Skip to content
AnxhelaDani edited this page Oct 2, 2015 · 19 revisions

The fourth CAP meeting took place on Thursday, 17th of September 2015, 3-5PM CET

Notes from the meeting:

  • P. and S. reported on the progress on the JSON metadata schema specifications. P. implemented a first version (which will be revised based on testing during the next months). DASPOS organized a HEP ontology boot camp at the University of Notre Dame in May. A first draft for a common HEP ontology was developed so far covering the detector final state and data processing workflows (with input from all LHC collaborations). They are now being tested with real data (e.g. the latest CMS Open Data release). Part of the results were submitted to and will be presented at the Workshop on Ontology and Semantic Web Patterns in October 2015 (

  • T. and P. showed the latest progress on the development site. Major changes in the backend happened, e.g. the system now runs entirely through Docker, so that data submitted to the current CAP version are already being stored and CERN OAuth/SSO is used for authentication and permission/access setting (through e-group membership). The CMS Statistics Questionnaire is done and so is the first JSON based submission form (which is now being tested and revised). P. showed prototype of JSON-schema-based submission for analysis preservation and how it can connect to collaboration databases exemplary through the connection between CAP and DAS (CMS).Discussion underlined the need for an additional layer for easier user guidance so that using CAP is as easy as possible, for example via automated pre-filling of the form by grabbing analysis source code when possible (A.P.). It should also be mentioned that different options are available: from grabbing code from a GitHub repo to a very detailed manual entry via the JSON forms. Users will need guidance here. Furthermore, the discussion highlights that the JSON forms should be flexible and available 'on demand', so users can pick and combine forms and subforms to represent their analysis. T. confirms that the schema composition is possible (e.g. cap-common-file-0.7.json or cap-cms-physics-objects-0.3.json, etc). K. mentioned usefulness of composing submission sub-schemas with already submitted data. Also, it would be good to share JSON schemas on GitHub more publicly, so that others can revise and comment.

  • Update from CMS:
    CAP: eager to test the available pieces on CAP. Interest to start using the CMS survey on production first (before CAP is completely on production level).
    COD: Getting ready to select the MC data for the next data release.

  • Update from ATLAS:
    CAP: Task force for data preservation is working on a report and recommendations. More detailed update will be available by D. S. (later).
    COD: Development team met with F. S. regarding the release of new ATLAS masterclass data this autumn.

  • Update from LHCb:
    CAP: Eager to test LHCb JSON schema. Looking for volunteers to test. Feedback will be elicited at computing workshop in Mid-Nov. Would be good if tutorial and test possible by then. Interest to organize some work days at CERN with CAP team to get this done. Possible use of M-O's code. Common LHCb/Yandex work on exercises using IPython notebooks.
    COD: Data release approved by collaboration. In addition, a prototype of a new tool for the research section of the Open Data Portal is ready. Meeting needed to understand how this could be implemented on COD.

  • Update from ALICE (absent):
    COD: Big data release coming. Data/code underlying publications have been prepared for preservation/release.


  • demo: support from SUSY and preservation etc. are using code from published analysis. 20 real world analyses can be used to extend ATLAS schema. We can try to use this knowledge for subschemas
  • RECAST could be data provider and data consumer for CAP. If data provider, different mode of capturing analysis - means to get ATLAS analysis on board early?
  • Slides will be available here,; Demo:

Homework for CAP team (and others where applicable):

  • Finalise and share JSON Schema(s) publicly (GitHub) and invite comments
  • Prepare infrastructure for pre-production for CMS Statistics Questionnaire.
  • Organize LHCb follow up meeting and sprint for testing
  • CMS meeting for testing of CAP and preparation for new data releases
  • contact D. S. for detailed update on task force
  • RECAST: understand how to use data from RECAST, reach out for feedback on JSON/metadata.