SciCat Data Model Discussion

Europe/London

Agenda + Minutes


Present: AP, AGB, TF, AL, NC, BG

Agenda

Links: SciCat Data Model v3, SciCat Data Model v4, API Explorer

  • Our data model
    • Metadata schemas previously with hierachical structure but now reference via ID
      • i.e experiment metadata has a campaignID and pulse has an experimentID
    • HIVE
      Experiments (Usually corresponding to a research proposal)
      ----> tests/pulses within an experiment
      ----> lots of separate data sources for each pulse, coming from multiple diagnostics (all associated to that pulse)
    • MAST-U
      Campaigns
      ----> Experiments
      ----> Pulses
      • Lots of separate data sources for each pulse, coming from multiple diagnostics (all associated to that pulse)
  • V3 vs V4
    • FYI these are the changes in the SciCat dataset schema going from API v3 to v4:

      • "instrumentId": "string" --> "instrumentIds": [ "string" ]
      • "sampleId": "string" --> "sampleIds": [ "string" ]
      • "proposalId": "string" --> "proposalIds": [ "string" ]
      • "principalInvestigator": "string" --> principalInvestigators": [ "string" ]

Issues

1) On the frontend, to be able to filter arbritrary nested scientific metadata. (Including the presentation)

2) How can we include our complex data model with pulses/shots

DS: public or private metadata, do we want all metadata to be public?

Actions/Next steps

  • Upgrade our SciCat test instance to v4 (AL)
  • AL and DS to meet to discuss searching across the scientific metadata and grouping etc (AL/DS)
    • Need something more specific without having to drill down to scientific metadata level, top-level would be better
    • ScientificMetadata displayed as a column header on the frontpage
  • Attempt to ingest more HIVE data to test on a local scicat (AP)
    • Includes investigating the unit/value and unitSI/valueSI keys in the scientific metadata (relates to GitHub Issue on validation)

 

There are minutes attached to this event. Show them.
    • 10:30 11:30
      Roundtable Discussion + AOB 1h
      Speakers: Adam Parker (Data Solutions Unit, UKAEA), Alejandra Gonzalez-Beltran, Andrew Lahiff (Advanced Computing), Nathan Cummings (High Performance Data Analytics), Shaun de Witt, Tom Farmer (Research Data Management)