-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving metadata creation for datasets and inPort #16
Comments
Sharing my inPort experience and data archive goals
I've spoke with some other PAM teams from other centers and I think we are all in the same place with inPort so it would be great to develop something with this group that I can share with my larger PAM team (and any other data type)! |
On Feb 4, Kourtney, Eli, Alice and Matt met.
We outlined some questions to get more info on
More notes in our Google Doc: https://docs.google.com/document/d/1mZ42TqSfOfvpYFctABao749wzSX6-JcZHRdslf0hTxs/edit?usp=sharing |
Hi All, I won't be able to make co-working today but, I did chat with our Life History Team on their use of InPort and ERDDAP. ERDDAP is what they use as the primary backup to publish trawl biological data. InPort seems to hold the metadata and a complete copy of historical biological data. ERDDAP will have the biological data published by survey each year but InPort will have one csv file that has all of the biological data year after year. So there is redundancy right now with what is on InPort and what is on ERDDAP. Both are updated separately by emailing two different individuals the data who then publish it to the websites (someone from SWFSC IT and someone from ERDDAP). As far as we know they are not linked to NCEI and are submitted via emailing csv files. On InPort the csv files present seem to be linked to some type of google storage (storage.googleapis.com) but only IT people have editing access to the google folder to upload/change data sets. |
Hi Alice, we are keeping notes on that shared google doc for todays meeting. Do you know who are SWFSC was helping with the InPort upload? |
Awesome, I'll be able to take a look at those later today. Thanh Vu from IT is the SWFSC POC for the life history group for InPort upload. |
On Feb 13, @Kourtney-Burger and @alicebeittel met and Kourtney shared her process for creating package profiles in batch for passive acoustic data using Passive Packer to send to NCEI. Her code and process is listed here on GitHub, she has an awesome Quarto Page! There are many similarities to CruisePack (NCEI packager for water column acoustic data) and I'll be looking at how to adapt the code for CruisePack and our fisheries surveys. The first steps will be 1) writing some code to compile our survey metadata from various R scripts used to make our survey report and from folder names on our server 2) Downloading SQLiteStudio to open up the backend of the CruisePack executable and see how CruisePack organizes the metadata. |
Excited to see how you can adopt this method! Hopefully it saves some time for future you! |
Did you get any info on how they create the metadata? Do they create separate metadata for ERDDAP versus inPort or are they able to use the same XML (or similar) file? |
@eeholmes test how long to get notification |
@eeholmes I learned that the metadata doesn't really change year to year for the InPort FRD trawl database. It is the same general cruise description. Some updates were made recently to the descriptions to update it (sounds like the metadata wasn't updated for some time and needed some revisions). What IS updated each year is the csv files listed with the actual trawl biological data. The new data each year is appended to the csv and a new csv is submitted to InPort. The Life History group also sends the same trawl biological data to ERDDAP. The don't submit metadata with the ERDDAP submission (it sounds like the metadata is already stored on the site and doesn't change year to year). The same biological data is stored in two places. So yes, the metadata format is different for ERDDAP vs InPort but since it doesn't change year to year they are not creating separate metadata each year for submission. Trawl Specimen Data ERDDAP: https://coastwatch.pfeg.noaa.gov/erddap/tabledap/FRDCPSTrawlLHSpecimen.html |
Google Doc for Notes. Add info and what you know here.
SMART GOAL: Come up with some specific, doable (i.e. small), things we can do to improve metadata creation, esp in regards to inPort. Could be documentation, write-up of how different groups are approaching metadata creation, scripts, packages, interviews with people who might help us (NCEI, inPort).
Eli, Craig, Alice, Molly, Kourtney, Marylou, Dawn, Lynn, Carissa, Ana
Craig
Eli
Alice
Molly
*
The text was updated successfully, but these errors were encountered: