Skip to content

Commit

Permalink
Also mention CSV Validator.
Browse files Browse the repository at this point in the history
  • Loading branch information
David Underdown committed Feb 24, 2018
1 parent a04d844 commit d73d0b3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ See Richard Dunley's blog posts on using the catalogue as data: [Catalogue as Da
## Input CSV
On launching the script (or EXE) it will ask for an input CSV file. Enter the full path to your input file or you can drag and drop from a file explorer window to the command line (at least in Windows), you'll still need to hit enter afterwards so that the script continues. Otherwise just hit enter, and the scirpt will look for an input CSV file called [discovery_api_SearchRecords_input_params.csv](https://github.com/DavidUnderdown/DiscoveryAPI/blob/master/discovery_api_SearchRecords_input_params.csv) in the current working directory (ie in most situations, in the same directory as the script itself). The version of the file in this repository contains the parameters necessary to obtain the basic data used in Richard's first blog post (1795 records from record series SC 8, restricted to petitions from 1360-1380).
###Input parameters
Within the input CSV file you can include up to 38 columns. The first 34 (parameter names prefixed with "sps.") are used as the URL parameters for the API call. The remaining 4: labels, output_filepath, output_encoding, discovery_columns are for giving the list of labels expected in a structured description, the filepath(s) for the output, the text encoding to use (defaults to UTF-8) and the data fields from Discovery which to be included in the output. To help understand what valid input looks like a CSV Schema file has also been created, [discovery_api_SearchRecords_input_params.csvs](https://github.com/DavidUnderdown/DiscoveryAPI/blob/master/discovery_api_SearchRecords_input_params.csvs) using the [CSV Schema Language 1.1](http://digital-preservation.github.io/csv-schema/csv-schema-1.1.html) created by The National Archives.
Within the input CSV file you can include up to 38 columns. The first 34 (parameter names prefixed with "sps.") are used as the URL parameters for the API call. The remaining 4: labels, output_filepath, output_encoding, discovery_columns are for giving the list of labels expected in a structured description, the filepath(s) for the output, the text encoding to use (defaults to UTF-8) and the data fields from Discovery which to be included in the output. To help understand what valid input looks like a CSV Schema file has also been created, [discovery_api_SearchRecords_input_params.csvs](https://github.com/DavidUnderdown/DiscoveryAPI/blob/master/discovery_api_SearchRecords_input_params.csvs) using the [CSV Schema Language 1.1](http://digital-preservation.github.io/csv-schema/csv-schema-1.1.html) created by The National Archives. This can be used to check the structure of your own input CSV files using the [CSV Validator](http://digital-preservation.github.io/csv-validator/).

The only mandatory parameters are sps.searchQuery for the URL parameters, and output_filepath. You can include multiple rows to send different queries to the API in one running of the script. Note though that you can query across multiple series in one query by supplying a list of series within a single row. ie in the sps.recordSeries field of the CSV file you can specify things like "ADM 188, ADM 362, ADM 363" (which would query across several series containing the service records of naval ratings). It probably only really makes sense (in terms of the output you'll get) to provide a list of series which have a (near) identical set of labels.

Expand Down

0 comments on commit d73d0b3

Please sign in to comment.