pip install datapackage-pipelines-datahub
You will need DataHub Command Line tool to be installed on you machine.
You can use datapackage-pipelines-datahub as a plugin for dpp. In pipeline-spec.yaml it will look like this
...
- run: datahub.dump.to_datahub
Note: For pushing datasets to testing server set DATAHUB_ENV=testing
publishes DataSet to DataHub.io
Parameters:
config
- full path to theconfig.json
file. Default:~/.config/datahub/config.json
- Alternatively you can just set
DATAHUB_JSON
environt variable to be equal to the path to the config file
- Alternatively you can just set
findability
- Dataset visibility on the DataHub.io. One ofpublic
(default),private
,unlisted
.- other
data push
related options. Eg:schedule
,name
etc... seedata push -h
for more.
Example:
datahub:
title: my-dataset
pipeline:
-
run: load_metadata
parameters:
url: http://example.com/my-datapackage/datapackage.json
-
run: load_resource
parameters:
url: http://example.com/my-datapackage/datapackage.json
resource: my-resource
-
run: datahub.dump.to_datahub
parameters:
findability: private
schedule: every 2d
config: config/config.json.datahq