Skip to content

Commit

Permalink
release v1.0.10
Browse files Browse the repository at this point in the history
  • Loading branch information
jpwiedekopf committed Jun 3, 2021
1 parent 2f1b641 commit d8fded2
Show file tree
Hide file tree
Showing 8 changed files with 172 additions and 23 deletions.
132 changes: 132 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# FHIR Populator

[![PyPI version](https://badge.fury.io/py/fhir-populator.svg)](https://badge.fury.io/py/fhir-populator)

A tool to load a lot of FHIR resources into a "naked" FHIR server.

It is intended to quickly load a package of FHIR Profiles (`StructureDefinitions`) and associated artefacts (such as `CodeSystem`, `ValueSet`, `ConceptMap`) into a FHIR server that has just been spun up.

This tool was developed in the context of the [Core Dataset (KDS) of the Medical Informatics Initiative (MII) in Germany](https://simplifier.net/organization/koordinationsstellemii/~projects) as well as the [German Corona Consensus Dataset (GECCO) developed by the Network University Medicine](https://simplifier.net/ForschungsnetzCovid-19).

The script is written in Python 3 and [available on PyPI](https://pypi.org/project/fhir-populator/).

## Installation

As the package is available on the Python package index, you can install it quickly into a Virtual Environment. First, you may need to create a folder for FHIR populator and an Virtual Environment (all commands are for Unix-based OS and may need tweaking on Windows):

```bash
mkdir fhir-populator
cd fhir-populator
python -m venv .venv
source .venv/bin/activate
```

These commands will create a new directory, visit it, create the virtual environment, and activate it.

Next, load the package from PyPI:

```bash
pip install fhir-populator
```

You can now start it as a Python module:

```bash
python -m fhir_populator --help
```

and the help will be printed:

```
usage: fhir_populator [-h] --endpoint ENDPOINT [--authorization-header AUTHORIZATION_HEADER] [--log-file LOG_FILE]
[--get-dependencies] [--non-interactive] [--include-examples]
[--log-level {INFO,WARNING,DEBUG,ERROR}] [--rewrite-versions] [--only-put] [--versioned-ids]
[--exclude-resource-type [EXCLUDE_RESOURCE_TYPE ...]] [--registry-url REGISTRY_URL]
[--package PACKAGES [PACKAGES ...]]
optional arguments:
-h, --help show this help message and exit
--https://wiki.hl7.org/FHIR_NPM_Package_Spec ENDPOINT The FHIR server REST endpoint (default: None)
--authorization-header AUTHORIZATION_HEADER
an authorization header to use for uploading. If none, nothing will be sent. (default: None)
--log-file LOG_FILE A log file path (default: None)
--get-dependencies if provided, dependencies will be retrieved from the FHIR registry. (default: False)
--non-interactive In case of errors returned by this FHIR server, the error will be ignored with only a log
message being written out. Might be helpful when integrating this module into a script.
(default: False)
--include-examples If provided, the resources in the 'examples' folder of the packages will be uploaded.
(default: False)
--log-level {INFO,WARNING,DEBUG,ERROR}
The level to log at (default: INFO)
--rewrite-versions If provided, all versions of FHIR resources will be modified to be consistent with the
package version. Otherwise, the version is used as-is! (default: False)
--only-put if provided, IDs will be generated for all resources that lack one. This can be combined with
--versioned-ids. (default: False)
--versioned-ids if provided, all resource IDs will be prefixed with the package version. (default: False)
--exclude-resource-type [EXCLUDE_RESOURCE_TYPE ...]
Specify resource types to ignore! (default: None)
--registry-url REGISTRY_URL
The FHIR registry url, Simplifier by default (default: https://packages.simplifier.net)
--package PACKAGES [PACKAGES ...]
Specification for the package to download and push to the FHIR server. You can specify more
than one package. Use the syntax 'package@version', or leave out the version to use the
latest package available on the registry. (default: None)
```

There are a lot of command line options that can be used to customize the behaviour of the program.

## Example Invocation

To try out the program, you can spin up a FHIR server, such as [HAPI FHIR JPA Server Starter](https://github.com/hapifhir/hapi-fhir-jpaserver-starter) on your local machine, e.g. using Docker. Assuming the endpoint of the server is http://localhost:8080/fhir, you can upload the latest version of the GECCO package, including dependencies (e.g. the MII KDS modules used by that package), thus:

```bash
python -m fhir_populator --endpoint http://localhost:8080/fhir --get-dependencies --package de.gecco
```

As this example does not a specify a version of the `de.gecco` package, the latest version of the package will first be determined from the Simplifier API. You can also specify a version using the syntax `package@version`:

```bash
python -m fhir_populator --endpoint http://localhost:8080/fhir --get-dependencies --package de.gecco@1.0.3
```

Also, you can specify as many packages as you like, and mix-and-match versioned references with unversioned ones:

```bash
python -m fhir_populator --endpoint http://localhost:8080/fhir --get-dependencies --package de.gecco@1.0.3 de.medizininformatikinitiative.kerndatensatz.person
```

## Implementation Details

The script is broken into multiple steps:

1. All unversioned package references are converted to versioned references, by retrieving the package metadata from the NPM registry.
2. The packages are downloaded as Tarballs into a temporary directory (under `/tmp` for Unix systems), and extracted there
3. After each package is downloaded, the `package.json` is examined, and dependencies are added to the download queue, if desired.
* During this download, a dependency graph is built from the downloaded packages, to make sure that every package is uploaded after its dependencies
4. The packages are uploaded, file-by-file, to the FHIR server. This uses the topological sort of the directed dependency graph, to maintain consistency. Also, the files are uploaded in logical versions (e.g. `CodeSystem` before `ValueSet` before `StructureDefinition` before `Patient` etc.)
5. If the FHIR server returns an error, the user is prompted interactively for input.
6. When all resources are uploaded (or if the user aborts execution with *CTRL-C*), the temporary directory is recursively deleted.

## Configuration

There are a number of configuration options, which are (hopefully) mostly self-explanatory. Some of the more obscure ones are explained below:

* `--authorization-header`: USe if your server is configured for Authentication. You can enter something like `--authorization-header "Bearer asdf" here, which will be presented to the server for each request.
* `--exclude-resource-type`: You can skip resource types, e.g. `--exclude-resource-type CodeSystem ValueSet ConceptMap`. This is not case-sensitive, the lower-case version of the resource type will be matched against the lower-case parameter list.
* `--include-examples`: Examples in FHIR packages are great, but often not consistent across packages. For example, an `Observation` example might reference `Patient/example`, and this patient is nowhere to be found in the package, or its dependencies. Some FHIR servers (such as HAPI JPA Server) validate references on CREATE and return errors for missing references. Hence, examples (files in the `examples` folder of the package, as per the spec) are ignored by default.
* `--non-interactive`: If provided, errors returned by the FHIR server will be ignored, and only a warning will be printed out.
* `--only-put`: FHIR requires that IDs are present for all resources that are uploaded via HTTP PUT. Hence, if IDs are missing, a HTTP POST request is used by the script. This does not generate stable, or nice, IDs by default. You can provide this parameter to make the script generate IDs from the file name of the resource, which should be stable across reruns. This uses a "slugified" version of the filename without unsafe characters, and restricted to 64 characters, as per the specification.
* `--registry-url`: While the script was only tested using the Simplifier registry, it should be compatible to other implementations of the [FHIR NPM Package Spec](https://wiki.hl7.org/FHIR_NPM_Package_Spec), which is implemented by the Simplifier software. You can provide the endpoint of an alternative registry hence.
* `--rewrite-versions`: If provided, all `version` attributes of the resources will be rewritten to match the version in the `package.json`, to separate these definitions from previous versions. You will need to think about the versions numbers you use when communicating with others, who might not use the same versions - ⚠️ use with caution! ⚠️
* `--versioned-ids`: To separate versions of the resources on the same FHIR server, you can override the IDs provided in the resources, by including the slugified version of the package in the ID. If combined with the `--only-put` switch, this will work the same, versioning existing IDs, and slugifying + versioning the filename of resources without IDs.

## Hacking

If you want to customize the program, you should:

1. create a fork in GitHub, and clone it.
2. create a new virtual environment in your fork: `python -m venv venv`; `source venv/bin/active`
3. Install the package locally, using `pip install .`
4. Customize the script. Re-run step 3 if you change the script.
5. `python -m fhir_populator`, as before.
6. Create a issue and pull request in the GitHub Repo! We welcome contributions!
2 changes: 0 additions & 2 deletions fhir_populator/__init__.py

This file was deleted.

6 changes: 6 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[build-system]
requires = [
"setuptools>=42",
"wheel"
]
build-backend = "setuptools.build_meta"
32 changes: 32 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
[metadata]
name = fhir-populator
version = 1.0.10
author = Joshua Wiedekopf
author_email = j.wiedekopf@uni-luebeck.de
description = Load Simplifier packages into a FHIR server, quickly and consistently.
long_description = file: README.md
long_description_content_type=text/markdown
url = https://github.com/itcr-uni-luebeck/fhir-populator
project_urls =
Bug Tracker = https://github.com/itcr-uni-luebeck/fhir-populator/issues
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: BSD License
Operating System :: OS Independent
Intended Audience :: Healthcare Industry
Topic :: Utilities

[options]
package_dir =
= src
packages = find:
python_requires = >=3.6
install_requires =
requests
rich
inquirer
networkx
python-slugify

[options.packages.find]
where = src
19 changes: 0 additions & 19 deletions setup.py

This file was deleted.

Empty file added src/fhir_populator/__init__.py
Empty file.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ def __init__(self, args: argparse.Namespace, log: logging.Logger):
self.include_examples = args.include_examples
self.rewrite_versions = args.rewrite_versions
self.log_level = args.log_level
self.exclude_resource_type = args.exclude_resource_type
self.exclude_resource_type = [a.lower() for a in args.exclude_resource_type] if args.exclude_resource_type is not None else []
self.only_put = args.only_put
self.versioned_ids = args.versioned_ids
self.log = log
Expand Down Expand Up @@ -411,7 +411,7 @@ def upload_resources(self, dependency_graph: nx.DiGraph):
fhir_resource = FhirResource(encoded_path, package_version, self.args.only_put,
self.args.versioned_ids)
if self.args.exclude_resource_type is not None \
and fhir_resource.resource_type in self.args.exclude_resource_type:
and fhir_resource.resource_type.lower() in self.args.exclude_resource_type:
self.log.debug(
f"Resource {encoded_path} is of resource type {fhir_resource.resource_type}" +
f" and is skipped.")
Expand Down

0 comments on commit d8fded2

Please sign in to comment.