Skip to content

Commit

Permalink
Merge pull request #7 from itcr-uni-luebeck/release/1.1.0
Browse files Browse the repository at this point in the history
Release/1.1.0
  • Loading branch information
jpwiedekopf authored Jun 8, 2021
2 parents d8fded2 + ee1948f commit 7fea2b8
Show file tree
Hide file tree
Showing 8 changed files with 116 additions and 45 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -150,3 +150,5 @@ crashlytics.properties
crashlytics-build.properties
fabric.properties

.idea/*.iml
.idea/misc.xml
11 changes: 0 additions & 11 deletions .idea/fhir-populator.iml

This file was deleted.

4 changes: 0 additions & 4 deletions .idea/misc.xml

This file was deleted.

11 changes: 11 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Copyright 2021 IT Center for Clinical Research, Universität zu Lübeck

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29 changes: 23 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ These commands will create a new directory, visit it, create the virtual environ
Next, load the package from PyPI:

```bash
pip install fhir-populator
python -m pip install fhir-populator
```

You can now start it as a Python module:
Expand All @@ -41,12 +41,12 @@ and the help will be printed:
usage: fhir_populator [-h] --endpoint ENDPOINT [--authorization-header AUTHORIZATION_HEADER] [--log-file LOG_FILE]
[--get-dependencies] [--non-interactive] [--include-examples]
[--log-level {INFO,WARNING,DEBUG,ERROR}] [--rewrite-versions] [--only-put] [--versioned-ids]
[--exclude-resource-type [EXCLUDE_RESOURCE_TYPE ...]] [--registry-url REGISTRY_URL]
[--package PACKAGES [PACKAGES ...]]
[--exclude-resource-type [EXCLUDE_RESOURCE_TYPE ...] | --only [ONLY ...]]
[--registry-url REGISTRY_URL] [--package PACKAGES [PACKAGES ...]]
optional arguments:
-h, --help show this help message and exit
--https://wiki.hl7.org/FHIR_NPM_Package_Spec ENDPOINT The FHIR server REST endpoint (default: None)
--endpoint ENDPOINT The FHIR server REST endpoint (default: None)
--authorization-header AUTHORIZATION_HEADER
an authorization header to use for uploading. If none, nothing will be sent. (default: None)
--log-file LOG_FILE A log file path (default: None)
Expand All @@ -65,6 +65,8 @@ optional arguments:
--versioned-ids if provided, all resource IDs will be prefixed with the package version. (default: False)
--exclude-resource-type [EXCLUDE_RESOURCE_TYPE ...]
Specify resource types to ignore! (default: None)
--only [ONLY ...] Only upload the resource types provided here, e.g. only StructureDefinitions, CodeSystems and
ValueSets (default: None)
--registry-url REGISTRY_URL
The FHIR registry url, Simplifier by default (default: https://packages.simplifier.net)
--package PACKAGES [PACKAGES ...]
Expand Down Expand Up @@ -120,13 +122,28 @@ There are a number of configuration options, which are (hopefully) mostly self-e
* `--rewrite-versions`: If provided, all `version` attributes of the resources will be rewritten to match the version in the `package.json`, to separate these definitions from previous versions. You will need to think about the versions numbers you use when communicating with others, who might not use the same versions - ⚠️ use with caution! ⚠️
* `--versioned-ids`: To separate versions of the resources on the same FHIR server, you can override the IDs provided in the resources, by including the slugified version of the package in the ID. If combined with the `--only-put` switch, this will work the same, versioning existing IDs, and slugifying + versioning the filename of resources without IDs.

## Updating

```bash
cd fhir-populator
source venv/bin/activate
python -m pip install --upgrade fhir-populator
```

## Hacking

If you want to customize the program, you should:

1. create a fork in GitHub, and clone it.
2. create a new virtual environment in your fork: `python -m venv venv`; `source venv/bin/active`
2. create a new virtual environment in your fork: `python -m venv .venv`; `source .venv/bin/activate`
3. Install the package locally, using `pip install .`
4. Customize the script. Re-run step 3 if you change the script.
5. `python -m fhir_populator`, as before.
6. Create a issue and pull request in the GitHub Repo! We welcome contributions!
6. Create a issue and pull request in the GitHub Repo! We welcome contributions!

## Changelog

| Version | Date | Changes |
|-|-|-|
| v1.0.10 | 2021-06-03 | Initial release |
| v1.1.0 | 2021-06-08 | - handle Unicode filenames, especially on BSD/macOS (#1)<br>- do not serialize null ID for POST (#2)<br>- include option for only certain resource types(#6)<br>- fix XML handling (#6)<br>- add LICENSE |
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[metadata]
name = fhir-populator
version = 1.0.10
version = 1.1.0
author = Joshua Wiedekopf
author_email = j.wiedekopf@uni-luebeck.de
description = Load Simplifier packages into a FHIR server, quickly and consistently.
Expand Down
98 changes: 75 additions & 23 deletions src/fhir_populator/populator.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
import sys
import tempfile
from typing import List, Optional, Dict
import argparse
import json
import logging
import os
import argparse
import tarfile
import shutil
import logging
import requests
from rich.logging import RichHandler
import sys
import tarfile
import tempfile
import xml.etree.ElementTree as ElementTree
from enum import Enum
from io import BufferedReader
from typing import List, Optional, Dict

import inquirer
import networkx as nx
import requests
from rich.logging import RichHandler
from slugify import slugify


Expand Down Expand Up @@ -93,8 +95,9 @@ def get_payload_rewrite_xml(self, rewrite_version: Optional[str]) -> str:
version_node = root.find("version")
if version_node is not None:
version_node.text = rewrite_version
id_node = root.find("id")
id_node.text = self.id
if self.id is not None:
id_node = root.find("id")
id_node.text = self.id
return ElementTree.tostring(root, encoding="unicode")

def get_payload_rewrite_json(self, rewrite_version: Optional[str], indent: int = 2) -> str:
Expand All @@ -103,12 +106,20 @@ def get_payload_rewrite_json(self, rewrite_version: Optional[str], indent: int =
if rewrite_version is not None:
if "version" in json_dict:
json_dict["version"] = rewrite_version
json_dict["id"] = self.id
if self.id is not None:
json_dict["id"] = self.id
return json.dumps(json_dict, indent=indent)

def get_argument_xml(self, argument: str, raise_on_missing: bool = False):
tree = ElementTree.parse(self.file_path)
root = tree.getroot()
if argument == "resourceType":
# resource type is provided as the name of the tag, instead of as an attribute
tag = root.tag
if "{" in tag:
return tag.split("}")[1] # Tag name without namespace
else:
return tag # Tag does not seem to contain a namespace
res_node = root.find(argument)
if res_node is None and raise_on_missing:
raise LookupError(f"the resource {self.file_path} does not have an attribute {argument}!")
Expand Down Expand Up @@ -157,6 +168,7 @@ class PopulatorSettings:
only_put: bool
versioned_ids: bool
registry_url: str
only: List[str]
log: logging.Logger

def __init__(self, args: argparse.Namespace, log: logging.Logger):
Expand All @@ -170,7 +182,12 @@ def __init__(self, args: argparse.Namespace, log: logging.Logger):
self.include_examples = args.include_examples
self.rewrite_versions = args.rewrite_versions
self.log_level = args.log_level
self.exclude_resource_type = [a.lower() for a in args.exclude_resource_type] if args.exclude_resource_type is not None else []
self.exclude_resource_type = [a.lower() for a in args.exclude_resource_type] \
if args.exclude_resource_type is not None \
else []
self.only = [a.lower() for a in args.only] \
if args.only is not None \
else []
self.only_put = args.only_put
self.versioned_ids = args.versioned_ids
self.log = log
Expand Down Expand Up @@ -257,10 +274,16 @@ def parse_args() -> argparse.Namespace:
"--versioned-ids", action="store_true",
help="if provided, all resource IDs will be prefixed with the package version."
)
parser.add_argument(
group = parser.add_mutually_exclusive_group()
group.add_argument(
"--exclude-resource-type", type=str, nargs="*",
help="Specify resource types to ignore!"
)
group.add_argument(
"--only", type=str, nargs="*",
help="Only upload the resource types provided here, " +
"e.g. only StructureDefinitions, CodeSystems and ValueSets"
)
parser.add_argument(
"--registry-url", type=str, default="https://packages.simplifier.net",
help="The FHIR registry url, Simplifier by default"
Expand Down Expand Up @@ -296,8 +319,6 @@ def download_packages(self, packages: List[str]) -> nx.DiGraph:
if not any(ignored):
dependency_graph.add_edge(dep, package)
packages_to_download.append(dep)
#else:
# self.log.warning(f"The package {dep} will not be uploaded, it is ignored.")
downloaded_packages.append(package)
self.log.debug("Packages downloaded with dependencies:")
for node in dependency_graph.nodes:
Expand Down Expand Up @@ -328,8 +349,29 @@ def download_untar_package(self, package_name: str) -> str:
for chunk in download_response.iter_content(chunk_size=8192):
download_fs.write(chunk)
self.log.debug(f"Downloaded to {download_path}")
with tarfile.open(download_path) as download_tar_fs:
download_tar_fs.extractall(extract_path)
try:
with tarfile.open(download_path) as download_tar_fs:
for tarinfo in download_tar_fs:
try:
extract_dir = os.path.dirname(tarinfo.path)
t_filename, t_ext = os.path.splitext(os.path.basename(tarinfo.path))
slug_filename = slugify(t_filename)
extract_filename = f"{slug_filename}{t_ext}"
extract_to_folder = os.path.join(extract_path, extract_dir)
os.makedirs(extract_to_folder, exist_ok=True)
extract_to = os.path.join(extract_to_folder, extract_filename)
with open(extract_to, "wb") as out_fp:
tar_br: BufferedReader
with download_tar_fs.extractfile(tarinfo) as tar_br:
out_fp.write(tar_br.read())
self.log.debug(f"Extracted {extract_to}")
except (tarfile.TarError, IOError, OSError):
logging.exception(f"Unhandled error extracting member '{tarinfo}' from {download_path}." +
"Extraction will continue.")
continue
except (tarfile.TarError, IOError, OSError):
logging.exception(f"Unhandled error extracting archive {download_path}")
exit(1)
self.log.debug(f"Extracted to {extract_path}")
return extract_path

Expand Down Expand Up @@ -392,14 +434,21 @@ def upload_resources(self, dependency_graph: nx.DiGraph):
# topological sort only returns the node name as str
package_dir = node_with_info["path"]
self.log.debug("Uploading package '%s' files from package directory: %s", package_node, package_dir)
self.log.debug("Uploading package '%s' files from package directory: %s", package_node, package_dir)
fhir_files = []
package_json = self.read_package_json(package_dir)
package_version = package_json["version"]
if package_json is None:
raise FileNotFoundError(f"package.json was not found within {package_dir}!")
for (directory_path, _, filenames) in os.walk(package_dir):
file_name: str
for file_name in filenames:
if file_name == "package.json":
if os.path.basename(directory_path) == "other": # other directory SHALL be ignored
# https://wiki.hl7.org/FHIR_NPM_Package_Spec#Format
continue
if file_name == "package.json" or file_name == "index.json":
continue
elif file_name.endswith(".sch"): # FHIR Shorthand
continue
full_path = os.path.join(directory_path, file_name)
encoded_path = full_path.encode('utf-8', 'surrogateescape').decode('utf-8', 'replace')
Expand All @@ -410,11 +459,13 @@ def upload_resources(self, dependency_graph: nx.DiGraph):
try:
fhir_resource = FhirResource(encoded_path, package_version, self.args.only_put,
self.args.versioned_ids)
if self.args.exclude_resource_type is not None \
and fhir_resource.resource_type.lower() in self.args.exclude_resource_type:
r_type = fhir_resource.resource_type.lower()
if (r_type in self.args.exclude_resource_type) or (
len(self.args.only) != 0 and r_type not in self.args.only):
self.log.debug(
f"Resource {encoded_path} is of resource type {fhir_resource.resource_type}" +
f"Resource {encoded_path} is of resource type {r_type}" +
f" and is skipped.")
continue
else:
fhir_files.append(fhir_resource)
except (LookupError, json.decoder.JSONDecodeError):
Expand Down Expand Up @@ -457,11 +508,12 @@ def upload_resources(self, dependency_graph: nx.DiGraph):
method=request_method,
url=upload_url,
headers={
"Content-Type": content_type
"Content-Type": content_type,
"Accept": "application/json"
},
data=payload
).prepare()
self.log.debug(f"uploading to {upload_url} (content type: {content_type})")
self.log.info(f"uploading to {upload_url} (content type: {content_type})")
upload_result = self.request_session.send(upload_request)
if 200 <= upload_result.status_code < 300:
self.log.debug(f"uploaded {fhir_file.resource_type} with status {upload_result.status_code}")
Expand Down
4 changes: 4 additions & 0 deletions src/fhir_populator/testmain.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from populator import Populator

if __name__ == "__main__":
Populator().populate()

0 comments on commit 7fea2b8

Please sign in to comment.