Skip to content

Commit

Permalink
Merge pull request #50 from OpenEnergyPlatform/release-v0.4.0
Browse files Browse the repository at this point in the history
Release v0.4.0
  • Loading branch information
jh-RLI authored Apr 16, 2024
2 parents 0a8253e + f18778a commit 7f173ea
Show file tree
Hide file tree
Showing 14 changed files with 190 additions and 94 deletions.
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,23 @@ Template:

______________________________________________________________________

## [v0.4.0] - 2024-04-16

### Added

- Metadata upload in case of single table in OEM [#21](https://github.com/OpenEnergyPlatform/oem2orm/pull/21)

### Changed

- Reworked repo file structure & extend documentation [#49](https://github.com/OpenEnergyPlatform/oem2orm/pull/49)

### Fixed

- oedialect version to v0.1.1 [#47](https://github.com/OpenEnergyPlatform/oem2orm/pull/47)
- sqlalchemy version to v1.3.16 [#47](https://github.com/OpenEnergyPlatform/oem2orm/pull/47)

______________________________________________________________________

## [v0.3.3] - 2024-04-15

### Added
Expand Down
122 changes: 53 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# OEM to ORM
# oem 2 orm

Create database tables (and schema) from oemetadata json file(s)
Create database tables (and schema) from oemetadata json file(s). This tool is part of the open-energy-metadata (OEM) integration into the [OEP](https://openenergyplatform.org/).

## Installation:
## Installation

You can install pacakge using standard python installation:
`
Expand All @@ -15,94 +15,78 @@ pipx install oem2orm
`
see [Pipx-Documentation](https://pypa.github.io/pipx/) for further information.

## Usage

## Usage:
Read the Restrictions section and have a look at our [tutorial](./tutorial/USAGE.md) section to get more information about the usage of oem2orm either as code module or CLI tool. The tutorials also provide information how to validate your oemetadata files.

This tool is part of the open-energy-metadata (OEM) integration into the [OEP](https://openenergyplatform.org/).
To use this tool with the OEP API you need to be signed up to the OEP since
you need to provide an API-Token.
### Restrictions

If you want to upload OEM that was officially reviewed you must clone the
OEP data-preprocessing repository on [GitHub](https://github.com/OpenEnergyPlatform/data-preprocessing).
The data-review folder contains all of the successfully reviewed OEM files.
To use this tool with the OEP API you need to be signed up to the OEP since
you need to provide an API-Token.

For security reasons, tables can only be created in existing
For security reasons, tables can only be created in existing
schemas and just in the schemas "model_draft" and "sandbox".

Keep in mind the current state is not fully tested. The code is
still quit error prone f.e. the postgres types (column datatype) are not fully
Keep in mind that f.e. the postgres types (column datatype) are not fully
supported by the [oedialct](https://pypi.org/project/oedialect/) - work in progress.

### Terminal/CLI-Application
Step-by-Step:
0. pip and python have to be installed and setup on your machine
1. Create env from requirements.txt, and activate
2. Put the metadata file in the folder metadata or put your own folder in this
directory
3. execute the following in a terminal:
```
pipx install oem2orm
oem2orm
Enter metadata folder name:
...
```
4. Provide credentials and folder name in prompt
5. The table will be created
## Docs

### Import as Module
### Database connection

You can simply import this module in your Python script.py like this:
We use a global namedtuple called "DB" To store the sqlalchemy connection objects engine and metadata.
The namedtuple is available wen import oem2orm in a script. To establish the namedtuple use the function
setup_db_connection(). Now you can use DB.engine or DB.metadata. In the background the connection is established
using oedialect and the http API of the oeplatform website.

```python
from oem2orm import oep_oedialect_oem2orm as oem2orm
```
### oem2orm generator

Now just call the functions provided in oem2orm like this:
The table objects (ORM) are generated on the fly from an oemetadata.json file. [oemetadata](https://github.com/OpenEnergyPlatform/oemetadata) is a metadata specification of the Open Energy Family. It includes about 50 fields that can be used to provide metadata for tabular data resources.
A subset of these fields are grouped in the key "resources" ([see out example](https://github.com/OpenEnergyPlatform/oemetadata/blob/develop/metadata/v160/example.json#L237-L388)) in the metadata. These fields describe the schema of
the data table (like table name, columns, data types & table relations).

Recommended execution order:
- Setup the logger
```python
oem2orm.setup_logger()
```
The method oem2orm provides to create data tables on the OEP. It is especially useful if you attempt to automate the table creation and already use python or already have a oemetadata file available. The alternatives are:

- Setup the Database API connection as Namedtuple storing the SQLAlchemy engine and metadata:
```python
db = oem2orm.setup_db_connection()
```
1. [manually describing](https://openenergyplatform.github.io/academy/tutorials/01_api/02_api_upload/#create-table) the table object in JSON and then use the oep HTTP API directly to create a table.
2. Use the [User Interface of the oeplatform website](https://openenergyplatform.org/dataedit/wizard/) to create a table and upload data.

- Provide the oem files in a folder (in the current directory).
- Pass the folder name to the function:
```python
metadata_folder = oem2orm.select_oem_dir(oem_folder_name="folder_name")
```
### Oemetadata format

- Setup a SQLAlchemy ORM including all data-model in the provided oem files:
```python
orm = oem2orm.collect_ordered_tables_from_oem(db, metadata_folder)
```
[Specification for the oemetadata](https://github.com/OpenEnergyPlatform/oemetadata)

- Create the tables on the Database:
```python
oem2orm.create_tables(db, orm)
```
#### Oemetadata validation

- Delete all tables that have been created (all tables available in sa.metadata)
```python
oem2orm.delete_tables(db, orm)
```
The oemetadata specification is integrated into the open energy platform using a tool called [omi (metadata integration)](https://github.com/OpenEnergyPlatform/omi). OMI provides functionality to run validation checks on the metadata up to the oemetadata version 1.6.0. oem2orm also provides a minimal oep compliance check that mocks the checks that are run on the oep website once the metadata is uploaded to a table.

## Docs:
#### Supported column data types

### Database connection
We use a global namedtuple called "DB" To store the sqlalchemy connection objects engine and metadata.
The namedtuple is available wen import oem2orm in a script. To establish the namedtuple use the function
setup_db_connection(). Now you can use DB.engine or DB.metadata.

### oem2orm generator
Currently oem2orm supports

#### Supported datatypes
"bigint"
"int":
"integer"
"varchar"
"json"
"text"
"timestamp"
"interval"
"string"
"float"
"boolean"
"date"
"hstore"
"decimal"
"numeric"
"double precision"

#### Spatial Types
We create columns with spatial datatypes using Geoalchemy2.

"geometry point": Geometry("POINT", spatial_index=False),
"geom": Geometry("GEOMETRY", spatial_index=False),
"geometry": Geometry("GEOMETRY", spatial_index=False),

We create columns with spatial datatypes using Geoalchemy2.

## Database support

We only tested this tool with PostgreSQL & sqlalchemy version 1.3
11 changes: 7 additions & 4 deletions oem2orm/main.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

import click
import pathlib

Expand All @@ -11,22 +10,26 @@ def cli():


@cli.command()
@click.argument('metadata-folder', type=click.Path(exists=True))
@click.argument("metadata-folder", type=click.Path(exists=True))
def create_tables(metadata_folder):
db = oep_oedialect_oem2orm.setup_db_connection()
folder = pathlib.Path.cwd() / metadata_folder
tables = oep_oedialect_oem2orm.collect_tables_from_oem(db, folder)
oep_oedialect_oem2orm.create_tables(db, tables)
if len(tables) == 1:
# Upload metadata for single table
metadata = oep_oedialect_oem2orm.mdToDict(metadata_folder)
oep_oedialect_oem2orm.api_updateMdOnTable(metadata)


@cli.command()
@click.argument('metadata-folder', type=click.Path(exists=True))
@click.argument("metadata-folder", type=click.Path(exists=True))
def delete_tables(metadata_folder):
db = oep_oedialect_oem2orm.setup_db_connection()
folder = pathlib.Path.cwd() / metadata_folder
tables = oep_oedialect_oem2orm.collect_tables_from_oem(db, folder)
oep_oedialect_oem2orm.delete_tables(db, tables)


if __name__ == '__main__':
if __name__ == "__main__":
cli()
18 changes: 13 additions & 5 deletions oem2orm/oep_compliance.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"""

import logging
import pathlib
import json
Expand All @@ -24,6 +25,7 @@

logger = logging.getLogger()


def read_input_json(file_path: pathlib.Path = "tests/data/metadata_v15.json"):
with open(file_path, "r", encoding="utf-8") as f:
jsn = json.load(f)
Expand Down Expand Up @@ -123,7 +125,9 @@ def check_oemetadata_is_oep_compatible(metadata):
# -------------------------------------------


def run_metadata_checks(oemetadata: dict = None, oemetadata_path: str = None, check_jsonschema: bool = False):
def run_metadata_checks(
oemetadata: dict = None, oemetadata_path: str = None, check_jsonschema: bool = False
):
"""
Runs metadata checks includes:
- basic oep compliant check - tested by using omi's parsing and compiling
Expand Down Expand Up @@ -165,10 +169,12 @@ def run_metadata_checks(oemetadata: dict = None, oemetadata_path: str = None, c
schema = parser_validation.get_schema_by_metadata_version(metadata=metadata)
result = parser_validation.is_valid(inp=metadata, schema=schema)
if result is False:
result = result, parser_validation.validate(metadata=metadata, save_report=True)

result = result, parser_validation.validate(
metadata=metadata, save_report=True
)

return result


if __name__ == "__main__":
correct_v15_test_data = "tests/data/metadata_v15.json"
Expand All @@ -177,7 +183,9 @@ def run_metadata_checks(oemetadata: dict = None, oemetadata_path: str = None, c

meta = read_input_json(file_path=correct_v15_test_data)
print("Check v15 metadata from file!")
result = run_metadata_checks(oemetadata_path=correct_v15_test_data, check_jsonschema=True)
result = run_metadata_checks(
oemetadata_path=correct_v15_test_data, check_jsonschema=True
)
print("Check v15 metadata from object!")
run_metadata_checks(oemetadata=meta)

Expand Down
13 changes: 9 additions & 4 deletions oem2orm/oep_oedialect_oem2orm.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,10 @@ def create_tables(db: DB, tables: List[sa.Table]):
for table in tables:
logging.info(f"Working on table: {table}")
if not db.engine.dialect.has_schema(db.engine, table.schema):
error_msg = f'The provided database schema: "{table.schema}" does not exist. Please use an existing ' \
f'schema from the `name` column from: {OEP_URL}/dataedit/schemas'
error_msg = (
f'The provided database schema: "{table.schema}" does not exist. Please use an existing '
f"schema from the `name` column from: {OEP_URL}/dataedit/schemas"
)
logging.info(error_msg)
raise DatabaseError(error_msg)
else:
Expand All @@ -119,7 +121,9 @@ def create_tables(db: DB, tables: List[sa.Table]):
table.create(checkfirst=True)
logging.info(f"Created table {table.name}")
except oedialect.engine.ConnectionException as ce:
error_msg = f'Error when uploading table "{table.name}". Reason: {ce}.'
error_msg = (
f'Error when uploading table "{table.name}". Reason: {ce}.'
)
logging.error(error_msg)
raise DatabaseError(error_msg) from ce
except sa.exc.ProgrammingError as pe:
Expand Down Expand Up @@ -225,7 +229,8 @@ def create_tables_from_metadata_file(
column = sa.Column(
field["name"],
column_type,
primary_key=field["name"] in primary_keys,
primary_key=field["name"]
in primary_keys, # TODO: Should be fixed, see https://github.com/OpenEnergyPlatform/oedialect/issues/43
comment=field["description"],
)
columns.append(column)
Expand Down
17 changes: 7 additions & 10 deletions oem2orm/postgresql_types.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
__copyright__ = "Reiner Lemoine Institut"
__license__ = "GNU Affero General Public License Version 3 (AGPL-3.0)"
__url__ = "https://github.com/openego/data_processing/blob/master/LICENSE"
__author__ = "henhuy"
__license__ = "GNU Affero General Public License Version 3 (AGPL-3.0)"
__url__ = "https://github.com/openego/data_processing/blob/master/LICENSE"
__author__ = "henhuy"

import sqlalchemy as sa
import sqlalchemy.dialects.postgresql as psql
Expand All @@ -27,16 +27,13 @@ class DatabaseTypes:
"hstore": HSTORE,
"decimal": sa.DECIMAL,
"numeric": sa.NUMERIC,

# Spatial types
"geometry point": Geometry("POINT", spatial_index=False),
"geom": Geometry("GEOMETRY", spatial_index=False),
"geometry": Geometry("GEOMETRY", spatial_index=False),

"geometry point": Geometry("POINT", spatial_index=False),
"geom": Geometry("GEOMETRY", spatial_index=False),
"geometry": Geometry("GEOMETRY", spatial_index=False),
# not support with oedialect
"double precision": psql.DOUBLE_PRECISION
"double precision": psql.DOUBLE_PRECISION,
# "double precision array": sa.ARRAY("DOUBLE_PRECISION"),

}

def __getitem__(self, item):
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

setuptools.setup(
name="oem2orm",
version="0.3.3",
version="0.4.0",
author="henhuy, jh-RLI",
author_email="Hendrik.Huyskens@rl-institut.de",
description="SQLAlchemy module to generate ORM, read from data model (oedatamodel) in open-energy-metadata JSON format",
Expand All @@ -28,7 +28,7 @@
"Operating System :: OS Independent",
],
python_requires='>=3.6',
install_requires=['sqlalchemy==1.3.14', 'oedialect==1.1', 'requests', 'jmespath', 'omi', 'click'], # Optional
install_requires=['sqlalchemy==1.3.16', 'oedialect==0.1.1', 'requests', 'jmespath', 'omi', 'click'], # Optional
project_urls={ # Optional
'Bug Reports': 'https://github.com/OpenEnergyPlatform/oem2orm/issues',
'Source': 'https://github.com/OpenEnergyPlatform/oem2orm/tree/develop/oem2orm',
Expand Down
Loading

0 comments on commit 7f173ea

Please sign in to comment.