Skip to content

Commit

Permalink
Tkakar/CAT-1100-add-container-for-reading-image-metadata (#145)
Browse files Browse the repository at this point in the history
* Added container

* Added workflow

* Fixed container name in readme
  • Loading branch information
tkakar authored Jan 29, 2025
1 parent 02b059a commit f97102a
Show file tree
Hide file tree
Showing 14 changed files with 199 additions and 0 deletions.
26 changes: 26 additions & 0 deletions containers/ome-tiff-metadata/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# We just need to use --file to point at it, instead of assuming it is in context.

# Using Conda because pyarrow did not install easily on python base images.
FROM --platform=linux/amd64 continuumio/miniconda3:24.5.0-0

# For tiff packages
RUN apt-get --allow-releaseinfo-change update &&\
apt-get install -y gcc python3-dev libhdf5-dev pkg-config python3-numcodecs

RUN pip install --upgrade pip setuptools

COPY requirements-freeze.txt .
RUN pip install -r ./requirements-freeze.txt

# In development, you may want to pin a single dependency in requirements.txt,
# without throwing away the entire cache layer from requirements-freeze.txt.
# (But once it works, you should check in an updated freeze!)

COPY requirements.txt .
RUN pip install -r ./requirements.txt

COPY . .

CMD [ "python", "main.py", \
"--input_dir", "/input", \
"--output_dir", "/output" ]
25 changes: 25 additions & 0 deletions containers/ome-tiff-metadata/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# ome-tiff-metadata

This docker container creates a JSON object for metadata (physical sizes and units) extracted from each TIFF within an input directory. This is needed particularly for laying segmentation masks over base-images when there is a misalignment between the physical sizes fo both images. Note that, even with same pixel sizes, if physical sizes are different, misalignment can occur. This metadata is useful to add scaling to the segmentation mask when visualized using Vitessce.

## Input

The input to the container is one or more ome-tiff image files.

## Output

The output is a json file that includes an object with following structure for every input ome-tiff image.

```
{"PhysicalSizeX": 0.5, "PhysicalSizeY": 0.5, "PhysicalSizeUnitX": "\u00b5m", "PhysicalSizeUnitY": "\u00b5m"}
```

## Normalization

None

## Example

Example of a hubmap dataset using this container for metadata generation for Vitessce (visualization) would be
`TODO`.
1 change: 1 addition & 0 deletions containers/ome-tiff-metadata/VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.0.1
72 changes: 72 additions & 0 deletions containers/ome-tiff-metadata/context/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import argparse
from glob import glob
from pathlib import Path
from os import makedirs
from itertools import chain
import json
from ome_types import from_tiff


def get_metadata(tiff_file_path):
extracted_metadata = {}
try:
ome_data = from_tiff(tiff_file_path)
image = ome_data.images[0]
pixels = image.pixels

physical_size_x = pixels.physical_size_x
physical_size_y = pixels.physical_size_y

physical_size_unit_x = pixels.physical_size_x_unit
physical_size_unit_y = pixels.physical_size_y_unit


physical_size_unit_x = physical_size_unit_x.value if hasattr(physical_size_unit_x, 'value') else physical_size_unit_x
physical_size_unit_y = physical_size_unit_y.value if hasattr(physical_size_unit_y, 'value') else physical_size_unit_y
extracted_metadata = {
"PhysicalSizeX": physical_size_x,
"PhysicalSizeY": physical_size_y,
"PhysicalSizeUnitX": physical_size_unit_x,
"PhysicalSizeUnitY": physical_size_unit_y,
}
print(f"Extracted metadata: {extracted_metadata}")
except FileNotFoundError:
print(f"Error: The file {tiff_file_path} does not exist.")
except IndexError:
print("Error: The TIFF file does not contain any images.")
except Exception as e:
print(f"An unexpected error occurred: {e}")

return extracted_metadata


def main(input_dir, output_dir):
makedirs(output_dir, exist_ok=True)

# Find all OME.TIFFs in the input directory.
tiffs = list(chain(input_dir.glob('**/*.ome.tif'), input_dir.glob('**/*.ome.tiff')))
if not tiffs:
raise Exception(f'No OME TIFFs found in {input_dir}')
for input_path in tiffs:
metadata = get_metadata(str(input_path))

# Create output path for each OME.TIFF:
new_output_dir = (output_dir / input_path.relative_to(input_dir)).parent
new_output_dir.mkdir(parents=True, exist_ok=True)

# Set output filename for JSON file and dump to disk:
output_path = str(output_dir / input_path.relative_to(input_dir).with_suffix('').with_suffix(''))+'.metadata.json'
with open(output_path, 'w') as f:
f.write(json.dumps(metadata))

if __name__ == '__main__':
parser = argparse.ArgumentParser(
description='Create a json file for OME-TIFF offsets')
parser.add_argument(
'--input_dir', required=True, type=Path,
help='Directory containing ome-tiff files to read')
parser.add_argument(
'--output_dir', required=True, type=Path,
help='Directory where ome-tiff offsets should be written')
args = parser.parse_args()
main(args.input_dir, args.output_dir)
42 changes: 42 additions & 0 deletions containers/ome-tiff-metadata/context/requirements-freeze.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
anaconda-anon-usage==0.4.4
annotated-types==0.7.0
archspec==0.2.3
boltons==23.0.0
Brotli==1.0.9
certifi==2024.6.2
cffi==1.16.0
charset-normalizer==2.0.4
conda==24.5.0
conda-content-trust==0.2.0
conda-libmamba-solver==24.1.0
conda-package-handling==2.3.0
conda_package_streaming==0.10.0
cryptography==42.0.5
distro==1.9.0
frozendict==2.4.2
idna==3.7
jsonpatch==1.33
jsonpointer==2.1
libmambapy==1.5.8
menuinst==2.1.1
ome-types==0.5.2
packaging==23.2
pip==24.2
platformdirs==3.10.0
pluggy==1.0.0
pycosat==0.6.6
pycparser==2.21
pydantic==2.10.5
pydantic-compat==0.1.2
pydantic_core==2.27.2
PySocks==1.7.1
requests==2.32.2
ruamel.yaml==0.17.21
setuptools==75.1.0
tqdm==4.66.4
truststore==0.8.0
typing_extensions==4.12.2
urllib3==2.2.2
wheel==0.43.0
xsdata==24.3.1
zstandard==0.22.0
1 change: 1 addition & 0 deletions containers/ome-tiff-metadata/context/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ome-types==0.5.2
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"PhysicalSizeX": 0.5, "PhysicalSizeY": 0.5, "PhysicalSizeUnitX": "\u00b5m", "PhysicalSizeUnitY": "\u00b5m"}
7 changes: 7 additions & 0 deletions ome-tiff-metadata-manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[
{
"pattern": "image_metadata/(.*)\\.metadata\\.json",
"description": "JSON object containing metadata about image physical size extracted from TIFF file to help in scaling for visualization.",
"edam_ontology_term": "EDAM_1.24.format_3464"
}
]
19 changes: 19 additions & 0 deletions ome-tiff-metadata.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env cwl-runner

cwlVersion: v1.2
class: CommandLineTool
# TODO: Make main.py executable?
baseCommand: ['python', '/main.py', '--output_dir', 'image_metadata', '--input_dir']
hints:
DockerRequirement:
dockerPull: hubmap/portal-container-ome-tiff-metadata:0.0.1
inputs:
input_directory:
type: Directory
inputBinding:
position: 6
outputs:
json:
type: Directory
outputBinding:
glob: image_metadata
1 change: 1 addition & 0 deletions workflows/ome-tiff-metadata/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
See [containers/ome-tiff-metadata](https://github.com/hubmapconsortium/portal-containers/blob/master/containers/ome-tiff-metadata/README.md).
Binary file not shown.
3 changes: 3 additions & 0 deletions workflows/ome-tiff-metadata/test-job.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
input_directory:
class: Directory
path: test-input
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"PhysicalSizeX": 0.5, "PhysicalSizeY": 0.5, "PhysicalSizeUnitX": "\u00b5m", "PhysicalSizeUnitY": "\u00b5m"}

0 comments on commit f97102a

Please sign in to comment.