Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spatial decomposition #175

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions src/tasks/spatial_decomposition/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Spatial Decomposition/Deconvolution

Spatial decomposition (also often referred to as Spatial deconvolution) is
applicable to spatial transcriptomics data where the transcription profile of
each capture location (spot, voxel, bead, etc.) do not share a bijective
relationship with the cells in the tissue, i.e., multiple cells may contribute
to the same capture location. The task of spatial decomposition then refers to
estimating the composition of cell types/states that are present at each capture
location. The cell type/states estimates are presented as proportion values,
representing the proportion of the cells at each capture location that belong to
a given cell type.

We distinguish between _reference-based_ decomposition and _de novo_
decomposition, where the former leverage external data (e.g., scRNA-seq or
scNuc-seq) to guide the inference process, while the latter only work with the
spatial data. We require that all datasets have an associated reference single
cell data set, but methods are free to ignore this information.
8 changes: 8 additions & 0 deletions src/tasks/spatial_decomposition/api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Component and file format specifications

This folder contains specifications for file formats and component
interfaces.

These are not only used for documentation (i.e. to document the file
format of inputs and outputs of a component), but also for unit testing
and validation of output files.
8 changes: 8 additions & 0 deletions src/tasks/spatial_decomposition/api/README.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Component and file format specifications
format: gfm
---

This folder contains specifications for file formats and component interfaces.

These are not only used for documentation (i.e. to document the file format of inputs and outputs of a component), but also for unit testing and validation of output files.
37 changes: 37 additions & 0 deletions src/tasks/spatial_decomposition/api/comp_control_method.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
functionality:
namespace: "spatial_decomposition/control_methods"
info:
type: control_method
type_info:
label: Control method
summary: Quality control methods for verifying the pipeline.
description: |
Control methods have the same interface as the regular methods
but also receive the solution object as input. It serves as a
starting point to test the relative accuracy of new methods in
the task, and also as a quality control for the metrics defined
in the task.
arguments:
- name: "--input_single_cell"
__merge__: anndata_single_cell.yaml
direction: input
required: true
- name: "--input_spatial_masked"
__merge__: anndata_spatial_masked.yaml
direction: input
required: true
- name: "--input_solution"
__merge__: anndata_solution.yaml
direction: input
required: true
- name: "--output"
__merge__: anndata_output.yaml
direction: output
required: true
test_resources:
- type: python_script
path: /src/common/comp_tests/check_method_config.py
- type: python_script
path: /src/common/comp_tests/run_and_check_adata.py
- path: /resources_test/spatial_decomposition/pancreas
dest: resources_test/spatial_decomposition/pancreas
28 changes: 28 additions & 0 deletions src/tasks/spatial_decomposition/api/comp_method.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
functionality:
namespace: "spatial_decomposition/methods"
info:
type: method
type_info:
label: Method
summary: A spatial composition method.
description: "Method to estimate cell type proportions from spatial and single cell data"
arguments:
- name: "--input_single_cell"
__merge__: anndata_single_cell.yaml
direction: input
required: true
- name: "--input_spatial"
__merge__: anndata_spatial_masked.yaml
direction: input
required: true
- name: "--output"
__merge__: anndata_output.yaml
direction: output
required: true
test_resources:
- type: python_script
path: /src/common/comp_tests/check_method_config.py
- type: python_script
path: /src/common/comp_tests/run_and_check_adata.py
- path: /resources_test/spatial_decomposition/pancreas
dest: resources_test/spatial_decomposition/pancreas
30 changes: 30 additions & 0 deletions src/tasks/spatial_decomposition/api/comp_metric.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
functionality:
namespace: "spatial_decomposition/metrics"
info:
type: metric
type_info:
label: Metric
summary: A spatial decomposition metric.
description: |
A metric for evaluating accuracy of cell type proportion estimate
arguments:
- name: "--input_method"
__merge__: anndata_output.yaml
direction: input
required: true
- name: "--input_solution"
__merge__: anndata_solution.yaml
direction: input
required: true
- name: "--output"
__merge__: anndata_score.yaml
direction: output
required: true
test_resources:
- type: python_script
path: /src/common/comp_tests/check_metric_config.py
- type: python_script
path: /src/common/comp_tests/run_and_check_adata.py
- path: /resources_test/spatial_decomposition/pancreas
dest: resources_test/spatial_decomposition/pancreas

31 changes: 31 additions & 0 deletions src/tasks/spatial_decomposition/api/comp_process_dataset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
functionality:
namespace: "spatial_decomposition"
info:
type: process_dataset
type_info:
label: Data processor
summary: A spatial decomposition dataset processor.
description: |
Prepare a common dataset for the spatial_decomposition task.
arguments:
- name: "--input"
__merge__: /src/datasets/api/anndata_common_dataset.yaml
direction: input
required: true
- name: "--output_single_cell"
__merge__: anndata_single_cell.yaml
direction: output
required: true
- name: "--output_spatial_masked"
__merge__: anndata_spatial_masked.yaml
direction: output
required: true
- name: "--output_solution"
__merge__: anndata_solution.yaml
direction: output
required: true
test_resources:
- type: python_script
path: /src/common/comp_tests/run_and_check_adata.py
- path: /resources_test/common/pancreas
dest: resources_test/common/pancreas
35 changes: 35 additions & 0 deletions src/tasks/spatial_decomposition/api/file_output.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
type: file
example: "resources_test/spatial_decomposition/pancreas/anndata_output.h5ad"
info:
label: Output
summary: "Spatial data with estimated proportions."
description: "Spatial data file with estimated cell type proportions."
slots:
layers:
- type: integer
name: counts
description: Raw counts
required: true
obsm:
- type: double
name: coordinates
description: XY coordinates for each spot
required: true
- type: double
name: proportions
description: Estimated cell type proportions for each spot
required: true
uns:
- type: string
name: cell_type_names
description: Cell type names corresponding to columns of `proportions`
required: true
- type: string
name: dataset_id
description: "A unique identifier for the dataset"
required: true
- type: string
name: method_id
description: "A unique identifier for the method"
required: true

25 changes: 25 additions & 0 deletions src/tasks/spatial_decomposition/api/file_score.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
type: file
example: "resources_test/spatial_decomposition/pancreas/anndata_score.h5ad"
info:
label: "Score"
summary: Metric score file.
slots:
uns:
- type: string
name: dataset_id
description: "A unique identifier for the dataset"
required: true
- type: string
name: method_id
description: "A unique identifier for the method"
required: true
- type: string
name: metric_ids
description: "One or more unique metric identifiers"
multiple: true
required: true
- type: double
name: metric_values
description: "The metric values obtained for the given prediction. Must be of same length as 'metric_ids'."
multiple: true
required: true
25 changes: 25 additions & 0 deletions src/tasks/spatial_decomposition/api/file_single_cell.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
type: file
example: "resources_test/spatial_decomposition/pancreas/anndata_single_cell.h5ad"
info:
label: "Single cell data"
summary: "The single cell data file"
slots:
layers:
- type: integer
name: counts
description: Raw counts
required: true
obs:
- type: integer
name: label
description: Cell type label IDs
required: true
uns:
- type: string
name: cell_type_names
description: Cell type names corresponding to values in `label`
required: true
- type: string
name: dataset_id
description: "A unique identifier for the dataset"
required: true
29 changes: 29 additions & 0 deletions src/tasks/spatial_decomposition/api/file_solution.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
type: file
example: "resources_test/spatial_decomposition/pancreas/anndata_solution.h5ad"
info:
label: Solution
summary: "The solution spatial data file"
slots:
layers:
- type: integer
name: counts
description: Raw counts
required: true
obsm:
- type: double
name: coordinates
description: XY coordinates for each spot
required: true
- type: double
name: proportions
description: True cell type proportions for each spot
required: true
uns:
- type: string
name: cell_type_names
description: Cell type names corresponding to columns of `proportions`
required: true
- type: string
name: dataset_id
description: "A unique identifier for the dataset"
required: true
25 changes: 25 additions & 0 deletions src/tasks/spatial_decomposition/api/file_spatial_masked.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
type: file
example: "resources_test/spatial_decomposition/pancreas/anndata_spatial_masked.h5ad"
info:
label: "Masked"
summary: "The masked spatial data file"
slots:
layers:
- type: integer
name: counts
description: Raw counts
required: true
obsm:
- type: double
name: coordinates
description: XY coordinates for each spot
required: true
uns:
- type: string
name: cell_type_names
description: Cell type names corresponding to columns of `proportions` in output
required: true
- type: string
name: dataset_id
description: "A unique identifier for the dataset"
required: true
26 changes: 26 additions & 0 deletions src/tasks/spatial_decomposition/api/task_info.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: spatial_decomposition
label: "Spatial decomposition"
summary: "Estimation of cell type proportions per spot in 2D space from spatial transcriptomic data coupled with corresponding single-cell data"
motivation: |
Spatial decomposition (also often referred to as Spatial deconvolution) is
applicable to spatial transcriptomics data where the transcription profile of
each capture location (spot, voxel, bead, etc.) do not share a bijective
relationship with the cells in the tissue, i.e., multiple cells may contribute
to the same capture location. The task of spatial decomposition then refers to
estimating the composition of cell types/states that are present at each capture
location. The cell type/states estimates are presented as proportion values,
representing the proportion of the cells at each capture location that belong to
a given cell type.
description: |
We distinguish between _reference-based_ decomposition and _de novo_
decomposition, where the former leverage external data (e.g., scRNA-seq or
scNuc-seq) to guide the inference process, while the latter only work with the
spatial data. We require that all datasets have an associated reference single
cell data set, but methods are free to ignore this information.
authors:
- name: "Giovanni Palla"
roles: [ author, maintainer ]
info: { github: giovp }
- name: "Scott Gigante"
roles: [ maintainer ]
info: { github: scottgigante, orcid: "0000-0002-4544-2764" }