Skip to content

Latest commit

 

History

History
41 lines (31 loc) · 2.02 KB

README.md

File metadata and controls

41 lines (31 loc) · 2.02 KB

This repository contains code to reproduce analyses presented in the paper 'Identification of methylation-sensitive human transcription factors using meSMiLE-seq'.

Note

Prerequisites

A prerequisite to running the code provided in this repo is obtaining the rights to use the ProBound Suite (Rube et al., 'Prediction of protein-ligand binding affinity from sequencing data with interpretable machine learning', doi:10.1038/s41587-022-01307-0) and a material transfer agreement. The pyProBound operator package must be installed to use the ProBound Suite via Python. The package source code can be found at probound_operator.

Structure

Each folder consists of subfolders containing exemplary data, scripts, and an output folder. The analysis can either be executed as a full block by calling 'run_pipeline.sh' or it can be split into separate parts.

Contents

  1. SMiLEseq
  • analysis of classical SMiLE-seq experiments by using a Fisher's exact test to prefilter raw sequencing reads before de novo motif discovery via ProBound
  1. meSMiLEseq
  • analysis of methylation-sensitive SMiLE-seq data to create methylation-aware binding models and k-mer scatterplots
  1. WGBS_analysis
  • exemplary analysis pipeline using PRDM13 to show CG methylation patterns at individual motif occurrences in cells by intersecting ChIP-seq and WGBS data
  1. Motifs
  • folder contains all TF binding motifs showcased in Figure 1 as 'position-specific affinity matrices' and DNA logos
  1. probound_operator
  • folder contatins source code of a package to operate ProBound from Python, provided ProBound is installed and its absolute path is stored in environment variable PROBOUND_JAR_FULL_PATH. You can create the package using the following code:
export PROBOUND_JAR_FULL_PATH="path/to/probound/jar"
cd probound_operator
python3 -m build

Miscellaneous information

Libraries and versions used in the analysis:

  • Python (3.9.5)
  • pandas (2.0.3)
  • NumPy (1.26.4)
  • SciPy (1.13.1)
  • Matplotlib (3.6.2)
  • Logomaker (0.8)