Skip to content

Files

Latest commit

 

History

History
591 lines (550 loc) · 11.9 KB

README.md

File metadata and controls

591 lines (550 loc) · 11.9 KB

IPyRSSA

IPyRSSA (Integrative Python library for RNA Secondary Structure Analysis) is a set of Python library to analyze RNA secondary structure and SHAPE data.

New

Python 3 is supported now. Python 2 is not supported.

Update your local library

git pull origin

General module

`import General`
Function name Usage
load_fasta Read fasta file
write_fasta Write fasta file
load_dot Read dotBracket file
write_dot Write dotBracket file
load_shape Read SHAPE .out file
load_SHAPEMap Read SHAPEmap file
load_ct Read .ct file
write_ct Write .ct file
init_pd_rect Build a dataframe
init_list_rect Build a list matrix
find_all_match Find all match regions with a regex
bi_search Binary search
calc_shape_gini Calculate SHAPE gini index
calc_shape_structure_ROC Calculate the ROC points structure and shape scores
calc_AUC Calculate AUC with ROC points
calc_AUC_v2 Calculate AUC with dot and shape_list
seq_entropy Calculate the entropy of the sequence.

Colors module

`import Colors`
Function name Usage
format or f Format a colorful text
color_SHAPE Convert SHAPE list to colorful blocks
color_Seq_SHAPE Convert sequence to colorful sequence
browse_shape Print and compare single/multiple shape scores example
browse_multi_shape Align multiple sequences and print shape scores example

Cluster module

`import Cluster`

Warning: This module can only be used on loginviewxx/mgtxx

Function name Usage
new_job Get a job handle
handle.set_job_depends The job will be executed when parameter jobs done
handle.submit Submit the job to queue
handle.has_finish Return True if finished
handle.job_status Return one of Not_Found, DONE, RUN, PEND, EXIT
handle.wait Wait the job to finish
handle.kill Kill the job

Seq module

`import Seq`

Prerequisites: pyliftover, pysam

Function name Usage
reverse_comp Get reversed complementary sequence of raw sequence
flat_seq Flatten the long sequence to multiline sequence
format_gene_type Classify the raw gene type in annotation to common gene type
Class:seqClass A class to fetch sequence from big genome
lift_genome Convert the genome version (hg19=>hg38)
search_subseq_from_genome Search a pattern in genome region

Structure module

`import Structure`
Function name Usage
predict_structure Prediction secondary structure combine SHAPE or not
bi_fold Prediction RNA interaction
search_TT_cross_linking Search TT cross linking sites in structure
dyalign Predict a common secondary structure for two sequences
multialign Predict a common secondary structure for multiple sequences
estimate_energy Calculate the folding free energy change of a structure
partition Calculate the partition function
maxExpect Calculate the max-expect structure
evaluate_dot Evaluate the Sensitivty and PPV for a predicted structure relative to target structure
calc_structure_similarity Calculate the structure similarity,distance
dot2ct Dotbracket to list
dot2bpmap Dotbracket to dictionary
parse_pseudoknot Parse pseudoknot with ctList
ct2dot ctList to dotbracket
write_ctFn Save dot-bracket structure to .ct file
dot2align Convert secondary structure to aligned sequence.
dot_from_ctFile Read a dotbracket from .ct file
trim_stem Trim a stem loop
find_stem_loop Find stem loop from secondary structure
find_bulge_interiorLoop Find bulges and interior loops from secondary structure
calcSHAPEStructureScore Calculate strcuture - SHAPE agreement score for stem loop
sliding_score_stemloop Find stem-loops in RNA with a sliding window
multi_alignment Multiple sequence alignment with muscle
kalign_alignment Multiple sequence alignment with kalign
global_search Global align short sequences to multiple long sequences
align_find Find the unaligned sequence region from aligned sequence
locate_homoseq Locate homologous region in multiple sequences
dot_to_alignDot Dotbracket to aligned dotbracket
shape_to_alignSHAPE SHAPE list to aligned SHAPE list
annotate_covariation Annotate raw sequence to colorful sequence by highlight the covariation sites
dot_F1 Compare predicted structure and true structure and calculate the F1 score
parse_structure Given a dot-bracket structure, parse structure into all kinds of single-stranded bases and paired bases
refine_structure_interior Check and make some some canonical base pairs in interior loops paired
refine_structure_stackingclosing Check and make some some canonical base pairs in stacking end paired
refine_structure_hairpinclosing Check and make some some canonical base pairs in hairpin paired

Visual module

`import Visual`

Prerequisites: java, VARNA (http://varna.lri.fr)

Function name Usage
Plot_RNAStructure_Shape Plot the RNA structure combine with SHAPE scores
Plot_RNAStructure_Base Plot the RNA structure with different colors for ATCG
Plot_RNAStructure_highlight Plot the RNA structure and highlight some regions
Map_rRNA_Shape Output rRNA structure with PostScript format
get_rRNA_refseq Return reference rRNA sequence

Rosetta module

`from D3 import Rosetta `

Prerequisites: ROSETTA, it can be only run in cluster

Function name Usage
pred_3D_rosetta Predict RNA 3D structure with ROSETTA

MCSym module

`from D3 import MCSym `
Function name Usage
upload_MCSym_job Upload MCSym RNA 3D structure prediction job
get_MCSym_status Get the status of the job
minimize_MCSym_newThread Minimize the pdbs
score_MCSym_newThread Score and ranking pdbs
fetch_top_MCSym_pdb_newThread Download top scored pdbs

HDOCK module

`from D3 import HDOCK`
Function name Usage
upload_HDOCK_job Upload HDOCK RNA-protein docking job
get_HDOCK_status Get the status of the job
guess_HDOCK_time_left guess the time to leave
fetch_HDOCK_results Download all results
fetch_HDOCK_top10_results Download top 10 results

Figures module

`import Figures`
Function name Usage
stackedBarPlot Plot a stacked bar figure
violinPlot Plot a violin figure
piePlot Plot a pie figure
boxPlot Plot a box figure
cdf Plot a CDF curve

GPU module

import GPU

Function name Usage
get_gpu_processes Get process handles running on GPU
get_gpu_list Get a list of available gpu
get_free_gpus Get a list of GPU id without process run on it

Alignment

import Alignment

Function name Usage
blast_seq Use blastn to search sequence
annotate_seq Given a sequence and blastdb, search and annotate the sequence

Covariation

import Covariation

Function name Usage
dot2sto Covert dot to stockholm file
cmbuild Create .cm file with stockholm alignment
cmcalibrate Calibrate a .cm file
cmsearch Call cmsearch programe to search aligned sequence agaist cm model
R_scape Call R-scape to call covariation base pairs
read_RScape_result Read the R-scape result
get_alignedPos2cleanPos_dict Get a distionary {align_pos: raw_pos}
call_covariation Give sequence and dot. Run covariation pipeline
calc_MI Calculate the Mutual information for aligned sequences
calc_RNAalignfold Calculate the RNAalifold covariation score for aligned sequences
calc_RNAalignfold_stack Calculate the RNAalifold covariation score (consider stack) for aligned sequences
collect_columns Given multialignment, return alignment columns
calc_covBP_from_sto Given multialignment, return covariation score for each column pair