Skip to content

Add custom domain information #23

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sdhutchins opened this issue Jan 17, 2025 · 0 comments
Open

Add custom domain information #23

sdhutchins opened this issue Jan 17, 2025 · 0 comments

Comments

@sdhutchins
Copy link

Hi, is it possible to add custom domain information? Below is my script I'm using, but not sure if there is something set up within the package to do this.

The domain_data_formatted variable is what I'd be inputting.

@phoeguo @zhangrener

# Load necessary libraries
library(dplyr)
library(g3viz)
library(jsonlite)
library(dplyr)


# Define the UniProt REST API URL for BMPR2
uniprot_url <- "https://rest.uniprot.org/uniprotkb/Q13873.json"  # BMPR2 UniProt ID

# Fetch and parse the JSON data directly
parsed_data <- fromJSON(uniprot_url)

# Extract relevant domain information
domains <- parsed_data$features %>%
  filter(type %in% c("Signal", "Domain", "Transmembrane", "Topological domain")) %>%
  transmute(
    Description = description,
    Start = sapply(location$start, as.numeric),  # Safely extract start positions
    End = sapply(location$end, as.numeric),      # Safely extract end positions
    Type = type
  )

# Print the extracted domains
print(domains)

# Format retrieved domain data for g3viz
domain_data_formatted <- list(
  length = parsed_data$sequence$length,  # Total length of BMPR2 protein
  domainType = "Custom Domains",
  details = domains %>%
    transmute(
      start = Start,
      end = End,
      name = Description
    )
)

# Print formatted domain data
print(domain_data_formatted)

[bmpr2_clinvar_pathogenic.csv](https://github.com/user-attachments/files/18461874/bmpr2_clinvar_pathogenic.csv)

# Load your data (assuming it's saved as a CSV file)
data <- read.csv("data/bmpr2_clinvar_pathogenic.csv")

# Prepare data for g3Lollipop
processed_data <- data %>%
  mutate(
    Hugo_Symbol = GeneSymbol,  # Map GeneSymbol to Hugo_Symbol
    Mutation_Class = case_when(
      Type == "SNV" & consequence == "Stop gain" ~ "Nonsense_Mutation",
      Type == "SNV" & consequence == "Missense" ~ "Missense_Mutation",
      TRUE ~ "Other"
    ),
    Protein_Change = paste0("p.", ref_aa, pos_aa, alt_aa),  # Create Protein_Change column
    # Extract AA_Position from pos_aa
    AA_Position = as.numeric(pos_aa)
  ) %>%
  select(Hugo_Symbol, Mutation_Class, Protein_Change, AA_Position)  # Select required columns

# Verify processed data
print(head(processed_data))

plot.options <- g3Lollipop.theme(theme.name = "cbioportal",
                                 title.text = "BMPR2 gene",
                                 y.axis.label = "# of BMPR2 Mutations")

# Use processed data with g3viz
g3Lollipop(
  processed_data,
  gene.symbol = "BMPR2",  # Specify target gene
  gene.symbol.col = "Hugo_Symbol",  # Specify gene symbol column
  aa.pos.col = "AA_Position",  # Specify amino acid position column
  protein.change.col = "Protein_Change",  # Specify protein change column
  factor.col = "Mutation_Class",  # Specify mutation classification column
  plot.options = plot.options,  # Use default theme
  output.filename = "BMPR2_ClinVar_Visualization"  # Save output
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant