Welcome to my repository for the HackBio Internship 2024, a collaborative and immersive experience aimed at advancing bioinformatics skills through projects using real-world data.
HackBio is a bioinformatics internship founded to bridge the gap between knowledge and application in the field of computational biology. With a strong focus on mentorship, collaboration, and hands-on learning, HackBio provided us, aspiring scientists and bioinformaticians, the opportunity to work on significant projects with real-life implications.
For this project, my teammates and I were working on the Antimicrobial Resistance (AMR) in Cancer track.
Antimicrobial resistance roughly refers to the ability of microbes (such as bacteria, fungi, viruses, and parasites) to evolve and resist the effects of drugs that were once effective against them. This phenomenon is challenging in cancer treatment, particularly for immunocompromised patients undergoing chemotherapy, where infections caused by resistant pathogens can lead to complications.
Throughout the internship, I’ve been using Python, R and Bash.
Goal: Write an essay (400 words) on Antimicrobial Resistance (AMR) in cancer using references.
Implication: This step involves understanding the intersection between cancer treatment and AMR, where resistance to antibiotics affects treatment outcomes, particularly in cancer patients. For example, AMR in gut microbiomes can influence immunotherapy efficiency. This stage focuses on research synthesis and scientific communication, explaining the challenges of managing infections in cancer patients while balancing antibiotic use.
Goal: Analyze the paper "Gut Resistome of NSCLC Patients Treated with Immunotherapy" & make a short video about it.
Implication: The purpose here is to understand how gut microbiomes contribute to AMR in cancer treatments, particularly for non-small cell lung cancer (NSCLC) patients using immunotherapy. The task helps highlight how AMR genes can reduce the effectiveness of cancer treatments, providing insights into future personalized cancer therapies by managing resistance factors in the gut microbiome. Collaborative team engagement and research presentation are crucial here.
Goal: Download and clean the AMR Products Dataset, perform statistical analysis, visualize data, and generate key insights about AMR trends.
Implication: The task is aimed at developing data analysis skills by cleaning, processing, and interpreting an AMR dataset. By visualizing trends in AMR product development, it’s possible to identify promising drug candidates and understand AMR dynamics in different regions. This stage focuses on technical skills in data science (e.g., using Python or R), and critical thinking to extract insights that can influence public health strategies.
Goal: Introduce bioinformatics using Bash scripting for a simulated wet-lab biologist, alongside analyzing a WHO dataset about cholera outbreaks, using R.
Implication: This phase provides foundational bioinformatics tasks with an emphasis on Bash scripting for sequencing data analysis and file navigation. By offering simple, yet powerful, scripts, wet-lab biologists can automate tasks like genome assembly, variant calling, and data preprocessing. The parallel analysis of the cholera outbreak dataset exemplifies the importance of bioinformatics tools in handling global health data. Through this, we demonstrate how key insights from large datasets like cholera outbreaks can inform public health decisions and antimicrobial resistance (AMR) strategies.
Goal: Build an NGS analysis pipeline using bash or another workflow manager (e.g., Nextflow, Snakemake) for real datasets.
Implication: This final stage integrates various bioinformatics tools (FastQC, FastP, BWA, FreeBayes) into a cohesive pipeline for NGS data analysis. By running tasks like quality control, trimming, genome alignment, and variant calling in an automated fashion, it provides a reproducible framework for researchers to analyze multiple datasets. This step demonstrates technical proficiency in building scalable and efficient pipelines for genomics research.
The HackBio Internship 2024 provided me a unique opportunity to develop bioinformatics skills by dealing with real-world data related to AMR in cancer and cholera outbreaks. Through collaborative projects, I gained hands-on experience in using bioinformatics tools such as Python, R, and Bash scripting to analyze complex datasets and build automated workflows.
This experience has been unlike any other course I’ve taken, primarily due to the level of autonomy we had. While our mentors were available for support when needed, they didn’t hand us any code—just project guidelines—and that was fantastic.
At first, starting an analysis from scratch felt intimidating, especially after being used to highly guided projects where everything was essentially "spoon-fed." But now, working with real datasets feels much more approachable, and I’m incredibly grateful for that growth. Keep an eye out for new projects popping up on my GitHub!
For those interested, HackBio will open another round of applications next year. Be sure to follow them on LinkedIn and take advantage of this incredible learning experience !
Shoutout to all my fellow intern-mates !
Authors | Name | GitHub | |
---|---|---|---|
1 | Haseeb Manzoor | https://github.com/haseebmanzur | https://www.linkedin.com/in/haseebmanzoor |
2 | Yetunde Alo Mary | https://github.com/aloyetunde | https://www.linkedin.com/in/yetunde-alo-mary |
3 | Nada ElSayed Ahmed | https://github.com/Nada-EA | https://www.linkedin.com/in/nada-elsayed-ahmed |
4 | Nada ElHadidy | https://github.com/nada-elhadidy | https://www.linkedin.com/in/nada-elhadidy |
5 | Merna Salem | https://github.com/MernaSalem | https://www.linkedin.com/in/merna-salem |