Skip to content

Developed a Parser software to meticulously extract vital information such as GO, KEGG, and DOI numbers from each protein sequence within UniProt files. The software boasts adaptability for both command-line and GUI use.

Notifications You must be signed in to change notification settings

DevAhmed-py/uniprot_parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Uniprot-Parser

Overview

The Uniprot-Parser is a software tool developed by Ahmed Tijani Akinfalabi in 2024. It is designed to extract essential information such as Gene Ontology (GO) IDs, Kyoto Encyclopedia of Genes and Genomes (KEGG) IDs, and Digital Object Identifiers (DOIs) from UniProt data files. This tool offers versatility, allowing users to operate it via both command-line and graphical user interface (GUI).

Purpose

The motivation behind the development of the Uniprot-Parser stemmed from the need for a reliable and efficient method to extract specific data from UniProt files. These files contain valuable biological sequence information crucial for bioinformatics research and analysis. By creating this software, I aimed to streamline the process of extracting essential details from UniProt files, making it easier for researchers and scientists to access the information they need.

Functionality

The Uniprot-Parser addresses the challenge of extracting specific information, such as GO IDs, KEGG IDs, and DOIs, from protein sequences within UniProt files. Its main features include:

  • Extraction of GO, KEGG, and DOI numbers from UniProt files
  • Support for both compressed and uncompressed UniProt data files
  • Command-line interface with optional arguments for specific data extraction
  • Graphical user interface (GUI) for user-friendly interaction

Learning Experience

During the development of the Uniprot-Parser, I gained valuable insights and skills in:

  • Parsing and extracting data from text-based files
  • Handling command-line arguments and user input
  • Building a graphical user interface (GUI) using Tkinter for enhanced user experience
  • Understanding the structure and content of UniProt data files

Unique Features

What sets the Uniprot-Parser apart from other similar tools is its:

  • Versatility: Capable of handling various types of UniProt data files and providing options for customized data extraction
  • Adaptability: Supports both command-line and GUI modes, catering to different user preferences and requirements
  • Ease of Use: Simple and intuitive interface, making it accessible to users with varying levels of technical expertise

Usage

To use the Uniprot-Parser, simply follow these steps:

  1. Download and install the software from the provided repository.
  2. Run the program using the command-line interface or launch the GUI.
  3. Specify the desired optional arguments (--go, --doi, --kegg) along with the UniProt data file(s).
  4. View the extracted information or save it for further analysis.

For detailed usage instructions and additional information, refer to the provided help page within the software.

Conclusion

In summary, the Uniprot-Parser is a valuable tool for extracting essential biological sequence information from UniProt data files. Its versatility, adaptability, and user-friendly interface make it a standout solution for researchers and scientists in the field of bioinformatics. By developing this software, I aimed to contribute to the advancement of bioinformatics research and facilitate access to vital data for scientific exploration and discovery.

About

Developed a Parser software to meticulously extract vital information such as GO, KEGG, and DOI numbers from each protein sequence within UniProt files. The software boasts adaptability for both command-line and GUI use.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages