Skip to content
View JelinR's full-sized avatar

Block or report JelinR

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JelinR/README.md

Hi, I'm Jelin Raphael Akkara, a Master's student in Physics of Data at the University of Padua. My interests span Natural Language Processing, Computer Vision, and their intersection in fields like Embodied AI. Below is a selection of projects, both major and minor, that have influenced my growth and shaped my interests. Feel free to reach out using the provided contact information.

Personal Site: https://jelinr.github.io/

Major Projects

These refer to projects that I have improved upon pre-existing models or methods using innovative approaches, often producing surprisingly good results.

  • Lightweight CNN for Speech Keyword Detection: Developed a lightweight CNN model (31k parameters) to classify speech keywords with 90.27% categorical accuracy on the Google V12 Speech Commands Dataset. The model has a 39 ms average inference time and uses 90 KB memory, competing with SoTA models like TDNN (250k parameters, 94% accuracy).

  • YOLOv8n Object Detection using Blob Enhancers: Improved YOLOv8n for small human (far away or occluded persons) detection by 1.1% by adding a pre-processing layer that enhances regions of interest (through blob detection) before fine-tuning the model. The preprocessing speed merely increased by 2 ms (from 7 ms to 9 ms).

  • Fake News Recognition with Naive Bayes: Built an efficient text classifier in R for fake news detection using Multinomial Naive Bayes. A SQL-based approach enabled handling term-context matrices efficiently, training on 20,800 rows (average 4,544 words) in under 30 seconds.

  • Dask Distributed Analysis with Big Data: Implemented anomaly detection on a large industrial dataset (~5GB) using a virtual cluster of 3 worker nodes. Efficiently utilized low-level map-reduce parallelization to process the big data.

Minor Projects

These refer to projects that I use to get comfortable with a concept or tool. The focus of these projects is to familiarize myself with the tool, and it is less about deriving benchmark-worthy results.

  • Learning Kant using LLM and RAG: Tested RAG's capability using Llama-2 as the chatbot LLM, FAISS as the vector store, and HuggingFace for the pipeline on four influential works of Immanuel Kant, optimizing prompts and parameters for best results.

  • Audio Generation using Variational Autoencoders: Built a continuously varying latent space using VAE to generate speech keyword audio samples. This was done by optimizing the latent dimension and the encoder and decoder parameters, experimenting with assymetric structures.

  • Predicting Plasma Crashes with Transformers: Conducted anomaly detection (magnetic crashes) on a highly imbalanced time series dataset (plasma evolution) using three architectures: 1-D CNN, DNN, and Transformer.

  • Muon Pair Detection: Identified rare muon events using a signal selection strategy with linear interpolation and count thresholds on a four-layer detector, modeled after CMS drift tubes.

Popular repositories Loading

  1. Fake_News_Detection Fake_News_Detection Public

    Built an efficient text classifier in R for fake news detection, training on 20,800 rows (average 4,544 words) in under 30 seconds.

    Jupyter Notebook 1

  2. Plasma_Evolution_Analysis Plasma_Evolution_Analysis Public

    Analyzing Plasma Evolution (Time Series) with Transformers to detect anomalies

    Jupyter Notebook 1

  3. ultralytics_forked ultralytics_forked Public

    Forked from ultralytics/ultralytics

    NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

    Python 1 2

  4. Speech_KWS_Lightweight Speech_KWS_Lightweight Public

    Building a lightweight model (CNN or DNN) that maintains a high degree of accuracy in classifying speech keyword audio samples.

    Jupyter Notebook 1

  5. LLM_RAG_Learning_Kant LLM_RAG_Learning_Kant Public

    Test a RAG system's ability to provide concise answers through prompt engineering and parameter optimization.

    Jupyter Notebook 1

  6. VAE_Audio_Generation VAE_Audio_Generation Public

    Approach audio generation using Variational Autoencoder, building a rich continuous latent space.

    Jupyter Notebook 1