Skip to content

Latest commit

 

History

History
40 lines (29 loc) · 909 Bytes

README.md

File metadata and controls

40 lines (29 loc) · 909 Bytes

Intro

A lightweight implementation of a Decoder-only language model trained on the TinyStories dataset. The project features custom Triton kernels for optimized performance on NVIDIA GPUs.

Features

  • Transformer-based language model architecture
  • Custom Triton kernels for key operations:
    • Softmax
    • RMS Normalization
    • Cross Entropy Loss
    • Rotary Position Embeddings (RoPE)
  • Custom tokenizer training using SentencePiece

Prerequisites

pip install -r requirements.txt

Usage

# Download TinyStories dataset and train tokenizer
python train_vocab.py    
  
# Preprocess data
python preprocess.py

# Train Model
python train.py

# Generate text samples using trained model
python sample.py --prompt "your prompt"

Reference