GitHub

Intro

A lightweight implementation of a Decoder-only language model trained on the TinyStories dataset. The project features custom Triton kernels for optimized performance on NVIDIA GPUs.

Features

Transformer-based language model architecture
Custom Triton kernels for key operations:
- Softmax
- RMS Normalization
- Cross Entropy Loss
- Rotary Position Embeddings (RoPE)
Custom tokenizer training using SentencePiece

Prerequisites

pip install -r requirements.txt

Usage

# Download TinyStories dataset and train tokenizer
python train_vocab.py    
  
# Preprocess data
python preprocess.py

# Train Model
python train.py

# Generate text samples using trained model
python sample.py --prompt "your prompt"

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
transformer		transformer
README.md		README.md
preprocess.py		preprocess.py
requirements.txt		requirements.txt
sample.py		sample.py
tokenizer.py		tokenizer.py
train.py		train.py
train_vocab.py		train_vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Features

Prerequisites

Usage

Reference

About

Releases

Packages

Languages

Wencong50714/TinyLLM

Folders and files

Latest commit

History

Repository files navigation

Intro

Features

Prerequisites

Usage

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages