Evaluating LLM based applications

Workshop description

It is so easy and quick to build a shiny PoC using LLMs and it is so hard to turn it into a production-grade LLM application. To succeed you need a robust evaluation framework, which you are going to use during the development and post-deployment of your LLM based app.

This workshop focuses on understanding evaluation-driven development and architecture of a LLM based app, building an evaluation framework for a LLM based app, establishing a test suite with evals and laying the monitoring foundations for it. All of it by leveraging Python OSS libraries.

Requirements

General requrements

basic Python knowledge
basic understanding of ML testing
basic understanding of ML monitoring

Optional requirements

uv for dependency management
Google account if you want to use Google Colab

Usage

with uv

Run the following code:

git clone https://github.com/pyladiesams/eval-llm-based-apps-jan2025.git
cd eval-llm-based-apps-jan2025

# create and activate venv, install dependencies
uv sync

with Google collab

Visit Google Colab
In the top left corner select "File" → "Open Notebook"
Under "GitHub", enter the URL of the repo of this workshop
Select one of the notebooks within the repo.
At the top of the notebook, add a Code cell and run the following code:

!git clone https://github.com/pyladiesams/eval-llm-based-apps-jan2025.git
%cd eval-llm-based-apps-jan2025
!pip install -r requirements.txt

Video record

Re-watch this YouTube stream

Credits

This workshop was set up by @pyladiesams and @una-ai-mlops-agency

Appendix

Pre-Commit Hooks

To ensure our code looks beautiful, PyLadies Amsterdam uses pre-commit hooks. You can enable them by running pre-commit install.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
assets		assets
solutions		solutions
workshop		workshop
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating LLM based applications

Workshop description

Requirements

General requrements

Optional requirements

Usage

with uv

with Google collab

Video record

Credits

Appendix

Pre-Commit Hooks

About

Languages

License

pyladiesams/eval-llm-based-apps-jan2025

Folders and files

Latest commit

History

Repository files navigation

Evaluating LLM based applications

Workshop description

Requirements

General requrements

Optional requirements

Usage

with uv

with Google collab

Video record

Credits

Appendix

Pre-Commit Hooks

About

Topics

Resources

License

Stars

Watchers

Forks

Languages