From de0ec1fba26f728e188bd78755d63f459b132448 Mon Sep 17 00:00:00 2001 From: Morteza Hosseini Date: Mon, 13 May 2024 21:22:22 +0100 Subject: [PATCH] Update README.md and add tokenization article for LLMs --- README.md | 16 ++++++++++------ llm/tokenization.ipynb | 2 +- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index d5b975ce..560c20dd 100644 --- a/README.md +++ b/README.md @@ -25,6 +25,12 @@ In addition to these projects, I regularly share my insights and learnings on th - [Understanding Hashing](data-structure/hashing.ipynb): Dive into the world of hashing, its applications, and Python implementation. - [Sorting Algorithms](data-structure/sorting-popular.ipynb): A comprehensive guide to understanding and implementing popular sorting algorithms in Python. +## :art: Data Visualization + +- [lets-plot](visualization/lets-plot/codebook.ipynb): Create stunning plots with [lets-plot](https://lets-plot.org/index.html), a Python port of the R's [ggplot2](https://ggplot2.tidyverse.org/) library. +- [Pitfalls](visualization/pitfalls/pitfalls.ipynb): Avoid common pitfalls in data visualization. +- [QR Code](visualization/qrcode.ipynb): Generate QR codes with ease. + ## :mag: EDA (Exploratory Data Analysis) - [Data Balancing](eda/data-balancing.ipynb): Learn techniques to balance imbalanced datasets. @@ -40,6 +46,10 @@ In addition to these projects, I regularly share my insights and learnings on th - [KerasTuner](hypertune/kerasTuner.ipynb): Optimize your models with hyperparameter tuning using the [KerasTuner](https://keras.io/keras_tuner/) library. - [Optuna](hypertune/optuna.ipynb): Enhance your models with hyperparameter tuning using the [Optuna](https://optuna.org/) library. +## :brain: LLM (Large Language Model) + +- [Tokenization](llm/tokenization.ipynb): Explore the tokenization of text data. + ## :robot: Machine Learning - [Best Threshold for Logistic Regression](machine-learning/threshold-logistic-regression.ipynb): Explore different methods to find the optimal threshold for logistic regression. @@ -76,12 +86,6 @@ In addition to these projects, I regularly share my insights and learnings on th - [Forecasting with sktime](time-series/sktime.ipynb): Forecast time-series data using the [sktime](https://github.com/sktime/sktime) library. - [Prevent Overfitting](time-series/prevent-overfitting.ipynb): Learn techniques to prevent overfitting in time series forecasting. -## :art: Data Visualization - -- [lets-plot](visualization/lets-plot/codebook.ipynb): Create stunning plots with [lets-plot](https://lets-plot.org/index.html), a Python port of the R's [ggplot2](https://ggplot2.tidyverse.org/) library. -- [Pitfalls](visualization/pitfalls/pitfalls.ipynb): Avoid common pitfalls in data visualization. -- [QR Code](visualization/qrcode.ipynb): Generate QR codes with ease. - ## :spider_web: Web Scraping - [jobinventory](scrape/jobinventory.com/tutorial.ipynb): Scrape job listings from jobinventory.com using Python. diff --git a/llm/tokenization.ipynb b/llm/tokenization.ipynb index 32354956..53db2914 100644 --- a/llm/tokenization.ipynb +++ b/llm/tokenization.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Click [here]() to access the associated Medium article." + "Click [here](https://morihosseini.medium.com/from-characters-to-context-tokenization-in-llms-09b20abc42ed) to access the associated Medium article." ] }, {