Skip to content

Commit

Permalink
Update README.md and add tokenization article for LLMs
Browse files Browse the repository at this point in the history
  • Loading branch information
smortezah committed May 13, 2024
1 parent 23961b0 commit de0ec1f
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 7 deletions.
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ In addition to these projects, I regularly share my insights and learnings on th
- [Understanding Hashing](data-structure/hashing.ipynb): Dive into the world of hashing, its applications, and Python implementation.
- [Sorting Algorithms](data-structure/sorting-popular.ipynb): A comprehensive guide to understanding and implementing popular sorting algorithms in Python.

## :art: Data Visualization

- [lets-plot](visualization/lets-plot/codebook.ipynb): Create stunning plots with [lets-plot](https://lets-plot.org/index.html), a Python port of the R's [ggplot2](https://ggplot2.tidyverse.org/) library.
- [Pitfalls](visualization/pitfalls/pitfalls.ipynb): Avoid common pitfalls in data visualization.
- [QR Code](visualization/qrcode.ipynb): Generate QR codes with ease.

## :mag: EDA (Exploratory Data Analysis)

- [Data Balancing](eda/data-balancing.ipynb): Learn techniques to balance imbalanced datasets.
Expand All @@ -40,6 +46,10 @@ In addition to these projects, I regularly share my insights and learnings on th
- [KerasTuner](hypertune/kerasTuner.ipynb): Optimize your models with hyperparameter tuning using the [KerasTuner](https://keras.io/keras_tuner/) library.
- [Optuna](hypertune/optuna.ipynb): Enhance your models with hyperparameter tuning using the [Optuna](https://optuna.org/) library.

## :brain: LLM (Large Language Model)

- [Tokenization](llm/tokenization.ipynb): Explore the tokenization of text data.

## :robot: Machine Learning

- [Best Threshold for Logistic Regression](machine-learning/threshold-logistic-regression.ipynb): Explore different methods to find the optimal threshold for logistic regression.
Expand Down Expand Up @@ -76,12 +86,6 @@ In addition to these projects, I regularly share my insights and learnings on th
- [Forecasting with sktime](time-series/sktime.ipynb): Forecast time-series data using the [sktime](https://github.com/sktime/sktime) library.
- [Prevent Overfitting](time-series/prevent-overfitting.ipynb): Learn techniques to prevent overfitting in time series forecasting.

## :art: Data Visualization

- [lets-plot](visualization/lets-plot/codebook.ipynb): Create stunning plots with [lets-plot](https://lets-plot.org/index.html), a Python port of the R's [ggplot2](https://ggplot2.tidyverse.org/) library.
- [Pitfalls](visualization/pitfalls/pitfalls.ipynb): Avoid common pitfalls in data visualization.
- [QR Code](visualization/qrcode.ipynb): Generate QR codes with ease.

## :spider_web: Web Scraping

- [jobinventory](scrape/jobinventory.com/tutorial.ipynb): Scrape job listings from jobinventory.com using Python.
Expand Down
2 changes: 1 addition & 1 deletion llm/tokenization.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Click [here]() to access the associated Medium article."
"Click [here](https://morihosseini.medium.com/from-characters-to-context-tokenization-in-llms-09b20abc42ed) to access the associated Medium article."
]
},
{
Expand Down

0 comments on commit de0ec1f

Please sign in to comment.