Bloom Chatbot with Flask

Chatbot Application Overview

This document outlines the Flask backend (app.py) and HTML frontend (ChatClient.html) for a chatbot application. Here's a breakdown of the functionality and structure:

Flask Backend (`app.py`)

Initialization:
- The Flask app is initialized, and CORS (Cross-Origin Resource Sharing) is enabled to allow requests from http://127.0.0.1:5500.
- Logging is set up to track the application's activity.
Model Loading:
- The application uses the BloomForCausalLM model from the Hugging Face Transformers library.
- The model is either downloaded from Hugging Face or loaded from a local directory (data/bloom-1b7).
- The model and tokenizer are loaded onto the available device (GPU if available, otherwise CPU).
Chat Endpoint:
- The /chat endpoint accepts POST requests with JSON data containing a user message.
- The message is processed by the Bloom model, which generates a response.
- The response is returned as JSON.
Health Check:
- A simple health check endpoint (/) returns a message indicating that the chatbot is running.
Running the App:
- The Flask app runs on port 5000 by default, or a port specified in the environment variable PORT.

HTML Frontend (`ChatClient.html`)

Structure:
- The HTML file defines a simple chat interface with a header, a message display area, and an input field with a send button.
Styling:
- CSS is used to style the chat interface, including message bubbles, loading animations, and button states.
JavaScript Functionality:
- The sendMessage function sends user input to the Flask backend and handles the response.
- User messages are displayed in the chat window, and a loading animation is shown while waiting for the bot's response.
- The bot's response is displayed in the chat window once received.
- The Enter key can be used to send messages.
Error Handling:
- Errors during the fetch request are caught and displayed as a bot message indicating that something went wrong.

Interaction Between Frontend and Backend

The frontend sends user messages to the /chat endpoint of the Flask backend using a POST request.
The backend processes the message using the Bloom model and returns a generated response.
The frontend displays the bot's response in the chat window.

Running the Application

Backend:
- Ensure Python and the required libraries (flask, torch, transformers) are installed.
- Run the Flask app using python app.py.
Frontend:
- Open the ChatClient.html file in a web browser (e.g., using a local server like http-server or directly via file:// protocol).
Usage:
- Type a message in the input field and press Enter or click the Send button to interact with the chatbot.

Overview of BLOOM Models

The BloomForCausalLM models are part of the BLOOM (BigScience Large Open-science Open-access Multilingual) family of language models. These models are designed for causal language modeling, predicting the next token in a sequence, making them suitable for text generation tasks.

Overview of BLOOM Models

BLOOM (BigScience Large Open-science Open-access Multilingual) models, specifically the BloomForCausalLM, are designed for causal language modeling. They predict the next token in a sequence, making them suitable for text generation tasks.

BLOOM Model Variants

BLOOM models come in various sizes. Here’s a comparison of some popular variants:

Model Name	Parameters	Layers	Heads	Hidden Size	Context Window	Multilingual Support	Use Case
bloom-560m	560 million	24	16	1024	2048 tokens	Yes (46 languages)	Lightweight, fast inference, suitable for low-resource environments.
bloom-1b1	1.1 billion	24	16	1536	2048 tokens	Yes (46 languages)	Balanced performance, good for general-purpose text generation.
bloom-1b7	1.7 billion	24	16	2048	2048 tokens	Yes (46 languages)	Improved performance over 1b1, suitable for more complex tasks.
bloom-3b	3 billion	30	32	2560	2048 tokens	Yes (46 languages)	Higher capacity, better for nuanced text generation and larger contexts.
bloom-7b1	7.1 billion	30	32	4096	2048 tokens	Yes (46 languages)	Strong performance for advanced tasks, requires more computational resources.
bloom-176b	176 billion	70	112	14336	2048 tokens	Yes (46 languages)	State-of-the-art, massive scale, requires significant computational power.

When to Use Which Model?

bloom-560m:
- Ideal for lightweight applications or environments with limited computational resources.
- Suitable for simple text generation tasks or prototyping.
bloom-1b7:
- A good balance between performance and resource requirements.
- Suitable for general-purpose text generation, chatbots, and more complex tasks.
bloom-3b and bloom-7b1:
- Better for advanced tasks requiring higher accuracy and nuance.
- Requires more computational power but offers significantly better performance.
bloom-176b:
- State-of-the-art performance for research and large-scale applications.
- Requires specialized hardware (e.g., multiple GPUs or TPUs) and is not practical for most users.

Performance Considerations

Hardware Requirements:
- Smaller models like bloom-560m can run on CPUs or low-end GPUs.
- Larger models like bloom-1b7 and above require GPUs for efficient inference.
- The bloom-176b model requires distributed computing infrastructure.
Inference Speed:
- Smaller models are faster but may produce less coherent or nuanced text.
- Larger models are slower but generate higher-quality responses.
Memory Usage:
- Larger models consume significantly more memory, which can be a bottleneck for deployment.

Conclusion

The choice of BLOOM model depends on your specific use case, available hardware, and performance requirements. For lightweight applications, bloom-560m is a good starting point, while bloom-1b7 offers a balance between performance and resource usage. For advanced tasks, larger models like bloom-7b1 or bloom-176b are recommended, though they require significant computational resources.

Bloom Chatbot with Flask

This project implements a chatbot using the Bloom-1.7B language model from Hugging Face's transformers library. The chatbot is served via a Flask web application, allowing users to interact with the model through a simple API endpoint. The application supports CORS (Cross-Origin Resource Sharing) for seamless integration with frontend applications.

::: section

Features

Bloom-1.7B Model: Utilizes the powerful Bloom-1.7B causal language model for generating human-like responses.
Flask API: Provides a RESTful API endpoint (/chat) for sending user messages and receiving model-generated responses.
CORS Support: Enables cross-origin requests from a specified frontend origin (e.g., http://127.0.0.1:5500).
Health Check: Includes a health check endpoint (/) to verify that the chatbot is running.
Error Handling: Robust error handling for invalid requests, model loading issues, and inference errors.
Device Optimization: Automatically uses GPU if available, otherwise falls back to CPU. :::

::: section

Prerequisites

Before running the application, ensure you have the following installed:

Python 3.8 or higher
pip (Python package manager) :::

::: section

Installation

Clone the repository:

git clone https://github.com/attributeyielding/Smart_Chat_Bot.git
cd bloom-chatbot

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install dependencies:

pip install -r requirements.txt

The requirements.txt file should include:

flask
torch
transformers
flask-cors

Download the Bloom-1.7B model:
- The application will automatically download the model if it is not already present in the data/bloom-1b7 directory.
- Ensure you have sufficient disk space (approximately 5-10 GB) for the model. :::

::: section

Running the Application

Start the Flask server:
```
python app.py
```
The application will run on http://0.0.0.0:5000 by default. You can access the health check endpoint at:
```
http://127.0.0.1:5000/
```

Interact with the chatbot:

Send a POST request to the /chat endpoint with a JSON payload containing the user's message:

{
    "message": "Hello, how are you?"
}

Example using curl:

curl -X POST http://127.0.0.1:5000/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello, how are you?"}'

The response will be in JSON format:

{
    "response": "I am doing well, thank you! How can I assist you today?"
}

:::

::: section

Configuration

CORS Origins: By default, the application allows requests from http://127.0.0.1:5500. To modify this, update the origins parameter in the CORS initialization:
```
CORS(app, origins="http://your-frontend-url.com", supports_credentials=True)
```
Model Path: The model is saved and loaded from the data/bloom-1b7 directory. You can change this by modifying the MODEL_PATH variable in the code.
Device: The application automatically detects and uses a GPU if available. To force CPU usage, modify the device variable:
```
device = torch.device("cpu")
```

:::

::: section

API Endpoints

1. Health Check

Endpoint: GET /
Description: Verifies that the chatbot is running.
Response: Plain text string: "Chatbot is running!"

2. Chat

Endpoint: POST /chat
Description: Accepts a user message and returns a model-generated response.

Request Body:

{
    "message": "Your input message here"
}

Response:

{
    "response": "Model-generated response here"
}

Error Responses:
- 400 Bad Request: If no message is provided.
- 415 Unsupported Media Type: If the Content-Type header is not application/json.
- 500 Internal Server Error: If an error occurs during model inference. :::

::: section

Troubleshooting

Model Download Issues: Ensure you have a stable internet connection and sufficient disk space. If the download fails, manually download the model using:

tokenizer = BloomTokenizerFast.from_pretrained("bigscience/bloom-1b7")
model = BloomForCausalLM.from_pretrained("bigscience/bloom-1b7")
tokenizer.save_pretrained("data/bloom-1b7")
model.save_pretrained("data/bloom-1b7")

GPU Not Detected: If you have a GPU but it is not being used, ensure that torch is installed with CUDA support:
```
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```
CORS Errors: Ensure the frontend URL is correctly specified in the CORS configuration.

License

This project is licensed under the MIT License.

Acknowledgments

Hugging Face for the transformers library and the Bloom model.
Flask for the web framework.
PyTorch for the deep learning framework.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
ChatClient.html		ChatClient.html
LICENSE		LICENSE
README.md		README.md
app.py		app.py
help.md		help.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chatbot Application Overview

Flask Backend (`app.py`)

HTML Frontend (`ChatClient.html`)

Interaction Between Frontend and Backend

Running the Application

Backend:

Frontend:

Usage:

Overview of BLOOM Models

Overview of BLOOM Models

BLOOM Model Variants

When to Use Which Model?

Performance Considerations

Conclusion

Bloom Chatbot with Flask

Features

Prerequisites

Installation

Running the Application

Configuration

API Endpoints

1. Health Check

2. Chat

Troubleshooting

License

Acknowledgments

About

Releases

Packages

Languages

License

attributeyielding/Smart_Chat_Bot

Folders and files

Latest commit

History

Repository files navigation

Chatbot Application Overview

Flask Backend (app.py)

HTML Frontend (ChatClient.html)

Interaction Between Frontend and Backend

Running the Application

Backend:

Frontend:

Usage:

Overview of BLOOM Models

Overview of BLOOM Models

BLOOM Model Variants

When to Use Which Model?

Performance Considerations

Conclusion

Bloom Chatbot with Flask

Features

Prerequisites

Installation

Running the Application

Configuration

API Endpoints

1. Health Check

2. Chat

Troubleshooting

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Flask Backend (`app.py`)

HTML Frontend (`ChatClient.html`)

Packages