This script preprocesses and performs sentiment analysis on text data in a DataFrame column named "Review". It uses NLTK for text cleaning and VADER sentiment analysis to generate sentiment scores.
- Text Cleaning: Converts text to lowercase, removes URLs, HTML tags, punctuation, and numbers, and applies stemming.
- Sentiment Analysis: Calculates positive, negative, and neutral sentiment scores for each review and determines the overall sentiment of the dataset.
- Install Dependencies: Ensure you have NLTK and pandas installed.
- Download NLTK Resources: Download the VADER lexicon.
- Clean and Analyze Text: The script processes the text data and computes sentiment scores, displaying the overall sentiment.
- Input DataFrame: Contains a column "Review" with text data.
- Output: A DataFrame with the original review, positive, negative, and neutral sentiment scores.
This project is licensed under the MIT License.