This repository contains supplementary materials for the research project "Research on Social Media Style Classification Based on Quantitative Indicators". It includes corpora, code for corpus processing, and data analysis tools.
This repository serves as an attachment to my undergraduate graduation project, focusing on the classification of social media styles using quantitative indicators.
Corpus/
: Self-built corpora in txt format, named as "topic-form-platform"Code/
: Python and R scripts for corpus cleaning and data analysisData/
: Quantitative linguistic analysis results from the corporaReport/
: (Upcoming) Interactive reports with detailed corpus introductions and data visualizations
Clone the repository to your local machine:
git clone https://github.com/exusiaiwei/supp-social-media-style-2023.git
cd supp-social-media-style-2023
Ensure you have the necessary dependencies installed (list to be provided).
- Navigate to the
Code/
directory - Modify file paths in the scripts as needed
- Run the desired Python or R scripts
For detailed instructions on each script, please refer to the comments within the code files.
The Data/
folder contains the results of quantitative linguistic analysis performed on the corpora. This data is available for replication studies by other researchers.
Contributions are welcome! Please feel free to submit a Pull Request.
- Tieba_Spider: Modified for thesis data collection
- WeiboSpider: Used for thesis data collection
This project is licensed under the MIT License - see the LICENSE file for details.