Gianina Morales | Fall 2022 | gcm31@pitt.edu
A perennial interest of researchers is understanding the history and characteristics of their fields. Topic modeling is a data mining and machine learning technique that automatically analyzes texts to identify latent topic structures. I used topic modeling techniques to analyze the trends over time in literacy research and scholarship in one leading journal and a conference papers journal in the field of Literacy education. My research questions are:
- What are the trends in topics of literacy education research and scholarship over more than five decades (1969-2022) of the focal journals?
- How do the topics have changed over time?
3,131 journal articles that represent all the publications between 1969 and 2022 in the focal journals. I obtained the data as part of a major research project and for a confidential agreement, I cannot share the raw data. However, I put the tidy data that GitHub allowed (by weight) after mining the texts in this folder.
-
Final report: document with the results of my study.
-
Analysis: folder with the
Rmd
files where I developed all the processing. -
Data product samples: folder with the files resulting from text mining and analysis.
-
Images: folder with the figures and tables of the project.
-
Progress report: detail of the advances in the process of wrangling and analysis of the data.
-
Project plan: project proposal. Includes my initial ideas for the project.
-
Project presentation: PowerPoint used in the presentation of the project in the course Data Science for Linguists.
-
README: this document with the generalities of the project.
-
LICENSE: Attribution-NonCommercial-ShareAlike 4.0 International