06-week6.Rmd

# Week 6: Unsupervised learning (topic models)

This week builds upon past the scaling techniques we explored in Week 5 and instead turns to another form of unsupervised approach---topic modelling.

The substantive articles by @nelson_computational_2020 and @alrababah_authoritarian_2020 provide, in turn, illuminating insights using topic models to categorize the thematic content of text information.

The article by @ying_topics_2021 provides a valuable overview and accompaniment to the earlier work of @denny_text_2018 when thinking about how we validate our findings and test the robustness of any inferences we make from these models.

Questions:

1.  What assumptions underlie topic modelling approaches?
2.  Can we develop structural models of text?
3.  Is topic modelling a discovery or measurement strategy?
4.  How do we validate any model?

**Required reading**:

-   @nelson_computational_2020
-   @parthasarathy2019
-   @ying_topics_2021

**Further reading**:

-   @chang_reading_2009
-   @alrababah_authoritarian_2020
-   @grimmer_general_2011
-   @denny_text_2018
-   @smith_automatic_2021
-   @boyd_characterizing_2018

**Slides**:

-   Week 6 [Slides](https://docs.google.com/presentation/d/1SeL25sA0a7OoJhPOy5lvYuvqOZAUJBkh17VRTG5_VAw/edit?usp=sharing)