Code used for my Master's thesis: ‘The impact of data ordering in Large Language Models: a study on Curriculum Learning’.
Contains the different phases of the experiment:
- Pretraining of BERT with Masked Language Modeling,
- Application of Probing Task Approach
- Plausibility calculation