Useful scripts for generating training sets and working with stable diffusion training.
Creates a ngram frequency list, sorted by frequency, of all words in .txt files in the given directory
A general find/replace script for caption files. See comment docs.
A specific processor for hydrus-exported .txt files to make them useful for stable diffusion. Supports adding prefixes, suffixes, and removing tokens.
Renames txt files with extension .jpg.txt (or similar) to just .txt
Simple find/replace of file names