Skip to content

Addition of the tool scandirpdf2txt

Compare
Choose a tag to compare
@albion2000 albion2000 released this 18 Feb 14:08
· 17 commits to master since this release

This release includes four tools :

naming_conventions.py & naming_conventions_do_rename.py to enforce some strict rules over the directory names in a file tree

check_jpegs for a fast sanity check of a jpegs file tree & check_jpegs_full for a deeper and slower sanity check

scandir2pdf for massive conversion from jpegs to pdfs.

new : scandirpdf2txt for massive conversion from ocred pdfs to txt files for the purpose of fast full text search with dedicated tools (google or else).

validated on 27K+ jpeg files, 1.2K pdfs.