For Baybayin word recognition systems and other related studies.
Due to the upload maximum size (25 MB) restriction per file, the dataset is split into two zip files. The first 500 Baybayin word images are compressed in Baybayin Word Images_1st500.zip
while the other 500 are in Baybayin Word Images_2nd500.zip
file.
The spreadsheet file, Baybayin Word Images List.xlsx
, contains the list of provided Tagalog words with Baybayin images.
A document file, Baybayin Word References.docx
, is also uploaded for a list of where we took and snipped most of the Baybayin word images.
Tagalog_words_74419+.xlsx
is a spreadsheet file that contains 74490 Tagalog words (and some default phrases) collected from publicly available Tagalog word archives in the internet.
These datasets was used to assess our proposed Baybayin word recognition.
If you have some questions regarding the dataset provided, just email me at rbpino@up.edu.ph. You can read the full paper where we utilized these datasets here.
A big thanks to the Baybayin Facebook Public Group named Baybayin - Philippine National Writing System where I have seen and cropped most of the collected Baybayin word images. The group aims to spread and restore the Baybayin writing system throughout the Filipino Nation and others. You can check their page here https://www.facebook.com/groups/Baybayin.PhilippineNationalWritingSystem. Mabuhay!
These datasets are part of the ongoing restoration of the Baybayin script in the Philippines. Others may use some of these for Baybayin related researches.