-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTR Text reflow #49
Comments
Sorry for the delay, thank you for working on this. Here are a few notes that are already published for you to take a look at. Let me know if I can help. |
I have had a couple of false starts at an algorithm here. The observations so far:
Probably need to make a third attempt here where I trust the label field and then try to build a bounding box around the lines using fuzzy matching from the words fields for each line. |
Thank you for your work on this. The new PDF attach is awesome, been working through current notes.
Sent from Samsung Galaxy smartphone.
Get Outlook for Android<https://aka.ms/AAb9ysg>
…________________________________
From: Brandon Philips ***@***.***>
Sent: Friday, February 7, 2025 9:12:17 PM
To: philips/supernote-obsidian-plugin ***@***.***>
Cc: edfinn1973 ***@***.***>; Comment ***@***.***>
Subject: Re: [philips/supernote-obsidian-plugin] RTR Text reflow (Issue #49)
I have had a couple of false starts at an algorithm here. The observations so far:
* The label field at the top has all of the text in the right order but no bounding boxes.
* The words field has all of the bounding boxes but the words may be out of order and are hard to reliably sort.
Probably need to make a third attempt here where I trust the label field and then try to build a bounding box around the lines using fuzzy matching from the words fields for each line.
—
Reply to this email directly, view it on GitHub<#49 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BHKDJFYNO7HYOD5UO5WECG32OVYZDAVCNFSM6AAAAABV3QJLPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBUGQ2TQMBSGQ>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Problem: Recognized text isn't really formatted into paragraphs in a way that is nice for markdown.
Solution: There needs to be some sort of heuristics algorithm to group the bounding boxes (see IRecognitionElement) and text into reflowed text.
How you can help: Please upload simple single page RTR
.note
files to this issue and include a copy of the text formatted in the way you wish it was formatted so I can generate some test cases.For example:
Note: rtr.note.zip
Screenshot (optional):![Image](https://private-user-images.githubusercontent.com/33544/406674938-39bbbf22-b28a-4e96-aa53-9f0fd0810d93.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkxODY4ODYsIm5iZiI6MTczOTE4NjU4NiwicGF0aCI6Ii8zMzU0NC80MDY2NzQ5MzgtMzliYmJmMjItYjI4YS00ZTk2LWFhNTMtOWYwZmQwODEwZDkzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEwVDExMjMwNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTc4YzcyY2MwN2Y5ZDIwNjZkNWNkZmZiMDg0ODMzNjFjNzI3Zjc3MzA4N2Q5YmIwYTVjYWFkMTA2ODExOTYxZTEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.mVlvDRoX1BAZa3dvnAlPdf2HGic9RgYZ9zKKOo1y7_Y)
Should output this:
The text was updated successfully, but these errors were encountered: