Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WCAG 4.1.2 tests 9 and 14 possible false positives #75

Closed
jzuidweg opened this issue Aug 11, 2021 · 6 comments
Closed

WCAG 4.1.2 tests 9 and 14 possible false positives #75

jzuidweg opened this issue Aug 11, 2021 · 6 comments
Labels

Comments

@jzuidweg
Copy link

This PDF document generates the following errors that appear to be false positives:

  • WCAG 4.1.2 test 9 "Missing table elements" points to text that obviously appears to be a real table. It is probably also tagged as such because Swink indicated that this is a well tagged file. It's not clear why the algorithm thinks this is not a table.
  • WCAG 4.1.2 test 15 "Paragraph is incorrectly tagged as a numbered heading" points to two occurrences. At least the second (the line "1. Bezoekers") is obviously a real heading. What is even more curious is that the algorithm does not give this error for other, similar headings in the document such as "2. Bezoeken", "3. Acties", etc. So the algorithm does not only give a false positive, it also behaves inconsistently.
@bdoubrov
Copy link
Collaborator

bdoubrov commented Aug 25, 2021

The issue with false "Missing table elements" is fixed.

As for the incorrect numbered heading, the problem is that both H2 (Definities) and H3 (1. Bezoekers) on page 2 have identical font style. This is why they are not recognized as headings but rather usual paragraphs.

We can enhance the heuristics of heading detection to take into account, for example, leading number in a one-line paragraph.

@jzuidweg
Copy link
Author

jzuidweg commented Sep 10, 2021

@bdoubrov , I think it would be a good idea to enhance the heuristics along the lines you described. For now I see the algorithm still flags the headings as wrongly tagged.

@bdoubrov
Copy link
Collaborator

bdoubrov commented Nov 3, 2021

We'll implement two changes in heading recognition:

  • increase (slightly) probability of a heading if we see that the author tagged it as a heading
  • check the leading numbering of a line

@bdoubrov
Copy link
Collaborator

Second item is implemented in #105

@jzuidweg
Copy link
Author

jzuidweg commented Dec 15, 2021

This issue is marked as fixed-in-dev but WCAG 4.1.2 test 9 "Missing table elements" continues giving false positives. I'm not closing this issue yet.

@jzuidweg
Copy link
Author

Closing this issue, but opening new issue #273 for recognition of headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants