WCAG 2.0 Parsing Criterion is a PITA

Posted on Friday, 20 November 2015 by Steve Faulkner

The WCAG 2.0 Parsing Criterion is a Pain In The Ass (PITA) because the checking of it throws up lots of potential errors that if required to fix, may result in a lot of extra work (in some cases busy work) for developers. This is largely due to the lack of robust tools for producing a set of specific issues that require fixing.

I have discussed the parsing criterion previously in WCAG 2.0 parsing error bookmarklet also providing a bookmarklet that helps to filter out some HTML conformance checker errors that are definitely (maybe) not potential accessibility issues.

IMPORTANT NOTE:

I am not saying here that checking and fixing HTML Conformance errors is not an important and useful part of web development process, only that fixing all HTML conformance errors is not a requirement for accessibility. There are good reasons to validate your HTML as part of the development process.

What the WCAG parsing criterion requires?

Is really, only, a very limited subset of the errors and warnings that may be produced when checking with the only available tools (i.e. HTML conformance checkers) for testing the WCAG parsing Criterion. You can use a HTML conformance checker to find such errors, but the errors that need fixing for accessibility purposes can often be needles in a haystack.

1. Complete start and end tags

note: but only when this is required by the specification

Examples of what happens:

This:

fieldset><input></fieldset>

Displays this on page:

fieldset>

or

This:

<img src="HTML5_Logo.png" alt="HTML5"
<p>test</p>

Produces this in DOM:

<img <p="" alt="HTML5" src="HTML5_Logo.png"> test
<p></p>

i.e. unintended empty p element with intended text not contained and a mutant attribute <p="" sprouted on the img element.

What this requirement does not mean

Adding end tags to every element:

Not this! <input></input>
...
Not this!<li>list item </li>
...

or self closing elements without end tags

Not this! <input /> <img />

There are rules in HTML detailing which elements require end tags and under what circumstances:  Optional Tags. You can also find this information under Tag omission in text/html in the definition of each element in HTML.

4.5.9 The abbr element

Tag omission in text/html:

Neither tag is omissible

Good news is that most code errors of this type will be fairly obvious as they will show up as text strings in the rendered code or effect style/positioning of content and produce funky attributes in the DOM.

2. Malformed attribute and attribute values

quoted attributes

Any attributes that take text strings or a set of space-separated tokens or a set of comma-separated tokens or a valid list of integers, need to be quoted:

Do this:

<p class="poot pooter">some text about poot</p>
<img alt="The Etiology of poot." src="poot.png">

Not this:

//missing end quote on class attribute with multiple values: 
Not this!<p class="poot pooter>some text about poot</p>

//no quotes on class attribute with multiple values: 
Not this!<p class=poot pooter>some text about poot</p>

//missing start quote on alt attribute
Not this!<img alt=The Etiology of poot." src="poot.png">

//no quotes on alt attribute
Not this!<img alt=The Etiology of poot. src="poot.png">

Note: although some attributes do not require quoted values, the safest and sanest thing to do is quote all attributes.

Spaces between attributes

Do this:

<p class="poot" id="pooter">some text about poot</p>
<img alt="The Etiology of poot." src="poot.png">

Not this:

//no space between class and id attributes: 
Not this!<p class="poot"id="pooter">some text about poot</p>

//no space between alt and src attributes:
Not this!<img alt="The Etiology of poot."src="poot.png">

Further reading on attributes: Failure of Success Criterion 4.1.1 due to incorrect use of start and end tags or attribute markup

3. Elements are nested according to their specifications

What this requirement means is that you cannot do something silly like having a list item li without it having a ul or ol as a parent:

<div>
Not this!<li>list item</li> 
Not this!<li>list item</li>
</div>

or multiple controls inside a label element:

<label>
first name <input type="text"> 
Not this!last name <input type="text"> 
</label>

Examples of what happens:

For “a list item li without it having a ul or ol as a parent” depending on browser, the semantics of the list item including  the role, list size and position of an item in the list, are lost. It also results in funky rendering across browsers.

For “multiple controls inside a label element” depending on the browser, the accessible name for each of the controls is a concatenation of the text inside the label, so in the example case, each control has an accessible name of “first name last name”. Also clicking, with the mouse, on either text label will move focus to the first control in the label element.

4. Elements do not contain duplicate attributes

Pretty simple, don’t do this:

<img alt="html5" Not this! alt="html6">

Note: although this is a requirement in the WCAG criteria and a HTML conformance requirement, it causes no harm accessibility wise unless the 2nd instance of the duplicate attribute is one that exposes  required information, the usual processing behaviour for duplicate attributes is that the first instance is used, further instances are ignored.

5. Any IDs are unique

Again, pretty simple, don’t do this

<body>
...
<p id="IAmUnique">
...
 <div Not this! id="IAmUnique"> 
... 
</body>

Note: although this is a requirement in the WCAG criteria and a HTML conformance requirement, it causes no harm accessibility wise unless the id value is being referenced by a relationship attribute such as for or headers or aria-labelledby etc.

Some further examples of HTML conformance errors that ARE NOT WCAG parsing criterion fails

  • Unrecognized attributes:

    Error: Attribute event not allowed on element a at this point.

  • Unrecognized Elements:

    Error: Element poot not allowed as child of element body in this context.

  • Bad attribute values:

    Error: Bad value grunt for attribute type on element input.

  • Missing attribute values:

    Error: Element meta is missing one or more of the following attributes: content, property.

  • Obsolete elements and attributes:

    Error: The align attribute on the td element is obsolete.

 

About Steve Faulkner

Steve is the Technical Director at TPG. He joined The Paciello Group in 2006 and was previously a Senior Web Accessibility Consultant at vision australia. He is the creator and lead developer of the Web Accessibility Toolbar accessibility testing tool. Steve is a member of several groups, including the W3C Web Platforms Working Group and the W3C ARIA Working Group. He is an editor of several specifications at the W3C including HTML 5.1, ARIA in HTML, Notes on Using ARIA in HTML and HTML5: Techniques for providing useful text alternatives. He also develops and maintains HTML5accessibility.

Comments

  1. Agree – a pain.
    And the only one that is really detectable in modern web-apps where half of the app is added/modified with scripting is duplicate IDs. The browser tends to “correct” most of the rest of these errors anyway.

  2. Hi James,

    And the only one that is really detectable in modern web-apps where half of the app is added/modified with scripting is duplicate IDs.

    You can check the serialized DOM via a bookmarklet, which is helpful in catching errors that the browser does not correct.

  3. Which of the issues described in this article cannot be solved by setting up a proper tool chain for your web content? Which CMS systems are so broken that they can’t produce “well-formed” markup? (I know that “well-formed” is meaningless in non-XML languages; it just serves as a shortcut here.) Which JavaScript libraries are unable to produce content that meets these very basic criteria?

  4. Christophe,

    I’m having a little trouble understanding your questions. At TPG we very often encounter many of these examples Steve has discussed in this post. For instance, by volume the issue of duplicate IDs represents 3% of the issues that Tenon.io has found on Fortune 500 home pages. List items without proper parent elements are quite common as well. Whether these are the result of bad copy-paste, bad frameworks, or bad content management systems doesn’t really matter. What matters is that these are real issues that do occur with high frequency.

  5. Karl,

    I’m not denying that these issues occur very frequently. The point of my questions is which types of issues can simply be solved by fixing the tool chain (frameworks, JavaScript libraries, CMS, …)? Do these issues even have to exist?
    If the answer is no, then the problem is not in the WCAG succcess criterion (cf. the articles title and intro) but in the tools.

  6. @Christophe

    perhaps you did not read the intro closely enough

    because the checking of it throws up lots of potential errors that if required to fix, may result in a lot of extra work (in some cases busy work) for developers. This is largely due to the lack of robust tools for producing a set of specific issues that require fixing.

  7. I’m wondering if the actual point of this blog post was unclear (looking at Christophe and David’s replies above). The too long; didn’t read here is: 1) not all validation errors have an impact on accessibility 2) there’s no clear list of which do and which don’t.

    To give a concrete example: a blind screen reader user will not suddenly be unable to read/use a page because of a single unescaped ampersand, or because of a well-formed, but invalid additional attribute on an element. These validation errors have no impact on the ability of a modern browser to correctly parse markup and generate a consistent DOM tree and accessibility tree.

    If you just naively run a page through the W3C validator and officially FAIL it under 4.1.1 if the validator reports anything other than complete validity, you’re doing it wrong.

  8. @Steve
    Sure, but that’s just one phrase in an article that focuses on SC 4.1.1 (with many links about validation, conformance, a bookmarklet, etc.) instead of issues in tools. As a consequence, I read the whole thing as criticism of the SC instead of those tools. So which is it?

  9. Christophe, it’s quite clearly (to me at least) a criticism of the SC – as I noted in my comment above, it’s not clearly listed/specified which validation errors are in fact failures of 4.1.1 – many devs naively seem to assume that ANY validation errors result in a FAIL, which is not true.

    Also, when Steve talks about “tools” in the article, he’s referring to validation/checking tools, not frameworks/CMSs/etc.

  10. Your original question about why frameworks/CMSs/etc can’t avoid errors that lead to failures of 4.1.1 in the first place is orthogonal to the point of this article, which is about how to test, and what to test, to determine if validation errors are in fact failures of 4.1.1.

  11. Some of the bad attribute value messages (especially the ARIA ones) are parsing fails. For example:

    <label id="theLabel">Label</label> <input type="text" id="foo" aria-labelledby="" />

    gives the message:

    Bad value for attribute aria-labelledby on element input:
    An IDREFS value must contain at least one non-whitespace character.

    I’d also include missing HTML comment end as a parsing error – causes even worse problems than missing end tags. Without a comment end the parser has to guess where the comment is supposed to end:

    <!-- a comment with a missing end
    Does the comment end on the next newline?
    <label>Or the next element start?</label>
    <!-- or the next well-formed comment end -->
    Or the end of the file</html>

  12. Patrick,

    If these errors that are discussed in the article are generated by authoring tools that can be fixed so they produce valid code, I don’t see how the authoring tools are “orthogonal”. It’s about the root cause of the issue, whereas checkers work at the level of the symptoms.

  13. Christophe … seriously: which part of “The WCAG 2.0 Parsing Criterion is a Pain In The Ass (PITA) because the checking of it throws up lots of potential errors” gives you the impression that we’re NOT talking about checkers? This is about AUDITING, not about remediation/fixing of the root causes.

  14. “Let’s discuss how to test against WCAG 2.0 SCs” “Why not just make web content accessible in the first place? Then we don’t need to worry about WCAG 2.0 SCs…”

Comments for this post are closed.