Why are headers not tagged for accessibility?

I am trying to convert a Microsoft 2010 document using Acrobat XI Pro to a PDF file, and have the following output problems.
1. Headers throughout the document are not being tagged within the PDF file. That is, when pressing "h" JAWS says there are no headers in the document. The source MS-Word file contains several Headers of type h1, h2, and h3.
2. It appears that words, in the PDF file, are being randomly concatenated. Visually on the screen there is a space between the words, but JAWS is reading them as a single word and the braille display shows no space between the words.

MSW2010 preferences PDFMaker Settings
< > View Adobe PDF result
<x> Prompt for Adobe PDF file name
<x> Convert Document Information
PDF/A Compliance: PDF/A-3b
< > Attach source file
<x> Create Bookmarks
<x> Add Links
<x> Enable Accessibility and Reflow with tagged Adobe PDF
<x> Enable advanced tagging


David Best


5 Answers

Good to know -- ISO 14289-1 in conjunction with 1.8 of ISO 32000-1 define what an accessible PDF is. And, ya know, AT software does not. Have you walked the structure tree to evaluate what Acrobat's Full Check tells you, what PAC 2 tells you and how things stack up against the Matterhorn Protocol?

Are the headings in the Word authoring file the built-in Word headings or something from some style that's packaged by styling to appear to be a heading. When Word's built in headings are used both the Microsoft PDF creation/tagging and PDFMakers PDF creation/tagging provied tags in the structure tree that properly role map to the appropriate PDF Heading element (in part 1 one of 32000 and 14289 these are exclusively H1 through H6). Note that proper role mapping of All PDF elements is a Requirement for Accessible PDF (See the ISO Standards). When you are walking the PDF's structure tree it is possible to view each element's (tag's) role mapping. An element titled "H1", "h1", "Heading1", "Heading 1" or any sort of variation that is not role mapped to PDF's "H1" element will never be "seen" by AT as a heading of level one.

ISO 32000-1 This document is an ISO approved copy of the ISO 32000-1 Standards document. By agreement with ISO, Adobe Systems Incorporated is allowed to offer this version of the ISO standard as a free PDF file on their own Web site.
It is not an official ISO document but the technical content is identical; the page and section numbers are also preserved.

http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf

ISO 14289-1
Purchase from the ANSI web store.

http://webstore.ansi.org/RecordDetail.aspx?sku=ISO+14289-1%3A2012

Acrobat Accessibility Page (link to guidelines)

http://www.adobe.com/accessibility/products/acrobat.html

Matterhorn Protocol at the PDF Association's PDF/UA Compentency Center.

http://www.pdfa.org/2014/06/matterhorn-protocol-1-02-now-available/

PDF/UA Reference Suite

http://www.pdfa.org/publication/pdfua-reference-suite/

Resources at AIIM.

http://www.aiim.org/Research-and-Publications/Standards/Committees/PDFUA

PDF Accessibility Checker

http://www.access-for-all.ch/en/pdf-lab/pdf-accessibility-checker-pac.html

axesPDF for Word (2007 and 2010) Beta

http://www.axespdf.com/

Be well...


David Austin   

David, thanks for your detailed response. Very helpful!
However, I am still unable to find a solution. I have two issues, but will deal with them one at a time.

Using the Adobe Acrobat XI trial with the JAWS 15 screen reader, I am not able to find any Heading tags within the PDF document. I have performed the following:

1. In Acrobat with the document opened, open "View > Show/Hide > Navigation > Content".
2. All Header text items are contained within "Container <h1>", "Container <h2>", and "Container <h3>" elements.
3. Close Container View and return focus to the PDF document text.
4. Using JAWS 15, press "h" navigation key to move focus to next Header. JAWS says "there are no Headings on this page", when in fact there should be <h1> and <h2> elements on the page.
5. Move Virtual Cursor focus to the Heading text and press JawsInsert+f. JAWS says "Font is bold", but does not identify it as a Heading element.
6. With focus on the PDF document, press JawsInsert+v to open Adobe Acrobat Settings. Type "head" in the search field. Verify that the "Headings Announce" parameter is set to "Headings and Level" option.

What else am I missing? Any assistance is appreciated.


David Best   

The Content panel has no bearing on a PDF's accessiblity or interaction with AT. A PDF's accessibility is established by / defined by the structure tree which lives within the Tags panel.

The structure tree (in the Tag panel) has an <h1>. What does it role map to? ISO 32000-1 identifies that PDF's Heading elements (tags) are explicitely: H1, H2, H3, H4, H5 and H6. Note the uppercase "H".

Your <h1>, <h2>, etc are what's been used to denote the styling in the authoring file as a "heading". For those authoring applications having an appropriately robust tag management process used during PDF creation the built-in Headings / Paragraph tags must be used to obtain proper role mapping that is required.

For MS Word a "Heading 1" is roled mapped to PDF's <H1> when using Acrobat's PDFMaker. Similarly this occurs when using FrameMaker or InDesign. A "custom" build named "h1" will role map to an appropritate PDF element/tag other than one of the PDF heading elements. What this will be is a function of what Word style was used. So, something based on Normal will role map to the PDF element (tag) <P>. And, of course <P> isn't a PDF heading element (tag). Consequently AT will (correctly) identify that there's no headings.

When the deliverable is to be an accessible PDF it is essential to understand what the role mapping of the authoring application paragarph tags / styles to PDF elements will be.

For headings use what is built-in. If something else is used then you must establish the correct role mapping manually using Acrobat Pro. Each individual occurance of a "<h1>" would be mannually role mapped to PDF's <H1>. Same for each "<h2>", "<h3>", etc.

Best practice is to "get it done" in the authoring environment. For best of class how-to when using MS Word use Karen McCall's books. A fall back could be what is out there in Microsoft's webspace. (Much of it provided by Karen).

For your file(s) you could remediate back in the authoring files or remediate in the PDF file(s) manually.

Be well...


David Austin   

Well, the conversion is more complex than I expected. If Microsoft text is identified as Headers by using the Quick Styles (Alt+h,l), and I can navigate the Headers using JAWS (JawsInsert+z,h), then I would expect these elements to automatically be mapped to PDF Header elements when selecting the Acrobat PDFMaker accessibility preferences (Alt+b,s) for PDF/A-3b. I will have to investigate further. Note, the Acrobat XI Pro Toolbars appear not to be very screen reader friendly. I cannot view the Tags tree (Alt+v,s,n,g), and Acrobat seems to freeze and I lose JAWS speech. When activating Accessibility (Alt+t,a) JAWS simply says "This file claims compliance with the PDF/A standard and has been opened read-only to prevent modification.", and I have no keyboard control.

Anyway, possibly you can help with my second issue. When reading the PDF document, JAWS is concatenating some words. Although visually there is a space on the screen, JAWS ignores the space. When viewing Content Container <p> (Alt+v,s,n,n) elements, it appears a space on a separate line by itself, is ignored by JAWS.

Example one Container <p> content:
"Th
is

is
te
st

Te
xt
."
Is read by JAWS as "Thisis testtext.", as only the space appearing on a line with text is recognized.

Example two Container <p> content:
"Th
is
is
te
st
Te
xt
."
Is read by JAWS as "This is test text.", as no space appears on a line without text.

This is the only thing I can see in the the PDF document that correlates JAWS concatenated words with the position of spaces. Any suggestions on how to fix this would be greatly appreciated. Thanks!


David Best   

I take your comment to mean that my MSW/PDF experience is not normal or expected convertion behaviour. In which case, I agree in that it is time to try another AT. Thanks!


David Best   


Please specify a reason: