HTML Writers Guild Guild Operations W3C Voice Browsers

Web Authoring Strategies for Voice Browsers

By Kynn Bartlett <site contacts>
Vice President of Marketing and Outreach,
HTML Writers Guild

Revision 980922

Document Status: This is a position paper published by the HTML Writers Guild for the W3C's Workshop on Voice Browsers, Tuesday 13th October 1998, Cambridge, Massachussets.

Table of Contents

1. Introduction

The World Wide Web has been incorrectly described by some as a "visual medium", and to the casual observer this would appear to be true. However, the Web is far more than simply a way to display images in a visually appealing manner -- it's more properly described as an "information medium."

Web authors often lock themselves into the strictly visual metaphor when designing for the web; by doing so, they not only limit their audience but also the type of information that they can provide. By viewing the World Wide Web as only what's displayed on a screen, they fail to take advantage of the full potential of this powerful information medium.

Voice browsers offer the potential to expand the reach of the Web beyond the desktop or laptop and offer information in ways that most content authors have never imagined. Telephone access to web pages; browsers for the visually impaired; hands-free web surfing while driving a car; reading and language instruction for children and adult learners; intelligent alarm clocks that parse the day's news and present summaries upon verbal request.

The Web will reach this state, soon -- it's not a matter of if, but when. The web author's key to unlocking this potential is the concept of Universally Accessible Design -- creating pages which not only "look good" in today's browsers, but which are usable by both yesterday's simpler user agents and the diverse network access devices that will characterise the Web of the 21st century.

The techniques used in Universally Accessible Design provide many other benefits in web design, including cross-browser compatability, accessibility for the disabled, support for legacy machines and software, intelligent parsing of content by automated user agents, access via mobile computing devices, and other specialized web browsers for specific needs.

A properly constructed web document is an accessible web document.

The HTML Writers Guild is committed to developing, distributing, and teaching principles of Universally Accessible Design to our members and the web authoring community. The following strategies recommend specific ways in which these principles can be applied to designing pages that are usable by voice browsers.

2. Design Strategies for Voice Browsers

Authoring a web page for any specific type of user agent or system configuration should never be a completely separate subject with arcane new techniques developed for each special need, but rather an application of the common set of Universally Accessible Design principles that should be part of every web author's repertoire.

With few exceptions, pages should never be designed "for" certain types (or brands) of browsers, but should instead be designed for all uses (and potential uses) of the information. All web documents should be equally accessible to voice browsers as to visual user agents.

The Guild's studies, and discussions with web authors, have shown that the primary obstacle to universal accessibility is ignorance. There are few cases where a conscious decision has been made to produce a generally inaccessible web page; rather, the author is simply unaware of the need to create accessible pages and the techniques by which that is done. Once enlightened, most web authors eagerly embrace the concept of universal accessibility, since the benefits are many and obvious.

Therefore, this paper will briefly list some of the primary techniques of Universally Accessible Design as they relate to voice browsers, and offer ideas as to how authors can implement these considerations when designing for the web.

a. Aural Style Sheets

Aural Style Sheets are part of the Cascading Style Sheets, Level 2 [CSS2] specification, and provide for a level of control in "spoken" text roughly analogous to that for displayed/printed text. The use of an aural style sheet (or aural style sheet properties included in a general style sheet document) allows the author to specify characteristics of the spoken text such as volume, pitch, speed, and stress; indicate pauses and insert audio "icons" (sound files); and show how certain phrases, acronyms, punctuation, and numbers should be voiced.

Combined with the @media selector for media types, a well-crafted aural style sheet can greatly increase the accessibility of a web document in a voice browser. Further investigation in this area is encouraged, especially in the area of example aural style sheets and suggestions for authoring techniques.

b. Rich Meta-Content

HTML 4.0 gives the author the ability to embed a great deal of meta-content into a document, specifying information which expands on the semantic meaning of the content and allows for specialized rendering by the user agent.

In other words, by using features found in HTML 4.0 (and to a limited extent, in other versions of HTML), an author can give better information to the browser, which can then make the document easier to use.

As an example, the LONGDESC attribute can be used to provide a link to extended information about a visual image. The TITLE attribute should be used extensively to indicate logical divisions in the document. The ACRONYM and ABBR elements can be used to identify text that has further meaning beyond the simple letters of the word. LINK elements can form logical connections between pages or groups of pages.

Judicious and ample use of meta-content within a document allows the author to not simply specify the content, but also suggest the meaning and relationship of that content in the context of the document. Intelligent user agents -- such as voice browsers -- can then use that meta-information as appropriate for their presentation and structural needs.

c. Planned Abstraction

One use for meta-content information is the development of pages which are designed to be abstracted. The typical web document found on the web can often be quite lengthy; finding information by listening to web page read out loud takes longer than visually scanning a page, especially when most web pages are designed for visual use.

Thus, most voice browsers will provide a method for abstracting a page; presenting one or more outlines of the page's content based on a semantic interpretation of the document.

Examples of potential or valid abstraction techniques include:

  • Listing all the links and link text on a page
  • Forming a structure based on the H1, H2, ... H6 headers
  • Summarizing table data
  • Scanning for TITLE attributes in elements and presenting a list of options for expansion
  • Vocalizing any "bold" or emphasized text
  • Digesting the entire document into a summary based on keywords as some search engines provide

There are any number of other options available for voice browser programmers to use to provide short, easily-digestible versions of web contents to the browser user. This suggests that the web author should provide as much meta-content as possible as well as careful use of HTML elements in their proper manner. Specific techniques include:

  • Useful choices for link text (e.g., "the report is available" instead of "click here")
  • Appropriate use of heading tags to define document structure, not simply for size/formatting
  • Use of the SUMMARY attribute for tables
  • Intelligent use of TITLE, including TITLEs on some elements that may not be considered, such as HR or P
  • Use of STRONG and EM where appropriate, providing benefits for both vocal and visual "scanability"
  • Use of META elements with KEYWORDS and SUMMARY content

d. Alternative Content for Unsupported Media Types

The "poster child" for web accessibility is the ALT attribute, which allows alternative text to be specified for images; if a user agent cannot display the visual image, the ALT text can be used instead.

Widespread use of the ALT attribute by all sites on the Internet would likely double the accessibility of World Wide Web with such a simple change. Web authors who do not correctly use ALT text are seriously damaging the usability of the entire medium!

For voice browsers, ALT text is vitally important since images cannot be represented at all, aurally. Especially when used as part of a link, alternative content must be provided so that the voice browser can accurately render the page in a manner useful to the user.

In addition to ALT for IMG attributes, HTML 4.0 provides a number of other ways for specifying alternative content that can be used by a browser if an unsupported media type is provided. Some of those include:

  • ALT attributes for image map AREAs, APPLETs, and image INPUT buttons
  • Text captions and transcripts for multimedia (video and audio)
  • NOSCRIPT elements when including scripting languages, as voice browsers may be unable to process javascript instructions
  • NOFRAMES elements when using framesets, as frames are a very visually-oriented method of document display
  • Use of nested OBJECT elements to include a wide variety of alternative content for many media types

3. Further References

HTML 4.0
CSS 2 [includes Aural Style Sheets]
WAI Guidelines: Page Authoring
W3C: Mobile Access
W3C Note: Voice Browsers
Productivity Works [creators of pwWebSpeak browser]

4. Summary

Authoring pages for use by voice browsers is not difficult -- as long as care is taken to design for universal accessibility. Increased awareness of the need for Universally Accessible Design among the web authoring community is critical to the success of the web's expansion beyond desktop/laptop computer units.

[Valid HTML 4.0!]
This page is maintained by site contacts. Last updated on 22 September 2002.
Copyright © 2002 by the International Webmasters Association/HTML Writers Guild.