Web Authoring Strategies for Voice Browsers
By Kynn Bartlett <site contacts>
Vice President of Marketing and Outreach,
HTML Writers Guild
This is a position paper published by the
HTML Writers Guild for the
Workshop on Voice
Browsers, Tuesday 13th October 1998, Cambridge, Massachussets.
Table of Contents
The World Wide Web has been incorrectly described by some as a "visual
medium", and to the casual observer this would appear to be true.
However, the Web is far more than simply a way to display images in
a visually appealing manner -- it's more properly described as an
Web authors often lock themselves into the strictly visual metaphor
when designing for the web; by doing so, they not only limit their
audience but also the type of information that they can provide. By
viewing the World Wide Web as only what's displayed on a screen,
they fail to take advantage of the full potential of this powerful
Voice browsers offer the potential to expand the reach of the Web
beyond the desktop or laptop and offer information in ways that most
content authors have never imagined. Telephone access to web pages;
browsers for the visually impaired; hands-free web surfing while
driving a car; reading and language instruction for children and
adult learners; intelligent alarm clocks that parse the day's news
and present summaries upon verbal request.
The Web will reach this state, soon -- it's not a matter of if, but
when. The web author's key to unlocking this potential is the
concept of Universally Accessible Design -- creating pages which
not only "look good" in today's browsers, but which are usable by
both yesterday's simpler user agents and the diverse network
access devices that will characterise the Web of the 21st century.
The techniques used in Universally Accessible Design provide many
other benefits in web design, including cross-browser compatability,
accessibility for the disabled, support for legacy machines and
software, intelligent parsing of content by automated user agents,
access via mobile computing devices, and other specialized web
browsers for specific needs.
A properly constructed web document is an accessible web document.
The HTML Writers Guild is committed to developing, distributing,
and teaching principles of Universally Accessible Design to our
members and the web authoring community. The following strategies
recommend specific ways in which these principles can be applied
to designing pages that are usable by voice browsers.
2. Design Strategies for Voice Browsers
Authoring a web page for any specific type of user agent or system
configuration should never be a completely separate subject with
arcane new techniques developed for each special need, but rather
an application of the common set of Universally Accessible Design
principles that should be part of every web author's repertoire.
With few exceptions, pages should never be designed "for" certain
types (or brands) of browsers, but should instead be designed for
all uses (and potential uses) of the information. All web
documents should be equally accessible to voice browsers as to
visual user agents.
The Guild's studies, and discussions with web authors, have shown
that the primary obstacle to universal accessibility is ignorance.
There are few cases where a conscious decision has been made to
produce a generally inaccessible web page; rather, the author is
simply unaware of the need to create accessible pages and the
techniques by which that is done. Once enlightened, most web authors
eagerly embrace the concept of universal accessibility, since the
benefits are many and obvious.
Therefore, this paper will briefly list some of the primary techniques
of Universally Accessible Design as they relate to voice browsers,
and offer ideas as to how authors can implement these considerations
when designing for the web.
a. Aural Style Sheets
Aural Style Sheets are part of the Cascading Style Sheets, Level
2 [CSS2] specification, and provide for a level of control in
"spoken" text roughly analogous to that for displayed/printed
text. The use of an aural style sheet (or aural style sheet
properties included in a general style sheet document) allows the
author to specify characteristics of the spoken text such as volume,
pitch, speed, and stress; indicate pauses and insert audio
"icons" (sound files); and show how certain phrases, acronyms,
punctuation, and numbers should be voiced.
Combined with the @media selector for media types, a well-crafted
aural style sheet can greatly increase the accessibility of a web
document in a voice browser. Further investigation in this area is
encouraged, especially in the area of example aural style sheets
and suggestions for authoring techniques.
b. Rich Meta-Content
HTML 4.0 gives the author the ability to embed a great deal of
meta-content into a document, specifying information which expands
on the semantic meaning of the content and allows for specialized
rendering by the user agent.
In other words, by using features found in HTML 4.0 (and to a limited
extent, in other versions of HTML), an author can give better
information to the browser, which can then make the document easier
As an example, the LONGDESC attribute can be used to provide
a link to extended information about a visual image. The TITLE
attribute should be used extensively to indicate logical divisions
in the document. The ACRONYM and ABBR elements can be used to
identify text that has further meaning beyond the simple letters
of the word. LINK elements can form logical connections between
pages or groups of pages.
Judicious and ample use of meta-content within a document allows
the author to not simply specify the content, but also suggest the
meaning and relationship of that content in the context of the
document. Intelligent user agents -- such as voice browsers -- can
then use that meta-information as appropriate for their presentation
and structural needs.
c. Planned Abstraction
One use for meta-content information is the development of pages
which are designed to be abstracted. The typical web document found
on the web can often be quite lengthy; finding information by
listening to web page read out loud takes longer than visually
scanning a page, especially when most web pages are designed for
Thus, most voice browsers will provide a method for abstracting
a page; presenting one or more outlines of the page's content based
on a semantic interpretation of the document.
Examples of potential or valid abstraction techniques include:
- Listing all the links and link text on a page
- Forming a structure based on the H1, H2, ... H6 headers
- Summarizing table data
- Scanning for TITLE attributes in elements and presenting a list
of options for expansion
- Vocalizing any "bold" or emphasized text
- Digesting the entire document into a summary based on keywords
as some search engines provide
There are any number of other options available for voice browser
programmers to use to provide short, easily-digestible versions of
web contents to the browser user. This suggests that the web
author should provide as much meta-content as possible as well as
careful use of HTML elements in their proper manner. Specific
- Useful choices for link text (e.g., "the report is available"
instead of "click here")
- Appropriate use of heading tags to define document structure,
not simply for size/formatting
- Use of the SUMMARY attribute for tables
- Intelligent use of TITLE, including TITLEs on some elements
that may not be considered, such as HR or P
- Use of STRONG and EM where appropriate, providing benefits for
both vocal and visual "scanability"
- Use of META elements with KEYWORDS and SUMMARY content
d. Alternative Content for Unsupported Media Types
The "poster child" for web accessibility is the ALT attribute,
which allows alternative text to be specified for images; if a user
agent cannot display the visual image, the ALT text can be used
Widespread use of the ALT attribute by all sites on the Internet
would likely double the accessibility of World Wide Web with such
a simple change. Web authors who do not correctly use ALT text
are seriously damaging the usability of the entire medium!
For voice browsers, ALT text is vitally important since images
cannot be represented at all, aurally. Especially when used as
part of a link, alternative content must be provided so that the
voice browser can accurately render the page in a manner useful
to the user.
In addition to ALT for IMG attributes, HTML 4.0 provides a number
of other ways for specifying alternative content that can be
used by a browser if an unsupported media type is provided. Some
of those include:
- ALT attributes for image map AREAs, APPLETs, and image INPUT
- Text captions and transcripts for multimedia (video and audio)
- NOSCRIPT elements when including scripting languages, as voice
- NOFRAMES elements when using framesets, as frames are a very
visually-oriented method of document display
- Use of nested OBJECT elements to include a wide variety of
alternative content for many media types
3. Further References
- HTML 4.0
- CSS 2 [includes Aural Style Sheets]
- WAI Guidelines: Page Authoring
- W3C: Mobile Access
- W3C Note: Voice Browsers
- Productivity Works [creators of pwWebSpeak browser]
Authoring pages for use by voice browsers is not difficult -- as long as
care is taken to design for universal accessibility. Increased awareness
of the need for Universally Accessible Design among the web authoring
community is critical to the success of the web's expansion beyond
desktop/laptop computer units.