Reprinted from Webmonkey
by Steve Champeon, Chief Technical Officer
Back in March of 2003, Nick Finck and I stunned the Web design world at the South by Southwest Interactive conference in Austin. How? Well, despite a late night spent chowing down fish tacos and swilling Shiner bock, we actually managed to show up early Sunday morning to deliver our presentation .
OK, so maybe our audience was stunned by more than just the early hour. Maybe they were actually impressed by the presentation, which offered a comprehensive approach to Web document design that both embraced the future and drew heavily upon the past. It also restored markup to its rightful place, showed how presentation could be requested only by those browsers that wanted it, and demonstrated how much money and time could be saved by expanding your audience and embracing the medium.
And now you don't even have to get up early to enjoy the magic - it's all in the pages that follow. (The Shiner bock is up to you, though.)
In my last article, I defined some terms you may find useful and relevant for this discussion. So, if you didn't read it then, here's your chance. We'll be here when you get back. OK? OK.
Back in the late eighties and early nineties, we faced a very different situation when publishing electronic documents. SGML was complex, expensive, and lacked a comprehensive presentation solution. As a result, it was only used by governments, big business, lawyers, and industry and trade organizations. These organizations considered SGML a longterm solution for document storage, not as a final output format. The result was printed as a book, turned into an electronic technical manual, or imported into a database.
Each SGML document type was crafted to meet the needs of a given industry or application (such as marking up legal documents, or books, or aircraft maintenance manuals). Like Adobe Illustrator or Photoshop's native file formats, which are often turned into JPEGs or GIFs for publication online, or EPS for insertion into print documents, these SGML document types were just intermediate storage formats, but extremely powerful and customized.
Remember, SGML is primarily just a set of rules that help people create markup languages, not a markup language on its own. It is often called a "metalanguage" to distinguish it from the markup languages it is used to create.
The introduction of the Web, with its one simple markup language, and browsers with hardcoded display defaults, eliminated two of the biggest problems with SGML in one fell swoop.
You see, browsers didn't have to deal with all of the possibilities inherent in SGML - they just had to deal with HTML. Documents in other formats could be (and were expected to be) transformed into HTML. By introducing hardcoded presentation defaults, HTML let Web authors focus on the content, rather than on the presentation of that content, leaving that up to the end user (or whoever wrote the end user's browser).
Sadly, the next six to eight years saw a slew of corruptions to HTML, in order to force HTML to become more of a presentation language, to serve the needs of designers, not the preferences of the end user. But to be fair, the lack of a fast and widely implemented presentation solution made extending HTML an obvious choice. The other option was to develop a suitable presentation language. Those who were part of the early days remember the mad rush, and those who remember SGML may actually prefer the rush to the glacial pace of SGML development. But as we'll see, both paths were followed.
Fast forward to 2003. We've had eight to ten years of extensions to HTML. We have XML , a new, more robust and simplified version of SGML. We have nearly seven years of Cascading Style Sheets support in the browsers, with excellent support in most, if not all, modern graphical desktop browsers.
In other words, we have a great foundation on which to build a new strategy for Web design - we call it progressive enhancement.
Before we dive into the basics behind progressive enhancement, though, let's talk about so-called "graceful degradation".
In principle, graceful degradation is a great idea - it assumes that we care to reach everyone in our audience. It also assumes that any information we provide is to be made accessible, and that we use redundant forms to deliver any content that relies on more advanced technologies. If those technologies aren't supported, or are disabled by the user for speed or security reasons, they still get the whole content enchilada.
In practice, though, "graceful degradation" has come to mean "it looks okay in the previous version of Internet Explorer for Windows", or, at best, "I can view the site in Lynx , and the alt attributes on my images don't get in the way of the main content". Rather than tailoring a site in successive stages, adding richer content to a sparse, but already information-rich skeleton, most just build it for a target browser and hope for the best in the others.
Most Web designers simply build sites quickly, using all the tricks they learned during the terrible browser wars. Tables for layout, font tags, hundreds of embedded images, single pixel GIFs, and so on - all get used, or abused, and whoever maintains the site usually makes it worse, unless they are highly disciplined.
All of this extra presentational cruft just adds to the weight of a Web document, without adding to the message for anyone but users of modern graphical desktop browsers. Most cell phone, PDA, and other mobile devices simply can't display such pages, because the markup itself is too big. Such devices don't have the memory, or the bandwidth, to show such huge pages.
So, in the real and ever expanding world of non-desktop Web browsers, sites supposedly built to degrade gracefully simply fail. And make no mistake, the old approach to Web design - where we assumed that all browsers would get better, faster, and more powerful; where we assumed that everyone would have fat pipes; where we assumed that the only drag on innovation was the existence of legacy browsers - has been shown to be a myth, like the jet cars from 1950s technology fairs.
Put another way, the assumptions - that browsers will always get more powerful, available bandwidth will always increase, and degradation is all about older, less capable browsers - are false.
So, then, how do we work around these limitations of new devices?
On the next few pages, we will show you how to build documents that should work in any browser that can read simple HTML, and how to tie in styles to the semantics and functions of your markup. We'll show you how to target some styles to certain browsers and avoid styling content altogether in other browsers. We'll talk about how to achieve cost savings by stripping your markup down to bare minimum without sacrificing on the richness of its display, in the process making your sites more accessible.
It may be easier to start from scratch after a review of your content, in terms of both the specific and generic. Have a sense of what is commonly found in your documents, and what its meaning or purpose is. And it goes without saying (but I'll say it anyway) that you should always work on a copy, or better yet, a version-controlled file, of your document.
Be sure to keep track of the size of your document before and after you've removed all of this unnecessary cruft. You'd be surprised at how much bandwidth you'll save. The recent redesign of Wired News , for example, saved almost eighty percent in file size and bandwidth per page, on average. Don't forget - in most browsers, any externally referenced files, such as scripts and styles, are cached, so they only need to be downloaded once. If you're doing things right, you can apply most of these stylesheets and scripts across your entire site, and spread the savings across the whole site.
The first thing we do is remove all presentational markup from our document. This not only includes the obvious stuff, like font tags and tables, but the markup whose presentation can't be defined by stylesheets, and the style attributes of any other markup. The result is a stark, unstyled framework that is essentially just your content and basic navigation. Don't worry if the document shows some style if you view it in a browser. Remember, they contain hardwired presentation defaults. Just make sure that your markup is appropriate for its content (i.e., don't use <blockquote> or <ul> just to get a cheap indent, or <h5> for really small text).
Move scripts and stylesheet rulesets outside of the document. Don't use so-called inline or embedded styles or scripts; put all of your presentation and behavior outside the document, using the mechanisms available in HTML4 and XHTML. (In fact, XHTML's increased strictness makes this a necessity for all but the most simplistic of scripts, if you don't want to deal with "CDATA sections" and the like.)
The result is a line between structure and content on the one hand, and style/presentation and behavior on the other.
OK, now you've got the basic markup, your content, and little else in the HTML document. You're not done with it, however. Next, think about the meaning of that content. It's easier for pages that fit a certain type, or profile: press releases, bios, news pages, weblogs, and so on. Print out a few pages, and see if you can identify the common elements.
For a blog , you might find posts, dates, permalinks, comments, links, quotes, and the like. You'll notice that HTML doesn't have elements for "post" or "comment" or "permalink", but it does provide for more generic items like paragraphs, or anchors, or quoted text.
So, the trick is to take what HTML (and, by extension, XHTML) gives you for semantic markup, and add to it. Until we have true XML in the browser, we can't just define our own elements. But we can define our own semantics, using styles, and by building on what HTML provides.
After all, a permalink is just a type of anchor. An ingredients list in a recipe is just a type of unordered list, and the steps for mixing it into the batter may well be an ordered list. A quoted song lyric is just a type of blockquote, and the note that identifies the author is just a type of cite. There probably isn't an HTML element for every situation, but do what you can with what you have before resorting to the generic elements <div> and <span>.
A simple paragraph, with the addition of a stylesheet rule, becomes a paragraph in a weblog post. A div, the most generic of all HTML elements, becomes a div containing a post. A blockquote becomes a weblog comment, or quoted text from some other source. A simple anchor becomes a reference to your favorite blog, or a link to read later, or a link to the article you cadged the great technique from.
But how do you make sure your cutting-edge visual design shows up in the latest graphical desktop browser, while the content still shows up in the order and with the emphasis you want, in lynx, or a PDA?
An interesting thing about software: it has bugs. Most of the time, this is reason to curse, but sometimes we can turn these bugs to our advantage.
Many enterprising souls on various mailing lists and elsewhere have found bugs in the CSS and HTML parsers built into certain browsers. I know what you're thinking - "What!? A bug in a browser? No way!" - but hear me out. The interesting thing is that the bugs can be used to selectively hide, or show, styles to whatever browser has that bug.
This is especially interesting when that browser is the only one that has the bug. And, fortunately for us, there are many of these bugs.
Another interesting thing about software: it has features. Most of the time, we concern ourselves with the features software does have, rather than those it does not have. But here again, in our pursuit of Web design excellence, we care whether a given browser doesn't have support for a feature. This means we can use that feature to exclude that browser from ever seeing a particular stylesheet, in whole, or in part.
The most basic browsers, with very limited features and some strong constraints on what they will accept, require you to strip things down to the bare bones. You can always add them back later via more advanced technologies that these browsers don't support.
Instead of trying to mix presentation into markup that lowest common denominator browsers can't handle anyway, strip it out. Make sure only capable browsers even request it in the first place. The more we know about what browser supports what, the better off we are when it comes down to setting up our markup and styles.
Design the stylesheets to deliver custom styles to one browser or a class of browsers, and you localize and reduce your regression testing requirements (a fancy way of saying that you don't need to test those fixes in anything but the browser we're targeting).
In the end, we end up with a sort of layering effect - all the way up from "baseline" (browsers with no CSS support and little or no support for graphics) to "legacy" (browsers with little to no CSS support) to "midrange" (decent basic CSS1 support) to "modern" (decent CSS2 support) and beyond.
The dividing line is drawn by Tantek Çelik's high pass filter , which excludes any browser that fails Section 7.1 of the CSS Test Suite. We base our current definition of "modern" on this line.
We have a few special cases, as well: Netscape Navigator 4.x, which has enough bugs in its CSS parser for us to want to selectively hide and show styles only to it; Internet Explorer 5 for the Macintosh, which supports one more way to import stylesheets than any other browser (and hence we can deliver special styles through that method); and Opera, which many hacks target.
In addition, by incorporating all of our relevant hacks and filters, we can use the same set of styles to provide a top-level tableless layout, or a section specific set of styles, or a global set of styles for use on the whole website.
So, how do we do this?
We start with a basic, "baseline.css" stylesheet, which we link into the page by way of the <link> element. This stylesheet provides little more than what any CSS-capable browser can handle: a few font, color, and perhaps some basic formatting rules. But it may also contain styles for Netscape Navigator 4.x, using a hack known as Fabrice's Inversion , which shows styles only to NN4, and is itself named in reference to Caio's Hack , which hides from NN4 only.
Then, we link in our "filter.css" stylesheet, which simply uses Tantek's High Pass Filter to import yet another stylesheet. This second sheet contains two lines; one is a dummy, and the next is the sheet we want to import. The imported stylesheet contains our rules for modern browsers, and is appropriately named "modern.css".
So, with just three sheets, one of negligible size, we've provided a way to target styles to NN4, to everyone but NN4, and (as a group) to IE6+ for Windows, IE5+ for Macintosh, Netscape 6/Mozilla or later, Opera6, and Safari/Konqueror.
There are, of course, other bugs and workarounds. The CSS-Discuss list Wiki , a great resource for this sort of thing, lists or links to dozens of documented bugs and discussions of how, and when, to exploit them. And more are being found every passing day.
Okay, so now you have a document that contains only structural or semantic markup, that uses CSS selectors, the id and class attributes of HTML, and so forth to provide a basic, but meaningful container for your content. You've separated out all of your presentation rules into the appropriate stylesheet or sheets. You've labeled your markup in terms of semantics, and, youv'e also labeled those containers that don't have a semantic, so much as a functional, value.
Now it's time to think about a print-only stylesheet. The greatest thing about the Web is that it allows for so many different kinds of media to coexist more or less peacefully. So, now that we know how to make your site work on the Web, we're going back to paper.
Printed documents are peculiar - you can't click on them for more information. If there's any information, notices, copyright or policy statements, and the like that you need to associate with a given document, here's how you do it. Put the policy into its own <p> or <div> and then use the CSS display property to hide it in all but the print stylesheet. This content will still show up in non-CSS browsers, but you can include a disclaimer for those browsers and then hide it in all stylesheets, as well, including your print stylesheet.
Other things to consider: do users of less capable browsers (say, those who need to tab from link to link, or endure a spoken version of a long navigation menu) want a different experience than those who can quickly skate over to the body content and navigate from there? Consider using the CSS display trick above to provide a "skip navigation" link, or even move your main content to the top of the file and position it on screen using CSS. You can then provide a "skip to nav" link that skips the main content, for those who need to get somewhere else on your site quickly.
Finally, test your page out in a few different browsers. I usually test in Lynx as my baseline browser, then Netscape Navigator 4.x, as my legacy browser, then IE5 or IE5.5 for Windows as my midrange browser, and then any of the various modern graphical desktop browsers (IE5/Mac, IE6/Windows, Mozilla , or Safari ). Your testing needs may vary, and I encourage you to try testing from cell phones, PDAs, and other browsers. Maybe you'll find the next interesting hack!
As it stands, you're probably groaning at the length and mass of this article. But I do want to sum up the basic approach and lay the ground work for some future directions.
We've talked about how style can be correlated to basic HTML frameworks, customized and targeted at certain browsers, and how to make sure others don't see it at all. We showed you how to achieve bandwidth savings and make your pages more accessible, through the use of sparse markup, caching of stylesheets and scripts, and structure geared towards the lowest common denominator.
Some of you may object that any approach that depends on bugs is probably not terribly reliable, but let me remind you that the approach is also founded on capabilities. Older browsers, and those that lack a certain level of support for CSS, will never see the high end presentation rules you define, because - bugs or not - they simply won't know how, and therefore will not be able, to fetch them.
Obsolete browsers are not being maintained, and certainly aren't having new features or more robust CSS parsers back fit into their codebases. So old bugs are reasonably reliable as indicators (provided the same bugs don't show up in so-called "modern" browsers, that is).
Finally, the modern browsers themselves also march ahead, adding new features and capabilities, as newer specifications are developed. There will be a day when we'll use the high pass filter as our lower bound. Hopefully, a new mechanism will arrive that allows for the tagging of a rule or stylesheet as only worth downloading if a client supports adjacent sibling attribute selectors or whatnot.
The main point I want to leave you with is that this method makes it even harder to avoid separating your markup and presentation. By way of a clear argument for inclusiveness, sensitivity to client capabilities, and suitability, as well as a pretty compelling suggestion of cost savings and improved user experience, the approach we call progressive enhancement seems like a no-brainer, too.