Retooling Slashdot with Web Standards

{Part I of a two-part series.}

Article Continues Below

Ask an IT person if they know what Slashdot’s tagline is and they’ll reply, “News for Nerds. Stuff that Matters.” Slashdot is a very prominent site, but underneath the hood you will find an old jalopy that could benefit from a web standards mechanic.

In this article we will show how an engine overhaul could take place by converting a single Slashdot page from their current HTML 3.2 code, nested tables, and invalid, nonsemantic markup, to a finely tuned web standards racing engine. The goal is not to change Slashdot, but to rebuild it with web standards and show the benefits of the transition.

Before you panic because I’m picking on Slashdot, let me inform you that I asked Rob “CmdrTaco” Malda, the guru behind Slashdot, for permission to post this information, and he stated in his reply email:

Have fun. Feel free to submit patches back to us if you come up with anything useful. Slashdot’s source code is open source and available at www.slashcode.com.

The breakdown#section2

We started by freezing a copy of Slashdot on Tuesday July 22, 2003. Once we had a copy of the page, the first step was to remove all non-essential tags. The only tags left were anchors, lists, forms, images, scripts and header information. From the stripped down version, the code was converted to XHTML 1.0 Transitional, and validated. At this point, the page looks like a minefield of links in a sea of information. It’s valid, but not pretty to look at, so on to the next step.

The semantic conversion#section3

While viewing the now-valid markup, it became apparent that most of the information would be better described as lists. For example, the images for the categories of Slashdot were now just sitting next to each other. Essentially, anything that there was more than two of was put it into a list, for example: login, sections, help, stories, about, services, etc. Lists can be described and positioned in any way that we want, and by stating that elements are part of a list, we are describing their relationship to each other.

The next step was to use header tags. The page has lots of titles and information, but none of the information was described appropriately or explained its relationship to other bits of information. So we gave the title of an article an <h1>, the author information an <h2>, the department received an <h3>, and the “read more” area an <h4>. This then uniquely identified each part of information of an article, while describing the relationship of those parts. Then came the simple part: identify paragraphs, and clean up the code.

The benefit of the semantic conversion is that we are using tags for what they were meant for. It is clear that a list of objects belongs together, and there is a title hierarchy. Another benefit is that this also helps with search engine optimization.

Boxes everywhere#section4

What we have now is a a jumbled mess of well-described information. The information needs to be bound together, with relationships to other information. To begin with, each article is placed into a <div class="node">. Now all information about an article belongs together and all articles are equal, with hierarchy established by the physical order they are placed in.

Next, we uniquely identify each remaining information group, and encapsulate them in their own <div>. For instance:

  • <div id=”advertisement”>
  • <div id=”header”>
  • <div id=”leftcolumn”>
  • <div id=”centercolumn”>
  • <div id=”footer”>

The purpose of boxing the information into <div> is so that information is logically grouped together, which makes shaping the information easier. The CSS can now address each information group and assign attributes to it, such as layout and design. It’s not necessarily semantic, but it is necessary for the presentation. Here’s the semantically organized example. There’s no CSS layout, yet; it’s just structured markup.

The reconstruction of the skeleton#section5

Now that each information group is identified by a <div>, the page is shaped with CSS so that the design matches the old look and feel. This is a matter of time, patience, and practice. The first goal is not to mimic the old site, but to get things to position themselves correctly in the three-column format with an overall header / masthead and footer. This becomes the first CSS file: layout.css.

The benefit of positioning a page with a single CSS file is simple: you know where to look if there is a positioning problem. Often, if you have a problem, it is usually with the positioning. In this step, we were mindful of the page’s behavior in a variety of browsers, so we choose to utilize the @import feature, as any browsers that don’t support that directive will not get the layout. This includes web-enabled cell phones, PDA devices, old browsers, and other Internet devices. Here’s the page with the positioning CSS applied.

Applying the skin#section6

Now we have the page displaying the correct layout, but it still doesn’t look like Slashdot. The second CSS file that is attached is the markup.css, which contains information about fonts, colors, background images, and the way lists are displayed. Here’s the final example.

We also have the ability to add a second skin if we want to give the user an option on how they want to view the page. The second skin doesn’t have to duplicate all of the layout information, which should already be cached from the layout.css file.

The CSS link#section7

We link the CSS files in the header to complete the transition.

<link rel="stylesheet" type="text/css"
href="styles/layout.css" media='screen' />
<style type="text/css">
@import "styles/markup.css";
</style>

In this example, the layout.css file is linked with a media type of screen. This is intentional. The information there is only important for display on a screen, it doesn’t have any benefit for printed media type, or any other (aural, tv, braille, etc.) for that matter. The markup.css file, which is the “skin” of the page, is imported, and thus hidden from noncompliant web devices because some of its features could be harmful or interpreted incorrectly.

Benefits!#section8

The page will now correctly render in standards-compliant browsers, just as it did before, and will fail gracefully for non-standard browsers. So, while the design is not as pretty in very old browsers, the content is still available to their users. It is also much cleaner and more predictable with screen readers. By having the CSS fail gracefully, content is even available to PDAs and web phones. Plus, there are no horizontal scroll bars! Finally, there is also a printer-friendly version using only CSS (no separate “printer-friendly” page). Perhaps the biggest benefit of this particular example is the bandwidth savings:

  • Savings per page without caching the CSS file: ~2KB per request
  • Savings per page with caching the CSS file: ~9KB per request

Though a few KB doesn’t sound like a lot of bandwidth, let’s add it up. Slashdot’s FAQ, last updated 13 June 2000, states that they serve 50 million pages in a month. When you break down the figures, that’s ~1,612,900 pages per day or ~18 pages per second. Bandwidth savings are as follows:

  • Savings per day without caching the CSS files: ~3.15 GB bandwidth
  • Savings per day with caching the CSS files: ~14 GB bandwidth

Most Slashdot visitors would have the CSS file cached, so we could ballpark the daily savings at ~10 GB bandwidth. A high volume of bandwidth from an ISP could be anywhere from $1 – $5 cost per GB of transfer, but let’s calculate it at $1 per GB for an entire year. For this example, the total yearly savings for Slashdot would be: $3,650 USD!

Remember: this calculation is based on the number of pages served as of 13 June, 2000. I believe that Slashdot’s traffic is much heavier now, but even using this three-year-old figure, the money saved is impressive.

Everything explained so far is discussed in more detail at the University of Wisconsin – Platteville’s Slashdot Web Standards example site.

The challenge#section9

I now challenge the ALA community. We need a good web standards mechanic (or team of mechanics) to dig though Slashdot’s engine, Slashcode, and make it web-standards-compliant. CmdrTaco has encouraged us to submit patches, and I know we can show the benefits! The challenge is there — any takers?

Next time: printer-friendly and handheld-friendly Slashdot with a few simple additions.

94 Reader Comments

  1. I noticed that, when looking at the sample sheet with MS IE (6?) installed with AdShield (http://www.allstarss.com/store/adshield.html), the top images float out of positon if an ad is blocked.

    When mozilla blocks an image from displaying, the area is left as a ‘blank’ space (well, you can still link to the ad if you click in the blank space), but when Adshield tanks an image on Internet Explorer, it is often the case that the dynamic space set up for the image gets removed, so it is as if the image was never there.

    This ‘disappearance’ of the entire ad banner space is causing the ‘slashdot’ logo and topic images to be displayed over the text of the topmost story.

  2. I have a Palm Tungsten C and I regularly surf slashdot.org/palm on my 320×320 screen. Not just presentation is involved, they also do nice things like show the top ranked replies as a sort of discussion summary. Would that be easy to accomodate in your rework?

  3. Whats with all the people posting ‘this seems faster to me!’ Its a static page on a different server, you cannot compare the speeds emperically until it is running live, with all the DB calls, perl evaluation and BW usage that entails.

  4. Organizationally the lists of items (such as: “YRO”, “Older Stuff”) look ugly. Especially for lines that wrap. I’d expect to see lines that start on the left line, and then wrapped text be indented instead of the other way around. But other then that I like the look. (I hate the italisized news header text in the second example style though, hard to read)

  5. I definitely like what you did. The code is a lot cleaner and leverages existing standards. Something I always try to do when possible.

    But it doesn’t solve my problem – and I expect the problem for most people here. We have to convince the people who pay the bills that this is worth doing – rewrite legacy HTML. Otherwise it’s just another academic exercise.

    The savings in bandwidth on 3-4000 USD / year is not significant for most sites that have as much usage as slashdot. Most sites and web applications out there have a lot less usage and hence will see a lot less savings.

    How many person-hours where put into this? Counting discussions and planning as well as the coding? How many hours would be spent putting it in slashdot?

    I expect the ‘cost’ to implement this would be at least $30-40,000 USD. That means you have a 10 year return on this work. No business person would consider it. Not to mention the risk of doing it in the first place and the potential revenue lost of having the people work on something else (that has a better return).

    What I would like to see is how much this saves in maintenance and bad fixes for the slashdot site. That might make a better business case.

  6. Ya know, I think this very same thing should be done with phpBB. I’m just waiting for version 2.2…

    (Spirit)

  7. Thank you for a great article!

    This is not only a useful insight into the “retooling” of an existing website but should also be of interest to people planning future website projects.

    I look forward to your followup article and have a small request/suggestion, in-line with several comments already posted:

    It would be useful to mention what needs to be considered, and/or how you allow, the graceful support of magnification. Many sites CSS driven sites allow only a small amount of magnification before becoming unreadable.

  8. Here’s one thing you didn’t think of. Users can now use their own custom CSS to disable the advertisements:

    #advertisement {
    display: none;
    visibility: collapse;
    }

    (I think I got that right)

  9. I can only use slashdot in low-bandwidth format (both for visual and crap dialup reasons). So preserving that is high on my personal priorities list 🙂

    This CSS is pretty well-behaved in that the non-CSS version degrades to very nearly the normal low-bandwidth appearance. The only notable diff being that the headlines are in a larger font than I’m accustomed to seeing. Of course, how the comment pages degrade remains to be seen. 🙂

    In my browser, which does not do CSS (and has js and images disabled), the “skip to content” link does not work. Maybe an ordinary A NAME tag instead of whatever you’ve got right now?

    (This form doesn’t wrap… makes it hard to see what I wrote… grrr!)

    ~REZ~

  10. It is neat to see an entire page that does not include tables.

    What software did you use to redo Slashdot’s page or was it done with just a text editor?
    Please include any validating software tools too.

    What CSS/XHTML reference materials/books/web sites do you recommend?

  11. I would really like to create a standards compliant site but I have problems doing it, even with reading many tutorials and such. If someone is willing to help me get started on creating a good looking layout with only css (I know the basics of css already) then it would be much appreciated!

    admin [at] sandwiregames [dot] com

    thanks!

  12. In response to the gentleman who claimed the ROI was 10 years:

    Your calculations for the lablor involved are _Way_ off.

    #1) The real work has already been done by ALA, saving slashdot significantly. Even if done by a real developer, sice it is a translation of an existing design, it would only take 24 hours, at the most, to come up with the XHTML and CSS in the form ALA has it.

    #2) With the grunt work done, all that remains is template modifications. I downloaded slashcode and got about 25% done with modifying the templates in only 4 hours. Assume I’m faster than average and/or some ‘snags’ would be hit – you’re still looking at 20 hours. Add 4 for some testing (or do it the slashdot way and put up a test server and let your audience test). Even with work inflation (someone taking their time or documenting things according to a process slashdot has in place) and testing, you’re looking at no more than 26 or so hours.

    In this post-boom world, a coder can be had for fairly low prices – especialy when it’s the cookie-cutter work of #2. Assume you pay the high rate of 80$/hour. This may appear low to you if you live in the valley, but its a rate I see all the time in the realworld. Except I see it being charged by companies – I could see a true independant charging less.

    So, 80 * 30 (lets inflate more and take into account the time needed to find someone to do it – which wont be much with slashdot) results in 2400$. Add another 400$ for the time of the people approving the result and such.

    2800$ – less than the bandwidth savings, so less than a year ROI. Even if the overhead of the internal processes and approvals were higher, you’re looking at a year ROI.

    If # 1 were still involved, you’d have a much different picture… but still less than 2 years ROI.

    Now, most companies won’t see the bandwidth savings slashdot would get – simply because they don’t have the traffic. However, the resulting XHTML/CSS is easier to modify (content _and_ design), reducing costs to update the site in the future. Without a case study or access to a company’s books (to see the amount spent currently on maintaining and modifying a website), the ROI is much harder to determin. Many companies get redesigns every 2-3 years anyway, so the costs could just be absorbed during one of these upgrades.

    Slashdot has a distinct advantage in this case – people will climb all over each other to do the work for free.

    Even with the overhead of rolling the changes out, the ROI on near-0$ of changes is hard to argue with.

  13. Have a look at http://software.lottadot.com/article.pl?sid=03/06/17/224243&mode=thread&tid=10&tid=11

    Way back when, I took a fresh check-out of slashcode, broke down it’s layout, and started converting to CSS.

    You can grab it out of our CVS server if you want to have a look. It’s being used on sites such as the Bookiejoint http://bookiejoint.org (which, if you’re curious about how comments look on a CSS’d site, check it out).

    The benefits of CSS’ing, well, all that’s been said numerous times, and we’ve demonstrated that by taking a story that has many thousands of comments and running it under our CSS’d templates. The bandwidth and speed benefits are just awesome.

    Anyway, pty started doing one at http://strict.openflows.org and there’s mine over at http://www.lottadot.com/. Those are the only 2 publicly available that I know of. The one at openflows isn’t geared to the standard 3-column slashcode look, wheres mine was inteneded to be a direct drop-in replacement.

  14. Your code seem to have one mistake:

    <link rel="stylesheet" type="text/css"
    href="styles/layout.css" media=’screen’ />
    <style type="text/css">
    @import "styles/markup.css";
    </style>

    Shouldn’t the layout.css be in the @import statement so that browser that doesne support the @import rule doesn’t try to position the page ??

    Also, like someone notice before, I don’t really like the leftcolumn class either, not generic enough for my taste. I would also like the main text to be on top but it’s always a challenge with a 3 column layout.

  15. “We gave the title of an article an

    , the author information an

    , the department received an

    , and the “read more” area an

    .”

    The heading tags are for, well, headings. Using them for the author, department and “read more” sections makes no semantic sense, and is little better than the bad old days of using these tags whenever you wanted big bold text.

    Much better (IMO) to use something like

    instead.

  16. Of course they’re headings; they provide information about the section that follows (or precedes). They’re not paragraphs of text; making them paragraphs is no better than using

    s.

    If there’s any semantic problem with them, it’s that each of

    through

    all apply to the same section. If you’re going to be a zealot about saying “Deeper levels of headings MUST apply to deeper sections,” then the solution is that all FOUR of those headings should be

    . They all apply to the same section, so they all get the same number.

    I personally don’t think it’s necessary to be a zealot about such things, though. There’s nothing, even in the spec, that says

    HAS to be the heading of a section exactly one level deeper than

    . I see no problem with making it the subheading of an

    as long as you’re consistent.

    I have a slight problem with

    being used for the information in the links, though. That kind of puts the links on the same “level” as the articles, which I don’t think is good.

    Also, I think the markup for the “Sections” links is pretty bad. The “n more” lines are treated as list items, which they certainly aren’t. The only things that are

  17. s there are the section links themselves; “n more” lines would be better off as s set to display: block;. Put “3 more” in parentheses and it makes perfect sense styled or unstyled.

    Not to be too negative, though. It’s great to see this done. 🙂

  18. You say, “Two years later, almost to the day, SlashCode’s (sic) default templates still have not been updated and made valid, and neither has SlashDot’s (sic) template. (sic)”

    (One would think you would have learned to spell “Slashcode” and “Slashdot” in the intervening years.)

    Did you have a point beyond, “I already said this?” It almost seems like you are saying that it hasn’t been fixed is evidence of something in particular, but as it is not evidence of anything in particular — except that they’ve not been “made valid” — your words are a bit confusing.

  19. While we are on the topic, has anyone ever taken note to how bad the html is behind javadocs, which are now sprawled all over the web? Maybe someone should kick off an initiative to fix those things up and send it over to Sun!

  20. I do all my coding by hand using EditPlus. A majority of serious web coders would probably use HomeSite but I find it bloated and buggy.

    Hand coding is the ideal as it gives you an appreciation of the underlying structure of what you are doing, and that’s what markup is mostly about; structuring and presenting a document.

    Dreamweaver is ultimately for people who don’t really know what they are doing.

    Oh, and Fireworks is the way to go for web graphics.

  21. Nice article, the idea was there but it needs to go a bit deeper. Ideally Slashdot, and it’s underlying SlashCode need to be rebuilt from the ground up, but as stated in the FAQ under “Have you considered PHP?”, which I believe sums up the rebuilding on the whole, “the effort involved in rewriting it would be prohibitive…”

    If possible, the site would benefit from being rebuilt from backend to stylesheets, not just to comply with standards, but also to address usability and accessibility issues.

    Another thing, not entirely on topic, but after reading the article, I wondered if in the future, instead of “retooling” a site, what about doing an article(s) on creating a site from scratch, from idea to final product, using xhtml, css, php and maybe mysql.

    I believe would be good for people to see how things can be done, from start to finish, and in turn learning some lessons on the process/journey/struggle in developing a site.

  22. Fantastic article and great food for thought for those large and popular sites that think it may be ‘too hard’ to keep up with standards.

    Thank you for an excellent example to show my clients.

  23. Hell,

    I wanted to redesign our institution’s webpage using CSS-P Layout upon knowing its advantage over table-based layout design. I tried starting a page from the templates that come with Dreamweaver MX 2004, Page Designs(CSS). I previewed the resulting new layout design using Opera, Netscape 7.1 & Explorer locally in my computer and was amazed by its control in the overall design. All my three browsers were successful in viewing the sample page.

    However, when I uploaded the page into my server, I was surprised to see why it did not show up in my Netscape browser the same way as I previewed it locally in my computer. While Opera & Explorer on the other hand have no problem viewing the page from the server at all.

    I tried uploading the same files again to other server and were successful in viewing the page using the three browsers this time.

    What seems to be the problem here? A Netscape bug or a server problem? Does a server need to be configured to parse a CSS page design?

    Please do check the page:
    (this site does not work)
    http://wwww.cdc-cdh.edu/css_site/csstest.htm

    (this site works)
    http://www.feu-alumni.com/css_site/csstest.htm

    Regards and hoping you can enlighten me,

    Raoul

  24. Raoul: Sounds like a server problem. Mozilla (and thus probably Netscape 6+) won’t read CSS files that aren’t sent out by the server with the MIME type “text/css”, at least not if the HTML document has a fairly recent doctype. This is not a bug in Mozilla, but I think it’s unnecessarily strict.

    To fix it, the server should just be instructed to send out files ending with “.css” as “text/css”. Under Apache, this can be done either with a .htaccess file (I don’t remember the exact syntax, though. Anyone?) or in the global configuration (which I’m not familiar with at all), if you have access to that. Under other servers, I’m not sure how to do it.

  25. It looks good – but as with many of these layouts then when you reduce the width of browser window all the elements start colliding and looking a bit odd. I’ve stuck with tables for “stretching” layouts so far because of effects like this, despite a string desire to switch to a div based layout. Any ideas?

  26. “Next week: printer-friendly and handheld-friendly Slashdot with a few simple additions.”

    so, it already 9 days since the article came out. where’s part two?

  27. Honest question here, but how is that so many “standards” examples and practices involve pages that make such heavy use of javascript?

    Not only does it make the site useless for those visitors without javascript (either disabled due to paranoia or weak mobile browsers), but for all the talk about bandwidth savings javascript remains a client-side service that requires every client to download it — even if it doesn’t apply to a particular client.

    Surely there must be a server-side platform/language out there that can manage all the benefits of providing “web standards” without requiring the use and download of javascript?

  28. I agree with Greenie above, I don’t know why people use Javascript for so many things.

    A website should still be functionnal without javascript. There is more people out there with javascript disabled/not available than there is people using 4.0 browsers, and still people use Javascript without even thinking about it.

    I’m all for standards-compliance and all, but there’s so many silly things possible with javascript that I always leave it disabled for fear of being annoyed by the current (stupid and useless) funny craze (such as images following your cursor).

    If you can’t make a website that won’t work without javascript, a lot of people don’t even bother enabling it, they just go elsewhere.

    In my opinion, Javascript is only useful for forms and similar cases.

  29. Greenie wrote: “Honest question here, but how is that so many “standards” examples and practices involve pages that make such heavy use of javascript?”

    Don’t know what you’re talking about. The answer is that, to the best of my knowledge, many “standards” examples and practices don’t make any use at all of JavaScript.

    Or did you mean to reply to the “JavaScript Image Replacement” article? Well, that’s kinda like the exception that confirms the rule. If you look around the websites of people who take web standards seriously, you’ll find that they’re avoiding JavaScript as far as humanly possible, even in cases where JavaScript has previously been considered the only solution.

    When said people resort to JavaScript, there usually aren’t any other options. One example that springs to mind is PNG images. IE/Win doesn’t support alpha channels in PNG by default, but there is a way to make it, using JavaScript a la Microsoft. There simply isn’t any other way to achieve it. Certainly not server-side. But of course, there’s always the option of not using PNG alpha at all, or not caring about it working properly in IE/Win.

  30. “Did you have a point beyond, “I already said this?” It almost seems like you are saying that it hasn’t been fixed is evidence of something in particular, but as it is not evidence of anything in particular — except that they’ve not been “made valid” — your words are a bit confusing”

    Yep, I had a point, and I think everyone else might have seen it: Despite the issue being raised several times over the years, including by me two years ago, SlashDot’s administrators still haven’t fixed their site’s templates. Given the claims on the SlashCode web site that this is easy, the fact that it hasn’t been done for SlashDot is evidence that either the SlashDot administrators don’t care enough to fix them, or they don’t know enough about HTML/CSS to know how to do so. I suspect the former, but I’m willing to believe the latter.

    [sorry about the delay in replying, I’ve been on the road AGAIN]

  31. flaimo wrote:

    >>”Next week: printer-friendly and handheld-friendly Slashdot with a few simple additions.”

    >>so, it already 9 days since the article came out. where’s part two?

    delayed by the u.s. thanksgiving holiday i should think

  32. Precisely. We celebrated Thanksgiving instead of publishing another issue of ALA.

    The next ALA issue comes out Friday 4 December and will include Part II of the Slashdot article.

    I just changed the copy to read “Next time” instead of “Next week.”

  33. hi,
    there were complaints, about that the css enabled page does not scale well, if using very large fonts.
    I’m not an CSS expert, but playing around with the proper display value, it should be possible to make content, leftcolumn, centercolumn, and rightcolumn to behave like a table.
    This way the font-scaling problem might get solved.

  34. Quote:
    It looks horrible with iCab. Hopefully iCab will improve its CSS support soon, but until then we still need to check the website look in every browser and avoid using unsupported CSS features.

    Dude, iCab has had zero CSS support for the past four years. It is for all intents and purposes dead. The only people I can see continuing to use it are die-hard Microsoft haters running Mac OS 9. On OS X, there’s no shortage of standards-compliant browsers. At this point iCab can be considered a browser with roughly the functionality of lynx.

  35. “The next ALA issue comes out Friday 4 December and will include Part II of the Slashdot article.”

    Friday 5 December? 😉

  36. Am I the only one that thinks markup should never contain words like “leftcolumn,” “centercolumn,” and “rightcolumn?” What if an alternate stylesheet has the columns switched? What if a future version of the page doesn’t use columns at all?

    Yet I see this all the time. Sometimes the markup is as horrible as “bluetext,” “bold,” etc. This is not what CSS is all about.

  37. The slashdot redesign contains the following code:

    Just wondering if you used a guideline for whether id was used in place of class. I suppose I would base my decision to use id for structural elements that do not repeat. Is there any other reasons?

  38. Ref: http://www.alistapart.com/discuss/slashdot/2/#c5678

    Peter,
    I know this is a few months after the event but as a hosting compnay that’s been trying to deliver a solution around the basis of the /. site these three things that you cite are really important:
    1). Fix non-standard /. markup
    2). ????????
    3). Profit!
    They are all things that we’ve had months of hair pulling and a majority of it is the inability of the coders to use any form of standardisation, let alone WS.
    No’3 is the hardest for us to come to terms with as virtually no-one this year is going to look at re-tooling unless there’s visible profit!
    McQ

Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA

I am a creative.

A List Apart founder and web design OG Zeldman ponders the moments of inspiration, the hours of plodding, and the ultimate mystery at the heart of a creative career.
Career