A LIST Apart: For People Who Make Websites

No. 275

Discuss: Semantics in HTML 5

Pages

 <  1 2 3 4 >  Last »

11 Creating New Elements

As Jeremy Keith mentioned in comment 2, the document.createElement() solution works for IE 6 and 7; and the latest non-public technical release of IE8 has fixed the bug that prevented that from working in previous betas. And given that the update cycle for Firefox is significantly faster than that of IE, waiting for Firefox 2 usage to diminish significantly enough and for users to upgrade to at least Firefox 3, which handles unknown elements reasonably well, isn’t such a big deal.

So all the problems with introducing new elements in HTML5 either have workarounds or are rapidly becoming insignificant.

The advantage of using elements instead of attributes is that from an authoring perspective, it can make reading markup a lot easier, since so many <div> and </div> are replaced by more meaningful names. In particular, by replacing numerous nested divs, it makes associating end tags with start tags easier (as long as authors don’t simply naively replace all divs with sections, leaving the same “divitis” problem with a different name.)

posted at 10:58 am on January 6, 2009 by Lachlan Hunt

12 Semantics of semantics

John,

you make an interesting case, yet as the comments — no matter how smart they may be too — clearly show, you fall into a common trap: you propose a solution to an unclear problem.

Pretty much everyone in the Web community agrees that “semantics are yummy, and will get you cookies”, and that’s probably true. But once you start digging a little bit further, it becomes clear that very few people can actually articulate a reason why.

So before we all go another round on this, I have to ask: what’s it you wanna do with them darn semantics?

The general answer is “to repurpose content”. That’s fine on the surface, but you quickly reach a point where you have to ask “repurpose to what?”. For instance, if you want to render pages to a small screen (a form of repurposing) then <nav> or <footer> tell you that those bits aren’t content, and can be folded away; but if you’re looking into legal issues digging inside <footer> with some heuristics won’t help much.

I think HTML should add only elements that either expose functionality that would be pretty much meaningless otherwise (e.g. <canvas>) or that provide semantics that help repurpose for Web browsing uses.

We can debate the specifics, but seen in this light the existing additions to HTML5 pretty much make sense. This is very definitely not to say that there shouldn’t be extensibility hooks, rather I aim to indicate which extensibility approach should go where.

That also tells you why looking at DocBook isn’t such a great idea. DocBook is for technical documentation, HTML is far more general in its purpose. DocBook has hundreds of elements and attributes that would make no sense in HTML.

So before we look into other ways of including semantics in HTML, we need to look at use cases. Want to extract some triples? Maybe GRDDL can cut it. Want to do semantic decoration of your tree? Looking into RDFa could perhaps be an option. Want to have your content stick as cleanly as possible to the semantics of your model, and render that separately? It could be a job for arbitrary XML with some XBL (we’re talking 2020 here).

I really am not the use case fascist most of the time, but when the word “semantics” comes up it helps to reach for the bullwhip.

posted at 12:20 pm on January 6, 2009 by Robin Berjon

13 Untitled

It’s not good to complain about attempts to update HTML because it’s already been updated several times in the past. So why suddenly can’t we update it any more? Because of IE6?

And if HTML5 is not the answer, we are left with two things:

1. HTML4 as it stands, and using methods like this article suggests to bolt new concepts on to it. That sounds like trying to improve VHS instead of moving to DVD just for the sake of backwards compatibility. Things have to move on at some point and often that means a big change that will leave existing users behind. (Anything that no longer works simply has to be converted, like VHS tapes are copied onto DVDs.)

2. XHTML. This allows any tags we want, and appeals to me because we don’t have to go with a set list of tags we are given if we don’t like them. So long as they remain readable and have a DTD to let the browser know how to handle them what’s the problem?

HTML5 may not be perfect but we can’t stick with HTML4 forever. I’m looking forward to HTML6 and beyond.

posted at 12:23 pm on January 6, 2009 by Chris Hester

14 Untitled

Good piece, Jon – especially as I’m currently weeping in frustration while trying to style html 5 reliably.

There are some wierdnesses in HTML 5, I agree. hr has always annoyed me, as it’s presentational and should really be “seperator” (so Japanese people can have a vertical rule). But backwards compatibility is a design principle of HTML 5http://www.w3.org/TR/html-design-principles/.

header and footer have caused woe and anguish to some; to me, they’re well-established terms for “branding stuff” and “stuff at the end in small type that you hide the legal stuff and the accessibility link”.

I had a fist-fight with Lachlan only yesterday because I can’t mark up dialogues using blockquotes, but must use the crazy dd/ dt pairs (and can’t mark up stage directions in a dialogue).

But HTML 5 is probably the best thing we’ve got right now. It gives lots of new methods to markup web apps, so pushes a pitchfork in the face of the advancing Silverlight and Flash onslaught. (Do they have much richer semantics?)

Devils advocate: you claim that microformats show that we need more semantics. But apart from hCalendar and hCard, nobody really uses them that I kmow of. Are they a solution in search of a problem?

Devil’s advocate:

posted at 12:24 pm on January 6, 2009 by bruce lawson

15 Untitled

.. and apologies for spelling your name wrong.

Buce.

posted at 12:33 pm on January 6, 2009 by bruce lawson

16 The holy grail

I’ve often wondered about defining semantics in the same way that style is defined using CSS. You could have “Semantic Sheets” which you could attach to your HTML document in much the same way that you do CSS. That would remove the whole semantics issue from HTML. But then, really, that is what XML and DTDs are supposed to do.

Maybe we are searching for the holy grail of web authoring. A mechanism for marking up content in a semantically rich way, that is easy to use, easy to write, easy to edit, easy extend, backward compatible and forward compatible. Perhaps this is not possible? Maybe we should just accept that HTML4, whilst not perfect, is good enough?

@Lachlan

But isn’t HTML5, at its core, simply replacing one fixed set of tags with another? The new tags might be meaningful now, but will they be in 30 years time?

I personally feel too much time is wasted on creating a “better” HTML when really it is still incredibly difficult for non-technical people to write valid, meaningful and uncluttered HTML. I still know of no editor (WYSIWYG or otherwise) which I could give to my mother so that she can write a valid, sematically correct and uncluttered HTML document.

posted at 12:46 pm on January 6, 2009 by Robin Massart

17 thoughtful, but

Thoughtful, certainly, but i will argue for a scorched-internet policy. We need to move forward, and screw the browsers that refuse to come along with us into the future. Sure, users will suffer at first, until they Get It. Word will spread soon enough that there are browsers out there that JPW.

Does anyone seriously believe that we can get by just on minor modifications to the current HTML spec for the next decade?? Of course not! Sooner or later, we’re going to wish we had a continuous supply of new elements, where such things make sense, universally speaking.

Damn the torpedoes – full speed ahead!

posted at 12:47 pm on January 6, 2009 by jeremy jarratt

18

Hi John,

The main problem you describe — of having a need to add semantics at a number of different layers within a document — is exactly what motivated me to write the early drafts of RDFa, a few years ago. And the combination of RDFa and @role achieve pretty much everything you rightly suggest we need.

The essential difference between the attributes added by RDFa and the role attribute is the level at which the semantics are applied. As you rightly point out in your examples, role allows an author to say something about the document itself. For example, they might indicate that one part of their document is a footer, whilst another is a menu.

The RDFa attributes on the other hand, relate to the content of the document. Here an author can indicate that a link to another document is actually a link to a license. Or they could indicate that the text in an <h2> is actually the name of the author of the article.

By the way, the @content attribute is used to provide a more accurate representation of some inline text, so your example:

<span equivalent=“2009-05-01”>May Day next year</span>

could be written as:

<span content=“2009-05-01”>May Day next year</span>

You can also indicate the datatype, if needed:

<span content=“2009-05-01” datatype=“xs:date”>May Day next year</span>

Regards,

Mark

http://webBackplane.com/mark-birbeck

posted at 01:10 pm on January 6, 2009 by MarkBirbeck

19 Stability

To address the commenters who ask (rhetorically, I think) why we bother with backwards compatibility when that was never a concern in the late 90s: Do any of you actually remember the late 90s? It sucked. There were no standards, browser compatibility was verymuch up in the air, and the leading edge was so erratic and inconsistent that nothing interesting or valuable could be discovered or formalized.

Modern Ajax techniques are based off of ten-year-old technology. XMLHttpRequest and IFRAME methods have been possible for a very long time. Javascript hasn’t moved forward appreciably since v1.5. It took ten years of stability to develop the mature scripting frameworks and responsible development processes that we now take for granted.

The last thing we want to do is throw out the old because we’ve outgrown it to introduce a whole new set of instabilities. The internet platform has become too mature for that to be a reasonable proposition. Instead, we should be examining processes like those proposed in this article, which emphasize small steps in the right general direction but without breaking everything that’s come before.

posted at 01:15 pm on January 6, 2009 by Nick Husher

20 XML, DTDs, Oh My!

Sounds like we all agree that we need a markup language that is flexible — something extensible. Yeah! And eXtensible markup language. Then we just need something that allows us to define the custom types in our documents. A document type definition.

Boy it’d be great if this “XML” language existed along with support for “DTDs”…

If we all stopped this absurd argument over which HTML 5 tags are worthwhile and moved to full XML support styled using current CSS techniques we could stop versioning the markup language used to structure the web.

The only downside is that without a standard set of markup tags you make spidering the web much more challenging and open the doors to “What the hell are you doing?! Why is there a <footer_for_the_homepage_with_red_border> tag?!”

posted at 01:16 pm on January 6, 2009 by Michael Thompson

Pages

 <  1 2 3 4 >  Last »

Got something to say?

Discuss this article. We reserve the right to delete flames, trolls, and wood nymphs.

Create a new account or sign in below if you’d like to leave a comment.

Remember me

Forgot your password?

Subscribe to this article's comments: RSS (what’s this?)