I'm sure all this has been said before, but it's a minor obsession of mine. Ignore HTML5 for a second, and all the interesting things happening with Javascript and CSS. I'd like to discuss the kids in leather jackets smoking cigarettes behind the gymnasium of web development: <b>
and <i>
.
And Wordpress is going to make my first point for me. I'm typing this in their WYSIWYG and the buttons for bold and italic are a little b and a little i. When I toggle one on, it becomes /b or /i. And yet the HTML that gets inserted is <strong>
or <em>
. Why?
Point 1: The bold and italic tags do exactly what the fuck it says on the tin.
If you want something in a division on a page, there's a tag for that. If you want something to be a heading, there's a tag for that, too. These tags, like all others, accomplish what they do only through conventions of browser manufacturers, but can nonetheless be relied upon to get you some semblance of what you want even if your CSS should be mysteriously and tragically abducted in the middle of the night. Like the bold and italic tags, they tell you very little about text they contain. But nothing in HTML tells you about the text it contains. It does, however, sort of tell you what it's going to look like in a webpage and describe the role it will play in the page's structure.
Point 2: If you want semantic markup, use XML.
XML is a very handy little language that will let you describe your precious digitized text as precisely as you want. You can style it and present it on the web and get exactly the same thing you would from a document full of non-semantic <div>
s and <span>
s. Or, hey, put even more <span>
s in and give them all seventeen class names and call it a microformat. I hope you have a really, really fun time doing that. If, on the other hand, you're willing to consider that your consumable data and your pretty website need not be the same beast, you could just keep the XML somewhere else and if anyone ever actually wants to scrape all the air dates and titles of the Buffy the Vampire Slayer episodes you recorded on VHS, let 'em just grab that.
Point 3: Why this webpage is shouting at me!!
You know what strong and emphasis refer to? They refer to verbal inflections. You know what most peoples' web browsers don't do? Talk to them. You know what would annoy the shit out of me if I were using a screen reader to read a bibliography on the internet? Listening to a goofy-sounding computer voice try to emphasize every single word of Journal of the Study of Obscure and Mostly-in-Latin Canine Diseases Affecting Generally the Respiratory System but Also Sometimes the Lymph Nodes or something. I don't know that this is still a problem, but the idea is ridiculous in and of itself. Italic does not always mean emphasis, nor does bold always mean MAKE THIS LOUD. Certain conventions of formatting require the use of bold or italics, but actual writing rarely does. If you need to place emphasis on something, you can do that with word choice. (Or anyway I hear it's possible..)
Point 4: Character count
When I first started doing professional web development, lots of things weren't digitized. Therefore I spent a lot of time marking things up, enough that I began to have those horrible almost-nightmares where you're doing the same task over and over and you can't stop doing it or thinking about it. If semantic markup had been a buzzword back then and every <b>
had to be a <span style="font-weight:bold;">
, I probably would have gone back to working at Chevron. The bold and italic tags are short. The emphasis tag isn't bad, but the strong tag is way too long to be a good value per keystroke. This is to say nothing of what you should actually be doing if you want to keep your presentation and your tags altogether separate, using a span with a CSS class describing the content and not merely the fact that you want it extra heavy. I mean, sure, I will always use CSS to bold or italicize something if there's already a tag around it. Nine times out of ten a bold or italic tag would be inappropriate there anyway as the font weight or style is purely decorative. On the other hand, if something needs to be weighted or styled differently and has no containing HTML element, fuck it, I'm using <b>
and <i>
. They're shorter.
Point 5: Dirty tricks
I add <b>
and <i>
to things I don't necessarily want bolded or italicized. I do it because they're some of the shortest elements available and they provide a hook to style child elements in larger repeated widgets with very precise, complex layouts. If I have to wrap every word of text in a tag just to accomplish some goofy layout, I'm using the smallest thing available. Especially if the widget in question is something I'm inserting dynamically and I want to keep it in a string without spaces and still be able to sort of recognize what's in it without needing four monitors just to view the whole thing at once. Again, if Something Better exists, I always do that first. But there are times when nothing you can do is going to be semantic, screen-reader friendly, or anything you'd want your mother finding out you'd done.
One of the most beautiful things about the web is its imperfections. We don't write firmware for missile guidance systems, we write applications that let you pretend to steal your friends' cartoon cows. There are places for flawlessness and obsessive semantic control, but this isn't one. When you can write elegant code, it's awesome, and I think most of the time everyone tries to. But it's important to realize that at the end of the day, this is all kind a series of epic hacks, and sometimes the most sane thing to do is just embrace it.