Wednesday, June 27, 2007

Websites as graphs

Everyday, we look at dozens of websites. The structure of these websites is defined in HTML, the lingua franca for publishing information on the web. Your browser's job is to render the HTML according to the specs (most of the time, at least). You can look at the code behind any website by selecting the "View source" tab somewhere in your browser's menu.

HTML consists of so-called tags, like the A tag for links, IMG tag for images and so on. Since tags are nested in other tags, they are arranged in a hierarchical manner, and that hierarchy can be represented as a graph. I've written a little app that visualizes such a graph, and here are some screenshots of websites that I often look at.

I've used some color to indicate the most used tags in the following way:

blue: for links (the A tag)
red: for tables (TABLE, TR and TD tags)
green: for the DIV tag
violet: for images (the IMG tag)
yellow: for forms (FORM, INPUT, TEXTAREA, SELECT and OPTION tags)
orange: for linebreaks and blockquotes (BR, P, and BLOCKQUOTE tags)
black: the HTML tag, the root node
gray: all other tags

Here I post a couple of screenshots,

Update: Here it is:

CNN has a complicated but typical tag structure of a portal: Lots of links, lots of images. Similar use of divs and tables for layouting purposes. (1316 tags)


boingboing, my favorite blog, has a very simple tag structure: there seems to be one essential container that contains all other tags, essentially links (lots!), images, and tags to layout the text. A typical content driven website. (1056 tags)


As always, simplicity rules at Apple's website. A few images and links, that's it. Note the large yellow cluster, representing a dropdown menu. (350 tags)


Yahoo seems to be stuck in the old days of HTML style: most of the tags are tables, used for layouting - no divs. Very uncommon these days. (952 tags)


The complete opposite of yahoo - this site uses almost no tables at all, only divs (green). It's nice to see how the div tags are holding the other elements, like links and images, together. (454 tags)


Surprisingly, at least to me, Microsoft's portal is very much div-driven. Also of note is it's very scarce use of images. (633 tags)


Today, google is everywhere, but if somebody had asked me 5 years ago why I was using google, and wanted a visual answer, here it is (88 tags):


I finish with two of my own projects:

What can I say? I like it ;-) No tables, lots of links, simple structure. A typical Movable Type site, I guess. (372 tags)


My personal art project. Although I programmed the site myself, I'm surprised by the simplicity of its tag structure. It shows that you can make beautiful websites with just a few tags ;-) (88 tags)


That's it. You can play around with the app, and take a fresh look at websites - here's the applet.