Saturday, April 19, 2008

Thoughts on (1 of 2)

The FreeBSD website has been based on XML/SGML structured content since index.sgml was first checked into CVS in 1996. Since then we've realized a number of benefits of this emphasis on structured content, but there is a cost to this setup and I'm not convinced we're getting the full benefit of our infrastructure. First, consider what we gain by our current setup:

  • XSLT stylesheets and CSS to separate content from presentation, so new technical pages can be added quickly in the same look and feel as the rest of the website.

  • Dynamic generation of HTML pages from XML files of events and other structured data, so that we can always have a list of "upcoming" and "past" events that doesn't need to be manually updated once the event has occurred.

  • Text based changes can be kept in the same revision control system we use for code. This helps our developers by lowering the barrier for them to make website contributions, and it helps our translators to easily see the textual differences of the english language pages that need to be translated.

  • Complete change history allowing us to borrow content from previous revisions for cyclical activity like releases and summer of code.

  • RSS feeds automatically generated for most XML content on the site, allowing the content to be syndicated to other sites or read more conveniently through feedreaders or mobile devices.

  • Easy distribution of the source for the website allowing mirrors all over the world to host copies of the content.

Despite these advantages, there are also costs associated with this system.

  • Communication is entirely uni-directional. There is no way for users of the website to add comments or additions to newsflash stories, to attach photos to events on the events page, or to comment on the utility of feature requests on the ideas page.

  • Manual editing of XML files is tedious and error prone compared to using a web based content management system. The XML files and especially the stylesheets can be daunting to those that haven't spent significant time working with these technologies, which presents a barrier to more frequent updates and more regular improvements to the site.

I've been thinking a bit recently about these deficiencies and how we can address them while still making the most out of our existing presence on the web. I will follow up to this post with some specific ideas.

No comments: