...I've finally done what I should have a long time ago and run the current parts of the site through HTML Tidy. Most of the legacy stuff, whose markup was written years ago, should now be valid HTML 4.01 Transitional. My aim is to gradually move over to an XHTML standard in future, and go back and put in more descriptive alt text and/or titles for images where appropriate.
For those wondering what HTML 4.01 and XHTML are, or what the difference is, they're languages which describe pages on the web. HTML stands for 'Hyper Text Markup Language', and it has a version number because the language has changed over time. Keeping it simple, HTML 3.02 had fewer commands than HTML 4.01 and some of the syntax was different. The idea is that if people label their pages with the version of HTML they were written in (and then write them correctly, of course) then web browsers such as Firefox and Internet Explorer will continue to render them as their authors intended even if the commands and syntax change in later versions.
In particular, there's an emphasis in more recent versions of HTML and XHTML to separate the presentation of pages from their physical layout. What this means is that rather than writing into your pages that you want big red text in an italic font every time you want that, you can write that in one place and tell your pages to reference that place. It's then easy to change in future, as well as keeping filesizes smaller, helping those with poor sight or other disabilities to make the display they see suit them, etc. This is the concept of Cascading Style Sheets, often abbreviated to CSS.
It's not quite that simple, particularly for those of us who've been writing HTML for five years or more—it's like everything else: if you get into bad habits early on (or ideas about how things should be done change) it can involve a fair amount of time and effort to makes changes. There's also the fact that the relative complexity of recent versions of HTML can baffle newcomers a little, encouraging them to learn by writing in older versions—ultimately crossing over to mix in features of current versions and writing code which is a confused mess. This is the main reason many websites display strangely or break when viewed in different browsers and even different versions of those browsers.
It doesn't help that one of the browser still most widely-used (Internet Explorer) is also the worst written when it comes to interpreting different versions of HTML. However, even IE is capable of displaying most pages correctly as long as people label their pages to tell it which language version is being used, and then write in that version. :)