HTMLTidy is one of those wonderfully efficient little tools in the *nix tradition. Like the other tools of this type, it does only one thing, but does it well. It differs from most of the others in one respect though - it's name clearly describes what it does: tidy up messy html.
Have you ever worked on a project that involved editing html pages from a variety of sources? Ever had to work with an MSWord document saved as html? What about the chunderous mess that's spewed from some of those WYSIWYG web tools? Or maybe you create your pages the
Tidy fixes a number of common, and not so common, mistakes in HTML files. It does this by analyzing the markup in a file and comparing it to the HTML 4.01 specification. Depending on the options you specify, Tidy can fixes the problems it finds or it can generate a log detailing the errors.
The range of problems Tidy can fix is impressive. It can add missing or mis-matched end tags, correct tags that are in the wrong order, insert quotes around attributes, and can even add missing > to a tag. One of the few things Tidy can't do is add SUMMARY
Features:
* Suggests fixes and improvements for common errors found in HTML, XHTML and XML documents.
* Check multiple documents through the Batch Action Wizard.
* Ability to read settings from a default Tidy config file [new].
* Convert documents to XHTML and XML formats.
* Upgrade FONT tags to style sheets.
* Remove optional end tags.
* Indent / beautify tags, attributes and/or content.
* Change tags and/or attributes to uppercase or lowercase.
* Strip surplus tags in HTML documents generated using Word.
* Check for accessibility.
*
The documentation would have to improve considerably to be atrocious, the module has some head scratching limitations, and is slow.
Installation on Darwin 8.5.0/perl 5.8.6 was a nightmare of dependency resolution, whether by hand or by cpan. The author might have mentioned that the htmltidy source and headers have to be present before installation in the instructions. While the documentation does mention to "tell the makefile that you're using ranlib", that convoluted set of instructions doesn't actually address the problem I had.
That aside, on
Web Developers
If you are comfortable using Tidy and the command prompt already, all you need is this exe. If you are a beginner you may want this installer that includes Dave Raggett's overview and a quick reference to all of Tidy's options.
Getting Started
New to Tidy? Install one of the packages above, then open up a command (DOS) window. If you're under Windows 2000/XP, you can find this under Start->All Programs->Accessories->Command Prompt. When it opens, run tidy -h to get a quick and dirty reference on how to use it. You can also
HTMLtidy (otherwise known as Tidy) is an Amiga port of Dave Raggett's Tidy, which is a program designed to tidy up your HTML source files. It can adjust the layout of the source to make it easier for you to read, wrapping lines and indenting tables and lists etc.
However, that is not all that HTMLtidy does - it can also fix some of the more common HTML mistakes, including
* Missing or unmatched tags;
* End tags in the wrong order;
* Missing quotes round attributes;
* Missed / in end tags;
* Missing > closing tags;
and many
I’ve finally enabled a subset of HTML in my comments. In doing so, I had several requirements that needed to be fulfilled:
1. Entered markup must be valid to XHTML strict, to stop comments form breaking validation and keep things nice and tidy.
2. No presentational markup! I want to maintain control over how things look via my stylesheets—comments posted should only be able to use structural HTML elements.
3. Attributes should be restricted to those that add semantic meaning. Javascript event attributes and CSS related attributes should not be