text 2 HTML without the lock down

Jay Allen points, in his post The CMS and inline HTML , to an issue which I have been recently giving some thought as I consider switching between Textile to Markdown.

Both these tools are written to speed the process of writing HTML by removing the need to type those repetitive tags. Markdown puts more emphasis on readability of it’s shorthand, but at this point doesn’t have all the bells and whistles of Textile. There are alternatives out there, all of which I have not looked into, these two just happen to integrate nicely as formatting options into Movable Type.

The lock down Jay writes about highlights the need for tools that automatically translate HTML to our shorthand of choice. The shorthand scripts themselves provide a means to create HTML from the shorthand my entries were originally written in but not back again. A means to translate efficiently in the opposite direction makes switching from one shorthand language is a 3 step process: Shorthand-A to HTML to Shorthand-B or vice versa.

For text that is ultimately only ever going to be viewed as HTML maybe there is no need for the last step. Freedom in this scenario would be provided by a means to translate entries to HTML within the Movable Type database. Like a smart search and replace which switches off the automatic formatting for each entry once it is converted. Readability of the entry would only be hampered in the rare event of revisiting a entry for editing.

I’m not aware of an easy way of making these in-situ transformations but since I first made some notes towards this entry some tools have come to my attention which address this need for flexibility. Aaron Swartz, John Gruber’s coconspirator on Markdown, has made a Python script (html2text) that converts:

a page of HTML into clean, easy-to-read plain ASCII. Better yet, that ASCII also happens to be valid Markdown…

The script is easily applied with BBEdit or to HTML displayed in my browser with the provided favelet.

The existence of this tool for Markdown is in keeping with it’s goal of making plain text and HTML better friends.

Secondly Dean Allen, Brad Choate’s coconspirator on Textile mentions Detextile:

Semiholy grail 2: Alongside Textile now is Detextile, an inverted mirror of its conversion algorithms. This was the final piece of the puzzle. Now both text and HTML versions of articles are synchronized: make a change in one, it shows up in the other. I expect for most people this’ll never be needed, but to the markup-obsessed it should prove useful.

Unfortunately this appears to be specific to Textpattern. Maybe the next step would allow me to make these transformations to my Movable Type database directly.