Converting Word to ThML

The conversion from microsoft word to an eventual XML document has several steps. First, the document is converted to XML, using the Transdoc DTD. This is done with the free @rtf2xml package (v0.4 or v0.5).

Then, these XML documents are converted to SGML using the ThML DTD using custom perl scripts. These are development level -- not entirely finished, and some bugs are known to exist, but they have successfully converted several documents. In order to keep each script small and manageable, I separated the job into several smaller jobs:

You can download these perl scripts as a zip file:

Some jobs than remain: