Structure and Presentation
Jan. 27th, 2005 01:36 pmRecently I have been working on my website, trying to convert it to PmWiki2 and bring it up to snuff, so that I have something nice to show off to potential employers. In the process, I found that the Nethack spoilers (a 150+ page Word document) are broken. All is not lost, because Word can open it with the repair option, but I'll still have to reformat it, because I've lost the nethack.dot file that held a lot of the styles used. Now I have to waste a lot of time fixing this, or drop this piece from my portfolio, which I really don't want to do, because it's my only book-length project.
Writing standards-compliant XHTML and CSS has taught me to write documents in a way that separates structure and presentation. And here I have this gigantic Word document to deal with, which doesn't. The web standards approach is opposite to Word's: Word is a giant toolkit for making your documents look the way you want them to. Structure is handled entirely in presentation! If I use "clear formatting" on a heading, it turns back into body text. Word's approach only works for short documents, probably less than 3 pages. More than that, and inconsistency will occur despite the best of intentions. Even if you use Word's Styles features, sooner or later you're going to slip, and that bit of ordinary-looking text is really Something Else with its distinctive presentation altered. Another type of lock-in.
So here I am with this huge document, where what I really want to do is rewrite the stylesheet...but I can't, because this is Word, not HTML. If I "clear formatting", I also "clear semantics and structure". Not what I want. I'm trying to figure out how best to deal with this, and I think I'm going to turn it into a Wiki. That will allow for the kind of structure and linking that I want, and I'll probably be able to work out decent presentation. Maybe I'll even be able to give away the project to the Nethack community so that my work won't go to waste, but (hopefully) I won't have to host or maintain it.
Maybe I should start putting everything that isn't a complex creation into a wiki. That still leaves the problem of finding a better text/thought processor for the complex creations. What I want:
Any recommendations?
Writing standards-compliant XHTML and CSS has taught me to write documents in a way that separates structure and presentation. And here I have this gigantic Word document to deal with, which doesn't. The web standards approach is opposite to Word's: Word is a giant toolkit for making your documents look the way you want them to. Structure is handled entirely in presentation! If I use "clear formatting" on a heading, it turns back into body text. Word's approach only works for short documents, probably less than 3 pages. More than that, and inconsistency will occur despite the best of intentions. Even if you use Word's Styles features, sooner or later you're going to slip, and that bit of ordinary-looking text is really Something Else with its distinctive presentation altered. Another type of lock-in.
So here I am with this huge document, where what I really want to do is rewrite the stylesheet...but I can't, because this is Word, not HTML. If I "clear formatting", I also "clear semantics and structure". Not what I want. I'm trying to figure out how best to deal with this, and I think I'm going to turn it into a Wiki. That will allow for the kind of structure and linking that I want, and I'll probably be able to work out decent presentation. Maybe I'll even be able to give away the project to the Nethack community so that my work won't go to waste, but (hopefully) I won't have to host or maintain it.
Maybe I should start putting everything that isn't a complex creation into a wiki. That still leaves the problem of finding a better text/thought processor for the complex creations. What I want:
- single-source, multiple-output documents: export to .doc, .rtf, .html, .txt, .pdf, or more as required, with as little loss of structure and presentation as possible in the output medium.
- able to format for print, screen, or both, preferably with little fuss.
- encourages me to add structure and semantics to documents.
- separates structure from presentation.
- allows me to control presentation tightly if I so choose, by creating my own stylesheets, and supports this in the UI.
- stored in an open format so that my data is not locked in forever.
- not overly difficult to learn. I'm willing to invest time in this, but it should not take days of dedicated work to get the basics.
- doesn't make my document any more complicated than necessary.
- runs quickly and stably -- preferably on WinXP, or Cygwin-with-X if necessary. (Buying a Mac just isn't in the cards right now.)
Any recommendations?