[LINK] This is enough to make any e-librarian's head spin
Craig Sanders
cas@taz.net.au
Sat Nov 23 17:19:54 EST 2002
On Sat, Nov 23, 2002 at 03:08:58PM +1100, Rick Welykochy wrote:
> A challenge to any Linkers who wish to try:
> -----------------------------------------------------------------------------
> Download the 50 KB document from
> <http://central.dot.net.au/~rick/STROUST1.doc>,
> convert it to plain text or HTML.
i wasn't able to load it in abiword, openoffice, or kword.
the command-line tool catdoc had no problem with it, and happily
converted it to both plain text and TeX format - from TeX, it is an easy
step to html or many other formats.
word2x also managed to convert it to both plain text and LaTeX with no
hassles.
i can't find it now (or even remember what it's called) but i've got
another one which will convert direct from MSWord to HTML. i've used it
in the past.
> If you can do this, please tell me the software and version of same
> you used to accomplish this task :)
Package: catdoc
Version: 0.91.5-1
Description: MS-Word to TeX or plain text converter
This program extracts text from MS-Word files, trying to preserve
as many special printable characters as possible. catdoc supports
everything up to Word-97.
.
It doesn't even try to preserve fancy Word formatting, because
Word users usually don't care about document structure, and it is
this very thing which is important to LaTeX users.
.
Also provided is xls2csv, which extracts data from Excel spreadsheets
and outputs it in comma-separated-value format.
.
This package suggests tk because it also includes wordview, an
optional Tk-based GUI for catdoc. The MIME config provided in this
package will use wordview is X is running, or catdoc directly if it
is not.
Package: word2x
Version: 0.005-4.1
Description: Translates Word files into ascii text or LaTeX
Takes Word files and transforms them into ascii text or LaTeX
craig
ps: i've commented many times on the fact that using proprietary
document formats is in direct contradiction with the various record
keeping rules that all levels of government are required to adhere to.
IMO, it's one of the reasons why govts should never use MS Word or store
any other data in proprietary document/data formats. records must be
kept for much longer than the lifecycle of any software product.
--
craig sanders <cas@taz.net.au>
Fabricati Diem, PVNC.
-- motto of the Ankh-Morpork City Watch
More information about the Link
mailing list