Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Antiword is a free MS Word reader for Linux and RISC OS. There are ports to FreeBSD, BeOS, OS/2, Mac OS X, Amiga, VMS, NetWare, Plan9, EPOC, Zaurus PDA, MorphOS, Tru64/OSF and DOS. Antiword converts the binary files from Word 2, 6, 7, 97, 2000, 2002 and 2003 to plain text and to PostScript."

- http://www.winfield.demon.nl/



There is a project from Apache that works across all (the binary) MS Office formats.

http://poi.apache.org/


Yeah, that'll work too. The point is that you need to leverage someone else's work to do it. Focus on your core, find shortcuts for everything else.


Well, given the release of these documents, as well as the existence of the Office Open XML format, there's nothing left to reverse engineer.

Granted, its no picnic implementing the specs these documents outline, but its certainly better than having to figure it all out from a binary file.


Well it is a picnic, a picnic in the park, Jurassic park that is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: