daeso logo
Detecting And Exploiting Semantic Overlap

Page Contents

Hitaext Links

Download

Hitaext User Manual

Hitaext Home

Internal Links

Daeso Home

Hitaext FAQ

What is the status of Hitaext?

Hitaext has been succesfully used for aligning news articles, parapgraphs and sentences while developing the Daeso corpus, a monolingual (Dutch) parallel tree bank of over 1 million words. Even though there are some minor issues and documentation is not completed, we think it is sufficiently stable to allow release to a wider audience. In the spirit of open source software, we hope feedback from other users will help us to further improve Hitaext.

What is the support for Hitaext?

Hitaext will be supported for the duration of the Daeso project - until October 2009 - after which support will be taken over by the TST centrale. Support means we will do our best to provide help, solve bugs, and implement requested features. However, keep in mind that Hitaext is academic open source software, so we cannot deliver support at a commercial level.

Can Hitaext handle any XML markup?

Hitaext can read any syntactically well-formed XML document. It does not care about DTD's, Schema's or other definitions; it will not attempt to validate the documents.

Hitaext assumes inline markup. It cannot handle standoff annotation which uses pointers or offsets to link text and markup in separate files.