Algraeph is a tool for manual alignment of linguistic graphs, such as phrase structure trees or dependency structures, where each node corresponds to a subsequence of the analyzed input sentence. It allows you to express the similarity between two graphs by aligning their nodes and attaching relation labels to these aligments.
Graphs are read from one or more graphbanks (or treebanks). Algraeph currently supports graphs in the general GraphML format and in the Alpino format (for Dutch). Alignment relations are user-defined. The alignments are stored in a simple XML format, which can be used for further processing. The result - a parallel graph corpus - is a useful data set for many tasks in computational linguistics and natural language processing such as automatic summarization, automatic translation, paraphrase extraction, recognizing textual entailment, etc.
Algraeph is implemented in the Python programming language using the wxPython GUI toolkit. It has been tested on Mac OS X, GNU Linux and MS Windows, but should run on any platform which is supported by Python, wxPython and Graphviz.
A screenshot of an Algraeph window (under Mac OS X) which contains two simple phrase structure trees. The tree on the left is for the English phrase "spam and eggs"; the tree on the right is for the Dutch translation "smac en eieren". These full phrases are shown in the text boxes at the top. Aligned nodes, as indicate by green lines, are translations of each other. The selected nodes and their alignments are shown in yellow. The token sequences corresponding to the these selected nodes are shown in the text boxes at the bottom. The alignment relation, which is simply "equals" here, is shown in between.
Release of Algraeph version 1.0
First public release of Algraeph (version 0.6.0)