command line interface to Eurowordnet for Dutch written in Python
Compute similarity measures for a pair of Dutch
This demo computes three corpus-based word similarity measures for a
given pair of Dutch words:
The words should in fact be lemmas. That is, "kat" (cat) and "eten"
(eat) will be recognized, but inflected forms like "katten" (cats) or
"eet" (eats) are not. You can use plain words like "kat" (cat) and
"hond" (dog), but you can also add a part-of-speech (verb, noun or adj)
as in "varen:noun" or "varen:verb", and even a particular sense as in
If a word is unknown (i.e. not present in the Cornetto database) or
has count of zero, the similarity measure may become undefined, and the
value "None" is returned. For the technical details of the
implementation, see the relevant
part of the Pycornetto API documentation.
The implementation relies on three resources:
Please don't abuse this demo. If you want to compute similarity for
a substantial number of words, get a legal copy of the Cornetto
database from the TST centrale and use it with our free Pycornetto
software to process your own data.