BaseX Similarity Module based on the SimMetric Library http://sourceforge.net/projects/simmetrics/ Daniël Knippers, University of Twente (C) 2012 GPL License Installation ------------- In BaseX (command line client): REPO INSTALL StringSimilarity.jar Usage ------------- In a query, import the module and call a function like so: import module namespace s = "org.basex.modules.StringSimilarity"; s:Levenshtein("house", "home") Functions ------------- Generic: similarity(String metric, String s1, String s2): Return the normalized similarity value between s1 and s2, using metric. similarity_raw(String metric, String s1, String s2): Return the unnormalized similarity value between s1 and s2, using metric. Metric specific: (String s1, String s2): Return the normalized similarity value between s1 and s2, using metric_name. _raw(String s1, String s2): Return the unnormalized similarity value between s1 and s2, using metric_name. Metrics ------------- 1. BlockDistance 2. ChapmanLengthDeviation 3. ChapmanMeanLength 4. ChapmanOrderedNameCompound 5. Cosine 6. DamerauLevenshtein 7. Dice 8. EuclideanDistance 9. HammingDistance 10. Jaccard 11. Jaro 12. JaroWinkler 13. Levenshtein 14. MatchingCoefficient 15. NeedlemanWunch 16. OverlapCoefficient 17. QGrams 18. SmithWaterman 19. Soundex 20. TagLinkToken