Yunqa • The Delphi Inspiration

Delphi Components and Applications

User Tools

Site Tools


products:stemmer:history

YuStemmer: Version History

YuStemmer is a natural language stemming library for 15 languages. It reduces an inflected word to a common root form. YuStemmer is algorithmic, which makes it small and fast.

YuStemmer v6.0.1 – 19 Jan 2024

  • Fix Win64 access violation in Latin stemmers.

YuStemmer v6.0.0 – 13 Dec 2023

  • New stemmers:
    1. Estonian stemmer: TYuStemmer_Estonian_8, TYuStemmer_Estonian_16.
  • Modified stemmers (require to update existing indexes):
    1. German stemmer is replaced by German2 stemmer, and German2 stemmer removed.
    2. Romanian stemmer is fixed to work with unicode alphabet in modern use.
    3. Swedish stemmer: Improve handling of -öst.
  • Removed stemmers:
    1. German2 stemmer (to replace German stemmer, see above).
    2. Romanian stemmer ISO-8859-2 (TYuStemmer_Romanian). The encoding no longer provides all required characters, see above. Use TYuStemmer_Romanian_8 or TYuStemmer_Romanian_16 instead.
  • Latin stemmer has new methods StemNoun() and StemVerb().
  • Tamil stemmer runs about 40 percent faster.

YuStemmer v5.8.0 – 22 Nov 2023

  • Support Delphi 12 Athens Win32 and Win64.

YuStemmer v5.7.0 – 4 Jan 2023

  • TYuStemmer_German2 stemmers: Fix handling of 'qu' to match algorithm description.
  • TYuStemmer_Italian stemmers: Fix overstemming of 'divano'.
  • Improvements to arabic, greek, hindi, irish, turkish, and yiddish stemmers.

YuStemmer 5.6.0 – 16 Sept 2021

  • Support Delphi 11 Alexandria Win32 and Win64.

YuStemmer 5.5.0 – 16 Jul 2021

  • General stemming algorithm improvements.
  • Yiddish stemmer fixes.
  • Arabic and Greek stemmer efficiency tweaks.

YuStemmer 5.4.0 – 19 Nov 2020

  • New stemmers:
    1. Yiddish: TYuStemmer_Yiddish_8, TYuStemmer_Yiddish_16.
  • Fix decoding of 4-byte UTF-8 sequences.

YuStemmer 5.3.0 – 5 Jun 2020

  • Support Delphi 10.4 Sydney Win32 and Win64.

YuStemmer 5.2.0 – 10 Jan 2020

  • New stemmers:
    1. Serbian: TYuStemmer_Serbian_8, TYuStemmer_Serbian_16.
  • Minor optimisation to Dutch Kraaij Pohlmann stemmer: TYuStemmer_Kraaij_Pohlmann, TYuStemmer_Kraaij_Pohlmann_8, TYuStemmer_Kraaij_Pohlmann_16.

YuStemmer 5.1.0 – 8 Oct 2019

  • New stemmers:
    1. Hindi: TYuStemmer_Hindi_8, TYuStemmer_Hindi_16.
  • Add demo project to stem a word with all available stemmers at once. The precompiled executable binary is included in the Demo edition.
  • Enlarge test cases. The precompiled test executable binary is included in the Demo edition.

YuStemmer 5.0.0 – 21 Mar 2019

  • New stemmers:
    1. Greek: TYuStemmer_Greek_8, TYuStemmer_Greek_16.
    2. Indonesian: TYuStemmer_Indonesian, TYuStemmer_Indonesian_8, TYuStemmer_Indonesian_16.
    3. Lithuanian: TYuStemmer_Lithuanian_8, TYuStemmer_Lithuanian_16.
    4. Nepali: TYuStemmer_Nepali_8, TYuStemmer_Nepali_16.
  • Danish and Finish stemmers no longer mangle numbers. They now define “consonant” more tightly than just “not a vowel”, which means numbers don't get truncated, and also tends to leave foreign words alone. Search indexes need updating.
  • French stemmer recognizes suffixes that begin with diaereses. Search indexes need updating.
  • Latin stemmer uses a single tab (instead of mulitple white space) to separate the noun and verb form of the result. Code might need updating.
  • Russian stemmer normalizes 'ё' and maps it to 'е'. Search indexes need updating.
  • Turkish stemmer runs up to 11% faster by checking for 'ad' or 'soyad' more efficiently. Search indexes need updating.
  • Fix handling of 3-byte UTF-8 sequences, plus handle 4-byte UTF-8 sequences.
  • Create simpler code for and improve several stemmers.

YuStemmer 4.1.0 – 24 Dec 2018

  • Support Delphi 10.3 Rio Win32 and Win64.

YuStemmer 4.0.0 – 3 Apr 2017

  • Support Delphi 10.2 Tokyo Win32 and Win64.
  • New stemmers:
    • Arabic: TYuStemmer_Arabic_8, TYuStemmer_Arabic_16.
    • Kraaij Pohlmann (Dutch): TYuStemmer_Kraaij_Pohlmann, TYuStemmer_Kraaij_Pohlmann_8, TYuStemmer_Kraaij_Pohlmann_16.
    • Latin: TYuStemmer_Latin, TYuStemmer_Latin_8, TYuStemmer_Latin_16.
    • Lovins (English): TYuStemmer_Lovins, TYuStemmer_Lovins_8, TYuStemmer_Lovins_16.
    • Slovene: TYuStemmer_Slovene_8, TYuStemmer_Slovene_16.
    • Tamil: TYuStemmer_Tamil_8, TYuStemmer_Tamil_16.
  • Fix TYuStemmer_Czech_8 and TYuStemmer_Czech_16 to handle Unicode properly.
  • Portuguese stemmer fix: Replace Spanish suffixes with Portuguese ones.
  • Greately expand test cases.

YuStemmer 3.7.0 – 7 May 2016

  • Support Delphi 10.1 Berlin Win32 and Win64.

YuStemmer 3.6.2 – 15 Sep 2015

  • Support Delphi 10 Seattle Win32 and Win64.

YuStemmer 3.6.1 – 25 Apr 2015

  • Add support for Delphi XE8 Win32 and Win64.

YuStemmer 3.6.0 – 3 Oct 2014

  • Support Delphi XE7 Win32 and Win64.
  • New Stemmers:
    • Armenian: TYuStemmer_Armenian_8, TYuStemmer_Armenian_16.
    • Basque: TYuStemmer_Basque, TYuStemmer_Basque_8, TYuStemmer_Basque_16.
    • Catalan: TYuStemmer_Catalan, TYuStemmer_Catalan_8, TYuStemmer_Catalan_16.
    • Czech: TYuStemmer_Czech, TYuStemmer_Czech_8, TYuStemmer_Czech_16.
    • Irish: TYuStemmer_Irish, TYuStemmer_Irish_8, TYuStemmer_Irish_16.
  • Hungarian stemmer TYuStemmer_Hungarian now expects ISO 8859-2 instead of ISO 8859-1 encoded strings.

YuStemmer 3.5.0 – 28 Apr 2014

  • Support Delphi XE6 Win32 and Win64.

YuStemmer 3.0.0 – 25 Sep 2013

  • Support Delphi XE5 Win32 and Win64.

YuStemmer 2.6.0 – 14 Jun 2013

  • Support Delphi XE4 Win32 and Win64.

YuStemmer 2.5.0 – 4 Oct 2012

  • Support Delphi XE3 Win32 and Win64.

YuStemmer 2.1.0 – 8 Nov 2011

  • Support Delphi XE2 Win64.

YuStemmer 2.0.0 – 15 Oct 2011

  • Support Delphi XE2 Win32.

YuStemmer 1.1.0 – 28 Sep 2010

  • Delphi XE support.
  • German stemmers: Add a new rule to reduce -nisse (and -nissen and -nisses) to -nis. This improves the stemming of “Kürbisse”, for example, which was reduced to “Kürbiss” and not “Kürbis”. Database tokenizers need to rebuild their indexes if they use any of the German stemmers.

YuStemmer 1.0.0 – 5 Dec 2009

  • First release.
products/stemmer/history.txt · Last modified: 2024/01/20 10:34 by 127.0.0.1