Stemming

Orama can analyze the input and perform a stemming operation, which allows the engine to perform more optimized queries, as well as save indexing space.

When stemming is enabled, Orama uses the English language analyzer, but we can override this behavior by setting the property language at database initialization, and importing a custom stemmer.

import { create } from "@orama/orama";
import { stemmer, language } from "@orama/stemmers/italian";

const db = create({
  schema: {
    author: "string",
    quote: "string",
  },
  components: {
    tokenizer: {
      stemming: true,
      language,
      stemmer,
    },
  },
});

Right now, Orama supports 30 languages and stemmers out of the box:

Arabic
Armenian
Bulgarian
Chinese (Mandarin - stemmer not supported)
Danish
Dutch
English
Finnish
French
German
Greek
Hindi
Hungarian
Indonesian
Irish
Italian
Mandarin (stemmer not supported)
Nepali
Norwegian
Portuguese
Romanian
Russian
Sanskrit
Serbian
Slovenian
Spanish
Swedish
Tamil
Turkish
Ukrainian