Plugin Embeddings

To perform vector and hybrid search, you need to convert your text data into embeddings.

While this is managed for you with Orama Cloud, when using Orama open-source, you need to generate embeddings for your documents on your own.

This plugin generates embeddings for your documents at insert and search time, allowing you to perform vector and hybrid searches on your documents.

Installation

You can install the plugin using any major Node.js package manager.

npm install @orama/plugin-embeddings

yarn add @orama/plugin-embeddings

pnpm install @orama/plugin-embeddings

Important note: to use this plugin, you’ll also need to install one of the following TensorflowJS backend:

@tensorflow/tfjs
@tensorflow/tfjs-node
@tensorflow/tfjs-backend-webgl
@tensorflow/tfjs-backend-cpu
@tensorflow/tfjs-node-gpu
@tensorflow/tfjs-backend-wasm

For example, if you’re running Orama on the browser, we highly recommend using @tensorflow/tfjs-backend-webgl:

npm install @tensorflow/tfjs-backend-webgl

yarn add @tensorflow/tfjs-backend-webgl

pnpm install @tensorflow/tfjs-backend-webgl

If you’re using Orama in Node.js, we recommend using @tensorflow/tfjs-node:

npm install @tensorflow/tfjs-node

yarn add @tensorflow/tfjs-node

pnpm install @tensorflow/tfjs-node

Usage

This plugin will generate text embeddings for you at insert and search time, allowing you to perform vector and hybrid searches on your documents.

import { create, search } from '@orama/orama'
import { pluginEmbeddings } from '@orama/plugin-embeddings'
import '@tensorflow/tfjs-node' // Or any other appropriate TensorflowJS backend

const plugin = await pluginEmbeddings({
  embeddings: {
    // Property used to store generated embeddings. Must be defined in the schema.
    defaultProperty: 'embeddings',
    onInsert: {
      // Generate embeddings at insert-time.
      // Turn off if you're inserting documents with embeddings already generated.
      generate: true,
      // Properties to use for generating embeddings at insert time.
      // These properties will be concatenated and used to generate embeddings.
      properties: ['description'],
      verbose: true,
    }
  }
})

const db = create({
  schema: {
    description: 'string',
    // Orama generates 512-dimensions vectors.
    // When using this plugin, use `vector[512]` as a type.
    embeddings: 'vector[512]'
  },
  plugins: [plugin]
})

// When using this plugin, document insertion becomes async
await db.insert({ description: 'The quick brown fox jumps over the lazy dog' })
await db.insert({ description: "I've seen a lazy dog dreaming of jumping over a quick brown fox" })

// When using this plugin, search becomes async
const search = await search(db, {
  term: 'Dreaming of a quick brown fox',
  mode: 'vector'
})