Create a new Orama instance
We can create a new instance (from now on database) with an indexing schema
.
The schema represents the searchable properties of the document to be inserted.
Not all properties need to be indexed, but only those that we want to be able to search for.
If you want to learn more and see real-world examples, check out this blog post we wrote about schema optimization.
Schema properties and types
The schema
is an object where the keys are the property names and the values are the property types.
Orama supports the following types:
Type | Description | Example |
---|---|---|
string | A string of characters. | 'Hello world' |
number | A numeric value, either float or integer. | 42 |
boolean | A boolean value. | true |
enum | An enum value. | 'drama' |
geopoint | A geopoint value. | { lat: 40.7128, lon: 74.0060 } |
string[] | An array of strings. | ['red', 'green', 'blue'] |
number[] | An array of numbers. | [42, 91, 28.5] |
boolean[] | An array of booleans. | [true, false, false] |
enum[] | An array of enums. | ['comedy', 'action', 'romance'] |
vector[<size>] | A vector of numbers to perform vector search on. | [0.403, 0.192, 0.830] |
A database can be as simple as:
or more variegated:
Nested properties
Orama supports nested properties natively. Just add them as you would typically do in any JavaScript object:
Vector properties
Since version 1.2.0
, Orama supports vector search.
To run vector queries, you first need to initialize a vector property in the schema:
Please note that the size of the vector must be specified in the schema.
The size of the vector is the number of elements that the vector contains, so make sure to specify the correct size, as performing search on vectors of different sizes will result in unpredictable and mostly wrong results.
For performance reasons, we recommend using one vector property per database, even though it’s possible to have multiple vector properties in the same Orama instance.
If you’re using vector properties to search through embeddings, we highly recommend using HuggingFace’s gte-small
model, which has a vector size of 384
.
There is a great article written by Supabase explaining why it might be a better option than OpenAI’s text-embedding-ada-002
model: https://supabase.com/blog/fewer-dimensions-are-better-pgvector.
Instance ID
Every Orama instance has a unique id
property, which can be used to identify a given instance when working with multiple databases.
You can customize it by passing an id
property during the creation of the instance: