Skip to main content

secondary-index

Introduction

DefraDB provides a powerful and flexible secondary indexing system that enables efficient document lookups and queries.

About

The following sections provide an overview of performance considerations, indexing related objects, and JSON field indexing.

Performance considerations

Indexes can greatly improve query performance, but they also impact system performance during writes. Each index adds write overhead since every document update must also update the relevant indexes. Despite this, the boost in read performance for indexed queries usually makes this trade-off worthwhile.

To optimize performance:

  • Choose indexes based on your query patterns. Focus on fields frequently used in query filters to maximize efficiency.
  • Avoid indexing rarely queried fields. Doing so adds unnecessary overhead.
  • Be cautious with unique indexes. These require extra validation, making their performance impact more significant.

Plan your indexes carefully to balance read and write performance.

DefraDB supports indexing relationships between documents, allowing for efficient queries across related data.

Example schema: Users and addresses

type User {
name: String
age: Int
address: Address @primary @index
}

type Address {
user: User
city: String @index
street: String
}

Key indexes in this schema:

  • City field in address: Indexed to enable efficient queries by city.
  • Relationship between user and address: Indexed to support fast lookups based on relationships.

Query example

The following query retrieves all users living in Montreal:

query {
User(filter: {
address: {city: {_eq: "Montreal"}}
}) {
name
}
}

How indexing improves efficiency

Without indexes:

  • Fetch all user documents.
  • For each user, retrieve the corresponding Address. This approach becomes slow with large datasets.

With indexes:

  • Fetch address documents matching the city value directly.
  • Retrieve the corresponding User documents. This method is much faster because indexes enable direct lookups.

Enforcing relationship cardinality

Indexes can also enforce one-to-one relationships. For instance, to ensure each User has exactly one unique Address:

type User {
name: String
age: Int
address: Address @primary @index(unique: true)
}

type Address {
user: User
city: String @index
street: String
}

Here, the @index(unique: true) constraint ensures no two Users can share the same Address. Without it, the relationship defaults to one-to-many, allowing multiple Users to reference a single Address.

By combining relationship indexing with cardinality constraints, you can create highly efficient and logically consistent data structures.

JSON field indexing

DefraDB offers a specialized indexing system for JSON fields, designed to handle their hierarchical structure efficiently.

JSON indexing overview

JSON fields differ from other field types (e.g., Int, String, Bool) because they are structured hierarchically. DefraDB uses a path-aware system to manage these complexities, enabling traversal and indexing of all leaf nodes in a JSON document.

JSON Interface

DefraDB's JSON interface, defined in client/json.go, is essential for managing JSON fields. It allows the system to:

Traverse all leaf nodes in a JSON document. Represent a JSON value as either a complete document or a single node within the structure. Each JSON value also stores its path information, which is crucial for creating accurate and efficient indexes.

Example JSON Document
{
"user": {
"device": {
"model": "iPhone"
}
}
}

Here, the iPhone value is represented with its complete path: [user, device, model]. This path-aware representation ensures that the system knows not just the value, but where it resides within the document.

Inverted Indexes for JSON

DefraDB uses inverted indexes for JSON fields. These indexes reverse the traditional "document-to-value" relationship by starting with a value and quickly locating all documents containing that value.

Key Format for JSON Indexes

<collection_id>/<index_id>(/<json_path>/<json_value>)+/<doc_id>
How It Differs
  • Regular fields map to a single index entry.
  • JSON fields generate multiple entries—one for each leaf node, incorporating both the path and the value.

During indexing, the system traverses the entire JSON structure, creating these detailed index entries.

Value normalization in JSON

DefraDB normalizes JSON leaf values to ensure consistency in ordering and comparisons. For example:

  • JSON values include their normalized value and path information.
  • Scalar types (e.g., integers) are normalized to a standard type, such as int64.

This ensures that operations like filtering and sorting are reliable and efficient.

How indexing works

When indexing a document with JSON fields, the system:

  1. Traverses the JSON structure using the JSON interface.
  2. Generates index entries for every leaf node, combining path and normalized value.
  3. Stores entries efficiently, enabling direct querying.
Query example

Retrieve documents where the model is "iPhone":

query {
Collection(filter: {
jsonField: {
user: {
device: {
model: {_eq: "iPhone"}
}
}
}
})
}

With indexes, the system directly retrieves matching documents, avoiding the need to scan and parse the JSON during queries.

Benefits of JSON field indexing

  • Efficient queries: Leverages inverted indexes for fast lookups, even in deeply nested structures.
  • Precise path tracking: Maintains path information for accurate indexing and retrieval.
  • Scalable structure: Handles complex JSON documents with minimal performance overhead.

Usage

The @index directive can be used on GraphQL schema objects and field definitions to configure indexes.

@index(name: String, unique: Bool, direction: ORDERING, includes: [{ field: String, direction: ORDERING }])

name

Sets the index name. Defaults to concatenated field names with direction.

unique

Makes the index unique. Defaults to false.

direction

Sets the default index direction for all fields. Can be one of ASC (ascending) or DESC (descending). Defaults to ASC.

If a field in the includes list does not specify a direction the default direction from this value will be used instead.

includes

Sets the fields the index is created on.

When used on a field definition and the field is not in the includes list it will be implicitly added as the first entry.

Examples

Field level usage

Creates an index on the User name field with DESC direction.

type User {
name: String @index(direction: DESC)
}

Schema level usage

Creates an index on the User name field with default direction (ASC).

type User @index(includes: {field: "name"}) {
name: String
age: Int
}

Unique index

Creates a unique index on the User name field with default direction (ASC).

type User {
name: String @index(unique: true)
}

Composite index

Creates a composite index on the User name and age fields with default direction (ASC).

type User @index(includes: [{field: "name"}, {field: "age"}]) {
name: String
age: Int
}

Relationship index

Creates a unique index on the User relationship to Address. The unique index constraint ensures that no two Users can reference the same Address document.

type User {
name: String
age: Int
address: Address @primary @index(unique: true)
}

type Address {
user: User
city: String
street: String
}