# Vector Index

Apache Pinot supports vector indexes for efficient approximate nearest-neighbor (ANN) search on embedding columns. This document covers all supported index types, configuration options, quantizers, query patterns, and runtime tuning.

## Overview

Vector indexes accelerate similarity search by partitioning the vector space into clusters or graphs, enabling sub-linear lookup instead of scanning all vectors. Pinot supports four vector index types:

* **HNSW** (Hierarchical Navigable Small World): Graph-based, excellent accuracy, moderate memory
* **IVF\_FLAT**: Inverted File with flat quantization, fast index build
* **IVF\_PQ**: Inverted File with Product Quantization, balanced speed/memory
* **IVF\_ON\_DISK**: Disk-backed Inverted File, unlimited scale without the 2 GB JVM limit

## Index Configuration

Vector indexes are configured in the table's field-level `indexes` section using raw encoding.

### Minimal HNSW Configuration

```json
{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexes": {
        "vector": {
          "vectorIndexType": "HNSW",
          "vectorDimension": 512,
          "vectorDistanceFunction": "COSINE",
          "version": 1
        }
      }
    }
  ]
}
```

### Full HNSW Configuration with Tuning

```json
{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexes": {
        "vector": {
          "vectorIndexType": "HNSW",
          "vectorDimension": 1536,
          "vectorDistanceFunction": "COSINE",
          "version": 1,
          "properties": {
            "maxCon": "16",
            "beamWidth": "200"
          }
        }
      }
    }
  ]
}
```

### IVF\_FLAT Configuration

```json
{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "IVF_FLAT",
        "vectorDimension": 768,
        "vectorDistanceFunction": "EUCLIDEAN",
        "version": 1,
        "nlist": "128",
        "trainSampleSize": "20000",
        "quantizer": "SQ8"
      }
    }
  ]
}
```

### IVF\_PQ Configuration

```json
{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "IVF_PQ",
        "vectorDimension": 768,
        "vectorDistanceFunction": "EUCLIDEAN",
        "version": 1,
        "nlist": "256",
        "trainSampleSize": "50000",
        "pqM": "32",
        "pqNbits": "8",
        "quantizer": "PQ"
      }
    }
  ]
}
```

### IVF\_ON\_DISK Configuration

Disk-backed IVF using FileChannel random-access reads, enabling unlimited scale without the 2 GB JVM heap limit. Supports all quantizer types and full filter-aware ANN.

```json
{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "IVF_ON_DISK",
        "vectorDimension": 768,
        "vectorDistanceFunction": "EUCLIDEAN",
        "version": 1,
        "nlist": "256",
        "trainSampleSize": "50000",
        "quantizer": "SQ4"
      }
    }
  ]
}
```

## Distance Functions

| Function         | Use Case                                    | Range   |
| ---------------- | ------------------------------------------- | ------- |
| **COSINE**       | Normalized text embeddings (OpenAI, BERT)   | \[0, 2] |
| **EUCLIDEAN**    | Unnormalized embeddings or geometric data   | \[0, ∞) |
| **DOT\_PRODUCT** | Pre-normalized, higher score = more similar | (-∞, ∞) |
| **L2**           | Alias for EUCLIDEAN                         | \[0, ∞) |

## Quantizers

Pinot supports a generic quantizer framework for trading memory consumption against search speed. Quantizers apply to IVF-family indexes (`IVF_FLAT`, `IVF_PQ`, `IVF_ON_DISK`).

| Quantizer | Memory per dimension | Speed     | Use Case                                       |
| --------- | -------------------- | --------- | ---------------------------------------------- |
| **FLAT**  | 4 bytes              | Fastest   | High memory budget, maximum accuracy           |
| **SQ8**   | 1 byte               | Fast      | 8-bit scalar quantization                      |
| **SQ4**   | 0.5 bytes            | Very fast | 4-bit scalar quantization, maximum compression |
| **PQ**    | Variable             | Medium    | Large-scale with product quantization          |

SQ8 and SQ4 are fully integrated through the IVF creator, reader, and search paths — they are real backend capabilities, not validation-only features.

## SQL Functions

### VECTOR\_SIMILARITY — Top-K ANN Search

Returns the `k` nearest neighbors using the configured vector index:

```sql
SELECT ProductId,
       cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY(embedding, ARRAY[0.12, 0.34, 0.56, ...], 10)
ORDER BY dist ASC
LIMIT 10;
```

### VECTOR\_SIMILARITY\_RADIUS — Distance-Based Search

Returns all vectors within a distance threshold, without requiring a fixed top-K:

```sql
SELECT ProductId,
       cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY_RADIUS(embedding, ARRAY[0.12, 0.34, 0.56, ...], 0.3)
ORDER BY dist ASC;
```

Automatically falls back to brute-force scan on segments without a vector index. Approximate radius support is advertised only for backends where real index-assisted radius search is available.

## Filter-Aware ANN

When a query combines a vector predicate with metadata filters, Pinot can pre-filter vectors using a bitmap before the ANN lookup. This improves recall compared to post-ANN filtering.

```sql
SELECT ProductId,
       Brand,
       cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY(embedding, ARRAY[0.12, 0.34, 0.56, ...], 50)
  AND category = 'electronics'
  AND inStock = true
ORDER BY dist ASC
LIMIT 10;
```

**How it works:**

1. The metadata filter (`category = 'electronics'`) builds a bitmap of matching row IDs.
2. The bitmap is passed to the vector index reader via `FilterAwareVectorIndexReader`.
3. The index prunes vectors before ANN traversal using the bitmap.
4. Only matching vectors are considered — improving recall on selective filters.

**When to use filter-aware ANN:**

* Selective filters that remove 70% or more of rows
* Combine with exact reranking for best accuracy

`IVF_ON_DISK` has full `FILTER_THEN_ANN` support with pre-filter bitmap computation, explain/debug reporting showing filter selectivity, and consistent behavior with in-memory IVF\_FLAT and IVF\_PQ.

## HNSW Runtime Tuning

The following query options control HNSW search behavior at runtime without rebuilding the index. They apply to both mutable (consuming) and immutable (offline) segments.

### `vectorEfSearch` — Search Beam Width

Controls how many nodes HNSW visits during graph traversal:

```sql
SET vectorEfSearch = 500;

SELECT ProductId,
       cosineDistance(embedding, ARRAY[...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY(embedding, ARRAY[...], 10)
ORDER BY dist ASC
LIMIT 10;
```

**Typical values:**

* `100–150`: Low latency (real-time applications)
* `200–300`: Balanced (default)
* `400–800`: High recall (semantic search)

Higher `efSearch` improves accuracy at the cost of query latency.

### `vectorUseRelativeDistance` — Competitive Pruning

Enables or disables competitive pruning during HNSW graph traversal. Disabling can improve recall on some data distributions:

```sql
SET vectorEfSearch = 128;
SET vectorUseRelativeDistance = false;
SET vectorUseBoundedQueue = false;

SELECT cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY(embedding, ARRAY[0.12, 0.34, 0.56], 10)
ORDER BY dist ASC LIMIT 10;
```

## Adaptive Query Planner

Pinot automatically selects the optimal execution mode based on filter selectivity via `VectorSearchStrategy` in `FilterPlanNode`:

| Filter Selectivity | Mode              | Strategy                     |
| ------------------ | ----------------- | ---------------------------- |
| None               | `ANN_TOP_K`       | Pure ANN — no pre-filtering  |
| Low (<30%)         | `FILTER_THEN_ANN` | Build bitmap → pass to ANN   |
| High (>70%)        | `ANN_THEN_FILTER` | ANN candidates → post-filter |
| No index           | `EXACT_SCAN`      | Brute-force full scan        |

No configuration is required — the planner chooses the strategy per segment.

## Query Options

| Option                      | Default           | Description                                             |
| --------------------------- | ----------------- | ------------------------------------------------------- |
| `vectorNprobe`              | `4`               | Clusters to probe (IVF\_FLAT, IVF\_PQ, IVF\_ON\_DISK)   |
| `vectorExactRerank`         | `true` (IVF\_PQ)  | Override for exact reranking of ANN candidates          |
| `vectorMaxCandidates`       | `topK * 10`       | Cap on ANN candidates considered                        |
| `vectorDistanceThreshold`   | Not set           | Distance threshold on raw Pinot vector distance         |
| `vectorEfSearch`            | From index config | HNSW only: visit budget for search beam                 |
| `vectorUseRelativeDistance` | `true`            | HNSW only: toggle relative-distance competitive pruning |
| `vectorUseBoundedQueue`     | `true`            | HNSW only: toggle bounded top-K collector               |

## Vector Search Metrics

`VectorSearchMetrics` tracks the following server-side counters:

| Metric                         | Description                                       |
| ------------------------------ | ------------------------------------------------- |
| `vectorAnnCandidatesRetrieved` | Number of ANN candidates retrieved from the index |
| `vectorExactRerankCount`       | Vectors re-ranked with exact distance computation |
| `vectorFilteredOutCount`       | Vectors eliminated by the pre-filter bitmap       |
| `vectorSearchLatencyMs`        | End-to-end search latency                         |

## Index Type Comparison

| Index         | Memory | Build Time | Query Speed | Recall    | Quantization         | Disk-Backed |
| ------------- | ------ | ---------- | ----------- | --------- | -------------------- | ----------- |
| HNSW          | Medium | Moderate   | Fast        | Excellent | —                    | No          |
| IVF\_FLAT     | High   | Fast       | Medium      | Good      | FLAT/SQ8/SQ4         | No          |
| IVF\_PQ       | Low    | Moderate   | Medium      | Fair      | Product Quantization | No          |
| IVF\_ON\_DISK | Low    | Moderate   | Medium      | Good      | FLAT/SQ8/SQ4/PQ      | Yes         |

## Complete Example: Semantic Product Search

### Schema

```json
{
  "schemaName": "products",
  "dimensionFieldSpecs": [
    { "name": "ProductId", "dataType": "STRING" },
    { "name": "Category", "dataType": "STRING" },
    {
      "name": "embedding",
      "dataType": "FLOAT",
      "singleValueField": false
    }
  ]
}
```

### Table Configuration

```json
{
  "tableName": "products_OFFLINE",
  "fieldConfigList": [
    {
      "name": "Category",
      "indexes": { "inverted": {} }
    },
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexes": {
        "vector": {
          "vectorIndexType": "HNSW",
          "vectorDimension": 1536,
          "vectorDistanceFunction": "COSINE",
          "version": 1,
          "properties": {
            "maxCon": "32",
            "beamWidth": "200",
            "efConstruction": "400"
          }
        }
      }
    }
  ]
}
```

### Basic Top-K Query

```sql
SELECT ProductId,
       cosineDistance(embedding, ARRAY[-0.0013, -0.0110, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY(embedding, ARRAY[-0.0013, -0.0110, ...], 10)
ORDER BY dist ASC
LIMIT 10;
```

### Filter-Aware ANN Query

```sql
SELECT ProductId,
       Category,
       cosineDistance(embedding, ARRAY[-0.0013, -0.0110, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY(embedding, ARRAY[-0.0013, -0.0110, ...], 50)
  AND Category = 'Electronics'
ORDER BY dist ASC
LIMIT 10;
```

### Radius Search

```sql
SELECT ProductId,
       cosineDistance(embedding, ARRAY[-0.0013, -0.0110, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY_RADIUS(embedding, ARRAY[-0.0013, -0.0110, ...], 0.25)
ORDER BY dist ASC;
```

### IVF with Exact Reranking

```sql
SET vectorNprobe = 16;
SET vectorMaxCandidates = 500;
SET vectorExactRerank = true;

SELECT l2Distance(embedding, ARRAY[1.0, 2.0, 3.0]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY(embedding, ARRAY[1.0, 2.0, 3.0], 20)
ORDER BY dist ASC LIMIT 20;
```

### Distance Threshold Without Fixed Top-K

```sql
SET vectorDistanceThreshold = 0.75;
SET vectorMaxCandidates = 500;

SELECT l2Distance(embedding, ARRAY[1.0, 2.0, 3.0]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY(embedding, ARRAY[1.0, 2.0, 3.0], 200)
ORDER BY dist ASC LIMIT 200;
```

## Related Pages

* [Vector / Similarity Functions](https://docs.pinot.apache.org/functions/vector) — SQL function reference
* [Vector Query Execution Semantics](https://docs.pinot.apache.org/build-with-pinot/querying-and-sql/sql-syntax/vector-query-execution) — Execution modes
* [Query Options](https://docs.pinot.apache.org/build-with-pinot/querying-and-sql/query-execution-controls/query-options) — Full query options reference
* [Schema and Table Configuration](https://docs.pinot.apache.org/reference/configuration-reference/table) — Configuration reference


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pinot.apache.org/build-with-pinot/indexing/vector-index.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
