# jsonextractindex

Evaluates the 'jsonPath' on jsonField, returns the result as the type 'resultsType', use optional defaultValuefor null or parsing error.

The function closely mirrors JSONEXTRACTSCALAR, however, since it extracts values from a JSON Index, it allows for additional filtering to be pushed down.

Compared to JSONEXTRACTSCALAR, this is most useful when extraction is required on a large number of docs, or on large docs. When filtering is highly specific, the original JSONEXTRACTSCALAR implementation is usually faster. For a simple comparsion, see the initial PR [#11739](https://github.com/apache/pinot/pull/11739)

## Signature

> JSONEXTRACTINDEX(jsonField, 'jsonPath', 'resultsType', \[defaultValue], \[filter])

| Arguments       | Description                                                                                                                                                               |
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `jsonField`     | An **Identifier**/**Expression** contains JSON documents.                                                                                                                 |
| `'jsonPath'`    | Follows [JsonPath Syntax](https://goessner.net/articles/JsonPath/) to read values from JSON documents.                                                                    |
| `'resultsType'` | One of the Pinot supported data types:**`INT, LONG, FLOAT, DOUBLE, BOOLEAN, TIMESTAMP, STRING,`** **`INT_ARRAY, LONG_ARRAY, FLOAT_ARRAY, DOUBLE_ARRAY, STRING_ARRAY`**`.` |
| `'filter'`      | Pushes down a filter to avoid extracting values that do not match, e.g. **`"$.arrField[*].f2" > 2`**`.`                                                                   |

{% hint style="warning" %}
**`'jsonPath'`**` and`` `` `**`'resultsType'`are literals.** Pinot uses single quotes to distinguish them from **identifiers**.
{% endhint %}

## Usage Examples

The examples in this section are based on the [Batch JSON Quick Start](/start-here/quick-start.md#batch-json). In particular we'll be querying the row `WHERE id = 7044874109`:

```sql
SELECT repo
from githubEvents 
WHERE id = 7044874109
```

| repo                                                                                             |
| ------------------------------------------------------------------------------------------------ |
| {"id":115911530,"name":"LimeVista/Tapes","url":"<https://api.github.com/repos/LimeVista/Tapes"}> |

The following examples show how to use the `JSONEXTRACTINDEX` function:

```sql
SELECT id, jsonextractindex(repo, '$.name', 'STRING') AS name
FROM githubEvents 
WHERE id = 7044874109
```

| id         | name            |
| ---------- | --------------- |
| 7044874109 | LimeVista/Tapes |

```sql
SELECT id, jsonextractindex(repo, '$.foo', 'STRING', 'dummyValue') AS name
FROM githubEvents 
WHERE id = 7044874109
```

| id         | name       |
| ---------- | ---------- |
| 7044874109 | dummyValue |

```sql
SELECT id, jsonextractindex(repo, '$.name', 'STRING', 'dummyValue', '"$.id" < 10') AS name
FROM githubEvents 
WHERE id = 7044874109
```

| id         | name            |
| ---------- | --------------- |
| 7044874109 | LimeVista/Tapes |

### Array Extraction

The below examples will use the shown arrayField

```sql
SELECT repo
FROM myTable 
WHERE id = 123
```

| repo                                                                                                 |
| ---------------------------------------------------------------------------------------------------- |
| {"id":"xyz"","arrayField": \[{"f1": 1, "f2": 2}, {"f1": 3, "f2": 4}, {"f2": 6}, {"f1": 0, "f2": 5}]} |

Extract array values as MV

```sql
SELECT id, jsonextractindex(repo, '$.arrayField[*].f1', 'STRING', '[]') AS arrayValues
FROM myTable 
WHERE id = 123
```

| id  | arrayValues |
| --- | ----------- |
| 123 | \[1, 3, 0]  |

Filtering is also allowed on array values:

```sql
SELECT id, jsonextractindex(repo, '$.arrayField[*].f1', 'STRING', '[]', '"$.arrField[*].f2" > 2') AS arrayValues
FROM myTable 
WHERE id = 123
```

| id  | arrayValues |
| --- | ----------- |
| 123 | \[3, 0]     |

```sql
SELECT id, jsonextractindex(repo, '$.arrayField[*].f1', 'STRING', '[]', '"$.arrField[*].nonExistant" > 99999') AS arrayValues
FROM myTable 
WHERE id = 123
```

| id  | arrayValues |
| --- | ----------- |
| 123 | \[]         |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pinot.apache.org/functions/json/jsonextractindex.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
