# Querying JSON data

To see how JSON data can be queried, assume that we have the following table:

```
Table myTable:
  id        INTEGER
  jsoncolumn    JSON 

Table data:
101,{"name":{"first":"daffy"\,"last":"duck"}\,"score":101\,"data":["a"\,"b"\,"c"\,"d"]}
102,{"name":{"first":"donald"\,"last":"duck"}\,"score":102\,"data":["a"\,"b"\,"e"\,"f"]}
103,{"name":{"first":"mickey"\,"last":"mouse"}\,"score":103\,"data":["a"\,"b"\,"g"\,"h"]}
104,{"name":{"first":"minnie"\,"last":"mouse"}\,"score":104\,"data":["a"\,"b"\,"i"\,"j"]}
105,{"name":{"first":"goofy"\,"last":"dwag"}\,"score":104\,"data":["a"\,"b"\,"i"\,"j"]}
106,{"person":{"name":"daffy duck"\,"companies":[{"name":"n1"\,"title":"t1"}\,{"name":"n2"\,"title":"t2"}]}}
107,{"person":{"name":"scrooge mcduck"\,"companies":[{"name":"n1"\,"title":"t1"}\,{"name":"n2"\,"title":"t2"}]}}
```

We also assume that "jsoncolumn" has a [Json Index](https://docs.pinot.apache.org/basics/indexing/json-index) on it. Note that the last two rows in the table have different structure than the rest of the rows. In keeping with JSON specification, a JSON column can contain any valid JSON data and doesn't need to adhere to a predefined schema. To pull out the entire JSON document for each row, we can run the query below:

```json
SELECT id, jsoncolumn 
FROM myTable
```

| id    | jsoncolumn                                                                                                  |
| ----- | ----------------------------------------------------------------------------------------------------------- |
| "101" | "{"name":{"first":"daffy","last":"duck"},"score":101,"data":\["a","b","c","d"]}"                            |
| 102"  | "{"name":{"first":"donald","last":"duck"},"score":102,"data":\["a","b","e","f"]}                            |
| "103" | "{"name":{"first":"mickey","last":"mouse"},"score":103,"data":\["a","b","g","h"]}                           |
| "104" | "{"name":{"first":"minnie","last":"mouse"},"score":104,"data":\["a","b","i","j"]}"                          |
| "105" | "{"name":{"first":"goofy","last":"dwag"},"score":104,"data":\["a","b","i","j"]}"                            |
| "106" | "{"person":{"name":"daffy duck","companies":\[{"name":"n1","title":"t1"},{"name":"n2","title":"t2"}]}}"     |
| "107" | "{"person":{"name":"scrooge mcduck","companies":\[{"name":"n1","title":"t1"},{"name":"n2","title":"t2"}]}}" |

To drill down and pull out specific keys within the JSON column, we simply append the JsonPath expression of those keys to the end of the column name.

```json
SELECT id, jsoncolumn.name.last, jsoncolumn.name.first, jsoncolumn.data[1] 
FROM myTable
```

| id    | jsoncolumn.name.last | jsoncolumn.name.first | jsoncolumn.data\[1] |
| ----- | -------------------- | --------------------- | ------------------- |
| "101" | "duck"               | "daffy"               | "b"                 |
| "102" | "duck"               | "donald"              | "b"                 |
| "103" | "mouse"              | "mickey"              | "b"                 |
| "104" | "mouse"              | "minnie"              | "b"                 |
| "105" | "dwag"               | "goofy"               | "b"                 |
| "106" | "null"               | "null"                | "null"              |
| "107" | "null"               | "null"                | "null"              |

Note that the third column (jsoncolumn.data\[1]) is null for rows with id 106 and 107. This is because these rows have JSON documents that don't have a key with JsonPath jsoncolumn.data\[1]. We can filter out these rows.

```json
SELECT id, jsoncolumn.name.last, jsoncolumn.name.first, jsoncolumn.data[1] 
FROM myTable 
WHERE jsoncolumn.data[1] IS NOT NULL
```

| id    | jsoncolumn.name.last | jsoncolumn.name.first | jsoncolumn.data\[1] |
| ----- | -------------------- | --------------------- | ------------------- |
| "101" | "duck"               | "daffy"               | "b"                 |
| "102" | "duck"               | "donald"              | "b"                 |
| "103" | "mouse"              | "mickey"              | "b"                 |
| "104" | "mouse"              | "minnie"              | "b"                 |
| "105" | "dwag"               | "goofy"               | "b"                 |

Notice that certain last names (duck and mouse for example) repeat in the data above. We can get a count of each last name by running a GROUP BY query on a JsonPath expression.

```
SELECT jsoncolumn.name.last, count(*) 
FROM myTable 
WHERE jsoncolumn.data[1] IS NOT NULL 
GROUP BY jsoncolumn.name.last 
ORDER BY 2 DESC
```

| jsoncolumn.name.last | count(\*) |
| -------------------- | --------- |
| "mouse"              | "2"       |
| "duck"               | "2"       |
| "dwag"               | "1"       |

Also there is numerical information (jsconcolumn.score) embeded within the JSON document. We can extract those numerical values from JSON data into SQL and sum them up using the query below.

```json
SELECT jsoncolumn.name.last, sum(jsoncolumn.score) 
FROM myTable 
WHERE jsoncolumn.name.last IS NOT NULL 
GROUP BY jsoncolumn.name.last
```

| jsoncolumn.name.last | sum(jsoncolumn.score) |
| -------------------- | --------------------- |
| "mouse"              | "207"                 |
| "dwag"               | "104"                 |
| "duck"               | "203"                 |

In short, JSON querying support in Pinot will allow you to use a JsonPath expression whereever you can use a column name with the only difference being that to query a column with data type JSON, you must append a JsonPath expression after the name of the column.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pinot.apache.org/release-0.9.0/users/user-guide-query/json-queries.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
