arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Visualize data with Redash

  1. Install Redash and start a running instance, following the Docker Based Developer Installation Guidearrow-up-right.

  2. Configure Redash to query Pinot, by doing the following:

    1. Add pinotdb dependency

  3. Create visualizations, by doing the following:

hashtag
Add pinot db dependency

Apache Pinot provides a Python client library pinotdb to query Pinot from Python applications. Install pinotdb inside the Redash worker instance to make network calls to Pinot.

  1. Navigate to the root directory where you’ve cloned Redash. Run the following command to get the name of the Redash worker container (by default, redash_worker_1):

docker-compose ps

  1. Run the following command (change redash_worker_1 to your own Redash worker container name, if applicable):

  1. Restart Docker.

hashtag
Add Python data source for Pinot

  1. In Redash, select Settings > Data Sources.

  2. Select New Data Source, and then select Python from the list.

  3. On the Redash Settings - Data Source page, add Pinot as the name of the data source, enter pinotdb

hashtag
Start Pinot

Run the following command in a new terminal to spin up an Apache Pinot Docker container in the quick start mode with a baseball stats dataset built in.

hashtag
Run a query in Redash

  1. In Redash, select Queries > New Query, and then select the Python data source you created in .

  2. Add Python code to query data. For more information, see the .

  3. Click Execute to run the query and view results.

You can also include libraries like Pandas to perform more advanced data manipulation on Pinot’s data and visualize the output with Redash.

For more information, see in Redash documentation.

hashtag
Example Python queries

hashtag
Query top 10 teams by total runs

The following query connects to Pinot and queries the baseballStats table to retrieve the top ten players with the highest scores. The results are transformed into a dictionary format supported by Redash.

hashtag
Query top 10 teams by total runs

hashtag
Query total strikeouts by year

hashtag
Add a visualization and dashboard in Redash

hashtag
Add a visualization

In Redash, after you've ran your query, click the New Visualization tab, and select the type of visualization your want to create, for example, Bar Chart. The Visualization Editor appears with your chart.

For example, you may want to create a bar chart to view the top 10 players with highest scores.

You may want to create a line chart to view the total variation in strikeouts over time.

For more information, see .

hashtag
Add a dashboard

Create a dashboard with one or more visualizations (widgets).

  1. In Redash, go to Dashboards > New Dashboards.

  2. Add the widgets to your dashboard. For example, by adding the three visualizations from the above, you create a Baseball stats dashboard.

For more information, see in the Redash documentation.

in the
Modules to import prior to running the script
field.
  • Enter the following optional fields as needed:

    • AdditionalModulesPaths: Enter a comma-separated list of absolute paths on the Redash server to Python modules to make available when querying from Redash. Useful for private modules unavailable in pip.

    • AdditionalBuiltins: Specify additional built-in functions as needed. By default, Redash automatically includes 25 Python built-in functions.

  • Click Save.

  • Add a Python data source for Pinot
    Start Pinot
    Query in Redash
    Add a Python data source for Pinot
    Python query runnerarrow-up-right
    Queryingarrow-up-right
    Visualizationsarrow-up-right
    three example queries
    Dashboardsarrow-up-right
    Bar chart configuration
    Baseball stats dashboard
    docker exec -it redash_worker_1 /bin/sh                                
    pip install pinotdb
    docker run \
      --name pinot-quickstart \
      -p 2123:2123 \
      -p 9000:9000 \
      -p 8000:8000 \
      apachepinot/pinot:0.9.3 QuickStart -type batch
    from pinotdb import connect
    
    conn = connect(host='host.docker.internal', port=8000, path='/query/sql', scheme='http')
    curs = conn.cursor()
    curs.execute("""
        select 
    playerName, sum(runs) as total_runs
    from baseballStats
    group by playerName
    order by total_runs desc
    limit 10
    """)
    
    result = {}
    result['columns'] = [
        {
          "name": "player_name",
          "type": "string",
          "friendly_name": "playerName"
        },
        {
          "name": "total_runs",
          "type": "integer",
          "friendly_name": "total_runs"
        }
      ]
    
    rows = []
    
    for row in curs:
        record = {}
        record['player_name'] = row[0]
        record['total_runs'] = row[1]
    
    
        rows.append(record)
    
    result["rows"] = rows
    from pinotdb import connect
    
    conn = connect(host='host.docker.internal', port=8000, path='/query/sql', scheme='http')
    curs = conn.cursor()
    curs.execute("""
        select 
    teamID, sum(runs) as total_runs
    from baseballStats
    group by teamID
    order by total_runs desc
    limit 10
    """)
    
    result = {}
    result['columns'] = [
        {
          "name": "teamID",
          "type": "string",
          "friendly_name": "Team"
        },
        {
          "name": "total_runs",
          "type": "integer",
          "friendly_name": "Total Runs"
        }
      ]
    
    rows = []
    
    for row in curs:
        record = {}
        record['teamID'] = row[0]
        record['total_runs'] = row[1]
    
    
        rows.append(record)
    
    result["rows"] = rows
    from pinotdb import connect
    
    conn = connect(host='host.docker.internal', port=8000, path='/query/sql', scheme='http')
    curs = conn.cursor()
    curs.execute("""
        select 
    yearID, sum(strikeouts) as total_so
    from baseballStats
    group by yearID
    order by yearID asc
    limit 1000
    """)
    
    result = {}
    result['columns'] = [
        {
          "name": "yearID",
          "type": "integer",
          "friendly_name": "Year"
        },
        {
          "name": "total_so",
          "type": "integer",
          "friendly_name": "Total Strikeouts"
        }
      ]
    
    rows = []
    
    for row in curs:
        record = {}
        record['yearID'] = row[0]
        record['total_so'] = row[1]
    
    
        rows.append(record)
    
    result["rows"] = rows
    Add a visualization and dashboard in Redash
    Redash Settings - Data Sources