arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Lookup UDF Join

For more information about using JOINs with the multi-stage query engine, see JOINs.

circle-info

Lookup UDF Join is only supported with the single-stage query engine (v1). For more information about using JOINs with the multi-stage query engine, see JOINs.

Lookup UDF is used to get dimension data via primary key from a dimension table allowing a decoration join functionality. Lookup UDF can only be used with a dimension table in Pinot.

hashtag
Syntax

The UDF function syntax is listed as below:

  • dimTable Name of the dim table to perform the lookup on.

  • dimColToLookUp The column name of the dim table to be retrieved to decorate our result.

  • dimJoinKey

Noted that:

  1. all the dim-table-related expressions are expressed as literal strings, this is the LOOKUP UDF syntax limitation: we cannot express column identifier which doesn't exist in the query's main table, which is the factTable table.

  2. the syntax definition of [ '''dimJoinKey''', factJoinKey ]* indicates that if there are multiple dim partition columns, there should be multiple join key pair expressed.

hashtag
Examples

Here are some of the examples

hashtag
Single-partition-key-column Example

Consider the table baseballStats

Column
Type

and dim table dimBaseballTeams

Column
Type

several acceptable queries are:

hashtag
Dim-Fact LOOKUP example

playerName
teamID
teamName
teamAddress

hashtag
Self LOOKUP example

teamID
nameFromLocal
nameFromLookup

hashtag
Complex-partition-key-columns Example

Consider a single dimension table with schema:

BILLING SCHEMA

Column
Type

hashtag
Self LOOKUP example

customerId
missedPayment
lookedupCity

hashtag
Usage FAQ

  • The data return type of the UDF will be that of the dimColToLookUp column type.

  • when multiple primary key columns are used for the dimension table (e.g. composite primary key), ensure that the order of keys appearing in the lookup() UDF is the same as the order defined in the primaryKeyColumns from the dimension table schema.

The column name on which we want to perform the lookup i.e. the join column name for dim table.
  • factJoinKey The column name on which we want to perform the lookup against e.g. the join column name for fact table

  • INT

    numberOfGames

    INT

    numberOfGamesAsBatter

    INT

    AtBatting

    INT

    runs

    INT

    Seattle Mariners (since 1977) or Seattle Pilots (1969)

    1250 First Avenue South, Seattle, WA

    David Allan

    SEA

    Seattle Mariners (since 1977) or Seattle Pilots (1969)

    1250 First Avenue South, Seattle, WA

    Baltimore Orioles (original- 1901–1902 current- since 1954)

    Baltimore Orioles (original- 1901–1902 current- since 1954)

    STRING

    maritalStatus

    STRING

    buildingType

    STRING

    missedPayment

    STRING

    billingMonth

    STRING

    Paid

    Cupertino

    435

    Paid

    Cupertino

    playerID

    STRING

    yearID

    INT

    teamID

    STRING

    league

    STRING

    playerName

    STRING

    teamID

    STRING

    teamName

    STRING

    teamAddress

    STRING

    David Allan

    BOS

    Boston Red Caps/Beaneaters (from 1876–1900) or Boston Red Sox (since 1953)

    4 Jersey Street, Boston, MA

    David Allan

    CHA

    null

    null

    David Allan

    ANA

    Anaheim Angels

    Anaheim Angels

    ARI

    Arizona Diamondbacks

    Arizona Diamondbacks

    ATL

    Atlanta Braves

    Atlanta Braves

    customerId

    INT

    creditHistory

    STRING

    firstName

    STRING

    lastName

    STRING

    isCarOwner

    BOOLEAN

    341

    Paid

    Palo Alto

    374

    Paid

    Mountain View

    398

    Paid

    Palo Alto

    playerStint

    SEA

    BAL

    city

    427

    lookupUDFSpec:
        LOOKUP
        '('
        '''dimTable'''
        '''dimColToLookup'''
        [ '''dimJoinKey''', factJoinKey ]*
        ')'
    SELECT 
      playerName, 
      teamID, 
      LOOKUP('dimBaseballTeams', 'teamName', 'teamID', teamID) AS teamName, 
      LOOKUP('dimBaseballTeams', 'teamAddress', 'teamID', teamID) AS teamAddress
    FROM baseballStats 
    SELECT 
      teamID, 
      teamName AS nameFromLocal,
      LOOKUP('dimBaseballTeams', 'teamName', 'teamID', teamID) AS nameFromLookup
    FROM dimBaseballTeams
    select 
      customerId,
      missedPayment, 
      LOOKUP('billing', 'city', 'customerId', customerId, 'creditHistory', creditHistory) AS lookedupCity 
    from billing