Batch

Segment Fetchers

When pinot segment files are created in external systems (hadoop/spark/etc), there are several ways to push those data to pinot Controller and Server:

  1. push segment to shared NFS and let pinot pull segment files from the location of that NFS.

  2. push segment to a Web server and let pinot pull segment files from the Web server with http/https link.

  3. push segment to HDFS and let pinot pull segment files from HDFS with hdfs location uri.

  4. push segment to other system and implement your own segment fetcher to pull data from those systems.

The first two options should be supported out of the box with pinot package. As long your remote jobs send Pinot controller with the corresponding URI to the files it will pick up the file and allocate it to proper Pinot Servers and brokers. To enable Pinot support for HDFS, you will need to provide Pinot Hadoop configuration and proper Hadoop dependencies.

Last updated