File Systems
This section contains a collection of short guides to show you how to import from a Pinot supported file system.
Last updated
Was this helpful?
This section contains a collection of short guides to show you how to import from a Pinot supported file system.
Last updated
Was this helpful?
FileSystem is an abstraction provided by Pinot to access data in distributed file systems (DFS).
Pinot uses distributed file systems for the following purposes:
Batch Ingestion Job - To read the input data (CSV, Avro, Thrift, etc.) and to write generated segments to DFS
Controller - When a segment is uploaded to the controller, the controller saves it in the DFS configured.
Server - When a server(s) is notified of a new segment, the server copies the segment from remote DFS to their local node using the DFS abstraction.
Pinot lets you choose a distributed file system provider. The following file systems are supported by Pinot:
To use a distributed file system, you need to enable plugins. To do that, specify the plugin directory and include the required plugins -
Now, You can proceed to change the filesystem in the controller
and server
config as shown below:
scheme
refers to the prefix used in the URI of the filesystem. e.g. for the URI s3://bucket/path/to/file
, the scheme is s3
You can also change the filesystem during ingestion. In the ingestion job spec, specify the filesystem with the following config: