Reference: KAFKA support in Presto cluster

Description

Qubole does package a KAFKA connector for the Presto cluster.

The configuration files and the table definition files (mentioned in the tutorials) needs to be created under the "Override Presto Configuration" option under "Presto Settings" on the Cluster Configuration page through the UI.

Pushing these configuration settings while the cluster is running is not supported and hence you need to add these configurations first and then restart the cluster.

 

Links

‌• Information on configuring a KAFKA connector can be found here:

-> https://prestodb.io/docs/0.142/connector/kafka.html

‌• There are some KAFKA tutorials found here:

-> https://prestodb.io/docs/0.142/connector/kafka-tutorial.html

 

Example

1. Adding this to presto overrides in cluster config:

->

catalog/kafka.properties:

connector.name=kafka

kafka.table-names=table1,table2

kafka.nodes=host1:port,host2:port

will create the etc/catalog/kafka.properties mentioned in:

-> https://prestodb.io/docs/0.142/connector/kafka.html

 

2. Similarly etc/kafka/tpch.customer.json file mentioned in:

-> https://prestodb.io/docs/0.142/connector/kafka-tutorial.html

can be created by adding this to presto overrides:

->

kafka/tpch.customer.json:

{ "tableName": "customer", "schemaName": "tpch", .... }

 

More information about presto overrides section can be obtained here:

-> http://docs.qubole.com/en/latest/user-guide/presto/configuring-presto-cluster.html#custom-configuration

 

Since Kinesis connector is similar to kafka connector, this can be also useful in figuring out how to write such configs in cluster config:

-> shttps://blogs.aws.amazon.com/bigdata/post/Tx2DDFNHXSAAH2G/Presto-Amazon-Kinesis-Connector-for-Interactively-Querying-Streaming-Data

Have more questions? Submit a request

Comments

Powered by Zendesk