Reference: Difference between StreamX and Secor


StreamX is an ingestion service under an Apache open source license. StreamX is a tool for scalably and reliably streaming data between Apache Kafka and other systems. StreamX is built using the Kafka Connect framework described below.

Kafka Connect

Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other systems. It makes it simple to quickly define connectors that move large collections of data into and out of Kafka. Kafka Connect can ingest entire databases or collect metrics from all your application servers into Kafka topics, making the data available for stream processing with low latency. An export job can deliver data from Kafka topics into secondary storage and query systems or into batch systems for offline analysis.


Secor is a zero data loss log persistence service that persists Kafka logs to long-term storage such as Amazon S3. It’s not affected by S3’s weak eventual consistency model, incurs no data loss, scales horizontally, and optionally partitions data based on date. Please note that Secor is being deprecated.


Though StreamX and Secor moves data from Kafka to other systems, there exists many differences between them. The following table presents the key differences between StreamX and Secor:






Kafka Connect


Job Scheduling

REST API of Kafka Connect

Configuration files (to be checked)

Delivery Semantics

Exactly once

At-least once

Persist Destination

Currently S3, Other object stores

S3 only

Output Formats

  • Avro, Parquet

CSV, Avro


Scalable by default

Scalable by default

Partition add in Hive




StreamX-as-a-Service from Qubole

Qubole is also adding support for StreamX as a managed service on the Qubole Data Service (QDS), a self-service platform for big data analytics that runs on the Amazon Web Services, Google Compute Engine and Microsoft Azure clouds.


Have more questions? Submit a request


Powered by Zendesk