Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Hive Streaming Support with Parquet File Format

Hi, From Kafka we wanted to write the data into HDFS as Parquet File Format which is mapped to a Hive Table. To avoid small file issues using Hive Streaming is an Option. However it supports only ORC File Format(Not Stream Set Limitation, Indeed Its Hive Streaming Limitation). Kafka has HDFS Sink Connector to achieve this. Do we have similar options in Stream Set to achieve this?

Regards Ravi.P

Hive Streaming Support with Parquet File Format

Hi, From Kafka we wanted to write the data into HDFS as Parquet File Format which is mapped to a Hive Table. To avoid small file issues using Hive Streaming is an Option. However it supports only ORC File Format(Not Stream Set Limitation, Indeed Its Format (this is not a StreamSets limitation, indeed it is a Hive Streaming Limitation). limitation).

Kafka has HDFS Sink Connector to achieve this. Do we have similar options in Stream Set StreamSets to achieve this?

Regards Ravi.P