Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Drift Synchronization Solution for Hive

Is it possible to build the same solution on AWS Ecosystem.

  1. Instead of HDFS (Hadoop FS or MapR FS destination), is it possible to use AWS S3?
  2. Is it possible to use AWS Glue Metastore instead of Hive Metastore?

Is it possible to customize the existing solution? For example: certain operators that could help me achieve the same outcome using more work. My limitation is we already have a functional ingestion system and I am doing POC with streamsets to see if we can improve our existing solution.

Reference: https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Hive_Drift_Solution/HiveDriftSolution_title.html

Drift Synchronization Solution for Hive

Is it possible to build the same solution on AWS Ecosystem.

  1. Instead of HDFS (Hadoop FS or MapR FS destination), is it possible to use AWS S3?
  2. Is it possible to use AWS Glue Metastore instead of Hive Metastore?

Is it possible to customize the existing solution? For example: certain operators that could help me achieve the same outcome using more work. My limitation is we already have a functional ingestion system and I am doing POC with streamsets to see if we can improve our existing solution.

Reference: https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Hive_Drift_Solution/HiveDriftSolution_title.html