Ask Your Question

Revision history [back]

Looking at the Amazon EMR documentation, it says "The AWS Glue Data Catalog provides a unified metadata repository across a variety of data sources and data formats, integrating with Amazon EMR as well as Amazon RDS, Amazon Redshift, Redshift Spectrum, Athena, and any application compatible with the Apache Hive metastore." [my emphasis].

I would start with a working Hadoop/Hive pipeline, replace the HDFS destination with an Amazon S3 destination, then point the Hive Metastore destination at the AWS Glue Hive endpoint, and see what happens.