Ask Your Question

Hive Metadata Processor doesn't see changes in Hive Metastore

asked 2018-01-19 10:42:22 -0600

Roh gravatar image

updated 2018-01-19 16:16:46 -0600

metadaddy gravatar image

I have a pipeline with JDBC origin and HIVE as a destination.

I have a table in hive with the below columns :

Table X
         customername string 
         address string 
         phonenumer int
         address string

Table coming from JDBC consumer has the different schema for the same table (more columns and different column name) something like below :

Table X
           customername string
           postalcode string
           homephonenumber string
           network xyz

When the table schema is different for the same table it is expected that pipeline will fail or through the error, and when it did I deleted the table in hive and did the msck repair table <tableName> now the expected behaviour from stream sets pipeline is to create a new table with the schema that's coming from the JDBC, but its not actually creating the table still keeps complaining that schema is different from source and destination.

When I stop and restart the pipeline it was able to create the table.

Please note that I'm not directly connecting to JDBC, I have an RPC between two environments.

Source pipeline : JDBCORIGIN >>> RPCDESTINATION Destination pipeline : RPCORIGIN >>> HIVEMETADATA >> HIVE & HDFS

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2018-01-19 16:15:56 -0600

metadaddy gravatar image

updated 2018-01-19 16:17:12 -0600

This is expected behavior. To improve performance, the Hive Metadata processor caches responses from the Hive Metastore. If you make a change directly to the Hive Metastore, you must restart the pipeline so that the processor flushes its cache.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools



Asked: 2018-01-19 10:42:22 -0600

Seen: 439 times

Last updated: Jan 19 '18