Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Suspecting a potential bug in 3.0 JDBC > HIVE

I have a pipeline with JDBC origin and HIVE as a destination.

I have a table in hive with the below columns :

Table X
         customername string 
         address string 
         phonenumer int
         address string

Table coming from JDBC consumer has the different schema for the same table (more columns and different column name) something like below :

Table X
           customername string
           postalcode string
           homephonenumber string
           network xyz

When the table schema is different for the same table it is expected that pipeline will fail or through the error, and when it did I deleted the table in hive and did the msck repair table <tableName> now the expected behaviour from stream sets pipeline is to create a new table with the schema that's coming from the JDBC, but its not actually creating the table still keeps complaining that schema is different from source and destination.

When I stop and restart the pipeline it was able to create the table.

Please note that I'm not directly connecting to JDBC, I have an RPC between two environments.

Source pipeline : JDBCORIGIN >>> RPCDESTINATION Destination pipeline : RPCORIGIN >>> HIVEMETADATA >> HIVE & HDFS

Suspecting a potential bug in 3.0 JDBC > HIVE

I have a pipeline with JDBC origin and HIVE as a destination.

I have a table in hive with the below columns :

Table X
         customername string 
         address string 
         phonenumer int
         address string

Table coming from JDBC consumer has the different schema for the same table (more columns and different column name) something like below :

Table X
           customername string
           postalcode string
           homephonenumber string
           network xyz

When the table schema is different for the same table it is expected that pipeline will fail or through the error, and when it did I deleted the table in hive and did the msck repair table <tableName> now the expected behaviour from stream sets pipeline is to create a new table with the schema that's coming from the JDBC, but its not actually creating the table still keeps complaining that schema is different from source and destination.

When I stop and restart the pipeline it was able to create the table.

Please note that I'm not directly connecting to JDBC, I have an RPC between two environments.

Source pipeline : JDBCORIGIN >>> RPCDESTINATION Destination pipeline : RPCORIGIN >>> HIVEMETADATA >> HIVE & HDFS

Suspecting a potential bug in 3.0 JDBC > HIVE

I have a pipeline with JDBC origin and HIVE as a destination.

I have a table in hive with the below columns :

Table X
         customername string 
         address string 
         phonenumer int
         address string

Table coming from JDBC consumer has the different schema for the same table (more columns and different column name) something like below :

Table X
           customername string
           postalcode string
           homephonenumber string
           network xyz

When the table schema is different for the same table it is expected that pipeline will fail or through the error, and when it did I deleted the table in hive and did the msck repair table <tableName> now the expected behaviour from stream sets pipeline is to create a new table with the schema that's coming from the JDBC, but its not actually creating the table still keeps complaining that schema is different from source and destination.

When I stop and restart the pipeline it was able to create the table.

Please note that I'm not directly connecting to JDBC, I have an RPC between two environments.

Source pipeline : JDBCORIGIN >>> RPCDESTINATION Destination pipeline : RPCORIGIN >>> HIVEMETADATA >> HIVE & HDFS