How do you configure a Hive / Impala JDBC driver for Data Collector?
What driver Jar/class is supported, and how is the JDBC URI configured?
StreamSets comes bundled with the open-source Hive JDBC driver. Using the default driver, URLs will look like the following:
jdbc:hive2://hive-server2-host.company.com:10000/dbName
jdbc:hive2://hive-server2-host.company.com:10000/dbName;user=username;password=*
jdbc:hive2://hive-server2-host.company.com:10000/dbName;principal=hive/hive-server2-host.company.com@COMPANY.COM
jdbc:hive2://hive-server2-host.company.com:10000/dbName;principal=hive/hive-server2-host.company.com@COMPANY.COM;ssl=true;sslTrustStore=/path/to/truststore.jks
Cloudera also provides a Hive driver. To install it, simply install one of the CDH stage libraries. Using the Cloudera Hive driver:
jdbc:hive2://hive-server2-host.company.com:10000/dbName
jdbc:hive2://hive-server2-host.company.com:10000/dbName;AuthMech=3;UID=username;PWD=*
jdbc:hive2://hive-server2-host.company.com:10000/dbName;AuthMech=1;KrbRealm=COMPANY.COM;KrbHostFQDN=hive-server2-host.company.com;KrbServiceName=hive
jdbc:hive2://hive-server2-host.company.com:10000/dbName;AuthMech=1;KrbRealm=COMPANY.COM;KrbHostFQDN=hive-server2-host.company.com;KrbServiceName=hive;SSL=1;SSLKeyStore=/path/to/truststore.jks
Cloudera also has an Impala driver. Download it from Cloudera here and install it into SDC. Using the Cloudera Impala driver:
jdbc:impala://impala-daemon-host.company.com:21050/dbName
jdbc:impala://impala-daemon-host.company.com:21050/dbName;AuthMech=3;UID=username;PWD=*
jdbc:impala://impala-daemon-host.company.com:21050/dbName;AuthMech=1;KrbRealm=COMPANY.COM;KrbHostFQDN=impala-daemon-host.company.com;KrbServiceName=impala
jdbc:impala://impala-daemon-host.company.com:21050/dbName;AuthMech=1;KrbRealm=COMPANY.COM;KrbHostFQDN=impala-daemon-host.company.com;KrbServiceName=impala;SSL=1;SSLKeyStore=/path/to/truststore.jks
All Hadoop distributions include hive-jdbc drivers pre-packaged. The examples shown in Jeff's answer will not only work for Cloudera but for all distributions where you want to use the pre-packaged Hive jdbc driver.
You can also use the hive-jdbc driver to connect directly to Impala:
Unsecured: jdbc:hive2://myhost.example.com:21050/;auth=noSasl
Kerberos: jdbc:hive2://myhost.example.com:21050/;principal=impala/myhost.example.com@H2.EXAMPLE.COM
LDAP Auth: jdbc:hive2://myhost.example.com:21050/test_db;user=fred;password=xyz123
If you proceed to use the Impala driver, ensure you install and configure this in the External Directory for Data Collector.
Asked: 2017-05-08 16:48:26 -0600
Seen: 8,150 times
Last updated: Sep 05 '17