Ask Your Question
1

User sdc does not have privileges for DESCDATABASE

asked 2019-05-16 14:26:49 -0500

Prasanna gravatar image

Hello

We are trying out the StreamSets in our hadoop cluster and are facing some difficulty with access.

Details on cluster

  • CDH 5.15.x Kerberized/LDAP
  • Sentry enabled
  • JKS/JTS set up (key & trust stores)
  • SSL Enabled
  • Encryption at rest enabled

We added the StreamSets parcel through Cloudera Manager and have it up & running. We are able to read from (standalone Hadoop FS) & write to HDFS.

  • sdc has been added to hive proxy user list
  • hadoop.kms.proxyuser.sdc.users/
  • hadoop.kms.proxyuser.sdc.host have been set in KMS config
  • sdc added to sentry admins & valid connected users list
  • sdc added to decrypt data from encryption zones

Now, sdc is not in any LDAP groups. So, we are not able to add sdc to a sentry role. should SDC be a ldap user?

Since, sdc is another service similar to (say) spark/impala in our cluster and We do not have spark entries in sentry, but spark works fine with hive tables. What are we missing to let sdc access our hive data?

But, we are not able to read or write into any of the Hive tables yet. Here is the error.

com.streamsets.pipeline.api.base.OnRecordErrorException: HIVE_23 - TBL Properties 'com.streamsets.pipeline.stage.lib.hive.exceptions.HiveStageCheckedException: HIVE_20 - Error executing SQL: DESCRIBE DATABASE `db_sdc`, Reason:Error while compiling statement: FAILED: SemanticException No valid privileges
 User sdc does not have privileges for DESCDATABASE
 The required privileges: Server=server1->Db=db_sdc->action=select;Server=server1->Db=db_sdc->action=insert;' Mismatch: Actual: {} , Expected: {}
    at com.streamsets.pipeline.stage.processor.hive.HiveMetadataProcessor.process(HiveMetadataProcessor.java:589)
    at com.streamsets.pipeline.api.base.RecordProcessor.process(RecordProcessor.java:52)
    at com.streamsets.pipeline.api.base.configurablestage.DProcessor.process(DProcessor.java:35)
    at com.streamsets.datacollector.runner.StageRuntime.lambda$execute$2(StageRuntime.java:286)
    at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:235)
    at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:298)
    at com.streamsets.datacollector.runner.StagePipe.process(StagePipe.java:219)
    at com.streamsets.datacollector.runner.preview.PreviewPipelineRunner.lambda$runSourceLessBatch$0(PreviewPipelineRunner.java:348)
    at com.streamsets.datacollector.runner.PipeRunner.acceptConsumer(PipeRunner.java:221)
    at com.streamsets.datacollector.runner.PipeRunner.executeBatch(PipeRunner.java:142)
    at com.streamsets.datacollector.runner.preview.PreviewPipelineRunner.runSourceLessBatch(PreviewPipelineRunner.java:344)
    at com.streamsets.datacollector.runner.preview.PreviewPipelineRunner.processBatch(PreviewPipelineRunner.java:269)
    at com.streamsets.datacollector.runner.StageRuntime$3.run(StageRuntime.java:370)
    at java.security.AccessController.doPrivileged(Native Method)
    at com.streamsets.datacollector.runner.StageRuntime.processBatch(StageRuntime.java:366)
    at com.streamsets.datacollector.runner.StageContext.processBatch(StageContext.java:270)
    at com.streamsets.pipeline.lib.dirspooler.SpoolDirRunnable.produce(SpoolDirRunnable.java:303)
    at com.streamsets.pipeline.lib.dirspooler.SpoolDirRunnable.run(SpoolDirRunnable.java:142)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at com.streamsets.pipeline.lib ...
(more)
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2019-05-17 07:22:44 -0500

Mark Brooks gravatar image

updated 2019-05-21 15:13:43 -0500

[This first answer has been edited to reflect the info from the full thread below]

SDC supports Hadoop Impersonation (as described here) which covers direct interaction with HDFS, but does not apply to Hive. Hive impersonation (if enabled) must be configured in the HIve JDBC driver URL (see below).

If Hive impersonation is disabled (as is the case when Sentry is used) SDC will submit Hive queries as the "sdc" user, and if the sdc user does not have Sentry permissions, the result is the error you encountered, and one will need to grant the sdc user Sentry permissions.

In order to grant Sentry permissions to the sdc user, one first needs to see what groups the sdc user belongs to, typically performed using the linux "groups" command. Then one can execute a Sentry command like "GRANT ROLE [SENTRY-ROLE] TO GROUP [SDC-GROUP]"

If Sentry is not used and Hive Impersonation is enabled, one can simply use Hive impersonation.

Hive impersonation configuration depends on the driver used:

1) For the bundled Hive JDBC Driver, specify the following JDBC property: "hive.server2.proxy.user" described here

2) For the Cloudera JDBC Driver, use the DelegationUID property instead as described here

Services like Impala use the OS user ID of the client submitting a query as the basis for evaluating privileges. The service's own OS account (like "impala") does not need Sentry permissions.

edit flag offensive delete link more

Comments

Hello Mark - Thanks for the reply. I tried 'DelegationUID=' property as a part of JDBC url as well as 'Additional JDBC configuration properties' and neither seems to be doing the trick. The user trying to access Hive still remains 'sdc' and the same error persist

Prasanna gravatar imagePrasanna ( 2019-05-17 11:59:01 -0500 )edit

Hi Prasanna, Please inspect the Hive config property "HiveServer2 Enable Impersonation" in CM. It is likely disabled as a Sentry best practice to avoid opening a security hole. If this is the case would it be possible to grant the sdc user Sentry permissions to function as a "service account"?

Mark Brooks gravatar imageMark Brooks ( 2019-05-17 12:36:58 -0500 )edit

Yes, it is disabled because of sentry. Hence the reason for me asking does "sdc" have be a valid LDAP ID. Since,sentry is configured to work with LDAP, we can only add user ID's belonging to some LDAP group. Otherwise, sentry doesn't recognize them as a valid user. Any other possible workarounds ?

Prasanna gravatar imagePrasanna ( 2019-05-17 13:36:38 -0500 )edit

Hi Prasanna, Sorry it took me so long to get around to your original question. Given that the StreamSets Service is kerberized, isn't it the case that the sdc user is already in AD?

Mark Brooks gravatar imageMark Brooks ( 2019-05-17 13:49:12 -0500 )edit

Yes & No. Am not an AD person - so, not sure how it works. it doesn't create a AD account with "sdc" as the ID. it creates kerberos principal with sdc as the name, but not the AD account with same name.

Prasanna gravatar imagePrasanna ( 2019-05-17 14:39:17 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-05-16 14:26:49 -0500

Seen: 135 times

Last updated: 14 hours ago