Ask Your Question
1

READ from Azure Data Lake Store

asked 2018-06-11 08:02:39 -0500

werners gravatar image

Hi, I am trying to use the Hadoop FS Standalone origin to READ files from Azure Data Lake Store. Unfortunately it does not work (we uploaded the adls sdk as external lib). Is this supported? Or do we have to use Hadoop FS in cluster mode, even though we do not run a hadoop cluster?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2018-07-25 08:44:30 -0500

werners gravatar image

I found the solution, here we go:

  1. add the following jars as external libraries on stage library hdp_2_6_lib:

    • azure-data-lake-store-sdk-2.2.5.jar
    • hadoop.azure.datalake.1.0.jar
    • jackson-core-7.4.jar
    • slf4j-api-1.7.21.jar (I am not sure if the last 2 jars are necessary so you could try without those)
  2. use Hadoop FS Standalone HDP 2.6.2.1-1 as origin

  3. on the Hadoop FS tab:

    • Hadoop FS URI = adl://<youradls>.azuredatalakestore.net</youradls>
    • Hadoop FS Configuration: add the following key values:
      • fs.adl.oauth2.access.token.provider.type : ClientCredential (yes, the actual string 'ClientCredential')
      • fs.adl.oauth2.refresh.url : <your token="" endpoint="" from="" azure="" ad="">
      • fs.adl.oauth2.client.id : <your application="" id="" for="" your="" adls="" oauth2="" registration="" in="" azure="" ad="">
      • fs.adl.oauth2.credential: <your oauth2="" token="" for="" your="" adls="" in="" azure="" ad="">
  4. enter the file path and files you want to ingest in the Files tab

Done

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-06-11 08:02:39 -0500

Seen: 36 times

Last updated: Jul 25