Ask Your Question
1

Hadoop FS destination can NOT write data files

asked 2017-09-26 05:26:48 -0500

casel.chen gravatar image

I'm trying Replicating Relational Databases with StreamSets Data Collector, the pipeline run quietly without errors but Hadoop FS destination only create database and table folder in HDFS while NO data files generated in the table directory, why?

I'm sure there are many records from MySQL table. And checked Hive Metastore Destination, the hive table created successfully.

image description

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
0

answered 2019-02-22 11:40:29 -0500

Hello, I am facing similar issue as you mentioned in your post. Our Pipeline was working properly a couple of days ago and suddenly it stopped to work. We set full permition of the hdfs folder of hive (as showing on command: show create table [database].[table] in LOCATION property shows: /user/hive/warehouse/[database]/[table] ) Our job completed without any error (even on Preview than on Run options modes), and when we go to HDFS via command line issue command (hdfs dfs -ls /user/hive/warehouse/[database]/) we can see the folder was successfully created but no file (AVRO format) was added to this folder, and on HUE when we select data (SELECT * FROM [database].[table] returns empty...

Have you resolved this? Any idea/direction can you can help us figured out?

edit flag offensive delete link more

Comments

We have found a solution about this issue, on the Hadoop FS (Destination) on Configuration > Output Files > Directory in Header flag was needed to be checked (as it is not checked by default) and it was not adding the AVRO files once creating a temporary table in HDFS directory /user/hive/warehouse.

Leonardo Muniz gravatar imageLeonardo Muniz ( 2019-02-25 05:58:38 -0500 )edit
0

answered 2017-09-26 05:28:00 -0500

casel.chen gravatar image

updated 2017-09-26 05:31:38 -0500

I set dfs.permission = false and chmod w+r for /user hdfs folder

Btw, how to extract mysql datetime field to yyyyMMdd and set it as partition field? Thanks!

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2017-09-26 05:26:48 -0500

Seen: 212 times

Last updated: Sep 26 '17