Ask Your Question
1

[JDBC Query Consumer to Hadoop FS] string type date processing in sql query and Hadoop directory path

asked 2019-06-26 23:43:52 -0500

John Seok gravatar image

updated 2019-06-27 00:32:28 -0500

I make JDBC Query Consumer to Hadoop FS pipeline and I have trouble with processing string type time in the pipeline

1st Question : SQL Query on JDBC Query Consumer

Database is postgresql and I want to use incremental mode for ingestion date_info column type is character varying like '20190601000000'

SQL Query: SELECT * FROM TWEDAS_RAW_DATA WHERE date_info > '${OFFSET}' ORDER BY date_info Initial Offset: 20190601000000 Offset Column: date_info

2nd Question : Directory Template making a directory template using record:value time basis in Hadoop FS

  • Time Basis : ${record:value("/date_info")}

  • Directory Template : /user/hhiwedas/raw/${YYYY()}-${MM()}-${DD()}

It's running but happens error like 'CTRCMN_0100 - Error evaluating expression ${record:value("/date_info")}: javax.servlet.jsp.el.ELException: Attempt to convert String "20190619160631" to type "java.util.Date", but there is no PropertyEditor for that type'

I want to know how to convert string type to appropriate type

Hope your fast response

Thanks,

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2019-06-26 23:59:54 -0500

metadaddy gravatar image

The problem is you're providing a string where the stage is expecting a Java Date. You can configure the time basis like this to convert the string to a Date:

${time:extractDateFromString(record:value('/date_info'),'yyyyMMddHHmmss')}
edit flag offensive delete link more

Comments

I did modify like your answer but happened error like 'HADOOPFS_12 - The record 'SELECT * FROM TWEDAS_RAW_DATA::rowCount:909' is late' It's validated and running but doesn't ingest data for that error.. Do you have any answer for me? Thanks,

Youngjun John Seok gravatar imageYoungjun John Seok ( 2019-06-27 18:23:13 -0500 )edit

oh.. while writing above comment about error. But collecting speed is so slow.. and even data acquisition rate is so low(during 2 hours running, input data is 70 million and output data is 1.8 million) Any way to solve this porblem?? I'm very much looking forward to your response. Thanks,

Youngjun John Seok gravatar imageYoungjun John Seok ( 2019-06-27 18:25:56 -0500 )edit

Is the date_info column indexed?

metadaddy gravatar imagemetadaddy ( 2019-06-28 17:39:01 -0500 )edit

No, it's not indexd

John Seok gravatar imageJohn Seok ( 2019-06-30 19:27:18 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-06-26 23:38:28 -0500

Seen: 757 times

Last updated: Jun 27