Ask Your Question

Hive Metadata table name can take multiple expressions ?

asked 2017-11-03 09:50:19 -0500

Roh gravatar image

updated 2017-11-03 14:03:31 -0500

I have multiple tables data coming from SDC RPC. My requirement is I need the partition for few tables and not for every table because some table data doesn't satisfy our minimum partition requirement of 256 MB. For the once I don't want to do partitions I kept the regular expression as ${record:attribute('jdbc.tables1')} in hive table name config and for the once I need partition looks like this ${record:attribute('jdbc.tables')} i don't see an option of adding multiple regular expression in one hive metadata and making the partitions accordingly.

So i made a pipeline as the first picture and it works fine, but the challenge is because both the hive_metdata processors receive the same data even though it can make the tables and feed the data we will see the error in the second screenshot. how to get around this?

image description

image description

Side Note: I can get around it by broadcasting the data to other port and reading it, but I don't want to do that.

What I Tried so far? Broadcasting to same port but changing the sdc-rpc ID, which didn't work :(

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2017-11-03 10:33:08 -0500

jeff gravatar image

If I understand what you're trying to do correctly, you can add a Stream Selector processor before your Hive Metadata processors. Configure that stream selector to only pass through records with the attribute you want (i.e. use a condition like ${record:attribute('jdbc.tables') != null}) to pass through the records you want to keep in that stream, and send others to trash or some other location.

However, I wonder if this is even overkill for what you're trying to do. Please bear in mind that the jdbc.tables attribute is just a preconfigured attribute name coming out of the origin. You are not bound to use that name in your Hive processor. You could potentially put an expression evaluator processor to "unify" the attribute names - using whatever logic is required - into a single common one (ex: table_name_final) that the Hive metadata processor is expecting.

edit flag offensive delete link more


Got It :) Thank you so much @jeff

Roh gravatar imageRoh ( 2017-11-03 14:57:02 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2017-11-03 09:50:19 -0500

Seen: 249 times

Last updated: Nov 03 '17