Ask Your Question
1

try to convert list map of multiple to a single record

asked 2019-01-31 10:17:20 -0500

srinath_222 gravatar image

updated 2019-03-18 17:50:59 -0500

metadaddy gravatar image

Problem statement :

I am trying to convert multiple records with list map into a single record .

ID 1 list map a
ID 1 list map b
ID 1 list map c

into

ID 1 list map a , b , c

It's basically multiple records with list map into a single record for some time .

Any one whom has done facing similar issue and had some solution please do post in this it would be a real help .

thanks in advance

Sample input:

image description

Desired output in elastic search:

"ROWID_OBJECT": "100008",
"PSI": [
    {
        "SYSTEM_ID": "417008634"
    },        
    {
        "SYSTEM_ID": "901785927"
    },        
    {
        "SYSTEM_ID": "901787068"
    }
]
edit retag flag offensive close merge delete

Comments

For clarity, do you mind updating your question with sample input record--and preferably a screenshot of it in preview mode? Thanks!

iamontheinet gravatar imageiamontheinet ( 2019-01-31 11:29:46 -0500 )edit

So your origin and destination are both Elasticsearch?

iamontheinet gravatar imageiamontheinet ( 2019-01-31 14:03:10 -0500 )edit

Orgin is SQL Server client through JDBC consumer and destination is Elastic search

srinath_222 gravatar imagesrinath_222 ( 2019-01-31 14:17:45 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
1

answered 2019-03-18 16:19:47 -0500

vel gravatar image

updated 2019-03-18 16:35:00 -0500

metadaddy gravatar image

This can be achieved using JYTHON code , PFA below the code used .

ids = {}

for i, record in enumerate(records):
    try:
        id = record.value['ROWID_OBJECT']

        if id not in ids:

            ids.update({id: i})

            for k, v in record.value.items():
                if isinstance(v, dict):
                    record.value[k] = [record.value[k]]

        else:

            for k, v in record.value.items():
                if isinstance(v, dict):
                    if record.value[k] not in records[ids[id]].value[k]:
                        records[ids[id]].value[k].append(record.value[k])

    except Exception as e:
        error.write(record, str(e))


for i, record in enumerate(records):
    if i in ids.values():
        output.write(record)
edit flag offensive delete link more

Comments

Nice solution! Note that you will need additional logic if there's a chance that the groups of records might be split across multiple batches.

metadaddy gravatar imagemetadaddy ( 2019-03-18 17:48:44 -0500 )edit

I had the same issue I have included additional logic in my ETL , since my destination is Elastic Search , I am updating the records (Update with doc_as_upsert) , So then in case of same records its just an update to existing record , Same should be changed accordingly to different destination

vel gravatar imagevel ( 2019-03-19 08:54:05 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-01-31 10:17:20 -0500

Seen: 490 times

Last updated: Mar 18