How to sync data from CSV file to Kafka Producer in Avro message with Confluent Schema Registry?
I want to read data from a csv file (total 100 lines) and send them to kafka producer in avro message with confluent schema registry, but it reported errors like "AVRO_GENERATOR_00 - Record 'zhima.csv::2255' is missing required avro field 'sample.zhima.avro.Zhima.id'"., how to config the pipeline?
The csv file content looks like:
"ID","NAME","CERT_TYPE","CERT_NO","PHONE","OPEN_ID","CREDIT_SCORE","STATUS","BIZ_NO","TRANSACTIONID","TENANT_ID","OPERATOR_ID","OPERATOR_NAME","CREATED_AT","CREATED_BY","UPDATED_AT","UPDATED_BY"
"1","Jack","IDENTITY_CARD","xxxxx32","","268800222596185591781808705","732","1","ZM201708113000000107000563167056","201708111830006580000000000025","3","8","Rose","2017-08-11 18:30:00","sys","2017-08-11 18:30:00","sys"
Here is the screenshot
Here is the avro schema I used:
{"namespace": "sample.zhima.avro",
"type": "record",
"name": "Zhima",
"fields": [
{"name": "id", "type": "long"},
{"name": "name", "type": "string"},
{"name": "cert_type", "type": "string"},
{"name": "cert_no", "type": "string"},
{"name": "phone", "type": ["null","string"], "default": null},
{"name": "open_id", "type": "string"},
{"name": "credit_score", "type": ["null","int"], "default": null},
{"name": "status", "type": "int"},
{"name": "biz_no", "type": "string"},
{"name": "transactionid", "type": "string"},
{"name": "tenant_id", "type": "string"},
{"name": "operator_id", "type": ["null","string"], "default": null},
{"name": "operator_name", "type": ["null","string"], "default": null},
{"name": "created_at", "type": "long"},
{"name": "created_by", "type": "string"},
{"name": "updated_at", "type": ["null","long"], "default": null},
{"name": "updated_by", "type": ["null","string"], "default": null}
]
}
what does line 2255 and 2256 of your CSV look like? Is it a blank line by chance?
I don't think the CSV file has problem because I can ingest it with my program before using streamsets dc. BTW, the CSV file I used has only 100 lines.