Field partitioner does not support multiple event type feature with schema references #170

gokhansari · 2021-01-04T09:56:59Z

With the version 5.5+, Confluent now supports multiple event types in same topic. Based on this official documentation and this blog page, I tried to use this feature. I wrote a kafka streams application which produce avro messages with different schemas by using schema registry schema references. I even tried to consume these messages by an other kafka streams application to test multiple event type functionality and achieved a successful result.
This is my union Avro Schema to use in my tests. eventA and eventB schemas are also registered on Schema Registry:

[
    "com.xxx.xxx.eventA",
    "com.xxx.xxx.eventB"
]

Everything was good until this point. Then I tried to sink these messages to HDFS over Kafka Connect Hdfs Sink Connector alongside FieldPartitioner. And set relevant configuration properties in connector settings. This was the field that I want to use for partitioning:

"partition.field.name" : "field1"

Connector successfully started and read records from kafka but when It comes to partitioning process I got errors. It seems field partitioner was looking for field1 but actually It is not under root. Because of multiple event type functionality there is a wrapper root field with the name eventA. (I think this is made by toConnectSchema method in AvroData class of kafka-schema-registry-parent repository of confluent.)
Struct{eventA=Struct{field1=val1,field2=val2,field3=val3}}

So partition.field.name must be set "eventA.field1". But this is not appropriate approach, root object field name always changes with a different event type name. I think we can say, multiple event type feature broke field partitioning on kafka connect.

As a workaround should I go with implementing custom field partitioner or Is there any consistent solution that I missed?

The text was updated successfully, but these errors were encountered:

gokhansari changed the title ~~Field partitioner does not support multiple event type feature~~ Field partitioner does not support multiple event type feature with schema references Jan 5, 2021

gokhansari mentioned this issue Jan 6, 2021

Small object files problem for multi schema for a single topic confluentinc/kafka-connect-hdfs#537

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Field partitioner does not support multiple event type feature with schema references #170

Field partitioner does not support multiple event type feature with schema references #170

gokhansari commented Jan 4, 2021 •

edited

Loading

Field partitioner does not support multiple event type feature with schema references #170

Field partitioner does not support multiple event type feature with schema references #170

Comments

gokhansari commented Jan 4, 2021 • edited Loading

gokhansari commented Jan 4, 2021 •

edited

Loading