-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add delta-vs-avro blog #483
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Avril Aysha <[email protected]>
Signed-off-by: Avril Aysha <[email protected]>
Signed-off-by: Avril Aysha <[email protected]>
|
||
Both Avro and Delta Lake can be used for streaming and real-time processing. | ||
|
||
Avro is often used for ingestion of raw streaming data, such as log files, events or sensor data. Its lightweight, row-based structure makes it ideal for high-throughput writes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remind the reader here again that avro
is row-based.
The row-based format lends itself better for initial ingestion, but because it is not a columnar format, there is a higher cost when using avro for analytical stream processing.
On a side note, I've been bit by avro and streaming. Spark provides better support for permissive-mode
now, but ArrayIndexOutOfBoundsException
's used to plague my days on PagerDuty. While it "can be forwards-compatible" - when it isn't it hurts :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments I left on the blog post are mainly food for thought. I don't see any problems with the post as is.
Signed-off-by: Avril Aysha <[email protected]>
thanks @newfront, super helpful! Changes have been incorporated, so this should be good to go. |
This PR adds a blog comparing Delta Lake vs Avro files.
Signed-off-by: Avril Aysha [email protected]