You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.
Error handling is too often an afterthought in the design of a library. Rust provides a magnificent syntax to remove the noise of error-handling from the happy path. However, alongside all the benefits this syntax has, it also promotes the mindset that errors should be dealt with at a later moment. Which in general is good for productivity, but carries the danger of delaying this moment until it's too late for a conscious and holistic design.
Designing the error-handling of a library is much more difficult than we usually credit. As can be seen in various examples of libraries similar to Streams [1] [2], one of the primary issues is in the mindset when designing the errors of the library: they are designed from a bottom-up perspective. They answer the question of “how can I wrap this low-level error so that I can forward it downstream”, strongly incentivized by the question-mark syntax.
In design, we know that there are no hard truths. Design by inertia is dangerous and one should always be compelled to revaluate past assumptions, and question their applicability in an specific context. Let’s first revisit the foundations of a good error-handling design, and how do we want to apply them to Streams.
The purpose of errors
When designing a library, and specifically its error-handling, the first premise that we need to keep in mind is that they are the artefact that a user of the library will receive when something does not go as planed. The errors have 2 main purposes:
log errors for tracing
handle errors for business logic
For tracing, the most important quality of errors is that they are expressive and specific. There's nothing more annoying than a log that says "an error occurred during fetching the message. Try again later". It is equally annoying to have 10 lines of logs explaining "an error occurred during fetching: Error in data layer: [...] warning, something has failed: [404] Error 404: document not found". We need to strive for error messages that provide information as much concrete as possible, as much condensed as possible, always remembering that they are supposed to be read by a human at some point, and she is supposed to intuitively know how to act on them. As per structure of the error, the tracing purpose does not have any particular need for structure; we could be returning strings with the error and be done with it.
Structure becomes important when we consider errors as a mechanism for application engineers to implement business logic around them. And this is the main question that the structure should try to answer: “what business logic will the application engineer want to trigger in each particular error case?”
In a network communication protocol like Streams, we can classify the different errors by the action that the user should do upon encountering them:
Retry the operation again without changing anything
Correct the data being sent and retry the operation
Correct the environment and retry the operation
Desist from executing the operation
Regardless of how the Error enum is structured, when a operation returns an error, the user will match against it in an attempt to decide which of these 4 actions should do. Therefore, is of outmost importance that the enum projects this classification as directly as possible.
Error model
In Streams the operations that can fail are:
Send a message
Fetch a message
Create a stream
Connect to a stream
Create a branch
Change the permissions of a branch
To each of these operations, the following model tries to project the 4 different actions described above:
operation \ action
Retry the operation again without changing anything
Correct the data being sent and retry the operation
Correct the environment and retry the operation
Desist from executing the operation
Send a message
transient network failure
payload too big
User has not created or received the announcement message of the stream
User has readonly permission over this topic
user does not have an identity
Fetch a message
transient network failure
NA
User has not created or received the announcement message of the stream
spongos state of linked message not in store
message not found
user is not allowed to read from this topic
message data cannot be unwrapped
message signature is not valid
message data is not compatible with this version of streams
Create a stream
transient network failure
Stream with this topic already exists (should this be upserting instead?)
?
User does not have an identity
Create a branch
transient network failure
Branch with this topic already exists (should this be upserting instead?)
User has not created or received the announcement message of the stream
User does not have an identity
user has readonly permission over the parent topic
Change the permissions of a branch
transient network failure
Unknown PSK
User has not created or received the announcement message of the stream
Use does not have an identity
User is not admin of the branch
Enum
With the pervious model in mind, the proposed Error enum squeleton could look something like this (variant fields to be validated and possibly extended during implementation) :
Self-note: make sure the classification is granular enough to be able to store data within each Error variant that can be relevant for the error-handling logic
Conceptual Design
Error handling is too often an afterthought in the design of a library. Rust provides a magnificent syntax to remove the noise of error-handling from the happy path. However, alongside all the benefits this syntax has, it also promotes the mindset that errors should be dealt with at a later moment. Which in general is good for productivity, but carries the danger of delaying this moment until it's too late for a conscious and holistic design.
Designing the error-handling of a library is much more difficult than we usually credit. As can be seen in various examples of libraries similar to Streams [1] [2], one of the primary issues is in the mindset when designing the errors of the library: they are designed from a bottom-up perspective. They answer the question of “how can I wrap this low-level error so that I can forward it downstream”, strongly incentivized by the question-mark syntax.
In design, we know that there are no hard truths. Design by inertia is dangerous and one should always be compelled to revaluate past assumptions, and question their applicability in an specific context. Let’s first revisit the foundations of a good error-handling design, and how do we want to apply them to Streams.
The purpose of errors
When designing a library, and specifically its error-handling, the first premise that we need to keep in mind is that they are the artefact that a user of the library will receive when something does not go as planed. The errors have 2 main purposes:
For tracing, the most important quality of errors is that they are expressive and specific. There's nothing more annoying than a log that says
"an error occurred during fetching the message. Try again later"
. It is equally annoying to have 10 lines of logs explaining"an error occurred during fetching: Error in data layer: [...] warning, something has failed: [404] Error 404: document not found"
. We need to strive for error messages that provide information as much concrete as possible, as much condensed as possible, always remembering that they are supposed to be read by a human at some point, and she is supposed to intuitively know how to act on them. As per structure of the error, the tracing purpose does not have any particular need for structure; we could be returning strings with the error and be done with it.Structure becomes important when we consider errors as a mechanism for application engineers to implement business logic around them. And this is the main question that the structure should try to answer: “what business logic will the application engineer want to trigger in each particular error case?”
In a network communication protocol like Streams, we can classify the different errors by the action that the user should do upon encountering them:
Regardless of how the
Error
enum is structured, when a operation returns an error, the user will match against it in an attempt to decide which of these 4 actions should do. Therefore, is of outmost importance that the enum projects this classification as directly as possible.Error model
In Streams the operations that can fail are:
To each of these operations, the following model tries to project the 4 different actions described above:
Enum
With the pervious model in mind, the proposed
Error
enum squeleton could look something like this (variant fields to be validated and possibly extended during implementation) :The text was updated successfully, but these errors were encountered: