Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

Error-handling #269

Open
arnauorriols opened this issue Jul 8, 2022 · 1 comment
Open

Error-handling #269

arnauorriols opened this issue Jul 8, 2022 · 1 comment

Comments

@arnauorriols
Copy link
Contributor

arnauorriols commented Jul 8, 2022

Conceptual Design

Error handling is too often an afterthought in the design of a library. Rust provides a magnificent syntax to remove the noise of error-handling from the happy path. However, alongside all the benefits this syntax has, it also promotes the mindset that errors should be dealt with at a later moment. Which in general is good for productivity, but carries the danger of delaying this moment until it's too late for a conscious and holistic design.

Designing the error-handling of a library is much more difficult than we usually credit. As can be seen in various examples of libraries similar to Streams [1] [2], one of the primary issues is in the mindset when designing the errors of the library: they are designed from a bottom-up perspective. They answer the question of “how can I wrap this low-level error so that I can forward it downstream”, strongly incentivized by the question-mark syntax.

In design, we know that there are no hard truths. Design by inertia is dangerous and one should always be compelled to revaluate past assumptions, and question their applicability in an specific context. Let’s first revisit the foundations of a good error-handling design, and how do we want to apply them to Streams.

The purpose of errors

When designing a library, and specifically its error-handling, the first premise that we need to keep in mind is that they are the artefact that a user of the library will receive when something does not go as planed. The errors have 2 main purposes:

  • log errors for tracing
  • handle errors for business logic

For tracing, the most important quality of errors is that they are expressive and specific. There's nothing more annoying than a log that says "an error occurred during fetching the message. Try again later". It is equally annoying to have 10 lines of logs explaining "an error occurred during fetching: Error in data layer: [...] warning, something has failed: [404] Error 404: document not found". We need to strive for error messages that provide information as much concrete as possible, as much condensed as possible, always remembering that they are supposed to be read by a human at some point, and she is supposed to intuitively know how to act on them. As per structure of the error, the tracing purpose does not have any particular need for structure; we could be returning strings with the error and be done with it.

Structure becomes important when we consider errors as a mechanism for application engineers to implement business logic around them. And this is the main question that the structure should try to answer: “what business logic will the application engineer want to trigger in each particular error case?”

In a network communication protocol like Streams, we can classify the different errors by the action that the user should do upon encountering them:

  1. Retry the operation again without changing anything
  2. Correct the data being sent and retry the operation
  3. Correct the environment and retry the operation
  4. Desist from executing the operation

Regardless of how the Error enum is structured, when a operation returns an error, the user will match against it in an attempt to decide which of these 4 actions should do. Therefore, is of outmost importance that the enum projects this classification as directly as possible.

Error model

In Streams the operations that can fail are:

  • Send a message
  • Fetch a message
  • Create a stream
  • Connect to a stream
  • Create a branch
  • Change the permissions of a branch

To each of these operations, the following model tries to project the 4 different actions described above:

operation \ action Retry the operation again without changing anything Correct the data being sent and retry the operation Correct the environment and retry the operation Desist from executing the operation
Send a message transient network failure payload too big User has not created or received the announcement message of the stream
  • User has readonly permission over this topic
  • user does not have an identity
Fetch a message transient network failure NA
  • User has not created or received the announcement message of the stream
  • spongos state of linked message not in store
  • message not found
  • user is not allowed to read from this topic
  • message data cannot be unwrapped
  • message signature is not valid
  • message data is not compatible with this version of streams
Create a stream transient network failure Stream with this topic already exists (should this be upserting instead?) ? User does not have an identity
Create a branch transient network failure Branch with this topic already exists (should this be upserting instead?) User has not created or received the announcement message of the stream
  • User does not have an identity
  • user has readonly permission over the parent topic
Change the permissions of a branch transient network failure Unknown PSK User has not created or received the announcement message of the stream
  • Use does not have an identity
  • User is not admin of the branch

Enum

With the pervious model in mind, the proposed Error enum squeleton could look something like this (variant fields to be validated and possibly extended during implementation) :

enum Error {
    NetworkFailure(String, Box<dyn Error>),
    DataError(String),
    SetupError(String),
    PermissionError(String),
    MessageNotFound(String),
    FatalError(String),
}
@arnauorriols
Copy link
Contributor Author

Self-note: make sure the classification is granular enough to be able to store data within each Error variant that can be relevant for the error-handling logic

@arnauorriols arnauorriols removed their assignment May 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant