Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DynamoDbv2 bug for incorrect DateTime epoch serialization when date falls out of epoch supported range #3442

Open
wants to merge 3 commits into
base: main-staging
Choose a base branch
from

Conversation

sander1095
Copy link

@sander1095 sander1095 commented Aug 15, 2024

Description

This PR fixes a bug in the DynamoDb SDK when converting a DateTime to an epoch (integer).

DateTime's that fall out of the epoch supported range (2038-01-19T03:14:07UTC) will fail to get converted to epoch and simply get sent to DynamoDb as a DateTime, resulting in inconsistent storage in DynamoDb.

I believe this is a pretty critical bug, because retrieving dates that should be stored as epoch but are returned as DateTime could definitely cause lots of bugs or exceptions in client code.. scanamo, a DynamoDb SDK for scala, ALWAYS converts to epoch, for example, so I can expect their SDK to throw errors when it would encounter a datetime when deserializing when expecting an epoch integer.

Motivation and Context

Take a look at the following code that sets a property for storage in DynamoDbv2.

    [DynamoDBProperty("expiry", StoreAsEpoch = true)]
    public DateTime Expiry { get; set; }

You'd expect this to ALWAYS serialize a DateTime to epoch, and convert it back to DateTime when deserializing.

However, when a datetime is stored which is later than 2038-01-19T03:14:07UTC, it will store it as a DateTime anyway. This can be seen in the code edited in this PR; an exception is thrown about the date not fitting in an int32. It is caught and logged but the original entry is returned:

https://github.com/aws/aws-sdk-net/blob/051edf21c5968387334d27747c3188a3cb38f62b/sdk/src/Services/DynamoDBv2/Custom/DocumentModel/Document.cs#L254C1-L277C10

This PR this behaviour by calling AWSSDKUtils.ConvertToUnixEpochSecondsString(dateTime); instead, which will use a string-ified version of int64 instead of int32. This won't throw and epoch conversions will work better!

Important

  • I didn't change the "fall back to original entry" behavior. In case an exception happens in the code from this PR, it will STILL store the property as a datetime even though it should be stored in epoch.
    • I would propose to throw an exception here, but this would be a big breaking change and might need its own issue, research, and PR. I'd be happy to discuss this and contribute.
  • Instead of storing far-future dates as datetime, this PR fixes this behavior and sticks to epoch anyway. Though this is the correct approach, it is breaking!
    • The breaking change nature of this PR must be discussed! I can come up with 2 ways to fix this.
      • Create a public API change, like creating an extra property called StoreAsEpochLong in DynamoDBProperty, for example. This would require refactoring from consumers of the package.

Funnily enough, dates that are stored as DateTime even though they should be stored as an epoch DO deserialize back correctly. Perhaps that code exists on purpose or it just works by accident. This doesn't reduce the critical-ness of this bug, in my eyes, though! If someone uses a different SDK Client and their SDK does expect an integer epoch instead of a datetime, it could still cause issues.

fixes #3443

Testing

I've tested this locally and it works! :)
At first glance I couldn't find any unit tests for this code to base new regressions tests on top of, but I do think some should be written to prevent regressions.

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project
  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have read the README document
  • I have added tests to cover my changes
  • All new and existing tests passed

License

  • I confirm that this pull request can be released under the Apache 2 license

values outside of 2038 will silently be converted to its original value, in this case, a datetime.
@sander1095 sander1095 changed the title Fix DynamoDbv2 bug for incorrect DateTime epoch parsing when date falls out of epoch range Fix DynamoDbv2 bug for incorrect DateTime epoch serialization when date falls out of epoch supported range Aug 15, 2024
@dscpinheiro dscpinheiro changed the base branch from main to main-staging August 16, 2024 16:22
@bhoradc bhoradc requested a review from normj August 23, 2024 16:47
@normj
Copy link
Member

normj commented Aug 24, 2024

Hi @sander1095, because this would break any readers of the data that haven't updated the SDK if we started persisting the epoch as a long we need to make this an opt-in change.

I think adding the StoreAsEpochLong as you suggest works. If both StoreAsEpoch and StoreAsEpochLong are set to true I think we should throw an exception. We can mark the StoreAsEpoch as Obsolete with a message explaining how you should use StoreAsEpochLong but be sure the readers of the data are using a recent version of the SDK that understands long epoch dates.

@sander1095
Copy link
Author

sander1095 commented Aug 24, 2024

Hey @normj , thanks for your reply!

I understand your comment and meaning for introducing a new property. However, I would challenge it with the following arguments:

  • As a consumer of the package, I see this issue as a bug. Creating a new property creates a little bit of work for tons of users, which isn't very friendly and also annoying.
  • You say the following: because this would break any readers of the data that haven't updated the SDK if we started persisting the epoch as a long we need to make this an opt-in change. The SDK is currently able to deal with incorrectly serialized epochs. This means that this would only be a breaking change for people that serialize with the .NET sdk and deserialize with a different SDK where epoch/datetime is expected and would then throw errors. I believe this is a minor issue and a bit of an implementation detail, as every SDK works differently. Perhaps some research can be done to check how other popular dynamodb sdks deal with this issue, to back this point up.
  • We could increase the MAJOR version of this sdk so consumers know about the breaking change. This might be more annoying on the short term, but better in the long term.

If we do choose for a new property, I doubt that "long" in the name is a good choice. Using long, double, float or anything else is a bit of an implementation detail. Perhaps it should have "64bit" in the name?


Finally, there's something that is also very important that still needs to be discussed. The cause of this issue is that an exception is thrown and caught and no conversion will take place. Switching from int to long will ensure this particular exception does not happen anymore, but if there are any other exceptions, we will encounter the same bug again. To summarize, adding support for epoch after 2038 is great, but its only a bandaid for an underlying issue.

Therefore I would propose that the error handling of the existing code should be changed. Any exceptions/failures to serialize, deserialize or in any way transform data should "never" fall back to simply using the original input, which is what happens here. Instead, I believe it should throw! This could be considered a breaking change in the code that the related PR touches, though it doesn't really matter because I believe currently not many other exceptions can be thrown.

This means other parts of the SDK should be analyzed for these kinds of transformation issues :).

@sander1095
Copy link
Author

@normj any update or thoughts on this :)?

@sander1095
Copy link
Author

@normj Any update :)?

@normj
Copy link
Member

normj commented Oct 18, 2024

@sander1095 Apologize for my delayed response.

If any of this was new code the my opinion would be more aligned with you. But the fact is this code is over 10 years old and I can't measure how much code out there in the wild depends on the current behavior. Making runtime breaking behavior changes is really hard to do for this code. Especially since we are talking about data persistence and more then one application could be accessing the data and each of this applications will have different update cycles.

I thought maybe we could do what you suggest in our upcoming V4 but that has the problem I just mentioned that every application that accesses the data has to migrate to V4 at the same time. I'm not comfortable forcing users of the libraries to have synchronize their V4 updates across all of their applications. Our goal for V4 is an easy update process to V4.

I have to stand by my original comment since this is a runtime breaking change behavior for data persistence we have to make this an opt-in change. Naming is hard but I do prefer your suggestion of StoreAsEpoch64bit over StoreAsEpochLong.

@Dreamescaper
Copy link
Contributor

Dreamescaper commented Oct 19, 2024

I thought maybe we could do what you suggest in our upcoming V4 but that has the problem I just mentioned that every application that accesses the data has to migrate to V4 at the same time

@normj
Correct me if I'm wrong, but I don't think that's true in this case.
This PR doesn't affect reading the data in any way, only saving. Therefore, if V4 saves the data, then V3 is still able to read it. And vice versa - if V3 saves a value as string (incorrectly), then V4 is still able to read it anyway.

@sander1095
Copy link
Author

@Dreamescaper You are correct :)! The v4 approach would still have my vote, as adding a new property and adding comments to the old one to hint devs to use the new property is a bit weird...

Again, we would still need to discuss the current code!

If there are any other (de)serialization issues in the future, the code would still fall back to using datetime instead of epoch. I believe this must be changed to throwing an exception, but that would definitely be a breaking change (though a worthy one, I believe..) If we are going to do this, v4 would be a worthy choice :)

@normj
Copy link
Member

normj commented Oct 22, 2024

If applications today saves a date past 2038 the date is written as a formatted string. If V4 switched that so it was an 64bit epoch seconds then any code still using V3 will fail to convert the 64bit epoch into a DateTime. That means if we made this change in V4 without it being an opt-in mechanism all applications reading the data have to update at the same time. Otherwise there is a potential of a runtime failure reading the DDB dates.

As for the exception I don't see a problem changing the catch block to be more specific and catching System.OverflowException. I don't see a reason any other exception should be triggered in this code. Although I would do that change in V4 just to be safe.

@Dreamescaper
Copy link
Contributor

@normj
You're probably right. Do you think it would be acceptable to make changes to V3 to allow it to read longs in this case? So that V3's save behavoir would be unchanged, but it would be able to read longs if V4 stores it as long?

@sander1095
You have no changes for reading the data. And as far as I can see, it tries to read it as int, therefore, per my understanding, it would fail if you save the data, and then try to read it.

@normj
Copy link
Member

normj commented Oct 23, 2024

@Dreamescaper We can and should change V3 to understand long epoch but there is no guarantee customers will update their V3 SDK version so that doesn't alleviate the backwards compatible concerns that make this need to be an opt-in experience.

@Dreamescaper
Copy link
Contributor

So in order to have a breaking scenario, one would need to

  • Access the same table from multiple services
  • Update only one service to V4
  • Do not update the other service to V4, and to the latest V3
  • Save the date after 2034
  • Use StoreAsEpoch, but don't care if it is actually stored as a number (otherwise it's already broken)
  • Only use DynamoDbContext for access (otherwise it's already broken)

I just feel that the possibility of such breaking change is fully justified by a major version change.
And it does not justify adding another property, which would confuse everyone.

Not up to me to decide, of course

@normj
Copy link
Member

normj commented Oct 24, 2024

@Dreamescaper It might be rare but there is no way for me to measure how many users would be impacted. In the absence of data I have to be conservative. If this was your data that had a lot of consumers accessing the data and we broke you due to this change you would be pretty unhappy dealing with a production down issue and having to coordinate with all the consumers to push to prod new versions now to get production back up.

As long as we mark StoreAsEpoch as Obsolete with warning message explaining the difference in how epochs are stored I really don't think it will be that confusing what property to use.

@sander1095
Copy link
Author

Hey @normj

As for the exception I don't see a problem changing the catch block to be more specific and catching System.OverflowException. I don't see a reason any other exception should be triggered in this code.

We could do this. However, the core of the problem is that if there would ever be any other type of exception in the future, serialization would fail and it would fall back to storing it as datetime instead of epoch, but now with a different cause.

I believe that serialization should either succeed or fail, it should not fall back to a different strategy, unless a user acknowledges and allows it. What's your view on this?

Although I would do that change in V4 just to be safe.

To fix the issue I mentioned above, I believe any exception during (de)serialization should be "final" and blow up. This would need more investigation throughout the entire SDK to ensure this philosophy is applied everywhere. I am not planning on performing that in this PR, though. I do not know much of this v3 -> v4 upgrade you talk about, so perhaps it's a better fit for v4 then?

In any case, if you agree, would you mind creating an issue for this :D?


To summarize, this PR needs the following changes:

  • Mark StoreAsEpoch as obsolete
  • Create a new property StoreAsEpochLong and ensure that this does support 64 bit, which this PR adds.
  • ** What do we do about possible errors occuring during this de(serialization) process? Even with a new property, it would still fall back to storing as datetime...**

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants