Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FirmwareInfo has wrong sizes for fields #411

Open
thetek42 opened this issue Apr 11, 2024 · 13 comments
Open

FirmwareInfo has wrong sizes for fields #411

thetek42 opened this issue Apr 11, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@thetek42
Copy link
Contributor

Currently, FirmwareInfo is defined as follows:

pub struct FirmwareInfo {
    pub version: heapless::String<24>,
    pub released: heapless::String<24>,
    pub description: Option<heapless::String<128>>,
    pub signature: Option<heapless::Vec<u8, 32>>,
    pub download_id: Option<heapless::String<128>>,
}

However, the fields version and released are too small. The esp_app_desc_t struct defines version to be 32 characters long. time and date are both 16 bytes long, which would require 33 bytes in total for released (because released is currently formatted as "{date} {time}").

typedef struct {
    uint32_t magic_word;        /*!< Magic word ESP_APP_DESC_MAGIC_WORD */
    uint32_t secure_version;    /*!< Secure version */
    uint32_t reserv1[2];        /*!< reserv1 */
    char version[32];           /*!< Application version */
    char project_name[32];      /*!< Project name */
    char time[16];              /*!< Compile time */
    char date[16];              /*!< Compile date*/
    char idf_ver[32];           /*!< Version IDF */
    uint8_t app_elf_sha256[32]; /*!< sha256 of elf file */
    uint32_t reserv2[20];       /*!< reserv2 */
} esp_app_desc_t;

This is currently an issue, because if the version contained in the binary is longer than 24 bytes, this line will cause a panic.

Changing the length probably requires changing it in embedded-svc as well. The alternative (but worse) solution would be to trim both version and release date info.

@ivmarkov
Copy link
Collaborator

Can you give an example of a value in date and a value in time that takes all the 16 characters? My assumption for date is that it should be no longer than 10 characters (as in yyyy-mm-dd) and time should be no longer than 12 chars (hh:mm:ss.mmm) where mmm is the milliseconds range. All of that in UTC of course. This makes for a total of 22 characters, with two extra left as a 0s.

@thetek42
Copy link
Contributor Author

A time longer than that will likely never happen. But in theory (i.e. when the fields are manually overwritten) it would be possible. The version field does however provide a problem (e.g. from git describe: 1.23.45-67-g1234567-dirty would be 25 characters long).

@ivmarkov
Copy link
Collaborator

ivmarkov commented Apr 12, 2024

Excuse my ignorance, but from where is 1.23.45-67-g1234567-dirty is coming? In my mind, the git version should either be a tag, a branch name (not ideal!), or a sha1 hash. On tag and branch names you have control. On the hash, you don't, but it won't fit in the 32 chars of the native esp idf struct either unless you encode the 40 hexadecimal digits of the sha1 hash as two per byte. However if you do that, the hash would fit in the 24 bytes version in FirmwareInfo as well?

@thetek42
Copy link
Contributor Author

By default, ESP-IDF uses git describe --always for filling the version field (see https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/misc_system_api.html#app-version).

There are three options of how to avoid this issue:

  • manually specify a version (but then, git describe cannot be used)
  • restrain yourself to keeping the <tag>-<commit-count> part to <10 characters (but then, tags such as 3.10.22 or v0.1.23 would not be feasible)
  • never (re)compile esp-idf(-sys) when there are uncommited changes in order to not have the -dirty at the end of the version string

Those "solutions" aren't really solutions tough. If esp-idf supports 32 characters for a version string, esp-idf-svc needs to be able to deal with that somehow.

@ivmarkov
Copy link
Collaborator

But we also can't just extend the existing fields of FirmwareInfo as that might break older upgrades?

Open to ideas how we van extend the version, and possibly date and time fields in a backwards compatible way...

@Vollbrecht Vollbrecht added the bug Something isn't working label Jun 21, 2024
@AVee
Copy link

AVee commented Jul 2, 2024

Firstly, I'd really like to see the unwrap() calls disappear there, right now I can't use esp-idf-svc to check what is on my OTa partitions as I cannot be sure it won't crash my device. I feel a call to something like EspOta::get_running_slot() should just never panic but return errors where relevant.

There are three options of how to avoid this issue:
...
Or ensure the version is always set to the Cargo version as esp_app_desc!() does? In the context of a rust application I think that would be the correct value to use. And the current behavior were the version only get's updated when esp-idf-sys happens to be recompiled is kinda broken anyway, as it can be out of sync with the version of the application being build.

But we also can't just extend the existing fields of FirmwareInfo as that might break older upgrades?

I'd think that just increasing the lengths of the strings is likely to be limited as heapless::String<24> and heapless::String<32> will largely support all the same operations. And where it doesn't there probably is a compile error which is trivial to fix. Additionally, given that a default build causes crashes and/or incorrect version numbers I doubt it's used much anyway.

Personally, given the choice between having backwards compatibility or risk a device bootlooping because one of the partitions contains a app with the default version produced by IDF I'd take the breaking change.

@Vollbrecht
Copy link
Collaborator

For a possible blast-radius i checked esp-hal, and there usage of embedded-svc. They are currently only using it in esp-wifi and don't use the ota stuff if i see correctly. So at least we would not break there anything.

But yeah as a minimal solution we should at least add something to the impl From implementation so we don't unwrap on an input that is a valid esp_app_desc_t field value. A dirty fix where we would not introduce a breaking change would be to clamp whatever we get from unsafe { from_cstr_ptr(&app_desc.version as *const _) } .try_into() .unwrap(), to 24 chars and than we could issue a warning to the user.

@ivmarkov
Copy link
Collaborator

ivmarkov commented Jul 2, 2024

But we also can't just extend the existing fields of FirmwareInfo as that might break older upgrades?

Open to ideas how we van extend the version, and possibly date and time fields in a backwards compatible way...

Did you guys read this?

@AVee
Copy link

AVee commented Jul 2, 2024

But we also can't just extend the existing fields of FirmwareInfo as that might break older upgrades?
Open to ideas how we van extend the version, and possibly date and time fields in a backwards compatible way...

Did you guys read this?

Yes, I did, but I might be missing something. This is how I think thing would work out if FirmwareInfo struct would have some fields be extended, please correct me if I'm wrong:

I'd expect the new code to be compile time incompatible with existing rust code in a few usage scenarios, I'd expect the compiler to catch these cases forcing the code changes needed to happen. As such it could be a breaking change at compile time for some users (but I doubt it).

But, as far as I can see, it should not negatively affect the actual upgrading of devices as it does not affect the format of the actual binary in any way (that is dictated by IDF). The current code will deal with application images and partitions as long as the version field does not exceed 24 bytes. Changed code will handle those just fine as well, but also deal with images containing a longer version field. Extending the field in FirmwareInfo will not change the way version numbers are generated in any way, so if OTA updates are currently working they should still work both for upgrades and downgrades after that change.

So unless you refer to something else when you say it might break 'older upgrades', I really don't see anything bad happening if those fields are extended.

@ivmarkov
Copy link
Collaborator

ivmarkov commented Jul 3, 2024

But, as far as I can see, it should not negatively affect the actual upgrading of devices as it does not affect the format of the actual binary in any way (that is dictated by IDF).

I agree. The one and only problem with simply extending the fields is that it makes it too easy for folks which already have production deployments of firmware with shorter fields to eventually forget that even though they now have fields which can take longer data - they must use shorter data in those forever. Or at least until they can guarantee that the last device they have ever sold/deployed in the field is running with a newer firmware that supports reading the longer fields.

This is not something that you can necessarily catch during testing, because - after a few release iterations you might forget that you might still be having devices with old firmware that can only read shorter fields. Then you'll release firmware with long data in the fields, and suddenly any laggers out there in the field which are like 10 versions behind your latest firmware will not be upgradable to it.

That's all.

I think we can live with that though, as long as we do something along the lines of what @Vollbrecht suggests - at compile time - issue a warning if a firmware is generated which puts longer data in the fields than the current limits. This warning might be annoying for newcomers which are just about to release their first production firmware, but might help folks who are already in production.

Also: I might be extra paranoid here, but this is OTA, and we must be careful. I've been yelled at twice already for other OTA changes, which are not even changing the firmware format (just changing the API).

The other thing I would like that we are exploring is getting rid (if at all possible) from the heapless fields in the FirmwareInfo structure and replacing these with regular references (as in FirmwareInfo<'a>). We should at least think what that would entail and whether we should do it. Because - again - we would be doing a breaking change, so we should try to push this too.

If it is not clear, having heapless in the API is not ideal, as heapless is not at version "1" yet.

@AVee
Copy link

AVee commented Jul 3, 2024

I appreciate the paranoia, OTA is one of those 'better safe then sorry' things ;)

I've done some testing with the most basic implementation of an OTA update, something like this:

let update = self.ota.initiate_update().unwrap();
update.write_all(data).unwrap();
update.complete()
restart()

That does not create any FirmwareInfo objects but just calls into IDF directly. A quick test confirms this code does successfully update to a version containing a 29 bytes version string. It's only when the get_*_slot() methods are used on a partition when I run into problems. That said, I can see someone using those methods to check what was done after writing the update so the warning is still a good idea.

Another thing to be aware of: The docs for the ESP app image format have this note:

The maximum length is 32 characters, including null-termination character. For example, if the length of PROJECT_NAME exceeds 31 characters, the excess characters will be disregarded.

So the actual length would be 31 characters, not 32.

I like the idea of getting rid of heapless and it makes sense to do the breaking changes in one go. We could consider just using byte arrays in the struct and providing helper functions to produce an &str when requested? Like this:

pub struct FirmwareInfo {
    pub version: [u8;32],
    ...
}

impl FirmwareInfo {
    pub fn version(&self) -> Result<&str, Utf8Error> {
        let len = self.version.iter().position(|&c| c == b'\0').unwrap_or(self.version.len());
        std::str::from_utf8(&self.version[..len])        
    }
}

This would have the added benefit that creating the FirmwareInfo can never fail as it just takes the bytes. Any error handling is moved into the accessor functions and there is the option to inspect the raw data if needed.

And once you get into the idea of accessor functions, maybe FirmwareInfo can just be a trait implemented on esp_app_desc_t?

@ivmarkov
Copy link
Collaborator

ivmarkov commented Jul 8, 2024

I like the idea of getting rid of heapless and it makes sense to do the breaking changes in one go. We could consider just using byte arrays in the struct and providing helper functions to produce an &str when requested? Like this:

pub struct FirmwareInfo {
    pub version: [u8;32],
    ...
}

impl FirmwareInfo {
    pub fn version(&self) -> Result<&str, Utf8Error> {
        let len = self.version.iter().position(|&c| c == b'\0').unwrap_or(self.version.len());
        std::str::from_utf8(&self.version[..len])        
    }
}

Well this is a bit like re-inventing heapless::String except based on ASCIIZ terminator instead of a length prefix, so not sure how I feel about it! :) And the problem is, this version array is still in embedded-svc which is supposed to be MCU-agnostic and yet it still contains an ESP-IDF-specific thing, which is the length of the version.

I was rather thinking (but that might be impossible) to lifetime the FirmwareInfo struct by 'a and then the struct should return &str references into another object. But that might be impossible OR it might require weird API, in that we should be required to receive the FirmwareInfo in a user-provided callback if the underlying esp_app_desc_t is a temporary object.

And once you get into the idea of accessor functions, maybe FirmwareInfo can just be a trait implemented on esp_app_desc_t?

Yes, this seems much cleaner! Also cleaner than my FirmwareInfo<'a> suggestion, but would also require that Slot is becoming a trait...

I think we need to experiment with this a bit (that is - making Slot and FirmwareInfo traits). Might or might not work, and might require a few iterations.

The big benefit is that if we need to keep Slot or FirmwareInfo or both as copies of the actual ESP-IDF structures, we can still use heapless::String to power those internally, but heapless will be out of the API surface.

Would you like to drive this?

@AVee
Copy link

AVee commented Jul 9, 2024

Would you like to drive this?

I wouldn't mind giving it a go, but it will be end of august at the earliest as I'll be on vacation. I somebody else wants to pick it up before that I'm fine with that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

4 participants