-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: bind to a specific local IPv6 address during pair verify #272
base: main
Are you sure you want to change the base?
Conversation
Cross reference aiocoap#298 |
I'm running this locally now. I've been seeing dozens of 4.04 errors a day. Let's see if this cuts that out. |
Codecov ReportBase: 66.74% // Head: 66.59% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #272 +/- ##
==========================================
- Coverage 66.74% 66.59% -0.15%
==========================================
Files 71 71
Lines 6697 6718 +21
==========================================
+ Hits 4470 4474 +4
- Misses 2227 2244 +17
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
This will prevent future traffic from picking the "best" address to communicate with an accessory. Current products tie sessions to the original IPv6 address and return a CoAP error code 4.04 when a new address attempts to continue the session. It is likely that this could explode in fun, new ways: the DHCPv6 server returning a new address, the host selecting a new random SLAAC address, etc.
Wrap "get local IPv6 address" function in try/except. This prevents network errors from marking the device as not being provided by the integration. This does seem to have eliminated frequent 404s from my environment. |
A CoAP error code of 4.04 can indicate that the accessory has no record of the HAP session. This will certainly be the case if a device has lost power. Shutdown the session on our side so we can trigger a reconnect.
This seems like a good fix, provided the library can react to local address changes. From the code, I wonder if the logic could be made to prioritize ULAs or link-local versus say an ISP publicly routable address that is more subject to change (especially when using privacy-preserving randomized addresses). |
The ip module has various attributes that could help with that: https://docs.python.org/3/library/ipaddress.html#ipaddress.IPv6Address.is_private |
In terms of detecting failed connections, hoping we can get around to testing if the coap transport publishes a sane s# value in mdns. That number should go up if there is a state change that no one observed. So if we see it go up, we know we should restart the session. |
It's nice that we get a new source address every time we pair verify. My instinct is that it'd be nice if we could have the events service bind to * but set the source address on outgoing traffic. But I don't think that helps much in practice - if the local ip changes the device won't know it and if the remote ip changes we handle it through mdns. So let's not bother. In terms of detecting our own ip changing, ideally I'd want to defer that to HA. Presumably they have to handle that to update their mdns record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openthread/openthread#1520 is another case where openthread code had to be mindful of source address.
At the moment I'm tempted to merge the 4.04 handling and recovering only, as I'd like to try and understand the situations where the source address is wrong before deploying this work around. And I just don't think I'm going to get in front of a computer before the Jan release is out. I'm struggling to think of scenarios where the source address flips if we have picked the thread ULA of the device as the destination address, the best one i came up with is if your HA loses its thread route but your home router doesn't. Default route would get it from HA to your main router, then your router delivers it to the thread router. But home routers don't generally accept thread routes out of the box, and would rely on your thread devices doing DHCP6, not just your thread router. That would be surprising as they are different L2 segments aren't they? If we do go ahead with this work around I want to test what does happen if the ipv6 did address of HA did change - hopeful any pending recvs would get cancelled, and bubble up asyncio into our code? |
2.4.2 handles |
And PR to land it in HA - home-assistant/core#84700. Hopefully I can find time with all my devices (test trashing the network etc ti test recovery) to test the full PR but can't say it's likely before mid jan. |
I upgraded HA which wiped out this full by-hand change. I don't "see" the reconnects any more from the perspective of the frontend but they are still happening w/o the full fix. |
Yep, still on my radar to loop back round to here and test out the full fix when I've got more time to try and break things. |
This will prevent future traffic from picking the "best" address to communicate with an accessory. Current products tie sessions to the original IPv6 address and return a CoAP error code 4.04 when a new address attempts to continue the session.
It is likely that this could explode in fun, new ways: the DHCPv6 server returning a new address, the host selecting a new random SLAAC address, etc.