Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-629: Fix use-after-free in BRPeerManager/BRPeer #295

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

michael-brd
Copy link
Contributor

I have some tests in testBwm.c that abuse the BRPeerManager (via the BRWalletManager). They call BRPeerManagerConnect and BRPeerManagerDisconnect, as well as BRPeerManagerDisconnect+BRPeerManagerFree. They also randomly set if the network is reachable or not.

Using these tests, I can reliably trigger user-after-frees on an ASAN build.

To get the discussion started on how to fix, I've got this PoC branch that "fixes" the issues. Fixes in quotations because this is probably not how we want to fix it.

Some of the issues/changes:

  • BRPeerManagerConnect used to explicitly call _peerDisconnected after a race-y check against the BRPeer's thread; removed this
  • Added asserts to capture pthread_create failures; these could/should be changed to the function returning success/failure and handling
  • We close the BRPeer's socket in BRPeerDisconnect, but the BRPeer thread closes it too. We can't do that, as the file descriptor could possibly be reused. Since the BRPeer thread owns the socket (and has a close), removed the close in BRPeerDisconnect. Instead, just a shutdown is performed to signal our intent to close
  • The BRPeerManager code sort of assumes that a BRPeer has a thread (pre-emptively increments peerThreadCount, only calls BRPeerFree in _peerDisconnected, etc.) but if the network is unreachable, no BRPeer thread is created. I removed the waitingForNetwork BRPeer field and rolled it into the status. On BRPeerDisconnect, if we never launched the thread (i.e. in the BRPeerStatusWaiting state), launch a thread that does the same actions a "real" BRPeer thread would do (i.e. call disconnect, do the thread cleanup).
  • Added asserts to capture future issue with peer thread count "underflow"-ing

@michael-brd michael-brd requested review from voisine and EBGToo October 8, 2019 18:37
Comment on lines +1121 to +1125
assert (0 == array_count(ctx->pongCallback));
assert (NULL == ctx->mempoolCallback);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These probably aren't needed. Added them due to my unfamiliarity with this aspect of the BTC code (assert would catch if we needed to handle this).

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely the most controversial part; the spawning of a "cleanup"-like thread. Also, the dual nature of this function...

@@ -93,7 +93,8 @@ extern "C" {
typedef enum {
BRPeerStatusDisconnected = 0,
BRPeerStatusConnecting,
BRPeerStatusConnected
BRPeerStatusConnected,
BRPeerStatusWaiting
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roll the separate waitingForNetwork flag into this.

@@ -839,7 +839,7 @@ static void _peerDisconnected(void *info, int error)
if (manager->connectFailureCount > MAX_CONNECT_FAILURES) manager->connectFailureCount = MAX_CONNECT_FAILURES;
}

if (! manager->isConnected && manager->connectFailureCount == MAX_CONNECT_FAILURES) {
if (! manager->isConnected && manager->connectFailureCount >= MAX_CONNECT_FAILURES) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd seen cases where this was greater than MAX_CONNECT_FAILURES. In BRPeerManagerConnect and BRPeerManagerPublishTx, we checked for >=, mirrored that here as I had seen cases where it was greater than at this point.

@@ -1679,7 +1673,7 @@ void BRPeerManagerDisconnect(BRPeerManager *manager)
p = manager->connectedPeers[i - 1];
manager->connectFailureCount = MAX_CONNECT_FAILURES; // prevent futher automatic reconnect attempts
BRPeerDisconnect(p);
if (BRPeerConnectStatus(p) == BRPeerStatusConnecting) manager->peerThreadCount--; // waiting for network
while (BRPeerConnectStatus(p) == BRPeerStatusConnecting) BRPeerDisconnect(p);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me leery (endless loop anyone?) but the logic should be that this only happens when the thread is being spawned. Once spawned, the BRPeer thread will set the status to connected or disconnected. In either case, this will break out (fingers crossed but also seen in testing).

@michael-brd michael-brd changed the title CORE-629: Fix memory corruption issues CORE-629: Fix use-after-free issue in BRPeerManager/BRPeer Oct 8, 2019
@michael-brd michael-brd changed the title CORE-629: Fix use-after-free issue in BRPeerManager/BRPeer CORE-629: Fix use-after-free in BRPeerManager/BRPeer Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants