Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential issue with timeout #34

Open
gterzian opened this issue Jul 28, 2019 · 6 comments
Open

Potential issue with timeout #34

gterzian opened this issue Jul 28, 2019 · 6 comments

Comments

@gterzian
Copy link
Member

gterzian commented Jul 28, 2019

In the recv_timeout at

if let Ok(msg) = crate::recv_timeout(&self.receiver, TIMEOUT) {

Since it's called as part of perform_updates on the main-thread.

I think it might be possible for script to hijack the main-thread and make it unresponsive to user input in the XR device, since if script keeps sending messages within 5ms(for example using requestAnimationFrame every 5 ms, or even less since the main-thread will be a bit busy in between handling each message), the main-thread will be stuck there and I don't think it could then handle new input. In the meantime the script could keep running and the user wouldn't be able to do anything about it.

For example in the Hololens code, the loop would get stuck at https://github.com/servo/servo/blob/b52bfbe68aed9b0abb39a6e994c5dad37cf7d33c/support/hololens/ServoApp/BrowserPage.cpp#L168

cc @paulrouget

@gterzian
Copy link
Member Author

gterzian commented Jul 29, 2019

Ah no it's not true in all cases, since cancelling the raf callback still results in sending a SessionMsg::RenderAnimationFrame, which increments the timestamp and would break the main-thread out of the while loop.

https://github.com/servo/servo/blob/89bade45b5d33b2f9df20888b00fd95e1c08191d/components/script/dom/xrsession.rs#L227

So if script were to send a raf every 5 ms, and immediately cancel the callback, you'd still break out of the loop when the raf callback, cancelled or not, is executed and a message is sent back to the main-thread.

However, that requires the callback to execute, which means script has to run the task enqueued at https://github.com/servo/servo/blob/89bade45b5d33b2f9df20888b00fd95e1c08191d/components/script/dom/xrsession.rs#L310

So if script were to just start looping in a task, and send a raf every 5 ms, it could keep the main-thead looping and prevent the raf callback from executing by remaining in a long-running task.

However it would be obvious a script was long-running, so the UA could actually stop it.

So we can still say that while script is running inside the same task, and that task sends a raf every 5 ms, Servo would remain unresponsive to user input, since the main-thread would be stuck in a loop handling raf requests.


The "bad scenario" would look something like:

While the "main script" would stay inside a long-running task, sending raf every 5ms keeping the main-thread looping, it could have already spin-up a dedicated worker and thereby retaining access to a normally running event-loop, which means it could use fetch and other async apis from the worker.

All the while, the user is stuck in a frozen XR session, and clicking "stop" doesn't do anything.

If the constellation were to send a "ask the user to kill a script" message to the main-thread(which we currently don't have I think), that message wouldn't be handled since the main-thread is stuck in a raf loop. So the constellation would have to decide to kill the script on it's own, without confirmation from the user.

Now if the constellation blocks on the response from the "ask the user to kill a script" message, it would actually deadlock, all the while script can keep running until the user takes off the headset and manually turns off the machine or kills the all browser-related processes.

The better scenario sees the constellation suspend script first, then send a message to the main-thread asking whether to kill it, which would eventually be handled since the main-thread would have been "freed" from the raf loop that was kept going by script.

@asajeffrey
Copy link
Member

In order to perform this attack, script would have to create an rAF callback that never ended (in order to block the RenderAnimationFrame message) but in that case script wouldn't send another RequestAnimationFrame message, and the timeout would expire.

That said, the code is a bit brittle, it might be better to use recv_deadline rather than recv_timeout.

@gterzian
Copy link
Member Author

gterzian commented Jul 30, 2019

In order to perform this attack, script would have to create an rAF callback that never ended, but in that case script wouldn't send another RequestAnimationFrame message, and the timeout would expire.

The way I read RequestAnimationFrame

https://github.com/servo/servo/blob/9043f247d9b031ed285e880e4b90aa523d4a63ae/components/script/dom/xrsession.rs#L287

is that it doesn't prevent the currently running task in script from making an arbitrary number of calls, and each call sends a SessionMsg, via Session.request_animation_frame:

pub fn request_animation_frame(&mut self, dest: Sender<(HighResTimeStamp, Frame)>) {

So the attack doesn't so much require a never ending raf callback, it requires a long-running task making calls to RequestAnimationFrame every 5 ms.

Each call sends a message to the main-thread, which resets the timeout, but doesn't increment the timestamp(that requires the raf callback to execute).

The long-running task prevents raf callback from executing, since they require xr_raf_callback task enqueued by the router to executed on the loop, which isn't going to happen so long as another task is keeping the loop busy.

https://github.com/servo/servo/blob/9043f247d9b031ed285e880e4b90aa523d4a63ae/components/script/dom/xrsession.rs#L310

This could be solved on the script side by requiring a RequestAnimationFrame to only queue a callback if the last callback was executed, unfortunately that doesn't appear spec compliant...
https://immersive-web.github.io/webxr/#dom-xrsession-requestanimationframe

Maybe you could fix this by only sending a SessionMsg::RequestAnimationFrame when a raf callback has executed since the last call to RequestAnimationFrame?

You need only one raf callback to executed all currently enqueued callbacks, so you need only to send one SessionMsg::RequestAnimationFrame per arbitrary number of raf requests, since all callbacks when execute when the response comes. So you only need to send a new message after a raf callback executes.

I've opened a PR with a potential fix servo/servo#23891

@asajeffrey
Copy link
Member

We might also want to replace recv_timeout by recv_deadline as a belt-and-braces solution?

bors-servo pushed a commit to servo/servo that referenced this issue Jul 30, 2019
WebXr: allow for only a single raf message, until callbacks execute

<!-- Please describe your changes on the following line: -->

See servo/webxr#34

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [ ] `./mach test-tidy` does not report any errors
- [ ] These changes fix #___ (GitHub issue number if applicable)

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because ___

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/23891)
<!-- Reviewable:end -->
@gterzian
Copy link
Member Author

gterzian commented Jul 31, 2019

We might also want to replace recv_timeout by recv_deadline as a belt-and-braces solution?

Yes I guess in crossbeam terms the below could be done. And the IPC implementation would be doing try_recv in a loop while checking the timeout?

The main point is that the timeout is not reset each time a message is received....

let deadline = after(timeout);
while true {
    select! {
        recv(r) -> msg => handle_msg(msg),
        recv(deadline) -> _ => break,
    }
}

Also I think it's worth re-highlighting that the underlying problem is that WebXR is running a "sub message-loop" inside perform_updates, which itself is one part of the "main device loop". And while the sub-loop is running, the device loop is not, which means user input is not handled(At least in Hololens).

And the danger is that since the sub-loop is driven by messages coming from script, it could find a way to keep the loop running and prevent UI events from being handled.

@asajeffrey
Copy link
Member

Well yes, really we should just have one main thread loop, but I'm not sure how to achieve that without a drastic rewrite. I'm going to open a "main thread" metabug over in servo/servo to try to capture the discussion, which is currently happening in lots of different issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants