Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP bufferbloat if WebSocket server keeps pushing data quickly to a slow client #170

Open
yli-cpr opened this issue Feb 15, 2018 · 8 comments

Comments

@yli-cpr
Copy link

yli-cpr commented Feb 15, 2018

This may be able to be solved by registering a twisted producer and when twisted calls pauseProducing, daphne can just disconnect the client.

Autobahn exports this interface through WebSocket protocols.

@andrewgodwin
Copy link
Member

Could you flesh this out a bit more with how you detected this? That would be valuable for whoever picks this up, so they can verify a fix.

@yli-cpr
Copy link
Author

yli-cpr commented Feb 15, 2018

I just managed to produce this issue.

  1. let worker send many messages to reply channel
  2. let the client get into an infinite loop upon first event.
  3. no other thing like load balancer in between.
  4. set daphne ping timeout to a bigger vlaue

I observed that daphne's memory usage kept growing (by MB). Reducing ping timeout may help? But that assumes memory usage won't go too high during the timeout.

@yli-cpr
Copy link
Author

yli-cpr commented Feb 16, 2018

also, the ping timeout won't work. because it updates "last_data" even for server side send!

I think that's another bug. When server is sending data, it doesn't mean the client/connection is good.

@yli-cpr
Copy link
Author

yli-cpr commented Feb 16, 2018

I tried to install push producer, and captured pauseProducing call from twisted. A trick is I have to unregister the previous producer (HTTPChannel)

# in onConnect:
            self.transport.unregisterProducer()
            self.registerProducer(PushProducer(self), True)

Ideally, daphne should forward pauseProducing and resumeProducing to worker

@cpina
Copy link

cpina commented Dec 15, 2022

I've had what I think is a similar/same problem when testing django/django#16384 if generating lots of data from Django (which in a project we do, generating files on-the-fly).

Carlton had some code:
django/django#16384 (comment)

Having something a view such as:

async def generate():
    gb_to_send = 5

    chunk_size = 5 * 1024 * 1024

    total_sent = 0
    count = 0

    while total_sent < gb_to_send * 1024 * 1024 * 1024:
        data = f"{count % 10}" * chunk_size
        total_sent += len(data)
        count += 1
        await asyncio.sleep(0.000001) # change it to make slower / faster
        yield data


async def a_streaming_view(request):
    return StreamingHttpResponse(generate())

And then using curl and then stopping curl (Control+Z on Linux/Mac shells) or even quitting (Control+C): data is generated using lots of RAM. I can provide a better example if needed / useful.

@carltongibson
Copy link
Member

Hey @cpina - yes please. If you're able to focus in on what's happening here that would be amazing. (Current plan is to swing back here after Django 4.2a1, so any work before then would be extra handy 🎁)

@cpina
Copy link

cpina commented Dec 15, 2022

👍 I will prepare a self-contained example and write my findings in daphne-Twisted code that might help, hopefully!

@cpina
Copy link

cpina commented Dec 15, 2022

Self contained example to see memory increase:
https://gist.github.com/cpina/fe1e3fa982d09997a5957441b97c5d0c

It is the first time that I dive into daphne and Twisted so take the next hypothesis with a pinch of salt!

It's possible to see what I think is the size of what needs to be sent to the client in Daphne via (horrible):
self.channel.transport._tempDataLen
In daphne/http_protocol.py line 265 just before http.Request.write(self, message.get("body", b""))

Also, it seems that Twisted would like to stop the producer since twisted/internet/abstract.py, method _maybePauseProducer is executed and if self._isSendBufferFull() returns True. It calls self.producer.pauseProducing() (twisted/web/http.py HttpChannel.pauseProducing) but it cannot stop the producer... but I don't know at this point what "Producer" should be, how it should be stopped, how Daphne should set it or if this is a red herring at all or the right path.

Hopefully this helps somehow! I'm happy to test any possible changes or try to fix it (I need to familiarise myself with the related code first).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants