Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM Load balance #252

Open
Oceania2018 opened this issue Jan 15, 2024 · 1 comment
Open

LLM Load balance #252

Oceania2018 opened this issue Jan 15, 2024 · 1 comment
Assignees

Comments

@Oceania2018
Copy link
Member

Inspired by this load balancing idea.
Load balance allow across multiple models, providers, and keys, avoid toke limitation.

@Oceania2018 Oceania2018 self-assigned this Jan 22, 2024
Oceania2018 added a commit that referenced this issue Jan 22, 2024
@ishaan-jaff
Copy link

Hi @Oceania2018 I'm the maintainer of LiteLLM - we provide an Open source proxy for load balancing Azure + OpenAI + Bedrock + Vertex - 100+ LLMs

It can process (500+ requests/second)

From this thread it looks like you're trying to maximize throughput - I hope our solution makes it easier for you. (i'd love feedback if you're trying to do this)

Here's the quick start on using LiteLLM Proxy for load balancing

Doc: https://docs.litellm.ai/docs/proxy/reliability

Step 1 Create a Config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
      api_version: "2023-05-15"
      api_key: 
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/

Step 2: Start the litellm proxy:

litellm --config /path/to/config.yaml

Step3 Make Request to LiteLLM proxy:

curl --location 'http://0.0.0.0:8000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "gpt-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ],
    }
'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants