LLM Load balance #252

Oceania2018 · 2024-01-15T00:22:28Z

Inspired by this load balancing idea.
Load balance allow across multiple models, providers, and keys, avoid toke limitation.

ishaan-jaff · 2024-07-25T22:03:54Z

Hi @Oceania2018 I'm the maintainer of LiteLLM - we provide an Open source proxy for load balancing Azure + OpenAI + Bedrock + Vertex - 100+ LLMs

It can process (500+ requests/second)

From this thread it looks like you're trying to maximize throughput - I hope our solution makes it easier for you. (i'd love feedback if you're trying to do this)

Here's the quick start on using LiteLLM Proxy for load balancing

Doc: https://docs.litellm.ai/docs/proxy/reliability

Step 1 Create a Config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
      api_version: "2023-05-15"
      api_key: 
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/

Step 2: Start the litellm proxy:

litellm --config /path/to/config.yaml

Step3 Make Request to LiteLLM proxy:

curl --location 'http://0.0.0.0:8000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "gpt-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ],
    }
'

Oceania2018 self-assigned this Jan 22, 2024

Oceania2018 added a commit that referenced this issue Jan 22, 2024

LLM Load balance #252

3dad385

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Load balance #252

LLM Load balance #252

Oceania2018 commented Jan 15, 2024

ishaan-jaff commented Jul 25, 2024

LLM Load balance #252

LLM Load balance #252

Comments

Oceania2018 commented Jan 15, 2024

ishaan-jaff commented Jul 25, 2024

Here's the quick start on using LiteLLM Proxy for load balancing

Step 1 Create a Config.yaml

Step 2: Start the litellm proxy:

Step3 Make Request to LiteLLM proxy: