Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Cost Calculation from usage (Based on Provider + Model prices) #3932

Open
brennanmceachran opened this issue Nov 28, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@brennanmceachran
Copy link

brennanmceachran commented Nov 28, 2024

Feature Description

The options for AI providers is growing, the list of models is growing, and the nuances of cost per token is also growing - we'll probably never be able to estimate costs prior to an api call again. While the increasingly complex pricing structures have usually brought costs down... it's become hard to know!

The cost of usage from actual data has become a daunting task due to the nuances in token-based billing. Eg:

  • Model x Provider Variability: Each provider uses distinct rates for each of their models, with variations in token costs for prompts, completions, and additional features.
  • Granular Token Pricing Rules: Providers like OpenAI charge differently for cached tokens, multimodal tokens (e.g., audio, vision), and even rejected prediction tokens, which adds layers of complexity.
  • Lack of standardized calculators: Currently, developers are manually mapping usage data to pricing rates (At least that's what I'm doing)

Furthermore, token counts aren’t directly comparable across providers, making it impossible to evaluate usage across models in a meaningful way. However, converting token usage into USD provides a clear, apples-to-apples comparison.

For example, consider the following usage report from OpenAI:

{
  "usage": {
    "prompt_tokens": 8100,
    "completion_tokens": 3900,
    "total_tokens": 12000,
    "prompt_tokens_details": { 
      "cached_tokens": 2100,
      "audio_tokens": 0 
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 1800,
      "rejected_prediction_tokens": 1000
    }
  }
}

Mapping this data to different model prices means we're spending between $0.003 and $0.339 (*I think!)—over a 100x difference— without accounting for the o1 models reasoning tokens (which adds a few hundred to tens of thousands of output tokens.)

Calcs
1. gpt-4o: $0.056625
  - cached prompt tokens: (2100 * ($1.25 / million))
  - uncached prompt tokens (6000 * ($2.50 / million))
  - output tokens: (3900 * ($10 / million))
2. gpt-4o-mini: $0.0033975
  - cached prompt tokens: (2100 * ($0.075 / million))
  - uncached prompt tokens (6000 * ($0.150 /million))
  - output tokens: (3900 * ($0.600 / million))
3. o1-preview: $0.33975
  - cached prompt tokens: (2100 * ($7.50 / million))
  - uncached prompt tokens (6000 * ($15.00 / million))
  - output tokens: (3900 * ($60 / million))
4. o1-mini: $0.06795
  - cached prompt tokens: (2100 * ($1.50 / million))
  - uncached prompt tokens (6000 * ($3.00 / million))
  - output tokens: (3900 * ($12 / million))

While AI SDK gives us the ability to easily see the difference between each model's unique benefits it leaves cost opaque. Aiding the dev on cost allows us to do cost/benefit analysis as we build.

Feature request:
Parse CompletionTokenUsage, apply provider/model-specific pricing, return total cost on generation api calls. Or include cost within the CompletionTokenUsage object

This would offer developers a reliable, standardized way to compute costs accurately from actual usage data.

Use Cases

Cost Transparency for Developers
Developers need to know exactly how much specific API calls cost to evaluate the trade-offs between accuracy, speed, and price when choosing a model. By parsing the CompletionTokenUsage data and calculating costs, this feature would provide developers with immediate feedback on pricing, mapped directly to actual usage.

Customer Usage Billing
Businesses offering AI services to customers can use this feature to calculate and bill their customers accurately or with a markup

Provider and Model Comparison
Developers evaluating different models (e.g., OpenAI’s GPT-4 vs. a smaller, cheaper model) can use this feature to compare the true costs of identical workloads across providers or models. This empowers informed decision-making based on both performance and cost.

Invoice Validation
Teams can reconcile monthly provider invoices by comparing their usage logs against calculated costs. This ensures transparency and accuracy in billing.

Debugging/Logging Costs
Developers can analyze sudden spikes in costs by breaking down specific requests. For example, they can identify if multimodal usage (e.g., audio tokens) or rejected tokens contributed to an unexpected expense and adjust accordingly.

Additional context

No response

@brennanmceachran brennanmceachran added the enhancement New feature or request label Nov 28, 2024
@brennanmceachran brennanmceachran changed the title Cost Calculation for Token Usage Based on Provider + Model Prices Feature Request: Cost Calculation from usage (Based on Provider + Model prices) Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant