Feature Request: Cost Calculation from usage (Based on Provider + Model prices) #3932

brennanmceachran · 2024-11-28T20:51:48Z

Feature Description

The options for AI providers is growing, the list of models is growing, and the nuances of cost per token is also growing - we'll probably never be able to estimate costs prior to an api call again. While the increasingly complex pricing structures have usually brought costs down... it's become hard to know!

The cost of usage from actual data has become a daunting task due to the nuances in token-based billing. Eg:

Model x Provider Variability: Each provider uses distinct rates for each of their models, with variations in token costs for prompts, completions, and additional features.
Granular Token Pricing Rules: Providers like OpenAI charge differently for cached tokens, multimodal tokens (e.g., audio, vision), and even rejected prediction tokens, which adds layers of complexity.
Lack of standardized calculators: Currently, developers are manually mapping usage data to pricing rates (At least that's what I'm doing)

Furthermore, token counts aren’t directly comparable across providers, making it impossible to evaluate usage across models in a meaningful way. However, converting token usage into USD provides a clear, apples-to-apples comparison.

For example, consider the following usage report from OpenAI:

{
  "usage": {
    "prompt_tokens": 8100,
    "completion_tokens": 3900,
    "total_tokens": 12000,
    "prompt_tokens_details": { 
      "cached_tokens": 2100,
      "audio_tokens": 0 
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 1800,
      "rejected_prediction_tokens": 1000
    }
  }
}

Mapping this data to different model prices means we're spending between $0.003 and $0.339 (*I think!)—over a 100x difference— without accounting for the o1 models reasoning tokens (which adds a few hundred to tens of thousands of output tokens.)

Calcs

1. gpt-4o: $0.056625
  - cached prompt tokens: (2100 * ($1.25 / million))
  - uncached prompt tokens (6000 * ($2.50 / million))
  - output tokens: (3900 * ($10 / million))
2. gpt-4o-mini: $0.0033975
  - cached prompt tokens: (2100 * ($0.075 / million))
  - uncached prompt tokens (6000 * ($0.150 /million))
  - output tokens: (3900 * ($0.600 / million))
3. o1-preview: $0.33975
  - cached prompt tokens: (2100 * ($7.50 / million))
  - uncached prompt tokens (6000 * ($15.00 / million))
  - output tokens: (3900 * ($60 / million))
4. o1-mini: $0.06795
  - cached prompt tokens: (2100 * ($1.50 / million))
  - uncached prompt tokens (6000 * ($3.00 / million))
  - output tokens: (3900 * ($12 / million))

While AI SDK gives us the ability to easily see the difference between each model's unique benefits it leaves cost opaque. Aiding the dev on cost allows us to do cost/benefit analysis as we build.

Feature request:
Parse CompletionTokenUsage, apply provider/model-specific pricing, return total cost on generation api calls. Or include cost within the CompletionTokenUsage object

This would offer developers a reliable, standardized way to compute costs accurately from actual usage data.

Use Cases

Cost Transparency for Developers
Developers need to know exactly how much specific API calls cost to evaluate the trade-offs between accuracy, speed, and price when choosing a model. By parsing the CompletionTokenUsage data and calculating costs, this feature would provide developers with immediate feedback on pricing, mapped directly to actual usage.

Customer Usage Billing
Businesses offering AI services to customers can use this feature to calculate and bill their customers accurately or with a markup

Provider and Model Comparison
Developers evaluating different models (e.g., OpenAI’s GPT-4 vs. a smaller, cheaper model) can use this feature to compare the true costs of identical workloads across providers or models. This empowers informed decision-making based on both performance and cost.

Invoice Validation
Teams can reconcile monthly provider invoices by comparing their usage logs against calculated costs. This ensures transparency and accuracy in billing.

Debugging/Logging Costs
Developers can analyze sudden spikes in costs by breaking down specific requests. For example, they can identify if multimodal usage (e.g., audio tokens) or rejected tokens contributed to an unexpected expense and adjust accordingly.

Additional context

No response

The text was updated successfully, but these errors were encountered:

brennanmceachran added the enhancement New feature or request label Nov 28, 2024

brennanmceachran changed the title ~~Cost Calculation for Token Usage Based on Provider + Model Prices~~ Feature Request: Cost Calculation from usage (Based on Provider + Model prices) Nov 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Cost Calculation from usage (Based on Provider + Model prices) #3932

Feature Request: Cost Calculation from usage (Based on Provider + Model prices) #3932

brennanmceachran commented Nov 28, 2024 •

edited

Loading

Feature Request: Cost Calculation from usage (Based on Provider + Model prices) #3932

Feature Request: Cost Calculation from usage (Based on Provider + Model prices) #3932

Comments

brennanmceachran commented Nov 28, 2024 • edited Loading

Feature Description

Use Cases

Additional context

brennanmceachran commented Nov 28, 2024 •

edited

Loading