Member-only story

Invest in Strategies to Save LLM Cost

9 min readAug 6, 2024

As we all know today, not all language models (LLMs) are created equal; some models are more resource-intensive than others. Factors such as the size of the model, the number of requests made, and the computational resources required can significantly impact costs.

Additionally, different providers may charge varying amounts for their services. Therefore, it would be mindful of implementing cost-saving strategies for organizations or individuals.

In this writing, let’s examine some emerging approaches that are nearly ‘free lunch’ to implement.

Enable Cost-Based Routing

What is it?

The idea is simple, assume not every LLM call requires an expensive large model to get the same answer, directly less complex or simple query calls to low cost LLM model help maximize saving.

For example, the cost between OpenAI Models GPT-4o vs GPT-40-mini are quite significant in cost. See latest prices list here: https://openai.com/api/pricing/

Invest in Strategies to Save LLM Cost

Enable Cost-Based Routing

Written by Minyang Chen

No responses yet