Member-only story

Invest in Strategies to Save LLM Cost

Minyang Chen
9 min readAug 6, 2024

--

As we all know today, not all language models (LLMs) are created equal; some models are more resource-intensive than others. Factors such as the size of the model, the number of requests made, and the computational resources required can significantly impact costs.

Additionally, different providers may charge varying amounts for their services. Therefore, it would be mindful of implementing cost-saving strategies for organizations or individuals.

In this writing, let’s examine some emerging approaches that are nearly ‘free lunch’ to implement.

Enable Cost-Based Routing

What is it?

The idea is simple, assume not every LLM call requires an expensive large model to get the same answer, directly less complex or simple query calls to low cost LLM model help maximize saving.

Figure 1: Cost-Based LLM Routing

For example, the cost between OpenAI Models GPT-4o vs GPT-40-mini are quite significant in cost. See latest prices list here: https://openai.com/api/pricing/

Figure 2: Open AI gpt-4o cost

--

--

Minyang Chen
Minyang Chen

Written by Minyang Chen

Enthusiastic in AI, Cloud, Big Data and Software Engineering. Sharing insights from my own experiences.

No responses yet