Intuitive Thinker — Introducing Guided Mental Models to Enhance Small LLM Reasoning

Minyang Chen
13 min readSep 27, 2024

At a high level, Large Language Models (LLMs) function as statistical models that predict the most likely next word(s) in a sequence based on learned token probabilities. While they excel at parsing tokens and creating extensive neural networks connecting them, assigning meaningful interpretations to these tokens remains a significant challenge. Sure, we can improve it by train with larger size high quality data including synthetic one to make the more better.

Figure 1 : Time to Learn, Think, Plan, Evaluate and Reflect

Indeed, a winning commercial strategy for LLM vendors involves continually developing larger and super-sized models, such as Meta scaling the Llama model up to 405 billion parameters or Mistral’s introduce Mixtral-Larger-2. Empirical evidence demonstrates that larger models often outperform their predecessors, as evident in table comparisons:

Figure 2: source: https://aimlapi.com/blog/mistral-large-2-beats-llama-3-1-405b

As illustrated, larger Language Models (LLMs) typically exhibit superior performance due to their enhanced ability to discern intricate patterns within data and generate more accurate responses, including textual reasoning.

--

--

Minyang Chen
Minyang Chen

Written by Minyang Chen

Enthusiastic in AI, Cloud, Big Data and Software Engineering. Sharing insights from my own experiences.