If you’ve been following along from my previous blog post, you’ll remember that fine-tuning is essentially teaching a smaller LLM to do exactly what you need. But how does it actually work? Is it some kind of magic, or perhaps complex mathematics involving quantum physics that’s is really hard to understand? Fortunately, it’s neither. The core idea behind fine-tuning is relatively straightforward. Here it is:
You just give LLM examples of questions and expected answers
The internal machinery behind LLM fine-tuning is not that simple, but it does not matter. I like the following example: you can teach students some stuff, without having a good idea (or almost any idea) how their brains actually work.
So, how do we approach fine-tuning? How do we teach (or as we call it – fine tune) an LLM? What you need is to prepare a list of examples which should represent what you want it to learn. Let’s try to build such examples.
Let’s assume that in your company it is always preferable to paint your products in blue color. Let’s assume that our products are boxes and forks. Here’s how we can represent it in the form of examples:
Q: What color should I prefer to paint the product?
A: Blue
Q: What color is better for forks?
A: Blue.
Q: What color is preferable to paint the box – blue or green?
A: Blue
Q: Is it a good idea to paint the box in red?
A: No
We can invent a few more such examples, but I hope the idea is clear – if somebody will learn how to answer such questions, he/she/it will internalize the idea that blue color is preferable, especially for boxes and forks. It is exactly what we need.
How do we check if we succeed? This simple question is subtle. Why? Because we need to distinguish whether our LLM indeed “internalized” this knowledge and will use it in different contexts as opposed to just “remembering” examples we gave it. “Internalized” here means the LLM has genuinely learned and understood the underlying principle (like “prefer blue color”) rather than just memorizing the specific examples, so it can apply this knowledge to new, different situations it hasn’t seen before.
Just remembering examples can hide misunderstanding
So, how can we distinguish between remembering and “understanding”? Unsurprisingly, we do it in a similar way as we do it with humans. We take out some examples from the teaching process and use them for testing.
So our goal is to make the LLM generalize from our examples to other questions or tasks which may arise in the future.
How to build such examples is a good question. It is more of a pedagogical task than a technical one. We need to create such a set of examples that a smart enough model will be able to generalize from it. We will discuss this in our future blogs – and that’s what exactly KDCube is doing.
To sum things up: fine tuning of an LLM happens by making it “read and learn” examples of the behavior we want from it. It is super important to verify the absorbed knowledge on examples.
David is a seasoned technology leader with over 17 years of experience in AI and big data engineering. He currently serves as Chief Technology Officer at Liminal Health, where he focuses on unlocking the full value of healthcare claims through intelligent automation. In addition to his technical background, David has strong interest in comparing how humans and Artificial Neural Networks (ANN) learn and perform - how they differ and how they similar.
- David Gruzmanhttps://kdcube.tech/author/davidgruzman/