Part 3 of 58
The Prediction Line
By Madhav Kaushish · Ages 12+
Trviksha decided to start with something she could see.
She had seven factors and two hundred and fourteen patients. Trying to handle all seven at once had produced confusion. So she stripped the problem down to one factor and one outcome: age and sickness severity.
The Grid
She cleared a large stone slab and marked it as a grid. Along the bottom edge, she carved notches for age — one notch per decade, from ten to ninety. Along the left edge, she carved notches for sickness severity — a score Grothvik had assigned to each patient based on how ill they became, from zero (perfectly healthy through the rainy season) to ten (critically ill).
For each of the two hundred and fourteen patients, she placed a pebble on the grid at the intersection of that patient's age and that patient's sickness score. Some positions had clusters of pebbles. Some had single pebbles. Large stretches of the grid were empty.
Blortz: What are you looking at?
Trviksha: Whether age and sickness have a relationship. If older patients tend to score higher, the pebbles should drift upward as you move right across the grid.
Blortz: Do they?
The pebbles were scattered — not randomly, but messily. A thirty-year-old had scored 7. A sixty-year-old had scored 2. Individual patients defied any clean pattern. But viewed from a distance, squinting, there was a drift. The cloud of pebbles tilted upward from left to right. Younger patients were, on average, lower on the grid. Older patients were, on average, higher. Not always. Not reliably. But on average.
The Stick
Trviksha: I want to summarize that drift. Not the individual pebbles — the overall direction.
She took a straight stick — a thin branch, roughly the length of the grid — and placed it across the stone slab, tilted to follow the general direction of the pebble cloud. Some pebbles sat above the stick. Some sat below. She adjusted the angle until roughly equal numbers were above and below, and the stick seemed to pass through the "middle" of the cloud.
Blortz: You are drawing a line through the mess.
Trviksha: The mess has a direction. The line captures the direction. Any individual patient might be above or below the line, but the line tells me: for a patient of this age, the typical sickness score is approximately here.
She tested it. A new patient, age forty-five. She found the forty-five notch on the bottom edge, traced up to the stick, and read across to the left edge. The stick said: approximately 4.2. She checked Grothvik's records — the three patients aged forty-five in the data had scores of 3, 5, and 6. Average: 4.7. The stick's estimate of 4.2 was in the right area — not exact, but reasonable.
Trviksha: The stick does not know about any particular patient. It knows about the trend. Given an age, it gives me the trend's prediction, not the truth.
Blortz: And the distance between the pebble and the stick?
Trviksha: That is how much the patient differs from the trend. Some patients are sicker than their age would predict. Some are healthier. The stick captures what age alone can tell me. Everything the stick misses must be explained by other factors — diet, location, water — or by pure chance.

The Fit
The stick's position was not arbitrary. Trviksha wanted the stick to be as close as possible to as many pebbles as possible. She experimented with different angles and positions, eyeballing the fit each time.
Blortz: You are choosing the position by feel. Is there a systematic way?
Trviksha: I could measure the distance from each pebble to the stick and try to make the total distance as small as possible. The position that minimises the total distance is the best fit.
She tried this. For each position of the stick, she measured how far each pebble was from the stick (above or below), squared the distances (so that a pebble two notches away counted four times as much as a pebble one notch away — large misses mattered more than small ones), and summed them up. Then she moved the stick slightly and repeated the measurement. After many adjustments, she found a position where the total squared distance was smallest. Moving the stick in any direction made the total worse.
Blortz: You found the position where the stick disagrees with the data the least.
Trviksha: The best fit. Not a perfect fit — the pebbles are scattered, and no straight stick will pass through all of them. But this stick disagrees less than any other stick could.
Blortz: And for a new patient whose age you know but whose sickness you do not?
Trviksha: Read the prediction off the stick. It will be wrong for that individual patient — the pebbles scatter. But it will be the best guess a straight line can make, given what the data shows.
This was the core idea: the line did not describe any single patient. It described the relationship between age and sickness across all patients. It was a summary — imperfect but useful — that could be applied to new cases.