Part 22 of 58
Yesterday's Weather
By Madhav Kaushish · Ages 12+
Kvrothja's blight detection system was operational, and GlagalCloud had its first spatial pattern recogniser. But the next customer brought a different kind of problem.
The Forecaster
Vrothjelka was Sonhlagot's chief weather recorder. She had tablets going back thirty years — daily readings of temperature, humidity, wind direction, cloud cover, and rainfall. She wanted predictions.
Vrothjelka: I have thirty years of daily weather data. I want to know: given the last seven days, what will tomorrow's weather be?
Trviksha: I can build a network. Seven days of readings, five measurements per day — thirty-five inputs. One output: tomorrow's rainfall.
She built a standard network: thirty-five inputs, twelve hidden velociraptors, one output. She trained it on Vrothjelka's historical data.
The results were mediocre. The model predicted roughly average weather for every input. It captured the seasonal trends — wetter in the monsoon, drier in winter — but missed the day-to-day patterns that Vrothjelka needed.
Vrothjelka: I can predict seasonal averages myself. I need you to predict whether tomorrow will rain, given that it rained today and yesterday.
The Order Problem
Trviksha examined the network's behaviour more carefully. She noticed something troubling. The thirty-five inputs were arranged as: Day 1 temperature, Day 1 humidity, Day 1 wind, Day 1 cloud cover, Day 1 rainfall, Day 2 temperature, Day 2 humidity... and so on, through Day 7.
The network treated these thirty-five numbers as an unordered collection of features. Mathematically, it could learn different weights for "Day 1 temperature" and "Day 7 temperature" — but it had to discover on its own that these represented the same measurement at different times.
Trviksha: The network does not know that Day 1 comes before Day 2. It does not know that these inputs represent a sequence. It sees thirty-five numbers with no structure.
Blortz: But there is structure. Yesterday matters more than five days ago. A cold front yesterday means more than a cold front last week. The sequence is the information.
Trviksha: Exactly. And I am throwing the sequence away by feeding all seven days in at once, as a flat list.
This was the same problem she had encountered with Kvrothja's fields — spatial structure being lost when data was flattened. But now the structure was temporal, not spatial. The question was not "which plots are neighbours?" but "which days came first?"

The Deeper Issue
There was a worse problem. Vrothjelka's data covered thirty years of daily readings — roughly eleven thousand days. For each prediction, the model used a window of seven days. But what if the relevant history was longer than seven days?
Vrothjelka: Monsoons build over weeks. A drought develops over months. Seven days of context is often not enough. I need the model to consider longer history.
Trviksha: I could extend the window to thirty days. That would be a hundred and fifty inputs. Or sixty days — three hundred inputs.
Blortz: Each time you extend the window, the network grows enormously. And you have to choose the window size in advance. What if the relevant history is longer than whatever window you pick?
Trviksha: Then I would always be throwing away information. The network would see a fixed window sliding across the data, blind to everything that came before the window.
The convolutional filters from Kvrothja's problem also slid a fixed window — but across space, not time. And they assumed the pattern looked the same everywhere (translation invariance), which was reasonable for spatial patterns like blight. But weather patterns were not the same at every position in a sequence. A cold front on Day 1 of a monsoon and a cold front on Day 30 meant very different things. The spatial trick would not work.
Trviksha: I need a network that processes the sequence one day at a time, but somehow carries information forward from previous days. Not a fixed window. A running memory.
Glagalbagal: A memory. You want to teach the pebbles to remember.
Trviksha: I want to build a system where each day's processing includes a summary of every day that came before it. Not by making the input larger — by making the network carry information forward as it goes.
She did not yet know how to build this. But she knew what she needed: a velociraptor that could look at today's weather while simultaneously remembering what it had learned from yesterday, and last week, and last month — a velociraptor with a past.