The Sliding Window

Trviksha spent two days looking at Kvrothja's field grids. The blight always appeared as clusters — a patch of three, five, or eight adjacent blighted plots. The clusters had a characteristic shape: a core of fully blighted plots surrounded by a ring of partially affected ones. The shape was the same whether the cluster sat in the upper-left, the centre, or the lower-right of the field.

Trviksha: The pattern is local. A 3×3 patch of plots contains all the information needed to decide whether the centre plot is at risk. If the surrounding plots are blighted, the centre is in danger. If the surrounding plots are healthy, the centre is probably safe.

Blortz: Then examine the field 3×3 patches at a time, instead of all four hundred plots at once.

The Scanning Team

Trviksha assigned a team of three velociraptors to the task — not to examine the whole grid at once, but to examine a small window of it.

The team looked at a 3×3 patch: nine plots. Each plot's status was an input. The team computed a weighted sum of the nine inputs, applied an activation function, and produced a single output: a number indicating how much the centre of the patch looked like a blight cluster.

Then the team slid one position to the right and examined the next 3×3 patch. Same weights. Same computation. Same activation function. Just a different set of nine plots.

They slid across the entire first row, one position at a time. Then they moved down one row and slid across again. The team scanned the entire 20×20 grid, one 3×3 window at a time, applying the same computation everywhere.

Drysska: The same weights, every time?

Trviksha: The same weights, every time. The team is not learning different patterns for different positions. It is learning one pattern — "what does a blight cluster look like in a 3×3 patch?" — and applying that same pattern everywhere.

Drysska: Then a blight cluster in the upper-left will trigger the same response as a blight cluster in the lower-right.

Trviksha: Exactly. The network no longer cares where in the field the pattern appears. It only cares what the local patch looks like.

A 20×20 stone grid with a small 3×3 wooden frame being slid across it by a team of velociraptors. The frame highlights nine plots at a time. As the frame moves from position to position, the same velociraptor team performs the same computation at each location. Arrows show the scanning path across and down the grid

Training the Window

The nine weights of the 3×3 window were trained by the usual method: backward propagation. The training data consisted of thousands of 3×3 patches extracted from the forty training fields, each labelled with whether the centre plot was blighted or healthy. The weights adjusted to minimize the loss on these patches.

After training, Trviksha examined the learned weights. The nine numbers formed a recognisable shape: the centre weight was strongly negative, and the surrounding eight weights were moderately positive. The filter was computing, roughly, "are the neighbours blighted while the centre is not yet?" — an early-warning detector for blight spreading inward.

Trviksha: The window learned to detect the boundary of an advancing cluster. Plots surrounded by blight but not yet blighted themselves are the ones at highest risk. The window found this pattern on its own.

Kvrothja: That matches my experience. The edge of a cluster is where the blight moves next.

What Changed

The sliding-window approach achieved 88% accuracy on held-out fields — a substantial improvement over the 76% of the fully connected network. But the improvement was not just in accuracy. The errors were different.

The fully connected network had made bizarre errors — flagging healthy areas in one field because blight happened to appear in the same grid position in another field. The sliding-window network did not make this kind of error. Because it examined local patches with shared weights, it could not learn position-specific patterns. It could only learn what blight looked like locally, regardless of where in the field the locality was.

Trviksha: The old network confused "where" with "what." It learned that position 17 was dangerous because it had seen blight there before. The new network only knows "what" — what does a local patch of blight look like — and applies that knowledge everywhere.

Blortz: The same eyes, looking at every part of the field.

Trviksha has invented the convolutional filter (also called a kernel) — the fundamental building block of convolutional neural networks (CNNs). A small set of weights slides across the entire input grid, applying the same computation at every position. This achieves translation invariance: the filter detects the same pattern regardless of where it appears. A blight cluster in the corner triggers the same response as one in the centre. The filter is small (3×3 = 9 weights), the grid is large (20×20 = 400 positions), but the same 9 weights are applied at every position. This was the key idea in Yann LeCun's 1989 work on handwritten digit recognition, which showed that convolutional networks could recognise digits regardless of their position in the image. Think about how you read text: you recognise the letter "e" whether it appears at the start or end of a word, at the top or bottom of a page. You do not have a separate "e-detector" for each possible position — you have one, and it works everywhere.