generaleducators

The Art of Defining: Why Most Arguments Are About Words, Not Facts

Madhav Kaushish·15 min read

Is a square a rectangle?

If you ask a room of people this question, you will get strong opinions on both sides. Some will say obviously yes. Others will insist they are different things. The debate can go on for a while. But here is the thing: both answers are correct, depending on your definition of rectangle.

If a rectangle is any quadrilateral with four right angles, then a square is a rectangle — it has four right angles and happens to also have equal sides. If your definition of rectangle requires that adjacent sides be unequal, then a square is not a rectangle. There is nothing to disagree about once you make the definitions explicit. The question "is a square a rectangle?" is really a question about what we mean by the word "rectangle."

This is not a peculiar edge case. It turns out that a remarkably large number of disagreements, in mathematics and outside of it, are really about definitions. The disagreement has no substance until the terms are clarified. Once they are, the disagreement either dissolves or reveals itself to be about something deeper than it first appeared.

In what follows, I want to explore what mathematicians have learned about the art of defining — how to choose good definitions, what makes one definition better than another, and why this matters far beyond mathematics.

Not all definitions are equal

Consider even numbers. We could define them as numbers whose last digit is 0, 2, 4, 6, or 8 when written in base 10. Or we could define them as numbers divisible by 2. For ordinary purposes, both definitions give the same results. If you want to check whether 374 is even, the first definition is quicker — just look at the last digit. So in some practical sense, the first definition is more useful.

But the second definition tells you something the first does not. It tells you what evenness actually is. The first definition depends on the arbitrary choice of base 10. Write numbers in base 3, and the last-digit criterion falls apart entirely. The divisibility definition does not care what base you use. It captures something about the concept of evenness that is independent of how we happen to represent numbers.

This distinction between a definition that is useful for a particular purpose and a definition that gets at the heart of a concept is one of the things mathematicians care about. A good definition is not just one that correctly identifies the objects in question. It is one that illuminates them.

Take the case of parallelograms. There are at least three ways to define them:

  1. Quadrilaterals with opposite sides parallel
  2. Quadrilaterals with opposite sides equal
  3. Quadrilaterals whose diagonals bisect each other

These three definitions are equivalent — any shape satisfying one satisfies all three. So for the purpose of identifying parallelograms, it does not matter which one you use. But there is something interesting here. Most people find it harder to see, at a glance, why the third definition should entail the other two. The connection between "diagonals bisect each other" and "opposite sides are parallel" is not obvious. If you are teaching someone what a parallelogram is, or if you are building a theory and need a starting point, you might reasonably prefer one of the first two definitions over the third.

Three parallelograms, each illustrating one of the equivalent definitions: opposite sides parallel, opposite sides equal, and diagonals bisecting each other

So even when definitions are equivalent, there can be reasons to prefer one over another: how easy it is to work with, how clearly it connects to the mental picture we have, and how naturally it leads to the consequences we care about.

Why we study some things and not others

Why do mathematicians care about parallelograms at all? Compare parallelograms with "quadrilaterals that have any two sides equal." Parallelograms have a rich set of properties — the three equivalent characterisations above, plus the fact that each diagonal divides the parallelogram into two congruent triangles, plus various area relationships. "Quadrilaterals with any two sides equal" do not, as far as I can tell, have much interesting to say about them that is not already true of quadrilaterals in general.

The philosopher Jamie Tappenden discusses this in terms of "fruitfulness." We include objects in our inventory — we give them names and study them — when they lead to significant consequences. Significant does not just mean "lots of consequences," since you can always generate trivially many consequences from any true statement. It means consequences that reveal structure or explain connections.

A similar comparison: right-angled triangles versus triangles with one 47.3-degree angle. On the face of it, both are specified by fixing one angle of a triangle, so why should we study one and not the other? There are a few reasons. One is that right angles have less arbitrariness than other angles — given a point and a line, there is exactly one perpendicular you can draw, but for any other angle that uniqueness fails. Another is that theorems about right triangles tend to be cleaner. The Pythagorean theorem is a beautiful equation. You could write out the law of cosines for 47.3-degree triangles, but there would be extra coefficients, and nobody would find the result illuminating.

The ability to ask "is this concept worth naming?" is not limited to mathematics. In any field, and in everyday life, we are constantly deciding which distinctions are worth making and which categories are worth having. Some of these decisions are more thoughtful than others.

The consequences of classification

Return to the square-rectangle question. This is not just a matter of labelling. The choice of classificatory system has consequences for how much work you have to do and how much structure you can see.

If squares are rectangles (and rectangles are parallelograms, and parallelograms are trapeziums), you get a nested hierarchy. The definitions become simple — each level adds one new condition to the level above it:

  • Trapezium: quadrilateral with one pair of parallel sides
  • Parallelogram: trapezium with the parallel sides being equal
  • Rectangle: parallelogram with one right angle
  • Square: rectangle with adjacent sides equal

Each definition is short because it builds on the previous one. And you get logical inheritance: anything proved about parallelograms automatically applies to rectangles and squares. You prove it once and get it for free at every level below.

Two classification systems for quadrilaterals side by side: a hierarchical system where each category nests inside the one above it, and a flat system where all categories are separate

Now suppose instead you treat squares, rectangles, parallelograms, and trapeziums as completely separate categories with no overlap. The definitions get much more complicated. A rectangle becomes "a quadrilateral with opposite sides equal, one right angle, and adjacent sides not equal." You need that last clause — "adjacent sides not equal" — to exclude squares. A parallelogram becomes "a quadrilateral with opposite sides parallel and no right angle" to exclude rectangles. Every definition has to explicitly exclude the things it is not, and every theorem has to be reproved for each category separately.

You see the same thing in biology. We could say humans are primates, which are mammals, which are vertebrates, which are animals. In that case, things established about mammals — warm-bloodedness, live birth, and so on — are automatically inherited by humans. Or we could treat "human" and "animal" as separate categories. But then we lose that inheritance, and we have to specify much more about what a human is from scratch. The considerations are remarkably similar: logical inheritance and simplicity of definitions push us toward hierarchical classification in both mathematics and biology.

When students in my course discussed the classification of triangles, exactly this sort of reasoning came up. Is an equilateral triangle a type of isosceles triangle? It depends on whether your definition of isosceles is "a triangle with two equal sides" or "a triangle with exactly two equal sides." The first allows equilateral triangles to be isosceles; the second does not. Once we settled on the first definition, one student asked: "Does that mean we would want isosceles triangles to be types of scalene triangles?" That is exactly the right question, and the answer reveals the limits of the hierarchical approach — at some point, the inheritance is not useful, and we do not want to push it further.

When definitions break outside their home

One of the most interesting things about definitions is what happens when you try to use them in a context they were not designed for. In many cases, a concept has several equivalent definitions in its original setting. When you move to a new setting, those equivalences can break down, and you have to choose which definition to keep.

The canonical example is prime numbers. The elementary school definition is: a prime number is a number greater than 1 that is only divisible by 1 and itself. There is another definition: a number is prime if whenever it divides a product of two numbers, it divides at least one of them. For ordinary whole numbers, these are equivalent. But when mathematicians generalised from whole numbers to other number systems — things called integer rings — the two definitions came apart. There are elements in certain rings that satisfy the first definition but not the second. Mathematicians chose to keep the second definition as the "real" definition of prime, because the important theorems about primes depend on that property, not on the elementary-school characterisation.

I explored something similar with students in my course. In the module on discrete geometry, I asked them to work in worlds with finitely many points. To do anything in these worlds, they needed to borrow concepts from Euclidean geometry — things like "straight line" and "bisection." But straight lines in Euclidean geometry have many properties. They are paths in the same direction. They are the shortest paths between two points. Given two points, there is exactly one straight line between them. In Euclidean geometry, all of these properties go together. In a world with six points, they do not.

Students had to decide which property to preserve. We did not have a useful concept of direction in these worlds, so the direction-based definition was out. But we did have a concept of distance, so defining straight lines as shortest paths worked. The catch was that shortest paths in these worlds are not unique — there might be several shortest paths between two points — which is very different from Euclidean geometry. So by choosing the distance-based definition, students gained existence (there are always shortest paths) but lost uniqueness (there might be more than one). By choosing the direction-based definition, they would have had uniqueness (if a straight line exists, it is the only one) but lost existence (there may be no straight line between two given points).

Euclidean geometry versus discrete geometry: in Euclidean space there is exactly one straight line between two points, while in a six-point world there can be multiple shortest paths

There is no "right answer". It depends on what you want to do and which consequences matter to you. This kind of trade-off is present whenever you extend a concept beyond its original context — in mathematics, in science, in law, and in everyday language.

Definitions in everyday life

I have been talking mostly about mathematics, but I think the ideas here are broadly applicable. Consider a few everyday examples.

When people debate whether a particular action is "violent," they are often working with different definitions. For some, violence requires physical harm. For others, destroying property counts. For still others, certain kinds of speech count as violence. These are not just semantic quibbles — the definition you choose determines whether particular actions fall inside or outside the category, which in turn affects moral and legal judgements. The disagreement about "violence" is in large part a disagreement about definition, and recognising that is the first step toward having a productive conversation about it.

Or take the concept of "democracy." Different people mean very different things by this word. Some mean a system where leaders are chosen by popular vote. Others include constraints on majority rule, protections for minorities, freedom of the press, and so on. Debates about whether a particular country is "really a democracy" often turn out to be debates about definition. That does not make them unimportant — which definition we use has consequences for how we evaluate political systems — but it means we should be explicit about what we mean before arguing about whether a particular case fits.

Classification operates similarly. Is a virus alive? It depends on your definition of life. The interesting question is not whether the answer is yes or no but what each answer entails. If we define life in a way that includes viruses, we may have to revise other things we believe about living things. If we exclude them, we need to explain why they share so many features with things we do consider alive. The classification has consequences.

What a classroom taught me

When I ran a course on theory building with 12-to-15-year-old students in Pune, India, one of the modules was essentially a game about defining. I made up a word — "podgon" — and held two secret definitions of it. Students proposed shapes and I told them whether each shape was a podgon according to each of my two definitions. Their job was to figure out the definitions.

What made this interesting was watching students negotiate the relationship between a definition and the examples it generates. A student named Tarini, after learning that a square was not a podgon for either definition but an equilateral triangle was, did not jump to the conclusion that podgons must be triangles. Instead, she asked about an equilateral pentagon. When told it was a podgon, she immediately said: "Odd number of sides." She was not just pattern-matching. She was forming a conjecture about the definition and then designing a test to evaluate it.

On the other hand, there were moments where students drew conclusions that did not follow. A student named Imran, having established that one open shape was not a podgon, concluded that all podgons must be closed. That is a common error — generalising from a single case. But the structure of the activity made the error visible. I could draw a different open shape and ask: "How do you know this one is not a podgon?" The game created a situation where the gap in reasoning was concrete and discussable.

These are not just mathematical skills. The ability to distinguish between "this example is consistent with my definition" and "this example proves my definition" is the same skill you need when evaluating evidence in any context. The habit of asking "what would count as a counterexample to my belief?" is the heart of critical thinking.

Definitions are choices

Perhaps the most important thing I have learned from thinking about definitions, both in mathematics and in teaching, is that definitions are choices. This sounds obvious, but it goes against a deep intuition most people have, which is that words have fixed meanings that you can look up in a dictionary or a textbook.

In mathematics, the same object can have many valid definitions. Some are equivalent, some are not. Even among equivalent definitions, some are more useful, more illuminating, or more central to the concept than others. The choice of definition shapes what theorems you can prove, what structure you can see, and what generalisations are possible. It is not a preliminary step before the "real" mathematics begins. It is a core part of the mathematical enterprise.

Outside of mathematics, definitions are even more obviously choices, but we tend to forget this. When someone says "that is not real art" or "that is not a real sport" or "that does not count as work," they are enforcing a particular definition as though it were a fact about the world. Noticing that — noticing that the disagreement is about the definition, not about the case at hand — is a skill that can be taught. It is a skill that makes conversations more productive and thinking more clear.

I do not mean that all definitions are equally good, or that "it is just a matter of definition" is always a satisfying response. Some definitions are better than others, for reasons I have tried to articulate: fruitfulness, centrality to the concept, simplicity, generalisability, what they preserve and what they lose. But recognising that a definition is a choice — and that a different choice would lead to different consequences — is the starting point for thinking carefully about anything.