Hugo's Blog

C++
haskell

A Boomerang Proof

Introduction

My mother is a teacher at a primary school where she often works with the gifted kids that yearn for an extra challenge. One of the challenges she provided them was called the Boomerang Problem1. The goal of the challenge is to find the largest number that cannot be represented as the sum of multiples of two given numbers (called the boomerangs). So for example, given the number 4 and 5, you can construct 17 as \(3 * 4 + 1 * 5\). However, the number 11 cannot be represented that way. In fact, it turns out to be the largest number for which that holds given boomerangs 4 and 5.

Impressed with the speed with which some kids were able to answer the question she also wanted to test me. And so, after doing a bunch of arithmetic to solve the problem I naturally wanted to find an easier, lazier, and more elegant way to generate an answer.

In this post we will try to formulate a generic method for finding this largest number given any two boomerangs. We do this for entertainment, to aid with sleeping now that one more question in the universe is answered, and unbeknownst to most people: because proofing is a creative process. In the final paragraph I will shed a few words on what makes the process so interesting and worthwhile.

Euclid would like to interject for a moment

Before we try to answer the question of the largest number we must asks ourselves first: does such a number always exist? And in fact, this happens to not always be the case. The easiest counter example that you can think of is when the two boomerangs are even numbers. After all, both the sum and a multiple of even numbers remain even. As a result, two even boomerangs will never be able to express an odd number, making the question "What is the largest number that cannot be expressed?", rather nonsensical.

Even numbers carry with them a property that troubles the water in the same way that 6 and 9 are two boomerangs for which the question becomes nonsensical. Namely, they are both a multiple of a number other than 1. This number is known as the Greatest Common Divider (Or GCD for short). We know that if the GCD of two numbers is some number \(z\), that \(k * z+1\) can never be expressed for the following reason:

  1. Given 2 boomerangs \(x\) and \(y\) such that \(GCD(x,y) = z\).
  2. We can rewrite \(y = n1 * z\) and \(x = n2 * z\) for some \(n1, n2 \in \mathbb{N}\)
  3. Our construction expression then becomes \(a * n2 * z + b * n1 * z\)
  4. Which can be factorized as \((a * n1 + b * n2) * z\)
  5. This implies that only multiples of \(z\) can be constructed.
  6. Finally we can conclude that there is no such thing as the largest inexpressible number if \(z > 1\)

What we have learned is that before we start asking the question we first calculate the GCD to check if it is in fact 1. Luckily there is a relatively simple method for determining the GCD which is known as the Euclidean Algorithm2.

Developing an intuition by force: throwing a computer against it.

While it is not always necessary, usually a proof attempts to verify an already developed idea or hypothesis. Furthermore, it forms a great excuse to include some computer science concepts in the mix regarding solving puzzles algorithmically.

In essence, what we want is a program that given a number z and two boomerangs x and y, will return an a and b that satisfies our puzzle, or tell us that it is impossible of course. Then we will run this program for any number between 1 and say 100000. More than likely, no new solutions will pop up after a while and we can reasonable assume that we have found the highest number. Note that this does not yet constitute a proof, but it can give us a good understanding and new insights that will help us get there eventually.

Now this is the part where we will dive into some computer science, if you are here just for the math, don't worry, it won't affect our proof. The thing is, we want to construct a program that is able to find a solution in a reasonable amount of time. To do this, we will use a concept known as dynamic programming3 (or DP for short). What this method helps us do, is create an efficient algorithm for problems that can be expressed as a sub problem:

$$ T(x,y,0) = True \\ T(x,y,z) = z > 0 \wedge (T(x,y,z-x) \vee T(x,y,z-y)) $$

What this says is that the number z can be expressed if either \(z-x\) or \(z-y\) can be expressed. As base case we also state that the number 0 can always be expressed. In haskell, the mathematical notation translates quite well:

test :: Int -> Int -> Int -> Bool
test x y z
  | z < 0 = False
  | z == 0 = True
  | otherwise = test x y (z-x) || test x y (z-y)

However, this implementation can run out of hand quite quickly since it can require up to 2 recursive calls per iteration, meaning that the execution stack grows exponentially. But this is where the concept of DP kicks in, now that we have stated our problem in terms of the solution to a smaller version we can start to attack it from the other side, eliminating recursion and ensuring that we never have to calculate something twice. Furthermore, we also save the information needed to actually reconstruct a and b once we know that the answer has been yes. For this we will employ c++:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <algorithm>

int main(int argc, char** argv) {
  // x and y are our boomerangs, we can vary these to test
  int x = 28;
  int y = 19;
  int z = 114;

  // Allocate space to save the answers for the sub problems
  int* data = (int*)malloc((z+1) * sizeof(int));
  // Initialize the memory to 0
  memset(data, 0, (z+1)*sizeof(int));

  // base case
  data[0] = 1;

  // solve all the problems up to and including z
  for(int i = 1; i<z+1; ++i) {
    if (i >= x && data[i-x] ==1)      data[i] = 1;
    else if (i >= y && data[i-y] ==1) data[i] = 1;
  }

  // Answer was not found
  if (data[z] == 0) {
    printf("%i = NO ANSWER\n", z);
    return 0;
  }

  // Reconstruct a and b
  // Start from the target number
  int i = z;
  int a = 0;
  int b = 0;
  while(i>0) {
    // the number minus x can be expressed (thus add an x)
    if (i >= x && data[i-x] == 1) {
      a++;
      i -= x;
    }
    // the number minus y can be expressed (thus add a y)
    else if (i >= y && data[i-y] == 1) {
      b++;
      i -= y;
    }
  }
  printf("%i = %i * %i + %i * %i\n", z, a, x, b, y);
}

Now, let's run this for a number of boomerangs and see what the largest number is for which the program outputs NO ANSWER.

x y largest z
4 5 11
5 6 19
7 10 53
28 19 485

Admittedly, it takes either luck or time to spot the pattern, but eventually it can be done. I was sitting in the garden, thinking about what these numbers might mean and where they came from and then it suddenly hit me: the formula for the largest number is a follows. $$ L(x,y) = x * y - x - y $$ It seems we have found a simple and elegant formula that can tell us the answer right away; No programming languages, no sub problems, often not even a calculator is needed. But where does this formula come from? And it is always correct? Perhaps some number in the billions is waiting out there to show our hypothesis is wrong... Clearly, sleep is not yet an option.

If you want to play around yourself with some numbers I've taken that same c++ code and converted it to javascript for you to play with:

Proof that the formula always produces an inexpressible number

  1. Let us define \(z = x * y\)
  2. Because \(GCD(x,y) = 1\) we know that there are exactly 2 ways to express \(z\):
    \(z = y * x + 0 * y\)
    \(z = 0 * x + x * y\)
  3. If we then subtract \(x\) from \(z\) we can only fix the first expression to remain true:
    \(z-x = (y-1) * x + 0 * y \)
  4. If we then again subtract \(y\) from \(z\) we cannot fix the expression, concluding that for \(z = x*y-x-y\) there exists no solution.

Proof that any larger number can be expressed

Let's first think about how someone might proof such a thing. Our computer program was not a valid proof because we could never evaluate every single number. What we have to come up with is something known as an induction proof. We aren't going to proof that it holds for some number, we are instead going to proof that we can always increment a number by 1 and still express the number of the sum of the two boomerangs. That way, our statement suddenly holds for all numbers to the stars and beyond!

  1. Given numbers x,y,z,a,b such that \(z = a * x + b * y\)
  2. We can decrement a by n such that \(n*x \mod y = y-1\), leading to the following expression: \((a-n) * x + (b - \frac{n*x}{y} + 1) * y = z + 1\)
  3. Similarly we can decrement b by n such that \(n*y \mod x = x-1\), leading to the following expression: \((a - \frac{n*y}{x} + 1) * x + (b-n)*y = z + 1\)
  4. Now, it is not possible to always find such an n. But since \(GCD(x,y)=1\) we know that
    \(n*x \mod y = 0 \Rightarrow n = k * y\)
    \(n*y \mod x = 0 \Rightarrow n = k * x\)
    In other words, the sequence of rests always ends with 0 and only when n is a multiple of the divider
  5. We also know that the sequences of rests cannot contain a loop for any \(n < x\) or \(n < y\) respectively
  6. From 4. and 5. we can conclude that \(y-1\) and \(x-1\) must always appear in the sequences of rests when we increment n to \(x-1\) or \(y-1\) in step 2 and 3 respectively.
  7. From step 6. we can conclude that we can always rearrange the expression to become 1 more iff \(a ≥ y-1\) and/or \(b ≥ x-1\). The smallest number for this holds is \(x*y-x-y\).
  8. Finally, after we have applied either of these rules we will always have enough x's and y's left to apply the other rule at the next iteration.
  9. Therefore we can express every number above \(x*y-x-y\) as a boomerang number.

An example

To highlight the rewrite rules we have conjured up in our proof let us walk through an example with 4 and 5 as a boomerangs. We already know now that 11 is the largest number that cannot be expressed, but we can start with the rewrite rule from there.

z a b elaboration
11 - - -
12 3 0 -
13 2 1 \(n*4 \mod 5 = (5-1)\) holds for \(n=1)\) since \(4 \mod 5=4)\). Now we can subtract a by 1 and increment b by \(\frac{4}{5} + 1 = 1\)
14 1 2 \(n*4 \mod 5 = (5-1)\) holds for \(n=1)\) since \(4 \mod 5=4)\). Now we can subtract a by 1 and increment b by \(\frac{4}{5} + 1 = 1\)
15 0 3 \(n*4 \mod 5 = (5-1)\) holds for \(n=1)\) since \(4 \mod 5=4)\). Now we can subtract a by 1 and increment b by \(\frac{4}{5} + 1 = 1\)
16 4 0 \(n*4 \mod 5 = (5-1)\) holds for \(n=1)\) since \(4 \mod 5=4)\). But, \(1 > 0\) so we proceed to rewrite rule 2. \(n*5 \mod 4 = (4-1)\) holds for \(n=3\) since \(15 \mod 4 = 3\). So we decrement b by 3 and increment a by \(\frac{15}{4}+1=4\).

Summarizing

And there we have it! Proof! First we created a computer program to gain some intuition about what the solution might be. Then we proved that the number produced by our solution would at least never be a expressible by our boomerangs. Finally, using induction we proved that after that number we could always rewrite the expression to make the total 1 more, thereby proving that we can express any number up to infinity larger than \(x*y-x-y\)!.

Math as a creative process, why computers can't do everything

I don't think proofs like this interest most people and I think for a large part that has to do with not realizing that math is a fundamentally creative process. In school you usually experience a cycle in which the teacher explains a concept, rule, or formula after which you go ahead and apply this newfound knowledge to a number of examples.

While this is of course very necessary to teach everyone the basics of math it often fails to highlight the fact that those rules and concepts were not carved in stone by a deity and gifted to us; Instead we discovered them! And this is were things get ironic since the word discovery implies some sort of unspecified human process, which most people do not associated with a field like math that is often crowned with the virtue of being precise and unambiguous.

In fact, it is precisely because developing new theories in math requires human creativity that it attracts people. Mathematicians don't view themselves as efficient computers, but as creative beings that are almost creating art, albeit under very strict rules.

It is true that computer programs exist that can tell you whether certain conjectures are true or not (SAT solvers for example) but they are currently quite limited. It seems that every time we try to create a magical auto proofer, we stumble upon some missing information that requires creativity to solve.

All this leads to the most existential question in math: 'Is math something you discover or something you create?'

References

  1. Boomerang Problem by Erik van Haren's MathPlay (Dutch)
  2. Wikipedia page about the Euclidean Algorithm
  3. Wikipedia page about Dynamic Programming
last modified: May 29, 2023 at 13:07