This page gives an overview of a technique attributed to A. Karatsuba for multiplying large
numbers which can be used to improve the performance of multiplying numbers with
a large number of digits (such as `BigInteger`).
Essentially, the technique is as follows:

- break the multiplication of two numbers down into the multiplication of four constituent numbers;
- for these four multiplications, employ a trick that means only three multiplications actually need to be performed;
- for large numbers, the multiplication is broken down recursively.

When dealing with large numbers, multiplication is an expensive operation compared to addition, and so the number of underlying multiplications effectively becomes the rate-determining factor. Reducing just one in four underlying multiplications has a considerable effect on how the method scales with more and more digits.

The technique is a great example of how the application of a simple bit of high-school algebra can give a significant improvement. It stems from the basic observation of how to multiply together two sums in brackets. In effect, all four possible combinations of terms are added together:

(*a* + *b*)(*c* + *d*) ≡ *ac* + *ad* + *bc* + *bd*

Now, given two large numbers to multiply together, we can break this down into two *sums* of
numbers to multiply together, each of which has fewer significant digits. For example:

384775 * 992614 ≡ (384000 + 775) * (992000 + 614)

This breaks the problem down because in reality, multiplying 384000 by 992000 is no harder than multiplying 384 by 992 and then shifting the result left by 6 decimal places. So we can rewrite the above as:

384775 * 992614 ≡ (1000a + b) * (1000c + d)

≡ 1000000*ac + 1000*ad + 1000*bc + bd

≡ 1000000*ac + 1000*(ad+bc) + bd

where *a* = 384, *b* = 775 etc.

Now, on the surface, there are four actual multiplications that we need to perform here,
corresponding to the pairs of letters *ac*, *ad*, *bc*, *bd*, since multiplying
by 1000 and 1000000 is simply a question of shifting left by 3 or 6 decimal places respectively.

Imagine that we first do
the first and last of the four multiplications, i.e. we calculate *ac* and *bd*.
Then, the trick concerns how we perform the middle two multiplications *ad* and *bc*.
We don't actually care about the individual values of *ad* and *bc*, only about
the *sum* of these two multiplications. And there is a way that we can derive this sum
by doing only one more multiplication.
Instead of performing the
two multiplications *a*d* and *b*c*, we perform the single multiplication
(*a+b*) * (*c+d*). Recall from above that this is equivalent to:

Now, rearranging things, this means that:

≡ (

Notice how the underlined parts cancel one another out.
In other words, we can derive the sum that we actually want— the sum of
the second and third multiplications— by instead performing a single multiplication
*(a+b)***(c+d)* and then subtracting from this the results of the two mutliplications
that we already performed. Overall, we have performed three mutliplications instead of four.
We will also perform some extra additions and have to perform the "shifts" corresponding
to multiplication by 1000 and 1000000, but on large numbers, additions and shifts left are
relatively fast comapred to multiplication.

In our example, the number was just six digits long for illustration. In reality, there would be little point in applying the algorithm to such a small number since our processor could multiply two numbers of this magnitude in a single instruction taking one or two clock cycles. But on numbers that are hundreds and thousands of digits long, the saving is dramatic.