Karatsuba's algorithm for multiplication
This page gives an overview of a technique attributed to A. Karatsuba for multiplying large
numbers which can be used to improve the performance of multiplying numbers with
a large number of digits (such as BigInteger).
Essentially, the technique is as follows:
- break the multiplication of two numbers down into the multiplication of four constituent numbers;
- for these four multiplications, employ a trick that means only three multiplications actually need
to be performed;
- for large numbers, the multiplication is broken down recursively.
When dealing with large numbers, multiplication is an expensive operation compared to
addition, and so the number of underlying multiplications effectively becomes the rate-determining
factor. Reducing just one in four underlying multiplications has a considerable effect on
how the method scales with more and more digits.
The technique is a great example of how the
application of a simple bit of high-school algebra can give a significant improvement. It stems
from the basic observation of how to multiply together two sums in brackets. In effect,
all four possible combinations of terms are added together:
(a + b)(c + d) ≡ ac + ad + bc + bd
Now, given two large numbers to multiply together, we can break this down into two sums of
numbers to multiply together, each of which has fewer significant digits. For example:
384775 * 992614 ≡ (384000 + 775) * (992000 + 614)
This breaks the problem down because in reality, multiplying 384000 by 992000 is no harder
than multiplying 384 by 992 and then shifting the result left by 6 decimal places. So we can rewrite
the above as:
384775 * 992614 ≡ (1000a + b) * (1000c + d)
≡ 1000000*ac + 1000*ad + 1000*bc + bd
≡ 1000000*ac + 1000*(ad+bc) + bd
where a = 384, b = 775 etc.
Now, on the surface, there are four actual multiplications that we need to perform here,
corresponding to the pairs of letters ac, ad, bc, bd, since multiplying
by 1000 and 1000000 is simply a question of shifting left by 3 or 6 decimal places respectively.
Imagine that we first do
the first and last of the four multiplications, i.e. we calculate ac and bd.
Then, the trick concerns how we perform the middle two multiplications ad and bc.
We don't actually care about the individual values of ad and bc, only about
the sum of these two multiplications. And there is a way that we can derive this sum
by doing only one more multiplication.
Instead of performing the
two multiplications a*d and b*c, we perform the single multiplication
(a+b) * (c+d). Recall from above that this is equivalent to:
ac + ad + bc + bd
Now, rearranging things, this means that:
a*d + b*c ≡ [ ac + ad + bc + bd ] - ac - bd
≡ (a+b) * (c+d) - ac - bd
Notice how the underlined parts cancel one another out.
In other words, we can derive the sum that we actually want— the sum of
the second and third multiplications— by instead performing a single multiplication
(a+b)*(c+d) and then subtracting from this the results of the two mutliplications
that we already performed. Overall, we have performed three mutliplications instead of four.
We will also perform some extra additions and have to perform the "shifts" corresponding
to multiplication by 1000 and 1000000, but on large numbers, additions and shifts left are
relatively fast comapred to multiplication.
In our example, the number was just six digits long for illustration. In reality, there would
be little point in applying the algorithm to such a small number since our processor could
multiply two numbers of this magnitude in a single instruction taking one or two clock
cycles. But on numbers that are hundreds and thousands of digits long, the saving is dramatic.