April | 2023 | On software efficiency

Introduction

Working with big factorials can be a challenge. For instance if we want to calculate the number of combinations one uses the following formula:

nCr = n!/(r!(n-r)!)

A naïve approach causes n * r * (n – r) multiplications and one division.

The problem is that it’s easy to cause overflow of traditional built-in number types.

16! overflows a 32 bit number.

21! overflows a 64 bit number.

There exist libraries that allow programmers to circumvent these limitations like BigInteger in Java. But once these numbers become big they become inefficient to calculate.

Another solution is the one of Mark Dominus https://blog.plover.com/math/choose.html who adopted an interesting approach from https://en.wikipedia.org/wiki/L%C4%ABl%C4%81vat%C4%AB.

Other than the two solution above I did not immediately find another approach that lends the flexibility I was searching for.

A solution based on prime factor vectors

This post however describes another approach.

It is based on the fundamental theorem of arithmetic: Every integer greater than 1 is either a prime number or can be written as a product of its prime factors.

Once we have the arrays of prime factors, calculating the result is quite simple. Given variables in brackets indicate an array or vector of prime factors, we can rewrite the above formula as follows:

[nCr] = [n!] - ([r!] + [(n-r)!])

Where + and - are simple vector additions/subtraction.

The challenge is how to efficiently convert a n! into a vector of prime factors.

A simple approach calculates the prime factors of n! by calculating factors of each component of the result.

[n!] = [n] + [n – 1] + … +[2]

This is costly, given it needs to factor every multiple of n!.

The following approach however is more efficient:

calculate_for_factorial(n):
   Calculate the primes with a sieve of eratosthenes
   N = new int[primes <= n.length]
   For the ith prime in primes <= n: 
      Calculate count for one prime

To find the factor for one prime p we count the amount of pⁱ that fits in n.

This calculates the vector in ~ log(n) * n / ln(n) steps since it needs log_p(n) per prime. And there are n/ln(n) primes smaller than n.

The upside is that it avoids overflows under the following limits:

For 32 bit numbers the maximum is 2,147,483,647!.

For 64 bit numbers this is 9,223,372,036,854,775,807!.

Here is a java implementation

public static List<Integer> calculate_for_factorial(int n)
    {
        List<Integer> allPrimes = Primes.getAllPrimes(n);
        List<Integer> N = new ArrayList<>(allPrimes.size());
        for (int i = 0; i < allPrimes.size(); i++) {
            var prime = allPrimes.get(i);
            int c = 0;
            int r = prime;
            while(r <= n) {
                c+= n/r;
                r*=prime;
            }
            N.add(c);
        }
        return N;
    }

This approach can also be used for other combination problems like permutations with multisets or multinomial coefficient problems.

On software efficiency

Monthly Archives: April 2023

Handling factorials using prime factor vectors

Introduction

A solution based on prime factor vectors