The power function \(x^y\) is integral to many DSP applications, such as dB to linear gain conversion (\(y = 10^\frac{x}{20}\)), and semitone to Hz conversion (\(f_t = f_0 \cdot 2^{\frac{t}{12}}\)). When studying the code in Dexed, an FM synth modelled over DX7, I find many use cases of the `exp2`

function (\(2^x\)), especially in the amplitude envelope calculation.

In this post, we will look at how \(2^x\), or the `exp2`

function, can be approximated for speed-intensive, precision-tolerant use cases. Note that we only discuss the case of `exp2`

, because it is a convenient base in floating point representation (more on this later), and it is easily extendable to the generic power function \(x^y\). Given \(f(k) = 2^k\), we can transform the power function by multiplying a constant \(\log_{2}{x}\) on the input to make use of \(f(\cdot)\):

$$x^y = 2^{y \cdot \log_{2}{x}} = f(y \cdot \log_{2}{x})$$

### Initial ideas

A straightforward approach is to truncate the **Taylor series** of \(2^x\) up to the \(n\)-th term. One can get the Taylor series of \(2^x\) as:

$$2^x = e^{x \ln 2} = 1 + \frac{x \ln 2}{1!} + \frac{(x \ln 2)^2}{2!} + \frac{(x \ln 2)^3}{3!} + â€¦ $$

However, to get a good approximation across a wide input range, it requires higher order of polynomials, which is computationally intensive.

Another idea from Dexed is to use a finite-range lookup table and fixed-point arithmetic, however this method is usable only for fixed-point systems.

To get a more precise and efficient implementation in floating point, we need to first understand the floating point representation.

### Separating the integer and decimal part

Letâ€™s say we want to implement an `exp-2`

approximation for a single-precision (32-bit) floating point system. According to IEEE-754 floating point representation, it consists of 1 sign bit, 8 exponent bits, and 32 mantissa (or fractional) bits, as depicted in the diagram:

The corresponding formula of single-precision floating point is \((âˆ’1)^{S} Ã— 1.M Ã— 2^{(E âˆ’ 127)}\). From this formula, we can observe that: **given an integer input, calculating exp2 is essentially bit-shifting to get the exponent bits \(E\)**. We also need to add the bias value in the exponent bits before bit-shifting. For single-precision, the bias value is 127 or 0x7f, as shown in the formula above.

This gives us an idea of how we can tackle the approximation separately, given an input \(x\):

- for the integer part \(\lfloor x \rfloor \), bit-shift to the exponent bits;
- for the decimal part \(x - \lfloor x \rfloor \), use a rational approximation;
- multiply the output of both parts \(2^{x} = 2^{\lfloor x \rfloor} \cdot 2^{x - \lfloor x \rfloor}\) (in C++, we can use
`ldexp`

)

### Rational approximation of `exp2f`

Depending on the rounding mode used to extract the integer part, the range of the decimal part would either be within \([-0.5, 0.5]\) or \([0, 1)\). With this, we only need an approximation precise enough within this range, which is more achievable.

There are a myriad of ideas on how this approximation could be achieved. We can start from an n-th order polynomial approximation. For example, with the help of `np.polyfit`

we can get a 3rd-order polynomial approximation:

$$ 2^{x} \approx 0.05700169x^{3}\ + 0.24858144x^{2} + 0.69282515x + 0.9991608, \quad x \in [-1, 1]$$

This is actually quite close to the Taylorâ€™s expansion at order 3:

$$ 2^{x} \approx \frac{(x \ln 2)^3}{3!} + \frac{(x \ln 2)^2}{2!} + \frac{x \ln 2}{1!} + 1 $$

$$ \quad \quad \quad \quad \quad = 0.0555041x^{3}\ + 0.2402265x^{2} + 0.693147x + 1 $$

The Cephes library uses a PadÃ© approximant in the form of:

$$ 2^{x} \approx 1 + 2x \frac{P(x^2)}{Q(x^2) - xP(x^2)}, \quad x \in [-0.5, 0.5]$$

$$ P(x) = 0.002309x^{2}+20.202x+1513.906 $$

$$ Q(x) = x^{2}+233.184x+4368.211 $$

From a blog post by Paul Mineiro, it seems like the author also uses something similar to PadÃ© approximant, but with a lower polynomial order:

$$ 2^{x} \approx 1 + \frac{27.7280233}{4.84252568 - x} âˆ’ 0.49012907x âˆ’ 5.7259425, \quad x \in [0, 1)$$

### Timing and Accuracy

We report the absolute error of each approximation method within a given input range. Test script here.

Within input range of \([0, 1)\), 10000 sample points:

max | min | avg | |
---|---|---|---|

3rd-order polynomial | \(\quad 2.423 \times 10^{-3} \quad\) | \(\quad 1.192 \times 10^{-7} \quad\) | \(\quad 6.736 \times 10^{-4} \quad\) |

Mineiroâ€™s method | \(\quad 5.829 \times 10^{-5} \quad\) | \(\quad 0 \quad\) | \(\quad 2.267 \times 10^{-5} \quad\) |

Cephesâ€™ method | \(\quad 2.384 \times 10^{-7} \quad\) | \(\quad 0 \quad\) | \(\quad 2.501 \times 10^{-8} \quad\) |

Within input range of \([-0.5, 0.5]\), 10000 sample points:

max | min | avg | |
---|---|---|---|

3rd-order polynomial | \(\quad 8.423 \times 10^{-4} \quad\) | \(\quad 5.960 \times 10^{-8} \quad\) | \(\quad 4.764 \times 10^{-4} \quad\) |

Mineiroâ€™s method | \(\quad 4.995 \times 10^{-5} \quad\) | \(\quad 0 \quad\) | \(\quad 1.623 \times 10^{-5} \quad\) |

Cephesâ€™ method | \(\quad 1.192 \times 10^{-7} \quad\) | \(\quad 0 \quad\) | \(\quad 1.798 \times 10^{-8} \quad\) |

We also measure the total time taken to run on 10000 sample points, averaged across 5 runs:

in secs | |
---|---|

3rd-order polynomial | \(\quad 4.747 \times 10^{-5} \quad\) |

Mineiroâ€™s method | \(\quad 8.229 \times 10^{-5} \quad\) |

Cephesâ€™ method | \(\quad 4.854 \times 10^{-4} \quad\) |

We can see Cephes provides the best accuracy, while 3rd-order polynomial approximation provides the best speed. Mineiroâ€™s method keeps the absolute error within the order of magnitude \(10^{-5}\), while using only ~20% of the time needed by Cephes.

### Code example in SIMD

SIMD is commonly used to provide further computation speedup on CPU. The aim of of this post is also to find an efficient SIMD implementation for `exp2`

, which is still lacking in common SIMD operation sets. Below we will look at an example of `exp2`

approximation implemented using SSE3. We use the 3rd-order polynomial approximation below:

1 | __m128 fast_exp_sse (__m128 x) { |

Some notes to discuss:

For the integer rounding part,

`_mm_cvtps_epi32`

is used, which is a float-to-int casting. To use round-to-nearest mode, we can use`_mm_round_ps`

, but it is only supported in SSE4.1.There is a difference between

**type conversion**`_mm_cvtps_epi32`

and**reinterpret casting**`_mm_castsi128_ps`

. Type conversion converts a fixed point integer representation to a floating point representation, and retain its value. Reinterpret casting takes the byte pattern of the fixed-point input, and reinterprets it based on the floating point representation.PadÃ© approximant can be used by replacing lines 16-21, and would require the division operator

`_mm_div_ps`

.

### References

Creating a Compiler Optimized Inlineable Implementation of Intel Svml Simd Intrinsics

Added vectorized implementation of the exponential function for ARM/NEON

Fastest Implementation of the Natural Exponential Function Using SSE

Fast Approximate Logarithm, Exponential, Power, and Inverse Root