How to tune Elliptic Curves resource usage
Like most parts of Mbed TLS, the implementation of elliptic curve operations can be tuned using various compilation flags. This page explains the two parameters that control tradeoffs between performance and footprint in more detail than our general documentation on reducing footprint.
Performance and RAM figures
Since this page discusses performancefootprint tradeoffs, it’s useful to have some performance figures. For convenience, the figures quoted in this article were collected with Mbed TLS 2.2 on a machine with a 64bit CPU. You can reproduce those results on your machine using scripts/eccheap.sh from the Mbed TLS sources.
The RAM figures only include heap usage, not the stack. This is a limitation of the measurement script. However, these should still be useful, as most memory used by elliptic curve operations will be on the heap. Remember, however, that RAM figures may be slightly lower on a 32bit machine.
Which curves?
This page refers to “short Weierstrass” form curves. This includes NIST, Brainpool, and “Koblitz” curves such as the one used by Bitcoin—in short, all except newer curves like Curve25519 and Curve448.
On the reference machine, Curve25519 performed 358 ECDHE/s (ephemeral Elliptic Curve DiffieHellman key exchanges per second) using around 1500 bytes on the heap.
MBEDTLS_ECP_WINDOW_SIZE
The main operation on elliptic curves is multiplication of a point by an integer, which is performed using additions and doubling. This is similar to fast exponentiation algorithms you may be familiar with. It is possible to boost performance by precomputing some wellchosen multiples of the input point just before you multiply it by the chosen integer. This is the comb method.)
RAM usage increases with the number of multiples you compute and store before performing the actual multiplication. However, while precomputing a few multiples gives better performance than precomputing none, you don’t want to compute too many either. The optimal size of the precomputation table depends on the size of the curve.
By default, our multiplication function will select the table size that gives the best performance for the curve used. However, you might want to set an upper bound for the table size: this is what MBEDTLS_ECP_WINDOW_SIZE
does. More specifically the value of this macro is the log of the maximum number of points that will be precomputed.
For example, for NIST P256, the performance and RAM figures for various values of this parameter (with fixed point optimization disabled, see next section) are as follows:
Window Size 
2 
3 
4 
5 
6 

ECDHE/s 
44 
50 
53 
53 
53 
Heap bytes 
2064 
2448 
3592 
3624 
3680 
As you can see, when moving from 2 to 4, performance increases with RAM usage; larger values aren’t helpful for performance. The code selects 4 as the effective value, and doesn’t change RAM usage as much.
MBEDTLS_ECP_FIXED_POINT_OPTIM
The first paragraph of the previous section said some multiples are precomputed just before computation, suggesting that they are discarded once the operation is done, and computed again for the next operation. This is true, but incomplete.
Elliptic curves each come with a standard “base point” (also know as generator, and usually denoted by the letter G
), for when the protocol requires a point known by all parties. For example, in ECDHE, one party generates a secret exponent a
, computes aG
, send the result to its peer, receives the peer’s public share Q
then computes the shared secret aQ
.
Notice that for the first part (but not the second), the point to be multiplied is known in advance: G
. When you first multiply a G
by an integer, you precompute wellchosen multiples of G
as usual. After that step, you may want to store those multiples in order to speed up subsequent multiplications of G
by other integers. Keeping this information, however, uses RAM. There is a performancememory tradeoff.
Set
MBEDTLS_ECP_FIXED_POINT_OPTIM
to 1 in order to keep the table, and 0 to discard it.
For example, the performance and RAM figures for NIST P256 without and with fixed point optimization (with max window size set to 4, see previous section) are as follows:
0 
1 


ECDHE/s 
53 
76 
Heap bytes 
3592 
5360 
Conclusion
The ECC implementation in Mbed TLS uses some memoryperformance tradeoffs. The defaults values are chosen with performance in mind, but you can adjust them to reduce RAM usage for the tradeoff that best fits your particular needs.
It should be noted that newer curves like Curve25519, with a simpler implementation that doesn’t use any tradeoffs, manage to deliver superior performance while using less RAM compared to NIST curves.