Bleichenbacher's Ghost

Hey man, computer security is hard.

Notes on Padmé Padding

Posted on — Oct 18, 2020;   Reading Time — 2 minutes

I made some quick notes on the Padmé paper this morning and I’m dumping them here for posterity.

First, here is a link to the paper. These notes are not complete, please read the paper for details.

Nikitin, Kirill, Ludovic Barman, Wouter Lueks, Matthew Underwood, Jean-Pierre Hubaux, and Bryan Ford. “Reducing Metadata Leakage from Encrypted Files and Communication with PURBs”, Proceedings on Privacy Enhancing Technologies 2019, 4 (2019): 6-33, doi: https://doi.org/10.2478/popets-2019-0056, arXiv:1806.03160 [cs.CR]

Fix a message space P of binary strings of length at most M.

Define a surjective padding function

f: P ----> R
  |p| |-> |c|

which takes a message length |p| and produces a padded length |c| in the set R of possible padded lengths.

Define leakage to be ceil(log2(|R|)). Notice that if |R| = 1 => leakage = 0, that is if we pad everything to some fixed size then no bits are leaked and if |R| = 2 => leakage = 1, then there exist two plaintexts which can be distinguished.

From this definition, it is clear that if we want to do something more efficient than padding everything to one size, we need to leak some information.

1. Padding to a multiple of constant

If we padded every message to a multiple of some constant b, that is

R = {0, b, 2*b, 3*b,...}; |R| = ceil(M/b)
f(n) = b * ceil(n/b)

This leaks

ceil(log2(ceil(M/b))) = O(log M)

bits.

2. Padding to the nearest power of 2

If we padded every message to a power of 2, that is

R = {0, 2, 2^2, 2^3,...}; |R| = ceil(log2(M))
f(n) = 2^(ceil(log2(n)))

This leaks

ceil(log2(ceil(log2(M/b)))) = O(log log M)

bits.

This is asymptotically great, but in the worst case adds an overhead of ~100% (if n = 2^x + 1 bits for some x. then f(n) = 2^(x+1).)

3. Padmé

Notice that one can map the allowed lengths to a binary representation with log(log(M)) + 1 bits

|    log(log(M)) + 1 bits    |

where you map the binary representation of n to mean a padding of 2^n.

Padme defines the allowed lengths to have a representation that is twice as long

|    log(log(M)) + 1 bits    |    log(log(M)) + 1 bits    |

where the first log(log(M)) + 1 bits represent the exponent and the last log(log(M)) + 1 bits represent the mantissa.

However, Padme also adds the constraint that the mantissa should not be longer than the exponent (this is to keep the real-world overhead low; to <12%). So 9 which has a 3 bit mantissa and 2 bit exponent is rounded up to 10 which has a 2 bit mantissa and 2 bit exponent.

The figure illustrates the idea mentioned in the paragraph 'Algorithm' of the paper.
Table 1 from the arXiv version of the paper, reproduced under CC BY-SA.

These ideas allow PADME to achieve the <12% overhead while keeping the leakage at O(log log M). See the “Leakage and Overhead” section of the paper for details.