I made some quick notes on the Padmé paper this morning and I’m dumping them here for posterity.

First, here is a link to the paper. These notes are not complete, please read the paper for details.

Nikitin, Kirill, Ludovic Barman, Wouter Lueks, Matthew Underwood, Jean-Pierre Hubaux, and Bryan Ford. “Reducing Metadata Leakage from Encrypted Files and Communication with PURBs”, Proceedings on Privacy Enhancing Technologies 2019, 4 (2019): 6-33, doi: https://doi.org/10.2478/popets-2019-0056, arXiv:1806.03160 [cs.CR]

Fix a message space `P`

of binary strings of length at most `M`

.

Define a surjective *padding* function

```
f: P ----> R
|p| |-> |c|
```

which takes a message length `|p|`

and produces a padded length `|c|`

in the set `R`

of possible padded lengths.

Define *leakage* to be `ceil(log2(|R|))`

. Notice that if `|R| = 1 => leakage = 0`

, that is if we pad everything to some fixed size then no bits are leaked and if `|R| = 2 => leakage = 1`

, then there exist two plaintexts which can be distinguished.

From this definition, it is clear that if we want to do something more efficient than padding everything to one size, we need to leak some information.

If we padded every message to a multiple of some constant `b`

, that is

```
R = {0, b, 2*b, 3*b,...}; |R| = ceil(M/b)
f(n) = b * ceil(n/b)
```

This leaks

```
ceil(log2(ceil(M/b))) = O(log M)
```

bits.

If we padded every message to a power of 2, that is

```
R = {0, 2, 2^2, 2^3,...}; |R| = ceil(log2(M))
f(n) = 2^(ceil(log2(n)))
```

This leaks

```
ceil(log2(ceil(log2(M/b)))) = O(log log M)
```

bits.

This is asymptotically great, but in the worst case adds an overhead of ~100% (if `n = 2^x + 1`

bits for some `x`

. then `f(n) = 2^(x+1)`

.)

Notice that one can map the allowed lengths to a binary representation with `log(log(M)) + 1`

bits

```
| log(log(M)) + 1 bits |
```

where you map the binary representation of `n`

to mean a padding of `2^n`

.

Padme defines the allowed lengths to have a representation that is twice as long

```
| log(log(M)) + 1 bits | log(log(M)) + 1 bits |
```

where the first `log(log(M)) + 1`

bits represent the exponent and the last `log(log(M)) + 1`

bits represent the mantissa.

However, Padme also adds the constraint that the mantissa should not be longer than the exponent (this is to keep the real-world overhead low; to `<12%`

). So 9 which has a 3 bit mantissa and 2 bit exponent is rounded up to 10 which has a 2 bit mantissa and 2 bit exponent.

These ideas allow PADME to achieve the `<12%`

overhead while keeping the leakage at `O(log log M)`

. See the “Leakage and Overhead” section of the paper for details.