Red envelopes

Long time no see! I just survived Eurocrypt 2020 submission deadline. Hooray and here’re some maths!

Bande-annonce : I will be posting a series of entries about my recent research on attribute-based encryption (ABE) in Chinese (English speakers should just read the paper, right?). Keep tuned.

Red envelopes are a form of (monetary) gifts (learn more by tapping the link below the hero image), and modern forms of red envelopes include electronic ones, most commonly^{[citation needed]} seen on WeChat.

On WeChat, one can put a certain amount of money (no more than 200 CNY per envelope) and distribute it

to a specific person;
to a specific number of persons evenly; or
to a specific number of persons randomly.

We are interested in how the amount of money each person receives is sampled in the last case. It would be totally rational to assume WeChat uses crafted distributions so that there are lower/upper bounds on the amount of money each person who opens the envelope receives, but we (shamelessly self-identified) mathematicians can confine ourselves in the simpler case — the uniform distribution without bounds except the natural ones.

Formally speaking, let the amount of money be $N$ pennies and the number of people be $m,$ where $N, m \in N, N \geq m > 0 .$ Consider the set $Ω_{m, N} = {(x_{1}, \dots, x_{m}) ∣ ∣ ∣ x_{1} + \dots + x_{m} = N, 1 \leq x_{1}, \dots, x_{m} \in N},$ we wish to draw a sample from the uniform distribution over $Ω_{m, N} .$

Incorrect Characterizations

Before talking about sampling from the distribution, it is actually interesting to see how one can characterize the distribution. There are 3 obvious incorrect attempts.

The first attempt is to say:

The marginal distributions for each person are identical.
Conditioned on any person, say receiving $x_{i}$ pennies, the rest are distributed identically as if there were $(N - x_{i})$ pennies and $(m - 1)$ people.

This attempt fails as soon as there are 2 people and 3 pennies: The requirement only mandates that the distribution of the first person’s amount be symmetric around the center, but not necessarily uniform.

Now let’s consider a second attempt:

For each ordered subset of ${1, \dots, m}$ , the marginal distributions of these people only depend on the size of the ordered subset.
Conditioned on any person…

Or a third one:

For each ordered subset of…
Conditioned on any ordered subset of…

Of course, they both fail as soon as there are 2 people and 3 pennies.

Sampling

Why did I stop talking about characterizations? Well, I just became tired of it. Let’s do the actual sampling. It’s pretty simple — consider $N$ pennies arrange in a row and separated by $(N - 1)$ gaps, we can use reservoir sampling to put $(m - 1)$ separators in those gaps. This takes time $Ω (N lo g N),$ exponential in the input length.

Of course, WeChat doesn’t have to face this exponential problem even if it chooses to sample from this distribution — there is a small upper bound on money (20000 pennies). There is also a small upper bound on distributees (100 people, I just checked the FAQ).

Correct Characterization

The following characterization is correct:

If there are two people, the amount for the first person is uniformly distributed.
The marginal distributions for each person are identical.
Conditioned on any person…

This can be proven by induction. If there is 1 or 2 people, the proposition holds trivially. If there are $m > 2$ people, let the probability masses of $x_{m}$ be $p_{1}, \dots, p_{N + 1 - m} .$ Combining the second and the third conditions with the induction hypothesis, we have that for all $j,$ $p_{j} = Pr [x_{m} = j] = Pr [x_{1} = j] = k = 1 \sum N + 2 - m - j Pr [x_{m} = k] Pr [x_{1} = j ∣ x_{m} = k] = k = 1 \sum N + 2 - m - j p_{k} \frac{( m - 3 N - k - j - 1 )}{( m - 2 N - k - 1 )} .$ Enter the theory of Markov chains — there is only one natural Markov chain to consider and I’m not going to explicitly define it (since it’ll be a mouthful). It’s easy to check that this Markov chain is regular: Any state can transit to another in 2 steps (via $p_{1}).$ Therefore, it has exactly one steady state, which is the unique solution to the above linear system. Note that the correct distribution (uniform over $Ω_{m, N})$ ‘also’ satisfies the system, so the unique solution must be the correct distribution. This completes the inductive step.

N.B. The above inductive step does not go through for $m = 2,$ because each state in that Markov chain absorbs itself and any distribution is a steady state.