Hard derangement interview problem

✌️

Natasha recently shared to me an interesting problem, sometimes asked in statistical interviews, which is surprisingly close to my own research field of graph theory. This particular problem appeared to be asked by Snapchat.

Consider nnn﻿ boys and nnn﻿ girls, each boy secretly loving a girl and vice versa. What is the probability that no couple could be formed?

An alternative mathematical formulation would be

Given two finite sets AAA﻿ and BBB﻿, each containing nnn﻿ elements, how many pairs of functions f ⁣:A→Bf \colon A \to Bf:A→B﻿ and g ⁣:B→Ag \colon B \to Ag:B→A﻿ are there, such that their composition h=g∘fh = g \circ fh=g∘f﻿ has no fixed points?

The problem appears to be on OEIS and has two math.stackexchange entries with several very nice attempts. Apparently, my approach with generating functions seems to be new.

An example of a pair of functions with two mutual friendships 1-1 and 4-5. Each mutual friendship corresponds to a fixed point of the composition.

Previous works

Here is the link to the OEIS sequence of the first several terms

A284458 - OEIS

The OEIS is supported by the many generous donors to the OEIS Foundation.

https://oeis.org/A284458

This OEIS entry also contains two links to Math StackExchange discussions

An unexpected application of non-trivial combinatorics

PROBLEM STATEMENT Given two finite sets $A$ and $B$, each containing $s \in \mathbb N$ elements, how many pairs of functions $f \colon A \rightarrow B$ and $g \colon B \rightarrow A$ are there, ...

https://math.stackexchange.com/questions/713370/an-unexpected-application-of-non-trivial-combinatorics

Couple Probability

The problem states that there are 12 boys and 12 girls. Each boy chooses a girl at random and each girl chooses a boy at random. If a boy and a girl choose each other, they form a couple. It then a...

https://math.stackexchange.com/questions/982134/couple-probability

I personally love such kinds of problems because nnn^nnn﻿ denoting the total number of endofunctions (i.e. functions from a set to itself) is also the number of rooted trees with one marked vertices. There are nice combinatorial bijections which can be used for asymptotic analysis.

Derangements

Let us start with an easier problem.

Given a finite set AAA﻿ of size nnn﻿, how many functions f ⁣:A→Af \colon A \to Af:A→A﻿ are there without fixed points?

Clearly, for each of the nnn﻿ points there are (n−1)(n-1)(n−1)﻿ allowed possibilities of where the function fff﻿ would be pointing to. The number of functions is therefore (n−1)n(n-1)^n(n−1)n﻿. The probability that a randomly chosen function does not have fixed points is asymptotically

\lim_{n \to \infty} \dfrac{(n-1)^n}{n^n} = \lim_{n \to \infty} \left( 1 - \dfrac{1}{n} \right)^n = e^{-1} \, .

In this case, all the fixed points are formed independently, which is not the case in our problem. 

Approach 1: exact formula and asymptotic analysis

Furthermore, if we look at the exact formula from OEIS, we observe

\sum_{k=0}^n (-1)^k \binom{n}{k}^2 k! n^{2(n-k)}

which I guess could be obtained by some kind of inclusion-exclusion. However, asymptotic analysis of such formulae may be sometimes difficult, while can still be useful for numerical simulations. In this case, however, asymptotic analysis is possible because the summands do not give increasingly divergent contributions.

Let us take a closer look at each of the summands as n→∞n \to \inftyn→∞﻿.

S_{k,n} := (-1)^k \binom{n}{k}^2 k! n^{2(n-k)} = (-1)^k\binom{n}k\frac{n!}{n^{2k}(n-k)!}

Note that (nk)\binom{n}{k}(kn​)﻿ is asymptotically nk/k!n^k / k!nk/k!﻿, i.e. a polynomial of degree kkk﻿. The ratio n!/(n−k)!n! / (n-k)!n!/(n−k)!﻿ is again a polynomial of degree kkk﻿, so when divided by n2kn^{2k}n2k﻿ the remaining part asymptotically does not depend on nnn﻿. The first several terms of this sum are approximately

1 - \dfrac{1}{2!} + \dfrac{1}{3!} - \ldots \approx e^{-1} \, .

With some extra work to make this proof rigorous we arrive to the answer e−1e^{-1}e−1﻿.

Source: https://math.stackexchange.com/a/983293/208535

Approach 2: conditional probabilities

Suppose that the items from the set BBB﻿ receive, accordingly, k1,k2,…,knk_1, k_2, \ldots, k_nk1​,k2​,…,kn​﻿ incoming edges. 

In this example, particular values of kik_iki​﻿ are given.

We can notice that since k1+…+kn=nk_1 + \ldots + k_n = nk1​+…+kn​=n﻿, we have Eki=1\mathbb E k_i = 1Eki​=1﻿ for each iii﻿. We could also compute Eki2\mathbb E k_i^2Eki2​﻿ and make sure it is finite (asymptotically does not depend on nnn﻿). In fact each kik_iki​﻿ is distributed according to a Binomial distribution (although not independently) Binom(n,1n)\mathrm{Binom}(n,\frac{1}{n})Binom(n,n1​)﻿. Its variance is Ek12−(Ek1)2=1−1n\mathbb E k_1^2 - (\mathbb E k_1)^2 = 1 - \frac1nEk12​−(Ek1​)2=1−n1​﻿, and therefore, Ek12∼2\mathbb E k_1^2 \sim 2Ek12​∼2﻿ as n→∞n \to \inftyn→∞﻿.

First of all, suppose that all kik_iki​﻿ are equal to 1. Then, the probability that there are no mutual friendships is the same as in the derangements problem

\prod_{i=1}^n \left( 1 - \dfrac{1}{n} \right) \sim e^{-1}

Now, in the case of general kik_iki​﻿, we replace the 1’s in the product by kik_iki​﻿’s:

\prod_{i=1}^n \left( 1 - \dfrac{k_i}{n} \right) = e^{\sum_{i=1}^n\log(1 - k_i/n)}

By developing the Taylor series of the log, we obtain

e^{-1 + O(\sum k_i)/n^2} = e^{-1}(1 + O(n^{-1}))

This requires a better justification but heuristics says that on average, the answer does not depend on kik_iki​﻿ unless some of the kik_iki​﻿ is too large, which does not happen too often. By decomposing the probability up to conditional probabilities, we obtain

\mathbb P(\text{no mutual friendships}) = \sum_{k_1,\ldots,k_n} \mathbb P(\text{no mutual friendships}|k_1, \ldots, k_n) \mathbb P(k_1, \ldots, k_n) \sim \sum_{k_1,\ldots,k_n} e^{-1} \mathbb P(k_1, \ldots, k_n) = e^{-1} \, .

Approach 3: generating functions

Let us solve the derangement problem in a hard way first. From each node there is an arrow to another node. We look at the ensemble of arrows and nodes as a whole and try to find some structure in it.

It turns out that some nodes assemble into cycles, and others attach to nodes assembled as cycles. Everything outside the cycles forms trees ascending to one of the nodes of a cycle.

With some help of the exponential generating functions, we can express the combinatorial property that the endofunctions are sets of these cycle-trees (unicycles).

\sum_{n=0}^\infty \dfrac{n^n}{n!} z^n = e^{U(z)}

where U(z)U(z)U(z)﻿ is yet-to-be-determined exponential GF of unicycles. Since the exponential GF of cycles is just simply

\mathsf{CYC}(z) = \sum_{n=0}^\infty \dfrac{(n-1)!}{n!} z^n = \sum_{n=0}^\infty \dfrac{1}{n} z^n = \log \dfrac{1}{1 - z}

the GF of unicycles could be obtained by a functional composition with the cycle function

U(z) = \mathsf{CYC}(T(z)) = \log \dfrac{1}{1 - T(z)} \, ,

where the generating function of rooted trees T(z)T(z)T(z)﻿ is defined by a recursive relation “a rooted tree is composed of a root and a set of rooted trees”:

T(z) = z e^{T(z)} \, .

This gives a surprising identity

\sum_{n=0}^\infty \dfrac{n^n}{n!} z^n = \dfrac{1}{1 - T(z)} = e^{U(z)}

✌️

I admit that being comfortable with this approach requires some experience. This is the reason I am trying to provide some more context and give more examples, which also takes more space. But the actual heuristic solution is very short, once all the intuitions are gained.

Remark. Joyal’s proof for the number of labelled trees nn−2n^{n-2}nn−2﻿.

The above approach allows to easily obtain the celebrated formula for the number of unrooted labelled trees Tn=nn−2T_n = n^{n-2}Tn​=nn−2﻿. From the functional definition of the function T(z)T(z)T(z)﻿, we get, by taking a pointed derivative

z \dfrac{d}{dz} T(z) = z e^{T(z)} + z^2 \dfrac{d}{dz}T(z) e^{T(z)}

which is a linear equation that can be solved:

z \dfrac{d}{dz} T(z) = \dfrac{T(z)}{1 - T(z)} = \dfrac{1}{1 - T(z)} - 1

Therefore,

\sum_{n=1}^\infty \dfrac{n^n}{n!} z^n = z \dfrac{d}{dz} T(z)

and it easily follows that the number of rooted trees is nn−1n^{n-1}nn−1﻿, so therefore, the number of non-rooted trees is nn−1n^{n-1}nn−1﻿ divided by nnn﻿, which is nn−2n^{n-2}nn−2﻿.

Isolated cycles have Poisson distribution

The form of the generating functions allows one to obtain the Poisson distribution for the number of isolated cycles with almost no extra effort. Since the GF for all cycles is

\mathsf{CYC}(z) = \sum_{n=1}^\infty \dfrac{1}{n} z^n = \log \dfrac{1}{1 - z},

we can mark the cycles of length 1 by introducing an extra variable xxx﻿:

\mathsf{CYC}(z, x) = xz + \sum_{n=2}^\infty \dfrac{1}{n} z^n = \log \dfrac{1}{1 - z} - z + xz \, .

The generating function of all the endofunctions with isolated cycles marked is now

e^{\mathsf{CYC}(T(z), xT(z))} = \dfrac{1}{1 - T(z)} e^{T(z)(x-1)}

The heuristics goes as follows:

Probability GF of the Poisson(λ)\mathsf{Poisson}(\lambda)Poisson(λ)﻿ distribution is proportional to eλxe^{\lambda x}eλx﻿. To be fully rigorous, the probability GF is in fact eλx−λ=eλxeλe^{\lambda x-\lambda} = \frac{e^{\lambda x}}{e^\lambda}eλx−λ=eλeλx​﻿, but the constant is distracting.

We would therefore expect that eT(z)(x−1)e^{T(z)(x-1)}eT(z)(x−1)﻿ encodes a Poisson(T(z))\mathsf{Poisson}(T(z))Poisson(T(z))﻿ distribution, but it is not really clear what zzz﻿ should be.

The asymptotic behaviour of generating functions is dictated by their singularities. In the general cases, these singularities play with each other, making the analysis more difficult, but in this particular case, we only need to know the value of T(z)T(z)T(z)﻿ at the singularity.

Since the singular expansion of T(z)T(z) T(z)﻿ gives T(z)∼1−21−zeT(z) \sim 1 - \sqrt{2}\sqrt{1 - ze}T(z)∼1−2​1−ze​﻿, we therefore, conclude that T(e−1)=1T(e^{-1}) = 1T(e−1)=1﻿ and e−1e^{-1}e−1﻿ is the singularity closest to zero.

✌️

We often use the notation [zn]F(z)[z^n] F(z)[zn]F(z)﻿ to denote nnn﻿th coefficient of the generating function F(z)F(z)F(z)﻿.

But wait, we almost forgot about the factor 11−T(z)\dfrac{1}{1 - T(z)}1−T(z)1​﻿ in front and it is terrifying! But in order to get the probability generating function for the number of cycles, we can divide the counting GF by the total GF, just like in combinatorial probability, where we divide the number of “good” outcomes by the number of “total” outcomes.

\mathbb P(\mathsf{event}) = \dfrac{\#\mathsf{good\, outcomes}}{\#\mathsf{all\, outcomes}}

\mathsf{pgf}(x) = \dfrac{[z^n]\mathsf{endofunctions}(z, x)}{[z^n]\mathsf{endofunctions}(z)}

and we know that the multiple 11−T(z)\dfrac{1}{1 - T(z)}1−T(z)1​﻿ will introduce some perturbation, but exe^{x}ex﻿ is a tiny guy (with asymptotically discrete distribution), so the final answer will not be changed. We, therefore, obtain, by some hand-waving

\mathsf{pgf}(x) \sim e^{T(e^{-1}) (x-1)} = e^{x-1} = \mathsf{pgf}[\mathsf{Poisson}(1)](x)

Remark. Further statistics by transfer theorems

The form of the generating functions also allows one to get the asymptotic distribution of previously unaccessible patterns. For example, we could ask for the distribution of the number of unicycles in the above graph. The answer comes by marking the unicycles with an extra variable:

e^{x \cdot U(z)}

Not so fast: the singular expansion of T(z)T(z) T(z)﻿ gives T(z)∼1−21−zeT(z) \sim 1 - \sqrt{2}\sqrt{1 - ze}T(z)∼1−2​1−ze​﻿, and therefore, U(z)=log⁡11−T(z)U(z) = \log \dfrac{1}{1 - T(z)}U(z)=log1−T(z)1​﻿ turns to infinity at z=e−1z = e^{-1}z=e−1﻿, making the approach impossible. It would be still available if U(e−1)U(e^{-1})U(e−1)﻿ were a constant though!

We could still get the expected number of the cycles by using this approach.

Expected number of cycles in random function

begingroup$ Let $a_n$ denote the expected number of cycles in a random endofunction on a set of $n$ elements. Marko Riedel has already posted the derivation of the asymptotic result $a_n\sim\frac12\log n$ by P. Flajolet and A. Odlyzko in a comment.

https://math.stackexchange.com/questions/2849169/expected-number-of-cycles-in-random-function

\left.x\dfrac{d}{dx} e^{x \cdot U(z)} \right|_{x=1} = U(z) e^{U(z)} = \dfrac{1}{1 - T(z)} \log \dfrac{1}{1 - T(z)} \, .

The asymptotics of this generating function can be obtained by applying transfer theorems of Flajolet-Odlyzko. This is more or less a routine procedure and requires only the knowledge of these generating functions around their respective singularities.

I invite the curious reader to check out a paper of Flajolet and Odlyzko entitled “Random mapping statistics” where they squeeze out everything that is possible to know about random mappings by a repeated application of the transfer theorem and the above symbolic method.

hal.inria.fr

https://hal.inria.fr/inria-00075445/document

Composition of two mappings

Let us return to the original problem. It seems to be more difficult than just treating one mapping because now we have two mappings. But we will see that there is no need in applying transfer theorems at all, and a simple approach will do.

Now we have vertices of two colours, and we have to make sure that the arrows go between alternating colours.

We will use a variable z1z_1z1​﻿ for the vertices of the first colour, and z2z_2z2​﻿ for the vertices of the second.

The generating function of trees rooted at, respectively, the first or the second colour, would be

T_1(z_1, z_2) = z_1 e^{T_2(z_1, z_2)} \quad \text{and} \quad T_2(z_1, z_2) = z_2 e^{T_1(z_1, z_2)} \, .

They, still, share the same singularity, but now it is trickier because in two dimensions, the singularity is not a single point, but a manifold. Well, if we require z1=z2z_1 = z_2z1​=z2​﻿ then the singularity would still be a point z1=z2=e−1z_1 = z_2 = e^{-1}z1​=z2​=e−1﻿. In the end we need to make sure that the number of vertices of each colour is equal, which indirectly points to the symmetry and to the fact that z1=z2z_1 = z_2z1​=z2​﻿ in the heuristics.

Since in every cycle we have an even number of vertices, and black and white trees alternate, we therefore, have the generating function of cycles

U(z_1, z_2) = \log \dfrac{1}{1 - T_1(z_1, z_2) T_2(z_1, z_2)}

Excluding the cycles of length 2 would mean cropping the first term in the expansion of the logarithm:

U(z_1, z_2, x) = \log \dfrac{1}{1 - T_1 T_2} + T_1 T_2(x-1)

and the composition of two mappings now has a generating function with xxx﻿ marking all the cycles of size 222﻿, also corresponding to fixed points:

\dfrac{1}{1 - T_1 T_2} e^{(x-1) T_1 T_2}

Note that the singular value of T1T2T_1 T_2T1​T2​﻿ is again equal to 1 at the point z1=z2=e−1z_1 = z_2 = e^{-1}z1​=z2​=e−1﻿.

As in the derangement exercise, we extract the coefficient [z1nz2n][z_1^n z_2^n][z1n​z2n​]﻿ and this gives us the probability generating function of the number of fixed points:

\mathsf{pgf}(x) = \dfrac{[z_1^n z_2^n]\mathsf{compositions}(z_1, z_2, x)}{[z_1^n z_2^n]\mathsf{compositions}(z_1, z_2)} \sim e^{x-1}

Final remarks

The trick above would not work if the number of black and white points is not equal. In this case, there is no guarantee that z1=z2z_1 = z_2z1​=z2​﻿, and therefore, obtaining the singular point requires some more effort. No worries, saddle point analysis can help in this case.

Apparently, no paper analysing mapping statistics of the composition of two (or more?) mappings is published yet.

The same idea could be applied to the analysis of random graphs in stochastic block model. This is one of my research projects and I hope it can lead to some amazing discoveries. Even in the case of two parts (black points and white points?) there are vast unexplored terrains.