The Geometry and Statistics of Noodles
January 07, 2013
An Experiment
Get out a sheet of paper and draw parallel lines on it spaced two inches apart. Take a one-inch long needle and repeatedly toss it onto the sheet of paper, counting the total number of needle tosses and the number of times the needle touches one of the lines. What do you expect the ratio of the number of line intersections to the number of tosses to be? In other words, what is the probability that a randomly tossed needle intersects a line? I don’t think the answer is obvious, but it turns out to be $\frac{1}{\pi}$, the reciprocal of the ubiquitous number $\pi \sim 3.14159…$ You can in principle use this to experimentally calculate $\pi$, though you unfortunately need to toss the needle an impractical number of times in order to get a reasonable estimate.
This experiment is known as Buffon’s Needle Experiment; in this post I’m going to explore more general and seemingly more difficult phenomenon called Buffon’s Noodle Experiment. The setup is the same as before, only instead of tossing a needle (a line segment), we’ll toss a rigid “noodle” in the shape of any desired plane curve. We’ll find that the statistics of noodle crossings is determined just by the length, and not the specific shape, of the noodle in question. Thus the noodle experiment makes a profound connection between geometry and probability theory, a connection which helps solve difficult problems in both areas. This is encapsulated by a beautiful tool called Crofton’s Formula, the first result in an area of mathematics called “Integral Geometry” (or alternatively “Geometric Probability,” depending on whom you ask).
Part of the beauty of the integral geometry approach to Buffon’s needle experiment is that it involves almost no calculations. There are other approaches that involve writing out probability density functions and calculating double integrals, but the argument that I will give below involves only basic (but surprisingly subtle) ideas in probability theory and calculus. It’s a great example of a tricky problem that can be solved through careful abstract thought.
The Explanation
To come to grips with the needle and noodle experiments, we will try to answer the following question: what is the expected number of times that a randomly thrown noodle will cross lines on lined paper with line spacing $d$? We count with multiplicities: if the noodle intersects the same line twice, that counts as two line intersections.
Let $X$ be the random variable which represents the number of line intersections for a given noodle. Recall that the expected value of $X$ is given by:
\[E(X) = \sum_{n=0}^\infty n P(X = n)\]where $P(X = n)$ represents the probability that the number of line intersections is exactly $n$. Here are a few basic observations about these expectations in the case where the noodle is actually a needle (i.e. a line segment).
-
The expectation for a needle of length smaller than $d$ (the line spacing) is just $P(X = 1)$ because such a needle can cross at most one line.
-
The expectation for any needle depends only on the length of the needle. Thus if $X_\ell$ denotes the random variable which represents the number of line intersections for a needle of length $\ell$, we can define a function $f$ by $f(\ell) = E(X_\ell)$.
Now, consider a noodle made up of exactly two needles of lengths $\ell_1$ and $\ell_2$ joined rigidly end-to-end. Denote the random variables representing the number of crossings for the two needles by $X_1$ and $X_2$, respectively; then the random variable representing the number of line intersections for the noodle is just $X_1 + X_2$. Note that $X_1$ and $X_2$ are not independent since the needles are joined, but it is nevertheless true that
\[E(X_1 + X_2) = E(X_1) + E(X_2) = f(\ell_1) + f(\ell_2)\]A priori the expectation $E(X_1 + X_2)$ depends on the lengths $\ell_1$ and $\ell_2$ of the two needles as well as the angle at which they are joined, but the calculation above shows that the expectation is actually independent of the angle. Therefore we can calculate the expectation just by considering the case where angle is such that the two needles form a line segment, i.e. a needle of length $\ell_1 + \ell_2$. From this we conclude that
\[f(\ell_1 + \ell_2) = f(\ell_1) + f(\ell_2)\]We can iterate this argument for any noodle made by chaining together $n$ needles of lengths $\ell_1, \ldots, \ell_n$ to conclude that:
\[f(\sum_{i=1}^n \ell_i) = \sum_{i=1}^n f(\ell_i)\]In particular, if the needles all have the same length, we conclude that $f(n \ell) = n f(\ell)$ for any positive integer $n$ and any positive real number $\ell$. By dividing a needle into $n$ equal pieces, this also shows that $f(\frac{1}{n} \ell) = \frac{1}{n}f(\ell)$. Combining these two facts, we conclude that $f(a \ell) = a f(\ell))$ for any rational number $a$. The expected number of crossings for a longer line segment is at least as large as the expected number of crossings for a shorter one, so we also know that the function $f$ is non-decreasing. Also, we have that $f(0) = 0$ (i.e. the line segment of length zero doesn’t intersect any lines). Basic calculus tells us that the only non-decreasing function $f$ which satisfies $f(0) = 0$, $f(\ell_1 + \ell_2) = f(\ell_1) + f(\ell_2)$, and $f(a \ell) = a f(\ell)$ for every rational number $a$ is the function
\[f(\ell) = C \ell\]where $C$ is some constant. This is already a pretty strong statement about the expected number of crossings for a needle, but we can do even better. Take any noodle of total length $\ell$ made up of $n$ line segments of lengths $\ell_1, \ldots, \ell_n$ and let $X$ be the random variable representing the number of crossings for that noodle; as above, we have:
\[E(X) = \sum_{i=1}^n f(\ell_i) = \sum_{i=1}^n C \ell_i = C \sum_{i=1}^n \ell_i = C \ell\]Thus the expected number of crossings for piecewise linear noodles is still just $C$ times the length of the noodle.
Now take any noodle which is in the shape of a piecewise smooth curve $\gamma$. Borrowing another fact from basic calculus, $\gamma$ is the uniform limit of a sequence of piecewise linear curves $\gamma_n$ whose lengths converge to the length of $\gamma$ (note that this second condition is not automatically implied by the first!) Since the expectations for each of the piecewise linear noodles is given by the formula $C \ell$, this formula holds for any piecewise smooth noodle. Thus we have proved:
Let $X$ denote the random variable corresponding the number of line crossings for a rigid piecewise smooth noodle of length $\ell$ tossed at lined sheet of paper. Then $E(X) = C \ell$ for some universal constant $C$.
Already we have proven something that wasn’t really obvious at the outset: the expected number of intersections depends only on the length, and not on the shape, of the noodle! It remains only to calculate the constant $C$; since this constant is the same for any noodle (of any shape and any length), it suffices to work out just one example. There is one particular noodle for which this calculation is especially easy: the circle whose diameter is $d$ (the spacing of the lines). This is because no matter how you drop such a circle on a sheet of lined paper whose lines are spaced $d$ apart, the circle must intersect exactly two lines and therefore the expected number of intersections is simply $2$. The length of the circle of diameter $d$ is $\pi d$, so we have $2 = \pi d C$ and hence $C = \frac{2}{\pi d}$. Thus the expectation formula is $E(X) = \frac{2 \ell}{\pi d}$.
In the example at the beginning of this post, we considered a needle of length $1$ (so that $\ell = 1$) tossed at a sheet of lined paper with line spacing $d = 2$. According to our formula, this means that the expected number of line crossings is simply $\frac{1}{\pi}$. But as we observed above, the expected number of crossings for a needle which is shorter than the line spacing of the paper is simply the probability that at least one crossing will occur. Therefore this probability is $\frac{1}{\pi}$. According to the law of large numbers, this means that the ratio of the number of needle tosses to the number of crossings approaches $\pi$ as the number of tosses tends to infinity.
Crofton’s Formula
Having accounted for Buffon’s needle and noodle experiments, let us reflect on what we have done. We set out to answer a question about the statistics of tossing noodles at paper, and we found that the answer to the question is an explicit formula involving only the length of the noodle and some constants. Flipping this formula around, notice that this gives us a surprising way to calculate the length of the noodle:
$\ell = \frac{\pi d E(X)}{2}$
In other words, length is a quantity which can be measured and manipulated using statistical techniques. You might not be convinced right away that this is a useful way to think about length, but in fact there are a variety of difficult theorems in geometry which have remarkably easy proofs when they are translated into this language.
Actually, it’s useful to think about all of this in a slightly different way. Instead of throwing the noodle at the paper, we’ll imagine throwing the paper at the noodle. In other words, we’ll ask a slightly different question: what is the average number of times that a random line in the plane intersects a given plane curve? This question is conceptually a little more problematic than the old question because it is not completely clear what the phrase “random line” should mean; Bertrand’s Paradox is a good illustration of the subtleties involved.
Here is the right meaning of the phrase for our purposes. The set of all oriented lines in the plane can be parametrized by two coordinates: the (signed) distance $r$ from a line to the origin (a number from $-\infty$ to $\infty$) and the direction $\theta$ (an angle) in which it points (a number from $0$ to $2\pi$). With this parametrization, we can interpret a “random line” to simply be a random point $(r,\theta)$ in the strip $(-\infty,\infty) \times [0,2\pi]$. (To placate the highly mathematically literate members of my readership, I’ll remark that the space of lines in the plane is topologically a homogeneous space which has a unique translation invariant measure, and this space differs from the strip with Lebesgue measure by a set of measure zero.)
Now, associated to any piecewise smooth curve $\gamma$ in the plane is a function $n_{\gamma}(r,\theta)$ which represents the number of times that the line determined by the values $(r,\theta)$ intersects $\gamma$. Crofton’s formula relates the average value of this function to the length of $\gamma$:
Crofton’s Formula:
\[\text{Length}(\gamma) = \frac{1}{4} \iint n_{\gamma}(r,\theta)\, dr\, d\theta\]This formula can be proved using more or less the same procedure that we used to calculate the expected number of crossings for a noodle thrown at a sheet of lined paper: argue that the integral on the right-hand side is additive for line segments attached end-to-end, use an approximation argument to show that it agrees with length up to a multiplicative constant, and fix the constant by calculating a single explicit example (the circle is once again a good choice).
As I alluded in the beginning of this post, Crofton’s formula is the tip of a very deep and very beautiful iceberg - I may explore it further in future posts.