Now, to make the problem more difficult: poison takes effect only after 10 hours, and you need to find the poisoned bottle within a day. (If the poison worked instantly, you could easily find the poisoned bottle by making just one prisoners test them all one by one. If you had unlimited time, you could cut the number of bottles in half while using just one prisoner by making him test half of the bottles, and it would be enough with 10 to find the right one. But you do not have time for this.)

You may stop here and try finding the solution on your own, if you want to, but the puzzle is considered hard, and it will most likely will take time. My intent is to simplify the solution a bit, and make it more transparent.

Here’s the standard solution: enumerate all 1000 bottles with binary numbers from 0 to 1110000011 (if I am not mistaken) and make the prisoner number i try the bottle number k iff the ith digit of the number k is 1. After 10 hours some prisoners will die (or perhaps no one will, if the poisoned bottle was the zeroth one) and then you will be able to reconstruct the binary number of the poisoned bottle.

Let’s see how it works when there are 3 prisoners (A, B and C) and 8 bottles:

bottle number 0: 0 0 0

bottle number 1: 0 0 1

bottle number 2: 0 1 0

bottle number 3: 0 1 1

bottle number 4: 1 0 0

bottle number 5: 1 0 1

bottle number 6: 1 1 0

bottle number 7: 1 1 1

______________________

prisoners:……….A B C

A will bottles 4, 5, 6 and 7, B will try 2, 3, 6 and 6, and finally C will try 1, 3, 5 and 7. If, say A and C die, you will immediately know that bottle number 5 was the poisoned one.

Here’s a rather more clear formulation of the same solution: don’t just use prisoners to find the bottle – use the subsets of prisoners. The set of 10 prisoners has 1024 subsets, more than enough. Assign each bottle to some subset. (Naturally, since each prisoner belongs to more than one subset, she will have to try more than one bottle). After 10 hours some subset of the set of prisoners will completely die out. Then you will know that the bottle, which you assigned to it was the one.

Of course, if you decide to implement this scheme you will inevitably have to use something like the binary number trick described above, but I think my explanation is better, deeper in some sense, and helps to understand not only *how*, but *why*. Or maybe not.

]]>

]]>

You already knew all of this, or nearly all, but have you ever wondered what does the plot of look like? Well, wonder no more:

What do we see here? First, the surface, or whatever you want to call it, is a fractal. It’s asymmetric, because is asymmetric with respect to its arguments. each of its level-sets consists of a single point, and it’s easy to see why. You may also notice that I cheated a bit, and this is not really the plot of the function I described in the beginning – instead this is a slightly modified version that uses binary expansions instead of decimal ones. It makes no difference in theory, but the plot looks better this way.

]]>

]]>

**Fact: **Suppose we have a function of two variables and it’s restriction to every straight line going through the origin has a strict local minimum at . This does not necessarily imply that has a local minimum at the origin.

Funny, isn’t it – all the (straight) roads lead down, and yet you are not on top.

**Counterexample**: . Verifying that a) this function does not have a local minimum at and b) has a local minimum at is trivial, boring, and will be left as an exercise for the reader.

**What’s going on here?** Let’s plot this function:

Well, maybe this time it doesn’t help much. But if we look at this surface above, it will make things more clear:

What we have here is a kind of a distorted saddle at the origin (big black point). The function is equal to zero there. There are points where the function is positive (red region) in every neighborhood of the origin, no matter how small, so it’s definitely not a local maximum. But at the same time, every straight line going through the origin will have to cross the blue region (where the function if negative) before it crosses the red region, and therefore the restriction of a function to that line will have to decrease for a little while and will have a local minimum. No matter which straight line you chose, you can’t get to the red region before you get to the blue one. Now, if you were allowed to choose a parabola…

But even then I could modify the example so that even parabolic roads wouldn’t help. I think you can now guess how – by replacing squares with 4-th powers: . Now, we could take not one, but two families of lines, each completely covering the plane – all the straight lines and all the parabolas , and still all the restrictions would have strict local maxima, while the function itself wouldn’t. (We could take this even further, by using such functions as , but then we would have to dance the singularity-patching dance, and that’s boring.)

]]>

**Step 1.** Trick the victim into graphing this slightly modified version of Tupper’s Self-Referential Formula:

in the region, say , , where n=

381025769433284440392319964900539302335

336506037124716737876327885349255244053

654529126222910795091774837392244419620

805572082941110711057252345724567742835

679828220923525384477934725633019757953

249733463908820850208792361145516746285

337177182564212162731442.

**Step 2.** Existential terror.

(Graphing the formula in that rectangle gives you the message “help! I’m trapped in a Universe factory”, which of course is a reference to this classic xkcd)

]]>

what makes it interesting is that the set of points , for which the inequality holds, in the rectangle , , you will get this:

Turns out that the graph of the formula resembles the formula itself, which at first sounds pretty incredible. Oh, wait: I didn’t specify the value of n. It should be equal to:

960939379918958884971672962127852754715004339660129306

651505519271702802395266424689642842174350718121267153

782770623355993237280874144307891325963941337723487857

735749823926629715517173716995165232890538221612403238

855866184013235585136048828693337902491454229288667081

096184496091705183454067827731551705405381627380967602

565625016981482083418783163849115590225610003652351370

343874461848378737238198224849863465033159410054974700

593138339226497249461751545728366702369745461014655997

933798537483143786841806593422227898388722980000748404719.

Read more about it in Wikipedia.

I spent a couple of evenings trying to figure out how does it work, and intended to write a long post about it, but I see that it has already been done here much better than I could, so I’ll only give you the main idea: the graph of the formula contains all possible combinations of pixels of size and what you need to do is to locate the right fragment of the graph by choosing the value of n. Fortunately, the formula is constructed in such a way that it can be done rather easily. For instance, if you set n to be equal to:

770137614616740349659457383109219250729889213023196245

754997397474539522884572693040838328048207848222847384

255800207585680609541957604841348860558024608884092791

814186507612666983299083984568308963916568266347954249

445046515417964282372279838338100838479512127675315663

107944860681436794491747199511098181076928

you will get a fragment of the graph saying “hello world!”. (Oh, and it will be upside down, as pointed out in the blog post I already mentioned)

]]>

The idea is that you if you generate a lot of gaussian bumps with randomly distributed parameters and add them all together, you will get the surface that looks like some natural landscape – a mountain, perhaps or an island, and then by playing around with the parameters of the distribution you can make your landscape look more or less rugged.

So, I started with a plain, added some bumps, and got this:

This doesn’t look very natural, does it? But then I added several more layers of smaller bumps, and created The Jelly Mountain:

Jelly Mountain looks pretty realistic, especially if you are willing to look past the hideousness of the color scheme. When I got slightly better at choosing parameters, I created The Candy Archipelago:

This has no theoretical importance whatsoever, but it’s still fun making something life-like out of equations and matrices.

Here’s some of my code for the curious and the masochistic:

restart;

with(LinearAlgebra): with(plots): with(Statistics):

N:=10000; disp:=1.5; S:=200; s:=S/2;

M:=matrix(S, S); for i from 1 to S do for j from 1 to S do M[i, j]:=0; od; od;

f:=(x, y, d, h)->h*exp(-(x^2+y^2)/d^2);

for i from 1 to N do xx:=evalf(Sample(RandomVariable(Uniform(-1, 1)), 1)[1], 3): yy:=evalf(Sample(RandomVariable(Uniform(-1, 1)), 1)[1], 3):

d:=evalf(Sample(RandomVariable(Normal(0, 1/300)),1)[1], 3):

evalf(i/N*100, 3);

for j from max(1, round(xx*100+s-disp*abs(d)*100)) to min(S, round(xx*100+s+disp*abs(d)*100)) do for k from max(1, round(yy*100+s-disp*abs(d)*100)) to min(S, round(yy*100+s+disp*abs(d)*100)) do

M[j, k]:=M[j, k]+evalf(f((j-s)/100-xx, (k-s)/100-yy, d, d), 3):

od; od; od:

matrixplot(M, style=surface, lightmodel=light2);

]]>

**What does it really mean? **It means: if you take some positive numbers , raise each to the power of n, add them all together, take the n-th root of the sum, then if you start to increase n the result will start to approach the greatest of the numbers you started with.

**Why is that weird? **Because the procedure is symmetric with respect to all the numbers , and the result depends only on one of them, the greatest one (Let’s call it ). All the other k-1 numbers are pretty much completely irrelevant. You can set them all to zero and it will not change the outcome, because all the work is done by , and by alone. How does the procedure “determine” which number is the greatest?

**How do we prove this amazing fact? **Like this:

We already defined . Now:

and

, therefore , the end.

**Does it work with other families of functions or only with powers and roots? **The proof only uses some monotonicity and the fact that , so it should also work, for example, when , meaning that:

I can’t think of a good way to completely characterize the class of functions, for which the proof holds. Can you?

**What does this mean geometrically? **It means that as n increases, level surfaces of a function defined in a k-dimensional space, will start to resemble a k-dimensional cube, because a cube is a level surface of a function . You might remember that we already spoke about this in my previous post titled “On Norms”.

**Have you made a cool-looking but completely pointless animation to demonstrate this phenomenon? **I thought you’d never ask:

(you may or may not need to click on it)

**Does this fact have any useful applications? **Not that I know of, which proves that this is some high quality math, and not same applied rubbish. (Just kidding, applied math can be beautiful too. Once in a thousand years.)

]]>

**Why is it weird? **Because the sum of two clearly non-constant functions is constant. Well, more or less so.

**How to prove it? **Differentiate.

**How can I use it? **You can confuse people with it: “Hello, clerk, I want ” tickets to this movie.” Also, it works as a very fancy, if slightly incorrect way to write signum function.

]]>