Friday, May 27, 2016

Nature Notes from Massachusetts: How the Land has Changed

0305151548I've lived in Massachusetts for 8 years now, and I've always been struck by the density and variety of trees here – maples, oaks, birches, beeches, chestnuts, hickories, white pines, pitch pines, hemlocks, firs. Look in any direction and your view is likely to be blocked by a tangle of trees: in the winter and early spring crisscrossing, leafless branches form a haze of brown and gray; in the summer, when the leaves have returned, there is a lush, impenetrable wall of green. 

Apparently this wasn't always the case: in the mid 1800s, the naturalist and writer Henry David Thoreau, the author of Walden, was "able to look out of his back door in Concord [now on the outskirts of Boston] and see all the way to Mount Monadnock in New Hampshire because there were so few trees to block his view." In Natural History of Western Massachusetts, Stan Freeman writes: 
"in the early 1800s Massachusetts may have looked much like a farm state in the Midwest, such as Kansas and Indiana. Farm fields, barren of trees, stretched from horizon to horizon…"
Also consider this. In 1871, when the US Department of Agriculture (USDA) surveyed the stone fences that European farmers in the Northeast had constructed, they found 33,000 miles of such fences in Massachusetts alone! That number should make clear just how much land was put under the plough.

Things changed quickly, though. As the United States expanded westward in the 19th century, fulfilling its so called Manifest Destiny, the Midwest emerged as a major player in agriculture. Midwestern crops could be sent back east by railroad. The farmers of the New England, unable to compete, abandoned their lands. The forests grew back, hiding the thousands of miles of stone fences.

UntitledIn 1893, forest land in Massachusetts was about 30% of the land area of the state. In 1998, forest land actually increased to 60%. This still holds true -- see 2014 USDA map. The six million residents of Massachusetts are concentrated in a few cities and suburbs, and despite the resurgence of local farms, much of what the state needs is supplied from outside. Travel west of Boston (along I-90 or Route 2 or back roads such as MA-9) and the towns are never very big. At the edges of these towns – with their  abandoned mills, red brick buildings, the odd convenience store, gas station, a church or two – are miles and miles of thick forests, winding brooks and wetlands. Even the exceptionally busy Mass pike or Interstate-90 runs through land that has simply been left alone. Driving by at 70 miles an hour, I once remember spotting a blue heron resting among cattails in a small pond.  

In Amherst, which is in the western part of the state, residential areas are continuously interspersed with a patchwork of conservation lands. One of my favorite spots is called Lawrence Swamp. Much of it looks like this picture I took a couple of weeks ago. I love how still the water is! You can follow even the smallest of ripples – created, say, by an insect skimming the surface. The mound you see adjacent to the dead pine tree is an active beaver lodge. A flooded landscape with dead trees, broken stumps and floating logs – very haphazard, but to ecologists such features constitute a habitat structure, an arrangement of the physical space that allows diverse species to thrive. In March and April, red-winged blackbirds perch themselves on the stumps, punctuating the silence with their screeches. Occasionally a pileated woodpecker will knock its beak against a tree trunk, not just once but continuously creating an eerie drumming rhythm that can be heard from far.

When Thoreau was having his simple, back-to-nature Walden experience in the 19th century, many species I can easily spot now were less prevalent or even completely absent. For example, the last wild turkey in Massachusetts was shot in 1851. Now they've made a huge comeback; I see them regularly in groups of 6-10, foraging in meadows. Moose were absent then but are now around. Beavers had been eliminated in the 17th and 18th centuries thanks to the profit-driven excesses of the fur trade. In the 1930s, they were re-introduced, and have transformed the wetlands of Massachusetts, creating swamp-like habitats that benefit a host of other species. Just to give two examples: blue herons and pileated woodpeckers make use of small tree islands in these swamps; with the increase in beaver-engineered landscapes, their numbers have risen in the last century. 

The return of forests, wetlands, and once-missing or threatened animals: how counterintuitive these trends are at a time when habitats and species elsewhere are being lost rapidly!

References: My primary source for this piece has been Natural History of Western Massachusetts, but also David Foster's Thoreau's Country: Journey Through a Transformed Landscape. The map of the state of Massachusetts comes from this USDA report. Here's a related column on beavers I did for 3QD last year. 

Sunday, March 20, 2016

Where probability meets literature and language: Markov models for text analysis

Is probabilistic analysis of any use in analyzing text – sequences of letters or sequences of words? Can a computer generate meaningful sentences by learning statistical properties such as how often certain strings of words or sentences occur in succession? What other uses could there be of such analysis? These were some questions I had this year as I collected material to teach a course on a special class of probability models called Markov chains. The models owe their name to the Russian mathematician Andrey Markov, who first proposed them in a 1906 paper titled "Extension of the law of large numbers to dependent quantities".

The key phrase, as we shall see, is ‘dependent quantities'. Broadly speaking, Markov models are applications of that basic rule of conditional probability, P(A|B): the probability of Event A happening, given that B occurs. The uses of Markov chains are many and varied – from the transmission of genes through generations, to the analysis of queues in telecommunication networks, to the movements of particles in physics. In 2006 – the 100th anniversary of Markov's paper – Philipp Von Hilgers and Amy Langville summarized the five greatest applications of Markov chains. This includes the one that is unknowingly used by most of us on a daily basis: every time we search on the internet, the ranking of webpages is based on the solution to a massive Markov chain

The focus of this piece, however, is the analysis of letter and word sequences as they appear in text. In what follows, I'll look at four examples where Markov models play a role.

1. Vowel and Consonant Pairs in Pushkin's Eugene Onegin

The first such example was demonstrated by Andrey Markov himself in 1913. To illustrate an example of his theory on dependent quantities, Markov had collected data – painstakingly, by hand! – on the first 20,000 letters of Alexander Pushkin's popular novel in verse, Eugene Onegin. He was interested in counts of vowels and consonants and the order in which they appeared. Of the first 20,000 letters in Eugene Onegin 8638 were vowels and 11362 were consonants. The overall probability estimate that a letter is a vowel is therefore 8638/20000 = 0.43. For a consonant, the same estimate is 11362/20000 = 0.57. 
Suppose the probability that a letter is a vowel or consonant is independent of what the previous letter was – in the same way that the outcome of a coin toss is independent of the previous toss. Just as the probability of a heads following a heads is 0.5*0.5 = 0.25, we can calculate the probability that: (1) a vowel is followed by a vowel (0.43*0.43 = 0.185), (2) a vowel is followed by a consonant (0.43*0.57 = 0.245), (3) a consonant is followed by a vowel (0.57*0.43 = 0.245) and (4) a consonant is followed by a consonant (0.57*0.57 = 0.325).

If these 4 probabilities (which sum to 1) were correct, we would expect that in 19,999 letter pairs of Eugene Onegin we should find approximately 0.185*19,999 = 3698 pairs where a vowel is followed by a vowel.

But it's not hard to see that the independence assumption is strange. A vowel is more likely to be succeeded by a consonant than it is by a vowel. Markov's counts based on 19,999 pairs of successive letters demonstrated this clearly. The number of pairs where a vowel is followed by a vowel is 1104, less than a third the number (3698) estimated assuming independence. Here are same four probabilities we discussed above, but now based on the pairs actually observed in Onegin:

v-v count: 1104, P (second letter is v, given that the first is a v) = 1104/8638 = 0.128
v-c count: 7534, P (second letter is c, given that the first is a v) = 7534/8638 = 0.872
c-v count: 7534, P (second letter is v, given that the first is a c) = 7534/11362 = 0.663
c-c count: 3827, P (second letter is c, given that the first is a c) = 3827/11362 = 0.337

[My reference is this article, and the figure above comes from here.]

What we see above is a simple illustration of dependent quantities. In this case, the probability that a letter is a consonant or vowel depends only what the previous letter was, but nothing more than that.

Markov's application of probability to letters in a text must have seemed quaint at the time. What practical value could the analysis of vowels and consonants have? Andrey Kolmogorov (1903-1987), another Russian mathematician – who came up with the axioms of probability – felt that Markov chose Eugene Onegin because he was somewhat isolated in Russia and therefore wasn't able to apply his ideas to the exciting discoveries in physics that Western Europe was abuzz in the first decades of the 20th century.

But what is quaint in one era can suddenly become important in another. As David Link notes in his article, Traces of the Mouth, Markov's efforts in retrospect "represent an early and momentous attempt to understand the phenomenon of language in mathematical terms." It's not an exaggeration to say that Markov's analysis of text is in principle similar to what Google and other firms now routinely carry out on a massive scale: analyzing words in books and internet documents, the order in which the words occur, analyzing search phrases, detecting spam and so on. 

[Read more here. The fourth and last part on the Indus symbols is below: ]

4. Do Ancient Symbols Constitute a Written Script?

Indus_seal_impressionNow to a detection problem of a different kind. If archaeological excavations have unearthed a large corpus of symbols, how do we know that these symbols are evidence of a written script? The symbols, although they appear in a sequence, could be some type of religious or artistic expression, not necessarily a linguistic script. If someone in the distant future excavated samples of printed DNA sequences, which consist of 4 letters AGC and T, then could they prove or disprove that the sequence is a written script? Similarly, what would the conclusion be if samples of Fortran programming code were excavated?

These are precisely the type of questions that this 2009 Science paper attempted to answer using the conditional probability principles that underlie Markov models. The corpus they applied it to was the excavated symbols of the Indus Valley Civilization, "which stretched from what is now eastern Pakistan and northwestern India" from around 2600-1900 BCE. There are over 3800 such inscriptions made up of 417 symbols. The average length of each inscription (the analogy that comes to my mind is word length) is around 5 symbols. The largest consists of 17 symbols.

The Indus script has not yet been deciphered. Indeed, because it is yet undeciphered, there still remains a question whether it represents a language at all!

If the Indus collection is indeed a language, then we should see general patterns that we see in other languages. In the same way that vowels and consonants do not occur independently of each other, letters of an alphabet do not occur independently either. Some letters occur more frequently than others in written text (see the Zipf distribution). In English, the letter pair ‘th' occurs very frequently since the word ‘the' is the most frequently used word, but you'll be hard-pressed to find the letter pair ‘wz' in English.

Thus there is a kind of imbalance that can be observed in languages. A measure called information entropy, which was proposed in Claude Shannon's paper we discussed earlier, quantifies this imbalance based on the observed counts/frequencies of letter pairs in a language. If the relative frequencies of pairs of Indus symbols exhibits similarities to the frequencies observed in other linguistic systems, then that provides supporting (but certainly not conclusive) evidence that the symbols constitute a written script. 

This is what the Science paper is claiming. The entropy of the Indus symbols was closer to languages - Sumerian, Old Tamil, Sanskrit and English - than it was to the entropy of non-linguistic systems such as DNA sequences, protein sequences and programming languages such as Fortran. 

Around 7 years ago when these results were published, I remember they were heavily circulated on social media. It's a cool story for sure – mathematics revealing patterns of an ancient, undeciphered script in the hotly contested ground that is Indian history. However, Richard Sproat, a computational linguist, raised concerns that provide an important counterpoint. As late as June 2014, Sproat was still doggedly pointing out technical issues in the original Science paper! 

Whatever the concerns, I did find this type of work intriguing - a clever use of probabilistic approaches. If the data and parameters used in the calculations were made public, it should be possible to replicate the findings and debate the conclusions if necessary.

Monday, January 18, 2016

A note on peppers

Black_Pepper_(Piper_nigrum)_fruitsThe Indian subcontinent is well known for its spices, and one of its stellar contributions is the ubiquitous black pepper. Native to South India and Southeast Asia (see unripe green fruits in picture), it’s been around for thousands of years, making its way very early to Europe and other parts of Asia by trade. Black pepper and the related long pepper may have been the most prevalent hot spices east of the Atlantic. That was until Columbus blundered onto the Americas in 1492, inadvertently connecting the Americas – which at the time had a unique ecological and cultivation history because of its isolation – to Europe, Africa and Asia.

In the newly globalized world since 1492, American ‘peppers’, better known as chilies, began to make their way to the rest of the world and took hold quickly. Indeed, all the chili peppers that the world uses today, without exception – from the mild bell peppers used primarily for their deep flavors to the hot ones that Indian, Thai, Chinese, Korean and other cuisines take for granted – all are descended from the varieties cultivated for millennia by pre-Hispanic farmers in southern North America (Mexico primarily) and northern South America (Peru and Bolivia have many varieties). The fiery habanerowhich scores high on the Scoville Heat Scale, is originally from the Amazon from where it reached Mexico.

While traveling in Oaxaca (southern Mexico) last week, I saw and tasted the dizzying variety of chili peppers, small and large, fresh, dried and smoked, each imparting a different color, flavor and odor to the salsas, the region’s famous moles, and other Mexican classics such as poblano rajas. At one restaurant dozens of dried chilies, types I had never seen before, were patched to the wall.

Thai_peppersEtymology provides some interesting clues. The word ‘pepper’ apparently has its roots in a South Indian word pippali, referring to the long pepper plant, whereas ‘chili’ is from Nahuatl, a pre-Hispanic Mexican language (Nahuatl, though diminished since the Spanish conquest of 1521, is still spoken in Mexico). The word for chilies in Tamil, my mother tongue, is milagai – a modification of the word milagu, the word for black pepper. It makes sense that this new entrant and competitor for creating heat should be linked by name to its older rival. Both milagu and milagai now co-exist in South Indian cuisine. The introduced chilies haven’t diminished the use of the peppercorns at all. Indeed, the potent garam masala, a signature mix of spices widely prevalent in India – Abbas has a recipe for it in his new book – uses only peppercorns for heat and not chilies.

All said, it's hard to imagine Indian cooking without chilies today. If somebody had asked me about the origin of chilies in high school or college, I would have claimed them as Indian without a second thought. It was a huge surprise when I learned, in my mid twenties, that chilies were introduced, that before the 16th or 17th centuries, they were not part of the cuisine at all! K.T. Achaya, the author of The Story of Our Food notes that "in one of the sections of the Ain-i-Akbari, written in 1590, there is a list of 50 dishes cooked in the [emperor] Akbar's court: all of them use only [black] pepper to impart spiciness." Similarly, the red chili paste and sauces that you find in so many Korean dishes and Thai curries are relatively recent. Of course, chilies are not unique in this regard. The same idea applies to tomatoes, potatoes, a lot of grains -- the list could go on and on.

It is fascinating how things that were once foreign can integrate so seamlessly and become so familiar that they now feel ‘native’, as if they were timelessly associated with a place and people.

Cross-posted here.

Wednesday, December 23, 2015

Reflections on War and Peace, and the Inner Work of Pierre Bezukhov

First published over at 3 Quarks Daily.

War-and-peace-pevearI finished reading War and Peace recently. It took me three years but I did try to read it carefully. Tolstoy defined art "as that human activity which consists in one person's consciously conveying to others, by certain external signs, the feelings he or she has experienced, and in others being infected by those feelings and also experiencing them." This is a wonderfully robust definition – especially because it does not impose which types of "human activity" or "external signs" qualify. And I was certainly infected by the themes of War and Peace: I felt on many occasions that the book was speaking especially to me. I took notes and copied down everything that struck me.

War and Peace operates in two distinct parts. There's the story of two upper class Russian families and individuals – the Bolkonskys, the Rostovs and the inimitable Pierre Bezukhov – whose lives are directly affected by the Napoleonic wars from 1805-1812, including the French invasion of and subsequent retreat from Moscow. Here the narrative flows so seamlessly from one character to another, from one high society intrigue to the next, and so clear is the psychological detailing that it never feels like anything is being overdone. This despite the fact that Tolstoy likes to intervene constantly. His style goes against the "show but don't tell" advice that is nowadays given to writers. He takes great pains to tell us what's going on in each character's mind, how things have changed since we last met this or that person. Everything, internal or external – estates, battlegrounds, soirees, dinners, military offices, forests – is described with great precision. Sudden twists are not Tolstoy's style; the suspense instead comes from how a character will respond to changes in her circumstances.

The other part of War and Peace consists of what can only be called the author's own essays. Tolstoy inserts them throughout the book at regular intervals, having put the story on pause. The essays, though long-winded and difficult to get through, are nevertheless an integral part of the book. Tolstoy uses them to continually emphasize how difficult it is to attribute causes to events in history, how the so called "big men" such as Napoleon (whom Tolstoy particularly dislikes) do not have the kind of agency that historians like to credit them. 

L.N.Tolstoy_Prokudin-GorskyThe gist of these essays is best illustrated by an analogy Tolstoy uses. In classical mechanics, Tolstoy notes, the continuous motion of an object or a combination of objects is accurately described and predicted by the integration of infinitesimally small quantities. The development of calculus in the 17th century made this possible. Likewise history too is continuous and can only be approached as an integral, as "the sum of all individual wills". The historian's typical approach, however, is to isolate discrete events or periods, assume that they are independent, and assign proximate discrete causes to the events. By this method, powerful individuals such as Napoleon, are said to cause events and drive history. But are such conclusions really correct? What of the wills the hundreds of thousands of soldiers and other citizens across Europe and Russia who were involved? In Tolstoy's view "only by admitting an infinitesimal unit for observation – a differential of history, that is, the uniform strivings of people – and attaining to the art of integrating them (taking the sums of these infinitesimal quantities) can we hope to comprehend the laws of history."

Tolstoy wrote this in the 1860s. In 2015, the laws of history are still not clear. There seems to be no way to define a "differential of history" let alone integrate "individual wills". We still have lengthy, inconclusive debates on what exactly caused an event. We can sense, intuitively, that there are innumerable causes which we cannot fully list, all of which interact in complex ways. Nicholas Nassim Taleb described it well in The Black Swan: "History is opaque. You see what comes out, not the script that produces events, the generator of history." 

The Inner Work of Pierre Bezukhov

There's a lot more one can say about the analytical or theoretical parts of War and Peace. But the main focus of this piece is Pierre Bezukhov.

Pierre Bezukhov and two pairs of siblings – Natasha and Nikolai Rostov; Marya and Andrei Bolkonsky –  make up the five major characters of the book. Each has a different personality but they share important features. They are all extremely sincere. They introspect a lot, learn lessons from the major events in their lives and are aware of their flaws. They continuously seek happiness, the kind of happiness that does not depend on external circumstances. At least three of them – Pierre, Andrei and Marya – are engaged in some kind of religious or spiritual search or a search for meaning and wisdom.

The phrase that comes up in the book a few times is "inner work". And I felt the inner work of Pierre Bezukhov especially crystallizes what Tolstoy is trying to convey. In what follows, I provide a compressed chronological version of Pierre's development in three parts along with key quotes. I can't claim that what I present is original. War and Peace has been endlessly analyzed and I may well be repeating what more qualified readers and critics have already noted. Also there are spoilers here, though I tried to minimize them by mainly focusing on Pierre's questions. All the quotes are from the acclaimed Pevear-Volokhonsky translation. The artistic rendition of Pierre Bezukhov by D. Shmarinov is from this website. The collage of Napoleon's invasion and retreat from Russia is from here.  

"What for? Why? What's going on in the world?"
When War and Peace begins in 1805, Pierre Bezukhov, the illegitimate son of a wealthy count, has just returned from Europe. He is a good-natured but bumbling, absent-minded and somewhat na├»ve. He admires Napoleon. He is not particularly interested in wealth but loves the good life. Physically, he is big and fat; he eats and drinks a lot. His father's exceptional wealth, which he accidentally inherits, brings him naturally into the orbit of Russian high society. He is introduced to Elena, the daughter of the well connected Prince Kuragin. Infatuated with Elena's beauty, he marries her. But quickly it becomes clear there is no real connection. When Elena flirts with a Russian officer, Dolokhov, Pierre nonetheless becomes jealous and challenges Dolokhov to a duel. He injures Dolokhov in the leg but the matter is hushed up. Pierre gets away with the implications. Afterwards Pierre has a quarrel with Elena who taunts him, and they separate.  

This is exactly the point at which Pierre's inner work begins. While traveling, he has a chance meeting with a man who belongs to the "brotherhood of Freemasons". Pierre has no belief in God or religious abstractions. In the past he even made fun of Masonic beliefs. But Pierre is fascinated by this stranger who argues convincingly that "the supreme wisdom is not based on reason alone" and can only be obtained by purifying oneself inwardly. With his life in disarray, Pierre is eager to embrace something that will give him purpose.  He becomes a Mason, putting himself through the cultish initiation rituals of the brotherhood. Despite the strangeness of these rituals, Pierre is rejuvenated by the message of the Masons that "the source of blessedness is not outside, but inside us."

Moments like this, however, are always fleeting in Tolstoy's world. Like life itself everything moves and changes. Just when you think there is some kind of stability, it begins to disappear. So it is with Pierre's Masonic moment. Even as he becomes an advocate of his new beliefs, Pierre notices that his excesses in food, wine and the amusements of "bachelor parties" (Tolstoy's phrase for the company of women) continue as before.

As he participates in events of the society around him and leads a dissipated life, a doubt keeps nagging him:
"What for? Why? What's going on in the world?"
He also notices that everyone around him seems to be doing something to distract themselves so as to fill the gaps in their life:
"Sometimes Pierre remembered stories he had heard about how soldiers at war, taking cover under enemy fire, when there is nothing to do, try to find some occupation for themselves so as to endure the danger more easily. And to Pierre all people seemed to be such soldiers, saving themselves from life; some with ambition, some with cards, some with drafting laws, some with women, some with playthings, some with horses, some with politics, some with hunting, some with wine, some with affairs of the state. "Nothing is trivial or important, it's all the same; only save yourself from it as best you can!" thought Pierre. "Only not to see it, that dreadful it.[Tolstoy's italics.]
How relevant these observations are even today! As if the activities of our "physical self" aren't enough – all the occupations and hobbies Pierre mentions above – we now have the innumerable pleasures and distractions of a life online! I was also struck by the claim: "Nothing is trivial or important; it's all the same." 

"A limit to suffering and a limit to freedom…"

In 1811 and 1812 – the years the Great Comet could be seen in the night sky – Pierre is caught up in the Russian resistance to the looming French advance. It endows Pierre, whose life had been drifting aimlessly, with a new purpose. He is not capable of serving as a soldier. But he attends meetings where funds are being raised for the militia; he cooks up occult theories that suggest that he himself will somehow obstruct Napoleon's apocalyptic advance. He feels a need to "undertake something and sacrifice something" though he cannot articulate "what he wanted to sacrifice it for".  

This begins a fascinating phase where the clumsy and militarily clueless Pierre walks straight into the war when all other citizens are trying to escape. We see the great Battle of Borodino through Pierre who, in good humor, blunders on to the most dangerous parts of the battlefield. Initially considered a nuisance, the soldiers slowly take a liking to this strangely dressed Russian count unexpectedly in their midst. We see him on the retreat along with soldiers. We see the burning of Moscow after the city has emptied out and Napoleon's army occupies it. Pierre stays on in Moscow, has comical plans of assassinating Napoleon with a pistol he possesses, ends up rescuing those trapped in fires, gets arrested for arson (something he was never guilty of), observes the harrowing public execution of fellow prisoners and himself narrowly escapes from being executed. Finally, he travels as a prisoner along with others under the harshest physical conditions as Napoleon's army begins to retreat from Moscow. 
It is in these challenging external circumstances – the three week walk in captivity, away from Moscow – that Pierre gains his deepest insights. 

He learns "not with his mind, but with his whole being". He notices, to his own surprise, his ability to adapt to the difficulties very well. Depleted French reserves mean that Pierre is fed horsemeat, which he finds "tasty and nutritious" and "the saltpeter bouquet of gunpowder they used instead of salt was even agreeable". It is fall, the weather is cold, but walking keeps him warm and even "the lice that ate him warmed his body pleasantly". His feet are full of sores and are frightful to look at, but Pierre simply and very naturally thinks of other things. This teaches him "the saving power of the shifting of attention that has been put in man, similar to the safety valve in steam engines, which releases the extra steam as soon as the pressure exceeds a certain norm".

A fellow prisoner, a peasant foot-soldier named Platon Karataev, inspires Pierre with his genuine simplicity and cheer.  

Pierre realizes that "as there is no situation in the world in which a man can be happy and perfectly free, so there is no situation in which he can be perfectly unhappy and unfree." Further:    
"He had learned that there is a limit to suffering and a limit to freedom, and those limits are very close; that a man who suffers because one leaf is askew in his bed of roses, suffers as much as he now suffered falling asleep on the bare, damp ground, one side getting cold as the other warmed up; that when he used to put on his tight ballroom shoes, he suffered just as much as now, when he walked quite barefoot (his shoes had long since worn out) and his feet were covered with sores. He learned that when, by his own will, as it had seemed to him, he had married his wife, he had been no more free than now, when he was locked in a stable for the night."
What really elevates these sentences is the quality of the examples and the contrasts they set up. The claims are simple yet striking. They are those truths that we perhaps know intuitively but have not articulated yet.  

"People must join hands…"

Pierre is eventually rescued, and with the war finally reaching its end, he returns to normal life. Even though he falls ill, he is filled joy and recovers. When, "by old habit", he asks himself: "Well, and what then? What am I going to do?" immediately the answer comes to him: "Nothing. I'll live. Ah, how nice!"

The search for a purpose, Pierre has realized, is precisely that which keeps one unhappy. The purpose seems simply to live, to get on with things cheerfully if possible, rather than looking for abstractions. Pierre has emerged a renewed man.  

But just because we've gained some wisdom does not necessarily mean that we will adhere to it all the time. We see this again and again in War and Peace. (It also works the other way: a lack of enthusiasm for life never lasts either and a person finds himself revived one way or another.) Prince Andrei, Pierre's friend, keeps experiencing blissful moments when he feels that the world has been transcended. Such as when, lying injured at the Battle of Austerlitz, he glimpses something indescribably special in the "lofty sky", something that renders everything else insignificant. But however profound such moments may be, they always fade. Prince Andrei's sister, Marya Bolkonsky, who unlike her atheist brother and father, is devout, has an unshakeable faith, and tries very hard to elevate her character through religion – Marya discovers again and again that despite her best efforts and sincere intentions all kinds of irritations and jealousies torment her.  

Pierre changes at the end too, but it's a lot more subtle. The Epilogue is set a few years after the war. Pierre is happily married and has children. He retains much of his newfound joy in life; people still love to be around him. You would think this would be a good way to finish, literally a "happily ever after" ending. But somehow, inexplicably, Pierre now decides to participate in political intrigue. He has just returned from an important meeting in Petersburg. He feels the current administration in Petersburg is not doing the right things, there's "thievery in the courts", "what is young and honest, they destroy". So "people must join hands, in order to avoid the general catastrophe". 

War and Peace ends with Pierre hinting at the creation of a rebel group – something's cooking, and it will eventually lead to the Decembrists revolt of 1825. So Pierre, who had learned from his experiences in war a few years back that there is no need for abstract purposes, now ends up again arguing for and participating in one.

To the very end, Tolstoy remains faithful to the fact that not even the most profound realizations withstand the dynamism and change that is life.

Monday, May 11, 2015

Unconditioned by the past

Exploring the Memoryless property of the Exponential Distribution. Cross posted over at 3 Quarks Daily.
1. Waiting For the Next Customer

Suppose you run a small business, a barber shop or a small restaurant that takes walk-ins only. A customer has just left, your place is empty, and you are waiting for the next customer to come in. You've figured out that on average the time between two successive arrivals is 15 minutes. However, there is variation and the variation follows the Exponential probability curve shown in the figure below. This is not an arbitrary choice: time between successive random and independent arrivals does actually follow the Exponential. The average time between arrivals depends on whether it is a busy or slow time of the day, but the general shape of the Exponential curve keeps showing up again and again when empirical data is plotted (one example here).    
The height of the curve is an indicator of where the greatest probability densities are. Most arrivals happen in quick succession (the curve is tall when t is small), but there will be occasions when a long time elapses before the next arrival happens. At t=0, when the last customer just left, if you calculated the probability of the next customer arriving within 5 minutes (0 < t < 5) you would get the value 0.283. Equivalently you could say that the probability you will wait 5 minutes or more is (1 - 0.283) = 0.717. 

Now here's the interesting part. Suppose twenty minutes have now passed and the next customer still hasn't arrived. You are starting to get a little impatient; after all you don't want your productive time to be idle. So at t=20, you again calculate the probability of a customer arriving in the next 5 minutes (20 < t < 25), given that no one has come so far. You would think this new probability, based on how much time has elapsed, should be higher than 0.283. But, surprisingly, the probability that a customer will arrive in the next 5 minutes, given that twenty idle minutes have passed, is still 0.283! And the probability that you will wait 5 minutes or more is still 0.717.

This is precisely the Memoryless property of the Exponential: the past has been forgotten; the probability of when the next event will happen remains unconditioned by when the last event happened. Fast forward even more: let's say you've waited for half an hour. No one has shown up so far. Frustrated, you recalculate the probability of someone arriving in the next 5 minutes (30 < t < 35). Still 0.283!

The behavior that we see in the Exponential is because your customers are arriving independently of one another -- remember that you allow only walk-ins. There is no "memory" or predetermined schedule connecting any two successive arrivals (in the same way that the outcome of a coin that is tossed now has no memory or connection with the outcome of a coin tossed at some point of time in the past). A barber shop, a small restaurant, a shoe-shop, a cab-driver, a car mechanic, a self-employed person who earns a living doing Japanese-English translation requests – one can find many contexts that experience the Memoryless property. Bigger retail firms also experience the same problem, but they hire (and fire) many people, cross-train their employees to do multiple tasks and thereby have ways to reduce the risk of staying idle.

In small businesses, the wait for the next customer is felt far more personally and acutely. Recently, I spoke to Abel (not his real name), an Ethiopian man who had started a restaurant in a small Midwestern town. The Ethiopian dishes I tasted were excellent. Yet Abel said there were many difficult evenings he would be alone, waiting for someone to come in. To cut costs, he was both the cook and the server on such slow days. But Abel noted that he would, unexpectedly, get busy. This is the flip side of the Exponential: a string of closely spaced arrivals is very likely since the probability densities are front heavy, as seen in the shape of the curve. So you can go from being idle for an hour to suddenly having a long line of people waiting. Now you have a different problem – you are too busy and your customers are unhappy! 

2. A Visual Illustration

Let's look closer why exactly the Memoryless property holds true for the Exponential. Instead of showing the algebra, I'll try illustrating visually. I struggled with the Memoryless property myself for many years; so at the very least, I'll put my own thoughts in order. Please let me know if something does not sound right.   

The Exponential is a continuous distribution used to characterize the probability of time durations, such as the time between two successive randomly occurring events. Naturally the smallest possible value is 0. The Exponential curve is asymptotic – a fancy word for the idea that the probability curve keeps dipping as we move to the right and gets closer and closer to the x axis, but never quite dips enough to touch the x-axis. The dipping curve stretches to infinity. So very long time between events (long periods of idleness) are theoretically possible, although in practice they are very, very unlikely. The area under the Exponential probability curve, if we calculate the limit, tends to 1 (as it must for any continuous probability distribution).   
Rescale2Let's return to the original example. Time between successive arrivals follows an Exponential Distribution with a mean of 15 minutes. Currently, twenty minutes have passed since the last arrival, so we are at t=20. We are trying to find out the probability that an arrival will happen in the next 5 minutes -- in the interval 20 < t < 25. To do so, we now only need to consider the area under the Exponential curve to the right of the t=20 mark. The total area under the curve to the right of t=20 is 0.263. We "rescale" this area such that 0.263 now becomes equivalent to an area of 1 -- we do this because this is the relevant conditional probability space we are now interested in. Further, the x axis is re-scaled: t=20 becomes t=0; t=25 becomes t=5; and so on, so that in the newly re-scaled or conditioned area we can calculate the probability of an event happening in the next 5 minutes.  

Surprisingly, the re-scaled area is exactly the original Exponential probability curve! Even the height of the curve corresponding to every time value on the x-axis is exactly as it was when the last customer left.

It's not a precise analogy, but just as the same pattern keeps repeating itself in a fractal no matter how much you magnify the original, so the same exact Exponential curve we started with keeps appearing again and again upon re-scaling no matter how much time has elapsed. It does not matter if 10, 30, 100 or 2000 minutes have passed without an arrival; the probability that an arrival will happen in the next five minutes will always be 0.283. This makes mathematical calculations very straightforward -- the past does not need to be kept track of, and the same formulas can be used at any stage.

There is something about how the curve decays or dips, more specifically the rate at which it decays, which gives Exponential this unique property among continuous probability distributions. In fact, if you knew that time durations follow the Memoryless property you can work backwards and prove that the original probability curve has to be Exponential. 

As a contrast, other well known continuous distributions, say the Normal or Lognormal, do not have the Memoryless property. The Lognormal distribution is a more relevant comparison since, like the Exponential, it allows only values greater than 0, unlike the normal which allows negative values and is therefore not always appropriate for modeling time durations. In the Lognormal and Normal, the probability of a future event is not unconditioned by how much time has passed. This means that you have to keep track of the past when you calculate the possibility of a future event -- and this quickly gets very cumbersome and computationally expensive. 

For a further contrast, I've created a couple of roughly equivalent images, Figure 1 and Figure 2, for the continuous uniform distribution -- a relatively simple, bounded distribution with a flat curve. Here we see that the probability of an event happening in the next five minutes was originally 0.1667; after twenty minutes, it went up to 0.5.  

Among discrete distributions, the Geometric distribution has the Memoryless property.

3. Lifetime of a Device

I'd like to end the piece by raising a couple of questions. Probability textbooks routinely mention that the Exponential distribution can be used to model the lifetime of a device: time from when the device is put into operation to its failure. Here the Memoryless property seems puzzling to me.

If a device has worked for 3000 hours, the probability that it will work for another 1000 hours is exactly the same as when the device started operating. I find that quite amazing. Such a property is possible only if the failure of the device has nothing to do with wear and tear caused due to time. Otherwise, the longer the device works, the more likely it is to fail in the next time interval -- just as at the age of 70, the probability that we will die in the next 10 years is much higher than the same probability calculated at the age of 40. From what I've read, the lifetime of semiconductor components follows exponential time to failure distributions. But then how is it that these devices escape wear and tear caused due to time? And are there other examples?

Tuesday, April 14, 2015

The Traveling Moose # 6

I was reminded of the  traveling salesman problem (TSP) in an unlikely context, thanks to a display at the New York State Museum in Albany (click on image for better view). The display showed the zigzagging journeys made not by a salesman, but a radio collared moose, simply called #6. The moose seems to have traveled a lot, but the explanatory note on the display said otherwise: "Most New York moose have settled into limited travel routines...From February 1998 to November 2000 radio-collared moose #6 ranged over 350 square kilometers (217 square miles) east of Great Sacandaga Lake."   

Tuesday, February 17, 2015

A mobile surgical unit in Ecuador

Latest 3 Quarks Daily column is about my healthcare-themed trip to Cuenca, Ecuador last October. Full essay is here. This is how it begins: 
Since 1994, a small team of clinicians has been bringing elective surgeries to Ecuador's remotest towns or villages, places that have do not have hospitals in close proximity. From the city of Cuenca – Ecuador's third largest town, where they are based – the team drives a surgical truck to a distant village or town. Though a small country by area, the barrier of the Andes slices Ecuador into three distinct geographic regions: the Pacific coast in the west; the mountainous spine that runs through the middle; and the tremendously bio-diverse but also oil rich jungle expanse to the east, El Oriente, home to many indigenous tribes. Apart from a few major cities – Quito, Guayaquil, Cuenca – towns and villages tend to be small and remote.   
Isuzu Truck 2

Each year the team goes on 12 surgical missions, roughly one per month. A trip lasts around 4 days: a day's drive to get to the place; 2 days to conduct 20-30 surgeries (sometimes more sometimes less); and then a day to return. Patients pay a nominal/reduced fee if they can: the surgeries are done irrespective of the patient's ability to pay. The clinicians belong to a foundation called Cinterandes (Centro Interandino de Desarollo – Center for Inter-Andean Development).   
Amazingly, the very same Isuzu truck (see above) has been in use for more than 850 missions and has seen 7458 surgeries from 1994-2014! The truck itself is not very large; in fact, it cannot be, because it has to reach places that do not have good roads. The mobile surgery program has the lowest rates of infection in the country. Not a single patient has been lost. The cases to be operated on have to be carefully chosen. Because of the lack of major facilities nearby, only surgeries with a low risk of complication can be done. Hernias and removal of superficial tumors are the most common. Hernias can be debilitating, yet patients may simply choose to live with them for many years rather than visit a far-off urban hospital. For many, leaving work for a few days and traveling to get a health problem fixed is not an option.

A sky full of monarchs

A sky full of the famous monarch butterflies in Michoacan, Mexico. It's hard to capture this special phenomenon -- millions of butterflies congregating, after a 2000 mile journey from Canada and northern US, in a few fir/oyamel forests in Central Mexico -- on camera. This picture I took last week is not very good, but every black speck in the sky, however faint, is a butterfly. This year's monarch numbers seem to be better than last year, although still well below the average across two decades.

Monday, January 19, 2015

Birds seen this winter

It's hard to spot new birds during Massachusetts winters (I don't own a house with a yard or a bird feeder, which makes it doubly hard). The hundreds of species that make their home or pass through here are more easily observed in spring, summer and early fall. But last Tuesday – a bone chillingly cold but sunny day in Amherst – I ran into four species all at once. I had come out for a walk in a quiet part of town, a dead end street where an unpaved hiking trail leads to a pond. The unusually high levels of noise in the trees suggested that a lot of birds were active. The repeated deep thuds I was hearing indicated that woodpeckers were around, hammering on tree trunks.


So here are the species that I spotted, from left to right (picture assembled from Wikipedia images): the eastern blue bird; the black capped chickadee; the female downy woodpecker (the male has slight red marks on the head); the misleadingly named red-bellied woodpecker because the prominent red or orange patch is actually on the bird's curved head. The chickadee is the smallest of the four, and the red-bellied woodpecker the largest. Overall, nothing really surprising here – these are all common winter birds. But as an amateur bird watcher, I felt a special joy stumbling upon them; it felt, at least in those few moments, as if some special secret of nature had been unexpectedly revealed.

Some other things I've noticed this winter: (1) starlings, dozens of them somersaulting gracefully in the air in unison, literally a dance to avoid death, an attempt to disorient hawks that are hunting them (something similar to what's happening in this video. On a different note, the 150 million starlings in North America today are descended from the 60 odd European starlings that were deliberately introduced to New York's Central Park in 1890 by "a small group of people with a passion to introduce all of the animals mentioned in the works of William Shakespeare" -- talk about literature influencing ecology!); (2) young wild turkey, moving black specks from a distance, foraging in a snow covered meadow (here's a previous piece on wild turkey); and (3) a few weeks ago, at twilight, the mysterious, round faced barred owl, the only owl I've ever seen, well camouflaged against the bark of a tree, very similar to this picture.   

(This piece was cross-posted here.)

Tuesday, December 30, 2014

The resettlement of refugee farmers in East Punjab after Partition

A different version of this piece was published back in 2009, in the OR/MS Today magazine.

It is well known that the partition of the Indian subcontinent in 1947 had terrible consequences. Tens of thousands of people died. Millions were displaced and lost their cherished ancestral homes: Muslims left India for Pakistan, and Hindus and Sikhs left Pakistan for India. It was the greatest mass migration in history. But what is less understood is the manner in which the vast numbers of refugees were accommodated and settled into the newly divided regions. The greatest mass migration in history inevitably became the largest resettlement operation in the world.

How was this monumental task achieved? That question might take up many books, and perhaps many have already been written. But we get a glimpse of how it was done in the Indian side of the Punjab (East Punjab) in Refugees and the Republic a chapter in Ramachandra Guha’s post-independence historical narrative India after Gandhi. 

Guha’s chapter appealed to me for a different reason. My specialization is the field of operations research: the quantitative or optimization methods that are now used widely in the attempt to make service systems more efficient. The resettlement or land allocation problem set up by Guha in the chapter seemed to fall squarely in the realm of optimization. I was curious to know how the reallocation of land had been carried out in practice.  

Punjab was one of the partitioned provinces; the eastern part found itself in India while the western in Pakistan. A large number of Muslims had left East Punjab for Pakistan. But there was an even greater influx of Hindus and Sikhs into the east from Pakistan. Most of these refugees were farmers. Together they had abandoned 2.7 million hectares of land in Western Punjab but across the border in India where they now had to make a living only 1.9 million hectares had been left behind by Muslim farmers who had fled the opposite way. The problem was made more complex by three additional factors:
  • Each refugee family had a claim on how much they had owned prior to emigrating.
  • The fertility of the land differed; there were dry, unirrigated districts as well as lush, irrigated regions.
  • There were demands that families and neighbors be relocated in the same way as they had been in West Punjab. If possible entire village communities had to be recreated.
From the comfort of hindsight – and given how far computing power has grown in the last 6-7 decades – I can imagine formulating the land allocation problem as a large scale mathematical optimization model. The decisions (the x variables in the problem, which the model would be solved for) would be how much land to allocate to each family and where to allocate. The objective, expressed as  a function of the decisions, would be to minimize the difference between claims and actual allocations. The constraints would be equations that expressed limits such as the total land allocated could not exceed the total  land available. There could be other constraints to ensure that families and neighbors are relocated together. The mathematical model would also have to capture the spatial aspect, the variation in fertility and relate them somehow to the allocation decisions.

Even by today’s standards this is a very difficult optimization problem; it also has human or qualitative dimensions that are difficult to capture mathematically. The unenviable task of reallocating land fell upon the Indian government and its civil service workers. As a first step, they assigned each family of refugee farmers 4 hectares irrespective of its past holding; they also gave loans to buy seed and equipment. Viewed from an optimization lens, this 4-acre allotment is an “initial solution” – a feasible assignment to the decisions (the x’s) to get things going, but very far from optimal.

As families began to sustain themselves, applications were invited for them to claim more land, depending on what they had owned in West Punjab. Within a month, there were 500,000 claims. These claims were then “verified in open assemblies consisting of other migrants from the same village. As each claim was read out by a government official, the assembly approved, amended or rejected it.” Refugees tended to exaggerate of course, but were deterred by the open assembly method; if a claim turned out to be false they were punished by a reduction in land.

Sardar Tarlok Singh of the Indian Civil Service and a graduate of the London School of Economics led the rehabilitation operation. He used two simple but interesting rules (we call them heuristics) for allocating land, and this is where the pragmatism in the whole operation comes most clearly to light. Though claims had been filed, because of the reduced acreage, none of the refugees could be assigned as much land as they'd originally owned. Everybody’s claim had to be reduced by a certain percentage. Plus, there had to be some way of accounting for the differing fertility of land.

Sardar Tarlok Singh came up with two measures, the standard acre and the graded cut, which dealt with these issues: 

“A standard acre was defined as that amount of land which could yield ten to eleven maunds of rice. (A maund is about 40 kilograms.) In the dry, unirrigated districts of the east, four physical acres were equivalent to one standard acre; but in the lush “canal colonies” [where irrigation was strong], one physical acre was about equal to one standard acre. The innovative concept of the standard acre took care of the variations in soil and climate across the province.

The idea of the graded cut, meanwhile, helped overcome the large discrepancy between the land left behind by the refugees and the land now available to them – a gap that was close to million acres. For the first ten acres of any claim, a cut of 25% was implemented – thus one got only 7.5 acres instead of ten. For higher claims the cuts were steeper: 30% between ten and 30 acres, and on upward, so that those having more than 500 acres were taxed at the rate of 95%.”

With this rule, there clearly were losers, and the losers, of course, were those who had once owned huge tracts of land: “The biggest single loser was a woman named Vidyawati who had inherited land (and lost) her husband’s estate of 11,500 acres spread across thirty-five villages of the Gujranwala and Sialkot districts. In compensation she was allotted mere 835 acres in a single village of Karnal.”

It’s unclear what analysis motivated Tarlok Singh to come up with the specific ranges and the cuts. It’s possible that the taxing mechanism might have left too much land unassigned, and many claimants dissatisfied. Or, given that the overall reduction in total land was about 38 percent (the farmers had left 2.7 million hectares behind and now were being resettled on 1.7 million hectares), the taxing may not have been strict enough. The exact details are unknown.

What is known, though, is that by November 1949, a year and a half after the resettlement began, “Tarlok Singh had made 250000 allotments distributed equitably across the districts of East Punjab”. Even the soft constraints, such as settling families and neighbors together, were met to a large extent, though “the recreation of entire village communities proved impossible.” The resettlements were so successful that “by 1950, a depopulated countryside was alive once again.” Tarlok’s heuristics might have been simple, but they helped solve a complex, large-scale allotment problem in the aftermath of a traumatic event.

Halfway across the world, in the summer of 1947 — the same year that the partition of the Indian subcontinent took place — George Dantzig conceived the famous Simplex Algorithm. In the next decades the Simplex Algorithm, helped by advances in the computing capability, would provide fast solutions to large linear optimization problems. Each fall I teach the algorithm to graduate students from many disciplines. But in 1947, the Simplex algorithm could only tackle a small version of the classical diet problem: optimally choosing food items for a family to minimize costs while ensuring minimum nutritional needs are met. That particular diet problem had only 9 constraints and 27 variables, but took 120 man-days to solve; worksheets were glued together and spread out like a tablecloth to assist in determining the optimum solution [1].

This puts the enormity of resettlement problem in perspective. With half a million people making claims – which means there would be hundreds of thousands of variables and constraints – even an awareness of Dantzig’s algorithm could not have helped the Indian Civil Service.

Nearly 7,000 officials were needed for the resettlement effort; they constituted a refugee city of their own. The problem occupied them for a period of three years. Imagine the paperwork and the records that had to be kept and retrieved; imagine the disputes among the refugees, the flared tempers and the jealousies. But imagine also the perseverance of everyone involved. It made me think about what it is exactly that makes certain large scale operations successful and others not. It’s hard to make generalizations, but one feature seems to be effective coordination across large groups of people: largely error-free lunch deliveries by the Dabbawalas of Mumbai is an example that comes to mind.

Related notes:

1. Guha ends the chapter in his book poignantly. The resettlement, Guha says, may have been successful, but the general sense of loss could not be undone. The migrating Sikhs had left behind a beloved place of worship, Nankana Sahib, the birthplace of the founder of their faith, Guru Nanak. Muslims migrating from East Punjab too had left behind the town of Qadian, the center of the Ahmadiya sect of Islam; the Ahmadiya mosque was visible for miles around. Very few Muslims now lived in Qadian, which was full of Hindu and Sikh refugees. Guha quotes the editor of the Calcutta newspaper Statesman, who wrote that in both Qadian and Nankana Sahib there was “the conspicuous dearth of daily worshippers, the aching emptiness, the sense of waiting, of hope and…of faith fortified by humbling affliction.”

2. The picture shows a boy at a Delhi refugee camp in 1947. Here is the source. The largest refugee camp, though, was at Kurukshetra, consisting of nearly 300,000 people. For their entertainment, film projectors were brought in and Disney specials featuring Mickey Mouse and Donald Duck were screened at night. It was, as one social worker described it, a “two-hour break from reality”. 

1. Bazaraa, M., Jarvis, J., and Sherali, H., 2005, Linear Programming and Network Flows, Wiley 3rd Edition

2. All quoted parts in the piece are from Ramachandra Guha’s India After Gandhi