Thursday, November 22, 2012

My PhD dissertation summary

Today I submitted my PhD dissertation for examination. The title of my dissertation is: ‘Measuring and Influencing Sequential Joint Agent Behaviours.’

The essential thesis of my research is that:

Algorithmically designed reward functions can influence groups of learning agents toward measurable desired sequential joint behaviours. 

The thesis is demonstrated with research explaining how to measure a particular sequential joint behaviour, turn-taking, how to identify rewards that are conducive (or prohibitive) to turn-taking by learning agents in a simulated context and how to design rewards that incentivise arbitrary sequential joint behaviours in multi-agent stochastic games.

Informally, the thesis is about activities performed together through time by a group of agents that figure out how to do things better as they go. An agent could be a person, a robot or a computer program. We mathematically explain how to get the overall outcomes we want by telling the agents what they should individually want. Because we do this mathematically, we need to measure the things we want our group of agents to do. This dissertation explains some new ideas about how we can measure how well a group of agents is taking turns, how we can guess whether or not pairs of a certain kind of robot-like computer programs will take turns, and how we can tell individual agents what they should want so that they collectively end up doing something that we want, for some situations.

My dissertation includes most of two journal papers that I published, plus other bits that I’m planning to submit as another journal.

One of the things I studied was simulated agents communicating and learning from rewards.

Friday, November 16, 2012

Nash and Bourne

Violent intrigue can be analysed mathematically
Recently, I finished reading "The Bourne Ultimatum" by Robert Ludlum, an energetic, intriguing tale of violent, manipulative men head-to-head in a struggle of life and death. My definite impression was that Robert Ludlum's Bourne series is a realistic portrayal of the total opposition that can be mathematically modelled as a 'Nash equilibrium.' Read on to get the details.

Many of you will know of Jason Bourne by the series of movies where Matt Damon plays the amnesic protagonist. However, the book series has a different plot and a different mood. In both cases, Jason Bourne is an intelligent man of violence that outguns and outwits his opponents in deadly struggles.

Matt Damon
Matt Damon plays Jason Bourne
At the same time as I read the Bourne books, I was studying mathematical ways to understand groups of opposing agents. The concept of a 'Nash equilibrium' is one mathematical way to understand the theory behind violent strategies. A Nash equilibrium is a situation where each person is acting to as to maximise his or her own good, given the actions of everyone else. A Nash equilibrium can be a good thing, where everyone is helping each other, or a Nash equilibrium can be a bad thing, where each person is at the others' throats. In the case of Jason Bourne, the Nash equilibrium is always one where the two masters of intrigue are trying to kill each other. There can only be one winner in the Bourne series.

Without giving too much of the books away, Jason Bourne opposes Carlos the Jackal, the leading international assassin. One will set a trap for the other and the other will 'reverse' the trap, and the one narrowly escapes. Neither man's appearance is clearly known by the other and they both enlist pawns to fight against each other. Jason Bourne threatens, bribes and takes every possible extreme measure in order to defeat the Jackal. If you have only seen the movies, then consider how Bourne tricks and outwits his opponents for his own ends.
John Nash was a revolutionary mathematician
The overall impression I got from the books was 'Oh, this is what a purely competitive Nash equilibrium really looks like.' The winner is not the strongest, the fastest or the man with the best weapons, but the man who thinks the extra step ahead. If you can foresee what your opponent will do, then you can defeat him. But your opponent will try to foresee what you will do. So you must think N+1 steps ahead. In fact, to win, you must be unpredictable. The mathematical solution of a purely competitive game is to randomise over all of your possible actions. In practise, you become an unstable psychopath who commits apparently arbitrary acts of violence without any discernible pattern. I read a lot of action books, but most of the bad guys are stupid or have some other drastic failings. Jason Bourne's opponents seem much more closely matched.

Life can approach the theoretical abstraction of a Nash equilibrium, but game theoretic methods often provide exact answers to slightly the wrong questions. This can make game theory blindingly addictive to some, as Venkatesh Rao observes. In the areas of game theory that I've studied, people often assume that the people have a finite number of actions available to them. However, in practice, the number of devious things that you can do to someone else is limited only by your imagination. Game theory can give us some intuition, but it's probably best to let Robert Ludlum fill in the details. (Actually, stop that thought process right there, just in case you think of a new way of inflicting harm.)

In "How the Mind Works," Steven Pinker gives some good explanations of why people are emotional and do crazy, violent things that seem irrational from some perspectives. In "The Better Angels of Our Nature: Why Violence Has Declined," Pinker explains why people have become much less violent with time. Contrary to the constant whine of the alarmists, not all our morals are bad and getting worse. Part of the reason for the decline in violence is a change in the prevailing Nash equilibrium. We are now more incentivised to be peaceful and non-violent, which is good for all of us. Centrally administrated justice helps us all be more charitable. The real life Carlos the Jackal is serving time in a French prison and Jason Bourne is best approximated by Matt Damon, who co-founded to help the poor get better access to water.

Sunday, November 4, 2012

How to be an Awesome Postgrad Student

Here are a few tips on getting through a PhD or a Master’s, based on my experience doing a PhD in Electrical Engineering at the University of Canterbury. I’ve had a good time so far, and I hope that you can have the best experience possible. This document is based on ideas that I gathered from other written works on how to be an awesome postgrad, my supervisors, friends, parents, wife and various talks that I attended as a postgrad. I hope you can translate any discipline-specific details into your own field. Your PhD experience probably be unlike mine, especially if you are not doing electrical engineering. Get advice from wise scholars in your discipline on how to do be an awesome postgrad student in your context.