Journal Updates
As part of the Canadian Distributed Mentorship Program, here is where I will be posting my journal entries relating to the work done.
I - | II - | III - | IV - | V - | VI - | VII - | VIII - | IX- | X- | XI- | XII- | XIII- | XIV- | XV- | XVI- | XVII- | XVIII |
Week Three: May 14 - May 19
"Busy, busy, busy"!
Universite du Quebec a Montreal (UQAM, plus or minus some accents)
held the annual
Association for Symbolic Logic meeting.
Like a true dork, I spent Wednesday through Friday mornings there, listening to brilliant people talking
about topics that are bound to come up in crossword puzzles. Quantum computations, sequents, observational
integration theory and Zermelo's axiom of choice. Originally I had planned to take viciously clear notes that
were to be read at home, just to "make sure I was still on the ball". This was abandoned 6 minutes
into the first lecture of the morning, when tensors charmingly took the stage, I couldn't
understand what muscles had to do with intuitionist logic, and was so lost I wasn't even sure there
was a ball to be on anymore.
My favourite lecture, by far, was Elisabeth Bouscaren's,
on "Finitely axiomatizable theories, old and new questions". It was the kind of thing that although you understand
very little of, it instantly absorbs you. Like icecream.
On the work side of the cube, I worked on the MountainCar environment, for the Reinforcement learning competitition and benchmarking event. I had to modify Barto and Sutton's original problem, where and underpowered car has to be driven up a steep mountain road, and add a sensorimotor delay. Rather than receiving the actual state of the environment, the agent receives a delayed one, k seconds before the correct state, thus making the learning problem slightly hairier.
I've also started reading two new books, Ethem Alpaydin's Introduction to Machine Learning, and Neuro-Dynamic Programming by Bertsekas and Tsitsiklis. before I can read about decision trees, I need some prerequisite knowledge of basic supervised learning concepts, such as classes and models. I'm on it like white on rice.
And the week update is still not over! On Thursday, Michael James from the Toyota Research Labs came by to talk about Predictive State Representations(PSRs). Having read Littman and Sutton's paper before, I actually understood all of it. This is a pleasant surprise, as our lab meetings are often about topics that build on a massive amount of information I know nothing about, but am trying to learn as fast as possible. Since I can talk about this one, I will. Take a dynamic system, represented by a set of states, actions and observations, where the state space is not directly observable(i.e. after performing an action, the agent never quite knows where in the world it is). Usually, such a system is modelled by a Partially Observable Markov Decision Process (POMDP), where the current state of the system is represented by a belief state, or a probability distribution over the hidden states. The problem is that POMDPs assume a perfect dynamics model, and then try to estimate the state based on it. No model, and there's something fishy in the state of Denmark. A PSR, however, is a vector of predictions for a specially selected set of action{observation sequences, called tests. They have their own problems, but of them at a later time...
"Busy, busy, busy, is what we Bokononists whisper whenever we think of how complicated and unpredictable the machinery of life really is." - Kurt Vonnegut, Cat's Cradle.