Journal Updates
As part of the Canadian Distributed Mentorship Program, here is where I will be posting my journal entries relating to the work done.
I - | II - | III - | IV - | V - | VI - | VII - | VIII - | IX- | X- | XI- | XII- | XIII- | XIV- | XV- | XVI- | XVII- | XVIII |
Week Sixteen: August 14 - August 18
Early in the week I decided I had it with the Rupert bug. So, armed with a litre of coffee, courtesy of my friend Mike from upstairs, I rewrote the entire thing in what I affectionately call an "all-niter". Well, rewrite is a bit of a misnomer, I kept some of the main things. But apart from that, I took the code, and Rupert along with it, to town. My own probability distribution classes and operator overloading? Very yes. C++, I like you. You're mean, but I like you.
Which was just in time, because on Tuesday, Susanna Still arrived in town, and we got to talk about the code, and the algorithm and what results I should expect. We decided that, in order to be able to compare our results to those from the PSR literature, we should fix the action picking policy to uniformly random. This, of course, the code likes a lot, as the main problem that I encountered was that the P(a) distribution would never converge within the j-iterations needed to solve the system of equations.
First, Susanna and I wanted to see whether the pasts were getting approximated correctly. In other words, whether the internal representation is a sufficient statistic for the available information. Below is a graph of P(future|state), in red, vs. P(future|past), in black. The easiest way to see that this is working is to see whether the red lines decently approximate the black ones. And they do!
Resizing the image makes it very pixely, so click here for graph awesomeness!
In the middle of the week, we came up with a second version of the algorithm, that calculates P(state|past) in a slightly different way. Thus, for the remaining of the week, I coded in the modification, and began to run comparison tests...