Journal Updates
As part of the Canadian Distributed Mentorship Program, here is where I will be posting my journal entries relating to the work done.
I - | II - | III - | IV - | V - | VI - | VII - | VIII - | IX- | X- | XI- | XII- | XIII- | XIV- | XV- | XVI- | XVII- | XVIII |
Week Nine: June 26 - June 30
Worst week ever. I'm sure Lemony Snicket can take it to town and write the fourteenth installment of "A Series of Unfortunate Events"...
Monday started promising. By lunch, I thought I had made some advances, and had the data structures for the algorithm figured out. I was
on it like snakes on a plane.
For-loops galore, I simulated the past (and reached printf zen...my console looked like a christmas tree), but it quickly became
obvious that calculating p( future | past, action ) wasn't going to work. At all. Not even if Morgan Freeman asked it nicely. I did figure
out how to compute p( past ), but that wasn't really the issue here. I was storing the past as an array of length T, where each entry
was the action:observation pair observed at the given time. Then, once I simulated a future, I calculated p ( future, past, action ) for each entry in the past array,
which is decidedly wrong.
No time for sulking, the ICML workshop was here. Tuesday was supposed to be an in-and-out job. Run the environments, run the agents, get results, plot them. So, I ran the environment, I ran the agents...for about 5 seconds, until the cat-and-mouse segfaulted. Segmentation Faults are much like the Kubler-Ross 5 stages of grief. Denial, Anger, Bargaining, Depression, Acceptance. Every computer science undergrad knows this. Actually, if you're careful, late at night in a lab, you can play the "Denial or Anger" game, and figure out how far down the rabbit hole someone is. You see the segmentation fault, and immediately you pretend you didn't see it, that you gave the wrong input data, or that it is the compiler's fault. Every computer science undergrad also knows that all these reasons are statistically wrong, and that re-compiling and re-running the same thing will segfault again. If you're lucky, it happens regularly and predictably. If you're not, it will lurk in the night until you think you've won; then it will strike again, and you'll go down in flames like the best of them. You never, ever win.
I was supposed to let the agent run for 1000 episodes on the cat-mouse-environment. Consistently, after 100 episodes, things hit the fan.
I slid through Bargaining and Depression with a large icecream, and moved right on to Acceptance. Debugging. The fourth circle of hell.
Fixing what seems to be a bug breaks something in a completely different place, adding a comment seems to fix things, and
generally, nothing ever makes sense. Sometimes, you run the code in front on someone else, that lurking segfault happens, and everyone thinks
you've lost your mind. The sensible approach at this time is to turn your computer off. There is nothing they can teach you to prepare you for this.
Five hours later, I had found the bug. Tech babble aside, I had under-allocated memory for a message, that would write data in other people's yards.
Allow any code to do this for a long enough period of time, and it will segfault. I fixed the bug, had some congratulatory icecream,
and started running the environment again. For the same 5 seconds, until it segfaulted. Again. Thank Odin, this time it was the
fault of the agent, not mine. I emailed its owner, and moved on to running the mountain car environment...
Which sums up the rest of Tuesday. And of Wednesday. You can't really do much else when benchmarking, because you're hogging the CPU. And when one run of a given test takes 2 hours, you *really* don't want to slow down the CPU. At about 5:30am on Thursday, all the results were done, plotted, Matlab was my new best friend and you can see them here.
Late Thursday (going to bed at 6:30 takes a leetle bit of recovery) and Friday were back-to-the-code time. Neither good nor bad happened, it was a general dissatisfaction with my data structures. I found an apparent solution, but that involved a 5 dimensional sparse matrix, which was very, very, gross. Writing out past[i][j][k][l][m] makes me want to eat lampshades.