«

»

Nov 18 2012

edX 6.00x week 7 – plots and simulations

After completing week 6, where the fundamentals of OOP were brain-dumped in half a dozen short videos, week 7 apparently changed tack. The first lecture sequence briefly introduced the NumPy and SciPy libraries and was followed by a lecture sequence on simulations and random walks.

I say that the course apparently changed tack as although problem set 7 was a simulation task, being successful relied heavily on applying the OOP concepts taught in week 6. I suspect that this will be true for the rest of the course. Although it may have seemed to some students (judging by the reaction on the forums) that spending more time on the introduction to OOP was needed, in reality I suspect that we’re going to get lots more practice as the course progresses.

I must be starting to feel more comfortable with the idea myself, as dealing with the OOP aspects of this week’s problem set didn’t seem to intrude on the main task, which was to create a couple of different classes of robots to clean tiles in a rectangular room using slightly different strategies. The ‘StandardRobot’ used a strategy of randomly changing direction when it was about to hit a wall, with the ‘RandomWalkRobot’ using a strategy of randomly changing direction before every move.

Probably the most interesting part of the task was deciding how best to represent the room and the tiles (cleaned or not). The discussion forums have seen students use two-dimensional lists of clean and dirty tiles, dictionaries, sets, multiple lists … all kinds of inventive solutions. Initially, I chose to record the position of each tile as it was cleaned in an unordered list, iterating through it each time a robot landed on the next tile and adding that tile if it hadn’t already been cleaned. It worked (and got me through the grader) – but it was only when I started playing with the solution on my Raspberry Pi that I realised how awful the representation I’d chosen was. It’s not a good representation as it takes longer to check the list as more tiles are cleaned – at the same time that it becomes less likely for the robot to land on a tile that hasn’t already been cleaned.

An obvious alternative that I’m sure is better would be to reverse the logic of data structure so that the position of the dirty (all) tiles are initially stored  and then removed as they are cleaned. However, in the end I settled for using a list of length room height* room width which simply contained True or False values representing clean and dirty tiles respectively. As each tile is cleaned, the position (y value*room width) + x value is set True (and if it wasn’t already True, the number of cleaned tiles in the room is incremented).

The results of the change were impressive:

Execution times for 1,000 trials, all tiles cleaned, speed 1, using a single StandardRobot on an Intel 2.5GHz x86 Family 6 Model 23 running Windows XP

Original data structure

Revised data structure

5×5 room

4.4 seconds

1.7 seconds

10×10 room

70.6 seconds

8.1 seconds

15×15 room

401.5 seconds

20.2 seconds

20×20 room

1374.9 seconds

39.1 seconds

Execution times for 1,000 trials, all tiles cleaned, speed 1, using a single StandardRobot on a 256Mb Raspberry Pi Model B running Raspbian

Original data structure

Revised data structure

5×5 room

62.4 seconds

24.1 seconds

10×10 room

961.7 seconds

113.3 seconds

15×15 room

 -

287.3 seconds

20×20 room

 -

551.7 seconds

Unfortunately, the automated graders used by 6.00x don’t give marks for style or efficiency(*) so I haven’t bothered to resubmit my answer. I suspect that on the actual 6.00 course run at MIT, marks are awarded for such considerations. They certainly were on my computing degree in the early 1980s. This therefore illustrates one of the limitations inherent in this type of online course. And sometimes even the efficiency with which the result is obtained isn’t a good predictor of elegance. I suspect that the most efficient answer to the 6,9,20 problem from Midterm 1 was the one which simply returned a True or False value for the function depending on whether the value was in a hard-coded list. Clever perhaps, but not a general solution to the three box problem!

(*) Well, provided that the answer you give produces the correct result against the 6.00x test suites in under 30 seconds of execution time I believe.

Talk back to me ...