This evening I did a quick Python examination of historical temperatures in Minneapolis/St Paul. We’ve been having some very warm days this November, and I was wondering how far from the norm they are. So I got data from 1970 until yesterday (didn’t go further because I had some trouble with the API) and
- figured out how to read it correctly as a time series in Python using pandas
- figured out how to select data using a nice selector technique
- and then plotted pictures & figured out stats.
There were some quirks to the date-time format that meant I had to look carefully at the options for read_csv, learning about the parse_date option and making sure my header and index_col options were also right. Then because I wanted to pick only November 8 temperatures from the last 45 years I needed to use a different type of data slicing than I’m most used to: I made a date range d and then picked out (d.month == 11) & (8 == d.day). That was a very neat solution — thanks, stackoverflow people! This resulted in the following distribution:
That’s just max temperatures on November 8 since 1970 binned into 12 bins. 12 bins looks nice. It looks so “normal”!
To sum things up, the average temp on November 8 is 46.46 degrees, and the standard deviation is 12.8 degrees. Yesterday’s high of 59 or 60 degrees (depends what source you look at) is a standard deviation above average!
Nothing groundbreaking, but a good exercise. Maybe a fun thing to do with a class at school — but getting the data into a nice format is non-trivial for your average teacher. What to do about that? Hmph.
Enjoy the math lifestyle with me
Occasional emails about pretty math and new projects. I'm planning coloring book two and also I'm paranoid there is a mistake in the first one and I can email you to fix it if I know your email address.