Minnesota’s weather is “normal”

The bulk of my math-brain today was focused on finishing a write-up of an analysis of Minneapolis-St. Paul daily temperatures. It’s a nice time series that I found at the University of Dayton’s temperature archive, conveniently available as a csv file. Minnesota temperatures look like this:

Daily temperatures at MSP

Daily temperatures at MSP

Mildly sinusoidal. Turns out there’s not much linear trend over this time period, even though in many cities there has been a significant warming trend over the last 15-20 years. (I wonder if this needs more analysis for MSP — I just did some basic trend analysis.) And remarkably, you can do a pretty good Fourier series fit for this data!

Data with Fourier series forecast at the end

Data with Fourier series forecast at the end

The blue part at the end is a forecast of temperatures from the Fourier series I fit (using the “forecast” package in R).

Obviously the temperature data is much fuzzier than the Fourier series: what’s the noise? We can look at what the noise looks like by plotting the residuals:

ResidualsPlot

 

Hm. Looks like sometimes our weather is much colder than predicted — what a surprise. How much? Let’s look at a density plot of those residuals:

Density plot of residuals

Density plot of residuals

Dang — that almost looks normal! And hence the title.

 

 

 

Of course it’s not quite normal: this is the land of Lake Wobegon. If all the children are above average, it’s because

QQ plot of standardized residuals from rugarch

QQ plot of standardized residuals from rugarch

we’re close to normal, but a little bit skewed…. (Read about QQ plots at Wikipedia or NIST.)

Chicago Monday

What did I learn today?

  • The ropa vieja platter at Cafecito in Chicago is delicious, but now I am really over-full. Shouldn’t have eaten it all I guess.
  • Cyclic but not periodic…. Heard a talk by Yuliy Baryshnikov on “scribbles and doodles” — extracting information from the noise of data, not just the main features. Cool ideas (keep in mind this is an intellectual exercise here of recalling what I can without looking at my several pages of notes): the shape of the noise in data is itself important. We model financial time series as Brownian motion with drift, right, but we know it’s not exactly that. Look at the up-turns and down-turns of a random walk, look at persistence diagrams; you can tell apart Brownian motion and real financial data. I think I’m getting too tired to write this up well. Next part of the talk was about cyclic but non-periodic time series data.
  • Talk by Katharine Turner on cone fields — using vector fields to reconstruct a manifold from a set of sample points/point cloud, for instance. I am curious about seeing this in action — some clever ideas for dealing with corners and furry data.
  • Sayan Mukherjee: butterflies and dogs. I mean, Grassmannians and Stiefel manifolds. Looking at Grassmannians Gr(k,n) with different k, because what if your underlying manifold is a union of linear subspaces of different dimensions? Fun to see Stiefel manifold in this context, in a talk by a statistician!
  • Omer Bobrowski: random geometric complexes. Some interesting things about Cech complexes, for instance. I need to think about this and how it relates to the algebraic geometry I learned in grad school.

More sleep will help more intelligent conversation. I believe there is now a family of three in the closet-sized hostel room next to me, and it turns out angry tw0-year-olds are louder than sleepy Singaporean teenaged tourists. But I have earplugs 🙂

Tuesday

What did I learn today?

  • A nice idea for using random matrices in portfolio optimization. In classical Markowitz-style portfolio optimization you’re trying to get the best return for a given risk or the lowest risk for a given return. You can use Lagrange multipliers and just solve this optimization problem, using the correlation matrix you get from the set of assets you have in the portfolio. This optimization process picks out the eigenvectors with the smallest eigenvalues to weight the most in assembling an “optimal” portfolio (trying to have the smallest variance). However, the smallest eigenvalues tend to correspond to noise. So one suggested method is to compare your correlation matrix with a random matrix, and replace everything in the “noise band” with a scaled identity component. This apparently improves the risk estimate substantially (or decreases the underestimate of the risk, perhaps). Original paper here.
  • Reading Robinson’s paper on an equivariant Pieri rule, and just trying to do calculations. Learned something about the pattern of the calculations, but not yet sure how that will help.
  • Learned that kale fried in duck fat is pretty tasty. Definitely worth remembering.
  • Learned that I cannot at this point follow double kettlebell cleans and squats with multiple presses without a break. I can do the presses alone, but not right after swings, cleans, and squats with no break.

What did I learn yesterday?

  • A large carb-heavy lunch makes me fall asleep in the afternoon. And maybe too much Roquefort doesn’t help.
  • That sleepiness does not lend itself to learning or accomplishing other work.