SideCar's Kalman Filter models San Francisco brunch

gpcz · on March 5, 2014

Typically, a Kalman filter is only as accurate as your mathematical model of the underlying phenomena (for example, your flight dynamics on an airplane), and the filter is mainly useful to mitigate the noise from your real-world sensor observations. Is there more public information available about the mathematical model being used within their state transition matrix? Depending on that information, this could be really clever or a glorified low-pass filter.

lnanek2 · on March 5, 2014

But they say they are looking for people to go to brunch, wait in line, then return. That's two short spikes of activity next to each other once a day. A low pass filter would filter that out as a high frequency noise.

Even basic stuff like how long it takes someone to drive from one spot to the other before they can contribute to increased demand in Oakland when they are in SF seems more complex than a basic filter and in line with Kalman.

gpcz · on March 5, 2014

A Kalman filter assumes that measurement noise follows a Gaussian distribution, and it continuously updates its estimate of the covariance based on previous observations. Therefore, if you gave it a bunch of very similar observations (like differing by 0.1) for a long time, the covariance would get very narrow. Once the activity spikes appeared, they would not have much influence on the state estimate because the probability distribution would imply they were extremely unlikely events. This would look very similar to a low-pass filter.

Although the Kalman filter retains some aggregate data about past states in its iterated covariance estimate, it is still primarily a recurrence relation where the future state depends on the immediate present, much like a discretized low-pass filter. This is part of why I'm intrigued by the parent article's use of a Kalman filter for this application.

michaelmior · on March 6, 2014

I don't have much knowledge of Kalman filters, but the Wikipedia article[1] claims the assumption of Gaussian error is a common misconception. A quick skim of the original paper[2] seems to confirm this.

[1] http://en.wikipedia.org/wiki/Kalman_filter

[2] http://www.cs.unc.edu/~welch/kalman/media/pdf/Kalman1960.pdf

gpcz · on March 6, 2014

I didn't know that -- that's very interesting! Thank you for showing me that.

I learned about Kalman filters in a mobile robotics course that made explicit Gaussian assumptions early on for the primary topic (SqrtSAM), and they brought up Kalman filters in its own lecture as kind of a "this is how they used to do SLAM" lecture. Considering the large amount of overlap in the methods, such as the use of linear covariance projections, I guess I made the assumption that Kalman filters had the same Gaussian assumption.

michaelmior · on March 6, 2014

Sounds like a pretty interesting course. Really I know about Kalman filters is the use case.

tel · on March 6, 2014

It sounded a lot like the interviewer didn't have enough expertise to do much over key of "Kalman filter" as sounding vaguely big data-ey. Too bad.

gpcz · on March 6, 2014

That may be a part of it. The problem was only described in vague terms, so there's probably a reason why a Kalman filter makes sense, but as it was described in the interview it seemed more like a regression problem to me.

ericwaller · on March 6, 2014

All you kalman filter fans out there will be happy to hear that you can grab a ride from SideCar and some Giants tickets from SeatGeek[1] for a truly algorithmic afternoon.

1. http://chairnerd.seatgeek.com/using-a-kalman-filter-to-predi...

nullc · on March 6, 2014

Expected math. Was disappointed.

dfc · on March 5, 2014

I thought all models literally model the subject that they model.

caycep · on March 6, 2014

Not an expert in this field, but amusing to me in that the only other place i've had the fortune of encountering kalman filters were groups trying to analyze neuro data from Blackrock Utah micro electrode arrays.

Brain signals, brunch, they all look the same...

tel · on March 6, 2014

They show up all over the place in time series and more general random signals. Those are just less than incredibly popular domains, in no small part I feel due to the greater tool sophistication needed to make a dent.

spinlock · on March 5, 2014

I think it's more fun that they model SFers standing in line for brunch.