When measuring our performance at anything in life, whether you are a call center worker or a race driver, the data and metrics we are using may not really be helping us as much as they should be. How do we even know? I backed my way into this problem and am now trying to find a way out.
I've always been a fan of numbers and measurement. I have loved crunching data ever since having to learn to analyze reams of log file data from the web servers of an online banking application I helped build back in the late 90's. Wow, that makes me sound ancient! We were trying to chase down answers for what types of things were happening in the application by looking at line after line of activity based data. Each time someone did something, it got logged to the data file. With enough crazy shell scripting and awesome Unix commands, we could recreate the sequence of actions that someone performed in the application. And then we tried to recreate issues and fix them. It could be mind-numbing, but it was the only real way to understand what we hoped was reality.
These days, just about everything is measured and can be analyzed. We know how much energy our house uses courtesy of Nest thermometers and we know how our body is performing via apps on our phones with more to come with all the fancy integrated watches that will be headed our way. Data is becoming ever more important, mostly because so much of it is now available. But are we getting the right story from the data? That is a big and very important question to keep asking ourselves. I've always posited that deriving "wisdom" from data is far harder and far more important than just performing analytics. And here in my own quest to be a better driver, I have found a simple but poignant example.
My simple goal is to go as fast as my car and my physiognomy will allow - the limit. The hard part is knowing where that limit is. And it's also hard to know how to get there. I've spent lots of time looking for ways to get there this year. One of the ways is by recording data during each session and reviewing it later to try to compare different instances to see where I was better or worse. We'd all guess that having accurate data is important. But accuracy is only one aspect. The thing I'm finding now is that frequency is just as important.
What I mean by frequency is how often the data is recorded. In a situation where things happen very quickly, having data infrequently can provide some misleading "wisdom" takeaways. The Mars rover had some major issues when it couldn't report data frequently enough. Infrequent data forces you to gloss over some of the details, and those details can make all the difference. Non-detail-oriented folks will tell me all the time to look at the "big picture". I'm here to repeat that you can't see the (correct) big picture unless you verify that the little pictures that make up the big one are actually accurate!
Getting back to my getting-faster goal, I have session data that I record via some handy apps on my phone. I primarily use TrackAddict HD, but I also use video from my GoPro and dabble with Harry's Lap Timer. These apps can suck in data over WiFi from a little dongle that attaches to the OBDII port in my car. You probably have one of those too under your dashboard. Most cars since about 1996 have had to have it. That little port takes a big, fat connector and devices that attach to it can get access to a bit of data about your car and how it's functioning. Each car differs in what it spits out. But its also important to note the frequency at which the data comes out. I didn't worry too much before as it seemed any data was more than I had. True enough, but also misleading. The OBDII port pushes out data at a max of 1Hz (once per second) roughly (10.4 kBaud max actually or 10,400 symbols or data flashes per second) and even slower if the car is busy doing something else when the data is requested.
I got more than the nothing data I had at first. Now I (thought I) had vehicle speed, RPM, and throttle position. I could get a few more parameters too if I wanted to - fuel-air-ratios, cylinder bank temps, other useless engine diagnostic stuff - but another limit of OBDII is that the more data you request each time it refreshes, the more time it takes since data flow total is constant and the more data that has to come through each time the fewer times it can send data over the same period. So now your 1Hz effective rate may be down to 0.1Hz - which in the land of cars is very slow... If your car is from 2008 or later, the US has mandated that the CAN bus standard be implemented which is way better. The CAN bus keeps flowing data all the time and much more of it. But my car is from 2007 and I can't get the CAN info from the OBDII port like 2008+ cars can. I can get it from elsewhere, but that's a bigger project to go that route.
Why does this frequent data matter, you ask? What I want to know is how fast I enter corners, go through corners, and exit from corners. At the least, I would like to know the rough speed differential between entry and exit. This info provides insight into how well I am taking the corner - and how close to a potential "limit" I am. The limit here would be the adhesion of my tires to the track and the max speed for each corner. Go beyond that and you're off the track and into the run off or wall. In most ways, one does not really know the limit until you surpass it. Since I pay for my car, I don't plan on surpassing the limit unless there is no danger of damage. Autocross is great for exploring these limits since the only thing you hit are cones when you go over the limit. Most tracks have some leeway with run-off, but I'm not winning any money or trophies (yet?) so no point in risking a shunt there. What's important right now is this speed data. I thought I had it in hand and just needed to review it, but alas, not so much. The data I had in hand was giving me a false sense of hope.
As you may have seen, I've delved into my data before. I noted then that the speed data looked a bit choppy. I expected a bit more of a smooth line. Speed is fluid, isn't it. You can't really go directly from 2mph to 5mph directly. You pass through the velocities in between, but acceleration is a measure of how quickly that happens. But if all you have are the two data points, how do you know you didn't hit 10mph in the middle with amazing acceleration and deceleration in between? You don't. And that is the crux of the problem. If you don't have enough data points, taken frequently, the average may look pretty lackluster. But the reality may have been quite the opposite!
I started by pulling out my data from my last on-track session at Thompson Speedway's new road course. It wasn't my fastest session, but I was most familiar with the run since it was my last one. I wanted to see if I could compare the laps as I went around the track and see which corners showed big deviations in speeds. Overlaying all the speed data from the laps on top of one another didn't really do much for me. Data visualization is a tricky area. Edward Tufte taught this well long ago. Way too much going on here.
So I backed it down to overlaying only three laps. That helped somewhat.