Can Your Data Hide How You Are Performing?

When measuring our performance at anything in life, whether you are a call center worker or a race driver, the data and metrics we are using may not really be helping us as much as they should be. How do we even know? I backed my way into this problem and am now trying to find a way out.

I've always been a fan of numbers and measurement. I have loved crunching data ever since having to learn to analyze reams of log file data from the web servers of an online banking application I helped build back in the late 90's. Wow, that makes me sound ancient! We were trying to chase down answers for what types of things were happening in the application by looking at line after line of activity based data. Each time someone did something, it got logged to the data file. With enough crazy shell scripting and awesome Unix commands, we could recreate the sequence of actions that someone performed in the application. And then we tried to recreate issues and fix them. It could be mind-numbing, but it was the only real way to understand what we hoped was reality. 

These days, just about everything is measured and can be analyzed. We know how much energy our house uses courtesy of Nest thermometers and we know how our body is performing via apps on our phones with more to come with all the fancy integrated watches that will be headed our way. Data is becoming ever more important, mostly because so much of it is now available. But are we getting the right story from the data? That is a big and very important question to keep asking ourselves. I've always posited that deriving "wisdom" from data is far harder and far more important than just performing analytics. And here in my own quest to be a better driver, I have found a simple but poignant example.

My simple goal is to go as fast as my car and my physiognomy will allow - the limit. The hard part is knowing where that limit is. And it's also hard to know how to get there. I've spent lots of time looking for ways to get there this year. One of the ways is by recording data during each session and reviewing it later to try to compare different instances to see where I was better or worse. We'd all guess that having accurate data is important. But accuracy is only one aspect. The thing I'm finding now is that frequency is just as important.

What I mean by frequency is how often the data is recorded. In a situation where things happen very quickly, having data infrequently can provide some misleading "wisdom" takeaways. The Mars rover had some major issues when it couldn't report data frequently enough. Infrequent data forces you to gloss over some of the details, and those details can make all the difference. Non-detail-oriented folks will tell me all the time to look at the "big picture". I'm here to repeat that you can't see the (correct) big picture unless you verify that the little pictures that make up the big one are actually accurate!

Getting back to my getting-faster goal, I have session data that I record via some handy apps on my phone. I primarily use TrackAddict HD, but I also use video from my GoPro and dabble with Harry's Lap Timer. These apps can suck in data over WiFi from a little dongle that attaches to the OBDII port in my car. You probably have one of those too under your dashboard. Most cars since about 1996 have had to have it. That little port takes a big, fat connector and devices that attach to it can get access to a bit of data about your car and how it's functioning. Each car differs in what it spits out. But its also important to note the frequency at which the data comes out. I didn't worry too much before as it seemed any data was more than I had. True enough, but also misleading. The OBDII port pushes out data at a max of 1Hz (once per second) roughly (10.4 kBaud max actually or 10,400 symbols or data flashes per second) and even slower if the car is busy doing something else when the data is requested. 

I got more than the nothing data I had at first. Now I (thought I) had vehicle speed, RPM, and throttle position. I could get a few more parameters too if I wanted to - fuel-air-ratios, cylinder bank temps, other useless engine diagnostic stuff - but another limit of OBDII is that the more data you request each time it refreshes, the more time it takes since data flow total is constant and the more data that has to come through each time the fewer times it can send data over the same period. So now your 1Hz effective rate may be down to 0.1Hz - which in the land of cars is very slow... If your car is from 2008 or later, the US has mandated that the CAN bus standard be implemented which is way better. The CAN bus keeps flowing data all the time and much more of it. But my car is from 2007 and I can't get the CAN info from the OBDII port like 2008+ cars can. I can get it from elsewhere, but that's a bigger project to go that route.

Why does this frequent data matter, you ask? What I want to know is how fast I enter corners, go through corners, and exit from corners. At the least, I would like to know the rough speed differential between entry and exit. This info provides insight into how well I am taking the corner - and how close to a potential "limit" I am. The limit here would be the adhesion of my tires to the track and the max speed for each corner. Go beyond that and you're off the track and into the run off or wall. In most ways, one does not really know the limit until you surpass it. Since I pay for my car, I don't plan on surpassing the limit unless there is no danger of damage. Autocross is great for exploring these limits since the only thing you hit are cones when you go over the limit. Most tracks have some leeway with run-off, but I'm not winning any money or trophies (yet?) so no point in risking a shunt there. What's important right now is this speed data. I thought I had it in hand and just needed to review it, but alas, not so much. The data I had in hand was giving me a false sense of hope. 

As you may have seen, I've delved into my data before. I noted then that the speed data looked a bit choppy. I expected a bit more of a smooth line. Speed is fluid, isn't it. You can't really go directly from 2mph to 5mph directly. You pass through the velocities in between, but acceleration is a measure of how quickly that happens. But if all you have are the two data points, how do you know you didn't hit 10mph in the middle with amazing acceleration and deceleration in between? You don't. And that is the crux of the problem. If you don't have enough data points, taken frequently, the average may look pretty lackluster. But the reality may have been quite the opposite!

I started by pulling out my data from my last on-track session at Thompson Speedway's new road course. It wasn't my fastest session, but I was most familiar with the run since it was my last one. I wanted to see if I could compare the laps as I went around the track and see which corners showed big deviations in speeds. Overlaying all the speed data from the laps on top of one another didn't really do much for me. Data visualization is a tricky area. Edward Tufte taught this well long ago. Way too much going on here.

So I backed it down to overlaying only three laps. That helped somewhat. 

The truth started to hit me. I wanted to isolate differences so I derived the differences from the raw speed data and plotted that info.
And then it all hit me. The data was garbage. The step-function look of the data gave it away. Even though I had lots of data points, the information in each data point was only changing every 10th or 20th data point. And the steps showing here made it clear. All the data I've collected to date has been lacking and probably isn't even very useful. If only I knew then what I know now...

There was one other tip off to the sad data I had in hand. When looking at the map overlays where my car's location gets plotted over a satellite image of the course, it looked like I had some off-track excursions. Or one too many. I was hoping to see the differences in the lines I drove on track. But it was clear that the resolution of my location via GPS was so far off (due to extrapolation) that I wouldn't get any meaningful comparisons. Another canary in my coal mine.

After a bit of research, the makers of TrackAddict HD and Harry's Lap Timer, in their documentation (TrackAddict, Harry's), do mention that using an external GPS device is preferable to using the native GPS in one's phone. I'm under no delusion that my iPhone is all things to all people so I expected to hit some limits. In this case it's the refresh rate of the GPS communicating my location that is sub-par. The iPhone does 5Hz at best (often slower) while external GPS devices can get closer to 10Hz. I wasn't sure this was going to be enough of an improvement but it would certainly be better than what I had natively in the phone and further research said I'd basically need military clearance to get much better. It seemed worth a shot. So I got the Dual XGPS150A for not much coin and got to test it out today. It actually helps with two problems for me. I get more frequent GPS location data - so I'm suddenly not crashing into the grandstands at the track each lap! - and this GPS data also allows for a far more frequent extrapolation of vehicle speed, it seems. The vehicle speed is updated every couple of data points instead of every 10 or 20. TrackAddict HD seems to think its creating data points somewhere between every thousandth and every half-tenth of a second (pretty wide range unfortunately), there is still some room for missing important data points but its far closer now than before. Here is a look at the same data collected from GPS (blue line) and from OBDII (red line) along with the delta between them (green line - values multiplied by 10 for easier display).
There are a couple of instances where the differential is 9 to 14mph! For track work, that level of imprecision is pretty much unacceptable. Knowing when you are braking from the data becomes impossible when the frequency is too low. And quick changes in speed become completely invisible. 

Hopefully this small upgrade will be enough to get the data I need to do a better job analyzing my own performance. The next steps are a bit more pricey, but certainly appealing. The next step appears to be the AIM Solo or the AIM Solo DL. Both appear to be nice units and almost the same except the DL will allow for connecting to the car's CAN bus to get much (!) more frequent data from the car itself. That would solve both the speed accuracy/frequency issue for velocity and also log the GPS location data with more accuracy since they are 10Hz devices. And in one compact unit rather than two or three little things that need to be mounted, charged, and tended. But with 4x-6x the investment over what I have now, I can't justify the jump yet. Still better to spend my money on brake pads and tires for now, it seems. Christmas is coming, right?

How might this apply to your life if you aren't a race car driver? A friend who is an attorney, has noticed it when partner compensation is calculated. Their firm averages performance over three years. Three years! The result is that some folks apparently completely sand bag for up to a couple of years. And then get their buddies, who may not be on the same cycle, to throw them some bones so they can have the one good year that matters for their average to stay decent. The firm gets to carry these folks for two years of sub-par performance (what does that do for morale to see these folks never working!) before they bother to take a look at the reality. If the firm ever goes to an annual review model, the real performance of these people will become visible to all. Data can certainly mislead. Someone could have two banner years and then one tough one - and they may be deemed a failure even after two great performances based mostly on periodicity! So watch out how you are measured - and how often and when - to see if you may be benefiting or suffering from a lack of "wisdom" in your data!


1 response
Very nice and comprehensive analysis. I will be on the next ADSI session so we can discuss it live. I was using the same GPS for aviation, but I upgraded to a more precise unit (Stratus 2), that usually gives me a 1m precision. I will test to see if it is compatible with Track addict and bring it for the next session so we can do some tests. Fred