The safest hands in the business

Success in Formula 1 is as much about consistency as outright speed. It is easy to point to historical examples of drivers being outscored by their teammates across a year despite being the much stronger driver over a single lap. In 1984, Prost outqualified his champion teammate Lauda 15-1, but was narrowly beaten by Lauda to the title. He took this lesson to heart and reformed his driving style, focusing on race pace and consistently bringing home points. Things came full circle when Senna outqualified Prost 28-4 from 1988-1989, but Prost outscored Senna 186-163 (and 154-150 in points that counted towards the championship). The difference was partly down to bad luck, but also down to Prost’s more pragmatic approach. A contemporary example is Button vs. Hamilton at McLaren from 2010-2012. While Hamilton dominated qualifying 44-14, he was ultimately outscored 672-657 by Button, through a combination of bad luck and inconsistency.

Drivers who keep out of trouble may not gain the same attention as flashier drivers like Gilles Villeneuve, Ayrton Senna, and Ronnie Peterson, but they understand the demands of the sport. Points are not won in qualifying and, since 1960, they are not won for setting fastest laps either.

So who are the cleanest drivers in the sport’s history, and who are most likely to end the race in a wall? I’m sure some names immediately spring to mind.

To answer this, I compiled some data on the likelihood of drivers retiring due to driver-related DNFs. I defined these as races in which a driver had any of the following happen:

  • DNF due to a crash or collision.
  • Disqualification from the race due to driver conduct (e.g., black-flagged for actions on track).
  • Running out of fuel within the last 5 laps of a race — I included this because fuel management has been an important skill at several points in the history of the sport.
  • Voluntarily withdrawing during a race without any mechanical or other problems.
  • Non-classified finishes (i.e., failing to complete a satisfactory fraction of the total race distance) that were not attributed to mechanical problems.

The vast majority of driver-related DNFs were DNFs due to crashes or collisions. I didn’t attempt to attribute fault when it came to collisions (since this can become subjective), nor did I include data on races where the driver crashed or went off track but was able to finish the race.

I ran these statistics for all drivers who have scored at least 3 wins and all drivers who are currently active. I also picked two special examples for reference:

  1. Ukyo Katayama, who has the greatest number of driver-related DNFs per start of any driver with 50 or more starts.
  2. Andrea de Cesaris, who has a reputation as one of the most crash-prone drivers in the sport’s history.

For each driver, I calculated the average number of starts per driver-related DNF. The results are shown in the graph below.

Four of the current drivers (Chilton, Ericsson, Kyvat, and Magnussen) have not yet had a single driver-related DNF, so they do not appear on the chart.

The crash champion is Ukyo Katayama, with one crash every 3.2 starts. Famous for his incredible start-line crash at Estoril in 1995, he puts even Andrea de Cesaris to shame. Among the world champions, “Hunt the Shunt” is a clear leader, with one driver-related DNF every 4.6 starts, although Damon Hill is not far behind with one driver-related DNF every 6.1 starts.

At the other end of the spectrum, Juan Manuel Fangio emerges as one of the safest drivers in history, crashing out of a race only once in his 50 starts. In fact, he may not have even been to blame for that crash (at Spa 1953), as some sources attribute the crash to a steering problem. In an era where a single crash could easily be fatal, Fangio usually drove within his limits. The same could be said of Clark, Gurney, Stewart, McLaren and Hulme. Clark and McLaren nevertheless lost their lives to crashes caused by mechanical failures.

A few stereotypes are also put to rest. Senna and Schumacher were aggressive drivers, but they were not especially crash-prone, despite their reputations among many newer fans. Mansell, Piquet, Lauda, and Hakkinen were all more likely to end races at their own hands. Farina, who was considered a dangerous driver that often put other drivers out of races, had fewer driver-related DNFs than Ascari, Moss, Hawthorn, Brabham, Collins, and Brooks. Among the modern drivers, Maldonado and Grosjean are quite crash-prone, but Sutil has them both beaten, with one driver-related DNF every 5.2 starts.

Hamilton is the most crash-prone of the current world champions, with one driver-related DNF every 11.0 starts. However, he isn’t as far behind Button as one might expect, and he has a similar crash rate to Prost. Alonso stands out as the safest of the current world champions, with just one driver-related DNF every 19.9 starts.

Bianchi and Bottas rank ahead of Alonso, but there is significant uncertainty in their crash rates, due to their small number of starts. To give some more robust estimates, we can see how the current drivers would rank if they were to each have 2 driver-related DNFs in the 16 remaining races this season.

f1_starts_per_crash_robust

A note regarding uncertainty

As fans of a sport, we often think of results in an absolute sense. There is no question who won the 2008 season or how many times Schumacher crashed in his career. These are just facts. The graphs of starts per driver-related DNF are based on these facts.

However, from a statistical perspective, it makes sense to think about uncertainty in these measurements. Michael Schumacher had 30 driver-related DNFs in his 288 starts, giving him one driver-related DNF every 9.6 starts. By comparison, Mika Hakkinen had one driver-related DNF every 9.5 starts. Is that difference meaningful, or might it just be down to random chance (variation in the data sample)? If we could somehow rerun history many times, how often would we expect Schumacher to have the higher crash rate than Hakkinen?

To estimate the uncertainty in our measurements, we need to make some assumptions about the underlying statistical distribution from which these samples are drawn. A not unreasonable assumption is that driver-related DNFs have a Poisson distribution, with the mean of that distribution varying from driver to driver. Using the Poisson distribution, we can exactly calculate our degree of certainty in the mean rate. For a 95% confidence interval (i.e., we are 95% sure that the “true” mean lies within this interval), Schumacher has between 6.7 and 14.2 starts per driver-related DNF, while Hakkinen has between 5.9 and 16.3 starts per driver-related DNF.

In other words, it is very difficult to be certain — in a statistical sense — whether one driver was objectively more crash-prone than another in most cases, even for drivers with relatively long careers. We can be almost certain that Alonso (with a 95% confidence interval of 11.1 to 39.9) is a safer driver than Sutil (with a 95% confidence interval of 3.6 to 9.2), but we can’t generally say much more than that based on these data alone!

Adjusting for other DNFs

One factor that could skew the statistics is car reliability. A driver with an unreliable car might break down before they have the chance to crash. This is particularly important given the trend towards much greater reliability over the past decade. Andrea de Cesaris had 103 non-driver DNFs in his 208 starts, which may have significantly reduced his number of driver-related DNFs.

It’s impossible to know for sure how many driver-related DNFs were prevented by non-driver DNFs (i.e., all other types of DNFs), but we can make a quick and dirty approximation. Let’s assume that driver-related DNFs and non-driver DNFs are independent events, with respective probabilities of Pd and Pf. In any given race, there is the chance of neither of these events occurring, one of these events occurring, or both of these events occurring. In the event that both would have occurred, the actual DNF would be due to whichever event occurred first.

We’ll now make the slightly naughty assumption that driver-related DNFs and non-driver DNFs occur with the same distribution with respect to laps into the race. This is a bit naughty because accidents are probably skewed towards the beginning of the race (when cars are running closer together) and mechanical failures are probably slightly skewed towards the end of the race (after the car has been running for a long time). Nevertheless, if we use this assumption for a first approximation, it means that in races where both a driver-related DNF and a non-driver DNF were going to occur, each type of DNF has a 50% chance of occurring first.

In this case, the total probability of a driver-related DNF is

P(driver failure | no non-driver failure) + 0.5P(driver failure | non-driver failure) = Pd(1-Pf) + 0.5PdPf,

and the total probability of a non-driver DNF is

P(non-driver failure | no driver failure) + 0.5P(non-driver failure | driver failure) = Pf(1-Pd) + 0.5PdPf.

These equations can be solved for Pd and Pf for each driver. We can then estimate an expected number of driver-related DNFs for each driver, if they had never suffered any non-driver DNFs. This is just NPd.

I used this adjusted estimate to recompute the number of starts per driver-related DNF for each driver, as shown below.

 The overall ordering of the drivers is largely preserved by this adjustment. At the lower end, Hunt is still just below de Cesaris (a driver he held in very low esteem), but things are improved somewhat for Sutil.

For drivers who have relatively infrequent driver-related DNFs, the differences may seem unimportant — especially given the uncertainty discussed above — but remember that a single extra DNF can easily decide a championship. For example, Webber or Hamilton could have won the title in 2010, if not for untimely crashes. Alonso and Vettel have each averaged about one driver-related DNF per season, whereas Hamilton typically has two, and Raikkonen falls somewhere in the middle.

Championship years and yearly fluctuations

For a given driver, there can be significant fluctuations in the number of driver-related DNFs from year to year. For example, Nigel Mansell had 1 driver-related DNF in 1987, but then 5 driver-related DNFs in 1988. This is not unexpected, given we are dealing with statistics of small numbers. A plausible model for the frequency of driver-related DNFs is the Poisson distribution, for which the coefficient of variation (i.e., the standard deviation divided by the mean) is inversely proportional to the square root of the mean.

We might expect crash frequency to vary across a driver’s career in some cases, due to changes in experience, ability, or driving style. For example, Felipe Massa had one driver-related DNF every 5.0 starts in his first two years in the sport, but since 2005 has had one driver-related DNF every 17.7 starts. Meanwhile, Michael Schumacher had one driver-related DNF every 6.5 starts from 1991-1995, but then settled down to one driver-related DNF every 12.2 starts from 1996-2006. On his return from 2010-2012, he increased to a rate of one driver-related DNF every 9.7 starts.

One interesting observation from the data is that in 50 of the 64 seasons of Formula 1, the champion has had fewer driver-related DNFs than their personal career average. This is likely due to an interplay of several factors that are difficult to disentangle. First, a driver is naturally more likely to win the title in a year in which they lose fewer points through crashes. Second, fewer crashes may be a sign of stronger form, which will also improve the likelihood of a driver winning the title. Third, a driver who is often running near the front may have less opportunity to tangle with other drivers.

Traditional metrics for driver success have focused on the peaks of achievement, including the most number of wins or poles. Equally important, I would argue, is minimizing the number of poor performances.

Advertisements

17 comments

  1. Great article. But where’s Berger? He won more than three races.

  2. Good catch! Berger was an accidental omission — I’ve uploaded new graphs including him.

  3. Bob McMurray · · Reply

    I am sure you will have many suggestions / comments but how about Chris Amon? He drove a good few GPs and for the biggest names around.

    1. Amon was a wonderful and terribly unlucky driver. I didn’t include him in this analysis because he didn’t satisfy the criterion I imposed (at least 3 wins). I just ran his stats and he had one driver-related DNF every 12.0 starts.

      1. Bob McMurray · ·

        Thanks for the reply re Chris Amon.

  4. […] failures, so for each driver I first excluded all races with mechanical or other non-driver DNFs, just like in my last post. The remaining starts (including finishes as well as races where a driver crashed or otherwise […]

  5. […] or failed to finish due to a ‘driver failure’ (using the same definition as I used for driver-related DNFs in my previous post) — are called counting […]

  6. Among current top drivers Alonso seems to be the safest one. It is confirmed by this graph:
    https://theansweris27.com/lap-times-for-the-2014-f1-german-grand-prix/
    In section Finish Status we can see he’s the only one who has finished all GPs in the same lap that the winner.

  7. […] of a driver finishing a race without crashing can be computed from their prior race history, as I did in my previous post. From this, the probability of crashing on each lap can be estimated. The probability of a […]

  8. […] of a driver finishing a race without crashing can be computed from their prior race history, as I did in my previous post. From this, the probability of crashing on each lap can be estimated. The probability of a […]

  9. Matt Karshis · · Reply

    Regarding late-race accidents that cause the driver to drop in the standings but still are counted as being over 90% – example Perez & Massa in Canada? I see they are figured as not counting but you mention running out of fuel in the last 5 laps – that would also typically mean they are finishing as classified. Or are you only referring to races where running out of fuel in the last 5 means a DNF?

    Are you going to update this in the offseason – Chilton and Ericsson now on the list and Sutil’s lead further lengthened with 4 more driver related DNF!

    1. You raise a good point regarding late race accidents. In general, I didn’t record these if the driver was still classified. That might not be the most sensible way of doing the analysis.

      I’ve been updating the data (Sutil’s series of accidents did not go unnoticed!), so I will consider publishing an update, perhaps worked into another article on a variety of driver stats.

  10. […] also holds the dubious record of crashing out of races more often than any other current driver, with 1 crash every 5 starts. […]

  11. It would perhaps be useful to add a separate variable here.
    1. Single driver incidents
    2. Accidents caused by interaction with another driver

    and then divide the accidents of category 2 by the number of times the car was overtaken or overtakes on track. For example half the reason Chilton never gets in incidents is because he is never battling for position. Same for when Vettel/Hamilton/Schumacher win a race leading the entire time by 10+ seconds. This would give us a better understanding of who is really a safer driver and better overtaker.

    I suppose if this was going to be counted, we should also include non DNF causing incidents. But this would be an entirely different article altogether! good work

    1. Indeed, all good points. Maldonado is probably the expert when it comes to non-DNF-causing incidents! I like the first idea, although it is sometimes difficult to find these data for old races. Overtaking data are difficult to obtain even for modern races, and unfortunately pretty much impossible to find going further back. They can sometimes be estimated from lap-charts though.

  12. Palle Hellemann · · Reply

    For the probability calculations, do an analysis of the variation of when DNF’s happen due to drivers error and due to technical defect. I’m sure the median of DNF’s due to driver errors will be at an earlier lap than the median of the DNF’s due to technical defects, but I might be wrong. This data can be used to more precisely predict which ever occurs first, instead of just presuming they are 50-50.
    Another issue, which I think is missing in this analysis, is the fact that some drivers are prone to cause other drivers to have DNF’s and continue the race themselves.
    A third issue is – how many unscheduled pitstops to shift a frontwing, or shift a punctured tire, does a driver have? This is also an indication of his stability/ability to stay out of trouble.
    Fourth, but not least: Please put in stats from Nick Heidfeld, who had a record of 41 classifications in a row brought to an end due to a collision with Sutil in 2009. Heidfeld was the only driver who finished all races in 2008. But probably this overcautious driving was exactly what prevented Heidfeld from becoming a top driver in F1 – he was too willing to yield, didn’t fight hard enough for his position. And this is also why his data is interesting to compare with.

    1. All very fair points. Regarding Heidfeld, these are his stats: 183 starts, 16 driver DNFs, 31 non-driver DNFs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: