Car reliability has always played a central role in Formula 1 — an inevitable consequence of teams not wanting to leave any potential performance unexploited. Mechanical DNFs at inopportune times have decided numerous championships. Who can forget Mansell’s puncture at Adelaide 1986 or Hamilton’s shock engine failure at Malaysia 2016? Many of the sport’s most memorable upset results have also come as a result of reliability problems or weather conditions eliminating much of the field.
Today, it is commonly assumed that Formula 1 drivers have little influence in determining the reliability of their cars, with mechanical failures down to the roll of the dice. Sophisticated electronics and monitoring tools carefully protect sensitive components from inappropriate inputs. Drivers are fed constant information on their brake temperatures, fluid pressures, etc., with instructions from the pitwall on how to manage issues as soon as they appear on the telemetry trace. Gone are the days of a driver accidentally missing a gear shift or overrevving the engine. Drivers today can still put undue wear on the car in other ways, such as running too aggressively over kerbs, but tracks have become relatively sanitized and the cars ever more robust. In this article, I’ll investigate the question of whether there is evidence for drivers, past or present, affecting their own car’s reliability.
Mechanical reliability across F1 history
Before going into this analysis, it’s important to note that car failure rates have varied over the course of Formula 1 history. To explore this, I used my database of Formula 1 race results, in which I have previously coded all DNFs as either driver DNFs (e.g., crashes) or non-driver DNFs (e.g., engine failures). This is the same database I use for my model-based driver rankings and update annually for my end of season performance rankings.
I note that this database only includes drivers who completed at least three “counting races” (i.e., races without a non-driver DNF) in at least one season in their career. It therefore excludes drivers with very short careers or drivers who rarely managed to qualify. Since these excluded drivers tended to be in poor machinery, the season-average DNF rates shown below are probably slight underestimates compared to the full historical record, but they will nevertheless capture the historical trends.
The graph below shows how the percentages of starts ending in different types of DNFs have varied over time.
Across Formula 1 history, driver DNFs (chiefly, crashes) have ranged between ~5-20% of starts, perhaps reflecting a combination of varying difficulty of tracks and cars, along with changing levels of driver skills between eras. On the other hand, non-driver DNFs (chiefly, mechanical failures) have ranged between ~10-50% of starts. Since almost all non-driver DNFs are mechanical DNFs (rare exceptions include technical disqualifications), I will refer to non-driver DNFs as mechanical DNFs from here onward in this article.
Reliability issues and total DNF rates both peaked in the mid-1960s and mid-1980s. Since 2007, we have experienced an historically low level of DNFs, driven mostly by a reduction in mechanical DNFs. Each season in the past decade has had mechanical DNFs in ≤15% of starts. This trend has had the following outcomes.
- Upset results are generally less common now. In the 1980s, a new driver got their first career win every 11 races on average, and a new driver got their first podium every 8 races. From 2007-2017, new winners are crowned every 20 races, and new podium finishers every 12 races.
- Given the relative rarity of DNFs today, as well as the need to count all races, a single DNF for a championship competitor is now viewed as a more grave penalty. Before 1991, a driver’s worst 2-7 results (rules varying by year) did not count, meaning some DNFs had no actual consequence.
Whether these are positive or negative outcomes from a sporting standpoint depends on the aspects of the sport you find most entertaining.
Meaningful differences (and avoiding jumping to conclusions)
First, let’s consider an example that illustrates how poorly intuition can serve us when dealing with statistics of small numbers (a similar point was made in a previous article on driver crash rates).
In the image below, imagine that each box represents a race in a 20-race season. A red box is a mechanical DNF. Clearly, driver 2 had the much worse reliability in this season, with 8 mechanical DNFs to 2 for driver 1.
[Note that such cases certainly occur in practice, and this example is similar to the 2004 McLaren drivers, who experienced 8 and 2 mechanical DNFs respectively in 18 starts.]
Can we say with confidence that these two samples are significantly different in a statistical sense? To pose the question another way, how unlikely would this season result be if Drivers 1 and 2 were actually identical in their likelihood of experiencing a mechanical DNF, and the observed difference were all down to luck?
Perhaps surprisingly, the result for this null hypothesis is p=0.065, meaning we would expect to see a difference at least this large between identical teammates in 6.5% of seasons. Using the customary (arbitrary) cut-off of p<0.05, we should conclude that there is no statistically significant difference here. We would need to see such a difference play out over a longer time period before concluding that there is likely a systematic difference in the rates of mechanical failure for these drivers. To put it another way, we don’t have the statistical power to be justified in concluding differently.
Analysis of driver reliability rates
Since reliability rates differ greatly between years and between teams, we can’t look at absolute reliability rates if we are trying to establish a driver’s influence on reliability. Instead, we can compare a driver’s reliability to their teammates’ to see if there are systematic relative differences. For this analysis, I used all drivers in my database who debuted in 1980 or later. This choice was to simplify the analysis, as before this date teams often ran more than two cars and customer cars were common, both of which would make the analysis more complex.
For each driver, the p-value was computed (from Fisher’s exact test), giving the probability that such a one-sided record could be observed by chance. The table below presents the drivers with the most extreme p-values.
[Note: The statistics used below assume that each driver’s record is an independent sample from others’. In reality, this is not completely valid, as one driver’s reliability record also influences the tallies of their teammates. For drivers with longer careers and multiple teammates, we can reasonably assume that they see a sample of drivers with differing effects and are thus being compared to an estimate of the population mean. For drivers with very short careers, this is potentially problematic, however. For instance, a driver with only one teammate might appear to have a positive effect on reliability if that one teammate was particularly hard on their car. Or they might have their own positive effect masked if their only teammate had an equal positive effect on reliability. A full treatment of this statistical dependence would require a much more complicated statistical model, so here I proceed with a simplified approach to determine the general results, under a non-ideal assumption.]
Since we are performing multiple comparisons here (178 drivers in total), we need to be alert to the possibility of finding improbable results just due to the sheer number of comparisons. Even if the universal null hypothesis is true (i.e., every single driver has the same probability of DNFs as their teammates), a standard threshold of p<0.05 is expected to yield significant results for ~1 in 20 cases. A more stringent cut-off is therefore needed here.
In a large dataset such as this, we can correct for multiple comparisons using the Benjamini-Hochberg method. Allowing a typical false discovery rate of 10%, we find that only one of these comparisons can be considered statistically significant: Alain Prost’s significantly lower rate of mechanical DNFs than his teammates. Michele Alboreto’s higher rate of DNFs than his teammates is close to statistical significance. All other drivers in the sample are well over the significance threshold, meaning their results are easily accounted for by chance.
Although not statistically significant, Gabriele Tarquini‘s record is worthy of a brief comment just for its absurdity. Tarquini drove for several of the least competitive teams of the late 1980s and early 1990s, including Osella, Coloni, Fondmetal, and AGS. This is reflected in his record of 79 race entries but only 38 starts, due to frequent difficulties qualifying. In some races he was the only entrant for his team, but in general his teammates did a much worse job of qualifying, reflected in the fact that he has only 9 starts alongside a teammate. To add insult to injury, his car broke down in 6 of those.
Michele Alboreto‘s very poor reliability record is surprising. As far as I’m aware, Alboreto did not acquire a reputation as a car destroyer, although his 1985 title challenge was notably affected by poor reliability. Title protagonist Alain Prost had 3 mechanical DNFs vs. 7 mechanical DNFs for Alboreto (including 5 consecutive failures in the last 5 races), whereas Alboreto’s teammate Johansson had only 3 mechanical DNFs. Looking through Alboreto’s career, lower reliability than his teammates is a consistent feature. Was it all incredible misfortune or was Alboreto actually too hard on his equipment? This one is difficult to call.
Alain Prost: a unique case
In Alain Prost’s case, and his case alone, it is extremely likely that we are seeing a systematic difference in mechanical reliability, rather than a difference that could be attributed to chance.
As to the cause of this difference, we can consider two possibilities. One is Prost’s widely held reputation as a driver who was exceptionally gentle with his machinery and never pushed more than was required. The other is the possibility that Prost was given superior equipment to his teammates. The latter possibility can be tested (and rejected) by comparing Prost’s DNF rates against his various teammates, and by comparing Prost’s record to other top drivers.
The below table shows how Prost’s reliability compared to each of his teammates. As we can see from this, Prost’s reliability was consistently better, even against teammates such as Lauda, Rosberg, and Senna, whom he faced on relatively equal terms. Moreover, his reliability was not generally better in cases where he faced more junior teammates, such as Johansson and Alesi, where he would be expected to benefit most from number 1 driver status.
It’s interesting to note that Rene Arnoux also has a very favorable career reliability record, with 43 mechanical DNFs to his teammates’ 67 (p=0.004). He is not included in this analysis sample, due to debuting before 1980, but a quick-and-dirty analysis shows that in terms of the p-value he would rank only behind Alain Prost (41-73, p=0.0005) and Jean-Pierre Jabouille (a car destroyer: 23-9, p=0.0008) if the analysis were formally extended to include drivers who debuted from 1950-1979.
Next, we can look at Prost’s record alongside other world champions from the sample, many of whom enjoyed strict number 1 driver status for most of their careers.
As we can see from this, only Prost has a record that is well outside the range of chance. The next closest in terms of statistical significance is Nigel Mansell, who actually trends in the other direction (more mechanical failures than his teammates).
We are therefore left with the inescapable conclusion that Alain Prost had a significant positive influence as a driver on the reliability of his own cars.
Other interesting cases
Using this approach, it’s interesting to investigate some other drivers who are perceived as frequently breaking down or experiencing abnormal levels of misfortune.
Andrea de Cesaris was a renowned DNF specialist, holding the record for most total DNFs and most consecutive DNFs. My previous analysis of crashes also ranked him among the most crash-prone drivers (living up to his unflattering nickname), which helped to significantly boost his DNF tally. While the cars he drove across his career were highly unreliable, his reliability record was not significantly worse than that of his teammates: 90 mechanical DNFs to 79 of his teammates (p=0.30).
Mark Webber was often considered prone to poor reliability throughout his Formula 1 career, especially during his days at Jaguar and Red Bull. An analysis of his career reliability failures shows no bias, however. Webber had 34 mechanical DNFs to 36 of his teammates. This analysis doesn’t include cases where a driver had mechanical issues yet was able to finish the race, which perhaps afflicted Webber to some degree.
Jean Alesi is another name often mentioned in discussions of the most unlucky Formula 1 drivers. His decision to drive for Ferrari over Williams in 1991 was surely an unfortunate one, and he is well remembered for near misses on the way to his first win. As one of my previous analyses showed, he may have won the 1992 championship title, had he been at Williams. As far as reliability went, however, he was luckier than his teammates if anything, with 57 mechanical DNFs to 70 for his teammates (p=0.20).
To my knowledge, this is the first formal analysis of drivers’ reliability rates compared to their teammates across a large part of Formula 1 history. The key findings are:
- Since about 2007, reliability has been much higher in Formula 1. It is reasonable to assume that drivers today have less influence on mechanical reliability than drivers in the 1980s and earlier.
- Caution is needed whenever trying to draw conclusions about luck or driver influence from a single season — pure chance can cause large discrepancies in small samples.
- In the great majority of cases, career differences in reliability between drivers and their teammates are explainable by chance alone (i.e., the null hypothesis). This is not to say that drivers cannot influence mechanical reliability, but it does imply that if there is an effect of the driver it must generally be quite small (too small to detect from a typical Formula 1 career). This is consistent with an assumption of some of my modeling, which is that non-driver DNFs are mostly down to chance.
- Alain Prost stands out as a singular example of a driver who positively influenced the reliability of his cars. In models that ignore non-driver DNFs altogether, he is therefore going to be slightly underrated due to this virtue being neglected.