2017 season is, unfortunately, over and at the same time that was the end of the first season F1-predictor published qualifying and race predictions. In this post I would like to describe my experience from this first year of keeping this blog and – most importantly – provide an objective assessment of 2017 predictions!
In hindsight, creating a personal blog is something I should had done a few years ago! This blog gave me back much more than I expected! I had actually built the F1-predictor engine since late 2015 but I was always postponing publishing to a blog just because I thought I wouldn’t have time to maintain it or, maybe, I wasn’t confident enough to discuss something publicly about Data Science and Machine Learning.
In order to reduce any possible hesitation about creating this blog, I had drafted about eight blog posts and had them ready to publish at any time. This took away many of the doubts regarding the time I would have available after publishing the blog – and it really helped.
Discussing and writing about Data Science concepts like feature engineering and predictive modeling made me gain a deeper understanding of the concepts themselves in order to be able to explain them as simply as possible. And of course, doing so helped me gain a lot of confidence!
There were some other ideas I blogged about like the F1 sentiment analysis dashboard and the F1-predictor API that I hadn’t initially planned but ended up implementing and publishing. Both project helped me brush up my R and php skills respectively.
OK, enough about my story. Let’s see how F1-predictor performed!
In the following section I’m showing the F1-predictor engine assessment using an evaluation metric I haven’t blogged about yet. Anyway, it’s a slightly modified version of the well-known RMSE, so you can interpret it similarly.
Firstly, let’s start by comparing the predictions for each race against some benchmarks. For qualifying predictions, the benchmark will be the previous race qualifying results while for the race the benchmark is the starting position (i.e. ‘predict’ that every driver will finish the race in the position they start from). In other words, I’ll calculate the modified RMSE using both these benchmarks and my predictions and then compare them. I think both benchmarks are a bit naive but they make complete sense (at least to me)!
So, here is the evaluation for each race in 2017:
|GP name||Qualifying results benchmark||Qualifying results performance||Race results benchmark||Race results performance|
|United States GP||5.97||5.35||3.12||1.37|
|Abu Dhabi GP||5.21||1.38||1.70||2.11|
As you can see, F1-predictor did much better than these benchmarks (I wouldn’t create this blog if it didn’t :P). Specifically, there’s a 24.7% average improvement in qualifying predictions and an 18.3% in race predictions. There were a few races where the benchmarks where more accurate than the ML model but these were rather few.
Apart from this metric, I also have some other evidence regarding the performance of the model! There’s an online game called f1-forecast.com where anyone can post his/her F1 predictions before each race. Specifically, you should predict the top-5 quali drivers, the top-10 race drivers and the driver who’ll do the fastest lap in the race. Then, players are given points based on how well they predict all these.
You can imagine that F1-predictor could not miss this! The only thing missing was the fastest lap forecast which I predicted manually (although I plan to build another ML model for doing it automatically). In this game, 547 competitors participated last year and made predictions throughout the year.
Here’s the final standing after the 2017 season ended:
Yaaaaay!!! A completely automated solution finished #23, not very far in terms of points from the top spots! Next year I hope I get even better!
That was all for 2017! No more race predictions till March 2018. Of course, I’ll keep publishing some Data Science posts every now and then. See you next season!