2017 season overview

2017 season overview

2017 season is, unfortunately, over and at the same time that was the end of the first season F1-predictor published qualifying and race predictions. In this post I would like to describe my experience from this first year of keeping this blog and – most importantly – provide an objective assessment of 2017 predictions!

Personal View

In hindsight, creating a personal blog is something I should had done a few years ago! This blog gave me back much more than I expected! I had actually built the F1-predictor engine since late 2015 but I was always postponing publishing to a blog just because I thought I wouldn’t have time to maintain it or, maybe, I wasn’t confident enough to discuss something publicly about Data Science and Machine Learning.

In order to reduce any possible hesitation about creating this blog, I had drafted about eight blog posts and had them ready to publish at any time. This took away many of the doubts regarding the time I would have available after publishing the blog – and it really helped.

Discussing and writing about Data Science concepts like feature engineering and predictive modeling made me gain a deeper understanding of the concepts themselves in order to be able to explain them as simply as possible. And of course, doing so helped me gain a lot of confidence!

There were some other ideas I blogged about like the F1 sentiment analysis dashboard and the F1-predictor API that I hadn’t initially planned but ended up implementing and publishing. Both project helped me brush up my R and php skills respectively.

OK, enough about my story. Let’s see how F1-predictor performed!

Model Assessment

In the following section I’m showing the F1-predictor engine assessment using an evaluation metric I haven’t blogged about yet. Anyway, it’s a slightly modified version of the well-known RMSE, so you can interpret it similarly.

Firstly, let’s start by comparing the predictions for each race against some benchmarks. For qualifying predictions, the benchmark will be the previous race qualifying results while for the race the benchmark is the starting position (i.e. ‘predict’ that every driver will finish the race in the position they start from). In other words, I’ll calculate the modified RMSE using both these benchmarks and my predictions and then compare them. I think both benchmarks are a bit naive but they make complete sense (at least to me)!

So, here is the evaluation for each race in 2017:

GP name Qualifying results benchmark Qualifying results performance Race results benchmark Race results performance
Australian GP N/A 3.52 0.88 1.11
Chinese GP 5.19 4.67 4.26 3.14
Bahrain GP 5.49 3.49 2.59 2.17
Russian GP 4.31 3.27 1.46 0.87
Spanish GP 3.64 3.54 3.92 3.72
Monaco GP 5.01 4.22 2.63 2.48
Canadian GP 4.59 1.95 2.48 2.40
Azerbaijan GP 3.63 2.83 3.89 3.82
Austrian GP 5.08 3.97 2.48 2.06
British GP 5.26 5.29 4.06 2.28
Hungarian GP 5.43 4.10 2.22 3.05
Belgian GP 3.84 3.15 3.55 3.65
Italian GP 7.27 6.28 4.00 2.40
Singapore GP 8.07 4.73 1.58 1.91
Malaysian GP 5.37 5.00 4.47 2.94
Japanese GP 5.60 3.32 2.61 1.83
United States GP 5.97 5.35 3.12 1.37
Mexican GP 6.11 3.85 2.88 2.88
Brazilian GP 5.53 5.78 4.00 1.84
Abu Dhabi GP 5.21 1.38 1.70 2.11
Average 5.29 3.98 2.94 2.40

As you can see, F1-predictor did much better than these benchmarks (I wouldn’t create this blog if it didn’t :P). Specifically, there’s a 24.7% average improvement in qualifying predictions and an 18.3% in race predictions. There were a few races where the benchmarks where more accurate than the ML model but these were rather few.

Apart from this metric, I also have some other evidence regarding the performance of the model! There’s an online game called f1-forecast.com where anyone can post his/her F1 predictions before each race. Specifically, you should predict the top-5 quali drivers, the top-10 race drivers and the driver who’ll do the fastest lap in the race. Then, players are given points based on how well they predict all these.

You can imagine that F1-predictor could not miss this! The only thing missing was the fastest lap forecast which I predicted manually (although I plan to build another ML model for doing it automatically). In this game, 547 competitors participated last year and made predictions throughout the year.

Here’s the final standing after the 2017 season ended:

F1-forecast standings
F1-forecast.com standings

Yaaaaay!!! A completely automated solution finished #23, not very far in terms of points from the top spots! Next year I hope I get even better!

That was all for 2017! No more race predictions till March 2018. Of course, I’ll keep publishing some Data Science posts every now and then. See you next season!

11 thoughts on “2017 season overview

  1. Very good results! Congratulations!

    Keep going, I think there’s still a space for grow.

    btw – I achieve 0.815 accuracy just simply changing NAs to a maximum time and I’ve tried to add gaps as additional parameters, post about it will be soon in my blog. Will be interesting to see your thoughts about it.

  2. I’ve just found this by luck. I read all posts and I am very grateful for Stergios. I’ve learnt many things. I am actually starting in data science with an aerospace engineering background and I am also a F1 fun. So I found all what I needed in this blog

    Thank you Stergios, keep working !

    1. Thank you for your nice words Houcem. Your background sounds a great for a data science career!

      Feel free to subscribe to the newsletter to get all the updates.

Leave a Reply

Your email address will not be published. Required fields are marked *