Olympic Sages by DataDruids

Introduction

We are Team Data Druids, and we were tasked with predicting the winners of the 2024 Paris Olympics based on previous data. A significant portion of our data comes from the 2021 Tokyo Olympics. We analyzed historical data, visualized our results, and hypothesized how this data could be used to predict the outcomes of the 2024 Olympics. Our final step was to generate predictions for the 2024 Paris Olympics.

Olympic medals pixelated

Preliminary Analysis

We wanted to understand how we could use data from previous olympics to predict future olympics. This section showcases some analysis that we conducted using a dataset from the 2021 Tokyo Olympics.

We have created four interactive pie charts showcasing the top ten countries in various medal categories. These charts allow you to see the number of medals each country won by hovering over the respective segments. Additionally, you can toggle countries on and off to observe how this affects the overall medal distribution.

Gold medal pixelated

Radar Chart

We have created a radar chart displaying the top five countries with the most medals. This chart illustrates the distribution of gold, silver, and bronze medals relative to the total number of medals each country won. You can toggle each category on and off to gain a clearer understanding of how these countries compare to one another.

Stacked bar chart description

We have created a stacked bar chart featuring all the countries that participated in the 2021 Tokyo Olympics, displaying their respective gold, silver, and bronze medals. You can toggle each medal category to see how the countries compare to one another.

Predictions for the 2024 Paris Olympics

While analyzing the previous Olympics data we have come up with various ways to predict the countries with the most medals in the 2024 Olympics.

Silver medal pixelated

Linear Regression model

This linear regression model displays the top 10 countries with the highest predicted total medals based on the number of athletes they sent to the 2021 Tokyo Olympics. The model's mean absolute error is 2.8, and the R-squared value is -0.8, indicating that the model did not perform well. Improvements will be needed moving forward. This model is interactive and you can see more information when hovering over a section of the models.

Bronze medal pixelated

Random Forest Regressor model

This random forest regressor model also displays the top 10 countries with the highest predicted total medals, but now is based on the number of athletes and coaches they sent to the 2021 Tokyo Olympics. Several improvements were made from the previous model. The inclusion of coaches data provides a more comprehensive view, and Grid Search was used to fine-tune the model. The model's mean absolute error is 0.5, and the R-squared value is 0.9, indicating that it performs well. This model is interactive and you can see more information when hovering over a section of the models.

If you want some more insights on the predictions for the 2024 Summer Olympics winners, you can follow this link to the

Jupyter Notebook.
Olympic medals pixelated