World Cup and Oxford University: How does the data show that Brazil will be champion?

Data Science & Analytics
Yasmim RestumYasmim Restum - 25 de November de 2022.

After a victorious debut for Brazil at the 2022 World Cup in Qatar, anxiety is mounting over the possibility of a world title. But even the best predictors couldn't do as much math as the predictive models created by Oxford University.

Although the team coached by Tite is one of the most likely to win the cup, this mathematical prediction that Brazil will be the six-time champion arouses the eyes of the curious and the attention of the most skeptical. After all, how do you build a mathematical model for something as variable as soccer?

Changing tactical schemes, substitutes swapping with star players, the heat from the crowd, an injury on the field, the players' emotions. How can mathematics digest all this and come up with the most likely answer for a single champion?

In the same way that companies face market fluctuations, harsh blows from inflation, currency hikes, among other factors, soccer also works with variables that statistics and predictive analysis can master.

If you want to understand how the renowned British institution elaborated the research that is making the most noise in this Cup and the intelligence behind it, keep on reading.

Oxford University: research shows Brazil will be champion

Captura de tela 2022-11-24 110106Translating the puzzle box of soccer into statistics is a job that requires repetition. To reach the research conclusions, professor and researcher Joshua Bull, from Oxford University's Institute of Mathematics, brought together a team of data experts who analyzed the possible variables to create models capable of translating the most likely outcomes of the future.

In this regard, we are talking here not only about predictive models - capable of predicting what is most likely to happen - but also prescriptive models - that will understand unlikely situations, and their possible outcomes, as well as hypothetical scenarios as varied as possible.

The statistical models assembled by Bull's team became true World Cup simulators - with more than 1 million possibilities of the group matches, the choice of the most likely outcomes for the decisive phase, and then the algorithm simulated each knockout game a further 100,000 times.


Now, how do you simulate a game that will happen in the future?

It can't just be "guesswork," right? What the Oxford mathematicians did was to understand, through historical perspective (from 2018), which team had the best chance of winning by understanding their performance in each match.

Using a tool known in soccer technical circles called xG - which calculates what the probability is of a player/team hitting the goal. The tool understands who the player is and from where he is kicking to tell whether there is more chance of a hit or miss.

And using another tool called Elo Ratings, they can measure the strength of the opponents in each game. There are thousands of other calculations possible with xG, but Bull chose to feed the model with past games of the teams.

In a business language, thinking about retailers, it would be like looking at your best sellers, customer base, and promotion reach to structure logistics, inventory, and define which product would be the highlight of Black Friday, for example.

 Cross-referencing the results of Elo Ratings with xG, a correlation between them emerges: if Team A has 500 more Elo points than Team B, we can expect a match with 1.8 xG for Team A, that is: about 2 goals at the end of the match. In addition, the most recent matches of the teams have a higher weight in the calculation than older ones, and it is possible to add an "unexpected" variable to the models - as discussed above.

If you want to understand more in depth, the video below has been published on the Oxford Mathematics channel of the university's School of Mathematics, in which we are guided through the predictions made by mathematician Joshua Bull.

Statistics and Predictive Analysis

Beyond the fun challenge of predicting soccer results, analytical models create great competitive advantages for various niches, whether in sports, tourism, retail, insurance and banking, or even entertainment. See below 3 examples of data modeling applications.

Credit Granting

Certainly, this example is one that most people come across. It seeks to statistically predict the chance that a future customer of a financial product will default.


Experiment, observe, learn, and correct can be a good strategy, but predictive models can be extremely useful for doing this with large amounts of data.

The results can also be faster and also more accurate, making the success-oriented creation of numerous marketing actions.

With the proper use of models, it is possible to determine the appropriate time to start campaigns, target audience, estimated return, and work efficiency.

Portfolio monetization

A company's relationship with its customer portfolio is certainly one of its greatest assets - regardless of the business vertical: retail, bank, insurance company, or industry.

Thus, understanding well the behavior and needs of your audience can lead them to high levels of satisfaction: generating loyalty and profitability.

Propensity models for other products (upsell and x-sell), churn/avoidance or even adaptation of the product already used (limit adjustment, for example), can be important tools in maintaining the relationship with customers.

So, will Brazil win the World Cup or not?

The Harvard researchers predict that Brazil is the most likely champion of the World Cup in Qatar, with a 14.7% chance of winning the gold medal.  The Brazilian team would face Belgium in the final on December 18 and win, with 61% odds.

But before celebrating, it is important to note that Argentina comes in right after Brazil with a narrow margin of less than 0.4 percentage points. Statistically, there is a technical tie between the two.

But since not everything that happens on the field can be translated into numbers, the jinx has appeared once again in the World Cup. This time, the victim was Germany, who lost 2-1 to Japan in their opening game. Also by 2-1, another great team contradicted their record: Argentina lost to Saudi Arabia in the first round of group C - with 3 goals by Argentina still disallowed in the first half.

In other words, although Argentina's number of goals was higher than Saudi Arabia's, as the model predicted, the valid goals were fewer. Perhaps it is the case of refining the model and thinking about inserting other variables in the field to make it more and more accurate.

Not only in soccer, but in any business, the models need to be constantly revised and have their methodologies questioned to keep up with the business objectives, the market oscillations, the competition, and the changes in consumer desires.

Therefore, it is not possible to say that Brazil will definitely be champion of the 2022 World Cup - but there is a great chance. In any case, we will only have 1 champion, and it will be an excellent way to test the effectiveness of these Harvard analytical models and their potentials and bottlenecks.

Thus, the use of analytical models to set up tactical schemes and anticipate tournament results is another way to popularize the scientific resource and bring it closer to society as a whole.


Free Materials