From Players to Champions:
A Generalizable Machine-Learning Approach
for Match-Outcome Prediction with Insights from the FIFA World Cup

1University of Michigan - Dearborn

Abstract

Accurate prediction of FIFA World Cup match outcomes is valuable to analysts, coaches, bettors, and fans. We introduce a machine-learning framework that fuses player-level performance metrics (goals, assists, passing accuracy, tackles, …) with team-level historical data to forecast winners. Multi-year information is distilled into year-specific team profiles, dimensionality reduction is applied to control feature explosion, and an ensemble of classifiers is optimized via cross-validation. On the FIFA 2022 World Cup dataset our model outperforms a head-to-head baseline, underscoring the importance of granular player attributes and offering insights into player synergy and strategic match-ups.


Project Overview

The proposed pipeline (Figure 1) comprises:

  • Feature engineering that aggregates yearly player data into team vectors.
  • Scaling and PCA to combat high-dimensional noise.
  • An ensemble of Logistic Regression, Random Forest, Gradient Boosting, AdaBoost, and k-Nearest Neighbors.
  • Majority voting to yield a single probability of victory for each team.

Key Figures

Figure 1 – End-to-End Framework

Diagram of the predictive pipeline showing data ingestion, preprocessing, model ensemble and majority voting
End-to-end ML framework for match-outcome prediction (adapted from page 1 of the paper).

Figure 2 – Training Pipeline

Block diagram of the training pipeline with feature extraction, PCA, cross-validation and model selection

Figure 3 – Performance Comparison

Table comparing overall, high-scoring, and low-scoring accuracy versus a baseline

License

The code and pre-processed datasets are released under the MIT License. See the LICENSE file for details.


Acknowledgments

We thank the FIFA data community and prior works on sports analytics for inspiration. This project was supported in part by the University of Michigan-Dearborn ECE and CIS departments.