Women’s football analyzed: interpretable expected goals models for women
Population of Study
Five major domestic men’s and women’s league competitions, across four seasons from 2017/18 to 2020/21.
[Men’s = English Premier League, French Ligue 1, German Bundesliga, Italian Serie A, Spanish La Liga; Women’s = English WSL, French Division 1, German Bundesliga, Spanish Primera Division, American NWSL]
Summary
Exploration of whether combining or separating data from men’s and women’s football leads to better expected goals models, and differences in top men’s and women’s football leagues’ shooting. The latter is done through using an interpretable statistical modelling technique (Generalized Additive Models).
The paper notes “Due to the rapid evolution of the women’s game, historical shot data might not be representative for the game as it is played today. Therefore, we do not go further back in time than the 17/18 season and the 2017 season for the NWSL.”
Abstract
Technical data such as event or optical tracking data from men’s football (soccer) matches have been extensively analysed using techniques from AI on a variety of different levels. However, there has been very little analysis of the women’s game. In this work we take an initial step towards analysing professional women’s football. Using event data covering a number of seasons from the top women’s leagues, we perform two analyses.
First, we perform an exploratory analysis by computing several technical indicators (e.g., goal scoring rates over the season, conversion rates, shot locations) and then compare and contrast them to the indicators for comparable men’s leagues and find several intriguing differences. Second, we assess whether xG models on one gender are applicable to data from a different gender.