Bottom-Up Player Impact Model – Step 2

Our previous model (detailed breakdown) took a top-down approach to predict the winner of a game. The model didn't care about any individual players on any of the teams—it predicted the winner based on the history of the team, home ground advantage, team stability, and the average age of the players.

This attempt at building out a model takes a bottom-up approach: what if we evaluate every player on every team, add them all together, and then make a prediction about which team will win? Just like my previous model, if you want to see the detailed work (i.e. Python code, more detailed analysis on the analytical tools used) feel free to look here or at the GitHub repo.

A player is good if they contribute to their team winning the game.

To figure this out, I split every player into different role types¹—different types of players will contribute in different ways. We’d expect midfielders to average more disposals than forwards, and forwards to average more goals than midfielders.²

I then looked at every individual game performance from 2010 to round 9, 2025 and analysed which numbers for each role type correlate with wins. This means I can get an idea of what a "good" player looks like for each role type. Below is a list of the top 5 attributes by absolute correlation value for each role type (some variables have a negative correlation):³

Top 5 Attributes by Role

Select a role to view the top 5 attributes below.

What did this tell me? At a minimum, it all seemed to make sense:

Goals are good, goal assists also
Clangers seem to be quite important, especially so for defenders
It was interesting that hitouts were nowhere to be seen for rucks (though the importance of hitouts and their lack of correlation to clearances is something that has been looked at in detail)
Rebounds were negatively correlated for defenders—which seems strange at first, but a high amount of rebounds probably reflects a lot of inside 50s conceded

After identifying what a "good" player looks like, I also used a polynomial model that looked at whether combinations of the variables have a larger effect (i.e., if a forward has high amounts of tackles combined with goals, does this give a better idea of what correlates to a win?). There was a slight uplift in using the polynomial model (see detailed page), but it was not a game changer.

My models now allowed me to identify what a good player in different role types looks like. I wanted to sense-check whether my model actually identifies good players, so I ran the model over the last few years to see the players it spat out as the top 5 in each position⁴:

Top 5 Players by Role (2022–2025)

Select a role to view the top 5 players below.

Honestly, I was pretty happy with how this picked up the players widely attributed as the best players over the past few years. There was a bit of noise:

Players who switch roles between midfield and defence, but are listed as defenders, seem to score very highly
The models like rucks who score goals: some highly-rated ruckmen are actually players who spend enough time forward to almost be considered a forward, not a ruck. I decided to leave them as ruckmen per their listed position
My models really like players who either score goals, or assist goals

The way my models worked is that each player’s impact score is averaged over recent games (using a rolling window) to reflect current form, then summed at the team level to predict results. For each game, the models predict whether the home team wins, based on the combined recent performance of all players selected for that game.

The real question was: could they predict a winner of a game? Well, after running the model through the history of games the best model was able to predict a winner 65.8% of the time! This meant that, at a minimum, we didn't go backwards from our simplistic top down model:

Model	Performance (Accuracy %)
Home team only	56.6
Top-down (Model v1)	61.7
Player-based sum (Model v2a)	63.7
Player-based sum (Model v2b)	65.8

Next up: looking at age profiles and home ground advantage in more detail.

Footnotes

Source was Kaggle dataset augmented by footywire and Wikipedia. ↩
The different roles are: forward, midfield, midfield/forward, ruck, tall defender, and small defender. ↩
I am limited by only being able to rely on publicly listed numbers; for the full list of attributes, see the detailed project page. If I had access to Champion Data, I’d be able to put more attributes into the model. ↩
Games from 2022 to round 8, 2025 inclusive, and to avoid one-off performances skewing results, only players with 15+ games are included. ↩