For those who are interested in the details: we model the strength of teams usin...

antouank · on June 9, 2016

But how does that make sense in a game where you can have a draw? By definition 100% cannot cover only home/away. Draw must be factored in. So doesn't that make all the Elo assumptions false? Just wondering how you can take draw into consideration...

lum · on June 9, 2016

Excellent point - this one is definitely on our todo list. There are several simple extensions of the Elo model that take draws into consideration (i.e., give a non-zero probability to draws), for example the Rao-Kupper model. There are only minimal changes needed w.r.t. the original model, but still we didn't manage to make the changes in time for this version of the site.

In short: at its core, the "Elo assumption" postulates that every team can be represented by a real number (that can be interpreted as the strength of the team), and that the probability of the outcome depends on on the difference in strength. In the vanilla Elo model, the outcome is binary, but it's easy to make it ternary.

antouank · on June 9, 2016

The thing is that football is a time based sport, so draw is an outcome with pretty good chances. Usually weak teams will try to delay as much as they can to get the draw. Given enough time, the strong team would have much bigger chances to win. Also, depending on the context ( points needed for each time ) a team might have bigger motive to go for a draw than a win.

Predicting outcome possibility in football is a very complicated story, I doubt it can be solved in a simple way like elo ranking the players or teams.

That said, the knock-out phase might be more suitable for that model.

Kudos for the effort anyway, and nice UI.

tfgg · on June 9, 2016

Have you considered using a TrueSkill-alike with extensions for scores/teams, such as PoissonOD?

http://research.microsoft.com/pubs/193839/sbsl_ecml2012.pdf

A few years ago I had a similar idea for trying to build team models, so you can make a better guess at the performance of national teams, since they don't play very often, or for league teams due to transfers at the start/during the season, but hadn't got as far as you :)

lum · on June 9, 2016

We did some preliminary experiments in this direction. Basically, we tried to do a regression on the score difference instead of using only binary outcomes. In our experiments it didn't improve the predictive accuracy - but there are many more things to try. It does feel a bit wasteful not to take score data into account.

Nice that you had some similar ideas :-)

heinrich5991 · on June 9, 2016

In chess you can also have a draw.

Either by having the player agree to a draw, or by killing every piece that is not the kings.

antouank · on June 9, 2016

Right. But then again, you cannot say it's 100% that player A or player B will win, right? There's a chance for draw that is not negligible.

psuter · on June 9, 2016

One common way to represent the probability of outcomes in chess as a single number (e.g. in comparing opening lines) is to say "white gets 55% of the points", which aggregates wins and draws. (So for instance if white wins 50%, draws 25% and loses 25% of the games, it gets 0.5 * 1 + 0.25 * 0.5 + 0.25 * 0 = 62.5% of the points.)

umanwizard · on June 9, 2016

In fact, at top levels of play, draw rates are 40% or higher.

rainforest · on June 9, 2016

Is there a reason you opted to use Elo rather than other rating systems commonly used in baseball analysis like log5 or Pythagorean (e.g. https://summerofjeff.wordpress.com/2010/12/05/serving-agains...)?

I'd also be interested in knowing a bit more about point 2 as well.

galfarragem · on June 9, 2016

One specific criticism regarding Portugal-Iceland: main squad will not be exactly as considered there. E.g. Cristiano Ronaldo is considered a substitute.. Does it make a difference to your predictions?

lum · on June 9, 2016

You are correct about Portugal, not exactly the squad that will likely play against Iceland. We used the starting lineup of the last official game (I believe in this case it was Portugal vs England on June 2nd), which did not include e.g. Cristiano Ronaldo.

The actual lineup does impact the predictions - we will update the lineups before the start of each game, when the lineup is anounced.

In a future version, we would like to make it possible for visitors to change the players in the team, and automatically update the prediction.

dmoo · on June 9, 2016

I think the same is also true for Ireland. The last friendly match was used to view mainly fringe players. I guess 7/8 of the likely starters are here as substitutes. Perhaps you should look at the last competitive game in the qualifiers.

mlla · on June 9, 2016

Have you had any chance to compare your player strength values against other models to check how well they are aligned? You can find from Premier League's web site "Player Performance Index", e.g. for last season the top three players according to PL PPI were Harry Kane, Riyad Mahrez and Jamie Vardy.

Another obvious question is that have you checked your model against the odds on betting sites that provide "Draw-no-bet" bets since you are not yet taking into account draws?

peeky · on June 9, 2016

I'm interested in how you model the strength of players, is that for the whole squad or the expected starting eleven? The prediction that stood out for me was the Wales vs Slovakia which FIFA rankings and betting odds would both suggest will be closer, would love to hear more about the factors behind that particular prediction.

benrjackson · on June 9, 2016

The Wales starting lineup is missing Gareth Bale (arguably in the top 5 best players in the world). His 'kickscore' is listed as higher than any of the welsh players so I don't understand why the model has excluded him from the starting lineup. See also Portugal and Ronaldo (kickscore 100!).

Interesting but perhaps needs some tweaking to match the expected starting lineups.

vicow · on June 9, 2016

Hi! I'm Victor, one of the researchers behind this project. We used the most recent lineups (up to yesterday) to do the predictions you see on the web page. We will update them with the latest friendlies (e.g., Portugal indeed) and before every game, as soon as we have the official lineups!

alexroan · on June 9, 2016

It may be worth taking competitive matches into account over friendly matches, as the purpose of competitive matches is to win at all cost. Friendly matches, on the other hand, tend to be used primarily for match fitness (especially leading up to tournaments), and for trialling new tactics before the competitive games begin.

benrjackson · on June 9, 2016

Thanks for the clarification and great project!

It will be interesting to see how things change when you get the official lineups. As I'm sure you are aware, teams often rest their 'star' players in friendlies immediately prior to a tournament in order to minimize the risk of injury.

ancymon · on June 9, 2016

Where did you get player/club data from?

lum · on June 9, 2016

We scraped it from soccerway.com.

JD557 · on June 9, 2016

Are there any plans to make the clean dataset public? It would be nice to be able to play with it.

octonion · on June 9, 2016

You can use the FIFA data in my soccer GitHub repo - https://github.com/octonion/soccer/tree/master/fifa

vicow · on June 9, 2016

Could be, yes! We'll see after the Euro ;)

tremon · on June 9, 2016

Has there been any work (by you or others) towards factoring in the performance effect of the coaches on the teams?

vicow · on June 9, 2016

We tried to consider it as an extra player, but it didn't help.

octonion · on June 9, 2016

What's a reference for the connection between Elo and Gaussian process classification that you mention?