Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For those who are interested in the details: we model the strength of teams using the well-know Elo model (used e.g., for official chess ratings).

We innovated on two aspects of the traditional Elo model, in which every team has an independent parameter:

1) We actually model the strength of players instead of teams. This makes it possible to learn from games that were played in championships between clubs, and transfer this knowledge to games between countries

2) There is a not-so-widely-known connection between Elo-type comparison models and Gaussian process classification. We leverage this, and get a full posterior distribution for each team's strength. (Information on the uncertainty of our estimates helps a lot in coming up with sensible predictions)

If anyone wants to know more (explanations on the web page are very superficial at the moment), please drop me a line!



But how does that make sense in a game where you can have a draw? By definition 100% cannot cover only home/away. Draw must be factored in. So doesn't that make all the Elo assumptions false? Just wondering how you can take draw into consideration...


Excellent point - this one is definitely on our todo list. There are several simple extensions of the Elo model that take draws into consideration (i.e., give a non-zero probability to draws), for example the Rao-Kupper model. There are only minimal changes needed w.r.t. the original model, but still we didn't manage to make the changes in time for this version of the site.

In short: at its core, the "Elo assumption" postulates that every team can be represented by a real number (that can be interpreted as the strength of the team), and that the probability of the outcome depends on on the difference in strength. In the vanilla Elo model, the outcome is binary, but it's easy to make it ternary.


The thing is that football is a time based sport, so draw is an outcome with pretty good chances. Usually weak teams will try to delay as much as they can to get the draw. Given enough time, the strong team would have much bigger chances to win. Also, depending on the context ( points needed for each time ) a team might have bigger motive to go for a draw than a win.

Predicting outcome possibility in football is a very complicated story, I doubt it can be solved in a simple way like elo ranking the players or teams.

That said, the knock-out phase might be more suitable for that model.

Kudos for the effort anyway, and nice UI.


Have you considered using a TrueSkill-alike with extensions for scores/teams, such as PoissonOD?

http://research.microsoft.com/pubs/193839/sbsl_ecml2012.pdf

A few years ago I had a similar idea for trying to build team models, so you can make a better guess at the performance of national teams, since they don't play very often, or for league teams due to transfers at the start/during the season, but hadn't got as far as you :)


We did some preliminary experiments in this direction. Basically, we tried to do a regression on the score difference instead of using only binary outcomes. In our experiments it didn't improve the predictive accuracy - but there are many more things to try. It does feel a bit wasteful not to take score data into account.

Nice that you had some similar ideas :-)


In chess you can also have a draw.

Either by having the player agree to a draw, or by killing every piece that is not the kings.


Right. But then again, you cannot say it's 100% that player A or player B will win, right? There's a chance for draw that is not negligible.


One common way to represent the probability of outcomes in chess as a single number (e.g. in comparing opening lines) is to say "white gets 55% of the points", which aggregates wins and draws. (So for instance if white wins 50%, draws 25% and loses 25% of the games, it gets 0.5 * 1 + 0.25 * 0.5 + 0.25 * 0 = 62.5% of the points.)


In fact, at top levels of play, draw rates are 40% or higher.


Is there a reason you opted to use Elo rather than other rating systems commonly used in baseball analysis like log5 or Pythagorean (e.g. https://summerofjeff.wordpress.com/2010/12/05/serving-agains...)?

I'd also be interested in knowing a bit more about point 2 as well.


One specific criticism regarding Portugal-Iceland: main squad will not be exactly as considered there. E.g. Cristiano Ronaldo is considered a substitute.. Does it make a difference to your predictions?


You are correct about Portugal, not exactly the squad that will likely play against Iceland. We used the starting lineup of the last official game (I believe in this case it was Portugal vs England on June 2nd), which did not include e.g. Cristiano Ronaldo.

The actual lineup does impact the predictions - we will update the lineups before the start of each game, when the lineup is anounced.

In a future version, we would like to make it possible for visitors to change the players in the team, and automatically update the prediction.


I think the same is also true for Ireland. The last friendly match was used to view mainly fringe players. I guess 7/8 of the likely starters are here as substitutes. Perhaps you should look at the last competitive game in the qualifiers.


Have you had any chance to compare your player strength values against other models to check how well they are aligned? You can find from Premier League's web site "Player Performance Index", e.g. for last season the top three players according to PL PPI were Harry Kane, Riyad Mahrez and Jamie Vardy.

Another obvious question is that have you checked your model against the odds on betting sites that provide "Draw-no-bet" bets since you are not yet taking into account draws?


I'm interested in how you model the strength of players, is that for the whole squad or the expected starting eleven? The prediction that stood out for me was the Wales vs Slovakia which FIFA rankings and betting odds would both suggest will be closer, would love to hear more about the factors behind that particular prediction.


The Wales starting lineup is missing Gareth Bale (arguably in the top 5 best players in the world). His 'kickscore' is listed as higher than any of the welsh players so I don't understand why the model has excluded him from the starting lineup. See also Portugal and Ronaldo (kickscore 100!).

Interesting but perhaps needs some tweaking to match the expected starting lineups.


Hi! I'm Victor, one of the researchers behind this project. We used the most recent lineups (up to yesterday) to do the predictions you see on the web page. We will update them with the latest friendlies (e.g., Portugal indeed) and before every game, as soon as we have the official lineups!


It may be worth taking competitive matches into account over friendly matches, as the purpose of competitive matches is to win at all cost. Friendly matches, on the other hand, tend to be used primarily for match fitness (especially leading up to tournaments), and for trialling new tactics before the competitive games begin.


Thanks for the clarification and great project!

It will be interesting to see how things change when you get the official lineups. As I'm sure you are aware, teams often rest their 'star' players in friendlies immediately prior to a tournament in order to minimize the risk of injury.


Where did you get player/club data from?


We scraped it from soccerway.com.


Are there any plans to make the clean dataset public? It would be nice to be able to play with it.


You can use the FIFA data in my soccer GitHub repo - https://github.com/octonion/soccer/tree/master/fifa


Could be, yes! We'll see after the Euro ;)


Has there been any work (by you or others) towards factoring in the performance effect of the coaches on the teams?


We tried to consider it as an extra player, but it didn't help.


What's a reference for the connection between Elo and Gaussian process classification that you mention?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: