Monday, 7 July 2014

Cricket WASP

There's been a lot of talk on television, twitter and cricket betting forums about the cricket prediction tool WASP recently so I thought I'd give some thoughts on the subject.

I've read far more about WASP than is probably good for me and will start by saying the guys who've developed it are smart, know way more about stats than I ever will, and have made a useful, if stepping stone, contribution to the modelling of cricket. I doff my cap at their work.

I'll also say from the outset though that I more or less ignore it and would expect any competent cricket trader to profit handsomely if allowed to bet against its predictions.

I don't use the tool as the basis for any bets and will go as far as saying I wouldn't use WASP to place bets with your money, let alone mine - unless I was self-matching your bets!

But I'm getting ahead of myself. And being more than a little unfair. Let's take a proper look at WASP.

What is WASP

WASP stands for Winning And Score Predictor. It aims to do two things in a cricket match:

1) In the first innings it gives a prediction of the score the bat first team will achieve.

2) In the second innings it gives a probability of the chasing team winning the match.

Crucially, and I'll come back to this, WASP does all this with the assumption that an average batting team is playing an average bowling team.

WASP works in limited overs cricket only and its outputs are based on historical stats between the top eight international teams in both ODI and Twenty 20 formats.

It was developed by Dr Scott Brooker and Dr Seamus Hogan, Brooker's Ph.D supervisor, while Brooker was taking a Ph.D in economics at New Zealand's University of Canterbury. Like I said. These guys know their way around stats. And Hogan is a self-confessed "cricket tragic" to boot.

WASP first came to the attention of the public in November 2012 when Sky Sport in New Zealand used it during coverage of domestic matches. Sky Sports in the UK has since picked up on it and this season WASP is now a constant feature in televised limited over matches in the UK.

How WASP works

I don't want to get bogged down in the intricacies of WASP - I'd no doubt get something wrong and besides this post would become very boring very quickly.

Instead I'll provide further reading both in and at the bottom of this post for anyone is who especially interested and for now will just give a brief summary.

WASP is a dynamic programme. It predicts first innings runs by calculating the expected additional runs that will be made given the number of remaining resources. That is balls and wickets. It references historical match data to calculate this.

WASP rightly attempts to take into account playing conditions by having commentators set a par score for any match about to be played which it uses as a base figure for calculations.

It further attempts to take past conditions into account by including in its modelling an adjustment for ease of batting conditions (PDF file) in historical games. The paper in the link is well worth a read though I'm sure it will raise accuracy issues from a betting perspective for many.

The second innings output - the probability of the bat second team winning - is based on similar, if more complicated methodology, in that predictions are made based on the target score, the number of balls and wickets remaining and runs already scored.

If this is all starting to sound similar to Duckworth-Lewis it's interesting to note the WASP grew out of research into finding a better solution to reduced overs cricket matches than the Duckworth-Lewis method.

Problems with WASP

Much of the criticism WASP receives in betting circles is simply based on the prediction probability for the bat 2nd team winning being far removed from the reality of the actual market price.

This is a key issue and I'll come back to it shortly in this article but first a look at some of the other problems with WASP:

1) Setting the par score

This key piece of information is reliant on a commentator's opinion before a match starts. I've already looked at the issue of commentators confusing average and par scores so this is not a great starting point.

That WASP co-creator Seamus Hogan states the following himself is a little worrying too:

"I believe this judgement (determining the par score) is just a recent historical average for that ground" (Penultimate paragraph)

To be fair Hogan does go on to say "the method of determining par may evolve" and, more recently, commentators have made a real effort in having a go at setting a proper par score.

The trouble is commentators seem to misunderstand what  a par score for WASP is. Again, in the words of Seamus Hogan, a WASP par score is:

"A judgement made on what the average first innings score would be for the average batting team playing the average bowling team in those conditions". (Penultimate paragraph)

So when Nick Knight set the WASP par score at a very high 180 in the recent Nottinghamshire v Yorkshire Twenty 20 based on the fine batting lines up on display WASP was set incorrectly from the off.

2) Par score can't be changed

While WASP takes into account what is happening in a match to make a changing innings runs prediction it always starts from a fixed par score.

The question is, what if that initial par score was totally wrong?

In the Nick Knight example above with the score at 27/3 after 5 overs and finally some acknowledgement that the match was being played on a glued pitch in grey overcast conditions the talk on Sky Sports was that par should have been 150. (Nick Knight himself thought "140 to 150" by the end of the innings)

Of course, after 5 overs, punters had the advantage of seeing that the glued pitch was very two paced too and could take this into account. WASP can make adjustments. But is still stuck with a 180 par score.

The result is that at 57/4 from 10 overs WASP, as you'd expect, was clearly adjusting its prediction for the number of innings runs down. As had, of course, the bookmakers.

But WASP was still predicting 6.5 runs higher than the highest bookmaker line I could find at that point. (And I'm sure I'd have been knocked if I'd tried to go under that line!)

This isn't a one off example either. The following day in the Durham v Derbyshire Twenty 20 David "Bumble" Lloyd set the WASP par at 166. He later commented:

"We set WASP wrong. We thought it was a good pitch but there's something in it. On reflection I'd set it at 145 and not 166."

Lloyd and Knight are experienced former international cricketers. They know what they are talking about and this just illustrates how difficult it can be to set a par score before you see a ball bowled.

3) WASP and averages

We've already seen that WASP assumes an average batting team is playing an average bowling team. The trouble, of course, is not all teams are average. I'm sure you can draw your own conclusions.

A separate issue with averages comes from the historical adjustment for ease of batting conditions I linked to earlier which goes into the WASP modelling. There's some very clever statistical work going on in there. But ultimately everything starts boiling down to averages to such an extent that we get this:

"Accordingly, in this paper we adopt the strategy of assuming that there is no useful information from knowing the ground at which the game was played". (Page 5)

Strictly speaking the clever stat work behind that comment can be proved I'm sure. However, as a punter averages can be dangerous. We all know some grounds are more conducive to runs than others:


If I'm betting on a match at one of these two grounds I don't care about the average runs scored across all matches from a particular point in an innings in the WASP database sample.

I'm more interested in specific ground history. And I'll trust my own models / intuition / judgement of teams and conditions on what the first innings score will be over WASP any day of the week.

4) WASP win prediction at odds with actual market

I mentioned earlier much of the criticism aimed at WASP is based on the prediction probability (and indicative odds) for the bat 2nd team to win bearing no relation to the reality of the actual betting markets.

A typical recent example was in the Surrey v Kent Twenty 20 match where, early in the second innings, WASP had the fielding team Kent as favourites to win the match.

At the time, as was noted by cricket punters on Twitter, WASP was giving Surrey a 47% chance of winning when the actual markets had Surrey with an 85% chance of winning.

Now that is some difference. And I hope by now it is apparent why. A huge part of the reason is, of course, because the two teams were not average. The Kent team had key players out, including gun bowler Doug Bollinger, while Surrey had a batting line up that would be the envy of any county team.

Far from average batting against average bowling here we had very strong batting against weak bowling. The betting market was right and Surrey cruised to victory 2 wickets down with 13 balls to spare.

In Defence of WASP

First up I said earlier I was being more than a little unfair on WASP. And I am. Why? Because much of the criticism of WASP on social media and betting forums, some of which I've elaborated on in this article, is simply based on misunderstanding.

1) WASP is not a betting tool

To put it bluntly WASP is not designed as a betting tool or to aid punters. As Hogan says himself:

"The first thing to note is that the predictions are not forecasts that could be used to set TAB betting odds." (4th paragraph. Note: Punters can place bets with TABs in New Zealand and Australia)

I know this will create confusion. After all Hogan is also on the record saying:

"In the second innings it (WASP) gives a probability of the batting team winning the match." (2nd paragraph)

So I hear questions of how can it give a probability of winning but not in a way that can be used to set odds?

The answer is subtle. I'll let Hogan explain:

"WASP is a way of calculating who is winning, rather than a prediction of who will win." (See point 1)

It seems contradictory, and I'm sure many will argue it is, but with some thought, and bearing in mind the average team v average team discussion above, it's possible to see what he is saying. I'll provide links at the bottom of the article where Hogan explains the principle, with examples, in more detail.

Not surprisingly such delicate intricacies appear lost on the Sky Sports commentators and definitely are on the general viewing public. Hence we see confusion and, sometimes, derision. But it seems the way WASP is presented to the viewing public is not entirely how it should be.

2) WASP creators aware of limitations

Although Hogan and Brooker have put vast amounts of work into the WASP it's only fair to acknowledge that they themselves are also aware of its limitations.

Although they believe it is an improvement on Duckworth-Lewis the pair freely admit there are areas that need further work. Again, links are provided at the bottom for full discussion of these but a I'll have a quick look at two.

The first is the issue of setting the par score. Hogan states:

"The Par Score method is not perfect as it is subjective, but it is a big improvement on simply assuming that all games are played in the same conditions." (Why have Par Score answer)

A second, and the main area area for further work, regards the issue that WASP assumes you have average teams playing against each other. In the words of Hogan and Brooker:

"The second area where a useful extension is possible is in allowing for predictable  differences in ability between teams...Making this adjustment remains for later work, as it would require building up a dynamic dataset of team ability." (Page 29)


WASP as a possible betting tool

Despite everything above, and my evident reluctance to base any bets on WASP outputs, I was going to give a suggestion about how WASP might be useful from a betting perspective.

This article has now got overly long however so I'll do a separate post on WASP as a possible betting tool at a later date. (Update: Article published here)

Final Thoughts

I'll let the stattos battle it out regarding WASP v Duckworth-Lewis and the relative merits of the methodologies they each use. I just think it's great that cricket attracts sharp minds capable of creating such innovative tools.

I've mentioned before that cricket is just so damned hard to model, with the benefits this brings to punters, and the discussion above helps highlight some of the difficulties.

WASP is a serious attempt by some smart cricket enthusiasts to model the game. It's a great piece of work, acknowledges some of its own weaknesses, and provides suggestions for future improvements.

As a tool of general interest to the public it is a first. People watching televised cricket now have a consistent real time opinion on what a team will score and how the chasing team is doing. This said I'm not sure WASP is presented correctly to the viewing public.

Given my interest in cricket betting it is not surprising a lot of the criticism I come across regarding WASP is from punters. But it should be remembered WASP is not a betting tool.

From a betting perspective the first innings runs prediction is probably more useful than the bat 2nd win probability output. But even then a seasoned innings run punter should comfortably outperform the tool.

WASP relies on assumptions of average teams playing average teams at, it could be argued, average grounds. Furthermore it relies on an accurate initial par score input and simply can't react enough to often quite dramatic changes in playing conditions during an innings or match.

A punter knows the relative strengths of the teams, is able to adjust his initial views on par, take into account every change in conditions, way up the value as well as the number of wickets that fall, knows which bowlers have overs left and, as such, should back their own view over that of WASP.



WASP Further Reading

WASP discussion by co creator Seamus Hogan. (includes some history, explanation and workings of WASP)

Further WASP discussion by Seamus Hogan (includes some clarification and further explanation of outputs)

Seamus Hogan replies to Twitter / general feedback (Interesting piece on irrational expectations in cricket)

WASP FAQ by Seamus Hogan (More detailed attempt to explain WASP issue by issue)

Pitch discussion by Seamus Hogan (Brief critique of a Cricinfo piece on pitches and introduction to historical estimation of pitch quality)

Full paper on inferring batting conditions in historical ODI cricket. (Working paper by Brooker and Hogan which has an input into final WASP product. Is a PDF file)

WASP Wikipedia 


Other non WASP cricket blog posts by Seamus Hogan:
Do catches win matches
(Ir)rational expectations in cricket
Declarations and nightwatchmen
Are wickets more likely on hat-trick balls

4 comments:

  1. I appreciate your balanced comments on the WASP. Like you, I wouldn't use WASP as a betting tool, for all the reasons you mention. Let me clarify a couple of points:
    1. On "We all know some grounds are more conducive to runs than others." Yes we do. But there have been a surprisingly large number of ODI games played at grounds that have hosted only a small number of games, where past averages would not provide much useful information. There is also quite a high variability in conditions from day-to-day at a single ground. These two facts mean that knowing the ground at which a game is played only adds a small amount of information when inferring how conditions affect the likely outcomes. On the other hand, knowing the ground does provide useful information about the likely par score in a particular match.

    2. The key word in the last sentence, however, is "likely", which brings me to the second point. There is no reason why the commentators could not adjust the par score during a match, partly as a result of observing what is happening (e.g. how much the ball is swinging, how quickly the ball is coming on to the bat, etc.) and partly to make an adjustment at the start of the second innings if conditions are likely to be easier in one innings than the other. It is not strictly correct to say that the "par score can't be changed", but I think for reasons of simplicity of communication, the Sky commentators have chosen to not ever reset the par score. This is a bit frustrating for Scott and me, but it's not our decision. If nothing else, if they are never going to reset the par score, we would prefer that they delay revealing it until, say, 5-10 overs have been bowled in the first innings so that they at least have some on-the-day information on which to base their judgement.

    ReplyDelete
  2. Firstly may I say I’m honoured you found my little space on the web and took the time to give such a considered reply. It really is very much appreciated.

    On reflection it makes total sense that the WASP par score can be reset and I value the correction. I agree with your reasoning why Sky choose not to do this. Additionally, from a practical point of view, they are also broadcasting a little slot on WASP and the par score the commentators have decided on, and why, before each televised game. I guess it is easier then than your preference (if they don’t reset it during a match) of during the game as on field events take precedence then. I can understand the frustration of yourself and Scott at WASP effectively not being used to its full potential. Nevertheless, casual viewers and cricket enthusiasts will still find the outputs of interest. It’s perhaps only cricket punters like me, who try to be as accurate as possible in our predictions, who realise when some of the outputs are off kilter.

    Regarding runs I agree totally there a large number of grounds that have only hosted a handful of ODI results meaning averages from such small samples are next to useless.

    You touch on an interesting issue though, and one I only left out from the original article for reasons of it getting too long. Especially in Twenty 20 there are many more domestic than international games each year. Several grounds now have more than 50 full Twenty 20 matches and some more than 70. I’ve wondered if the addition of domestic data to the existing international results would be useful or not.

    Certainly from a betting perspective it is. Grounds can change over time but it is also true that the extra data makes it very clear some grounds do tend towards higher or lower scoring games in general. And at some grounds it is also possible to profile the types of games a ground is likely to produce – albeit clearly sometimes condition related. As a punter one of my aims is to watch carefully and decide which, if any profile, a match might be following.

    ReplyDelete
  3. Have you checked out & analysed COW (Chance of Winning)? Felt it to be better than various other tools around. I guess this is developed way back in 2009 by a company called Cartwheel Creative for their cricket site holdingwilley.com.They have not revealed much as to how they do it,but seemed to be very free to discuss when asked about it.

    ReplyDelete
  4. Where can i find the detail calculation for wasp or the original paper or the detail procedure used to calculate wasp?

    It would be very help full for me.
    Thank you in advance.

    ReplyDelete

Please note all comments are moderated so there will be a short delay before your comment is published.