“Statistical Modelling” appears to be an all encompassing term. Looks like anything can be attempted to be modelled statistically. Over the past 3-4 years, I learnt about different techniques for statistical modelling as well as its applications in the areas of natural language processing and others. However, it never occurred to me that all that can possibly have applications in sports (Hmm… I have no option but to acknowledge my peanut brain!)

A couple of days back, I was reading this book “IPL: Cricket & Commerce, an inside story” by T.R.Vivek and Alam Srinivas. Apart from providing news,gossip,financial details, sport, spirit and the emotional experiences during IPL-1 and 2, this book also had an interesting story to tell. It was on Rajasthan Royals success during IPL-1, and its comparison with a US Baseball team – Oakland A’s, whose success in 2003 was described in a book called “MoneyBall”. I did not read this book, but read what Vivek and Srinivas wrote about it here.

Basically, the Oakland A’s manager Billy Beane redefined the rools of team selection, by using rigorous statistics as a means of team selection. As a result, Oakland A’s was very successful during that season. Apart from collecting a lot of statistics on the minute details of the game, Oakland A’s also developed its own statistical model of the game and players, taking cue from one firm called AVM systems.

In terms of cricket, if such a model were to be created, assisted by huge amount of statistics, it should be able to answer questions like – which player will react in what way in a particular situation. It seems like “a highly improved statistical model can actually predict the value of players in a more pinpointed fashion” – If thats actually possible, Statistics is also as important as a strategy discussion before every match..perhaps more than that.

However, Iam clueless on how a statistical model for cricket works. I don’t think there are any working models, at the moment (since the book does not mention any).

Anyone who happens to see this post…and who are aware of how statistical modelling can be applied to cricket, what kind of statistics need to be collected for that etc – can drop a comment here and explain 🙂 Somehow, the “HOW?” part is bugging me a lot for the past few days.

The only model that you can apply to Cricket is statistical model, to be more specific – Stochastic model

One example:

You keep observing the performances of all the people under different conditions, against different oppositions and for various teams.

You can make a table like

Player —- Played for —- Played against —- Performance ( Bowling or batting ) — Batting performance against specific bowling attack – Bowling against specific batting strength etc etc etc

Now, once you have the statistics, you are more likely to find a pattern – something like Ganguly playing well against Spinners, or Gavaskar playing well against Fast Bowlers or Sehwag playing well at the opening spot or whatever

When you select a team, you can use these models to create a Probability distribution that gives an idea about how a player is likely to fare, given a type of bowling attack in a particular condition and that should help you in selecting a team!

@Rowdy: Thanks!! 🙂

You should atleast acknowledge me first .. thanks to my brother i get to read such books ani ! 😛 ..

http://cricket.rediff.com/report/2010/mar/29/ipl-2010-most-valuable-player-yusuf-pathan.htm

That’s another way of looking at it …

Well Haley, thats more of Arithmetic and less of Modeling. When you build a model, it should be able to predict how the thing that is being modeled behaves.

Unless I am missing something, your table shows only statistics but no models.

If you see sites like Cricinfo and Cricbuzz while the match is going on. They give you lot of statistics at various levels. like for batsmen (what’s the average, #duckouts,no of catch outs, centuries,etcc)

bowler(no of wikcets, economy etc…),

keeper (no of catches, stumpings etc), fielder(# catches) etc…

You can definitely make statistical model from these and have good estimate of strength of the team…

By the way any IPL owner contacted you to develop such a statistical model? If yes, I have to keep all these things secret 🙂 🙂

@Halley and Ravichandra: Thats exactly my point. How are statistics used to build statistical model, and more so in the case of cricket? Are there any known models already?

🙂

Good post

Btw,What are the model parameters?

hmm…

i haven’t read the post…just skimmed…how ever, as far as i understood to ur q, I don’t see why one can’t use stats in cricket? I don’t know where you r stuck at all…it could just be a regression model naa…which can predict performance of a player…

nways, stats gave one beast – duckworth lewis method- to cricket…why do u need more?

@Vamsi: Don’t you think there is much more statistical modelling to model players and teams – than in the case of Duckworth-Lewis model? Thinking in this way, D/L method has only two main variables:

“The essence of the D/L method is ‘resources’. Each team is taken to have two ‘resources’ to use to make as many runs as possible: the number of overs they have to receive; and the number of wickets they have in hand. At any point in any innings, a team’s ability to score more runs depends on the combination of these two resources. Looking at historical scores, there is a very close correspondence between the availability of these resources and a team’s final score, a correspondence which D/L exploits” (Source: Wiki)

-Don’t you think modelling a player… and…then a team..are much more complex processes?

what r you talking?:)

As long as data is available, I don’t think it’s a complex process considering many other complex stat applications(say, stock predictions)….a simple naive model would be function of previous match performance:)…in fact, i am sure statisticians (not cricinfo ones, statisticians from acads) would have played with it:)…give me some time..will let u know

No – Naive models are not commercially viable 😉

If it were so naive, there would have already been some model in commercial cricket – IMHO.

One Previous match performance/One Career Batting Average/One Strike Rate – All these direct statistics amount to nothing in designing a Statistical model, which will help you in choosing a cricket team…and predicting its performance. Which was why, I wrote this post open for discussion – asking people’s opinion on what stats are needed, how they are to be connected using a model, how it is to be used for predicting future performances with some accuracy…etc etc.

check this out:)

http://www.stat.sfu.ca/~tim/papers/cricketsim.pdf

మలక్పేట రౌడీ గారు చెప్పిన దానితో ఏకీభవిస్తాను. statistical model క్రికెట్ కి తయారు చేయటం చాలా కష్టమనే అనుకుంటాను. దానికి ఎంతో డాటా కావాలి. ఉదాహరణ కు ఒక ప్లేయర్ షార్ట్ బాల్, ఇన్స్వింగ్ ఆడలేడని,ఆఫ్ సైడ్ ఆడగలడని, లెగ్ సైడ్ ఆడలేడనీ అనుకుందాం. మోడల్ కనిపెట్టటం చాలా కష్టం. అదే మామూలుగా మ్యాచ్ చూసిన సగటు క్రికెట్ అభిమాని కూడా చెప్పగలడు. కొంత కంప్యూటర్ dig చేసిన సమాచారం, ఒక ఆటగాడి గురించి చూసిన దాన్ని బట్టి మనిషి (analyst/coach …) అంచనా వేయటం రెండిటి సమన్వయం తోనే ఒక అంచనాకు రాగలరు. కేవలం కంప్యూటర్ బేస్డ్ మోడల్ అంటే చాలా కష్టం నా దృష్టిలో.