METHODOLOGYML-TRAINED

How CricketIQ Works

A transparent methodology that measures real match impact — not just stats. Every score is explainable and data-driven.

CORE FORMULA

Impact = P×C×W×K

PerformanceContextPressureKnockout

All weights learned from data via XGBoost on 1,169 matches.

vs

Rule-Based vs ML-Learned Weights

FACTOR
RULE-BASED (v1)
ML-LEARNED (v2)
Batting score
runs x 0.5 + SR x 0.2 + 4s x 2 + 6s x 3
Sum of 20 weighted features
Bowling score
wickets x 15 + dots x 1 + eco bonus
Sum of 16 weighted features
Powerplay (batting)
1.0x
1.0x
Powerplay (bowling)
1.0x
1.3x
Middle (batting)
1.2x
1.15x
Middle (bowling)
1.2x
1.0x
Death (batting)
1.5x
1.4x
Death (bowling)
1.5x
1.5x
Chasing bonus (bat only)
1.1x
1.1x
High RRR (>10)
1.4x
1.08x
Medium RRR (>8)
1.2x
1.04x
Low wickets (<4)
+0.2
+0.09
Knockout multiplier
Not included
1.15x
Defending pressure
Not included
Economy bonus for low totals

Purple values = significant changes discovered by ML

XGBOOST MODEL PERFORMANCE

BATTING R2

0.7797

79% of winning explained

BATTING MAE

2.85

~2.8 pts avg error

BOWLING R2

0.5971

60% of winning explained

BOWLING MAE

3.42

~3.4 pts avg error

Model: XGBoost (max_depth=6, n=400, lr=0.05)|17,708 batting + 13,878 bowling innings|5-fold cross-validated
ML

What the Model Learned (Feature Importance)

These charts show which stats the XGBoost model considers most important for predicting winning performances. Taller bar = more important.

BATTING FEATURES

BOWLING FEATURES

1-7

Step-by-Step Pipeline

1

Feature Engineering

Raw ball-by-ball CSV data is processed into 39 player-level features:

Batting (21 features)

  • runs, balls_faced, strike_rate
  • fours, sixes, boundary_pct
  • runs in powerplay/middle/death
  • batting_position, dot_ball_pct, run_rate_diff

Bowling (18 features)

  • wickets, economy, dot_balls
  • wickets in powerplay/middle/death
  • runs conceded by phase
  • dot_ball_pct, balls_bowled, balls by phase

Context

  • over phase (PP/MID/DEATH)
  • chasing or defending
  • run_rate_differential
  • match_stage_pct

Pressure & Stakes

  • required_run_rate
  • wickets_remaining
  • is_knockout (playoff/final)
  • first_innings_total (defending)
2

Performance Score

XGBoost learns optimal feature weights from match outcome data:

BATTING

score = Sum(feature x ML_weight) -- calibrated by XGBoost

BOWLING

score = Sum(feature x ML_weight) + defending_bonus

DEFENDING PRESSURE BONUS (NEW)

If defending < 150 & eco < 6: bonus = (6 - eco) x 2
3

Context Weight

Phase weights are calculated differently for batting and bowling — the same phase has different difficulty levels depending on the role:

BATTING

Powerplay (1-6)1.0x

Field restrictions help

Middle (7-15)1.15x

Rebuilding / accelerating

Death (16-20)1.4x

Big hitting, survival is hard

+Chasing: 1.1x when batting second

BOWLING

Powerplay (1-6)1.3x

Early wickets are gold

Middle (7-15)1.0x

Containment, standard

Death (16-20)1.5x

Hardest phase to bowl

No chasing bonus — bowlers get defending pressure bonus instead

4

Pressure Weight

CONDITIONWEIGHT
RRR > 101.08x
RRR > 81.04x
RRR > 61.02x
Wickets < 4+0.09
Wickets < 6+0.1
Late stage+0.1

Capped at 2.0x maximum.

5

Knockout Multiplier

Playoff matches carry higher weight because the stakes are real — lose and go home.

LEAGUE MATCH

1.0x

Standard weight

KNOCKOUT MATCH

1.15x

Qualifiers, eliminators, finals

71 knockout matches identified across 2017-2024 (last 4 matches per season).

6

Normalization & Rolling Average

SIGMOID NORMALIZATION

100 / (1 + e^(-steepness × (raw - median)))

BAT: median=40, steepness=0.026 | BOWL: median=28, steepness=0.030

EXPONENTIAL RECENCY DECAY (10 INNINGS)

Latest (1st)1.00x
3rd most recent0.67x
5th most recent0.45x
10th (oldest)0.17x

Formula: weight = 0.82^(distance from latest). Latest innings has ~6x the weight of the oldest.

SCORE SCALE

0-35

35-50

50-60

60-75

75+

7

Data Pipeline

1

Ball-by-Ball CSV

1,169 IPL matches (2017-2024), every delivery

2

Feature Extraction

21 batting + 18 bowling features + knockout flags

3

XGBoost Training

80/20 split, 5-fold cross-validation, SHAP explainability

4

Per-Innings Impact

P x C x W x K (ML weights) + defending bonus

5

Normalization

Scale to 0-100

6

Rolling Average

Recency-weighted last 10 innings

7

Player Impact Score

Final composite metric with full SHAP breakdown

Use It In Your Project

Add CricketIQ impact scoring to any app — 3 ways to integrate

The trained model is fully portable. Copy the model-template/ folder into your project and start scoring innings in minutes.

1

Python Import

Use directly in your Python code

# In your Python app

from features import calculate_impact

result = calculate_impact(

player_data, 'batting'

)

# result['impact_score'] → 0-100

2

REST API

Call from any language or frontend

# Start the server

python run_model.py --mode api

# Then from anywhere

POST localhost:5000/score

{ "runs": 82, "type": "batting" }

# → { "impact_score": 93.0 }

3

CSV Batch

Process hundreds of innings at once

# Score an entire dataset

python run_model.py \

--input your_data.csv \

--output scores.csv \

--type both

# → scores.csv with all results

HOW TO ADD TO YOUR PROJECT

1

Copy

Copy the model-template/ folder into your project

2

Install

Run pip install -r requirements.txt

3

Send Data

Pass player innings stats via Python, API, or CSV

4

Get Score

Receive 0-100 impact score with full breakdown

🏏

Cricket Apps

Add impact scores to any cricket scoring or analytics app

📊

Data Analysis

Score historical innings in bulk for research or fantasy leagues

🔗

Any Backend

Call the API from Node.js, Java, Go, or any language via HTTP