METHODOLOGYML-TRAINED

How CricketIQ Works

A transparent methodology that measures real match impact — not just stats. Every score is explainable and data-driven.

CORE FORMULA

Impact = P×C×W×K

PerformanceContextPressureKnockout

All weights learned from data via XGBoost on 1,169 matches.

Rule-Based vs ML-Learned Weights

FACTOR

RULE-BASED (v1)

ML-LEARNED (v2)

Batting score

runs x 0.5 + SR x 0.2 + 4s x 2 + 6s x 3

Sum of 20 weighted features

Bowling score

wickets x 15 + dots x 1 + eco bonus

Sum of 16 weighted features

Powerplay (batting)

1.0x

Powerplay (bowling)

1.0x

1.3x

Middle (batting)

1.2x

1.15x

Middle (bowling)

1.2x

1.0x

Death (batting)

1.5x

1.4x

Death (bowling)

1.5x

Chasing bonus (bat only)

1.1x

High RRR (>10)

1.4x

1.08x

Medium RRR (>8)

1.2x

1.04x

Low wickets (<4)

+0.2

+0.09

Knockout multiplier

Not included

1.15x

Defending pressure

Not included

Economy bonus for low totals

Purple values = significant changes discovered by ML

XGBOOST MODEL PERFORMANCE

BATTING R2

0.7797

79% of winning explained

BATTING MAE

2.85

~2.8 pts avg error

BOWLING R2

0.5971

60% of winning explained

BOWLING MAE

3.42

~3.4 pts avg error

Model: XGBoost (max_depth=6, n=400, lr=0.05)|17,708 batting + 13,878 bowling innings|5-fold cross-validated

What the Model Learned (Feature Importance)

These charts show which stats the XGBoost model considers most important for predicting winning performances. Taller bar = more important.

BATTING FEATURES

BOWLING FEATURES

1-7

Step-by-Step Pipeline

Feature Engineering

Raw ball-by-ball CSV data is processed into 39 player-level features:

Batting (21 features)

runs, balls_faced, strike_rate
fours, sixes, boundary_pct
runs in powerplay/middle/death
batting_position, dot_ball_pct, run_rate_diff

Bowling (18 features)

wickets, economy, dot_balls
wickets in powerplay/middle/death
runs conceded by phase
dot_ball_pct, balls_bowled, balls by phase

Context

over phase (PP/MID/DEATH)
chasing or defending
run_rate_differential
match_stage_pct

Pressure & Stakes

required_run_rate
wickets_remaining
is_knockout (playoff/final)
first_innings_total (defending)

Performance Score

XGBoost learns optimal feature weights from match outcome data:

BATTING

score = Sum(feature x ML_weight) -- calibrated by XGBoost

BOWLING

score = Sum(feature x ML_weight) + defending_bonus

DEFENDING PRESSURE BONUS (NEW)

If defending < 150 & eco < 6: bonus = (6 - eco) x 2

Context Weight

Phase weights are calculated differently for batting and bowling — the same phase has different difficulty levels depending on the role:

BATTING

Powerplay (1-6)1.0x

Field restrictions help

Middle (7-15)1.15x

Rebuilding / accelerating

Death (16-20)1.4x

Big hitting, survival is hard

+Chasing: 1.1x when batting second

BOWLING

Powerplay (1-6)1.3x

Early wickets are gold

Middle (7-15)1.0x

Containment, standard

Death (16-20)1.5x

Hardest phase to bowl

No chasing bonus — bowlers get defending pressure bonus instead

Pressure Weight

CONDITION	WEIGHT
RRR > 10	1.08x
RRR > 8	1.04x
RRR > 6	1.02x
Wickets < 4	+0.09
Wickets < 6	+0.1
Late stage	+0.1

Capped at 2.0x maximum.

Knockout Multiplier

Playoff matches carry higher weight because the stakes are real — lose and go home.

LEAGUE MATCH

1.0x

Standard weight

KNOCKOUT MATCH

1.15x

Qualifiers, eliminators, finals

71 knockout matches identified across 2017-2024 (last 4 matches per season).

Normalization & Rolling Average

SIGMOID NORMALIZATION

100 / (1 + e^(-steepness × (raw - median)))

BAT: median=40, steepness=0.026 | BOWL: median=28, steepness=0.030

EXPONENTIAL RECENCY DECAY (10 INNINGS)

Latest (1st)1.00x

3rd most recent0.67x

5th most recent0.45x

10th (oldest)0.17x

Formula: weight = 0.82^(distance from latest). Latest innings has ~6x the weight of the oldest.

SCORE SCALE

0-35

35-50

50-60

60-75

75+

Data Pipeline

Ball-by-Ball CSV

1,169 IPL matches (2017-2024), every delivery

Feature Extraction

21 batting + 18 bowling features + knockout flags

XGBoost Training

80/20 split, 5-fold cross-validation, SHAP explainability

Per-Innings Impact

P x C x W x K (ML weights) + defending bonus

Normalization

Scale to 0-100

Rolling Average

Recency-weighted last 10 innings

Player Impact Score

Final composite metric with full SHAP breakdown

Use It In Your Project

Add CricketIQ impact scoring to any app — 3 ways to integrate

The trained model is fully portable. Copy the model-template/ folder into your project and start scoring innings in minutes.

Python Import

Use directly in your Python code

# In your Python app

from features import calculate_impact

result = calculate_impact(

player_data, 'batting'

)

# result['impact_score'] → 0-100

REST API

Call from any language or frontend

# Start the server

python run_model.py --mode api

# Then from anywhere

POST localhost:5000/score

{ "runs": 82, "type": "batting" }

# → { "impact_score": 93.0 }

CSV Batch

Process hundreds of innings at once

# Score an entire dataset

python run_model.py \

--input your_data.csv \

--output scores.csv \

--type both

# → scores.csv with all results

HOW TO ADD TO YOUR PROJECT

Copy

Copy the model-template/ folder into your project

Install

Run pip install -r requirements.txt

Send Data

Pass player innings stats via Python, API, or CSV

Get Score

Receive 0-100 impact score with full breakdown

🏏

Cricket Apps

Add impact scores to any cricket scoring or analytics app

📊

Data Analysis

Score historical innings in bulk for research or fantasy leagues

🔗

Any Backend

Call the API from Node.js, Java, Go, or any language via HTTP