Wooble
DATADECODE - The Insight Nobody Is Looking At
DataDecode: where data stops being numbers and starts becoming news. Dive into AISHE datasets, uncover a story no one’s telling, and turn complex education data into a clear, publish-ready insight.
Evaluationclosed
Data AnalysisData AnalyticsData PreprocessingData ScienceData VisualizationDesign Documentation
What you get
Rewards and recognition for participating in this challenge.
Participation certificate
Receive a certificate when you complete the challenge.
Overview
DataDecode: where data stops being numbers and starts becoming news. Dive into AISHE datasets, uncover a story no one’s telling, and turn complex education data into a clear, publish-ready insight.
Problem statement
The Setup
Meera Iyer is a journalist at a regional news outlet in Chennai. She covers education and employment. Every year, the Ministry of Education releases the AISHE report — All India Survey on Higher Education — a 200-page PDF with enrollment numbers, faculty counts, infrastructure data, pass rates, and placement figures across every college category in India. Every year, Meera downloads it. Every year, she writes the same story: "Enrollment up. Employability still a concern."
She knows there is a better story in that data. She does not have the skills to find it.
You do.
The Exact Problem You Are Solving
You will analyse the AISHE dataset (publicly available at aishe.gov.in — download the last 3 years of data) and produce one dashboard that tells one story that Meera could publish tomorrow as a data-backed news piece.
The story must be non-obvious. "Enrollment increased" is not a story. "States with the highest female STEM enrollment have not seen a corresponding increase in female faculty at those institutions — and here is the exact gap, state by state" is a story.
The Dataset
Primary source: AISHE Annual Reports 2021–22, 2022–23, 2023–24 from aishe.gov.in. You must use data from at least 2 of the 3 years. You may enrich with one secondary source (NSSO, NCRB, RBI DBIE, or data.gov.in) but the core finding must come from AISHE.
Mandatory Deliverables
Deliverable 1 — The One Finding State your finding in exactly one sentence before anything else. This sentence must be falsifiable — someone should be able to look at your data and either confirm or dispute it. "The data suggests interesting patterns" is not a finding. "Private engineering colleges in Tier 2 cities show 34% higher placement rates than government colleges in the same cities, but charge 6x the fees — concentrated in 4 states" is a finding.
Deliverable 2 — The Dashboard A live, public dashboard (Streamlit, Looker Studio, Observable, Tableau Public — your choice) that visualises the finding. The dashboard must have: a headline (the finding in one sentence), supporting charts (minimum 3, maximum 6), a methodology note (how did you get here), and a limitations section (what can this data not tell you). Every chart must have a plain English caption. No chart without a caption.
Deliverable 3 — The Journalist Brief A 300-word document written for Meera — not for a data scientist. No jargon. No methodology details. Just: here is what the data shows, here is why it matters, here is what questions it raises that a journalist should investigate further. If Meera cannot understand it, it fails.
Deliverable 4 — The Code GitHub repo with: raw data download script or instructions, cleaning notebook (every cleaning step commented), analysis notebook, visualisation code. A judge must be able to run your code and reproduce your finding exactly.
Hard Constraints
Your finding must be about India — not a global comparison
You cannot use a finding that has already been published in a major Indian newspaper in the last 2 years. If it has been reported, it is not a new story
Every number on your dashboard must be traceable to a specific row or column in the raw data
The dashboard must load in under 5 seconds on a standard connection
You must acknowledge at least 2 limitations of your analysis. Claiming your analysis is definitive is an automatic penalty
Deliverables
- The One Question Your Submission Must Answer
- "So what? Who should care about this finding, and what should they do differently because of it?"
- If your submission cannot answer this, it is an academic exercise, not a piece of work.
Evaluation criteria
- Is the finding genuinely non-obvious? Would a domain expert be surprised? 30%
- Is it reproducible? Judge runs the code and arrives at the same number 25%
- Can a non-technical person understand the dashboard in 60 seconds? 25%
- Are limitations honestly stated? 10%