Query PISA data
in plain English.
A simple interface for exploring international education data. The app turns plain-English questions into validated SQL queries, runs them in BigQuery, and produces charts with the underlying data.
Available data↓
Years
PISA waves are available for 2003, 2006, 2009, 2012, 2015, 2018, and 2022.
Countries
The data includes the 23 OECD countries with non-missing PISA scores from 2003 to 2022, plus the United States. Countries include Australia, Belgium, Canada, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Korea, Latvia, Mexico, New Zealand, Norway, Poland, Portugal, Sweden, Switzerland, and United States of America.
Variables
Available grouping variables include subject (math, reading, and science), gender, school type (public and private), native vs. first- or second-generation immigrant status, language at home, parental education, and derived country groups such as continent or English-speaking vs. non-English-speaking.
How it works↓
This app uses GPT-4.1 mini to parse natural-language PISA questions into structured JSON: subject, country, year/time mode, filters, grouping variables, country groups, and query type. The model does not generate executable SQL. The backend validates the JSON intent, maps it to a fixed SQL template, and runs the query against BigQuery. Missing values in grouping variables are excluded by default unless explicitly requested. Within-country averages use PISA student-level weights and plausible values. Cross-country averages use senate weights, giving each country equal weight. Generated SQL and returned rows are shown alongside the output for transparency.