Mock Pupil-History Dataset — Learning Path Agent Hackathon
Hand-crafted, internally-consistent JSON dataset for the BoostAI Learning
Path Agent hackathon challenge. Mirrors the production SQLAlchemy schema
in elevenplus-backend/src/app/models/ so an agent built on this data
transfers cleanly to the real database after the hackathon.
Reference date: 2026-05-01 (timestamps in the data are relative to
this — adjust TODAY in generate.py to shift them).
What's in here
classroom.json One Year-6 Maths class + tutor + student-classroom links
students.json 12 students (with _persona annotation per student)
question_bank.json 50 unique 11+ Maths questions
assignments.json 8 assignments (5 CLOSED, 2 PUBLISHED, 1 DRAFT)
assignment_questions.json 64 assignment-question rows (links to question_bank)
assignment_assignees.json 85 (student × assignment) rows with status/scores
student_answers.json 593 per-question answer records — THE primary file
activity_logs.json 593 activity log rows (timestamps, durations, solve_mode)
dataset.json All of the above bundled into one object
bonus_early_warning_input.json 3 classes × 10 students for the bonus EWS challenge
bonus_early_warning_output.json Expected morning alert output (top 3 per class)
generate.py Deterministic generator (re-run anytime)
The agent's main input is dataset.json (or student_answers.json plus
question_bank.json if you'd rather load files separately).
For the bonus challenge shown in the brief, use
bonus_early_warning_input.json as a class-level monitor input with ordered
topics plus per-student topic understanding scores, and
bonus_early_warning_output.json as the expected ranked risk list with a
specific weak topic and recommended action for each student.
Schema fidelity
Field names and enum values match the production models exactly:
| Production model | This dataset |
|---|---|
users.py (role=student) |
students.json |
classrooms.py + classroom_student_rs.py |
classroom.json |
assignments.py (status: DRAFT/PUBLISHED/CLOSED) |
assignments.json |
assignment_questions.py |
assignment_questions.json |
assignment_assignees.py (status: NOT_STARTED/IN_PROGRESS/SUBMITTED) |
assignment_assignees.json |
question_bank.py (difficulty: EASY/MEDIUM/HARD; source: BOOST) |
question_bank.json |
student_assignment_answers.py (graded_marks, marks_awarded, grading_status=GRADED) |
student_answers.json |
assignment_activity_logs.py (activity_type, duration_seconds, extra_data) |
activity_logs.json |
question_metrics.py (explanation_type) |
embedded as _solve_mode on each answer |
Timestamps are Unix milliseconds (BigInteger) per the project convention.
Hackathon annotations (not in production schema)
Fields prefixed with _ are hackathon-only annotations. Strip them if
you ever seed this data into the real DB.
| Annotation | Where | Meaning |
|---|---|---|
_persona |
students.json |
Engineered behaviour persona (see below) |
_solve_mode |
student_answers.json |
One of just_answer, step_by_step, solve_together, handwritten |
_time_on_task_seconds |
student_answers.json |
Seconds spent on this question |
_is_correct |
student_answers.json |
Boolean correctness (already implied by graded_marks) |
_misconception_tag |
student_answers.json |
Set when wrong answer matches a known misconception (e.g. add_tops_add_bottoms) |
_question_topic / _sub_topic / _difficulty |
student_answers.json |
Denormalised from question_bank for convenience |
_answered_at |
student_answers.json |
Same as created_at, just clearer name |
The mock exports now also include newer review-style fields used by the current app schema, such as:
- per-question:
is_correct,ai_feedback,review_needs_attention,review_issue_reason,review_correctness_score,review_understanding_score,review_question_score,review_confidence,review_tags - per-assignee:
overall_score,ai_feedback,next_step_outcome
These make it easier to seed or backfill historical closed homework into the newer review surfaces.
The 12 students
Five students have engineered misconception personas; the other seven are
realistic noise. The agent should ideally identify the personas from the
data itself — _persona is included only so you can grade the agent.
| ID | Name | Persona | What you'll see in the data |
|---|---|---|---|
| 201 | Aisha Khan | fraction_inversion |
~12% on Fractions Add/Subtract/Multiply, ~78% elsewhere. Wrong answers show add-tops-add-bottoms pattern (e.g. ½+⅓ → ⅖). |
| 202 | Ben Carter | place_value_gaps |
Fails multi-digit subtraction with borrowing & decimal alignment. Strong on single-digit ops. |
| 203 | Chen Wei | rushed_careless |
Right method when in step_by_step; wrong final answer in just_answer. Time-on-task drops week-over-week. No activity in the last 8 days — also drives Early Warning. |
| 204 | Daniela Rossi | solve_together_dependent |
Solve-Together share rises 21% → 88% across the period. Independent accuracy degrading. |
| 205 | Elif Demir | word_problem_weak |
0% on word problems, ~90% on bare computation of the same operations. |
| 206 | Felix Brown | stable_strong |
~84% overall — baseline noise. |
| 207 | Grace Park | stable_strong |
~85% overall — baseline noise. |
| 208–210 | Singh / Nakamura / Williams | stable_mid |
~65% overall — baseline noise. |
| 211–212 | Patel / O'Connor | stable_weak |
~50% overall — baseline noise. |
The 8 assignments
| ID | Name | Topic | Due | Status | Why it matters |
|---|---|---|---|---|---|
| 3001 | HW1 — Place Value Warmup | Place Value | -28d | CLOSED | Baseline data |
| 3002 | HW2 — Arithmetic Practice | Arithmetic | -22d | CLOSED | |
| 3003 | HW3 — Fractions Foundations | Fractions | -16d | CLOSED | First fraction signal |
| 3004 | HW4 — Negatives & BIDMAS | BIDMAS | -10d | CLOSED | |
| 3005 | HW5 — Geometry Basics | Geometry | -6d | CLOSED | |
| 3006 | HW6 — Algebra & Sequences | Algebra | +2d | PUBLISHED | In flight |
| 3007 | HW7 — Adding Fractions (test prep) | Fractions | +5d | PUBLISHED | Curriculum deadline anchor for the bonus EWS |
| 3008 | HW8 — Mixed Revision | Mixed | +12d | DRAFT | No activity yet |
Bonus / Early Warning signals embedded in the data
| Signal | Where it lives | Who exhibits it |
|---|---|---|
| Drop in attempt rate (last 7 days) | student_answers.json timestamps |
Student 203 (no recent activity); 211 reduced volume |
| Increasing Solve-Together reliance | _solve_mode distribution over time |
Student 204 (21% → 88%) |
| Declining score trend on deadline topic | Fractions accuracy across HW3 → HW7 | Student 201 (very weak on Fractions; HW7 is fractions, due in 5 days) |
| Time since last session | activity_logs.json last timestamp |
Student 203 (≥ 8 days) |
Expected agent output for the bonus monitor on this dataset:
| Rank | Student | Topic driving risk | Suggested action |
|---|---|---|---|
| 1 | 203 Chen Wei | Engagement collapse | Reach out + light re-entry assignment; check for blockers |
| 2 | 201 Aisha Khan | Fractions / Add — HW7 in 5 days | 3 targeted fraction-add questions on the add-tops-add-bottoms misconception |
| 3 | 204 Daniela Rossi | Increasing scaffold dependence (any topic) | Pair with step_by_step mode + scheduled just_answer checkpoint |
Running / re-running the generator
cd boost-ai-eval/mock-data
python3 generate.py
The RNG is seeded (20260501), so output is byte-stable across runs unless
you change the source. To shift dates, edit TODAY at the top of
generate.py.
Quick agent-side recipes
Load everything:
import json
data = json.load(open('boost-ai-eval/mock-data/dataset.json'))
students = data['students']
answers = data['student_answers']
qbank = {q['id']: q for q in data['question_bank']}
Pull all answers for a single student:
def answers_for(student_id):
aa_ids = {aa['id'] for aa in data['assignment_assignees']
if aa['student_id'] == student_id}
return [a for a in data['student_answers'] if a['assignee_id'] in aa_ids]
Compute topic-level mastery snapshot:
from collections import defaultdict
topic_acc = defaultdict(lambda: [0, 0]) # [correct, total]
for a in answers_for(201):
topic_acc[a['_question_topic']][1] += 1
topic_acc[a['_question_topic']][0] += int(a['_is_correct'])
# {'Fractions': [2, 16], 'Place Value': [3, 4], ...}