Mock Pupil-History Dataset — Learning Path Agent Hackathon

Hand-crafted, internally-consistent JSON dataset for the BoostAI Learning Path Agent hackathon challenge. Mirrors the production SQLAlchemy schema in elevenplus-backend/src/app/models/ so an agent built on this data transfers cleanly to the real database after the hackathon.

Reference date: 2026-05-01 (timestamps in the data are relative to this — adjust TODAY in generate.py to shift them).

What's in here

classroom.json              One Year-6 Maths class + tutor + student-classroom links
students.json               12 students (with _persona annotation per student)
question_bank.json          50 unique 11+ Maths questions
assignments.json            8 assignments (5 CLOSED, 2 PUBLISHED, 1 DRAFT)
assignment_questions.json   64 assignment-question rows (links to question_bank)
assignment_assignees.json   85 (student × assignment) rows with status/scores
student_answers.json        593 per-question answer records — THE primary file
activity_logs.json          593 activity log rows (timestamps, durations, solve_mode)
dataset.json                All of the above bundled into one object
bonus_early_warning_input.json   3 classes × 10 students for the bonus EWS challenge
bonus_early_warning_output.json  Expected morning alert output (top 3 per class)
generate.py                 Deterministic generator (re-run anytime)

The agent's main input is dataset.json (or student_answers.json plus question_bank.json if you'd rather load files separately).

For the bonus challenge shown in the brief, use bonus_early_warning_input.json as a class-level monitor input with ordered topics plus per-student topic understanding scores, and bonus_early_warning_output.json as the expected ranked risk list with a specific weak topic and recommended action for each student.

Schema fidelity

Field names and enum values match the production models exactly:

Production model	This dataset
`users.py` (role=student)	`students.json`
`classrooms.py` + `classroom_student_rs.py`	`classroom.json`
`assignments.py` (status: DRAFT/PUBLISHED/CLOSED)	`assignments.json`
`assignment_questions.py`	`assignment_questions.json`
`assignment_assignees.py` (status: NOT_STARTED/IN_PROGRESS/SUBMITTED)	`assignment_assignees.json`
`question_bank.py` (difficulty: EASY/MEDIUM/HARD; source: BOOST)	`question_bank.json`
`student_assignment_answers.py` (graded_marks, marks_awarded, grading_status=GRADED)	`student_answers.json`
`assignment_activity_logs.py` (activity_type, duration_seconds, extra_data)	`activity_logs.json`
`question_metrics.py` (explanation_type)	embedded as `_solve_mode` on each answer

Timestamps are Unix milliseconds (BigInteger) per the project convention.

Hackathon annotations (not in production schema)

Fields prefixed with _ are hackathon-only annotations. Strip them if you ever seed this data into the real DB.

Annotation	Where	Meaning
`_persona`	`students.json`	Engineered behaviour persona (see below)
`_solve_mode`	`student_answers.json`	One of `just_answer`, `step_by_step`, `solve_together`, `handwritten`
`_time_on_task_seconds`	`student_answers.json`	Seconds spent on this question
`_is_correct`	`student_answers.json`	Boolean correctness (already implied by `graded_marks`)
`_misconception_tag`	`student_answers.json`	Set when wrong answer matches a known misconception (e.g. `add_tops_add_bottoms`)
`_question_topic` / `_sub_topic` / `_difficulty`	`student_answers.json`	Denormalised from `question_bank` for convenience
`_answered_at`	`student_answers.json`	Same as `created_at`, just clearer name

The mock exports now also include newer review-style fields used by the current app schema, such as:

per-question: is_correct, ai_feedback, review_needs_attention, review_issue_reason, review_correctness_score, review_understanding_score, review_question_score, review_confidence, review_tags
per-assignee: overall_score, ai_feedback, next_step_outcome

These make it easier to seed or backfill historical closed homework into the newer review surfaces.

The 12 students

Five students have engineered misconception personas; the other seven are realistic noise. The agent should ideally identify the personas from the data itself — _persona is included only so you can grade the agent.

ID	Name	Persona	What you'll see in the data
201	Aisha Khan	`fraction_inversion`	~12% on Fractions Add/Subtract/Multiply, ~78% elsewhere. Wrong answers show add-tops-add-bottoms pattern (e.g. ½+⅓ → ⅖).
202	Ben Carter	`place_value_gaps`	Fails multi-digit subtraction with borrowing & decimal alignment. Strong on single-digit ops.
203	Chen Wei	`rushed_careless`	Right method when in `step_by_step`; wrong final answer in `just_answer`. Time-on-task drops week-over-week. No activity in the last 8 days — also drives Early Warning.
204	Daniela Rossi	`solve_together_dependent`	Solve-Together share rises 21% → 88% across the period. Independent accuracy degrading.
205	Elif Demir	`word_problem_weak`	0% on word problems, ~90% on bare computation of the same operations.
206	Felix Brown	`stable_strong`	~84% overall — baseline noise.
207	Grace Park	`stable_strong`	~85% overall — baseline noise.
208–210	Singh / Nakamura / Williams	`stable_mid`	~65% overall — baseline noise.
211–212	Patel / O'Connor	`stable_weak`	~50% overall — baseline noise.

The 8 assignments

ID	Name	Topic	Due	Status	Why it matters
3001	HW1 — Place Value Warmup	Place Value	-28d	CLOSED	Baseline data
3002	HW2 — Arithmetic Practice	Arithmetic	-22d	CLOSED
3003	HW3 — Fractions Foundations	Fractions	-16d	CLOSED	First fraction signal
3004	HW4 — Negatives & BIDMAS	BIDMAS	-10d	CLOSED
3005	HW5 — Geometry Basics	Geometry	-6d	CLOSED
3006	HW6 — Algebra & Sequences	Algebra	+2d	PUBLISHED	In flight
3007	HW7 — Adding Fractions (test prep)	Fractions	+5d	PUBLISHED	Curriculum deadline anchor for the bonus EWS
3008	HW8 — Mixed Revision	Mixed	+12d	DRAFT	No activity yet

Bonus / Early Warning signals embedded in the data

Signal	Where it lives	Who exhibits it
Drop in attempt rate (last 7 days)	`student_answers.json` timestamps	Student 203 (no recent activity); 211 reduced volume
Increasing Solve-Together reliance	`_solve_mode` distribution over time	Student 204 (21% → 88%)
Declining score trend on deadline topic	Fractions accuracy across HW3 → HW7	Student 201 (very weak on Fractions; HW7 is fractions, due in 5 days)
Time since last session	`activity_logs.json` last timestamp	Student 203 (≥ 8 days)

Expected agent output for the bonus monitor on this dataset:

Rank	Student	Topic driving risk	Suggested action
1	203 Chen Wei	Engagement collapse	Reach out + light re-entry assignment; check for blockers
2	201 Aisha Khan	Fractions / Add — HW7 in 5 days	3 targeted fraction-add questions on the add-tops-add-bottoms misconception
3	204 Daniela Rossi	Increasing scaffold dependence (any topic)	Pair with `step_by_step` mode + scheduled `just_answer` checkpoint

Running / re-running the generator

cd boost-ai-eval/mock-data
python3 generate.py

The RNG is seeded (20260501), so output is byte-stable across runs unless you change the source. To shift dates, edit TODAY at the top of generate.py.

Quick agent-side recipes

Load everything:

import json
data = json.load(open('boost-ai-eval/mock-data/dataset.json'))
students = data['students']
answers = data['student_answers']
qbank = {q['id']: q for q in data['question_bank']}

Pull all answers for a single student:

def answers_for(student_id):
    aa_ids = {aa['id'] for aa in data['assignment_assignees']
              if aa['student_id'] == student_id}
    return [a for a in data['student_answers'] if a['assignee_id'] in aa_ids]

Compute topic-level mastery snapshot:

from collections import defaultdict
topic_acc = defaultdict(lambda: [0, 0])  # [correct, total]
for a in answers_for(201):
    topic_acc[a['_question_topic']][1] += 1
    topic_acc[a['_question_topic']][0] += int(a['_is_correct'])
# {'Fractions': [2, 16], 'Place Value': [3, 4], ...}

8.5 KiB Raw Blame History Unescape Escape