Files
BoostAI/Mock-Data/README.md
2026-05-25 17:05:06 +01:00

8.5 KiB
Raw Blame History

Mock Pupil-History Dataset — Learning Path Agent Hackathon

Hand-crafted, internally-consistent JSON dataset for the BoostAI Learning Path Agent hackathon challenge. Mirrors the production SQLAlchemy schema in elevenplus-backend/src/app/models/ so an agent built on this data transfers cleanly to the real database after the hackathon.

Reference date: 2026-05-01 (timestamps in the data are relative to this — adjust TODAY in generate.py to shift them).

What's in here

classroom.json              One Year-6 Maths class + tutor + student-classroom links
students.json               12 students (with _persona annotation per student)
question_bank.json          50 unique 11+ Maths questions
assignments.json            8 assignments (5 CLOSED, 2 PUBLISHED, 1 DRAFT)
assignment_questions.json   64 assignment-question rows (links to question_bank)
assignment_assignees.json   85 (student × assignment) rows with status/scores
student_answers.json        593 per-question answer records — THE primary file
activity_logs.json          593 activity log rows (timestamps, durations, solve_mode)
dataset.json                All of the above bundled into one object
bonus_early_warning_input.json   3 classes × 10 students for the bonus EWS challenge
bonus_early_warning_output.json  Expected morning alert output (top 3 per class)
generate.py                 Deterministic generator (re-run anytime)

The agent's main input is dataset.json (or student_answers.json plus question_bank.json if you'd rather load files separately).

For the bonus challenge shown in the brief, use bonus_early_warning_input.json as a class-level monitor input with ordered topics plus per-student topic understanding scores, and bonus_early_warning_output.json as the expected ranked risk list with a specific weak topic and recommended action for each student.

Schema fidelity

Field names and enum values match the production models exactly:

Production model This dataset
users.py (role=student) students.json
classrooms.py + classroom_student_rs.py classroom.json
assignments.py (status: DRAFT/PUBLISHED/CLOSED) assignments.json
assignment_questions.py assignment_questions.json
assignment_assignees.py (status: NOT_STARTED/IN_PROGRESS/SUBMITTED) assignment_assignees.json
question_bank.py (difficulty: EASY/MEDIUM/HARD; source: BOOST) question_bank.json
student_assignment_answers.py (graded_marks, marks_awarded, grading_status=GRADED) student_answers.json
assignment_activity_logs.py (activity_type, duration_seconds, extra_data) activity_logs.json
question_metrics.py (explanation_type) embedded as _solve_mode on each answer

Timestamps are Unix milliseconds (BigInteger) per the project convention.

Hackathon annotations (not in production schema)

Fields prefixed with _ are hackathon-only annotations. Strip them if you ever seed this data into the real DB.

Annotation Where Meaning
_persona students.json Engineered behaviour persona (see below)
_solve_mode student_answers.json One of just_answer, step_by_step, solve_together, handwritten
_time_on_task_seconds student_answers.json Seconds spent on this question
_is_correct student_answers.json Boolean correctness (already implied by graded_marks)
_misconception_tag student_answers.json Set when wrong answer matches a known misconception (e.g. add_tops_add_bottoms)
_question_topic / _sub_topic / _difficulty student_answers.json Denormalised from question_bank for convenience
_answered_at student_answers.json Same as created_at, just clearer name

The mock exports now also include newer review-style fields used by the current app schema, such as:

  • per-question: is_correct, ai_feedback, review_needs_attention, review_issue_reason, review_correctness_score, review_understanding_score, review_question_score, review_confidence, review_tags
  • per-assignee: overall_score, ai_feedback, next_step_outcome

These make it easier to seed or backfill historical closed homework into the newer review surfaces.

The 12 students

Five students have engineered misconception personas; the other seven are realistic noise. The agent should ideally identify the personas from the data itself — _persona is included only so you can grade the agent.

ID Name Persona What you'll see in the data
201 Aisha Khan fraction_inversion ~12% on Fractions Add/Subtract/Multiply, ~78% elsewhere. Wrong answers show add-tops-add-bottoms pattern (e.g. ½+⅓ → ⅖).
202 Ben Carter place_value_gaps Fails multi-digit subtraction with borrowing & decimal alignment. Strong on single-digit ops.
203 Chen Wei rushed_careless Right method when in step_by_step; wrong final answer in just_answer. Time-on-task drops week-over-week. No activity in the last 8 days — also drives Early Warning.
204 Daniela Rossi solve_together_dependent Solve-Together share rises 21% → 88% across the period. Independent accuracy degrading.
205 Elif Demir word_problem_weak 0% on word problems, ~90% on bare computation of the same operations.
206 Felix Brown stable_strong ~84% overall — baseline noise.
207 Grace Park stable_strong ~85% overall — baseline noise.
208210 Singh / Nakamura / Williams stable_mid ~65% overall — baseline noise.
211212 Patel / O'Connor stable_weak ~50% overall — baseline noise.

The 8 assignments

ID Name Topic Due Status Why it matters
3001 HW1 — Place Value Warmup Place Value -28d CLOSED Baseline data
3002 HW2 — Arithmetic Practice Arithmetic -22d CLOSED
3003 HW3 — Fractions Foundations Fractions -16d CLOSED First fraction signal
3004 HW4 — Negatives & BIDMAS BIDMAS -10d CLOSED
3005 HW5 — Geometry Basics Geometry -6d CLOSED
3006 HW6 — Algebra & Sequences Algebra +2d PUBLISHED In flight
3007 HW7 — Adding Fractions (test prep) Fractions +5d PUBLISHED Curriculum deadline anchor for the bonus EWS
3008 HW8 — Mixed Revision Mixed +12d DRAFT No activity yet

Bonus / Early Warning signals embedded in the data

Signal Where it lives Who exhibits it
Drop in attempt rate (last 7 days) student_answers.json timestamps Student 203 (no recent activity); 211 reduced volume
Increasing Solve-Together reliance _solve_mode distribution over time Student 204 (21% → 88%)
Declining score trend on deadline topic Fractions accuracy across HW3 → HW7 Student 201 (very weak on Fractions; HW7 is fractions, due in 5 days)
Time since last session activity_logs.json last timestamp Student 203 (≥ 8 days)

Expected agent output for the bonus monitor on this dataset:

Rank Student Topic driving risk Suggested action
1 203 Chen Wei Engagement collapse Reach out + light re-entry assignment; check for blockers
2 201 Aisha Khan Fractions / Add — HW7 in 5 days 3 targeted fraction-add questions on the add-tops-add-bottoms misconception
3 204 Daniela Rossi Increasing scaffold dependence (any topic) Pair with step_by_step mode + scheduled just_answer checkpoint

Running / re-running the generator

cd boost-ai-eval/mock-data
python3 generate.py

The RNG is seeded (20260501), so output is byte-stable across runs unless you change the source. To shift dates, edit TODAY at the top of generate.py.

Quick agent-side recipes

Load everything:

import json
data = json.load(open('boost-ai-eval/mock-data/dataset.json'))
students = data['students']
answers = data['student_answers']
qbank = {q['id']: q for q in data['question_bank']}

Pull all answers for a single student:

def answers_for(student_id):
    aa_ids = {aa['id'] for aa in data['assignment_assignees']
              if aa['student_id'] == student_id}
    return [a for a in data['student_answers'] if a['assignee_id'] in aa_ids]

Compute topic-level mastery snapshot:

from collections import defaultdict
topic_acc = defaultdict(lambda: [0, 0])  # [correct, total]
for a in answers_for(201):
    topic_acc[a['_question_topic']][1] += 1
    topic_acc[a['_question_topic']][0] += int(a['_is_correct'])
# {'Fractions': [2, 16], 'Place Value': [3, 4], ...}