Files
BoostAI/Mock-Data/README.md
2026-05-25 17:05:06 +01:00

164 lines
8.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Mock Pupil-History Dataset — Learning Path Agent Hackathon
Hand-crafted, internally-consistent JSON dataset for the BoostAI **Learning
Path Agent** hackathon challenge. Mirrors the production SQLAlchemy schema
in `elevenplus-backend/src/app/models/` so an agent built on this data
transfers cleanly to the real database after the hackathon.
**Reference date:** `2026-05-01` (timestamps in the data are relative to
this — adjust `TODAY` in `generate.py` to shift them).
## What's in here
```
classroom.json One Year-6 Maths class + tutor + student-classroom links
students.json 12 students (with _persona annotation per student)
question_bank.json 50 unique 11+ Maths questions
assignments.json 8 assignments (5 CLOSED, 2 PUBLISHED, 1 DRAFT)
assignment_questions.json 64 assignment-question rows (links to question_bank)
assignment_assignees.json 85 (student × assignment) rows with status/scores
student_answers.json 593 per-question answer records — THE primary file
activity_logs.json 593 activity log rows (timestamps, durations, solve_mode)
dataset.json All of the above bundled into one object
bonus_early_warning_input.json 3 classes × 10 students for the bonus EWS challenge
bonus_early_warning_output.json Expected morning alert output (top 3 per class)
generate.py Deterministic generator (re-run anytime)
```
The agent's main input is `dataset.json` (or `student_answers.json` plus
`question_bank.json` if you'd rather load files separately).
For the bonus challenge shown in the brief, use
`bonus_early_warning_input.json` as a class-level monitor input with ordered
topics plus per-student topic understanding scores, and
`bonus_early_warning_output.json` as the expected ranked risk list with a
specific weak topic and recommended action for each student.
## Schema fidelity
Field names and enum values match the production models exactly:
| Production model | This dataset |
|---|---|
| `users.py` (role=student) | `students.json` |
| `classrooms.py` + `classroom_student_rs.py` | `classroom.json` |
| `assignments.py` (status: DRAFT/PUBLISHED/CLOSED) | `assignments.json` |
| `assignment_questions.py` | `assignment_questions.json` |
| `assignment_assignees.py` (status: NOT_STARTED/IN_PROGRESS/SUBMITTED) | `assignment_assignees.json` |
| `question_bank.py` (difficulty: EASY/MEDIUM/HARD; source: BOOST) | `question_bank.json` |
| `student_assignment_answers.py` (graded_marks, marks_awarded, grading_status=GRADED) | `student_answers.json` |
| `assignment_activity_logs.py` (activity_type, duration_seconds, extra_data) | `activity_logs.json` |
| `question_metrics.py` (explanation_type) | embedded as `_solve_mode` on each answer |
Timestamps are **Unix milliseconds** (BigInteger) per the project convention.
### Hackathon annotations (not in production schema)
Fields prefixed with **`_`** are hackathon-only annotations. Strip them if
you ever seed this data into the real DB.
| Annotation | Where | Meaning |
|---|---|---|
| `_persona` | `students.json` | Engineered behaviour persona (see below) |
| `_solve_mode` | `student_answers.json` | One of `just_answer`, `step_by_step`, `solve_together`, `handwritten` |
| `_time_on_task_seconds` | `student_answers.json` | Seconds spent on this question |
| `_is_correct` | `student_answers.json` | Boolean correctness (already implied by `graded_marks`) |
| `_misconception_tag` | `student_answers.json` | Set when wrong answer matches a known misconception (e.g. `add_tops_add_bottoms`) |
| `_question_topic` / `_sub_topic` / `_difficulty` | `student_answers.json` | Denormalised from `question_bank` for convenience |
| `_answered_at` | `student_answers.json` | Same as `created_at`, just clearer name |
The mock exports now also include newer review-style fields used by the current app schema, such as:
- per-question: `is_correct`, `ai_feedback`, `review_needs_attention`, `review_issue_reason`, `review_correctness_score`, `review_understanding_score`, `review_question_score`, `review_confidence`, `review_tags`
- per-assignee: `overall_score`, `ai_feedback`, `next_step_outcome`
These make it easier to seed or backfill historical closed homework into the newer review surfaces.
## The 12 students
Five students have engineered misconception personas; the other seven are
realistic noise. The agent should ideally identify the personas from the
data itself — `_persona` is included only so you can grade the agent.
| ID | Name | Persona | What you'll see in the data |
|---|---|---|---|
| 201 | Aisha Khan | `fraction_inversion` | ~12% on Fractions Add/Subtract/Multiply, ~78% elsewhere. Wrong answers show **add-tops-add-bottoms** pattern (e.g. ½+⅓ → ⅖). |
| 202 | Ben Carter | `place_value_gaps` | Fails multi-digit subtraction with borrowing & decimal alignment. Strong on single-digit ops. |
| 203 | Chen Wei | `rushed_careless` | Right method when in `step_by_step`; wrong final answer in `just_answer`. Time-on-task drops week-over-week. **No activity in the last 8 days** — also drives Early Warning. |
| 204 | Daniela Rossi | `solve_together_dependent` | Solve-Together share rises **21% → 88%** across the period. Independent accuracy degrading. |
| 205 | Elif Demir | `word_problem_weak` | 0% on word problems, ~90% on bare computation of the same operations. |
| 206 | Felix Brown | `stable_strong` | ~84% overall — baseline noise. |
| 207 | Grace Park | `stable_strong` | ~85% overall — baseline noise. |
| 208210 | Singh / Nakamura / Williams | `stable_mid` | ~65% overall — baseline noise. |
| 211212 | Patel / O'Connor | `stable_weak` | ~50% overall — baseline noise. |
## The 8 assignments
| ID | Name | Topic | Due | Status | Why it matters |
|---|---|---|---|---|---|
| 3001 | HW1 — Place Value Warmup | Place Value | -28d | CLOSED | Baseline data |
| 3002 | HW2 — Arithmetic Practice | Arithmetic | -22d | CLOSED | |
| 3003 | HW3 — Fractions Foundations | Fractions | -16d | CLOSED | First fraction signal |
| 3004 | HW4 — Negatives & BIDMAS | BIDMAS | -10d | CLOSED | |
| 3005 | HW5 — Geometry Basics | Geometry | -6d | CLOSED | |
| 3006 | HW6 — Algebra & Sequences | Algebra | +2d | PUBLISHED | In flight |
| **3007** | **HW7 — Adding Fractions (test prep)** | **Fractions** | **+5d** | **PUBLISHED** | **Curriculum deadline anchor for the bonus EWS** |
| 3008 | HW8 — Mixed Revision | Mixed | +12d | DRAFT | No activity yet |
## Bonus / Early Warning signals embedded in the data
| Signal | Where it lives | Who exhibits it |
|---|---|---|
| Drop in attempt rate (last 7 days) | `student_answers.json` timestamps | Student 203 (no recent activity); 211 reduced volume |
| Increasing Solve-Together reliance | `_solve_mode` distribution over time | Student 204 (21% → 88%) |
| Declining score trend on deadline topic | Fractions accuracy across HW3 → HW7 | Student 201 (very weak on Fractions; HW7 is fractions, due in 5 days) |
| Time since last session | `activity_logs.json` last timestamp | Student 203 (≥ 8 days) |
**Expected agent output for the bonus monitor on this dataset:**
| Rank | Student | Topic driving risk | Suggested action |
|---|---|---|---|
| 1 | 203 Chen Wei | Engagement collapse | Reach out + light re-entry assignment; check for blockers |
| 2 | 201 Aisha Khan | Fractions / Add — HW7 in 5 days | 3 targeted fraction-add questions on the add-tops-add-bottoms misconception |
| 3 | 204 Daniela Rossi | Increasing scaffold dependence (any topic) | Pair with `step_by_step` mode + scheduled `just_answer` checkpoint |
## Running / re-running the generator
```bash
cd boost-ai-eval/mock-data
python3 generate.py
```
The RNG is seeded (`20260501`), so output is byte-stable across runs unless
you change the source. To shift dates, edit `TODAY` at the top of
`generate.py`.
## Quick agent-side recipes
**Load everything:**
```python
import json
data = json.load(open('boost-ai-eval/mock-data/dataset.json'))
students = data['students']
answers = data['student_answers']
qbank = {q['id']: q for q in data['question_bank']}
```
**Pull all answers for a single student:**
```python
def answers_for(student_id):
aa_ids = {aa['id'] for aa in data['assignment_assignees']
if aa['student_id'] == student_id}
return [a for a in data['student_answers'] if a['assignee_id'] in aa_ids]
```
**Compute topic-level mastery snapshot:**
```python
from collections import defaultdict
topic_acc = defaultdict(lambda: [0, 0]) # [correct, total]
for a in answers_for(201):
topic_acc[a['_question_topic']][1] += 1
topic_acc[a['_question_topic']][0] += int(a['_is_correct'])
# {'Fractions': [2, 16], 'Place Value': [3, 4], ...}
```