PLATFORM TECHNOLOGY

Platform technology

A domain-specific AI engine for educational misconception diagnosis.

47-type misconception taxonomy · 3-tier foundation model fallback · 635 question templates · Qwen3-8B QLoRA fine-tune in active development

01 · Research problem

The problem we are solving

Student misconceptions in mathematics and language are well-documented in educational research. A student who consistently makes the same error is not lacking practice — they have a specific, identifiable cognitive error requiring targeted remediation.

Standard adaptive learning systems address this through content sequencing: give students more practice on the topic they got wrong. Edvora's engine addresses this at a deeper level — it identifies which specific reasoning error caused the wrong answer, then generates an explanation that starts from that exact error.

02 · Pipeline today

The diagnostic pipeline (current)

Today, the misconception classification engine runs as a structured prompting pipeline against three foundation models — Groq (primary), Google Gemini, and Anthropic Claude — with the 47-type misconception taxonomy supplied as in-context guidance. Each wrong answer is classified against the taxonomy, and a tailored explanation is generated.

The taxonomy itself was developed by the Edvora team from analysis of 140 official Australian examination past papers and validated against established educational research on student misconceptions. The taxonomy — not a fine-tuned model — is the core piece of intellectual property today.

Current technical stack

Primary inference: Groq (llama-3.3-70b-versatile)
Secondary fallback: Google Gemini 2.0 Flash
Tertiary fallback: Anthropic Claude (Haiku 4.5)
Classification basis: 47-type taxonomy in-context
Question corpus: 635 validated templates
Source documents: 140 official AU exam papers
Quality gate: 3,057 automated content tests

02b · Roadmap

Domain-specific fine-tune (in development)

Our roadmap target is a fine-tuned Qwen3-8B classifier specifically for the Australian K-12 misconception taxonomy. We are using the data collected through the research program here as the training corpus.

The approach follows the methodology demonstrated by the Kaggle Eedi competition silver medal solution (Qwen2.5-32B with LoRA r=16, alpha=32 to classify student wrong answers against a structured misconception taxonomy) and the April 2026 EduQwen paper (arxiv.org/html/2604.06385) which demonstrated a fine-tuned Qwen3-32B achieving 96.52% accuracy on the Pedagogy Benchmark.

We will publish the methodology and results when the fine-tune meets our internal accuracy threshold (target Q3 2026). Until then, the production system uses the prompting pipeline above.

Planned fine-tune specifications

Target base model: Qwen3-8B (Apache 2.0 licence)
Method: QLoRA 4-bit quantisation
LoRA rank: r=16, alpha=32
Training framework: Unsloth + HuggingFace TRL SFTTrainer
Target inference: 2× speed, 60% less VRAM vs base
Required gold labels: 500 (research program output)
Target release: Q3 2026

03 · Taxonomy

The 47-type misconception taxonomy

The taxonomy classifies student reasoning errors across three subject areas. Examples from each:

Mathematics

Adding numerators and denominators independently when adding fractions · applying 2D area formulas to 3D problems · reversing inequality direction when dividing by negatives · confusing place value in decimal multiplication.

English

Selecting explicitly stated details rather than implied inferences · confusing author purpose with character motivation · over-generalising from single examples.

General Ability (verbal, spatial, logical)

Applying 2D rotation logic to 3D spatial problems · confusing reflection with rotation · committing the affirming-the-consequent fallacy in logical deduction.

The taxonomy was developed from analysis of 140 official Australian examination past papers and validated against established educational research on student misconceptions.

04 · Infrastructure

Production inference architecture

The platform uses a 3-tier AI fallback chain:

PrimaryGroq

sub-100ms inference for real-time diagnostic response

SecondaryGoogle Gemini

automatic failover if Groq is unavailable

TertiaryAnthropic Claude

final fallback ensuring 99.9%+ availability

497 of 498 question templates have pre-generated cached AI explanations, delivering near-zero runtime AI cost for the majority of student interactions. Dynamic generation is reserved for novel error patterns not matched by the cache.

05 · Corpus

Training and validation corpus

The corpus comprises 140 official Australian examination documents including NSW DoE official SHS and OC practice tests, NAPLAN numeracy and literacy papers (Years 3, 5, 7, 9), ACER scholarship sample examinations, WA ASET official samples, VIC SEAL examination materials, and ACARA NAPLAN Technical Reports 2024–2025.

140

Official past papers

635

Validated question templates

3,057

Automated content tests · 100% pass

All content validated against official frameworks. 16 content categories covered.

06 · Scale

Global scalability design

The diagnostic engine is designed for curriculum-agnostic deployment. The misconception taxonomy maps to reasoning error patterns — cognitive processes largely consistent across educational systems. Curriculum-specific components (question content, subject taxonomy) are modular.

Deploying in a new curriculum context requires: (1) domain-specific fine-tuning data, (2) curriculum-aligned question templates, (3) localised misconception taxonomy validation. The infrastructure, inference chain, parent reporting, and adaptive scheduling require no modification.

Target expansion markets

SingaporePSLE

United Kingdom11+ and GCSE

United StatesStandardised testing

Hong KongDSE preparation

Research partnerships, institutional licensing, or international deployment enquiries:

support@edvora.com.au

Try a live demo Back to overview