Platform technology
A domain-specific AI engine for educational misconception diagnosis.
01 · Research problem
The problem we are solving
Student misconceptions in mathematics and language are well-documented in educational research. A student who consistently makes the same error is not lacking practice — they have a specific, identifiable cognitive error requiring targeted remediation.
Standard adaptive learning systems address this through content sequencing: give students more practice on the topic they got wrong. Edvora's engine addresses this at a deeper level — it identifies which specific reasoning error caused the wrong answer, then generates an explanation that starts from that exact error.
02 · Pipeline today
The diagnostic pipeline (current)
Today, the misconception classification engine runs as a structured prompting pipeline against three foundation models — Groq (primary), Google Gemini, and Anthropic Claude — with the 47-type misconception taxonomy supplied as in-context guidance. Each wrong answer is classified against the taxonomy, and a tailored explanation is generated.
The taxonomy itself was developed by the Edvora team from analysis of 140 official Australian examination past papers and validated against established educational research on student misconceptions. The taxonomy — not a fine-tuned model — is the core piece of intellectual property today.
Current technical stack
- Primary inference
- Groq (llama-3.3-70b-versatile)
- Secondary fallback
- Google Gemini 2.0 Flash
- Tertiary fallback
- Anthropic Claude (Haiku 4.5)
- Classification basis
- 47-type taxonomy in-context
- Question corpus
- 635 validated templates
- Source documents
- 140 official AU exam papers
- Quality gate
- 3,057 automated content tests
02b · Roadmap
Domain-specific fine-tune (in development)
Our roadmap target is a fine-tuned Qwen3-8B classifier specifically for the Australian K-12 misconception taxonomy. We are using the data collected through the research program here as the training corpus.
The approach follows the methodology demonstrated by the Kaggle Eedi competition silver medal solution (Qwen2.5-32B with LoRA r=16, alpha=32 to classify student wrong answers against a structured misconception taxonomy) and the April 2026 EduQwen paper (arxiv.org/html/2604.06385) which demonstrated a fine-tuned Qwen3-32B achieving 96.52% accuracy on the Pedagogy Benchmark.
We will publish the methodology and results when the fine-tune meets our internal accuracy threshold (target Q3 2026). Until then, the production system uses the prompting pipeline above.
Planned fine-tune specifications
- Target base model
- Qwen3-8B (Apache 2.0 licence)
- Method
- QLoRA 4-bit quantisation
- LoRA rank
- r=16, alpha=32
- Training framework
- Unsloth + HuggingFace TRL SFTTrainer
- Target inference
- 2× speed, 60% less VRAM vs base
- Required gold labels
- 500 (research program output)
- Target release
- Q3 2026
03 · Taxonomy
The 47-type misconception taxonomy
The taxonomy classifies student reasoning errors across three subject areas. Examples from each:
Mathematics
Adding numerators and denominators independently when adding fractions · applying 2D area formulas to 3D problems · reversing inequality direction when dividing by negatives · confusing place value in decimal multiplication.
English
Selecting explicitly stated details rather than implied inferences · confusing author purpose with character motivation · over-generalising from single examples.
General Ability (verbal, spatial, logical)
Applying 2D rotation logic to 3D spatial problems · confusing reflection with rotation · committing the affirming-the-consequent fallacy in logical deduction.
The taxonomy was developed from analysis of 140 official Australian examination past papers and validated against established educational research on student misconceptions.
04 · Infrastructure
Production inference architecture
The platform uses a 3-tier AI fallback chain:
sub-100ms inference for real-time diagnostic response
automatic failover if Groq is unavailable
final fallback ensuring 99.9%+ availability
497 of 498 question templates have pre-generated cached AI explanations, delivering near-zero runtime AI cost for the majority of student interactions. Dynamic generation is reserved for novel error patterns not matched by the cache.
05 · Corpus
Training and validation corpus
The corpus comprises 140 official Australian examination documents including NSW DoE official SHS and OC practice tests, NAPLAN numeracy and literacy papers (Years 3, 5, 7, 9), ACER scholarship sample examinations, WA ASET official samples, VIC SEAL examination materials, and ACARA NAPLAN Technical Reports 2024–2025.
All content validated against official frameworks. 16 content categories covered.
06 · Scale
Global scalability design
The diagnostic engine is designed for curriculum-agnostic deployment. The misconception taxonomy maps to reasoning error patterns — cognitive processes largely consistent across educational systems. Curriculum-specific components (question content, subject taxonomy) are modular.
Deploying in a new curriculum context requires: (1) domain-specific fine-tuning data, (2) curriculum-aligned question templates, (3) localised misconception taxonomy validation. The infrastructure, inference chain, parent reporting, and adaptive scheduling require no modification.
Target expansion markets
Research partnerships, institutional licensing, or international deployment enquiries:
support@edvora.com.au