Release Notes

Stay up to date with every InceptBench release — new features, improvements, and fixes.

InceptBench Release Notes

`v2.3.6` — Integrity Checks & AP Content Support

March 17, 2026

New Features

Added integrity_check metric to detect prompt injection and score manipulation attempts in submitted content.
Added support to evaluate AP (Advanced Placement) content.

Bug Fixes

Fixed an issue causing blank PNGs to be generated when converting SVGs to PNG format.
Fixed substandard matching logic that was incorrectly aligning content to the wrong curriculum standards.

`v2.3.5` — Evaluator Accuracy & LLM Consolidation

March 10, 2026

Improvements

Migrated to Gemini 3, improving response quality and evaluation consistency.
Consolidated LLM factory for cleaner model management across evaluators.
Integrated Langfuse for better observability and tracing of evaluation runs.

Bug Fixes

Fixed evaluator image analysis incorrectly counting visual elements, leading to inaccurate scores.
Fixed difficulty alignment misassessment where content was being assigned incorrect difficulty bands.
Fixed curriculum alignment misinterpretation causing content to be mapped to the wrong standards.
Fixed SVG rendering issues affecting image-based evaluation accuracy.

`v2.3.3` — Curriculum Versioning & Image Evaluation

Feb 10, 2026

New Features

Added support for versioning in the curriculum API, enabling callers to pin evaluations to a specific curriculum version.

Improvements

Evaluator now more strictly follows curriculum metadata, reducing drift between question content and expected standards alignment.
Significant improvements to image evaluation accuracy and consistency.

`v2.3.2` — Content Type Enforcement & Image Evaluation

February 2026

New Features

Evaluator now penalizes when generated content type doesn’t match the requested type (e.g. request is fill-in-the-blank but generated content is MCQ).
Evaluator now penalizes when an image is expected in the content but not present.

Improvements

Improved image evaluation using Gemini Agentic for more accurate visual content analysis.
Reduced hallucinations in curriculum assessment boundary detection.

`v2.3.0` — Evaluator Flexibility

Jan 10, 2026

Improvements

Major evaluator improvements including enhanced flexibility around curriculum stems, allowing for more natural question variations while maintaining alignment with standards.