InceptBench Evaluators & API

One unified evaluator for questions, quizzes, reading passages, and articles.
Automatically classifies content and routes to specialized assessment methods.


What Gets Evaluated?

InceptBench automatically classifies your content and evaluates it across 8-11 quality dimensions depending on content type:

Questions

MCQs, fill-in, match — 11 metrics including distractor quality

Quizzes

Multiple questions — concept coverage, difficulty distribution

Reading Passages

Fiction & nonfiction — reading level, engagement, accuracy

Every evaluation returns:

  • Content type — Automatically classified (question, quiz, fiction_reading, nonfiction_reading, other)
  • Overall score (0.0-1.0) — Holistic quality assessment
  • Dimension scores — Individual scores with reasoning for each metric
  • Suggested improvements — Actionable recommendations

Quickstart

Evaluate educational content in seconds with our evaluator:

Choose method:

API Usage

bash

curl -X POST "https://api.inceptbench.com/evaluate" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --data '{
    "generated_content": [
        {
        "id": "q1",
        "content": "What is 2+2? A) 3 B) 4 C) 5 D) 6"
        }
    ]
}'

Note: To obtain an API_KEY, please reach out to the InceptBench team.


Input Format

InceptBench uses a single unified input format. The content field accepts any format — plain text, JSON object, or structured data. The evaluator automatically classifies and routes your content.

Content Item Schema

json

{
  "generated_content": [
    {
      "id": "optional-unique-id",
      "curriculum": "common_core",
      "request": {
        "grade": "7",
        "subject": "mathematics",
        "type": "mcq",
        "difficulty": "medium",
        "locale": "en-US",
        "skills": {
          "lesson_title": "Solving Linear Equations",
          "substandard_id": "CCSS.MATH.7.EE.A.1"
        },
        "instructions": "Create a linear equation problem"
      },
      "content": "Your content here (string, JSON object, or any format)"
    }
  ]
}

Content Item Fields

FieldRequiredDefaultDescription
contentYesContent to evaluate. Accepts any format: plain text, JSON object, or structured data
idNoAuto-generatedUnique identifier for the content item
curriculumNocommon_coreCurriculum for alignment evaluation
requestNonullOptional metadata about generation request (see below)

Request Metadata Fields (all optional)

FieldDescriptionExample Values
gradeGrade level"K", "1", "7", "12"
subjectSubject area"mathematics", "english", "science"
typeContent type hint"mcq", "fill-in", "article", "quiz"
difficultyDifficulty level"easy", "medium", "hard"
localeLanguage-region code"en-US", "en-AE", "ar-AE"
skillsSkills informationJSON object or string with lesson/standard info
instructionsGeneration promptThe instruction/prompt used to generate this content
Flexible Content: The content field has no enforced schema. You can pass plain text like "What is 2+2?" or structured JSON with question, answer, options, etc. InceptBench will automatically parse and evaluate whatever format you provide.

Content Examples

Here are examples of different content types. These are examples, not required schemas — you can structure your content however you prefer.

Multiple Choice Question

json

{
  "generated_content": [
    {
      "id": "mcq-example",
      "request": {
        "grade": "3",
        "subject": "math",
        "type": "mcq",
        "difficulty": "easy"
      },
      "content": {
        "question": "Which figure shows equal parts?",
        "answer": "A",
        "answer_explanation": "Only A has all equal parts",
        "answer_options": [
          { "key": "A", "text": "Figure A" },
          { "key": "B", "text": "Figure B" },
          { "key": "C", "text": "Figure C" },
          { "key": "D", "text": "Figure D" }
        ]
      }
    }
  ]
}

Plain Text Question

json

{
  "generated_content": [
    {
      "content": "What is the value of x in 3x + 7 = 22? A) 3 B) 4 C) 5 D) 6. The answer is C because subtracting 7 from both sides gives 3x = 15, then dividing by 3 gives x = 5."
    }
  ]
}

Reading Passage

json

{
  "generated_content": [
    {
      "id": "reading-1",
      "request": {
        "grade": "5",
        "subject": "english",
        "type": "article"
      },
      "content": "# The Water Cycle\n\nWater is always moving on Earth. It goes from oceans to clouds to rain and back again. This is called the water cycle...\n\n## Questions\n\n1. What powers the water cycle?\n2. Where does most evaporation occur?"
    }
  ]
}

Quiz (Multiple Questions)

json

{
  "generated_content": [
    {
      "id": "quiz-1",
  "request": {
        "grade": "4",
        "subject": "math",
        "type": "quiz"
  },
  "content": {
        "title": "Fractions Assessment",
        "questions": [
          {
            "question": "What is 1/2 + 1/4?",
            "answer": "3/4",
            "options": ["1/2", "3/4", "1/4", "1"]
          },
          {
            "question": "Which fraction is larger: 2/3 or 3/5?",
            "answer": "2/3",
            "options": ["2/3", "3/5", "They are equal"]
          }
        ]
      }
    }
  ]
}

Images in Content

Images are automatically detected from the content string. No separate image_url field is required.

Supported formats:

FormatExample
Direct URLhttps://example.com/image.png
Markdown![description](https://example.com/image.png)
HTML<img src="https://example.com/image.png">

Example with image:

json

{
  "generated_content": [
    {
      "content": "Look at the triangle below:\n\n![triangle](https://example.com/triangle.png)\n\nWhat is the area of the triangle if the base is 6 cm and height is 4 cm?"
  }
  ]
}

When images are detected:

  • They are sent to vision-capable models for analysis
  • Object counting is performed automatically
  • Visual properties are analyzed for educational relevance

Example Response

json

{
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "evaluations": {
    "q1": {
      "content_type": "question",
      "overall": {
        "score": 0.85,
        "reasoning": "Well-constructed question with clear answer options and appropriate difficulty.",
        "suggested_improvements": "Consider adding a visual stimulus for enhanced engagement."
  },
      "factual_accuracy": {
        "score": 1.0,
        "reasoning": "The mathematical content and answer are correct.",
        "suggested_improvements": null
      },
      "educational_accuracy": {
        "score": 1.0,
        "reasoning": "Fulfills educational intent for the target grade level.",
        "suggested_improvements": null
      },
      "curriculum_alignment": {
        "score": 0.9,
        "reasoning": "Aligns well with CCSS.MATH.3.OA standards.",
        "suggested_improvements": null
      },
      "clarity_precision": {
        "score": 0.85,
        "reasoning": "Clear wording appropriate for grade 3 students.",
        "suggested_improvements": null
      },
      "weighted_score": 0.8387
    }
  },
  "evaluation_time_seconds": 12.34,
  "inceptbench_version": "2.2.0",
  "failed_items": null
}

Content Types & Metrics

The type field in your request can be any string (e.g., "mcq", "fill-in", "article"). The evaluator will automatically classify your content and map it to one of these internal types:

Content TypeDescriptionKey Metrics
questionSingle educational questioncurriculum_alignment, clarity_precision, distractor_quality, difficulty_alignment, stimulus_quality
quizMultiple questions togetherconcept_coverage, difficulty_distribution, non_repetitiveness, test_preparedness
fiction_readingFictional narrative passagesreading_level_match, engagement, accuracy_and_logic, question_quality
nonfiction_readingInformational passagesreading_level_match, topic_focus, accuracy_and_logic, question_quality
otherGeneral educational contenteducational_value, content_appropriateness, clarity_and_organization

Universal Metrics (all content types)

MetricTypeDescription
overall0.0-1.0Holistic quality score
factual_accuracyBinaryContent is factually correct
educational_accuracyBinaryFulfills educational intent

API Reference

Endpoints

MethodEndpointDescription
POST/evaluateEvaluate content items (1-100 per request)
GET/healthHealth check
GET/curriculumsList available curriculums

Authentication

Include your API key in the Authorization header:

bash

Authorization: Bearer YOUR_API_KEY

Rate Limits

  • 100 items per request maximum
  • 10 concurrent requests per API key
  • Typical evaluation time: 10-30 seconds per item

Interactive API Docs

Explore the full API with our interactive documentation:


CLI Reference

Installation

bash

pip install inceptbench

Environment Variables

Before using the CLI, set the following environment variables:

VariableRequiredDescription
OPENAI_API_KEYYesOpenAI API key (powers evaluations)
INCEPT_API_KEYYesInceptBench API key (for curriculum search)
GEMINI_API_KEYNoGoogle Gemini API key (for image analysis)
ANTHROPIC_API_KEYNoAnthropic API key (for image analysis fallback)

Commands

CommandDescription
inceptbench exampleCreate a sample content.json file
inceptbench evaluate <file.json>Evaluate content from JSON file
inceptbench evaluate --raw "content"Evaluate raw content string
inceptbench --versionShow version

Options

OptionShortDescription
--output FILE-oSave results to JSON file
--verbose-vShow verbose/debug output
--rawEvaluate raw content (string or .txt/.md file)
--curriculum NAMECurriculum for evaluation (with –raw only)
--generation-promptGeneration prompt (with –raw only)

Examples

bash

# Create sample input file
inceptbench example

# Evaluate from JSON file
inceptbench evaluate content.json

# Evaluate and save results
inceptbench evaluate content.json -o results.json

# Evaluate raw content
inceptbench evaluate --raw "What is 2+2? A) 3 B) 4 C) 5 D) 6"

# Evaluate with curriculum context
inceptbench evaluate --raw "Solve for x: 2x + 5 = 15" \
  --curriculum common_core \
  --generation-prompt "Grade 7 algebra"

# Verbose mode
inceptbench evaluate content.json -v

PyPI Package