THE IMPACT /

While this project was designed as a proof of concept, it successfully demonstrated how a human-centered approach to AI can meaningfully support academic assessment.

Under 30s per essay

Sub-30-second evaluation time for AI-assisted grading.

90% human alignment

AI scores matched human graders in 9 out of 10 cases.

Reduced cognitive load

AI reasoning helped instructors focus where it mattered.

Sensitive content visibility

Potential self-harm language surfaced for instructor review.

THE CHALLENGE /

Grading essays at scale is slow and mentally tiring. Teachers spend hours reading responses, applying rubrics, and making sure grades are fair. The real challenge wasn’t just making grading faster, it was designing an AI tool that teachers could trust to support their judgment, not replace it.

RESEARCH INSIGHTS /

Grading requires judgment, not just scoring

Teachers look at nuance, intent, and effort, things AI alone can’t fully understand.

Lack of transparency breaks trust

If instructors can’t see why a score was given, it’s hard for them to trust or rely on it.

Sensitive content is easier to miss at scale

Bulk review increases the risk of overlooking self-harm or distress signals.

Human review should be a core interaction

Reviewing and editing should feel central to the workflow and not an exception path.

THE SOLUTION /

To help with grading at scale, I designed a human-centered AI essay grading proof of concept that supports teachers instead of replacing them. The AI helps speed up rubric-based grading, but teachers stay in control. It reviews essays first, points out areas that need attention, and suggests feedback, while instructors make the final grading decisions.

Clear entry point for bulk grading

A simple upload flow with clear rules helps users know what to expect and prevents errors before grading starts.

Making AI work visible and not opaque

Showing each grading step helps people understand what the AI is doing and makes it feel like a helpful assistant.

Transparent rubric-driven scoring

Each score is clearly linked to the rubric and explained in plain language to support trust and consistency.

Sensitive content is highlighted with care

Potential self-harm content is highlighted for instructors using calm, review-only signals.

Human judgment is built into every step

Editing scores is easy and expected, keeping instructors in control of the final results.

WHAT I LEARNED /

This project showed me that trust matters more than speed in AI design. Educators felt confident when AI decisions were transparent and when they stayed in control. I also learned that AI works best when it highlights where attention is needed, rather than making final judgments.

Under 30s per essay

Sub-30-second evaluation time for AI-assisted grading.

Sub-30-second evaluation time for AI-assisted grading.

90% human alignment

AI scores matched human graders in 9 out of 10 cases.

AI scores matched human graders in 9 out of 10 cases.

Reduced cognitive load

AI reasoning helped instructors focus where it mattered.

AI reasoning helped instructors focus where it mattered.

Sensitive content visibility

Potential self-harm language surfaced for instructor review.

Potential self-harm language surfaced for instructor review.

Grading requires judgment, not just scoring

Teachers look at nuance, intent, and effort, things AI alone can’t fully understand.

Teachers look at nuance, intent, and effort, things AI alone can’t fully understand.

Lack of transparency breaks trust

If instructors can’t see why a score was given, it’s hard for them to trust or rely on it.

If instructors can’t see why a score was given, it’s hard for them to trust or rely on it.

Sensitive content is easier to miss at scale

Bulk review increases the risk of overlooking self-harm or distress signals.

Bulk review increases the risk of overlooking self-harm or distress signals.

Human review should be a core interaction

Reviewing and editing should feel central to the workflow and not an exception path.

Reviewing and editing should feel central to the workflow and not an exception path.

Clear entry point for bulk grading

Making AI work visible and not opaque

Transparent rubric-driven scoring

Sensitive content is highlighted with care

Human judgment is built into every step

Exploring human-centered AI for academic essay assessment

Exploring human-centered AI for academic essay assessment