Tools · AWS ML Blog ·
Multimodal evaluators: MLLM-as-a-judge for image-to-text tasks in Strands Evals
AWS ML Blog describes multimodal evaluators in Strands Evals that use an MLLM as a judge for image-to-text tasks. The approach is meant to verify whether outputs are grounded in the source image for use cases like visual shopping, document understanding, and chart analysis.