Optimize AI agents with fine-tuning

Analyze real agent quality problems, compare fine-tuning methods, and match the right approach to each scenario.

⚙️ Advanced (400) ⏱ ~15 minutes 🧭 Choose your own path

Choose a scenario to explore

Adventure Works agent quality problems

Adventure Works operates a Trail Guide Agent for outdoor trip planning. The agent is experiencing three distinct quality problems. Each scenario presents real evaluation metrics, a quality challenge, and asks you to identify the best fine-tuning approach.

You can work through the scenarios in any order. Start with the one that matches a challenge you've encountered in your own work.

Scenario 1

Inconsistent response format

Gear specification responses vary unpredictably — sometimes vague, sometimes detailed. Customers asking similar questions get very different information quality.

Coherence 2.8/5 ⚠️ Fluency 3.1/5 ⚠️ Groundedness 4.2/5 ✓ Relevance 4.1/5 ✓
Explore this scenario →
Scenario 2

Inappropriate tone for sensitive topics

Safety recommendations swing between alarmist and dismissive. Neither extreme supports Adventure Works' goal of encouraging safe outdoor activities.

Coherence 3.0/5 ⚠️ Content Safety 7% flagged ⚠️ Groundedness 4.3/5 ✓ Relevance 4.2/5 ✓
Explore this scenario →
Scenario 3

Illogical reasoning in complex planning

Trip plans ignore how constraints interact — recommending wild camping for beginners, or exposed terrain in bad weather — rather than reasoning through trade-offs.

Tool Call Accuracy 2.7/5 ⚠️ Relevance 2.8/5 ⚠️ Groundedness 4.1/5 ✓ Fluency 4.3/5 ✓
Explore this scenario →