Trail Guide Agent Technical Plan

Architecture Overview

The Trail Guide Agent is implemented as a Python command-line application that uses Azure OpenAI Service to provide conversational assistance for hiking trip planning. The architecture follows a simple, educational design optimized for individual learners.

High-Level Flow:

User launches the agent from the command line
Agent initializes connection to Azure OpenAI Service using the Microsoft Foundry SDK
Agent displays welcome message and enters interactive loop
User enters questions about trails or gear in natural language
Agent maintains conversation history to preserve context
Agent sends user message + conversation history to Azure OpenAI
Azure OpenAI generates response based on system instructions and conversation context
Agent displays response to user
Loop continues until user exits (e.g., types “exit” or “quit”)

Components:

Main application (trail_guide_agent.py): Entry point, conversation loop, user I/O
Azure OpenAI client: Handles LLM interactions via Microsoft Foundry SDK
Conversation manager: Maintains message history for context
System prompt: Defines agent behavior, persona, and capabilities
Configuration: Environment variables for Azure endpoint, deployment, API keys

This architecture keeps the implementation minimal—focused on demonstrating core GenAIOps concepts without unnecessary complexity.

Technology Stack and Key Decisions

Core Technologies

Programming Language: Python 3.11+

Rationale: Aligns with constitution requirement and is standard for AI/ML projects
Educational benefit: Accessible to most learners, rich ecosystem of Azure SDKs

Azure OpenAI Service: LLM provider

Rationale: Required by constitution (Azure-only resources)
Deployment model: GPT-4 or GPT-4o for high-quality conversational responses
Educational benefit: Industry-standard LLM service with robust documentation

Microsoft Foundry SDK (azure-ai-projects): Primary SDK

Rationale: Constitution mandates using latest Foundry SDK
Package: azure-ai-projects (latest stable version)
Educational benefit: Teaches modern Azure AI development patterns
Alternative packages NOT used: openai Python library, azure-ai-inference (older SDK)

Authentication: DefaultAzureCredential

Rationale: Simplifies authentication for individual learners
Method: Azure CLI authentication via az login
Educational benefit: No need to manage service principals or complex auth flows
Secrets: API keys avoided; uses Azure identity-based access

Supporting Libraries

Python Standard Library: For file I/O, environment variables

os: Environment variable access
sys: Command-line argument handling
typing: Type hints for code clarity

No additional dependencies unless strictly necessary

Rationale: Minimal approach per constitution
Educational benefit: Reduces setup friction and troubleshooting

Infrastructure Provisioning Approach

Azure Developer CLI (azd): Primary deployment tool

Students provision Azure resources using azd up command
Rationale: Simplifies infrastructure setup while teaching modern deployment patterns
Benefits:
- Provisions Azure AI Foundry hub + project
- Deploys agent to Microsoft Foundry
- Auto-generates .env file with connection details
- Repeatable and version-controlled (uses Bicep under the hood)
- Integrated VS Code experience

Alternative considered: Manual Bicep deployment

Decision: azd wraps Bicep, provides better student experience
Rationale: Students learn infrastructure-as-code without Bicep complexity
Constitutional compliance: azd uses Bicep templates internally

Alternative considered: Azure Portal (clickops)

Decision: Not repeatable or teachable at scale
Rationale: azd teaches automation while remaining simple

Configuration Approach

Environment Variables: Managed via .env file

.env file auto-generated by azd up during provisioning
Contains:
- AZURE_PROJECT_CONNECTION_STRING: Connection to AI Foundry project
- AZURE_AGENT_ID: ID of deployed agent in Foundry
- Generated automatically—students don’t manually configure
.env file in .gitignore to prevent secret exposure
.env.example provided as template showing required variables

Authentication: DefaultAzureCredential

Students authenticate via az login before running azd up
Same credentials used at runtime to connect to agent
No API keys or secrets in code or config

Data Storage

No persistent storage required for MVP

Conversation history: In-memory only (clears on exit)
Trail data: Embedded in system prompt or future knowledge base
Rationale: Minimizes setup complexity
Future enhancement: Add RAG with Azure AI Search (stretch module)

User Interface

Command-line interface (CLI)

Input: Standard input (keyboard)
Output: Standard output (terminal)
Rationale: Simplest possible interface; focuses learning on AI behavior, not UI
Educational benefit: Works on all platforms; no web framework required
Future enhancement: Jupyter notebook interface for interactive learning

Implementation Sequence

Student Tasks vs. Pre-Built Code

This plan distinguishes between:

Student tasks: Activities learners perform as part of the lab
Pre-built code: Agent implementation provided in the repository

The educational goal is for students to:

Provision Azure infrastructure
Configure their environment
Run and interact with the pre-built agent
Understand how it works through code review
(Optional) Modify and experiment with the agent

Phase 1: Student Setup Tasks

Student performs these tasks in the lab:

Clone the repository
- Fork/clone this repository to their local machine
- Open the repository in VS Code
Authenticate with Azure
- Install Azure CLI (if not already installed)
- Run az login to authenticate
- Verify access to their Azure subscription
Install Azure Developer CLI
- Install azd CLI tool
- Verify installation: azd version
Provision Azure resources
- Navigate to repository root
- Run azd up to provision:
  - Azure AI Foundry hub
  - Azure AI Foundry project
  - Azure OpenAI Service (GPT-4 deployment)
  - Trail Guide Agent deployed to Foundry
- azd creates .env file with connection details
- Estimated time: 5-10 minutes (automated deployment)
Set up Python environment
- Create Python virtual environment: python -m venv venv
- Activate virtual environment
- Install dependencies: pip install -r requirements.txt
Run the agent
- Navigate to src/agents/trail_guide_agent/
- Run: python trail_guide_agent.py
- Interact with the agent via CLI

Why this approach:

Students experience full deployment workflow
Teaches infrastructure provisioning without manual configuration
Generates working environment in <15 minutes
Students can immediately start using the agent
Focus shifts to understanding agent behavior and GenAIOps concepts

Phase 2: Pre-Built Agent Code (Provided in Repo)

The repository includes ready-to-run agent code:

Infrastructure as Code (/infrastructure)
- azure.yaml: azd configuration file
- /bicep/main.bicep: Azure resource definitions
  - AI Foundry hub
  - AI Foundry project
  - Azure OpenAI Service
  - GPT-4 deployment
- /bicep/agent.bicep: Agent definition and deployment
Agent Implementation (/src/agents/trail_guide_agent/)
- trail_guide_agent.py: Main application file
- system_prompt.txt: Agent persona and instructions
- README.md: Setup and usage instructions
Code structure:
- Initialize Azure AI Projects client from .env connection string
- Load agent by ID from environment variable
- Implement conversation loop (get input → send to agent → display response)
- Maintain conversation thread for context
- Handle exit commands and errors gracefully
Configuration Files
- requirements.txt: Python dependencies
  - azure-ai-projects (latest)
  - azure-identity
  - python-dotenv
- .env.example: Template showing required variables
- .gitignore: Ensures .env not committed

Students focus on:

Reading and understanding the pre-built code
Running the agent and testing different queries
Observing how conversation context is maintained
Experimenting with modifications (stretch tasks)

Phase 3: Understanding and Exploration (Student Learning)

After the agent is running, students:

Code walkthrough
- Review trail_guide_agent.py to understand:
  - How Azure AI Projects SDK is used
  - How conversation threads work
  - How agent responses are generated
- Review system_prompt.txt to understand agent behavior
Testing and interaction
- Test various trail and gear queries
- Verify multi-turn conversation context
- Test edge cases (out-of-scope questions, empty input)
- Measure response times
Validation against spec
- Check that agent meets all acceptance criteria
- Document any gaps or unexpected behaviors

Phase 4: Optional Stretch Tasks (Advanced Students)

Modify system prompt
- Edit system_prompt.txt to change agent personality
- Redeploy agent with new prompt: azd deploy
- Compare response quality before/after
Add conversation logging
- Implement logging to capture interactions
- Save conversations to local file
- Prepare data for evaluation module
Experiment with agent configuration
- Modify temperature, max_tokens in agent definition
- Observe impact on response quality

Constitution Verification

This technical plan aligns with the project constitution as follows:

Azure-Only Cloud Resources ✅

Requirement: All cloud resources must be hosted on Microsoft Azure
Compliance: Uses Azure OpenAI Service exclusively; no other cloud providers
Verification: No AWS, GCP, or other cloud service dependencies

Python Implementation ✅

Requirement: Primary language Python 3.11+
Compliance: Entire agent implemented in Python
Verification: No other programming languages used

Latest Microsoft Foundry SDK ✅

Requirement: Use latest Microsoft Foundry SDK (azure-ai-projects)
Compliance: Uses azure-ai-projects for all Azure AI interactions
Verification: Does not use deprecated openai library or older Azure SDKs

Minimal, Lightweight Approach ✅

Requirement: Opt for most minimal, lightweight, fastest approach
Compliance:
- CLI interface (simplest UI)
- In-memory conversation history (no database)
- Minimal dependencies (only Foundry SDK + azure-identity)
- No complex infrastructure (no VNets, App Services, etc.)
Verification: Setup time <15 minutes; implementation <200 lines of code

Educational Purpose ✅

Requirement: Designed for individual learners with own Azure account
Compliance:
- Runs locally in VS Code
- No team collaboration features
- Clear README with step-by-step setup
- Well-commented code for learning
Verification: Learner can complete setup independently

Simple Authentication ✅

Requirement: Easy authentication for individual students
Compliance: Uses DefaultAzureCredential + Azure CLI (az login)
Verification: No service principals, managed identities, or complex auth flows required

No Secrets in Source Code ✅

Requirement: Store no secrets in source code
Compliance: All credentials via environment variables or Azure authentication
Verification: No hardcoded API keys, connection strings, or passwords in .py files

Bicep for Infrastructure (When Applicable) ✅

Requirement: Preference for Bicep templates for infrastructure
Compliance: Bicep template available for provisioning Azure OpenAI resource
Note: For this MVP, learners can use existing Azure OpenAI resource or create via Azure Portal
Verification: No infrastructure-as-code required to run agent (optional enhancement)

Assumptions and Open Questions

Assumptions

Azure OpenAI Access: Learner has access to Azure OpenAI Service (subscription approved)
GPT-4 Availability: Learner’s Azure region supports GPT-4 or GPT-4o deployment
Azure CLI Installed: Learner has Azure CLI installed and configured (az login works)
Python Environment: Learner can create Python virtual environments
Terminal Access: Learner is comfortable running Python scripts from command line
Trail Knowledge: Agent has general knowledge about hiking trails from GPT-4 training data
- No custom knowledge base required for MVP
- Agent may not have detailed information about all trails
No RAG Required: MVP uses LLM’s built-in knowledge; vector search deferred to stretch module
Token Limits: GPT-4 context window (8K or 128K) is sufficient for conversation history
Response Time: Azure OpenAI responses typically <3 seconds with standard tier
Single User: Agent handles one conversation at a time (no concurrency requirements)

Open Questions

System Prompt Complexity
- How detailed should trail recommendation examples be in the system prompt?
- Should we include specific trail names/regions in the system prompt or rely on LLM knowledge?
- Recommendation: Start minimal; add detail if responses lack specificity
Conversation History Length
- What’s the optimal number of exchanges to keep in context?
- How do we handle very long conversations that exceed token limits?
- Recommendation: Keep last 10 exchanges; summarize and reset if needed
Gear Recommendations
- Should we embed a gear catalog in the system prompt or rely on general knowledge?
- Do we need to mention real Adventure Works products vs. generic gear?
- Recommendation: Use generic gear for MVP; specific catalog can be added with RAG
Error Recovery
- Should the agent automatically retry failed API calls or prompt the user?
- Recommendation: Simple retry with exponential backoff; max 3 retries
Logging and Evaluation
- Should we log conversations for later evaluation?
- Where should logs be stored (local file, Azure Storage)?
- Recommendation: Add optional logging to local file for stretch module on evaluation
Weather Information
- How should the agent handle weather questions given no real-time API?
- Recommendation: Agent states it provides general seasonal guidance, not real-time weather
Out-of-Scope Handling
- Should we implement explicit intent detection or rely on LLM to decline gracefully?
- Recommendation: Rely on system prompt instructions; LLM is capable of declining out-of-scope requests
Deployment Model Selection
- Should we specify GPT-4, GPT-4o, or allow learner to choose?
- Recommendation: Document both options; recommend GPT-4o for speed/cost balance

Technical Risks and Mitigations

Risk: Azure OpenAI Service Unavailable

Impact: Agent cannot function
Mitigation: Clear error message with troubleshooting steps; retry logic with exponential backoff
Fallback: None (service is required); learner checks Azure status page

Risk: DefaultAzureCredential Fails

Impact: Authentication errors prevent API access
Mitigation:
- README includes az login verification steps
- Provide fallback to API key authentication if needed
- Clear error messages with setup instructions

Risk: Rate Limiting

Impact: Requests rejected during high usage
Mitigation: Implement exponential backoff retry logic
Educational note: Teaches learners about rate limiting and resilience patterns

Risk: Poor Response Quality

Impact: Agent provides irrelevant or unhelpful recommendations
Mitigation:
- Iteratively refine system prompt based on testing
- Add evaluation module in stretch labs to measure quality
- Provide examples in README of good vs. poor queries

Risk: Token Limit Exceeded

Impact: Conversation history causes API errors
Mitigation:
- Trim old messages from history
- Monitor token usage (add counter if needed)
- Reset conversation gracefully with user notification

Risk: Learner Configuration Mistakes

Impact: Agent fails to run due to incorrect setup
Mitigation:
- Comprehensive README with step-by-step instructions
- Configuration validation at startup with helpful error messages
- Example .env.example file with placeholder values

Future Enhancements (Out of Current Plan)

These features are NOT implemented in the initial version but are documented for potential stretch modules:

Retrieval-Augmented Generation (RAG)
- Add Azure AI Search for vector storage of trail data
- Embed trail descriptions and search for relevant context
- Improves factual accuracy and detail
Conversation Logging and Evaluation
- Log conversations to local file or Azure Storage
- Implement quality evaluators (relevance, groundedness, coherence)
- Teaches GenAIOps evaluation practices
Azure Key Vault Integration
- Move API keys and connection strings to Key Vault
- Demonstrates enterprise secret management
Jupyter Notebook Interface
- Alternative interface for interactive exploration
- Better for demonstrating prompt engineering
Multi-Agent Architecture
- Separate agents for trail recommendations vs. gear recommendations
- Demonstrates agent orchestration patterns
Real-Time Weather Integration
- Connect to weather API for current conditions
- Requires API key management and external service integration
Bicep Template for Infrastructure
- Automate Azure OpenAI resource provisioning
- Teaches infrastructure-as-code practices
Monitoring and Observability
- Add Azure Application Insights telemetry
- Track token usage, response times, error rates

Success Metrics

The implementation is successful when:

Functional Requirements Met:
- ✅ Agent responds to trail and gear queries conversationally
- ✅ Conversation context maintained across 5+ exchanges
- ✅ Trail recommendations filtered by difficulty and location
- ✅ Gear recommendations based on activities and weather
- ✅ Out-of-scope queries declined gracefully
- ✅ Response time <3 seconds average
Educational Goals Achieved:
- ✅ Learner can set up and run agent in <15 minutes
- ✅ Code is clear and well-documented for learning
- ✅ README provides comprehensive guidance
- ✅ Demonstrates GenAIOps principles (prompting, evaluation, monitoring)
Constitution Compliance:
- ✅ All Azure resources; no other cloud providers
- ✅ Python 3.11+ implementation
- ✅ Latest Microsoft Foundry SDK used
- ✅ Minimal, lightweight architecture
- ✅ Simple authentication for individual learners
- ✅ No secrets in source code
Quality Indicators:
- ✅ Agent provides helpful, relevant responses in manual testing
- ✅ No hallucinations of non-existent trails (when trail knowledge available)
- ✅ Conversational tone is friendly and professional
- ✅ Error messages are clear and actionable

Implementation Checklist

Before proceeding to task generation, verify:

Microsoft Foundry SDK documentation reviewed
Azure OpenAI endpoint and deployment identified
Python 3.11+ environment available
Azure CLI installed and az login completed
System prompt drafted with agent persona and capabilities
README outline prepared with setup instructions
Error handling strategy defined
Testing approach planned (manual + acceptance criteria validation)
Constitution compliance verified (all checkboxes ✅ above)

This technical plan provides the foundation for generating implementation tasks and writing code. The plan prioritizes educational clarity, minimal complexity, and alignment with constitution principles.