Trail Guide Agent Technical Plan

Architecture Overview

The Trail Guide Agent is implemented as a Python command-line application that uses Azure OpenAI Service to provide conversational assistance for hiking trip planning. The architecture follows a simple, educational design optimized for individual learners.

High-Level Flow:

  1. User launches the agent from the command line
  2. Agent initializes connection to Azure OpenAI Service using the Microsoft Foundry SDK
  3. Agent displays welcome message and enters interactive loop
  4. User enters questions about trails or gear in natural language
  5. Agent maintains conversation history to preserve context
  6. Agent sends user message + conversation history to Azure OpenAI
  7. Azure OpenAI generates response based on system instructions and conversation context
  8. Agent displays response to user
  9. Loop continues until user exits (e.g., types “exit” or “quit”)

Components:

  • Main application (trail_guide_agent.py): Entry point, conversation loop, user I/O
  • Azure OpenAI client: Handles LLM interactions via Microsoft Foundry SDK
  • Conversation manager: Maintains message history for context
  • System prompt: Defines agent behavior, persona, and capabilities
  • Configuration: Environment variables for Azure endpoint, deployment, API keys

This architecture keeps the implementation minimal—focused on demonstrating core GenAIOps concepts without unnecessary complexity.

Technology Stack and Key Decisions

Core Technologies

Programming Language: Python 3.11+

  • Rationale: Aligns with constitution requirement and is standard for AI/ML projects
  • Educational benefit: Accessible to most learners, rich ecosystem of Azure SDKs

Azure OpenAI Service: LLM provider

  • Rationale: Required by constitution (Azure-only resources)
  • Deployment model: GPT-4 or GPT-4o for high-quality conversational responses
  • Educational benefit: Industry-standard LLM service with robust documentation

Microsoft Foundry SDK (azure-ai-projects): Primary SDK

  • Rationale: Constitution mandates using latest Foundry SDK
  • Package: azure-ai-projects (latest stable version)
  • Educational benefit: Teaches modern Azure AI development patterns
  • Alternative packages NOT used: openai Python library, azure-ai-inference (older SDK)

Authentication: DefaultAzureCredential

  • Rationale: Simplifies authentication for individual learners
  • Method: Azure CLI authentication via az login
  • Educational benefit: No need to manage service principals or complex auth flows
  • Secrets: API keys avoided; uses Azure identity-based access

Supporting Libraries

Python Standard Library: For file I/O, environment variables

  • os: Environment variable access
  • sys: Command-line argument handling
  • typing: Type hints for code clarity

No additional dependencies unless strictly necessary

  • Rationale: Minimal approach per constitution
  • Educational benefit: Reduces setup friction and troubleshooting

Infrastructure Provisioning Approach

Azure Developer CLI (azd): Primary deployment tool

  • Students provision Azure resources using azd up command
  • Rationale: Simplifies infrastructure setup while teaching modern deployment patterns
  • Benefits:
    • Provisions Azure AI Foundry hub + project
    • Deploys agent to Microsoft Foundry
    • Auto-generates .env file with connection details
    • Repeatable and version-controlled (uses Bicep under the hood)
    • Integrated VS Code experience

Alternative considered: Manual Bicep deployment

  • Decision: azd wraps Bicep, provides better student experience
  • Rationale: Students learn infrastructure-as-code without Bicep complexity
  • Constitutional compliance: azd uses Bicep templates internally

Alternative considered: Azure Portal (clickops)

  • Decision: Not repeatable or teachable at scale
  • Rationale: azd teaches automation while remaining simple

Configuration Approach

Environment Variables: Managed via .env file

  • .env file auto-generated by azd up during provisioning
  • Contains:
    • AZURE_PROJECT_CONNECTION_STRING: Connection to AI Foundry project
    • AZURE_AGENT_ID: ID of deployed agent in Foundry
    • Generated automatically—students don’t manually configure
  • .env file in .gitignore to prevent secret exposure
  • .env.example provided as template showing required variables

Authentication: DefaultAzureCredential

  • Students authenticate via az login before running azd up
  • Same credentials used at runtime to connect to agent
  • No API keys or secrets in code or config

Data Storage

No persistent storage required for MVP

  • Conversation history: In-memory only (clears on exit)
  • Trail data: Embedded in system prompt or future knowledge base
  • Rationale: Minimizes setup complexity
  • Future enhancement: Add RAG with Azure AI Search (stretch module)

User Interface

Command-line interface (CLI)

  • Input: Standard input (keyboard)
  • Output: Standard output (terminal)
  • Rationale: Simplest possible interface; focuses learning on AI behavior, not UI
  • Educational benefit: Works on all platforms; no web framework required
  • Future enhancement: Jupyter notebook interface for interactive learning

Implementation Sequence

Student Tasks vs. Pre-Built Code

This plan distinguishes between:

  • Student tasks: Activities learners perform as part of the lab
  • Pre-built code: Agent implementation provided in the repository

The educational goal is for students to:

  1. Provision Azure infrastructure
  2. Configure their environment
  3. Run and interact with the pre-built agent
  4. Understand how it works through code review
  5. (Optional) Modify and experiment with the agent

Phase 1: Student Setup Tasks

Student performs these tasks in the lab:

  1. Clone the repository
    • Fork/clone this repository to their local machine
    • Open the repository in VS Code
  2. Authenticate with Azure
    • Install Azure CLI (if not already installed)
    • Run az login to authenticate
    • Verify access to their Azure subscription
  3. Install Azure Developer CLI
    • Install azd CLI tool
    • Verify installation: azd version
  4. Provision Azure resources
    • Navigate to repository root
    • Run azd up to provision:
      • Azure AI Foundry hub
      • Azure AI Foundry project
      • Azure OpenAI Service (GPT-4 deployment)
      • Trail Guide Agent deployed to Foundry
    • azd creates .env file with connection details
    • Estimated time: 5-10 minutes (automated deployment)
  5. Set up Python environment
    • Create Python virtual environment: python -m venv venv
    • Activate virtual environment
    • Install dependencies: pip install -r requirements.txt
  6. Run the agent
    • Navigate to src/agents/trail_guide_agent/
    • Run: python trail_guide_agent.py
    • Interact with the agent via CLI

Why this approach:

  • Students experience full deployment workflow
  • Teaches infrastructure provisioning without manual configuration
  • Generates working environment in <15 minutes
  • Students can immediately start using the agent
  • Focus shifts to understanding agent behavior and GenAIOps concepts

Phase 2: Pre-Built Agent Code (Provided in Repo)

The repository includes ready-to-run agent code:

  1. Infrastructure as Code (/infrastructure)
    • azure.yaml: azd configuration file
    • /bicep/main.bicep: Azure resource definitions
      • AI Foundry hub
      • AI Foundry project
      • Azure OpenAI Service
      • GPT-4 deployment
    • /bicep/agent.bicep: Agent definition and deployment
  2. Agent Implementation (/src/agents/trail_guide_agent/)
    • trail_guide_agent.py: Main application file
    • system_prompt.txt: Agent persona and instructions
    • README.md: Setup and usage instructions

    Code structure:

    • Initialize Azure AI Projects client from .env connection string
    • Load agent by ID from environment variable
    • Implement conversation loop (get input → send to agent → display response)
    • Maintain conversation thread for context
    • Handle exit commands and errors gracefully
  3. Configuration Files
    • requirements.txt: Python dependencies
      • azure-ai-projects (latest)
      • azure-identity
      • python-dotenv
    • .env.example: Template showing required variables
    • .gitignore: Ensures .env not committed

Students focus on:

  • Reading and understanding the pre-built code
  • Running the agent and testing different queries
  • Observing how conversation context is maintained
  • Experimenting with modifications (stretch tasks)

Phase 3: Understanding and Exploration (Student Learning)

After the agent is running, students:

  1. Code walkthrough
    • Review trail_guide_agent.py to understand:
      • How Azure AI Projects SDK is used
      • How conversation threads work
      • How agent responses are generated
    • Review system_prompt.txt to understand agent behavior
  2. Testing and interaction
    • Test various trail and gear queries
    • Verify multi-turn conversation context
    • Test edge cases (out-of-scope questions, empty input)
    • Measure response times
  3. Validation against spec
    • Check that agent meets all acceptance criteria
    • Document any gaps or unexpected behaviors

Phase 4: Optional Stretch Tasks (Advanced Students)

  1. Modify system prompt
    • Edit system_prompt.txt to change agent personality
    • Redeploy agent with new prompt: azd deploy
    • Compare response quality before/after
  2. Add conversation logging
    • Implement logging to capture interactions
    • Save conversations to local file
    • Prepare data for evaluation module
  3. Experiment with agent configuration
    • Modify temperature, max_tokens in agent definition
    • Observe impact on response quality

Constitution Verification

This technical plan aligns with the project constitution as follows:

Azure-Only Cloud Resources ✅

  • Requirement: All cloud resources must be hosted on Microsoft Azure
  • Compliance: Uses Azure OpenAI Service exclusively; no other cloud providers
  • Verification: No AWS, GCP, or other cloud service dependencies

Python Implementation ✅

  • Requirement: Primary language Python 3.11+
  • Compliance: Entire agent implemented in Python
  • Verification: No other programming languages used

Latest Microsoft Foundry SDK ✅

  • Requirement: Use latest Microsoft Foundry SDK (azure-ai-projects)
  • Compliance: Uses azure-ai-projects for all Azure AI interactions
  • Verification: Does not use deprecated openai library or older Azure SDKs

Minimal, Lightweight Approach ✅

  • Requirement: Opt for most minimal, lightweight, fastest approach
  • Compliance:
    • CLI interface (simplest UI)
    • In-memory conversation history (no database)
    • Minimal dependencies (only Foundry SDK + azure-identity)
    • No complex infrastructure (no VNets, App Services, etc.)
  • Verification: Setup time <15 minutes; implementation <200 lines of code

Educational Purpose ✅

  • Requirement: Designed for individual learners with own Azure account
  • Compliance:
    • Runs locally in VS Code
    • No team collaboration features
    • Clear README with step-by-step setup
    • Well-commented code for learning
  • Verification: Learner can complete setup independently

Simple Authentication ✅

  • Requirement: Easy authentication for individual students
  • Compliance: Uses DefaultAzureCredential + Azure CLI (az login)
  • Verification: No service principals, managed identities, or complex auth flows required

No Secrets in Source Code ✅

  • Requirement: Store no secrets in source code
  • Compliance: All credentials via environment variables or Azure authentication
  • Verification: No hardcoded API keys, connection strings, or passwords in .py files

Bicep for Infrastructure (When Applicable) ✅

  • Requirement: Preference for Bicep templates for infrastructure
  • Compliance: Bicep template available for provisioning Azure OpenAI resource
  • Note: For this MVP, learners can use existing Azure OpenAI resource or create via Azure Portal
  • Verification: No infrastructure-as-code required to run agent (optional enhancement)

Assumptions and Open Questions

Assumptions

  1. Azure OpenAI Access: Learner has access to Azure OpenAI Service (subscription approved)
  2. GPT-4 Availability: Learner’s Azure region supports GPT-4 or GPT-4o deployment
  3. Azure CLI Installed: Learner has Azure CLI installed and configured (az login works)
  4. Python Environment: Learner can create Python virtual environments
  5. Terminal Access: Learner is comfortable running Python scripts from command line
  6. Trail Knowledge: Agent has general knowledge about hiking trails from GPT-4 training data
    • No custom knowledge base required for MVP
    • Agent may not have detailed information about all trails
  7. No RAG Required: MVP uses LLM’s built-in knowledge; vector search deferred to stretch module
  8. Token Limits: GPT-4 context window (8K or 128K) is sufficient for conversation history
  9. Response Time: Azure OpenAI responses typically <3 seconds with standard tier
  10. Single User: Agent handles one conversation at a time (no concurrency requirements)

Open Questions

  1. System Prompt Complexity
    • How detailed should trail recommendation examples be in the system prompt?
    • Should we include specific trail names/regions in the system prompt or rely on LLM knowledge?
    • Recommendation: Start minimal; add detail if responses lack specificity
  2. Conversation History Length
    • What’s the optimal number of exchanges to keep in context?
    • How do we handle very long conversations that exceed token limits?
    • Recommendation: Keep last 10 exchanges; summarize and reset if needed
  3. Gear Recommendations
    • Should we embed a gear catalog in the system prompt or rely on general knowledge?
    • Do we need to mention real Adventure Works products vs. generic gear?
    • Recommendation: Use generic gear for MVP; specific catalog can be added with RAG
  4. Error Recovery
    • Should the agent automatically retry failed API calls or prompt the user?
    • Recommendation: Simple retry with exponential backoff; max 3 retries
  5. Logging and Evaluation
    • Should we log conversations for later evaluation?
    • Where should logs be stored (local file, Azure Storage)?
    • Recommendation: Add optional logging to local file for stretch module on evaluation
  6. Weather Information
    • How should the agent handle weather questions given no real-time API?
    • Recommendation: Agent states it provides general seasonal guidance, not real-time weather
  7. Out-of-Scope Handling
    • Should we implement explicit intent detection or rely on LLM to decline gracefully?
    • Recommendation: Rely on system prompt instructions; LLM is capable of declining out-of-scope requests
  8. Deployment Model Selection
    • Should we specify GPT-4, GPT-4o, or allow learner to choose?
    • Recommendation: Document both options; recommend GPT-4o for speed/cost balance

Technical Risks and Mitigations

Risk: Azure OpenAI Service Unavailable

  • Impact: Agent cannot function
  • Mitigation: Clear error message with troubleshooting steps; retry logic with exponential backoff
  • Fallback: None (service is required); learner checks Azure status page

Risk: DefaultAzureCredential Fails

  • Impact: Authentication errors prevent API access
  • Mitigation:
    • README includes az login verification steps
    • Provide fallback to API key authentication if needed
    • Clear error messages with setup instructions

Risk: Rate Limiting

  • Impact: Requests rejected during high usage
  • Mitigation: Implement exponential backoff retry logic
  • Educational note: Teaches learners about rate limiting and resilience patterns

Risk: Poor Response Quality

  • Impact: Agent provides irrelevant or unhelpful recommendations
  • Mitigation:
    • Iteratively refine system prompt based on testing
    • Add evaluation module in stretch labs to measure quality
    • Provide examples in README of good vs. poor queries

Risk: Token Limit Exceeded

  • Impact: Conversation history causes API errors
  • Mitigation:
    • Trim old messages from history
    • Monitor token usage (add counter if needed)
    • Reset conversation gracefully with user notification

Risk: Learner Configuration Mistakes

  • Impact: Agent fails to run due to incorrect setup
  • Mitigation:
    • Comprehensive README with step-by-step instructions
    • Configuration validation at startup with helpful error messages
    • Example .env.example file with placeholder values

Future Enhancements (Out of Current Plan)

These features are NOT implemented in the initial version but are documented for potential stretch modules:

  1. Retrieval-Augmented Generation (RAG)
    • Add Azure AI Search for vector storage of trail data
    • Embed trail descriptions and search for relevant context
    • Improves factual accuracy and detail
  2. Conversation Logging and Evaluation
    • Log conversations to local file or Azure Storage
    • Implement quality evaluators (relevance, groundedness, coherence)
    • Teaches GenAIOps evaluation practices
  3. Azure Key Vault Integration
    • Move API keys and connection strings to Key Vault
    • Demonstrates enterprise secret management
  4. Jupyter Notebook Interface
    • Alternative interface for interactive exploration
    • Better for demonstrating prompt engineering
  5. Multi-Agent Architecture
    • Separate agents for trail recommendations vs. gear recommendations
    • Demonstrates agent orchestration patterns
  6. Real-Time Weather Integration
    • Connect to weather API for current conditions
    • Requires API key management and external service integration
  7. Bicep Template for Infrastructure
    • Automate Azure OpenAI resource provisioning
    • Teaches infrastructure-as-code practices
  8. Monitoring and Observability
    • Add Azure Application Insights telemetry
    • Track token usage, response times, error rates

Success Metrics

The implementation is successful when:

  1. Functional Requirements Met:
    • ✅ Agent responds to trail and gear queries conversationally
    • ✅ Conversation context maintained across 5+ exchanges
    • ✅ Trail recommendations filtered by difficulty and location
    • ✅ Gear recommendations based on activities and weather
    • ✅ Out-of-scope queries declined gracefully
    • ✅ Response time <3 seconds average
  2. Educational Goals Achieved:
    • ✅ Learner can set up and run agent in <15 minutes
    • ✅ Code is clear and well-documented for learning
    • ✅ README provides comprehensive guidance
    • ✅ Demonstrates GenAIOps principles (prompting, evaluation, monitoring)
  3. Constitution Compliance:
    • ✅ All Azure resources; no other cloud providers
    • ✅ Python 3.11+ implementation
    • ✅ Latest Microsoft Foundry SDK used
    • ✅ Minimal, lightweight architecture
    • ✅ Simple authentication for individual learners
    • ✅ No secrets in source code
  4. Quality Indicators:
    • ✅ Agent provides helpful, relevant responses in manual testing
    • ✅ No hallucinations of non-existent trails (when trail knowledge available)
    • ✅ Conversational tone is friendly and professional
    • ✅ Error messages are clear and actionable

Implementation Checklist

Before proceeding to task generation, verify:

  • Microsoft Foundry SDK documentation reviewed
  • Azure OpenAI endpoint and deployment identified
  • Python 3.11+ environment available
  • Azure CLI installed and az login completed
  • System prompt drafted with agent persona and capabilities
  • README outline prepared with setup instructions
  • Error handling strategy defined
  • Testing approach planned (manual + acceptance criteria validation)
  • Constitution compliance verified (all checkboxes ✅ above)

This technical plan provides the foundation for generating implementation tasks and writing code. The plan prioritizes educational clarity, minimal complexity, and alignment with constitution principles.