This app is designed as a learning aid for anyone seeking to become familiar with Generative AI apps and agents. It's based on the user interface in Microsoft Foundry portal, but does not use any Azure cloud services.
The app uses AI language models for both GPU and CPU modes. For GPU mode, it uses Microsoft's Phi-3-mini via WebLLM (requires WebGPU API and GPU). For CPU mode, it uses SmolLM2 via wllama/WebAssembly (works on any device). If WebLLM fails to load, the app automatically falls back to CPU mode. You can switch between modes using the Model dropdown.
The app also makes use of the MobileNetV3 model and the TensorFlow.js framework to implement image classification.
Known issues
- The initial download of the Microsoft Phi model may take a few minutes - particularly on low-bandwidth connections. Subsequent downloads should be quicker.
- Some GPU-enabled computers (particularly those with ARM-based processors) do not support WebGPU without enabling the Unsafe WebGPU Support browser flag. If your browser fails to load the Microsoft Phi model, you can try enabling this flag at edge://flags on Microsoft Edge or chrome://flags on Google Chrome. Disable it again when finished!
- CPU mode (SmolLM2) response times are slower than GPU mode (Phi-3-mini). SmolLM2 is a smaller 360M parameter model optimized for browser performance. Use the Model dropdown to select your preferred mode.
- Image analysis is based on MobileNetV3, which is a general-purpose image classification model. It may not accurately identify all images or provide detailed descriptions.
- Voice mode depends on browser-support for speech recognition. Speech recognition fails with a network error in Microsoft Edge on ARM hardware.
- Voice mode performance may vary based on the device's microphone quality and ambient noise levels. For best results, use a good quality microphone in a quiet environment.