GPU mode uses WebLLM with the Phi-3-mini model, which requires WebGPU support in
your browser for fast, GPU-accelerated inference.
CPU mode uses wllama with Phi-2 when GPU mode is unavailable or when you prefer a broader
browser-compatible option. Basic mode skips model inference entirely and returns matching content
directly from the built-in knowledge base.
If you are using a Windows computer with an ARM64 CPU (such as a Snapdragon processor) that has an
integrated GPU, you may need to explicitly enable WebGPU support in your browser's settings:
- In Microsoft Edge, go to
edge://flags
- In Google Chrome, go to
chrome://flags
Remember to disable it again when you're finished!
If the GPU and CPU models are both unavailable, the app automatically switches to Basic mode so you
can still search the built-in AI content.