Potato Assistant is a modern, voice-enabled AI chat interface built with Rust. It mimics the layout of professional tools like Gemini or ChatGPT but runs natively on your desktop with high performance and low resource usage.
The goal of this project is to provide a seamless Voice-to-Voice and Text-to-Text experience, bridging the gap between local desktop environments and Large Language Models (LLMs).
- 🗣️ Full Voice Interaction:
- Speech-to-Text (STT): Speak naturally to the assistant.
- Text-to-Speech (TTS): The AI answers back with audio.
- 💬 Modern Chat Interface:
- Split View Layout: Sidebar for conversation history (left) and main chat area (right).
- Streaming Responses: Watch the AI's answer appear character by character in real-time.
- 🧠 AI Backend: Designed to connect with major providers (OpenAI, Anthropic) or local models (Ollama).
- 🚀 Native Performance: Built in Rust for blazing fast startup and minimal memory footprint compared to Electron apps.
This project leverages the best-in-class Rust crates for multimedia and async tasks:
- GUI:
iced(Model-View-Update architecture). - Network:
reqwestfor handling API streams. - Audio Input:
cpalfor low-level microphone access. - Audio Output:
rodiofor playing AI responses. - Async Runtime:
tokio.
Since this project handles Audio I/O, you need specific system libraries installed.
On Arch Linux / Manjaro:
sudo pacman -S alsa-lib opensslOn Ubuntu / Debian:
sudo apt install libasound2-dev libssl-dev pkg-configTo use the AI features, you need to set up your API keys.
- Create a
.envfile in the root directory:touch .env
- Add your API Key (Example for OpenAI or local URL):
AI_API_KEY=sk-your-api-key-here # AI_MODEL=gpt-4o
-
Clone the project:
git clone [https://github.com/your-username/potato_assistant.git](https://github.com/your-username/potato_assistant.git) cd potato_assistant -
Run in Release Mode: For the best audio performance and UI smoothness, always run in release mode.
cargo run --release
The application manages two heavy asynchronous streams simultaneously without blocking the UI:
src/
├── main.rs # UI Entry point (Iced Application).
├── audio/
│ ├── microphone.rs # Handles cpal input stream.
│ └── speaker.rs # Handles rodio output queue.
├── api/
│ └── client.rs # Manages HTTP requests and SSE (Server-Sent Events) streaming.
└── ui/
├── chat.rs # The main chat view component.
└── sidebar.rs # The history sidebar component.
We welcome "Potato" enthusiasts! If you want to improve the voice detection or add support for more local AI models:
- Fork the project.
- Create your feature branch (
git checkout -b feature/BetterVoice). - Commit your changes.
- Push to the branch.
- Open a Pull Request.
Because even a Potato can be smart with enough Rust. 🥔