Local-first · No cloud account · MIT licensed

A voice assistant that lives on your desktop —
not in the cloud.

Voxa is a frameless, always-on-top orb you tap to start a realtime voice conversation. It searches and saves your notes, calls tools from a local connector harness, and can even build its own connectors by voice. Your config, your notes, and your keys stay on your machine.

  • 🪟 Windows
  • 🍎 macOS
  • 🐧 Linux
  • ⚙️ Tauri v2
  • 🔑 Bring your own model

This is the app's actual orb renderer, running live. Tap it to simulate a conversation,
or restyle it in the Skins section below.

What it is

Tap. Talk. It talks back.

Voxa sits quietly in a corner of your screen as a small glowing orb with a dock panel. Tap it and you're in an open-mic, low-latency voice session — with barge-in, so you can interrupt it mid-sentence like a real conversation.

The model can call tools, and those tools come from a small local connector harness that runs alongside the orb. Out of the box you get a local Markdown brain — search, read, and save your own notes by voice, offline, with no API key — plus a whole shelf of connectors.

There is no proprietary backend. Ever.

The Voxa dock — the glowing orb next to its status panel reading “Idle — tap the orb to start”
The dock: orb + status panel. One tap to start a session, one gear for everything else.

Features

Everything on board. Nothing phoning home.

🗣️

Realtime voice

Open-mic, low-latency, barge-in. Choose Gemini Live, OpenAI Realtime, or a local daemon you run yourself — switchable in Settings.

🧠

Local brain

A folder of .md files Voxa can search / read / save by voice — offline, no API key. Point it at an Obsidian vault if you like.

🔌

Connectors

Weather, web search, crypto, GitHub, Hacker News, Wikipedia, timers, lists, Spotify and more. Each one is a single small ES module — no build step.

🛠️

Self-extension

Say “build me a connector for the OpenWeather API” and the forge writes one from a declarative, data-only spec — safely, with SSRF guards.

🎭

Personas & skins

Pick a built-in “soul”, edit its system prompt, and theme the orb with 10 skins × 8 palettes — live, by voice, or in config.

🪟

Cross-platform

One Tauri v2 app, built on Windows, macOS, and Linux in CI on every push. The harness is plain Node and runs anywhere Node 18+ runs.

Space-grade privacy.

Your voice goes to the model provider you chose — or never leaves the machine at all with a local daemon. Notes, keys, and config live in local files you can open in any editor.

Ambient footage: “Stars of Cepheus”, NASA / JPL-Caltech Spitzer Space Telescope

How it works

Your machine. A tiny HTTP contract. That's the whole stack.

The orb reads a local voxa-config.json and talks to the harness over GET/POST /api/voice/tools. Any server speaking that contract is a valid tool source — point Voxa at your own backend if you want.

🙂You tap & talk
Your machine — no backend
🔮Orb (Tauri) reads voxa-config.json
🧰Connector harness :3010 · loopback only
🧠 memory .md notes
⛅ weather · 🔎 search · 🐙 github …
🛠 forge build-agent
Realtime model Gemini Live · OpenAI · local daemon
voice / audio tool calls GET/POST /api/voice/tools

Get started

From clone to conversation in three steps.

Prerequisites: Rust (stable) and Node 18+. On Linux also libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf.

Get Voxa on GitHub

  1. 1

    Build & run the orb

    # dev build (or: npm run tauri build)
    git clone https://github.com/przemekzur/voxa.git
    cd voxa/packages/orb
    npm install
    npm run tauri dev

    That's it — the orb auto-starts the connector harness (tools + memory brain, all connectors enabled) and stops it when you quit. Prefer to run it yourself? cd packages/harness && npm start — the orb detects it and won't start a second one.

  2. 2

    Paste a key, allow the mic

    Tap the orb and paste a Gemini API key when prompted — it's stored locally, never in the config file.

    💡 Gemini Live is free to use in Google AI Studio — create a key at no cost and the realtime voice model runs on the free tier. No billing setup required.
  3. 3

    Talk

    “Remember that the standup moved to 10am.”
    Saved to your notes. 📝
    “…what time is standup?”
    Your standup is at 10am — you moved it earlier today.
Voxa Settings window — voice provider, model and voice, API keys, persona editor, brain folder, and connector harness URL
Settings: provider, voice, keys, persona, brain folder, harness URL — saved to voxa-config.json.

Configure it

One gear. Everything behind it.

Everything is one tap from the orb's gear:

  • Skins, palettes, layouts live right in the panel.
  • ⚙ Settings… opens voice provider (Gemini / OpenAI / local daemon), voice model & voice, API keys, persona, and the brain folder — with an Open folder button.
  • 🔌 Connectors… opens the connector manager as an app window (also at http://localhost:3010). Every connector ships enabled by default; Enable all / Disable all flips the whole set.
  • Settings are written to voxa-config.json in your app-data dir — plain JSON you can edit by hand.

Using the agent

Just say it.

No wake word ceremony, no app switching. Tap the orb and speak — the model picks the right tool. Try saying:

🎙️ “Remember that the standup moved to 10am.”

🧠 Memory

“Remember the wifi password is hunter2.” · “Search my notes for the tax deadline.” · “Read my note about the Berlin trip.”

Notes are Markdown files in your brain folder — greppable, syncable, yours.

⏱️ Timers & lists

“Set a pasta timer for 9 minutes.” · “Add oat milk to the shopping list.” · “What's on my list?”

Quick utilities that keep working while you keep talking.

🌍 The world

“What's the weather in Kraków tomorrow?” · “What's Bitcoin at?” · “What's on the Hacker News front page?”

Weather, web search, crypto, currency, GitHub, Wikipedia, news — one connector each.

🎨 Appearance

“Set skin to reactor.” · “Use the ice palette.”

The orb restyles itself live, mid-conversation.

🛠️ Self-extension

“Build me a connector for the OpenWeather API.”

The forge writes a new connector from a declarative HTTP spec. Tap the orb to reload — the new tools are live.

🎭 Personas

Pick or edit a “soul” in Settings — each has a system prompt and a recommended voice.

Your edits are saved per-persona, so experiments never wreck the preset.

Connectors

Three ways to teach Voxa a new trick.

A connector gives Voxa new voice-callable tools. It's a single ES module — no build step, no dependencies beyond Node's stdlib and global fetch. The harness auto-discovers anything in packages/harness/connectors/<id>/index.mjs.

🛠️ The forge

Enable the forge connector and say:

“Build me a connector for the OpenWeather API.”

Forge writes a new connector from a declarative, data-only HTTP spec — it never executes model-written code, and it's guarded against SSRF. Great for any public REST API.

Tap the orb to reload, and the new tools are live.

⌨️ A connector is one file

Default-export a manifest with actions. Drop it in, hit Reload in the harness UI (orb gear → 🔌 Connectors…, or localhost:3010), fill any config, and Test. Tools appear on the orb's next session.

Rules that matter

  1. Tool names are global — prefix every action name with the connector id.
  2. description is the model's only guide — write it for an LLM deciding when to call.
  3. parameters is flat JSON Schema — typed; avoid $ref/anyOf.
  4. handler returns { result } or { error }result is a short string the model reads aloud.
  5. Secrets go in config with secret: true — stored server-side, never sent to the browser.
  6. Never block forever — wrap outbound calls in AbortSignal.timeout(ms).
// packages/harness/connectors/dice/index.mjs
export default {
  id: "dice",
  name: "Dice",
  description: "Roll dice.",
  icon: "🎲",
  config: [],               // optional config/secret fields
  async test() { return { ok: true, message: "Ready." }; },
  actions: [
    {
      name: "dice_roll",   // GLOBALLY unique — prefix with the id
      description: "Roll an N-sided die and return the result.",
      parameters: {         // JSON Schema for the args (flat, typed)
        type: "object",
        properties: { sides: { type: "number",
          description: "Number of sides (default 6)." } },
      },
      async handler(args) {
        const n = Math.max(2, Math.floor(args.sides || 6));
        return { result: `You rolled a ${1 + Math.floor(Math.random() * n)} (d${n}).` };
      },
    },
  ],
};

🧠 Memory / “bring your own brain”

The default memory connector is Voxa's local brain — a folder of Markdown notes exposed as memory_search / memory_save / memory_read / memory_list.

Point its brainDir at an Obsidian vault and talk to your existing notes.

🌐 …or bring your own tool source

Any server that speaks the tool contract — GET/POST /api/voice/tools — can be a tool source. Set it as the harness URL in Settings and Voxa will happily use your backend instead.

# the whole contract
GET  /api/voice/tools   # → list of tool manifests
POST /api/voice/tools   # → run a tool, return { result }

Skins & palettes

10 skins × 8 palettes. Try them right here.

A skin is the orb's shape — sphere style, rings, flare, scanline. A palette is its colour set. They're independent: mix any skin with any palette, live from the gear, by voice (“set skin to reactor”, “use the ice palette”), or in config.

“set skin to orbit, use the ember palette”

tap the orb to hear it “speak”

Skin

Palette

Ten built-in Voxa orb skins — Orbit, Halo, Reactor, Lens, Holo Dock, Minimal, Nebula, Desktop Handoff, Spectrum Grid, and Crystal Cage
The ten built-in skins, each themeable across eight palettes.

Make your own

Custom skins — no recompile.

Declare custom skins and palettes in voxa-config.json under appearance. They're validated and merged on launch, then appear in the picker and respond to voice like the built-ins.

Each palette is six colour roles as [r, g, b]: core (central glow), accent (rings), hot (speaking heat), deep (shadowed depth), line (wireframe), white (specular).

Want a brand-new sphere or ring style, not just a new combination? Those are drawn by the canvas renderer in packages/orb/src/js/orb.js — add a case there and an entry in skins.js.

{
  "appearance": {
    "palettes": [{
      "id": "midnight", "name": "Midnight",
      "core":   [120, 160, 255], "accent": [200, 120, 255],
      "hot":    [210, 230, 255], "deep":   [20, 30, 80],
      "line":   [150, 190, 255], "white":  [240, 245, 255]
    }],
    "skins": [{
      "id": "myskin", "name": "My Skin",
      "sphere": "wire",     // wire | soft | lens
      "ring":   "reactor",  // none | orbit | halo | reactor | spectrum
      "flare":  true,
      "scan":   true,
      "brackets": false,
      "defaultPalette": "midnight"
    }]
  }
}

Platforms

One app. Three operating systems.

Voxa is a Tauri v2 app. Every push builds the orb on Windows, macOS, and Linux in CI.

PlatformWebViewNotes
🪟 WindowsWebView2Primary dev platform. Release builds have no console window.
🍎 macOSWKWebViewTransparent orb needs macOSPrivateApi (set). Mic permission ships in Info.plist.
🐧 LinuxWebKitGTK 4.1Needs libwebkit2gtk-4.1 at runtime; voice needs WebKitGTK ≥ 2.38 (getUserMedia).
cd packages/orb
npm install
npm run tauri build              # release installers for the host OS
npm run tauri build -- --no-bundle   # compile only (what CI runs)

Early but real.

Voice across three providers, the local brain, 21 connectors, the forge, personas, skins, and cross-platform CI builds all work today. On the roadmap: the build-agent's gated code path and a web cockpit.

Get started GitHub repo Build a connector