Architecture
How Folio is built, from the Rust core through the Tauri shell to the data flow that moves a meeting from microphone to markdown.
The stack#
Folio is a Rust program with a desktop face. About 70 percent of the code is a Rust core that owns audio capture, storage, and transcription. That core is wrapped by a Tauri 2 desktop binary, which gives it a native window, OS permissions, and a menu-bar presence without shipping a full browser engine.
The interface is a React 18 frontend written in TypeScript and styled with Tailwind. The two halves talk over Tauri IPC. The types that cross that boundary are not written twice. They are defined once in Rust and generated to TypeScript with ts-rs, so the frontend always sees the same shapes the core returns.
Repository layout#
The repository is a single Cargo workspace plus the frontend and the packaging it ships with. The Rust toolchain is pinned, the crates are split by responsibility, and the Homebrew cask lives in the repository itself, so the project doubles as its own tap.
folio/
Cargo.toml workspace root
rust-toolchain.toml Rust 1.88, both Apple targets
crates/
folio-core/ audio capture, storage, transcription, diarization
folio-cli/ CLI test harness
folio-mcp/ local MCP stdio server (notes, tasks, memories)
src-tauri/ Tauri 2 desktop binary
src/ React 18 + TypeScript + Tailwind frontend
Casks/ Homebrew cask (the repo doubles as its own tap)
docs/ repo-local documentationThe Rust core#
The workspace holds three crates. Each has one job and a clear edge against the others.
folio-corecrate- The framework-agnostic core. It owns audio capture, storage, transcription, diarization, the agent, and the memory and task stores. It is embeddable by the Tauri app, by the CLI, or by a future Swift app via UniFFI.
folio-clicrate- A CLI test harness. It exercises the core from the terminal without the desktop shell, which makes audio devices and recording easy to probe in isolation.
folio-mcpcrate- The local MCP stdio server. It exposes notes, tasks, and memories to MCP-aware tools with read-only access and no network hop.
A few rules hold across the core. FolioError is the single public error type, so callers handle one shape instead of a dozen. Logging goes through the tracing crate, never println, and audio callbacks are alloc-free hot paths that never log inside the callback body. macOS-specific code is gated behind a cfg(target_os = "macos") attribute, with stubs for other targets so the whole workspace still builds everywhere.
Capture pipeline#
When a meeting starts, Folio records two independent streams. cpal captures the microphone. ScreenCaptureKit captures system audio, which is everyone else on the call. Keeping the streams separate is what makes the rest of the pipeline honest. Your voice and the room never get mixed into one undecodable track.
The two streams rarely share a sample rate, so rubato resamples them to a common rate, and hound writes the result to WAV files on disk. The microphone track is always labelled You, which the transcription and diarization steps then rely on.
Transcription and diarization#
Transcription runs locally by default. The bundled backend is whisper.cpp through the whisper-rs bindings, Metal-accelerated on Apple Silicon. This is the primary path and it needs no network once the weights are present. The OpenAI Whisper API is an opt-in fallback for faster cloud transcription on long meetings. It needs an OpenAI key and it is never the default.
Diarization runs on-device against the system-audio track. It uses a pyannote-segmentation-3.0 model plus a WeSpeaker speaker-embedding model, both run through sherpa-onnx, then clusters the voices into Speaker 1, Speaker 2, Speaker 3 and so on. The microphone is always You, and no cloud is involved in diarization at all.
The IPC contract#
Every Tauri command is the contract between the core and the frontend. The argument and return types of those commands are defined in folio-core and generated to TypeScript by ts-rs. There is no second source of truth to keep in sync. Running cargo test regenerates the bindings, and CI catches any drift between what Rust declares and what TypeScript expects, so a changed signature cannot silently reach the UI.
Two-phase writes
File-backed stores write in two phases. The canonical on-disk file lands first. That is the .md for notes and the .json for tasks. The derived index is written second. The index is always rebuildable from the files, so the files are the source of truth and the index is just a fast read path you can throw away and regenerate.
Data flow#
Top to bottom, a request starts in React, crosses the IPC boundary as JSON, lands in the Tauri commands layer, and calls into folio-core, which is the only layer that talks to the OS, to OpenAI, to whisper.cpp, and to SQLite.
React (src/)
features/* + Zustand stores + shared/lib/ipc.ts
| invoke: JSON over Tauri IPC
v
src-tauri/
commands/* + app/state.rs + folio-core re-exports
| direct function calls
v
folio-core
audio:: + llm:: + memory:: + storage:: + transcription::
| OS APIs + OpenAI + whisper.cpp + SQLite
v
Disk + Hardware + NetworkThis page is the working summary. For the full account, including the module boundaries inside folio-core and the reasoning behind each layer, read the architecture document in the repository.