tellmebaby

What it does

One key turns your voice into a tool.

Six small skills, one tiny pill. Designed to feel like a friend hanging around in the corner, not a dashboard taking over your life.

Dictate, anywhere

Hold your hotkey, talk, release. The words land at your cursor. Slack, VS Code, Outlook, your terminal — anything that takes typed text takes voice now.

Read it back

Highlight a paragraph, hit a hotkey, hear it. Same for the clipboard, or any chunk of text you paste in. Your computer talks back, in a real voice that doesn't sound robotic.

Edit by voice

Select text, press the edit hotkey, say "translate to French" or "make this shorter and less formal." The selection rewrites in place — no copy-paste shuffling.

Custom dictionary

Got a name the recognizer keeps butchering? Add it. Snippets map voice phrases to fixed text — addresses, signatures, prompts you reuse. The longer you use it, the smarter it gets at sounding like you.

Per-app modes

Clean up filler words in Slack but keep them verbatim in your terminal. Different apps get different treatment automatically — no settings to remember when you switch windows.

Maximum-quality models

Parakeet TDT for English. Whisper Large v3 Turbo for everything else. INT8 on CPU, GPU acceleration where there is one. No "lightweight model" excuses for bad accuracy.

Choreography

Six hotkeys. Zero menus.

Once your hands learn the choreography, you stop noticing your keyboard altogether. Pick the chords that fit your fingers — these are the defaults.

Ctrl+Win Dictate. Hold to talk, release when you're done. Words appear at the cursor.

Ctrl+Shift+R Read selection. Highlight first, then press. The computer reads it through your speakers.

Ctrl+Shift+V Read clipboard. Whatever you copied last gets read aloud, no app switch needed.

Ctrl+Shift+E Edit selection. Select text, press, say what to change. Rewritten in place.

Ctrl+Shift+S Stop talking. Cuts off any read-aloud immediately. Useful for long paragraphs.

Esc Cancel. Press while the pill is active to throw away whatever you just said.

all six are remappable in Settings → Activation

It's actually different

Built around your hands, not your eyes.

The pill is the whole UI.

One small surface, bottom-center of your screen. It tells you what's happening — listening, transcribing, done — without stealing your attention. No window, no panel, no dashboard. Hover it for the hotkey hint or click to open the actual app.

0:04

It learns the words you actually say.

Names of coworkers, internal tool names, that one client whose name nobody can spell. Add them once to your dictionary; the recognizer biases toward them forever. No model retraining, no cloud roundtrip.

// dictionary

Aghil → Aghil

Tellmebaby → tellmebaby

Cloudflare → Cloudflare

K8s → Kubernetes

Your voice stays here.

The recognizer runs entirely on your CPU or GPU. No microphone audio is ever sent over the network. We say "100% local" and we mean it — pull your network cable mid-sentence and tellmebaby keeps working.

// network log

● initial model fetch — once

● update check — every 6h

○ microphone audio — never

○ transcript text — never

○ telemetry — never

vs. the alternatives

Why local at all?

Cloud dictation is fast and accurate. So is tellmebaby — without sending your voice anywhere. Here's the actual difference.

Cloud dictation

× Uploads your audio for every utterance
× Subscription, eventually
× Stops working without internet
× Privacy policy you'll never read
× Can disappear when funding runs out

✓ Audio never leaves your machine
✓ Free, no account, no signup
✓ Works on a plane, in a tunnel, anywhere
✓ No data to leak because we don't have any
✓ Open enough to fork — your install is forever

Built for

People who'd rather talk than type.

If you've ever finished an email and realized your hands are tired, tellmebaby is for you. If you haven't, it's for you anyway.

Writers — drafting at speech speed

You think faster than you type. Talk through your draft, then clean it up with the keyboard. tellmebaby gets the messy first version out so editing is the only work left.

Devs — for the parts that aren't code

Commit messages, doc strings, Slack replies, code reviews. The 40% of dev time that's English instead of code is where dictation pays off — your hands stay on the keyboard, your voice does the typing.

Multitaskers — for when your hands are busy

Eating lunch. Holding the baby. Pet on the lap. Whatever's happening, you can still get a paragraph out. tellmebaby doesn't care if your fingers are sticky.

RSI / accessibility — hands-light, by design

The hotkey is "any modifier you can mash with one finger." Custom dictionary, fast cancel, everything works one-handed. Edit Mode means you don't even need to retype to revise.

Real questions

The stuff people actually ask.

Why does Windows say "Windows protected your PC" when I run it?

That's SmartScreen. It warns about apps it hasn't seen before — code-signing certs cost a few hundred dollars a year, and we haven't bought one yet. Click More info → Run anyway and the install proceeds normally. The download has a SHA256 published right next to the button you can verify against.

Is my voice actually private? For real?

Yes. Speech recognition runs entirely on your CPU or GPU using locally-stored model files. Pull your network cable mid-sentence and tellmebaby keeps working. The only network traffic is the initial speech-model download (so you can pick languages) plus a JSON poll every six hours to see if there's an update. That's the entire network footprint.

How accurate is it, really?

English: very good. Parakeet TDT 0.6B v2 is one of the best open-weight ASR models available right now, INT8-quantized but still benchmark-competitive with closed cloud APIs. Multilingual: Whisper Large v3 Turbo, also INT8. We pick the right model automatically based on the language you say you speak in onboarding.

Where does it store my recordings and transcripts?

%USERPROFILE%\.tellmebaby — recordings as .wav, transcripts in a SQLite database, settings as JSON. Nothing's compressed, nothing's encrypted, nothing's hidden. You can browse it all directly in File Explorer. Deleting that folder is a complete reset.

Does it auto-update?

Yes. The app polls a signed manifest a few hours after launch. When a new version is out, you get a banner across the top of the main window with a one-click Install + restart. Updates are signed with an ed25519 key — you can't be tricked into installing a fake one.

What if I want it to ignore my microphone in some apps?

That's what per-app modes are for. Set "passthrough" mode for, say, your password manager — the hotkey doesn't do anything when that app is focused. You can also pause the hotkey globally with one click in the sidebar.

macOS / Linux?

Not yet. The Windows build uses WASAPI for audio capture and Win32 keyboard hooks for the global hotkey — porting those to macOS/Linux is real engineering work, not a config flip. Windows-only for now; other platforms once the Windows experience is rock-solid.

What does it cost?

Free. Forever, on Windows. There's no paid tier planned, no "pro" features locked behind a paywall, no telemetry to monetize. If that ever changes, anyone who installed before the change keeps the version they had at the price they had.

Is there a hidden catch?

No. tellmebaby is a personal project that exists because the maintainer wanted a local-first dictation tool with a brain and couldn't find one. You're welcome to use it, share it, fork the source if you want to. There's no "actually it phones home for analytics" footnote.

Talk to your PC.
It listens.

Hold the hotkey. Talk. Watch the words show up.

One key turns your voice into a tool.

Dictate, anywhere

Read it back

Edit by voice

Custom dictionary

Per-app modes

Maximum-quality models

Six hotkeys. Zero menus.

Built around your hands, not your eyes.

The pill is the whole UI.

It learns the words you actually say.

Your voice stays here.

Why local at all?

Cloud dictation

tellmebaby

Everything stays on your machine.

People who'd rather talk than type.

Writers — drafting at speech speed

Devs — for the parts that aren't code

Multitaskers — for when your hands are busy

RSI / accessibility — hands-light, by design

One download. no account, no card.

The stuff people actually ask.

Talk to your PC. It listens.

Hold the hotkey. Talk. Watch the words show up.

One key turns your voice into a tool.

Dictate, anywhere

Read it back

Edit by voice

Custom dictionary

Per-app modes

Maximum-quality models

Six hotkeys. Zero menus.

Built around your hands, not your eyes.

The pill is the whole UI.

It learns the words you actually say.

Your voice stays here.

Why local at all?

Cloud dictation

tellmebaby

Everything stays on your machine.

People who'd rather talk than type.

Writers — drafting at speech speed

Devs — for the parts that aren't code

Multitaskers — for when your hands are busy

RSI / accessibility — hands-light, by design

One download. no account, no card.

The stuff people actually ask.

Talk to your PC.
It listens.