sub-Vert Home
Todd Audio Logo

Welcome to Todd Audio

** Full-Length Draft Audiobooks **

Just a freebie side project to batch out expressive, extended length AI-voiced narration for fellow literary hobbyists who, like Todd, want to avoid the cost of premium TTS.


Samples

FAQ

Beta Invitation - February 25, 2026

Todd the Kitty is looking for beta-testers, so let us know. Buckshee. Gratis.

How it works: 1) choose a voice, either pick one from the LibriTTS samples below, or provide your own WAV; 2) provide your manuscript as a markdown file.

We run it (offline, during the Beta); the file is parsed and passed to Coqui XTTS v2 on our local Ubuntu box (with CUDA GPU) to generate the audio in WAV and MP3.

What's Todd up to here? Pick one: a) He got tired of blowing his monthly Slurpie budget on ElevenLabs (and there was this spare GPU sitting in a box, he threw it into a desktop and installed XTTS v2, and now that it's working, he wondered if it might be useful to other members of the Codex Literati); or b) He's a fuddy-duddy cat with too much time on his hands.

How will files be exchanged during the Beta? For each draft markdown, there'll be a temp folder on my Proton Drive. I'll notify you when the WAV/MP3 is ready and uploaded, and once you retrieve it, all files will be deleted and the folder dropped. In the future, I'm planning to set up an automation so it will be "drop and go".

Contact us on Discord or by email.

Email: rufuslaguerre@proton.me

Voices

Coqui XTTS v2 requires a short (~10 seconds) voice sample to capture the speaker’s vocal fingerprint—tone, pitch, cadence, and texture—then applies those characteristics when generating the audiobook so that the spoken narration then approximates that voice. You can provide your own voice sample if you prefer.

Todd likes Voice #260 when relaxing with a cognac and Milton's Paradise Lost.

sourced from LibriTTS-R / LibriVox (Public Domain, for non-commercial use)
Voice# Gender Preview Link
121 Female 121.wav
4446 Female 4446.wav
4507 Female 4507.wav
4992 Female 4992.wav
6829 Female 6829.wav
5142 Neutral 5142.wav
8463 Neutral 8463.wav
1089 Male 1089.wav
1320 Male 1320.wav
2300 Male 2300.wav
260 Male 260.wav
5105 Male 5105.wav
672 Male 672.wav
7021 Male 7021.wav
7176 Male 7176.wav
8224 Male 8224.wav
8230 Male 8230.wav
Back to top

Audio Samples (MP3)


Back to top

FAQ : Frequently Asked Questions



> Is there a limit on the word count of the input Markdown file I provide? Not really. I mean, don't send us the Mahābhārata or anything by Marcel Proust, but it'd be fun to try Joyce's Ulysses -- or War and Peace. For now, let's keep the runtimes within reason (under 4 hours) so for testing, a maximum of 150,000 words. For reference, a sample run of 46,000 words (The Turn of the Screw, by Henry James) took 65 minutes to generate a 5 hours and 23 minutes WAV (958MB) and MP3 (124MB).


> Why use markdown for input? Markdown is a common near-text format. For example, Google Docs and Novelcrafter provide for direct export to Markdown. The convention for indicating Act and Chapter breaks with asterisks is used to separate a big input file into separate chapters and those are then submitted to the TTS. There are additional substitutions and fixes done in a pre-process step to reduce pronunciation and random artifacts. XTTS-v2 does not specifically “support” or “understand” Markdown. It just sees plain text.


> Will this produce a perfect, commercial-quality audiobook for me? Nope, no way, and that's not the intended use. It's a throw-away, meant as an additional aid during editing and review. Also, depending on the voice and contents, the file is for personal (non-commercial) use only, not for upload to Audible. It's probably okay to send it to your mom (as long as you don't charge her).


> Can this generate audio in other languages besides English? Yes, Coqui XTTS v2 supports many (16+) languages, but I have no experience so far. Regardless, Todd says he's happy to give it a try. There will be some testing required and the intake prep script will need to be updated to support TXT UTF-8 encoding.


> What's the turn-around time? No idea. I'll try for just a day or two, but we running this manually for the time being. I hope to set up a fully automated "drop and go" mechanism, but no ETA on that.


> Can I provide my own narrator voice? Yes. This is one of the advantages with XTTS v2, but you'll want to have a high fidelity voice 10 second sample. Contact us for some guidelines on this. XTTS v2 has some specific ways it wants voice sample WAV. The better the quality and specs of the sample file, the better the resulting generated audiobook. I am not a digital audiophile so the guidance I have (on using ffmpeg) comes straight from a bot.


> Is there a way to correct TTS word pronunciation glitches? Yes, but this is a work in progress. The script already does a small set of expansions of common abbreviations (ex. Mr. to "Mister"). There is also a capability for a specific to your book lexicon so that specific un-phonetic words or ones with accent characters (ex. protégé) can be improved as to how they are handled by XTTS v2.


> How is the voice and audio quality? Mostly okay IMHO. This is another ongoing work-in-progress (I know, that’s redundant (but accurate)). The quality of the ten second voice sample WAV that seeds XTTS v2 seems to matter a lot. We can query the technical specs of the file with ffmpeg to look at general characteristics but I'm sure there are other more subtle and nuanced technical factors that determine if the sample works well. We need a professional.


> Can we do revisions to improve the results? Absolutely. Although the turnaround time will be variable. There will definitely be tuning and fix opportunities. (In the future we're planning an automated "revise and drop it in" for re-runs.)


> Is a full novel delivered as a single MP3? That's not the default behavior. The script looks for chapter breaks and will divy up each chapter and put it into a separate WAV file when generating audio. When the run is finished, we create MP3 files from the WAV since they are about one tenth the size. If you want to combine them, that seems easy with a utility like ffmpeg.


> Does the file you attach or upload have to be "text only" and Markdown? Sort of. At this time, we verify the as ASCII/UTF-8. I mean, it's a bit of a concern if you're on this site and can't manage to get your manuscript into text. And anyway, do you really want a juiced up cat converting your docx / html / epub / pdf files?


Back to top

Todd Audio Logo2 Todd Audio slurpie Todd Audio slurpie