Listen

This blog post is a test for something new I’m working on. I think a lot of people feel overwhelmed that AI is everywhere right now. Personal blogs should lean in to the human voice. There’s a new play button on the permalink for this post on the web which will use a recording that I’ve uploaded.

Vincent

Amazing!

Gunnar

Love it!

jabel

Cool!

Aeryn

Woah! I likey

Phil :prami:

@manton how many people are going to say that’s really good training data for a voice model…. But it was nice to hear your personal voice, as another human.

Vincent

@phils there will be these people... sigh.

Reese Armstrong :verified_su:

nifty!

Manton Reece

@phils Thanks! Yep, it's a great question. Voice models are about to get very, very good. But they're still not me.

Jim Mitchell

Pretty wild. Looking forward to seeing more on this one.

Pratik

How do we know you recorded that audio and are not using AI that used your voice to train it? 🙃 With your podcasts, I bet it’s pretty easy to do and quite accurately.

Manton Reece

@james I had a bug with dark mode that I just fixed. Maybe that was it?

Manton Reece

@pratik Heh. Well, the mistakes I didn't edit out this time. 🙂 But yeah, synthetic voices are getting incredibly good.

John Spurlock
Manton Reece

@james Oops, I see what happened. The “transcript” link messed up the post on Mastodon. Fixed going forward.

Pratik

But jokes aside, if I train an AI model with my voice and use it for such post narration, is that not using AI "correctly"? It will save me tons of time and yet personalize my blog posts. Heck, I don't mind (in fact, prefer) if it preserves my stutter.

Manton Reece

@pratik Maybe it’s like alt text: hand-written alt text is great, but AI-generated text is so good why not use it so that it’s accessible to more people? Likewise maybe in the future all posts will have synthetic narration by default but humans can override as needed to provide their own audio.

Andrew Canion

I love this. What a cool feature; hope it’s able to be released soon!

Numeric Citizen

I'm anxious to try it as soon as my visual theme is updated by @Mtt and @ericgregorich !

Eric Gregorich

@numericcitizen cool feature. I’ll update the Cards Theme with support for it when it’s released.

Matt Langford

@numericcitizen Tiny will definitely support it.

Pratik

Yup. Auto-generated (in my voice) but the option to override specific posts in case I want to do it in a different tone/mood/impression.

Jim Mitchell

This is cool work. Out of curiosity, in your example post, is AI transcribing your voice input? To me it sounded very much like you were reading the post as if it were already written. If it was transcribed, how did you handle the screenshot image insertion? Was it the little “screenshot here” aside? Finally, does the transcription save as a draft to clean up then manually post? Sorry for many questions.

Manton Reece

@jimmitchell I wrote the blog post myself, then read it, but funnily enough after I posted it Micro.blog did transcribe the audio back to text with AI. In this case, that was redundant, so the transcript can be tossed.

Jarrod Blundy

@pratik That's exactly what I've been dreaming of.

Pratik

So would it make to first make audio “post” and then have AI transcribe it? ☺️@jimmitchell

Jim Mitchell

Interesting. I can't seem to find any transcriptions for my tests on my test micro.blog. Are you dumping them as part of the feature, or have I missed something? Will the feature ever get to the place of being able to upload audio and the post gets transcribed?

Manton Reece

@jimmitchell One limitation that you might be hitting: we only transcribe one audio file per day. I was worried about the costs, but it has been fine so I'm going to raise that limit this week. If you click Transcripts, you can delete previous transcripts to work around the limitation for now. Also, check Account → View logs. It has log entries specifically for transcripts.

Manton Reece

@pratik Yes, I actually do this pretty often! Upload audio to my test blog, grab the transcript, then delete it.

Manton Reece

@jimmitchell Also check out the other limitations in the documentation. For example, it only works with MP3 files.

Jim Mitchell

What? RTFM? Never! 🤣

Thanks. That clears things up a lot and explains why I didn't see a transcript, since I uploaded in .m4a format.

Pratik

I'm more of a writer than a talker, so the opposite would work for me, but for people who do what you do, perhaps it can be made more automated?

Manton Reece @manton
Lightbox Image