
A little follow-up on this. Tonight I had a look at what it generated. It produced 2 files: a .wav and a .ass. The latter apparently contains subtitles that sync to the audio. But how do you play them together?
After searching around online, the general consensus seemed that you need to make a video file that throws it all together. For the background image I used a still of the book cover art. Then I ran an ffmpeg command that looked something like this:
ffmpeg -loop 1 -i cover.jpg -i abogen_file.wav -vf subtitles=abogen_file.ass -shortest audio_book.mov
It sounds pretty awesome and looks like this while it’s playing!













This one’s Japanese (which has a lot of onomatopoeia btw), but ki-ki-ki is the sound of a violin being played badly. I put my poor family through a lot of that when I was taking lessons as a kid. I could hear my aunts in the background saying “mata ki-ki-ki yatteru.” That’s basically Japanese for “there he goes sawing away again”.