开发者

Text extraction of speech from video and audio files

开发者 https://www.devze.com 2022-12-11 18:33 出处:网络
What are the best libraries for doing this. And开发者_StackOverflow社区 is the quality good enouch to rely on. It will not be possible to train the system with the speakers voice or use a dictionary o

What are the best libraries for doing this. And开发者_StackOverflow社区 is the quality good enouch to rely on. It will not be possible to train the system with the speakers voice or use a dictionary of terms to improve results.


On windows, you want to use the SAPI interface (Speech API). There are multiple implementations. Microsoft includes a free one with Windows. Dragon NaturallySpeaking is a non-free one that I've seen used in the past for similar tasks (with effort). If the speakers are speaking clearly (and not overlapping and interrupting each other), the lack of training isn't so crippling.

You won't get a good transcript, though, the accuracy will be bad enough that what you get will be useful only for indexing. Large words and unique phrases will pop out nicely, especially if you create a custom dictionary (which I know you said you don't want to do). For instance, finding all the news segments that mention 'Pelosi' and 'public option'.

0

精彩评论

暂无评论...
验证码 换一张
取 消