Professional Documents
Culture Documents
This is
my experience
Last week, Pablinux told you about the new version of Kdenlive, the video editing tool
from the KDE project. As I once commented, I prefer OpenShot which has a lower
learning curve, butAs I was very interested in the speech-to-text tool that this new
version incorporates, I decided to take a look at it.
I've said in the past and I keep it (come one by one) that free and open source software
has libraries for multimedia work that make Adobe and Blackmagic products look
like mere toys. The big problem is that nobody was interested in putting these tools
together with a simple and attractive interface and complete and easy to understand
documentation. Although Kdenlive is far from having achieved its goal, its developers
are on the right track.
In the case of the ability to convert speech to text, Kdenlive uses two tools from the
arsenal of the repository of the Python Package Index.
Vosk is an open source and offline speech recognition toolkitn. It offers speech
recognition models for 17 languages and dialects: English, Indian English, German,
French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch,
Catalan, Arabic, Greek, Farsi, and Filipino.
Kdenlive will check that you have these modules installed. PTo do this you need to
first install the python3-pip package on your distribution and then run the
commands:
Next, we have to install the voice models. For this we open Kdenlive and we are going
to Settings Configure Kdenlive Speech to Text.
To load the models you have two options: or download the models from this page
and load them manually (You must first check the Custom modem folders box) or
paste the link from the list that shows you that same page.
I compared Speech to tech to the free version of a cloud tool, and have seen self-
captioned videos from Youtube and paid course platforms. I have to say that it is not
perfect, but it is not worse than the mentioned alternatives. He has problems when
those who speak do not have good diction or do so over music or some other sound.
But, imagining the question they are asking me, yes, it can be used to subtitle a series or
movie. Although, due to the limitations indicated, they may have to be completed by
hand.
And, if the guys at Kdenlive put the batteries a bit and integrate a translation module,
the thing would be perfect.
There is something that could be improved. Today, if you want to change the
appearance of the subtitles, you will have to insert code. And, there is no way to
export them. You will only be able to see them embedded in the video.
But, as I said above, without a doubt the project is on the right track.