One of our teachers share that
If I ask you to suggest to me some speech-to-text service or API for my project, then, What will you do? Maybe you will first google and answer or you will refer me Google Speech API or IBM Watson Speech to Text or Bing Speech Service. That is fine, but these services are Free?
From Web First, we moved to Mobile First, now the age of Voice First Apps is almost here. Get Ready.— Zia U. Khan (@ziakhan) September 25, 2017
Indeed tech giants like Apple, Microsoft, etc. have moved for voice-first apps. Now, as a developer did you consider implementing voice-based interaction for your app? No? Why? You should consider it right now! And For this, you first need to understand how to convert user words to action? But wait, actually first you are required to understand how to convert user voice to text and then, text to action.
Remember the sequence:
"user voice to text then, text to an action".
And in this post, we are going to deal with the first part i.e., user voice to text. I am not an AI expert nor I have designed an algorithm to share with you. We are going to share with you cloud services for this purpose.If I ask you to suggest to me some speech-to-text service or API for my project, then, What will you do? Maybe you will first google and answer or you will refer me Google Speech API or IBM Watson Speech to Text or Bing Speech Service. That is fine, but these services are Free?
Maybe to some extent (like a free trial or limited free access) but they are not free for production work where you have to handle tons of requests.!
What if I tell you about a surprising service that will provide Speech-to-Text without any cost? Sounds amazing! But wait, first we look at the top speech-to-text services that are proudly presented by the google search engine:
Watson Speech to Text - IBM
IBM provides you one month's free trial and its basic pricing model is given below:
- Usage: < 250K minutes a month: $.02 per audio minutes transmitted
- Any usage > 250K to 500K minutes a month: $.015 per audio minutes transmitted
- Any usage > 501K to 1MM minutes a month: $.0125 per audio minutes transmitted
- Any usage > 1MM minutes a month: $.01 per audio minutes transmitted
You can also check its pricing details here. I have used its trail version, its good but its languages support is very limited (support 8 languages only). [There is no support for German language and actually this thing drive me to discover free Speech to text Service and write this article.]
The IBM text to speech supported languages are Brazilian Portuguese, French, Japanese, Mandarin Chinese, Modern Standard Arabic, Spanish, UK English, and US English.
Speech API - Speech Recognition | Google Cloud Platform
Compare to IBM Watson service, Google provide support to more languages, the API recognizes over 110 language & variants but it slight expensive compare to IBM service.
Bing Speech API—Speech Recognition | Microsoft Azure
It supports more languages compared to IBM Watson, you can find supported languages here.TIER | FEATURES | UNIT | PRICE |
---|---|---|---|
Bing Speech API – Free | Transactions | 5,000 transactions free per month | |
Bing Speech-to-Text API | Bing Speech-to-Text API | Transactions | $4 per 1,000 transactions |
After providing some basic introduction to giant Speech to Text services. It is time to disclose the free speech to text service and it is wit.ai.
wit.ai
wit ai will allow you to convert your speech to text (I only tested this service yet) for free of cost and without any strict restriction. As they stated in their FAQ page.
" wit.ai is free, including for commercial use. So both private and public Wit apps are free and are governed by our terms"
"We don't have a strict rate limit. Use common sense and please contact us if you plan to hit our API quite heavily (sustained rate of 1 request/sec)."
and amazingly it support almost 50 languages and its make me crazy because it is also supporting German lanague (not 100% but acceptable).
- Albanian
- Arabic
- Azerbaijani
- Bengali
- Bosnian
- Bulgarian
- Burmese
- Catalan
- Chinese
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Finnish
- French
- Georgian
- German
- Greek
- Hebrew
- Hindi
- Hungarian
- Icelandic
- Indonesian
- Italian
- Japanese
- Korean
- Latin
- Lithuanian
- Macedonian
- Malay
- Norwegian
- Persian
- Polish
- Portuguese
- Romanian
- Russian
- Serbian
- Slovak
- Slovenian
- Spanish
- Swahili
- Swedish
- Tagalog
- Tamil
- Thai
- Turkish
- Ukrainian
- Vietnamese
It was super easy to implement and I have implemented it using C# and Unity3d Game-engine (in future, I will share with you soon).
Undoubtedly, Google Speech API, IBM Watson and Bing Speech API has own importance but it have a price. If you looking for a free solution then, wit.ai can be a suitable option for you but remember do some research before program :). Cost is not a matter of concern always, you have to worry about request limits, supported languages and customization options etc etc.
Social Plugin