Professional Documents
Culture Documents
Pricing Per User
Pricing Per User
$.055
==========================================================
average speaking rate for adults in English is around 125 to 150 words per minute
(WPM)
Time to speak 15 words (in seconds) = (15 words / 155 WPM) * 60 seconds per minute
Time to speak 15 words (in seconds) ≈ 5.81 seconds
======================STT COSTS============================
Nvidia A40 GPU.
cost of processing 15 words (approximately 5.81 seconds of audio) on the Nvidia A40
GPU.
So, the cost of processing 10 hours of audio per user per month on the Nvidia A40
GPU using the Bark model would be approximately $32.01.
-----------------------------------------------------------------------------------
------
Nvidia A1000 GPU
The cost per second on A100 80GB GPU = $0.0004972222
The average speaking rate for adults in English is around 125 to 150 words per
minute. We'll use the average of these two values,
which is 137.5 words per minute.
Average Voicenote time in around 6 secs that takes up 16 KB of output voice. !!!!
Prcessing time unknown!!!!
Now, the cost of processing 10 hours of audio per user per month.
The Nvidia A1000 GPU with custom faster BARK will do faster calculations than AWS
Textract, IBM TTS, and Google Cloud Speech-to-Text.
Speech to Text
Service Transcription time (seconds)
Whisper on Nvidia A1000 GPU 54
AWS Transcribe 120
IBM STT 150
===================================================================================
========
The cost per second of usage would be higher if the GPU is idle for some of the
time.