{"data":{"site":{"siteMetadata":{"title":"Lime Brains","description":"We are The Software House where business questions meet software answers.","url":"https://limebrains.com"}},"markdownRemark":{"html":"<h1>Problem 😱</h1>\n<p>You want to process audio file into text.</p>\n<hr>\n<h1>Solution 🤓</h1>\n<p>We will use gapi which offers ability to transform audio files into text.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token comment\"># GOOGLE_APPLICATION_CREDENTIALS=./gapi-auth.json</span>\n<span class=\"token keyword\">import</span> json\n<span class=\"token keyword\">import</span> subprocess\n\n<span class=\"token keyword\">from</span> google<span class=\"token punctuation\">.</span>cloud <span class=\"token keyword\">import</span> speech\n<span class=\"token keyword\">from</span> google<span class=\"token punctuation\">.</span>cloud<span class=\"token punctuation\">.</span>proto<span class=\"token punctuation\">.</span>speech<span class=\"token punctuation\">.</span>v1 <span class=\"token keyword\">import</span> cloud_speech_pb2\n\nspeech_client <span class=\"token operator\">=</span> speech<span class=\"token punctuation\">.</span>SpeechClient<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n\n\npath_to_file <span class=\"token operator\">=</span> <span class=\"token string\">'./input.mp4'</span>\noutput_path_to_file <span class=\"token operator\">=</span> <span class=\"token string\">'./out.flac'</span>\n\n\n<span class=\"token comment\"># command = 'ffmpeg -i {input_file} -ac 1 -c:a  flac answer_11.flac'</span>\nsubprocess<span class=\"token punctuation\">.</span>call<span class=\"token punctuation\">(</span><span class=\"token punctuation\">[</span><span class=\"token string\">'ffmpeg'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'-i'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'{input_file}'</span><span class=\"token punctuation\">.</span><span class=\"token builtin\">format</span><span class=\"token punctuation\">(</span>input_file<span class=\"token operator\">=</span>path_to_file<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'-ac'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'1'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'-c:a'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'flac'</span><span class=\"token punctuation\">,</span> output_path_to_file<span class=\"token punctuation\">]</span><span class=\"token punctuation\">)</span>\n\n\n<span class=\"token keyword\">def</span> <span class=\"token function\">parse_speech_recognition_result</span><span class=\"token punctuation\">(</span>speech_recognition_result<span class=\"token punctuation\">)</span><span class=\"token punctuation\">:</span>\n    data <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span><span class=\"token string\">'data'</span><span class=\"token punctuation\">:</span> <span class=\"token punctuation\">[</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">}</span>\n    <span class=\"token keyword\">for</span> result <span class=\"token keyword\">in</span> speech_recognition_result<span class=\"token punctuation\">.</span>results<span class=\"token punctuation\">:</span>\n        <span class=\"token keyword\">for</span> alternative <span class=\"token keyword\">in</span> result<span class=\"token punctuation\">.</span>alternatives<span class=\"token punctuation\">:</span>\n            data<span class=\"token punctuation\">[</span><span class=\"token string\">'data'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>append<span class=\"token punctuation\">(</span><span class=\"token punctuation\">{</span>\n                <span class=\"token string\">'transcript'</span><span class=\"token punctuation\">:</span> alternative<span class=\"token punctuation\">.</span>transcript<span class=\"token punctuation\">,</span>\n                <span class=\"token string\">'confidence'</span><span class=\"token punctuation\">:</span> alternative<span class=\"token punctuation\">.</span>confidence<span class=\"token punctuation\">,</span>\n            <span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span>\n    <span class=\"token keyword\">return</span> data\n\n\n<span class=\"token keyword\">with</span> <span class=\"token builtin\">open</span><span class=\"token punctuation\">(</span>output_path_to_file<span class=\"token punctuation\">,</span> <span class=\"token string\">'rb'</span><span class=\"token punctuation\">)</span> <span class=\"token keyword\">as</span> recording_file<span class=\"token punctuation\">:</span>\n    recording_bytes <span class=\"token operator\">=</span> recording_file<span class=\"token punctuation\">.</span>read<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n    audio <span class=\"token operator\">=</span> cloud_speech_pb2<span class=\"token punctuation\">.</span>RecognitionAudio<span class=\"token punctuation\">(</span>content<span class=\"token operator\">=</span>recording_bytes<span class=\"token punctuation\">)</span>\n    config <span class=\"token operator\">=</span> cloud_speech_pb2<span class=\"token punctuation\">.</span>RecognitionConfig<span class=\"token punctuation\">(</span>encoding<span class=\"token operator\">=</span><span class=\"token string\">\"FLAC\"</span><span class=\"token punctuation\">,</span> sample_rate_hertz<span class=\"token operator\">=</span><span class=\"token number\">44100</span><span class=\"token punctuation\">,</span> language_code<span class=\"token operator\">=</span><span class=\"token string\">\"en-US\"</span><span class=\"token punctuation\">)</span>\n    speech_recognition_result <span class=\"token operator\">=</span> speech_client<span class=\"token punctuation\">.</span>recognize<span class=\"token punctuation\">(</span>config<span class=\"token operator\">=</span>config<span class=\"token punctuation\">,</span> audio<span class=\"token operator\">=</span>audio<span class=\"token punctuation\">)</span>\n    result <span class=\"token operator\">=</span> parse_speech_recognition_result<span class=\"token punctuation\">(</span>speech_recognition_result<span class=\"token punctuation\">)</span>\n\n    dump_result <span class=\"token operator\">=</span> json<span class=\"token punctuation\">.</span>dumps<span class=\"token punctuation\">(</span>result<span class=\"token punctuation\">)</span>\n\n    <span class=\"token keyword\">print</span><span class=\"token punctuation\">(</span>dump_result<span class=\"token punctuation\">)</span></code></pre></div>","excerpt":"Problem 😱 You want to process audio file into text. Solution 🤓 We will use gapi which offers ability to transform audio files into text.","frontmatter":{"title":"How to process audio into text?","subtitle":"How to process audio into text?","date":"2017-11-04 23:50","seo":{"title":"How to process audio into text?","description":"How to process audio into text?","noindex":false}},"fields":{"slug":"/blog/2017-11-04T23:50-how-to-process-audio-into-text/"}}},"pageContext":{"slug":"/blog/2017-11-04T23:50-how-to-process-audio-into-text/","indexable":false}}