TABLE OF CONTENTS
1. Overview2. Introduction to AWS Transcribe3. AWS Transcribe4. S35 trigger configuration Lambda code to Transcribing and Storing the Text in S36. Architecture Diagram7. Conclusion8. Use Cases9. CloudThat10. FAQs 1. Overview
Audio-to-text refers to the conversion of audio files into textual format. Audio files are almost impossible to use for analysis or to extract the important data from in computer programs and software. These audio files must be converted to text before they can then be used for analysis.
Software providers have created many tools to offer speech-to-text services. We will be discussing AWS Transcribe, a service that allows speech to text.
2. Introduction to AWS Transcribe
Amazon Transcribe is an automated speech recognition service that can automatically generate time-stamped transcripts from audio files. It is fully managed and continuously trained. Amazon Transcribe allows developers to add speech to text capabilities to their applications. Computers cannot search for and analyze audio data. Recorded speech must be converted to text before it can be used in applications. Customers used to have to work with transcription companies that required them to sign lengthy contracts and were difficult to integrate into their technology stacks. These providers often use outdated technology that is difficult to adapt to different situations, such as low-fidelity phone audio standards used in contact centers. This results in poor accuracy.
3. AWS Transcribe
S3 triggers will be used to automate transcription from start to finish. This article will provide a detailed overview.
Create a Lambda role with access to the S3, Cloud Watch, AWS Transcribe and S3 services
Make sure to create an S3 bucket as well as an output bucket for AWS Transcribe.
Create a Lambda function with python to trigger AWS Transcribe when a new.mp3 file is added to the input S3 bucket.
4. Set up a Trigger for S3
Click on the Add Trigger’ option on the lambda. Select ‘S3 as a source and the Event Type as ‘PUT.’ Prefix refers to the folder, suffix refers to the file type. For the demo, we only accept.mp3 files.
5. Lambda Code to Transcribe the Text and Store the text file in the S3
First, we will import required libraries such as boto3, requests, JSON
Configuration Settings allows you to increase the Lambda timeout; it is default set at 3 seconds.
This code reads Event and retrieves the Event’s Bucket Name, File Name, and Event ID.
Next, we create an S3 URL that we are supposed to use for Transcribe Job
We start the Transcription Job and then get the details for the Transcription Job
We call a function a function for starting and getting the Transcription details.
We retrieve the Transcript File Url as well as other details from the JSON response.
We use requests to fetch the Transcribed data from Url.
Next, we create a text file and then upload that text file S3
After the code is executed, we will see a Text file in S3 and also a Transcription Job in the AWS Transcribe services.
import boto3import jsonimport requestss3 = boto3.client(‘s3’)transcribe = boto3.client(‘transcribe’)def lambda_handler(event, context): try: file_bucket = event[‘Records’][‘s3’][‘bucket’][‘name’] file_name = event[‘Records’][‘s3’][‘object’][‘key’] object_url = ‘https://s3.amazonaws.com/0/1’.format(file_bucket, file_name) transcriptionJobDetails=startTranscriptionJob(file_name,object_url) status = getTranscriptionJob(file_name) url=status[‘TranscriptionJob’][‘Transcript’][‘TranscriptFileUri’] Text_Data = (requests.