October 13th 2023

Sergei SavvovFollowBetter Programming--1ListenShareHave you ever thought about the necessity of an assistant that could answer an interviewer’s question for you? With the current advancements in AI, this has become a reality!In this article, we will build a small Application using Whisper for voice recognition and ChatGPT for text generation. We will also wrap our application in a simple UI that will help you succeed in any job interview.Disclamer: I strongly advise against using the created application for its direct purpose. The goal of this article is to demonstrate how you can build in one evening a working prototype that was only a dream a year ago.Our application, at the press of a button, will record audio — it is crucial to activate it promptly during a question. Subsequently, we will transcribe speech from the audio using OpenAI’s Whisper model. After this, we will ask ChatGPT to compose a response to the question and display it on the screen. This process will take a bit of time, so you will be able to respond to the interviewer’s question virtually without hesitation.What are we waiting for? Let’s get started!Since we aim to develop an application that operates independently of the platform through which you’re conducting your calls — be it Google Meet, Zoom, Skype, etc., we can’t leverage the APIs of these applications. Therefore, we need to capture audio directly on our computer.It’s crucial to note that we will record the audio stream not via the microphone but the one flowing through the speakers. After a bit of googling, I found a soundcard library. The authors claim it to be cross-platform, so you should encounter no issues with it.The only drawback for me was the need to explicitly pass the range of time for how long the recording will take. However, this problem is solvable, as the recording function returns the audio data in a numpy array, which can be concatenated.So, the simple code to record audio streams through speakers would look like this:Subsequently, we can save it as a .wav file using the soundfile library:Here you will find the code related to audio recording.This step is relatively simple, as we will be utilizing the API of the pre-trained Whisper model from OpenAI:During testing, I found the quality to be excellent. Transcribing the audio recordings was also quick, which is why I chose this model. In addition, this model can process languages other than English, allowing you to go through interviews in any language.If you prefer not to use the API, you can run it locally. I would advise you to use whisper.cpp. It is a high-performance solution that doesn’t consume a lot of resources (the library’s author ran the model on an iPhone 13 device).Here you will find the Whisper API documentation.To generate an answer to the interviewer’s question, we will be using ChatGPT. Although using the API seems like a simple task, we will need to solve two additional issues:To manage this, we will clearly specify in the system prompt that we are using potentially imperfect audio transcriptions:To accelerate the generation, we will make two simultaneous requests to ChatGPT. This concept closely resembles the approach outlined in the Skeleton-of-Thought article and is visually represented below:The initial request will generate a quick response, consisting of no more than 70 words. This will help continue the interview without any awkward pauses:The second request will return a more detailed answer. This is necessary to support further engagement in the conversation:It’s worth noting that the prompt employs the structure “take a deep breath and think step by step”, a method proven by recent studies to provide superior response quality.Here you will find the code related to the ChatGPT API.To visualize the response from ChatGPT, we need to create a simple GUI application. After exploring several frameworks, I decided to settle on PySimpleGUI. It allows for the creation GUI applications trivially with a full set of widgets. In addition, I need the following functionality:Below is an example of code for creating a simple application that sends requests to OpenAI API in a separate thread using perfrom_long_peration:Here you will find the code related to the GUI application.Now that we have explored all the necessary components, it’s time to assemble our application. We’ve delved into acquiring audio, translating it into text using Whisper, and generating responses with ChatGPT. We also discussed creating a simple GUI:Let’s now take a look at a demonstration video to see how all these components work together. This will provide a clearer understanding of how you can utilize or modify this application for your needs.Here is a demonstration:If you wish to advance and enhance this solution, here are a few ways for improvement:We’ve been on an interesting journey to see how artificial intelligence, specifically Whisper and ChatGPT, can become a handy assistant during job interviews. This application is a peek into the future, showing us how seamlessly technology can fit into our everyday tasks, making our lives a little easier. However, the purpose of this app is to explore and understand the possibilities of Generative AI, and it is important to use such technology ethically and responsibly.In wrapping up, the possibilities with AI are truly endless and exciting. This prototype is just a stepping stone and there is much more to explore. For those looking to delve deeper into the world, the road is open and who knows what incredible innovations are waiting around the corner.If you have any questions or suggestions, feel free to connect on LinkedIn.----1Better ProgrammingI'm a Machine Learning Engineer, chemist, and nomad https://www.linkedin.com/in/sergey-savvov/Sergei SavvovinBetter Programming--13Benoit RuizinBetter Programming--204VinitainBetter Programming--34Sergei SavvovinBetter Programming--4Adrian H. RaudaschlinTowards Data Science--11Allen HeltoninBetter Programming--3Thomas SmithinThe Generator--40Diana DovgopolinArtificial Corner--29Jesus Rodriguez--WhitespectreinWhitespectre Ideas--1HelpStatusBlogCareersPrivacyTermsAboutText to speechTeams

The Ultimate Guide to Cloud Gaming: D…
best projectors for home

This post first appeared on VedVyas Articles, please read the originial post: here

People also like

The Ultimate Guide to Cloud Gaming: Discover the Best Services

best projectors for home

Hack Your Next Interview with Generative AI

Related Articles

Hack Your Next Interview with Generative AI

Related Articles

Share the post

Subscribe to Vedvyas Articles

Thank you for your subscription