September 26th 2023

OpenAI has introduced significant updates to Chatgpt, enhancing its capabilities with voice conversations and image recognition. This AI-powered chatbot is now equipped to engage in discussions about images shared by users and hold dynamic back-and-forth conversations using five distinct voices.

Conquer the Gaming World: Best AAA Ga…
ÙƒØªØ§Ø¨ Ø¹Ù† Ø§Ù„ØØ¨: Ø§Ø³ØªÙƒØ´Ù …
Infinix Note 40 Series Featuring â€˜A…
Understanding the Software and Apps o…
Comment Devenir un Influenceur sur le…

The new image recognition feature of ChatGPT allows it to comprehend and provide information about images submitted or shared by users across various platforms where the chatbot is accessible. Additionally, ChatGPT can conduct voice conversations utilizing OpenAI's Whisper speech recognition tool and a novel text-to-speech (TTS) technology developed by the company. This TTS technology is designed to deliver audio responses that closely resemble human speech, enriching the user experience on the ChatGPT mobile app.

OpenAI has confirmed that its image recognition capability will be accessible on all platforms, while voice conversations will initially be available as an opt-in feature on iOS and Android. These features will be accessible to ChatGPT Plus and Enterprise subscribers. There is currently no information regarding whether these enhancements will be extended to users on the free tier in the future.

To enable voice conversations on ChatGPT, users can navigate to the Settings menu and select "New Features," where they can toggle the option for voice conversations. Users can choose from five different voices, all developed in collaboration with professional voice actors. The chatbot can respond to spoken queries by converting them into text, which it can comprehend, and then it transforms its responses into human-like audio using the new TTS technology.

OpenAI's TTS technology is not exclusive to ChatGPT; Spotify recently revealed an AI-based voice translation tool for podcast creators, leveraging the same technology to automatically translate podcasts from English into French, German, and Spanish. This tool is currently undergoing testing with select podcast hosts and will subsequently be made accessible to Spotify users worldwide.

The image recognition tool integrated into ChatGPT operates on the company's multimodal GPT-3.5 and GPT-4 models, enabling it to analyze images and text contained within photos, screenshots, and documents. Users can either capture a new image or share an existing one from their mobile device with ChatGPT to receive insights and information from the chatbot.

Moreover, ChatGPT supports the sharing of multiple images for discussion with the chatbot. Users can employ the built-in drawing tool to highlight specific areas of interest within an image, allowing ChatGPT to provide tailored guidance and responses. For instance, outlining a problem area in a shared photo, such as a dislodged bicycle chain, may prompt the chatbot to offer solutions for fixing the issue.

This post first appeared on Technical News, please read the originial post: here