OpenAI Boosts ChatGPT's Functionality with Verbal Conversation and Image Recognition Capabilities

OpenAI, a prominent player in artificial intelligence, is expanding the capabilities of its esteemed assistant, ChatGPT. Originally designed as a text-based search tool, ChatGPT will now flaunt voice and image processing abilities, creating a more interactive experience for its users.

Since its introduction approximately nine months ago, ChatGPT has become a big phenomenon in the technology spectrum. It's deeply appreciated for its ability to compose essays, create poems, and summarize extensive texts from simple text cues. However, the AI assistant is now set to become even more engaging. It will now lend its ears to users, allowing for vocal interactions.

Users will get the opportunity to engage in voice dialogue with ChatGPT. For instance, the assistant could be asked to narrate an impromptu bedtime story guided by verbal cues from the user. Simple questions can also be directed to the aid, and responses will be delivered in spoken language.

Additionally, image-based search features have been made available. Users can upload an image and ask ChatGPT to identify or explain the uploaded item or request directions for achieving a specific goal.

The capabilities of voice interaction of ChatGPT have been fine-tuned by a superior text-to-speech model that can produce human-like voices from text and a short speech sample. OpenAI revealed that it has collaborated with skilled voice actors to generate five exclusive voices. The organization's open-source Whisper speech recognition system serves as the underlying technology for converting speech to text.

In an exciting development, Spotify has stepped in as a launch partner. It has introduced a valuable feature for podcasters allowing them to transcribe their shows from English into Spanish, French, or German while maintaining their original voice tone. Nevertheless, OpenAI discloses that access to this technology is not universal. It's only available to select podcasters including Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett for the initial launch.

In a blog post, OpenAI acknowledged the potential risks associated with its new voice technology, concerning the probability of fraud or misrepresentation by rogue elements. Thus, it is making sure not to stir any controversy on its release.

The unveiling of these new features is planned for the next fortnight. They will be accessible to the Plus and Enterprise subscribers initially. To enjoy voice features, users are required to navigate to the 'settings' on the app, select 'new features', opt-in to voice conversations, tap the headphone button located at the top-right corner, and, finally, pick the preferred voice.

To begin with, only ChatGPT Android and iOS app users will be able to experience voice conversations on an opt-in beta basis. The image-based search feature, however, will become available across all platforms by default.

Many no-code platforms, like AppMaster, are eagerly awaiting to see the broad range of applications this enhanced ChatGPT would yield in the near future. Building enterprise software with no code often requires such sophisticated AI assistance for superior interactivity and user experience enhancement.

OpenAI Boosts ChatGPT's Functionality with Verbal Conversation and Image Recognition Capabilities

Related Posts