Blog | News | TravelTechnology TangleWhat can OpenAI’s new GPT-4o AI model do?

Days after denying rumours of a new AI search engine and a GPT-5 release, OpenAI livestreamed the launch of its new flagship AI model, the GPT-4o, capable of accepting audio and visual inputs and generating output almost flawlessly. The ‘o’ in GPT-4o stands for “omni,” which means it can receive multimodal inputs through text, audio, and images, unlike the early days of ChatGPT, when users had to submit text to receive a response text.

OpenAI claims GPT-4o can achieve a response time of 232 milliseconds for audio input, while its average response time is 320 milliseconds. The AI interface uses the usual fillers, or sometimes repeats part of the question to cover for this latency.

While users could already use tools to vocally communicate with ChatGPT, that feature worked by clubbing three models: turning the user’s voice into text, carrying out operations, and returning an audio-based result. With GPT-4o, the same neural network takes care of these layers, and the model is able to respond faster and glean more insights from the user and their surroundings.

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

What all can GPT-4o do?

OpenAI ran several demos to show off the diverse abilities of GPT-4o across audio, images, and text. The AI interface, based on a user’s instructions, can turn a picture of a man into a caricature, create and manipulate a 3D logo, or attach a logo to an object. It can also generate meeting notes based on an audio recording, design a cartoon character, and even make a stylised movie poster with real people’s photos.

In promotional video clippings, GPT-4o assessed a man’s readiness for an interview and made jokes about him for being dressed too casually, thus demonstrating its visual understanding. In others, it helped set up a game, assisted a child in solving a math problem, recognised real-life objects in Spanish, and even expressed sarcasm.

OpenAI did not shy away from praising the new model, claiming that it beat existing rivals such as Claude 3 Opus and Gemini Ultra 1.0, as well as its own GPT-4 offering, in several areas across text evaluation and vision understanding evaluations.

What can’t it do?

While GPT-4o can process text, audio, and images, one noticeable omission is video generation – despite the model’s vision understanding capability. So, users cannot ask GPT-4o to give them a fleshed-out movie trailer, but they can ask the model questions about their surroundings by making the AI see the user’s environment through their smartphone’s camera.

Furthermore, GPT-4o made some slip-ups and errors when demonstrating its abilities. For example, when converting two portraits into a crime movie-style poster, the model initially produced gibberish instead of text. Though the results were later refined, the final product also had a slightly raw AI-generated feel.

GPT-4o comes at a crucial time for the ChatGPT-maker, which is now in competition with other Big Tech firms fine-tuning their own models or turning them into business tools.

While companies like Google are freely offering their chatbots that access information in real time, OpenAI fell behind as it put in place a knowledge cut-off for the most basic and free version of ChatGPT. This means non-paying users were receiving outdated information from a less developed model when compared to users trying out cutting-edge offerings from rivals.

It remains to be seen how far GPT-4o will enhance the ChatGPT experience for non-paying users.

Who can use this AI model?

ChatGPT will immediately be getting GPT-4o’s text and image capabilities, said OpenAI. Significantly, even non-paying users of ChatGPT will be able to experience GPT-4o. ChatGPT Plus users will get increased message limits along with the upgrade, while a new version of Voice Mode is also planned for them.

“GPT-4o is 2x faster, half the price, and has 5x higher rate limits compared to GPT-4 Turbo. We plan to launch support for GPT-4o’s new audio and video capabilities to a small group of trusted partners in the API in the coming weeks,” said OpenAI in its post.

What safeguards are in place for GPT-4o?

As generative AI systems grow more advanced and organic with improved response times, there are fears they will be misused for purposes such as carrying out scam calls, threatening people, impersonating non-consenting individuals, creating false but believable news media, etc.

OpenAI said that GPT-4o had been tested but that the company would continue to investigate risks and address them quickly, apart from limiting certain audio features at launch.

“GPT-4o has safety built-in by design across modalities, through techniques such as filtering training data and refining the model’s behaviour through post-training. We have also created new safety systems to provide guardrails on voice outputs,” said OpenAI, adding that over 70 experts across fields such as social psychology, bias/fairness, and misinformation had carried out red-team testing.

What does GPT-4o have to do with the Hollywood film ‘Her’?

When announcing the launch of GPT-4o, OpenAI CEO Sam Altman posted the word “her” on X.

This was taken to be a reference to the 2013 Hollywood sci-fi romance film written and directed by Spike Jonze, in which the protagonist played by Joaquin Phoenix grows infatuated with an AI assistant played by Scarlett Johansson.

In most of the demo clips shared by OpenAI, GPT-4o’s “voice” sounded female. Unlike more basic iterations, the voices in OpenAI’s latest model were expressive, friendly, and even affectionate, sounding more like a friend – or someone closer – rather than a machine-generated voice.

The GPT-4o voice reacted in typically human ways, such as cooing at an adorable dog, giving a man fashion advice, and guiding a student working on a math problem.

Source link

Can OpenAI break Google's Monopoly with Search Engine? | Vantage with Palki Sharma – Firstpost

Latest News Today Live Updates July 27, 2024: Google Doodle Today: Celebrating the skateboarding events at Paris Olympics 2024; here’s all you need to...

NYT ‘Strands’ Hints, Spangram And Answers For Saturday, July 27th

Pixel 9 Pro Details, Galaxy Ring Subscriptions, Deadpool’s New Smartphone

What UK political parties are promising in the 2019 general election

Otto Warmbier’s parents want North Korea to suffer for their son’s death

Could a ‘youthquake’ cause Boris Johnson to lose the general election?

People are driving through flames to escape this California wildfire

Sagittarius Horoscope Today: July 27, 2024 | Vogue India

Pisces Daily Horoscope Today, July 27, 2024 predicts wealth from different sources | Astrology

‘Deadpool & ‘Wolverine’ Set for $180M R-Rated Opening Record

France dazzles world with colourful, vibrant Olympics opening ceremony | Paris Olympics 2024 News

This is how insurance is changing for gig workers and freelancers

Trump could hit France with more tariffs in battle over taxes on big tech

Platform Robinhood withdraws its application to become an official bank

International money transfers hit $613 billion this year

Mike Pence made a surprise trip to Iraq to reassure Kurdish allies

Drew Banga wants to spark the Bay Area’s rap resurgence

How Omni accidentally became the best post-punk band in America

Exploring the origins of punk across America with Kid Karate and Bushmills

Tesla’s Cybertruck fiasco cost Elon Musk $768 million in a single day

The YouTuber who has become one of Gen Z’s most beloved celebrities

26 last-minute holiday gifts that are still thoughtful and unique

Practicing gratitude regularly can make you less stressed and sleep better

What can OpenAI’s new GPT-4o AI model do? | Explained

What all can GPT-4o do?

What can’t it do?

Who can use this AI model?

What safeguards are in place for GPT-4o?

What does GPT-4o have to do with the Hollywood film ‘Her’?

Related

Power Grid Corp Sees Net Profit Rise 3.5% To ₹3,724 Cr

Can OpenAI break Google's Monopoly with Search Engine? | Vantage with Palki Sharma – Firstpost

HIV treatment drug shows promise as prevention method, study shows – CBS New York

Mosques, mazar covered on Haridwar kanwar route; removed after flak

What can OpenAI’s new GPT-4o AI model do? | Explained

What all can GPT-4o do?

What can’t it do?

Who can use this AI model?

What safeguards are in place for GPT-4o?

What does GPT-4o have to do with the Hollywood film ‘Her’?

Related

Discover more from Blog | News | Travel