Grok gets glasses to see what you’re talking about
X’s AI chatbot can explain all those memes, if not succinctly
When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.
X (formerly Twitter) Premium subscribers can now ask the Grok AI assistant to describe images, not just make them. TheElon Musk-owned company xAI unveiled a new feature for visual content analysis, giving it the ability to describe photos, diagrams, and other snapshots using theGrok-2AI model which powers the AIchatbotand itsFlux AI image creation.
The feature brings Grok to parity withChatGPT, Gemini, and other rivals. If you subscribe to X’s subscription plans, you can try it out now by clicking on a button in an image post within X and asking Grok questions about the image or just for a straight descriptive analysis.
In tandem with the new feature, Grok showed off a new benchmark called RealWorldQA that is supposed to show how well a model can describe a real-world image, including the space between objects. The company claims RealWorldQA shows Grok to be as good or better than its rivals at explaining images even though it’s still in development. You can see an example below of how it works,sharedon X by Elon Musk.
Grok now understands images, even explaining the meaning of a joke.This is an early version. It will rapidly improve. https://t.co/gQ5BBISVRcOctober 28, 2024
See and Grok
As the screenshot illustrates, Grok is capable of breaking down a complex multi-stage image and explaining what happens in it. It can then extrapolate the humor of the joke, though, as is almost always the case, explaining the joke makes it much less funny. Still, it’s a sign that xAI is not done with putting out new features for Grok, especially multimodal tools. This could be a step toward Grok being able to explain audio and video content the same way it does with visuals.
One element not mentioned is how the visual analysis by Grok might portray thefreewheelingimage creation by the AI chatbot that seems to have little or no compunction about copyright issues. It’s something that users making images ofMariofaced when Nintendo’s copyright infringement hunter Tracerwent after themfor infringement. Whether an AI image ofMarioor any other intellectual property would be described as such or in more generic terms would be interesting to discover.
xAI’s owner being who he is, there’s also very obvious potential for the feature in other Musk-owned technology companies. Tesla’s semi-autonomous driving would certainly benefit from being able to identify people and objects around it and how they are spaced apart. The same goes for the long-promised humanoid robots Tesla’s had under development for the last few years.
You might also like
Get the best Black Friday deals direct to your inbox, plus news, reviews, and more.
Sign up to be the first to know about unmissable Black Friday deals on top tech, plus get all your favorite TechRadar content.
Eric Hal Schwartz is a freelance writer for TechRadar with more than 15 years of experience covering the intersection of the world and technology. For the last five years, he served as head writer for Voicebot.ai and was on the leading edge of reporting on generative AI and large language models. He’s since become an expert on the products of generative AI models, such as OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and every other synthetic media tool. His experience runs the gamut of media, including print, digital, broadcast, and live events. Now, he’s continuing to tell the stories people want and need to hear about the rapidly evolving AI space and its impact on their lives. Eric is based in New York City.
Anthropic’s new Claude 3.5 Haiku AI model is 4 times more expensive than its predecessor
Your doctor may have an AI assistant taking notes during your next Zoom call
Cybersecurity is business survival and CISOs need to act now