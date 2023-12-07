For Google, 2023 would have been a forgetful year, as it was struggling to match against OpenAI's generative Artificial Intelligence (Gen AI) bot ChatGPT. But, to everyone's surprise, the search engine giant' DeepMind team on Wednesday (December 6) announced a new advanced multi-modal Large Language Model Gemini. It is touted to be better than ChatGPT 4 in most of the performance benchmarks.
Google is betting big on the latest LLM Gemini and believes it can bring big revolution in the fields of science and engineering. And, as Google has plans to bring it Bard through updates to Search and Pixel phones, will significantly improve the productivity of the people.
"This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company. I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere," said Sundar Pichai, CEO, of Alphabet Inc. & Google.
Google's Gemini 1.0 LLM is being offered in three forms-- Ultra, Pro, and Nano. Gemini Ultra, the most advanced language model is suited for big data centres and research studies, while Gemini Pro is for computers, and Google Search, for all users.
And, Gemini Nano is tailor-made for smaller gadgets such as smartphones. Gemini Nano is capable of performing on-device without any need for internet connectivity
Google DeepMind's Gemini comes in three forms-- Ultra, Pro and Nano.
Credit: Google
Google's Gemini teasers have become a rage on social media. In the three minut video demo, Gemini was able to see things and respond with human-like reasoning and suggestions with voice and visual inputs, a marvelous feat, which ChatGPT can't.
It is capable of understanding five different types of information in the form of text, code, audio, image, and video. So, users can interact seamlessly with the Gemini using five aforementioned modalities.
This way Gemini can understand nuanced information and be able to answer questions or explain complicated subjects like math and physics.
For instance, when the Google employee asked Gemini for ideas on how to celebrate his daughter's birthday, it instantly asks what she likes and based on human inputs, it instantly changes to visual outputs offering multiple outdoor themes (all images generated instantly on the spot) and goes on to offer tips on food options and more. With intutitive user interface, users can go deeper to get step-by-step guide on baking a cup cake too.
Google also collaborated with YouTuber Mark Rober, popular for his innovative science and engineering experiments for children. Rober got early access to Gemini Pro-infused Bard AI Chatbot and was able to conceptualise and bring the fascinating science experiment of flight test of paper airplanes into reality.
From designing the plane in terms of wing configuration, testing the accuracy of the flight, and finally executing the experiment, Rober was able to finish it in far less time than usual.
During the entire period, Rober interacted with Bard with text prompts and was able to get practical suggestions and he was even able solutions whenever the experiment faced an obstacle. The big paper plane during the test flight accuracy test, failed to pass through the fire ring. Due to low pressure under the fire, the plane fell right in front of the fire ring. After, increasing pressure on the ground, the flight finally went through the fire ring. Rober noted that it usually takes a year to turn a concept into reality, but thanks to the new Gemini Pro-powered Bard was able to help him finish it in three weeks.
Gemini Ultra is said to be even better and the company is conducting safety checks to weed out any human-like biases to ensure it serves people for good purpose only.
"With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine, and ethics for testing both world knowledge and problem-solving abilities," Google said.
For now, users can try out Bard with Gemini Pro for text-based prompts, with support for other modalities coming soon. It will be available in English in more than 170 countries and territories to start and come to more languages and places, like Europe, shortly.
The company has announced to bring Gemini's several advanced capabilities to Bard AI on Google search and Pixel 8 Pro in the latest software updates.
"Every technology shift is an opportunity to advance scientific discovery, accelerate human progress, and improve lives. I believe the transition we are seeing right now with AI will be the most profound in our lifetimes, far bigger than the shift to mobile or to the web before it. AI has the potential to create opportunities — from the everyday to the extraordinary — for people everywhere. It will bring new waves of innovation and economic progress and drive knowledge, learning, creativity, and productivity on a scale we haven’t seen before. That’s what excites me: the chance to make AI helpful for everyone, everywhere in the world," said Sundar Pichai, CEO, Alphabet & Google.
Get the latest news on new launches, gadget reviews, apps, cybersecurity, and more on personal technology only on DH Tech.