Advertisment

Gemini Unveiled: Key Insights into Google's Answer to OpenAI's GPT

Google claims that Gemini 1.0 was trained on its AI-optimised infrastructure using in-house created Tensor Processing Units (TPUs) v4 and v5e

author-image
Preeti Anand
New Update
Google Gemini

Google Gemini

In 2023, the firm, long regarded as the AI top dog, fell at its own game to a startup unheard of by anyone before its remarkable rise. Caught off guard, Google hurried to catch up with its own AI chatbot service, Bard, but customers found it underwhelming compared to others in the competition. Now, the search engine giant appears to be preparing to dethrone OpenAI with a new start - a new family of AI models named Gemini that seems to be constructed from the ground up. According to Google's comparisons, the AI model exceeds GPT-4 across numerous multimodal parameters. The Pixel 8 Pro and Bard have already received portions of it, but this is just the beginning. 

Advertisment

What Google has in store for ChatGPT's most serious danger Gemini

Gemini perceives and communicates like an actual human

Gemini, like GPT-4, is an AI model that cannot be directly accessed. Instead, it serves as a foundation for Google and, eventually, other developers to build things on top of. According to Google, the AI-bot was designed from the bottom up to be multimodal, which means it can work with and mix many sorts of information, such as text, voice, image, code, and video. It can detect photographs, communicate in real-time, and even solve physics problems with astounding cleverness. Just look at the demo.

Advertisment

While this alone does not distinguish the artificial intelligence offering from GPT-4, which was also supposed to be multimodal, its versatility is laudable in that it is more than a single model. It can run on everything from data centres to mobile devices.

Tensor Processing Units were used to train

Google claims that Gemini 1.0 was trained on its AI-optimised infrastructure using in-house created Tensor Processing Units (TPUs) v4 and v5e. If that moniker seems familiar, it's because it's the same technology used in the Tensor chipset in the Google Pixel. According to Google, training and operating Gemini on TPUs helps it run faster than previous, smaller, and less competent models.

Advertisment

Can almost run on anything

Gemini 1.0 is divided into three models: Ultra, Pro, and Nano. Gemini Ultra is Google's most powerful LLM to date, and it is intended for enterprise applications that will use it for "highly complex tasks." Gemini Pro is the most versatile of the three and has already been integrated into Bard for prompts requiring sophisticated reasoning, planning, and comprehension. Beginning 13 December, developers and enterprise clients can access this model through the Gemini API in Google AI Studio or Google Cloud Vertex AI. Meanwhile, the Pixel 8 Pro includes Gemini Nano, marketed as the most efficient model for on-device tasks, to handle duties like information summarisation and Smart Reply.

Safety inspection

Advertisment

The ability to reason and accuracy are two of the most essential characteristics of a 'good' AI model, yet these characteristics are only meaningful if sufficient safety checks support them. To that purpose, Google claims to have used "best-in-class adversarial testing techniques" before releasing Gemini. The company claims it has implemented safeguards and created specialised safety classifiers to help its model avoid dangers such as bias, toxicity, and spitting out content that encourages violence.

In benchmarks, Gemini Ultra outperforms GPT-4

Google claims that Gemini outperforms the competition across tasks, citing a research article in which Gemini Ultra led the group in six of eight benchmarks. When multimodal features such as natural image, audio, and video understanding are included, Gemini Ultra outperforms state-of-the-art results on 30 of the 32 benchmarks used in large language model (LLM) development. However, the research paper's series of benchmarks revealed that only Gemini Ultra exceeded GPT-4, while the consumer-oriented Gemini Pro sat comfortably between GPT -3.5 and GPT-4.

Advertisment

Benchmarks are merely benchmarks, but when transformed into real-world settings, Gemini Pro - which, like GPT 3.5, is free to use - could benefit the typical user because it appears to be better at many jobs than GPT 3.5.

Bard is given a new life

Gemini is also visiting Bard. Beginning on Wednesday, the AI chatbot will use a modified version of Gemini Pro for sophisticated reasoning, planning, comprehension, and other tasks in English. A second version of the AI chatbot called Bard Advanced will be released early next year, granting access to the company's most cutting-edge models, such as Gemini Ultra. Like ChatGPT Plus, this will likely require a subscription.

The Pixel 8 Pro will also benefit

The Pixel 8 Pro is another Google product that will benefit from Gemini since it will employ the technology to power numerous on-device experiences. This features Summarise in the Recorder app, which will retrieve a summary of your recorded chats, interviews, and more even while you're not connected to the internet. As a developer preview, the model also powers Smart Reply on Gboard, suggesting "high-quality responses with conversational awareness." According to Google, this will be offered first on WhatsApp, followed by further apps next year.

preetia
Advertisment