Generative AI models like ChatGPT have competition

A few weeks ago in IT Matters I wrote about the detractors of Foundation Models, a completely new approach to artificial intelligence (AI). These are also called “generative AI models”. Generative models have become popular since they outperform traditional methods of training AI programs with smaller data sets. The salient feature of generative models for AI is that they examine nearly every bit of information that is available on the web, a data repository that doubles in size every two years, and then use the results from these to train modeling programs. AI to generate production.

Open AI, heavily backed by Microsoft, has two such models: one called GPT-3, which is primarily for documents, and another called DALL-E, which focuses on images. GPT-3 analyzed thousands of digital books and almost a billion words published on blogs, social networks and the rest of the Internet. Its competitor is Google, whose own offering in generative AI is called BERT. Most industry observers expected generative AI to cross over to newer models, such as a potential GPT-4 in 2023. However, it seems that Open AI and other companies aren’t done playing with their old models just yet. In early December, the San Francisco company released a demo of a new model called ChatGPT, a derivative of GPT-3 that is designed to answer questions in a back-and-forth dialogue. This can drive industry applications such as chatbots, widely used in customer service applications.

What is surprising is that ChatGPT is able to produce short texts that are remarkable for how coherent and eloquent they seem to be. It has a variety of daily use cases and its versatility is amazing. It gained 1 million users in just five days.

A scientist friend sent in a sample last week asking ChatGPT to create a short explanation of itself. He was rewarded with what appears to be a well-written essay by a Class 6 student. He did an adequate job of describing himself and, to me, was clearly an advertorial for the product. This second part is the problem of generative models. They regurgitate the filth or hyperbole that has been given to them.

By contrast, more focused cognitive models have smaller data sets (some of them even filled with dummy data) that are used to train AI programs for specific use cases. For example, a medical-radiological system would be limited to X-rays, MRIs, and other similar medical images, you probably wouldn’t be training in poetry or music or other similar information that has no relevance to the task at hand.

Interestingly, however, BigTech companies are not the only players in the forefront of generative AI. There has been an open source revolution to match, and sometimes surpass, what the best-funded labs are doing. 2022 saw the first community-built large multilingual language model, called BLOOM (BigScience Large Open-science Open-access Multilingual Language Model). We also saw an explosion of innovation around the open source Stable Diffusion text-to-image AI model, which rivaled OpenAI’s DALL-E.

Earlier this year, MIT Technology Review (bit.ly/3FSTkh6) reported that a group of more than 1,000 AI researchers is working on a multilingual big language model that is larger than GPT-3, and that this community of researchers plan to release their model for free. This was BLOOM, which is designed to be as transparent as possible, with researchers sharing details about the data it was trained on, challenges in its development, and how they evaluated its performance. By contrast, Open AI and Google have not shared their code or made their models publicly available, and outside parties have very little understanding of how these models are trained. While you can sign up to use any of these models, you can’t look under the hood. This includes not understanding your built-in biases, so that you can correct them if necessary while you build your own system that uses generative AI to solve, say, a functional business problem.

The reason this new open source community matters is that big tech companies, which have historically been the biggest spenders on AI research, are now suddenly facing what promises to be a difficult 2023. Many are implementing layoffs and hiring freezes as the global economic outlook looks headed for a recession. AI research is undoubtedly expensive. As Big Tech companies look to save money, they will have to be very careful when choosing which projects to invest in. It stands to reason that they choose the one that has the potential to make them the most money, rather than the most innovative. or experimental.

Meta, which owns Facebook, has already made it clear that it intends to cut back. In a post on ai.facebook.com (bit.ly/3Gj39q3), the firm says it is reorganizing its AI research team and splitting it up and transferring it to teams that actually create products. Meta and Facebook have already been hit hard this year, with falling ad revenue, so this move isn’t a surprise. In 2023 other Big Tech companies are likely to tighten their belts when it comes to AI research.

The good news for a venture capitalist like me is that some of this work will probably shift to startups. The old argument against, say, vernacular models for India, that Google Translate or a similar Big Tech online service will kill a startup in that space, may now be less valid. Ergo, startups may still have a chance.

Siddharth Pai is a co-founder of Siana Capital, a hedge fund manager.

See all the business news, market news, breaking events and the latest news updates on Live Mint. Download The Mint News app for daily market updates.

more less

Leave a Reply

Your email address will not be published. Required fields are marked *