Apr 20 2023

AI: What is the new direction for researchers

Anna Nguyen (Translated)

Case Studies

In everything, one must take a risk.

Since the end of 2012, a small group in the AI community believed that neural networks could reach "general purpose" level (universal technology). By the end of 2022 and the beginning of 2023, neural networks like GPT-4 and similar models could be considered to have achieved that level. Some even believe that machines have passed the Turing test, where people can be fooled into thinking they are chatting with a real person.

This bet has paid off.

Of course, not everyone believes it. In the computer vision community from 2012-2014 and natural language processing from 2014-2018, many refused to bet and some were forced out of the game.


Near the end of last year, a DeepMind boss dropped a tweet joking about the AI community, saying "stop what you're doing and focus on making big models. Big is enough." A big model, like GPT-4 or PaLM-E, can efficiently do countless tasks that previously had to be done separately, such as text comprehension, image processing, playing games, giving commands to robots, and so on.

Models like GPT-4 or PaLM-E are truly huge. There is no exact number, but it is rumored that GPT-4 has over a trillion parameters. For comparison, a standard particle physics model has only 19 parameters, enough to describe the basic particle world, the foundation of the physical world we live in. In daily life, we often use one-parameter models, like "that person has a high forehead, so they must be intelligent." Numerology, fortune-telling, and palm reading rarely use models with more than 5 parameters, just like our hands have only 5 fingers. An elder once proposed a model for happy marriage with only 2 parameters, by using the number of times one spouse interrupts or argues with the other.

More powerful and scientific than these models are the physiological models used in medicine, which rarely have more than 10 parameters. Simply put, no one can understand a model with 100 parameters. Even the BMI index only has height, weight, and waist circumference, which are just 3 parameters, and you can calculate the probability of obesity even when you are 50 years old and diabetic, according to some WHO model.

When a language model becomes large enough, it suddenly has some very nice properties that small models do not have, such as "learning in context" or "emergence." This means the model can learn instantly from a few examples without needing to be fine-tuned (which is very expensive) on a new dataset. Some estimate that one such example is equivalent to 100 samples.

Having many examples is great, but when there are none, a large language model acts as a computer, seeming to understand the intent of the person giving the example.


In the 1940s and 1950s, when the neural network model only had one layer, people predicted that this would be the electronic brain, that one day it would replace the biological brain and be able to do countless tasks. Few thought that that day was so close to a person's lifetime. Sixty years later, no one believes in a digital brain model. Just a few years ago, no one thought that day would come in a few decades.

At the beginning of 2023, the birth of GPT-4 created a rare earthquake. Almost immediately, people called for a halt to the development of similar technology, but stronger than GPT-4. The hiatus was for 6 months for the world to have time to think about what to do next and how to minimize the damage. Of course, no one listened, especially the slow ones. Rumor has it that Musk, the chicken feather owner, is plotting to set up a company to compete directly with OpenAI, which he publicly contributed to. Where did he find people when all the talents had been scooped up by the big players? Of course, he had to gut the company!

At the end of 2017, I predicted that internet data problems like image processing and natural language would be the game of big players. The academic community had no chance. At that time, I didn't know about Transformer (announced in December 2017 at NeurIPS - then still called NIPS), and BERT - the spark that signaled the GPT fire had not yet been born. Of course, no one believed it, except for the start-ups, because they knew how to fight against giants with their bare hands.

Then a few days ago, two colleagues grabbed a headline in a fairly serious essay, like a "depression" with big AI, what now? Give up? It's not a bad option. Of course, they revealed, be like us, relentlessly doing something that people haven't cared about, but can't be ignored.

The important thing is to run to the right position.

In recent weeks, the weekly news I read started with "in the past 10 days, more than 2,000 AI articles have been published." This is clearly a type of distributed denial-of-service (DDoS) attack on the research community's brains.

In soccer, the ball is passed very quickly and no one can run after it forever. The most important thing is to run to the right position so the ball can come to you. The same goes for AI research.

A colleague recently asked me how to keep up with the latest research trends in the field. I said it's impossible. I focus on what I'm interested in and concentrate on that, just like focusing on running to the right position.

Five years ago, the CoRL conference (Robot Learning) appeared. When people talked about AI, they imagined a human-like robot with emotions, or even something like Terminator. Today's AI, such as ChatGPT, has nothing to do with robots. In fact, the robotics community has to solve a very difficult hardware problem related to electronics and mechanics and has little time to focus on machine learning problems from data.

One of the CoRL founders said at the time that computer vision problems have become too easy to solve with AI. The problem worth solving is robotics, because it is very difficult. Even if you have integrated motor skills, planning ability, vision, and learning ability, the trial-and-error data required to learn is too much to execute in the real world. The effective solution is to train a simulation model first and then fine-tune it in the real world (the Sim2Real method).

This researcher has a very impressive strategy of running to the right position. He moved from speech processing in the early 2000s to robotics in the late 2010s.


The origial article is the from the Facebook of Professor Truyen Tran at Deakin University from who has rich research and development experience in AI.  



Tags: GPT-4 and the Emergence of General Purpose AI: The Risks and Rewards,There is a saying in the West, "be careful what you wish for".