Is Deepmind on its way to Artificial General Intelligence with ‘Gato’?


Photo: DeepMind

The article can only be viewed with JavaScript enabled. Please enable JavaScript in your browser and reload the page.

Deepminds Gato solves many tasks, but none very well. Does the new AI system nonetheless point the way to AI?

Shortly after DALL-E 2 from OpenAI, PaLM and LaMDA 2 from Google, Deepmind’s Chinchilla and Flamingo, the London-based AI firm is showcasing another major AI model that outperforms existing systems.

But Deepmind’s Gato is different: the model is no better at writing text, describing images, playing Atari, controlling robotic arms, or orientating itself better in 3D than other AI systems. Gato can do a little everything.

Video: Deep Mind

Deepminds Gato takes what he can get

Deepmind has trained multiple talents based on the architecture of transducers using images, text, proprioception (one’s body perception), shared moments, button inputs, and other “discrete and continuous feedback and actions”. In the training phase, all data is processed in a symbolic sequence by a Transformer network, similar to a large language model.

Gato is based on transformer architecture and is trained using many different data methods. | Photo: DeepMind

The team then tested Gato on 604 different tasks. In more than 450 of them, the AI ​​model achieves about 50 percent of the performance of the other expert systems in the benchmark. But this lags far behind the specialized AI models that can reach the expert level.

Jato and sizing laws

At just 1.18 billion parameters, Gato is small compared to the 175 billion GPT-3 variants, the giant PaLM model’s 540 billion variants, or the 70 billion “small” chinchilla variants.

According to the team, this is mainly due to the response time of the user’s Sawyer robotic arm – the larger model will be very slow on current hardware and with the current architecture of the robot’s tasks.

The team said these limitations could easily be overcome with new hardware and engineering. A larger Gato model can train with more data and potentially perform better on many tasks.

Deepminds Gato can solve quite a few tasks. Is Artificial General Intelligence just around the corner? | Photo: DeepMind

In the end one can Generic AI Model They appear to replace specialized models – also evidenced by the history of AI research. The team also refers to artificial intelligence researcher Richard Sutton, who noted that it was a “bitter lesson” from his research: “Historically, general models that make better use of computing power have prevailed over more specific and domain-specific methods.”

Deepmind also provides a significant increase in Gato performance with an increasing number of parameters: in addition to the large model, the team also trained two smaller ones using 79 million and 364 million parameters. Average performance in the tested metrics is measured linearly with increasing volume.

Gato’s performance is proportional to the number of parameters. | Photo: DeepMind

This phenomenon is already known from large language models and was examined in detail in the paper “Scaling the Laws of Neural Language Models” in early 2020.

To these scaling laws, Deepmind recently added the importance of larger amounts of data to scale performance with Chinchilla Paper. More data leads to better performance.

Logo

Game over or the next level of machine intelligence?

Will scaling a system like Gato one day enable AGI? The Hope the principle of scaling Not everyone agrees: In a new post on Substack, cognitive scientist and AI researcher Gary Marcus talks about Gato’s failed approach to “scaling everything.” All current large AI models such as GPT-3, PaLM, Flamingo or Gato will combine moments of brilliance and absolute incomprehension.

While humans are also prone to making mistakes, any honest person would realize that these kinds of mistakes show that something is very wrong right now. “If any of my kids made mistakes like this on a regular basis, without exaggeration, I would just quit everything I’m doing and take them to the neurologist straight away,” Marcus said.

Marcus refers to this type of research as old wit: “Alternative intelligence is not about building machines that solve problems in a way that has anything to do with human intelligence. It is about using massive amounts of data—often derived from human behavior—as a substitute for intelligence.”

This approach is not new in itself, but it is The arrogance associated with achieving artificial general intelligence simply by scaling this method.

Marcos also responds to a Twitter post by Nando de Freitas, Director of Research at Deepmind: “Now it’s all about the size! The game is over,” said de Freitas in the context of Gatto. Computational, faster sampling, smarter in memory, more methods, innovative data, on/offline.”

De Freitas sees DeepMind on its way to Artificial General Intelligence, if the measurement challenges described are resolved. Sutton’s lesson isn’t a bitter lesson, but a sweet one, according to a Deepmind researcher. Marcus writes that de Freitas here expresses what many in the industry think.

But DeepMind is also more cautious: “Maybe the expansion will suffice. Perhaps”, the lead scientist writes Murray Shanahan. However, he sees very little in Gato, which suggests that expansion alone would lead to generalization at the human level. However, Deepmind is looking in several directions.


Leave a Comment