One of the most tantalising thoughts in machine learning, and AI in general, is what’s known as the one algorithm theory. The theory states that the human brain, despite consisting of specialised areas, learns different things using a single algorithm. In other words, the mechanism our brain uses to learn how to ride a bicycle is the same as the one we use to learn how to solve a math problem, understand a foreign language, or recognise a cat in a photo.
Many of the top researchers in AI subscribe to the one algorithm theory, and rightly so — there’s compelling evidence to suggest that our brains indeed use a single set of instructions to learn things. A Master Algorithm. If we believe the master algorithm exists, the next steps are natural: we need to figure out how this mythical algorithm works, and approximate it as well as we can using a computer. If we can do that, we’ll have a learning algorithm that can, theoretically, learn to do anything a human being can do. For me, that is one of the most powerful thoughts ever conceived.
And the fun doesn’t stop there: given the rise in computing power, the master algorithm implemented in code could run much faster than our brains do, allowing us to build systems that can not only learn to do anything a human can do but can do so in a fraction of the time. That’s an exciting prospect, to say the least.
Currently, the learning algorithms we have are custom built to solve a particular type of problem. For an image recognition task, you use a convolutional neural network. For natural language processing, you use a recurrent one. And even though the base architectures might be well understood, you still have to fix a number of hyperparameters such as the number of neurons & layers, the regularisation technique and more. For most real-world problems, using machine learning is akin to having a large pile of linear algebra and twiddling the knobs until things work. You may have an intuition of how to twist said knobs, but that doesn’t stop you from having to twist them in the first place.
In order to get a couple of steps closer to uncovering the master algorithm, we should have a rethink. We need a single architecture that can be used regardless of the task we are trying to solve. We also need to be able to transfer what we’ve learned across domains — a human can take what they’ve learned studying a foreign language and apply that to programming, for example. These aren’t small steps, mind you. On the contrary, they are two major pieces of a puzzle whose solution is poised to change the world as we know it. They are by no means the only pieces of the puzzle, but they are undoubtedly the most important ones.
How long will it take for the community to discover all the pieces of the puzzle required to make the master algorithm? Depending on who you ask, ten to thirty years. Most researchers agree that we’re not close. Not even remotely.
That is, most researchers agreed. Since last Friday, I suspect it’s time for us all to revise our estimates. Drastically.
That’s because last Friday, Google quietly published a research paper entitled One Model To Learn Them All. It describes a rather interesting neural network: one that can not only learn to solve many different problems concurrently using a domain-agnostic core but one that can apply what it’s learned in one domain to another — without any major side effects. I won’t go into technical specifics, but in my estimation, Google has taken what is their first real shot at the master algorithm. It’s version 0.1, but that doesn’t matter — what matters is that it’s happened far sooner than anyone imagined.
We getting closer to uncovering the true master algorithm, and doing so at a rate that I don’t think anyone could have envisioned. And the reward couldn’t be any greater: fast, humanlike intelligence, democratised and ready to be used by anyone, anywhere.
If that notion doesn’t make a computer scientist’s heart race, I don’t know what will.
Special thanks to Mikko Leppänen for proofreading.