One possible approach to develop the best possible general learning algorithm

One possible approach to develop the best possible general learning algorithm Author: Martí Llopart Introduction In the world we live in, there are many problems that need to be urgently solved: climate change[1], the risk of a new pandemic[2], wealth inequality[3], the aging population[4] and more. All of these problems are hugely complex and solving them requires vast amounts of intelligence. In order to address that, there’s a tool that comes in very handy: Artificial Intelligence. We are all familiar with AI, but to put it shortly, it’s the intelligence demonstrated by machines[5]. In recent years, AI has proven to be a game changer for industries such as healthcare and finance, but most experts agree that we are just at the beginning of what has been termed as "The AI revolution"[6]. The culmination point of this AI revolution is the creation of a Superintelligence, a hypothetical AI agent that possesses intelligence far surpassing that of the brightest and most gifted human minds. If this Superintelligence is aligned with our goals, a solution could be found to the most complex scientific, social and economical problems, leading to huge benefits for humanity. But before creating Superintelligence, General Intelligence must be reached. Artificial General Intelligence (AGI) is the hypothetical intelligence of a machine that has the capacity to understand or learn any intellectual task that a human being can[7]. It is just one step below Superintelligence. So, how do we get to AGI? In this essay, I’ll try to figure out one of the key steps: Step 1: Creating the best general ML algorithm Our brains are general problem solvers. Their objective is to use the information from the senses to accurately predict future events and to act according to these predictions in order to survive [8]. If we want to develop General Intelligence, we must first create the best general learning algorithm (GLA), an algorithm capable of performing well at learning any type of task. The best GLA would be the one that performs perfectly at learning any task. However, there’s a big problem in developing such an algorithm: research in machine learning lacks a theoretical backbone. In the last decade, most of the advances in AI have been powered by neural networks, but the theory behind them still remains poorly understood. Although the math behind each individual neuron is well understood, the mathematical theory of the emergent behaviour of the network is still greatly unknown. The Universal Approximation Theorem gives us some guidance from mathematics, but still much of the background theory remains unexplored [9]. Therefore, in the same way that we wouldn’t be able to efficiently design planes or bridges without using the strong mathematical principles of fluid dynamics and solid mechanics, we can’t yet develop smart designs for machine learning algorithms because we don’t have such strong guiding principles yet. One solution to this problem could be to advance the theoretical knowledge we have over the machine learning field until we reach a much more solid basis. However, there seems to be a much better approach to solve this problem. A recent paper from GoogleAI [10] has shown that it is possible to evolve machine learning algorithms from scratch using only basic mathematical operations as building blocks. In the paper, given a series of tasks, machine learning algorithms were evolved using an evolutionary algorithm to achieve the best performance. Their proposed approach starts from empty programs and, using only basic mathematical operations as building blocks, evolutionary methods are applied to automatically find the code for complete ML algorithms. Given small image classification problems, their method rediscovered fundamental ML techniques, such as 2-layer neural networks with backpropagation, linear regression and the like, which have been invented by researchers throughout the years. The result demonstrates the plausibility of automatically discovering more novel ML algorithms to address harder problems, such as discovering the best GLA. Another problem in designing the best possible GLA is that because there’s no strong mathematical basis on which to base the design, the only way to check if a GLA is truly the best or not is to test its performance across all possible learning tasks. Obviously, this is impossible because there’s an infinite amount of learning tasks and a limited amount of time and computational resources. Hence, a solution to this problem could be to create a benchmark, a set of problems that are representative enough to most learning tasks. The company Deepmind, a British subsidiary of Alphabet, is working on developing safe AGI. As part of their approach, they are also trying to develop better GLAs. One of their latest feats was MuZero, a program that is able to master games without previously knowing their rules. In its 2019 release, MuZero was able to outperform humans at go, chess, shogi and a standard suite of Atari games. This set of games could be used as a starting benchmark for testing the performance of GLAs developed in the future[11].

The following steps is what I propose to create the best general ML algorithm: **1- To create a benchmark: ** The first step consists in creating a set of learning tasks that are diverse and representative enough to all possible learning tasks. Currently, the benchmark used by the pioneer company Deepmind to test learning algorithms consists of the Atari57 suite of games as well as chess, go and shogi. As previously mentioned, the benchmark has been beaten by the MuZero program at superhuman level. **2- To record the performance of the intelligent agent at beating the benchmark (attaining superhuman performance): ** The second step of the process is to record the performance of the best artificially intelligent agent that has beaten the benchmark with superhuman performance. Currently, this program is the MuZero algorithm developed by Deepmind. Three parameters will be recorded: 2.1 - Computational resources: The hardware and software configurations of the supercomputer used to run the experiment. 2.2 - Compute time: Using the specified hardware and software configurations, how much time it took for the learning algorithm to beat the benchmark. 2.3 - Length of the algorithm: The size of the file encoding the learning algorithm in the programming language or languages used. 3- Design a Machine Learning algorithm generator: Machine learning algorithms are specific combinations of basic logical and mathematical units. As previously mentioned, in 2020 the GoogleAI team developed a novel methodology to automatically generate machine learning algorithms from scratch using only basic mathematical operations as building blocks, reaching sophisticated structures such as neural networks. The idea behind this section is to design a framework of basic logical and mathematical operations as a starting point for the automatic design of the general machine learning algorithm. The Google AI team has already created such a framework in its AutoML-Zero paper, but it only contains some logical and mathematical operators. The idea behind this point is to design a similar but more broad framework for the creation of a general learning algorithm. Such a framework should have as many logical and mathematical basic units as possible, so that all options are present. This framework would be a true tabula rasa for the development of general learning algorithms. 4- To evolve a better algorithm: With the framework of step 3, and using either an evolutionary algorithm or others such as deep reinforcement learning, a program is searched for with the following properties:

Using the same type of computational resources (software and hardware configurations), the algorithm reaches superhuman performance at the benchmark with equal or fewer time than the gold standard (at the time of writing, MuZero).
The size of the file or files encoding the alogirthm has to be equal or lower than the original algorithm. Why? Because the simpler the algorithm, the less information it has encoded, which ensures that it is not specific to any task. It ensures that the algorithm only contains the indispensable information to learn anything. If one of the two parameters remains equal to the last best algorithm, the other has to be at least better. The shorter the time to achieve superhuman performance at the benchmark and the shorter the length of the algorithm, the better. 5- Test the newly generated algorithm over unseen games. The new algorithm is tested over unseen games and compared to the original algorithm (MuZero), to check if it really is a better general learning algorithm or if it just performs better at learning the benchmark. If it is, the process can continue. 6- To check if the new algorithm discovers an equally good or better version of itself substantially sooner than the evolutionary algorithm did. In the same way that the knowledge of the laws of mechanics and dynamics facilitate the design of planes and bridges, understanding the laws that rule the design of learning algorithms would also speed up its creation process. The MuZero algorithm is a Deep Learning based algorithm that uses reinforcement learning to understand and exploit the rules of different learning environments. For instance, given the game of chess, the algorithm will be able to understand its rules by just playing the game and then it will use them to win against adversaries.
If MuZero is able to understand and use the rules of previously unknown environments through experience, an improved version of MuZero such as the one we are developing in these steps might be able to quickly pick up the unknown rules that govern the development of learning algorithms and use them in order to develop the best possible learning algorithm in an intelligent way, such as playing chess. If this is the case and, in many repetitions, the improved algorithm discovers itself or a better version in less time than it took the evolutionary algorithm to do so, it would mean that the improved algorithm has picked up some rules that allow for a more efficient design of general learning algorithms. Such a discovery would be of incredible importance for the machine learning field and would also allow for an exponential speedup in the discovery of the best possible general machine learning algorithm. 7 - From here on the cycle can continue indefinitely, creating better and better general learning algorithms in a self-improving loop. If step 6 has been successful, the improved algorithm will take the place of the original search algorithm so that better GLAs can be discovered quickly. Moreover, the original algorithm will be substituted by the new algorithm as the gold standard to which improve upon. Therefore, when step 1 starts again, this time the design process has the goal of improving over the new algorithm instead of over MuZero. **Note nº 1: ** It might be the case that when the automatic algorithms are designing the best GLA, they limit the design to the provided computational resources. If this is the case, different computational resources would yield different designs of the best possible GLA. Therefore, ideally the computer used to develop and test the algorithms should match the capabilities of the human brain as much as possible and not be changed. This way, the algorithm designed will be the best GLA that the brain’s power can handle. It might also be the case that the automatically designed best GLA performs the best using any amount of computational resources. We can only know this through experimentation. **Note nº2: ** We don’t know yet if there’s an endpoint to the described process. In other words, we don’t know if there’s a GLA that can’t be improved upon. Maybe this process arrives to a flat endpoint of the progression curve. Again, we can only find this out via experimentation. However, it is important to note that if the AutoML-Zero experiment by GoogleAI has quickly arrived to neural networks using only evolutionary algorithms it is to be expected that an improved version could obtain a better version of deep learning and such. The promises of automatic discovery are vast and must be explored. References [1] O’Neill, B.C., Carter, T.R., Ebi, K., Harrison, P.A., Kemp-Benedict, E., Kok, K., Kriegler, E., Preston, B.L., Riahi, K., Sillmann, J. and van Ruijven, B.J., 2020. Achievements and needs for the climate change scenario framework. Nature climate change, 10(12), pp.1074-1084. [2] Piret, J. and Boivin, G., 2020. Pandemics throughout history. Frontiers in microbiology, 11. [3] Zucman, G., 2019. Global wealth inequality. Annual Review of Economics, 11, pp.109-138. [4] Li, J., Han, X., Zhang, X. and Wang, S., 2019. Spatiotemporal evolution of global population ageing from 1960 to 2017. BMC public health, 19(1), pp.1-15. [5] McCarthy, J., 2007. What is artificial intelligence?. [6] Makridakis, S., 2017. The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms. Futures, 90, pp.46-60. [7] Goertzel, B., 2014. Artificial general intelligence: concept, state of the art, and future prospects. Journal of Artificial General Intelligence, 5(1), p.1. [8] Raichle, M.E., 2010. Two views of brain function. Trends in cognitive sciences, 14(4), pp.180-190. [9] Zhang, Z., Beck, M.W., Winkler, D.A., Huang, B., Sibanda, W. and Goyal, H., 2018. Opening the black box of neural networks: methods for interpreting neural network models in clinical applications. Annals of translational medicine, 6(11). [10] Real, E., Liang, C., So, D. and Le, Q., 2020, November. Automl-zero: Evolving machine learning algorithms from scratch. In International Conference on Machine Learning (pp. 8007-8019). PMLR.