'AI winter' caused by pessimism about machine learning effectiveness.
1980s
Rediscovery of backpropagation causes a resurgence in machine learning research.
1990s
Work on Machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions – or "learn" – from the results.[2]Support-vector machines (SVMs) and recurrent neural networks (RNNs) become popular.[3] The fields of computational complexity via neural networks and super-Turing computation started.[4]
Deep learning becomes feasible, which leads to machine learning becoming integral to many widely used software services and applications. Deep learning spurs huge advances in vision and text processing.
2020s
Generative AI leads to revolutionary models, creating a proliferation of foundation models both proprietary and open source, notably enabling products such as ChatGPT (text-based) and Stable Diffusion (image based). Machine learning and AI enter the wider public consciousness. The commercial potential of AI based on machine learning causes large increases in valuations of companies linked to AI.
Pierre-Simon Laplace publishes Théorie Analytique des Probabilités, in which he expands upon the work of Bayes and defines what is now known as Bayes' Theorem.[10]
1913
Discovery
Markov Chains
Andrey Markov first describes techniques he used to analyse a poem. The techniques later become known as Markov chains.[11]
Warren McCulloch and Walter Pitts develop a mathematical model that imitates the functioning of a biological neuron, the artificial neuron which is considered to be the first neural model invented.[12]
1950
Turing's Learning Machine
Alan Turing proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows genetic algorithms.[13]
1951
First Neural Network Machine
Marvin Minsky and Dean Edmonds build the first neural network machine, able to learn, the SNARC.[14]
1952
Machines Playing Checkers
Arthur Samuel joins IBM's Poughkeepsie Laboratory and begins working on some of the first machine learning programs, first creating programs that play checkers.[15]
The nearest neighbour algorithm was created, which is the start of basic pattern recognition. The algorithm was used to map routes.[2]
1969
Limitations of Neural Networks
Marvin Minsky and Seymour Papert publish their book Perceptrons, describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.[19]
1970
Automatic Differentiation (Backpropagation)
Seppo Linnainmaa publishes the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.[20][21] This corresponds to the modern version of backpropagation, but is not yet named as such.[22][23][24][25]
1976
Discovery
Transfer Learning
Stevo Bozinovski and Ante Fulgosi introduced transfer learning method in neural networks training. [26][27]
1979
Stanford Cart
Students at Stanford University develop a cart that can navigate and avoid obstacles in a room.[2]
Stevo Bozinovski showed an experiment of neural network supervised learning for recognition of 40 linearly dependent patters: 26 letters, 10 numbers, and 4 special symbols from a computer terminal. [31]
1981
Explanation Based Learning
Gerald Dejong introduces Explanation Based Learning, where a computer algorithm analyses data and creates a general rule it can follow and discard unimportant data.[2]
Stevo Bozinovski develops a self-learning paradigm in which an agent learns using internal state evaluations, and does not use external reinforcements. Internal state evaluations are represented by emotions. He introduces the Crossbar Adaptive Array (CAA) architecture which solved the delayed reinforcement learning challenge. [33][34][35]
Kurt Hornik [de] proves that standard multilayer feedforward networks are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.
Torch, a software library for machine learning, is first released.[44]
2006
The Netflix Prize
The Netflix Prize competition is launched by Netflix. The aim of the competition was to use machine learning to beat Netflix's own recommendation software's accuracy in predicting a user's rating for a film given their ratings for previous films by at least 10%.[45] The prize was won in 2009.
2009
Achievement
ImageNet
ImageNet is created. ImageNet is a large visual database envisioned by Fei-Fei Li from Stanford University, who realized that the best machine learning algorithms wouldn't work well if the data didn't reflect the real world.[46] For many, ImageNet was the catalyst for the AI boom[47] of the 21st century.
2010
Project
Kaggle Competition
Kaggle, a website that serves as a platform for machine learning competitions, is launched.[48]
The Google Brain team, led by Andrew Ng and Jeff Dean, create a neural network that learns to recognize cats by watching unlabeled images taken from frames of YouTube videos.[50][51]
2012
Discovery
Visual Recognition
The AlexNet paper and algorithm achieves breakthrough results in image recognition in the ImageNet benchmark. This popularizes deep neural networks.[52]
2013
Discovery
Word Embeddings
A widely cited paper nicknamed word2vec revolutionizes the processing of text in machine learnings. It shows how each word can be converted into a sequence of numbers (word embeddings), the use of these vectors revolutionized text processing in machine learning.
2014
Achievement
Leap in Face Recognition
Facebook researchers publish their work on DeepFace, a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.[53]
2014
Sibyl
Researchers from Google detail their work on Sibyl,[54] a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.[55]
2016
Achievement
Beating Humans in Go
Google's AlphaGo program becomes the first Computer Go program to beat an unhandicapped professional human player[56] using a combination of machine learning and tree search techniques.[57] Later improved as AlphaGo Zero and then in 2017 generalized to Chess and more two-player games with AlphaZero.
2017
Discovery
Transformer
A team at Google Brain invent the transformer architecture,[58] which allows for faster parallel training of neural networks on sequential data like text.
AlphaFold 2 (2021), A team that used AlphaFold 2 (2020) repeated the placement in the CASP competition in November 2020. The team achieved a level of accuracy much higher than any other group. It scored above 90 for around two-thirds of the proteins in CASP's global distance test (GDT), a test that measures the degree to which a computational program predicted structure is similar to the lab experiment determined structure, with 100 being a complete match, within the distance cutoff used for calculating GDT.[60]
^Solomonoff, R.J. (June 1964). "A formal theory of inductive inference. Part II". Information and Control. 7 (2): 224–254. doi:10.1016/S0019-9958(64)90131-7.
^O'Connor, J J; Robertson, E F. "Pierre-Simon Laplace". School of Mathematics and Statistics, University of St Andrews, Scotland. Retrieved 15 June 2016.
^Langston, Nancy (2013). "Mining the Boreal North". American Scientist. 101 (2): 1. doi:10.1511/2013.101.1. Delving into the text of Alexander Pushkin's novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin's poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction.
^McCulloch, Warren S.; Pitts, Walter (December 1943). "A logical calculus of the ideas immanent in nervous activity". The Bulletin of Mathematical Biophysics. 5 (4): 115–133. doi:10.1007/BF02478259.
^Turing, A. M. (1 October 1950). "I.—COMPUTING MACHINERY AND INTELLIGENCE". Mind. LIX (236): 433–460. doi:10.1093/mind/LIX.236.433.
^ Stevo Bozinovski and Ante Fulgosi (1976) "The influence of pattern similarity and transfer learning upon training of a base perceptron" (original in Croatian) Proceedings of Symposium Informatica 3-121-5, Bled.
^ Stevo Bozinovski (2020) "Reminder of the first paper on transfer learning in neural networks, 1976". Informatica 44: 291–302.
^Fukushima, Kunihiko (October 1979). "位置ずれに影響されないパターン認識機構の神経回路のモデル --- ネオコグニトロン ---" [Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron —]. Trans. IECE (in Japanese). J62-A (10): 658–665.
^Fukushima, Kunihiko (April 1980). "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position". Biological Cybernetics. 36 (4): 193–202. doi:10.1007/BF00344251. PMID7370364. S2CID206775608.
^ S. Bozinovski (1981) "Teaching space: A representation concept for adaptive pattern classification" COINS Technical Report No. 81-28, Computer and Information Science Department, University of Massachusetts at Amherst, MA, 1981. UM-CS-1981-028.pdf
^ Bozinovski, S. (1982). "A self-learning system using secondary reinforcement". In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North-Holland. pp. 397–402. ISBN 978-0-444-86488-8
^Bozinovski, S. (1999) "Crossbar Adaptive Array: The first connectionist network that solved the delayed reinforcement learning problem" In A. Dobnikar, N. Steele, D. Pearson, R. Albert (Eds.) Artificial Neural Networks and Genetic Algorithms, Springer Verlag, p. 320-325, 1999, ISBN 3-211-83364-1
^Bozinovski S, Bozinovska L (2001). "Self-learning agents: A connectionist theory of emotion based on crossbar value judgment". Cybernetics and Systems. 32 (6): 637–667.
^Tesauro, Gerald (March 1995). "Temporal difference learning and TD-Gammon". Communications of the ACM. 38 (3): 58–68. doi:10.1145/203330.203343. S2CID8763243.
^Tin Kam Ho (1995). "Random decision forests". Proceedings of 3rd International Conference on Document Analysis and Recognition. Vol. 1. pp. 278–282. doi:10.1109/ICDAR.1995.598994. ISBN0-8186-7128-9.
^Canini, Kevin; Chandra, Tushar; Ie, Eugene; McFadden, Jim; Goldman, Ken; Gunter, Mike; Harmsen, Jeremiah; LeFevre, Kristen; Lepikhin, Dmitry; Llinares, Tomas Lloret; Mukherjee, Indraneel; Pereira, Fernando; Redstone, Josh; Shaked, Tal; Singer, Yoram. "Sibyl: A system for large scale supervised machine learning"(PDF). Jack Baskin School of Engineering. UC Santa Cruz. Archived from the original(PDF) on 15 August 2017. Retrieved 8 June 2016.