GPE, successor characteristics and related approaches
Improving the generalization of temporal difference learning: successor representation. Peter Dayan. Neural calculation, 1993.
Apprenticeship training through reverse reinforcement. Pieter Abbeel and Andrew Y. Ng. Publications of the International Conference on Machine Learning (ICML), 2004.
Horde: Scalable real-time architecture for learning information about uncontrolled sensor motor operation. Richard S. Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M. Pilarski, Adam White. Publications of the International Conference of Autonomous Factors and Multi-Agent Systems (AAMAS), 2011.
Multi-term next reinforcement learning robot. Joseph Modayil, Adam White, Richard S. Sutton. From Animals to Animals, 2012.
Approximate function function approximators. Tom Schaul, Dan Horgan, Karol Gregor, David Silver. Publications of the International Conference on Machine Learning (ICML), 2015.
Deep follower reinforcement learning. Tejas D.Kulkarni, Ardavan Saeedi, Simanta Gautam, Samuel J.Gershman. arXiv, 2017.
Visual semantic design using deep follower presentations. Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi. IEEE International Conference on Computer Vision (ICCV), 2017.
Deep reinforcement learning with follower features for navigating similar environments. Jingwei Zhang, Jost Tobias Springenberg, Joschka Boedecker, Wolfram Burgard. IEEE / RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
Universal follower representations for learning transfer confirmation. Chen Ma, Junfeng Wen, Yoshua Bengio. ArXiv, 2018.
Eigenoption Discovery through Deep Follower Representation. Marlos C.Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell. International Conference on Learning Representations (ICLR), 2018.
Follower Options: An alternative search framework to reinforce learning. Rahul Ramesh, Manan Tomar, Balaraman Ravindran. Publications of the International Joint Conference on Artificial Intelligence (IJCAI), 2019.
Follower uncertainties: exploration and uncertainty about temporal difference learning. David Janz, Jiri Hron, Przemysław Mazur, Katja Hofmann, José Miguel Hernández-Lobato, Sebastian Tschiatschek. Advances in Neural Data Processing Systems (NeurIPS), 2019.
The follower features combine elements of model-free and model-based reinforcement learning. Lucas Lehnert, Michael L.Littman. arXiv, 2019.
Invoice-based search with successor representation. Marlos C.Machado, Marc G.Bellemare, Michael Bowling. AAAI Artificial Intelligence Conference (AAAI) Meetings, 2020.
GPI, hierarchical RL, and related approaches
Robust layered control system for mobile robot. R. Brooks. IEEE Journal of Robotics and Automation, 1986.
Learning feudal reinforcement. Peter Dayan and Geoffrey E.Hinton. Advances in Neural Information Processing Systems (NIPS), 1992.
Methods for selecting functions through reinforcement learning. Mark Humphrys. Dissertation, University of Cambridge, Cambridge, UK, 1997.
Learning to solve multiple goals. Jonas Karlsson. Dissertation, University of Rochester, Rochester, New York, 1997.
Reinforcing learning through machine hierarchies. Ronald Parr and Stuart J. Russell. Advances in Neural Information Processing Systems (NIPS), 1997.
Between MDP and Semi-MDP: a framework for temporary abstraction in learning reinforcement. Richard S.Sutton, DoinaPrecup, Satinder Singh. Artificial Intelligence, 1999.
Hierarchical confirmation learning by decomposition of a MAXQ value function. PKO Dietterich. Journal of Artificial Intelligence Research, 2000.
Learning to reinforce multiple goals using the modular Sarsa (O). Nathan Sprague and Dana Ballard. Meetings of the International Joint Conference on Artificial Intelligence (IJCAI), 2003.
Q-decomposition for learning enhancers. Stuart J.Russell and Andrew Zimdars. Publications of the International Conference on Machine Learning (ICML), 2003.
The composition of the laws of optimal control. E. Todorov. Advances in Nerve Data Processing Systems (NIPS), 2009.
Linear Bellman combination for character animation control. M. da Silva, F. Durand and J. Popovic. ACM events in graphics, 2009.
Hierarchy through composition with multifunctional LMDPS. AM Saxe, AC Earle, and B.Rosman. Publications of the International Machine Learning Conference (ICML), 2017.
Hybrid award architecture for learning reinforcement. Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes and Jeffrey Tsang. Advances in Neural Information Processing Systems (NIPS), 2017.
Feudal networks for learning hierarchical reinforcement. Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu. Publications of the International Machine Learning Conference (ICML), 2017.
Composite deep gain learning for robot manipulation. T. Haarnoja, V. Pong, A. Zhou, M. Dalal, P. Abbeel and S. Levine. IEEE International Conference on Robotics and Automation (ICRA), 2018.
Composing value functions in reinforcement learning. Benjamin Van Niekerk, Steven James, Adam Earle, Benjamin Rosman. Publications of the International Conference on Machine Learning (ICML), 2019.
Design in hierarchical reinforcement learning: guarantees for the use of local practices. Tom Zahavy, Avinatan Hasidim, Haim Kaplan, Yishay Mansour. International Conference on Theory of Algorithmic Learning (ALT), 2020.
GPE + GPI, transfer learning and related approaches
Transferring learning by creating solutions to successive element problems. Satinder Singh. Machine learning, 1992.
Transfer learning to reinforce areas of learning: Research. Matthew E.Taylor and Peter Stone. Machine Learning Research Journal, 2009.
Transition in variable fee hierarchical confirmation. Neville Mehta, Sriraam Natarajan, Prasad Tadepalli, Alan Fern. Machine learning, 2008.
Learning and transferring modulated locomotive controllers. Nicolas Heess, Greg Wayne, Yuval Tassa, Timothy Lillicrap, Martin Riedmiller, David Silver. arXiv, 2016.
Learning to strengthen Learn. Jane X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick. arXiv, 2016.
RL2: Fast reinforcement by learning by slow reinforcement. Yan Duan, John Schulman, Xi Chen, Peter L. Bartlett, Ilya Sutskever, Pieter Abbeel. arXiv, 2016.
Model-agnostic meta-learning for rapid adaptation of deep networks. Chelsea Finn, Pieter Abbeel, Sergey Levine. Publications of the International Machine Learning Conference (ICML), 2017.
Follower traits to reinforce transfer in learning. André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado van Hasselt, David Silver. Advances in Neural Information Processing Systems (NIPS), 2017.
Transition to in-depth learning using successor qualities and general policy improvement. André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Žídek, Rémi Munos. Publications of the International Machine Learning Conference (ICML), 2018.
Development of enthopic practices with divergence correction. Jonathan Hunt, André Barreto, Timothy Lillicrap, Nicolas Heess. Publications of the International Conference on Machine Learning (ICML), 2019.
General Follower Features Approximators. Diana Borsa, André Barreto, John Quan, Daniel Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul. International Conference on Learning Representations (ICLR), 2019.
Optional keyboard: Combining skills with reinforcement learning. André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan J. Hunt, Shibl Mourad, David Silver, Doina Precup. Advances in Neural Data Processing Systems (NeurIPS), 2019.
Transfer learning in deep reinforcement learning: research. Zhuangdi Zhu, Kaixiang Lin, Jiayu Zhou, arXiv, 2020.
Quick conclusion on varying internal successor characteristics. Steven Hansen, Will Dabney, André Barreto, Tom Van de Wiele, David Warde-Farley, Volodymyr Mnih. International Conference on Learning Representations (ICLR), 2020.
Quick confirmation by learning with general policy updates. André Barreto, Shaobo Hou, Diana Borsa, David Silver, Doina Precup. Publications of the National Academy of Sciences, 2020.
Follow-up performance in neuroscience
The hippocampus as a proactive map. Kimberly Stachenfeld, Matthew Botvinick, Samuel Gershman. Natural Neuroscience, 2017.
Follower representation in human empowerment learning. I.Momennejad, EM Russek, JH Cheong, MM Botvinick, ND Daw, SJ Gershman. Behavior of natural man, 2017.
Proactive presentations can combine model-based reinforcement learning with non-model mechanisms. E. Russek, I. Momennejad, MM Botvinick, SJ Gershman, ND Daw. PLOS Computational Biology, 2017.
Succession: Its computational logic and neural substrates. Samuel J. Gershman. Journal of Neuroscience, 2018.
Better transfer learning with inferred follower maps. Tamas J.Madarasz, Timothy E.Behrens. Advances in Neural Data Processing Systems (NeurIPS), 2019.
Learning to reinforce multi-tasking in people. Momchil S.Tomov, Eric Schulz and Samuel J.Gershman. bioRxiv, 2019.
The neurally plausible model learns further performances in partially observable environments. Eszter Vertes, Maneesh Sahani. Advances in Neural Data Processing Systems (NeurIPS), 2019.
Neurobiological follower properties for space navigation. William de Cothi, Caswell Barry. Hippocampus, 2020.
Learning linear reinforcement: Flexible reuse of computation in design, grid fields, and cognitive guidance. Payam Piray, Nathaniel D.Daw. bioRxiv, 2020.