Communication in Multi-Agent Reinforcement Learning: A Survey

Rimsha Khan; Nageen Khan; Tauqir Ahmad

doi:10.71330/thenucleus.2023.1303

Authors

R. Khan Department of Computer Science, University of Engineering and Technology, Lahore, Pakistan
N. Khan Department of Computer Science, University of Engineering and Technology, Lahore, Pakistan
T. Ahmad Department of Computer Science, University of Engineering and Technology, Lahore, Pakistan

DOI:

https://doi.org/10.71330/thenucleus.2023.1303

Abstract

Agents can use communication to coordinate their actions and achieve their goals. The agents in multi-agent reinforcement learning (MARL) have the ability to enhance their overall learning performance by acquiring communication skills. They can transmit various types of messages to either all agents or particular groups, utilizing diverse communication channels. Their study on MARL with communication (Comm-MARL) is expanding. Nonetheless, currently, there is no methodical approach to differentiate and categorize present Comm-MARL (Communication Multi-agent reinforcement learning) systems. This article surveys recent research in the Comm-MARL domain, scrutinizing diverse communication aspects that could be incorporated into MARL systems. Several dimensions are suggested to examine, establish, and contrast Comm-MARL systems. This paper presents a comprehensive review of the nine dimensions influencing communication in multi-agent collaboration. The dimensions explored include communication type, communication policy, communicated messages, message combination, inner integration, communication constraints, communication learning, training schemes, and controlled goals. By examining these dimensions, the study aims to shed light on the intricate dynamics of agent interaction in complex environments. This review emphasizes the significance of effective communication strategies in achieving common objectives among agents and highlights the importance of factors such as context awareness, adaptability, and learning from past experiences. The insights provided in this paper offer valuable guidance for enhancing collaboration and communication strategies across various multi-agent systems and applications.

References

C. Amato, A. Oliehoek, and C.A. T. Juan, "A Concise Introduction to Decentralized POMDPs," Springer, 2016.

M.S. Zaïem, M. Etienne, and Bennequin, "Learning to communicate in multi-agent reinforcement learning," CoRR, abs/1911.05438, 2019.

G. Papoudakis, F. Christianos, A. Rahman, and S.V. Albrecht, "Dealing with non-stationarity in multi-agent deep reinforcement learning," CoRR, abs/1906.04737, 2019.

J.N. Foerster, G. Farquhar, T.A. fouras, N. Nardelli, and S. Whiteson, "Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence," AAAI Press, pp. 2974-2982, 2018.

J.N. Foerster, Y.M. Assael, N. de Freitas, and S. Whiteson, "Learning to communicate with deep multi-agent reinforcement learning," in Advances in Neural Information Processing Systems, vol. 29, no. (NIPS), pp. 2137-2145, 2016.

J. Kober, J.A. Bagnell, and J. Peters, "Reinforcement learning in robotics: A survey," Int. J. Robotics, vol. 32, no. (11), pp. 1238-1274, 2013.

M. Vinyals, J.A. Rodríguez-Aguilar, and J. Cerquides, "A survey on sensor networks from a multiagent perspective," Comput. J., vol. 54, no. 3, pp. 455-470, 2011.

P. Hernandez-Leal, B. Kartal, and M.E. Taylor, "A survey and critique of multiagent deep reinforcement learning," Autonomous Agents and Multi-Agent Systems, vol. 33, no. 6, pp. 750-797, 2019.

P. Hernandez-Leal, B. Kartal, and M.E. Taylor, "A survey and critique of multiagent deep reinforcement learning," Autonomous Agents and Multi-Agent Systems, pp. 750-797, 2019.

R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, "Multi-agent actor-critic for mixed cooperative-competitive environments," in Advances in Neural Information Processing Systems, vol. 30, no. (NIPS), pp. 6379-6390, 2017.

S. Sukhbaatar, A. Szlam, and R. Fergus, "Learning multiagent communication with backpropagation," in Advances in Neural Information Processing Systems, vol. 29, no. (NIPS), pp. 2244-2252, 2016.

S. Shalev-Shwartz, S. Shammah, and A. Shashua, "Safe, multi-agent reinforcement learning for autonomous driving," CoRR, abs/1610.03295, 2016.

N. Sandholm, T. Brown, and T. Tuomas, "Superhuman AI for multiplayer poker," Science, vol. 365, no. (9495), pp. 885-890, 2019.

Y. Du, B. Liu, V. Moens, Z. Liu, Z. Ren, J. Wang, X. Chen, and H. Zhang, "Learning correlated communication topology in multi-agent reinforcement learning," in 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 456-464, 2021.

Y. Niu, R.R. Paleja, and M.C. Gombolay, "Multi-agent graph attention communication and teaming," in 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 964-973, 2021.

T.T. Nguyen, N.D. Nguyen, and S. Nahavandi, "Deep reinforcement learning for multi-agent systems: A review of challenges, solutions, and applications," CoRR, abs/1812.11794, 2018.

Lee, J., Lee, M, Lee J., & Choi S. (2021). Deep reinforcement learning for multiagent collaboration arXiv preprint arXiv 210209561.

S. Gronauer and K. Diepold, "Multi-agent deep reinforcement learning: a survey," Artificial Intelligence, pp. 1-49, 2021.

A. Oroojlooyjadid and D. Hajinezhad, "A review of cooperative multiagent deep reinforcement learning," CoRR, abs/1908.03963, 2019.

A. Wong, T. Bäck, A.V. Kononova, and A. Plaat, "Multiagent deep reinforcement learning: Challenges and directions towards human-like approaches," CoRR, abs/2106.15691, 2021.

A. Das, T. Gervet, J. Romoff, D. Batra, D. Parikh, M. Rabbat, and J. Pineau, "Tarmac: Targeted multi-agent communication," in Proceedings of the 36th International Conference on Machine Learning (ICML), pp. 1538-1546, 2019.

R. Wang, X. He, R. Yu, W. Qiu, B. An, and Z. Rabinovich, "Learning efficient multi-agent communication: An information bottleneck approach," in Proceedings of the 37th International Conference on Machine Learning (ICML), vol. 119, pp. 9908-9918, 2020.

Y. Liu, W. Wang, Y. Hu, J. Hao, X. Chen, and Y. Gao, "Multi-agent game abstraction via graph attention neural network," in The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), pp. 7211-7218, 2020.

J. Jiang and Z. Lu, "Learning attentional communication for multiagent cooperation," in Advances in Neural Information Processing Systems 31 (NIPS), pp. 7265-7275, 2018.

A. Singh, T. Jain, and S. Sukhbaatar, "Individualized controlled continuous communication model for multiagent cooperative and competitive tasks," in International Conference on Learning Representations (ICLR), 2019.

D. Kim, S. Moon, D. Hostallero, W. J. Kang, T. Lee, K. Son, and Y. Yi, "Learning to schedule communication in multiagent reinforcement learning," in 7th International Conference on Learning Representations (ICLR), no. OpenReview.net, 2019.

H. Mao, Z. Zhang, Z. Xiao, Z. Gong, and Y. Ni, "Learning agent communication under limited bandwidth by message pruning," in The Thirty-Fourth AAAI Conference on Artificial Intelligence, no. AAAI Press, pp. 5142-5149, 2020.

G. Hu, Y. Zhu, D. Zhao, M. Zhao, and J. Hao, "Event-triggered multi-agent reinforcement learning with communication under limited-bandwidth constraint," CoRR, abs/2010.04978, 2020.

Z. Ding, T. Huang, and Z. Lu, "Learning individually inferred communication for multi-agent cooperation," in Advances in Neural Information Processing Systems 33 (NeurIPS), 2020.

A. Agarwal, S. Kumar, K. P. Sycara, and M. Lewis, "Learning transferable cooperative behavior in multi-agent teams," in International Foundation for Autonomous Agents and Multiagent Systems, pp. 1741-1743, 2020.

K. Zhang, Z. Yang, and T. Basar, "Decentralized multi-agent reinforcement learning with networked agents," Frontiers Inf. Technol. Electron. Eng., vol. 22, no. 6, pp. 802-814, 2021.

K. Zhang, Z. Yang, and T. Basar, "Multi-agent reinforcement learning: A selective overview of theories and algorithms," CoRR, abs/1911.10635, 2019.

T. Chu, S. Chinchali, and S. Katti, "Multi-agent reinforcement learning for networked system control," in 8th International Conference on Learning Representations (ICLR), no. OpenReview.net, 2020.

C. Qu, H. Li, C. Liu, J. Xiong, J. Zhang, W. Chu, Y. Qi, and L. Song, "Intention propagation for multi-agent reinforcement learning," CoRR, abs/2004.08883, 2020.

W. Kim, J. Park, and Y. Sung, "Communication in multi-agent reinforcement learning: Intention sharing," in 9th International Conference on Learning Representations (ICLR), 2021.

C. Sun, B. Wu, R. Wang, X. Hu, X. Yang, and C. Cong, "Intrinsic motivated multi-agent communication," in AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021.

P. Peng, Q. Yuan, Y. Wen, Y. Yang, Z. Tang, H. Long, and J. Wang, "Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games," CoRR, abs/1703.10069, 2017.

E. Pesce and G. Montana, "Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication," Machine Learning, vol. 109, no. 9-10, pp. 1727-1747, 2020.

J. Jiang, C. Dun, T. Huang, and Z. Lu, "Graph convolutional reinforcement learning," in 8th International Conference on Learning Representations (ICLR), no. OpenReview.net, 2020.

S. Q. Zhang, Q. Zhang, and J. Lin, "Efficient communication in multi-agent reinforcement learning via variance-based control," in Advances in Neural Information Processing Systems 32 (NeurIPS), pp. 3230-3239, 2019.

S. Q. Zhang, Q. Zhang, and J. Lin, "Succinct and robust multi-agent communication with temporal message control," in Advances in Neural Information Processing Systems 33 (NeurIPS), 2020.

C. E. Shannon, "A mathematical theory of communication," Bell Syst. Tech. J., vol. 27, no. 3, pp. 379-423, 1948.

R. L. Freeman, "Telecommunication system engineering," John Wiley & Sons, vol. 82, 2004.

B. Freed, R. James, G. Sartoretti, and H. Choset, "Sparse discrete communication learning for multi-agent cooperation through backpropagation," in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7993-7998, 2020.

W.J. Yun, B. Lim, S. Jung, Y.C. Ko, J. Park, J. Kim, and M. Bennis, "Attention-based reinforcement learning for real-time UAV semantic communication," CoRR, abs/2105.10716, 2021.

J. Sheng, X. Wang, B. Jin, J. Yan, W. Li, T. H. Chang, J. Wang, and H. Zha, "Learning structured communication for multiagent reinforcement learning," CoRR, abs/2002.04235, 2020.

A. Malysheva, T.T. K. Sung, C. Sohn, D. Kudenko, and A. Shpilman, "Deep multi-agent reinforcement learning with relevance graphs," CoRR, abs/1811.12557, 2018.

O. Kilinc and G. Montana, "Multi-agent deep reinforcement learning with extremely noisy observations," CoRR, abs/1812.00922, 2018.

W. Kim, M. Cho, and Y. Sung, "An efficient training method for multi-agent deep reinforcement learning," in The Thirty-Third AAAI Conference on Artificial Intelligence, pp. 6079-6086, 2019.

B. Freed, G. Sartoretti, J. Hu, and H. Choset, "Communication learning via backpropagation in discrete channels with unknown noise," in The Thirty-Fourth AAAI Conference on Artificial Intelligence, pp. 7160-7168, 2020.

X. Kong, B. Xin, F. Liu, and Y. Wang, "Revisiting the master-slave architecture in multi-agent deep reinforcement learning," CoRR, abs/1712.07305, 2017.

N. Gupta, G. Srinivasaraghavan, S. K. Mohalik, and M. E. Taylor, "HAMMER: multi-level coordination of reinforcement learning agents via learned messaging," CoRR, abs/2102.00824, 2021.

H. Mao, Z. Zhang, Z. Xiao, Z. Gong, and Y. Ni, "Learning agent communication under limited bandwidth by message pruning," in The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), AAAI Press, 2020.

E. Jang, S. Gu, and B. Poole, "Categorical reparameterization with gumbel-softmax," in 5th International Conference on Learning Representations (ICLR), OpenReview.net, 2020.

L. Kraemer and B. Banerjee, "Multi-agent reinforcement learning as a rehearsal for decentralized planning," Neurocomputing, vol. 190, pp. 82-94, 2016.

L. Busoniu, R. Babuska, and B. De Schutter, "Multi-agent reinforcement learning: A survey," in Ninth International Conference on Control, Automation, Robotics and Vision (ICARCV), IEEE, pp. 1-6, 2006.

L. Matignon, G.J. Laurent, and N. Le Fort-Piat, "Independent reinforcement learners in cooperative Markov games," Knowledge Engineering Review, vol. 27, no. 1, pp. 1-31, 2012.

B. Liu, Q. Liu, P. Stone, A. Garg, Y. Zhu, and A. Anandkumar, "Coach-player multi-agent reinforcement learning for dynamic team composition," Proceedings of Machine Learning Research, vol. 144, pp. 6860-6870, 2021.

P. Xuan, V.R. Lesser, and S. Zilberstein, "Communication decisions in multi-agent cooperation: model and experiments," in Proceedings of the Fifth International Conference on Autonomous Agents, 2001.

V. Hakami, V. Barghi, H. Mostafavi, S. Arefinezhad, and Z. Hakami, "A resource allocation scheme for D2D communications with unknown channel state information," Peer-to-Peer Netw. Appl., vol. 15, pp. 1189-1213, 2022.

Y. Du, B. Liu, V. Moens, Z. Liu, Z. Ren, J. Wang, X. Chen, and H. Zhang, "Learning correlated communication topology in multi-agent reinforcement learning," in 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 456-464, 2021.

Y. Niu, R.R. Paleja, and M.C. Gombolay, "Multi-agent graph attention communication and teaming," in 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 964-973, 2021.

D. Silver et al., "Mastering the game of Go without human knowledge," Nature, vol. 550, no. 7676, pp. 354-359, 2017.

Communication in Multi-Agent Reinforcement Learning: A Survey

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Recognition and Indexation