Real-Time Drone Communication System Using ROS 2 and GStreamer  with YOLOv8-Seg for Face Segmentation

Husam Salah Mahdi; K. Raja Kumar; K. John David Christopher

doi:10.22399/ijcesen.2123

Authors

Husam Salah Mahdi Department of CS & SE, Andhra University College of Engineering, Andhra University, India.
K. Raja Kumar
K. John David Christopher

DOI:

https://doi.org/10.22399/ijcesen.2123

Keywords:

ROS2, Face Segmentation, UAVs, GStreamer and Telemetry

Abstract

This paper presents the design and implementation of a ROS 2-based UAV syste m for real-time video streaming and intelligent
ground station processing. The proposed architecture integrates a Raspberry Pi 3 onboard computer with a Jetson Orin Nano
ground station over a wireless network. Video is captured and encoded using GStreamer on the UAV, streamed over UDP, and
decoded on the ground station for real-time object detection using the YOLOv8-seg model. ROS 2 middleware facilitates
synchronized telemetry and camera communication between the UAV and ground station via DDS topics. The system
demonstrates low-latency video transmission (∼105 ms), high streaming frame rate (30 FPS), and real-time object detection at
28–30 FPS with an average precision of 81.2%. The modularity of ROS 2 enables easy integration of additional perception,
control, and autonomous decision-making modules. Experimental results validate the system’s performance for surveillance and
inspection tasks, showcasing the potential of open-source middleware and embedded AI for edge-enhanced UAV applications.

References

[1] Lee, H., Yoon, J., Jang, M.-S., & Park, K.-J. (2021). A Robot Operating System framework for secure UAV communications. Sensors, 21(4), 1369. https://doi.org/10.3390/s21041369

[2] Bianchi, L., Bolognini, L., Cavallo, F., Cinotti, T. S., & Gaggero, M. (2023). A novel distributed architecture for unmanned aircraft systems based on Robot Operating System 2. IET Cyber-Systems and Robotics, 5(2), e12083.

[3] Jin, J., Zhang, H., Wang, Y., & Liu, T. (2021). Design of UAV video and control signal real-time transmission system based on 5G network. In Proceedings of the 16th IEEE Conference on Industrial Electronics and Applications (ICIEA) (pp. 533–537). IEEE.

[4] Kacianka, S., & Hellwagner, H. (2015). Adaptive video streaming for UAV networks. In Proceedings of the 7th ACM Workshop on Mobile Video (MoVid ’15) (pp. 25–30). ACM.

[5] Balogh, M., Balazs, B., & Vidács, A. (2022). Cloud-based robotics with advanced video streaming. International Journal of Cloud Robotics, 3(2), 101–115.

[6] Balogh, M., & Vidács, A. (2022). Optimizing camera stream transport in cloud-based industrial robotic systems. Infocommunications Journal, 14(1), 36–42.

[7] Diez-Tomillo, J., Garcia, J., De La Cruz, J., & Garcia, M. (2024). Efficient CNN-based low-resolution facial detection from UAVs. Neural Computing and Applications, 36, 5847–5860.

[8] Al-Mistarihi, M. A., Al-Khalil, A., & Al Maghayreh, E. (2021). Real-time video streaming for drone applications using ROS 2 and adaptive compression. Sensors, 21(12), 4061. https://doi.org/10.3390/s21124061

[9] Kumar, P., Kumar, R., & Sharma, A. (2021). Real-time, YOLO-based intelligent surveillance and monitoring system using Jetson TX2. In Proceedings of the International Conference on Data Analytics and Management (ICDAM) (Vol. 1397, pp. 461–471). Springer.

[10] Liberatori, B., Graziani, E., Russo, A., & Mancini, L. V. (2022). YOLO-based face mask detection on low-end devices using pruning and quantization. In Proceedings of the 45th International Convention on Information, Communication and Electronic Technology (MIPRO) (pp. 900–905). IEEE.

[11] Bormann, R., Bertram, T., & Ritz, R. (2022). Towards scalable multi-robot systems: A ROS 2-based approach. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). IEEE.

[12] Erle Robotics. (2021). ROS 2 UAV integration guide with PX4 stack [Technical documentation].

[13] Bauer, M., Schmidt, J., & Reichel, T. (2022). Performance analysis of GStreamer-based UAV video pipelines. In Proceedings of the International Conference on Multimedia Systems. ACM.

[14] He, L., & Fu, S. (2021). Adaptive streaming in unmanned aerial systems via GStreamer. IEEE Access, 9, 123456–123467.

[15] Tanaka, Y., Okamoto, K., & Ito, H. (2023). YOLOv5s for pedestrian detection on Jetson Nano: A case study. In Proceedings of the International Conference on Embedded Vision.

[16] Singh, D., Mishra, N., & Roy, A. (2024). Benchmarking YOLOv8 inference on edge AI platforms. In Proceedings of the AI Edge Computing Conference.

[17] Mendoza, A., & Li, F. (2023). ROS2-TensorRT: A bridge for high-performance edge inference. In Proceedings of the Real-Time Systems Symposium. IEEE.

[18] Zhou, N., & Chen, L. (2023). Lightweight facial expression recognition on Raspberry Pi using YOLO and MobileNet. In Proceedings of the International Joint Conference on Neural Networks (IJCNN).

[19] Saidi, A., Ghanmi, T., & Bensalah, M. (2022). Efficient GStreamer-based real-time UAV video transmission using hardware encoding. In Proceedings of the International Conference on Emerging Smart Computing and Informatics (ESCI) (pp. 291–296). IEEE.

[20] Saito, T., & Maekawa, H. (2021). Low-latency video transmission for UAV applications using ROS 2 and DDS over WiFi. In Proceedings of the 12th International Conference on Robotics and Mechatronics (ICRoM) (pp. 88–94). IEEE.

[21] Fernandes, G., Jiao, Y., & Tavares, A. (2022). Real-time UAV video streaming using Jetson and GStreamer for AI edge inference. In Proceedings of the IEEE International Conference on Edge Computing (EDGE) (pp. 22–27). IEEE.

[22] Kilic, F., Hassan, M., & Hardt, W. (2024). Prototype for multi-UAV monitoring–control system using WebRTC. Drones, 8(10), 551.

[23] Hong, D., & Moon, C. (2024). Autonomous driving system architecture with integrated ROS2 and adaptive AUTOSAR. Electronics, 13(7), 1303.

[24] Alsalam, B. H., Morton, M., Campbell, D., Ranathunge, G., & Garratt, S. B. (2016). Autonomous UAVs wildlife detection using thermal imaging, predictive navigation and computer vision. In Proceedings of the Australasian Conference on Robotics and Automation, Brisbane, Australia.

[25] Mahdi, H. S., Kumar, K. R., & Christopher, K. J. D. (2025). Accelerated real-time face recognition and segmentation with YOLOv8 optimized through TensorRT. Journal of Information Systems Engineering and Management, 10(35s), 5987. https://doi.org/10.52783/jisem.v10i35s.5987

Real-Time Drone Communication System Using ROS 2 and GStreamer with YOLOv8-Seg for Face Segmentation

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Information

Keywords

Announcements

Current Issue