Content for TR 22.874 Word version: 18.2.0

0… 4 5… 5.2… 5.3… 5.4… 5.5… 6… 6.2… 6.3… 6.4… 6.5… 6.6… 6.7… 7… 7.2… 7.3… 7.4… 8… A… A.2 A.3 A.4 B C D…

0 Introduction p. 6

This document covers use cases and potential requirements for 5G system support of Artificial Intelligence (AI)/Machine Learning (ML) model distribution and transfer (download, upload, updates, etc.). The TR on AI/ML services includes three aspects: AI/ML operation splitting between AI/ML endpoints, AI/ML model/data distribution and sharing over 5G system, distributed/Federated Learning over 5G system.

1 Scope p. 7

This report captures the study of the use cases and the potential performance requirements for 5G system support of Artificial Intelligence (AI)/Machine Learning (ML) model distribution and transfer (download, upload, updates, etc.), and identifies traffic characteristics of AI/ML model distribution, transfer and training for various applications, e.g. video/speech recognition, robot control, automotive, other verticals.

The aspects addressed include:

AI/ML operation splitting between AI/ML endpoints;
AI/ML model/data distribution and sharing over 5G system;
Distributed/Federated Learning over 5G system.
Study of the AI/ML models themselves are not in the scope of the TR.

2 References p. 7

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.

References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.
For a specific reference, subsequent revisions do not apply.
For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.

[1]

TR 21.905: "Vocabulary for 3GPP Specifications".

[2]

TR 22.891: Feasibility Study on New Services and Markets Technology Enablers

[3]

TR 22.863: Feasibility study on new services and markets technology enablers for enhanced mobile broadband

[4]

TS 22.261: Service requirements for the 5G system

[5]

TS 22.104: Service requirements for cyber-physical control applications in vertical domains

[6]

TS 23.273: 5G System (5GS) Location Services (LCS); Stage 2

[7]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks", in Proc. NIPS, 2012, pp. 1097-1105.

[8]

K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," 2014, arXiv:1409.1556. [Online]. Available: https://arxiv.org/abs/1409.1556

[9]

C. Szegedy, et al., "Going deeper with convolutions", in Proc. CVPR, 2015, pp. 1-9.

[10]

Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, Junshan Zhang, "Edge intelligence: Paving the last mile of artificial intelligence with edge computing", Proceeding of the IEEE, 2019, Volume 107, Issue 8.

[11]

Jiasi Chen, Xukan Ran, "Deep learning with edge computing: A review", Proceeding of the IEEE, 2019, Volume 107, Issue 8.

[12]

I. Stoica et al., "A Berkeley view of systems challenges for AI", 2017, arXiv:1712.05855. [Online]. Available: https://arxiv.org/abs/1712.05855

[13]

Y. Kang et al., "Neurosurgeon: Collaborative intelligence between the cloud and mobile edge", ACM SIGPLAN Notices, vol. 52, no. 4, pp. 615-629, 2017.

[14]

E. Li, Z. Zhou, and X. Chen, "Edge intelligence: On-demand deep learning model co-inference with device-edge synergy", in Proc. Workshop Mobile Edge Commun. (MECOMM), 2018, pp. 31-36.

[15]

TR 38.913: Study on Scenarios and Requirements for Next Generation Access Technologies (Release 15)

[16]

B. Kehoe, S. Patil, P. Abbeel, and K. Goldberg, "A survey of research on cloud robotics and automation," IEEE Transactions on automation science and engineering, vol. 12, no. 2, pp. 398-409, 2015.

[17]

Huaijiang Zhu, Manali Sharma, Kai Pfeiffer, Marco Mezzavilla, Jia Shen, Sundeep Rangan, and Ludovic Righetti, "Enabling Remote Whole-body Control with 5G Edge Computing", to appear, in Proc. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Available at: https://arxiv.org/pdf/2008.08243.pdf

[18]

K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE CVPR, Jun. 2016, pp. 770-778.

[19]

A. G. Howard et al., "MobileNets: Efficient convolutional neural networks for mobile vision applications," 2017, arXiv:1704.04861. [Online]. Available: https://arxiv.org/abs/1704.04861

[20]

B. Taylor, V. S.Marco, W. Wolff, Y. Elkhatib, and Z. Wang, "Adaptive deep learning model selection on embedded systems," in Proc. ACM LCTES, 2018, pp. 31-43.

[21]

G. Shu, W. Liu, X. Zheng, and J. Li, "IF-CNN: Image-aware inference framework for CNN with the collaboration of mobile devices and cloud", IEEE Access, vol. 6, pp. 621-633, 2018.

[22]

D. Stamoulis et al., "Designing adaptive neural networks for energy-constrained image classification", in Proc. ACM ICCAD, 2018, Art. no. 23.

[23]

Sergey Ioffe and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift", In ICML., 2015.

[24]

C.-J. Wu et al., "Machine learning at facebook: Understanding inference at the edge," in Proc. IEEE Int. Symp. High Perform. Comput. Archit. (HPCA), Feb. 2019, pp. 331-344.

[25]

Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel S. Emer, "Efficient processing of deep neural networks: A tutorial and survey", Proceeding of the IEEE, 2017, Volume 105, Issue 12.

[26]

Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, May 2015.

[27]

"An All-Neural On-Device Speech Recognizer", March 12, 2019, Posted by Johan Schalkwyk, https://ai.googleblog.com/2019/03/an-all-neural-on-device-speech.html

[28]

Yanzhang He, etc., "Streaming End-to-end Speech Recognition for Mobile Devices", 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019)

[29]

TS 22.243: "Speech recognition framework for automated voice services; Stage 1".

[30]

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y. Arcas, "Communication-efficient learning of deep networks from decentralized data", Proc. of the International Confe rence on Artificial Intelligence and Statistics, Apr. 20 17. [Online]. Available: https://arxiv.org/abs/1602.05629

[31]

"Federated Learning", https://justmachinelearning.com/2019/03/10/federated-learning/

[32]

T. Nishio and R. Yonetani, "Client selection for federated learning with heterogeneous resources in mobile edge", 2018, arXiv:1804.08333. [Online]. Available: https://arxiv.org/abs/1804.08333

[33]

E. Park et al., "Big/little deep neural network for ultra low power inference", in Proc. 10th Int. Conf. Hardw./Softw. Codesign Syst. Synth., 2015, pp. 124-132.

[34]

Nguyen H. Tran ; Wei Bao ; Albert Zomaya ; Minh N. H. Nguyen; Choong Seon Hong, "Federated Learning over Wireless Networks: Optimization Model Design and Analysis", In proc. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications

[35]

Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. "EIE: efficient inference engine on compressed deep neural network", In 43rd International Symposium on Computer Architecture, IEEE Press, 243-254.

[36]

V. Sze, "Efficient Computing for Deep Learning, AI and Robotics," Dept EECS, MIT, Available online at https://lexfridman.com/files/slides/2020_01_15_vivienne_sze_efficient_computing.pdf

[37]

V. Sze, Y. Chen, "Efficient Processing of Deep Neural Networks: A Tutorial and Survey" Proc. of IEEE, 2017, Available online at: https://www.semanticscholar.org/paper/Efficient-Processing-of-Deep-Neural-Networks%3A-A-and-Sze-Chen/3f116042f50a499ab794bcc1255915bee507413c

[38]

Stanford University, CS231n - Lecture 5-7: CNN, Training NNs, Available at YouTube.com

[39]

S. Han, J. Pool, J. Tran, and W, J. Dally, "Learning both weights and connections for efficient neural networks", NIPS, May 2015

[40]

P. A. Merolla, et al., "A million spikingneuron integrated circuit with a scalable communication network and interface", Science, vol. 345, no. 6197, pp. 668-673, Aug. 2014.

[41]

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing (almost) from scratch," J. Mach. Learn. Res., vol. 12 pp. 2493-2537, Aug. 2011.

[42]

T. N. Sainath, A.-R. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutionalneural networks for LVCSR", in Proc. ICASSP, 2013, pp. 8614-8618.

[43]

L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey", J. Artif. Intell. Res., vol. 4, no. 1, pp. 237-285, Jan. 1996.

[44]

3 AI Trends for Enterprise Computing. [Online]. Available: https://www.gartner.com/smarterwithgartner/3-ai-trends-for-enterprise-computing/

[45]

Shiming Ge; Zhao Luo; Shengwei Zhao; Xin Jin; Xiao-Yu Zhang, "Compressing deep neural networks for efficient visual inference", In proc. 2017 IEEE International Conference on Multimedia and Expo (ICME)

[46]

TS 22.186: Enhancement of 3GPP support for V2X scenarios; Stage 1 (Release 16) v16.2.0

[47]

"Develop Smaller Speech Recognition Models with NVIDIA's NeMo Framework", https://developer.nvidia.com/blog/develop-smaller-speech-recognition-models-with-nvidias-nemo-framework/

[48]

TS 23.501: System architecture for the 5G System (5GS)

[49]

TS 23.502: Procedures for the 5G System (5GS)

[50]

S. Ren, K. He, R. Girshick, J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks"

[51]

J. Redmon, A. Farhadi, "YOLOv3: An Incremental Improvement"

[52]

W. Sun, Z. Chen, "Learned Image Downscaling for Upscaling using Content Adaptive Resampler"

[53]

C. Ledig et al., "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"

[54]

A. G. Howard et al., "MobileNets: Efficient convolutional neural networks for mobile vision applications," 2017, arXiv:1704.04861. [Available online: https://arxiv.org/abs/1704.04861]

[55]

SSD-ResNet34 - https://github.com/IntelAI/models/tree/master/benchmarks/object_detection/tensorflow/ssd-resnet34

[56]

MLCommons Mobile Inference Benchmark v0.7 - https://mlcommons.org/en/inference-mobile-07/

[57]

MASK C-RNN - https://arxiv.org/abs/1703.06870

[58]

https://ai.facebook.com/blog/dlrm-an-advanced-open-source-deep-learning-recommendation-model/

[59]

TR 28.809: Study on enhancement of management data analytics

[60]

Mingzhe Chen, "A Joint Learning and Communications Framework for Federated Learning over Wireless Networks", Oct 2020

3 Definitions, symbols and abbreviations p. 10

3.1 Definitions p. 10

For the purposes of the present document, the terms and definitions given in TR 21.905 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905.

communication service availability:

percentage value of the amount of time the end-to-end communication service is delivered according to an agreed QoS, divided by the amount of time the system is expected to deliver the end-to-end service according to the specification in a specific area.

End-to-End Latency:

the time that takes to transfer a given piece of information from a source to a destination, measured at the communication interface, from the moment it is transmitted by the source to the moment it is successfully received at the destination.

reliability:

in the context of network layer packet transmissions, percentage value of the amount of sent network layer packets successfully delivered to a given system entity within the time constraint required by the targeted service, divided by the total number of sent network layer packets.

user experienced data rate:

the minimum data rate required to achieve a sufficient quality experience, with the exception of scenario for broadcast like services where the given value is the maximum that is needed.

3.2 Abbreviations p. 10

For the purposes of the present document, the abbreviations given in TR 21.905 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905.

Artificial Intelligence

CNN

Convolution Neural Network

DNN

Deep Neural Network

Federated Learning

GPU

Graphics Processing Units

IDC

Internet Data Center

Machine Learning