Deep Learning On Edge Computing Devices Design Challenges Of Algorithm And Architecture 1st Edition Xichuan Zhou

Deep Learning On Edge Computing Devices Design Challenges Of Algorithm And Architecture 1st Edition Xichuan Zhou download https://ebookbell.com/product/deep-learning-on-edge-computing- devices-design-challenges-of-algorithm-and-architecture-1st- edition-xichuan-zhou-44495146 Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be interested in. You can click the link to download. Deep Learning On Microcontrollers Learn How To Develop Embedded Ai Applications Using Tinyml Atul Krishna Gupta https://ebookbell.com/product/deep-learning-on-microcontrollers-learn- how-to-develop-embedded-ai-applications-using-tinyml-atul-krishna- gupta-50687446 Deep Learning On Windows Building Deep Learning Computer Vision Systems On Microsoft Windows 1st Ed Thimira Amaratunga https://ebookbell.com/product/deep-learning-on-windows-building-deep- learning-computer-vision-systems-on-microsoft-windows-1st-ed-thimira- amaratunga-22417620 Deep Learning On Graphs Yao Ma Jiliang Tang https://ebookbell.com/product/deep-learning-on-graphs-yao-ma-jiliang- tang-34448098 Applied Deep Learning On Graphs Leverage Graph Data For Business Applications Using Specialized Deep Learning Architectures Lakshya Khandelwal https://ebookbell.com/product/applied-deep-learning-on-graphs- leverage-graph-data-for-business-applications-using-specialized-deep- learning-architectures-lakshya-khandelwal-202186988

Applied Deep Learning On Graphs Leverage Graph Data For Business Applications Using Specialized Deep Learning Architectures 1st Edition Lakshya Khandelwal https://ebookbell.com/product/applied-deep-learning-on-graphs- leverage-graph-data-for-business-applications-using-specialized-deep- learning-architectures-1st-edition-lakshya-khandelwal-197825168 Applications Of Machine Learning And Deep Learning On Biological Data 1st Edition Faheem Masoodi https://ebookbell.com/product/applications-of-machine-learning-and- deep-learning-on-biological-data-1st-edition-faheem-masoodi-49150142 Tensorflow 2x In The Colaboratory Cloud An Introduction To Deep Learning On Googles Cloud Service 1st Edition David Paper https://ebookbell.com/product/tensorflow-2x-in-the-colaboratory-cloud- an-introduction-to-deep-learning-on-googles-cloud-service-1st-edition- david-paper-36380782 Deep Learning With Python Develop Deep Learning Models On Theano And Tensorflow Using Keras V17 Jason Brownlee https://ebookbell.com/product/deep-learning-with-python-develop-deep- learning-models-on-theano-and-tensorflow-using-keras-v17-jason- brownlee-7295338 Pytorch Deep Learning Handson Build Cnns Rnns Gans Reinforcement Learning And More Quickly And Easily Sherin Thomas https://ebookbell.com/product/pytorch-deep-learning-handson-build- cnns-rnns-gans-reinforcement-learning-and-more-quickly-and-easily- sherin-thomas-11117602

DEEP LEARNING ON EDGE COMPUTING DEVICES

This page intentionally left blank

DEEP LEARNING ON EDGE COMPUTING DEVICES Design Challenges of Algorithm and Architecture XICHUAN ZHOU HAIJUN LIU CONG SHI JI LIU

Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States Copyright © 2022 Tsinghua University Press. Published by Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this ﬁeld are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-323-85783-3 For information on all Elsevier publications visit our website at https://www.elsevier.com/books-and-journals Publisher: Mara Conner Acquisitions Editor: Glyn Jones Editorial Project Manager: Naomi Robertson Production Project Manager: Selvaraj Raviraj Designer: Christian J. Bilbow Typeset by VTeX

Contents Preface vii Acknowledgements ix PART 1 Introduction 1. Introduction 3 1.1. Background 3 1.2. Applications and trends 5 1.3. Concepts and taxonomy 8 1.4. Challenges and objectives 13 1.5. Outline of the book 14 References 16 2. The basics of deep learning 19 2.1. Feedforward neural networks 19 2.2. Deep neural networks 22 2.3. Learning objectives and training process 29 2.4. Computational complexity 33 References 34 PART 2 Model and algorithm 3. Model design and compression 39 3.1. Background and challenges 39 3.2. Design of lightweight neural networks 40 3.3. Model compression 47 References 56 4. Mix-precision model encoding and quantization 59 4.1. Background and challenges 59 4.2. Rate-distortion theory and sparse encoding 61 4.3. Bitwise bottleneck quantization methods 65 4.4. Application to eﬃcient image classiﬁcation 67 References 73 5. Model encoding of binary neural networks 75 5.1. Background and challenges 75 v

vi Contents 5.2. The basic of binary neural network 77 5.3. The cellular binary neural network with lateral connections 79 5.4. Application to eﬃcient image classiﬁcation 84 References 92 PART 3 Architecture optimization 6. Binary neural network computing architecture 97 6.1. Background and challenges 97 6.2. Ensemble binary neural computing model 98 6.3. Architecture design and optimization 102 6.4. Application of binary computing architecture 105 References 108 7. Algorithm and hardware codesign of sparse binary network on-chip 111 7.1. Background and challenges 111 7.2. Algorithm design and optimization 115 7.3. Near-memory computing architecture 120 7.4. Applications of deep adaptive network on chip 124 References 135 8. Hardware architecture optimization for object tracking 139 8.1. Background and challenges 139 8.2. Algorithm 140 8.3. Hardware implementation and optimization 143 8.4. Application experiments 147 References 152 9. SensCamera: A learning-based smart camera prototype 155 9.1. Challenges beyond pattern recognition 155 9.2. Compressive convolutional network model 159 9.3. Hardware implementation and optimization 164 9.4. Applications of SensCamera 166 References 175 Index 179

Preface We first started working in the field of edge computing-based machine learning in 2010. With project funding, we tried to accelerate support vector machine algorithms on integrated circuit chips to support embedded applications such as fingerprint recognition. In recent years, with the development of deep learning and integrated circuit technology, artificial intelligence applications based on edge computing devices, such as intelligent terminals, autonomous driving, and AIOT, are emerging one after another. However, the realization of an embedded artificial intelligence application involves multidisciplinary knowledge of mathematics, computing science, computer architecture, and circuit and system design. Therefore we arrived at the idea of writing a monograph focusing on the research progress of relevant technologies, so as to facilitate the understanding and learning of graduate students and engineers in related fields. Deep learning application development based on embedded devices is facing the theoretical bottleneck of high complexity of deep neural network algorithms. Realizing the lightweight of various fast developing deep learning models is one of the keys to realize AIOT pervasive artificial intelligence in the future. In recent years, we have been focusing on the development of automated deep learning tools for embedded devices. This book covers some of the cutting-edge technologies, currently developing in embedded deep learning, and introduces some core algorithms, including lightweight neural network design, model compression, model quantization, etc., aim- ing to provide reference for the readers to design embedded deep learning algorithm. Deep learning application development based on embedded devices is facing the technical challenge of limited development of integrated circuit technology in the post-Moore era. To address this challenge, in this book, we propose and elaborate a new paradigm of algorithm-hardware codesign to realize the optimization of energy efficiency and performance of neural network computing in embedded devices. The DANoC sparse coding neural network chip developed by us is taken as an example to introduce the new technology of memory computing, hoping to give inspiration to embedded design experts. We believe that, in the post-Moore era, the system collaborative design method across multiple levels of algorithms, software, and hardware will gradually become the mainstream of embedded intelli- vii

viii Preface gent design to meet the design requirements of high real-time performance and low power consumption under the condition of limited hardware resources. Due to time constraints and the authors’ limited knowledge, there may be some omissions in the content, and we apologize to the readers for this. Xichuan Zhou

Acknowledgements First of all, we would like to thank all the students who participated in the relevant work for their contributions to this book, including Shuai Zhang, Kui Liu, Rui Ding, Shengli Li, Songhong Liang, Yuran Hu, etc. We would like to take the opportunity to thank our families, friends, and colleagues for their support in the course of writing this monograph. We would also like to thank our organization, School of Microelectron- ics and Communication Engineering, Chongqing University, for providing supportive conditions to do research on intelligence edge computing. The main content of this book is compiled from a series of research, partly supported by the National Natural Science Foundation of China (Nos. 61971072 and 62001063). We are most grateful to the editorial staff and artists at Elsevier and Tsinghua University Press for giving us all the support and assistance needed in the course of writing this book. ix

CHAPTER 1 Introduction 1.1 Background At present, the human society is rapidly entering the era of Internet of Everything. The application of the Internet of Things based on the smart embedded device is exploding. The report “The mobile economy 2020” released by Global System for Mobile Communications Assembly (GSMA) has shown that the total number of connected devices in the global Inter- net of Things reached 12 billion in 2019 [1]. It is estimated that by 2025 the total scale of the connected devices in the global Internet of Things will reach 24.6 billion. Applications such as smart terminals, smart voice assistants, and smart driving will dramatically improve the organizational efficiency of the human society and change people’s lives. With the rapid development of artificial intelligence technology toward pervasive intelligence, the smart terminal devices will further deeply penetrate the human society. Looking back at the development process of artificial intelligence, at a key time point in 1936, British mathematician Alan Turing proposed an ideal computer model, the general Turing machine, which provided a theoretical basis for the ENIAC (Electronic Numerical Integrator And Computer) born ten years later. During the same period, inspired by the behavior of the human brain, American scientist John von Neumann wrote the monograph “The Computer and the Brain” [2] and proposed an improved stored program computer for ENIAC, i.e., Von Neumann Architecture, which became a prototype for computers and even artificial intelligence systems. The earliest description of artificial intelligence can be traced back to the Turing test [3] in 1950. Turing pointed out that “if a machine talks with a person through a specific device without communication with the outside, and the person cannot reliably tell that the talk object is a machine or a person, this machine has humanoid intelligence”. The word “artificial intelligence” actually appeared at the Dartmouth symposium held by John McCarthy in 1956 [4]. The “father of artificial intelligence” defined it as “the science and engineering of manufacturing smart machines”. The proposal of artificial intelligence has opened up a new field. Since then, the Deep Learning on Edge Computing Devices https://doi.org/10.1016/B978-0-32-385783-3.00008-9 Copyright © 2022 Tsinghua University Press. Published by Elsevier Inc. All rights reserved. 3

4 Deep Learning on Edge Computing Devices Figure 1.1 Relationship diagram of deep learning related research ﬁelds. academia has also successively presented research results of artificial intelligence. After several historical cycles of development, at present, artificial intelligence has entered a new era of machine learning. As shown in Fig. 1.1, machine learning is a subfield of theoretical research on artificial intelligence, which has developed rapidly in recent years. Arthur Samuel proposed the concept of machine learning in 1959 and con- ceived the establishment of a theoretical method “to allow the computer to learn and work autonomously without relying on certain coded instructions” [5]. A representative method in the field of machine learning is the support vector machine [6] proposed by Russian statistician Vladimir Vap- nik in 1995. As a data-driven method, the statistics-based SVM has perfect theoretical support and excellent model generalization ability, and is widely used in scenarios such as face recognition. Artificial neural network (ANN) is one of the methods to realize machine learning. The artificial neural network uses the structural and functional features of the biological neural network to build mathematical models for estimating or approximating functions. ANNs are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. The concept of the artificial neural network can be traced back to the neuron model (MP model) [7] proposed by Warren McCulloch and Walter Pitts in 1943. In this model the input multidimensional data are multiplied by the corresponding weight parameters and accumulated,

Introduction 5 and the accumulated value is calculated by a specific threshold function to output the prediction result. Later, Frank Rosenblatt built a perceptron system [8] with two layers of neurons in 1958, but the perceptron model and its subsequent improvement methods had limitations in solving high- dimensional nonlinear problems. Until 1986, Geoffrey Hinton, a professor in the Department of Computer Science at the University of Toronto, invented the back propagation algorithm [9] for parameter estimation of the artificial neural network and realized the training of the multilayer neural networks. As a branch of the neural network technology, the deep learning technology has been a great success in recent years. The algorithmic milestone appeared in 2006. Hinton invented the Boltzmann machine and success- fully solved the problem [10] of vanishing gradients in training the multilayer neural networks. So far, the artificial neural network has officially entered the “deep” era. In 2012, the convolutional neural network [11] and its variants invented by Professor Yann LeCun from New York Uni- versity greatly improved the classification accuracy of the machine learning methods on large-scale image databases and reached and surpassed people’s image recognition level in the following years, which laid the technical foundation for the large-scale industrial application of the deep learning technology. At present, the deep learning technology is ever developing rapidly and achieved great success in subdivision fields of machine vision [12] and voice processing [13]. Especially in 2016, Demis Hassabis’s Alpha Go artificial intelligence built based on the deep learning technology defeated Shishi Li, the international Go champion by 4:1, which marked that artificial intelligence has entered a new era of rapid development. 1.2 Applications and trends The Internet of Things technology is considered to be one of the important forces that lead to the next wave of industrial change. The concept of the Internet of Things was first proposed by Kevin Ashton of MIT in 2009. He pointed out that “the computer can observe and understand the world by RF transmission and sensor technology, i.e., empower computers with their own means of gathering information” [14]. After the massive data collected by various sensors are connected to the network, the connection between human beings and everything is enhanced, thereby expanding the boundaries of the Internet and greatly increasing industrial production efficiency. In the new “wave of industrial technological change”, the smart

6 Deep Learning on Edge Computing Devices terminal devices will undoubtedly play an important role. As a carrier for connection of Internet of Things, the smart perception terminal device not only realizes data collection, but also has front-end and local data processing capabilities, which can realize the protection of data privacy and the extraction and analysis of perceived semantic information. With the proposal of the smart terminal technology, the fields of Arti- ficial Intelligence (AI) and Internet of Things (IoT) have gradually merged into the artificial intelligence Internet of Things (AI&IoT or AIoT). On one hand, the application scale of artificial intelligence has been gradually expanded and penetrated into more fields relying on the Internet of Things; on the other hand, the devices of Internet of Things require the embedded smart algorithms to extract valuable information in the front-end collection of sensor data. The concept of intelligence Internet of Things (AIoT) was proposed by the industrial community around 2018 [15], which aimed at realizing the digitization and intelligence of all things based on the edge computing of the Internet of Things terminal. AIoT-oriented smart terminal applications have a period of rapid development. According to a third-party report from iResearch, the total amount of AIoT financing in the Chinese market from 2015 to 2019 was approximately $29 billion, with an increase of 73%. The first characteristic of AIoT smart terminal applications is the high data volume because the edge has a large number of devices and large size of data. Gartner’s report has shown that there are approximately 340,000 autonomous vehicles in the world in 2019, and it is expected that in 2023, there will be more than 740,000 autonomous vehicles with data collection capabilities running in various application scenarios. Taking Tesla as an example, with eight external cameras and one powerful system on chip (SOC) [16], the autonomous vehicles can support end-to-end machine vision image processing to perceive road conditions, surrounding vehicles and the environment. It is reported that a front camera with a resolution of 1280 × 960 in Tesla Model 3 can generate about 473 GB of image data in one minute. According to the statistics, at present, Tesla has collected more than 1 million video data and labeled the information about distance, acceleration, and speed of 6 billion objects in the video. The data amount is as high as 1.5 PB, which provides a good data basis for improvement of the performance of the autonomous driving artificial intelligence model. The second characteristic of AIoT smart terminal applications is high la- tency sensitivity. For example, the vehicle-mounted ADAS of autonomous

Introduction 7 vehicles has strict requirements on response time from image acquisi- tion and processing to decision making. For example, the average response time of Tesla autopilot emergency brake system is 0.3 s (300 ms), and a skilled driver also needs approximately 0.5 s to 1.5 s. With the data-driven machine learning algorithms, the vehicle-mounted system HW3 proposed by Tesla in 2019 processes 2300 frames per second (fps), which is 21 times higher than the 110 fps image processing capacity of HW2.5. The third characteristic of AIoT smart terminal applications is high energy efficiency. Because wearable smart devices and smart speakers in embedded artificial intelligence application fields [17] are mainly battery- driven, the power consumption and endurance are particularly critical. Most of the smart speakers use a voice awakening mechanism, which can realize conversion from the standby state to the working state according to the recognition of human voice keywords. Based on the embedded voice recognition artificial intelligence chip with high power efficiency, a novel smart speaker can achieve wake-on-voice at standby power consumption of 0.05 W. In typical offline human–machine voice interaction application scenarios, the power consumption of the chip can also be controlled within 0.7 W, which provides conditions for battery-driven systems to work for a long time. For example, Amazon smart speakers can achieve 8 hours of battery endurance in the always listening mode, and the optimized smart speakers can achieve up to 3 months of endurance. From the perspective of future development trends, the development goal of the artificial intelligence Internet of Things is achieving ubiquitous pervasive intelligence [18]. The pervasive intelligence technology aims to solve the core technical challenges of high volume, high time sensitivity, and high power efficiency of the embedded smart devices and finally to realize the digitization and intelligence of all things [19]. The basis of development is to understand the legal and ethical relationship between the efficiency improvement brought by the development of the artificial intelligence technology and the protection of personal privacy, so as to improve the efficiency of social production and the convenience of people’s lives under the premise of guaranteeing the personal privacy. We believe that pervasive intelligence calculation for the artificial intelligence Internet of Things will become a key technology to promote a new wave of industrial technological revolution.

8 Deep Learning on Edge Computing Devices Figure 1.2 Global data growth forecast. 1.3 Concepts and taxonomy 1.3.1 Preliminary concepts Data, computing power, and algorithms are regarded as three elements that promote the development of artificial intelligence, and the development of these three elements has become a booster for the explosion of the deep learning technology. First of all, the ability to acquire data, especially large- scale data with labels, is a prerequisite for the development of the deep learning technology. According to the statistics, the size of the global Inter- net data in 2020 has exceeded 30 ZB [20]. Without data optimization and compression, the estimated storage cost alone will exceed RMB 6 trillion, which is equivalent to the sum of GDP of Norway and Austria in 2020. With the further development of the Internet of Things and 5G technology, more data sources and capacity enhancements at the transmission level will be brought. It is foreseeable that the total amount of data will continue to develop rapidly at higher speed. It is estimated that the total amount of data will be 175 ZB by 2025, as shown in Fig. 1.2. The increase in data size provides a good foundation for the performance improvement of deep learning models. On the other hand, the rapidly growing data size also puts forward higher computing performance requirements for model training. Secondly, the second element of the development of artificial intelligence is the computing system. The computing system refers to the hardware computing devices required to achieve an artificial intelligence system. The computing system is sometimes described as the “engine” that supports the application of artificial intelligence. In the deep learning era of artificial intelligence, the computing system has become an infrastruc- ture resource. When Google’s artificial intelligence Alpha Go [21] defeated Korean chess player Shishi Li in 2016, people lamented the powerful artificial intelligence, and the huge “payment” behind it was little known: 1202

Introduction 9 Figure 1.3 Development trend of transistor quantity. CPUs, 176 high-performance GPUs, and the astonishing power of 233 kW consumed in a game of chess. From the perspective of the development of the computing system, the development of VLSI chips is the fundamental power for the improvement of AI computing performance. The good news is that although the development of the semiconductor industry has periodic fluctuation, the well-known “Moore’s law” [22] in the semiconductor industry has expe- rienced the test for 50 years (Fig. 1.3). Moore’s law is still maintained in the field of VLSI chips, largely because the rapid development of GPU has made up for the slow development of CPU. We can see from the figure that in 2010 the number of GPU transistors has grown more than that of CPUs, CPU transistors have begun to lag behind Moore’s law, and the development of hardware technologies [23] such as special ASICs for deep learning and FPGA heterogeneous AI computing accelerators have injected new fuel for the increase in artificial intelligence computing power. Last but not least, the third element of artificial intelligence development is an algorithm. An algorithm is a finite sequence of well-defined, computer-implementable instructions, typically to solve a class of specific problems in finite time. Performance breakthrough in the algorithm and application based on deep learning in the past 10 years is an important reason for the milestone development of AI technology. So, what is the future development trend of deep learning algorithms in the era of Internet of Ev- erything? This problem is one of the core problems discussed in academia and industry. A general consensus is that the deep learning algorithms will develop toward high efficiency.

10 Deep Learning on Edge Computing Devices Figure 1.4 Comparison of computing power demands and algorithms for deep learning model. OpenAI, an open artificial intelligence research organization, has pointed out that “the computing resource required by advanced artificial intelligence doubles approximately every three and a half months”. The computing resource of training a large AI model has increased by 300,000 times since 2012, with an average annual increase of 11.5 times. The growth rate of hardware computing performance has only reached an average annual increase of 1.4 times. On the other hand, the improvement of the efficiency of high-efficiency deep learning algorithms reaches annual average saving of about 1.7 times of the computing resource. This means that as we continue to pursue the continuous improvement of algorithm performance, the increase of computing resource demands potentially exceeds the development speed of hardware computing performance, as shown in Fig. 1.4. A practical example is the deep learning model GPT-3 [24] for natural language processing issued in 2020. Only the cost of model training and computing resource deployment has reached about 13 million dollars. If the computing resource cost increases exponentially, then it is difficult to achieve sustainable development. How to solve this problem is one of the key problems in the development of artificial intelligence toward the pervasive intelligence. 1.3.2 Two stages of deep learning: training and inference Deep learning is generally classified into two stages, training and inference. First, the process of estimating the parameters of the neural network model based on known data is called training. Training is sometimes also known as the process of parameter learning. In this book, to avoid ambiguity, we use

Introduction 11 the word “training” to describe the parameter estimation process. The data required in the training process is called a training dataset. The training algorithm is usually described as an optimization task. The model parameters with the smallest prediction error of the data labels on the training sample set are estimated through gradient descent [25], and the neural network model with better generalization is acquired through regularization [26]. In the second stage, the trained neural network model is deployed in the system to predict the labels of the unknown data obtained by the sensor in real time. This process is called the inference process. Training and inference of models are like two sides of the same coin, which belong to different stages and are closely related. The training quality of the model determines the inference accuracy of the model. For the convenience of understanding the subsequent content of this book, we summarize the main concepts of machine learning involved in the training and inference process as follows. Dataset. The dataset is a collection of known data with similar at- tributes or features and their labels. In deep learning, signals such as voices and images acquired by the sensor are usually converted into data expression forms of vectors, matrices, or tensors. The dataset is usually classified into a training dataset and a test dataset, which are used for the estimation of the parameters of the neural network model and the evaluation of neural network inference performance respectively. Deep learning model. In this book, we will name a function f (x;θ) from the known data x to the label y to be estimated as the model, where θ is a collection of internal parameters of the neural network. It is worth mentioning that in deep learning, the parameters and function forms of the model are diverse and large in scale. It is usually difficult to write the analytical form of the function. Only a formal definition is provided here. Objective function. The process of deep learning model training is defined as an optimization problem. The objective function of the optimization problem generally includes two parts, a loss function and a regularization term. The loss function is used to describe the average error of the label prediction of the neural network model on the training samples. The loss function is minimized to enhance the accuracy of the model on the training sample set. The regularization term is usually used to control the complexity of the model to improve the accuracy of the model for unknown data labels and the generalization performance of the model.

12 Deep Learning on Edge Computing Devices Figure 1.5 Application scenarios of cloud and edge. 1.3.3 Cloud and edge devices Edge computing [27] refers to a concept in which a distributed architecture decomposes and cuts the large-scale computing of the central node into smaller and easier-to-manage parts and disperses them to the edge nodes for processing. The edge nodes are closer to the terminal devices and have higher transmission speed and lower time delay. As shown in Fig. 1.5, the cloud refers to the central servers far away from users. The users can access these servers anytime and anywhere through the Internet to realize information query and sharing. The edge refers to the base station or server close to the user side. We can see from the figure that the terminal devices [28] such as monitoring cameras, mobile phones, and smart watches are closer to the edge. For deep learning applications, if the inference stage can be completed at the edge, then the problem of transmission time delay may be solved, and the edge computing provides services near data sources or users, which will not cause the problem of privacy disclosure. Data show that cloud computing power will grow linearly in future years, with a compound annual growth rate of 4.6%, whereas demand at the edge is exponential, with a compound annual growth rate of 32.5%. The edge computing terminal refers to the smart devices that focus on real-time, secure, and efficient specific scenario data analysis on user terminals. The edge computing terminal has huge development prospects in the field of artificial intelligence Internet of Things (AIoT). A large number

Introduction 13 of sensor devices in the Internet of Things industry need to collect various types of data at high frequency. Edge computing devices can integrate data collection, calculation, and execution to effectively avoid the cost and time delay of uploading the data to cloud computing and improve the security and privacy protection of user data. According to an IDC survey, 45% of the data generated by the Internet of Things industry in 2020 will be processed at the edge of the network, and this proportion will expand in the future years. “2021 Edge Computing Technology White Paper” has pointed out that the typical application scenarios of edge computing smart terminals include smart car networking/autonomous driving, industrial In- ternet, and smart logistics. The values of ultralow time delay, massive data, edge intelligence, data security, and cloud collaboration will prompt more enterprises to choose edge computing. 1.4 Challenges and objectives In recent years, deep learning has made breakthroughs in the fields of machine vision and voice recognition. However, because the training and inference of standard deep neural networks involve a large number of parameters and floating-point computing, they usually need to be run on resource-intensive cloud servers and devices. However, this solution has the following two challenges. (1) Privacy problem. Sending user data (such as photos and voice) to the cloud will cause a serious privacy disclosure problem. The European Union, the United States, etc. have set up strict legal management and monitoring systems for sending the user data to the cloud. (2) High delay. Many smart terminal applications have extremely high requirements for the end-to-end delay from data collection to completion of processing. However, the end-cloud collaborative architecture has the problem that data transmission delay is uncertain and is difficult to meet the needs of high time sensitivity smart applications such as autonomous driving. Edge computing effectively solves the above problem and has gradually become a research hotspot. Recently, edge computing has made some breakthroughs in technology. On one hand, algorithm design companies have begun to seek more efficient and lightweight deep learning models (such as MobileNet and ShuffleNet). On the other hand, hardware technology companies, especially chip technology companies, have invested heavily in the development of special neural network computing accel-

14 Deep Learning on Edge Computing Devices eration chips (such as NPU). How to minimize resource consumption by optimizing algorithms and hardware architecture on edge devices with limited resources is of great significance to the development and the application of AIoT in the 5G and even 6G era. The deep learning edge computing technology based on smart terminals will effectively solve the above technical challenges of deep learning cloud computing. This book focuses on the deep learning edge computing technology and introduces how to design, optimize, and deploy efficient neural network models on embedded smart terminals from the three levels of algorithms, hardware, and applications. In the algorithm technology, neural network algorithms for edge deep learning is introduced, including lightweight neural network structure design, pruning, and compression technology. The hardware technology details the hardware design and optimization methods of edge deep learning, including algorithm and hardware collaborative design, near memory computing, and hardware implementation of integrated learning. For the application program, each part briefly introduces the application program. In addition, as a comprehensive example, the application of smart monitoring cameras will be introduced as a separate part at the end of this book, which integrates algorithm innovation and hardware architecture innovation. 1.5 Outline of the book This book aims to comprehensively cover the latest progress in edge-based neural computing, including algorithm models and hardware design. To reflect the needs of the market, in this book, we attempt to systematically summarize the related technologies of edge deep learning, including algorithm models, hardware architectures, and applications. The performance of deep learning models can be maximized on the edge computing devices through collaborative algorithm-hardware-code design. The structure of this book is as follows. According to the content, it includes three parts and nine chapters. Part 1 is Introduction, including two chapters (Chapters 1–2); Part 2 is Model and Algorithm, including three chapters (Chapters 3–5); and Part 3 is Architecture Optimization, including four chapters (Chapters 6–9). The first chapter (Introduction) mainly describes the development process, related applications, and development prospects of artificial intelligence, provides some basic concepts and terms in the field of deep learn-

Introduction 15 ing, and finally provides the research content and contributions of this book. The second chapter (The Basic of Deep Learning) explains the relevant basis of deep learning, including architectures of feedforward neural networks, convolutional neural networks, and recurrent neural networks, as well as the training process of the network models and performance and challenges of the deep neural networks on AIoT devices. Chapter 3 (Model Design and Compression) discusses the current lightweight model design and compression methods, including efficient lightweight network designs by presenting some classical lightweight models and the model compression methods by detailedly introducing two typical methods, model pruning and knowledge distillation. Chapter 4 (Mix-Precision Model Encoding and Quantization) proposes a mixed precision quantization and encoding bitwise bottleneck method from the perspective of quantization and encoding of neural network activation based on the signal compression theory in wireless communication, and can quantify the neural network activation from a floating point type to a low-precision fixed point type. Experiments on ImageNet and other datasets show that by minimizing the quantization distortion of each layer the bitwise bottleneck encoding method realizes state-of-the-art performance with low-precision activation. Chapter 5 (Model Encoding of Binary Neural Networks) focuses on the binary neural network model and proposes a hardware-friendly method to improve the performance of efficient deep neural networks with binary weights and activation. The cellular binary neural network includes multiple parallel binary neural networks, which optimize the lateral connections through group sparse regularization and knowledge distillation. Experiments on CIFAR-10 and ImageNet datasets show that by introducing optimized group sparse lateral paths the cellular binary neural network can obtain better performance than other binary deep neural networks. Chapter 6 (Binary Neural Networks Computing Architecture) proposes a fully pipelined BNN accelerator from the perspective of hardware acceleration design, which has a bagging integrated unit for aggregating multiple BNN pipelines to achieve better model precision. Compared with other methods, this design greatly improves memory footprint and power efficiency on the MNIST dataset. Chapter 7 (Algorithm and Hardware Codesign of Sparse Binary Network-on-Chip) proposes a hardware-oriented deep learning algorithm-

16 Deep Learning on Edge Computing Devices deep adaptive network method from the perspective of algorithm and hardware collaborative design to explore the sparsity between neural network connections. To make full use of the advantages of algorithm optimization, we propose an efficient hardware architecture based on a sparsely mapped memory. Unlike the traditional network architecture on chip, the deep adaptive network on chip (DANoC) closely combines communication and calculation to avoid massive power loss caused by parameter transmission between the onboard memory and the on-chip computing unit. The ex- perimental results show that compared with the most advanced method, the system has higher precision and efficiency. Chapter 8 (Hardware Architecture Optimization for Object Tracking) proposes a low-cost and high-speed VLSI system for object tracking from the perspective of algorithm and hardware collaborative design based on texture and dynamic compression perception features and ellipse matching algorithm. The system introduces a memory-centric architecture mode, multistage pipelines, and parallel processing circuits to achieve high frame rates while consuming minimal hardware resources. Based on the FPGA prototype system, at a clock frequency of 100 MHz, a processing speed of 600 frames per second is realized, and stable tracking results are maintained. Chapter 9 (SensCamera: A Learning based Smart Camera Prototype) provides an example of edge computing terminals, a smart monitoring camera prototype system from the perspective of algorithm and hardware collaborative design, and integrates algorithm innovation and hardware architecture innovation. First, we propose a hardware-friendly algorithm, which is an efficient convolutional neural network for unifying object detection and image compression. The algorithm uses convolution computation to perform near-isometric compressed perception and invents a new noncoherent convolution method to learn the sampling matrix to realize the near-isometric characteristics of compressed perception. Finally, through hardware-oriented algorithm optimization, a smart camera prototype built with independent hardware can be used to perform object detection and image compression of 20 to 25 frames of video images per second with power consumption of 14 watts. References [1] Intelligence-GSMA, The mobile economy 2020, Tech. rep., GSM Association, Lon- don, 2020. [2] J. Von Neumann, The Computer and the Brain, Yale University Press, 2012.

Introduction 17 [3] B.G. Buchanan, A (very) brief history of artificial intelligence, AI Magazine 26 (2005) 53–60. [4] J. McCarthy, M. Minsky, N. Rochester, C.E. Shannon, A proposal for the Dartmouth summer research project on artificial intelligence, August 31, 1955, AI Magazine 27 (2006) 12–14. [5] A.L. Samuel, Some studies in machine learning using the game of checkers, IBM Journal of Research and Development 3 (1959) 210–229. [6] C. Cortes, V. Vapnik, Support-vector networks, Machine Learning 20 (3) (1995) 273–297. [7] W.S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics 5 (4) (1943) 115–133. [8] F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain, Psychological Review 65 (6) (1958) 386. [9] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back- propagating errors, Nature 323 (6088) (1986) 533–536. [10] G.E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets, Neural Computation 18 (7) (2006) 1527–1554. [11] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (11) (1998) 2278–2324. [12] A. Voulodimos, N. Doulamis, A. Doulamis, E. Protopapadakis, Deep learning for computer vision: A brief review, Computational Intelligence and Neuroscience (2018). [13] H. Purwins, B. Li, T. Virtanen, J. Schlüter, S.-y. Chang, T. Sainath, Deep learning for audio signal processing, IEEE Journal of Selected Topics in Signal Processing 13 (2019) 206–219. [14] K. Ashton, et al., That ‘internet of things’ thing, RFID Journal 22 (7) (2009) 97–114. [15] A. Ghosh, D. Chakraborty, A. Law, Artificial intelligence in internet of things, CAAI Transactions on Intelligence Technology 3 (4) (2018) 208–218. [16] S. Ingle, M. Phute, Tesla autopilot: semi autonomous driving, an uptick for future autonomy, International Research Journal of Engineering and Technology 3 (9) (2016) 369–372. [17] B. Sudharsan, S.P. Kumar, R. Dhakshinamurthy, AI vision: Smart speaker design and implementation with object detection custom skill and advanced voice interaction ca- pability, in: Proceedings of International Conference on Advanced Computing, 2019, pp. 97–102. [18] D. Saha, A. Mukherjee, Pervasive computing: a paradigm for the 21st century, Com- puter 36 (3) (2003) 25–31. [19] M. Satyanarayanan, Pervasive computing: Vision and challenges, IEEE Personal Com- munications 8 (4) (2001) 10–17. [20] D. Reinsel, J. Gantz, J. Rydning, Data age 2025: the evolution of data to life-critical don’t focus on big data; focus on the data that’s big, Tech. rep., IDC, Seagate, 2017. [21] S.D. Holcomb, W.K. Porter, S.V. Ault, G. Mao, J. Wang, Overview on DeepMind and its AlphaGo Zero AI, in: Proceedings of the International Conference on Big Data and Education, 2018, pp. 67–71. [22] R. Schaller, Moore’s law: past, present and future, IEEE Spectrum 34 (1997) 52–59. [23] D. Han, S. Zhou, T. Zhi, Y. Chen, T. Chen, A survey of artificial intelligence chip, Journal of Computer Research and Development 56 (1) (2019) 7. [24] T.B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, et al., Language models are few-shot learners, arXiv:2005.14165 [abs]. [25] J. Zhang, Gradient descent based optimization algorithms for deep learning models training, arXiv:1903.03614 [abs]. [26] J. Kukacka, V. Golkov, D. Cremers, Regularization for deep learning: A taxonomy, arXiv:1710.10686 [abs].

18 Deep Learning on Edge Computing Devices [27] W. Shi, J. Cao, Q. Zhang, Y. Li, L. Xu, Edge computing: Vision and challenges, IEEE Internet of Things Journal 3 (5) (2016) 637–646. [28] J. Chen, X. Ran, Deep learning with edge computing: A review, Proceedings of the IEEE 107 (8) (2019) 1655–1674.

CHAPTER 2 The basics of deep learning 2.1 Feedforward neural networks A feedforward neural network (or fully connected neural network) is one of the earliest neural network models invented in the field of artificial intelligence [1]. It is able to learn autonomously via the input data to complete specific tasks. Here we take image classification [2], one of the core problems in the field of computer vision, as an example to illustrate the principle of a feedforward neural network. The so-called classification problem is allocating a label to each input data on the premise of a fixed set of classification labels. The task of a feedforward neural network is predicting the classification label of a given image. The prediction is made by giving scores (prediction probabilities) of the image under each classification label in the form of a vector, which are also the output of the feedforward neural network. Apparently, the label with the highest score is the category to which the network predicts that the image belongs. As shown in Fig. 2.1(b), the process of prediction is a simple linear mapping combined with an activation function σ, f (x;W,b) = σ(Wx + b), (2.1) where the image data x ∈ Rd, d is the number of pixel elements of the images. The parameters of this linear function are the matrix W ∈ Rc×d and column vector b ∈ Rc, and c represents the number of categories. The parameter W is called the weight, and b is called the bias vector. Obviously, the weight and bias affect the performance of the feedforward neural network, and the correct prediction is closely related to the values of these two matrix vectors. According to the operational rule of matrices, the output will be a column vector of size c × 1, i.e., the scores of the c categories mentioned earlier. The structure of the feedforward neural network is inspired by the neuronal system of human brain [3]. The basic unit of computation of the brain is neuron. There are 80 billion neurons in a human neuronal system, which are connected by approximately 1014 to 1015 synapses. Fig. 2.1(a) shows a biological neuron. As shown in the figure, each neuron receives input signals from its dendrites and then generates output signals along its Deep Learning on Edge Computing Devices https://doi.org/10.1016/B978-0-32-385783-3.00009-0 Copyright © 2022 Tsinghua University Press. Published by Elsevier Inc. All rights reserved. 19

20 Deep Learning on Edge Computing Devices Figure 2.1 The correspondings between the neuronal structure of human brain and the artiﬁcial neural network. unique axon. The axon branches off gradually at the end and is connected to dendrites of other neurons through synapses. In the computation model of an artificial neuron, signals propagating along the axon (equivalent to input x) interact with the dendrites of other neurons (equivalent to matrix operation Wx) based on the synaptic strength of synapses (equivalent to weight W). The synaptic strength can control the strength of influence of one neuron on the other one, as well as the direction of influence: to ex- cite (positive weight) or suppress (negative weight) that neuron. Dendrites transmit signals to a cell body, where the signals are added up. According to what has been said so far, the human brain system works in a way similar to the linear mapping we just mentioned, but then the crucial point comes. The neurons activate and output an electrical pulse to their axons only if the sum in the cell body is above a certain threshold. In neuronal dynamics, the Leaky Integrate-and-Fire (LIF) model [4] is commonly used to describe this process. The model describes the membrane potential of the neuron based on the input to synapse and the injection current it receives. Simply speaking, the communication between two neurons requires a spike as a mark. When the synapse of the previous neuron sends

The basics of deep learning 21 Figure 2.2 The illustration of two activation functions, step and sigmoid functions. out a current, the membrane potential rises. Once the membrane potential exceeds a given threshold, a spike will be generated, and the membrane potential will be reset. Obviously, this process of spike generation is similar to a threshold-based function. If the current is lower than the threshold, then there will be no spike, and if the current is higher than the threshold, then there will be a spike, which is similar to the characteristics of a step function. The concept of activation function [5] was proposed in the light of this characteristic of human brain neurons. The activation function makes the neural network nonlinear, so that some problems that linear regression cannot handle can be solved. The step function just mentioned can handle the binary classification problem (outputting “yes” or “no”). For more categories, we need an intermediate activation value or an accurate description of the degree of activation, rather than a simple division into 100% or 0. In such a context, traditional activation functions such as sigmoid were proposed, which normalizes the input to (0, 1), achieves nonlinearity, and has an intermediate activation value. The formulations and curves are shown in Fig. 2.2. Normally, a typical feedforward neural network has one or more addi- tional layers of neurons between the input and output layers, which are called hidden layers. The hidden layers exist to identify and divide the features of the input data in greater detail [6], so as to make correct predic- tions. We divide a classification problem into multiple subproblems based on physical features, and each neuron in the hidden layers is responsible for dealing with such a subproblem. Fig. 2.3 shows a three-layer feedforward neural network. The first layer is an input layer, containing three neurons;

22 Deep Learning on Edge Computing Devices Figure 2.3 Schematic diagram of three-layer feedforward neural network. the second layer is a hidden layer, containing four neurons; and the third layer is an output layer, containing two neurons. Neurons in two adjacent layers are connected to each other, and neurons in the same layer are not connected. The input layer and hidden layer of the network are used as an example to describe the formula of fully connection: ŷj = sigmoid( 3 i=1 wijxi + bj), (2.2) where wij represents the weight value between the ith neuron in the input layer and the jth neuron in the hidden layer, and xi represents the numerical value of the ith element in the input column vector. These two values are multiplied and added to the bias bj of each neuron in the hidden layer, and the resulting value is processed by the activation function sigmoid. Therefore the output ŷj of each neuron in the hidden layer can be obtained. In practical applications, the number of hidden layers can be increased, and the way that the hidden layers are connected will also change, which is the deep neural network to be mentioned later. 2.2 Deep neural networks 2.2.1 Convolutional neural networks As we mentioned in the previous section, the input to the feedforward neural network is vector data, which will be subjected to feature extraction in multiple hidden layers. Each hidden layer is made up of several neurons, and each neuron is connected to all neurons in the previous layer. The final fully connected layer is called the output layer, and its output values are considered to be the score values for different categories in classification

The basics of deep learning 23 Figure 2.4 Schematic diagram of convolutional operation receptive ﬁeld. problems. Such a kind of network structure has obvious defects when facing large-size image input. The fully connected structure between hidden layers leads to a sharp increase in network parameter numbers, which not only greatly reduces the training speed, but also may lead to overfitting of the network and greatly damage the model performance. The fitting accuracy can be improved by increasing the number of network layers, but with the increase in the number of layers, problems such as gradient vanishing are easy to appear, making it difficult for the network to train convergence. Patterns of image recognition by human brain have been found to be instructive for the improvement of the structure of artificial neural networks. The human brain first perceives each local feature in the picture and then performs a higher level of integration to obtain global information. This is to make use of the sparse connectivity of the observed objects in the image, that is, local pixels in the image are closely related, whereas the correlation between pixels that are further apart is weak. Like the human brain, we only need to perceive local features of an image at the hidden layers and then integrate the local information at a higher layer to recognize a complete image. In recent years, it has been found that the convolution operator, which is widely used in the field of signal processing, can complete such a process. For one-dimensional time series signals, convolution is a special integral operation. When extended to a two-dimensional image, a matrix called convolutional kernel will be used to replace a signal participating in convolution in the one-dimensional case [7]. We have each convolutional kernel in the hidden layer connected to only one local area of the input data, and the spatial size of the connection is called the receptive field of the neuron.

24 Deep Learning on Edge Computing Devices Figure 2.5 The illustration of a typical convolutional neural network structure. The receptive field can be understood as the size of the area seen by a neuron [8]. The deeper the neuron, the larger the input area that the neuron can see. As shown in Fig. 2.4, the receptive field of each neuron in the first hidden layer is 3, the receptive field of each neuron in the second hidden layer is 5, and the receptive field of each neuron in the third hidden layer is 7. The further away the hidden layer is from the input layer, the more features can be obtained, realizing the control from local features to the whole perception. The convolutional kernel is equivalent to a mapping rule in which the value of an original image pixel point is multiplied by the value of the convolutional kernel at the corresponding location, and then the resulting values are added according to the weights. This process is similar to the search for a class of patterns in an image to extract the features of the image. Obviously, such a filter is not able to extract all features, and a set of different filters is required. Convolutional neural network is a kind of feedforward neural networks with convolution operation and deep structure [9]. Its structure, as shown in Fig. 2.5, includes multiple convolutional layers for feature extraction, pooling layers for reducing the amount of computation, and a fully connected neural network layer for classification. We will elaborate on the principles of each layer below. A convolutional layer is a hidden layer that contains several convolution units in a convolutional neural network, which is used for feature extraction. As mentioned above, the convolution is characterized by sparse connection and parameter sharing. The structure of the convolutional layer is shown in Fig. 2.6. The square window on the left is the previously men-

The basics of deep learning 25 Figure 2.6 The diagram of convolution operation, where c is the number of input feature map channels, n is the number of convolutional kernels, and k × k is the size of a convolutional kernel. tioned convolutional kernel, also known as a filter or weight. The weight window slides on the input feature map and computes the convolution with the feature map to obtain the output feature map. The filter can be shared, and only the feature map data in the filter window participate in the operation during each convolution computation. This is based on two considerations of image vision: first, only local neurons are important, and connections between neurons that are further apart can be ignored; second, patterns of the same picture may show any position of the image, and the same weight needs to be used for computation by sliding on the input data. The sparse connection and parameter sharing mechanism reduces the number of parameters of the model and improves the trainability of deep neural network. In a convolutional layer, the size of the output feature map is determined by the convolutional kernel size, stride, and zero-padding. The stride is the value of the pixel that the convolutional kernel moves each time it slides. Zero-padding means that the edge of the input feature map is filled with 0, which controls the spatial size of the output data volume. It is worth noting that these three variables are all hyperparameters of a deep neural network model. Different from parameters that can be obtained through training (such as the weight and bias), hyperparameters are variables determined by experience, such as the convolutional kernel size and convolutional kernel depth mentioned here, as well as the model training learning rate and iteration hyperparameters to be mentioned later.

26 Deep Learning on Edge Computing Devices Figure 2.7 The illustration of two different types of pooling, max pooling and average pooling. The pooling layer is a hidden layer used to abstract information in a deep neural network. Pooling is also a method to reduce the amount of computation of the model, which can increase the receptive field and reduce the difficulty and parameters of optimization. Fig. 2.7 shows two common pooling operations [10]. For each feature map channel, a window sliding operation is performed to realize max pooling (take the maximum value) or average pooling (take the average value) of data in the window to reduce the amount of data, prevent overfitting, and improve the generalization ability of the model. Generally, the pooling stride is greater than 1, which is used to reduce the scale of feature map. As shown in Fig. 2.7, the 4 × 4 feature map passes through a 2 × 2 pooling layer, whose stride is 2, and the output size is 2 × 2. A convolutional neural network, as an important supporting technology of deep learning, promotes the development of artificial intelligence. Convolution operators can effectively extract spatial information and are widely used in the field of visual images, including image recognition [11], image segmentation [12], target detection [13], etc. In the model inference stage, the image data is input into the network, multilevel feature extraction is carried out through computation of multiple macromodules, and the prediction results of categories are output using a fully connected layer. In the model training stage, for a given input data, the error between the predicted result of a label and the real label is computed. Then the error gradient with respect to each parameter is computed by the back propagation algorithm. Finally, the parameters are updated by using the gradient descent algorithm. The above iterative steps are repeated for many times to gradually reduce the neural network prediction error until it converges. Compared with traditional feedforward neural networks, the convolutional

The basics of deep learning 27 neural network generally has better prediction accuracy and is one of the most important deep neural network structures. 2.2.2 Recurrent neural networks Both the deep feedforward and convolutional neural networks mentioned above have a characteristic that their network structures are arranged in order, neurons in the lth layer receive only signals from neurons in the (l − 1)th layer, and there is no feedback structure. However, in a particular task, to better capture the time-sequential features of the input vector, it is sometimes necessary to combine the sequential inputs. For example, in speech signal processing, if an exact translation of a sentence is required, then it is obviously impossible to translate each word separately. Instead, the words need to be connected together to form a sequence, and then the entire time-sequential sequence is processed. The recurrent neural network (RNN) [14] described in this section is a neural network structure that processes time-sequential data and has feedback. RNN, originated in the 1980s and 1990s [15], is a recursive neural network that takes sequence data as input, adds feedback in the evolution direction of the sequence, and links all nodes in a chain. It is difficult for a traditional feedforward neural network to establish a time-dependent model, whereas RNN can integrate information from input unit and previous time node, allowing information to continue to function across the time node. This means that the network has a memory function, which is very useful in natural language processing, translation, speech recognition, and video processing. Fig. 2.8 shows the basic structure of a standard RNN. On the left, there is a folded diagram, and on the right, there is the structure diagram expanded in chronological order. We can see that the loop body structure of RNN is located in the hidden layer. This network structure reveals the essence of RNN: the network information of the previous moment will act on that of the next moment, that is, the historical information of the previous moment will be connected to the neuron of the next moment through weights. As shown in the figure, in an RNN network, x represents an input, h represents a hidden layer unit, o represents an output, y represents a training label, t represents time, U is a parameter from the input layer to hidden layer, V is a parameter from the hidden layer to output layer, and W is a recurrent layer parameter. As we can see from the previous description, the performance of h at the moment of t is not only determined by the

28 Deep Learning on Edge Computing Devices Figure 2.8 The schematic diagram of RNN, and its corresponding unfolded form. input at that moment, but also influenced by the previous moment: h(t) = tanh(Ux(t) + Wh(t−1) + b). (2.3) The output at the moment of t is o(t) = Vh(t) + c. (2.4) The final output predicted by the model is y = σ(o(t) ). (2.5) It is worth noting that for excessively long speech sequences, RNN only has short-term memory due to the problem of gradient vanishing during the training of back propagation model. Long short-term memory (LSTM) [16] and gate recurrent unit (GRU) [17] are two solutions to short-term memory of RNN, which introduce a gating mechanism to regulate information flow. Take the LSTM structure as an example. It contains a forget gate, an input gate, and an output gate, which are used to reserve or delete the incoming information and to record key information. LSTM performs well in long-term memory tasks, but the structure also leads to more parameters, making the training more difficult. Compared with LSTM, GRU with similar structure uses a single gate to complete the forget and selection information stages, reducing parameters while achieving the same performance as LSTM, which is widely used under the condition of limited computing resource and time cost.

The basics of deep learning 29 2.3 Learning objectives and training process 2.3.1 Loss function Deep learning model training and parameter estimation are generally based on the optimization of specific loss functions (or objective functions, col- lectively referred to as loss functions in this book). In model optimization theory, loss functions are a kind of functions that map the values of one or more variables to the real number field. For the training of neural network model, we use the loss function to measure the degree of inconsistency between the predicted value and the ground-truth label, which is a non- negative real-valued function, so the loss function for all samples of the training set is usually expressed as J(θ) = E(x,y)∼Pdata L(x,y;θ) = E(x,y)∼Pdata L(f (x;θ),y), (2.6) where L is the loss function of each sample, f (x;θ) is the output predicted by the model when x is the input, Pdata is the empirical distribution, E represents expectation, y is the vector of all data labels, and θ represents all parameters of the neural networks. The smaller the output of the loss function, the smaller the gap between the predicted value and the data label, and the better the performance of the model. Most importantly, the loss function is differentiable and can be used to solve optimization problems. In the model training stage the predicted value is obtained through forward propagation after data is fed into the model, and then the loss function computes the difference between the predicted value and the data label, i.e., the loss value. The model updates the parameters by back propagation to reduce the loss value, so that the predicted value generated by the model is close to the ground-truth label of the data, so as to achieve the purpose of learning. In the following sections, we mainly introduce two loss functions in two classical prediction tasks, classification and regression. The cross entropy loss function [18] is one of the most commonly used classification objective functions in current deep neural networks, which is a kind of loss functions based on probability distribution measurements. In information theory, entropy is used to describe the measurement of un- certainty. Cross entropy is originally used to estimate the average coding length, and in machine learning, it is used to evaluate the difference between the probability distribution obtained by current training and the real distribution. For a single sample, the cross entropy loss function takes the

30 Deep Learning on Edge Computing Devices form L = − C k=1 yk logŷk, (2.7) where C represents the number of output categories, ŷk represents the kth output element (k ∈ 1,2,...,C) of the neural network, and the corresponding data label is yk. If it is the corresponding category, then yk is 1; otherwise, yk is 0. Different from the duality of data labels in a classification task (discrete type), each dimension of data labels in a regression task is a real number (successive type). In regression tasks the prediction error is frequently used to measure how close the predicted value of a model is to the data label. Assuming that the real label corresponding to the ith input feature xi in a regression question is yi = [yi1,...,yik,...,yiM]T , and M is the total dimension of label vector, the prediction error of the network regression predicted value ŷik and its real label yik in the kth dimension is Lik = yik − ŷik. (2.8) Loss functions frequently used in regression tasks are L1 [19] and L2 [20]. The L1 loss function for N samples is defined as L = 1 N N i=1 M k=1 |Lik|. (2.9) There are many kinds of loss functions, including loss functions based on specific tasks, such as classification and regression tasks mentioned above, and loss functions based on distance measurement and probability distribution, for example, the mean square error loss function [21] and the L1 and L2 loss functions are based on distance measurement, whereas the cross entropy loss function and Softmax loss function [22] are based on probability distribution. The selection of a loss function needs to consider data features, and, in some cases, some regular terms should be added to improve the performance of the model. 2.3.2 Regularization In model training, the loss value on the training sample can be continu- ously decreased by increasing the number of training iterations or adjusting hyperparameter settings. However, the prediction accuracy of the label of

The basics of deep learning 31 the training sample may keep improving, but the prediction accuracy of the label of the testing sample decreases instead of rising, which is called overfitting. Therefore the regularization method should be used to improve the generalization ability of the model and avoid overfitting. Regularization is a method designed to reduce generalization errors, i.e., the errors of the model on testing samples, to correct models. In traditional machine learning algorithms, the generalization ability is mainly improved by limiting the complexity of the model. Generally speaking, the model complexity is linearly related to the data amount of weight parameters W: the larger the data volume of W, the greater the complexity, and the more complex the model. Therefore, to limit the complexity of the model, it is quite natural to reduce the number of weight parameters W, that is, to make some elements in W be zero or limit the number of nonzero elements. Make the parameter θ of the neural network contain the weight coefficients of all neural network levels. The complexity of model parameters can be limited by adding a parameter penalty (θ) to the loss function. The regularized loss function is denoted as L̃(x,y;θ) = L(x,y;θ) + α(θ), (2.10) where α ∈ [0,∞) is the hyperparameter weighing the relative contribution of regularization term and standard objective function L(x,y;θ). Setting α to 0 indicates that there is no regularization, and the larger the α, the greater the corresponding regularization contribution. By introducing regularization terms we hope to limit the number of nonzero elements in vector W, so that the weight parameters are as small as possible and close to 0. The most frequently used regularization penalty is the L2 norm, which suppresses the weights of large values by applying an element-by-element squared penalty to all parameters. The L2 parameter norm penalty is also known as the weight decay [23], which is a regularization strategy that makes the weight closer to the origin by adding a regularization term Ω = 1 2 W2 2 to the objective function. 2.3.3 Gradient-based optimization method As mentioned earlier, model training is achieved by minimizing loss functions in machine learning. Under normal circumstances, the loss function is very complicated, and it is difficult to solve the analytic expression of minimum value. The gradient descent [24] is designed to solve this kind of problem. For ease of understanding, let us take an example and regard

32 Deep Learning on Edge Computing Devices the process of solving the minimum value of a loss function as “standing somewhere on a slope to look for the lowest point”. We do not know the exact location of the lowest point, the gradient descent strategy is to take a small step in the direction of downward slope, and after a long downward walk, there is a high probability that you will end up near the lowest point. We select the direction of downward slope to be the negative direction of the gradient, because the negative direction of the gradient at each point is the steepest downward direction of the function at that point. Deep neural networks usually use the gradient descent to update parameters, and by introduction of random, adjustment of learning rate, and other methods it is hoped that the networks can avoid falling into poor local minimum points and converge to better points. This is the traditional idea of gradient descent. Stochastic gradient descent (SGD) is one of the most frequently used methods for updating parameters. In that method, the gradient of loss function with respect to parameters is computed by using a mini-batch of random sample data of the whole data set. SGD typically divides the whole data set into several small batches of sample data, then iterates the input and computes losses and gradients, and finally updates the parameters. Set the neural network parameter θ and collect a small batch of {x1,x2,...,xN } containing N samples from the training set, where xi corresponds to the label yi. The following equations show the computation principle of gradient descent [25]. Gradient computation ĝ ← 1 m ∇θ i L(f (xi;θ),yi). (2.11) Application update θ ← θ − εĝ, (2.12) where ĝ represents the gradient of the loss function with respect to parameter θ, and ε is called the learning rate, which is a hyperparameter to control the update stride of parameters. A too large learning rate will fluctuate near the minimum value but fail to converge, and a too little learning rate leads to spending more time for convergence, so the convergence speed is slow. The learning rate can be adjusted by experience or algorithms. For example, the learning process may be slower when a flat or high-curvature area is encountered. A momentum algorithm can be added to SGD to improve

The basics of deep learning 33 the convergence speed. At present, there are also adaptive algorithms such as Adaptive Moment Estimation (Adam) algorithm [26] and RMSProp algorithm [27], which can make an optimization from both the gradient and the learning rate to achieve good results. In deep learning, gradient computation is complicated because of the large number of network layers and parameters. Back propagation algorithm [28] is widely used in gradient computation of neural network parameters. The principle of the back propagation algorithm is to compute the gradient of the loss function with respect to each weight parameter layer by layer through the chain rule. Then based on the chain rule, the reverse iteration is performed from the last layer, and the weight parameters of the model are updated at the end of each iteration. In the process of model training, weight parameters are constantly updated by inputting different batches of data until the loss function values converge to get a better parameter solution. 2.4 Computational complexity From the perspective of complexity, two considerations should be taken into account when designing a CNN network. One is the amount of computation required by the network, and the other is the scale of the parameters of the model and the input and output features of each layer. The former determines the speed of network training or inferring, usually measured by time complexity, and the latter determines how much memory a computing device needs, usually measured by space complexity. The time complexity of an algorithm is a function that describes the running time of an input algorithm of a given size. It can describe the trend of change in code execution time with the increase in data size. Gen- erally speaking, the time complexity of the algorithm can be understood as the total time spent completing a set of specific algorithms. On a specific device the time is determined by the total amount of computation required by the execution of the algorithm. The frequently used units for the measurement of amount of computation of deep learning algorithms are required floating-point operations and FLOPS. Floating point operations per second (FLOPS) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measure than measuring instructions per second. At present, the total amount of computation of most convolutional neural networks can reach dozens or even hundreds of GigaFLOPs, such

34 Deep Learning on Edge Computing Devices as the common convolutional neural network models MobileNet-V2 [29] and ResNet-50 [30], with a total amount of computation of 33.6 GFLOPs to 109.8 GFLOPs, which makes it difficult for neural networks deployed at the edge to complete real-time reasoning. The space complexity refers to the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input, which is usually measured in units of computer memory. Inside a computer, information is stored, computed, and transmitted in binary form. The most basic units of storage are bits and bytes. In convolutional neural networks the space complexity is mainly determined by the size of parameters at all layers. The parameters of the convolutional layer are mainly determined by the size and number of convolutional kernels, whereas the parameters of the fully connected layer are determined by the number of input neurons and output neurons. Take AlexNet [31], the champion model of 2012 ImageNet Image Classifica- tion Challenge, for example, which contains five convolutional layers with parameter sizes of 35 KB, 307 KB, 884 KB, 1.3 MB, and 442 KB, respectively, and three fully connected layers with parameter sizes of 37 MB, 16 MB, and 4 MB. The total size of parameters in AlexNet is about 60 MB, among which the fully connected structure undoubtedly increases the size of parameters, contributing 57 MB of parameters. Some CNN models that emerged after AlexNet performed better, but they were difficult to deploy on edge computing terminals due to their high space complexity. References [1] D. Svozil, V. Kvasnicka, J. Pospichal, Introduction to multi-layer feed-forward neural networks, Chemometrics and Intelligent Laboratory Systems 39 (1) (1997) 43–62. [2] B.D. Ripley, Neural networks and related methods for classification, Journal of the Royal Statistical Society: Series B (Methodological) 56 (3) (1994) 409–437. [3] R. Sylwester, A Celebration of Neurons: An Educator’s Guide to the Human Brain, ERIC, 1995. [4] A.N. Burkitt, A review of the integrate-and-fire neuron model: I. Homogeneous synaptic input, Biological Cybernetics 95 (1) (2006) 1–19. [5] F. Agostinelli, M. Hoffman, P. Sadowski, P. Baldi, Learning activation functions to improve deep neural networks, arXiv:1412.6830 [abs]. [6] G. Huang, Y. Chen, H.A. Babri, Classification ability of single hidden layer feedforward neural networks, IEEE Transactions on Neural Networks 11 (3) (2000) 799–801. [7] Y. Pang, M. Sun, X. Jiang, X. Li, Convolution in convolution for network in network, IEEE Transactions on Neural Networks and Learning Systems 29 (5) (2017) 1587–1597. [8] W. Luo, Y. Li, R. Urtasun, R. Zemel, Understanding the effective receptive field in deep convolutional neural networks, in: Proceedings of International Conference on Neural Information Processing Systems, 2016, pp. 4905–4913.

The basics of deep learning 35 [9] S. Albawi, T.A. Mohammed, S. Al-Zawi, Understanding of a convolutional neural network, in: Proceedings of International Conference on Engineering and Technol- ogy, 2017, pp. 1–6. [10] D. Yu, H. Wang, P. Chen, Z. Wei, Mixed pooling for convolutional neural networks, in: Proceedings of International Conference on Rough Sets and Knowledge Technol- ogy, 2014. [11] S. Hijazi, R. Kumar, C. Rowen, et al., Using convolutional neural networks for image recognition, Cadence Design Systems (2015) 1–12. [12] H. Ajmal, S. Rehman, U. Farooq, Q.U. Ain, F. Riaz, A. Hassan, Convolutional neural network based image segmentation: a review, in: Proceedings of Pattern Recognition and Tracking XXIX, 2018. [13] Z. Wang, J. Liu, A review of object detection based on convolutional neural network, in: Proceedings of Chinese Control Conference, 2017, pp. 11104–11109. [14] Z.C. Lipton, A critical review of recurrent neural networks for sequence learning, arXiv:1506.00019 [abs]. [15] J.J. Hopfield, Neural networks and physical systems with emergent collective computational abilities, Proceedings of the National Academy of Sciences 79 (8) (1982) 2554–2558. [16] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation 9 (8) (1997) 1735–1780. [17] R. Dey, F.M. Salem, Gate-variants of gated recurrent unit (GRU) neural networks, in: Proceedings of IEEE International Midwest Symposium on Circuits and Systems, 2017, pp. 1597–1600. [18] D.M. Kline, V. Berardi, Revisiting squared-error and cross-entropy functions for training neural network classifiers, Neural Computing Applications 14 (2005) 310–318. [19] M.W. Schmidt, G. Fung, R. Rosales, Fast optimization methods for L1 regularization: A comparative study and two new approaches, in: Proceedings of European Confer- ence on Machine Learning, 2007. [20] P. Bühlmann, B. Yu, Boosting with the L2 loss: regression and classification, Journal of the American Statistical Association 98 (462) (2003) 324–339. [21] S. Singh, D. Singh, S. Kumar, Modified mean square error algorithm with reduced cost of training and simulation time for character recognition in backpropagation neural network, in: Proceedings of International Conference on Frontiers in Intelligent Computing: Theory and Applications, 2013. [22] W. Liu, Y. Wen, Z. Yu, M. Yang, Large-margin softmax loss for convolutional neural networks, arXiv:1612.02295 [abs]. [23] A. Krogh, J. Hertz, A simple weight decay can improve generalization, in: Proceedings of International Conference on Neural Information Processing Systems, 1991. [24] E. Dogo, O. Afolabi, N. Nwulu, B. Twala, C. Aigbavboa, A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks, in: Proceedings of International Conference on Computational Techniques, Electronics and Mechanical Systems, 2018, pp. 92–99. [25] Y. Bengio, I. Goodfellow, A. Courville, Deep Learning, vol. 1, MIT press, Mas- sachusetts, USA, 2017. [26] D.P. Kingma, J. Ba Adam, A method for stochastic optimization, arXiv:1412.6980 [abs]. [27] G. Hinton, N. Srivastava, K. Swersky, RMSProp: Divide the gradient by a running average of its recent magnitude, Neural Networks for Machine Learning, Coursera lecture 6e (2012) 13. [28] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back- propagating errors, Nature 323 (6088) (1986) 533–536.

36 Deep Learning on Edge Computing Devices [29] M. Sandler, A.G. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, MobileNetV2: Inverted residuals and linear bottlenecks, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. [30] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. [31] A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, Communications of the ACM 60 (2012) 84–90.

CHAPTER 3 Model design and compression 3.1 Background and challenges Although convolutional neural networks have achieved good results in such fields as computer vision and natural language processing, they are daunting to many embedded device-based applications due to their massive parameters. At present, deep learning models require large amounts of computing resource and memory, often accompanied by huge energy consumption. Large models become the biggest bottleneck when we need to deploy models on terminal devices with limited computing resources for real-time inference. The training and reasoning of deep neural networks usually heavily rely on GPU with high computing ability. The huge scale of features and the deluge of model parameters also greatly increase the time. Take as an example AlexNet [1], a network containing 60 million parameters. It takes two to three days to train the entire model on the ImageNet data set using NVIDA K40. In fact, Denil et al. [9] have shown that deep neural networks are facing severe overparameterization, and a small subset of the parameters can completely reconstruct the remaining parameters. For example, ResNet-50 [10], which has 50 convolutional layers, requires more than 95 MB of storage memory and more than 3.8 billion floating-point multiplication operations to process images. After some redundant weights are discarded, the network still works as usual, but more than 75% of the parameters and 50% of the computation time can be saved. This indicates that there is huge redundancy in the parameters of the model, which reveals the feasibility of model compression. In the field of deep neural network study, model compression and acceleration have received great attention from researchers, and great progress has been made in the past few years. Significant advances in intelligent wearable devices and AIoT in recent years have created unprecedented opportunities for researchers to address the fundamental challenges of deploying deep learning systems to portable devices with limited resources, such as memory, CPU, energy, and bandwidth. A highly efficient deep learning method can have a significant impact on distributed systems, embedded devices, and FPGAs for artificial intelligence. For the design and compression of highly efficient deep neural network models, in this chapter, we will first analyze Deep Learning on Edge Computing Devices https://doi.org/10.1016/B978-0-32-385783-3.00011-9 Copyright © 2022 Tsinghua University Press. Published by Elsevier Inc. All rights reserved. 39

Another Random Document on Scribd Without Any Related Topics

a grander scheme of chiaroscuro; men comprehend where they once scoffed. It was thus that Yeoland rose inspired, like a spiritual Venus from a sea of dreams. As molten glass is shaped speedily into fair and exquisite device, so the red wax of her heart had taken the impress of the hour. Gilderoy had stirred her like a blazoned page of romance. Fulviac caught the girl's half glance at him; read in measure the meaning of her mood. Her lips were half parted as though she had words upon her tongue, but still hesitated from some scruple of pride. He straightened in the saddle, and waited for her to unbosom to him with a confident reserve. Well? he said at length, since she still lingered in her silence. How much one may learn in a day, she answered, drawing her white palfrey nearer to his horse. Fulviac agreed with her. The man on the end of the rope, he said, learns in two minutes that which has puzzled philosophers since Adam loved Eve. She turned to him with an eagerness that was almost passionate even in its suppressed vigour. How long was it before you came to pity your fellows? Some minutes, not more. And the conversion? Shall satisfy you one day. For the present I will buckle up so unsavoury a fable in my bosom. Tell me what you have learnt at Gilderoy. Yeoland looked at the moon. The man saw great sadness upon her face, but also an inspired radiance that made its very beauty the more remarkable. He foresaw in an instant that they were coming to

deeper matters. Superficialities, the mannerisms of life, were falling away. The girl's heart beat near to his; he felt a luminous sympathy of spirit rise round them like the gold of a Byzantine background. Come, he said, with a burst of beneficence, you are beginning to understand me. She jerked a swift glance at him, like the look of a half-tamed falcon. You are a man, for all your sneers and vapourings. I had a heart once. Call me an oak, broken, twisted, aged, but an oak still. Yeoland drew quite close to him, so that her skirt almost brushed his horse's flank. Fulviac's shadow fell athwart her. Only her face shone clear in the moonlight. I have ceased, she said, to look upon life as a stretch of blue, a laughing dawn. Good. I have learnt that woe is the crown of years. Good again. That life is full of violence and wrong. A platitude. Yes. Life consists in learning platitudes. I am only one woman among thousands. A revelation. You jeer. Not so. Few women learn the truth of your proverb. Lastly, my trouble is not the only woe in the world. That it is an error to close up grief in the casket of self. Fulviac flapped his bridle, and looked far ahead into the cavern of the night. He was silent awhile in thought. When he spoke again,

he delivered himself of certain curt cogitations, characteristic confessions that were wholly logical. I am a selfish vagabond, he said; I appeal to Peter's keys whether all ambition is not selfish. I am an egotist for the good of others. The stronger my ambition, the stronger the hope of the land in generous justice. I live to rule, to rule magnanimously, yet with an iron sceptre. There, you have my creed. And God? she asked him. Is a most useful subordinate. You do not mean that? I do not. She saw again the mutilated beings in the catacombs, aye, even her own home flaming to the sky, and the white face of her dead father. Faith and devotion were great in her for the moment. Divine vengeance beaconed over the world, a torch borne aloft by the hand of Pity. It is God's war, she said to him with a finer solemnity sounding in her voice; you have stirred the woman in me. Is that enough? Enough, he answered her. And the rest? God shall make all plain in due season. Gilderoy had dwindled into the east; its castle's towers still netted the moonlight from afar. The meadowlands had ceased, and trees strode down in multitudes to guard the track. The night was still and calm, with a whisper of frost in the crisp, sparkling air. The world seemed roofed with a dome of dusky steel. Before them a shallow valley lay white in the light of the moon. Around climbed the glimmering turrets of the trees, rank on rank,

solemn and tumultuous. The bare gable ends of a ruined chapel rose in the valley. Fulviac drew aside by a bridle path that ran amid rushes. To the left, from the broken wall of the curtilage, a great beech wood ascended, its boughs black against the sky, its floor ankle-deep with fallen leaves. The chapel stood roofless under the moon. Hollies, a sable barrier that glistened in the moonlight, closed the ruin on the south. Yews cast their gloom about the walls. A tall cross in the forsaken graveyard stretched out its mossy arms east and west. The armed groom took the horses and tethered them under a clump of pines by the wall. Fulviac and the girl Yeoland passed up through weeds and brambles to the porch. A great briar rose had tangled the opening with a thorny web, as though to hold the ruin from the hand of man. The tiled floor was choked with grass; a rickety door drooped rotten on its rusty hinges. Fulviac pushed through and beckoned the girl to follow. Within, all was ruinous and desolate, the roof fallen, the casements broken. We must find harbour here, said the man, our horses go far to-morrow. A cheerful hostel, this. Its wildness makes it safe. You fear the cold. I'll see to that. No. I am hungry. The high altar still stood below the small rose window in the east, where the rotting fragments of a triptych hid the stonework. There was a great carved screen of stone on either side, curiously recessed as though giving access to an ambulatory. The altar stood in dense shadow, with broken timber and a tangle of briars ringing a barrier about its steps. On the southern side of the nave, a patch of

tiled flooring still stood riftless, closed in by two fallen pillars. The groom came in with two horse-cloaks, and Fulviac spread them on the tiles. He also gave her a small flask of wine, and a silver pyx holding meat and bread. We crusaders must not grumble at the rough lodging, he said to her; wrap yourself in these cloaks, and play the Jacob with a stone pillow. She smiled slightly in her eyes. The groom brought in a saddle, ranged it with a saddle cloth covering it, that it might rest her head. And you? she said to Fulviac. Damian and I hold the porch. You will be cold. I have a thick hide. The Lady of Geraint give you good rest! He threaded his way out amid the fallen stones and pillars, and closed the rickety gate. The groom, a tall fellow in a battered bassinet and a frayed brigantine, stood by the yew trees, as on guard. Fulviac gestured to him. The man moved away towards the eastern end of the chapel, where laurels grew thick and lusty about the walls. When he returned Fulviac was sitting hunched on a fallen stone in the corner of the porch, as though for sleep. The man dropped a guttural message into his master's ear, and propped himself in the other angle of the porch. An hour passed; the moon swam past the zenith towards the west; a vast quiet watched over the world, and no wind rippled in the woods. In the sky the stars shivered, and gathered more closely their silver robes. In the curtilage the ruined tombs stared white and desolate at the moon.

An owl's cry sounded in the woods. Sudden and strange, as though dropped from the stars, faint music quivered on the frost- brilliant air. It gathered, died, grew again, with a mysterious flux of sweetness, as of some song stealing from the Gardens of the Dead. Flute, cithern, and viol were sounding under the moon, merging a wizard chant into the magic of the hour. Angels, crimson-winged, in green attire, seemed to descend the burning stair of heaven. A sudden great radiance lit the ruin, a glory of gold streaming from the altar. Cymbals clashed; waves of shimmering light surged over the broken walls. Incense, like purple smoke, curled through the casements. The music rushed in clamorous rapture to the stars. A voice was heard crying in the chapel, elfin and wild, yet full of a vague rich sanctity. It ceased sudden as the brief moan of a prophecy. The golden glow elapsed; the music sank to silence. Nought save the moonlight poured in silver omnipotence over the ruin. From the chapel came the sound of stumbling footsteps amid the stones. A hand clutched at the rotting door, jerked it open, as in terror. The girl Yeoland came out into the porch, and stood swaying white-faced in the shadow. Fulviac. Her voice was hoarse and whispering, strained as the overwrought strings of a lute. The man did not stir. She bent down, dragged at his cloak, calling to him with a quick and gathering vehemence. He shook himself, as from the thongs of sleep, stood up and stared at her. The groom still crouched in the dark corner. Fulviac.

She thrust her way through the briars into the moonlight. Her hood had fallen back, her hair loose upon her shoulders; her eyes were full of a supernatural stupor, and she seemed under the spell of some great shock of awe. She trembled so greatly, that Fulviac followed her, and held her arm. Speak. What has chanced to you? She still shook like some flower breathed upon by the oracular voice of God. Her hands were torn and bloody from the thorns. The Virgin has appeared to me. Are you mad? The Virgin. Some ghost or phantom. No, no, hear me. She stretched out her hands like one smitten blind, and took breath swiftly in sudden gasps. Hear me, I was but asleep, woke, and heard music. The Virgin came out upon the altar, her face like the moon, her robes white as the stars. There was great light, great glory. And she spoke to me. Mother of God, what am I that I should be chosen thus! Speak. Can this be true? The truth, the truth! Fulviac fell on his knees with a great gesture of awe. The girl, her face turned to the moon, stood quivering like a reed, her lips moving as if in prayer. Her message, child? Ah, it was this: 'Go forth a virgin, and lead the hosts of the Lord.'

Fulviac's face was in shadow. He thrust up his hands to the heavens, but would not so much as glance at the girl above him. His voice rang out in the silence of the night:-- Gloria tibi, Sancta Maria! Gloria tibi, Domine! IX Faith, golden crown of the Christian! Self-mesmerism, subtle alchemy of the mind! How the balance of belief swings between these twain! A spiritual conception born in a woman's brain is as a savour of rich spices sweetening all the world. How great a power of obstinacy stirs in one small body! A pillar of fire, a shining grail. She will bring forth the finest gems that hang upon her bosom, the ruby of heroism, the sapphire of pity. She will cast all her store of gold into the lap of Fate. Give to her some radiant dream of hope, and she may prove the most splendid idealist, even if she do not prove a wise one. Remember the women who watched about the Cross of Christ. There had been trickery in the miracle, a tinge of flesh in the vision. The Virgin, in the ruck of religion, had suffered herself to be personated by a clever little player from Gilderoy, aided and idealised by a certain notorious charlatan who dealt in magic, was not above aiding ecclesiastical mummeries on occasions, and conspiring for the solemn production of miracles. A priest's juggling box, a secret door at the back of the altar used in bygone days for

the manipulation of a wonder-working image, musicians, incense, and Greek fire. These had made the portent possible. As for Fulviac, rugged plotter, he was as grave as an abbot over the business; his words were wondrous beatific; he spoke of the interventions of Heaven with bated breath. It was a superstitious age, touched with phantasy and gemmed with magic. Relics were casketed in gold and silver; holy blood amazed with yearly liquefactions the souls of the devout; dreamers gazed into mirrors, crystals, finger-nails, for visions of heaven. Jewels were poured in scintillant streams at the white feet of the Madonna. It was all done with rare mysticism, colour, and rich music. The moon ruled marriage, corn, and kine. The saints, like a concourse of angels, walked with melancholy splendour through the wilds. As for the girl Yeoland, she had the heart of a woman in the noblest measure, a red heart, pure yet passionate. The world waxed prophetic that shrill season. She was as full of dreams and phantasies as an astrologer's missal. Nothing amazed her, and yet all earth was mysterious. The wind spoke in magic syllables; the trees were oracular; the stars, white hands tracing symbols in the sky. She was borne above herself on the pinions of ecstasy, heard seraph wings sweep the air, saw the glimmer of their robes passing the portals of the night. Mysticism moved through the world like the sound of lutes over a moonlit sea. One March morning, Fulviac came to her in the northern chamber of the cliff. Yeoland had masses of scarlet cloth and threads of gold upon her knees, for she was broidering a banner, the banner of the Maid of Gilderoy. Her eyes were full of violet shadow. She

wore a cross over her bosom, emeralds set in silver; a rosary, dangling on her wrist, told how her prayers kept alternate rhythm with her fingers. Fulviac crooked the knee to the crucifix upon the wall, sat down near her on a rich bench of carved cedar wood. The man was in a beneficent mood, and beamed on her like a lusty summer. He had tidings on his tongue, tidings that he hoarded with the craft of an epicure. It was easy to mark when the world trundled well with his humour. He put forth smiles like a great oak whose boughs glisten in the sun. You will tire yourself, little sister. She looked at him with one of her solemn glances, a glance that spoke of vigils, soul-searchings, and prayer. My fingers tire before my heart, she said to him. Rest, rest. Do I seem weary to you? Nay, you are fresh as the dawn. He brushed back the tawny hair from off his forehead, and the lines about his mouth softened. I have news from the west. Ah! We gather and spread like fire in a forest. The mountain men are with us, ready to roll down from the hills with hauberk and sword. In two months Malgo will have sent the bloody cross through all the west. The golden thread ran through the girl's white fingers; the beads of her rosary rattled; she seemed to be weaving the destiny of a kingdom into the device upon her banner. How is it with us here? she asked him.

I have a thousand stout men and true camped upon the cliff. Levies are coming in fast, like steel to a magnet. In a month we shall outbulk a Roman legion. And Gilderoy? Gilderoy and Geraint will give us a score thousand pikemen. The stars fight for us. Fulviac took her lute from the carved bench and began to thrum the chords of an old song. Spears crash, and swords clang, Fame maddens the world. Come battle and love. Iseult-- Ah, Iseult. He broke away with a last snap at the strings, and set the lute aside. Bear with me, he said. Her dark eyes questioned him over her banner. I offer you the first victim. Ah! Flavian of Gambrevault. An indefinite shadow descended upon the girl's face. The inspired radiance seemed dimmed for the moment; the crude realism of her thoughts rang in discord to her dreams. She lost the glimmering thread from her needle. Her hands trembled a little as she played with the scarlet folds of the banner. Well?

A lad of mine bears news--a black-eyed rogue from the hills of Carlyath, sharp as a sword's point, quaint as an elf. I sent him gleaning, and he has done bravely. You would hear his tale from his own lips? She nodded and seemed distraught. Yes. Bring him in to me, she said. Fulviac left her, to return with a slim youth sidling in behind him like a shadow. The lad had a nut-brown skin and ruddy cheeks, a pair of twinkling eyes, a thatch of black hair over his forehead. Bred amid the hills of Carlyath, where the women were scarlet Eves, and the land a paradise, he had served in Gilderoy as apprentice to an armourer. Carlyath's wilds and the city's roguery had mingled in him fantastic strains of extravagant sentiment and cunning. Half urchin, half elf, he stood with bent knees and slouched shoulders, his black eyes alert on Fulviac, his lord. The man thrust him forward by the collar, with an eloquent gesture. The whole tale. Try your wit. The Carlyath lad advanced one foot, and with an impudent southern smirk, remarked-- This, madame, is an infatuated world. Thus, sententiously delivered, he plunged into a declamation with a picturesque and fanciful extravagance that he had imbibed from the strolling romancers of his own land. In the city of Gilderoy, he said, speaking very volubly and with many gestures, there lives a lady of surpassing comeliness. Her eyes are as the sky, her cheeks as June roses, her hair a web of gold. She is a right fair lady, and daily she sits at her broad

casement, singing, and plaiting her hair into shackles of gold. She has bound the Lord Flavian of Gambrevault in a net starred with poppies, scarlet poppies of the field, so that he ever dreams dreams of scarlet, and sees visions of lips warm as wine. Daily the Lord Flavian scours the country between Avalon and the fair city of Gilderoy, till the very dust complains of his fury, and the green grass curses his horse's heels. But the lady with the hair of gold compasses him like the sunset; she has stolen the eyes of heaven, and the stars are blind. Fulviac smiled over the extreme subtlety of the rendering. It was a delicate matter, delicately handled. The Carlyath lad had wit, and a most seraphic tongue. What more? There is yet another lady at Avalon. Well? A lady whose name is Duessa, a lady with black hair and a blacker temper. Lord Flavian has a huge horror of her tongue. Therefore he rides like a thief, without trumpets, to Gilderoy. Yet more. The lad spread his hands with an inimitable gesture, shrugged, and heaved a most Christian sigh. The Lady Duessa is the Lord Flavian's wife, he said. Surely. Therefore, sire, he is a coward. The lad drew back with a bow and a scrape of the foot, keeping his eyes on the floor with the discretion of a veteran lackey. At a sign from Fulviac, he slipped away, and left Yeoland and the man alone.

The girl's hands were idle in her lap; the great scarlet banner trailed in rich folds about her feet. There was a white mask of thought upon her face, and her eyes searched the distance with an oblivious stare. All the strong discords of the past rushed clamorous to her brain; her consecrated dreams were as so many angels startled by the assaults of hell. She rose from her chair, cast the casement wide, and stood gazing over the forest. Youth seemed in the breeze, and the clear voice of the Spring. The green woods surged with liberty; the strong zest of life breathed in their bosoms. In the distance the pines seemed to beckon to her, to wave their caps in windy exultation. Fulviac had stood watching her with the calm scrutiny of one wise in the passionate workings of the soul. He suffered her to possess her thoughts in silence for a season, to come by a steady comprehension of the past. Presently he gathered the red banner, and hung it on the frame, went softly to her and touched her sleeve. Shall they kill him on the road? he asked. She pondered a moment, and did not answer him. It is easy, he said, and a matter of sheer justice. The words seemed to steel her decision. No, she said, let them bring him here--to me. So be it, he answered her. Fulviac found her cold and taciturn, desirous of solitude. He humoured the mood, and she was still staring from the window when he left her. The woodland had melted before her into an oblivious mist. In its stead she saw a tower flaming amid naked trees, a white face staring heavenwards with the marble tranquillity of death.

X Down through the woods of Avalon rode the Lord Flavian of Gambrevault, down towards the forest track in the grey face of the dawn. In the meadows and beyond the orchards, water shone, and towers stood mistily. The voice of Spring pulsed in the air, songs of green woods, the wild wine of violets, pavements of primrose gold. Birds piped lustily in wood and thicket, and the ascending sun lavished his glittering archery from the chariots of the clouds. The Lord Flavian was inordinately cheerful that morning, as he rode in green and red through the prophetic woods. Heart and weather were in kindred keeping, and his youth sang like a brook after April rains. The woods danced in dew. Far on its rocky hill the towers of Gilderoy would soon beckon him above the trees. Beneath the shadow of the cathedral tower stood a gabled house with gilded vanes and roofs of generous red. There in Gilderoy, in a room hung with cloth of purple and gold, white arms waited, and the bosom of a golden Helen held love like a red rose in a pool of milky spikenard. Picture a slim but muscular man with the virile figure of a young David, a keen, smooth face, a halo of brown hair, eyes eloquent as a woman's. Picture a good grey horse trapped in red and green, full of fettle as a colt, burly as a bull. Picture the ermined borderings, the jewelled clasps, brigantine of quilted velvet, fur-lined bassinet bright as a star. Youth, clean, adventurous, aglow to the last finger-tip, impetuous to the tune of thirty breaths a minute. Youth with all its

splendid waywardness, its generosities, its immense self- intoxications. Youth with the voice of a Golden Summer in its heart, and for its plume the gorgeous fires of eve. Wealth often breeds apathy and parsimonious instincts. It is the beggar whose purse bursts with joy, whose soul blazes generous red upon the clouds. As for Flavian of Gambrevault and Avalon, he was rich but no miser, proud yet not haughty, sanguine but not vicious. Like many a man inspired by an instinctive idealism, his heart ran before his reason: they not having come cheek by jowl as in later years. He was very devout, yet very worldly; very ardent, yet over hasty. Mark him then, a lovable fool in the eyes of philosophy; a cup of mingled wine, both white and red. He was a great lord; yet his serfs loved him. The Lady Duessa's parents, good folk, had been blessed with aspirations. Gambrevault and Avalon had bulked very gloriously under the steel-blue vault of pride. Moreover, their daughter was a sensuous being, who panted for poetic surroundings, and lived to music. A boy of twenty; a passionate, dark-eyed, big-bosomed houri of twenty and five; bell, book, and ring--such had been the bridal bargain consummated on church principles five years ago or more. A youth of twenty is not supremely wise concerning the world, or his own heart. The Lord Flavian's marriage had not proved a magic blessing to him. Parentally sealed marriage deeds are the edicts of the devil. Quickly are the mighty fallen, and the chalices of love broken. It was no mere chance ambuscade that waited open-mouthed for Flavian, Lord of Gambrevault and Avalon, Warden of the Southern Marches, Knight of the Order of the Rose, as he rode that morning

to Gilderoy, a disciple of Venus. In a certain perilous place, the road ran betwixt walls of rock, and under the umbrage of overhanging trees. Twenty men with pike and gisarme swarming out of the woods; a short scuffle and a stabbed horse; a gag in the mouth, a bandage over the eyes, a mule's back, half a dozen thongs of stout leather. That same evening the Lord Flavian was brought like a bale of merchandise into Fulviac's guard-room, and tumbled on a heap of straw in a corner. They were grim men, these forest rangers, not given to pity, or the light handling of a feud. A poniard point was their pet oath, a whip of the sword the best word with an enemy. They bit their thumb nails at creation, and were not gentle in the quest of a creed. Fulviac heard their news, and commended them. They were like the ogres of the old fables; the red blood of a lusty aristocrat smelt fresh for the sword's supper. The girl Yeoland was at her prayer-desk with a blazoned breviary under her fingers, when Fulviac came to her with tidings of the day's capture. She knelt with her hands crossed upon her bosom, as Fulviac stood in the darkened doorway. To the man she appeared as the Madonna in some picture of the Annunciation, the yellow light from the lamp streaming down upon her with a lustre of sanctity. They have brought the boar home. Dead? Nay; but his corpse candle walks the cavern. For the girl it was a descent from spiritual themes to the stark realism of life. She left her prayer-desk with a little sigh. Her hands trembled as she drew a scarlet cloak about her, and fastened it with

a girdle of green leather. Her eyes dwelt on Fulviac's face with a species of dusky pain. Come, he said to her. Whither? To judge him. Not before all, not in the guard-room. Leave it to me, he said. Be forewarned. We deal with no mere swashbuckler. They went together to Fulviac's parlour, where a great brazen lamp hung from the roof, and a book bound in black leather lay chained on the table. Yeoland took the man's carved chair, while he stood behind her leaning on the rail. She was paler than was her wont. Now and again she pressed a hand to her breast, as though to stay the too rapid beating of her heart. Two guards bearing partisans came in from the guard-room with a man bound and blindfold between them. A third followed, bearing a two-handed sword naked over his shoulder. He was known as Nord of the Hammer, an armourer like to a Norse Volund, burly, strong as a bear. The door was barred upon them. One of the guards plucked the cloth from the bound man's face. In the malicious imagery of thought, Yeoland had often pictured to herself this Flavian of Gambrevault, a coarse, florid ruffian, burly and brutal, a fleshly demigod in the world of feudalism. So much for conjecture. What she beheld was a straight-lipped, clean-limbed man, slim as a cypress, supple as good steel. The face was young yet strong, the grey eyes clear and fearless. Moreover there was a certain lonely look about him that invoked pity, and angered her in

an enigmatic way. She was wrath with him for being what he was, for contradicting the previous imaginings of her mind. Flavian of Gambrevault stood bound before her, an aristocrat of aristocrats, outraged in pride, yet proud beyond complaint. The self- mastery of his breeding kept him a stately figure despite his tumbling and his youth, one convinced of lordship and the powerful splendour of his name. The whole affair to him was illogical, preposterous, insolent. A gentleman of the best blood in the kingdom could not be hustled out of his dignity by the horse-play of a bevy of cut-throats. Possibly the first vision to snare the man's glance was the elfin loveliness of the girl, who sat throned in the great chair as on a judgment seat. He marked the rose-white beauty of her skin, her sapphire eyes gleaming black in certain lights, her ebon hair bound with a fillet of sky-blue leather. Moreover, it was plain to the man in turn that this damoisel in the red gown was deciphering his features in turn with a curiosity that was no vapid virtue. As for Fulviac, he watched them both with his amber-brown eyes, eyes that missed no movement in the mask of life. To him the scene under the great brazen lamp was a study in moods and emotions. The aristocrat was the first to defy the silence. He had stared round the room at his leisure, and at each of its motionless figures in turn. The great sword, slanted in gleaming nakedness over Nord's shoulder, appeared to fascinate him for the moment. Despite his ambiguous sanctity, he showed no badge of panic or distress. Ignoring the woman, he challenged Fulviac, who leant upon the chair rail, watching him with an enigmatic smile.

Goodman in the red doublet, quoth he, when you have stared your fill at me, I will ask you to read me the moral of this fable. Fulviac stroked his chin with the air of a man who holds an adversary at some subtle disadvantage. Messire, he said, address yourself to madame--here; you are her affair in the main. The Warden of the Southern Marches bowed as by habit. His grey eyes reverted to Yeoland's face, searching it with a certain courteous curiosity that took her beauty for its justification. The woman was an enigma to him, a most magical sphinx whose riddle taunted his reason. Madame, he began. The girl stiffened in her chair at the word. You hold me at a disadvantage, seeing that I am ignorant of sin or indiscretion against you. If it is a question of gold---- Messire! He swept her exclamation suavely aside and ran on mellifluously. If it is a question of gold, let me beseech you to be frank with me. I will covenant with you instanter. My seneschal at Gambrevault will unbolt my coffers, and ease your greed. Pray be outspoken. I will renounce the delight of lodging here for a purse of good rose nobles. There was the faintest tinge of insolence in the man's voice, an insolence that exaggerated to the full the charge of plunder in his words. Whether he hinted at blood money or no, there was sufficient poison in the sneer to fire the brain and scorch the heart to vengeance.

The woman had risen from her chair, and stood gripping the carved woodwork with a passion that set her arms quivering like bands of tightened steel. The milk-white calm had melted from her face. Wrath ran riot in her blood. So large were her pupils that her eyes gleamed red. Ha, messire, I bring you to justice, and you offer me gold. The man stared; his eyes did not quail from hers. Justice, madame! Of what sin then am I accused? On my soul, I know not who you are. She calmed herself a little, shook back her hair from her shoulders, fingered her throat, breathing fast the while. My name, messire? Ha, you shall have it. I am Yeoland, daughter of that Rual of Cambremont whom you slaughtered at the gate of his burning house. I--am the sister of those fair sons whom you did to death. Blood money, forsooth! God grant, messire, that you are in honest mind for heaven, for you die to-night. The man had bent to catch her words. He straightened suddenly like a tree whose throat is loosed from the grim grip of the wind. He went grey as granite, flushed red again as a dishonoured girl. The words had touched him with the iron of truth. Hear me, he said to her. Ah, you would lie. By Heaven, no; give me an hour's justice. Murderer. Before God, you wrong me. He stood with twitching lips, shackled hands twisting one within the other. For the instant words eluded him, like fruit jerked from the

mouth of a thirst-maddened Tantalus. Anon, his manhood gathered in him, rushed forth redly like blood from a stricken throat. Daughter of Rual, hear me, I tell you the truth. I, Flavian of Gambrevault, had in my pay a company of hired 'spears,' rough devils from the north. The braggarts served me against John of Brissac, were half their service drunk and mutinous. When Lententide had come, their captain swore to me, 'Lording, pay us and let us go. We have spilt blood near Gilderoy,' scullion blood he swore, 'give us good bounty, and let us march.' So at his word I gave them largesse, and packed them from Gambrevault with pennons flying. Methought they and their brawlings were at an end. Before God and the saints, I never knew of this. Yeoland considered him, strenuous as he seemed towards truth. He was young, passionate, sanguine; for one short moment she pitied him, and pondered his innocence in her heart. It was then that Fulviac plucked at her sleeve, spoke in her ear, words that hardened her like a winter frost. She stared in the man's eyes, as she gave him his death-thrust with the sureness of hate. Blood for blood, were her words to him. Is this justice! I have spoken. Monstrously. Hear me---- Messire, make your peace with Heaven, I give you till daylight. The man stumbled against the table, white as the moon. Youth strove in him, the crimson fountain of life's wine, the wild cry of the dawn. His eyes were great with a superhuman hunger. Fulviac's strong voice answered him.

Hence, hence. At dawn, Nord, do your duty. XI Give doubt the password, and the outer battlements are traitorously stormed. Parley with pity, and the white banner flutters on the keep. Provided her emotions inspire her, a woman is strong; let her take to logic, and she is a rushlight wavering in the wind. In her red heart lies her divinity; her feet are of clay when reason rules her head. The girl Yeoland took doubt to her chamber that night, a malicious sprite, sharp of wit and wild of eye. All the demons of discord were loosed in the silence of the night. Pandora's box stood open, and the hours were void of sleep; faces crowded the shadows, voices wailed in the gloom. Her thoughts rioted like frightened bats fluttering and squeaking round a torch. Sleep, like a pale Cassandra, stood aloof and watched the mask of these manifold emotions. Turn and twist as she would amid her fevered pillows, a wild voice haunted her, importunate and piteous. As the cry of one sinking in a stormy sea, it rang out with a passionate vehemence. Moreover, there was a subtle echo in her own heart, a strong appeal that did not spare her, toss and struggle as she would. Decision fluttered like a wounded bird. Malevolence rushed back as an ocean billow from the bastion of a cliff that emblemed mercy. With a beating of wings and a discordant clamour, a screech-owl buffeted the casement. A lamp still burnt beneath the crucifix; the

glow had beaconed the bird out of the night. Starting up with a shiver of fear, she quenched the lamp, and crept back to bed. The darkness seemed to smother her like a cloak; the silence took to ghostly whisperings; a death-watch clicked against the wall. The night crawled on like a funeral cortège. Baffled, outfaced, sleepless, she rose from her tumbled bed, and paced the room as in a fever. Still wakefulness and a thousand dishevelled thoughts that hung about her like her snoodless hair. Again and again, she heard the distant whirr and rattle of wheels, the clangour of the wire, as the antique clock in Fulviac's chamber smote away the hours of night. Each echo of the sound seemed to spur to the quick her wavering resolution. Time was flying, jostling her thoughts as in a mill race. With the dawn, the Lord Flavian would die. Anon she flung the casement wide and stared out into the night. A calm breeze moved amid the masses of ivy, and played upon her face. She bared her breast to its breath, and stood motionless with head thrown back, her white throat glimmering amid her hair. Below, the sombre multitudes of the trees showed dim and ghostly, deep with mystery. A vague wind stirred the branches; the dark void swirled with unrest, breaking like a midnight sea upon a cliff. A few straggling stars peeped through the lattice of the sky. She leant against the sill, rested her chin upon her palms, and brooded. Thoughts, fierce, passionate, and clamorous, came crying like gusts of wind through a ruined house. Death and dead faces, blood, the yawn of sepulchres, life and the joy of it, all these passed as visions of fire before her fancy. Vengeance and pity agonised her soul. She answered yea and nay with the same breath; condemned and pardoned with contradicting zeal. Youth lifted up its face to her,

piteous and beautiful. Death reached out a rattling hand into her bosom. Presently, a far glow began to creep into the sky; a gradual greyness absorbed the shadows of the night. The day was dawning. From the forest, the trembling orisons of the birds thrilled like golden light into the air. Unutterable joy seemed to flood forth from the piping throats. Even the trees seemed to quiver to the sound. With a rush of bitter passion, she closed the casement, cast herself upon her bed, and strove to pray. Again came the impotent groping into nothingness. A dense mist seemed to rise betwixt her soul and the white face of the Madonna. Aspiration lessened like an afterglow, and dissolved away into a dark void of doubt. Prayer eluded her; the utterances of her heart died in a miserable endeavour, and she could not think. The spiritual storm wore itself away as the dawn streamed in with a glimmer of gold. Yeoland lay and stared at the casement, and the figure of Sebastian rendered radiant by the dawn, the whiteness of his limbs tongued with dusky rills of blood, where the barbs had smitten into the flesh. Sombre were the eyes, and shadowy with suffering. A halo of gold gilded the youthful face. The painted glass about him blazed like a shower of gems. The Sebastian of the casement recalled to her with wizard power the face of the man whom death claimed at dawn. The thought woke no new passion in her. The night's vigil had left her reason like a skein of tangled silk, and with the day she verged towards a wearied apathy. The voice of pity in her waned to an infrequent whisper that came like the rustling of leaves on a summer

night. She realised that it had dawned an hour or more; that the man had knelt and fallen to Nord's sword. Suddenly the silence was snapped by a far outcry sounding in the bowels of the cliff. Gruff voices seemed to echo and re-echo like breakers in a cavern. A horn blared. She heard the thudding of a door, the shrilling of mail, the clangour of iron steps passing up the gallery. Shivering, she raised herself upon her elbow to listen. Were they bringing her the man's head, grey and blood-dabbled, with closed lids and mangled neck? She fell back again upon her pillows, pressed her hands to her face with a great revulsion of pity, for the image had burnt in upon her brain. The clangour of harness drew near, with an iron rhythm as of the march of destiny. It ceased outside the door. A heavy hand beat upon the panelling. Who knocks? Her own voice, strained and shrill, startled her like an owl's hoot. Fulviac's deep bass answered her from the passage. Unbar to me, I must speak with you. She started up from the bed in passionless haste, ran to a closet, drew out a cloak and wrapped it about her shoulders. Her bare feet showed white under her night-gear as she slid the bolt from its socket, and let the man in. He was fully armed save for his salade, which he carried in the hollow of his arm. His red cloak swept his heels. A tower of steel, there was a clangorous bluster about him that bespoke action. The girl had drawn apart, shivering, and gathering her cloak about her, for in the gloom of the place she had thought for an

instant that Fulviac carried a mangled head. A rider has brought news, he said to her. John of Brissac's men have taken Prosper the Preacher, to hang him, as their lord has vowed, over the gate of Fontenaye. They are on the march home from Gilderoy, ten lances and a company of arbalestiers. I ride to ambuscado them. Prosper shall not hang! She stood with her back to the casement, and looked at him with a restless stare. Her thoughts were with the man whose grey eyes had pleaded with her through the night. Her fears clamoured like captives at the gate of a dungeon. What is more, this vagabond of Avalon has been begging twelve hours' grace to scrape his soul clean for Peter. Ah! she said, with a sudden stark earnestness. I will give him till sunset---- If I suffer it---- The dog has spirit. I would thrust no man into the dark till he has struck a bargain with his own particular saints. She drew back, sank down into a chair with her hair half hiding her face. You are right in being merciful, she said very slowly. Magic riddle of life; rare roseate rod of love. Was it youth leaping towards youth, the cry of the lark to the dawn, the crimson flowering of a woman's pity? The air seemed woven through with gold. A thousand lutes had sounded in the woods. Voiceless, she sat with flickering lids, amazed at the alchemy that had wrought ruth out of hate. Fulviac had drawn back into the gloom of the gallery. He turned suddenly upon his heel, and his scabbard smote and rang against

the rock. I take all the men I have, he said to her, even the dotard Jaspar, for he knows the ways. Gregory and Adrian I leave on guard; they are tough gentlemen, and loyal. As for the lordling, he is well shackled. Yeoland was still cowering in her chair with the mysterious passions of the moment. You will return? she asked him. By nightfall, if we prosper; as we shall. He moved two paces, stayed again in his stride, and flung a last message to her from the black throat of the passage. Remember, there is no recantation over this business. The man is my affair as well as yours. He is a power in the south, and would menace us. Remember, he must die. He turned and left her without more palaver. She heard him go clanging down the gallery, heard the thunder of a heavy door, the braying of a horn. A long while she sat motionless, still as stone, her hands lying idle in her lap. When an hour had passed, the sun smote in, and found her kneeling at her prayer-desk, her breviary dewed with tears. XII Fulviac passed away that morning into the forest, a shaft of red amid the mournful glooms. Colour and steel streamed after him

Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com

Deep Learning On Edge Computing Devices Design Challenges Of Algorithm And Architecture 1st Edition Xichuan Zhou

More Related Content

Similar to Deep Learning On Edge Computing Devices Design Challenges Of Algorithm And Architecture 1st Edition Xichuan Zhou

Recently uploaded

Deep Learning On Edge Computing Devices Design Challenges Of Algorithm And Architecture 1st Edition Xichuan Zhou