Efficient Processing of Deep Neural Networks. Vivienne Sze

Читать онлайн.
Название Efficient Processing of Deep Neural Networks
Автор произведения Vivienne Sze
Жанр Программы
Серия Synthesis Lectures on Computer Architecture
Издательство Программы
Год выпуска 0
isbn 9781681738338



Скачать книгу

History

       1.4 Applications of DNNs

       1.5 Embedded versus Cloud

       2 Overview of Deep Neural Networks

       2.1 Attributes of Connections Within a Layer

       2.2 Attributes of Connections Between Layers

       2.3 Popular Types of Layers in DNNs

       2.3.1 CONV Layer (Convolutional)

       2.3.2 FC Layer (Fully Connected)

       2.3.3 Nonlinearity

       2.3.4 Pooling and Unpooling

       2.3.5 Normalization

       2.3.6 Compound Layers

       2.4 Convolutional Neural Networks (CNNs)

       2.4.1 Popular CNN Models

       2.5 Other DNNs

       2.6 DNN Development Resources

       2.6.1 Frameworks

       2.6.2 Models

       2.6.3 Popular Datasets for Classification

       2.6.4 Datasets for Other Tasks

       2.6.5 Summary

       PART II Design of Hardware for Processing DNNs

       3 Key Metrics and Design Objectives

       3.1 Accuracy

       3.2 Throughput and Latency

       3.3 Energy Efficiency and Power Consumption

       3.4 Hardware Cost

       3.5 Flexibility

       3.6 Scalability

       3.7 Interplay Between Different Metrics

       4 Kernel Computation

       4.1 Matrix Multiplication with Toeplitz

       4.2 Tiling for Optimizing Performance

       4.3 Computation Transform Optimizations

       4.3.1 Gauss’ Complex Multiplication Transform

       4.3.2 Strassen’s Matrix Multiplication Transform

       4.3.3 Winograd Transform

       4.3.4 Fast Fourier Transform

       4.3.5 Selecting a Transform

       4.4 Summary

       5 Designing DNN Accelerators

       5.1 Evaluation Metrics and Design Objectives

       5.2 Key Properties of DNN to Leverage

       5.3 DNN Hardware Design Considerations

       5.4 Architectural Techniques for Exploiting Data Reuse

       5.4.1 Temporal Reuse

       5.4.2 Spatial Reuse

       5.5 Techniques to Reduce Reuse Distance

       5.6 Dataflows and Loop Nests

       5.7 Dataflow Taxonomy

       5.7.1 Weight Stationary (WS)

       5.7.2 Output Stationary (OS)

       5.7.3 Input Stationary (IS)

       5.7.4 Row Stationary (RS)

       5.7.5 Other Dataflows

       5.7.6 Dataflows for Cross-Layer Processing

       5.8 DNN Accelerator Buffer Management Strategies

       5.8.1 Implicit versus Explicit Orchestration

       5.8.2 Coupled versus Decoupled Orchestration

       5.8.3 Explicit Decoupled Data Orchestration (EDDO)

       5.9 Flexible NoC Design for DNN Accelerators

       5.9.1 Flexible Hierarchical Mesh Network

       5.10 Summary

       6 Operation Mapping on Specialized Hardware

       6.1 Mapping and Loop Nests

       6.2 Mappers and Compilers

       6.3 Mapper Organization

       6.3.1 Map Spaces and Iteration Spaces

       6.3.2 Mapper Search

       6.3.3 Mapper Models and Configuration Generation

       6.4 Analysis Framework for Energy Efficiency

       6.4.1 Input Data Access Energy Cost

       6.4.2 Partial Sum Accumulation Energy Cost