Efficient Processing of Deep Neural Networks. Vivienne Sze

Читать онлайн.
Название Efficient Processing of Deep Neural Networks
Автор произведения Vivienne Sze
Жанр Программы
Серия Synthesis Lectures on Computer Architecture
Издательство Программы
Год выпуска 0
isbn 9781681738338



Скачать книгу

Binary Modification: Tools, Techniques, and Applications

      Kim Hazelwood

      2011

      Quantum Computing for Computer Architects, Second Edition

      Tzvetan S. Metodi, Arvin I. Faruque, and Frederic T. Chong

      2011

      High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities

      Dennis Abts and John Kim

      2011

      Processor Microarchitecture: An Implementation Perspective

      Antonio González, Fernando Latorre, and Grigorios Magklis

      2010

      Transactional Memory, Second Edition

      Tim Harris, James Larus, and Ravi Rajwar

      2010

      Computer Architecture Performance Evaluation Methods

      Lieven Eeckhout

      2010

      Introduction to Reconfigurable Supercomputing

      Marco Lanzagorta, Stephen Bique, and Robert Rosenberg

      2009

      On-Chip Networks

      Natalie Enright Jerger and Li-Shiuan Peh

      2009

      The Memory System: You Can’t Avoid It, You Can’t Ignore It, You Can’t Fake It

      Bruce Jacob

      2009

      Fault Tolerant Computer Architecture

      Daniel J. Sorin

      2009

      The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

      Luiz André Barroso and Urs Hölzle

      2009

      Computer Architecture Techniques for Power-Efficiency

      Stefanos Kaxiras and Margaret Martonosi

      2008

      Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency

      Kunle Olukotun, Lance Hammond, and James Laudon

      2007

      Transactional Memory

      James R. Larus and Ravi Rajwar

      2006

      Quantum Computing for Computer Architects

      Tzvetan S. Metodi and Frederic T. Chong

      2006

      Copyright © 2020 by Morgan & Claypool

      All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher.

      Efficient Processing of Deep Neural Networks

      Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer

       www.morganclaypool.com

      ISBN: 9781681738314 paperback

      ISBN: 9781681738321 ebook

      ISBN: 9781681738338 hardcover

      DOI 10.2200/S01004ED1V01Y202004CAC050

      A Publication in the Morgan & Claypool Publishers series

       SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE

      Lecture #50

      Series Editors: Natalie Enright Jerger, University of Toronto

      Margaret Martonosi, Princeton University

      Founding Editor Emeritus: Mark D. Hill, University of Wisconsin, Madison

      Series ISSN

      Print 1935-3235 Electronic 1935-3243

      For book updates, sign up for mailing list at

      http://mailman.mit.edu/mailman/listinfo/eems-news

       Efficient Processing of Deep Neural Networks

      Vivienne Sze, Yu-Hsin Chen, and Tien-Ju Yang

      Massachusetts Institute of Technology

      Joel S. Emer

      Massachusetts Institute of Technology and Nvidia Research

       SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE #50

image

       ABSTRACT

      This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve key metrics—such as energy-efficiency, throughput, and latency—without sacrificing accuracy or increasing hardware costs are critical to enabling the wide deployment of DNNs in AI systems.

      The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as formalization and organization of key concepts from contemporary work that provide insights that may spark new ideas.

       KEYWORDS

      deep learning, neural network, deep neural networks (DNN), convolutional neural networks (CNN), artificial intelligence (AI), efficient processing, accelerator architecture, hardware/software co-design, hardware/algorithm co-design, domain-specific accelerators

       Contents

       Preface

       Acknowledgments

       PART I Understanding Deep Neural Networks

       1 Introduction

       1.1 Background on Deep Neural Networks

       1.1.1 Artificial Intelligence and Deep Neural Networks

       1.1.2 Neural Networks and Deep Neural Networks

       1.2 Training versus Inference

       1.3