Tesla V100技术架构白皮书
- 格式:docx
- 大小:12.25 MB
- 文档页数:59
Board Specification Tesla C1060 Computing ProcessorBoardDocument Change HistoryVersion Date Responsible Description of Change01 July 10, 2008 SG, SM Preliminary Release02 July 15, 2008 SG, SM Minor text updatesUpdated Support Information section 03 September 22, 2008 SG, SM Initial ReleaseUpdated thermal informationMinor text updates04 December 8, 2008 SG, SM Updated fan flow and power information05 January 22, 2009 SG, SM Updated board powerTable of ContentsTesla C1060 Overview (1)Key Features (1)Computing Processor Description (2)Configuration (3)Mechanical Specifications (4)PCI Express System (4)Standard I/O Connector Placement (5)Internal Connectors and Headers (6)External PCI Express Power Connectors (6)4-Pin Fan Connector (9)Power Specifications (11)Power by Rail (12)Thermal Specifications (13)Thermal Qualification Summary (13)Cooling Solution (15)Support Information (17)Languages (17)Certificates and Agencies (18)Agencies (18)List of FiguresFigure 1. Tesla C1060 Block Diagram (2)Figure 2. Tesla C1060 Computing Processor Board (4)Figure 3. Tesla C1060 Bracket (No Connectors) (5)Figure 4. 6-Pin PCI Express Power Connector (7)Figure 5. 8-Pin PCI Express Power Connector (8)Figure 6. 4-Pin Fan Connector (10)Figure 7. TM72 Active Fan Sink (15)List of TablesTable 1. Board Configuration (3)Table 2. 6-Pin PCI Express Power Connector Pinout (9)Table 3. 8-Pin PCI Express Power Connector Pinout (9)Table 4. 4 Wire Thermal Control Pinout (10)Table 5. Configuration with External PCI Express Connectors (11)Table 6. Power by Rail (12)Table 7. Test Setup and Configuration (14)Table 8. Sample Thermal Results and Specification (14)Table 9. Fan Specifications and Conditions (16)Table 10. Environmental Specifications and Conditions (16)Table 11. Languages Supported (17)Tesla C1060 Overview The NVIDIA® Tesla™ C1060 computing processor board is a PCI Express 2.0 full-height (4.376 inches by 10.50 inches) form factor computing add-in card based onthe NVIDIA Tesla T10 graphics processing unit (GPU). This board is targeted ashigh-performance computing (HPC) solution for PCI Express systems.The Tesla C1060 is capable of 933 GFLOPs/s of processing performance andcomes standard with 4 GB of GDDR3 memory at 102 GB/s bandwith.Key FeaturesGPUNumber of processor cores: 240Processor core clock: 1.296 GHzVoltage: 1.1875 VPackage size: 45.0 mm × 45.0 mm 2236-pin flip-chip ball grid array (FCBGA)BoardFourteen layer printed circuit board (PCB)PCI Express 2.0 ×16 system interfacePhysical dimensions: 4.376 inches × 10.50 inches, dual slotBoard power dissipation: 187.8 WExternal ConnectorsNoneInternal Connectors and HeadersOne 6-pin PCI Express power connectorOne 8-pin PCI Express power connector4-pin fan connectorMemory800 MHz512-bit memory interface4 GB: Thirty-two pieces 32M × 32 GDDR3 136-pin BGA, SDRAMBIOS1Mbit Serial ROMComputing ProcessorDescriptionFigure 1 is a block diagram of the Tesla C1060 computing processor.Tesla C1060 Computing Processor-Figure 1. Tesla C1060 Block DiagramConfigurationThere is one configuration available (Table 1) for the Tesla C1060 board.Table 1. Board ConfigurationSpecification DescriptionGeneric SKU reference 900-20607-0000-000Chip Tesla T10 GPUPackage size: GPU 45.0 x 45.0 mmProcessor clock 1296 MHzMemory clock 800 MHzMemory size 4 GBMemory I/O 512-bit GDDR3Memory configuration 32 pcs 32M × 32 GDDR3 SDRAMExternal connectors NoneInternal connectors and headers 8-pin PCI Express power connector6-pin PCI Express power connector4-pin fan connectorBoard power 187.8 W (idle power = 57.65 W)Thermal cooling solution TM72 active fan sinkMechanical SpecificationsPCI Express SystemThe Tesla C1060 computing processor board (Figure 2) conforms to the PCI Express full height (4.376 inches by 10.50 inches) form factor.Figure 2. Tesla C1060 Computing Processor Board4.376 inches10.50 inchesStandard I/O ConnectorPlacementAs shown in Figure 3, the Tesla C1060 does not include any external I/Oconnectors.Figure 3. Tesla C1060 Bracket (No Connectors)Internal Connectors andHeadersThe Tesla C1060 board supports the following internal connectors and headers.8-pin PCI Express power connector (can be used with a 6-pin power cable)6-pin PCI Express power connector4-pin fan connectorExternal PCI Express Power ConnectorsThe Tesla C1060 is a performance-optimized, high-end board and utilizes powerfrom the PCI Express connector as well as external power connectors. The boardcan be used in two different ways.One 8-pin PCI Express power connector orTwo 6-pin PCI Express power connectorsNote:When connecting two 6-pin power cables to the two power connectors on theTesla C1060 board, ensure that both power cables come from the same powerrail. For example, the same 12 V power supply.Figure 4 and Figure 5 show the specifications, and Table 2 and Table 3 show thepinouts for the 6-pin and 8-pin external PCI Express power connectors,respectively.Figure 4. 6-Pin PCI Express Power ConnectorFigure 5. 8-Pin PCI Express Power ConnectorTable 2. 6-Pin PCI Express Power Connector PinoutPin Number DescriptionV1 +12V2 +12V3 +124 GND5 Sense6 GNDTable 3. 8-Pin PCI Express Power Connector PinoutPin Number DescriptionV1 +12V2 +12V3 +124 Sense15 GND6 Sense07 GND8 GND4-Pin Fan ConnectorThe Tesla C1060 board uses a 4-pin fan to control the fan speed of the thermalsolution. The details of the connector (P/N: PH-T-4) are given in Figure 6. Thispart is a 2.0 mm (0.079") pitch disconnectable connector.Table 4 lists the pin assignments for this connector.Figure 6. 4-Pin Fan ConnectorTable 4.4 Wire Thermal Control PinoutPin Number Description 1 PWM (to fan) 2TACH (from fan)3 +12 V4 GNDPowerSpecifications The Tesla C1060 computing processor is a performance optimized high-end board solution. Power is taken from the PCI Express host bus as well as either one 8-pin or two 6-pin PCI Express power connectors.Note:When connecting two 6-pin power cables to the two power connectors on the Tesla C1060 board, ensure that both power cables come from the same powerrail. For example, the same 12 V power supply.Without auxiliary power provided to the Tesla C1060 board, the board will not boot and LED lights on the board will light up as listed in Table 5. This table outlines the different possible scenarios as well as the resulting behaviors.Table 5. Configuration with External PCI ExpressConnectors8-pin Power Connector 6-pin PowerConnectorResultConnected(either 8-pin or 6-pin) Connected Full Power – LED light on the bracketis GREEN by default8-pin Connected Not Connected Full Power – LED light on the bracketis GREEN by default6-pin Connected Not Connected LED light is RED – board will not bootto OSNot Connected Connected LED light is RED – board will not bootto OSNot Connected Not Connected LED light is RED – board will not bootTesla C1060 Computing Processor BoardPower by RailTable 6 lists the power by rail numbers for the Tesla C1060 board.Table 6. Power by RailPEX12V PEX3V3 EX12V Total Board PowerW62.01 2.82124.25 187.82 8-pin Connected Not connected Full Power – LED light onthe bracket is GREEN bydefaultThermalSpecificationsThermal Qualification SummaryThe information contained in this summary report is intended to provide users ofthe Tesla C1060 computing processor with thermal information necessary to assistin thermal management efforts. This information is not intended to provide aspecific thermal management solution. However, it does show an approach thatresult in the reliable operation of the Tesla C1060.The product and cooling solutions used are:Device product: Tesla C1060 boardCooling solution: Fan sink solution, Cooler Master TM72 NV P/N: 580-10607-2000-000. The cooling solution assembly includes a heat sink, fan, backplate,thermal grease interface material, and screws.Result: Under the operating conditions described in the following tables, theTesla C1060 passed thermal qualification.Table 7. Test Setup and ConfigurationSystem Part ConfigurationPC Motherboard attached to a chassis frame – entireunit placed in an acrylic boxMotherboard NVIDIAnForce® 790i Ultra SLIPower Supply ThermalTake 1000 WCPU Intel Core 2 Extreme QX9650 Yorkfield 3.0 GHz12 MB L2 Cach LGA 775 130 W Quad-CoreprocessorSDRAM DDR3 1333; 2 – 1 GB OCZ memory cardsPC Operating System Windows XP 32-bitGPU Computing ProcessingBoard Tesla C1060BIOS 62.00.1E.00.00 Display Driver 177.83GPU TeslaT10Clock Speed 1.296 GHz (core) 800 MHz (mclk)Table 8. Sample Thermal Results and SpecificationTest Application Tjunction(°C)* TA(°C)** CoolingSolutionTest 1:3DMark06/GT2Firefly87 45.4GPU junctionmaximumtemperature specification under any operating conditions. 102At anyambientTemperatureFan sink solution,NV P/N 580-10607-2000-000* Junction temperature is reported by NVIDIA thermal sensor** Ambient air temperature – average of 3 sensors positioned at the inlet to the GPU fanCooling SolutionNVIDIA will utilize a CoolerMaster TM72 active fan sink (Figure 7) to cool theGPU, memories and power supply components. For fan and environmentalspecifications refer to Table 9 and Table 10.Figure 7. TM72 Active Fan SinkTable 9. Fan Specifications and ConditionsSpecifications ConditionsRated voltage 12 V DCOperating voltage 5 V ~ 13.2 V DCStart up voltage 5.0 V (at 100% duty cycle at 25 o C power on/off) Rated current 0.48 AmpRated power 5.76 WSpeed 2600±10% RPM (At 25 o C to record speed afterfan running normal. This time is about 3-5minutes)Speed (with customer’s pillow) 2600 RPM ± 10%14.08 CFM (minimum 12.67 CFM)Air flow (at zero static pressurewith customer’s pillow)11.21 mm water (minimum 9.08 mm water)Air pressure (at zero air flow withcustomer’s pillow)Acoustic noise 39.4 dB(A); maximum 43.3 dB(A)Noise (with customer’s pillow) 46.0 dB(A); Max 50.6 dB(A)Life expectance 70,000 hours continuous operation at 40 °C with15 – 65 % relative humidityTable 10. Environmental Specifications and ConditionsSpecifications ConditionsOperating temperature 0 °C to 45 °CStorage temperature All function shall be normal after 500 hours at -10 °Cto 70 °C at normal humidity with a 24 hours recoveryperiod at room temperatureOperating humidity 5% to 90 % RHStorage humidity 5% to 95 % RHJanuary 22, 2009 | BD-04111-001_v0517Support InformationLanguagesTable 11.Languages SupportedWinXPLinuxEnglish (US) x x English (UK) x Arabic xChinese, Simplified x Chinese, TraditionalxDanish x Dutch x Finnish x French x French (Canada)xGerman x Italian x Japanese x Korean x Norwegian x Portuguese (Brazil)xRussian x Spanish x Spanish (Latin America)xSwedish x Thai x NOTE: NVIDIA’s CUDA™ software is only supported in English (U.S.)Tesla C1060 Computing Processor BoardCertificates and AgenciesAgenciesBureau of Standards, Metrology, and Inspection (BSMI)C-TickConformité Européenne (CE)Federal Communications Commission (FCC)Interference-Causing Equipment Standard (ICES)Ministry of Information and Communication (MIC)Underwriters Laboratories (UL)Voluntary Control Council for Interference (VCCI)January 22, 2009 | BD-04111-001_v0518NoticeALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, ANDOTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIAMAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THEMATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT,MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes noresponsibility for the consequences of use of such information or for any infringement of patents or otherrights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject tochange without notice. This publication supersedes and replaces all information previously supplied. NVIDIACorporation products are not authorized for use as critical components in life support devices or systemswithout express written approval of NVIDIA Corporation.TrademarksNVIDIA, the NVIDIA logo, CUDA, HybridPower, nForce, SLI and Tesla are trademarks or registeredtrademarks of NVIDIA Corporation in the United States and other countries. Other company and productnames may be trademarks of the respective companies with which they are associated.Copyright© 2008, 2009 NVIDIA Corporation. All rights reserved.。
深信服 aBos 一体机技术白皮书缩写和约定英文缩写英文全称中文解释Hypervisor Hypervisor虚拟机管理器(和 VMM 同义)VMM VMM Virtual Machine Manager虚拟机监视器HA HighAvailability高可用性vMotion vMotion实时迁移DRS Distributed Resource Scheduler分布式资源调度RAID Redundant Arrays ofIndependent Disks磁盘阵列IOPS Input/Output Operations PerSecond每秒读写(I/O)操作的次数VM Virtual Machine虚拟机SDN Software Defined Network软件定义网络NFV Network FunctionVirtualization网络功能虚拟化目录1 前言 (2)1.1 边缘IT 时代变革 (2)1.2 白皮书总览 (2)2 深信服aBos 一体机架构技术 (4)2.1 aBos 超融合架构概述 (4)2.1.1 aBos 超融合架构的定义 (4)2.2 深信服aBos 超融合架构组成模块 (4)2.2.1 系统总体架构 (4)2.3 aSV 计算虚拟化 (4)2.3.1 计算虚拟化概述 (4)2.3.2 aSV 技术原理 (5)2.3.3 深信服aSV 的技术特性 (14)2.3.4 aSV 的特色技术 (17)2.4 aNet 网络设备虚拟化 (18)2.4.1 网络设备虚拟化概述 (18)2.4.2 aNET 网络虚拟化技术原理 (19)2.4.3 aNet 功能特性 (22)2.4.4 深信服aNet 的特色技术 (28)2.5 aSAN 存储虚拟化 (29)2.5.1 存储虚拟化概述 (29)2.5.2 aSAN 技术原理 (29)2.5.3 aSAN 存储数据可靠性保障 (38)2.5.4 深信服aSAN 功能特性 (38)3 深信服aBos 一体机核心价值 (40)3.1 高稳定性 (40)3.2 简化分支机构IT (40)3.3 简化运维、集中管理 (40)3.4 灵活部署、扩展性好 (40)4 超融合架构最佳实践 (41)1 前言1.1 边缘 IT 时代变革随着在物联网的构建过程中传感器的大量使用,企业客户及员工使用的移动设备和云服务的大量增加,企业边缘的概念也正在被重新定义。
Increase Mobility, Deliver Great User Experiences, andReduce Downtime with NVIDIA Virtual GPU SolutionsNVIDIA VIRTUAL GPU | BROCHURE | JAN22The financial services industry is comprised of several sectors, ranging from investment banking and trading to retail and insurance. Today, all are facing challenges to improve scalability and mobility while also meeting stringent security and regulatory compliance requirements. To stay ahead of the market and competitors, professionals in the financial services industry needto access their workspace from anywhere, on any device, with a great experience. This becomes even more difficult with the advent of Windows 10, which impacts every sector in the financial industry. For example, financial analysts and advisors routinely scroll through countless screens of data. Without graphics acceleration, common operations in business applications like scrolling through a 300-page PDF, are plagued with significant lag time which reduces productivity.The trading floors, in particular, have unique challenges that can only be solved by virtualized solutions. Traders need mobilityand are often moved around, along with their systems, to work closely with groups specializing in equity, commodities, or risk income. They also need to be up and running constantly, working on physically small desk areas. Multi-monitor support is the desired approach, as some professionals may have up to 15 applications open at a time. In addition to safeguarding data from information breaches and insider trading, data must also be preserved in the event of natural and man-made disasters to ensure the trading floor can be restored in no time.Because every second of downtime translates to lost revenue, the financial services industry is very conservative and requires stable systems. It’s not uncommon that some organizations are stillon Windows XP because they haven’t had time to upgrade their systems and cannot afford downtime.>Brokers can lose $4 million in revenue per millisecond if their electronic trading platform is 5ms behind the competition.1>The financial services industry is the one most-breached sector, with an average total cost of data breach of $18.37 million.2>The financial services sector, including banking and insurance, is the largest contributor to the desktop virtualization revenue forecast through 2020 - security being one of the major driving factors.31Quantifying the Impact of Virtual GPUs, August 2019.2 Cost of Cybercrime, 20193BusinessWire. Research and Markets: Desktop Virtualization Market - Forecast and Trends (2015 - 2020).NVIDIA VIRTUAL GPU | BROCHURE | JAN22NVIDIA VIRTUALIZATION TECHNOLOGYDELIVERS A MOBILE DIGITAL WORKPLACETHAT REDUCES DOWNTIME AND BOOSTSSECURITYFinancial services organizations are embracing virtualizationsolutions to increase mobility, ensuring anytime access to datawhile also enabling improved security. In addition, they need toincrease the quality of performance and user experiences inmodern office applications that are substantially more graphicsintensive. By adding NVIDIA virtual GPU solutions to their VDIenvironments, organizations are centralizing apps and data,delivering cost-effective VDI performance that scales. Plus, theycan provide virtual workspaces for knowledge workers, powerusers, and mobile professionals that offer improved management,security, and productivity. The benefits of virtual GPU aresignificant:>Enhance Productivity and User Experience.Financial services professionals cannow access their workspace from anywhere, on any device with a native PC-likeexperience. With graphics acceleration, financial services organizations can takefull advantage of Windows 10 and modern business apps—including key apps suchas Bloomberg and homegrown, customized apps—with significantly lower latency.In addition, NVIDIA virtualization solutions can satisfy unique financial servicesproductivity requirements such as multi-monitor support for brokerage systems andlarger frame buffers for better data visualization and pattern recognition.>Increase Manageability and Scalability.In all sectors, financial services organizationsoften have to support hundreds and thousands of users with requirements ranging,from rolling out systems to quickly resolving issues. Moreover, on trading floors,every second of downtime equates to thousands of dollars in lost revenue. Now,financial services organizations can centralize data and applications in the datacenter, delivering virtual workspaces with improved manageability, security, andperformance while reducing downtime and support costs. IT can also easily managelarge-scale virtualization deployments with end-to-end visibility of the organization’sinfrastructure and proactive monitoring.>Bolster Security and Regulatory Compliance. As a heavily regulated industry,financial services organizations must safeguard data against information breachesand insider trading or face serious consequences. By securely hosting sensitivefinancial information within the data center, organizations can improve their overallsecurity while simultaneously protecting data in the event of disaster. Not only doesvirtualization allow more users to securely access more applications, it also enablessecure work-from-anywhere workstyles.NVIDIA VIRTUAL GPU | BROCHURE | JAN22NVIDIA VIRTUAL GPU | BROCHURE | JAN22CUSTOMER EXAMPLES4Assumes cost of subscription, NVIDIA GRID vGPU software, and hardware, with three-year amortization of hardware of two M10 cards supporting 87 vApps users.KEY FINANCIAL SERVICES USER GROUPSNVIDIA VIRTUAL GPU | BROCHURE | JAN22HOW NVIDIA VIRTUAL GPU WORKSIn a VDI environment powered by NVIDIA virtual GPUs, the NVIDIA virtual GPU software is installed at the virtualization layer along with the hypervisor. This software creates virtual GPUs that let every virtual machine (VM) share the physical GPU installed on the server. The NVIDIA virtualization software includes a graphics driver for every VM. vWS includes, for example, the powerful Quadro driver. Because work that is typically done by the CPU is offloaded to the GPU, demanding engineering and creativeapplications are supported in a virtualized and cloud environment,delivering a much better user experience.WHAT MAKES NVIDIA VIRTUAL GPU POWERFULThe ability to support both compute and graphics workloads for every vGPU EXCEPTIONAL USER EXPERIENCEConsistent performance with guaranteed quality of service, whether on-premises or in the cloud PREDICTABLE PERFORMANCEThe industry's highest user density solution with 2x the user density with A16 compared to the previous generation M10, reducing the amount of hardware resources needed and lowering your TCO BEST USER DENSITYEnd-to-end management and monitoring to deliver real-time insight into GPU performance, plus broad partner integrations so you can use the tools you know and loveOPTIMAL MANAGEMENT AND MONITORING Regular cadence of new software releases that ensures you stay on top of the latest features and enhancementsCONTINUOUS INNOVATIONSupport for all major hypervisors and the most extensive portfolio of professional apps certifications with Quadro driversBROADEST ECOSYSTEM SUPPORTFor more information, visit /virtualgpu© 2022 NVIDIA Corporation. All rights reserved. NVIDIA, and the NVIDIA logo are trademarks and/or registered trademarks of NVIDIA Corporation. All company and product names are trademarks or registered trademarks of the respective owners with which they are associated. Features, pricing, availability, and specifications are all subject to change without notice.Cover by: Raiffeisenverband Salzburg reg. Gen. m. b. H., Schwarzstr . 13-15, 5024 Salzburg (https:///wiki/File:RVS_Handelsraum.jpg), …RVS Handelsraum“, https:///licenses/by/2.0/at/deed.en。
NVIDIA在中国正式发布Quadro V100,实时光线追踪和AI技术成为热点作者:齐健来源:《智能制造》 2018年第5期齐健日前,NVIDIA 在2018 中国国际视听集成设备与技术展(InfoComm China 2018)期间正式面向中国用户发布了NVIDIA 旗下搭载NVIDIARTX 技术的最新Volta 架构专业显卡NVIDIA QuadroGV100 GPU。
NVIDIA 专业可视化业务高级总监Sandeep Gupte 介绍说,NVIDIA Quadro GV100 主要有四大技术亮点:首先,基于NVIDIA Volta 架构的GPU,可提供每秒7.4 万亿次浮点运算的双精度性能、每秒14.8 万亿次浮点运算的单精度性能、以及每秒118.5 万亿次浮点运算的深度学习性能;第二,内置5120 个CUDA 内核,236 TFLOPS 张量内核和32GBHMB2 高速大容量显存;第三,双向NVLink 互联技术可以通过并联两块Quadro GPU 将显存扩展至64GB;第四就是嵌入加速人工智能的计算单元Tensor Cores。
NVIDIA Quadro GV100 适用于制造业、传媒娱乐、医疗、建筑、工程和施工、油气、科学可视化和软件开发等领域,能为创意和技术专业人士、软件开发人员及数据科学家提供更高的计算、渲染和深度学习性能,以及卓越的显存容量及可扩展性,满足新一代设计和可视化工作流程的需求。
NVIDIA Quadro GV100 首次亮相于NVIDIA 2018GTC 技术大会,其最为突出的一项技术特点就是与NVIDIARTX 技术结合,能够在运行专业设计及内容创作类应用程序的同时,实现实时的计算密集型光线追踪。
借助真实世界般的光线和物理属性,能够实现栩栩如生的光照、反射和阴影效果,并以更快的速度实现电影级画质的渲染。
真实感渲染是很多设计行业中不可或缺的要素,利用传统的视频渲染方法实现真实的阴影效果需要用到环境光遮蔽、烘焙光照等技术去模拟光影。
SCALAR i500技术白皮书中端智能磁带库平台智能化磁带库1到18台驱动器36到402个磁带槽位模块化扩展,机械臂连续运行按需提供存储容量iPLATFORM体系结构和iLAYER管理系统特点和优势:∙系结构和iLayer管理系统方案将智能管理功能集成到磁带库平台之中。
∙主动监控和远程诊断可以降低50%的服务呼叫,并且将解决问题的时间缩短了30%。
∙将灵活的、模块化的结构与单一的、连续运行的机械臂合成在一起,提高了扩展性、性能和可靠性。
∙新型的扩展模式和按需提供存储容量的增长方式使得磁带库很容易地随着数据量的增长而扩展。
∙内置对SMI-S的支持允许系统被领先的SRM工具直接管理,包括EMC ControlCenter™。
∙其设计结构可以使它很容易地与领先的磁盘备份解决方案集成。
∙Quantum的服务方案将远程诊断功能扩展到了存储生态环境当中∙带有增强服务选项的一年保修服务Scalar® i500是一个智能磁带库平台,可以为增长之中的中端存储环境带来更快、更简单和更可靠的数据保护。
Scalar i500将模块化设计和机械臂连续运行方式结合起来,提供了业界领先的扩展性、性能和可靠性。
加入了QUANTUM的iPlatform 体系结构和iLayer管理方案的设计后,Scalar i500可以让备份管理更简单。
它的主动监控和远程诊断功能可以降低50%的服务呼叫,并且将解决问题的时间缩短了30%。
其按需提供存储容量的扩展模式让它可以在不中断用户处理数据的情况下进行扩展。
而且,Scalar i500设计得可以很容易地与磁盘备份整合,使其成为下一代备份体系结构的理想磁带库产品。
拥有Scalar i500,IT经理们可以放心,他们将拥有可靠和高性能的备份以及可靠的恢复能力。
同时,无论他们的存储需求如何演进,有效的长期数据保护将会使他们轻松进入未来。
照片说明:Scalar i500将模块化设计和机械臂连续运行方式结合起来,提供了业界领先的扩展性、性能可靠性和价值。
Huawei CX110 Switch Module V100R001C10White PaperIssue06Date2017-03-27Copyright © Huawei Technologies Co., Ltd. 2017. All rights reserved.No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.Trademarks and Permissionsand other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.All other trademarks and trade names mentioned in this document are the property of their respective holders.NoticeThe purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.Huawei Technologies Co., Ltd.Address:Huawei Industrial BaseBantian, LonggangShenzhen 518129People's Republic of ChinaWebsite:About This DocumentPurposeThis document describes the E9000 CX110 GE switch module (CX110 for short) in terms ofits functions, advantages, appearance, specifications, internal networking, standards andcertifications. You can learn about the CX110 by reading this document.The product features and commands for the ethernet switching plane of the switch modulesvary according to the software version. For details, see the documents listed in the followingtable.Intended AudienceThis document is intended for:l Huawei presales engineersl Channel partner presales engineersl Enterprise presales engineersSymbol ConventionsThe symbols that may be found in this document are defined as follows:Change HistoryIssue 06 (2017-03-17)This issue is the third official release.Issue 05 (2017-02-17)This issue is the fifth official release.Issue 04 (2016-11-22)This issue is the fourth official release.Issue 03 (2016-05-12)This issue is the third official release.Issue 02 (2015-07-17)This issue is the second official release.Issue 01 (2015-02-16)This issue is the first official release.White Paper ContentsContentsAbout This Document (ii)1 Introduction (1)1.1 Function (2)1.2 Advantages (8)1.3 Appearance (9)1.4 Ports (13)1.5 Indicator (17)1.6 Internal Chassis Networking (19)1.7 Software and Hardware Compatibility (21)1.8 Technical specifications (23)2 Standards and Certifications (26)2.1 Standards Compliance (27)2.2 Certifications (29)1 Introduction About This Chapter1.1 FunctionThis topic describes the functions, protocols, and ports of the CX110 GE switch module.1.2 AdvantagesThe CX110 provides various ports (GE/10GE/40GE) and high specifications, and supportslarge data center networks, high-performance stacking, and various data center features. Inaddition, the CX110 switch module can be easily deployed and maintained.1.3 AppearanceThis topic describes the CX110 in terms of its appearance, panel, and installation positions inthe chassis.1.4 PortsThis topic describes the features, number rules, names, types, quantities, subcard numbers,and port numbers of the CX110 ports.1.5 IndicatorThis topic describes the indicators on the CX110.1.6 Internal Chassis NetworkingThis topic describes connection relationships between the CX110 and mezz modules oncompute nodes.1.7 Software and Hardware CompatibilityThis topic describes mezz modules that can work with the CX110 and pluggable modules andcables supported by ports on the CX110 panel.1.8 Technical specificationsThis topic describes the physical, environmental, power, and network switching specificationsof the CX110.1.1 FunctionThis topic describes the functions, protocols, and ports of the CX110 GE switch module.The CX110 GE switch module (CX110 for short) is a switch control unit that provides dataswitching function for service slots in the system and centrally provides service andmanagement ports for external devices.The CX110 is installed in the rear slot of the E9000 chassis and connected to compute nodes,storage nodes, management modules through the E9000 midplane. The CX110 performsswitching of internal data packets and control management packets to provide high-speed datatransmission.Table 1-1 describes the functions of the CX110.Table 1-1 GE switching plane function description1.2 AdvantagesThe CX110 provides various ports (GE/10GE/40GE) and high specifications, and supportslarge data center networks, high-performance stacking, and various data center features. Inaddition, the CX110 switch module can be easily deployed and maintained.Various Ports (GE/10GE/40GE)Underpinned by the leading hardware platform, the CX110 provides high-density ports andand a line-speed forwarding capability.The CX110 provides four 10GE ports and 12 GE electrical ports for connecting upstream toconvergence/core switches, 34 GE ports for interconnecting with high-performance computenodes, and two 40GE ports for interconnecting with and stacking switch boards.High Specifications and Support for Large Data Center NetworksThe CX110 provides the highest specifications in the industry. It supports a maximum of131,072 MAC addresses, a maximum of 16,384 forwarding information bases (FIBs), and amaximum of 4,096 multicast prefix tables.High-Performance Stacking, Easy Deployment and MaintenanceThe CX110 supports stacking of four devices. It has the following advantages:l High performance: A single stacking system can provide eight 10GE and 24 GE uplink ports (two devices are stacked).l High bandwidth: The CX110 supports 80GE stacking bandwidth. The stacking system has no bandwidth bottlenecks.l Easy deployment and maintenance:–Pre-deployment and offline configuration are supported. The system can be pre-planned and pre-configured. Devices can be added as required, supporting plug andplay and Pay As You Grow.–The slot ID of a device is the ID in a stacking system, facilitating deviceidentification and maintenance.–Indicators on the front panel indicate the role and status of a stacking system. Thestacking system can be maintained without a terminal.l Simple upgrade operations: The stacking system supports quick and automatic software upgrades, simplifying upgrade operations and reducing upgrade workload.Various Data Center Featuresl Virtual/virtual machine (VM) access–Supports virtualized servers, improving data center utilization.–Supports virtual resource discovery. During migration of VMs, VM networkpolicies can be automatically migrated using the virtual resource discovery functionso that network resources can be allocated as required. Working with the large-scalelayer 2 network, VMs can be freely migrated inside the whole data center.l Transparent Interconnection of Lots of Links (TRILL) protocol–Complying with the Internet Engineering Task Force (IETF) standard, the TRILLprotocol supports ultra-large networks and flexible networking modes.–The TRILL protocol supports load balancing by paths, so that traffic can be sharedbetween multiple paths according to service requirements.–The TRILL protocol supports sub-second network convergence. Any changes onthe network can be quickly sensed and then fast convergence is performed.1.3 AppearanceThis topic describes the CX110 in terms of its appearance, panel, and installation positions inthe chassis.AppearanceFigure 1-1 shows the CX110.Figure 1-1 AppearanceInstallation PositionsThe CX110 can be installed in the four slots at the rear of the E9000 chassis. The four slots are 1E, 2X, 3X, and 4E, as shown in Figure 1-2.Figure 1-2 Installation positions and slot numberingPanelFigure 1-3 shows the CX110 panel.Figure 1-3 Panel1Product model 2Customization label (with an ESN label)3Stacking status indicator 4Health status indicator 5Offline button/indicator 6BMC serial port 7GE electrical port810GE optical port9Data transmission status indicator of the 10GE optical port 10Connection status indicator of the 10GE optical port 11GE electrical port indicator12SYS serial portThe numbers on the left side are port serial numbers. The arrow direction of a triangle indicates the direction of a port.ESNsAn Equipment Serial Number (ESN) is a string that uniquely identifies a server. An ESN is required when you apply for technical support from Huawei.Figure 1-4 shows the ESN format.Figure 1-4 ESN example1.4 PortsThis topic describes the features, number rules, names, types, quantities, subcard numbers,and port numbers of the CX110 ports.The CX110 Ethernet ports are numbered in Slot number/Subcard number/Port numberformat.l Slot number: indicates the slot number of the current switch module. The value ranges from 1 to 4, mapping to 1E, 2X, 3X, and 4E slot respectively from left to right on thepanel.l Subcard number: indicates the number of a subcard supported by service ports. Thevalue ranges from 1 to 20. Table 1-2 and Table 1-3 describe subcard numbers.l Port number: indicates the sequence number of a port on a subcard. Table 1-2 and Table 1-3 describe port numbers and subcards.For example, if the CX110 is in slot 2X, the first GE port on the upper right on the panel isnumbered as GE 2/17/12, as shown in Figure 1-5.Figure 1-5 Port naming rulesTable 1-2 describes the external ports on the CX110.Table 1-2 External portsTable 1-3 describes the internal ports on the CX110. Table 1-3 Internal ports1.5 IndicatorThis topic describes the indicators on the CX110.You can observe the indicators to determine the current operating status of the CX110. Table1-4 describes the indicators.Table 1-4 Indicator description1.6 Internal Chassis NetworkingThis topic describes connection relationships between the CX110 and mezz modules oncompute nodes.For details about the networking of the CX110 and Mezz cards on compute nodes, see E9000Blade Server Mezz Module-Switch Module Interface Mapping Tool.Figure 1-6 shows the internal chassis networking for the CX110 and compute nodes. Ports oncompute nodes for connecting to the CX110 are provided by two mezz modules as follows:l Mezz 1 connects to GE switching planes of the CX110s in slots 2X and 3X.l Mezz 2 connects to GE switching planes of the CX110s in slots 1E and 4E.Figure 1-6 Mapping between the CX110 and mezz modules on compute nodesThe following describes the mapping between the CX110s and mezz modules. For example,the CX110s are installed in slots 2X and 3X and connect to Mezz 1.Port Mapping Between a Switch Module and a Mezz ModuleMapping between the CX110 and ports on the MZ110The MZ110 provides four GE ports, including ports 1, 2, 3, and 4. Ports 1 and 2 map to theGE switching plane of the CX110 in slot 2X, and ports 3 and 4 map to the GE switching planeof the CX110 in slot 3X, as shown in Figure 1-7.Figure 1-7 Mapping between the CX110 and ports on the MZ110Mapping between the CX110 and ports on the MZ111The MZ111 provides four GE ports, including ports 1, 2, 3, and 4. Ports 1 and 3 map to theGE switching plane of the CX110 in slot 2X, and ports 2 and 4 map to the GE switching planeof the CX110 in slot 3X, as shown in Figure 1-8.Figure 1-8 Mapping between the CX110 and ports on the MZ1111.7 Software and Hardware CompatibilityThis topic describes mezz modules that can work with the CX110 and pluggable modules andcables supported by ports on the CX110 panel.For details about the software and hardware that are compatible with the CX110, see HuaweiServer Compatibility Checker.Supported Mezz ModulesThe CX110 connects to mezz modules of compute nodes. Table 1-5 describes models andspecifications of the supported mezz modules.Table 1-5 Supported mezz modulesSupported Pluggable Modules and CablesTable 1-6 Supported pluggable modules and cablesCX110 supports multiple pluggable optical modules, fibers, and network cables. You canchoose the modules and cables based on site requirements.l The CX110 provides the following functions for uplink GE applications:–Provides SFP+ optical ports and supports single-mode and multi-mode SFP opticalmodules.–Provides RJ45 ports, supports 10/100/1000 Mbit/s autonegotiation, and uses twistedpair cables for connection.l The CX110 provides the following functions for uplink 10GE applications:–Provides SFP+ optical ports and supports single-mode and multi-mode SFP+optical modules.–Supports SFP+ 10GE cables, which can be 7 m or 10 m active high-speed cables or1 m, 3 m, or 5 m passive high-speed cables.1.8 Technical specificationsThis topic describes the physical, environmental, power, and network switching specificationsof the CX110.Table 1-7 describes the technical specifications of the CX110, and Table 1-8 describes thenetwork switching specifications of the CX110.Table 1-7 Technical specificationsTable 1-8 Network switching specificationsWhite Paper 1 Introduction2 Standards and Certifications About This Chapter2.1 Standards ComplianceThis topic describes the international and industrial standards and communication protocolsthat the CX110 complies with.2.2 CertificationsThis topic describes the certifications that the E9000 has passed.2.1 Standards ComplianceThis topic describes the international and industrial standards and communication protocolsthat the CX110 complies with.International StandardsTable 2-1 lists the international standards.Table 2-1 Standards and protocol complianceIndustrial StandardsTable 2-2 lists the industrial standards.Table 2-2 Industrial standardsCommunication ProtocolsTable 2-3 lists the communication protocols.Table 2-3 Communication protocols2.2 CertificationsThis topic describes the certifications that the E9000 has passed.Table 2-4 lists the certifications.Table 2-4 Certifications。
NVIDIA TESLA V100 GPUGPUAI HPCNVIDIA Tesla V100 GPU ................................................................................................. 1 Tesla V100 AIHPC (2) (2) (5)NVIDIA GPU 6 (6)GPU (7)GV100 GPU (8) (11)Volta ................................................................................................................................ 12 Tensor .......................................................................................................................... 14 L1 (17)FP32INT32 (18) (18)NVLink (19) (19) (20)HBM2 ............................................................................................................................... 22 ECC (23) (23)Tesla V100 ........................................................................................................................ 24 GV100 CUDA .. (26) (27)NVIDIAGPU SIMT (27)Volta SIMT (28) (30)VOLTA (31) (33) (34)........................................................................................................................... 37 Tesla V100NVIDIA DGX-1 (38)NVIDIA DGX-1 .................................................................................................................. 39 DGX-1 .............................................................................................................................. 40 NVIDIA DGX-AI (42) (44).................................................................................................................. 44 .. (45) (45)NVIDIA GPU (48) (49)...................................................................................... 50 (51)............................................................................................................. 52 .. (53) (54)AB AI CGPU84 SMVolta GV100 GPU NVIDIA Tesla V100 SXM2 (1)2. Tesla V100 ................................................................................................... 4 TensorTesla V100 (5)Volta GV100 GPU (9)5. Volta GV100 (SM) (13)6. cuBLAS (FP32) (14)7. cuBLAS FP16FP32 (15)8. Tensor4x4 (15)9. Tensor ........................................................................................................ 16 10. Pascal Volta 4x4 ...................................................................................... 16 11. Pascal Volta .. (17)V100DGX-1”NVLink ...................................... 20 13. V100 NVLinkGPUGPUGPUCPU (21)NVLink ............................................................................................. 21 15. V100HBM2P100 (22)16. Tesla V100 .................................................................................................. 24 17. Tesla V100 .................................................................................................. 24 18. NVIDIA Tesla V100 SXM2 - . (25)CUDA (26)20. PascalGPUSIMT (27)21. Volta .......................................................................................................... 28 22. Volta . (29) (29)1. 3. 4.12. “ 14. 19.23. (30)25. Pascal MPSVoltaMPS (32)Volta MPS (33) (36)28. NVIDIA DGX-1 ............................................................................................. 38 GP100DGX-13 (39)30. NVIDIA DGX-1 ............................................................................................. 41 Tesla V100 DGX Station.. (42)32. NVIDIA DGX 47 ...................................................... 43 (46) (48)................................................................................................49 (50).............................................................................................51 NVIDIA . (52)39. NVIDIA DriveNet (53)1. NVIDIA TeslaGPU (10)GK180 GM200 GP100GV100 (18)3. NVIDIA DGX-1 (39)4. DGX ........................................................................................................... 4 324. 26. 27.29.31.33. 34. 35. 36. 37. 38.2.NVIDIA TESLA V100 GPUTesla V100Volta GV100 GPUVolta GV100 GPU NVIDIA Tesla V100 SXM210CUDA GPUNVIDIA® GPUGPUNVIDIA GPU NVIDIA GPU(HPC) (AI)NVIDIA GPUNVIDIA® Tesla® V100 GV1001Pascal GP100 GPUVolta GV100 GPUHPC1.TESLA V100 AIHPCNVIDIA Tesla V100GV100 GPU 211815 mm 2 NVIDIATSMC 12 nm FFN (FinFET NVIDIA)Pascal GPU GV100GPUGV100GPUGV100Tesla V100④HPC(SM) Volta GPU PascalSMVolta SM 50%Tensor TFLOPSFP32 FP6412TFLOPS6VoltaSMVolta L1④④④④④④ NVIDIA NVLink™NVIDIA NVLinkGPU/CPUGPU Volta GV100NVLinkNVLink 300GB/s GP100NVLinkNVLink160 GB/s IBM Power 9 CPU CPU V100 NVIDIA DGX-1 AIHBM2VoltaSamsung HBM21.5 16 GB HBM2Volta900 GB/sPascal GP10095%VoltaVolta (MPS) Volta GV100GPUCUDA MPS(QoS) Volta MPS48 MPS 3Pascal16 VoltaGV100IBM Power (ATS) GPU CPUTesla V100 300 W TDPTesla V100GPUAPICUDA 9NVIDIA GPUCUDAPascal VoltaKeplerAPI Volta④VoltaCaffe2 MXNet CNTK TensorFlowVolta Volta GPUcuDNN cuBLAS TensorRT Volta GV100(HPC) NVIDIA CUDA 9.0API Volta2 Tesla V1002.Tesla V100Tesla V100 AIHPCHPCTesla V1003Tesla V100Tensor④ 7.8 TFLOPS 1 (FP64)④ 15.7 TFLOPS 1 (FP32)④ 125 Tensor TFLOPS 1Tensor Tesla V1001 基于 GPU 加速时钟.AI3.NVIDIA GPU -GPU GPU GPUNVIDIA Pascal GPUCPU NVIDIA Tesla V100 GPUNVLink GPUGPU NVIDIA GPUAIAI(DNN)45 CGPUNVIDIA GPUNVIDIA GPU CPUGPUCPU GPUGPUTFLOPSGPUVoltaAlexNet650,0002012 ImageNet(CNN)ResNet-152150Pascal GP100 GPUGV100 GPU G (TPC)(SM)● 14SM ④● 64 INT32● 32 FP64 ● 8 Tensor● 4④ 8512 409684 SMGV100 GPU 5376FP32 5376 INT322688 FP64672Tensor 336HBM2 DRAMGV100 GPU6144 KBL2484SM④ 6GPC GPC ● 7TPC NVIDIA Tesla V100HPCVolta GV100 GPUGV100GV100 GPUPU (GPC)GV100 GPUSM84Volta SM SM ● 64 FP32GV100 GPU GV100 Tesla V100 80 SM 1 NVIDIA Tesla GPU4. 84 SMVolta GV100 GPUNVIDIA TeslaGPUTeslaTesla K40Tesla M40Tesla P100Tesla V100GPU GK180 (Kepler) GM200 (Maxwell) GP100 (Pascal) GV100 (Volta) SM 15 24 56 80 TPC 15 24 28 40 FP32 /SM 192 128 64 64 FP32 /GPU 2880 3072 3584 5120 FP64 /SM 64 4 32 32 FP64 /GPU 960 96 1792 2560 Tensor /SM NA NA NA 8 Tensor /GPU NANA NA 640 GPU810/875 MHz 1114 MHz 1480 MHz 1530 MHz FP32 TFLOPS 1 5 6.8 10.6 15.7 FP64 TFLOPS1 1.7 0.21 5.3 7.8 Tensor TFLOPS1NA NA NA 125240 192 224 320384GDDR5 384GDDR5 4096 HBM24096 HBM212 GB24 GB16 GB 16 GB L21536 KB 3072 KB 4096 KB 6144 KB/SM 16 KB/32 KB/48 KB 96 KB 64 KBKB 96/SM256 KB 256 KB 256 KB 256KB/GPU 3840 KB 6144 KB 14336 KB 20480 KB TDP235 W250 W300 W300 W7180 153 211GPU 551 mm 601 610 mm815 mm28 nm28 nm 16 nm FinFET+ 12 nm FFN1TFLOPSGPU1.100 Tesla V100(TDP)NVIDIA GPUTesla VTesla V100V100// 50%60%TDPGPUGPU75%85%NVMLNVIDIA-SMICAPITesla OEMGPU Tesla V100300 W TDPGPUTesla V100300 WVOLTAVolta(SM)④TensorGP10012TFLOPS④ L1SIMT SIMDGV100 SM Pascal GP100 SM GV100 GPU SML1Volta SM96 KbGP10064 KB④ 50%Pascal GP100 GV100 SMGV100 SM SM 1664FP3232 FP64 GP100 SM32FP32 Tensor8FP64FP32 128 KB16L0FP64GV100 SM 16INT32 L0 64 KB NVIDIA GPU5Volta SM④SIMTGPU GV1005.Volta GV100 (SM)TensorTensorVolta GV100 GPUTesla V100 GPU 640Tensor2 Volta GV100Tensor SM 8 SM 64 FMASM8Tensor512FMA 1024Tesla V100TensorFP32 Tesla V100TFLOPS6TFLOPSP100125 Tensor TFLOPS TensorFP16P100 12V100 Tensor相比于配备 CUDA 8 的 Tesla P100,单精度 (FP32) 矩阵-矩阵乘法在配备 CUDA 9 的 Tesla V100 上的速度快 1.8 倍cuBLAS (FP32)NVIDIA Maxwell Kepler Tesla P100- (GEMM)6CUDA 8 Tesla P100CUDA 9 Tesla V10071.8FP32VoltaTensorFP16 P100 96.cuBLAS FP16 FP32相比于配备 CUDA 8 的 Tesla P100 上的 FP32 矩阵乘法,混合度矩阵-矩阵乘法在配备 CUDA 9 的Tesla V100 上的速度快 9 倍TensorTensor 4x4D = A B + CA B CD C FP16D 4x4 FP328A B FP168Tensor4x4TensorFP16FP32FP16 4x4x49FP32Tensor7.8.Tensor10. PascalVolta 4x410 4x44x44x4 Tensor64Volta V100Pascal Tesla P10012Volta TensorCUDA 9 C++ APICUDA-C++API TensorCUDA16x16 32 Tensor CUDA-C++cuBLAScuDNN NVIDIACaffe2Tensor MXNetNVIDIATensor TensorVolta GPU 9.L1Volta 的 L1 数据缓存可缩小将数据保存于共享内存(需手动调整)的应用程序与直接访问设备内存数据的应用程序之间的差距。
NVIDIA TESLA V100 GPUGPUAI HPCNVIDIA Tesla V100 GPU ................................................................................................. 1 Tesla V100 AIHPC (2) (2) (5)NVIDIA GPU 6 (6)GPU (7)GV100 GPU (8) (11)Volta ................................................................................................................................ 12 Tensor .......................................................................................................................... 14 L1 (17)FP32INT32 (18) (18)NVLink (19) (19) (20)HBM2 ............................................................................................................................... 22 ECC (23) (23)Tesla V100 ........................................................................................................................ 24 GV100 CUDA .. (26) (27)NVIDIAGPU SIMT (27)Volta SIMT (28) (30)VOLTA (31) (33) (34)........................................................................................................................... 37 Tesla V100NVIDIA DGX-1 (38)NVIDIA DGX-1 .................................................................................................................. 39 DGX-1 .............................................................................................................................. 40 NVIDIA DGX-AI (42) (44).................................................................................................................. 44 .. (45) (45)NVIDIA GPU (48) (49)...................................................................................... 50 (51)............................................................................................................. 52 .. (53) (54)AB AI CGPU84 SMVolta GV100 GPU NVIDIA Tesla V100 SXM2 (1)2. Tesla V100 ................................................................................................... 4 TensorTesla V100 (5)Volta GV100 GPU (9)5. Volta GV100 (SM) (13)6. cuBLAS (FP32) (14)7. cuBLAS FP16FP32 (15)8. Tensor4x4 (15)9. Tensor ........................................................................................................ 16 10. Pascal Volta 4x4 ...................................................................................... 16 11. Pascal Volta .. (17)V100DGX-1”NVLink ...................................... 20 13. V100 NVLinkGPUGPUGPUCPU (21)NVLink ............................................................................................. 21 15. V100HBM2P100 (22)16. Tesla V100 .................................................................................................. 24 17. Tesla V100 .................................................................................................. 24 18. NVIDIA Tesla V100 SXM2 - . (25)CUDA (26)20. PascalGPUSIMT (27)21. Volta .......................................................................................................... 28 22. Volta . (29) (29)1. 3. 4.12. “ 14. 19.23. (30)25. Pascal MPSVoltaMPS (32)Volta MPS (33) (36)28. NVIDIA DGX-1 ............................................................................................. 38 GP100DGX-13 (39)30. NVIDIA DGX-1 ............................................................................................. 41 Tesla V100 DGX Station.. (42)32. NVIDIA DGX 47 ...................................................... 43 (46) (48)................................................................................................49 (50).............................................................................................51 NVIDIA . (52)39. NVIDIA DriveNet (53)1. NVIDIA TeslaGPU (10)GK180 GM200 GP100GV100 (18)3. NVIDIA DGX-1 (39)4. DGX ........................................................................................................... 4 324. 26. 27.29.31.33. 34. 35. 36. 37. 38.2.NVIDIA TESLA V100 GPUTesla V100Volta GV100 GPUVolta GV100 GPU NVIDIA Tesla V100 SXM210CUDA GPUNVIDIA® GPUGPUNVIDIA GPU NVIDIA GPU(HPC) (AI)NVIDIA GPUNVIDIA® Tesla® V100 GV1001Pascal GP100 GPUVolta GV100 GPUHPC1.TESLA V100 AIHPCNVIDIA Tesla V100GV100 GPU 211815 mm 2 NVIDIATSMC 12 nm FFN (FinFET NVIDIA)Pascal GPU GV100GPUGV100GPUGV100Tesla V100④HPC(SM) Volta GPU PascalSMVolta SM 50%Tensor TFLOPSFP32 FP6412TFLOPS6VoltaSMVolta L1④④④④④④ NVIDIA NVLink™NVIDIA NVLinkGPU/CPUGPU Volta GV100NVLinkNVLink 300GB/s GP100NVLinkNVLink160 GB/s IBM Power 9 CPU CPU V100 NVIDIA DGX-1 AIHBM2VoltaSamsung HBM21.5 16 GB HBM2Volta900 GB/sPascal GP10095%VoltaVolta (MPS) Volta GV100GPUCUDA MPS(QoS) Volta MPS48 MPS 3Pascal16 VoltaGV100IBM Power (ATS) GPU CPUTesla V100 300 W TDPTesla V100GPUAPICUDA 9NVIDIA GPUCUDAPascal VoltaKeplerAPI Volta④VoltaCaffe2 MXNet CNTK TensorFlowVolta Volta GPUcuDNN cuBLAS TensorRT Volta GV100(HPC) NVIDIA CUDA 9.0API Volta2 Tesla V1002.Tesla V100Tesla V100 AIHPCHPCTesla V1003Tesla V100Tensor④ 7.8 TFLOPS 1 (FP64)④ 15.7 TFLOPS 1 (FP32)④ 125 Tensor TFLOPS 1Tensor Tesla V1001 基于 GPU 加速时钟.AI3.NVIDIA GPU -GPU GPU GPUNVIDIA Pascal GPUCPU NVIDIA Tesla V100 GPUNVLink GPUGPU NVIDIA GPUAIAI(DNN)45 CGPUNVIDIA GPUNVIDIA GPU CPUGPUCPU GPUGPUTFLOPSGPUVoltaAlexNet650,0002012 ImageNet(CNN)ResNet-152150Pascal GP100 GPUGV100 GPU G (TPC)(SM)● 14SM ④● 64 INT32● 32 FP64 ● 8 Tensor● 4④ 8512 409684 SMGV100 GPU 5376FP32 5376 INT322688 FP64672Tensor 336HBM2 DRAMGV100 GPU6144 KBL2484SM④ 6GPC GPC ● 7TPC NVIDIA Tesla V100HPCVolta GV100 GPUGV100GV100 GPUPU (GPC)GV100 GPUSM84Volta SM SM ● 64 FP32GV100 GPU GV100 Tesla V100 80 SM 1 NVIDIA Tesla GPU4. 84 SMVolta GV100 GPUNVIDIA TeslaGPUTeslaTesla K40Tesla M40Tesla P100Tesla V100GPU GK180 (Kepler) GM200 (Maxwell) GP100 (Pascal) GV100 (Volta) SM 15 24 56 80 TPC 15 24 28 40 FP32 /SM 192 128 64 64 FP32 /GPU 2880 3072 3584 5120 FP64 /SM 64 4 32 32 FP64 /GPU 960 96 1792 2560 Tensor /SM NA NA NA 8 Tensor /GPU NANA NA 640 GPU810/875 MHz 1114 MHz 1480 MHz 1530 MHz FP32 TFLOPS 1 5 6.8 10.6 15.7 FP64 TFLOPS1 1.7 0.21 5.3 7.8 Tensor TFLOPS1NA NA NA 125240 192 224 320384GDDR5 384GDDR5 4096 HBM24096 HBM212 GB24 GB16 GB 16 GB L21536 KB 3072 KB 4096 KB 6144 KB/SM 16 KB/32 KB/48 KB 96 KB 64 KBKB 96/SM256 KB 256 KB 256 KB 256KB/GPU 3840 KB 6144 KB 14336 KB 20480 KB TDP235 W250 W300 W300 W7180 153 211GPU 551 mm 601 610 mm815 mm28 nm28 nm 16 nm FinFET+ 12 nm FFN1TFLOPSGPU1.。