MIAOW_Architecture_Whitepaper
- 格式:pdf
- 大小:464.81 KB
- 文档页数:12
MIAOWWhitepaper
HardwareDescriptionandFourResearchCaseStudies
Abstract
GPUbasedgeneralpurposecomputingisdevelopingas
aviablealternativetoCPUbasedcomputinginmanydo-
mains.Today’stoolsforGPUanalysisincludesimulatorslike
GPGPU-Sim,Multi2SimandBarra.Whileusefulformodeling
first-ordereffects,thesetoolsdonotprovideadetailedview
ofGPUmicroarchitectureandphysicaldesign.Further,as
GPGPUresearchevolves,designideasandmodificationsde-
manddetailedestimatesofimpactonoverallareaandpower.
Fueledbythisneed,weintroduceMIAOW,anopensourceRTL
implementationoftheAMDSouthernIslandsGPGPUISA,ca-
pableofrunningunmodifiedOpenCL-basedapplications.We
presentourdesignmotivatedbyourgoalstocreatearealistic,
flexible,OpenCLcompatibleGPGPUcapableofemulatinga
fullsystem.WefirstexploreifMIAOWisrealisticandthenuse
fourcasestudiestoshowthatMIAOWenablesthefollowing:
physicaldesignperspectiveto“traditional”microarchitecture,
newtypesofresearchexploration,validation/calibrationof
simulator-basedcharacterizationofhardware.Thefindings
andideasarecontributionsintheirownright,inadditionto
MIAOW’sutilityasatoolforothers’research.
1.Introduction
ThereisactiveandwidespreadongoingresearchonGPU
architectureandmorespecificallyonGPGPUarchitecture.
Toolsarenecessaryforsuchexplorations.First,wecompare
andcontrastGPUtoolswithCPUtools.
OntheCPUside,toolsspanperformancesimulators,em-
ulators,compilers,profilingtools,modelingtools,andmore
recentlyamultitudeofRTL-levelimplementationsofmicro-
processors-theseincludeOpenSPARC[39],OpenRISC[38],
IllinoisVerilogModel[56],LEON[18],andmorerecently
FabScalar[11]andPERSim[7].Inotherefforts,cleanslate
CPUdesignshavebeenbuilttodemonstrateresearchideas.
TheseRTL-levelimplementationsallowdetailedmicroarchi-
tectureexploration,understandingandquantifyingeffectsof
areaandpower,technology-drivenstudies,prototypebuilding
studiesonCPUs,exploringpower-efficientdesignideasthat
spanCADandmicroarchitecture,understandingtheeffects
oftransientfaultsonhardwarestructures,analyzingdi/dt
noise,andhardwarereliabilityanalysis.Somespecificexam-
pleresearchideasincludethefollowing:Argus[30]showed–
withaprototypeimplementationonOpenRISChowtobuild
lightweightfaultdetectors;Blueshift[19]andpowerbal-
ancedpipelines[46]considertheOpenRISCandOpenSPARC
pipelinesfornovelCAD/microarchitecturework.
OntheGPUside,anumberofperformancesimula-
tors[5,2,12,28],emulators[53,2],compilers[29,13,54],
Ultra-threadedDispatcher
CU
L2 Cache
M0CU
CU
CU
M1M2CU
CU
CU
CU
(a) GPU ArchitectureFetch Decode Schedule
LSUVector ALUInteger+FPScalar ALU
Vector General Purpose Registers (VGPR)Scalar GPRDispatcher
Memories
(b) Compute UnitHost Processor
Device
Memory
LDS
MemoryLogicIPMemory Controller
Figure1:CanonicalGPUOrganization
profilingtools[13,37],andmodelingtools[22,21,47,26]
areprevalent.HoweverRTL-levelimplementationsandlow-
leveldetailedmicroarchitecturespecificationislacking.As
discussedbyothers[10,52,42,23,32],GPUsarebeginning
toseemanyofthesametechnologyanddevice-reliabilitychal-
lengesthathavedrivensomeoftheaforementionedRTL-based
CPUresearchtopics.ThelackofanRTLlevelimplementa-
tionofaGPUhamperssimilareffortsintheGPUspace.As
CPUapproachesdonotdirectlytranslatetoGPUs,adetailed
explorationofsuchideasisrequired.Hence,wearguethatan
RTLlevelGPUframeworkwillprovidesignificantvaluein
explorationofnovelideasandisnecessaryforGPUevolution
complementingthecurrenttoolsecosystem.
Thispaperreportsonthedesign,development,character-
ization,andresearchutilityofanRTLimplementationofa
GPGPUcalledMIAOW(acronymizedandanonymizedfor
blindreviewasMany-coreIntegratedAcceleratorOfthe
Waterdeep).Figure1showsacanonicalGPGPUarchitec-
tureresemblingwhatMIAOWtargets(borrowedfromthe
AMDSIspecification,wedefineafewGPUtermsinTable3).
Specifically,wefocusonadesignthatdeliversGPUcom-
putecapabilityandignoresgraphicsfunctionality.MIAOWis
drivenbythefollowingkeygoalsandnon-goals.
GoalsTheprimarydrivinggoalsforMIAOWare:i)Re-
alism:itshouldbearealisticimplementationofaGPUre-
semblingprinciplesandimplementationtradeoffsinindustry
GPUs;ii)Flexible:itshouldbeflexibletoaccommodate
researchstudiesofvarioustypes,theexplorationofforward-
lookingideas,andformanend-to-endopensourcetool;iii)
Software-compatible:Itshouldusestandardandwidelyavail-
ablesoftwarestackslikeOpenCLorCUDAcompilerstoen-
ableexecutingvariousapplicationsandnotbetiedtoin-house
compilertechnologiesandlanguages.