عنوان مقاله
برنامه ریزی میکروساختار سه بعدی حرارت-آگاه
فهرست مطالب
مقدمه
فرمول بندی مشکل
تحلیل حرارتی سه بعدی
زیرساخت شبیه سازی
پارتیشن بندی سه بعدی حرارت-آگاه و برنامه ریزی
نتیجه گیری
بخشی از مقاله
برنامه ریز-سه بعدی یک برنامه جدید برای پروفایل حرارتی جاری و لیست شبکه ماژول ها بر مبنای محدودیت ها را میدهد. برنامه جدید ممکن است یک طول اتصال داخلی متفاوت بین ماژول ها داشته باشد. بنابراین، نیروی اتصال داخلی دوباره بر اساس طول های جدید محاسبه می گردد و به مصرف نیروی پویا اضافه می شود که قبلا جمع شده بود.
کلمات کلیدی:
Thermal-aware 3D Microarchitectural FloorplanningMongkol Ekpanyapong, Michael B. Healy, Chinnakrishnan S. Ballapuram, Sung Kyu Lim, and Hsien-Hsin S. LeeSchool of Electrical and Computer EngineeringGeorgia Institute of TechnologyGabriel H. LohCollege of ComputingGeorgia Institute of TechnologyAbstract— Next generation deep submicron processor design will needto take into consideration many performance limiting factors. Flip flopsare inserted in order to prevent global wire delay from becoming nonlinear,enabling deeper pipelines and higher clock frequency. The moveto 3D ICs will also likely be used to further shorten wirelength. Thiswill cause thermal issues to become a major bottleneck to performanceimprovement. In this paper we propose a floorplanning algorithmwhich takes into consideration both thermal issues and profile weightedwirelength using mathematical programming. Our profile-driven objectiveimproves performance by 20% over wirelength-driven. While thethermal-driven objective improves temperature by 24% on average overthe profile-driven case.I. INTRODUCTIONIn next generation deep submicron processor design it is likely thatrepeaters will be inserted frequently on global wires to prevent wiredelay from becoming non-linear [1]. Flip-flop insertion is a techniqueused to alleviate the impact of wire delay to achieve a target clockfrequency. A deeper pipeline enabled by flip-flop insertion results ina higher clock frequency and higher BIPS (billions of instructionsper second) [2]. Nevertheless, the improvement cannot always beanticipated; especially for designs with small feature size; flip-flopinsertion may cause IPC degradation from its increased latency.Therefore, inserting flip-flops without a meticulous measure does notguarantee an overall performance improvement.One technique that can alleviate IPC (Instructions per Cycle)degradation resulting from wire delay is communication aware floorplanning[3], [4], [5], [6]. Using floorplanners that consider the impactof wire delay by trying to move heavily communicating modulescloser together can shorten latency on such paths and result in betterperformance improvement. Another technique is to move to threedimensional integrated circuits or 3D ICs. By moving to 3D ICs,total wirelength can be reduced and clock speed can be increasedas shown in [7]. One bottleneck to the adoption of 3D ICs is heatdissipation. The structure of 3D ICs inherently implies that movingheat from the center of the chip will be more difficult. This can resultin more complex cooling devices, circuit malfunctions, and shortercircuit life time. When designing ICs with many layers of transistorsstacked together thermal issues become a large concern. In this paper,we propose a floorplanning algorithm that considers performance,area, and thermal issues using a mathmatical programming approachutilizing information gathered from cycle-accurate simulation.Some recent works on wire-delay issues on microarchitecturaldesign include [8], [5], [9], [2], [10], [11], [6]. Recent work onphysical design for microarchitecture include [12], [4], [3]. Recentwork on thermal-aware physical design algorithms include [13], [14],[15], [16], [17], [18].The structure of this paper is as follows: Section II presents theproblem formulation. Section III details our 3D thermal analysistechnique. Section IV shows our infrastructure for cycle-accuratesimulation. Section V presents our floorplanning algorithm. Finally,section VI shows our experimental results and we conclude in SectionVII.II. PROBLEM FORMULATIONA. Design FlowAn overview of our profile-driven microarchitectural floorplanningis shown in Figure 1. Our framework combines technology scalingparameters and the execution profiling information of applications toguide the floorplanning step of a given microarchitecture design. First,a machine description is provided as input to the microarchitecturesimulator, where profiling counters were instrumented for bookkeepingmodule-to-module communication. Then a cycle-accuratesimulation is performed using Simplescalar [19] to collect and extractthe amount of interconnection traffic between modules for a givenbenchmark program. The microarchitecture simulator was integratedwith Wattch [20] to provide the power numbers that are used to drivethe 3D-thermal analyzer. For cache-like or buffer-like structures, thearea and module delay are estimated using an industry tool fromHP Western Research Labs called CACTI [21]. For scaling otherstructures such as ALUs, we use GENESYS [22] developed at theGeorgia Institute of Technology.After the timing, area, and access frequency information of eachmodule is collected, we feed the module-level netlist, statisticalinterconnection traffic, and a target processor frequency into ourthermal/profile-guided floorplanner. The power consumption of allthe functional units are fed to the 3D-thermal analyzer to generatethe thermal profile. The 3D-floorplanner takes in the netlist and thetemperature information to generate a floorplan that maximizes theperformance under the thermal and frequency constraints. The newfloorplan is fed back to the 3D-thermal analyzer, along with the powernumbers to generate a new thermal profile. With these new latencyvalues architecture performance simulation is performed to obtainmore realistic and accurate IPC and BIPS numbers. Few iterationstake place before an optimum floorplan for the given constraint isachieved.B. Problem FormulationGiven a set of microarchitectural modules and a netlist that speci-fies the connectivity among these modules, our thermal- and profiledrivenmicroarchitectural floorplanner tries to place each modulesuch that (i) there is no overlap among the modules, and (ii) auser-specified clock period constraint is satisfied. Our objective isto minimize the maximum temperature among all blocks and theoverall execution time of a given processor. Because clock frequencyis fixed, IPC (Instructions Per Cycle) is used for the performancemeasurement. IPC represents the average number of instructions thatcan be issued in one clock cycle. In VLSI circuit design clock period