![]() |
Challenges in 45-nm Physical DesignContributor: Mentor Graphics Corp. November 6, 2008 -- With each new generation of ICs, previously manageable challenges in physical implementation emerge as extremely disruptive discontinuities. In response, new generations of design tools become necessary to realize the benefits of moving to new process nodes and new IC architectures. For example, at 180nm, timing closure was a disruptive challenge, which led to new physical synthesis technology. Then, at 130nm, signal integrity (SI) closure was the main discontinuity. The new generation of challenges started at 65nm, are in full force at 45nm, and will get worse as ICs venture into 32/22nm. Chip makers working at these nodes need to fully understand the new discontinuities, and start planning for them. Based on past trends and current research, IC designers, fabs and EDA companies generally agree on four key discontinuities that affect physical implementation of digital ICs:
Dealing with manufacturing variability during designManufacturing variations arise from the fact that current 193-nm light lithography cannot print 45-nm patterns without significant distortion. This results in catastrophic chip failures, and in changes to timing, signal integrity (SI) and power characteristics. Because design-for-manufacturing (DFM) is applied so late in the flow, designs are coming out in final routing with an unmanageable number of violations that impact chip yield, reliability and time-to-market. Manufacturing effects should be considered during the place-and-route phase, when the layout can be optimized for performance and manufacturability at the same time. Place-and-route tools designed for the previous generation of challenges struggle with the effects of manufacturing variations during layout. For example, older routers haven’t kept up with the growing complexity and number of design rules. They often assume "as drawn" features, rather than modeling the actual "as manufactured" shapes and geometries for devices and interconnects. It is nearly impossible to retro-fit aging place-and-route platforms to control the effects of manufacturing variability. Particularly for designs at 45nm and below, IC design teams must consider putting a solution in place before they face catastrophic yield failures or missed market windows. Handling an exploding number of design scenariosA second major discontinuity at 45nm is the explosion in the number of mode and corner scenarios that often have conflicting closure requirements. A mode/corner scenario is any combination of cell and interconnect variations, design constraints, library and operational modes that need to be considered during physical design. Figure 1 illustrates the sign-off requirements trend – an ever expanding set of corners for any given mode that must be considered.
The traditional approach to multicorner/ multimode designs is to merge constraints, add design margins, and use worst-case corner conditions early in the flow. Later in the flow, designers manually analyze each scenario in an iterative and potentially non-convergent process. This expensive and unpredictable approach frequently leads to missed schedules or reduced chip performance. The discontinuity of process and design variation requires concurrent multicorner-multimode (MCMM) analysis, which incumbent place and route tools can’t do because their core timing engines were architected to represent only one or two mode/corner scenarios at a time. Design for ultra-low power consumptionThe low-power discontinuity in physical design arises from several weaknesses in previous-generation design tools:
The use of multiple voltage domains is an extremely effective approach for reducing power usage, particularly leakage power. However, previous generation place-and-route engines can’t always honor the multi-voltage domain specifications, including inserting special cells and routing secondary power connections. This forces designers to use clumsy work-arounds to complete the physical implementation. Another weakness in design tools for low-power design is the lack of MCMM timing and deep co-optimization of all design requirements. Each voltage domain causes the number of timing analysis mode/corner scenarios to double when all the min/max voltage combinations are considered. For example, SLEEP, ACTIVE, and STANDBY modes must be analyzed and optimized at multiple process corners for timing, leakage, SI, and manufacturability. The third weakness is the poor clock tree synthesis (CTS) capabilities. Clocks are the single largest source of dynamic power usage, and clock tree synthesis and optimization is a good place to achieve power saving. Yet existing tools are unable to find new opportunities to reduce power consumption in increasingly complex clocks. Reducing clock power by lowering overall capacitance and minimizing switching activity requires advanced skew balancing and switching analysis that depends on MCMM clock synthesis and optimization. With single-mode CTS, there is no guarantee of converging on an optimum solution across all scenarios. Tool architectures to provide MCMM, high capacity and fast runtimeThe number of gates in a typical IC has quadrupled since the 180-nm node. Limitations in design tool capacity force designers to chop the design into smaller parts, which can delay schedules, increase engineering costs, and play havoc with full-chip closure. To work around the capacity limitations of physical design tools, designers use black-box models or interface logic models. This requires a separate abstraction model for each mode/corner scenario. As data sizes increase, so do computational loads. If design tools cannot efficiently use multicore platforms, our concept of reasonable time-to-market will have to change. Even with a very efficient data model, some tasks, particularly timing-related functions, simply take too long. Runtime of timing tasks is of particular concern because timing is the fundamental "cost optimization function" for most routing decisions, and virtually every change in a layout will impact timing in complex ways. Timing analysis and optimization consumes up to 60% to 70 % of the total run time during the place-and-route flow. Figure 2 illustrates the different stages of the design flow and the percentage of a place and route tool’s computing time spent on timing analysis and optimization for each stage.
Parallelizing timing analysis is extremely difficult, but would greatly speed up overall design times. Traditional parallelization techniques result in very complex synchronization schemes that produce wrong results, data corruption, and software instability. The heavy synchronization overhead also limits the gains achievable by increasing the number of CPUs or cores. Solutions to the 45-nm discontinuitiesLast-generation design tools are not architected to address the new discontinuities at 45nm. New products for physical implementation offer innovations to address the variability, low power, capacity, and runtime challenges of advanced process nodes. For example, to address the challenges of manufacturing variability, new tools must fully integrate design-for-manufacturing (DFM) capabilities, such as lithography and CMP analysis, to generate a correct-by-construction layout. To manage the explosion in corners and modes, new tools must have robust MCMM timing analysis and optimization. Only a timing engine specifically designed for MCMM can work within the full context of macro-level functional complexity issues (multiple operational modes), and micro-level process and manufacturing issues (multiple design corners). For the low-power discontinuities, designers need tools that are compliant with standard power formats, can handle special cells, automatically connect secondary power lines, and respect voltage island boundaries during routing and optimization. MCMM timing and optimization is also important for low-power designs, particularly during CTS. Figure 3 shows how a single-corner CTS implementation compares with a 9-corner CTS implementation for a 9-corner design.
The capacity and runtime challenges call for physical design tools specifically architected to manage data efficiently. They must be able to represent 100+ million gate designs in hierarchical or flat design methodologies. To control runtimes for all phases of the place-and-route flow, particularly timing analysis and optimization, physical design tools must exploit the full power of multicore compute platforms. Recent advancements in parallelizing timing analysis enables near linear scaling across any number of CPUs. This ability requires a natively parallel software architecture, and innovative data dependency analysis and task synchronization algorithms.
By Sudhakar Jilla Sudhakar is the marketing director for place-and-route products at Mentor Graphics Corp. Over the past 15 years, he has held various application engineering, marketing, and management roles in the EDA industry. He has been previously responsible for the rollout of several market leading products and initiatives such as Pinnacle, Olympus-SoC, Design-for-Variability at Sierra Design Automation and Physical Compiler, Galaxy-SI at Synopsys. | ||||||||
Reprinted from SOCcentral.com, your first stop for ASIC, FPGA, EDA, and IP news and design information. | ||||||||