May 18, 2006 -- When IC devices are produced and shipped to customers, it's important that they function as specified in the application environment. A number of strategies and practices can be used to statistically sample and predict how a device will operate over time. The practices outlined are believed to be best-in-class techniques for a successful product launch. These strategies most likely will point to sensitivities in devices that cause intermittent failures or process weaknesses that cause hard failures. If all of the outlined methods are not done in the pre-production phase, it may be necessary for failures to be analyzed later to prevent such occurrences in the future.
At eSilicon, we insert design-for-test (DFT) structured logic to detect silicon defects and reduce DPPM levels. We develop a composite chip test coverage for DFT and other test methods employed. The composite chip coverage helps to predict expected DPPM levels before measured data is obtained, and can help to find coverage “holes” in the chip design before tapeout. High test coverage from DFT is important because relatively small changes in the high end of test coverage can result in large changes for DPPM.
We also add DFT logic to actively improve the IC yield. This is typically used with memory built-in self-test (BIST) and fuse-based repair. In this scenario, we detect faulty memory bit-cells with memory BIST, and then map in new memory cells with laser-blown fuses. Memory repair can result in dramatic improvements in yield, which is reflected in the IC cost.
Device and process characterization
Full characterization of a device in the fabrication process is typically used to understand how fab process corner conditions interact with the IC design, ATE test and application function. Units are obtained from each fab process corner extreme and the center point, or typical process conditions expected, and tested across desired voltage and temperature ranges. This methodology captures the edge boundaries of the wafer fabrication process to statistically map the possible variables experienced in the wafer fabrication process in production.
Example of a typical process characterization analysis graph.
By completing the characterization step, device and fab process corner factors are evaluated in relation to voltage and temperature in both the application environment and ATE test environment. If this step is omitted from a production launch, the PVT sensitivities may be missed, which will impact the production ramp. It is possible that without this characterization, yield and performance may not be optimally centered for the production ramp, resulting in yield loss or unstable yield. At this stage, a full understanding of the ATE test limits is achieved and the test program limits are finalized.
Customer application correlation
At the point in which the ATE test process is defined and parameters set, it's also important that a thorough correlation is completed in the application setting. It's recommended that units from all process corners be evaluated and a true correlation between the ATE test program and application functions be performed.
When this process is completed, if there are any units that pass the ATE test and fail in the application when exposed to other factors, it is important to understand potential areas of ATE test coverage that can be improved. If there are areas that the ATE environment is not able to duplicate, it's important to understand and implement any potential parameters that may be linked to an application function, if possible.
To fully understand the reliability of the wafer fabrication, package assembly and device design, reliability tests are performed. These tests are designed to test the robustness of the combinations of manufacturing processes, product and design. A series of tests are outlined and performed to emulate the device under stress over a period of time.
Upon completion of these tests, predictions can be made about how well a device will perform as it is aged in a system or application environment. Examples of these tests are high-temperature operating life, ESD, latch-up, temperature cycling, highly accelerated stress, and many other tests, depending on the end application market. Conditions are set during these tests to best duplicate the stress a device will endure during operation in the application, and life expectancy predictions are made based on the ability to meet such testing conditions.
The ATE test environment is used before and after the reliability stress tests and environmental exposure extremes to ensure the device performance is in tact and functional. It is important at this phase of pre-production that the ATE test program be “Production Ready” to prevent discrepancies at production release.
Intermittent failure conditions
The necessary and sufficient information is needed about a failure that establishes a strong relationship between failure characteristics and the application environment. In most cases, IC devices are tested in an ATE environment that best duplicates the worst case of the environment in which they are expected to operate.
When a device displays a failure that is not catastrophic and has some intermittency, it can be very difficult to reproduce and understand. There are preventative measures and techniques used at the pre-production phase that assist in the correlation between the ATE test environment and the system application.
This necessary and sufficient information could include test data, IV-curves, schmoo plots, parametric data logs, environmental history, etc., and therefore, could be either electrical or physical in nature. The scope of application may be time-based, lot-based, package-based, design-based, application setting-based, etc. It is common that exposure to temperature or voltage can induce or duplicate an application failure. By completing an ATE test margin review of the parameter of interest may explain how a device could pass the ATE or application environment sometimes, and fail other times.
If a device exhibits an intermittent failure mode, it is unlikely that physical damage will be seen on a device using traditional failure analysis (FA) techniques. It is also possible that a parameter in the ATE test program may not have adequate test coverage to fully duplicate an application condition. Again, in these cases, correlation between ATE test and the application become very important.
Valid or hard device failures
If at any time there is a failure that causes the device not to fully operate in the application or in the ATE test environment, it is important to understand the mechanisms that cause the failure. This method of analysis by inference may be applied to customer returns, failures from quality conformance testing, reliability failures, qualification failures, and devices from engineering experiments and yield issues.
Using failure analysis does not necessarily imply that the root cause is known or understood, nor does it necessarily imply that a corrective action will or should take place, but FA can be used in conjunction with other programs that address these needs when a hard failure is seen. Many physically destructive tests are available at laboratories that can identify where there is breakdown in the silicon or circuit that would cause a failure.
Examples of failure analysis methods are:
- Basic failure mode verification; electrical test
- Non-destructive inspection
- External visual inspection (EVI)
- Real-time X-ray (RTX)
- Scanning acoustic microscopy/tomography (SAM/SAT)
- Mechanical/chemical decapsulation/delid
- Internal visual inspection (IVI)
- Basic defect visual localization
- Basic defect visual characterization (if exposed)
- Electrical overstress/electrostatic discharge analysis (EOS/ESD)
- Root cause analysis (RCA)
- Final FA report including corrective action (CA) initiation
BGA interconnect with a printed circuit board failure.
De-cap and SEM analysis of wire bond and solder ball failures.
Many more extensive failure analysis techniques are used if the root cause cannot be identified using these, although most failures can be seen without using more expensive and time consuming techniques.
eSilicon partners with customers to ensure a mutually successful production ramp with minimal risk to yield, application or field failures. To fully understand the process, product and application variables, all of the aforementioned process steps are required. eSilicon strongly believes that all of the production release steps are essential to fully understand the design or process limitations, risks and product performance aspects – otherwise the risks of failure are high.
Releasing a device to volume production is an important step in gaining market share and credibility with the customer base. Introducing a product without any performance, delivery or reliability problems greatly increase the success of the launch. The strategies outlined in this article will lead to a successful launch to production with minimal risk and maximum product performance.
By Donna Black. Black is Senior Director, Corporate Quality & Reliability, eSilicon Corp.
Prior to joining eSilicon in 2002, Donna was Senior Director of Corporate Quality for LSI Logic. Donna, who has more than 25 years experience in the semiconductor industry, has held senior management positions at C-Cube Microsystems, Adaptec, Leemak Training Systems and Xidex Magnetics. She holds a BS in Organizational Behavior from the University of San Francisco, and a BS in Computer Science & Engineering from San Jose State University. She is also TQM certified and is an ISO Lead Auditor.
Copyright 2006 eSilicon Corp. All rights reserved. This publication is protected by copyright and international treaty. No part of this publication may be reproduced in any form by any means without prior written authorization from eSilicon Corp.
Go to the eSilicon Corp. website to learn more.