# An On-Chip Test Clock Control Scheme for Circuit Aging Monitoring

Hyunbean Yi

Abstract-In highly reliable and durable systems, failures due to aging might result in catastrophes. Aging monitoring techniques to prevent catastrophes by predicting such a failure are required. Aging can be monitored by performing a delay test at faster clocks than functional clock in field and checking the current delay state from the test clock frequencies at which the delay test is passed or failed. In this paper, we focus on test clock control scheme for a system-onchip (SoC) with multiple clock domains. We describe limitations of existing at-speed test clock control methods and present an on-chip faster-than-at-speed test clock control scheme for intra/inter-clock domain test. Experimental results show our simulation results and area analysis. With a simple control scheme, with low area overhead, and without any modification of scan architecture, the proposed method enables faster-than-at-speed test of SoCs with multiple clock domains.

*Index Terms*—Aging test, design-for-testability, design-for-reliability, scan test, test clock, multiple clock domains

### I. INTRODUCTION

Transistor aging has been a major concern in deep submicron. It is well known that, with time, transistor performance degrades due to failure mechanisms such as negative/positive Bias temperature instability (NBTI/PBTI), Hot Carrier Injection (HCI), and Time Dependent Dielectric Breakdown (TDDB). In the latest CMOS process technology, most dominant failure mechanism is NBTI which causes slow delay degradation and may finally occur a failure [1-4]. In applications requiring high field reliability such as medical equipments, satellites, aircrafts, and power plants, performance degradation and a failure can trigger a life-threatening disaster.

There have been aging monitoring techniques presented to predict a failure due to aging. The goal of aging monitoring is to give a warning to the system user or to conduct self-repairing before a failure occurs. For aging monitoring, delay test techniques are used. However, aging cannot be monitored by simply conducting an existing at-speed test. A faster-than-atspeed test is required because a failure due to aging must be predicted before the failure occurs. In order to decide a faster test clock frequency, guard-band interval is referred. The guard-band interval, as shown in Fig. 1, is the timing guard-band given to account for expected performance loss over device life time [5, 6].

M. Agarwal et al. [6, 7] and T. Nakura et al. [8] designed aging sensors checking whether the signal transition of the combinational logic output occurs out of the guard-band time interval. These techniques enable aging to be observed concurrently during normal operation by directly checking aging of actual data paths during normal operation. As an on-line self-test architecture, Y. Li et al. [9] introduced the concurrent autonomous chip self-test using stored test patterns (CASP) which performs on-line delay testing using the test patterns pre-stored in a non-volatile memory in the

Manuscript received May. 5, 2012; revised Oct. 30, 2012.

Dept. of Computer Engineering/Graduate School of Information & Communications, Hanbat National University, San 16-1, Dukmyungdong, Yuseong-gu, Daejeon, South Korea. E-mail : bean@hanbat.ac.kr



Fig. 1. Guard-band interval.

system. To improve the delay measurement accuracy, H. Yi et al. [10] referred to the measured voltage and temperature which is uncontrollable in field. In this paper, we consider an SoC with multiple clock domains. The above aging monitoring techniques performed fasterthan-at-speed delay tests only for each core at a single clock. However, in a multiple clock environment, interclock logic as well as intra-clock logic has to be monitored. We propose an on-chip intra/inter-clockdomain delay test clock generation scheme for scanbased delay testing which can be applied in the existing aging monitoring architecture. We adopt an on-die clock shrink (ODCS) logic [11-13] with which we can generate a faster clock and adjust duty cycle. Using ODCS logic, we present how to generate intra/inter-clock-domain delay test clock patterns. The rest of the paper is as follows. Section II reviews related works and Section III illustrates our proposed on-chip test clock generation scheme. In Section IV, we show experimental results. Section V concludes the paper.

## **II. RELATED WORK**

For intra- and inter-clock domain at-speed test, there have been several test control schemes. A simple model of a circuit with multiple clock domains as shown in Fig. 2 is used to clarify the ideas. In this model, there are two clock domains and an inter-clock logic. Each clock domain has its own scan chain and the clock frequencies of CLK1 and CLK2 may be different. The scan-enable signals, SE1 and SE2, may be connected to a global scan-enable signal. The on-chip PLL is used to generate at-speed test clocks for the purpose of reducing test cost or performing built-in-self-test (BIST). For intra-clockdomain (Clock Domain 1 and 2) at-speed test, conventional scan-based delay testing methods can be applied. In order to generate test patterns using a commercial Automatic Test Pattern Generation (ATPG) tool, one external test clock and multiplexers at each



Fig. 2. A circuit model with intra/inter-clock-domain logics.



**Fig. 3.** LOS approach and broad-side approach (a) LOS approach, (b) broad-side approach.

clock input may need to be designed so that one test clock can be used in test mode. Traditionally, scan-based delay tests are based on the launch-off-shift (LOS) approach [14] as shown in Fig. 3(a) and the broad-side approach [15] as shown in Fig. 3(b). In the LOS approach, higher fault coverage can be achieved relative to the broad-side approach with smaller number of test patterns. The broad-side approach, however, is more commonly used because of the well-known scan-enable signal skew problem [16, 17]. This problem makes physical implementation much more difficult in aging monitoring where faster-than-at-speed test is required. Therefore, in this paper, we exclude LOS based methods.

Intra-clock-domain at-speed test clock generation can be easily implemented by gating out two consecutive clock pulses (a launch pulse and a capture pulse) from the same clock source while the scan-enable signal is '0'.



**Fig. 4.** Concept of inter-clock-domain at-speed test control (a) Test control for inter-clock logic from Clock Domain 1 to 2, (b) Test control for inter-clock logic from Clock Domain 2 to 1.

However, for inter-clock-domain delay test, а complicated clock generation control is needed. Fig. 4 shows test control examples for inter-clock-domain atspeed test. The inter-clock-domain logic can be divided into two parts, one part going from Clock Domain 1 to Clock Domain 2 and the other part going from Clock Domain 2 to Clock Domain 1. Then, if CLK1 is faster than CLK2, the two waveforms, Fig. 4(a) and 4(b), should be able to be generated for at-speed testing of the inter-clock-domain logic. In order to generate these waveforms, the existing methods have timely performed clock gating, considering the timing relation between two different clocks [18-22]. They adjusted clock gating enable timing using a delay control register or a shift register to make the launch clock pulse in a clock domain and the capture clock pulse in the other clock domain. However, as abovementioned in Section I, for aging monitoring, at-speed delay testing techniques are not sufficient. In order to perform a faster-than-at-speed test, methods to reduce the interval between the launch pulse and the capture pulse have been presented [23, 24]. They designed some delay control logic with which a finegrained delay control is possible. However, they need to modify scan cells or put dummy flip-flops in the middle of scan chain to use the delay control logic. Therefore, these techniques cannot be adapted to conventional scan architectures. As a faster-than-at-speed clock generator, using an on-die clock shrink (ODCS) circuit can be considered. The clock phase or cycle can temporarily be stretched or shrunk by loading a control value serially through the TAP. ODCS circuits have been used to support timing debug [11-13] and to measure accurate delay variation [10]. As far as we are concerned, there is no existing work to perform a faster-than-at-speed test of inter-clock logic based on conventional scan architecture [25] without any modification of scan cell or scan architecture.

# **III. ON-CHIP TEST CLOCK CONTROL SCHEME**

#### 1. Problem Statement

As mentioned in the previous sections, a faster-thanat-speed clock is required for aging monitoring and we adopt ODCS logic as an on-chip fast test clock generator. We can shrink low phase, high phase, or period of the clock at a specified clock cycle. The range of the ODCS used in [11] is 200 ps in 14 linear steps. Fig. 5 shows a waveform for period-shrink-function of ODCS which is used for our scheme. The problems to be solved are how to timely generate and distribute launch and capture pulses for intra- and inter-clock-domain test. When a clock domain goes in test mode, if you change the clock source path from the PLL to the ODCS logic so that the test clock generated by the ODCS logic goes to the domain-under-test, launch and capture pulses for intraclock-domain test can be easily generated using the existing at-speed test clock generation techniques shown in Section II. In order to generate launch and capture pulses for inter-clock-domain test, two consecutive clock pulses have to be split and distributed so that the first pulse (launch pulse) can go to one clock domain and the next pulse (capture pulse) can go to the other clock domain. A straightforward approach to distribute each clock pulse to different clock domain is to put a counter and sequentially perform test clock gating and multiplexing to each clock domain according to the counter value. However, with the increase in the number of clock domains, clock gating logics are added and thus make area increase and clock timing control become complicated. In this paper, we present a simple and easily controllable launch and capture pulse generator for inter-



Fig. 5. Period shrink function of ODCS (at N+2th clock cycle).

clock-domain test, the details of which are given in the next subsection.

#### 2. Proposed Scheme

Fig. 6 shows our proposed intra/inter-clock-domain test clock distribution controller for aging monitoring. The clock signal, scan enable, scan in and out signals should be accessed by external test pins for production test. For simplicity, this paper only describes internal test control signal architecture. To clarify our idea, we assume that there are two clock domains as shown in Fig. 2. During normal operation, fclk1 and fclk2 respectively go to CLK1 and CLK2. During on-line test mode, the programmed clock (pclk) generated from the ODCS logic is used as the test clock. The Test Application Controller (TAC) is in charge of setting test clock frequency to the ODCS logic through the TAP controller and applying test patterns. Test patterns can be either previously stored in a non-volatile memory or automatically generated by a pattern generator (e.g. Pseudo-Random Pattern Generator (PRPG)). The aging test application has to be performed in field. The existing

aging test architectures [9, 10] utilize idle time or poweron/-off time for test mode. Once one or more parts of an SoC go in test mode, an aging test controller (or a processor core) enables the TAC and transfers domain information including the locations of the parts, whether intra- or inter-clock-domain, and test clock frequency. Then, the TAC creates scan-based delay test signals such as the test clock (test clk), the scan enable signal (scan\_enable), the launch and capture pulses (launch pulse and capture pulse) by programming the ODCS logic and controlling the clock gating cells (CGCs), multiplexers, and AND gates.

A waveform example for intra-clock-domain test is shown in Fig. 7(a). In this case, two domains are concurrently in test mode. During shift operation, test patterns/responses are shifted in/out at a low speed clock to/from two domains at the same time. And then, while the scan enable signal is '0', two test clock pulse pairs are transferred to CLK1 and CLK2, respectively. Every time the clock frequency and the test clock path need to be changed, the TAC programs the ODCS logic and timely controls the CGCs, multiplexers, and AND gates. This is similar to a logic built-in self test (LBIST) clock



Fig. 6. Intra/inter-clock-domain delay test clock distribution controller.



Fig. 7. Waveforms for (a) intra-clock-domain faster-than-at-speed test, (b) inter-clock-domain faster-than-at-speed test.

control scheme called "staggered double-capture" [18]. For inter-clock-domain test control, we put the Inter-Clock-Domain Test Pulse Generator (ICDTPG). The basic idea is as follows. From a launch-capture pulse distributer's point of view, when inter-clock-domain test, launch pulse and capture pulse respectively go to different clock domain (different scan chain). Then, we can perform launch and capture operation by creating two consecutive signal edges and distributing each of them to each clock domain instead of gating each test clock pulse out to each clock domain one after another. The consecutive signal edges are created by the ICDTPG and the edges are distributed by the multiplexers. This makes the clock control easy without clock gating control. As a result, we obtain the waveform shown in Fig. 7(b). When the scan enable signal, scan enable, is '1', FF1 and FF2 are cleared. When scan enable is '0', two clock pulses (test clk) are applied in the flip-flops. Then, at the first test clock, the FF1 output goes to '1' which makes the launch rising edge and at the second test clock, the FF2 output goes to '1' which makes the capture rising edge. FF1 goes down to '0' at the second test clock and FF2 goes down when scan enable becomes '1'. In this way, the launch pulse and capture pulse can be simply and automatically created with the scan enable signal.

### **IV. EXPERIMENTAL RESULTS**

In order to conduct simulations, we set clock frequencies and launch-to-capture intervals as shown in Table 1. Based on Fig. 6, we assumed that the frequencies of CLK1 (fclk1) and CLK2 (fclk2) are respectively 1 GHz and 500 MHz which are synchronized. We also assumed that the target circuit uses the frequency guard-band of 20% [5]. This means that originally the clock domain 1 and 2 are able to respectively operate at 1.2 GHz and 600 MHz which are maximum clock frequencies. Accordingly, the launch-tocapture interval for faster-than-at-speed test of clock domain 1 should be set to one value which is less than 1 ns (=1/1 GHz) and greater than 833 ps (=1/1.2 GHz). We set the launch-to-capture interval for clock domain 1 to 840 ps and set the launch-to-capture interval for clock domain 2 to 1.68 ns. Since CLK1 and CLK2 are synchronized as shown in Fig. 8, d1 is equal to d2. Therefore, we set the launch-to-capture interval between two clock domains to 840 ps.

Fig. 9 shows our simulation result. The signal *pclk* is the clock signal generated from the ODCS circuit. It is gated by *pclk\_en* to timely generate shift pulses and launch-capture pulses. When the scan-in operation is over, the TAC de-asserts *pclk en* to set the ODCS to

**Table 1.** Clock frequencies and Launch-to-Capture intervals

| Frequency of CLK1               |                                                         | 1 GHz   |
|---------------------------------|---------------------------------------------------------|---------|
| Frequency of CLK2               |                                                         | 500 MHz |
| Scan Shift Clock Frequency      |                                                         | 200 MHz |
| For Intra-clock-<br>domain test | Launch-to-capture interval for clock<br>domain 1        | 840 ps  |
|                                 | Launch-to-capture interval for clock<br>domain 2        | 1.68 ns |
| For Inter-clock-<br>domain test | Launch-to-capture interval between<br>two clock domains | 840 ps  |



Fig. 8. Inter clock intervals.

generate the *Shrunken Clock*. The TAC de-asserts *scan\_enable* and asserts *pclk\_en* so that the ICDTPG can generate launch pulse and capture pulse using the *Shrunken Clock*. The simulation result shows that the launch pulse and the capture pulse are respectively transferred to CLK1 and CLK2 by the control signals.

From the existing works [9, 10], in [9], the area of their on-line aging test controller was less than 0.01% of the total area of the relatively large SoC, OpenSPARC T1 which includes 16 processor cores, and in [10], the area increase by their aging monitoring circuits was 8.8% in a small SoC which includes a processor core and 10 peripheral cores. Each of them internally uses its own intra-clock-domain test scheme. To apply our proposed technique to them, only our ICDTPG should be

combined. Therefore, the area increase will be trivial. Aging monitoring techniques are targeted to electronic life-support systems or relatively large and high-end systems. In such systems, some increase in chip size and price is acceptable if reliability and stability of the systems are improved.

# V. CONCLUSIONS

Highly reliable systems such as medical equipments, satellites, aircrafts, or power plants do not allow a failure during system operations, which might result in disasters. Failures in field can occur because of transistor aging. Therefore, one way to avoid disasters is to predict a failure by monitoring aging of devices and take action before a failure occurs. In order to predict a failure in field, an on-chip faster-than-at-speed test technique is required. In this paper, we proposed an on-chip test clock control scheme for intra/inter-clock-domain faster-thanat-speed test. We utilized ODCS logic for fast clock generation and designed circuits to generate and distribute launch and capture pulses. The proposed scheme enables inter/intra- clock logic faster-than-atspeed test i) with a simple control scheme, ii) with low area overhead, and iii) without any modification of scan architecture.

#### ACKNOWLEDGMENTS

This work was supported by the intramural research program 2011 of Hanbat National University (HNU), South Korea.



Fig. 9. Simulation result of launch pulse and capture pulse generation control for inter-clock domain test.

# REFERENCES

- C. H. Tung, "Process-Structure-Property Relationship and its Impact on Microelectronics Device Reliability and Failure Mechanism," *Journal of Semiconductor Technology and Science (JSTS)*, Vol. 3, No. 3, pp. 107-113, Sep. 2003.
- [2] W. Wang et al., "Compact Modeling and Simulation of Circuit Reliability for 65-nm CMOS Technology," *IEEE Trans. on Device and Material Reliability*, Vol. 7, No. 4, pp. 509-517, Dec. 2007.
- [3] T. W. Chen et al., "Gate-Oxide Early Failure Prediction," *Proc. IEEE VLSI Test Symp.*, pp. 111-118, Apr. 2008.
- [4] M. Noda et al., "On Estimation of NBTI-Induced Delay Degradation," *IEEE European Test Symp.*, pp. 107-111, May 2010.
- [5] O. Khan and S. Kundu, "A Self-Adaptive System Architecture to Address Transistor Aging," *Proc. Design Automation and Test in Europe*, pp. 81-86, Mar. 2009.
- [6] M. Agarwal et al., "Circuit Failure Prediction and Its Application to Transistor Aging," *Proc. IEEE VLSI Test Symp.*, pp. 277-284, May 2007.
- [7] M. Agarwal et al., "Optimized circuit failure prediction for aging: practicality and promise," *Proc. of Int'l Test Conf.*, no. 26.1, Oct. 2008.
- [8] T. Nakura et al., "Fine Grain Redundant Logic Using Defect-Prediction Flip-Flops," *IEEE int'l Solid-State Circuits Conf.*, pp. 402-403, Feb. 2007.
- [9] Y. Li et al., "CASP: Concurrent Autonomous Chip Self-Test Using Stored Test Patterns," *Proc. Design Automation and Test in Europe*, pp. 885-890, Mar. 2008.
- [10] H. Yi et al., "A Failure Prediction Strategy for Transistor Aging," *IEEE Trans. on VLSI Systems*, pp. 1-9, Oct. 2011.
- [11] S. Rusu and S. Tam, "Clock Generation and Distribution for the First IA-64 Microprocessor," *IEEE int'l Solid-State Circuits Conf.*, TA 10.6, Feb. 2000.
- [12] D. D. Josephson et al., "Debug Methodology for the McKinley Processor," *Proc. Int'l Test Conf.*, pp. 451-460, Nov. 2001.
- [13] S. Tam et al., "Clock Generation and Distribution for the 130-nm Itanium 2 Processor With 6-MB On-Die L3 Cache," *IEEE Journal of Solid-State*

Circuits, Vol. 39, No. 4, pp. 636-642, Apr. 2004.

- [14] J. Savir and S. Patil, "Scan-Based Transition Test," IEEE Trans. on Computer-Aided Design of Integrated Circuit and System, Vol. 12, Aug. 1993.
- [15] J. Savir and S. Patil, "Broad-Side Delay Test," *IEEE Trans. on Computer-Aided Design of Integrated Circuit and System*, Vol. 13, Aug. 1994.
- [16] J. Saxena et al., "Scan-Based Transition Fault Testing – Implementation and Low Cost Test Challenges," *Proc. Int'l Test Conf.*, pp 1120-1129, Oct. 2002.
- [17] S. Wang et al., "Hybrid Delay Scan: A Low Hardware Overhead Scan-Based Delay Test Technique for High Fault Coverage and Compact Test Sets," *Proc. Design Automation and Test in Europe*, pp. 1296-1301, Oct. 2004.
- [18] K. Hatayama et al., "At-Speed Built-in Test for Logic Circuits with Multiple Clocks," *Proc. IEEE Asian Test Symp.*, pp. 18-20, Nov. 2002.
- [19] L. -T. Wang et al., "At-Speed Logic BIST Architecture for Multi-Clock Designs," Proc. IEEE Int'l Conf. on Computer Design: VLSI in Computers and Processors, pp. 475-478, Oct. 2005.
- [20] H. Furukawa et al., "A Novel and Practical Control Scheme for Inter-Clock At-Speed Testing," *Proc. Int'l Test Conf.*, pp. 1-10, Oct. 2006.
- [21] X. Fan et al., "An On-Chip Test Clock Control Scheme for Multi-Clock At-Speed Testing," *Proc. IEEE Asian Test Symp.*, pp. 341-346, Oct. 2007.
- [22] K. Y. Cho and R. Srinivasan, "A Scan Cell Architecture for Inter-Clock At-Speed Delay Testing," *IEEE VLSI Test Symp.*, pp. 213-218, May 2011.
- [23] R. Tayade and J. A. Abraham, "On-Chip Programmable Capture for Accurate Path Delay Test and Characterization," *Proc. IEEE Int'l Test Conf.*, paper 6.2, Oct. 2008.
- [24] S. Pei et al., "An On-Chip Clock Generation Scheme for Faster-than-at-Speed Delay Testing," *Proc. Design Automation and Test in Europe*, pp. 1353-1356, Mar. 2010.
- [25] M. Kim et al., "High Speed Pulse-based Flip-Flop with Pseudo MUX-type Scan for Standard Cell Library," *Journal of Semiconductor Technology* and Science (JSTS), Vol. 6, No. 2, pp. 74-78, June 2006.



**Hyunbean Yi** received the B.S., M.S. and Ph. D. degrees in Computer Science and Engineering from Hanyang University, Korea, in 2001, 2003, and 2007, respectively. He was with Korea Electronics Technology Institute (KETI) from 2002 to 2007.

He had been a Research Scholar at the Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, USA from 2007 to 2009. He had been a Postdoctoral Researcher at the Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Japan from 2009 to 2011. Currently, he is a Professor in the Department of Computer Engineering/Graduate School of Information & Communications, Hanbat National University, South Korea. His research interests include high-speed communication system design, SoC/NoC testing/debugging, Design-for-Testability (DfT) and Design-for-Reliability (DfR).