# **MTCMOS Post-Mask Performance Enhancement**

Kyosun Kim\*, Hyo-Sig Won\*\*, and Kwang-Ok Jeong\*\*

Abstract—In this paper, we motivate the post-mask performance enhancement technique combined with the Multi-Threshold Voltage CMOS (MTCMOS) leakage current suppression technology, integrate the new design issues related to the MTCMOS technology into the ASIC methodology. The issues include short-circuit current and sneak leakage current prevention. Towards validating the proposed techniques, a Personal Digital Assistant (PDA) processor has been implemented using the methodology, and a 0.18um process. The fabricated PDA processor operates at 333MHz which has been improved about 23% at mo additional cost of redesign and masks, and consumes about 2uW of standby mode leakage power which could have been three orders of magnitude larger if the MTCMOS technology was not applied.

Index Terms—MTCMOS, ASIC design methodology, leakage current, low power, post-mask performance enhancement

## I. MOTIVATION

The International Technology Roadmap for Semiconductors (ITRS), since 2001, has been forecasting 0.1 W peak power and 2.1 mW standby power as the System-On-Chip Low Power (SOC-LP) PDA system specifications which will demand the integration of multiple technologies including Low

Operating Power (LOP), Low STandby Power (LSTP), and High Performance (HP) devices [1]. The threshold voltages (V<sub>th</sub>) of these devices in a process technology generation are determined so that the driving current is maximized while satisfying the leakage current constraints.

Table 1. Process Attributes of Low V<sub>th</sub> and High V<sub>th</sub>
Transistors in a Typical 0.18um ASIC Technology

|                               | Low V <sub>th</sub> | High V <sub>th</sub> |
|-------------------------------|---------------------|----------------------|
| Threshold Voltage             | 0.45 V              | 0.6 V                |
| Sub-Threshold Leakage Current | 32 pA/um            | 1 pA/um              |
| Saturation Drive Current      | 540 uA/um           | 450 uA/um            |



Fig. 1. Performance and Leakage Tradeoff in Accordance with Threshold Voltage Variation

Table 1 summarizes the transistor attributes in a typical 0.18um ASIC process technology with 1.8V

E-mail: {hs.won, kwang}@samsung.com

Manuscript received November 23, 2004; revised December 8, 2004.

<sup>\*</sup>Department of Electronic Engineering University of Incheon, Incheon, Korea

E-mail: kkim@incheon.ac.kr

<sup>\*\*</sup>CAE Center, Samsung Electronics, Yongin, Korea.

supply voltage. Whereas the driving current of low  $V_{th}$  transistors (LOP devices) is 20% larger than that of high  $V_{th}$  transistors (LSTP devices), its sub-threshold leakage current is unfortunately, thirty times larger.

Towards motivating the MTCMOS power gating technology [3] incorporated with post-mask-tooling performance enhancement, look more closely into the saturation (on) current and the sub-threshold leakage (off) current of low V<sub>th</sub> NMOS and PMOS transistors in a 0.18um process technology as shown in Figure 1. If the threshold voltage is further scaled down aggressively, up to 0.3V, the driving current increases about 20% while the leakage current becomes forty times larger. This exponential increase of the leakage current has prevented us from aggressively scaling down the threshold voltage. Notice that this exponentially increased leakage current is still about 300 times smaller than the saturation current, and therefore, is dominant only in the standby mode. Since the high V<sub>th</sub> power switch in the MTCMOS technology effectively suppresses this standby mode leakage current, an aggressive V<sub>th</sub> scaling down now becomes available for extra performance enhancement.

This performance improvement technique provides an additional opportunity to speed up SOC designs since the V<sub>th</sub> that has been usually determined in the standard dual V<sub>th</sub> process development can be further scaled down even after the mask-tooling step. The ITRS forecasts that the total chip power using only LSTP devices reaches 1.5W in the year of 2016, and almost all of this is dynamic power [2]. One of the major sources of this to-be-increased dynamic power is the power supply voltage of the LSTP devices which is about ten to fifty percent larger than that of the LOP devices. This is mainly due to the maintenance of sufficient voltage over-drive (VDD - V<sub>th</sub>) to allow sufficient circuit switching noise margin (at least 2 times the threshold voltage). The proposed performance improvement technique does not hinder the VDD scaling, and consequently, will significantly alleviate the quadratic increase of the dynamic power in the future LSTP devices.

The rest of this paper is organized as follows. Section 2 introduces the MTCMOS design issues such as the short circuit current due to floating inputs, sneak leakage prevention, and timing closure. Also, section 3 shows the test results of the fabricated PDA processor. Finally, the concluding remarks are given in Section 4.

# II. MTCMOS DESIGN ISSUES

# 1. Sleepless IPs

An SOC design includes a variety of IPs such as processors, memories, and analog components. Some of these IPs may not be implemented by using the MTCMOS technology. Whereas the high performance embedded processors which provide the power saving modes are turned off in the sleep mode, the memories which are implemented using high V<sub>th</sub>, low leakage transistors are always turned on to preserve the data on them. Also, the house-keeping circuits such as real time clock and power manager should be operational during the sleep mode. These non-MTCMOS IPs are directly powered by VDD and GND, and therefore, sleepless (always awake even in the sleep mode). If a sleepless IP is soft, it should be implemented using the SleepLess (SL) cells which consist of high V<sub>th</sub> transistors, and are directly connected to GND.

## 2. Floating Input Induced Short-Circuit Current



Fig. 2. Floating Prevention Circuit

Since the output nodes of all MTCMOS gates get floating as the VGND gets floating in the sleep mode, the floating inputs to the *sleepless* IPs can cause very large short-circuit current that flows from VDD to GND directly. To eliminate this leakage current, we insert a data-holding circuit that is composed of a tri-state buffer and a level holder at the output port of an MTCMOS logic gate which is the input to a *sleepless* IP as shown in Figure 2. This data-holding circuit is *sleepless* as well, and called Floating Prevention Circuit (FPC) [4]. In the sleep mode, the Sleep Control Bar (SCB) signal goes to high, the output of the tri-state buffer goes to the high

impedance state, and the state on the latch is preserved.

#### 3. Sneak Leakage Prevention

A sneak leakage path is any current path from VDD to GND that continues to draw high current relative to a cut-off path during sleep mode. Calhoun, *et al.* presented several design rules to provide remedies on typical sneak leakage path patterns that are mostly related to transmission gates [5]. Figure 3 shows a typical sneak leakage path induced by a transmission gate logic multiplexer. The dashed line indicates the sneak leakage path from an MTCMOS inverter (a symbol with a bold line which denotes a gate consisting of low  $V_{th}$  transistors) through an MTCMOS multiplexer to the high  $V_{th}$  inverter whose output value is '0'.

Transmission gate logic provides an efficient and economic implementation of multiplexers and XOR gate dominant arithmetic units. However, due to its lack of driving capabilities, and its vulnerability to the interconnect delay, the state-of-the-art VDSM ASIC technologies in which the interconnect delay is dominant allow the transmission gate logic to be used only internal to cells and/or hard macros, and prohibit the exposure of the transmission gates to the external interconnect. While multiplexers implemented in the sum-of-products form, transmission gated inputs to an arithmetic unit are buffered. This effectively and thoroughly obviates the Calhoun's patterns in ASIC designs.



Fig. 3. Transmission Gate Logic Induced Sneak Leakage Path

Although such prohibition eliminates the transmission gate logic induced leakage, sneak leakage paths still exist in ASIC designs. On a tri-state bus, a small latch, so-called level holder, is required to retain the value until a new value is presented by one of the bus

drivers even after the driver that currently presents the value on the bus is disabled and gets into the high-impedance state. Usually, the level holders are manually instantiated on the nets that are connected to tri-state bi-directional ports in the RTL design phase. If MTCMOS cells are instantiated, a sneak leakage path is formed from the VGND, through the pull-down transistor of an inverter in the latch, and the pull-down transistors of the tri-state inverter, and finally to the GND as shown in Figure 4. This path detours the power switches, and connects the VGND directly to the GND, annulling all the efforts to suppress the leakage current. Fortunately, the leakage path can be also removed simply by replacing the MTCMOS cells with the high  $V_{\rm th}$  cells.



Fig. 4. Sneak Leakage Path Induced by a Level Holder on a Tri-State Bus

An interesting point is the similarity between Calhoun's leakage path patterns, and ours. While the Calhoun's path is created by the MTCMOS transmission gate logic as shown in Figure 5(a), ours is created by the high  $V_{th}$  tristate inverter. However, since a tri-state inverter can be represented by an inverter with a CMOS transmission gate at the output, Figure 4 can be simplified to Figure 5(b) that is almost identical to Figure 5(a) except the type of the CMOS transmission gate.



(a) Transmission Gate Logic Induced Sneak Path

(b) Tri-State Buffer Induced Sneak Path

Fig. 5. Sneak Leakage Path Patterns

Putting it altogether, it can be observed that a sneak leakage path is created by one or a series of transmission gates with one end connected to the output of an MTCMOS gate, and the other end connected to the output of a high V<sub>th</sub> gate. Simple, yet intuitive remedies that can be easily applied to the conventional ASIC design methodology are cell replacement and buffer insertion as illustrated in Figure 6. The sneak leakage path shown in Figure 5(a) can be removed by the techniques shown in Figures 6(a) and (b). Similarly, the sneak leakage path shown in Figure 5(b) can be removed by the techniques shown in Figures 6(c) and (d). The basic rule is to make the types of the cells connected to those two ends of the transmission gate the same as that of the intervening transmission gate. While we have to rely on the buffer insertion at the interface to a hard macro, cell replacement is applicable to any place in soft blocks. Since both techniques either degrade or improve the timing, they must be incorporated with the timing closure techniques.

## 4. Timing Closure

Since the MTCMOS leakage suppression techniques affect the timing while the logic optimization may create circuit patterns that violates the rules to prevent shortcircuits and sneak leakage, the MTCMOS design issues cannot be separated from the timing closure, and tangentially fixed. Therefore, the timing optimization should be able to distinguish the sleepless parts from the MTCMOS parts and apply precisely the corresponding MTCMOS rules. Especially, the physical synthesis should be able to apply the MTCMOS rules to the design in a flat fashion. Although the state-of-the-art commercial physical synthesis tools do not have the ready-to-use functions to satisfy the MTCMOS constraints, the timing optimization can be at least guided so that the MTCMOS rules are not violated. While the circuit objects such as cell, net, port, module, etc. can be frozen (marked not to be changed during automatic optimization) selectively by the annotated properties, the forbidden cell types for the buffers to be inserted can be defined appropriately for each execution of the timing optimization. We developed an MTCMOS rule compliance toolkit which (i) detects and fixes the rule violations by FPC insertion, cell replacement and

buffer insertion in the net list, and (ii) generates property annotation scripts to guide the timing optimization. This toolkit enables the extended use of the conventional ASIC design tools for the MTCMOS design. However, the sequential nature of the design process incurs iterations. Therefore, an MTCMOS constraints aware physical synthesis is required for an ultimate solution.



(c) High V<sub>th</sub> Buffer Insertion (d) Cell Replacement **Fig. 6.** Sneak Leakage Prevention

#### III. EXPERIMENTAL RESULTS

# 1. PDA Application

The proposed MTCMOS design techniques have been validated on a 32-bit RISC microprocessor for handheld devices like PDAs and general applications with low power and high performance requirements. The design has been fabricated and fully tested.



Fig. 7. Photograph of the Chip

Figure 7 shows a microphotograph of the chip. About two million gates are implemented on a 5.7mm x 5.7 mm chip which consumes 270mW.

Table 2. Performance Enhancement Due to Vth Scaling

| Implementation | V <sub>th</sub> (V) | Speed (MHz) | I <sub>off</sub> (uA)<br>(PS off) | I <sub>off</sub> (uA)<br>(PS on) |
|----------------|---------------------|-------------|-----------------------------------|----------------------------------|
| Non MTCMOS     | 0.45                | 270         | 80                                |                                  |
| MTCMOS         | 0.45                | 262         | 1.0                               | 80                               |
|                | 0.38                | 285         | 1.0                               | 531                              |
|                | 0.3                 | 333         | 1.1                               | 6437                             |

Towards validating the MTCMOS leakage current suppression technology incorporated with post-masktooling performance enhancement, we fabricated three MTCMOS implementations, each with a different channel ion implantation, and therefore a different threshold voltage. The leakage suppression effects of the Power Switch (PS) in different threshold voltage conditions are observed by measuring the leakage current in the sleep mode when the PS is turned off as initially intended, and when the PS is artificially turned on by using an additional testing circuit. The difference clearly explains the suppression effects. The comparison of the MTCMOS implementations together with the non-MTCMOS implementation is summarized in Table 2. The first column shows the implementation technique. The next two columns show the threshold voltage and the maximum speed, respectively. The last two columns show the leakage current when the PS is on and off, respectively. The MTCMOS implementation about 3% slower than the non-MTCMOS implementation due to the PS ground bounce. However, as the threshold voltage goes down, the speed reaches up to 333MHz, maintaining the leakage current less than 2uA which could have been more than 6mA if the MTCMOS technology was not applied. The speedup of 23% has been achieved at no design and mask costs.

## IV. CONCLUSIONS

An MTCMOS post-mask performance enhancement technique has been developed, and validated on a PDA processor. The test results of the fabricated PDA processor show 23% of performance enhancement,

achieving three orders of magnitude reduction of the leakage current in the sleep mode.

#### ACKNOWLEDGEMENT

This work has been supported by the 2003 new faculty fund from University of Incheon.

## REFERENCES

- [1] International Technology Roadmap for Semiconductors, 2001 Edition, System Drivers, http://public.itrs.net, 2001.
- [2] International Technology Roadmap for Semiconductors, 2001 Edition, Process Integration, Devices, & Structures, http://public.itrs.net, 2001.
- [3] K-T. Park, H-S. Won, et al., "Low-Power Data-Preserving Complementary Pass-Transistor-Based Circuit for Power-Down Circuit Scheme", SSDM, 2001.
- [4] H-S. Won, K. Kim, et al., "An MTCMOS Design Methodology and Its Application to Mobile Computing," ISLPED, pp.110-115, August 2003.
- [5] B. H. Calhoun, et. al., "Design Methodology for Fine-Grained Leakage Control in MTCMOS," ISLPED, pp104-109, August 2003.



**Kyosun Kim** received the BS and MS degrees in Electronic Engineering from Yonsei University, Seoul, Korea in 1986 and 1988, respectively, and the PhD degree in Electrical and Computer Engineering from the University of Massachusetts, Amherst,

in 1998. He is an assistant professor in the Department of Electronic Engineering at the University of Incheon, where he has been since 2003. From 1988 to 2003, he was with the Semiconductor R&D Center, Samsung Electronics, Yong-In, Korea. His research interests are in high-level synthesis, fault-tolerant systems, reconfigurable computing, embedded systems, low-power design, and quantum-dot cellular automata.



**Hyo-sig Won** received the B.S. degree from Ajou University, Korea, in 1989, and the M.S. and Ph.D. degrees in electrical engineering from the University of Tohoku, Japan, in 1992 and 1997, respectively. His Ph.D. dissertation focused on the

low power design for neuro-chip. From 1997 and 1999, he was with Fujitsu, Co., Ltd., Japan, where he focused on low power and high performance design for SPARC processor. In 1999, he joined Samsung Electronics, Co., Ltd. Korea. His research interests include high-speed low-power circuit design for mobile applications and the general areas of VLSI Design and Design Automation. He served as program committee members in the Symposium on ISLPED.



**Kwangok Jeong** received the B.S. and M.S. degrees in Electrical Engineering from Hanyang University in 1997 and 1999, respectively. In 1999, he joined the Samsung Electronics, Korea where since then he had been engaged in research of on-chip IR-drop analysis

and low power SoC design. His research interests include power estimation, leakage reduction, dynamic IR-drop analysis and power routing optimization methodologies.