# A Multi-Point Sense Amplifier and High-Speed Bit-Line Scheme for Embedded SRAM ## II-Kwon Chang and Kae-Dal Kwack #### **Abstract** This paper describes new sense amplifier with fast sensing delay time of 0.54ns and 32kb CMOS embedded SRAM with 4.67ns access time for a 3-V power supply. It was achieved using the sense amplifier with multiple point sensing scheme and high speed bit-line scheme. The sense amplifier saves 25% of the power dissipation compared with the conventional one while maintaining a very short sensing delay. The SRAM uses 0.5m double-polysilicon and triple-metal CMOS process technology. A die size is $1.78\,\mathrm{mm} \times 2.13\,\mathrm{mm}$ . #### I. Introduction As hand held computers and personal digital assistants (PDA) have been developed, the demand for low power and high speed memories is increasing. An embedded memory with high performance has been reported[1]-[3]. The access time of embedded SRAM is determined by the critical path from address input to data output which consists of address buffers, decoders, memory cells, sense amplifiers, and output buffers. The address buffer and decoder circuits are composed of typical gates, which determine the delay time from the address input to the word line. Therefore, the effective method to reduce the delay time of these circuits may be to improve the switching speed of the primitive gates and to optimize the size of gates. On the other hand, in the path from the word line to the data output, various circuit technologies have been proposed to improve the speed[4]-[7]. This paper deals with circuit technologies such as the bit-line peripheral circuits and sense amplifier, especially a high speed sense amplifier was a key to achieve high speed embedded SRAM. A novel Multi-Point Sense Amplifier is in section II and its advantages are described with applied bit-line schemes in section III. In section IV, a simulated result is presented. Finally, section V summarizes the paper. # II. Sense Amplifier Circuit and Bit-Line Scheme As the density of CMOS SRAM increases, the fast access time is becoming an important factor. To reduce the access time during a read operation, the swing on the bit-lines should be as small as possible. The bit-line voltage swing versus local data bus delay is shown in Fig. 1. Fig. 1. Bit-line Voltage Swing versus Local Data-bus Delay Time Since this voltage swing becomes the input signal to the sense amplifier, a small bit-line voltage swing causes additional time delays. A conventional PMOS cross-coupled sense amplifier shown in Fig. 2(a) has already been reported[8][9], which achieves the power reduction without Manuscript received January 12, 1998; accepted March 6, 1998. The authors are with the Department of Electronic Engineering, Hanyang University. degradation of the access speed[10]. The MN1 and MN2 are used to convert the difference voltage of bit-line into current difference flowing through MN1 and MN2, resulting in the difference voltage at the *Out* and . Both outputs *Out* and are precharged to Vdd. Because the voltage gain of MN1 and MN2 is lower than one, the sense amplifier has less sensing performance at low slew-rate after *Sense* signal activates MN3. #### (a) Conventional PMOS Cross-coupled Sense Amplifier (b) Proposed Multi-Point Sense Amplifier Fig. 2. Conventional & Proposed Sense Amplifier The proposed Multi-Point Sense Amplifier(MPSA) is shown in Fig. 2(b). The MPSA consists of CMOS-coupled pair, equalizing transistor and sensing input transistors. The source nodes of PMOS transistors, MP1-MP4, for CMOS cross-coupled pair is connected on bit-line, which has a function that transfers the voltage difference of bit-line before the Sense signal activates. The equalizing transistor MP5 is connected between X and. The operation of this circuit is as follow. When the Sense signal voltage is the ground level, the MP3 and MP4 transistor transport the bit-line difference from both bit lines to the nodes Out and, which causes the splitting voltage level on output nodes. Therefore, the both X and voltage is equalized by MP5. This equalizing scheme accelerates the sensing speed. When the Sense signal voltage rises to Vdd, the current that is flowing through MN5 causes the voltage lowing on the drain node of MN5. As the voltage difference between gate and source of MN1 or MN2 is large enough to turn on the transistor, the voltage difference of bit lines is amplified. Consequently, the sensing delay time is reduced because the MPSA senses the bit-line input at the multi-sensing points, which is more explained by several reasons as follow. Firstly, the output nodes are sensed through MP3 and MP4. Secondly, MN1 and MN2 amplify the input difference as amplifier when the Sense signal activates. Finally, MP1 and MP2 transports the bit-line voltage to output nodes. Fig. 3 shows the sensing delay of conventional sense amplifier and proposed MPSA as a function of slew-rate variation of bit lines. The sensing delay is defined as the time interval from the crossing point of the input voltage to the point when the output voltage swing becomes 1V. The sensing delay of both of the sense amplifiers increase gradually as decreasing the slew-rate. The proposed MPSA shows a shorter delay time by 0.35ns at 100mV/ns and small dependence on bit-line slew-rate than the conventional one. This small dependence on the bit-line slew-rate also reduces the sensitivity of access time at low slew-rate. Fig. 4 shows the dependence of the sensing delay on the sense amplifier active current, ISA which is simulated under the condition of 50mV input voltage swing and 1V output voltage swing. The conventional sense amplifier has a large dependence on the active current, especially at low current, whereas the proposed MPSA has very small dependence on the active current even at low current level. This means that proposed MPSA has low power dissipation than the conventional one. Using this proposed sense amplifier, we have obtained a sense amplifier speed of 0.48ns with 180A sense amplifier current. The same delay is attained with as much as 240A using the conventional sense amplifier. The MPSA saves power dissipation that is required for the conventional one to attain the same short delay time. Fig. 3. Sensing Delay vs. Slew-rate of Bit Lines Fig. 4. Sensing Delay vs. Active Current, ISA #### III. Bit-Line Schemes ### 1. Bit Line Scheme The bit-line and local-data-bus peripheral circuits are shown Fig. 5. Precharging and equalizing circuits are connected to the column selecting circuit. In the conventional fast SRAM design, the bit-line precharge and equalize techniques is used. However, this technique is not suitable for large SRAM because the total gate capacitance of the precahrge transistors is too large for fast operation. The precharge circuit driven by *Pre\_charge* is used instead of the static precharge transistors. Therefore, the precharge and equalize operation of this bit-line circuit is the same as that of the conventional circuit without sacrificing any precharge capability. The sense amplifier is followed by column selecting circuit. The *Col\_sel* signal controls the column access. If a column is not selected, the read and write access remains off while bit-line precharge devices are on, acting as bit-line clamps. As a consequence, dc power consumed in all unselected columns of the memory during a read or write cycle. Fig. 5. The Bit-line and Local-data-bus Peripheral Circuits Fig. 6. Comparison Waveforms using MPSA & Conventional Sense Amp. A SmartSpice of Silvaco Co. was used to simulate for a read cycle, which is resulting in Fig. 6. This figure shows two cases, that is, using conventional sense amplifier and MPSA. In the case using conventional one, the *Sense* signal rises after bit-line swing is large enough to activate the sense amplifier. Therefore, additional access delay is induced in the following bit-line voltage swing. In the case using the MPSA, the *Sense* signal quickly rises, because the output nodes of the MPSA are already precharged to bit-line voltage level. Consequently, additional delay time does not occur. #### 2. Output Circuits The key issue of designing the high-speed SRAM with byte-wide organization is noise reduction. There are two kinds of noise: Vdd noise and GND noise. In the high-speed SRAM with byte-wide organization, when the output transistors drive a large load capacitance, the noise is generated and multiplied by 16 because 16 outputs may change simultaneously. It is an especially serious problem for the data zero output. That is to say, when the output NMOS transistor drives the large load capacitance, the GND potential of the chip goes up because of the peak current and the parasitic inductance of the GND line. Therefore, the address buffer and the ATD circuit are influenced by the GND bounce, and unnecessary signals are generated. Due to the delay of the valid data, the access time becomes longer in the worst case. Therefore, the new two-step drive scheme was proposed, and good operation was confirmed. Fig. 7 shows a output circuit. Due to the high-speed driving of transistor MN1, GND potential goes up, and the valid data are delayed by the output ringing. On the other hand, the new noise-reduction output circuit consists of one PMOS transistor, two NMOS transistors, one NAND gate, and the delay part. The operation of this circuit is explained as follows. The control signal(Wen,) is at low level in read operation and high level in write operation. When the data zero output of logical high level is transferred to Read data and Wen & are activated on reading operation, transistor MN4 is cut off, and MN3 raises node X to the middle level. Therefore, the peak current that flows into the GND line through transistor MN1 is reduced less than one half that of the conventional circuit because MN1 is driven by the middle level. After a delay counting from the beginning of the middle level, transistor MP3 raises node X to the Vdd level. As a result, the conductance of MN1 becomes maximum, but the peak current is small because of the low output voltage. Therefore, the increase of GND potential is small, and the output ringing does not appear. For example, when a 10-pF load capacitance is driven, the valid data appear approximately 0.52ns faster than the valid data of the conventional circuit. Fig. 7. Output Circuit with Noise Reduction Circuit ## IV. Simulation Results To demonstrate the proposed sense amplifier, the MPSA is applied to a 32kb high speed embedded SRAM. The SRAM uses a 0.5m CMOS process technology. The MOSFETs has a gate length of 0.5m for 3-V power supply. The memory cells use a polysilicon word line strapped metal layer every four columns to reduce word line delay. Fig. 8 shows the SRAM layout with MPSA for testing. The typical characteristics of 32kb SRAM are shown in Table I. The die measures 1.78mm2.13mm. The cell array consists of 128 rows by 256 columns. The address buffers, pre-decoders, and ATD circuits are laid out such that signal line lengths do not exceed 1.7mm, which makes signal delay less than 0.5ns. As a result, word-line selection time of 3.29ns is achieved. The simulation waveforms for 32kb SRAM access time is shown in Fig. 9. Fig. 10 shows the total access time of the memory core of 32kb compared with conventional one. The sensing delay is reduced from 0.91ns to 0.54ns by the reason of using multi-point sensing scheme, and the bit-line delay is reduced from 1.65ns to 0.32ns. The total access time of memory is 4.67ns in this work, which is superior to that of conventional one reducing the 34% of access time. Fig. 8. SRAM layout with MPSA | Technology | 0.5m twin-well 2-poly CMOS triple-metal | |---------------|-----------------------------------------| | Configuration | 2k×16b | | Power supply | 3V | | Acess time | 4.67ns | | Chip size | 1.78mm2.13mm | Table 1. Typical Characteristics of 32kb SRAM Fig. 9. Simulation waveforms for 32k SRAM with the MPSA Fig. 10. Comparison the access time ### V. Conclusion A new multi-point sense amplifier is proposed to correspond to the requirement of the fast embedded SRAM. It can amplify small input voltage swing with low active current. Consequently, a delay time of 0.54ns for the sense amplifier was achieved with 150A sense amplifier current. The MPSA saves power dissipation in attaining a very short sensing delay. With the MPSA, the bit-line delay is reduced to 80% of that of conventional sense amplifier. #### Reference - [1] A. L. Silburt, R. S. Phillips, G. F. R. Gibson, S. W. Wood, A. G. Bluschke, J. S. Fujimoto, S. P. Kornachuk, B. Nadeau-Dositie, R. K. Verma, A 180MHz 0.8m BiCMOS Modular memory Family of DRAM and Multiport SRAM, *IEEE Journal of Solid-State Circuits*, vol. 28, no. 3, pp. 222-232, 1993. - [2] M. Izumikawa, K. Suzuki, M. Nomura. H. Igura, H. Abiko, K. Okabe, A. Ono, T. Nakayame, M. Yamashina, and H. Yamada, A 400MHz, 300mW, 8kb, CMOS SRAM Macro with a Current Sensing Scheme, IEEE Custom Integrated Circuits Conference, pp. 595-598, 1994. - [3] J. S. Caravella, A Low Voltage SRAM For Embedded Applications, *IEEE Journal of Solid-State Circuits*, vol. 32, no. 3, pp. 428-432, 1997. - [4] M. Matsumiya, S. Kawashima, M. Sakata, M. Oojura, T. Miybo, T. Koga, K. Itabashi, K. Mizutani, H. Shimada, and N. Suzuki, A 15ns 16Mb CMOS SRAM with Interdigitated Bit-Line Architecture, *IEEE Journal of Solid-State Circuits*, vol. 27, no. 11, pp. 1497-1502, 1992. - [5] K. Sasaki, K. Ishibashi, K. Ueda, K. Komiyaji, T. Yamanaka, N. Hashimoto, H. Toyoshima, F. Kojima, and A. Shimizu, A 7ns 140mW 1Mb CMOS SRAM with current sense amplifier, in *IEEE Journal of Solid-State Circuits*, vol. 27, no. 11, pp. 1511-1517, 1992. - [6] K. Seno, K. Knorpp, L. L. Shu, F. Miyaji, M. Sasaki, M. Takeda, T. Yokioyama, K. Fjita, T. Kimura, Y. Tomo, P. Chuang, and K. Kobayshi, A 9ns 16Mb CMOS SRAM with Offset Reduced Current Sense Amplifier, ISSCC Dig. Tech. Papers, pp. 248-249, 1993. - [7] K. Sasaki, K. Ishibashi, K.shimohigashi, T. Yamanak, N.Moriwaki, S. Honjo, S. Ikeda, A. Koike, S. Meguro, and O. Minato, A 23-ns 4-Mb CMOS SRAM with 0.2A Standby Current, *IEEE Journal of Solid State Circuits*, vol. 25, no. 8, pp.1075~1080, 1990. - [8] T. Kobayshi, K. Nogmi, T. shirotori, Y. Fujmoto, and O. Watanabe, A Current-mode Latch Sense Amplifier and a Static Power Saving Input Buffer for Low-Power Architecture, Symp. VLSI Circuits, Dig. Tech. Papers, pp. 28-29, 1992 - [9] T. Seki, E. Itoh, C. Furukawa, I. Maeno, T. Ozawa, H. Sano, and N. Suzuki, A 6ns 1Mb CMOS SRAM with Latched Sense Amplifier, *IEEE Journal of Solid State Circuits*, vol. 27, no. 4, pp.1075~1080, 1990. - [10] K. Sasaki, K. Ishibashi, K.shimohigashi, T. Yamanak, N.Moriwaki, T. Nishida, K Shimohigashi, S. Hanamura, and S. Honjo, A 9-ns 1-Mb CMOS, *IEEE Journal of Solid State Circuits*, vol. 24, no. 5, pp.1219~1225, 1990. II-Kwon Chang was born in Seoul, Korea, in 1971. He received the B.S. and M.S. degrees in electronic engineering from Hongik University and Hanyang University, Korea, in 1996 and 1998, respectively. Currently he is working toward the Ph.D. degree in electronic engineering at Hanyang Uni- versity. Since 1996 he has been a Research Assistant at the Hanyang University Semiconductor Laboratory. His research interests include low-power memory circuit design and low-voltage analog circuit design. Kae-Dal Kwack was born in Daegu, Korea, on 1950. He received the B.S. and M.S. degrees in electronic engineering from Hanyang University, Seoul, Korea, in 1974 and 1976, respectively, and the Ph.D. degree in electronic engineering from the Institute National Polytechnique de Toulouse, E.N. S.E.E.H.T., Toulouse, France, in 1980. Currently he is a professor of department of electronic engineering at Hanyang University. He is a header of Hanyang Advanced Semiconductor Center from 1992. His research is currently focused on semiconductor device modeling, device simulator and low-power circuit design.