# A Low-area and Low-power 512-point Pipelined FFT Design Using Radix $-2^{4}-2^{3}$ for OFDM Applications 

Jian Yu*, Kyung-Ju Cho**


#### Abstract

In OFDM-based systems, FFT is a critical component since it occupies large area and consumes more power. In this paper, we present a low hardware-cost and low power 512-point pipelined FFT design method for OFDM applications. To reduce the number of twiddle factors and to choose simple design architecture, the radix $-2^{4}-2^{3}$ algorithm are exploited. For twiddle factor multiplication, we propose a new canonical signed digit (CSD) complex multiplier design method to minimize the hardware-cost. In hardware implementation with Intel FPGA, the proposed FFT design achieves more than about $28 \%$ reduction in gate count and $18 \%$ reduction in power consumption compared to the previous approaches.


Key Words : low-power, pipelined, FFT, OFDM, constant complex multiplier

## 1. Introduction

FFT processor is one of the components with hi gh complexity in the physical layer of OFDM-base d applications such as IEEE $802.11 \mathrm{a} / \mathrm{g} / \mathrm{n}$, WPAN, L TE systems, and so on. Thus, many FFT design ap proaches have been developed to reduce the compu tational complexity [1][2].

The radix-2 algorithm is popular due to simple butterfly for implementation. However it needs mo re complex multiplications. The radix- 4 algorithm can reduce the number of complex multiplications following higher butterfly complexity for implemen tation. For achieving a simple butterfly and reducin g the number of twiddle factor multiplication, radix $-2^{k}(k=2 \sim 5)$ FFT algorithms have been proposed i n [3-5].

Among the various FFT architectures, the pipeli ned architectures provide high throughputs at the c ost of reasonable hardware overhead. There are tw o types in the pipelined FFT architecture: feedfor
ward and feedback. Feedforward architectures can be classified into single path delay commutator and multi-path delay commutator. Feedback architectu res can be classified into single path delay feedbac k (SDF) and multi-path delay feedback [6].
For twiddle factor multiplication in [2], CSD mul tipliers are adopted to efficiently design the compl ex multipliers composed of four multiplications and two additions.
In this paper, we present a low hardware-cost and low power 512 -point FFT with radix $-2^{4}-2^{3} \mathrm{~S}$ DF architecture. To reduce the hardware-cost for twiddle factor multiplication, we propose new CS D complex multipliers which provide removal of ROM to store twiddle factors.

## 2. Design Issues of 512-point FFT

The discrete Fourier transform $X(k)$ of an $N$-p oint input signal $x(n)$ is defined as

[^0]\[

$$
\begin{equation*}
X(k)=\sum_{n=0}^{N-1} x(n) W_{N}^{n k}, \quad 0 \leq k \leq N-1, \tag{1}
\end{equation*}
$$

\]

where the twiddle factor $W_{N}^{n k}=e^{-j 2 \pi n k / N}$.
For the computation of (1), large hardware reso urces and computation time are required. To overc ome the shortcoming, radix- $2^{k}$ algorithms with a reduced number of complex multiplication have been presented.

The 512 -point FFT computation with radix- $2^{k}$ algorithms is composed of nine stages. Table 1 sh ows the sequence of twiddle factors at each stage for $N=512$ and radix $-2^{\mathrm{k}}$ algorithms, where $-j \mathrm{~m}$ eans trivial multiplication and '\#CM' denotes the required number of complex multiplications excl uding $-j$.

The radix $-2^{5}$ and radix $-2^{4}-2^{3}$ algorithms have less number of complex multiplications and simpl er twiddle factors compared to the others. The ra dix $-2^{4}-2^{3}$ algorithm is simpler than radix $-2^{5}$ algo rithm in butterfly control. Thus, the radix $-2^{4}-2^{3}$ algorithm is optimal candidate for the FFT desig n.

Among the various pipelined FFT architectures, we adopt SDF approach based on radix $-2^{k}$ algorit hm for its low cost and high efficiency [1].

Table 1. Base number of twiddle factor

| Algorithms | \# Cages |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |  |
| radix-2 $2^{2}$ | $-j$ | $W_{512}$ | $-j$ | $W_{128}$ | $-j$ | $W_{32}$ | $-j$ | $W_{8}$ | 1196 |
| radix-2 | $-j$ | $W_{8}$ | $W_{512}$ | $-j$ | $W_{8}$ | $W_{64}$ | $-j$ | $W_{8}$ | 1208 |
| radix-2 | $-j$ | $W_{16}$ | $-j$ | $W_{512}$ | $-j$ | $W_{16}$ | $-j$ | $W_{32}$ | 1200 |
| radix-2 | $-j$ | $W_{8}$ | $W_{32}$ | $-j$ | $W_{512}$ | $-j$ | $W_{16}$ | $-j$ | 1168 |
| radix- $-2^{4-} 2^{3}$ | $-j$ | $W_{16}$ | $-j$ | $W_{512}$ | $-j$ | $W_{8}$ | $W_{32}$ | $-j$ | 1168 |

Fig. 1. 512-point radix $-2^{4}-2^{3}$ SDF FFT architecture.
Fig. 1 shows the proposed architecture of the ra dix $-2^{4}-2^{3} 512$-point SDF FFT. In order to obtain proper data at the butterfly input, two types of $b$ utterfly (BF1 and BF2 in [3]) and several delay buf fers with different sizes are used for data shufflin g in Fig. 1. Control signal (ctrl) is used for switchi ng the butterfly types. Also, it provides a proper control for multiplication of twiddle factor.

## 3. Proposed FFT Design

### 3.1 Proposed CSD complex multiplier

$$
\text { for } W_{8}^{i}, W_{16}^{i} \text { and } W_{32}^{i}
$$

In order to design constant complex multiplier w ith twiddle factors $W_{8}^{i}, W_{16}^{i}$ and $W_{32}^{i}$, we first find out the required constant values for these twiddle factors.
Twiddle factors $W_{8}^{i}$ at stage 6 only need 4 fact ors ( $i=0 \sim 3$ ). By using $W_{N}^{N / 4}=-j$ and the symmetry property of complex sinusoidal function, only on e twiddle factor $W_{8}^{1}$ is required. In the twiddle fac tor, a constant $\operatorname{Re}\left\{W_{8}^{1}\right\}$ is needed since $\operatorname{Re}\left\{W_{8}^{1}\right\}=\operatorname{Im}\left\{W_{8}^{1}\right\}$, where $\operatorname{Re}\{t\}$ and $\operatorname{Im}\{t\}$ denotes the real part and imaginary part of t , respectivel y.

There are 7 twiddle factors for $W_{16}^{i}(i=0 \sim 4,6$, 9) at stage 2 . Applying rules similar to $W_{8}^{i}$, these twiddle factors can be explained by using only th ree constant values $\operatorname{Re}\left\{W_{16}^{1}\right\}, \operatorname{Re}\left\{W_{16}^{2}\right\}$ and $\operatorname{Re}\left\{W_{16}^{3}\right\}$.
Twiddle factors $W_{32}^{i}$ at stage 7 need 16 factors $(i=0 \sim 10,12,14,15,18,21)$. By the same, the fact ors can be expressed by 7 values of $\operatorname{Re}\left\{W_{32}^{1}\right\}$,

$\operatorname{Re}\left\{W_{32}^{2}\right\}, \operatorname{Re}\left\{W_{32}^{3}\right\}, \operatorname{Re}\left\{W_{32}^{4}\right\}, \operatorname{Re}\left\{W_{32}^{5}\right\}, \operatorname{Re}\left\{W_{32}^{6}\right\}$, an d $\operatorname{Re}\left\{W_{32}^{7}\right\}$ as shown in Table 2.

In order to efficiently design constant multiplica tion, CSD representation and common sub-express ion (CSE) sharing algorithm in [8] are adopted. Ta ble 3 shows the CSD representations of 7 coefficie nts. The CSE '101' (or -10-1) lies in the red soli d line. Also, CSE '10-1' (or -101) and '1000-1' (or -10001 ) lie in the blue dashed line and purple dotted line, respectively. The 7 CSD multipliers ca n be obtained by using 16 shifters and 13 additions as

$$
\begin{align*}
& C S E 1=d+d \gg 2  \tag{2}\\
& C S E 2=d-d \gg 2 \\
& C S E 3=d-d \gg 4 \\
& d \times \operatorname{Re}\left\{W_{32}^{A}\right\}=d-C S E \gg 6 \\
& d \times \operatorname{Re}\left\{W_{32}^{2}\right\}=d-C S E \gg 4+d \gg 9 \\
& d \times \operatorname{Re}\left\{W_{32}^{3}\right\}=C S E 2+C S E>4+C S E 2 \gg 8 \\
& d \times \operatorname{Re}\left\{W_{32}^{A}\right\}=C S E 2-d \gg 4+C S E 1 \gg 6 \\
& d \times \operatorname{Re}\left\{W_{32}^{5}\right\}=d \gg 1+d \gg 4-C S E 3 \gg 7 \\
& d \times \operatorname{Re}\left\{W_{32}^{6}\right\}=C S E 2 \gg 1+C S E 3 \gg 7 \\
& d \times \operatorname{Re}\left\{W_{32}^{7}\right\}=C S E 2 \gg 2+C S E 3 \gg 7
\end{align*}
$$

where, $d$ and $\gg t$ stands for the multiplicand f or twiddle factors and the right-shift operation b y $t$, respectively. Note that the CSD constant com plex multiplier is only consist of adders, shifters and multiplexers with lower hardware resources compared to general complex multiplier.

Table 2. Representation of $W_{32}^{i}$

| $W_{32}^{0}$ | 1 | $W_{16}^{8}$ | $-j$ |
| :---: | :---: | :---: | :---: |
| $W_{32}^{1}$ | $\operatorname{Re}\left\{W_{32}^{1}\right\}-j \operatorname{Re}\left\{W_{32}^{7}\right\}$ | $W_{16}^{9}$ | $-\operatorname{Re}\left\{W_{32}^{7}\right\}-j \operatorname{Re}\left\{W_{32}^{1}\right\}$ |
| $W_{12}^{2}$ | $\operatorname{Re}\left\{W_{32}^{2}\right\}-j \operatorname{Re}\left\{W_{32}^{6}\right\}$ | $W_{16}^{10}$ | $-\operatorname{Re}\left\{W_{32}^{6}\right\}-j \operatorname{Re}\left\{W_{32}^{2}\right\}$ |
| $W_{12}^{3}$ | $\operatorname{Re}\left\{W_{32}^{3}\right\}-j \operatorname{Re}\left\{W_{32}^{5}\right\}$ | $W_{16}^{12}$ | $-\operatorname{Re}\left\{W_{32}^{4}\right\}-j \operatorname{Re}\left\{W_{32}^{4}\right\}$ |
| $W_{12}^{4}$ | $\operatorname{Re}\left\{W_{32}^{4}\right\}-j \operatorname{Re}\left\{W_{32}^{4}\right\}$ | $W_{16}^{14}$ | $-\operatorname{Re}\left\{W_{32}^{2}\right\}-j \operatorname{Re}\left\{W_{32}^{6}\right\}$ |
| $W_{12}^{5}$ | $\operatorname{Re}\left\{W_{32}^{5}\right\}-j \operatorname{Re}\left\{W_{32}^{3}\right\}$ | $\operatorname{W15}$ | $-\operatorname{Re}\left\{W_{32}^{1}\right\}-j \operatorname{Re}\left\{W_{32}^{7}\right\}$ |
| $W_{32}^{6}$ | $\operatorname{Re}\left\{W_{32}^{6}\right\}-j \operatorname{Re}\left\{W_{32}^{2}\right\}$ | $W_{16}^{18}$ | $-\operatorname{Re}\left\{W_{32}^{2}\right\}+j \operatorname{Re}\left\{W_{32}^{6}\right\}$ |
| $W_{32}^{7}$ | $\operatorname{Re}\left\{W_{32}^{7}\right\}-j \operatorname{Re}\left\{W_{32}^{1}\right\}$ | $W_{16}^{21}$ | $-\operatorname{Re}\left\{W_{32}^{5}\right\}+j \operatorname{Re}\left\{W_{32}^{3}\right\}$ |

Table 3. CSD representation for $W_{32}^{i}$ with 12 bits

| $\operatorname{Re}\left(W_{32}^{1}\right)$ | 1 | 0 | 0 | 0 | 0 | 0 |  | -1 | 0 | -1 | 0 | 0 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $R e\left({ }^{2}\right.$ |  | 0 | 0 | 0 | -1 | 0 |  | -1 | 0 | 0 | 1 | 0 | 0 |
| Re( |  | 0 |  | 0 | 1 | 0 |  | 1 | 0 | 1 | 0 | -1 | 0 |
| Re ( |  | 0 |  | 0 | -1 | 0 |  | 1 | 0 | 1 | 0 | 0 | 0 |
| ( $W$ | 0 | 1 | 0 | 0 | 1 | 0 |  | 0 | -1 | 0 | 0 | 0 |  |
| $R e(1$ | 0 |  | 0 | - | 0 | 0 |  | 0 | 1 | 0 | 0 | 0 | -1 |
| $\operatorname{Re}\left(W_{32}^{7}\right)$ | 0 | 0 | T 1 | 0 | -1 | 0 |  | 0 | 1 | 0 | 0 | 0 | -1 |



Fig. 2. Proposed CSD constant multipliers for $W_{32}^{i}$.

Fig. 2 (a) shows the proposed CSD complex mul tiplier structure for $W_{32}^{i}$. The detailed structure of CSD multipliers is shown in Fig. 2 (b). To select the proper twiddle factor multiplication result in Table 2, two type multiplexers are needed by two signals (sel1 and sel2). Note that we can design C SD complex multipliers for $W_{8}^{i}$ and $W_{16}^{i}$ using
$\operatorname{Re}\left\{W_{32}^{2}\right\}, \operatorname{Re}\left\{W_{32}^{4}\right\}$, and $\operatorname{Re}\left\{W_{32}^{6}\right\}$.

### 3.2 Cascade CSD complex multiplier

$W_{512}^{i}$
As shown in Fig. 1, the output signals at stage 4 are multiplied by proper twiddle factors $W_{512}^{i}$ ( $i=0 \sim 511$ ). The complexity of CSD complex mul tiplier increases as the base number of twiddle fa ctor increases. Thus, the approach described abo ve is not practical for $W_{512}^{i}$. To reduce the requir ed number of constants for $W_{512}^{i}$, we utilize $1 / 8$ sy mmetry property and decomposition of twiddle $f$ actor [2]. The proposed CSD complex multiplier design procedure is as follows

1. Divide the $i$ of $W_{512}^{i}$ into 8 regions $(k=0 \sim 64)$ using $1 / 8$ symmetry property.
2. Decompose the $k$ into $8 i_{1}\left(i_{1}=0 \sim 8\right)$ and $i_{2}$ $\left(i_{2}=0 \sim 7\right)$ as $W_{512}^{k}=W_{512}^{8 i_{1}+i_{2}}=W_{512}^{8 i_{1}} W_{512}^{i_{2}}$.
3. Make CSD coefficient table for $W_{512}^{s_{11}}$ and $W_{512}^{i_{2}}$, and find the optimized CSE.

By applying the procedure, the required number of constant values can be reduced to 16 . Table 4 shows CSD representation of the 16 different value s. The CSE '101' (or $-10-1$ ) and '10-1' (or -101) are in the red solid line and blue dashed line, respe ctively. Also, the CSE '1001' (or -100-1), '100-1 ' (or -1001) and '1000-1' lie in the yellow dashdouble dotted line, green dash-single dotted line a nd purple dotted line, respectively.

Table 4. CSD representation of 16 values

| 1 | $\operatorname{Re}\left\{W_{512}^{s_{1 i} l^{\prime}}\right\}$ |  |  |  |  |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 1 | 1 | 0 |  |  | - | 0 | 0 |  |  |  | $0-1$ | -1 |  |
| 2 | 1 | 0 | 0 | 0 | 0 | 0 | -1 | 10 | $0-1$ | 0 | 0 | 0 | 0 |
| 3 | 1 | 0 | 0 | 0 | -1 | 10 | 1 | 10 | 0 | 0 | 0 | 0 | 0 |
| 4 | 1 | 0 | 0 | 0 | -1 | 10 | -1 | 10 | 0 | 1 | 10 | 0 | 0 |
| 5 | 1 | 0 | 0 | -1 | 0 | 0 | 0 | 01 | 10 | 0 | $0-1$ | -1 | 0 |
| 6 | 1 | 0 | -1 | 0 | 1 | 0 | 1 | 10 | 0 | 0 | 0 | 0 | -1 |
| 7 | 1 | 0 | - 1 | 0 | 0 | 1 | 0 | $0-1$ | $-10$ | 0 | 0 | 0 | -1 |
| 8 | 1 | 0 | -1 | 0 | -1) | 10 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| $i_{2}$ | $\operatorname{Re}\left\{W_{512}^{i_{1}}\right\}$ |  |  |  |  |  |  |  |  |  |  |  |  |
| 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |  | 0 | 0 | 0 |
| 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 00 | 0 | 0 | 0 | 0 |
| 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 00 | 0 | $0-1$ | -1 | 0 |
| 4 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $0-1$ | -1 | 0 |
| 5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | 1 | 0 | 0 |
| 6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $0-1$ | 0 | 0 | 1 | 0 |
| 7 |  | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0-1] | 0 | 0 | 0 | 0 |
| $i_{1}$ | $-/ m\left\{W_{12}^{s i_{12}}\right\}$ |  |  |  |  |  |  |  |  |  |  |  |  |
| 1 | 0 | 0 | 0 |  | 0 | -1. | 10 | 0 | 0 | 0 | 0 | 0 | 1 |
| 2 | 0 | 0 | 1 | 0 | -1 | 10 | 0 | 1 | 1 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0.1 | 1.0 | 0 | 0 | 1 | 0 |
| 4 | 0 | 1 | 0 |  | 0 | 0 | 0 | 0.1 |  | 0 | 0 | 0 | 0 |
| 5 | 0 |  | 0 | 0 | 0 | -1 | 0 | 0 | $0 \cdot$ | 0 | 0. -1 | -1 | 0 |
| 6 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | $0-1$ | -1 0 | 0 | 0 | 1 | 0 |
| 7 | 0 |  | 0 | 1 | 0 | 0 | - | 01 | 10 |  | 1 | 0 | -1 |
| 8 | - | 0 | -1. | 0 | -1 | 0 | 1 | 10 | 0 | 0 | 0 | 0 | 0 |
| $i_{2}$ | $-1 m\left\{W_{512}^{i 2}\right\}$ |  |  |  |  |  |  |  |  |  |  |  |  |
| 0 | 0 | 0 | 0 | 0 | 0 |  | 0 |  | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 0 | 0 | 0 | 0 |  | 1 | 10 | 0 -1 | 0 | 0 | 0 | 1 |
| 2 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | -1 | $-1_{1} 1_{0}$ | 0 | 0 |  | 0 |
| 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 01 | 1 | -1 | -1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 |  | 0 | $-1$ | 10 | 0 | 1 | 0 | 0 | 0 |
| 5 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 0 | 0 | $0-1$ | -1 | 0 |
| 6 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 10 | $0-1$ | 0 | $0-1$ | -1 | 0 |
| 7 | 0 | 0 | 0 |  | 0 | -1 | 0 | -1 | -1 0 | 0 | 0 | 0 | 0 |



Fig. 3. Proposed CSD multiplier structure.

Fig. 3 shows the proposed cascade CSD complex multiplier for $W_{512}^{i}$ which is composed of coarse and fine multiplication. The pipeline technique c an be used to reduce the critical path. The detail ed architecture of corse and fine multiplier is sho wn in Fig. 4.

## 4. Results and Comparison

The proposed and previous approaches for 512point FFT were designed using Verilog HDL. Thes e designs are synthesized based on Intel Cyclone 1 OLP FPGA using QUARTUS PRIME design tool. T he input and output word-length are 12 -bit and 2 5-bit, respectively.
Approach in [7] employs the CSD complex mul tiplier for $W_{16}^{i}$ and, conventional complex multipl iers (CCMs) for $W_{32}^{i}$ and $W_{512}^{i}$. Also, approach in [5] employs CSD complex multipliers for $W_{8}^{i}, W_{16}^{i}$ and $W_{32}^{i}$, and CCM for $W_{512}^{i}$. To implement CCM,

4 modified Booth multiplier and 2 ripple carry ad der are used. In the proposed approach, only CS D complex multipliers are used to implement twi ddle factor multiplication. It provides elimination of ROM to store the twiddle factors.
The performance comparison of the proposed ap proach and previous approaches is summarized in Table 5. Note that the proposed design approach achieves $28 \%$ gate count reduction and $34 \%$ memo ry reduction compared to radix- $2^{4}$ approach. In a ddition, the proposed approach is $18 \%$ less than radix- $2^{4}$ approach in power consumption.

## 5. Conclusion

We proposed a hardware efficient and low-pow er 512-point pipelined FFT with radix $-2^{4}-2^{3}$ algo rithm. To reduce the hardware-cost and power c onsumption, we proposed the CSD complex multi


Fig. 4. Detailed structure of CSD complex multiplication for $W_{512}^{i}$.

Table 5. Hardware comparison results of 512-point FFT designs

| Approaches | Multiplier types | Logic elements | Registers | Memory bits | Power (mW) |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Radix-2 $2^{4}[7]$ | CSD and CCM | $7,823(1)$ | $635(1)$ | $32,568(1)$ | $149.8(1)$ |
| Radix- $2^{5}[5]$ | CSD and CCM | $6,730(0.86)$ | $575(0.91)$ | $26,356(0.81)$ | $136.7(0.92)$ |
| Proposed | CSD | $5,669(0.72)$ | $523(0.82)$ | $21,608(0.66)$ | $122.2(0.82)$ |

pliers which replace conventional complex multi plier and remove ROM for storing twiddle factors. By simulation, the proposed FFT design achieves more than about $28 \%$ reduction in gate count and $18 \%$ reduction in power consumption compared $t$ o the previous approaches.

## REFERENCES

[1] C. Yu, M. H. Yen and S. J. Chen, "A Low-power 64-point pipeline FFT/IFFT processor for OFDM applications", IEEE Consum. Electron., vol. 5, pp. 40-45, 2011.
[2] J. Yu and K. J. Cho, "An area-efficient 256-point FFT design for WiMAX systems", KIIECT, vol. 11, no. 3, pp. 270-276, 2018.
[3] S. He and M. Torkelson, "Designing pipeline FFT processor for OFDM (de)modulation", Proc. URSI Int. Symp. Signals. Syst., Electron., 1998. pp. 257-262.
[4] J. Y. Oh and M. S. Lim, "New radix-2 to the 4th power pipeline FFT processor", IEICE trans. Electron., vol. E88-C, no. 8, pp.1740-1746, 2005.
[5] T. S. Cho and H. H. Lee, "A high-speed low-complexity modified radix-25 FFT processor for high rate WPAN applications", IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 21, no.1, pp. 187-191, 2013.
[6] M. Garrido, et al., "Pipelined radix-2k feedforward FFT architectures", IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 21, no.1, pp. 23-32, 2013.
[7] C. Yu and M. H. Yen, "Area-efficient 128to 2048/1536-point pipeline FFT processor for LTE and mobile WiMAX systems", IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 23, no.9, pp. 1793-2015, 2015.
[8] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. Willey-Interscience, 1999.

## Author Biography

Jian Yu
[Member]


- Jun. 2001: Hebei Normal Univ., Electronic Engr., BA - Mar. 2008: Tianjin Polytechnic Univ., Electronic Engr., MS -Mar. 2016 ~ current : Wonkwang Univ., Dept. of Electronic Engr., PhD course
<Research Interests> VLSI Design

Kyung-Ju Cho
[Member]

-Aug. 2006 : Chonbuk National Univ., Info. \& Comm. Engr., PhD
-Mar. $2012 \sim$ current : Wonkwang Univ., Dept. of Electronic Engr., Professor
<Research Interests> VLSI Design, SOC


[^0]:    This paper was supported by Wonkwang University in 2018.
    *Department of Electronic Engineering Wonkwang University
    **Corresponding Author: Department of Electronic Engineering Wonkwang University (kjcho@wku.ac.kr)
    Received October 11, 2018 Revised October 12, 2018
    Accepted October 19, 2018

