VHDL Implementation of an Optimise 16-Point FFT processor using CORDIC in pipeline architecture for OFDM system.
Volumn 3

VHDL Implementation of an Optimise 16-Point FFT processor using CORDIC in pipeline architecture for OFDM system.

Ms. Rakshanda Walke

Department of Electronics and Telecommunication

Jhulelal Institute of Technology Nagpur, India


Contact no: – 7304424919

Ms. Kalyani Kalamkar

Department of Electronics and Telecommunication

Jhulelal Institute of Technology Nagpur, India



Ms. Snehal Raghatate

Department of Electronics and Telecommunication

Jhulelal Institute of Technology Nagpur, India




The Fast Fourier Transform (FFT) and its inverse transform (IFFT) processor are key components in many communication systems. An optimized implementation of the 16- point FFT processor with radix-4 algorithm in CORDIC in pipeline architecture is presented in this paper. The butterfly- Processing Element (PE) used in the 16-FFT processor reduces the multiplicative complexity by using a real constant multiplication in one method and eliminates the multiplicative complexity by using add and shift operations in other proposed method. Cordic is a application of VHDL. The main aim of the project is to convert the given angle in trigonometric angle. This is   basically used in digital computers. The Pipeline structure reduces the delay in the system. The input to our system is integer type only.


. The Fast Fourier Transform is an efficient algorithm for computing the Discrete Fourier Transform and requires less number of computations than that of direct evaluation of DFT.  It has several applications in signal processing. But, because of the complexity of the processing algorithm of FFT, recently various FFT algorithms have been proposed to meet real-time processing requirements and to reduce hardware complexity over the last decades. In order to reduce complexity of hardware and improve performance of processing, this project work proposes hardware design of FFT processor using CORDIC algorithm which will be synthesized on Field Programmable Gate Array. The purpose of this project is to obtain an area efficient description of an FFT processor. To achieve this, radix-4 FFT processor will develop using CORDIC algorithm. The Coordinate Rotation Digital Computer algorithm is a well-known iterative technique to perform various basic arithmetic operations including the computation of trigonometric functions, vector magnitude estimation, polar to rectangular transformation etc. The choice of the CORDIC algorithm for realizing butterfly operation for FFT which eliminates the need for storing twiddle factors and angles saves a lot of hardware compared to its counterparts employing other techniques. For FFT processors, butterfly operation is the most computationally demanding stage. Traditionally, a butterfly unit is composed of complex adders and multipliers, and the multiplier is usually the speedup bottleneck in the pipeline of the FFT processor. The Coordinate Rotation Digital Computer algorithm is an alternative method to realize the butterfly operation without using any dedicated multiplier hardware.

Fig: – DIT FFT 
Fig :- DIF FFT


Ms. Rakshanda Walke and Kalyani Kalamkar has discussed ona “CORDIC Based Radix-4 FFT Processor”.

Technique: This project focuses on the study of FFT and IFFT of pipelined based radix-4 algorithm and to proceed this algorithm it uses a butterfly structure method due to the use of radix-4 instead of radix-2 technique, computation speed get should be doubled. To proceed further, Butterfly structure i.e. BFI and BFII can be executed using the DFT algorithm. Twiddle factor multipliers (TFM) can be implemented using Co-ordinate Rotation Digital Computer (CORDIC) algorithm, there are other units like delay units and control units for them we use FSM based design.

CORDAC: It will change the corresponding angles in trigometric format that is sine and cos format. Pipeline: Since we use pipeline structure the overall clock delay is reduced, compared to the system which is not pipelined.

Observation: In this project Radix-4 FFT processor architecture is studied. For generation of twiddle factor CORDIC algorithm is used. Various parts of FFT architecture such as BFI, BFII, Control Unit, Delay-Feedback model are discussed. In the next phase of this paperactual Implementation of FFT processor on FPGA will be done using Verilog.


Normally the architecture of radix-4 FFT processor consists of a twiddle factor based butterfly computation. At each stage of butterfly computation, the twiddle factor is multiplied with the input sequence and each stage required a RAM to store the twiddle factor angles.

The choice of the CORDIC algorithm for realizing butterfly operation for FFT which eliminates the need for storing twiddle factors and angles saves a lot of hardware compared to its counterparts employing other techniques.

The importance of radix-4 FFT when compared with radix-2, it’s taking less calculation resources.

The total CORDIC based radix-4 FFT will be implemented on a field programmable gate array that is characteristic of high efficiency, low cost, convenient implementation and short development cycle, and its performance is found to be satisfactory with above characteristics and advantages of both radix-4 FFT and CORDIC this proposed work explored here in this project.

Need of CORDIC

FFT is computationally inefficient block which uses many multipliers and adders.

The use of Multipliers increases the area and delay.

To reduce hardware complexity and increase computational efficiency we have used CORDIC Algorithm which replaces multipliers.

What is CORDIC?

 CORDIC (COordinate Rotation DIgital Computer)

  • Introduced in 1959 by Jack E. Volder
  • Rotate vector (1,0) by f to get (cos f, sin f)
  • Can evaluate many functions
  • Rotation reduced shift-add operations
  • Convergence method (iterative) N iterations for N-bit accuracy  Delay & hardware costs comparable reduce.


Top Entity name: sc_corproc

Entity name: p2r_cordic

Entity name: p2r_cordicpipe



  • Here it can be seen in the simulation that the input is given in hexadecimal format and so is the output.
  • 1555 hex  = 5641 dec
  • ain => 16’0000h = 0 degree
  • ain => 16’1555h = 30 degree
  • ain => 16’2000h = 45 degree
  • ain => 16’2AAAh = 60 degree
  • ain => 16’4000h = 90 degree
  • ain => 16’271Ah = 55 degree
  • Using the formula, the output values can be first converted to decimal and then to their respective values. With the rising edge of the clock if the enable signal is high then at the 14 clock pulse we get the sine and cosine values.



 The design flow of FFT Processor using CORDIC is shown in figure above. The selector block is nothing but a memory path buffer which computes respective memory of input samples.

When Active signal is asserted and there are some input data, the address generator block assigns a memory position for each input sample.

 Now when Dual port Ram gets write Address signal from address generator block, it saves both memory path along with respective input samples.

The 4 point FFT block has butterfly unit within it throughput of the system.

The entire model is made of the address generation unit, the control unit, the dual port RAM unit, the 4-point butterfly unit and the CORDIC twiddle factor generation unit.

This model is characterized by setting the parameter, sampling points and the accuracy to meet the actual needs.

To perform these operations concurrently, a dual –port RAM has been employed. The control unit involves the timing control of the data storage, reading and writing to make the corresponding data and rotating factor coefficient flow into butterfly and CORDIC computing unit in sequence in FFT operation.

Data and address of the “twiddle factor” can be easily generated by the counter. The address generation logic is very simple and does not limit the throughput of the system.


For simulation, {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1} are the 16 point inputs with real part =1 and imaginary = 0. The data width is chosen as 8 bits.

Reset is initially asserted(logic ‘1’)  and then de-asserted(logic ‘0’). After that start is asserted for a clock pulse and from that clock pulse the inputs are accepted by the cfft processor.

Here the cfft takes some clock pulses to process the input and produce the transformed output. Output is considered valid when outdataen gets asserted. It will be asserted for 16 clock pulses as there are the input is of 16 points. (4,0,0,0,4,0,0,0,4,0,0,0,4,0,0,0}. outposition is also verified.


  • Xilinx ISE 14.5
  • Altera QUARTUS II


  • It  is faster than other approaches.
  • The proposed architecture gives an advantage in terms of area, hardware.
  • Complex multiplication reduction approach for large points FFT and pipelining method.
  • Due to CORDIC Algorithm, need of storing Twiddle factor is eliminated.
  • Low cost and less complexity.

In FFT time required is less and power consumption is less.


  •  Calculation of Trigonometric, Hyperbolic and Logarithmic functions.
  • Real and Complex Multiplication, Division, square root Calculation.
  • Solution of linear equation.
  • Eigen value estimation.
  • Used in signal and Image Processing.
  • Communication system.
  • Robotics.
  • 3D Graphics.
  • Bio-Medical Application.


The design for 16-point FFT processor will be done using the above design goals and strategies and will be coded in VHDL. The synthesis and simulation results will be obtained using Xilinx ISE 14.5. Spartan-3A DSP, XC3SD1800A and FG676 are selected as Family, Device and Package respectively.


  • Anand Servey paper on FPGA implementation of CORDIC algorithms:
  • Anand Kumar (Basics of Algorithm)
  • S.S. Limaye (CORDIC Algorithm)
  • J.W.Cooley, J.W.Tukey, “An algorithm for the machine calculation of complex Fourier series,” 1965.
  • Ajay S. Padekar&S. S. Belsare, “STUDY OF A CORDIC BASED RADIX-4 FFT PROCESSOR” , International Journal of Electrical, Electronics and Data Communication, ISSN: 2320-2084 Volume-2, Issue-3, March-2014.
  • K. SreekanthYadav, V .Charishma,  Neelimakoppala,“Design and simulation of 64 point FFT using Radix 4 algorithm for FPGA Implementation”,International Journal of Engineering Trends and Technology, Volume 4, Issue 2- 2013

Related posts



Digital Maternity Ward Management System Hardware Intergration Using MATLAB


A Comparative Analysis of Sensor and Sensor less Control of Four Switch Inverter Fed BLDC Drive


Leave a Comment