Swati V. Kakde1, Mayuri Chawla2
M.Tech Student, Dept. of VLSI, Jhulelal Institute of Technology, Nagpur, India1,
Professor, Dept. of VLSI, Jhulelal Institute of Technology, Nagpur,India2
The focus of this paper is to depict the field of computer architecture, researchers compare architectures by simulating them on a common platform with common benchmark programs. This paper accomplished the following objectives. Designed high level cache architecture with the goal of improving high performance, Low power computing. Compared the new design to existing designs through software simulation. Concluded whether or not the design out-performed existing cache designs in regard to high-performance, low power computing and for determining the new cache configuration.
In this paper, we present the different steps of our methodology of adjusting caches for a specific MPSoC application. We have to generate the configuration file where we can find the best configuration with the best size, speed, and associativity for the given application
Keywords: cache memory, proposed methodology, VHDL, survey on paper.
Our society depends more heavily on computers and microprocessors with each passing year. Building a microprocessor requires organizing a large number of transistors onto a complex integrated circuit (IC).The high transistor density on modern microprocessors forces computer architects to consider both power consumption and performance. Shutting down parts of the microprocessor serves as the easiest, most effective mechanism to conserve power.
Many general-purpose processors utilize cache memory. Because the caching structures on microprocessors use a large percentage of the transistors, shutting down parts of the cache would save a considerable amount of power. However, the size of the cache greatly affects performance, or the time needed to execute programs. The optimal high-performance, low power cache will minimize energy consumption, or the product of power and execution time.
For the past 37 years, Moore’s law has accurately predicted that the number of transistors on a single IC wills double every 18 months. Increased transistor density has increased operating speeds at the same rate, but also caused more power consumption. This increased power consumption generates undesired heat, which potentially degrades performance, destroys the IC, or injures the user. Historically, computer architects have designed processors either for high performance or for low power depending on the application. For example, a cell phone needs low power consumption so that it will not burn the user’s hand.
As transistor density increases, the demand for processors that deliver high performance and conserve power will increase. This thesis project describes a caching technique the aims to conserve power while maintaining high performance.
II. CACHE MEMORY
Cache memory, also called CPU memory, is random access memory (RAM) that a computer microprocessor can access more quickly than it can access regular RAM. This memory is typically integrated directly with the CPU chip or placed on a separate chip that has a separate bus interconnect with the CPU. The basic purpose of cache memory is to store program instructions that are frequently re-referenced by software during operation. Fast access to these instructions increases the overall speed of the software program.
As the microprocessor processes data, it looks first in the cache memory; if it finds the instructions there (from a previous reading of data), it does not have to do a more time-consuming reading of data from larger memory or other data storage devices.
Figure 1. Shows a simplified diagram of a system with cache. In this system, every time the CPU performs a read or write, the cache may intercept the bus transaction, allowing the cache to decrease the response time of the system. Before discussing this cache model, lets define some of the common terms used when talking about cache.
III. CACHE MEMORY LEVELS
Cache memory is fast and expensive. Traditionally, it is categorized as “levels” that describe its closeness and accessibility to the microprocessor:
- Level 1 (L1) cache is extremely fast but relatively small, and is usually embedded in the processor chip (CPU).
- Level 2 (L2) cache is often more capacious than L1; it may be located on the CPU or on a separate chip or coprocessor with a high-speed alternative system bus interconnecting the cache to the CPU, so as not to be slowed by traffic on the main system bus.
- Level 3 (L3)cache is typically specialized memory that works to improve the performance of L1 and L2. It can be significantly slower than L1 or L2, but is usually double the speed of RAM. In the case of multicore processors, each core may have its own dedicated L1 and L2 cache, but share a common L3 cache. When an instruction is referenced in the L3 cache, it is typically elevated to a higher tier cache.
IV. RELATED WORK
There have been several proposals for reducing the power consumption of on-chip caches. MDM (Multiple-Divided Module) cache attempts to reduce the power consumption by means of partitioning the cache into several small sub-caches. MDM cache requires a great amount of hardware modification. Block buffering filter cache and L-cache achieve low power consumption by adding a very small L0-cache between the processor and the L1-cache. The advantage of L0-cache approaches decreases when memory reference locality is low and cache replacement happens frequently between the L0 and L1 caches.
The demand for computer architecture requires designers to tune processor parameters to avoid excessive energy wastage. Tuning on per-application basis allows greater saving in energy consumption without a noticeable degradation in performance. On- chip caches often consume significant fraction of the total energy budget and are therefore prime candidates for adaptation.
- Seungcheol Baek, Hyung Gyu Lee ,Junghee Lee and Jongman Kim
“Size-Aware Cache Management For Compressed Cache Architectures”
This article introduces the concept of size-aware cache management as a way to maximize the performance of compressed caches. They propose to further enhance the performance and energy consumption of compressed LLCs. One approach to increasing the effective cache capacity without increasing the physical capacity is to compressed the LLC.
2. A.Bengueddach1, B.Senouci ; S. Nier, B. Beldjilali
“Energy Consumption In Reconfiguration MPSoC Architecture : Two- Level Cache Optimization Oriented Approach
In this paper they investigate the estimation of the energy consumption in embedded MPSoC system. They propose an efficient solution to reduce the energy use in cache memories.
3. Kenji Kanazawa, Tsutomu Maruyama
“FPGA Acceleration Of SAT/Max-SAT Solving Using Variable-way Cache”
In this paper, they showed that by using on-chip block RAMs as a variable-way associative cache memory to cache the clause lists, the performance can be improved. By using the variable-way cache memory, the access delay to the DRAMs can be efficiently hidden. This cache memory aims to hold whole clock when it is small enough and only the head portion when it is large to hide the DRAM access delay. With this cache , up to 60% DRAM access delay can be hidden and the performance can be improve up to 26%.
4. Benitez,D; Moure J.C.; Rexachs D.
“A Reconfigurable Cache Memory With Heterogeneous Banks”
This paper presents the Amorphous Cache (AC), a reconfigurable L2 on-chip cache aimed at improving performance as well as reducing energy consumption. The results show that the combination of AC and the novel reconfiguration algorithm provides the best power consumption and performance.
5. Karthik T. Sundararajan,Timonthy M. Jones, Nigel Topham
“Smart Cache: A Self-Adaptive Cache Architecture For Energy Efficiency”
In this paper they present a Set and way Management cache Architecture for Run-Time reconfiguration (Smart cache), cache architecture that allows reconfiguration in both its size and associability. Results show the energy-delay of the Smart cache is on average 14% better than state-of-the-art cache reconfiguration architectures.
”An Efficient Memory Block Selection Strategy To Improve The Performance Of Cache Memory Subsystem”
In this paper they improve cache performance by reducing the speed-gap between the CPU and main memory, cache increases the timing unpredictability due to its dynamic nature. In this work, they propose an easy but efficient memory block selection strategy to enhance cache locking and cache replacement enactment and overall cache memory subsystem performance.
V. PROPOSE RESEARCH METHODOLOGY
In the field of computer architecture, researchers compare architectures by simulating them on a common platform with common benchmark programs. This thesis project accomplished the following objectives:
- Designed high level cache architecture with the goal of improving high performance, Low power computing
- Compared the new design to existing designs through software simulation
- Concluded whether or not the design outperformed existing cache designs in regard to high-performance, low power computing and for determining the new cache configuration.
In this section, we present the different steps of our methodology of adjusting caches for a specific MPSoC application. We have to generate the configuration file where we can find the best configuration with the best size, speed, and associativity for the given application. We are designing module by using VHDL.
VI.PROPOSED PLAN OF WORK
- Basic study of cache memory.
- In this phase we will survey of different papers.
- In this module we will designed high level architecture with goal of improving high performance.
- Analysis and testing will be performed in this module. This will also include the comparison of proposed approach with the existing one.
In this paper, we have surveyed the techniques for high speed, low energy memory systems. The best way to improve performance/energy efficiency is to achieve fast and low-energy access at each level of memory hierarchy and to concentrate memory accesses on the closest level to the processor. We have compared the new design to existing designs with the help of software simulation of VHDL.
- A. Jaleel, K. B. Theobald, S. C. Steely, Jr., and J. Emer, “High performance cache replacement using re-reference interval prediction (RRIP).
- Manne, A. Klauser, and D. Grunwald, “Pipeline gating: speculation control for energy reduction,” ISCA, 1998
- Abella and A. Gonz ́alez, “Heterogeneous way-size cache,” in ICS 2006
- L. Chen, X. Zou, J. Lei, and Z. Liu, “Dynamically reconfigurable cache for low-power embedded system,” in ICNC, 2007
- A. Putnam, D. Bennett, E. Dellinger, J. Mason, P. Sundarara-jan, and S. Eggers, “Chimps: A c-level compilation flow forhybridcpu-fpga architectures,” in Field Programmable Logicand Applications, 2008. FPL 2008. International Conferenceon, Sept., pp. 173–178.
- C. E. LaForest and J. G. Steffan, “Octavo: an fpga-centric processor family,” in Proceedings of thACM/SIGDA international symposium on Field Programmable Gate Arrays, ser. FPGA ’12. New York, NY, USA: ACM, 2012, pp. 219–228.
- A. Gordon-Ross, F. Vahid, and N. Dutt, “Automatic tuning of two-level caches to embedded applications,” in Proceedings of the Conference on Design, Automation and Test in Europe – Volume 1, February 2004
- A.Gordon-Ross, F.Vahid, and N.Dutt,“Fast configurable-cache tuning with a unified second-level cache,” in Proceedings of the 2005 international Symposium on Low Power Electronics and Design, ISLPED ’05.
- A. Gorden-Ross and F. vahid,.”A self-tuning configurable cache,”
- J. Abella, A. Gonzalez, “Heterogeneous way-size cache”, in Proc. 20th Conf. on Supercomputing
- D.H. Albonesi, “Selective cache ways: on-demand cache resource allocation”.
- A.S. Dhodapkar, J.E. Smith, “Comparing Program Phase Detection Techniques”
- A. Malik, B. Moyer, and D. Cermak. A low power unifiedcache architecture providing power and performance flexibility.
- D. H. Albonesi. Selective cache ways: On-demand cache resource allocation.