Assuming a LUT of say 512KB of 64-bit double types. Generally speaking, how does the CPU cache the structure in L1 or L2?
For example: I access the middle element, does it attempt to cache the whole LUT or just some of it - say the middle element and then n subsequent elements?
What kind of algorithms does the CPU use to determi开发者_如何学编程ne what it keeps in L2 cache? Is there a certain look-ahead strategy it follows
Note: I'm assuming x86, but I'd be interested in knowing how other architectures works POWER, SPARC etc..
It depends on the data structure you use for the LUT (look-up table?)
Caches are at their best with things that are laid out contiguously is memory (e.g. as arrays or std::vectors) rather than scattered around.
In simple terms, when you access a memory location, a block of RAM (a "cache line" worth -- 64 bytes on x86) is loaded into cache, possibly evicting some previously-cached data.
Generally, there are several levels of cache, forming a hierarchy. With each level, access times increase but so does capacity.
Yes, there is lookahead, which is limited by rather simplistic algorithms and the inability to cross page boundaries (a memory page is typically 4KB in size on x86.)
I suggest that you read What Every Programmer Should Know About Memory. It has lots of great info on the subject.
Caches are generally formed as a collection of cache lines. Each cache line's granularity is aligned to the size of the cache line, so, for example, a cache with a cache line of 128 bytes will have the address it is caching data for aligned to 128 bytes.
CPU caches generally use some LRU eviction mechanism (least recently used, as in evict the oldest cache line on a cache miss), as well as having some mapping from a memory address to a particular set of cache lines. (This results in one of the many false sharing errors in x86 if you are trying to read from multiple addresses aligned on a 4k or 16M boundary.)
So, when you have a cache miss, the CPU will read in a cache line of memory that includes the address range missed. If you happen to read across a cache line boundary, that means you will read in two cache lines.
精彩评论