CPU is faster than memory.
speeds have been diverging (e.g. CPU has been getting faster at a faster rate, while speed of memory has fallen behind significantly).
there are techniques to slightly speed up memory access (such as interleaving by having several parallel buses at the cost of great complexity in the circuitboard), but these techniques only increase speed linearly, not geometrically, as is needed to keep pace with CPU speeds.
locality: program spends a large amount of time in blocks of relatively small size.
knowing the size of the cache alone will not be an accurate indicator of its performance.
performance also depends on how the cache is mapped, etc..
of course, hit rate and miss rate depend on the specific program being run, and such measures must be determined in the context of specific benchmarks (e.g. specific program examples).
in this lecture, the focus is on L1 cache. L1 cache is tied to the CPU, while L2 cache is tied to main memory (main store RAM)