Locality of reference: Difference between revisions
imported>Drew R. Smith No edit summary |
imported>Daniel Mietchen (remove Article-of-the-Week transclusions) |
||
(8 intermediate revisions by 4 users not shown) | |||
Line 2: | Line 2: | ||
{{TOC|right}} | {{TOC|right}} | ||
{{seealso|Locality of networks}} | {{seealso|Locality of networks}} | ||
'''Locality of reference''' is a commonly observed pattern in [[RAM|memory]] accesses by a [[computer program]] over time. The idea is, that memory accesses that happen close to one another in time tend to occur close to one another in space (memory address). Locality of reference is one of the primary motivations for copying data ''in chunks'' from slower memory to faster memory in the [[memory hierarchy]]. Examples are [[cache|memory caches]], which attempt to load a range of main memory contents at a time, and [[paging|memory paging]] where pages of 1024 bytes or more are stored on and retrieved from harddisks in one go. Each time the assumption is made that the excess memory contents will be needed soon after. | |||
Locality of reference can be exploited by a computer's [[memory controller]] for drastic improvements in memory access times. In general, whenever a memory access takes place, the memory controller will attempt to read a larger section of memory which contains the target address. In the common case, subsequent memory accesses will likely target memory addresses that have been loaded into the cache by that same read. | Locality of reference can be exploited by a computer's [[memory controller]] for drastic improvements in memory access times. In general, whenever a memory access takes place, the memory controller will attempt to read a larger section of memory which contains the target address. In the common case, subsequent memory accesses will likely target memory addresses that have been loaded into the cache by that same read. | ||
== Thought | == Thought experiment: fetch-execute cycle == | ||
When a typical [[computer]] is executing a program, it repeatedly reads the next instruction in memory and then executes it. Typically, those instructions are placed in sequential memory addresses, with exceptions for branches that occur for [[control structure|control structures]] such as [[loop|loops]], [[conditional|conditionals]] and [[function invocation|function or method invocations]]. | When a typical [[computer]] is executing a program, it repeatedly reads the next instruction in memory and then executes it. Typically, those instructions are placed in sequential memory addresses, with exceptions for branches that occur for [[control structure|control structures]] such as [[loop|loops]], [[conditional|conditionals]] and [[function invocation|function or method invocations]]. | ||
== Thought | == Thought experiment: array algorithms == | ||
Suppose we had an [[algorithm]] which was to select the largest number in an [[array]]. One straight-forward way to accomplish this (indeed, the optimal solution for an unsorted flat array) is to [[iteration|iterate]] over each element of the array in order, and check | Suppose we had an [[algorithm]] which was to select the largest number in an [[array]]. One straight-forward way to accomplish this (indeed, the optimal solution for an unsorted [[flat array]]) is to [[iteration|iterate]] over each element of the array in order, and check whether each one is the largest so far. Thus, at time T=0, we check element 0, at T=1, we check element 1, and so on. Without a cache, the processor would need to spend a little bit of time during each instruction cycle to fetch the array element from main memory. But, if the processor employs a cache, we can achieve a speed-up as follows. | ||
Once the algorithm attempts its first read, the processor's memory controller will fetch not just that element, but the entire cache line which contains that element. The processor must wait for that element before it can proceed, but the memory controller can continue fetching the rest while the processor moves on to the next instruction. As a result, for the next few elements we can avoid a memory stall with each array access. | Once the algorithm attempts its first read, the processor's memory controller will fetch not just that element, but the entire cache line which contains that element. The processor must wait for that element before it can proceed, but the memory controller can continue fetching the rest while the processor moves on to the next instruction. As a result, for the next few elements we can avoid a memory stall with each array access. | ||
== Problematic | == Problematic access patterns == | ||
Any memory access which does not follow a linear pattern is problematic for mechanisms exploiting locality of reference. They will have spend resources to fetch more than what was immediately needed but the additional memory content fetched is now useless and the additional resources spend thus wasted. | Any memory access which does not follow a linear pattern is problematic for mechanisms exploiting locality of reference. They will have spend resources to fetch more than what was immediately needed but the additional memory content fetched is now useless and the additional resources spend thus wasted. | ||
Typically this occurs with referential [[data structure|data structures]] like [[linked list|linked lists]] if the elements are spread over memory, if index structure and content are stored together, and if the accessing algorithm is only interested in the index structure but not the indexed content. Thus traversing a linked list to find its tail or following all references in memory to detect unreachable objects for [[garbage collection]] are often very inefficient from a locality of reference point of view. | Typically this occurs with referential [[data structure|data structures]] like [[linked list|linked lists]] if the elements are spread over memory, if index structure and content are stored together, and if the accessing algorithm is only interested in the index structure but not the indexed content. Thus traversing a linked list to find its tail or following all references in memory to detect unreachable objects for [[garbage collection]] are often very inefficient from a locality of reference point of view. | ||
But also data structures which are typically assumed to be linear such as an array can prove problematic for locality of reference if access to them is non-linear. The typical example for this are [[hash table|hash tables]], which are often realized as arrays. Due to the hash algorithm access to any element of that array gives basically no preference of any following access to require a neighboring element. | But also data structures which are typically assumed to be linear such as an array can prove problematic for locality of reference if access to them is non-linear. The typical example for this are [[hash table|hash tables]], which are often realized as arrays. Due to the hash algorithm access to any element of that array gives basically no preference of any following access to require a neighboring element. |
Revision as of 04:17, 9 March 2010
- See also: Locality of networks
Locality of reference is a commonly observed pattern in memory accesses by a computer program over time. The idea is, that memory accesses that happen close to one another in time tend to occur close to one another in space (memory address). Locality of reference is one of the primary motivations for copying data in chunks from slower memory to faster memory in the memory hierarchy. Examples are memory caches, which attempt to load a range of main memory contents at a time, and memory paging where pages of 1024 bytes or more are stored on and retrieved from harddisks in one go. Each time the assumption is made that the excess memory contents will be needed soon after.
Locality of reference can be exploited by a computer's memory controller for drastic improvements in memory access times. In general, whenever a memory access takes place, the memory controller will attempt to read a larger section of memory which contains the target address. In the common case, subsequent memory accesses will likely target memory addresses that have been loaded into the cache by that same read.
Thought experiment: fetch-execute cycle
When a typical computer is executing a program, it repeatedly reads the next instruction in memory and then executes it. Typically, those instructions are placed in sequential memory addresses, with exceptions for branches that occur for control structures such as loops, conditionals and function or method invocations.
Thought experiment: array algorithms
Suppose we had an algorithm which was to select the largest number in an array. One straight-forward way to accomplish this (indeed, the optimal solution for an unsorted flat array) is to iterate over each element of the array in order, and check whether each one is the largest so far. Thus, at time T=0, we check element 0, at T=1, we check element 1, and so on. Without a cache, the processor would need to spend a little bit of time during each instruction cycle to fetch the array element from main memory. But, if the processor employs a cache, we can achieve a speed-up as follows.
Once the algorithm attempts its first read, the processor's memory controller will fetch not just that element, but the entire cache line which contains that element. The processor must wait for that element before it can proceed, but the memory controller can continue fetching the rest while the processor moves on to the next instruction. As a result, for the next few elements we can avoid a memory stall with each array access.
Problematic access patterns
Any memory access which does not follow a linear pattern is problematic for mechanisms exploiting locality of reference. They will have spend resources to fetch more than what was immediately needed but the additional memory content fetched is now useless and the additional resources spend thus wasted.
Typically this occurs with referential data structures like linked lists if the elements are spread over memory, if index structure and content are stored together, and if the accessing algorithm is only interested in the index structure but not the indexed content. Thus traversing a linked list to find its tail or following all references in memory to detect unreachable objects for garbage collection are often very inefficient from a locality of reference point of view.
But also data structures which are typically assumed to be linear such as an array can prove problematic for locality of reference if access to them is non-linear. The typical example for this are hash tables, which are often realized as arrays. Due to the hash algorithm access to any element of that array gives basically no preference of any following access to require a neighboring element.