Load-link/store-conditional
In computer science, load-linked/store-conditional[1] (LL/SC), sometimes known as load-reserved/store-conditional[2] (LR/SC), are a pair of instructions used in multithreading to achieve synchronization. Load-link returns the current value of a memory location, while a subsequent store-conditional to the same memory location will store a new value only if no updates have occurred to that location since the load-link. Together, this implements a lock-free, atomic, read-modify-write operation. "Load-linked" is also known as load-link,[3] load-reserved,[2] and load-locked.[citation needed] LL/SC was originally[4] proposed by Jensen, Hagensen, and Broughton for the S-1 AAP multiprocessor[1] at Lawrence Livermore National Laboratory. Comparison of LL/SC and compare-and-swapIf any updates have occurred, the store-conditional is guaranteed to fail, even if the value read by the load-link has since been restored. As such, an LL/SC pair is stronger than a read followed by a compare-and-swap (CAS), which will not detect updates if the old value has been restored (see ABA problem). Real implementations of LL/SC do not always succeed even if there are no concurrent updates to the memory location in question. Any exceptional events between the two operations, such as a context switch, another load-link, or even (on many platforms) another load or store operation, will cause the store-conditional to spuriously fail. Older implementations will fail if there are any updates broadcast over the memory bus. This is called weak LL/SC by researchers, as it breaks many theoretical LL/SC algorithms.[5] Weakness is relative, and some weak implementations can be used for some algorithms. LL/SC is more difficult to emulate than CAS. Additionally, stopping running code between paired LL/SC instructions, such as when single-stepping through code, can prevent forward progress, making debugging tricky.[6] Nevertheless, LL/SC is equivalent to CAS in the sense that either primitive can be implemented in terms of the other, in O(1) and in a wait-free manner.[7] ImplementationsLL/SC instructions are supported by:
Some CPUs[which?] require the address being accessed exclusively to be configured in write-through mode. Typically, CPUs track the load-linked address at a cache-line or other granularity, such that any modification to any portion of the cache line (whether via another core's store-conditional or merely by an ordinary store) is sufficient to cause the store-conditional to fail. All of these platforms provide weak[clarification needed] LL/SC. The PowerPC implementation allows an LL/SC pair to wrap loads and even stores to other cache lines (although this approach is vulnerable to false cache line sharing). This allows it to implement, for example, lock-free reference counting in the face of changing object graphs with arbitrary counter reuse (which otherwise requires double compare-and-swap, DCAS). RISC-V provides an architectural guarantee of eventual progress for LL/SC sequences of limited length. Some ARM implementations define platform dependent blocks, ranging from 8 bytes to 2048 bytes, and an LL/SC attempt in any given block fails if there is between the LL and SC a normal memory access inside the same block. Other ARM implementations fail if there is a modification anywhere in the whole address space. The former implementation is the stronger and most practical. LL/SC has two advantages over CAS when designing a load–store architecture: reads and writes are separate instructions, as required by the design philosophy (and pipeline architecture); and both instructions can be performed using only two registers (address and value), fitting naturally into common 2-operand ISAs. CAS, on the other hand, requires three registers (address, old value, new value) and a dependency between the value read and the value written. x86, being a CISC architecture, does not have this constraint; though modern chips may well translate a CAS instruction into separate LL/SC micro-operations internally. ExtensionsHardware LL/SC implementations typically do not allow nesting of LL/SC pairs.[17] A nesting LL/SC mechanism can be used to provide a MCAS primitive (multi-word CAS, where the words can be scattered).[18] In 2013, Trevor Brown, Faith Ellen, and Eric Ruppert implemented in software a multi-address LL/SC extension (which they call LLX/SCX) that relies on automated code generation;[19] they have used it to implement one of the best-performing concurrent binary search tree (actually a chromatic tree), slightly beating the JDK CAS-based skip list implementation.[20] See alsoReferences
|
Portal di Ensiklopedia Dunia