A Study of Consistency Mechanisms in Distributed Systems (part 3)

Leases: Time Based Locking
As seen in chapter 1, caching introduces the problem of ensuring consistency between the cached data and its primary location of storage. By consistent, we mean that the behavior is equivalent to there being only a single (uncached) copy of the data except for the performance benefit of the cache. With large caches, the traffic required to maintain consistency can be the dominant factor in cache performance.
Cache consistency protocols have been extensively studied in the work on shared memory multiprocessor architectures; this work relies on reliable, synchronous broadcast communication as provided by the system bus. A distributed system, however, can experience partial failures: a host may crash or messages may be lost. Existing approaches to consistency for file caches fall into two categories: those that assume reliable broadcast, and so do not tolerate communication failures, and those that require a consistency check for every read, and so fail to deliver good performance.
Leases are proposed as a consistency protocol that handles host and communication failures using physical clocks.

4.1. Leases and Cache Consistency
A lease is a contract that gives its holder specified rights over property for a limited period of time. In the context of caching, a lease grants to its holder control over writes to the covered datum during the term of the lease, such that the server must obtain the approval of the leaseholder before the datum may be written. When a leaseholder grants approval for a write, it invalidates its local copy of the datum.
A cache using leases requires a valid lease on the datum (in addition to holding the datum) before it returns the datum in response to a read, or modifies the datum in response to a write. When a datum is fetched from the server (the primary storage site of the datum), the server also returns a lease guaranteeing that any client will not write the data during the lease term unless the server first obtains the approval of this leaseholder. If the datum is read again within the term of the lease (and the datum is still in the cache), the cache provides immediate access to the datum without communicating with the server. After the lease expires, a read of the datum requires that the cache first extend the lease on the datum, updating the cache if the datum has been modified since the lease expired. When a client writes a datum, the server must defer the request until each leaseholder has granted approval or the term of its lease has expired.
Write-through gives clean failure semantics: no write that has been made visible to any client can be lost; applications must otherwise be prepared to recover from lost writes. Though the cost of write-through for file caches is considered prohibitive by some, the cost can be largely eliminated by giving special handling to temporary files, since they receive the majority of writes.

4.1.1. Example
To illustrate the operation of a file cache using leases, consider a diskless workstation being used for document production. When the workstation executes latex for the first time, it obtains a lease on the binary file containing latex for a term of (say) 10 seconds. Another access to the same file 5 seconds later can use the cached version of this file without checking with the file server. An access to this file after the lo-second term has expired requires the cache to check with the server. When a new version of latex is installed, the write is delayed until every leaseholder has approved the write. If some host holding a lease for this file is unreachable, the delay continues until the lease expires.
In the preceding example, the relevant reads and writes are not limited to operations on the contents of the file. In order to support a repeated open, the cache must also hold the name-to-file binding and permission information, and it needs a lease over this information in order to use that information to perform the open. Similarly, modification of this information, such as renaming the file, would constitute a write.
In the following sections we take look at the importance of the lease term, conducts to manage leases and fault tolerance. Section 5 presents the case study of the NQNFS approach and Section 6 describes some variants to the Leases mechanism.

4.2. Significance of the Lease Term
4.2.1. Short-term leases
Short lease terms have several advantages:
Minimize the delay resulting from client and server failures (and partitioning communication failures). When the server cannot communicate with a client, the server must delay writes to a file for which the failed client holds a lease until that lease expires.] When a server is recovering after crashing, it must honor the leases it granted before it crashed. This is most easily done if it remembers the maximum term for which it had granted a lease, and it delays writes to all tiles for that period, effectively increasing the time to fully recover by the maximum term. Alternately, the server can maintain a more detailed record of leases on persistent storage, but the additional I/O traffic is unlikely to be justified unless terms of leases are much longer than the time to recover.
Minimize the false write-sharing that occurs. False sharing refers here to a lease conflict when no actual conflict in file access exists. Specifically false sharing occurs when a client writes to a file, which is covered by a lease held by another client when the other client is not currently accessing the tie. False sharing introduces the overhead of a callback to the leaseholder(s) (thereby delaying the requesting client and loading the leaseholder and server) in a situation where without leases there would be no conflict. In the extreme, a lease term should be set to zero if a client is not going to access the file before it is modified by another client.
Reduce the storage requirements at the server, since the record of expired leases could be reclaimed. However, the storage overhead for the server to keep track of the leases it has granted is modest. The server requires a record of each leaseholder’s identity and a list of the leases it holds; each lease requires only a couple of pointers. For a client holding about one hundred leases, the total is around one kilobyte per client. Even if this were a problem, recording leases at a larger granularity could reduce it, so that each client holds few leases, at the expense of some increase in contention.

4.2.2. Longer-term leases
Long lease terms are significantly more efficient both for the client and server on files that are accessed repeatedly and have relatively little write-sharing. This may be observed in the Andrew file system project, which went from using a lease term of zero in the prototype to effectively a lease term of infinity in the revised version [5].

4.3. Options for Lease Management
Lease management in the server admits several options that may be exploited to improve performance. The server controls the term of the leases it grants; it is also free to wait for a lease to expire instead of seeking approval of a write. The client is free in deciding when to request extension of leases, when to relinquish them, and when to approve a write. The combinations of these options give different trade-offs between load and response time.
For example, the client may anticipate the expiration of its leases and request extension before the covered file is accessed. Doing so improves response time by eliminating the added delay for reads, but it does so at the cost of increased load for the server. In particular, an idle client continues to request extensions even when files are not being accessed, and because the cache continues to hold leases it may increase the level of contention due to false sharing.
The server can set the lease term based on the file access characteristics for the requested file as well as the propagation delay to the client. In particular, a heavily write-shared file might be given a lease term of zero. A lease given to a distant client could be increased to compensate for the amount the lease term is reduced by the propagation delay and for the extra delay incurred by the client to extend the lease. In general, a server can dynamically pick lease terms on a per file and per client cache basis using the analytic model, assuming the necessary performance parameters am monitored by the server.

4.4. Fault-Tolerance
Leases ensure consistency provided that the hosts and network do not suffer certain Byzantine failures including clock failure. More specifically, consistency is maintained in spite of message loss, and client or server failures (assuming writes are persistent at the server across a crash). Moreover, availability is not reduced by the caches because an unreachable client at most briefly delays write access by other clients.
Leases depend on well-behaved clocks. In particular, a server clock that advances too quickly can cause errors because it may allow a write before the term of a lease held by a previous client has expired at that client. Similarly, if a client clock fails by advancing too slowly, it may continue using a lease which the server regards as having expired The opposite errors-a slow server clock or fast client clock-do not result in inconsistencies, but do generate extra traffic since a client will regard leases to have expired before the server does, Such failures are much less common than either crashes or communication failures; they can be detected quickly by either a synchronization protocol or by including explicit timestamps in lease-related messages.
It is regarded as a reasonable assumption that clocks at the nodes of a distributed system are synchronized within fraction which is small relative to the lease terms of several seconds. Synchronized time is required for other aspects of file access as well, such as the file-modified times used by the Unix make facility [5]. As a minimum, the correct functioning of leases requires only that c1ock.s have a known bounded drift, in which case the lease term can be communicated.

4.5. Not Quite NFS: A Case Study
The NQNFS [6] cache consistency protocol is based on Leases. The basic principle is that the server disables client caching of files whenever concurrent write sharing could occur, by performing a server-to-client callback, forcing the client to flush its caches and to do all subsequent I/O on the file with synchronous RPCs. NQNFS, uses a short-term lease that expires due to timeout after a maximum of one minute, unless explicitly renewed by the client. The key-point is that an NQNFS client must keep renewing a lease to use cached data. Using leases permits the server to remain "stateless," since the soft state information, which consists of the set of current leases, is moot after one minute, when all the leases expire.
Whenever a client wishes to access a file’s data it must hold one of three types of lease:
read-caching
write-caching
non-caching.
The latter type requires that all file operations be done synchronously with the server via the appropriate RPCs.
The following sub-sections give timeline charts for read-caching, write-caching and shared write leases.

4.5.1. Read-caching Lease

A read-caching lease allows for client data caching but no modifications may be done. It may, however, be shared between multiple clients. Figure 4.1 shows a typical read-caching scenario. The vertical solid black lines depict the lease records. Note that the time lines are nowhere near to scale, since a client/server interaction will normally take less than one hundred milliseconds, whereas the normal lease duration is thirty seconds.

4.5.2. Write-caching Lease

A write-caching lease permits delayed write caching, but requires that all data be pushed to the server when the lease expires or is terminated by an eviction callback. When a write-caching lease has almost expired, the client will attempt to extend the lease if the file is still open, but is required to push the delayed writes to the server if renewal fails, as depicted by Figure 4.2. The writes may not arrive at the server until after the write lease

has expired on the client, but this does not result in a consistency problem, so long as the write lease is still valid on the server. Note that, in Figure 4.2, the lease record on the server remains current after the expiry time, due to the conditions mentioned in section 5. If a write RPC is done on the server after the write lease has expired on the server, this could be considered an error since consistency could be lost, but it is not handled as such by NQNFS.

4.5.3. Eviction of a Lease

Figure 4.3 depicts how read and write leases are replaced by a non-caching lease when there is the potential for write sharing. A write-caching lease is not used in the Stanford V Distributed System, since synchronous writing is always used. A side effect of this change is that the five to ten second lease duration recommended by Gray [5] was found to be insufficient to achieve good performance for the write-caching lease. Experimentation showed that thirty seconds was about optimal for cases where the client and server are connected to the same local area network, so thirty seconds is the default lease duration for NQNFS. A maximum of twice that value is permitted, since Gray showed that for some network topologies, a larger lease duration functions better. Although there is an explicit get_lease RPC defined for the protocol, most lease requests are piggybacked onto the other RPCs to minimize the additional overhead introduced by leasing.

Variations of the Leases Protocol

4.6.1. Volume Leases

Traditional leases provide good performance when the cost of renewing leases is amortized over many reads. Unfortunately, for many WAN workloads, reads of an object may be spread over seconds or minutes, requiring long leases in order to amortize the cost of renewals. To make leases practical for these workloads, an algorithm is proposed which uses a combination of object leases, which are associated with individual data objects, and volume leases, which are associated with a collection of related objects on the same server [8]. In this scheme a client reads data from its cache only if both its object and volume leases for that data are valid, and a server can modify data as soon as either lease has expired. By making object leases long and volume short, we overcome the limitations of traditional leases: long object leases have low overhead, while short volume leases allow servers to modify data without long delays. Furthermore, if there is spatial locality within a volume, the overhead of renewing short leases on volumes is amortized across many objects.

4.6.1.1. Basic algorithm

The basic algorithm is simple:

Reading Data. Clients read cached data only if they hold valid object and volume leases on the corresponding objects. Expired leases are renewed by contacting the appropriate servers. When granting a lease for an object to a client, if has been modified since the last time held a valid lease on then the server piggybacks the current data on the lease renewal.

Writing Data. Before modifying an object, a server sends invalidation messages to all clients that hold valid leases on the object. The server delays the write until it receives acknowledgments from all clients, or until the volume or object leases expire. After modifying the object, the server increments the object’s version number.

Cooperative Leases: A Cache Consistency Mechanism for CDNs

A consistency mechanism employed by a Content Distributive Network (CDN) should satisfy two key requirements:

scalability: the approach should scale to a large number of proxies employed by the CDN and should impose low overheads on the origin servers and proxies

flexibility: the approach should support different levels of consistency guarantees. Co-operative leases, based on a generalization of leases, satisfy these requirements.

In the original leases approach, the server grants a lease to each request from a proxy. The lease denotes the interval of time during which the server agrees to notify the proxy if the object is modified. After the expiration of the lease, the proxy must send a message requesting a lease renewal.

The leases approach has two drawbacks from the perspective of a CDN:

Leases provide strong consistency semantics by virtue of notifying a proxy of all updates to an object. As argued earlier, not all objects cached within a CDN need such stringent guarantees.

Leases require the server to maintain state for each proxy caching an object; the resulting state space overhead can be excessive for large CDN’s. Thus, leases do not scale well to busy servers and large CDN’s.

To alleviate these drawbacks, Co-operative Leases [10] are proposed allow a server to grant a single lease collectively to a group of proxies, instead of issuing a separate lease to each individual proxy. For each cached object, the proxy group designates an invalidation proxy, referred to as the leader that is responsible for all lease-related interactions with the server. The leader of a group manages the lease on behalf of all the proxies in the group. Since a leader is selected per object no single proxy becomes the bottleneck. Moreover, the server only notifies the leader upon an update to the object; the leader is then responsible for propagating this notification to other proxies in the group that are caching the object. Such an approach has two significant advantages:

It reduces the the amount of state maintained at a server (by using a single lease to represent a proxy group instead of an individual proxy)

It reduces the number of notifications that need to be sent by the server (by offloading some of notification burden to leader proxies).

About Unknown

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.

A Study of Consistency Mechanisms in Distributed Systems (part 3)

About Unknown

0 comments:

Post a Comment

A Study of Consistency Mechanisms in Distributed Systems (part 3)

About Unknown

RELATED POSTS

0 comments:

Post a Comment