Chapter 9: Distributed File Systems Flashcards

1
Q

CAP Theorem

A
  • Brewers conjecture [1]
    • You can have at most two of these properties for any shared-data system
  • Proof the conjecture [2]
    • Impossible to reliable provide atomic, consistent data when there are partitions in the network
  • Consistency
    • Any read operation that begins after a write operation must contain that value or that of a later write operation
  • Availability
    • Every request received must result in a response
  • Partition tolerant
    • Even when network failures occure, every request must terminate
    • E.g., Datacenter A cannot connect to Datacenter B, the leader election algorithm should not elect in each Datacenter a new leader
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

POSIX guarantees in distributed systems?

A
  • Real world distributed systems
    • Higher latencies over network than local access
    • Unreliable network
    • Limited bandwidth
    • Topology changes
    • Heterogeneous environments
  • CAP theorem
    • It is proven that Consistent Available and Partition tolerant systems are not possible
    • => Exposing the POSIX API with all the required atomic guarantees (which would require CAP) over network is not possible, hence we call that an abstraction which leaks.
  • => Cloud storage API
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Network File System (NFS)

A
  • Sharing of data between computers
  • Protocol designed for local LAN‘s
  • NFS creates a remote access layer for file systems
    • Remote machines can then handle inodes
  • NFS is built on multiple protocols
    • NFS (File creation, reading, writing, searching, authentication, stats)
    • Mountd (Mounting of exported filesystems)
    • –Nsm (Monitors client/server status
    • Nlm (Network Lock Manger, provides locking capabilities)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Caching in distributed filesystems

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Definitions - Caching

A
  • Cooperative caching
    • A request goes through multiple hierarchies. Cooperative caching is used to from from multiple servers a single unified cache.
  • Cache coherence
    • A cache is coherent if a process writes to any location, a subsequent read of any process sees the modification
  • Prefetching
    • Caching data blocks which will probably be accessed in the near future
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Lease Manager (LMGR)

A
  • File systems use locking mechanism to control access to the disk
  • Leases are locks with an expiration period set up in advance
    • Needed in case of network outage
    • Causes additional overhead, lease renewal
  • Implementation
    • Each OSD has one LMGR which acquires and renews the major lease
    • All leases for objects are managed by the OSD LMGR
    • OSD knows the network address of the current holder of the major- lease
    • The LMGR grants exclusive leases for objects residing in the OSD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

File Manager (FMGR)

A
  • Each opened file is managed by a single FMGRopen() creates an instance of a file manager
  • FMGR keeps track of each accomplished open() and read() request
  • When an open() request arrives at the FMGR it checks whether the file has already been opened by another client, else a proper exclusive lease is acquired from the LMGR
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Transaction Server (TSVR)

A
  • Responsibility
    • Directory operations are implemented as distributed transactions
  • Example
    • Creating a new file means to create a new entry in the parent directory and creating a file
    • Potential failures
      • Creating entry in parent directory
      • Creating file
      • Initiating host can fail
  • Works on a per operation basic
    • Acquires leases and performs the action
    • Holds acquired leases for as long as possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

GFS (The Google File System)

A
  • Fault tolerance while running on inexpensive commodity hardware
  • Introduces an API which is designed to be implemented scalably
  • Mutation
    • The GFS paper uses the word „mutation“ for any modification done to a file. This can be an in-place update, an append or a file creation, hence we also use it in the slides.
  • Leases
    • Ownership for a specified length of time
    • Leases can be renewed or extended by the owner
    • When the owner fails, the lease expires
    • The owner can end the lease early
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

GFS Design assumptions

A
  • Inexpensive commodity components that often fail
  • Typical file size 100MB or larger, no need to optimize for small files
  • Read workload
    • Large streaming reads
    • Small random reads
  • Write workload
    • Append to files
    • Modification supported but a design goal
    • Hundreds of concurrently appending clients
  • Bandwidth is more important than low latency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Architecture GFS

A
  • Files
    • Divided into fixed-size chunks
    • Identified by an immutable and unique id (chunk handle)
  • Single master
    • Maintains file system metadata
      • Namespace, access control information, mapping from files to chunks, location of chunks
    • Garbage collection (deferred deletion of files)
    • Sends heartbeat messages to chunk server
  • Multiple chunk servers
    • Chunks are stored on disks as files
    • Each chunk is replicated to multiple chunk servers (depending on the replication factor of the region)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Guarantees provided by GFS

A
  • File namespace mutations are atomic
    • e.g., file creation, …
    • Uniquely handled by master
    • Masters operation log defines a global total order of the operations
  • File manipulations
    • The state of a file region after a data mutation depends on the type of the action
  • Definitions
    • Consistent
    • all clients see the same data, regardless of which replica they read from
    • Definedconsistent and also the clients see what the file mutation writes in its entirety
    • Inconsistentmultiple clients see different kinds of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is fault-tolerance achieved? GFS

A
  • Master
    • Operation Log, Replication to shadow master
  • Chunk server
    • All chunks are versioned
    • Version number updated when a lease is granted
    • Chunks with old versions are not served and are deleted
  • Chunks
    • Re-replication triggered by master maintains replication factor – Rebalancing
    • Data integrity checks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is high-availability achieved? GFS

A
  • Fast recovery of master
    • Check pointing and operation log
  • Heartbeat messages
    • Include piggybacked information
    • Can trigger re-replication
  • Share current load
  • Can trigger garbage collection
    • => Chunkservers can fail any time
  • Diagnostic tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Summary on GFS

A
  • Highly concurrent reads and appends
  • Highly scalable
  • On cheap commodity hardware
  • Built for map-reduce kind of workloads
    • Reads
    • Appends
  • Developers have to understand the limitations
  • and may have to use other tools to work around
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

HDFS

A
  • The Hadoop Distributed File System
  • Originally built as infrastructure for the Apache
  • Nutch web search engine
  • Quite similar to GFS
  • Facebook
    • One cluster with 8800 cores and 12 PB raw storage
  • LinkedIn
    • Multiple clusters and als PB of data
  • Twitter
  • Powerset (bought by Microsoft)
17
Q

F4: Facebook‘s warm BLOB storage

A
  • Storage efficiency
    • Reduce replication factor
  • Suitable for warm BLOBs
  • Fault tolerance
    • Drive failures
    • Host failures
    • Rack failures
    • Datacenter failures
  • Design goals similiar to GFS
18
Q

Observation - files can be hot, warm or cold F4

A
19
Q

Properties G4

A
  • Blobs are in volumes which use distributed erasure coding
    • Less bytes than triple replication
    • Reed-SolomonCodes
    • XOR coding in the wide-area to ensure resilience to datacenter failures
  • Volumes
    • Has multiple states
      • Unlocked, which means open for modification
      • Locked, only reads and deletes
    • When unlocked, supports read, creates (appends) and deletes
    • Once full (100GB), volume is locked
    • Comprises 3 files
      • Index, journal and data
20
Q

Ori + design goals

A
  • ORI replicated file system
  • Think of git combined with Dropbox

DEsign goals:

  • Availability
    • Data should be available at any time despite nodes going down
  • Accessibility
    • Data should be accessible from everywhere
  • Durability
    • Data should not be lost
  • Usability
    • Built for collaboration and version control