Chapter 4 Threads & Concurrency Flashcards

Question

What are the two types of parallelism?

Answer 1

Data Parallelism Task Parallelism

Answer 2

Focuses on distributing subsets of the same data across multiple computing cores and performing the same operation on each core. Data parallelism involves the distribution of data across multiple cores

Answer 3

Summing the contents of an array of size N. On a single-core system, one thread would simply sum the elements [0]…[N − 1]. On a dual-core system, however, thread A, running on core 0, could sum the elements [0]…[N∕2 − 1] while thread B, running on core 1, could sum the elements [N∕2]…[N − 1].

Answer 4

Involves distributing not data but tasks (threads) across multiple computing cores. Each thread is performing a unique operation. Different threads may be operating on the same data, or they may be operating on different data. Task parallelism involves the distribution of tasks across multiple cores

Answer 5

An example of task parallelism might involve two threads, each performing a unique statistical operation on the array of elements. The threads again are operating in parallel on separate computing cores, but each is performing a unique operation.

Answer 6

User threads are supported above the kernel and are managed without kernel support

Answer 7

Kernel threads are supported and managed directly by the operating system.

Answer 8

The many-to-one model maps many user-level threads to one kernel thread. Thread management is done by the thread library in user space, so it is efficient However, the entire process will block if a thread makes a blocking system call. Also, because only one thread can access the kernel at a time, multiple threads are unable to run in parallel on multicore systems. Prevents Parallelism. (Ex.) Green Threads (Early Java)

Answer 9

The one-to-one model maps each user thread to a kernel thread. It provides more concurrency than the many-to-one model by allowing another thread to run when a thread makes a blocking system call. It also allows multiple threads to run in parallel on multiprocessors. The only drawback to this model is that creating a user thread requires creating the corresponding kernel thread, and a large number of kernel threads may burden the performance of a system Allows greater concurrencies but too many threads can slow the system down. (Ex.) Linux Systems

Answer 10

The many-to-many model multiplexes many user-level threads to a smaller or equal number of kernel threads. The number of kernel threads may be specific to either a particular application or a particular machine. It is difficult to implement even while being the most flexible.

Answer 11

One variation on the many-to-many model still multiplexes many user-level threads to a smaller or equal number of kernel threads but also allows a user-level thread to be bound to a kernel thread.

Answer 12

A thread library provides the programmer with an API for creating and managing threads.

Answer 13

The first approach is to provide a library entirely in user space with no kernel support. All code and data structures for the library exist in user space. The second approach is to implement a kernel-level library supported directly by the operating system. In this case, code and data structures for the library exist in kernel space.

Answer 14

POSIX Pthreads: Pthreads, the threads extension of the POSIX standard, may be provided as either a user-level or a kernel-level library. Windows: The Windows thread library is a kernel-level library available on Windows systems. Java: The Java thread API allows threads to be created and managed directly in Java programs. However, because in most instances the JVM is running on top of a host operating system, the Java thread API is generally implemented using a thread library available on the host system.

Answer 15

For POSIX and Windows threading, any data declared globally—that is, declared outside of any function—are shared among all threads belonging to the same process.

Answer 16

With asynchronous threading, once the parent creates a child thread, the parent resumes its execution, so that the parent and child execute concurrently and independently of one another. Because the threads are independent, there is typically little data sharing between them. Typically used to for designing responsive user interfaces

Answer 17

Synchronous threading occurs when the parent thread creates one or more children and then must wait for all of its children to terminate before it resumes. Here, the threads created by the parent perform work concurrently, but the parent cannot continue until this work has been completed. Once each thread has finished its work, it terminates and joins with its parent. Only after all of the children have joined can the parent resume execution. Typically, synchronous threading involves significant data sharing among threads.

Answer 18

Pthreads refers to the POSIX standard (IEEE 1003.1c) defining an API for thread creation and synchronization. This is a specification for thread behavior, not an implementation.

Answer 19

The pthread_attr_t attr declaration represents the attributes for the thread. We set the attributes in the function call pthread_attr_init(&attr). Because we did not explicitly set any attributes, we use the default attributes provided.

Answer 20

A separate thread is created with the pthread create() function call. In addition to passing the thread identifier and the attributes for the thread, we also pass the name of the function where the new thread will begin execution—in this case, the runner() function. Last, we pass the integer parameter that was provided on the command line, argv[1].

Answer 21

In a Pthreads program, separate threads begin execution in a specified function. In Figure 4.11, this is the runner() function. When this program begins, a single thread of control begins in main(). After some initialization, main() creates a second thread that begins control in the runner() function. Both threads share the global data sum

Answer 22

At this point, the program has two threads: the initial (or parent) thread in main() and the summation (or child) thread performing the summation oper- ation in the runner() function. This program follows the thread create/join strategy, whereby after creating the summation thread, the parent thread will wait for it to terminate by calling the pthread join() function. The summation thread will terminate when it calls the function pthread exit(). Once the summation thread has returned, the parent thread will output the value of the shared data sum.

Answer 23

One way to address these difficulties and better support the design of concurrent and parallel applications is to transfer the creation and management of threading from application developers to compilers and run-time libraries.

Answer 24

The first issue concerns the amount of time required to create the thread, together with the fact that the thread will be discarded once it has completed its work. The second issue is more troublesome. If we allow each concurrent request to be serviced in a new thread, we have not placed a bound on the number of threads concurrently active in the system. Unlimited threads could exhaust system resources, such as CPU time or memory.

Answer 25

The general idea behind a thread pool is to create a number of threads at start-up and place them into a pool, where they sit and wait for work

Answer 26

When a server receives a request, rather than creating a thread, it instead submits the request to the thread pool and resumes waiting for additional requests. If there is an available thread in the pool, it is awakened, and the request is serviced immediately. If the pool contains no available thread, the task is queued until one becomes free. Once a thread completes its service, it returns to the pool and awaits more work. Thread pools work well when the tasks submitted to the pool can be executed asynchronously.

Answer 27

Servicing a request with an existing thread is often faster than waiting to create a thread. A thread pool limits the number of threads that exist at any one point. This is particularly important on systems that cannot support a large number of concurrent threads. Separating the task to be performed from the mechanics of creating the task allows us to use different strategies for running the task. For example, the task could be scheduled to execute after a time delay or to execute periodically.

Answer 28

The number of threads in the pool can be set heuristically based on factors such as the number of CPUs in the system, the amount of physical memory, and the expected number of concurrent client requests. They can also be dynamically adjusted.

Answer 29

The main parent thread creates (forks) one or more child threads and then waits for the children to terminate and join with it, at which point it can retrieve and combine their results. This synchronous model is often characterized as explicit thread creation, but it is also an excellent candidate for implicit threading.

Answer 30

A set of compiler directives as well as an API for programs written in C, C++, or FORTRAN that provides support for parallel programming in shared-memory environments. OpenMP identifies parallel regions as blocks of code that may run in parallel. Application developers insert compiler directives into their code at parallel regions, and these directives instruct the OpenMP runtime library to execute the region in parallel.

Answer 31

It creates as many threads as there are processing cores in the system. Thus, for a dual-core system, two threads are created; for a quad-core system, four are created; and so forth. All the threads then simultaneously execute the parallel region. As each thread exits the parallel region, it is terminated.

Answer 32

Grand Central Dispatch (GCD) is a technology developed by Apple for its macOS and iOS operating systems. It is a combination of a run-time library, an API, and language extensions that allow developers to identify sections of code (tasks) to run in parallel.

Answer 33

GCD schedules tasks for run-time execution by placing them on a dispatch queue. When it removes a task from a queue, it assigns the task to an available thread from a pool of threads that it manages. GCD identifies two types of dispatch queues: serial and concurrent.

Answer 34

Tasks placed on a concurrent queue are also removed in FIFO order, but several tasks may be removed at a time, thus allowing multiple tasks to execute in parallel.

Answer 35

Each process has its own serial queue (known as its main queue), and developers can create additional serial queues that are local to a particular process.

Answer 36

The user-interactive class represents tasks that interact with the user, such as the user interface and event handling, to ensure a responsive user interface. Completing a task belonging to this class should require only a small amount of work.

Answer 37

The user-initiated class is similar to the user-interactive class in that tasks are associated with a responsive user interface; however, user-initiated tasks may require longer processing times. Opening a file or a URL is a user-initiated task, for example. Tasks belonging to this class must be completed for the user to continue interacting with the system, but they do not need to be serviced as quickly as tasks in the user-interactive queue.

Answer 38

The utility class represents tasks that require a longer time to complete but do not demand immediate results. This class includes work such as importing data.

Answer 39

Tasks belonging to the background class are not visible to the user and are not time sensitive. Examples include indexing a mailbox system and performing backups.

Answer 40

For the C, C++, and Objective-C languages, GCD identifies a language extension known as a block, which is simply a self-contained unit of work. A block is specified by a caret ˆ inserted in front of a pair of braces { }. Code within the braces identifies the unit of work to be performed. A simple example of a block is shown below: ^{ printf("I am a block"); }

Answer 41

Intel threading building blocks (TBB) is a template library that supports designing parallel applications in C++. As this is a library, it requires no special compiler or language support. Developers specify tasks that can run in parallel, and the TBB task scheduler maps these tasks onto underlying threads. Furthermore, the task scheduler provides load balancing and is cache aware, meaning that it will give precedence to tasks that likely have their data stored in cache memory and thus will execute more quickly. TBB provides a rich set of features, including templates for parallel loop structures, atomic operations, and mutual exclusion locking. In addition, it provides concurrent data structures, including a hash map, queue, and vector, which can serve as equivalent thread-safe versions of the C++ standard template library data structures.

Answer 42

For the Swift programming language, a task is defined using a closure, which is similar to a block in that it expresses a self-contained unit of functionality. Syntactically, a Swift closure is written in the same way as a block, minus the leading caret.

Answer 43

Initially, assume there is a function named apply(float value) that performs an operation on the parameter value. If we had an array v of size n containing float values, we could use the following serial for loop to pass each value in v to the apply() function: for (int i = 0; i < n; i++) { apply(v[i]); }

Answer 44

Alternatively, a developer could use TBB, which provides a parallel for template that expects two values: parallel for (range body ) where range refers to the range of elements that will be iterated (known as the iteration space) and body specifies an operation that will be performed on a subrange of elements. We can now rewrite the above serial for loop using the TBB parallel for template as follows: parallel_for (size t(0), n, [=](size t i) {apply(v[i]);});

Answer 45

If one thread in a program calls fork(), does the new process duplicate all threads, or is the new process single-threaded? Some UNIX systems have chosen to have two versions of fork(), one that duplicates all threads and another that duplicates only the thread that invoked the fork() system call. The exec() system call typically works in the same way as described in Chapter 3. That is, if a thread invokes the exec() system call, the program specified in the parameter to exec() will replace the entire process—including all threads. Which of the two versions of fork() to use depends on the application. If exec() is called immediately after forking, then duplicating all threads is unnecessary, as the program specified in the parameters to exec() will replace the process

Answer 46

A signal is used in UNIX systems to notify a process that a particular event has occurred. A signal may be received either synchronously or asynchronously, depending on the source of and the reason for the event being signaled.

Answer 47

1.) A signal is generated by the occurrence of a particular event. 2.) The signal is delivered to a process. 3.) Once delivered, the signal must be handled.

Answer 48

A signal may be handled by one of two possible handlers: 1. A default signal handler 2. A user-defined signal handler

Answer 49

Every signal has a default signal handler that the kernel runs when handling that signal.

Answer 50

The default action can be overridden by a user-define signal handler that is called to handle the signal

Answer 51

1. Deliver the signal to the thread to which the signal applies. 2. Deliver the signal to every thread in the process. 3. Deliver the signal to certain threads in the process. 4. Assign a specific thread to receive all signals for the process.

Answer 52

kill(pid_t pid, int signal)

Answer 53

pthread_kill(pthread_t tid, int signal)

Answer 54

Thread cancellation involves terminating a thread before it has completed. For example, if multiple threads are concurrently searching through a database and one thread returns the result, the remaining threads might be canceled.

Answer 55

Target Thread

Answer 56

1. Asynchronous cancellation. One thread immediately terminates the target thread. 2. Deferred cancellation. The target thread periodically checks whether it should terminate, allowing it an opportunity to terminate itself in an orderly fashion.

Answer 57

The default cancellation type is deferred cancellation. However, cancella- tion occurs only when a thread reaches a cancellation point.

Answer 58

Additionally, Pthreads allows a function known as a cleanup handler to be invoked if a thread is canceled. This function allows any resources a thread may have acquired to be released before the thread is terminated.

Answer 59

However, in some circumstances, each thread might need its own copy of certain data. We will call such data thread-local storage (or TLS).

Answer 60

It is easy to confuse TLS with local variables. However, local variables are visible only during a single function invocation, whereas TLS data are visible across function invocations.

Answer 61

Many systems implementing either the many-to-many or the two-level model place an intermediate data structure between the user and kernel threads. This data structure—typically known as a lightweight process

Answer 62

It works as follows: The kernel provides an application with a set of virtual processors (LWPs), and the application can schedule user threads onto an available virtual processor. Furthermore, the kernel must inform an application about certain events. This procedure is known as an upcall. Upcalls are handled by the thread library with an upcall handler, and upcall handlers must run on a virtual processor.

Answer 63

Linux provides the fork() system call with the traditional functionality of duplicating a process, as described in Chapter 3. Linux also provides the ability to create threads using the clone() system call. However, Linux does not distinguish between processes and threads. In fact, Linux uses the term task—rather than process or thread— when referring to a flow of control within a program.

Chapter 4 Threads & Concurrency Flashcards

(87 cards)