I . chp 2. Getting Started Flashcards

1
Q

Objective

A

This chapter will familiarize you with the framework we shall use throughout the
book to think about the design and analysis of algorithms. We begin by examining the insertion sort algorithm to solve the sorting problem.Having specified the insertion sort algorithm, we then argue that it
correctly sorts, and we analyze its running time. The analysis introduces a notation
that focuses on how that time increases with the number of items to be sorted. Following our discussion of insertion sort, we introduce the divide-and-conquer
approach to the design of algorithms and use it to develop an algorithm called
merge sort. We end with an analysis of merge sort’s running time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Keys in sorting problem:

A

The numbers that we wish to sort are also known as the keys.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pseudo Code Vs Real Code.

A

What separates pseudocode from “real” code is that in
pseudocode, we employ whatever expressive method is most clear and concise to
specify a given algorithm. Sometimes, the clearest method is English.
Another difference between pseudocode and real code
is that pseudocode is not typically concerned with issues of software engineering.
Issues of data abstraction, modularity, and error handling are often ignored in order
to convey the essence of the algorithm more concisely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Insertion Sort:

A
INSERTION-SORT(A)
1 for j = 2 to A.length
2     key = A[j]
3 // Insert A[j]  into the sorted sequence A[1 ... j - 1].
4     i = j - 1
5     while i>0 and A[i] > key
6         A[i + 1] = A[i]
7          i = i - 1
8 A[i + 1] = key
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Loop Invariant? Why do we need it?

A

invariant:

a function, quantity, or property which remains unchanged when a specified transformation is applied.

We use loop invariants to help us understand why an algorithm is correct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Loop Invariant of Insertion Sort:

A

At the start of each iteration of the for loop of lines 1–8, the subarray
A[1 , , , j -1 ] consists of the elements originally in
A[ 1 , , , j - 1 ], but in sorted order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What things must be shown about loop invariant:

(3)

How to show Algo is correct?

A

Initialization: It is true prior to the first iteration of the loop.
Maintenance: If it is true before an iteration of the loop, it remains true before the
next iteration.
Termination: When the loop terminates, the invariant gives us a useful property
that helps show that the algorithm is correct.

When the first two properties hold, the loop invariant is true prior to every iteration
of the loop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Pseudocode Conventions:

  1. block structure:
  2. Comments?
A
  1. Indentation indicates block structure.

3. // Indicates that the remainder of the line is a comment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2.Loop Pseudocode Conventions:

A

2.The looping constructs while, for, and repeat-until and the if-else conditional
construct have interpretations similar to those in C, ..etc.
In this book, the loop counter retains its value after exiting the loop.
keyword downto when a for loop
decrements its loop counter,
“to” when increment.
When the loop counter changes by an amount
greater than 1, the amount of change follows the optional keyword “by”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Psedocode

j = e, i = j
different allowed assignment form.

A

4.A multiple assignment of the form i = j = e assigns to both variables i and j
the value of expression e; it should be treated as equivalent to the assignment
j = e followed by the assignment i = j

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Scope of variables in Pseudocode:

A
  1. Variables (such as i, j , and key) are local to the given procedure. We shall not
    use global variables without explicit indication.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Array Convention Pseudocode:

A
  1. We access array elements by specifying the array name followed by the index in square brackets. For example, A[i] indicates the ith element of the
    array A. The notation “. .” is used to indicate a range of values within an array. Thus, A [1 . . j] indicates the subarray of A consisting of the j elements
    A[1], A[2], . . , A[j] .
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Compund data types rule, pointers to them.

Notations:

A
  1. We typically organize compound data into objects, which are composed of
    attributes.
    We access a particular attribute using the syntax : the object name, followed by a dot,
    followed by the attribute name.
    Example : Array A is treated as object and its attribute length, A.length.
    We treat a variable representing an array or object as a pointer to it.
    For all attributes f of an object x, setting y = x
    causes y.f = x.f
    If x.f = 3, then y.f = 3 as well. I
    x and y point to the same object after the assignment y = x.
    Our attribute notation can “cascade.” For example, suppose that the attribute f
    is itself a pointer to some type of object that has an attribute g.
    Then the notation
    x.f.g is implicitly parenthesized as (x.f).g.
    If y = x.f , then x.f.g is the same as y.g.

When pointer refers to no object at all, it has value NIL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Passing paramenters, Calling Convention:

A
  1. We pass parameters to a procedure by value: the called procedure receives its
    own copy of the parameters, and if it assigns a value to a parameter, the change
    is not seen by the calling procedure. When objects are passed, the pointer to
    the data representing the object is copied, but the object’s attributes are not. For
    example, if x is a parameter of a called procedure, the assignment x = y within
    the called procedure is not visible to the calling procedure. The assignment
    x.f = 3, however, is visible. Similarly, arrays are passed by pointer, so that a pointer to the array is passed, rather than the entire array, and changes to
    individual array elements are visible to the calling procedure.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Return Statements:

A

9.A return statement immediately transfers control back to the point of call in
the calling procedure. Most return statements also take a value to pass back to
the caller. Our pseudocode differs from many programming languages in that
we allow multiple values to be returned in a single return statement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Boolean Operators:

A

The boolean operators “and” and “or” are short circuiting. That is, when we
evaluate the expression “x and y” we first evaluate x. If x evaluates to FALSE,
then the entire expression cannot evaluate to TRUE, and so we do not evaluate y.
If, on the other hand, x evaluates to TRUE, we must evaluate y to determine the
value of the entire expression. Similarly, in the expression “x or y” we evaluate the expression y only if x evaluates to FALSE. Short-circuiting operators
allow us to write boolean expressions such as “x != NIL and x.f = y” without
worrying about what happens when we try to evaluate x.f when x is NIL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

error keyword:

A

The keyword error indicates that an error occurred because conditions were
wrong for the procedure to have been called. The calling procedure is responsible for handling the error, and so we do not specify what action to take

18
Q

Analysing an Algorithm:

A

Analyzing an algorithm has come to mean predicting the resources that the algorithm requires. Occasionally, resources such as memory, communication bandwidth, or computer hardware are of primary concern, but most often it is computational time that we want to measure.

19
Q

Random Acssess Machine(RAM) model:

Define this model briefly

A

The RAM model contains instructions commonly found in real computers:
arithmetic (such as add, subtract, multiply, divide, remainder, floor, ceiling), data
movement (load, store, copy), and control (conditional and unconditional branch,
subroutine call and return). Each such instruction takes a constant amount of time.

The data types in the RAM model are integer and floating point (for storing real
numbers).

20
Q

size of each word of data?

I DONT UNDERSTAND THISS!!

What is a word?

A

In computing, a word is the natural unit of data used by a particular processor design. A word is a fixed-sized piece of data handled as a unit by the instruction set or the hardware of the processor. The number of bits in a word (the word size, word width, or word length) is an important characteristic of any specific processor design or computer architecture.

We also assume a limit on the size
of each word of data.
For example, when working with inputs of size n, we typically assume that integers are represented by
c lg n bits for some constant c >= 1.
We require c >= 1 so that each word can hold the value of n, enabling us to index the
individual input elements, and we restrict c to be a constant so that the word size
does not grow arbitrarily. (If the word size could grow arbitrarily, we could store
huge amounts of data in one word and operate on it all in constant time—clearly
an unrealistic scenario.)

21
Q

exponentiation instruction:

A

it takes several instructions to compute x^y
when x and y are real numbers.
Many computers have a “shift left” instruction, which
in constant time shifts the bits of an integer by k positions to the left. In most
computers, shifting the bits of an integer by one position to the left is equivalent
to multiplication by 2, so that shifting the bits by k positions to the left is equivalent to multiplication by 2^k. Therefore, such computers can compute 2^k in one
constant-time instruction by shifting the integer 1 by k positions to the left, as long
as k is no more than the number of bits in a computer word. We will endeavor to
avoid such gray areas in the RAM model, but we will treat computation of 2^k as a
constant-time operation when k is a small enough positive integer

22
Q

Gray Area in RAM model:

A

Real computers contain instructions not listed above, and such instructions represent a gray area in the RAM model.

23
Q

memory hierarchy and RAM model:

A

In the RAM model, we do not attempt to model the memory hierarchy that is
common in contemporary computers. That is, we do not model caches or virtual
memory.
Models that include
the memory hierarchy are quite a bit more complex than the RAM model, and so
they can be difficult to work with. Moreover, RAM-model analyses are usually
excellent predictors of performance on actual machines.

24
Q

Even though we typically select only one machine model to analyze a given algorithm, we still face many choices in deciding how to express our analysis.

A

We
would like a way that is simple to write and manipulate, shows the important characteristics of an algorithm’s resource requirements, and suppresses tedious details.

25
Q

Why we need to define running time and input size?

A

In general, the time taken
by an algorithm grows with the size of the input, so it is traditional to describe the
running time of a program as a function of the size of its input. To do so, we need
to define the terms “running time” and “size of input” more carefully.

26
Q

Input Size:

3 examples

A

The best notion for input size depends on the problem being studied. For many
problems, such as sorting or computing discrete Fourier transforms, the most natural measure is the number of items in the input—for example, the array size n
for sorting. For many other problems, such as multiplying two integers, the best
measure of input size is the total number of bits needed to represent the input in
ordinary binary notation. Sometimes, it is more appropriate to describe the size of
the input with two numbers rather than one. For instance, if the input to an algorithm is a graph, the input size can be described by the numbers of vertices and
edges in the graph. We shall indicate which input size measure is being used with
each problem we study.

27
Q

Running TIme:

Assumptions?

A

The running time of an algorithm on a particular input is the number of primitive
operations or “steps” executed. It is convenient to define the notion of step so
that it is as machine-independent as possible.

A constant amount of time is required to execute each line of our
pseudocode. One line may take a different amount of time than another line, but
we shall assume that each execution of the ith line takes time ci, where ci is a
constant. This viewpoint is in keeping with the RAM model, and it also reflects
how the pseudocode would be implemented on most actual computers.

we separate the process of calling the
subroutine—passing parameters to it, etc.—from the process of executing the subroutine.
A statement that calls a subroutine takes constant time,
though the subroutine, once invoked, may take more.

28
Q

How to find running time:

A

The running time of the algorithm is the sum of running times for each statement executed; a statement that takes ci steps to execute and executes n times will
contribute cin to the total running time.6

29
Q

Linear Function of n:

A

(best case)
T (n) = c1n + c2(n - 1) + c4(n - 1) + c5(n - 1) + c8(n - 1)
= (c1 + c2 + c4 + c5 C c8)n - (c2 + c4 + c5 + c8)

We can express this running time as an + b for constants a and b that depend on
the statement costs ci; it is thus a linear function of n

30
Q

Insertion Sort:

Best case and worst case:

A

Best : Already sorted.
For each j = 2, 3, .. ,n, we then find
that A[i] key in line 5 when i has its initial value of j - 1. Thus tj = 1 for
j = 2, 3,…, n.

Worst: Reverse Sorted.
Compare A[j] with each element of subarray so
tj = j.

31
Q

Quadratic Function of n:

A

T(n) = (c5/2 + c6/2 + c7/2)n^2 +
(c1 + c2 + c4 + c5/2 - c6/2 - c7/2 + c8)n
- (c2 + c4 + c5 + c8)

We can express this worst-case running time as
an^2 + bn + c for constants a, b,
and c that again depend on the statement costs ci; it is thus a quadratic function
of n.

32
Q

Why are we concentrating on finding worst case:

A
  1. The worst-case running time of an algorithm gives us an upper bound on the
    running time for any input. Knowing it provides a guarantee that the algorithm
    will never take any longer. We need not make some educated guess about the
    running time and hope that it never gets much worse.
  2. For some algorithms, the worst case occurs fairly often.
  3. The “average case” is often roughly as bad as the worst case.
33
Q

Considering Average Case

consequences and what to consider:

A

In some particular cases, we shall be interested in the average-case running time
of an algorithm; we shall see the technique of probabilistic analysis applied to
various algorithms throughout this book. The scope of average-case analysis is
limited, because it may not be apparent what constitutes an “average” input for
a particular problem. Often, we shall assume that all inputs of a given size are
equally likely. In practice, this assumption may be violated, but we can sometimes
use a randomized algorithm, which makes random choices, to allow a probabilistic
analysis and yield an expected running time. We explore randomized algorithms
more in Chapter 5 and in several other subsequent chapters.

34
Q

Order of growth and Abstraction:

A

We used some simplifying abstractions to ease our analysis of the INSERTIONSORT procedure. First, we ignored the actual cost of each statement, using the
constants ci to represent these costs. This gave us more detail so we expressed the worst-case running time
as an^2 + bn + c for some constants a, b, and c that depend on the statement
costs ci. We thus ignored not only the actual statement costs, but also the abstract
costs ci.

We shall now make one more simplifying abstraction: it is the rate of growth,
or order of growth, of the running time that really interests us.

We consider only leading term, lower order term are relatively insignificant for large values of n.
We also ignore the leading term’s constant coefficient, since constant factors are less significant than the rate of growth
in determining computational efficiency for large inputs.

35
Q

Whats u can say a disadvantage of considering order of growth:

A

Due to constant factors and lowerorder terms, an algorithm whose running time has a higher order of growth might
take less time for small inputs than an algorithm whose running time has a lower order of growth. But for large enough inputs, a Theta(n^2) algorithm, for example, will
run more quickly in the worst case than a theta(n^3) algorithm.

36
Q

Selection Sort:

A

Consider sorting n numbers stored in array A by first finding the smallest element
of A and exchanging it with the element in AŒ1. Then find the second smallest
element of A, and exchange it with AŒ2. Continue in this manner for the first n1
elements of A. Write pseudocode for this algorithm, which is known as selection
sort. What loop invariant does this algorithm maintain? Why does it need to run
for only the first n 1 elements, rather than for all n elements? Give the best-case
and worst-case running times of selection sort in ‚-notation.

37
Q

Designing Algo, what technique did we use in Insertion Sort, explain.

A

For insertion
sort, we used an incremental approach: having sorted the subarray A[ . . j - 1],
we inserted the single element A[j ] into its proper place, yielding the sorted
subarray A[. . j].

38
Q

recursive in structure:

A

to solve a given problem, they

call themselves recursively one or more times to deal with closely related subproblems.

39
Q

Just say about divide n conquer:

A

Recursive algorithms typically follow a divide-and-conquer approach: they
break the problem into several subproblems that are similar to the original problem but smaller in size, solve the subproblems recursively, and then combine these
solutions to create a solution to the original problem.

40
Q

The divide-and-conquer paradigm involves three steps at each level of the recursion:

A

Divide the problem into a number of subproblems that are smaller instances of the
same problem.

Conquer the subproblems by solving them recursively. If the subproblem sizes are
small enough, however, just solve the subproblems in a straightforward manner.

Combine the solutions to the subproblems into the solution for the original problem.

41
Q

The Merger Sort Algo Divide and Conquer steps:

A

Divide: Divide the n-element sequence to be sorted into two subsequences of n=2
elements each.
Conquer: Sort the two subsequences recursively using merge sort.
Combine: Merge the two sorted subsequences to produce the sorted answer.