Chapter 28 - Hashing Flashcards

1
Q

A map (data structure) is also called ___

A

A map is also called a dictionary, a hash table, or an associative array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The array that stores the values is called a __(1)__. The function that maps a key to an index in the __(1)__ is called a __(2)__.

A

(1) hash table

(2) hash function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is hashing?

A

Hashing is a technique that retrieves the value using the index obtained from the key without performing a search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a perfect hash function? What is a collision?

A

Ideally, we would like to design a function that maps each search key to a different index in the hash table. Such a function is called a perfect hash function. However, it is difficult to find a perfect hash function. When two or more keys are mapped to the same hash value, we say that a collision has occurred. Although there are ways to deal with collisions, it is better to avoid collisions in the first place. Thus, you should design a fast and easy-to-compute hash function that minimizes collisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What class has the hashCode method? What does the method return by default?

A

The Object class, Java’s big daddy class.

hashCode returns the memory address for the object by default.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the general contract for the hashCode method?

A
  1. You should override the hashCode method whenever the equals method is overridden to ensure that two equal objects return the same hash code.
  2. During the execution of a program, invoking the hashCode method multiple times returns the same integer, provided that the object’s data are not changed.
  3. Two unequal objects may have the same hash code, but you should implement the hashCode method to avoid too many such cases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does “the contract of a class” mean?

A
Wouldn't it be nice if all Java classes that you use, including your own, lived up to their promises? In fact, wouldn't it be nice if you actually knew exactly what a given class promises?
It's an agreement that the class will expose certain methods, certain properties, and certain behaviors.
The contract specifies what that component expects of clients and what clients can expect of it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is hash code? What is the hash code for Byte, Short, Integer, and Character?

A

A typical hash function first converts a search key to an integer value called a hash code, and then compresses the hash code into an index to the hash table.
For a search key of the type byte, short, int, and char, simply cast it to int. So two different search keys of any one of these types will have different hash codes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the hash code for a Float object computed?

A

For a search key of the type float, use Float.floatToIntBits(key) as the hash code. Note that floatToIntBits(float f) returns an int value whose bit representation is the same as the bit representation for the floating number f. So, two different search keys of the float type will have different hash codes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is the hash code for a Long object computed?

A

For a search key of the type long, simply casting it to int would not be a good choice, because all keys that differ in only the first 32 bits will have the same hash code. To take the first 32 bits into consideration, divide the 64 bits into two halves and perform the exclusive or operator to combine the two halves. This process is called folding.
So, the hashing code is:
int hashCode = (int)(key ^ (key&raquo_space; 32));
Note that&raquo_space; is the right-shift operator that shifts the bits 32 positions to the right.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the hash code for a Double object computed?

A

For a search key of the type double, first convert it to a long value using doubleToLongBits, then perform a folding as follows:
long bits = Double.doubleToLongBits(key);
int hashCode = (int)(bits ^ (bits&raquo_space; 32));

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is the hash code for a String object computed?

A

The hash code for a string in Java is:
(…((s0*b + s1)b + s2)b + … + sN-2)b + sN-1
where b is 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is a hash code compressed to an integer representing the index in a hash table?

A

The hash code for a key can be a large integer that is out of the range for the hash table index. You need to scale it down to fit in the range of the index. Assume the index for a hash table is between 0 and N-1. The most common way to scale an integer to between 0 and N-1 is to use:
h(hashCode) = hashCode % N
To ensure that the indices are spread evenly, choose N to be a prime number greater than 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two ways of handling collisions?

A

open addressing and separate chaining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is open addressing?

A

Open addressing is the process of finding an open location in the hash table in the event of a collision. Open addressing has several variations: linear probing, quadratic probing, and double hashing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In relation to open addressing collision-handling, what is linear probing, and how does it work?

A

When a collision occurs during the insertion of an entry to a hash table, linear probing finds the next available location sequentially. For example, if a collision occurs at hashTable[k % N], check whether hashTable[(k+1) % N] is available. If not, check hashTable[(k+2) % N] and so on, until an available cell is found.
When probing reaches the end of the table, it goes back to the beginning of the table. Thus, the hash table is treated as if it were circular.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you remove an entry from a hash table that uses linear probing?

A

Search the entry that matches the key. If the entry is found, place a special marker to denote that the entry is available. Each cell in the hash table has three possible states: occupied, marked, or empty. Note that a marked cell is also available for insertion.

18
Q

What is the big disadvantage of linear probing?

A

Linear probing tends to cause groups of consecutive cells in the hash table to be occupied. Each group is called a cluster. Each cluster is actually a probe sequence that you must search when retrieving, adding, or removing an entry. As clusters grow in size, they may merge into even larger clusters, further slowing down the search time.

19
Q

What is quadratic probing?

A

Quadratic probing can avoid the clustering problem that can occur in linear probing. Linear probing looks at the consecutive cells beginning at index k. Quadratic probing, on the other hand, looks at the cells at indices (k + j^2) % n, for j above 0,that is, (k + 1) % n, (k + 4) % n, (k + 9) % n, …, and so on.
Quadratic probing avoids linear probing’s clustering problem, but it has its own clustering problem, called secondary clustering; that is, the entries that collide with an occupied entry use the same probe sequence.
Linear probing guarantees that an available cell can be found for insertion as long as the table is not full. However, there is no such guarantee for quadratic probing.

20
Q

What is double hashing?

A

Starting from the initial index k, both linear probing and quadratic probing add an increment to k to define a search sequence. The increment is 1 for linear probing, and j^2 for quadratic probing. These increments are independent of the keys. Double hashing uses a secondary hash function on the keys to determine the increments to avoid the clustering problem. For example, let the primary hash function h and secondary hash function h ‘ on a hash table of size 11 be defined as follows:
h(k) = k % 11;
h ‘ (k) = 7 - k % 7;

For a search key of 12, we have:
h(12) = 12 % 11 = 1;
h ‘ (12) = 7 - 12 % 7 = 2;

For this given key, the first attempt is index 1. If index 1 is full, the function will try new indexes at an increment of 2.

21
Q

What is collision handling by separate chaining?

A

The separate chaining scheme places all entries with the same hash index in the same location, rather than finding new locations. Each location in the separate chaining scheme uses a bucket to hold multiple entries. You can implement a bucket using an array, ArrayList, or LinkedList.
You can view each cell in the hash table as the reference to the head of a list, and elements in the list are chained starting from the head.

22
Q

In regards to hash tables, what is load factor?

A

The load factor measures how full a hash table is. It is the ratio of the number of elements to the size of the hash table, that is, LF = n/N, where n denotes the number of elements and N the number of locations in the hash table.
For the open addressing scheme, LF is between 0 and 1 (1 if hash table is full). For the separate chaining scehme, LF can be any value.
Studies show that you should maintain the load factor under 0.5 for the open addressing scheme, and under 0.9 for the separate chaining scheme.
Keeping the load factor under a certain threshold is important for the performance of hashing.

23
Q

What is “rehashing”?

A

The act of making a larger hash table, and “copying” the entries in the old hash table into the new one is called rehashing, because they need to be given new hash values.
Notice that you need to change the hash functions, since the hash-table size has been changed. To reduce the likelihood of rehashing, since it is costly, you should at least double the hash-table size.

24
Q

What is the time complexity of the containsKey(key : key) method in HashMap?

A

O(1)

25
Q

What is the time complexity of the containsValue(value: V) method in HashMap?

A

O(capacity)

26
Q

What does 1 &laquo_space;30 mean?

What about 1 &laquo_space;1, 1 &laquo_space;2, 1 &laquo_space;3?

A

2^30

2^1, 2^2, 2^3

27
Q
What are the integers resulted from:
32 >> 1
32 >> 2
32 >> 3
32 >> 4
?
A

16 – (32 / 2^1)
8 – (32 / 2^2)
4 – (32 / 2^3)
2 – (32 / 2^4)

28
Q

Why can you use a for-each loop to traverse the elements in a set?

A

Because Set implements Iterable, and HashSet contains implementation of the iterator.

29
Q

If each key is mapped to a different index in the hash table, it is called __

A. normal hashing
B. perfect hashing

A

B. perfect hashing

30
Q

A collision occurs ____

A. when two or more keys are mapped to the same hash value.
B. when two elements have the same key value.
C. when two elements are mapped to the same key.

A

A. when two or more keys are mapped to the same hash value.

31
Q

True or False? Every object has the hashCode() method.

A. True
B. False

A

A. True

32
Q

What is the return type value for the hashCode() method?

A. byte
B. short
C. int
D. long

A

C. int

33
Q

True or False? Two objects are equal if their hashCodes are the same.

A. True
B. False

A

B. False

34
Q

True or False? Two objects have the same hashCodes if they are equal.

A. True
B. False

A

A. True

35
Q

True or False? If two strigns are equal, the two strings have the same hashCodes.

A. True
B. False

A

A. True

36
Q

For an Integer object with value 20, what is its hashCode?

A. 10
B. 20
C. 30
D. 40

A

B. 20

37
Q

1 &laquo_space;2 is ___

A. 2
B. 3
C. 4
D. 5

A

C. 4

1 &laquo_space;2 is the same as 2^2

38
Q

___ is to find an open location in the hash table in the event of collision.

A. Open addressing
B. Separate chaining

A

A. Open addressing

39
Q

When a collision occurs during the insertion of an entry to a hash table, __ finds the next available location sequentially.

A. linear probing
B. quadratic probing
C. double hashing.

A

A. linear probing

40
Q

The __ places all entries with the same hash index into the same location, rather than finding new locations.

A. Open addressing scheme
B. separate chaining scheme

A

B. separate chaining scheme

41
Q

___ measures how full the hash table is.

A. Load factor
B. Threshold

A

A. Load factor

42
Q

What does the bitwise operator ^ do?

A
It's called the "exclusive or" operator. Works like this:
Num1 : 00101010   42
Num2 : 00001111   15
=================
XOR  : 00100101   37

Every bit in the two numbers are compared to each other. If they are the same, the resulting bit is 0. If they are not the same, the resulting bit is 1.