Inno 3.3 Data Structures Flashcards
(50 cards)
What is List<T> and how is it implemented internally?</T>
List<T> is a dynamic array internally backed by a contiguous memory block. It automatically resizes (usually doubling capacity) when more space is needed.</T>
Access by index is O(1), insert/remove at the end is amortized O(1), but insert/remove in the middle is O(n) due to shifting elements.
AUTOMATICLY RESIZES DOUBLING CAPACITY
What is Dictionary<TKey, TValue> and how does it work under the hood?
Dictionary uses a hash table under the hood.
Keys are hashed, then mapped to a bucket index.
Reading and writing is fast (O(1) average), but depends on the quality of the hash function and number of collisions. Collisions are handled with chaining or probing.
If the load factor (75%) is overcomed, theres rehasing so creating the double the amount of buckets and copying there data
BUCKET, LOAD FACTOR, REHASING
Why are read/write operations in Dictionary fast?
Because the key is first hashed, and then the hash determines the index in an internal array.
If the hash function distributes keys evenly, lookup is nearly constant time. Collisions degrade performance but are rare with good hashing.
What’s the difference between Array, ArrayList, and List<T>?</T>
Array is fixed-size and type-safe.
List<T> is generic, type-safe, resizable and faster than ArrayList.</T>
ArrayList is non-generic and stores object, causing boxing/unboxing.
What are the time complexities for common operations in List<T>?</T>
Index access: O(1)
Add (end): amortized O(1)
Insert (middle): O(n)
Remove (by index): O(n)
Q: What is a Queue<T> and what are its time complexities?</T>
A: Queue<T> is a FIFO (First-In-First-Out) structure backed by a circular array. Enqueue and dequeue are both O(1) amortized. Good for producer-consumer scenarios.</T>
Q: What is a Stack<T> and what are its time complexities?</T>
A: Stack<T> is LIFO (Last-In-First-Out). Push and Pop operations are O(1). Internally it uses a resizable array similar to List<T>.</T></T>
Q: What is ConcurrentStack<T>?</T>
A: A thread-safe version of Stack<T>. It uses lock-free synchronization under the hood (compare-and-swap operations) to support concurrent access with high performance.</T>
Q: What is LinkedList<T> and when is it better than List<T>?</T></T>
A: LinkedList<T> is a doubly-linked list. Insertion and deletion are O(1) if you have the node reference. However, lookup is O(n). It's better than List<T> when you often insert/delete in the middle or avoid reallocating memory.</T></T>
Q: What is a Hashtable?
A: A non-generic dictionary that stores keys/values as object. Uses hashing internally. Slower than generic Dictionary<TKey,TValue> due to boxing/unboxing and lack of compile-time safety.
STORES AS AN OBJECT => USES BOXING/UNBOXING
Q: What is SortedList<TKey, TValue> and how is it different from SortedDictionary<TKey, TValue>?
SortedList uses arrays internally and maintains keys in sorted order. Fast index access, but insertion is O(n) due to shifting.
SortedDictionary uses a binary search tree (Red-Black Tree), with O(log n) insertion/lookup and better scalability for large datasets.
Q: What is ConcurrentDictionary<TKey, TValue> and how does it handle thread safety?
A: It partitions data internally (using segments) and applies fine-grained locking or lock-free techniques for concurrent operations. It avoids full locking for high performance in multithreaded environments.
Q: What are the trade-offs between List<T> and LinkedList<T>?</T></T>
List<T> is better for indexed access and sequential data.</T>
LinkedList<T> is better for frequent inserts/removals when you already have a reference to the node.</T>
List<T> has better cache locality and lower memory overhead.</T>
Q: How does resizing in List<T> work?</T>
A: When capacity is exceeded, **List<T> allocates a new array (usually double the size), copies all elements, and replaces the internal array**. This makes appending O(1) amortized, but resizing itself is O(n).</T>
Q: What is cache locality and how does it affect performance in collections?
A: Cache locality means how well data fits and is accessed in CPU cache. Arrays and List<T> have good locality (contiguous memory), making them faster. Linked lists have poor locality, leading to cache misses and slower traversal.</T>
Cache locality describes how well a program uses the CPU cache, which is much faster than RAM.
When your code accesses memory, the CPU tries to load nearby data into its cache to speed things up. If your program accesses data that’s close together in memory, it benefits from better cache locality.
Arrays, List<T>: ✅ great locality – elements stored contiguously in memory.</T>
**LinkedList<T>, Dictionary<K,V>`: ❌ poor locality – nodes/entries are scattered in memory (heap allocated).</T>
HOW FAST IS FITS INTO THE CPU CACHE
Q: When is it better to use a SortedList vs Dictionary?
Use SortedList when you need to maintain order and fast access by index.
Use Dictionary when order doesn’t matter but fast insertion and lookup are needed.
For large datasets, SortedDictionary scales better than SortedList.
Q: Why is Dictionary not 100% O(1) lookup time?
A: Because hash collisions can occur. When multiple keys hash to the same bucket, collisions are handled via chaining or probing, which adds overhead. A poorly designed hash function or excessive load factor can degrade performance.
Q: Why is List<T> generally faster than LinkedList<T> for most scenarios?</T></T>
A: List<T> is backed by a contiguous memory block (array), which gives it excellent cache locality. The CPU can load sequential data quickly into cache lines, making iterations and index-based access very fast. In contrast, LinkedList<T> stores elements as separate nodes with pointers, often scattered in memory. This leads to frequent cache misses and slower traversal, despite O(1) insertions and deletions when a node reference is known.</T></T>
Q: Why is ArrayList slower and less safe than List<T>?</T>
A: ArrayList is a non-generic collection that stores elements as object, requiring boxing and unboxing for value types. This adds memory overhead and runtime performance penalties. List<T> is generic and avoids boxing, resulting in type safety, faster access, and better memory efficiency. Also, generic collections benefit from compile-time type checks and JIT optimizations.</T>
Q: Why does Dictionary<TKey, TValue> provide fast key lookups on average?
A: Dictionary uses a hash table internally. The key is hashed (via GetHashCode), and the result is used to calculate an index in an internal array (bucket). If the hash function distributes keys uniformly and there are few collisions, lookup is close to O(1). Collisions are resolved using chaining or open addressing. Performance may degrade to O(n) in worst-case scenarios, but a good hash function and resizing strategy keep it fast on average.
Q: What makes ConcurrentDictionary<TKey, TValue> thread-safe and performant?
A: ConcurrentDictionary avoids global locks by partitioning the data into internal segments. It applies fine-grained locking or uses lock-free techniques like Interlocked.CompareExchange. This design allows multiple threads to read/write concurrently with minimal contention, making it highly scalable for multi-threaded environments.
Q: Why is SortedList<TKey, TValue> slower on inserts compared to SortedDictionary<TKey, TValue>?
A: SortedList uses two internal sorted arrays (for keys and values). When inserting a new item, it must perform a binary search to find the correct index (O(log n)), but then shift elements to maintain sorted order (O(n)). In contrast, SortedDictionary uses a self-balancing binary search tree (Red-Black Tree), which guarantees O(log n) insertion and avoids shifting, making it faster for large or dynamic datasets.
Q: What is a Hashtable, and why is it slower than Dictionary<TKey, TValue>?
A: Hashtable is a non-generic key/value collection that internally uses hashing. It stores everything as object, which introduces boxing for value types and requires runtime type casting. This increases GC pressure and introduces potential for runtime errors. Dictionary<TKey, TValue> is generic, type-safe, and faster due to less boxing/unboxing, better memory alignment, and JIT optimizations.
Q: Why does LinkedList<T> have worse read performance than List<T>?</T></T>
A: LinkedList<T> requires sequential traversal to access a specific element (O(n)), as it does not support index-based access. Each node is a separate object in memory with references to the next/previous node, which causes pointer chasing and poor cache locality. In contrast, List<T> stores elements in contiguous memory and supports O(1) indexing, making reads much faster.</T></T>