(W10) Data Structures Flashcards
(11 cards)
What is a prefix of a string?
S[1..j] where 1 ≤ j ≤ n
What is the suffix of a string?
S[i..n] where 1 ≤ i ≤ n
What is a useful observation to make about a substring of S?
It will always be a PREFIX of some SUFFIX of S
What is the main way you can modify how a prefix trie works? (+3 ways)
The way that a node stores it’s children
Use an array: allows O(1) lookup, which leads to O(n) lookup over the entire trie. But bad space complexity O(T*D) t(total number of letters), d(domain)
Use a BST: good space complexity, only store a child if it actually exists O(T). But makes lookups O(n * log(D)
Use a hashtable: (don’t use this in an exam). theoretically lookup will be O(1) and space will be O(T), but it restricts our ability to perform more complicated operations
How to use a Prefix Trie to do prefix matching?
can inspect the trie to see if any words have matching prefixes.
How to use a Prefix trie to sort a list of words?
Insert all the words into the tree, and traverse it in lexigraphical order.
With an array, it’s O(length of all words * domain if not constant)
With a bst, it’s O(T log |domain|)
What are compact Trie or Trees?
If you have a chain of characters, merge them into a single edge
What are Suffix Trees and what problem do they solve?
Add every suffix of the word into the tree.
can be used to solve the pattern matching problem
How can we use a Suffix Tree to find if a pattern exists in a string?
Remember - a subset of a string is just a prefix of a suffix!
so look for the pattern in the suffix tree, and if you can find it, then there is a substring. O(substring length)
How to build a Suffix Tree?
In FIT2004, build the suffix trie O(n^2) and compress it into a suffix tree
How to solve the longest repeated substring problem?
Traverse a suffix tree, and look for the deepest internal node (non-leaf node)
Note - if using a suffix trie, have to have special handling for internal nodes that correspond to a single word. Maybe degree of node >2?