nearest neighbor Flashcards
don't forget kd trees (13 cards)
what do you do in nearest neighbors classification?
identify what training neighbor is most similar to the target and take the class label from it
what are some advantages of nearest neighbor classification?
simplicity, interprtability, non-linearity
what is the issue with nearest neighbor classification?
devising the right distance function
what are the mathematical properties of distance measures?
positive, if the distance is 0 the points must be equal, the distance from x to y must be the distance from y to x bc symmetry, triangle inequality
what is the generalized idea of nearest neighbor?
function interpolation by averaging the values of the k nearest points
how can you adjust nearest neighbors classification for more robust classification or interpolation?
using the k closet neighbors
what happens to the boundaries as k gets larger in k nearest neighbors?
smoother boundaries
what is the time complexity for to find NN given n points in d dimensions?
O(nd)
what are voronoi diagrams?
diagrams that partition space into regions sharing nearest neighbors
what are kd trees?
efficient algorithm that helps to find nearest neighbors in low dimensions.
what is locally sensitive hashing?
could speed nearest neighbor search if nearby points got hashed to the same bucket. loally sensitive hashing works on hashing points so that points/ vectors that are close to each other are hashed intot he same buckets. (do this by picking planes through the origin and cut through them)
what are some examples of network data?
social networks, www, product/ customer networks, genetic networks
what is page rank?
algorithm to find the probability that a random walk on G ends up at vertex v. looks at hyperlinks to different sites. allows for random jumps (damping factor)