Principles of AI Programming (Theory) Flashcards
(79 cards)
[HEURISTICS] What does the User-Item Matrix represent in collaborative filtering recommendation systems?
Users’ preferences or ratings for various items
The User-Item Matrix is a fundamental data structure in collaborative filtering systems that represents how each user has rated or interacted with different items. Each row typically represents a user, each column represents an item, and each cell contains the user’s rating or preference for that item.
[HEURISTICS] What popular metric is used to calculate similarity between users in collaborative filtering?
Cosine similarity
Cosine similarity is a popular metric used to measure the similarity between two users based on their rating patterns. It calculates the cosine of the angle between their rating vectors, which effectively measures how similarly they rate items regardless of the magnitude of their ratings.
[HEURISTICS] What is the “cold start” problem in recommendation systems?
The difficulty of making recommendations for new users or items with little data
The cold start problem refers to the challenge of providing accurate recommendations for new users who haven’t provided many ratings or new items that haven’t been rated much. Without sufficient data, it’s difficult for collaborative filtering to find meaningful patterns or similarities.
[HEURISTICS] How does the Nearest Neighbor heuristic work when applied to the Travelling Salesman Problem (TSP)?
It constructs a path by repeatedly visiting the nearest unvisited city
The Nearest Neighbor heuristic is a greedy algorithm for solving the TSP. It works by starting at one city and repeatedly visiting the nearest unvisited city until all cities have been visited, then returning to the starting city. While fast and intuitive, it doesn’t guarantee an optimal solution.
[HEURISTICS] How is Euclidean distance used in recommendation systems?
Calculating the similarity between users’ preference profiles
Euclidean distance is used to measure the similarity or dissimilarity between users based on their ratings. A smaller Euclidean distance indicates that two users have similar preferences, which is useful for finding like-minded users whose ratings can inform recommendations.
[HEURISTICS] How does user-based collaborative filtering generate recommendations?
By finding similar users and recommending items they rated highly
User-based collaborative filtering works by first identifying users who have similar preferences to the target user (neighbors). It then recommends items that these similar users have rated highly but the target user hasn’t yet rated or interacted with.
[HEURISTICS] What major limitation does the Nearest Neighbor heuristic have when applied to the TSP?
It can produce suboptimal routes by making locally optimal choices
The Nearest Neighbor heuristic is a greedy algorithm that makes the locally optimal choice at each step. However, this doesn’t guarantee a globally optimal solution. It may lead to inefficient routes, especially when the best overall route requires temporarily moving away from nearby cities.
[HEURISTICS] What is Mean Squared Error (MSE) used for in recommendation systems?
To measure the accuracy of predicted ratings
Mean Squared Error (MSE) measures the average squared difference between the predicted ratings and the actual ratings given by users. A lower MSE indicates more accurate predictions, which is crucial for effective recommendation systems.
[HEURISTICS] How does item-based collaborative filtering differ from user-based collaborative filtering?
It finds similar items rather than similar users
Item-based collaborative filtering identifies relationships between items based on user ratings. Instead of finding similar users, it finds items similar to those the user has already rated positively, and then recommends those similar items to the user.
[HEURISTICS] Why is distance calculation important in solving the Travelling Salesman Problem?
To analyze the feasibility of travel and calculate total path length
Distance calculation is essential in TSP as it forms the basis for evaluating different possible routes. By calculating distances between cities, the algorithm can determine which routes are shorter and thus more efficient for the salesperson to travel.
[HEURISTICS] What is content-based filtering in recommendation systems?
Recommending items based on item features and user preferences
Content-based filtering makes recommendations by comparing the features of items with the user’s preferences. Unlike collaborative filtering, which uses patterns in user ratings, content-based approaches analyze item attributes (like genre, actors, or keywords for movies) and match them to user preference profiles.
[HEURISTICS] What is a “sparse” User-Item Matrix in collaborative filtering?
A matrix where most cells contain zero values or are empty
A sparse User-Item Matrix contains many empty entries because most users have only rated a small fraction of all available items. This sparsity is a common challenge in recommendation systems, making it difficult to find reliable patterns and similarities.
[HEURISTICS] What key difference distinguishes collaborative filtering from content-based filtering?
Collaborative filtering uses ratings data, while content-based uses item features
The fundamental difference is that collaborative filtering makes recommendations based on user rating patterns and similarities between users or items, without needing to know anything about the items themselves. Content-based filtering, in contrast, relies on item features (e.g., genre, actors, keywords) and user preferences for those features.
[HEURISTICS] What is serendipity in recommendation systems?
The ability to recommend surprising but relevant items users might not have discovered
Serendipity in recommendation systems refers to recommending items that are both unexpected/surprising and relevant to the user. Good recommendation systems balance accuracy (recommending items users will definitely like) with serendipity (helping users discover new items they wouldn’t have found themselves).
[HEURISTICS] What is precision as an evaluation metric in recommendation systems?
The fraction of recommended items that are relevant to the user
Precision is an evaluation metric that measures the proportion of recommended items that are truly relevant to the user. High precision means the system is making accurate recommendations without suggesting many irrelevant items. It’s often used alongside recall (the proportion of relevant items that were successfully recommended).
[CELLULAR AUTOMATA] What is a cellular automaton?
A discrete model of computation based on a grid of cells with simple rules
A cellular automaton is a computational model consisting of a grid of cells, each with a finite number of states. The cells evolve over discrete time steps according to fixed rules based on the states of neighboring cells, creating complex patterns from simple rules.
[CELLULAR AUTOMATA] What does the state “S” typically represent in a disease spread simulation?
Susceptible individuals
In epidemic simulations, cells often follow the SIR model: Susceptible (S), Infected (I), and Recovered (R). “S” represents individuals who are currently healthy but susceptible to catching the disease if exposed to infected neighbors.
[CELLULAR AUTOMATA] What is the “neighborhood” concept in cellular automata?
The set of cells that influence a cell’s next state
The neighborhood in cellular automata defines which surrounding cells affect the future state of a given cell. Common neighborhoods include the von Neumann neighborhood (the four orthogonally adjacent cells) and the Moore neighborhood (all eight surrounding cells).
[CELLULAR AUTOMATA] What is Conway’s Game of Life?
A famous cellular automaton with simple rules that create complex patterns
Conway’s Game of Life is one of the most well-known cellular automata. It follows simple rules: a dead cell with exactly three live neighbors becomes alive (birth); a live cell with two or three live neighbors stays alive (survival); otherwise, cells die or remain dead. These simple rules create remarkably complex and sometimes unpredictable patterns.
[CELLULAR AUTOMATA] What does “infection probability” represent in epidemic spread simulations?
The likelihood that an infected cell will transmit the disease to a neighboring susceptible cell
Infection probability is a parameter that determines how likely it is for a susceptible individual (cell) to become infected when exposed to an infected neighbor. This parameter models the transmissibility of the disease and significantly affects how quickly the epidemic spreads in the simulation.
[CELLULAR AUTOMATA] How might vaccination be modeled as an intervention strategy in epidemic simulations?
By making some cells immune to infection (transitioning directly to recovered state)
Vaccination can be modeled by transitioning some susceptible cells directly to a recovered or immune state, bypassing the infected state. This simulates how vaccination protects individuals from becoming infected even when exposed to the disease.
[CELLULAR AUTOMATA] What state is NOT typically included in the basic SIR model for epidemic spread?
Treated
The standard SIR model includes Susceptible (S), Infected (I), and Recovered (R) states. “Treated” is not part of the basic model, though more complex extensions like SIRT might include treatment as a separate state or process.
[CELLULAR AUTOMATA] What visualization technique works best for showing disease progression in cellular automata models?
A grid with different colors representing different cell states at each time step
Visualizing the grid directly with color-coded cells (e.g., green for susceptible, red for infected, blue for recovered) provides an intuitive way to see how the disease spreads spatially over time in the cellular automaton model.
[CELLULAR AUTOMATA] What additional state is often added to extend the basic SIR model for greater realism?
Exposed
The SEIR model adds an “Exposed” state between Susceptible and Infected to represent individuals who have contracted the disease but are not yet infectious. This better models diseases with an incubation period before symptoms and infectiousness develop.