Exam 2020 augusti Flashcards

Question

9c. Compare degree with a more complex measures of centrality. What pros and cons has the different measures in the context of identify the most important genes.

Answer 1

There are a few different ways of measuring centrality such as: Degree Closeness Eigenvector Betweenness The simplest centrality measure is the degree centrality, which is defined by the number of connections attached to each node. In-degree represents the number of directed connections reaching a node, while out-degree represents the number of directed edges leaving a node. Closeness centrality is the average distance of the node to all others. A central node, with high closeness, should therefore be close to all other nodes in the network in terms of their shortest path distances. Eigenvector centrality is ranking centrality in measures of the node being linked to many other important nodes. The important nodes has high centrality to other nodes. So it's one node that has high centrality and is connected to many other central nodes. Betweeness centrality is measuring the number of shortest paths going through the node.

Answer 2

A network community is a set of nodes that are densely connected to each other but sparsely connected to the rest of the network. Nodes in the same network are often involved in the same pathways, regulatory mechanisms, or other biological processes. For example protein interaction networks could be: Signaling pathways: Proteins can interact with each other to transmit signals within a cell or between cells. These signaling pathways are critical for many cellular processes, including cell growth, differentiation, and apoptosis. Metabolic pathways: Proteins can also interact with each other to catalyze biochemical reactions that are involved in metabolic pathways. These pathways are responsible for the breakdown and synthesis of molecules that are essential for cell function, such as carbohydrates, lipids, and amino acids. Transcriptional regulation: Proteins can interact with DNA to regulate gene expression. This can involve direct interactions between transcription factors and DNA, as well as indirect interactions through intermediary proteins.

Answer 3

Modules are sets of nodes that are densely connected among each other, but sparsely connected to other nodes outside their community. The disease module hypothesis states that complex diseases are often not due to malfunctioning of a single gene but a disease module, aka a group of densely connected nodes. This means that multiple genes and pathways are affected and causes the disease.

Answer 4

To potentially falsify the disease module hypothesis in the study of one disease of interest, one possible test could be a randomized control trial (RCT). An RCT is a study design in which participants are randomly assigned to either a treatment group or a control group. The treatment group receives the intervention being tested, while the control group does not. If the disease module hypothesis is correct, we would expect to see a significant difference in disease outcomes between the treatment and control groups. However, if the disease module hypothesis is false, we would not expect to see a significant difference between the two groups.

Answer 5

There are different options, a clique based algorithm is MCODE for example. MCODE (Molecular Complex Detection) is a clique-based algorithm designed to identify densely connected subgraphs (modules) in protein-protein interaction networks. The algorithm works by scoring each node in the network based on its local connectivity and then recursively expanding highly scored nodes into a dense subgraph. The algorithm consists of the following steps: Node Scoring: The algorithm assigns a local score to each node in the network based on its degree and the degree of its immediate neighbors. The score is calculated as the sum of the product of the degrees of each node in a given node's neighborhood. The higher the score, the more likely the node is to be part of a densely connected module. Seed Selection: The algorithm selects the highest-scoring node as a seed node and expands it into a module by including all its first neighbors with a score greater than a pre-defined cutoff. Module Expansion: The algorithm continues to expand the module by adding neighboring nodes that meet a specified score cutoff until no more nodes can be added without decreasing the overall score of the module. Module Scoring: The algorithm calculates a score for each module based on the sum of the scores of its nodes. Output: The algorithm outputs all modules with a score above a predefined cutoff. MCODE is a powerful algorithm for identifying biologically relevant subgraphs in protein-protein interaction networks, and it has been successfully applied to a variety of biological systems, including cancer and infectious diseases.

Exam 2020 augusti Flashcards

(29 cards)