Exam 2020 augusti Flashcards
(29 cards)
- What can be measured in a DEO equation?
Yhat
What is characterizing model paramters?
Model paramters stay constant with time and consists of x(0), k, and yhat
What is characterizing model parameters?
Change over time, are the x1, x2 etc
Q1: Use Euler forward to compute the concentration of B after one time step with the step
size Dt = 0.1, i.e. compute B. Assume the following values for the kinetic rate
constants: k1 = 3, k2 = 2, k3 = 1
The euler method uses the formula:
x(Δt)=x(0)+d/dt(x)(0)*Δt
Answer the questions below
(a) Formulate the null hypothesis underlying a likelihood ratio test! (1 point)
H0=There is no difference between the models and the data
H1=One model is better then the other
What do you conclude when you cannot reject the null hypothesis in a c2-test?
H0=The residuals are small /there is no difference between the model and the data
H1=The residuals are big /tghere is difference between the model and the data
What do you conclude when you reject the null hypothesis in a whiteness test?
H0=The residuals are not too correlated
H1=The resdiuasl are too correlated
Give an example of a situation when cross validation is useful in small-scale systems
biology!
When we believe we have overfitted our data
3e) Which test would you use to reject the model in Figure 1? Motivate your answer! (2
points)
Chi2 test q
What is hypothesis driven modeling and how does this approach relate to data driven
modeling? Give an example of a question where you would use hypothesis driven modeling
and motivate why hypothesis driven modeling is more useful than data driven modeling for that
question.
Overall, while both hypothesis-driven and data-driven modeling approaches have their strengths and limitations, hypothesis-driven modeling is often more useful for testing specific hypotheses about biological mechanisms or for validating experimental results, while data-driven modeling is often more useful for identifying patterns or generating predictions based on large datasets.
Describe the steps taken to evaluate if a small-scale mechanistic model is in agreement with
experimental data!
We start with a visual inspection, Chi2 test and then perfomring different statistical tests depending on the model like cross validation if we have multiple models to see which is the best fit, whitness test to see if the data is correlated etc
Choose a biological network of choice, define what is a node in this particular
network, what interactions do exist, and what types are the underlying
interactions (motivate your answer). (1p)
Nodes: Each node in the network represents a unique protein, which may be involved in a variety of different biological processes. Nodes are typically labeled with the name or identifier of the protein they represent.
Edges: Each edge in the network represents an interaction between two proteins, which may take a variety of forms. For example, an edge may represent a physical binding interaction between two proteins, or it may represent a functional interaction in which one protein regulates the activity of another.
Underlying reactions: The interactions between proteins in the network are often based on underlying biochemical reactions, such as protein-protein binding or enzyme-substrate interactions. These reactions can be represented as edges in the network, with the nodes representing the proteins or other molecules involved in the reaction.
7b. Draw the graph of the network defined by the following adjacency matrix (2p)
Draw this
c. Is the network directed, and/or weighted? (1p)
Directed network: If the network is directed, then the adjacency matrix will be asymmetric.
Weighted network: If the network is weighted, then the adjacency matrix will have nonzero values that represent the strength or weight of the connections between nodes.
7d, Calculate this: What is the average shortest path of this network?
average shortest path = (sum of shortest path distances for all node pairs) / (total number of node pairs)
7e. What is the clustering coefficient of this network? (1p)
Clustering coefficient also known as global transitivity (which means it’s the entire graph being analyzed)
It’s closed triplets / closed+ open triplets
Each closed triplet / triangle counts as tree while open triplet is one
8 Centrality in network. All questions relate to the network below. (tot 4p)
a. Which node has the highest degree?
The simplest centrality measure is the degree centrality, which is defined by the number of connections attached to each node.
In-degree represents the number of directed connections reaching a node, while out-degree represents the number of directed edges leaving a node.
8b. Is the network likely to come from a random process? If so describe what random process, if not motivate why.
Typical random graph models include the Barabási-Albert model (or “scale-free”), the Erdös-Rényi model (or “random”), and the Watts-Strogatz model (or “small world”).
The “random” graph model creates a graph with nodes and edges completely at random and often look quite dense (look at computer lab 6. Graph models for examples)
Scale free creates edges between nodes based on the importance of potential connections. It looks the most like a possible protien-protein netowork with a few cliques in the graph
Small world often create ring like structures. Most nodes can be reached from every other node by a small number of steps.
If it’s not from a random process but for example a protien-protein interaction network it’s more likely to look like a few highly connected proteins (hubs) in the network, while most proteins have relatively few connections. This is called a scale free network which is the same premiere as scale free random network generator.
8c. Which node has the highest clustering coefficient
Clustering coefficient also known as global transitivity (which means it’s the entire graph being analyzed)
It’s closed triplets / closed+ open triplets
Each closed triplet / triangle counts as tree while open triplet is one
8d. Which node has the highest betweenness centrality?
Finally, if we measure the centrality as the number of shortest paths going through a vertex or an edge, we would be ranking them based on their betweenness centrality.
8e. Which genes form the maximum clique of maximal size in this network
One of the multiple methods for the detection of modules in a graph is based on the identification of cliques. A set of nodes forms a clique (or complete subgraph) if all possible connections between the nodes exist. A two-node clique is simply two connected nodes. A three-node clique is also known as a triangle.
Graphs also contain maximal cliques, which are complete subgraphs such that no other node can be added while maintaining completeness.
8f. What is the clustering coefficient of node 8?
Clustering coefficient also known as global transitivity (which means it’s the entire graph being analyzed)
It’s closed triplets / closed+ open triplets
Each closed triplet / triangle counts as tree while open triplet is one
- Consider the human protein-protein interaction network. (tot 5p)
a. Sketch the degree distribution.
The degree distribution of a protein-protein network in humans is expected to follow a power-law distribution, also known as a scale-free distribution. This means that there are a few highly connected proteins (hubs) in the network, while most proteins have relatively few connections.
9b. How is this degree distribution different from a degree distribution of a randomly generated network? Where are we expected to find the highest fraction of disease-associated genes, please motivate why this is likely. (2p)
In a protein-protein network, the degree distribution is often described as “scale-free”, meaning that it follows a power-law distribution. This means that the majority of nodes in the network have relatively few connections, while a small number of nodes (known as “hubs”) have a very large number of connections. This type of distribution is characteristic of many real-world networks, including social networks and biological networks.
In contrast, a randomly generated network typically has a degree distribution that follows a Poisson distribution. In this type of distribution, the number of nodes with a given degree is expected to follow a bell-shaped curve, with the majority of nodes having a degree close to the average degree of the network.
Degree correlate with lethality, meaning that if a node has a high degree is has higher correlation to lehtality and disease asscoiated genes. This is because if the gene is used a lot and is involved in many pathways if something goes wrong it will go wrong in a lot of places causing a higher lethality and more disease asscotation.