2 - Virtual Screening & QSAR Modeling Flashcards
(15 cards)
What is Virtual Screening (VS)?
A computational method in drug discovery to quickly search large libraries of molecules to find those most likely to bind to a target. (trying to predict bioactivity)
Two main types: ligand-based, structure-based
Why is Virtual Screening used?
To reduce cost and effort by narrowing down compound libraries before wet-lab testing.
What are ligand-based virtual screening methods?
Uses known active molecules (ligands) to find new ones with similar structure or properties, assuming similar molecules have similar activity.
They use only ligand information; include similarity search and ML model-based approaches.
Used when: the structure of the target protein is unknown
What is structure-based virtual screening?
Uses the 3D structure of a target protein to identify molecules that can bind to it.
Methods that use both ligand and protein structure, such as docking, scoring, and molecular dynamics.
Used when: the protein structure is known (e.g., from X-ray crystallography or AlphaFold).
SBVS helps find molecules that are likely to act as inhibitors or drugs.
What are molecular fingerprints?
Vector representations of molecules encoding presence/absence of substructures.
Name three similarity metrics used in virtual screening.
Tanimoto, Dice, and Cosine similarity.
What are PAINS in drug discovery?
Pan-Assay INterference Compounds that often cause false positives in bioassays.
What is QSAR modeling?
Quantitative Structure-Activity Relationship modeling; predicts bioactivity from molecular structure.
What are common ML models used in QSAR?
Random Forest, SVM, Gradient Boosting, Neural Networks, GNNs.
Why is pIC50 used instead of IC50?
Because it’s log-scaled, numerically stable, and easier for regression tasks.
What is scaffold-based data splitting?
Splitting data based on core molecular structures to avoid information leakage.
What is compound series bias?
When similar compounds appear in both train and test sets, leading to overoptimistic performance.
What is multitask QSAR?
A model predicting multiple bioactivities (targets) from a single molecule input.
What is drug synergy?
When the combined effect of drugs is greater than their individual effects.
Why must models for drug synergy be order-invariant?
Because the order of input drugs shouldn’t affect the synergy prediction.