lecture 2 Flashcards
(35 cards)
Why health and biomedical informatics?
- modern healthcare and biomedical research are information intensive activities
- the crossover/intersection between health care/biomedical research and information technology
What is the demand for specialised workforce?
- people with the ability to combine ICT and biomedical skills and knowledge are in high demand in Australia and around the world
- A job market study from June 2012 showed that job posts in Health Informatics have increased ten times faster than other health related jobs in recent years
- need people with specific skills: people able to understand both the language of medicine and and the language of informatics technology
- well paid in many countries
What is rationale?
Background
• health care, biomedical research and public health are information intensive activities:
- medical images and clinical records
- DNA sequencing, molecular data
- literature and public databases
- clinical trials, biobanks, GWAS
• new data types (extremely complex and heterogeneous) are being generated at an unprecedented pace
How long did/does it take to decode the human genome?
- first decoded in 2003 after one decade of work
* nowadays takes one day
What is Pubmed?
- growing exponentially
- 5000 biomedical research articles are published daily
- over 22 million articles
What are projects that human genome project has led to?
- human microbiome project
- exposome alliance project
- ENCODE genome regulation
- Human Epigenome Project
- phenome levels (proteome, metabolome)
- inter and intra individual genetic variation: 1000 genomes, mapping human genetic variation
What is Big Data?
• global size of “Big Data” in Healthcare stands at roughly 150 Exabytes (10^18) in 2001, increasing at a rate btetween 1.2 and 2.4 Exabytes per year (SA)
defined by the 4 Vs:
• volume: Data at Rest, terabytes to exabytes of existing data to process
• velocity: data in motion, streaming data, milliseconds to seconds to respond
• variety: data in many forms, structured, unstructured, text, multimedia
• veracity: data in doubt, uncertainty due to data inconsistency and incompleteness, ambiguities, latency, deception, model approximations
What is small data?
- our individual digital traces
- personal devices specifically designed for self-tracking (Fitbit)
- social networks, search engines, mobile operators, online games, and e-commerce sites that we access every day
- our everyday behaviours are becoming data
- data that are just about me, over time
- data about us, but not being provided to us
- from chronic pain to depression to memory enhancement and Crohn’s Disease
- generating evidence where n=me
- issues: open data, privacy, standards and tools
What are issues with informatics?
- how can we efficiently collect, store, search, integrate, analyse and visualise all of this info ?
- how can we use research data to model and simulate human physiology and pathology?
- how can we facilitate the translation of research findings into clinical solutions?
- which new information processing methods will be needed to respond to the emerging research approaches?
- the tools you use are the result of our research
How do we connect levels of biomedical information?
- connecting different levels of biomedical information
- bottom - up approaches (from gene/molecule to environment)
- top-down approaches (reverse)
- need to link health informatics (population data, clinical data, patient generated data) with bioinformatics (genomic data, gene expression data, proteomics and metabolomics data)
- relationship between genotype, phenotype
- everything must also be connected with the complex interplay between itself and the environment
Health vs bioinformatics?
- bioinformatics is different from health informatics
- increasing opportunities for interaction
- bioinformatics and computational systems biology
- health and biomedical informatics
Why is working with clinical data so hard? Why is healthcare data different?
Humans are a result of evolution - not perfect - many levels
Data about humans that arises from a growing number of sources and contexts:
- clinical research
- clinical practice - EHRs
- patient and disease registires
- mHealth apps
- Smart devices and sensors
- Environmental data
- Social media data
why different? • distributed (EMR, clinical departments) • different formats (text, images, numeric, videos) • same data exists in different systems • patient generated data • data is structured and unstructured • inconsistent/variable definitions • new data coming out every day • complexity of data (the human body) • changing regulatory requirements • privacy issues
What is biomedical informatics?
- informatics is the science of information
- information is data plus meaning
- biomedical informatics is the science of information as applied to or studied inthe context of biomedicine
- informaticians study information (data + meaning, in contrast to focusing exclusively on data)
- thus, practitioners must understand the context or domain (biomedicine)
- IT is different from biomedical informatics
- IT is basically the technology that we use to process information
- informatics is about providing meaning to data
What are some of the big challenges currently being address?
- phenome –> genome * exposome
- human → environmental sensors, phenomic sensors, genomic sensors
- environmental sensors → environmental risk factors (pollution, radiation, toxic agents,…)
- phenomic sensors → physiological, biochemical parameters (cholesterol, temperature, glucose, heart rate…)
- genomic sensors: biomarkers (DNA sequence, proteins, gene expression, epigenetics)
- all of these combined → integrated personal health record
challenge:
• how can we measure environmental exposure?
• e.g. diabetes
• measuring the exposome
• environement-wide association study on Type 2 Diabetes mellitus
• 266 environmental factors
• future: combined: GWAS-EWAS?
What is an example of a new way of presenting/visualising data?
- microarrays: analysis of gene expression in cancer
- how this is translated into survival curve
- ontologies: systems that are expressed in a standardised way, so you could analyse data from two different places
How can we extract knowledge from the literature?
- text mining
- automatically scan through abstracts and extract complex networks of interrelationships
- way of filtering information for clinician/researcher
What is increasing every day?
- need for patient-specific decision support assistance
- number of facts is out of capacity of any human mind
- traditional health care (i.e. decisions by clinical phenotype) vs decisions that take into account structural genetics: e.g. SNPs, haplotypes, functional genetics: gene expression profiles, proteomics and other effector molecules
- need to deliver systems that can overcome this, function as reminders, alerts, that can send messages to clinicians saying ‘hey, be careful, this patient could have an adverse drug reaction because there is an incompatibility between x and y’
What is the role of informatics in new taxonomy of disease?
- stratification of disease - ICD 11 - US Nat Academy - Towards precision medicine
- new taxonomy based on human molecular biology
- exposome
- signs and symptoms
- genome
- epigenome
- microbiome
- other types of patient data
- individual patients
- e.g. skin, colon, parathyroid - BRAF mutation
- MD ANderson CC - Breast, Ovarian, Uterine, Cervical – PIK3CA Mutation trial
- in the future will classify cancers not in terms of where they are seen but by what is causing them
What is network and systems medicine?
- has a role in informatics at all levels
- personalised and participatory medicine
- preventative medicine
- social component of disease
How do we access information about genetic diseases?
- there are many online bioinformatics resources that offer updated and reliable information on the molecular causes of genetic diseases
- different methods of search can be used (catalogues, search engines, databases)
- navigation across the large number of resources is not straightforward and discerning their quality and reliability poses challenges for clinicians and biomedical researchers
What is another layer of complexity?
- all/most of the databases are interconnected
* spaghetti
What are questions you can get databases to answer?
- what are the main features of the disease?
- are there any drugs for the disease?
- are there any gene therapies or clinical trials for the disease?
- what laboratories perform genetic tests for the disease?
- what genes cause the disease?
- on which chromosomes are these genes located?
- what mutations have been found in these genes?
- what names are used to refer to these genes?
- what are the proteins coded by these genes?
- what are the functions of the gene product?
- what is the 3D structure for these proteins?
- what are the enzymes associated with these proteins?
What are the main centres?
- US Gov - NIH - NLM - NCBI (similar to EBI, offer information across the whole spectrum, genes, genetic information, proteins, metabolites, 3D structures)
- EC – EBI
- DDBJ - focussed on metabolic data (japan)
- Switzerland - SIB - Expasy - protein data
- usually will offer you a window where you can make a search
- often offer training resources, online short courses
• in principle we say that everything you can get from those places is reliable, well-funded, last for a long time,
What is NAR?
- second strategy in searching for information
- catalogues of resources
- Nucleic Acids Research
- free to access issue on databases
- peer reviewed
- high reliability, high quality
- easy to navigate
- tree