Lecture 4: Clinical Outcomes Studies Flashcards
(28 cards)
There are two kinds of outcome measures what are they?
Questionnaire-Based
Performance-Based
Which kind of outcome measure is completed by therapist or patient (self reported)
* Is it scored by preselecting a qualititative or quantattive vale for possible answers
Questionnaire Based (were asking the patient - this is a subjective measurement based on their experience / opion based on their limitations)
Quantitative value (because its just scoring th numbers they selected)
Which kind of outcome measure require patient to perform a set of tasks or movements. Can be scored via: objective measurement (distance or time)
* Is it qualitiatively or quantitiatvely assigned values?
Performance-Based outcome measure
Qualitatively
* Think the walking test, theres no #’s here, you’re just watching their gait so its qualitatve. However, somthing like burg balance has times attached to it so it would be quantitative
thi
This is the LESS test, it would fall under performance based outcome measures. It evaulates their landing mechanics (qualitative examination)
However, its not subjective because the patient is providing objective data
The score essentially equates to their risk of getting hurt in sport
What makes a clinical outcome measure useful? (9)
* What are all these things together called?
1) Can use it as a baseline measurement and compare it to something else later (test, intervention, retest)
2) We need to make sure it has high validity (the ability of the test to measure the construct were talking about
* If I want a test to measure balance im not going to do an MMT
3) Reliable - we want to make sure the outcome measure is also giving consistent results everytime
4) generalized across age ranges (I can use it in a wide variety of people)
5) Accounting for ceiling/floor effects
6) Direct relationship to treatment goals (does it realate to what im trying to acomplish)
7) Responsiveness - does a big change need to happen for it to mean anything?
8) Performance-based is always better than questionaire based
9) Assess risk/sucess (are they more likely to … or less likely to … if they meet this score)
* Helps us w/ their prognosis / chances of getting better based on their score
All together = psychometric properties
Whats better a performance based or questionaire based outcome measure?
Performance = more objective results
The statistical measuremenests collectovely are known as the measure’s _
Pyshometric properties (basically talking about an outcome measure here)
* Or maybe more broadly any test and all its statistical measurements
“The technical qualitite sof a test or assessment tool that indicate its statistical strength and usefulness”
all of these things
KNOW: You can find studies looking at the psychometric properties of a test
* Because in order to establish valdiity/relability of a test, we need a research artical / study to tell us
* some will look at relability, some will look at validitity, and some will look at clinical meaninfulness - meaning is this a test where its only good if we have a baseline and we have to compare later, or is it good where we just need one score ti realte to risk in the future, or do we have other data we can comapre it to.
this slide is kind of a duplicate of the next ones
Studies measuring outcome measures look at what 3 things?
1) Reliability - do we get consistent results?
2) Validiity - Are we measuring what we intent to?
3) Clinical Meaninfulness - How seful is this test in what it intends to do?
* track improvememnt
* Compare to normative data
* Assess risk
What are our 4 types of validity?
How are these measures (i.e., what correlation coefficient do we use?)
1) Criterion
2) Face
3) Construct
4) Content
Measured w/ spearmans correlation
What is criterion validity?
Comparing the validity of the test/outcome measure to the gold standard/refrence standard
* How well does it measure what were intending to measure according to the best thats out there right now
EX: Testing balance / fall risk we could compare the TNETE to whatever the gold standard is for balance
What is face validity?
* How is it measured?
Taking the outcome measure/test at face value. Just looking at this test does it make sense that its measuring what we think its measuring.
The degree to which an assessment or test subjectively appears to measure the variable or construct that it is supposed to measure
Think gait walking test wouldnt make sense for UE strength so it would have low face value
Measured w/ spearmans coefficent (r)
What is a construct validity?
* How is it measured?
An abstract variable
* Variable = anything were trying to measure
* Can be abstract like “health and wellness” - how do we measure this?
Fall risk is a construct, so is balance
ROM = direct so not a construct (its not abstract)
* we can take out a goni and measure it
Does it accurately reflect the theoretical concept it’s supposed to measure?
Construct validity: Does the test accurately reflect the theoretical concept its supposed to measure (abstract concept)
* EX: measurement of depression using a specific questionaire
* Concept: Depression is the construct we want to measure. It encompasses symptoms like sadness, loss of interest, and fatigue
* Hypothesis: If the BDI test accuretly measures depression, it should correlate with other established measures of depression, such as the hamilton depression arting scale
* Researchers would adminsiter both the BDI and HDRS to the same group of people. If the scores from both assessment show a strong correlation, this supports the construct caldity of the BDI
* Additionally, researchers might check if the BDI scores differeentitae between groups known to have different levels of depression (individals diagnosised w/ depression vs those w/o). If the BDI effectively distinguishes these groups. it further supports its construct valditity
Measured w/ spearmans coefficent (r)
What is construct validity?
* How is it measured?
How well that test/outcome measures that abstract thing
* How well does it measure the thing were trying to measure
EX: If I’m trying to measure obesity the standard would be BMI; however, it doesnt account for muscle mass
* So the construct validity is diminished for the BMI test
Measured w/ spearmans coefficent (r)
What is content validity?
* How is it measured?
Do the individual pecies of the test/outcome measure actually measure what were trying to measure
EX: ROM is 1 measure thats done 1 time and were done. However, w/ the DGI theres different parts (eyes open, eyes closed, picking something up off the ground - these are different items within the test) - do all these items measure some part of that construct? Well all those items do releate to walking/balance which is what the test is trying to measure (functional dynamic balance) - so it would have high content validity
Measured w/ spearmans coefficent (r)
* Do these scores correlate w/ one another (do peopel w/ bad balance score bad on this test?)
Which groups change can’t the floor effect see
unable to detect change in the low performers
* basically the high performs would do so bad that it wouldnt really dilinate them much from the high level performers
* the test is too challenging
* think taking a bunch of people the SCI and telling them to walk - well if none of them can walk we wouldnt know whose recovering better
Which group cant the cieling effect group detech change in
can’t detect change in the high performing group
* This is because the test is too easy, so they will score the same as the low performing group
The ability of the test to detect changes over time in the construct being measured
Responsiveness
Its basically how long it takes to measure change in a test. Think between inches and cm. The cm is more responsive because it can detect more change in a smaller amount of time than inches
* = more responive deteches more change in less time
Think measuring gait speed. Well if my clock only measure down to the second it would be less repsonse than one that went down to the 10th of a second.
NOTE: we also want to know how much change matters. In gait speed does a 2 second change in gait speed matter? if they needed it to take 10 seconds longer to put them in a different category that test would be considered less reponse (i.e., large amounts of change in gait speed need to occur before what the test perdicts will happen will happen) - we need a big change on test to see an acutal change
Inter rater relaibility = consistent results between raters
Intra rater relability = consistent results in on rater
reliable = same result over and over
how well all the items relate to one another
Internal consistency
how well a tests items correalte with each other
NOTE: content valdidity = all these different items - how well they relate to the construct. While internal consistency relates all the items in that one test to one another (think about a test w/ multiple items and how well they all relate to one another)
Think about having a test with both questions about math and reading. Well, those test items don’t have good intenral consistency because they questions don’t really relate to one another. However, they have high construct validity because they’re measuring the abstract variable (which is your overall knowledge leaving highschool)
* however, if you only look at the math questions in the math section they would have high interal validity
The error that tends to occur naturally
Stajdard error of measurement
How off you’re likely to be due to chance (think 5 degrees for goni)
* think a stopwatch, maybe its one second. You’re likely to be a second fof the actual time in 1 direction –> thats the standard error of measurement
The smallest cahnge that we can measure thats unlikely to be due to random error
* the amount of change that we need to occur to know it wasnt and error/chance, but a true change as occured
Minimal Detectable Change (MDC)
Should be bigger than standard errior of measurement
The change thats needed for the pt to say that they feel somewhat better or somewhat worse - this is enough of a change to make a difference in the pts life
Minimal Clinically Important Difference (MCID)
The smallest change in a treatment outcome that a patient consideres important and woud warrant a change in their managemeent
If im measuring ROM and my standard errior of measurement is 5 and my minimal detectable change is 7 - if I get a change in elbow flexion of 7 degrees, that might not mean much to the ot even though its the minimal detecable change - they still don’t have enough to really tell its a different - but if they had a 12-15 degree change now maybe they can use that arm better and are able to see those changes (i.e., “yeah I feel somewaht better) = Minimal clinically improtant difference (MCID)
Standard error of measurement, Minimal detectable change, minimal clinically important difference are all important because we need to know what a significant change is when we do test, treat, retest (wouldnt really have a baseline and know if it changed)
These are also fantastic for writing goals off of. So we should defiently write those at least on that minimal detactable change (not on that one thats only due to chance)
Maybe make the short term goal the MCD and the long term goal something greater than MCID