Lecture 4: Clinical Outcomes Studies Flashcards by Tyler Winn

There are two kinds of outcome measures what are they?

Questionnaire-Based
Performance-Based

How well did you know this?

Not at all

Perfectly

Which kind of outcome measure is completed by therapist or patient (self reported)
* Is it scored by preselecting a qualititative or quantattive vale for possible answers

Questionnaire Based (were asking the patient - this is a subjective measurement based on their experience / opion based on their limitations)

Quantitative value (because its just scoring th numbers they selected)

How well did you know this?

Not at all

Perfectly

Which kind of outcome measure require patient to perform a set of tasks or movements. Can be scored via: objective measurement (distance or time)
* Is it qualitiatively or quantitiatvely assigned values?

Performance-Based outcome measure

Qualitatively
* Think the walking test, theres no #’s here, you’re just watching their gait so its qualitatve. However, somthing like burg balance has times attached to it so it would be quantitative

How well did you know this?

Not at all

Perfectly

thi

This is the LESS test, it would fall under performance based outcome measures. It evaulates their landing mechanics (qualitative examination)

However, its not subjective because the patient is providing objective data

The score essentially equates to their risk of getting hurt in sport

How well did you know this?

Not at all

Perfectly

What makes a clinical outcome measure useful? (9)
* What are all these things together called?

1) Can use it as a baseline measurement and compare it to something else later (test, intervention, retest)

2) We need to make sure it has high validity (the ability of the test to measure the construct were talking about
* If I want a test to measure balance im not going to do an MMT

3) Reliable - we want to make sure the outcome measure is also giving consistent results everytime

4) generalized across age ranges (I can use it in a wide variety of people)

5) Accounting for ceiling/floor effects

6) Direct relationship to treatment goals (does it realate to what im trying to acomplish)

7) Responsiveness - does a big change need to happen for it to mean anything?

8) Performance-based is always better than questionaire based

9) Assess risk/sucess (are they more likely to … or less likely to … if they meet this score)
* Helps us w/ their prognosis / chances of getting better based on their score

All together = psychometric properties

How well did you know this?

Not at all

Perfectly

Whats better a performance based or questionaire based outcome measure?

Performance = more objective results

How well did you know this?

Not at all

Perfectly

The statistical measuremenests collectovely are known as the measure’s _

Pyshometric properties (basically talking about an outcome measure here)
* Or maybe more broadly any test and all its statistical measurements

“The technical qualitite sof a test or assessment tool that indicate its statistical strength and usefulness”

all of these things

How well did you know this?

Not at all

Perfectly

KNOW: You can find studies looking at the psychometric properties of a test
* Because in order to establish valdiity/relability of a test, we need a research artical / study to tell us
* some will look at relability, some will look at validitity, and some will look at clinical meaninfulness - meaning is this a test where its only good if we have a baseline and we have to compare later, or is it good where we just need one score ti realte to risk in the future, or do we have other data we can comapre it to.

this slide is kind of a duplicate of the next ones

How well did you know this?

Not at all

Perfectly

Studies measuring outcome measures look at what 3 things?

1) Reliability - do we get consistent results?

2) Validiity - Are we measuring what we intent to?

3) Clinical Meaninfulness - How seful is this test in what it intends to do?
* track improvememnt
* Compare to normative data
* Assess risk

How well did you know this?

Not at all

Perfectly

What are our 4 types of validity?

How are these measures (i.e., what correlation coefficient do we use?)

1) Criterion
2) Face
3) Construct
4) Content

Measured w/ spearmans correlation

How well did you know this?

Not at all

Perfectly

What is criterion validity?

Comparing the validity of the test/outcome measure to the gold standard/refrence standard
* How well does it measure what were intending to measure according to the best thats out there right now

EX: Testing balance / fall risk we could compare the TNETE to whatever the gold standard is for balance

How well did you know this?

Not at all

Perfectly

What is face validity?
* How is it measured?

Taking the outcome measure/test at face value. Just looking at this test does it make sense that its measuring what we think its measuring.

The degree to which an assessment or test subjectively appears to measure the variable or construct that it is supposed to measure

Think gait walking test wouldnt make sense for UE strength so it would have low face value

Measured w/ spearmans coefficent (r)

How well did you know this?

Not at all

Perfectly

What is a construct validity?
* How is it measured?

An abstract variable
* Variable = anything were trying to measure
* Can be abstract like “health and wellness” - how do we measure this?

Fall risk is a construct, so is balance

ROM = direct so not a construct (its not abstract)
* we can take out a goni and measure it

Does it accurately reflect the theoretical concept it’s supposed to measure?

Construct validity: Does the test accurately reflect the theoretical concept its supposed to measure (abstract concept)
* EX: measurement of depression using a specific questionaire
* Concept: Depression is the construct we want to measure. It encompasses symptoms like sadness, loss of interest, and fatigue
* Hypothesis: If the BDI test accuretly measures depression, it should correlate with other established measures of depression, such as the hamilton depression arting scale
* Researchers would adminsiter both the BDI and HDRS to the same group of people. If the scores from both assessment show a strong correlation, this supports the construct caldity of the BDI
* Additionally, researchers might check if the BDI scores differeentitae between groups known to have different levels of depression (individals diagnosised w/ depression vs those w/o). If the BDI effectively distinguishes these groups. it further supports its construct valditity

Measured w/ spearmans coefficent (r)

How well did you know this?

Not at all

Perfectly

What is construct validity?
* How is it measured?

How well that test/outcome measures that abstract thing
* How well does it measure the thing were trying to measure

EX: If I’m trying to measure obesity the standard would be BMI; however, it doesnt account for muscle mass
* So the construct validity is diminished for the BMI test

Measured w/ spearmans coefficent (r)

How well did you know this?

Not at all

Perfectly

What is content validity?
* How is it measured?

Do the individual pecies of the test/outcome measure actually measure what were trying to measure

EX: ROM is 1 measure thats done 1 time and were done. However, w/ the DGI theres different parts (eyes open, eyes closed, picking something up off the ground - these are different items within the test) - do all these items measure some part of that construct? Well all those items do releate to walking/balance which is what the test is trying to measure (functional dynamic balance) - so it would have high content validity

Measured w/ spearmans coefficent (r)
* Do these scores correlate w/ one another (do peopel w/ bad balance score bad on this test?)

How well did you know this?

Not at all

Perfectly

Which groups change can’t the floor effect see

Study These Flashcards

unable to detect change in the low performers
* basically the high performs would do so bad that it wouldnt really dilinate them much from the high level performers
* the test is too challenging
* think taking a bunch of people the SCI and telling them to walk - well if none of them can walk we wouldnt know whose recovering better

Which group cant the cieling effect group detech change in

Study These Flashcards

can’t detect change in the high performing group
* This is because the test is too easy, so they will score the same as the low performing group

The ability of the test to detect changes over time in the construct being measured

Study These Flashcards

Responsiveness

Its basically how long it takes to measure change in a test. Think between inches and cm. The cm is more responsive because it can detect more change in a smaller amount of time than inches
* = more responive deteches more change in less time

Think measuring gait speed. Well if my clock only measure down to the second it would be less repsonse than one that went down to the 10th of a second.

NOTE: we also want to know how much change matters. In gait speed does a 2 second change in gait speed matter? if they needed it to take 10 seconds longer to put them in a different category that test would be considered less reponse (i.e., large amounts of change in gait speed need to occur before what the test perdicts will happen will happen) - we need a big change on test to see an acutal change

Inter rater relaibility = consistent results between raters

Intra rater relability = consistent results in on rater

reliable = same result over and over

Study These Flashcards

how well all the items relate to one another

Study These Flashcards

Internal consistency

how well a tests items correalte with each other

NOTE: content valdidity = all these different items - how well they relate to the construct. While internal consistency relates all the items in that one test to one another (think about a test w/ multiple items and how well they all relate to one another)

Think about having a test with both questions about math and reading. Well, those test items don’t have good intenral consistency because they questions don’t really relate to one another. However, they have high construct validity because they’re measuring the abstract variable (which is your overall knowledge leaving highschool)
* however, if you only look at the math questions in the math section they would have high interal validity

The error that tends to occur naturally

Study These Flashcards

Stajdard error of measurement

How off you’re likely to be due to chance (think 5 degrees for goni)
* think a stopwatch, maybe its one second. You’re likely to be a second fof the actual time in 1 direction –> thats the standard error of measurement

The smallest cahnge that we can measure thats unlikely to be due to random error
* the amount of change that we need to occur to know it wasnt and error/chance, but a true change as occured

Study These Flashcards

Minimal Detectable Change (MDC)

Should be bigger than standard errior of measurement

The change thats needed for the pt to say that they feel somewhat better or somewhat worse - this is enough of a change to make a difference in the pts life

Study These Flashcards

Minimal Clinically Important Difference (MCID)

The smallest change in a treatment outcome that a patient consideres important and woud warrant a change in their managemeent

If im measuring ROM and my standard errior of measurement is 5 and my minimal detectable change is 7 - if I get a change in elbow flexion of 7 degrees, that might not mean much to the ot even though its the minimal detecable change - they still don’t have enough to really tell its a different - but if they had a 12-15 degree change now maybe they can use that arm better and are able to see those changes (i.e., “yeah I feel somewaht better) = Minimal clinically improtant difference (MCID)

Standard error of measurement, Minimal detectable change, minimal clinically important difference are all important because we need to know what a significant change is when we do test, treat, retest (wouldnt really have a baseline and know if it changed)

These are also fantastic for writing goals off of. So we should defiently write those at least on that minimal detactable change (not on that one thats only due to chance)

Maybe make the short term goal the MCD and the long term goal something greater than MCID

Study These Flashcards

If we have a high standard of error what happens to relability

Goes down * if the test has lots of error those results are not going to be consistent

What happens to validity w/ a high MCID?

It increases A high MCID means that the changes measured by the tool reflect improvements or declines that are noticeable and impactful to patients. This ensures that the tool captures clinically significant changes rather than just statistical variations.

Questions to answer to assess the quality of an artical.

clinical meaninfulness = MCID

Lecture 4: Clinical Outcomes Studies Flashcards

(28 cards)