Observation Flashcards

(57 cards)

1
Q

two basic classes / times for studies and evaluation

A
  • formative:
    at the beginning to inform about context and to study possible options
  • summative:
    to judge on the impact of a HCI design
    (a summative evaluation of a design might be a formative one for the next step)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why, what, where and when to evaluate

A
why: 
study question (check user' requirements and that they can use the product and they like it) 

what:
a conceptual model, early prototypes of a new system and later, more complete prototypes, human behaviour…

where:
in natural and laboratory settings

when:
* formative: throughout design;
* summative: finished products can be evaluated to collect information to inform new products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

three classes of measures

A

user effectivity

user efficiency

user satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

evaluation classes

A
  • setting
  • evaluation time
  • evaluation partner
  • result type
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

controlled settings

A
  • setting conditions are controlled
  • non-controllable conditions are measured
  • e.g. lab experiments, living labs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

natural settings

A
  • study in ‘everyday’ and natural conditions that cannot be controlled
  • some, but not all non-controllable conditions can be measured
  • e.g. field studies, in-the-wild studies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

types of evaluation time

A

inspective:
* inspection / evaluation while run of an experiment or while use

retrospective:
* evaluation after run of the experiment or after use

short term: short session
long term: long session

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

evaluation partners

A

the user:

  • gives direct feedback e.g. for use
  • best for gaining new insights into context
  • if its an experiment: called “subject”

the expert:

  • allows for best practice information
  • reported expert experience may require many users / test subjects to be collected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Result types

A

subjective:
* results cannot be directly compared between subjects

objective:
* results can be directly compared between subjects

quantative:
* results are numbers

qualitative:
* results are text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interviews - Five key issues

A
  1. setting goals
    decide how to analyze data once collected
  2. Identifying participants
    decide who to gather data from
  3. relationship with participants
    clear and professional, informed consent when appropriate
  4. Triangulation
    look at data from more than one perspective
    collect more than one type of data, e.g. qualitative from experiments and quantitative from interviews
  5. Pilot studies
    small trial of main study
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data recording

A
  • notes, audio, video, photographs can be used individually or in combination
  • always use a visual impression
  • different challenges and advantages with each combination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

three types of interviews

A

structured interviews

  • pre-developed questions
  • strictly following the wording
  • easy to carry out - but limited to the question set
  • more precise to evaluate

semi-structured interviews
* structured part + ‘open’ questions

unstructured interviews

  • used when little background information available
  • minimizes the influence of the questioner
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Running the interview - structure

A

Introduction - introduce yourself, explain the goals of the interview, reassure about the ethical issues, ask to record, present the informed consent form

warm-up - make first questions easy and non-threatening

main body - present questions in a logical order

a cool-off period - include a few easy questions to defuse tension at the end

closure - thank interviewee, signal the end, e.g. switch of the recorder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

encouraging a good response

A
  • make sure purpose of study is clear
  • promise anonymity
  • ensure questionnaire is well designed
  • follow-up with emails, phone calls, letters
  • provide an incentive
  • 40% response rate is good, 20% is often acceptable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Standard questionnaires used in HCI

A

SUS - system usability scale

TLX - NASA task load index

QUIS - Questionnaire for User interface satisfaction

CSUQ - Computer system usability questionnaire

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SUS - benefits and restrictions

A

+ very easy to scale (likert)
+ useful in small sample sizes with o.k. results
+ validity o.k. (you see differences in bad and good design)

  • Score 0-100 -> association with percentage
  • not diagnostic, just to classify
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

problems with online questionnaires

A
  • sampling is problematic if population size is unknown
  • preventing individuals from responding more than once can be a problem
  • individuals have also been known to change questions in email questionnaires
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Types of observation

A

direct observation in the field

  • structuring frameworks
  • degree of participation
  • ethnography

direct observation in controlled environments

indirect observation: tracking user’s activities

  • diaries, experience sampling method
  • interaction logging
  • video and photographs collected remotely by drones or other equipment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Planning and conducting observation in the field

A
  • decide on how involved you will be: passive observer to active participant
  • how to gain acceptance
  • how to handle sensitive topics, eg. culture, private spaces, etc.
  • how to collect the data:
    • what data to collect - what equipment to use - when to stop observing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Ethnography

A

Goal: to experience the participant and it’s context

Ethnographers immerse themselves in the culture that they study

analyzing video and data logs can be time-consuming

collections of comments, incidents and artifacts are made

co-operation of people being observed is required

informants are useful

data analysis is continuous

interpretivist technique

questions get refined as understanding grows

reports usually contain examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

online enthography

A

interaction online differ from face-to-face

virtual worlds have persistence that physical worlds do not have

ethical considerations and presentations of results are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

observations and materials that might be collected

A
  • activity or job descriptions
  • rules and procedures
  • descriptions of activities
  • recordings
  • informal interviews
  • diagrams (of the physical layout,…)
  • photographs, videos, workflow diagrams, process maps, …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

observation in a controlled environment

A

direct observation

  • think aloud techniques
  • also used in conjunction with other interview and questionnaire techniques

indirect observation

  • diaries
  • interaction logs
  • web analytics

video, audio, photos, notes are used to capture data in both types of observation

24
Q

Think Aloud

A

While using an application, a user is constantly explaining what he is thinking what he is doing

Quality of the evaluation depends on

  • selection of test candidates
  • appropriate preparation of the candidates
  • appropriate setting so that a natural usage can be guaranteed
25
Think aloud preperation
* explain the system * explain the setting * explain expectation 1. using the scenarios prepared earlier, write a draft list of tasks 2. try out the tasks and estimate how long they will take a participants to complete 3. prepare a task sheet for the participants 4. get ready for the test session 5. tell the participants that it is the system that is under test, not them; explain and introduce tasks 6. participants start the tasks. Have them give you running commentary on what they are doing, why they are doing it and difficulties or uncertainties they encounter 7. encourage participants to keep talking 8. When the participants have finished, interview them briefly about the usability of the prototype and the session itself. Thank them 9. write up your notes as soon as possible and incorporate into a usability report
26
Think aloud evaluation
* qualitative, subjective mostly * ethnographic, delivers to the point experience for specific issues / problems * generalisations are very difficult, require high level of experience * interpretations can be done based on various different psychological theories and models
27
Living labs
* People's use of technology in their everyday lives can be evaluated in living labs * such evaluations are too difficult to do in a usability lab
28
Ubicomp Studies
* Are field studies, not lab studies * In situ, means result includes measurements of the context * context and situation is not controlled * such studies are more expensive * more likely to find novel insight and experience * Ubicomp studies requrie additional effort * Ubicomp studies e.g. normally also require control conditions, prestudies, calculation of number of participants, selection of participants, data selection and statistics
29
3 main types of ubicomp field studies
study current behavior: * what are people doing now proof-of-concept studies: * does my technology function in the real world experience studies. * how does using my prototype change people's behaviour or allow them to do new things
30
Wizard of Oz studies
good for proof of concept person simulates and controls system from behind the scenes * use mock interface and interact with users * good for simulating system that would be difficult to build
31
Experience Studies
Surveys * often used as prestudy * carried out after any change of condition in a between-subject study * regular in-between survey while a study to measure change of participants reaction Logging * use the mobile device to also collect data about usage
32
Logging - design considerations
how will you use the logged data? * select appropriate data to log (at the right frequency) make a list of specific questions that you expect to answer from the log data will your logging help you know if the study is going smoothly?
33
Logging - web analytics
A system of tools and techniques for optimizing web usage by measuring, collecting, analyzing and reporting web data typically focus on the number of web visitors and page views.
34
Experience Sampling Methodology (ESM)
ESM is a study method using questionnaires Participants are asked to fill out short questionnaires at various points throughout the day You get a different picture than to recall later Considerations: * how often to ask the participant * how many questions * collect experience or sensor information
35
Study Design
For any study: A: start with a concrete research question B: answer the following questions: * what will your participants do during the study * what data will you collect * how long will the study be
36
steps to a successful study
1. Have a clear research goal and question 2. Create a study design document containing * 1. Research question / Hypothesis * 2. Detailed participant Profile * 3. Detailed method description (what will part. do) * 4. Detailed timeline description * 5. Types of Data you collect * 6. Analysismethod * 7. How you draw conclusion / validate hypothesis
37
How long should your study be
Depends on type of study * experience studies (several weeks) are longer than proof of concept studies (serveral days) * studies of current behaviour may start from hours to weeks Depends on novelty * usage of novel systems is often very different at the start (enthusiasm or scepticism) and after longer period of use Practical considerations * If it requires much effort from the participants you have to restrict measurement time * Frequency of need for interaction with participants: High frequency means shorter measurement time Frequency of use * High frequency of use reduces measurement of time required
38
Things to consider when interpreting data
*Reliability does the method produce the same results on separate occasions? * Validity does the method measure what it is intended to measure internal validity - external validity * Ecological validity does the environment of the evaluation distort the results? Is the result transferable to a general environment? * Biases: Are there biases that distort the results? * Scope: How generalizable are the results
39
selecting participants
First you have to answer 3 questions before you start: * representation of participants to the intended user group * grouping of participants * data sampling strategy
40
Representation of study participants
* Representative Participant Set * Non-Representative Participation Set * Be careful: Many statistics assume a representative set
41
Grouping Participants
one group only or multiple groups group selection based on * self-reported experience * frequency of use * amount of experience * demographics * different activities the participants have to perform
42
Sampling Strategy
Random sampling * everyone has equal probability of being selected as participant based on a list Systematic Sampling * Based on predefined criteria, e.g. every 10th person entering the ECE Center Stratified sampling * Additionally, it is important to select people reflecting the distribution in your intended user group. So you care e.g. that your final set contains 50% male and 50% female Samples of convenience * Volunteer based. Must be adjusted to the wanted user group
43
Sample Size
Depends on acceptable error! * Major problems can be identified by 3-4 people. * Early stage design require less participants * But this is an oversimplification --> Perform a pre-test where participants have to first detect known usability issues, caluculate averatge percentage of found usability issues over all participants Gives you the percentage of found issue in average
44
Test order
Participants learn fast - test order may have a significant influence on the outcome of the experiment --> Reschedule order of tasks for each participant This is not necessary with unrelated tasks. Sometimes it is impossible because tasks depend on each other
45
Types of Evaluation without user
* Experts use their knowledge of users & technology to review software usability * Expert critiques can be formal or informal * Heuristic evaluation is a review guided by a set of heuristics * Walkthroughs involve stepping through a pre-planned scenario noting potential problems.
46
Revised version of Nielsen's original heuristics
* Visibility of system status * Match between system and real world * user control and freedom * consistency and standards * error prevention * recognition rather than recall * flexibility and efficiency of use * aesthetic and minimalist design * help users recognize, diagnose, recover from errors * hep and documentation
47
3 stages for doing heuristic evaluation
1. briefing session to tell experts what to do 2. Evaluation period of 1-2 hours in which * each expert works separately * take one pass to get a feel for the product * take a second pass to focus on specific features 3. Debriefing session in which experts work together to prioritize problems.
48
vorteile & nachteile heurisitic evaluation
+ few ethical problems - no users involved + few practical problems - no users involved - can be difficult to find experts - important problems may get missed - many trivial problems are often identified - experts have biases
49
Cognitive Walkthroughs
* focus on ease of learning and or usage * designer presents an aspect of the design & usage scenarios * Expert is told the assumptions about user population, context of use, task details * one or more experts walk through the design prototype with the scenario * experts are guided by questions
50
cognitive walkthrough questions
1. Will the correct action be sufficiently evident to the user? 2. Will the user notice that the correct action is available? 3. Will the user associate and interpret the response from the action correctly? 4. If correct action is performed, will the user see that progress is made towards his goals?
51
Pluralistic walkthrough
variation on the cognitive walkthrough, performed by a carefully managed team The panel of experts begins by working separately Then there is managed discussion that leads to agreed decisions The approach lends itself well to participatory design
52
Criteria for Creating and Measure of Mental Workload
Sensitivity * index must be sensitive to changes in task difficulty or resource demand Selectivity * index should NOT be sensitive to changes unrelated to resource demands Diagnosticity * index should indicate not just that workload is varying but the cause of variation (Un)obstusiveness * an index should not interfere with or contaminate the primary task being assessed Reliability (Reproducibility) * index should produce the same estimate for a given task and operator Bandwidth * the index should respond to high-frequency changes in workload
53
4 primary approaches to workload assessment
1) primary task: direct 2) secondary task: indirect 3) physiological correlates 4) subjective ratings: does not interfere with task, but subjective
54
workload assessment - primary task
Measure performance metrics: * time * speed * strength Derived workload metrics * no absolute value * difference in performance my indicate difference in workload
55
workload assessment - secondary task
popular types: * rhythmic tapping task * random number generation * probe reaction time task * time estimation * time production
56
workload assessment - physiological measurments
* Heart rate (ECG), Muscle Activity (EMG), Brain Activity (EEG) * Respiration (GSR) * Oxygen uptake * Eye-Tracking In principle precise, but * difficult to set-up * needs extensive physiological conditioning to bring subjects to same conditioning level * different to compare between subjects due to high variation in physiological condition Conditioning: * Baseline Phase * Interaction Phase * Recover Phase
57
workload assessment - subjective ratings - NASA TLX - 5 dimensions
Mental demand * how mentally demanding was the task? Physical demand * how physically demanding was the task? Temporal demand * how hurried or rushed was pace of the task Performance * How successful were you in accomplishing what you were asked to do Effort * How hard did you have to work to accomplish your level of performance Frustration * How insecure, discouraged, irritated, stressed and annoyed were you?