Block 4 - Unit 1: An evaluation framework Flashcards Preview

M364 Revision > Block 4 - Unit 1: An evaluation framework > Flashcards

Flashcards in Block 4 - Unit 1: An evaluation framework Deck (36)
Loading flashcards...

Key points for evaluation. (4)

Evaluation is a key activity in ID lifecycle.

An essential requirement for any interactive product is understanding users' needs.

Needs of users can be usefully expressed as goals for the product - both usability and UX goals.

Purpose of evaluation - check users can use the product and like it. Assess how well goals have been satisfied in a design.


3 main approaches to evaluation?`

Usability testing.

Field studies.

Analytical evaluation.

(Each differs according to its theories, philosophies (beliefs) and practices for evaluation.


Methods (def and 6 examples)

Practical techniques to answer questions set in relation to an evaluation goal, include:

Observing users.

Asking users their opinions.

Asking experts their opinions.

Testing users' performance.

Modelling users' task performance.


Opportunistic evaluation.

Designers informally and quickly get user feedback from users / consultants to confirm their ideas are in line with users' needs, and are liked.

Generally used early and needs only low resources.

'Quick and dirty'.


Evaluation and accessibility. (Include example).

If system should be usable by disabled, must evaluate both 'technical accessibility' (can user physically use?) and usability.

Eg. Blind user - screen reader might technically access data in a table, but user also need to read cells in a meaningful and useful way, eg. access contextual info about cells - relate cells to rows / columns.


6 evaluation case studies (SB)

Early design ideas for a mobile device for rural Indian nurses.

Cell phones for different world markets.

Affective issues - collaborative immersive game.

Improving a design - Hutchworld patient support system.

Multiple methods help ensure good usability - olympic messaging system.

Evaluating a new kind of interaction - an ambient system.


DECIDE intro.

Well planned evaluations are driven by 'goals' which aim to seek answers to clear 'questions', whether stated up front or emerge.

Questions help determine the kind of 'evaluation approach' and 'methods' used.

'Practical issues' also impact decisions.

'Ethical issues' must also be considered.

Evaluators must have enough time and expertise to evaluate, analyse, interpret and present the 'data' they collect.


DECIDE framework checklist.

Determine the 'goals'.

Explore the 'questions'.

Choose the 'evaluation approach and methods'.

Identify the 'practical issues'.

Decide how to deal with the 'ethical issues'.

Evaluate, analyse, interpret and present the 'data'.

(Common to think about and deal with items iteratively, moving backwards and forwards between them. Each is related to the others).


Determine the goals and Explore the questions. (3 points)

Determine 'why' you are evaluating - high-level goals.

If evaluating a prototype the focus should match the purpose of the prototype.

Goals identify the scope of the evaluation and need to be specific rather than general; identifying questions based on these goals clarifies the intention of the evaluation further.


Example of general -> specific goal.

'Help clarify users' needs have been met in an early design sketch.'

More specific goal statement:

'Identify the best representation of the metaphor on which the design will be based.'


How to make goals operational (DECIDE)

We must clearly articulate questions to be answered.
Eg. what are customers' attitudes to e-tickets (over paper)?

Questions can be broken down to very specific sub-questions to make evaluation more finegrained.
Eg. 'Is interface poor?" to "... difficult to navigate?", "... terminology inconsistent?", "... response slow?", etc.


What will an evaluation be focused on?

Guided by key questions, and any other questions based on the usability criteria to see how well usability goals have been satisfied.

Usability criteria - specific quantified objectives to assess if goal is met.

Also, how well UX goals have been satisfied - how interaction / experience feels to the user (subjective).

UX usually evaluated qualitatively, eg. 'users shopping online should be able to order an item easily without assistance.
Possible to use specific quantified objectives for UX goals, eg. '85% + should be able to order without assistance.'


What effects choice of approaches / methods of evaluation.

Approach influences the kinds of methods used.
Eg. analytical evaluation - methods directly involving users won't be used.

Choice of methods:
Where you are in the lifecycle.
Goals being assessed.
Practical issues - time, money, technology, appropriate participants.

WHAT you are evaluating and type of data being collected.
Eg. low-fi prototypes - any time in lifecycle, but predominantly useful fo qualitative data, or assessing certain UX goals or interface features (eg. underlying metaphor).


Why use more than one evaluation approach / method?

Often choosing just one approach is too restrictive for evaluation.
Take a broader view - mix and match approaches / methods according to goals, questions and practical / ethical issues.
Eg. methods used in field studies tend to involve observation, interviews or informal discussions.

Combining methods for evaluation study, especially if complementary, can give different perspectives for evaluation, and may help to find more usability problems than a single method might.


Usability defect (problem)

A difficulty in using an interactive product that affects the users' satisfaction and the system's effectiveness and efficiency.
Usability defects can lead to confusion, error, delay or outright failure to complete a task on the part of the user. They make the product less usable for the target users.


Identify practical issues.

Many issues - important to identify as many as possible before study.
Pilot study is useful to discover surprise events.

Issues include:
- users
- facilities and equipment
- schedule and budget constraints
- evaluators expertise

May need to compromise, eg. less users for shorter period (budget).


Users (practical issues) (3 points)

Where possible, a sample of real (prospective / current) users should be used, but sometimes representatives of the user group are necessary (identified in requirements).
Eg. experience level, age / gender, culture, education, personality.

How will users be involved?
Tasks in lab should represent those for which the product is designed.
Users should get frequent (every 20 mins) and feel at ease - product is being tested.

Field studies - onus is on evaluators to fit in with users and cause as little disturbance as possible.


Facilities and equipment (practical issues) (3 points)

Video - where to place cameras? - can change behaviour.

If you're not confident of observer skills, ask participant to use 'think-aloud' protocol.
Or, ask if you need them to pause to write notes.

Audio / video takes time to analyse - aproximately 6 hours for 1 hour of video.


Schedule and budget constraints (practical issues). (1 point)

Usually have to compromise according to resources and time available.


Expertise (practical issues).

Different requirements for different evaluation methods.

Eg. user tests - knowledge of experimental design and video recording.
Consult a statistician before, if there's a need to analyse results using statistical measures, and after during data collection / analysis.


Accessibility and evaluation methods - practical issues. (2)

Asking users:
- interviews for deaf / dumb - written questions / answers, or sign language.
- questionnaires for blind - describe designs, emulate screen reader.

Asking experts:
- may need to provide information not easily attained by users.
Eg. descriptions of visual images - expert may be best judge of accuracy of description.


Why are ethical issues important to evaluation?

Users are in unfamiliar situations.

Privacy should be protected - name not associated with data.


4 principles for ethical issues. (3 ACM + 1)

Ensure users an those who will be affected by a system have their needs clearly articulated during the assessment of requirements.

Articulate and support policies that protect the dignity of users and others affected by a computing system.

Honour confidentiality.

Ask users' permission in advance to quote them, promise anonymity and offer to show the report before it's disclosed.


Some other ethical issues.

Web use - activity can be logged, possibly without knowing.
Privacy, confidentiality, informed consent.

Children - legal issues; may need parent / teacher present.

English not first language - may miss nuances, possibly causing bias.

Speech impairment, learning difficulty, etc. - may need helper / interpreter.
Should address remarks to participant, not intermediary.

Cultural constraints may make criticism difficult - eg. Japanese 'polite'.


Evaluate, analyse, interpret and present the data (overview).

Need to decide what data needed to answer study questions, how it will be analysed and how findings are presented.

Method used often determines the type of data collected, but there are still choices.
Eg. should data be treated statistically?


General question areas for final 'E'. (5)

Reliability. (Consistency)




Ecological validity.


Reliability (consistency) - final 'E'.

How well the method produces the same (or similar) results under the same circumstances, even if done by a different evaluator.

Different methods have different degrees of reliability.
Eg. carefully controlled experiment - high;
observation in natural settings - variable;
unstructured interview - low.


Validity - final 'E'.

Does evaluation method measure what is intended.

Covers method and how performed.
Eg. goal to find how product is used in homes - don't plan a lab session.


Biases - final 'E'.

Occurs when results are distorted.
experts more sensitive to design flaws than others;

observers may not notice certain behaviours they don't deem important, ie. selectively gather data;

interviewers may unconsciously influence responses - tone, expression, phrasing of questions.


Scope - final 'E'.

How much can evaluation's findings be generalised?

Eg. some modelling methods, eg. keystroke model, have a narrow, precise scope. Predicts expert, error-free behaviour so results can't be used to describe novices learning to use the system.