Performance management Flashcards

1
Q

Definition of PA

A

PA = managerial evaluation of an employee’s performance, often annually, in which they assess the extent to which desired behaviors have been observed or achieved. Done for a variety of reasons such as decision making, feedback, providing a basis for pay decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Taxonomies

A

• Most cited is Campbel’s 8 factors (1993). Seemed to be the granddaddy

o	Job-specific task proficiency
o	Non-jobs-specific task proficiency
o	Written and oral communication
o	Demonstrating effort
o	Maintaining personal discipline
o	Facilitating peer and team performance
o	Supervision
o	Management administration

• Then Borman & Brush (1992) taxonomic structure – 18 factors of managerial performance developed from critical incidents

• Viswesvaran (1996) - general factor (p)
o Reliability only .52 for performance

• Borman & Motowidlo (1993; 1997) - the are the contextual perf boys

Spector and Fox = CWB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Importance of PM

A

We should care about performance management because it provides the basis for our criterion measures used in the evaluation of employees, but also in the evaluation of programs we design (e.g., training) as well as selection systems (Austin & Villanova, 1992). Performance management has a rich history that has progressed from looking specifically and rating formats to reduce rater error to rater training to a broadening criterion space (e.g., OCB and CWB) to a world that requires management of teams and cross-cultural groups more than ever before (Austin & Crespin). While there might not be one best performance management system, there are many aspects that can be used to evaluate the system and make it more effective (Bowman). Performance is most typically measured using supervisory ratings that are subjective in nature. Multisource ratings (i.e., 360-degree ratings) have also become more prevalent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do orgs feel about PA?

A

Regardless of the system used, most are unhappy w/ it
Employees often feel they are not fairly assessed and many believe ratings are biased
Managers feel they receive inadequate training and don’t like being placed in the role of “judge.”
Often a feeling they’re done b/c they have to be done but that nothing productive comes out of the process (Murphy and Cleveland; Pulakos)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Trends

A

o Drive to concentrate on the context of the PA rather than just looking at raters as a source of variance.
o Adaptive performance (Pulakos, 2000)
o Drive to look at ratee reactions and motivation to improve rather than looking @ errors or accuracy
o Other?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we do PAs?

A

No one best way. Context determines what you should use. MBO, checklists, different rating scales,
critical incidents, BARS, BOS (behavior observation scales), compare individuals, ranking, & diaries.

New! FORS and CAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The criterion problem (long, sorry)

A

Austin and Villanova, 1992

Has always been prevalent (should we update this to, “1917-2014”???)

Issues with semantics – what does the term mean

-We focus on measuring our predictors but not sufficiently on measuring the criteria, and then we just assume it’s fine.

We need to need to consider the entire process

Need to consider how to evaluate performance:
(absolute/comparative; Judgment vs objective, hard vs soft criteria, multidim. nature)
Different methods: supervisor, self, 360 deg
Subjective – biases, time, things that can’t be seen, inferences. May gather things that objective may be deficient in.
Contamination, Deficiency, Relevance (to job performance– important at a fundamental level)

o Making decisions that are based on incorrect info
o Legal issues with devices loaded w/ biases and incorrect info (Title VII of Civil Rights Act)
o Different ways of looking @ it, examine at multiple levels of analysis

  1. Other: Halo, leniency, central tendency, recency & first impression, contrast effects (tendency to go away from ratings of the previous person), & assimilation effects (opposite of contrast effects).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Critical considerations (1) - the “Why”: purpose

A

a. Research
b. Feedback development
c. Training development (similar to feedback develop)
d. Performance evaluation (can also be at team level)
e. Organization planning (big picture performance that can lead to high-level decisions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Critical considerations (2) - the “What”: content

A

a. Conceptual
b. Criteria
c. Multiple dimensions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Critical considerations (3) - the “When” - timing

A

a. Mid perf.
b. Post perf.
c. Repeated measures
* Performance is dynamic and changes over time
* End of year performance reviews and archival data may provide an inaccurate or outdated view of performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Critical considerations (4) where- fidelity

A

a. similarity between the . . . situation and the operational situation which is simulated” (p. 50). In the case of PM, this refers to how closely the measurement setting replicates the actual performance situation it is intended to represent.
b. Two dimensions:
(a) the physical characteristics of the measurement environment (i.e., the look and feel of the equipment and environment)
(b) the functional characteristics (i.e., the functional aspects of the task and equipment).
c. Depending on the purpose and nature of the performance measures, different levels of fidelity more or less appropriate.
i. Example: If measurement is intended to capture day-to-day performance and job is highly dependent on various changes that occur in a fast-paced dynamic environment, a strictly controlled laboratory setting with a low level of fidelity may result in misleading findings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Critical considerations (5) - the “How”

A

a. Questionnaires
b. Observations
c. Simulations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Categories of errors

A
  1. Distributional errors – incorrect representations of perf. distributions across employees
    a. Rating means Severity, leniency,
    b. Variance range restriction, central tendency
  2. Illusory halo –correlations between ratings of two different dimensions being higher (or lower) than the correlation between the actual behaviors reflecting those dimensions
  3. And other (e.g. similar to me error, first impression error)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is context important

A
  • Other variables, such as situational factors, may constrain the translation from predictors to behaviors to results (Wildman, 2011)
  • Also, see Murphy IOP
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Has there been an overemphasis on accuracy? Explain. What else should we look @.

A

• Accuracy is important but by focusing on accuracy, we’ve introduced difficult to understand systems and have designed systems to reduce bias that make raters confused about what they’re rating. Perceived fairness (pull in justice lit. Colquitt) is a more important goal (DeNisi and Sonesh)
o Issues with our bunny hole expedition of errors (halo isn’t always bad! Murphy and Balzer)
o Landy and Farr say rating formats don’t change
o All of these efforts (and more) were focused on perf. appraisal, not on performance magagmeent and actual performance improvement (little attn paid to affect, motivation, etc.)
o But, these goals were still on accuracy (mot. and affect  accuracy)
o Illgen, 1993 says accuracy might be the wrong idea all together (also DeNisi)
o This led to shift of focus on perf. improvement and whether Es are motivated to improve; thus shift to whether Es perceive PAs as accurate and fair

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Name the 4 operationalizations of accuracy and define them.

A

Elevation accuracy - avg for a rater across ratings (rater diffs); overall accuracy

Diff. elevation - avg for each target across scores (differentiate b/w ratees across dimensions)

* above two about ratees (elevation in title; ppl on elevators) * most important in administrative ratings (Hoffman et al., 2012)

Stereotype accuracy - avg for dim. Across targets (differentiate b/ dims across ratees)
*above about dimensions

Diff. accuracy - sensitivity to patterns of performance (diff both b/w and w/in ratee scores); ratee*dimension interaction; rater identifies patterns of strengths and weaknesses

* combo  * most important in developmental (Hoffman et al., 2012)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Definition of CWB

A

Voluntary behavior that violates organizational norms and threatens the wellbeing of the org, its members or both (Robinson and Bennett)

2 dimensions: 
o	Abuse against others (emotion-based; e.g., incivility)
o	Production deviance 
	Sabotage
	Theft
	Withdrawal 

Spector and Fox some of first to do work in this area

18
Q

What are the two main approaches to CWB?

A

 Justice approach—CWB are a form of retaliation - Greenberg and colleagues
 Emotion—events that create negative emotions lead to revenge - Spector and colleagues

***Note - Still need to look at CWB from a cognitive focus too (like Organ & Koronovsky with OCBs)

19
Q

What are the two ways of dealing with CWB?

A

o Non-punitive: alignment, corrective/constructive feedback, self-management training, positive discipline, EAPs)
o Punitive—generally viewed as effective, but moderated by perceptions of control, just world beliefs, and negative affect
 Also can lead to bad feelings, power play perceptions, etc.
 Effects on observers can be negative depending on situation
 Progressive discipline
 Terminate as a last resort
• Consider at-will status

20
Q

Common measurement issues in PA (Kline & Sulsky, 2009)

A

o Absolute vs comparative judgments
• Depends on purpose– developmental vs promotions; expect diff user reactions

o Two ways efforts have gone to improve ratings?
• 1. Rater training
Rater error training, beh. obs. Training, FOR
• 2. Rating formats – lots in the BARs format family
Behaviorally based, graphic

o Meaning of work perf – the criterion problem; cites back to A&V, 92

o Likes Campbell et al formulation or taxonomy of relevant perf dims

o Most theory/research focused on
 Delineation of perf dims
 Expected associated perf standards

o Rating Quality
 Crohnbach’s accuracy indices

o Includes team perf appraisal – messy and difficult

21
Q

OCB guide

A

o Barnard, 1938 - willingness to contribute
o Organ, 1983 - narrow def. of perf –> OCB
o Williams and Anderson, 1991 - O vs. I; scale
o Borman and Mottowidlo, 1993 - contexutal perf. distinguished from task perf.
o Organ, 1997 - ok, OCB can be rewarded. Same as contextual
o Hanson and Borman say they’re identical = 2006
o Using same toothbrush
o Chiaburu - in role vs. out role distinction

22
Q

OCB and context. P original defs.

A

o OCB definition: “individual behavior that is discretionary, not directly or explicitly recognized by the formal reward system, and that in the aggregate promotes the effective functioning of the org.” (Organ, 1988)

o CP definition: known for its more proximal effect, enhancing and sustaining the social, psychological, and org context of the cooperative system (Borman & Motowidlo,1993)

o CP uses the terms interpersonal facilitation and job dedication to denote OCB-I/OCB-O; tends to have more consistent factorial structure than OCB

23
Q

Name a couple of Williams and Anderson’s dimensions

A

e.g., courtesy, sportsmanship, peacemaking, civic virtue; and they did the OCB-I/O distinction

24
Q

Lowdown on CWB- can you think of any studies we read??!

A
  • In 2000, Bennett and Robinson created a scale to measure workplace deviance. Deviance happens!
  • So, what causes it? We know that OCBs are more related to cognitive than affective (thank you Organ & Koronovsky), but Spector & Fox say that we have to consider emotions in the formation of CWBs and the proposed a model (but didn’t test it). But what about raters? Do they have any influence on ratings of CWB?
  • Why yes, they do. Lievens et al found that culture and type of rater affected ratings. So raters rate differently (for task, OCB, and CWB) and this could partially be due to the culture and role they have.
  • We also know that there are problems when people rate themselves. Stewart et al (2009) developed an other, multi-rater instrument, established it’s validity and compared it to the self-report it was based on. Most employees are engaging in some type of subtle deviant behavior, but not frequently.
  • We’ve learned that OCBs have a lot to do with social exchange, and CWBs do too, particularly with the breach of psychological contracts. Jensen et al was able to expand on which types of breaches was related to which type of CWB. Additionally, while we know that breach relates to CWBs it appears that org policies have no effect on them.
25
Q

Lowdown on OCB. Can you think of any studies we read?!

A
  • (Organ & Koronovsky, 1989) OCBs more likely due to cognitive process than affective reactions. They are deliberate and controlled, and a result of an exchange (social and economic). Trust is very important and when it is violated the individual will switch to a strictly economic exchange (reduce OCBs).
  • (Allen & Rush, 1998) because they are voluntary, they are more likely to be noticed and seen as a sign of commitment and altruism, if the behavior is consistent. Managers attribute different motives to OCBs and they are related to greater liking response.
  • (Hoffman et al incl Meriac, 2007) We can say that OCB has a general factor which correlates mod-strongly with task performance (.74) and has two dims (I and O) that are highly correlated. Leaves us with questions about the correlation between OCB and task – is it halo?
  • Hanson and Borman (2006) wrote a review of citizenship.
  • Many researchers have tried to create a taxonomy for OCB, but LePine’s (2001) meta-analysis suggests that citizenship is a single latent construct with ratings made on various dimensions (similarly to Hoffman et al.)
  • The way people define OCBs (extra- or in-role) is important and Chiaburu & Byrne found that trust and the exchange ideologies people have (strong or weak) are important predictors of OCB role definition. This is important because it is another article that includes social exchange and trust with OCBs. Also interesting is that people can have job sat but not trust their org.
  • And finally, Stone-Romero et al. looked at how we are defining the behaviors and found that the voluntary or extra-role nature is not as clear-cut as we may seem, which may be an issue for validity. There was a lot of overlap between contextual items and job description items, and ambiguity. Also, many contextual behaviors actually are rewarded, contrary to our definition. They assert that OCBs are part of the formal reward system
26
Q

OCB and personality

A

 Personality is more highly related to contextual performance than task performance
 Conscientiousness and OCB are moderately related (.16-.24)
 Agreeableness is also related (from .08 to .17) and facet matters (up to .28)
 Emotional stability (.13 and .16)
 Not Extraversion and Openness to Experience
 PA and NA have small correlations
 Different personality traits are likely to be more strongly related to certain dimensions of OCBs than others (e.g., conscientiousness related to generalized compliance or conscientiousness aspects of OCB)
–Personality is more strongly related to OCB in weak situations

27
Q

How does strength of a situation play a role w/ OCB?

A

o Personality is more strongly related to OCB in weak situations (e.g., the effects of conscientiousness can be wiped out by demand characteristics (like desire for promotion) of the situation)
o Type of job
 OCB not as distinct from task perf in managerial jobs
 Teams vs independent work
• OCBs more important when working in teams
• Also more difficult to distinguish OCB from task perf

28
Q

Trends or new developments w/ OCB?

A
  • Borman developed a CARS for rating OCBs! Resulted in lower errors and higher validity estimates
  • Rating OCB in interviews (Podsakoff et al.)

• Issue of OCBs being distinct from task perf.
Hoffman et al. found high correlation
Stone-Romero et al found that, generally speaking, OCBs are part of the formal reward system

• Other?

29
Q

What does Vis (2000) say about performance?

A

each perf dim is complexly determined (jointly between cog ability and personality) and that it is impossible to specify a sole cause or antecedent of a particular dim. BUT, the general factor implies some common determinants across dims.

Over 50% of the variance is shared across the different dimensions; pos. manifold

30
Q

From Campbell, 1993. What are the main determinants of perf.? Of his 8 dims, which ones are relevant to all jobs?

A
•	Determinants of performance
o	Declarative knowledge
o	Procedural knowledge
o	Motivation
o	Situational constraints (Austin & Crespin, 2006)

-relevant to all jobs
job specific task proficiency
maintaining personal discipline
demonstrating effort

31
Q

Adaptive performance

A

o Argue that adaptability is like the 9th factor in Campbell’s model
o 8 Dimensions
 Handing emergencies/crisis situations
 Handling work stress
 Solving problems creatively
 Dealing with uncertain and unpredictable situations
 Learning work tasks, technologies, and procedures
 Demonstrating physical adaptability
 Demonstrating interpersonal adaptability
 Demonstrating cross-cultural adaptability

o Johnson would argue that these dimensions can be incorporated already, but that uncertainty is the only adaptive dimension distinct from contextual performance
o Important due to changing technology, automation, mergers, corporate restructuring, and living in a global economy

32
Q

Personality and adaptive performance (new this year)

A

Huang et al., 2014; JAP; Personality and adaptive performance at work: A meta-analytic investigation.

  • A meta-analysis demonstrated that emotional stability and ambition are both related to overall adaptive performance.
  • Openness, however, does not contribute to the prediction of adaptive performance.

• Analysis of predictor importance suggests:
o ambition is the most important predictor for proactive forms of adaptive performance
o whereas emotional stability is the most important predictor for reactive forms of adaptive performance.

• Job level (managers vs. employees) moderates the effects of personality traits: Ambition and emotional stability exert stronger effects on adaptive performance for managers as compared to employees.

33
Q

New study ideas for OCB?

A

• Is personality more strongly correlated with quality of OCB rather than quantity? (Hanson says probably quality)
• Not much is known about why they are related to evals and reward
o Could be making super’s job easier, demonstrating desire to help org, reciprocity, liking, etc.

34
Q

Lowdown on rating formats. Can you think of any studies we read?!!!

A

Week overview: Here’s what was in store for this week: Rating formats have long been used as a means of measuring performance. Landy and Farr’s seminal article (1980) essentially called for a moratorium on rating formats, basically demonstrating that the type of format has little effect on ratings. However, while calling for a moratorium for the time being, they never intended to say that rating formats do not matter, as many seem to have gleaned from this piece. Thus, the moratorium phrase was overblown. In the pursuit of being “error” free, format was not the answer, but it doesn’t mean we shouldn’t discuss it anymore.

So, it’s okay to lift this moratorium in order to look at some important issues such as information processing and how we no longer just look at errors. Subsequent articles this week touched on a number of different issues relevant to rating: 1) raters’ reactions to forced distributions (Schleicher et al., 2009), 2) impact of using frequencies to evaluate perf. (Kane & Woehr, 2006), 3) using computer adaptive rating scales (Schneider et al., 2003, originated by Borman), 4) relative percentile approach (Goffin et al., 2009), and 5) the use of frame-of-reference scales (Hoffman et al., 2012)

35
Q

Lowdown of FORT. What is it?

A
  • Traditional rater error training facilitated the learning of a new rating response set, which reduced leniency and halo errors, but also inadvertently lowered levels of rating accuracy in some instances (Bernardin & Pence, 1980; Landy & Farr, 1980). Issue w/ halo and removing true variance
    • Bernardin and Buckley (1981) concluded we needed to develop new rater-training programs that increase rating accuracy, and they proposed FOR training as an alternative strategy.
    • FOR training attunes raters to a common frame of reference such that worker behaviors can be similarly assessed by different raters and increase accurate evals (McIntyre et al., 1984; Athey & McIntyre, 1987).

Specifically, it involves:
o (1) Matching ratee behaviors to their appropriate performance dimensions
o (2) Correctly judging the effectiveness levels of specific ratee behaviors (Sulsky & Day, 1992, 1994).

In sum, a theory of performance for each individual performance dimension is imparted to raters to assist them in the accurate evaluation of ratee performance.

36
Q

Lowdown of training. Can you remember any articles!?

A

This week we learned that BARS, as opposed to FOR, is not detailed enough to generate a frame-of-reference. In BARS, we throw out items that don’t load onto a competency, but in FOR, these poor items are the very behaviors we may need to help us reach a common conceptualization of performance (Hauenstein et al., 1989).

Stamoulis et al., pointed out that past FORT studies have been flawed in that they often haven’t used controls or pre/post-test designs to test their hypotheses. Overall, they found support for superiority of FORT on rating accuracy, but also found that control training led to improvements for differential accuracy.

Another study (Noonan et al., 2001) looked at yet another program and compared it to FORT, behavioral observation training (BOT). BOT training’s goal is to improve the observation of behaviors as opposed to the evaluation of behaviors. They found that combining BOT and FORT led to similar results. Given how costly and time consuming FORT is, future studies should continue to delve into alternative ways of training raters (hint FORS perhaps!).

A few studies we read added to the theory behind WHY FORT seems to work.

Schleicher and Day (1998) pointed out that FORT should help convey organizational values, builds agreement with those values, creates accurate cognitive representations and thus improves accuracy. Ultimately, this should hopefully lead to improved ratings (e.g. via increased motivation).

Another showed how FORT is effective in its ability to build rater schemas (Gorman et al., 2009).

Finally, using more studies and looking at moderators including protocols, Roch et al. (2012) conducted an updated meta-analysis on FORT, finding an overall effect size of .50. They found that while FORT clearly influences rater accuracy, it doesn’t relate to different accuracy operationalizations the same (most related to differential accuracy).

Another theme I saw is the issue of how we know an expert’s true score. Ultimately, it lies in one’s judgment.

37
Q

Multisource and feedback lowdown. Can you remember any articles?!!

A

Multisource ratings are popular! (Lots of stuff out there on 360). This week can be summed up w/ the following: an important meta-analysis (Conway & Huffcutt, 1997) found that different sources yield different ratings (supervisor highest, followed by peer, and then subordinate), and that because no single source had high reliability estimates, multisource may help in this regard. Also, while ratings are different b/w sources, in general, these different rating sources still seem to hold the similar conceptualizations of performance, as demonstrated in a study looking at measurement invariance of ratings (Facteau & Craig, 2001).

Soooo ratings differ by source, what’s going on here? Well, some would say the lack of convergence means we’re not rating well. Others that we read about this week pulled in a methodological reason (range restriction), while others pulled in a substantive reason (diff. peeps see diff. things) for the low convergence. LeBreton found evidence that the low relationships are likely due to range restriction in our ratings, rather than a difference in what individuals are seeing. But, Lance (2008) would strongly argue (since he likes to ruffle feathers, check on the AC review!) that these discrepant ratings aren’t necessarily due to poor ratings or biases. Based on his findings from a CFA approach, he asserts that different raters capture different aspects of the performance criterion space (I dig his argument).

What about how to relay these ratings in the form of feedback? We also read a seminal article on feedback by Kluger and DeNesi (1996), which delineates the process by which feedback interventions can increase learning and performance. Overall, FI is more effective when promoting elaborate vs. shallow learning and transfer of learning to similar tasks and when it doesn’t result in deeper questions about self-worth and self-meaning.

Finally, Anseel et al. (2012) looked at goal orientation (master vs. performance), concluding that orgs need to strike a balance between encouraging learning and encouraging performance, as too much emphasis on comparative performance may be detrimental to employees’ reactions and rate of performance improvement.

38
Q

Contextual issues lowdown!

A
  • The design of PA matters (e.g., higher ratings may follow negative ratings)
  • The purpose of the PA matters and affects leniency. Raters bias their ratings to avoid giving negative feedback and other negative consequences or to get positive consequences. Administrative ratings were more lenient than research ratings

• Social context of PA matters.
o PA takes place in a social context which must be identified, measured, and accounted for when studying PA. We look at the rater or the ratee, but don’t look at them simultaneously enough.
o Relationship quality is an important predictor of appraisal reactions (perceptions of accuracy, fairness, utility, satisfaction, and motivation to improive), even after controlling for rating favorability and participation. Relationship may be more important than decision control or outcome favorability. So, relationship quality is more powerful than even getting a good rating.

• Audience and accountability matters. Accountability can improve accuracy and leniency, depending on audience and form of accounting. Higher status audience = more accurate. Face-to-face accounting = more positive bias.

• We can look at performance over time to examine maximal and typical performance. Typical should be the level that a person most usually achieves (their normal). We may not see as much of a difference in on-the-job performance, as max and typ were highly correlated
o They suggest incorporating direct measures of motivation if possible as well as other dispositional factors that may affect variation. (unreported sources of variance)
o Max perf may indicate potential more than typical does

39
Q

Processing info: errors and accuracy lowdown!

A
  • The social context matters, and this includes how others act toward someone (not just how the other person acts). Ratings will be affected by others’ behavior (in addition to the target’s behavior), especially depending on the affordance offered to others by the target (perceptions of their utility to others). This is a fancy way of saying that we use others as objects of knowledge in our social reality and that is influenced by what we perceive that offer to us (employment, promos, etc). This info from others may previously have been categorized as error - John says Error doesn’t necessarily reduce accuracy (more on that in a bit).
  • The purpose of the PA also matters. Is PA memory or judgment based? Looks like both! Remembered more in evaluative condition, but organization wasn’t there. It appears that when people think of behaviors before making ratings, they will rely on those behaviors instead of previous judgments. Also, this relationship between memory and judgment can be influenced by context (like everything)
  • Affect is important, too. Affect does affect encoding, weighting, and cumulative impact on rating through a consistency bias (stronger when not neutral; Robbins & DeNisi), and they found evidence that past performance builds up and has an assimilation affect on future ratings.
  • And you thought halo was bad! Here we were, all this time, thinking that the word “error” meant, well an “error”, and Murphy and Balzer pull the wool from our eyes! Halo is not a good proxy for error, in fact it appears that halo contributes to accuracy. M&B say stop using halo to indicate anything but the construct of halo (not error, not accuracy)

o But Murphy isn’t done. Murphy et al. goes on to ask if we can separate true and illusory halo. In the field, no.
• OK, so we use halo as an error, but can we make PA more accurate? Jelley et al examined parallel and serial processing with behavioral obs scales and found that with accountability parallel could be as good as serial (but more cog taxing). Sticking with our regular serial BOS is good, but adding FOR and diary keeping as part of an overall PM is a good idea.

40
Q

Evaluating Criteria: Construct V and reactions lowdown

A

• Viswesvaran (1996) is a seminal article. Use interrater reliabilities because they don’t include rater specific error and won’t be downwardly biased estimates of validity. If you do use interrater reliabilities, use the dim ratings first and consider FOR training. Always correct your estimates (will be better than if you didn’t of if your estimates were coming from a primary study). Average interrater reliability for supervisor ratings of performance = .52.
• Construct validity is important. James (one of John’s favorites) says that how we validate it has implications for how and what we measure as outcomes. We need to use methods like nomological net, MTMM, and factor analysis to ensure criterion adequately represents performance domain. Selection of job perf criterion should be based somewhat on empirical relationships between job behaviors and criteria that reflect org goals
o Scullen at al did construct validity study of managerial performance (multirater)
• What about employee reactions?
o Fairness is important
 Greenberg found that equity theory applies to PAs and both proc and dist justice were
o Keeping et al:
 There’s a lot of inconsistencies in how reactions are studied (most atheoretical)
 We often confound satisfaction, fairness, utility, and accuracy
 Affect added little
o Korsgaard et al: Instrumental and noninstrumental voice are not interchangeable in the effects on PA.
 Instrumental voices is more important than non-instrumental when it comes to allocation decisions, but noninstrumental more important to reactions to mgmt and org. Trust was only related to non-instrumental voice.