Selection Flashcards
(41 cards)
Definition of Selection:
A definition of selection is “a systematic process of deciding who to hire, promote or move to other jobs”. This definition indicates that selection is an important part of staffing, which is a broader field of HR, and that selection processes and decisions aren’t always about hiring new employees to the firm.
What is the ultimate purpose of any selection tool or method? What do we hope it will do?
The one thing and one thing really really well: predict future job performance.
Personality Testing
As an example of a selection method, we can consider personality testing. Personality tests are interesting, in part, because many laypeople are skeptical of them, and think of them as not very useful. Keep in mind how we should measure “usefulness”. We will discuss this shortly. Some common arguments against using personality tests are, for instance, people can fake their answers; tests are inaccurate; personality changes from situation to situation, so it can’t be useful as a way to select personnel. Are these true or false? Are tests inaccurate? If so, they wouldn’t be very useful for personnel selection because they wouldn’t predict anything, such as future job performance. In fact, none of these arguments are true. We can consider this further when we talk about personality in personnel selection, but I wanted to introduce this early in this unit to demonstrate to you that your preconceived notions about different selection methods may or may not be consistent with what the current science has to say.
Faking answers does not change validity of a personality test.
5 Evaluation Selection Method Standards
- Reliability
- Validity
- Generalizability
- Utility
- Legality
Before we talk about specific selection methods, it is important to discuss how we evaluate the extent to which a particular measure is a good measure. All personnel selection methods and, more broadly, processes should meet the criterion of legality, which is to say they should conform to prevailing laws and legislation. Although we won’t focus on this as part of this particular unit, if you are going to work in HR you should familiarize yourself with the various laws related to personnel selection, some of which we touched on in the section of Legal Issues and EEO. Selection experts also evaluate the reliability, validity, generalizability and utility of various selection methods.
Reliability
Powerpoint:
Reliability is a property of a measurement approach or selection method, which represents the extent to which that measure is free from random error.
Selection methods first need to be evaluated in terms of their reliability. If a particular method isn’t reliable, it won’t be valid and therefore won’t be very useful. We will consider why a method has to be reliable for it to be valid in a moment. But first, let’s talk about what reliability is. Most any measure that we use to represent some actual or true score, such as measuring height with a ruler for instance, will be imperfect. That is to say, for instance, different measurements will result in slightly different results. Measures almost never perfectly represent their true scores, even though their true scores are stable, such as with height. One’s height doesn’t change from month to month, at least not for adults, yet measures of highly will be slightly different depending on the measurement.
If a particular selection method has a high degree of random error, which is to say it is unreliable, it won’t be very valid, which is to say it won’t be very good at predicting anything. For instance, let’s say a measure of academic aptitude – the GRE test – isn’t very reliable, which would mean, for instance, that your score on the test changes a lot from Time 1 to Time 2, perhaps from your junior year to your senior year of college. If this were the case, this would mean that the measure represents different things at different times. If the measure represents different things at different times, so X at time 1 and then X plus or minus something at Time 2, it can be used to predict anything because the measure is unstable. Since personnel selection is all about using different methods to measure different things that are meant to predict the extent to which individuals will be successful on the job, reliability is a necessary but insufficient criterion for a method to be worthwhile in a selection process. How do we measure reliability?
Test-retest reliability
Internal consistency reliability
Test-Retest Reliability
The degree to which a measure correlates with itself at two different times, which is demonstrated in the preceding GRE example. In this case, test-retest reliability is a correlation coefficient. It is important to understand what a correlation coefficient is and the different levels of practical significance or strength of a correlation. As a brief review, correlations below .10 are trivial; .10 to .29 are small; .30 to .49 are moderate and .50 and above are considered large.
Internal Consistency Reliability
When a particular measure is based on a survey format and it has multiple survey items, such as with personality or IQ tests, reliability can also be measured based in internal consistency reliability, which is essentially the extent to which different items in the same test are consistent with one another. Although this type of reliability is not represented by a correlation coefficient, it is still a representation of the extent to which a measure is free from random error. Reliability is always about consistency – whether it be consistent of a measure across time, or consistency between different items on a test with one another. How reliable should a selection method be? There aren’t hard and fast rules for cutoff scores for the reliability of a measure. That said, reliability should, in general, be strong (.50 or above).
What is internal consistency reliability? When questions are repeated but in different words. The different wording tries to make the personality test reliable.
For a selection tool like an IQ test, personality test, work sample test, etc. all these different tools, if we want them to do their job, has to be reliable.
Explain reliability coming before validity:
If something can’t predict anything, then it’s not valid.
Validity:
Reliability is a necessary but insufficient condition for a measure to be valid—but what is validity? Validity comes in different forms, but in general it is the extent to which a measure assesses relevant – and only relevant – aspects of a particular criterion, such as job performance. In general, when we talk about validity, we are talking about the extent to which a measure measures what it is supposed to. When we talk about criterion-related validity, we are talking about the extent to which a measure measures job performance. Job performance is sort of an ultimate criterion in personnel selection. There are two broad types of validity that we should consider: content validity and criterion-related validity. Criterion-related validity comes in two forms: concurrent and predictive, each of which we will discuss.
Content Validation
A test-validation strategy performed by demonstrating that the items, questions, or problems posed by a test are a representative sample of the kinds of situations or problems that occur on the job. In other words, a test must be deemed to measure the content it is supposed to measure and only that content. Measures can be deficient or contaminated, and either would indicate a lack of content validity. If a measure is deficient, it is not measuring all of the features of the variable it is supposed to measure. So, for instance, if aptitude as measured by a GRE test is meant to measure quantitative, verbal and analytical aptitudes, but a particular test only measures quantitative and verbal aptitude, it is deficient. If a measure is contaminated, it is measuring something that is not meant to be part of the variable it is intended to measure. So, for instance, if a particular GRE test measures social skills in addition to quantitative, verbal and analytical aptitudes, then the measure would be deficient. Of course, GRE tests do not measure social skills, even though social skills are indeed important for academic performance in graduate school, which is what the measure is meant to predict. The point is that a particular test should measure only what it is intended to measure. How is content validity determined? It is determined through expert judgment. That is, a panel of subject matter experts carefully review the properties of a given test – so the questions on a personality test, for instance – to determine the extent to which they measure what they are supposed to measure.
Criterion-Related Validity - Predictive vs Concurrent
Criterion-related validity is, as it sounds, meant to demonstrate the extent to which a given measure (X) is predictive of some criterion (Y). In general, we think of job performance as the ultimate criterion in personnel selection. After all, once we have a qualified applicant pool, what we want to do then is to select the most qualified among the pool, which is to say the person or persons who will perform best on the job. There are two types of criterion-related validity: predictive and concurrent, both of which are represented by a correlation coefficient. Let’s start with concurrent because it is generally simpler than predictive. With concurrent validity, job incumbents are exposed to a particular selection method, let’s say some sort of test, and their scores are then correlated with job performance. Assuming there is some variability in both the test and the criterion, the variability in the test can be used to explain variability in the criterion. Concurrent validation is easier than predictive validation in the sense that it is less time consuming and resource intensive, but it is also has some limitations. Because only current job incumbents are exposed to the test, scores on the may be more similar across incumbents than if job applicants were tested. This is because persons who are actually selected into an organization will be more similar to one another than a group of applicants—due to factors such as socialization and training. Thus, restriction in range of scores in the test can affect the types of correlations that are observed. Predictive validity is more ideal than concurrent validation, but it is also more involved so to speak. It involves measuring all applicants (at least, all applicants at a certain stage of the selection process) on a particular test, then selecting individuals who are thought to be the most qualified and the best future performers, and then correlating scores on the test pre-hiring with job performance scores post-hiring. In this sense, scores on the test can actually be used to predict performance. This is ideal because we want to be able to say that scores on our test are not only correlated with, but that they in fact predict job performance. That said, since scores from all applicants need to be obtained, this can involve much more time and effort, and it is not often done in business organizations; it is less common the concurrent validation studies. In general, tests need to demonstrate validity for them to be used in personnel selection. If a test is not valid, it has no purpose being used in a selection process. Moreover, if a particular test results in some negative consequence, such as adverse impact, and it isn’t valid, then its use can not be defended, which we talked about briefly in the context of Legal Issues and EEO. Validities – i.e., correlations – vary across different personnel selection methods, as we will see, and there is no hard and fast rule here about the strength of a correlation needed. That said, even small correlations (.10 and above) can be useful in selection.
Lecture:
Explain concept of construct validity? Subject matter experts. If SME’s are looking at a test and they say the questions aren’t really gauging personality, they question its construct validity ? construct validity is established through content validation.
What is criteria related validity? Difficult to measure in a predictive sense. Defend tests by saying they have criteria validity.
Generalizability
The degree to which the validity of a selection method established in one context extends to other contexts.
3 contexts:
- different situations (jobs, organizations)
- different samples of people
- different time periods
Most of the time, we want the validity of a measure to be generalizable across different situations, different samples or groups of people and across time. So, for instance, if we use a personality test to predict job performance, ideally that test will predict job performance in different jobs (engineers, marketing coordinators, lawyers, so on), across different subgroups of people (across men and women, minorities and majority members, persons of different ages and so on), as well as across time. In general, the validity of most the measures we use in personnel selection are generalizable. Validities for personality tests – correlations between test scores and job performance – tend to be pretty stable over time, and they are consistent across jobs, organizations and, for the most part, across persons from different backgrounds. That said, as we will see shortly, some important personnel selection methods have different validities for different groups of people, which is not only a problem for predicting performance, but can also present legal challenges.
What is generalizability? Generalizability is the degree to which a test is useful across contexts (situations, candidates, jobs)
Utility
Utility is the degree to which the information provided by selection methods enhances the effectiveness of selecting personnel in organizations. In other words, it is the usefulness of a particular selection method. Although this is a separate and somewhat independent dimension from the other means by which we evaluate selection tools, such as reliability, the utility of a particular method is impacted by reliability, validity and generalizability. That is to say, the more reliable, valid or generalizable a particular method, the higher its utility. That said, other factors are indicative of utility. For instance, when a selection ratio is lower, the utility of a test is higher, all else equal. We want to have a large proportion of applicants relative to the number of persons we are selecting. Other factors such as the cost of a test, the extent to which the test is related to turnover and so on are all considered in relation to the utility of a particular test or measure.
Lecture:
What is utility? Usefulness. Don’t consider only for reliability, but also for generalizability. IQ highly relevant across jobs, but not persons. IQ causes disparate impact. People of color perform more poorly. Cannot cheat on an IQ test.
Types of Selection Methods
Types of selection methods used to assess a person for employment include:
- interviews
- honesty tests and drug tests
- work samples
- personality inventories
- cognitive ability tests
We don’t have time to consider every possible personnel selection method, but let’s consider some of the more widely used methods in a bit more detail. In general, there are selection methods that tend to occur in initial stages of selection, and those that tend to occur in later stages of selection, as well as those that occur during a final screening. Evaluation of resumes to determine candidates’ fit with a position occurs early in the selection process. Related to this, the use of biographical data and different types of tests tend to occur relatively early in the selection process – they can be used to screen out candidates, and they are relatively faster and cheaper than other selection methods. Other methods, such as interviews, occur later in the selection process once candidates have been carefully screened. Reference checks and drug testing tend to occur in final screening of applicants.
Resume Screening
Perhaps the first step in a selection process is for a qualified HR professional, such as an HR coordinator, to screen resumes so as to sort candidates into two “piles” so to speak– those that meet the basic qualifications for a job, or the job req’s, and those that do not. Those that do will be moved on to other stages of the selection process. Some tips from the book “Hiring Great People” are offered above for persons to consider when screening resumes, such as being “open” and avoiding discriminatory information. Existing research, including research that I have conducted myself, shows that decision makers, such as hiring managers, will make hiring decisions based on candidates’ demographic information, such as gender, sexual orientation and so on – so it is important that persons screening resumes be trained professionals.
Biographical Data (Bio-data)
The basic premise behind biodata is that past behavior is a good (maybe the best) predictor of future behavior, a principle of behavioral consistency. In general, questions developed by subject matter experts about situations that are likely to have occurred in one’s past – and how one behaved in these situations – are used to predict job performance. Biodata measures tend to have 10 to 30 items; a single biodata question will not be a good predictor of future performance.
Biographical data from applicants (hobbies, experiences in school, preferred supervisor) learned through questionnaire or from a resume. Information is compared to information from firm’s successful employees. For instance, potential leadership ability could be predicted by previous leadership experience. Must ensure questions are job related and attempt to verify information provided. The biggest concern with the use of biographical data is that applicants who supply the information may be motivated to misrepresent themselves. Thus, it is important to control distortion by, for instance, warning about including a lie detection scale and/or asking applicants to elaborate on their answers. In general, validities tend to be practically significant, in the moderate range (around .30). Biodata need to always be job related (meaning they are related to job content and/or they predict job performance) and should be designed to not unfairly discriminate against protected classes, i.e., cause disparate impact.
Biodata items require extensive time and effort in terms of development, but they are relatively inexpensive to administer and score (often automated scoring procedures are used).
Cognitive Ability Tests
3 dimensions of cognitive ability tests:
- verbal comprehension: a persons capacity to understand and use written and spoken language.
- quantitative ability: speed and accuracy with which one can solve arithmetic problems.
- reasoning ability: a persons capacity to invent solutions to diverse problems.
Psychologists and neuroscientists have been researching human intelligence for many decades now, and attempts to measure intelligence data back at least to the early part of the 20th century. As such, today we have quite a good understanding – at least, relatively speaking – of what intelligence is and how to measure it. Intelligence is measured using cognitive ability tests, or intelligence quotient (IQ) tests. These tests measure ability – not skills or knowledge – but ability along three different dimensions: verbal, quantitative and reasoning. These are similar to the dimensions measured in academic aptitude tests, such as the GRE, and scores on academic aptitude tests tend to correlate strongly with IQ tests (around .80). Unlike academic aptitude tests, which tend to be influenced by academic experience and training, IQ tests are meant to measure raw ability and should be able to do so in some sense regardless of one’s academic experiences. Of course, measures of verbal comprehension require language proficiency and some level of reading skill. In any event, what is relevant to the current discussion is that measures of intelligence are also measures of general mental ability. That is, there is a generalized factor that is represented by the three different dimensions of intelligence, and that general factor – which we call ‘g’ or general mental ability (or GMA) – is very useful in personnel selection for reasons we will describe next.
Validity for cognitive ability (IQ tests): general mental ability = .6 to .7, strong, across jobs. Highly generalizable across jobs. IQ related to learning. Learn more effectively and rapidly.
Intelligence “g” and Job Performance
Research has consistently – and in most social scientists’ opinions, unequivocally – shown that GMA is correlated strongly with job performance across different types of jobs. That is, intelligence is correlated with job performance for janitors, salespeople, executives and so on. In this sense, the validity for IQ tests is high. Why is this the case– why is intelligence related to job performance. If you think about what “being smart” means, it means an ability to learn and to solve problems. IQ predicts job performance, in part, because it leads to more rapid learning and thus to more job knowledge, and IQ is particularly relevant to more rapid learning when information is complex. That said, there is a direct relationship between IQ and job performance even when job knowledge is held constant. So, for instance, let’s say we are considering hiring two individuals who have about the same level of previous experience in sales for a sales job—they will have very similar levels of job knowledge. All else equal and on average, intelligent individuals will outperform less intelligent individuals when job knowledge is similar. Why? Because smarter individuals are more able to solve problems. Persons in all jobs – including janitors – experience job problems for which they have little experience or training to rely on, and this is where intelligence can come into play. What’s more, intelligence is not correlated with many other selection methods, such as personality scores. Persons who are smarter are not necessarily more conscientious and vice versa. What this means is that these two methods can be combined to make an overall better assessment of candidates’ qualifications for a particular job. Intelligence is a robust predictor not only of job performance, but a variety of work related criteria, such as objective career success (defined here as number of promotions and final salary).
IQ tests put into practice
In practice, how can this principle work?
1) Organization must be able to be selective in hiring
a) If it is a tight job market, and org. can’t be selective, benefits of hiring on “g” or other abilities is minimized
2) Organization must be able to measure “g”
a) Need a standardized employment tests, such as the Wonderlic Personnel Test.
b) This is a valid and cost-effective test
3) Validity of job performance must be greater than 0.
a) Or, there must be differences between workers in quality and quantity of output
b) This condition is “always met” – applies across jobs
To put IQ tests into practice, organizations must be able to be selective in hiring; they must be able to measure GMA; and a measure of GMA must be valid. In general, these conditions are met. Under most circumstances, firms can be selective in who they hire, although selection ratios will differ across firms and across jobs, of course. We can measure GMA cheaply and efficiently using existing and well-researched measures, such as the Wonderlic Personnel Test. And there is extensive research about the validity of IQ tests. Again, they are highly valid. In fact, the validities for IQ tests tend to be higher than most any other selection method or test.
Issues with using “g” in selection:
Of course, the content domain of an intelligence test is limited. It measures intellectual ability and intellectual ability alone. It is not intended to measure other important types of intelligence, so to speak, such as emotional intelligence, and it has little to do with personality or other individual differences. As such, it is important to combine IQ tests with other tests, such as personality tests, or selection methods, such as structured interviews. This will increase the overall validity of a selection decision. Another reason to combine other selection methods with IQ tests is to reduce the adverse impact of decisions based on intelligence tests alone. IQ tests do necessarily result in adverse impact when used alone- they tend to favor Whites and Asians and they disfavor Blacks and Hispanics. There is some debate about why this occurs, but my professional opinion is that cultural biases of IQ tests in terms of how they are developed and the way in which some questions are asked is the most widely accepted explanation. Nevertheless, IQ tests should generally not be used alone or as a primary variable in a hiring decision.
Personality
Personality is an enduring pattern of thoughts, feelings and behaviors. Personality is generally stable over the life course. For instance, if you are more conscientious than another individual at say 16 years old, you may both become more conscientious over time, but your rank order won’t change; that is, you will most probably still be more conscientious than that individual as adults. It took researchers many years – many decades – studying personality to come up with a useful and validated model of what personality is. If you think about it, there are many different words or concepts that describe human personality. In fact, there are thousands of such words. Personality researchers developed what is know today as the Big Five model of personality using a lexical approach, which is to say a word-based approach, where they grouped together similar words, like synonyms, to try to come up with a concise model of the major dimensions of human personality. Today, researchers generally agree that adult personality is represented by five major factors, each with several sub-factors or dimensions.
Big Five Model of Personality Dimensions
These are the five major factors, which cab be easily remembered using the acronym OCEAN – for openness, conscientiousness, extroversion, agreeableness and neuroticism (which is conceptually the reverse of emotional stability). If you try to think of a personality dimension – or trait – that isn’t represented in some way here, you will probably find it challenging. The full spectrum of human personality is represented here – at least, it is for normal personality or normal psychology. The model isn’t intended to represent abnormal or clinical psychological issues. What is useful about this model is that it is generally comprehensive, yet it is concise. This helps us have a common language and understanding of what personality is. It also allows researchers to develop measures of these personality traits. Again, since personality is stable, measures of personality should be – and are – stable over time, which is to say they demonstrate high levels of test-retest reliability.
Conscientiousness and Emotional Stability
Well researched, validated measures of some of the big five personality traits have proven useful in personnel selection. If you think about which of the big five would be most likely to predict job performance across all jobs, what would you say they are? Conscientiousness and emotional stability are related to job performance across all jobs, although to different degrees.
Conscientiousness correlates with job performance at .24. Emotionally stability correlates with job performance at .15 While these are moderate and small correlations, respectively… These traits are correlated at .0 with intelligence (not at all). So, using these traits in addition to “g” for selection will increase the likelihood of selecting ee’s who perform well.
Conscientious ee’s are more dedicated, careful and reliable. This is related to increased effort and motivation, which is related to better work performance. That is, conscientiousness is related to performance primarily through “self-regulatory” processes and “will do” performance factors, such as the amount of effort exerted (time on task); Quality– being Careful, thorough and detail-oriented; More citizenship behaviors and less counterproductive work behavior; improved Self-efficacy- conscientious employees Develop more positive beliefs about their abilities; and through Goal setting- they are likely to set goals and remain committed to them.
Emotionally stable ee’s are less neurotic. This is related to not being stress prone, doubting one’s abilities, and worrying, which is related to better job performance.
Conscientious and emotionally stable employees not only perform better, they are also more likely to be on-time, to stay with and be committed to organizations; to engage in more citizenship behavior; to not engage in conflict; and to avoid alcohol and drug abuse.
Other three in Big Five:
Although conscientiousness and emotional stability have validities that are highly generalizable, the “other” Big Five traits also have practically significant validities, but they are more limited in their generalizability. For instance, openness to new experience is relevant to jobs which require extensive training, rotations across new or different contexts, or international assignments. Agreeableness is particularly useful for selection in customer service and for team-oriented work roles. Extraversion is one of the best predictors of success in management and sales positions.