Artificial intelligence (AI) and the ability to predict outcomes based on analysis of patterns are helping advance almost every area of human society, ranging from autonomous vehicles to predictive medicine. The business world derives great value from AI-driven tools and leverages data in almost every function.
Most interestingly, perhaps, is the recent proliferation of AI tools in the Human Resources field that address hiring, internal mobilization, promotion, and the possible effects deploying these technologies can have on the business overall. These tools can offer great value to HR professionals, as they aim to save time, lower recruiting costs, decrease manual labor, and collect vast amounts of data to inform decisions while helping avoid biases in human decision-making.
Companies must comply with strict legal and ethical requirements, and it’s incumbent upon HR leaders to understand how incorrectly deployed and designed AI tools can also be a liability.
The real challenge for HR leaders is that most AI-driven tools are “black box” technologies, meaning algorithm design and logic are not transparent. Without full insight into “the box,” it’s impossible for HR leaders to evaluate the degree to which such tools expose an employer to risk.
This article will briefly review some of the dangers of utilizing AI for people decisions; provide examples of how algorithms can be biased when they are trained to imitate human decisions; highlight the promise of AI for people-related decisions; and explore how AI can facilitate these decisions while addressing compliance, adverse impact, and diversity and inclusion concerns.
The Dangers of AI-Driven People Decisions
“Black box” algorithm design. Algorithms that leverage machine learning can both make decisions and “learn” from previous decisions; their power and accuracy come from their ability to aggregate and analyze large amounts of data efficiently and make predictions on new data they receive.
However, the challenge of algorithm design is deciding which factors, variables, or elements should be given more “weight,” meaning which data points should be given relative priority when an algorithm decides. For example, if not taken into careful consideration, factors such as gender, ethnicity, area of residence, etc., can affect an algorithm, thus biasing the decision and negatively affecting certain groups in the population.
Recently, the Electronic Privacy Information Center (EPIC) filed a joint complaint with the Federal Trade Commission (FTC) claiming that a large HR-tech company providing AI-based analysis of video interviews (voice, facial movements, word selection, etc.) is using deceptive trade practices. EPIC claims that such a system can unfairly score candidates and, moreover, cannot be made fully transparent to candidates because even the vendor cannot clearly articulate how the algorithms work.
This company claims to collect “tens of thousands” of biometric data points from candidate video interviews and inputs these data points into secret “predictive algorithms” that allegedly evaluate the strength of the candidate. Because the company collects “intrusive” data and uses them in a manner that can cause “substantial and widespread harm” and cannot specifically articulate the algorithm’s mechanism, EPIC claims that such a system can “unfairly score someone based on prejudices” and cause harm.
Mimicking, rather than improving, human decisions. In theory, algorithms should be free from unconscious biases that affect human decision-making in hiring and selection. However, some algorithms are designed to mimic human decisions. As a result, these algorithms may continue to perpetuate, and even exaggerate, the mistakes recruiters may make.
Training algorithms on actual employee performance (i.e., retention, sales, customer satisfaction, quotas, etc.) helps ensure the algorithms weigh job-related factors more heavily and biased factors (ethnicity, age, gender, education, assumed socioeconomic status, etc.) are being controlled.
For example, the data these algorithms are learning from will sometimes reflect and perpetuate long-ingrained stereotypes and assumptions about gender and race. One study found that natural language processing (NLP) tools can learn to associate African-American names with negative sentiments and female names with domestic work rather than professional or technical occupations.
Onetime calibration. Most HR-tech companies that support hiring decisions using AI conduct an initial calibration, or training, of their models on their best-performing employees to identify the top traits, characteristics, and features of top performers identify these same factors in candidates.
The rationale behind this process is valid, so long as the company’s measures of performance are neutral, job-related, and free from bias based on protected characteristics such as gender and ethnicity. However, performing it only one time is counterintuitive to the long-term goal.
In today’s business context, in which companies are constantly evolving their strategy to address dynamic market conditions and competition, the key performance indicators (KPIs) used to measure employee success and the definition of roles change frequently. The top performers of today may not necessarily be the top performers of tomorrow, and algorithms must consider this and continuously readjust and learn from these changes.
What Does the Law Say?
Every locality is subject to the specific legislation and case law of that region, so it’s critical that HR leaders consult with legal advisors before making any decision regarding AI and people decisions. However, U.S. employment law provides a good example of the level of care and detail that must be taken when thinking about deploying AI in your workforce.
Under Title VII of the Civil Rights Act of 1964, applicants and employees are protected from discrimination based on race, color, religion, sex, national origin, age, disability, and genetic information. On top of Title VII’s federal protections, states and localities protect additional categories from employment discrimination. AI vendors, and companies that engage them, should be mindful of compliance with these intersecting laws.
Fairness and validity. Aside from complying with the law, companies using AI-driven assessments must demonstrate that the assessments are valid, meaning that the tools, in fact, test candidates for the skills or knowledge needed to be successful in the role. Companies must be able to demonstrate how a specific question is related to a job’s requirements.
For example, is a general knowledge question relevant for an operations position? Or is a question that asks candidates to distinguish between colors biased against colorblind individuals?
In addition, as previously discussed, AI tends to base its decisions and “learn” from the status quo. For example, if managers within an organization have traditionally been white men, will an assessment discriminate against an African-American woman simply because she doesn’t fit the profile that has been associated with “successful” managers in the past?
How to Use AI Ethically and Legally
AI can provide significant advantages to HR leaders who want to leverage technology to strengthen their selection processes. But given the equally significant risks, there are a number of critical things to keep in mind or evaluate before moving forward with an AI-driven solution.
1. AI is not a fix for a broken hiring process. Even the most advanced AI tools are only as good as the data we feed them. If the data we collect in hiring and selection processes are not reliable and quantifiable, it’s impossible to build a repeatable process that is subject to validation and improvement.
Hiring practices should rely on proven methodologies scientifically shown to predict job success and reduce bias, such as biographical and personality questionnaires, behavioral-structured interviews, integrity tests, and situational judgment tests.
2. Ensure algorithm transparency. As an employer, you are obligated to own and understand the data being used to design and train the algorithms. For liability reasons, algorithm design and data sources should be clear and transparent so you can justify and prove that decisions are unbiased. This means ensuring the data are reliable and collected methodically to ensure uniformity.
Here are a few questions employers can ask AI vendors to ensure transparency:
- Which variables go into the algorithm, and what is their relative weight?
- How do you test and control for bias in your algorithms?
- How big is the data set you use to train your algorithms?
- Are the algorithms trained on employee data or recruiter decisions?
3. Algorithms should not be trained on human decisions. It’s imperative that algorithms not be trained to replicate human decisions, as human decision-making can be biased. Accordingly, algorithms should not be trained on what the recruiter and hiring manager constitute a successful hire but should instead be trained to learn from actual employee performance. Doing so enables employers to ensure the algorithms are being trained to predict success of future candidates using examples of actual successful employees rather than simply simulating recruiters’ or hiring managers’ opinions.
4. Continuous calibration. Because company strategy is ever-evolving, it’s important that algorithms be continuously adjusted based on real-life employee results and business context. In more practical terms, this means that an employee who is successful today and who is able to succeed in the current environment may not necessarily be successful in a few years’ time, assuming the company environment or KPI benchmarks evolve.
Algorithm-driven decision-making can account for such changes, but it’s incumbent on HR leaders to make sure the tools used have the capability to naturally adjust to these changes over time.
Shiran Danoch is the Cofounder and Chief of Behavioral Science at Empirical, a prehire assessment platform that uses AI and machine learning to empower managers to make data-driven hiring decisions. Danoch is a licensed industrial and organizational psychologist based out of Israel. Gal Sagy is the Cofounder and CEO of Empirical. Sagy is located in Los Angeles. Aaron Crews is Littler’s Chief Data Analytics Officer, based out of the firm’s Sacramento office. He leads the firm’s data analytics practice and Big Data strategy, working with clients and case teams to harness the power of data to build superior legal strategies and improve legal outcomes. Matt Scherer is an associate in Littler’s Portland office. He is a member of the firm’s Big Data Initiative, focusing on leveraging data science and statistical analysis to gain insights into legal issues that employers face.