In recruitment, the ultimate objective of Artificial Intelligence is to ensure that tens, hundreds or even thousands of applications are processed as objectively and efficiently as possible:
- based on a certain amount of data,
- by applying a consistent data processing method from one application to another,
- by breaking free from all representations/stereotypes and cognitive biases.
The formalisation of this systematic processing applied to data is called an algorithm.
The algorithms in question…
An algorithm is (according to Wikipedia): “A finite and unambiguous sequence of operations or instructions to solve a class of problems.”
At first glance you might think that this is a bit “mechanical” if not basic as a way of operating, especially when it comes to processing applications (with real people behind each one)…
One can indeed easily imagine a somewhat simplistic algorithm of the type:
- If the candidate presents characteristic A (eg Education = Business School) then 20 points are added to their candidacy.
- If they then have characteristic B (eg Experience in the same sector as the company that wants to recruit) then 40 points are added.
- And so on…
The important thing to understand is that algorithms can also be extremely sophisticated and elegant. For example, they can include “nested logics” such as:
- If A is between 70 and 100
- AND B has a value between 40 and 60
- AND that C …
- AND that D…
- AND/OR that …
- AND that … This almost ad infinitum.
But there, even though we have already far exceeded the data processing capacities of the human brain (100% of us – recruiters included), we still remain at a fairly basic level of algorithm!
One can indeed – when one parameterises an algorithm – easily integrate other types of processing which involve the computation of correlations, of linear regressions on a pool of data in order to extract threshold values which can then be applied to the particular treatment of a set of candidates for a given post.
Note that artificial intelligence in recruitment (predictive recruitment) is also extremely useful for identifying – before the actual recruitment phase – the factors that contribute to success and commitment in a given position, in a specific context.
Basically, the A.I. can help list the qualities required to be engaged and successful in a particular job. It may not seem like much, but this this is still the essential basic conditions for a successful recruitment. Without these factors, good luck finding your golden egg!
Machine Learning: How Machines Can Learn On Their Own!
Machine Learning can be defined as: “A set of mathematical and statistical approaches for a computer system that give it the capacity to learn from data.”
Machine Learning, even if it is clearly underexploited (or not at all) by most HR Tech solutions today, it nevertheless constitutes one of the most promising branches of Artificial Intelligence as it applies to recruitment.
Basically, Machine Learning in recruiting is what allows you to “learn from previous recruitments” in an objective way.
Applied systematically, Machine Learning has the power to gradually refine its selection criteria – for a given position – leading to increasingly powerful predictive capacities on the success and commitment of people in a given post.
What factors impact the quality of machine learning?
There are 2 main types of factors that determine the quality and whether or not a machine learning device is or isn’t discriminatory:
- The type of data that is used,
- The representativeness of the database on which machine learning is conducted.
The type of data used in machine learning
It goes without saying that in order to avoid introducing negative effects, Machine Learning must be exercised on data which does not potentially carry discriminatory information that is direct (age, gender, etc.) or indirect (address, schools, past experiences etc…)
Direct criteria, defined as exclusion criteria, directly exclude certain categories of individuals, without any absolute link with actual job performance being established.
Criteria are said to be “indirect” because even if they do not seem to target a specific category of people, it can still very much be the case. Take addresses for example: it goes without saying that if I exclude all the applications from a certain neighborhood or area, it is possible that I am excluding a group of people (perhaps it is a lower-income area or there is a large immigrant population)…
Ideally, when working on candidate selection during hiring, you should run the machine learning system on data that is not impacted by either of these types of criteria (or as minimally as possible), which are considered as discriminatory factors in hiring by national or regional legislation.
The specific case of psychological and behavioural variables
On the other hand, psychological and behavioural characteristics offer a particularly interesting alternative. Simply, these characteristics are found to be rather very well distributed in pretty much any criteria that would be used to define a population.
If I choose to preselect my candidates on the basis of their cognitive abilities and to asses that with a standardized test rather than though criteria such as “attended top X schools”, I will inevitably neutralize the factor “was bred by socially successful parents”. This still represents a step forward in the context of fairness in recruitment.
Likewise, if I run Machine Learning on a set of data that includes this information (individuals’ cognitive abilities) rather than information on schools attended, the system will ultimately tend to offer me applications that are much more diverse in terms of gender, origin and age.
The representativeness of the database on which the learning is carried out
This criterion is particularly interesting to study. If my company has tended to recruit, for a specific position, mainly men who have gone through an engineering school, if I go to launch a Machine Learning algorithm on the basis of their CV… what do you think the outcome will be?
There is a good chance that among the criteria that emerges, there will be the fact of having passed through such or such engineering school. However, many top engineering schools are disproportionately made up of young white men from privileged families.
If diversity is something important to me (besides just being a legal obligation), then I may have more interest in integrating other factors that – as seen above – are less likely to be impacted by applicants’ social background.
If, on the other hand, I analyse the data resulting from the cognitive tests on my current engineers, the machine learning algorithm will undoubtedly bring out criterion such as: “Superior cognitive abilities”.
What should be understood is that if the analysis criteria are not impacted by potentially discriminating variables, then the representativeness of the sample becomes less important.
Why? Well basically, as I do the analysis in my company, it doesn’t matter if I have a homogenous team at the moment.
Since what emerges as a criterion is “superior cognitive abilities”. And considering that the proportion of people with superior cognitive abilities is equally high among men as among women, among people of any ethnicity/nationality/background/social status/etc.
If I choose to apply this “cognitive aptitude” criterion in the future, while dropping the “engineering school” criterion, I don’t risk being pushed towards the same group who comprised my original analysis. On the contrary, I will mechanically increase the diversity within my teams! (Provided, of course, that I also diversify my sourcing).
Admittedly, the basic sample (my company’s workforce) was “strongly characterised” but it allowed to highlight a distinctive characteristic that is not specific to this population in particular. In a way, the exercise allowed the emergence of a “Universal” characteristic.
This just to understand that the argument of “it takes a huge database to do machine learning and the base population has to be sufficiently diverse otherwise it doesn’t work”… is just incorrect!