When it comes to a school’s remote proctoring approach for online exams, the systemic bias of fully automated AI-based approaches is at least as concerning as the subjective bias of human proctoring. Today, bias in automated proctoring is known to disproportionately impact women and people of color, as well as students with certain medical conditions, resulting in widespread distrust of remote proctoring technology. As Shea Swauger put it in the MIT Technology Review, “[A]lgorithmic proctoring is a modern surveillance technology that reinforces white supremacy, sexism, ableism, and transphobia.”
To ensure a fair online testing environment that supports student success, the school’s proctoring system should combine the automation of an AI-based solution with human oversight. But human intervention alone is not enough; the AI models in human-in-the-loop (HITL) systems must be built to prevent bias and create a more level playing field for students.
From a machine learning perspective, the term “bias” is used in many different contexts. These include:
While all of these can have an impact on AI-based remote proctoring, sample bias is the root of many of the most significant problems presented by online invigilation systems.
AI systems are trained by processing large datasets that serve as the foundation for their models. If these datasets don’t accurately represent a diverse population, the resulting models may become very good at analyzing overrepresented populations while remaining unable to offer accurate insights about others.
Most online proctoring companies do not develop their own AI models and don’t train their models on industry-specific data. Instead, they rely on off-the-shelf machine learning models and general datasets that may not represent end users. In practice, this means that datasets used by most proctoring systems skew white, Western, male, and able-bodied, the resulting models are often inadequate for other populations.
The result is that many AI-based remote proctoring approaches can cause profound damage, including shutting students out of exams when biased facial recognition software fails to correctly identify test-takers with dark complexions and incorrectly flagging suspicious movements by test-takers with certain disabilities. Additionally, most datasets are not specific to the testing environment, producing models that do not account for the unique conditions in which they are being deployed and reducing overall reliability.
Respect for students and for the learning process itself should lie at the heart of proctoring no matter how or where tests are administered. As such, the best remote proctoring approach seeks to eliminate bias whenever possible. At Rosalyn, that begins with the dataset.
Unlike most invigilation systems, Rosalyn uses its own proprietary database that includes approximately 250,000 sessions and represents test-takers from all over the world. This means AI models can be purpose-built with exposure to people with diverse skin tones, movements, gender identities, and socio-economic backgrounds that show the true range of bandwidths and connectivity quality in real-life testing environments. As a result, proctoring can be applied more equitably, without giving certain demographics an unfair advantage.
Rosalyn’s system is also dynamic; rather than relying on a static machine learning model, Rosalyn continually learns from test sessions and becomes more intelligent over time as the dataset grows and as the system is updated with new information.
State-of-the-art AI can be powerful and open up new opportunities for developing fair, consistent, scalable remote proctoring approaches. However, it cannot replicate the wisdom of the human observer. That’s why Rosalyn combines AI with human intervention that allows educators to retain decision-making authority.
With Rosalyn, AI is the first line of defense. When potential violations are flagged, the human the flagged activity either in real time or post session to determine whether a violation has occurred. This can greatly reduce false positives and help both students and educators feel confident in the proctoring process.
Of course, human observers open up another potential source of bias—one that can be significantly more complex than arises in machine learning. In fact, the same populations that tend to be most affected by poorly designed AI systems are often disadvantaged by educators and educational institutions. Rosalyn’s advanced HITL system corrects for this by offering a workflow that decouples proctors and monitored students; proctors only see potential violations flagged by AI, offering an objective basis for decisions and minimizing the risk of discriminatory practices.
It’s in the best interest of students and educators alike to ensure that test results accurately reflect the student’s knowledge of the subject matter. As education increasingly moves online, that requires taking a remote proctoring approach that simultaneously addresses the biases inherent to many invigilation systems and to human proctors. Rosalyn’s advanced system allows all students to feel comfortable and trust that no test-taker has an unfair advantage—or disadvantage.