AI-resistance in aptitude testing is real, but it varies significantly by test design. At the vulnerable end sit static, text-based question banks where a candidate can copy a question, paste it into ChatGPT, and receive a correct answer within seconds. At the resistant end sit assessments where AI assistance is either technically impossible or so slow and unreliable that it actively harms the candidate's performance.
The problem is that "AI-resistant" has become a standard marketing label, applied freely by providers whose assessments sit much closer to the vulnerable end than they'd have you believe.
Monitoring measures are not AI-resistant as they only flag the behaviour after it's occurred
A number of providers position behavioural monitoring methods like tab-switching, unusual input flags, or webcam proctoring, as their primary answer to AI cheating. This conflates detection with resistance.
Monitoring tools flag suspicious behaviour after it's happened, they don't prevent the candidate from using AI. A candidate who uses AI carefully, like on a second device, can still bypass many of these anti-cheat measures. The traditional assessment format itself is the problem, and there are limits to the amount you can monitor an unsupervised online assessment.
Providers who lead with detection as their AI strategy are implicitly acknowledging that their test format can be exploited. Genuine AI-resistance removes the underlying vulnerability. The better question to ask any provider is whether their test design makes AI assistance ineffective in the first place - not whether they can catch it once it's happened, as you may be losing out on genuinely strong candidates who chose to boost their score.
Genuine AI-resistance comes from deliberate test design characteristics that render AI ineffective
Tests that combine the following characteristics are as close to AI-proof as anything currently available.
Interactive, changing test content
The most structurally AI-resistant assessments require continuous, multi-step physical interaction with dynamic content. There is no static block of question text that AI can easily read.
An example of this would be one of Test Partnership's MindmetriQ assessments, Pipe Puzzle. The candidate is shown a grid of pipe pieces and must rearrange the moveable tiles to complete a continuous route from the start piece to the end piece, with pipe ends and empty tiles fixed in place. Even if a candidate screenshots the puzzle, AI tools struggle significantly with spatial logic tasks, often hallucinating routes that don't work or can't be built with the available pieces. In the unlikely event AI suggested a correct route, the candidate would still need to locate and move each piece themselves within the 10-15 second time limit.

Attempting to use AI on a task like this doesn't just fail to help, it consumes the time needed to actually solve it - resulting in Pipe Puzzle being a truly AI-proof measure of logical reasoning.
Short time limits
Traditional assessments commonly have time limits of 1-2 minutes per question - even those with anti-cheat measures in place. However, the process of submitting the question into ChatGPT, receiving an answer, interpreting that answer and submitting it yourself will usually only take a minimum of 15-20 seconds (if the candidate is super fast and organised).
By using a time limit of 10-15 seconds per question, none of those steps are feasible.
You obviously can't apply a 10 second time limit to a traditional assessment question as they typically require substantial processing time of large blocks of text or charts. Therefore, the format of the test must be different to traditional assessments. Time pressure works best in combination with interactive formats. Together, they close off AI assistance from two directions at once: the format removes the clean input AI needs, and the time limit removes the window to use it even if one existed. Assessments that have both are operating at a meaningfully different level of resistance to those that rely on time pressure alone.
Test Partnership's Number Racer is built around this principle. The candidate is shown a stream of falling number blocks and a target number. They must collect only the blocks that, when added together, reach the target number - while actively avoiding the rest, in real time. AI would need to understand the live game state, identify a valid combination from only a handful of possible solutions, communicate the answer back to the candidate, and do all of this before any of the relevant blocks have already fallen past. Under strict time limits, that workflow is not feasible and attempting it will result in the candidate scoring worse than genuine attempts.

Progressive content reveal
One of the most common ways candidates use AI is by screenshotting a question and feeding the image into the AI tool. Assessments that never display the full question on screen at once remove that option entirely.
This can work in several ways:
- Information that appears only on click and disappears once viewed
- Content that reveals progressively as the candidate moves through a task
- Question elements that are never all visible simultaneously
In each case, there's no complete snapshot to capture and submit. A screenshot shows only a fragment - not enough for AI to reason about the full problem, let alone provide a useful answer. Candidates will have to take multiple screenshots and submit them all into the AI tool which adds drastically more time to the process.
This characteristic adds a meaningful layer of resistance even in formats that aren't fully gamified. Combined with interactivity and time pressure, it closes off the screenshot-to-AI workflow that a significant proportion of candidates currently rely on.
In-person supervised testing
It's worth noting that requiring a candidate to sit an aptitude test in-person also removes the threat of AI. Candidates are physically present and supervised, with no access to external tools.
The problem is that in-person testing isn't viable as an early screening tool, which is where aptitude assessments deliver the most value. Running large candidate volumes through an in-person format is costly and logistically impractical. In-person testing is useful if used as a verification stage - used later in the process to confirm the results of an earlier online assessment before a final hiring decision is made.
Gamified assessments are the closest thing to an AI-proof test
Gamified cognitive ability assessments combine the strongest AI-resistant design characteristics.
The format requires real-time physical interaction with dynamic content, shorter time limits, and the dynamic nature of the interface means copy-pasting is not viable. Candidates who attempt to use AI assistance during a well-designed gamified assessment will perform worse than those who simply engage with the task directly.
However, not all gamified assessments are built equally. A number of gamified formats on the market only measure personality traits, which are a considerably weaker predictor of performance than cognitive ability. Personality measures have a predictive validity of around 0.22 vs cognitive ability's 0.65 (Schmidt, Oh & Shaffer, 2016). General cognitive ability (g) is the strongest single predictor of job performance in the research literature (Schmidt & Hunter, 1998).
Pro Tip
When evaluating providers, confirm that their gamified assessments are designed to measure cognitive ability, not just behavioural traits. Test Partnership's MindmetriQ series of gamified tests are validated measures of general cognitive ability, with AI-resistance built into the format itself rather than added on top.
Conclusion and next steps
AI-resistance in aptitude testing is real and achievable, but it requires the right test design. The assessments that sit closest to AI-proof share three design characteristics:
- Require continuous interaction with dynamic content
- Operate under time limits tight enough to make the AI assistance workflow impossible
- Never expose the full question on screen at once
In-person testing removes AI entirely but belongs at verification stage, not early screening.
When evaluating providers, apply the practical test - try to cheat their assessment yourself with ChatGPT, and see how far you get. If you can get a usable answer back before time runs out, you have your answer.
Gamified cognitive ability assessments are currently the strongest available answer to the AI cheating problem. You can demo our AI-resistant MindmetriQ gamified tests and explore the science behind them.
