5 command terms that cost AI candidates marks on Papers 1 and 2

IB Math AI assessment rewards modelling thinking over procedural recall. This article maps the specific command terms, rubric expectations, and Paper 3 inquiry habits that separate a 5 from a 6 —…

IB CoursesJune 1, 202617 min read

IB Mathematics: Applications and Interpretation (AI) assesses a fundamentally different mathematical competence from its sister course, Analysis and Approaches. Where AA rewards analytical elegance and procedural fluency, AI prioritises the ability to construct, solve, and interpret mathematical models of real-world situations. Most candidates who plateau at a 5 in AI have strong computational skills but struggle with the contextual layer that the course's command terms explicitly demand. This article isolates the modelling-cycle thinking that examiners look for across AI Papers 1 and 2, explains how Paper 3 (HL) tests this under inquiry conditions, and provides a targeted revision framework built around the specific demands of the Applications and Interpretation course.

What AI actually assesses: the modelling cycle, not the answer

The AI syllabus document states explicitly that students should be able to 'construct, use and interpret mathematical models.' This is not a vague aspiration — it defines how every question on every paper is written, marked, and grouped into assessment objectives. A candidate who arrives at the correct numerical answer via a shortcut they cannot explain has answered a different question from the one the examiner set.

The modelling cycle has four identifiable stages: formulating the problem and identifying relevant variables; constructing a model by choosing appropriate relationships; solving the model using technology or algebra; and interpreting the results in context, including a judgement about whether the model is valid. Each stage maps to specific rubric categories. Candidates who skip stages 1 and 4 — the most common pattern among plateaued scorers — leave 15-20% of available marks uncollected simply because they answered the mathematics without explaining why it was the right mathematics to use.

The three assessment objectives and how AI distributes them

AO1 (Knowledge and understanding): Selecting and applying appropriate techniques and processes. In AI this includes GDC operations — chi-squared tests, matrix multiplication, regression calculations — as well as algebraic manipulation.
AO2 (Reasoning and communication): Justifying procedures, explaining choices, interpreting results. This is where the modelling cycle is most visible in the rubric and where AI candidates lose the most marks relative to their computational ability.
AO3 (Applications and modelling): Formulating models, evaluating their validity, using technology appropriately. This objective carries significant weight on both papers and is tested with particular rigour in Paper 3 (HL).

Most revision time goes to AO1. The highest-yield preparation for a 6 or 7 involves systematic development of AO2 and AO3 habits — not additional practice on AO1-type questions, which most candidates at this level handle reliably already.

Five command terms that define AI's rubric expectations

AI's command terms are not interchangeable with AA's. A question using 'solve' in AI often carries an expectation of technological solution followed by contextual interpretation, whereas 'solve' in AA frequently asks for an exact analytical result. The following five command terms appear with high frequency in AI papers and are frequently mishandled by candidates who read them as procedural instructions rather than modelling tasks.

Interpret

'Interpret' is the most consequential command term in AI. It asks candidates to explain what a mathematical result means in the context of the problem — not merely to state the result. A chi-squared test yielding a p-value of 0.003 requires the candidate to say something like: 'The p-value is below the significance level of 0.05, so we reject the null hypothesis and conclude that there is a statistically significant association between the variables.' The conclusion is a statement about the context, not about the chi-squared value itself. Candidates who respond with 'the result is significant' without explicitly connecting the p-value to the stated significance level lose the interpretation mark.

Describe

'Describe' asks for a characterisation of a mathematical feature — a distribution, a relationship, a pattern — in context. In AI this frequently appears alongside statistical output from the GDC. A candidate asked to describe the relationship shown in a residual plot should identify the pattern (e.g., 'the residuals show systematic curvature, suggesting the linear model is not appropriate') and link it to the underlying assumption being violated. Providing only a numerical description — 'the residual standard error is 2.3' — satisfies neither the mathematics nor the rubric.

Find

'Find' in AI is not simply 'calculate and write the answer.' AI rubric criteria frequently award marks for finding a value that requires justification of the method as well as the result. If a question says 'find the expected frequency' in a chi-squared test, the candidate needs to show that expected frequencies are being used (i.e., the correct matrix from the GDC), not simply copy a number from a calculator screen. Showing the formula alongside the GDC output demonstrates the reasoning that AO2 rewards.

Verify

'Verify' asks candidates to check the validity of a model or result against the given conditions. In AI this often means substituting values back into a regression equation and commenting on the goodness of fit, or checking whether a differential equation solution satisfies the initial condition. Candidates who calculate correctly but skip verification leave AO3 marks unclaimed.

Calculate

Of all the command terms, 'calculate' comes closest to a purely procedural instruction in AI — but even here, the expectation includes presenting the result with appropriate precision and, where relevant, units. A candidate calculating a probability from a normal distribution should state the result as '0.234 (3 significant figures)' or '23.4%', not simply write '0.2343' without context. Units and precision matter because AI's real-world framing makes both meaningful.

Paper 1: where the modelling demands appear earliest

Paper 1 is technology-permitted throughout. This means every question can be solved using the GDC — the examiner has designed the questions with this in mind. The practical consequence is that Paper 1 tests whether candidates can select the right GDC function, use it correctly, and then interpret the output within the problem's context. Candidates who rely on memorised procedures without understanding which GDC operation corresponds to each modelling step find Paper 1 harder than expected — the calculator does not compensate for conceptual confusion about what you are trying to model.

The sustained problem contexts in Paper 1 section A typically present 6-8 short questions tied to a single scenario. A common pattern involves a statistical investigation: defining a hypothesis, calculating a test statistic, finding a p-value, interpreting the result, evaluating a model, and commenting on limitations. Each question tests a different stage of the modelling cycle within the same underlying context. Candidates who treat each question as an isolated problem miss the coherence the examiner has built into the rubric — AO2 marks are frequently awarded for showing awareness of how one stage relates to the next.

The GDC as an assessment tool, not just a计算 aid

In AI Paper 1, the GDC is part of the assessment. Candidates who cannot navigate their calculator to produce a specific regression output, perform a specific hypothesis test, or find the intersection of two curves quickly will run out of time. Paper 1 runs for 90 minutes at SL and 120 minutes at HL. A candidate who spends more than 90 seconds locating a function on the GDC during the exam has already created a time deficit that compounds through the paper.

The specific GDC skills that AI Paper 1 requires include: navigating between spreadsheet and graph views efficiently, using the statistics package for regression and tests, storing and recalling values for use in subsequent calculations, and using the equation solver for non-standard equations that cannot be solved analytically. These are not optional efficiencies — they are load-bearing skills for the time-pressured conditions of the examination.

Paper 2: sustained contexts and the interpretation mark sequence

Paper 2 differs from Paper 1 in two structural ways that reshape the modelling demands. First, it is not technology-permitted for all questions — approximately 40% of Paper 2 marks are allocated to questions that require analytical (non-GDC) solutions. Second, the sustained problem contexts are longer and more complex, often involving multi-stage modelling where the output of one stage becomes the input for the next.

The multi-stage structure means a single error at an early stage propagates through subsequent questions. This is by design — Paper 2 rubric criteria include a 'follow-through' provision where a mathematically consistent continuation of an earlier error earns marks even if the final numerical answer is wrong. But this provision only helps candidates who have shown their working clearly and logically. Candidates who omit steps in early questions leave examiners with no basis for awarding follow-through marks, even if their approach was sound.

Common calculation errors under Paper 2 time pressure

Paper 2's sustained contexts generate a specific pattern of error. Because the problems are longer and involve more variables, candidates experience higher cognitive load. The most frequent errors I see in scripts include misreading the axes on a provided diagram, substituting values into the wrong variable in a multi-variable model, and rounding intermediate values too aggressively, which compounds to a final answer that falls outside the acceptable error bounds. The acceptable error (AE) margin in Paper 2 is typically 3 significant figures for final answers — this tolerance is generous, but it does not protect against conceptual errors or omitted working.

A practical habit that prevents this: whenever you complete a stage in a multi-stage problem, write the result in the margin of your script and circle it. Before moving to the next stage, check that the value makes sense in context. This takes 15 seconds and catches the majority of propagation errors that cost 2-3 marks each.

Paper 3 (HL): the inquiry format that catches unprepared candidates

Paper 3 is unique to the HL programme and is the most distinctive assessment in the IB Mathematics suite. It presents a single extended problem with several sub-parts, each building on the previous one. The time allocation is 60 minutes for 55 marks — roughly 1.1 minutes per mark, which is more generous than Papers 1 and 2. The extra time reflects the expectation that candidates will engage in genuine mathematical thinking: proposing extensions to the problem, testing conjectures, and justifying their reasoning.

The inquiry format of Paper 3 means that candidates who have learned only to answer questions — not to ask them — struggle. The final parts of a Paper 3 question typically invite candidates to pursue an extension or generalisation that is not fully specified. Candidates who have not encountered this format in practice often freeze at this point because they do not know what 'good enough' looks like. The rubric for the extension component rewards mathematical justification and a clear link to the original problem, not necessarily a correct final answer.

Topics that Paper 3 HL candidates must handle with fluency

Based on the published Paper 3 specimens and recent examination sessions, the following topic areas appear with sufficient frequency to warrant priority in revision: differential equations and Euler's method, matrix applications including eigenvalues and eigenvectors, iterative methods for solving equations, and extended statistical inference problems. Within each of these, the modelling cycle applies — candidates must be able to explain why the technique is appropriate, execute it correctly using the GDC where needed, and interpret the output within the problem's context.

One specific Paper 3 skill that AI HL candidates often underdevelop is the ability to validate a numerical solution from an iterative method against the analytical result (where one exists). The iterative approach and the analytical approach give slightly different numerical results due to rounding — commenting on this discrepancy and explaining why it occurs is an AO2 skill that Paper 3 rewards explicitly. Candidates who only provide the numerical answer and miss this interpretive step leave marks on the paper.

Common pitfalls and how to avoid them

The following error patterns appear consistently across AI examination scripts. Each has a specific cause and a specific remediation strategy that takes less than ten minutes to implement.

Pitfall 1: procedural fluency without contextual awareness

This is the most common cause of plateaued scores in AI. The candidate can execute all the required techniques — hypothesis tests, regression calculations, differential equations — but cannot reliably explain why a particular technique is appropriate or what the result means in context. The fix is to add one sentence to every practice answer: 'This means [result] in the context of [problem], because [relevant criterion from the problem].' Making this a habit transforms AO2 performance without requiring additional mathematical content.

Pitfall 2: calculator dependency without calculator understanding

Candidates who use the GDC as a 'magic box' — entering values and copying outputs without knowing which function is performing which operation — are vulnerable to subtle errors that the GDC cannot detect. The most common: entering data into the statistics package with the wrong category coding, which produces a correct-looking but meaningless regression output. The fix is to test the GDC result against a known simple case before relying on it in the examination. If a linear regression of a two-point dataset gives a perfect fit, the calculator is set up correctly. This takes 30 seconds and prevents an entire category of error.

Pitfall 3: ignoring the model evaluation requirement in regression questions

AI Paper 1 and Paper 2 both test regression analysis with some regularity. The modelling cycle requires candidates to evaluate whether the chosen model is appropriate — typically by examining residuals or reporting the coefficient of determination (R²). Candidates who provide the regression equation and coefficients but omit the residual analysis have answered only the AO1 component of the question. The AO3 component, which asks for an evaluation of the model's validity, goes unanswered.

AI versus AA: understanding the assessment difference in concrete terms

The choice between AI and AA is partly determined by university course requirements and partly by mathematical preference — but understanding the assessment differences helps explain why a candidate who scores 6 in AI HL might have scored differently in AA HL, and vice versa. The table below maps the key structural and emphasis differences that affect preparation strategy.

Assessment feature	AI HL / SL	AA HL / SL
Primary mathematical emphasis	Modelling, statistics, interpretation	Calculus, algebra, proof
GDC role	Central — technology-permitted throughout Paper 1	Limited — analytical methods prioritised; GDC restricted on Paper 1
Statistics content	Extensive: hypothesis tests, regression, distributions	Minimal: basic probability only
Calculus content	Functional; applications-oriented	Deep: integration techniques, proof, differential equations
Paper 3 (HL) format	Extended problem solving, inquiry, modelling	Proof-based problem solving, formal mathematics
Typical IA/Exploration topic focus	Statistics, modelling, real-world data analysis	Calculus investigation, abstract mathematics

Strategic preparation: where to focus for maximum score impact

Effective AI preparation is not about covering more content — it is about developing the modelling habits that the assessment rewards. Based on the rubric distribution and the error patterns described above, the following priorities produce the highest return per hour of study.

Build the modelling cycle habit: For every practice question, write a brief note identifying which stage of the modelling cycle each part of the question is testing. This takes two minutes and builds the contextual awareness that AO2 and AO3 require.
Practise command-term responses explicitly: Select five past paper questions that use 'interpret' and write the response without looking at the mark scheme. Then compare. The gap between your response and the mark scheme answer will almost certainly be a gap in contextual interpretation, not mathematical content.
Develop Paper 3 (HL) inquiry skills: Work through one previous Paper 3 question per week in the months before the examination. The inquiry format requires practice — it is qualitatively different from answering structured questions. Focus on the final parts of each question, where candidates are asked to extend or evaluate the model.
Strengthen GDC fluency systematically: Identify the five GDC operations you use least fluently (e.g., matrix multiplication, iterative equation solving, residual analysis). Practise each operation on a known dataset until you can complete it without consulting the manual. In the examination, speed at the GDC translates directly into time for AO2 and AO3 writing.
Target Paper 2 sustained contexts: These are the questions where time pressure and multi-stage complexity combine to produce the most lost marks. Practise them under timed conditions, using the margin-note habit described above to prevent propagation errors.

The modelling emphasis that defines IB Mathematics: Applications and Interpretation also defines what makes AI distinctive as an IB mathematics course. It is the mathematics of real-world systems, data-informed decisions, and technology-supported reasoning — and it rewards a specific set of habits that procedural fluency alone does not develop. For most candidates reading this, the highest-leverage change is not learning more mathematical content. It is developing the discipline to apply the modelling cycle consistently: to state what you are modelling before you model it, and to state what your result means after you have found it. That habit, practised across every past paper question between now and the examination, moves scores in ways that content revision alone cannot.

IB Courses' one-to-one IB Math AI HL programme focuses on Paper 3 inquiry skills and AO2 contextual writing — the combination that most determines whether an AI HL candidate achieves a 6 or a 7.

Why IB Global Politics candidates lose marks on the Engagement Activity — and the reflection structure that fixes it Why the IB Turkish A Individual Oral frustrates capable candidates — and the double-layered framework that changes this How to choose your literary text for the IB Turkish A Individual Oral

Frequently asked questions

How is IB Math AI Paper 1 different from Paper 2 in terms of what gets rewarded?

Paper 1 is technology-permitted throughout and consists of shorter, more focused questions that test individual stages of the modelling cycle. Paper 2 mixes technology-permitted and analytical questions within longer sustained contexts, requiring candidates to manage multi-stage modelling where the output of one stage feeds into the next. Paper 2 also places more weight on follow-through marks, which reward logically consistent working even when an earlier numerical error occurs. The key difference for preparation is that Paper 1 rewards GDC fluency and contextual interpretation of individual results, while Paper 2 rewards sustained logical chains and error-management habits.

What does 'interpret' mean specifically in an AI examination context?

In AI, 'interpret' requires candidates to explain what a mathematical result means in the problem's specific context. This goes beyond stating the result — it requires connecting the numerical output to the real-world situation described in the question. For a hypothesis test, this means identifying the null and alternative hypotheses, stating the decision rule, reporting the p-value, and drawing a conclusion that refers back to the context of the original claim. A common error is stating 'the result is significant' without explicitly linking the p-value to the significance level that was stated in the problem.

How does Paper 3 HL work and what makes it different from Papers 1 and 2?

Paper 3 presents a single extended problem with multiple interconnected sub-questions. The inquiry format means that later parts of the question often invite candidates to propose extensions or generalisations that are not fully specified. Candidates have 60 minutes for 55 marks, which is more generous per mark than Papers 1 and 2. The higher time allocation reflects the expectation of genuine mathematical thinking rather than procedural execution. The most challenging aspect for unprepared candidates is the open-ended final parts, where the rubric rewards justified reasoning even if the mathematical result is incomplete or approximate.

Why do AI candidates sometimes score higher on AO1 but plateau on AO2 and AO3?

AO1 (knowledge and technique) is the most practiced component in most preparation programmes. AO2 (reasoning and communication) and AO3 (applications and modelling) require different habits that are not developed through formula revision alone. AO2 demands that candidates justify their choices and explain their reasoning in context — this is a writing habit as much as a mathematical habit. AO3 requires candidates to formulate models and evaluate their validity, which means engaging with the problem at the modelling stage rather than jumping directly to calculation. The plateau occurs because candidates who are strong at AO1 have less incentive to develop these habits, and because AO2 and AO3 feedback is less immediate than AO1 feedback — it is harder to self-assess contextual interpretation than to check whether the calculation was correct.

How should I structure AI revision to address the modelling cycle specifically?

The most effective approach is to add a modelling-cycle review to every past paper question you attempt. After completing a question, identify which modelling-cycle stage each part tested (formulate, construct, solve, interpret) and write one sentence explaining how your answer addressed that stage. This takes two to three minutes per question but builds the contextual awareness that AO2 and AO3 reward. Complement this with explicit command-term practice: select questions that use 'interpret', 'describe', and 'verify', write your response without the mark scheme, then compare. The gap will almost always be in contextual writing, not in mathematical content.

5 command terms that cost AI candidates marks on Papers 1 and 2

What AI actually assesses: the modelling cycle, not the answer

The three assessment objectives and how AI distributes them

Five command terms that define AI's rubric expectations

Interpret

Describe

Find

Verify

Calculate

Paper 1: where the modelling demands appear earliest

The GDC as an assessment tool, not just a计算 aid

Paper 2: sustained contexts and the interpretation mark sequence

Common calculation errors under Paper 2 time pressure

Paper 3 (HL): the inquiry format that catches unprepared candidates

Topics that Paper 3 HL candidates must handle with fluency

Common pitfalls and how to avoid them

Pitfall 1: procedural fluency without contextual awareness

Pitfall 2: calculator dependency without calculator understanding

Pitfall 3: ignoring the model evaluation requirement in regression questions

AI versus AA: understanding the assessment difference in concrete terms

Strategic preparation: where to focus for maximum score impact

Frequently asked questions

Related Posts

How to plan a 12-week ESS SL revision block around the IA and Paper 1

3 ESS SL Section A answering moves that lift a 4 into band 5

4 ESS SL case-study moves that rescue a Paper 2 Section B answer

What AI actually assesses: the modelling cycle, not the answer

The three assessment objectives and how AI distributes them

Five command terms that define AI's rubric expectations

Interpret

Describe

Find

Verify

Calculate

Paper 1: where the modelling demands appear earliest

The GDC as an assessment tool, not just a计算 aid

Paper 2: sustained contexts and the interpretation mark sequence

Common calculation errors under Paper 2 time pressure

Paper 3 (HL): the inquiry format that catches unprepared candidates

Topics that Paper 3 HL candidates must handle with fluency

Common pitfalls and how to avoid them

Pitfall 1: procedural fluency without contextual awareness

Pitfall 2: calculator dependency without calculator understanding

Pitfall 3: ignoring the model evaluation requirement in regression questions

AI versus AA: understanding the assessment difference in concrete terms

Strategic preparation: where to focus for maximum score impact

Related reading

Frequently asked questions

Related Posts

How to plan a 12-week ESS SL revision block around the IA and Paper 1

3 ESS SL Section A answering moves that lift a 4 into band 5

4 ESS SL case-study moves that rescue a Paper 2 Section B answer