How Accurate Are the Step 2 CK NBMEs?

by Alec Palmerton, MD in Plan, Step 2 CK/Shelf

If you’re studying for Step 2 CK, you may wonder how accurate the NBMEs are. In other words, how close will your final score be to your predicted score? How much of a range of possibilities might there be for your final Step 2 CK score? And what can the predictive ability of Step 2 CK NBMEs tell us about how to maximize our Step 2 score?

It’s even more important to know the accuracy of the Step 2 CK NBMEs, given that Step 1 is pass-fail. In this article, you will learn:

  • How many Step 2 CK NBMEs there are,
  • How historically accurate Step 2 CK NBMEs are relative to the real test – based on actual data from the NBME itself,
  • The accuracy of Step 1 vs. Step 2 CK NBMEs,
  • Reasons why Step 2 CK performance may be more varied,
  • Implications for how you can maximize your Step 2 CK score, and
  • Much more

Table of Contents

How Many Step 2 CK NBMEs Are There?

Currently, there are four NBME practice exams for Step 2 CK. These are:

  • Comprehensive Clinical Science Self-Assessments (CCSSA) Forms 9, 10, 11, and 12

Forms 9, 10, and 11 were introduced on July 28, 2021, and replaced Forms 6-8. Form 12 arrived in June 2022.

Step 2 CK Performance Is More Variable Than Step 1

For years, it’s been clear that NBME scores for Step 2 CK vary much more than those for Step 1. Combined with the fact that there have usually been many fewer Step 2 CK practice tests than for Step 1, few exams lead to test anxiety than Step 2 NBMEs.

The NBME has tightened up reporting on how accurate the predicted scores are (read: they stopped telling us). However, looking at the spread in prior years between expected performance and final score, we see that Step 2 CK NBMEs have (traditionally) much less predictive value. (Note: these data were available for CCSSA Forms 6-8; current CCSSAs do NOT provide this data).

Actual Step 1 and Step 2 CK Scores Minus Score Predicted by NBME Within 1 Week of Exam

Step 2 CK NBMEs Have Traditionally Underpredicted Final Scores (Often By a Lot)

Viewed side-by-side, Step 2 CK NBMEs likely underpredict your final score. (And probably by a lot). Here are some interesting statistics, according to the NBME’s data:

The likelihood that the final score is equal/higher than the last NBME within one week:

  • Step 1: 68%
  • Step 2 CK: 77%

The likelihood that the final score is ≥ 10 than NBME within one week:

  • Step 1: 31%
  • Step 2 CK: 50%

The likelihood that the final score is ≥ 20 than NBME within one week:

  • Step 1: 7%
  • Step 2 CK: 23%

Those statistics are pretty remarkable, particularly for Step 2 CK. It implies that roughly 1 in 4 students will score at least 20 points higher on their actual test than predicted by the Step 2 CK NBME. (By comparison, many fewer Step 1 students will accomplish this feat).

But Step 2 CK Scores Can Still Be Lower

We can do the same analysis for the likelihood that your score will be lower than predicted.

The likelihood that the final score is ≤ 1 lower than the last NBME within one week:

  • Step 1: 32%
  • Step 2 CK: 23%

The likelihood that the final score is ≤ 11 lower than the last NBME within one week:

  • Step 1: 9%
  • Step 2 CK: 8%

The likelihood that the final score is ≤ 21 lower than the last NBME within one week:

  • Step 1: 3%
  • Step 2 CK: 3%

Here, we can see a < 1/3 chance your NBME will overpredict your Step 1 score. (It’s even lower for Step 2 CK).

What About the Predictive Value of the Newest Step 2 CK NBMEs (CCSSAs)

Unfortunately, as noted above, the NBME stopped publishing granular breakdowns of the accuracy of the NBMEs. That said, the NBME still gives us a rough estimate of the accuracy of the NBMEs.

For example, this student had a predicted score of 238 on CBSSA 11.

Like prior NBMEs, the new Step 2 CK NBMEs (CCSSAs) appear to underpredict Step 2 CK scores on average

Note that the NBME report states:

We anticipate that your actual performance on Step 2 CK will fall in the range of 233-261 about two-thirds of the time. This range is based on students who took CCSSA within one week before taking Step 2 CK.

First of all, note that it is an insane range. For example, 233 is roughly the 20th percentile, whereas 261 would be approximately the 83rd percentile. It is bonkers that there could be such a dramatic range on such a high-stakes exam.

Second, note that the Step 2 CK NBMEs likely underpredict one’s score. The lower range is 233, only three points below the predicted score. However, the upper bound is 261, 15 points higher than estimated.

Given how vital Step 2 CK is now that Step 1 is pass-fail, you may wonder why predicting Step 2 CK performance is so tricky.

Why Is Step 2 CK Performance More Variable?

Why such a significant difference between the predicted and actual Step 2 CK scores? And why are Step 2 CK scores so much more variable than Step 1? Unfortunately, we’ll likely never know with certainty.

However, my opinion is that Step 2 scores are more variable due to the character of the test itself. Step 1 is a much more content-heavy exam. There are lots of smaller – and a few essential – concepts that you must apply in clinical settings. This concept-heavy focus means that there are more questions you may have little chance of getting – and/or ones where you will almost always get it right.

Step 2 CK is different, however. Because so much of it relies on question interpretation – analyzing extended vignettes for clinical clues – there is much more variability in expected performance.

For example, if I’m tired and stop reading carefully, my chances of interpreting a complicated vignette drop dramatically. However, for Step 1, fatigue has a much less detrimental effect on my ability to answer a question about the expected vitals/physiologic parameters in shock. Once I understand it is a shock question, whether I can answer it correctly will depend on my ability to apply the concept P1 – P2 = Flow x Resistance (often written as MAP = CO x TPR in this context), among other things.

Knowing that your Step 2 CK score is likely harder to predict can be frustrating. However, the fact that it is so dependent on question interpretation provides the opportunity to improve in a short period.

What Is Question Interpretation (QI)?

What is question interpretation (QI)? QI is analyzing the meaning behind every sentence in a vignette – knowing what it means and why the question writer put it there.

QI is easiest to understand when you consider what it is not. QI is NOT scanning for buzzwords or reading the last sentence/question first. Instead, it is looking at the 56-year-old man with crushing chest pain and recognizing that his smoking is likely a coronary artery disease risk factor. QI would also mean knowing that smoking in a different context – a 36-year-old woman on OCPs with sudden onset chest pain – likely means a DVT risk factor.

To read more about question interpretation, read this article.

What Can We Do to Maximize Our Step 2 CK Scores?

If we accept that Step 2 CK depends much more on question interpretation, there are several important things we can do to maximize our Step 2 CK performance.

Improve Question Interpretation

The most obvious way to improve our Step 2 CK scores is to improve our ability to interpret questions. The more we understand what every sentence means – and how it fits in the broader clinical context – the easier it is to answer any question.

Question interpretation is essentially a skill. So rather than cramming many more facts – most of which you may not see on your test – QI can help on most questions for your test.

And as we’ve discussed before, here is a student who just took his Step 2 CK, who was in Never Forget. Note that these practice exams were all within a week of each other.

Form 10 (7/16): 223 (65%)
Form 9 (7/19): 220 (63%)
Form 11 (7/23): 242 (73%)
Step 2 CK Score (7/28): 245

Note that within 12 days, he improved by 22 points (and 19 points on his NBME).

Rest to Improve Focus

What goes along with question interpretation for Step 2 CK performance is rest. Fatigue affects test performance generally. However, being tired has a much more significant effect on exams where active thinking and concentration are needed.

Being tired on Step 1 may hurt you some. However, given that so much more of it is concept mastery, most people find their scores fluctuate less when tired.

For Step 2 CK? Students have reported wild swings (sometimes as much as 30-50%) based on their energy level/fatigue. Rest accordingly.

Stay Calm

Like fatigue, anxiety can have a massive effect on exams, none more so than Step 2 CK. Because so much of the test requires concentration, the calmer you can be, the better.

See this article for more on how to address – and overcome – test-taking anxiety.

Concluding Thoughts

Many of us have heard that Step 2 CK NBMEs underpredict the final score. On the other hand, some people believe that the UWorld Self-Assessments (UWSAs) – particularly UWSA 2 – are more accurate.

The unfortunate truth is that neither the NBME nor UWorld publishes much data regarding the accuracy of their exams. However, available data suggest that the actual performance on your real exam may be higher – and potentially significantly higher – than predicted. This variability in the final score is even more significant if you’ve only taken a single practice test.

Can you bank on scoring higher on the actual exam than your CBSSA? And what should you do to try and maximize your performance?

Ultimately, there’s no guarantee on what your final Step 2 CK score will be. This lack of predictability may be terrifying – particularly given the raised stakes of Step 2 CK.

That said, that Step 2 CK practice tests are such poor predictors of final score offers a glimmer of hope. Specifically, it suggests that there are things that you can do – including learning how to improve and apply question interpretation, rest, and staying calm – that can maximize your Step 2 CK performance.

What do you think? Are the Step 2 CK NBMEs good predictors of the final score? How do you feel about taking your test? Let us know in the comments!

  1. EZ says:

    Hey Alec,

    First of all, thanks for all of your content – I’ve been following your blog since I was an M1 back in 2019, and it’s very good quality – probably one of the best USMLE blogs on the Internet.

    At the risk of sounding super neurotic, in your experience, what your students with similar practice exams to mine usually score? NBME 9: 250, NBME 10: 261, NBME 11: 259, NBME 12: 245 (last exam), Free 120: 75% (individual blocks were 75%, 70%, 80%, which I attribute to getting freaked out after checking my answers after 1st block, then rallying on the 3rd block).

    Thanks a lot!

    1. Yousmle says:

      Hi EZ,

      Thanks for your kind words! Obviously there is no way to predict with certainty one’s score although the NBMEs are reasonable means to get a general estimate – I assume you are feeling more nervous since NBME 12 was significantly below your prior NBMEs? Do you know how many questions you missed on NBME 12? FWIW, if you can tell me, I can give you an estimate of how much each question was worth on the test, to see if the test is weighted differently than the others.

      Dr. P

      1. EZ says:

        Don’t have the exact numbers, but probably around 50 questions for form 12 (75% correct).

        1. Yousmle says:

          If you click to review your incorrect questions, you can just count the number of wrong answers manually by clicking through each question. Once you have it, we should be able to tell you how much each question is worth.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

