Competency-based medical education and the McNamara fallacy: Assessing the important or making the assessed important?T Singh1, N Shah2
1 Center for Health Professions Education, Adesh University, Bathinda, Punjab, India
2 Department of Psychiatry, Smt. NHL Municipal Medical College and SVPIMSR, Ahmedabad, Gujarat, India
Correspondence Address: Source of Support: None, Conflict of Interest: None DOI: 10.4103/jpgm.jpgm_337_22
Source of Support: None, Conflict of Interest: None
Keywords: Assessment, competency, McNamara fallacy, narrative
Robert McNamara studied at the Harvard School of Business; worked for the Ford Company helping it to recover losses and reach great heights. He achieved this by meticulous data analysis. He and his team members were considered as the “whiz kids” who could turn around any situation by suggestions based on data analysis. Given his track record, he was appointed the US Secretary of Defense during the Vietnam War.
McNamara applied the same mathematical models to war analysis. He, for example, considered the “dead body counts” of the enemy side versus his own side, and area under occupation, as signs of winning the war, whereas turning a blind eye to the actual dynamics of the war situation, such as the spirit of the people fighting the war, their resistance and guerrilla warfare, feelings of the rural Vietnamese people, and the local forest conditions. As a result, for a long time, he kept conveying to USA that it was winning the war, although the ground reality at Vietnam was entirely different. Many years have passed but this thinking continues to haunt us.
Several years later, Daniel Yankelovich, a sociologist, coined the phrase “The McNamara Fallacy,” which is also known as the quantitative fallacy: relying heavily on metrics and numbers to draw conclusions; and ignoring the ground-level realities. He described four stages of beliefs, ranging from measuring all that is easily measurable, disregarding what is not easily measurable, considering that as unimportant, and even non-existent! The problem gets exaggerated whenever we deal with issues that are non-linear, complex, and unpredictable. Like war, medicine and medical education are classic examples of non-linearity, complexity, and unpredictability. Medical education in the psychometric era appears to be highly influenced by McNamara's thinking. [Table 1] depicts how, for various decisions, we depend on numbers while ignoring important aspects. Even when we have seen the fallibility of such a practice, the tendency to believe in numbers just seems to be the norm.
Competency-based medical education (CBME): A new era
The CBME has been introduced in India since 2019 with the goal of producing an Indian Medical Graduate (IMG) who is competent in fulfilling the health needs of the society. Five roles of the IMG have been defined as follows: Clinician, Communicator, Leader and member of the health care team, Professional, and Lifelong learner. The global and subject-level competencies have been listed, and there is a provision in the curriculum for specific learning opportunities for various roles, for example, learner-doctor program in clinical postings to function as members of the health care team, self-directed learning to impart skills for lifelong learning, and the Attitude, Ethics, and Communication module that spans across the curriculum.
However, defining roles, competencies, and providing learning opportunities alone will not lead us to the goal if our assessment is not “fit for purpose.” Assessment in CBME is concerned with not only just the product but also the process; with not only just proving but also improving. The major obstacle for this is going to be our overdependence on numbers – making only the measurable important rather than measuring the important. The obsession with timetables, teaching hours and proportions of integration, checklists, and marks suitably illustrates this.
Although psychometric rigor is presumed to have advantages, such as ensuring fairness, standardization, and comparability, it is interesting to note that out of the five envisaged roles of the IMG, at least four (leader, lifelong learner, professional, and communicator) are not amenable to measurement by numbers. If we continue assessing the numeric way, we will end up assessing only a certain part of the student's learning, that too inaccurately. Just like McNamara while counting the dead bodies continued to dwell in the illusion that they were winning the war, we may also assume, that using the yardstick of numbers, we are producing competent IMGs, although the reality may only manifest itself many years down the lane.
The consequences of being obsessed with numbers: A deep trap
Let us have a look at what exactly happens when the entire focus of assessment is on generating marks: we undermine the richness of information and reduce it to plain hollow numbers. Schuwirth and van der Vleuten have explained this very well as to how we transition from rich information about students' performance to only dichotomous pass/fail decisions.
In addition, the focus on structure, standardization, and uniformity hinders the assessment of competency in the real sense. Competency is a “habitual and judicious use of knowledge, skills, attitudes and communication for the welfare of a patient.” It means that competency assessment is more than stacking marks in individual components; that there is a vast difference, for example, in inserting a nasogastric tube in a mannequin (skill) and doing so in a struggling child brought with ingestion of a poison by anxious parents (competency). Although both skill and competency literally mean the “ability” to do something, educationally, competency is contextual and contributes to positive patient outcomes. It includes a component of using one's values and responsiveness. Thus, competency assessment must focus, in addition to technical details, on the context and outcome, both of which vary in different situations. It must be much more than a checklist-based or a structured exercise. The value of checklists, however, can be improved by providing narrative comments. This is especially true for skills/competencies that are not objectively measurable.
The focus on numbers alone gives the wrong message to the students as well – misdirecting their efforts from actual learning to collecting marks – promoting test-wise ness and performance orientation. There also remains a risk of cheating or manipulation. Sometimes, the numbers are intentionally manipulated (e.g., using Item Response Theory framework and/or removing items to improve reliability). Unfortunately, manipulating the numbers does not change the reality.,
That numbers could be manipulated is one problem, but believing, that numbers if accurate, could give us a complete picture is a bigger one. If it were so, there would be no need for having any “Discussion Section” in research papers and everyone would be able to comprehend and conclude the exact same thing as the authors wanted to convey!
Even then, the commitment to numbers is so deeply ingrained in our psyche that when it becomes apparent that numbers capture only certain aspects of the desired qualities, instead of exploring other ways of assessing, these other aspects are somehow measured and forcibly fit into numbers. This is called as “objectification”: a process by which abstract concepts are treated as if they are a physical thing., This makes matters worse, as it only gives rise to a higher level of test-wise ness, with students focusing only on scoring well and going farther away from the real attainment of competencies.
This exemplifies the Goodhart's law, which states that when a measure becomes a target, it ceases to be a real measure. Originally, the purpose of measurement is to find out something worthwhile, but the fact that it is going to be measured gives rise to distortions/corruptions to ensure that the measurement “looks good” or “criteria are somehow met.” This behavior defeats the original purpose of measurement. The focus on numbers changes a decision-making issue to a measurement issue. All these aspects of focusing on numbers are depicted in [Figure 1].
How to redeem ourselves from this fallacy in medical education
Let us look at a very common scenario of a girl being taught how to cook by her mother. After initial few sessions, the mother stands with the daughter observing her making the food. She does not just leave the kitchen saying “4 out of 10”; rather she tastes the food and tells her to increase a bit of salt, reduce the flame, boil it longer, and serve with garnishing. A father does something similar while coaching his daughter to drive a car. We all have learnt and taught many things very well like this, but we fall back on numbers alone when teaching medicine to students.
We know that what is not assessed is not learnt. The need to ensure that the students become competent in performing all the five roles cannot be overstated. To believe that we can assess the role as a clinician by various numeric assessments would be level one of the fallacy – measuring all that can be measured and believing that it would suffice. The other roles as communicator, leader, lifelong learner, and professional are not amenable to measurements (at least with our current understanding). To disregard these roles because of their unquantifiability, to consider them as unimportant or non-existent would be falling prey to levels two, three, and four of the fallacy, respectively, which is dangerous. These roles are major contributors to the quality of patient care and decisive of success or failure in practice. To mention them as of “core” importance but not assessing them or considering them “non-core” as far as assessment goes, is a sure recipe for killing the competencies corresponding to these roles. Some assessment – even if it is by narratives – is better than no assessment, as it conveys the importance of various competencies to the students. Fortunately, the role of such assessments in CBME is increasingly being recognized.
The new curriculum provides many opportunities to move beyond numbers in the form of formative and internal assessment. The grades/scores in these assessments are not to be added to the final scores, and this opens many untreaded paths, allowing a meaningful process of collecting rich information regarding their learning and then guiding them appropriately to facilitate the attainment of competencies. Many medical schools in India are trying in this direction. We came across an example of case-based blended learning eco-system in an Indian medical school, where the undergraduate students post their reflections on the cases seen by them on a blog and they are utilized for formative assessment, contributing to internal marks.
Our initial perception of anything is qualitative rather than numerical and examiners first make a qualitative assessment before converting it to marks. Furthermore, when the student converts these marks to meaningful information to understand his or her performance, there is loss of fidelity. It would probably make more sense to convey qualitative information directly in the first place rather than getting into dual conversion.
The choice of the tools should be based on their educational impact rather than psychometric properties alone. With some tools, we may be able to generate and defend numbers (marks) very easily. But they promote rote learning/surface learning. Tools such as mini-Clinical Evaluation Exercise, on the other hand, have great educational impact because of the expert feedback based on direct observation in a real-life setting. A toolbox of assessment methods to cover all the roles of an IMG has recently been published.
Assessment in artificial or standardized settings may be good in the initial phase of learning or to practice skills; however, the real-life practice scenario is never a “standard” one. Hence, it is important to assess performance in a real-life scenario – noisy OPD, un-co-operative patient, inconsistent history, confusing laboratory reports, diagnostic dilemma, and so on. How a student handles such situations is a matter of qualitative judgement. Such assessment can be done using the various workplace-based assessment tools. Because assessment using such tools happens in unstandardized workplace-associated situations, a narrative provides better data for feedback and learning. This also helps students become reflective learners themselves, which in turn improves their performance, as against a certain marksheet or score that only tells whether a student could score pass marks or not.
The new curriculum mandates the use of a logbook and recommends the use of portfolio. Maintaining such documents and indulging in authentic self-assessments would provide an opportunity for meaningful teacher–student interactions, optimizing students' learning. The assessment of reflections recorded in the students' portfolio would give the teachers deep insight into their learning trajectory and needs.
To redeem ourselves from the McNamara fallacy, we shall have to embrace the qualitative methods of assessment and embed them in the system. There is plenty of evidence to suggest that narratives without using grades/marks have a better influence on student learning. Individualized narrative seems to work even more.
[Figure 2] enlists the key points we need to keep in mind to avoid becoming victims of the McNamara fallacy.
Challenges and the way forward
Incorporating a qualitative component in assessment in a system that has relied heavily on numbers for many decades is not going to be an easy task. The following are the important points to consider:
An inevitable entrant to any discussion on non-numerical assessment is the debate on subjectivity and objectivity. Although we do not want to enter that now, enough has been written on the role and utility of expert subjective judgments and their equivalence to objective measures.,
It requires a lot of time and resources. Assessors shall have to be trained in the task. They will need expertise not only in the competency being assessed but also in the observational and recording task intrinsic to the assessor's role. They need to keep in mind the goals of assessment and be familiar with the various tools being used for the purpose. Students shall have to be sensitized too, for which the foundation course can provide a useful opportunity.
Qualitative assessments may involve observing a lot of tasks being performed in the workplace and/or reviewing a lot of data in a portfolio. The use of technology may help with data management, for example, having videotapes of clinical consultations or maintaining e-portfolios instead of physical copies.
We are so accustomed to giving marks that it may be a challenge to shift to grades. The problem with marks is that some examiners may use only “45–60” part of the scale, whereas others may use “10–70.” This shrinking or expanding the scale results in “contamination” of marks. Using grades in place of marks is an effective intervention to cut down the problem of numbers. It would make the interpretation of results clearer and allow comparability between different subjects, students, and universities.
It is important to develop a vocabulary to describe advancements in competence. For example, the Uniformed Services University has a “RIME” framework describing various levels as Reporter, Interpreter, Manager, and Educator, with respect to the clinical competence of the students. The use of rubrics may also help in this regard. Virk et al. have described the characteristics of an effective rubric and how to construct rubrics to enhance the rigor and acceptability of subjective assessments. Documenting the qualitative assessments must include the rules, the evidence, the thought process ensued, and the reasons for decision making., This is especially important when qualitative assessments are done to make high-stakes decisions.
Qualitative assessments can capture certain nuances which the most psychometrically sophisticated and mathematically robust assessments cannot., However, it is only the “numbers” despite all the flaws and “standardization” despite its significantly higher cost, that have had an unconditional appeal so far, to all the stakeholders. This perspective needs to change. Narrative comments alone have shown to be an extremely reliable means of assessing residents in a competency-based internal medicine program. In fact, rich and meaningful comments could illustrate weaknesses not otherwise picked up by various “scores,” thereby helping to overcome the well-described phenomenon of “failure to fail” and providing learners more guidance regarding how to improve.
To be more accepting of qualitative methods, Valentine et al. suggested that the lens with which the assessment is viewed as appropriate or not should change from objectivity to fairness. Additionally, to add rigor to qualitative assessments, we shall have to apply the criteria of qualitative research methods. It is possible to achieve this by practical steps such as clarifying the intent of assessment and its justification, considering the reflexivity of assessors, appropriate sampling strategy, ensuring richness of raw data and synthesis, using multiple and different settings and observers, use of clear methodological framework, and demonstrating the impact on learner, assessor, and system.
Robert McNamara died an old man in 2009 and had an opportunity to reflect on his long life. His obituary in the Economist records:
“He was haunted by the thought that amid all the objective-setting and evaluating, the careful counting and the cost-benefit analysis, stood the ordinary human beings. They simply behaved unpredictably.”
We are at a crucial moment of history. We have already taken the plunge in competency-based model without making a corresponding change in assessment. We need a paradigm shift in our perspective on assessment, to steer clear of number dependency, and evaluate the dynamics of the students' learning and attainment of competencies qualitatively. We need a system in which such assessment is documented, defended, and considered not only fair, transparent, and credible but also highly valuable. This would be critical to the success of CBME. At the same time, this should not be seen as a plea to discard numbers; they have their place. This is only an attempt to emphasize the value of narratives in assessment, and use them, especially in formative and internal assessment.
It is worthwhile remembering that “not everything that can be counted counts, and not everything that counts can be counted.”
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
[Figure 1], [Figure 2]