Performance Measurement
In a recent New England Journal of Medicine Perspective article, MacLean and associates reviewed the state of performance measurement in medicine.[1] They began by noting CMS hopes to have 90% of fee-for-service-payments linked in some way to assessments of physician quality by the end of the year. But there are more than 2500 performance measures listed by the National Quality Measures Clearinghouse. Not surprisingly, a recent survey showed 63% of U. S. physicians did not believe these current measures captured physician quality, yet practices are spending an average of $40,000 yearly per physician to comply with payor mandates to report quality. The American College of Physicians Performance Measurement Committee have published a method for “evaluating the benefits and harms of medical intervention,” which the authors applied to some of the 271 measures included in Medicare’s “Merit-based Incentive Payment System/Quality Payment Program.” Their focus was on the 86 measures considered relevant, or potentially relevant, to ambulatory general internal medicine. 32 (37%) were rated as valid, 35 as not valid, and 24 were indeterminate. When they restricted analysis to those measures developed by the National Committee for Quality Assurance (NCQA) or endorsed by the National Quality Forum, (NQF), they found 58% of these measures were deemed valid, versus 48% not endorsed, and 27% for non-endorsed measures. When looking at the measures judged not valid the authors found the following. “Notably, among the 30 measures rated as not valid, 19 were judged to have insufficient evidence to support them…Another characteristic of measures rated not rated as valid by our method was inadequately specified exclusions, resulting in a requirement that a process or outcome occur across broad groups of patients, including patients who might not benefit…We also identified measures that were directed at important, evidence-based quality concepts but had poor specifications that might misclassify high-quality care as low-quality care.” The authors ended by looking at the different processes used and the composition of the various committees to examine why different conclusions were reached. Their summary was: “We believe the next generation of performance measurements should not be limited by the use of easy-to-obtain (e. g. administrative) data or function as a stand-alone, retrospective exercise. Instead, it should be fully integrated into care delivery, where it would effectively and efficiently address the most pressing performance gaps and direct quality improvement. For now, we need a time out during which to assess and revises our approach to physician performance.” Lest this appear as an “inside baseball” argument with little practical importance, I want to consider a current issue we are struggling with at my hospital. Like all hospitals, we have been responding to the mandate to focus on the processes used to care for patients with sepsis to insure early identification, rapid assessment, and prompt aggressive care, all of which are associated with reduced mortality. We have been using electronic prompts and monitoring compliance with the “sepsis bundle” in those patients. We have found the alerts to be sensitive, but not particularly specific, and are generally more useful in the Emergency Room, where they help to identify patients who need to jump to the head of the queue, than in the Intensive Care Unit, where there are many confounders amid ongoing care. Data show increased adherence to the “sepsis bundle,” which presumably reflects better care. At the same time, we are having problems, again like most hospitals, with antibiotic stewardship: minimizing use of antibiotics to reduce the spread of drug-resistant organisms. Data show we have more “antibiotic days” in our patients than comparator hospitals. So far, despite efforts to call attention to the problem, we have not seen much reduction in antibiotic usage, which presumably reflects less good care. The challenge is that both goals are worthy and clinically important, and both goals are financially important, albeit in different ways. To the frontline clinician, though, the balance boils down to a wager argument. “If the patient has sepsis and I don’t treat it, she will likely die. If she doesn’t have sepsis and I do treat her, she will likely survive.” Given this balance, the cautious clinician will err on the side of giving antibiotics. The harm from delay is immediate, obvious, and measurable. The harm from overuse is delayed, obscure, and hard to measure. My suggestion has been to measure and track two additional indicators. First, how often do clinicians correctly decide that the sepsis alert is wrong, and the patient does not need antibiotics? If we want to encourage correct decision making, we must measure both options, not just the one CMS has focused on. And we must track them together. Secondly, given the wager argument, we must accept “over-treatment,” but we need to make some internal decisions about how much is acceptable. I am not so worried about the first dose of antibiotics as I am the last dose. In the heat of initial presentation, I expect all clinicians will err on the side of overtreatment and start antibiotics. But in most cases, the issue has clarified itself in a couple of days. So maybe the best indicator is to track the length of time a given patient is on antibiotics without a diagnosis of infection, rather than “antibiotic days.” The latter is easier, given that it comes out of the pharmacy database. The former is more difficult, because you need to analyze individual patient level data. The problem is there are immediate and direct financial penalties for the hospital for failing to meet goals for the sepsis bundle, but there are no direct penalties (or rewards) for ruling out sepsis or curtailing antibiotic administration, so there are problems with allocating funds to monitor the latter two indicators. Of course, professional organizations could continue to look at the “alert” levels to reduce the number of false positives and look at the components of the bundle. Second, we need to consider ways to decide if a given hospital treats more patients with sepsis; how do we decide at the hospital level whose patients are sicker? Current efforts mostly serve to identify those organizations that spend more time on coding processes. Here, again, the reward is for using the system to advantage, not for tracking meaningful differences and outcomes. But CMS has decided as a matter of policy to pay for “quality” over “volume.” The fact they can’t really do it is irrelevant; the is controlling dollars in the Medicare budget. My concern is that comingling payment with quality will ultimately hurt our ability to improve our care processes in ways that benefit patients, because we spend all our efforts “meeting the numbers.” If CMS is serious about improving quality, it needs to resist the urge to squeeze on one side of the balloon without considering negative effects elsewhere. 14 May 2018 [1] MacLean CH, Kerr EA, Qaseem A. Time Out—Charting a Path for Improving Performance Measurement. N Engl J Med 2018;378(19):1757-1761. doi:10.1056/NEJMp1802595. |
Further Reading
Another Look at the Value Proposition A review of published data show pay for performance programs have not impacted either cost of care or health outcomes. Confronting The Quality Paradox - Part 1 Measuring Teamwork Measuring teamwork is difficult, but important if healthcare systems are to invest in their development. This article reviews the literature and provides suggestions for action now. Quality Metrics Variation in Health Care Is variation in health care good, bad, or inevitable? The answer may determine future medical practice. |