{"id":6690,"date":"2011-03-02T17:22:03","date_gmt":"2011-03-02T22:22:03","guid":{"rendered":"http:\/\/blogs.nejm.org\/cardioexchange\/?p=6690"},"modified":"2011-07-19T17:44:19","modified_gmt":"2011-07-19T21:44:19","slug":"a-dose-of-reality-the-challenges-of-comparing-effectiveness","status":"publish","type":"post","link":"https:\/\/blogs.nejm.org\/cardioexchange\/2011\/03\/02\/a-dose-of-reality-the-challenges-of-comparing-effectiveness\/","title":{"rendered":"A DOSE of Reality: The Challenges of Comparing Effectiveness"},"content":{"rendered":"<p>An ideal paper for your next journal club \u2014 <a href=\"http:\/\/www.nejm.org\/doi\/full\/10.1056\/NEJMoa1005419?query=featured_home\">\u201cDiuretic Strategies in Patients with Acute Decompensated Heart Failure\u201d<\/a> \u2014 was just published in <em>NEJM<\/em>, by the NHLBI Heart Failure Clinical Research Network. \u00a0In this study (called DOSE), patients hospitalized with heart failure were randomized to receive different diuretic regimens based on dose and mode of administration. The authors concluded that \u201cthere were no significant differences in patients\u2019 global assessment of symptoms or in the change in renal function when diuretic therapy was administered by bolus as compared with continuous infusion or at a high dose as compared with a low dose.\u201d <a href=\"http:\/\/www.nejm.org\/doi\/full\/10.1056\/NEJMe1014162\">The editorialist<\/a> stated, \u201cSince a high-dose regimen may relieve dyspnea more quickly without adverse effects on renal function, that regimen is preferable to a low-dose regimen.\u201d<\/p>\n<p>There are at least five issues here that might be worthy of your attention.<\/p>\n<p><strong>1. The topic.<\/strong> Diuretics were introduced in the 20th century to treat heart failure, replacing the rigid Southey\u2019s tubes that were inserted subcutaneously to drain fluid. Thiazide diuretics were introduced in 1958, and furosemide, the first loop diuretic, was approved in 1966. This agent is now standard therapy for patients with heart failure. What is remarkable \u2014 and worth some reflection \u2014 is that despite using loop diuretics for 45 years now, we still do not have essential evidence to support decisions about dosing and mode of administration. This study responds to the clamor for comparative effectiveness trials \u2014 and should give us pause about what else we are doing in clinical practice with little evidence to guide us.<\/p>\n<p><strong>2. The primary outcomes.<\/strong> The primary efficacy endpoint (there was also a safety endpoint) was the patient\u2019s global assessment of symptoms, which was quantified as the area under the curve (AUC) of serial assessments from baseline to 72 hours. The authors describe the assessment as follows:<\/p>\n<blockquote><p>Patients were asked to self assess both their general well being (PGA) and their level of dyspnea using a visual analog scale (VAS) method. For PGA, patients marked their global well being on a 10 cm vertical line, with the top labeled \u201cbest you have ever felt\u201d and the bottom labeled \u201cworst you have ever felt.\u201d For dyspnea, the labels were \u201cI am not breathless at all\u201d and \u201cI am as breathless I have ever been.\u201d The VAS was scored from 0 to 100 by measuring the distance in millimeters from the bottom of the line. The patient was unaware of the numerical value of their response.<\/p><\/blockquote>\n<p>Kudos to the research team for caring about patient-reported outcomes and attempting to translate those outcomes into a useful metric. However, interpreting the results is a challenge. For the comparison of bolus versus continuous infusion, the mean AUCs were 4236 and 4373, respectively (<em>P<\/em>=0.47). For the comparison of high- versus low-dose therapy, the values were 4430 and 4171, respectively (<em>P<\/em>=0.06). The authors considered a 600-point difference in AUC to be clinically important based on prior studies and thus concluded that there were no significant between-group differences in the primary efficacy endpoint, either statistically or clinically. Although these conclusions seem appropriate, it\u2019s difficult to know what a 600-point difference really means in terms of the patient\u2019s experience. A few examples from the authors would have been helpful.<\/p>\n<p><strong>3. The power calculation.<\/strong> With only 308 patients, this study was small by the standards of most RCTs measuring patient outcomes in heart failure. The sample size calculation was based on 88% power to detect a 600-point difference in the AUC of global assessment scores \u2014 and on 88% power to detect a difference of 0.2 mg\/dL in the change in creatinine level between groups (the primary safety endpoint). I have no quibble with these calculations (though I do wonder why they picked 88%), but the small number of patients makes it difficult to do much with exploratory analyses by subgroup or different outcomes. The investigators adjusted the significance level for the primary outcomes, stating that the threshold would be a <em>P<\/em> value of &lt;0.025. In doing this, they treated each trial within the 2&#215;2 factorial design as a separate study with two endpoints.<\/p>\n<p>One statistic that would have been useful to see in this paper is the confidence intervals for the difference in the primary endpoints, so that we could see what kind of differences cannot be excluded based on the results. Remember, the study was not designed to show that the groups were similar; it was designed to test if they were different. The conclusion, appropriately enough, was that there were no significant differences, but your questions now might be: \u201cAre the treatment groups similar? What kind of differences can be excluded?\u201d<\/p>\n<p><strong>4. The secondary analysis and adjustment for multiple comparisons. <\/strong>The investigators conducted many secondary analyses and, for these, set a <em>P<\/em> value of 0.05 as the threshold for statistical significance. Most of these endpoints did not differ between the groups, but there were some findings that bear discussion:<\/p>\n<ul>\n<li>In the high- versus low-dose comparison, the difference      in the area under the curve at 72 hours for dyspnea met the criteria for statistical      significance (4668 vs. 4478, respectively; <em>P<\/em>=0.04) \u2014 a point highlighted by the authors in the      discussion. However, with so many comparisons conducted, a <em>P<\/em> value of 0.04 should hardly be      considered significant. Furthermore, the difference between groups was &lt;200      points, far below the authors\u2019 predefined threshold for a clinically      meaningful result.<\/li>\n<li>Change in body weight favored the continuous-infusion      and high-dose groups, as might be expected.<\/li>\n<li>The high-dose group had a significantly higher      proportion of patients with creatinine increases of &gt;0.3 mg\/dL than did      the low-dose group (23% vs. 14%; <em>P<\/em>=0.04).      Again, we should be careful about interpreting the statistical      significance of this, but an absolute difference of 9% for a potent risk      factor like worsening renal function is hard to ignore and does raise      concerns.<\/li>\n<\/ul>\n<p><strong>5. The recommendation by the editorialist.<\/strong> The editorialist came out with a strong endorsement for the high-dose regimen, arguing that it reduces dyspnea without worsening renal function. My interpretation is a bit different, but I tend to require better evidence to justify using more of a medication. Here is where the journal club should get interesting: What do you think are the implications of this study? Do you agree with the editorialist that it should change practice? Was the trial designed to address the question you have about how to use diuretics? If not, what would you have done differently? How should the guidelines incorporate this new information, if at all?<\/p>\n<p>I look forward to your thoughts.<\/p>\n<p><em>For more on the DOSE study, check out Anju Nohria&#8217;s <a href=\"http:\/\/blogs.nejm.org\/cardioexchange\/questioning-the-dose\/\">Voices blog<\/a>.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>An ideal paper for your next journal club \u2014 \u201cDiuretic Strategies in Patients with Acute Decompensated Heart Failure\u201d \u2014 was just published in NEJM, by the NHLBI Heart Failure Clinical Research Network. \u00a0In this study (called DOSE), patients hospitalized with heart failure were randomized to receive different diuretic regimens based on dose and mode of [&hellip;]<\/p>\n","protected":false},"author":211,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1,14],"tags":[674,721,287],"class_list":["post-6690","post","type-post","status-publish","format-standard","hentry","category-general","category-heart-failure","tag-diuretics","tag-furosemide","tag-heart-failure-2"],"_links":{"self":[{"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/posts\/6690","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/users\/211"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/comments?post=6690"}],"version-history":[{"count":0,"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/posts\/6690\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/media?parent=6690"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/categories?post=6690"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.nejm.org\/cardioexchange\/wp-json\/wp\/v2\/tags?post=6690"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}