User login
Patterns and Appropriateness of Thrombophilia Testing in an Academic Medical Center
Thrombophilia is a prothrombotic state, either acquired or inherited, leading to a thrombotic predisposition.1 The most common heritable thrombophilias include factor V Leiden (FVL) and prothrombin G20210A. The most common acquired thrombophilia is the presence of phospholipid antibodies.1 Thrombotic risk varies with thrombophilia type. For example, deficiencies of antithrombin, protein C and protein S, and the presence of phospholipid antibodies, confer higher risk than FVL and prothrombin G20210A.2-5 Other thrombophilias (eg, methylenetetrahydrofolate reductase mutation, increased factor VIII activity) are relatively uncommon and/or their impact on thrombosis risk appears to be either minimal or unknown.1-6 There is little clinical evidence that testing for thrombophilia impacts subsequent thrombosis prevention.5,7,8 Multiple clinical guidelines and medical societies recommend against the routine and indiscriminate use of thrombophilia testing.8-13 In general, thrombophilia testing should be considered only if the result would lead to changes in anticoagulant initiation, intensity, and/or duration, or might inform interventions to prevent thrombosis in asymptomatic family members.8-13 However, thrombophilia testing rarely changes the acute management of a thrombotic event and may have harmful effects on patients and their family members because positive results may unnecessarily increase anxiety and negative results may provide false reassurance.6,14-18 The cost-effectiveness of thrombophilia testing is unknown. Economic models have sought to quantify cost-effectiveness, but conclusions from these studies are limited.7
The utility of thrombophilia testing in emergency department (ED) and inpatient settings is further limited because patients are often treated and discharged before thrombophilia test results are available. Additionally, in these settings, multiple factors increase the risk of false-positive or false-negative results (eg, acute thrombosis, acute illness, pregnancy, and anticoagulant therapy).19,20 The purpose of this study was to systematically assess thrombophilia testing patterns in the ED and hospitalized patients at an academic medical center and to quantify the proportion of tests associated with minimal clinical utility. We hypothesize that the majority of thrombophilia tests completed in the inpatient setting are associated with minimal clinical utility.
METHODS
Setting and Patients
This study was conducted at University of Utah Health Care (UUHC) University Hospital, a 488-bed academic medical center with a level I trauma center, primary stroke center, and 50-bed ED. Laboratory services for UUHC, including thrombophilia testing, are provided by a national reference laboratory, Associated Regional and University Pathologists Laboratories. This study included patients ≥18 years of age who received thrombophilia testing (Supplementary Table 1) during an ED visit or inpatient admission at University Hospital between July 1, 2014 and December 31, 2014. There were no exclusion criteria. An institutional electronic data repository was used to identify patients matching inclusion criteria. All study activities were reviewed and approved by the UUHC Institutional Review Board with a waiver of informed consent.
Outcomes
An electronic database query was used to identify patients, collect patient demographic information, and collect test characteristics. Each patient’s electronic medical record was manually reviewed to collect all other outcomes. Indication for thrombophilia testing was identified by manual review of provider notes. Thrombophilia tests occurring in situations associated with minimal clinical utility were defined as tests meeting at least one of the following criteria: patient discharged before test results were available for review; test type not recommended by published guidelines or by UUHC Thrombosis Service physicians for thrombophilia testing (Supplementary Table 2); test performed in situations associated with decreased accuracy; test was a duplicate test as a result of different thrombophilia panels containing identical tests; and test followed a provoked venous thromboembolism (VTE). Testing in situations associated with decreased accuracy are summarized in Supplementary Table 3 and included at least one of the following at the time of the test: anticoagulant therapy, acute thrombosis, pregnant or <8 weeks postpartum, and receiving estrogen-containing medications. Only test types known to be affected by the respective situation were included. Testing following a provoked VTE was defined as testing prompted by an acute thrombosis and performed within 3 months following major surgery (defined administratively as any surgery performed in an operating room), during pregnancy, <8 weeks postpartum, or while on estrogen-containing medications. Thrombophilia testing during anticoagulant therapy was defined as testing within 4 half-lives of anticoagulant administration based on medication administration records. Anticoagulant therapy changes were identified by comparing prior-to-admission and discharge medication lists.
Data Analysis
Patient and laboratory characteristics were summarized using descriptive statistics, including mean and standard deviation (SD) for continuous variables and proportions for categorical variables. Data analysis was performed using Excel (Version 2013, Microsoft Corporation. Redmond, Washington).
RESULTS
During the 6-month study period, 163 patients received at least 1 thrombophilia test during an ED visit or inpatient admission. Patient characteristics are summarized in Table 1. Tested patients were most commonly inpatients (96%) and female (71%). A total of 1451 thrombophilia tests were performed with a mean (± SD) of 8.9 ± 6.0 tests per patient. Testing characteristics are summarized in Table 2. Of the 39 different test types performed, the most commonly ordered were cardiolipin IgG and IgM antibodies (9% each), lupus anticoagulant (9%), and β2-glycoprotein 1 IgG and IgM antibodies (8% each). When combined with testing for phosphatidyl antibodies, antiphospholipid tests accounted for 70% of all tests. Overall, 134 (9%) test results were positive. The mean time for results to become available was 2.2 ± 2.5 days. The frequency of test types with corresponding positivity rates and mean time for results to become available are summarized in Supplementary Table 4.
The indications for thrombophilia testing are summarized in Table 3. Ischemic stroke was the most common indication for testing (50% of tests; 35% of patients), followed by VTE (21% of tests; 21% of patients), and pregnancy-related conditions (eg, preeclampsia, intrauterine fetal demise; 15% of tests; 25% of patients). Overall, 911 tests (63%) occurred in situations associated with minimal clinical utility, with 126 patients (77%) receiving at least one of these tests (Table 4).
Anticoagulant therapy was changed in 43 patients (26%) in the following ways: initiated in 35 patients (21%), transitioned to a different anticoagulant in 6 patients (4%), and discontinued in 2 patients (1%). Of the 35 patients initiating anticoagulant therapy, 29 had documented thrombosis (24 had VTE, 4 had cerebral venous sinus thrombosis [CVST], and 1 had basilar artery thrombosis). Overall, 2 instances were identified in which initiation of anticoagulant therapy at discharge was in response to thrombophilia test results. In the first instance, warfarin without a parenteral anticoagulant bridge was initiated for a 54-year-old patient with a cryptogenic stroke who tested positive for β2-glycoprotein 1 IgG antibodies, lupus anticoagulant, and protein S deficiency. In the second instance, warfarin with an enoxaparin bridge was initiated for a 26-year-old patient with a cryptogenic stroke who tested positive for β2-glycoprotein 1 IgG and IgM antibodies, cardiolipin IgG antibodies, lupus anticoagulant, protein C deficiency, and antithrombin deficiency. Of the 163 patients receiving thrombophilia testing, only 2 patients (1%) had clear documentation of being offered genetic consultation.
DISCUSSION
In this retrospective analysis, 1451 thrombophilia tests were performed in 163 patients over 6 months. Tested patients were relatively young, which is likely explained by the number of patients tested for pregnancy-related conditions and the fact that a stroke or VTE in younger patients more frequently prompted providers to suspect thrombophilia. Nearly three-fourths of patients were female, which is likely due to testing for pregnancy-related conditions and possibly diagnostic suspicion bias given the comparative predilection of antiphospholipid syndrome for women. The patient characteristics in our study are consistent with other studies evaluating thrombophilia testing.21,22
Thrombophilia testing was most frequently prompted by stroke, VTE, and pregnancy-related conditions. Only 26% of patients had acute thrombosis identified during the admission, primarily because of the high proportion of tests for cryptogenic strokes and pregnancy-related conditions. Thrombophilia testing is recommended in patients who have had a stroke when the stroke is considered to be cryptogenic after a standard stroke evaluation.23 Thrombophilia testing in pregnancy-related conditions is controversial but is often considered in situations such as stillbirths with severe placental pathology and/or significant growth restriction, or in mothers with a personal or family history of thrombosis.24 The proportion of testing for pregnancy-related conditions may be greater than at other institutions because UUHC Maternal Fetal Medicine is a referral center for women with conditions associated with hypercoagulability. Anticoagulant therapy was initiated in 21% of patients, but specifically in response to thrombophilia testing in only 2 instances; in most cases, anticoagulant therapy was initiated regardless of thrombophilia test results.
The results of this study confirm our hypothesis because the majority of thrombophilia tests occurred in situations associated with minimal clinical utility. Testing in these situations was not isolated to specific patients or medical services because 77% of tested patients received at least 1 test associated with minimal clinical utility. Our study took a conservative approach in defining scenarios associated with minimal clinical utility because other situations can also affect testing accuracy (eg, hepatic disease, nephrotic syndrome) but were not included in our analysis of this outcome.
The results of this study highlight opportunities to improve thrombophilia testing practices at our institution and may be generalizable to institutions with similar testing patterns. Because multiple medical services order thrombophilia tests, strategies to improve testing practices are still being determined. The results of this study can serve as a baseline for comparison after strategies are implemented. The most common situation associated with minimal clinical utility was the use of test types not generally recommended by guidelines or UUHC Thrombosis Service physicians for thrombophilia testing (eg, β2-glycoprotein 1 IgA antibodies, phosphatidyl antibodies). We intend to require a hematology or thrombosis specialty consult prior to ordering these tests. This intervention alone could potentially decrease unnecessary testing by a third. Another consideration is to require a specialty consult prior to any inpatient thrombophilia testing. This strategy has been found to decrease inappropriate testing at other institutions.21 We also intend to streamline available thrombophilia testing panels because a poorly designed panel could lead to ordering of multiple tests associated with minimal clinical utility. At least 12 different thrombophilia panels are currently available in our computerized physician order entry system (see Supplementary Table 5). We hypothesize that current panel designs contribute to providers inadvertently ordering unintended or duplicate tests and that reducing the number of available panels and clearly delineating what tests are contained in each panel is likely to reduce unnecessary testing. Other strategies being considered include using electronic clinical decision support tools, implementing strict ordering criteria for all inpatient testing, and establishing a thrombosis stewardship program.
Our study was unique in at least 2 ways. First, previous studies describing thrombophilia testing have described testing patterns for patients with specific indications (eg, VTE), whereas our study described all thrombophilia tests regardless of indication. This allows for testing pattern comparisons across indications and medical services, increasing the generalizability of our results. Second, this study quantifies tests occurring in situations associated with a practical definition of minimal clinical utility.
Our study has several limitations: (1) Many variables were reliant on provider notes and other documentation, which allows for potential misclassification of variables. (2) It was not always possible to determine the ultimate utility of each test in clinical management decisions, and our study did not investigate the impact of thrombophilia testing on duration of anticoagulant therapy. Additionally, select situations could benefit from testing regardless if anticoagulant therapy is altered (eg, informing contraceptive choices). (3) Testing performed following a provoked acute thrombosis was defined as testing within 3 months following administratively defined major surgery. This definition could have included some minor procedures that do not substantially increase VTE risk, resulting in underestimated clinical utility. (4) The UUHC University Hospital serves as a referral hospital for a large geographical area, and investigators did not have access to outpatient records for a large proportion of discharged patients. As a result, frequency of repeat testing could not be assessed, possibly resulting in overestimated clinical utility. (5) In categorizing indications for testing, testing for CVST was subcategorized under testing for ischemic stroke based on presenting symptoms rather than on underlying pathophysiology. The rationale for this categorization is that patients with CVST were often tested based on presenting symptoms. Additionally, tests for CVST were ordered by the neurology service, which also ordered tests for all other ischemic stroke indications. (6) The purpose of our study was to investigate the subset of the hospital’s patient population that received thrombophilia testing, and patients were identified by tests received and not by diagnosis codes. As a result, we are unable to provide the proportion of total patients treated at the hospital for specific conditions who were tested (eg, the proportion of stroke patients that received thrombophilia testing). (7) Current practice guidelines do not recommend testing for phosphatidyl antibodies, even when traditional antiphospholipid testing is negative.25-27 Although expert panels continue to explore associations between phosphatidyl antibodies and pregnancy morbidity and thrombotic events, the low level of evidence is insufficient to guide clinical management.28 Therefore, we categorized all phosphatidyl testing as associated with minimal clinical utility.
CONCLUSIONS
In a large academic medical center, the majority of tests occurred in situations associated with minimal clinical utility. Strategies to improve thrombophilia testing practices are needed in order to minimize potentially inappropriate testing, provide more cost-effective care, and promote value-driven outcomes.
Disclosure
S.W. received financial support for this submitted work via a Bristol-Myers-Squibb grant. G.F. received financial support from Portola Pharmaceuticals for consulting and lectures that were not related to this submitted work.
1. Franco RF, Reitsma PH. Genetic risk factors of venous thrombosis. Hum Genet. 2001;109(4):369-384. PubMed
2. Ridker PM, Hennekens CH, Lindpaintner K, Stampfer MJ, Eisenberg PR, Miletich JP. Mutation in the gene coding for coagulation factor V and the risk of myocardial infarction, stroke, and venous thrombosis in apparently healthy men. N Engl J Med. 1995;332(14):912-917. PubMed
3. Koster T, Rosendaal FR, de Ronde H, Briët E, Vandenbroucke JP, Bertina RM. Venous thrombosis due to poor anticoagulant response to activated protein C: Leiden Thrombophilia Study. Lancet. 1993;342(8886-8887):1503-1506. PubMed
4. Margaglione M, Brancaccio V, Giuliani N, et al. Increased risk for venous thrombosis in carriers of the prothrombin G-->A20210 gene variant. Ann Intern Med. 1998;129(2):89-93. PubMed
5. De Stefano V, Martinelli I, Mannucci PM, et al. The risk of recurrent deep venous thrombosis among heterozygous carriers of both factor V Leiden and the G20210A prothrombin mutation. N Engl J Med. 1999;341:801-806. PubMed
6. Dickey TL. Can thrombophilia testing help to prevent recurrent VTE? Part 2. JAAPA. 2002;15(12):23-24, 27-29. PubMed
7. Simpson EL, Stevenson MD, Rawdin A, Papaioannou D. Thrombophilia testing in people with venous thromboembolism: systematic review and cost-effectiveness analysis. Health Technol Assess. 2009;13(2):iii, ix-x, 1-91. PubMed
8. National Institute for Health and Clinical Excellence. Venous thromboembolic disease: the management of venous thromboembolic diseases and the role of thrombophilia testing. NICE clinical guideline 144. https://www.nice.org.uk/guidance/cg144. Accessed on June 30, 2017.
9. Evalution of Genomic Applications in Practice and Prevention (EGAPP) Working Group. Recommendations from the EGAPP Working Group: routine testing for factor V Leiden (R506Q) and prothrombin (20210G>A) mutations in adults with a history of idiopathic venous thromboembolism and their adult family members. Genet Med. 2011;13(1):67-76.
10. Kearon C, Akl EA, Comerota AJ, et al. Antithrombotic therapy for VTE disease: antithrombotic therapy and prevention of thrombosis, 9th ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest. 2012;141(2 Suppl):e419S-494S. PubMed
11. Baglin T, Gray E, Greaves M, et al. Clinical guidelines for testing for heritable thrombophilia. Br J Haematol. 2010;149(2):209-220. PubMed
12. Hicks LK, Bering H, Carson KR, et al. The ASH Choosing Wisely® campaign: five hematologic tests and treatments to question. Hematology Am Soc Hematol Educ Program. 2013;2013:9-14. PubMed
13. Stevens SM, Woller SC, Bauer KA, et al. Guidance for the evaluation and treatment of hereditary and acquired thrombophilia. J Thromb Thrombolysis. 2016;41(1):154-164. PubMed
14. Christiansen SC, Cannegieter SC, Koster T, Vandenbroucke JP, Rosendaal FR. Thrombophilia, clinical factors, and recurrent venous thrombotic events. JAMA. 2005;293(19):2352-2361. PubMed
15. Prandoni P, Lensing AW, Cogo A, et al. The long-term clinical course of acute deep venous thrombosis. Ann Intern Med. 1996;125(1):1-7. PubMed
16. Miles JS, Miletich JP, Goldhaber SZ, Hennekens CH, Ridker PM. G20210A mutation in the prothrombin gene and the risk of recurrent venous thromboembolism. J Am Coll Cardiol. 2001;37(1):215-218. PubMed
17. Eichinger S, Weltermann A, Mannhalter C, et al. The risk of recurrent venous thromboembolism in heterozygous carriers of factor V Leiden and a first spontaneous venous thromboembolism. Arch Intern Med. 2002;162(20):2357-2360. PubMed
18. Mazzolai L, Duchosal MA. Hereditary thrombophilia and venous thromboembolism: critical evaluation of the clinical implications of screening. Eur J Vasc Endovasc Surg. 2007;34(4):483-488. PubMed
19. Merriman L, Greaves M. Testing for thrombophilia: an evidence‐based approach. Postgrad Med J. 2006;82(973):699-704. PubMed
20. Favaloro EJ, McDonald D, Lippi G. Laboratory investigation of thrombophilia: the good, the bad, and the ugly. Semin Thromb Hemost. 2009;35(7):695-710. PubMed
21. Shen YM, Tsai J, Taiwo E, et al. Analysis of thrombophilia test ordering practices at an academic center: a proposal for appropriate testing to reduce harm and cost. PLoS One. 2016;11(5):e0155326. PubMed
22. Meyer MR, Witt DM, Delate T, et al. Thrombophilia testing patterns amongst patients with acute venous thromboembolism. Thromb Res. 2015;136(6):1160-1164. PubMed
23. Saver JL. Clinical practice: cryptogenic stroke. N Engl J Med. 2016;374(21):2065-2074. PubMed
24. ACOG practice bulletin no. 102: management of stillbirth. Obstet Gynecol. 2009;113(3):748-761. PubMed
25. Miyakis S, Lockshin MD, Atsumi T, et al. International consensus statement on an update of the classification criteria for definite antiphospholipid syndrome (APS). J Thromb Haemost. 2006;4(2):295-306. PubMed
26. Keeling D, Mackie I, Moore GW, Greer IA, Greaves M, British Committee for Standards in Haematology. Guidelines on the investigation and management of antiphospholipid syndrome. Br J Haematol. 2012;157(1):47-58. PubMed
27. Committee on Practice Bulletins—Obstetrics, American College of Obstetricians and Gynecologists. Practice bulletin no. 132: antiphospholipid syndrome. Obstet Gynecol. 2012;120(6):1514-1521. PubMed
28. Bertolaccini ML, Amengual O, Andreoli L, et al. 14th International Congress on Antiphospholipid Antibodies Task Force. Report on antiphospholipid syndrome laboratory diagnostics and trends. Autoimmun Rev. 2014;13(9):917-930. PubMed
Thrombophilia is a prothrombotic state, either acquired or inherited, leading to a thrombotic predisposition.1 The most common heritable thrombophilias include factor V Leiden (FVL) and prothrombin G20210A. The most common acquired thrombophilia is the presence of phospholipid antibodies.1 Thrombotic risk varies with thrombophilia type. For example, deficiencies of antithrombin, protein C and protein S, and the presence of phospholipid antibodies, confer higher risk than FVL and prothrombin G20210A.2-5 Other thrombophilias (eg, methylenetetrahydrofolate reductase mutation, increased factor VIII activity) are relatively uncommon and/or their impact on thrombosis risk appears to be either minimal or unknown.1-6 There is little clinical evidence that testing for thrombophilia impacts subsequent thrombosis prevention.5,7,8 Multiple clinical guidelines and medical societies recommend against the routine and indiscriminate use of thrombophilia testing.8-13 In general, thrombophilia testing should be considered only if the result would lead to changes in anticoagulant initiation, intensity, and/or duration, or might inform interventions to prevent thrombosis in asymptomatic family members.8-13 However, thrombophilia testing rarely changes the acute management of a thrombotic event and may have harmful effects on patients and their family members because positive results may unnecessarily increase anxiety and negative results may provide false reassurance.6,14-18 The cost-effectiveness of thrombophilia testing is unknown. Economic models have sought to quantify cost-effectiveness, but conclusions from these studies are limited.7
The utility of thrombophilia testing in emergency department (ED) and inpatient settings is further limited because patients are often treated and discharged before thrombophilia test results are available. Additionally, in these settings, multiple factors increase the risk of false-positive or false-negative results (eg, acute thrombosis, acute illness, pregnancy, and anticoagulant therapy).19,20 The purpose of this study was to systematically assess thrombophilia testing patterns in the ED and hospitalized patients at an academic medical center and to quantify the proportion of tests associated with minimal clinical utility. We hypothesize that the majority of thrombophilia tests completed in the inpatient setting are associated with minimal clinical utility.
METHODS
Setting and Patients
This study was conducted at University of Utah Health Care (UUHC) University Hospital, a 488-bed academic medical center with a level I trauma center, primary stroke center, and 50-bed ED. Laboratory services for UUHC, including thrombophilia testing, are provided by a national reference laboratory, Associated Regional and University Pathologists Laboratories. This study included patients ≥18 years of age who received thrombophilia testing (Supplementary Table 1) during an ED visit or inpatient admission at University Hospital between July 1, 2014 and December 31, 2014. There were no exclusion criteria. An institutional electronic data repository was used to identify patients matching inclusion criteria. All study activities were reviewed and approved by the UUHC Institutional Review Board with a waiver of informed consent.
Outcomes
An electronic database query was used to identify patients, collect patient demographic information, and collect test characteristics. Each patient’s electronic medical record was manually reviewed to collect all other outcomes. Indication for thrombophilia testing was identified by manual review of provider notes. Thrombophilia tests occurring in situations associated with minimal clinical utility were defined as tests meeting at least one of the following criteria: patient discharged before test results were available for review; test type not recommended by published guidelines or by UUHC Thrombosis Service physicians for thrombophilia testing (Supplementary Table 2); test performed in situations associated with decreased accuracy; test was a duplicate test as a result of different thrombophilia panels containing identical tests; and test followed a provoked venous thromboembolism (VTE). Testing in situations associated with decreased accuracy are summarized in Supplementary Table 3 and included at least one of the following at the time of the test: anticoagulant therapy, acute thrombosis, pregnant or <8 weeks postpartum, and receiving estrogen-containing medications. Only test types known to be affected by the respective situation were included. Testing following a provoked VTE was defined as testing prompted by an acute thrombosis and performed within 3 months following major surgery (defined administratively as any surgery performed in an operating room), during pregnancy, <8 weeks postpartum, or while on estrogen-containing medications. Thrombophilia testing during anticoagulant therapy was defined as testing within 4 half-lives of anticoagulant administration based on medication administration records. Anticoagulant therapy changes were identified by comparing prior-to-admission and discharge medication lists.
Data Analysis
Patient and laboratory characteristics were summarized using descriptive statistics, including mean and standard deviation (SD) for continuous variables and proportions for categorical variables. Data analysis was performed using Excel (Version 2013, Microsoft Corporation. Redmond, Washington).
RESULTS
During the 6-month study period, 163 patients received at least 1 thrombophilia test during an ED visit or inpatient admission. Patient characteristics are summarized in Table 1. Tested patients were most commonly inpatients (96%) and female (71%). A total of 1451 thrombophilia tests were performed with a mean (± SD) of 8.9 ± 6.0 tests per patient. Testing characteristics are summarized in Table 2. Of the 39 different test types performed, the most commonly ordered were cardiolipin IgG and IgM antibodies (9% each), lupus anticoagulant (9%), and β2-glycoprotein 1 IgG and IgM antibodies (8% each). When combined with testing for phosphatidyl antibodies, antiphospholipid tests accounted for 70% of all tests. Overall, 134 (9%) test results were positive. The mean time for results to become available was 2.2 ± 2.5 days. The frequency of test types with corresponding positivity rates and mean time for results to become available are summarized in Supplementary Table 4.
The indications for thrombophilia testing are summarized in Table 3. Ischemic stroke was the most common indication for testing (50% of tests; 35% of patients), followed by VTE (21% of tests; 21% of patients), and pregnancy-related conditions (eg, preeclampsia, intrauterine fetal demise; 15% of tests; 25% of patients). Overall, 911 tests (63%) occurred in situations associated with minimal clinical utility, with 126 patients (77%) receiving at least one of these tests (Table 4).
Anticoagulant therapy was changed in 43 patients (26%) in the following ways: initiated in 35 patients (21%), transitioned to a different anticoagulant in 6 patients (4%), and discontinued in 2 patients (1%). Of the 35 patients initiating anticoagulant therapy, 29 had documented thrombosis (24 had VTE, 4 had cerebral venous sinus thrombosis [CVST], and 1 had basilar artery thrombosis). Overall, 2 instances were identified in which initiation of anticoagulant therapy at discharge was in response to thrombophilia test results. In the first instance, warfarin without a parenteral anticoagulant bridge was initiated for a 54-year-old patient with a cryptogenic stroke who tested positive for β2-glycoprotein 1 IgG antibodies, lupus anticoagulant, and protein S deficiency. In the second instance, warfarin with an enoxaparin bridge was initiated for a 26-year-old patient with a cryptogenic stroke who tested positive for β2-glycoprotein 1 IgG and IgM antibodies, cardiolipin IgG antibodies, lupus anticoagulant, protein C deficiency, and antithrombin deficiency. Of the 163 patients receiving thrombophilia testing, only 2 patients (1%) had clear documentation of being offered genetic consultation.
DISCUSSION
In this retrospective analysis, 1451 thrombophilia tests were performed in 163 patients over 6 months. Tested patients were relatively young, which is likely explained by the number of patients tested for pregnancy-related conditions and the fact that a stroke or VTE in younger patients more frequently prompted providers to suspect thrombophilia. Nearly three-fourths of patients were female, which is likely due to testing for pregnancy-related conditions and possibly diagnostic suspicion bias given the comparative predilection of antiphospholipid syndrome for women. The patient characteristics in our study are consistent with other studies evaluating thrombophilia testing.21,22
Thrombophilia testing was most frequently prompted by stroke, VTE, and pregnancy-related conditions. Only 26% of patients had acute thrombosis identified during the admission, primarily because of the high proportion of tests for cryptogenic strokes and pregnancy-related conditions. Thrombophilia testing is recommended in patients who have had a stroke when the stroke is considered to be cryptogenic after a standard stroke evaluation.23 Thrombophilia testing in pregnancy-related conditions is controversial but is often considered in situations such as stillbirths with severe placental pathology and/or significant growth restriction, or in mothers with a personal or family history of thrombosis.24 The proportion of testing for pregnancy-related conditions may be greater than at other institutions because UUHC Maternal Fetal Medicine is a referral center for women with conditions associated with hypercoagulability. Anticoagulant therapy was initiated in 21% of patients, but specifically in response to thrombophilia testing in only 2 instances; in most cases, anticoagulant therapy was initiated regardless of thrombophilia test results.
The results of this study confirm our hypothesis because the majority of thrombophilia tests occurred in situations associated with minimal clinical utility. Testing in these situations was not isolated to specific patients or medical services because 77% of tested patients received at least 1 test associated with minimal clinical utility. Our study took a conservative approach in defining scenarios associated with minimal clinical utility because other situations can also affect testing accuracy (eg, hepatic disease, nephrotic syndrome) but were not included in our analysis of this outcome.
The results of this study highlight opportunities to improve thrombophilia testing practices at our institution and may be generalizable to institutions with similar testing patterns. Because multiple medical services order thrombophilia tests, strategies to improve testing practices are still being determined. The results of this study can serve as a baseline for comparison after strategies are implemented. The most common situation associated with minimal clinical utility was the use of test types not generally recommended by guidelines or UUHC Thrombosis Service physicians for thrombophilia testing (eg, β2-glycoprotein 1 IgA antibodies, phosphatidyl antibodies). We intend to require a hematology or thrombosis specialty consult prior to ordering these tests. This intervention alone could potentially decrease unnecessary testing by a third. Another consideration is to require a specialty consult prior to any inpatient thrombophilia testing. This strategy has been found to decrease inappropriate testing at other institutions.21 We also intend to streamline available thrombophilia testing panels because a poorly designed panel could lead to ordering of multiple tests associated with minimal clinical utility. At least 12 different thrombophilia panels are currently available in our computerized physician order entry system (see Supplementary Table 5). We hypothesize that current panel designs contribute to providers inadvertently ordering unintended or duplicate tests and that reducing the number of available panels and clearly delineating what tests are contained in each panel is likely to reduce unnecessary testing. Other strategies being considered include using electronic clinical decision support tools, implementing strict ordering criteria for all inpatient testing, and establishing a thrombosis stewardship program.
Our study was unique in at least 2 ways. First, previous studies describing thrombophilia testing have described testing patterns for patients with specific indications (eg, VTE), whereas our study described all thrombophilia tests regardless of indication. This allows for testing pattern comparisons across indications and medical services, increasing the generalizability of our results. Second, this study quantifies tests occurring in situations associated with a practical definition of minimal clinical utility.
Our study has several limitations: (1) Many variables were reliant on provider notes and other documentation, which allows for potential misclassification of variables. (2) It was not always possible to determine the ultimate utility of each test in clinical management decisions, and our study did not investigate the impact of thrombophilia testing on duration of anticoagulant therapy. Additionally, select situations could benefit from testing regardless if anticoagulant therapy is altered (eg, informing contraceptive choices). (3) Testing performed following a provoked acute thrombosis was defined as testing within 3 months following administratively defined major surgery. This definition could have included some minor procedures that do not substantially increase VTE risk, resulting in underestimated clinical utility. (4) The UUHC University Hospital serves as a referral hospital for a large geographical area, and investigators did not have access to outpatient records for a large proportion of discharged patients. As a result, frequency of repeat testing could not be assessed, possibly resulting in overestimated clinical utility. (5) In categorizing indications for testing, testing for CVST was subcategorized under testing for ischemic stroke based on presenting symptoms rather than on underlying pathophysiology. The rationale for this categorization is that patients with CVST were often tested based on presenting symptoms. Additionally, tests for CVST were ordered by the neurology service, which also ordered tests for all other ischemic stroke indications. (6) The purpose of our study was to investigate the subset of the hospital’s patient population that received thrombophilia testing, and patients were identified by tests received and not by diagnosis codes. As a result, we are unable to provide the proportion of total patients treated at the hospital for specific conditions who were tested (eg, the proportion of stroke patients that received thrombophilia testing). (7) Current practice guidelines do not recommend testing for phosphatidyl antibodies, even when traditional antiphospholipid testing is negative.25-27 Although expert panels continue to explore associations between phosphatidyl antibodies and pregnancy morbidity and thrombotic events, the low level of evidence is insufficient to guide clinical management.28 Therefore, we categorized all phosphatidyl testing as associated with minimal clinical utility.
CONCLUSIONS
In a large academic medical center, the majority of tests occurred in situations associated with minimal clinical utility. Strategies to improve thrombophilia testing practices are needed in order to minimize potentially inappropriate testing, provide more cost-effective care, and promote value-driven outcomes.
Disclosure
S.W. received financial support for this submitted work via a Bristol-Myers-Squibb grant. G.F. received financial support from Portola Pharmaceuticals for consulting and lectures that were not related to this submitted work.
Thrombophilia is a prothrombotic state, either acquired or inherited, leading to a thrombotic predisposition.1 The most common heritable thrombophilias include factor V Leiden (FVL) and prothrombin G20210A. The most common acquired thrombophilia is the presence of phospholipid antibodies.1 Thrombotic risk varies with thrombophilia type. For example, deficiencies of antithrombin, protein C and protein S, and the presence of phospholipid antibodies, confer higher risk than FVL and prothrombin G20210A.2-5 Other thrombophilias (eg, methylenetetrahydrofolate reductase mutation, increased factor VIII activity) are relatively uncommon and/or their impact on thrombosis risk appears to be either minimal or unknown.1-6 There is little clinical evidence that testing for thrombophilia impacts subsequent thrombosis prevention.5,7,8 Multiple clinical guidelines and medical societies recommend against the routine and indiscriminate use of thrombophilia testing.8-13 In general, thrombophilia testing should be considered only if the result would lead to changes in anticoagulant initiation, intensity, and/or duration, or might inform interventions to prevent thrombosis in asymptomatic family members.8-13 However, thrombophilia testing rarely changes the acute management of a thrombotic event and may have harmful effects on patients and their family members because positive results may unnecessarily increase anxiety and negative results may provide false reassurance.6,14-18 The cost-effectiveness of thrombophilia testing is unknown. Economic models have sought to quantify cost-effectiveness, but conclusions from these studies are limited.7
The utility of thrombophilia testing in emergency department (ED) and inpatient settings is further limited because patients are often treated and discharged before thrombophilia test results are available. Additionally, in these settings, multiple factors increase the risk of false-positive or false-negative results (eg, acute thrombosis, acute illness, pregnancy, and anticoagulant therapy).19,20 The purpose of this study was to systematically assess thrombophilia testing patterns in the ED and hospitalized patients at an academic medical center and to quantify the proportion of tests associated with minimal clinical utility. We hypothesize that the majority of thrombophilia tests completed in the inpatient setting are associated with minimal clinical utility.
METHODS
Setting and Patients
This study was conducted at University of Utah Health Care (UUHC) University Hospital, a 488-bed academic medical center with a level I trauma center, primary stroke center, and 50-bed ED. Laboratory services for UUHC, including thrombophilia testing, are provided by a national reference laboratory, Associated Regional and University Pathologists Laboratories. This study included patients ≥18 years of age who received thrombophilia testing (Supplementary Table 1) during an ED visit or inpatient admission at University Hospital between July 1, 2014 and December 31, 2014. There were no exclusion criteria. An institutional electronic data repository was used to identify patients matching inclusion criteria. All study activities were reviewed and approved by the UUHC Institutional Review Board with a waiver of informed consent.
Outcomes
An electronic database query was used to identify patients, collect patient demographic information, and collect test characteristics. Each patient’s electronic medical record was manually reviewed to collect all other outcomes. Indication for thrombophilia testing was identified by manual review of provider notes. Thrombophilia tests occurring in situations associated with minimal clinical utility were defined as tests meeting at least one of the following criteria: patient discharged before test results were available for review; test type not recommended by published guidelines or by UUHC Thrombosis Service physicians for thrombophilia testing (Supplementary Table 2); test performed in situations associated with decreased accuracy; test was a duplicate test as a result of different thrombophilia panels containing identical tests; and test followed a provoked venous thromboembolism (VTE). Testing in situations associated with decreased accuracy are summarized in Supplementary Table 3 and included at least one of the following at the time of the test: anticoagulant therapy, acute thrombosis, pregnant or <8 weeks postpartum, and receiving estrogen-containing medications. Only test types known to be affected by the respective situation were included. Testing following a provoked VTE was defined as testing prompted by an acute thrombosis and performed within 3 months following major surgery (defined administratively as any surgery performed in an operating room), during pregnancy, <8 weeks postpartum, or while on estrogen-containing medications. Thrombophilia testing during anticoagulant therapy was defined as testing within 4 half-lives of anticoagulant administration based on medication administration records. Anticoagulant therapy changes were identified by comparing prior-to-admission and discharge medication lists.
Data Analysis
Patient and laboratory characteristics were summarized using descriptive statistics, including mean and standard deviation (SD) for continuous variables and proportions for categorical variables. Data analysis was performed using Excel (Version 2013, Microsoft Corporation. Redmond, Washington).
RESULTS
During the 6-month study period, 163 patients received at least 1 thrombophilia test during an ED visit or inpatient admission. Patient characteristics are summarized in Table 1. Tested patients were most commonly inpatients (96%) and female (71%). A total of 1451 thrombophilia tests were performed with a mean (± SD) of 8.9 ± 6.0 tests per patient. Testing characteristics are summarized in Table 2. Of the 39 different test types performed, the most commonly ordered were cardiolipin IgG and IgM antibodies (9% each), lupus anticoagulant (9%), and β2-glycoprotein 1 IgG and IgM antibodies (8% each). When combined with testing for phosphatidyl antibodies, antiphospholipid tests accounted for 70% of all tests. Overall, 134 (9%) test results were positive. The mean time for results to become available was 2.2 ± 2.5 days. The frequency of test types with corresponding positivity rates and mean time for results to become available are summarized in Supplementary Table 4.
The indications for thrombophilia testing are summarized in Table 3. Ischemic stroke was the most common indication for testing (50% of tests; 35% of patients), followed by VTE (21% of tests; 21% of patients), and pregnancy-related conditions (eg, preeclampsia, intrauterine fetal demise; 15% of tests; 25% of patients). Overall, 911 tests (63%) occurred in situations associated with minimal clinical utility, with 126 patients (77%) receiving at least one of these tests (Table 4).
Anticoagulant therapy was changed in 43 patients (26%) in the following ways: initiated in 35 patients (21%), transitioned to a different anticoagulant in 6 patients (4%), and discontinued in 2 patients (1%). Of the 35 patients initiating anticoagulant therapy, 29 had documented thrombosis (24 had VTE, 4 had cerebral venous sinus thrombosis [CVST], and 1 had basilar artery thrombosis). Overall, 2 instances were identified in which initiation of anticoagulant therapy at discharge was in response to thrombophilia test results. In the first instance, warfarin without a parenteral anticoagulant bridge was initiated for a 54-year-old patient with a cryptogenic stroke who tested positive for β2-glycoprotein 1 IgG antibodies, lupus anticoagulant, and protein S deficiency. In the second instance, warfarin with an enoxaparin bridge was initiated for a 26-year-old patient with a cryptogenic stroke who tested positive for β2-glycoprotein 1 IgG and IgM antibodies, cardiolipin IgG antibodies, lupus anticoagulant, protein C deficiency, and antithrombin deficiency. Of the 163 patients receiving thrombophilia testing, only 2 patients (1%) had clear documentation of being offered genetic consultation.
DISCUSSION
In this retrospective analysis, 1451 thrombophilia tests were performed in 163 patients over 6 months. Tested patients were relatively young, which is likely explained by the number of patients tested for pregnancy-related conditions and the fact that a stroke or VTE in younger patients more frequently prompted providers to suspect thrombophilia. Nearly three-fourths of patients were female, which is likely due to testing for pregnancy-related conditions and possibly diagnostic suspicion bias given the comparative predilection of antiphospholipid syndrome for women. The patient characteristics in our study are consistent with other studies evaluating thrombophilia testing.21,22
Thrombophilia testing was most frequently prompted by stroke, VTE, and pregnancy-related conditions. Only 26% of patients had acute thrombosis identified during the admission, primarily because of the high proportion of tests for cryptogenic strokes and pregnancy-related conditions. Thrombophilia testing is recommended in patients who have had a stroke when the stroke is considered to be cryptogenic after a standard stroke evaluation.23 Thrombophilia testing in pregnancy-related conditions is controversial but is often considered in situations such as stillbirths with severe placental pathology and/or significant growth restriction, or in mothers with a personal or family history of thrombosis.24 The proportion of testing for pregnancy-related conditions may be greater than at other institutions because UUHC Maternal Fetal Medicine is a referral center for women with conditions associated with hypercoagulability. Anticoagulant therapy was initiated in 21% of patients, but specifically in response to thrombophilia testing in only 2 instances; in most cases, anticoagulant therapy was initiated regardless of thrombophilia test results.
The results of this study confirm our hypothesis because the majority of thrombophilia tests occurred in situations associated with minimal clinical utility. Testing in these situations was not isolated to specific patients or medical services because 77% of tested patients received at least 1 test associated with minimal clinical utility. Our study took a conservative approach in defining scenarios associated with minimal clinical utility because other situations can also affect testing accuracy (eg, hepatic disease, nephrotic syndrome) but were not included in our analysis of this outcome.
The results of this study highlight opportunities to improve thrombophilia testing practices at our institution and may be generalizable to institutions with similar testing patterns. Because multiple medical services order thrombophilia tests, strategies to improve testing practices are still being determined. The results of this study can serve as a baseline for comparison after strategies are implemented. The most common situation associated with minimal clinical utility was the use of test types not generally recommended by guidelines or UUHC Thrombosis Service physicians for thrombophilia testing (eg, β2-glycoprotein 1 IgA antibodies, phosphatidyl antibodies). We intend to require a hematology or thrombosis specialty consult prior to ordering these tests. This intervention alone could potentially decrease unnecessary testing by a third. Another consideration is to require a specialty consult prior to any inpatient thrombophilia testing. This strategy has been found to decrease inappropriate testing at other institutions.21 We also intend to streamline available thrombophilia testing panels because a poorly designed panel could lead to ordering of multiple tests associated with minimal clinical utility. At least 12 different thrombophilia panels are currently available in our computerized physician order entry system (see Supplementary Table 5). We hypothesize that current panel designs contribute to providers inadvertently ordering unintended or duplicate tests and that reducing the number of available panels and clearly delineating what tests are contained in each panel is likely to reduce unnecessary testing. Other strategies being considered include using electronic clinical decision support tools, implementing strict ordering criteria for all inpatient testing, and establishing a thrombosis stewardship program.
Our study was unique in at least 2 ways. First, previous studies describing thrombophilia testing have described testing patterns for patients with specific indications (eg, VTE), whereas our study described all thrombophilia tests regardless of indication. This allows for testing pattern comparisons across indications and medical services, increasing the generalizability of our results. Second, this study quantifies tests occurring in situations associated with a practical definition of minimal clinical utility.
Our study has several limitations: (1) Many variables were reliant on provider notes and other documentation, which allows for potential misclassification of variables. (2) It was not always possible to determine the ultimate utility of each test in clinical management decisions, and our study did not investigate the impact of thrombophilia testing on duration of anticoagulant therapy. Additionally, select situations could benefit from testing regardless if anticoagulant therapy is altered (eg, informing contraceptive choices). (3) Testing performed following a provoked acute thrombosis was defined as testing within 3 months following administratively defined major surgery. This definition could have included some minor procedures that do not substantially increase VTE risk, resulting in underestimated clinical utility. (4) The UUHC University Hospital serves as a referral hospital for a large geographical area, and investigators did not have access to outpatient records for a large proportion of discharged patients. As a result, frequency of repeat testing could not be assessed, possibly resulting in overestimated clinical utility. (5) In categorizing indications for testing, testing for CVST was subcategorized under testing for ischemic stroke based on presenting symptoms rather than on underlying pathophysiology. The rationale for this categorization is that patients with CVST were often tested based on presenting symptoms. Additionally, tests for CVST were ordered by the neurology service, which also ordered tests for all other ischemic stroke indications. (6) The purpose of our study was to investigate the subset of the hospital’s patient population that received thrombophilia testing, and patients were identified by tests received and not by diagnosis codes. As a result, we are unable to provide the proportion of total patients treated at the hospital for specific conditions who were tested (eg, the proportion of stroke patients that received thrombophilia testing). (7) Current practice guidelines do not recommend testing for phosphatidyl antibodies, even when traditional antiphospholipid testing is negative.25-27 Although expert panels continue to explore associations between phosphatidyl antibodies and pregnancy morbidity and thrombotic events, the low level of evidence is insufficient to guide clinical management.28 Therefore, we categorized all phosphatidyl testing as associated with minimal clinical utility.
CONCLUSIONS
In a large academic medical center, the majority of tests occurred in situations associated with minimal clinical utility. Strategies to improve thrombophilia testing practices are needed in order to minimize potentially inappropriate testing, provide more cost-effective care, and promote value-driven outcomes.
Disclosure
S.W. received financial support for this submitted work via a Bristol-Myers-Squibb grant. G.F. received financial support from Portola Pharmaceuticals for consulting and lectures that were not related to this submitted work.
1. Franco RF, Reitsma PH. Genetic risk factors of venous thrombosis. Hum Genet. 2001;109(4):369-384. PubMed
2. Ridker PM, Hennekens CH, Lindpaintner K, Stampfer MJ, Eisenberg PR, Miletich JP. Mutation in the gene coding for coagulation factor V and the risk of myocardial infarction, stroke, and venous thrombosis in apparently healthy men. N Engl J Med. 1995;332(14):912-917. PubMed
3. Koster T, Rosendaal FR, de Ronde H, Briët E, Vandenbroucke JP, Bertina RM. Venous thrombosis due to poor anticoagulant response to activated protein C: Leiden Thrombophilia Study. Lancet. 1993;342(8886-8887):1503-1506. PubMed
4. Margaglione M, Brancaccio V, Giuliani N, et al. Increased risk for venous thrombosis in carriers of the prothrombin G-->A20210 gene variant. Ann Intern Med. 1998;129(2):89-93. PubMed
5. De Stefano V, Martinelli I, Mannucci PM, et al. The risk of recurrent deep venous thrombosis among heterozygous carriers of both factor V Leiden and the G20210A prothrombin mutation. N Engl J Med. 1999;341:801-806. PubMed
6. Dickey TL. Can thrombophilia testing help to prevent recurrent VTE? Part 2. JAAPA. 2002;15(12):23-24, 27-29. PubMed
7. Simpson EL, Stevenson MD, Rawdin A, Papaioannou D. Thrombophilia testing in people with venous thromboembolism: systematic review and cost-effectiveness analysis. Health Technol Assess. 2009;13(2):iii, ix-x, 1-91. PubMed
8. National Institute for Health and Clinical Excellence. Venous thromboembolic disease: the management of venous thromboembolic diseases and the role of thrombophilia testing. NICE clinical guideline 144. https://www.nice.org.uk/guidance/cg144. Accessed on June 30, 2017.
9. Evalution of Genomic Applications in Practice and Prevention (EGAPP) Working Group. Recommendations from the EGAPP Working Group: routine testing for factor V Leiden (R506Q) and prothrombin (20210G>A) mutations in adults with a history of idiopathic venous thromboembolism and their adult family members. Genet Med. 2011;13(1):67-76.
10. Kearon C, Akl EA, Comerota AJ, et al. Antithrombotic therapy for VTE disease: antithrombotic therapy and prevention of thrombosis, 9th ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest. 2012;141(2 Suppl):e419S-494S. PubMed
11. Baglin T, Gray E, Greaves M, et al. Clinical guidelines for testing for heritable thrombophilia. Br J Haematol. 2010;149(2):209-220. PubMed
12. Hicks LK, Bering H, Carson KR, et al. The ASH Choosing Wisely® campaign: five hematologic tests and treatments to question. Hematology Am Soc Hematol Educ Program. 2013;2013:9-14. PubMed
13. Stevens SM, Woller SC, Bauer KA, et al. Guidance for the evaluation and treatment of hereditary and acquired thrombophilia. J Thromb Thrombolysis. 2016;41(1):154-164. PubMed
14. Christiansen SC, Cannegieter SC, Koster T, Vandenbroucke JP, Rosendaal FR. Thrombophilia, clinical factors, and recurrent venous thrombotic events. JAMA. 2005;293(19):2352-2361. PubMed
15. Prandoni P, Lensing AW, Cogo A, et al. The long-term clinical course of acute deep venous thrombosis. Ann Intern Med. 1996;125(1):1-7. PubMed
16. Miles JS, Miletich JP, Goldhaber SZ, Hennekens CH, Ridker PM. G20210A mutation in the prothrombin gene and the risk of recurrent venous thromboembolism. J Am Coll Cardiol. 2001;37(1):215-218. PubMed
17. Eichinger S, Weltermann A, Mannhalter C, et al. The risk of recurrent venous thromboembolism in heterozygous carriers of factor V Leiden and a first spontaneous venous thromboembolism. Arch Intern Med. 2002;162(20):2357-2360. PubMed
18. Mazzolai L, Duchosal MA. Hereditary thrombophilia and venous thromboembolism: critical evaluation of the clinical implications of screening. Eur J Vasc Endovasc Surg. 2007;34(4):483-488. PubMed
19. Merriman L, Greaves M. Testing for thrombophilia: an evidence‐based approach. Postgrad Med J. 2006;82(973):699-704. PubMed
20. Favaloro EJ, McDonald D, Lippi G. Laboratory investigation of thrombophilia: the good, the bad, and the ugly. Semin Thromb Hemost. 2009;35(7):695-710. PubMed
21. Shen YM, Tsai J, Taiwo E, et al. Analysis of thrombophilia test ordering practices at an academic center: a proposal for appropriate testing to reduce harm and cost. PLoS One. 2016;11(5):e0155326. PubMed
22. Meyer MR, Witt DM, Delate T, et al. Thrombophilia testing patterns amongst patients with acute venous thromboembolism. Thromb Res. 2015;136(6):1160-1164. PubMed
23. Saver JL. Clinical practice: cryptogenic stroke. N Engl J Med. 2016;374(21):2065-2074. PubMed
24. ACOG practice bulletin no. 102: management of stillbirth. Obstet Gynecol. 2009;113(3):748-761. PubMed
25. Miyakis S, Lockshin MD, Atsumi T, et al. International consensus statement on an update of the classification criteria for definite antiphospholipid syndrome (APS). J Thromb Haemost. 2006;4(2):295-306. PubMed
26. Keeling D, Mackie I, Moore GW, Greer IA, Greaves M, British Committee for Standards in Haematology. Guidelines on the investigation and management of antiphospholipid syndrome. Br J Haematol. 2012;157(1):47-58. PubMed
27. Committee on Practice Bulletins—Obstetrics, American College of Obstetricians and Gynecologists. Practice bulletin no. 132: antiphospholipid syndrome. Obstet Gynecol. 2012;120(6):1514-1521. PubMed
28. Bertolaccini ML, Amengual O, Andreoli L, et al. 14th International Congress on Antiphospholipid Antibodies Task Force. Report on antiphospholipid syndrome laboratory diagnostics and trends. Autoimmun Rev. 2014;13(9):917-930. PubMed
1. Franco RF, Reitsma PH. Genetic risk factors of venous thrombosis. Hum Genet. 2001;109(4):369-384. PubMed
2. Ridker PM, Hennekens CH, Lindpaintner K, Stampfer MJ, Eisenberg PR, Miletich JP. Mutation in the gene coding for coagulation factor V and the risk of myocardial infarction, stroke, and venous thrombosis in apparently healthy men. N Engl J Med. 1995;332(14):912-917. PubMed
3. Koster T, Rosendaal FR, de Ronde H, Briët E, Vandenbroucke JP, Bertina RM. Venous thrombosis due to poor anticoagulant response to activated protein C: Leiden Thrombophilia Study. Lancet. 1993;342(8886-8887):1503-1506. PubMed
4. Margaglione M, Brancaccio V, Giuliani N, et al. Increased risk for venous thrombosis in carriers of the prothrombin G-->A20210 gene variant. Ann Intern Med. 1998;129(2):89-93. PubMed
5. De Stefano V, Martinelli I, Mannucci PM, et al. The risk of recurrent deep venous thrombosis among heterozygous carriers of both factor V Leiden and the G20210A prothrombin mutation. N Engl J Med. 1999;341:801-806. PubMed
6. Dickey TL. Can thrombophilia testing help to prevent recurrent VTE? Part 2. JAAPA. 2002;15(12):23-24, 27-29. PubMed
7. Simpson EL, Stevenson MD, Rawdin A, Papaioannou D. Thrombophilia testing in people with venous thromboembolism: systematic review and cost-effectiveness analysis. Health Technol Assess. 2009;13(2):iii, ix-x, 1-91. PubMed
8. National Institute for Health and Clinical Excellence. Venous thromboembolic disease: the management of venous thromboembolic diseases and the role of thrombophilia testing. NICE clinical guideline 144. https://www.nice.org.uk/guidance/cg144. Accessed on June 30, 2017.
9. Evalution of Genomic Applications in Practice and Prevention (EGAPP) Working Group. Recommendations from the EGAPP Working Group: routine testing for factor V Leiden (R506Q) and prothrombin (20210G>A) mutations in adults with a history of idiopathic venous thromboembolism and their adult family members. Genet Med. 2011;13(1):67-76.
10. Kearon C, Akl EA, Comerota AJ, et al. Antithrombotic therapy for VTE disease: antithrombotic therapy and prevention of thrombosis, 9th ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest. 2012;141(2 Suppl):e419S-494S. PubMed
11. Baglin T, Gray E, Greaves M, et al. Clinical guidelines for testing for heritable thrombophilia. Br J Haematol. 2010;149(2):209-220. PubMed
12. Hicks LK, Bering H, Carson KR, et al. The ASH Choosing Wisely® campaign: five hematologic tests and treatments to question. Hematology Am Soc Hematol Educ Program. 2013;2013:9-14. PubMed
13. Stevens SM, Woller SC, Bauer KA, et al. Guidance for the evaluation and treatment of hereditary and acquired thrombophilia. J Thromb Thrombolysis. 2016;41(1):154-164. PubMed
14. Christiansen SC, Cannegieter SC, Koster T, Vandenbroucke JP, Rosendaal FR. Thrombophilia, clinical factors, and recurrent venous thrombotic events. JAMA. 2005;293(19):2352-2361. PubMed
15. Prandoni P, Lensing AW, Cogo A, et al. The long-term clinical course of acute deep venous thrombosis. Ann Intern Med. 1996;125(1):1-7. PubMed
16. Miles JS, Miletich JP, Goldhaber SZ, Hennekens CH, Ridker PM. G20210A mutation in the prothrombin gene and the risk of recurrent venous thromboembolism. J Am Coll Cardiol. 2001;37(1):215-218. PubMed
17. Eichinger S, Weltermann A, Mannhalter C, et al. The risk of recurrent venous thromboembolism in heterozygous carriers of factor V Leiden and a first spontaneous venous thromboembolism. Arch Intern Med. 2002;162(20):2357-2360. PubMed
18. Mazzolai L, Duchosal MA. Hereditary thrombophilia and venous thromboembolism: critical evaluation of the clinical implications of screening. Eur J Vasc Endovasc Surg. 2007;34(4):483-488. PubMed
19. Merriman L, Greaves M. Testing for thrombophilia: an evidence‐based approach. Postgrad Med J. 2006;82(973):699-704. PubMed
20. Favaloro EJ, McDonald D, Lippi G. Laboratory investigation of thrombophilia: the good, the bad, and the ugly. Semin Thromb Hemost. 2009;35(7):695-710. PubMed
21. Shen YM, Tsai J, Taiwo E, et al. Analysis of thrombophilia test ordering practices at an academic center: a proposal for appropriate testing to reduce harm and cost. PLoS One. 2016;11(5):e0155326. PubMed
22. Meyer MR, Witt DM, Delate T, et al. Thrombophilia testing patterns amongst patients with acute venous thromboembolism. Thromb Res. 2015;136(6):1160-1164. PubMed
23. Saver JL. Clinical practice: cryptogenic stroke. N Engl J Med. 2016;374(21):2065-2074. PubMed
24. ACOG practice bulletin no. 102: management of stillbirth. Obstet Gynecol. 2009;113(3):748-761. PubMed
25. Miyakis S, Lockshin MD, Atsumi T, et al. International consensus statement on an update of the classification criteria for definite antiphospholipid syndrome (APS). J Thromb Haemost. 2006;4(2):295-306. PubMed
26. Keeling D, Mackie I, Moore GW, Greer IA, Greaves M, British Committee for Standards in Haematology. Guidelines on the investigation and management of antiphospholipid syndrome. Br J Haematol. 2012;157(1):47-58. PubMed
27. Committee on Practice Bulletins—Obstetrics, American College of Obstetricians and Gynecologists. Practice bulletin no. 132: antiphospholipid syndrome. Obstet Gynecol. 2012;120(6):1514-1521. PubMed
28. Bertolaccini ML, Amengual O, Andreoli L, et al. 14th International Congress on Antiphospholipid Antibodies Task Force. Report on antiphospholipid syndrome laboratory diagnostics and trends. Autoimmun Rev. 2014;13(9):917-930. PubMed
© 2017 Society of Hospital Medicine
A Randomized Controlled Trial of a CPR Decision Support Video for Patients Admitted to the General Medicine Service
Discussions about cardiopulmonary resuscitation (CPR) can be difficult due to their association with end of life. The Patient Self Determination Act (H.R.4449 — 101st Congress [1989-1990]) and institutional standards mandate collaboration between care providers and patients regarding goals of care in emergency situations such as cardiopulmonary arrest. The default option is to provide CPR, which may involve chest compressions, intubation, and/or defibrillation. Yet numerous studies show that a significant number of patients have no code preference documented in their medical chart, and even fewer report a conversation with their care provider about their wishes regarding CPR.1-3 CPR is an invasive and potentially painful procedure with a higher chance of failure than success4, and yet many patients report that their provider did not discuss with them the risks and benefits of resuscitation.5,6 Further highlighting the importance of individual discussions about CPR preferences is the reality that factors such as age and disease burden further skew the likelihood of survival after cardiopulmonary arrest.7
Complicating the lack of appropriate provider and patient discussion of the risks and benefits of resuscitation are significant misunderstandings about CPR in the lay population. Patients routinely overestimate the likelihood of survival following CPR.8,9 This may be partially due to the portrayal of CPR in the lay media as highly efficacious.10 Other factors known to prevent effective provider-and-patient discussions about CPR preferences are providers’ discomfort with the subject11 and perceived time constraints.12
Informational videos have been developed to assist patients with decision making about CPR and have been shown to impact patients’ choices in the setting of life-limiting diseases such as advanced cancer,13-14 serious illness with a prognosis of less than 1 year,15 and dementia.16 While discussion of code status is vitally important in end-of-life planning for seriously ill individuals, delayed discussion of CPR preferences is associated with a significant increase in the number of invasive procedures performed at the end of life, increased length of stay in the hospital, and increased medical cost.17 Despite clear evidence that earlier discussion of resuscitation options are valuable, no studies have examined the impact of a video about code status options in the general patient population.
Here we present our findings of a randomized trial in patients hospitalized on the general medicine wards who were 65 years of age or older, regardless of illness severity or diagnosis. The video tool was a supplement for, rather than a replacement of, standard provider and patient communication about code preferences, and we compared patients who watched the video against controls who had standard discussions with their providers. Our video detailed the process of chest compressions and intubation during CPR and explained the differences between the code statuses: full code, do not resuscitate (DNR), and do not resuscitate/do not intubate (DNR/DNI). We found a significant difference between the 2 groups, with significantly more individuals in the video group choosing DNR/DNI. These findings suggest that video support tools may be a useful supplement to traditional provider discussions about code preferences in the general patient population.
METHODS
We enrolled patients from the general medicine wards at the Minneapolis VA Hospital from September 28, 2015 to October 23, 2015. Eligibility criteria included age 65 years or older, ability to provide informed consent, and ability to communicate in English. Study recruitment and data collection were performed by a study coordinator who was a house staff physician and had no role in the care of the participants. The medical charts of all general medicine patients were reviewed to determine if they met the age criteria. The physician of record for potential participants was contacted to assess if the patient was able to provide informed consent and communicate in English. Eligible patients were approached and informed consent was obtained from those who chose to participate in the study. After obtaining informed consent, patients were randomized using a random number generator to the intervention or usual-care arm of the study.
Those who were assigned to the intervention arm watched a 6-minute long video explaining the code-preference choices of full code, DNR, or DNR/DNI. Full code was described as possibly including CPR, intubation, and/or defibrillation depending on the clinical situation. Do not resuscitate was described as meaning no CPR or defibrillation but possible intubation in the case of respiratory failure. Do not resuscitate/do not intubate was explained as meaning no CPR, no defibrillation, and no intubation but rather permitting “natural death” to occur. The video showed a mock code with chest compressions, defibrillation, and intubation on a mannequin as well as palliative care specialists who discussed potential complications and survival rates of inhospital resuscitation.
The video was created at the University of Minnesota with the departments of palliative care and internal medicine (www.mmcgmeservices.org/codestat.html). After viewing the video, participants in the intervention arm filled out a questionnaire designed to assess their knowledge and beliefs about CPR and trust in their medical care providers. They were asked to circle their code preference. The participants’ medical teams were made aware of the code preferences and were counseled to discuss code preferences further if it was different from their previously documented code preference.
Participants in the control arm were assigned to usual care. At the institution where this study occurred, a discussion about code preferences between the patient and their medical team is considered the standard of care. After informed consent was obtained, participants filled out the same questionnaire as the participants in the intervention arm. They were asked to circle their code status preference. If they chose to ask questions about resuscitation, these were answered, but the study coordinator did not volunteer information about resuscitation or intervene in the medical care of the participants in any way.
All participants’ demographic characteristics and outcomes were described using proportions for categorical variables and means ± standard deviation for continuous variables. The primary outcome was participants’ stated code preference (full code, DNR, or DNR/DNI). Secondary outcomes included comparison of trust in medical providers, resuscitation beliefs, and desire for life-prolonging interventions as obtained from the questionnaire.
We analyzed code preferences between the intervention and control groups using Fisher exact test. We used analysis of variance (ANOVA) to compare questionnaire responses between the 2 groups. All reported P values are 2-sided with P < 0.05 considered significant. The project originally targeted a sample size of 194 participants for 80% power to detect a 20% difference in the code preference choices between intervention and control groups. Given the short time frame available to enroll participants, the target sample size was not reached. Propitiously, the effect size was greater than originally expected.
RESULTS
Study Participants
A total of 273 potentially eligible patients were approached to participate and 119 (44%) enrolled. (Figure 1). Of the 154 patients that were deemed eligible after initial screening, 42 patients were unable to give consent due to the severity of their illness or because of their mental status. Another 112 patients declined participation in the study, citing reasons such as disinterest in the consent paperwork, desire to spend time with visitors, and unease with the subject matter. Patients who declined participation did not differ significantly by age, sex, or race from those enrolled in the study.
Among the 119 participants, 60 were randomized to the control arm, and 59 were randomized to the intervention arm. Participants in the 2 arms did not differ significantly in age, sex, or race (P > 0.05), although all 4 women in the study were randomized to the intervention arm. Eighty-seven percent of the study population identified as white with the remainder including black, Asian, Pacific Islander, Native American, or declining to answer. The mean age was 75.8 years in the control arm vs. 75.2 years in the intervention arm.
Primary diagnoses in the study group ranged widely from relatively minor skin infections to acute pancreatitis. The control arm and the intervention arm did not differ significantly in the incidence of heart failure, pulmonary disease, renal dialysis, cirrhosis, stroke, or active cancer (P > 0.05). Patients were considered as having a stroke if they had suffered a stroke during their hospital admission or if they had long-term sequelae of prior stroke. Patients were considered as having active cancer if they were currently undergoing treatment or had metastases. Participants were considered as having multiple morbidities if they possessed 2 or more of the listed conditions. Between the control arm and the intervention arm, there was no significant difference in the number of participants with multiple morbidities (27% in the control group and 24% in the video group).
Code Status Preference
There was a significant difference in the code status preferences of the intervention arm and the control arm (P < 0.00001; Figure 2). In the control arm, 71% of participants chose full code, 12% chose DNR, and 17% chose DNR/DNI. In the intervention arm, only 37% chose full code, 7% chose DNR, and 56% chose DNR/DNI.
Secondary outcomes
Participants in the control and intervention arms were asked about their trust in their medical team (Question 1, Figure 3). There was no significant difference, but a trend towards less trust in the intervention group (P = 0.083) was seen with 93% of the control arm and 76% of the intervention arm agreeing with the statement “My doctors and healthcare team want what is best for me.”
Question 2, “If I choose to avoid resuscitation efforts, I will not receive care,” was designed to assess participants’ knowledge and perception about the care they would receive if they chose DNR/DNI as their code status. No significant difference was seen between the control and the interventions arms, with 28% of the control group agreeing with the statement, compared to 22% of the video group.
For question 3, participants were asked to respond to the statement “I would like to live as long as possible, even if I never leave the hospital.” No significant differences were seen between the control and the intervention arms, with 22% of both groups agreeing with the statement.
When we examined participant responses by the code status chosen, a significantly higher percentage of participants who chose full code agreed with the statement in question 3 (P = 0.0133). Of participants who chose full code, 27% agreed with the statement, compared to 18% of participants who chose DNR and 12% of participants who chose DNR/DNI. There was no significant difference (P > 0.05) between participant code status choice and either Question 1 or 2.
DISCUSSION
This study examined the effect of watching a video about CPR and intubation on the code status preferences of hospitalized patients. Participants who viewed a video about CPR and intubation were more likely to choose to forgo these treatments. Participants who chose CPR and intubation were more likely to agree that they would want to live as long as possible even if that time were spent in a medical setting.
To our knowledge, this is the first study to examine the role of a video decision support tool about code choices in the general hospital population, regardless of prognosis. Previous work has trialed the use of video support tools in hospitalized patients with a prognosis of less than 1 year,15 patients admitted to the ICU,18 and outpatients with cancer18 and those with dementia.16 Unlike previous studies, our study included a variety of illness severity.
Discussions about resuscitation are important for all adults admitted to the hospital because of the unpredictable nature of illness and the importance of providing high-quality care at the end of life. A recent study indicates that in-hospital cardiopulmonary arrest occurs in almost 1 per 1000 hospital days.19 These discussions are particularly salient for patients 65 years and older because of the higher incidence of death in this group. Inpatient admission is often a result of a change in health status, making it an important time for patients to reassess their resuscitation preferences based on their physical state and known comorbidities.
Video tools supplement the traditional code status discussion in several key ways. They provide a visual simulation of the procedures that occur during a typical resuscitation. These tools can help patients understand what CPR and intubation entail and transmit information that might be missed in verbal discussions. Visual media is now a common way for patients to obtain medical information20-22 and may be particularly helpful to patients who have low health literacy.23Video tools also help ensure that patients receive all the facts about resuscitation irrespective of how busy their provider may be or how comfortable the provider is with the topic. Lastly, video tools can reinforce information that is shared in the initial code status discussion. Given the significant differences in code status preference between our control and video arms, it is clear that the video tool has a significant impact on patient choices.
While we feel that our study clearly indicates the utility of video tools in code status discussion in hospitalized patients, there are some limitations. The current study enrolled participants who were predominantly white and male. All participants were recruited from the Minneapolis Veterans Affairs Health Care System, Minnesota. The relatively homogenous study population may impact the study’s generalizability. Another potential limitation of our study was the large number of eligible participants who declined to participate (41%), with many citing that they did not want to sign the consent paperwork. Additionally, the study coordinator was not blinded to the randomization of the participants, which could result in ascertainment bias. Also of concern was a trend, albeit nonsignificant, towards less trust in the healthcare team in the video group. Because the study was not designed to assess trust in the healthcare team both before and after the intervention, it is unclear if this difference was a result of the video.
Another area of potential concern is that visual images can be edited to sway viewers’ opinions based on the way content is presented. In our video, we included input from palliative care and internal medicine specialists. Cardiopulmonary resuscitation and intubation were performed on a CPR mannequin. The risks and benefits of CPR and intubation were discussed, as were the implications of choosing DNR or DNR/DNI code statuses.
The questionnaire that we used to assess participants’ knowledge and beliefs about resuscitation showed no differences between the control and the intervention arms of the study. We were surprised that a significant number of participants in the intervention group agreed with the statement, “If I choose to avoid resuscitation efforts, I will not receive care.” Our video specifically addressed the common belief that choosing DNR/DNI or DNR code statuses means that a patient will not continue to receive medical care. It is possible that participants were confused by the way the question was worded or that they understood the question to apply only to care received after a cardiopulmonary arrest had occurred.
This study and several others14-16 show that the use of video tools impacts participants’ code status preferences. There is clinical and humanistic importance in helping patients make informed decisions regarding whether or not they would want CPR and/or intubation if their heart were to stop or if they were to stop breathing. The data suggest that video tools are an efficient way to improve patient care and should be made widely available.
Disclosures: The authors report no conflicts of interest.
1. Dunn RH, Ahn J, Bernstein J. End-of-life care planning and fragility fractures of the hip: are we missing a valuable opportunity? Clin Orthop Relat Res 2016;474(7):1736-1739. PubMed
2. Warren MB, Lapid MI, McKean AJ, Cha SS, Stevens MA, Brekke FM, et al. Code status discussions in psychiatric and medical inpatients. J Clin Psychiatry. 2015;76(1):49-53. PubMed
3. Bhatia HL, Patel NR, Choma NN, Grande J, Giuse DA, Lehmann CU. Code status and resuscitation options in the electronic health record. Resuscitation. 2015;87:14-20. PubMed
4. Singh S, Namrata, Grewal A, Gautam PL, Luthra N, Kaur A. Evaluation of cardiopulmonary resuscitation (CPR) for patient outcomes and their predictors. J Clin Diagn Res. 2016;10(1):UC01-UC04. PubMed
5. Anderson WG, Chase R, Pantilat SZ, Tulsky JA, Auerbach AD. Code status discussions between attending hospitalist physicians and medical patients at hospital admission. J Gen Intern Med. 2011;26(4):359-366. PubMed
6. Einstein DJ, Einstein KL, Mathew P. Dying for advice: code status discussions between resident physicians and patients with advanced cancer--a national survey. J Palliat Med. 2015;18(6):535-541. PubMed
7. Piscator E, Hedberg P, Göransson K, Djärv T. Survival after in-hospital cardiac arrest is highly associated with the Age-combined Charlson Co-morbidity Index in a cohort study from a two-site Swedish University hospital. Resuscitation. 2016;99:79-83. PubMed
8. Zijlstra TJ, Leenman-Dekker SJ, Oldenhuis HK, Bosveld HE, Berendsen AJ. Knowledge and preferences regarding cardiopulmonary resuscitation: A survey among older patients. Patient Educ Couns. 2016;99(1):160-163. PubMed
9. Wilson ME, Akhoundi A, Krupa AK, Hinds RF, Litell JM, Gajic O, Kashani K. Development, validation, and results of a survey to measure understanding of cardiopulmonary resuscitation choices among ICU patients and their surrogate decision makers. BMC Anesthesiol. 2014;14:15. PubMed
10. Harris D, Willoughby H. Resuscitation on television: realistic or ridiculous? A quantitative observational analysis of the portrayal of cardiopulmonary resuscitation in television medical drama. Resuscitation. 2009;80(11):1275-1279. PubMed
11. Mills LM, Rhoads C, Curtis JR. Medical student training on code status discussions: how far have we come? J Palliat Med. 2016;19(3):323-325. PubMed
12. Binder AF, Huang GC, Buss MK. Uninformed consent: do medicine residents lack the proper framework for code status discussions? J Hosp Med. 2016;11(2):111-116. PubMed
13. Volandes AE, Levin TT, Slovin S, Carvajal RD, O’Reilly EM, et al. Augmenting advance care planning in poor prognosis cancer with a video decision aid: a preintervention-postintervention study. Cancer. 2012;118(17):4331-4338. PubMed
14. El-Jawahri A, Podgurski LM, Eichler AF, Plotkin SR, Temel JS, Mitchell SL, et al. Use of video to facilitate end-of-life discussions with patients with cancer: a randomized controlled trial. J Clin Oncol. 2010;28(2):305-310. PubMed
15. El-Jawahri A, Mitchell SL, Paasche-Orlow MK, Temel JS, Jackson VA, Rutledge RR, et al. A randomized controlled trial of a CPR and intubation video decision support tool for hospitalized patients. J Gen Intern Med. 2015;30(8):1071-1080. PubMed
16. Volandes AE, Paasche-Orlow MK, Barry MJ, Gillick MR, Minaker KL, Chang Y, et al. Video decision support tool for advance care planning in dementia: randomised controlled trial. BMJ. 2009;338:b2159. PubMed
17. Celso BG, Meenrajan S. The triad that matters: palliative medicine, code status, and health care costs. Am J Hosp Palliat Care. 2010;27(6):398-401. PubMed
18. Wilson ME, Krupa A, Hinds RF, Litell JM, Swetz KM, Akhoundi A, et al. A video to improve patient and surrogate understanding of cardiopulmonary resuscitation choices in the ICU: a randomized controlled trial. Crit Care Med. 2015;43(3):621-629. PubMed
19. Overdyk FJ, Dowling O, Marino J, Qiu J, Chien HL, Erslon M, et al. Association of opioids and sedatives with increased risk of in-hospital cardiopulmonary arrest from an administrative database. PLoS One. 2016;11(2):e0150214. PubMed
20. Stacey D, Samant R, Bennett C. Decision making in oncology: a review of patient decision aids to support patient participation. CA Cancer J Clin. 2008;58(5)293-304. PubMed
21. Lin GA, Aaronson DS, Knight SJ, Carroll PR, Dudley RA. Patient decision aids for prostate cancer treatment: a systematic review of the literature. CA Cancer J Clin. 2009;59(6):379-390. PubMed
22. O’Brien MA, Whelan TJ, Villasis-Keever M, Gafni A, Charles C, Roberts R, et al. Are cancer-related decision aids effective? A systematic review and meta-analysis. J Clin Oncol. 2009;27(6):974-985. PubMed
23. Sudore RL, Landefeld CS, Pérez-Stable EJ, Bibbins-Domingo K, Williams BA, Schillinger D. Unraveling the relationship between literacy, language proficiency, and patient-physician communication. Patient Educ Couns. 2009;75(3):398-402. PubMed
Discussions about cardiopulmonary resuscitation (CPR) can be difficult due to their association with end of life. The Patient Self Determination Act (H.R.4449 — 101st Congress [1989-1990]) and institutional standards mandate collaboration between care providers and patients regarding goals of care in emergency situations such as cardiopulmonary arrest. The default option is to provide CPR, which may involve chest compressions, intubation, and/or defibrillation. Yet numerous studies show that a significant number of patients have no code preference documented in their medical chart, and even fewer report a conversation with their care provider about their wishes regarding CPR.1-3 CPR is an invasive and potentially painful procedure with a higher chance of failure than success4, and yet many patients report that their provider did not discuss with them the risks and benefits of resuscitation.5,6 Further highlighting the importance of individual discussions about CPR preferences is the reality that factors such as age and disease burden further skew the likelihood of survival after cardiopulmonary arrest.7
Complicating the lack of appropriate provider and patient discussion of the risks and benefits of resuscitation are significant misunderstandings about CPR in the lay population. Patients routinely overestimate the likelihood of survival following CPR.8,9 This may be partially due to the portrayal of CPR in the lay media as highly efficacious.10 Other factors known to prevent effective provider-and-patient discussions about CPR preferences are providers’ discomfort with the subject11 and perceived time constraints.12
Informational videos have been developed to assist patients with decision making about CPR and have been shown to impact patients’ choices in the setting of life-limiting diseases such as advanced cancer,13-14 serious illness with a prognosis of less than 1 year,15 and dementia.16 While discussion of code status is vitally important in end-of-life planning for seriously ill individuals, delayed discussion of CPR preferences is associated with a significant increase in the number of invasive procedures performed at the end of life, increased length of stay in the hospital, and increased medical cost.17 Despite clear evidence that earlier discussion of resuscitation options are valuable, no studies have examined the impact of a video about code status options in the general patient population.
Here we present our findings of a randomized trial in patients hospitalized on the general medicine wards who were 65 years of age or older, regardless of illness severity or diagnosis. The video tool was a supplement for, rather than a replacement of, standard provider and patient communication about code preferences, and we compared patients who watched the video against controls who had standard discussions with their providers. Our video detailed the process of chest compressions and intubation during CPR and explained the differences between the code statuses: full code, do not resuscitate (DNR), and do not resuscitate/do not intubate (DNR/DNI). We found a significant difference between the 2 groups, with significantly more individuals in the video group choosing DNR/DNI. These findings suggest that video support tools may be a useful supplement to traditional provider discussions about code preferences in the general patient population.
METHODS
We enrolled patients from the general medicine wards at the Minneapolis VA Hospital from September 28, 2015 to October 23, 2015. Eligibility criteria included age 65 years or older, ability to provide informed consent, and ability to communicate in English. Study recruitment and data collection were performed by a study coordinator who was a house staff physician and had no role in the care of the participants. The medical charts of all general medicine patients were reviewed to determine if they met the age criteria. The physician of record for potential participants was contacted to assess if the patient was able to provide informed consent and communicate in English. Eligible patients were approached and informed consent was obtained from those who chose to participate in the study. After obtaining informed consent, patients were randomized using a random number generator to the intervention or usual-care arm of the study.
Those who were assigned to the intervention arm watched a 6-minute long video explaining the code-preference choices of full code, DNR, or DNR/DNI. Full code was described as possibly including CPR, intubation, and/or defibrillation depending on the clinical situation. Do not resuscitate was described as meaning no CPR or defibrillation but possible intubation in the case of respiratory failure. Do not resuscitate/do not intubate was explained as meaning no CPR, no defibrillation, and no intubation but rather permitting “natural death” to occur. The video showed a mock code with chest compressions, defibrillation, and intubation on a mannequin as well as palliative care specialists who discussed potential complications and survival rates of inhospital resuscitation.
The video was created at the University of Minnesota with the departments of palliative care and internal medicine (www.mmcgmeservices.org/codestat.html). After viewing the video, participants in the intervention arm filled out a questionnaire designed to assess their knowledge and beliefs about CPR and trust in their medical care providers. They were asked to circle their code preference. The participants’ medical teams were made aware of the code preferences and were counseled to discuss code preferences further if it was different from their previously documented code preference.
Participants in the control arm were assigned to usual care. At the institution where this study occurred, a discussion about code preferences between the patient and their medical team is considered the standard of care. After informed consent was obtained, participants filled out the same questionnaire as the participants in the intervention arm. They were asked to circle their code status preference. If they chose to ask questions about resuscitation, these were answered, but the study coordinator did not volunteer information about resuscitation or intervene in the medical care of the participants in any way.
All participants’ demographic characteristics and outcomes were described using proportions for categorical variables and means ± standard deviation for continuous variables. The primary outcome was participants’ stated code preference (full code, DNR, or DNR/DNI). Secondary outcomes included comparison of trust in medical providers, resuscitation beliefs, and desire for life-prolonging interventions as obtained from the questionnaire.
We analyzed code preferences between the intervention and control groups using Fisher exact test. We used analysis of variance (ANOVA) to compare questionnaire responses between the 2 groups. All reported P values are 2-sided with P < 0.05 considered significant. The project originally targeted a sample size of 194 participants for 80% power to detect a 20% difference in the code preference choices between intervention and control groups. Given the short time frame available to enroll participants, the target sample size was not reached. Propitiously, the effect size was greater than originally expected.
RESULTS
Study Participants
A total of 273 potentially eligible patients were approached to participate and 119 (44%) enrolled. (Figure 1). Of the 154 patients that were deemed eligible after initial screening, 42 patients were unable to give consent due to the severity of their illness or because of their mental status. Another 112 patients declined participation in the study, citing reasons such as disinterest in the consent paperwork, desire to spend time with visitors, and unease with the subject matter. Patients who declined participation did not differ significantly by age, sex, or race from those enrolled in the study.
Among the 119 participants, 60 were randomized to the control arm, and 59 were randomized to the intervention arm. Participants in the 2 arms did not differ significantly in age, sex, or race (P > 0.05), although all 4 women in the study were randomized to the intervention arm. Eighty-seven percent of the study population identified as white with the remainder including black, Asian, Pacific Islander, Native American, or declining to answer. The mean age was 75.8 years in the control arm vs. 75.2 years in the intervention arm.
Primary diagnoses in the study group ranged widely from relatively minor skin infections to acute pancreatitis. The control arm and the intervention arm did not differ significantly in the incidence of heart failure, pulmonary disease, renal dialysis, cirrhosis, stroke, or active cancer (P > 0.05). Patients were considered as having a stroke if they had suffered a stroke during their hospital admission or if they had long-term sequelae of prior stroke. Patients were considered as having active cancer if they were currently undergoing treatment or had metastases. Participants were considered as having multiple morbidities if they possessed 2 or more of the listed conditions. Between the control arm and the intervention arm, there was no significant difference in the number of participants with multiple morbidities (27% in the control group and 24% in the video group).
Code Status Preference
There was a significant difference in the code status preferences of the intervention arm and the control arm (P < 0.00001; Figure 2). In the control arm, 71% of participants chose full code, 12% chose DNR, and 17% chose DNR/DNI. In the intervention arm, only 37% chose full code, 7% chose DNR, and 56% chose DNR/DNI.
Secondary outcomes
Participants in the control and intervention arms were asked about their trust in their medical team (Question 1, Figure 3). There was no significant difference, but a trend towards less trust in the intervention group (P = 0.083) was seen with 93% of the control arm and 76% of the intervention arm agreeing with the statement “My doctors and healthcare team want what is best for me.”
Question 2, “If I choose to avoid resuscitation efforts, I will not receive care,” was designed to assess participants’ knowledge and perception about the care they would receive if they chose DNR/DNI as their code status. No significant difference was seen between the control and the interventions arms, with 28% of the control group agreeing with the statement, compared to 22% of the video group.
For question 3, participants were asked to respond to the statement “I would like to live as long as possible, even if I never leave the hospital.” No significant differences were seen between the control and the intervention arms, with 22% of both groups agreeing with the statement.
When we examined participant responses by the code status chosen, a significantly higher percentage of participants who chose full code agreed with the statement in question 3 (P = 0.0133). Of participants who chose full code, 27% agreed with the statement, compared to 18% of participants who chose DNR and 12% of participants who chose DNR/DNI. There was no significant difference (P > 0.05) between participant code status choice and either Question 1 or 2.
DISCUSSION
This study examined the effect of watching a video about CPR and intubation on the code status preferences of hospitalized patients. Participants who viewed a video about CPR and intubation were more likely to choose to forgo these treatments. Participants who chose CPR and intubation were more likely to agree that they would want to live as long as possible even if that time were spent in a medical setting.
To our knowledge, this is the first study to examine the role of a video decision support tool about code choices in the general hospital population, regardless of prognosis. Previous work has trialed the use of video support tools in hospitalized patients with a prognosis of less than 1 year,15 patients admitted to the ICU,18 and outpatients with cancer18 and those with dementia.16 Unlike previous studies, our study included a variety of illness severity.
Discussions about resuscitation are important for all adults admitted to the hospital because of the unpredictable nature of illness and the importance of providing high-quality care at the end of life. A recent study indicates that in-hospital cardiopulmonary arrest occurs in almost 1 per 1000 hospital days.19 These discussions are particularly salient for patients 65 years and older because of the higher incidence of death in this group. Inpatient admission is often a result of a change in health status, making it an important time for patients to reassess their resuscitation preferences based on their physical state and known comorbidities.
Video tools supplement the traditional code status discussion in several key ways. They provide a visual simulation of the procedures that occur during a typical resuscitation. These tools can help patients understand what CPR and intubation entail and transmit information that might be missed in verbal discussions. Visual media is now a common way for patients to obtain medical information20-22 and may be particularly helpful to patients who have low health literacy.23Video tools also help ensure that patients receive all the facts about resuscitation irrespective of how busy their provider may be or how comfortable the provider is with the topic. Lastly, video tools can reinforce information that is shared in the initial code status discussion. Given the significant differences in code status preference between our control and video arms, it is clear that the video tool has a significant impact on patient choices.
While we feel that our study clearly indicates the utility of video tools in code status discussion in hospitalized patients, there are some limitations. The current study enrolled participants who were predominantly white and male. All participants were recruited from the Minneapolis Veterans Affairs Health Care System, Minnesota. The relatively homogenous study population may impact the study’s generalizability. Another potential limitation of our study was the large number of eligible participants who declined to participate (41%), with many citing that they did not want to sign the consent paperwork. Additionally, the study coordinator was not blinded to the randomization of the participants, which could result in ascertainment bias. Also of concern was a trend, albeit nonsignificant, towards less trust in the healthcare team in the video group. Because the study was not designed to assess trust in the healthcare team both before and after the intervention, it is unclear if this difference was a result of the video.
Another area of potential concern is that visual images can be edited to sway viewers’ opinions based on the way content is presented. In our video, we included input from palliative care and internal medicine specialists. Cardiopulmonary resuscitation and intubation were performed on a CPR mannequin. The risks and benefits of CPR and intubation were discussed, as were the implications of choosing DNR or DNR/DNI code statuses.
The questionnaire that we used to assess participants’ knowledge and beliefs about resuscitation showed no differences between the control and the intervention arms of the study. We were surprised that a significant number of participants in the intervention group agreed with the statement, “If I choose to avoid resuscitation efforts, I will not receive care.” Our video specifically addressed the common belief that choosing DNR/DNI or DNR code statuses means that a patient will not continue to receive medical care. It is possible that participants were confused by the way the question was worded or that they understood the question to apply only to care received after a cardiopulmonary arrest had occurred.
This study and several others14-16 show that the use of video tools impacts participants’ code status preferences. There is clinical and humanistic importance in helping patients make informed decisions regarding whether or not they would want CPR and/or intubation if their heart were to stop or if they were to stop breathing. The data suggest that video tools are an efficient way to improve patient care and should be made widely available.
Disclosures: The authors report no conflicts of interest.
Discussions about cardiopulmonary resuscitation (CPR) can be difficult due to their association with end of life. The Patient Self Determination Act (H.R.4449 — 101st Congress [1989-1990]) and institutional standards mandate collaboration between care providers and patients regarding goals of care in emergency situations such as cardiopulmonary arrest. The default option is to provide CPR, which may involve chest compressions, intubation, and/or defibrillation. Yet numerous studies show that a significant number of patients have no code preference documented in their medical chart, and even fewer report a conversation with their care provider about their wishes regarding CPR.1-3 CPR is an invasive and potentially painful procedure with a higher chance of failure than success4, and yet many patients report that their provider did not discuss with them the risks and benefits of resuscitation.5,6 Further highlighting the importance of individual discussions about CPR preferences is the reality that factors such as age and disease burden further skew the likelihood of survival after cardiopulmonary arrest.7
Complicating the lack of appropriate provider and patient discussion of the risks and benefits of resuscitation are significant misunderstandings about CPR in the lay population. Patients routinely overestimate the likelihood of survival following CPR.8,9 This may be partially due to the portrayal of CPR in the lay media as highly efficacious.10 Other factors known to prevent effective provider-and-patient discussions about CPR preferences are providers’ discomfort with the subject11 and perceived time constraints.12
Informational videos have been developed to assist patients with decision making about CPR and have been shown to impact patients’ choices in the setting of life-limiting diseases such as advanced cancer,13-14 serious illness with a prognosis of less than 1 year,15 and dementia.16 While discussion of code status is vitally important in end-of-life planning for seriously ill individuals, delayed discussion of CPR preferences is associated with a significant increase in the number of invasive procedures performed at the end of life, increased length of stay in the hospital, and increased medical cost.17 Despite clear evidence that earlier discussion of resuscitation options are valuable, no studies have examined the impact of a video about code status options in the general patient population.
Here we present our findings of a randomized trial in patients hospitalized on the general medicine wards who were 65 years of age or older, regardless of illness severity or diagnosis. The video tool was a supplement for, rather than a replacement of, standard provider and patient communication about code preferences, and we compared patients who watched the video against controls who had standard discussions with their providers. Our video detailed the process of chest compressions and intubation during CPR and explained the differences between the code statuses: full code, do not resuscitate (DNR), and do not resuscitate/do not intubate (DNR/DNI). We found a significant difference between the 2 groups, with significantly more individuals in the video group choosing DNR/DNI. These findings suggest that video support tools may be a useful supplement to traditional provider discussions about code preferences in the general patient population.
METHODS
We enrolled patients from the general medicine wards at the Minneapolis VA Hospital from September 28, 2015 to October 23, 2015. Eligibility criteria included age 65 years or older, ability to provide informed consent, and ability to communicate in English. Study recruitment and data collection were performed by a study coordinator who was a house staff physician and had no role in the care of the participants. The medical charts of all general medicine patients were reviewed to determine if they met the age criteria. The physician of record for potential participants was contacted to assess if the patient was able to provide informed consent and communicate in English. Eligible patients were approached and informed consent was obtained from those who chose to participate in the study. After obtaining informed consent, patients were randomized using a random number generator to the intervention or usual-care arm of the study.
Those who were assigned to the intervention arm watched a 6-minute long video explaining the code-preference choices of full code, DNR, or DNR/DNI. Full code was described as possibly including CPR, intubation, and/or defibrillation depending on the clinical situation. Do not resuscitate was described as meaning no CPR or defibrillation but possible intubation in the case of respiratory failure. Do not resuscitate/do not intubate was explained as meaning no CPR, no defibrillation, and no intubation but rather permitting “natural death” to occur. The video showed a mock code with chest compressions, defibrillation, and intubation on a mannequin as well as palliative care specialists who discussed potential complications and survival rates of inhospital resuscitation.
The video was created at the University of Minnesota with the departments of palliative care and internal medicine (www.mmcgmeservices.org/codestat.html). After viewing the video, participants in the intervention arm filled out a questionnaire designed to assess their knowledge and beliefs about CPR and trust in their medical care providers. They were asked to circle their code preference. The participants’ medical teams were made aware of the code preferences and were counseled to discuss code preferences further if it was different from their previously documented code preference.
Participants in the control arm were assigned to usual care. At the institution where this study occurred, a discussion about code preferences between the patient and their medical team is considered the standard of care. After informed consent was obtained, participants filled out the same questionnaire as the participants in the intervention arm. They were asked to circle their code status preference. If they chose to ask questions about resuscitation, these were answered, but the study coordinator did not volunteer information about resuscitation or intervene in the medical care of the participants in any way.
All participants’ demographic characteristics and outcomes were described using proportions for categorical variables and means ± standard deviation for continuous variables. The primary outcome was participants’ stated code preference (full code, DNR, or DNR/DNI). Secondary outcomes included comparison of trust in medical providers, resuscitation beliefs, and desire for life-prolonging interventions as obtained from the questionnaire.
We analyzed code preferences between the intervention and control groups using Fisher exact test. We used analysis of variance (ANOVA) to compare questionnaire responses between the 2 groups. All reported P values are 2-sided with P < 0.05 considered significant. The project originally targeted a sample size of 194 participants for 80% power to detect a 20% difference in the code preference choices between intervention and control groups. Given the short time frame available to enroll participants, the target sample size was not reached. Propitiously, the effect size was greater than originally expected.
RESULTS
Study Participants
A total of 273 potentially eligible patients were approached to participate and 119 (44%) enrolled. (Figure 1). Of the 154 patients that were deemed eligible after initial screening, 42 patients were unable to give consent due to the severity of their illness or because of their mental status. Another 112 patients declined participation in the study, citing reasons such as disinterest in the consent paperwork, desire to spend time with visitors, and unease with the subject matter. Patients who declined participation did not differ significantly by age, sex, or race from those enrolled in the study.
Among the 119 participants, 60 were randomized to the control arm, and 59 were randomized to the intervention arm. Participants in the 2 arms did not differ significantly in age, sex, or race (P > 0.05), although all 4 women in the study were randomized to the intervention arm. Eighty-seven percent of the study population identified as white with the remainder including black, Asian, Pacific Islander, Native American, or declining to answer. The mean age was 75.8 years in the control arm vs. 75.2 years in the intervention arm.
Primary diagnoses in the study group ranged widely from relatively minor skin infections to acute pancreatitis. The control arm and the intervention arm did not differ significantly in the incidence of heart failure, pulmonary disease, renal dialysis, cirrhosis, stroke, or active cancer (P > 0.05). Patients were considered as having a stroke if they had suffered a stroke during their hospital admission or if they had long-term sequelae of prior stroke. Patients were considered as having active cancer if they were currently undergoing treatment or had metastases. Participants were considered as having multiple morbidities if they possessed 2 or more of the listed conditions. Between the control arm and the intervention arm, there was no significant difference in the number of participants with multiple morbidities (27% in the control group and 24% in the video group).
Code Status Preference
There was a significant difference in the code status preferences of the intervention arm and the control arm (P < 0.00001; Figure 2). In the control arm, 71% of participants chose full code, 12% chose DNR, and 17% chose DNR/DNI. In the intervention arm, only 37% chose full code, 7% chose DNR, and 56% chose DNR/DNI.
Secondary outcomes
Participants in the control and intervention arms were asked about their trust in their medical team (Question 1, Figure 3). There was no significant difference, but a trend towards less trust in the intervention group (P = 0.083) was seen with 93% of the control arm and 76% of the intervention arm agreeing with the statement “My doctors and healthcare team want what is best for me.”
Question 2, “If I choose to avoid resuscitation efforts, I will not receive care,” was designed to assess participants’ knowledge and perception about the care they would receive if they chose DNR/DNI as their code status. No significant difference was seen between the control and the interventions arms, with 28% of the control group agreeing with the statement, compared to 22% of the video group.
For question 3, participants were asked to respond to the statement “I would like to live as long as possible, even if I never leave the hospital.” No significant differences were seen between the control and the intervention arms, with 22% of both groups agreeing with the statement.
When we examined participant responses by the code status chosen, a significantly higher percentage of participants who chose full code agreed with the statement in question 3 (P = 0.0133). Of participants who chose full code, 27% agreed with the statement, compared to 18% of participants who chose DNR and 12% of participants who chose DNR/DNI. There was no significant difference (P > 0.05) between participant code status choice and either Question 1 or 2.
DISCUSSION
This study examined the effect of watching a video about CPR and intubation on the code status preferences of hospitalized patients. Participants who viewed a video about CPR and intubation were more likely to choose to forgo these treatments. Participants who chose CPR and intubation were more likely to agree that they would want to live as long as possible even if that time were spent in a medical setting.
To our knowledge, this is the first study to examine the role of a video decision support tool about code choices in the general hospital population, regardless of prognosis. Previous work has trialed the use of video support tools in hospitalized patients with a prognosis of less than 1 year,15 patients admitted to the ICU,18 and outpatients with cancer18 and those with dementia.16 Unlike previous studies, our study included a variety of illness severity.
Discussions about resuscitation are important for all adults admitted to the hospital because of the unpredictable nature of illness and the importance of providing high-quality care at the end of life. A recent study indicates that in-hospital cardiopulmonary arrest occurs in almost 1 per 1000 hospital days.19 These discussions are particularly salient for patients 65 years and older because of the higher incidence of death in this group. Inpatient admission is often a result of a change in health status, making it an important time for patients to reassess their resuscitation preferences based on their physical state and known comorbidities.
Video tools supplement the traditional code status discussion in several key ways. They provide a visual simulation of the procedures that occur during a typical resuscitation. These tools can help patients understand what CPR and intubation entail and transmit information that might be missed in verbal discussions. Visual media is now a common way for patients to obtain medical information20-22 and may be particularly helpful to patients who have low health literacy.23Video tools also help ensure that patients receive all the facts about resuscitation irrespective of how busy their provider may be or how comfortable the provider is with the topic. Lastly, video tools can reinforce information that is shared in the initial code status discussion. Given the significant differences in code status preference between our control and video arms, it is clear that the video tool has a significant impact on patient choices.
While we feel that our study clearly indicates the utility of video tools in code status discussion in hospitalized patients, there are some limitations. The current study enrolled participants who were predominantly white and male. All participants were recruited from the Minneapolis Veterans Affairs Health Care System, Minnesota. The relatively homogenous study population may impact the study’s generalizability. Another potential limitation of our study was the large number of eligible participants who declined to participate (41%), with many citing that they did not want to sign the consent paperwork. Additionally, the study coordinator was not blinded to the randomization of the participants, which could result in ascertainment bias. Also of concern was a trend, albeit nonsignificant, towards less trust in the healthcare team in the video group. Because the study was not designed to assess trust in the healthcare team both before and after the intervention, it is unclear if this difference was a result of the video.
Another area of potential concern is that visual images can be edited to sway viewers’ opinions based on the way content is presented. In our video, we included input from palliative care and internal medicine specialists. Cardiopulmonary resuscitation and intubation were performed on a CPR mannequin. The risks and benefits of CPR and intubation were discussed, as were the implications of choosing DNR or DNR/DNI code statuses.
The questionnaire that we used to assess participants’ knowledge and beliefs about resuscitation showed no differences between the control and the intervention arms of the study. We were surprised that a significant number of participants in the intervention group agreed with the statement, “If I choose to avoid resuscitation efforts, I will not receive care.” Our video specifically addressed the common belief that choosing DNR/DNI or DNR code statuses means that a patient will not continue to receive medical care. It is possible that participants were confused by the way the question was worded or that they understood the question to apply only to care received after a cardiopulmonary arrest had occurred.
This study and several others14-16 show that the use of video tools impacts participants’ code status preferences. There is clinical and humanistic importance in helping patients make informed decisions regarding whether or not they would want CPR and/or intubation if their heart were to stop or if they were to stop breathing. The data suggest that video tools are an efficient way to improve patient care and should be made widely available.
Disclosures: The authors report no conflicts of interest.
1. Dunn RH, Ahn J, Bernstein J. End-of-life care planning and fragility fractures of the hip: are we missing a valuable opportunity? Clin Orthop Relat Res 2016;474(7):1736-1739. PubMed
2. Warren MB, Lapid MI, McKean AJ, Cha SS, Stevens MA, Brekke FM, et al. Code status discussions in psychiatric and medical inpatients. J Clin Psychiatry. 2015;76(1):49-53. PubMed
3. Bhatia HL, Patel NR, Choma NN, Grande J, Giuse DA, Lehmann CU. Code status and resuscitation options in the electronic health record. Resuscitation. 2015;87:14-20. PubMed
4. Singh S, Namrata, Grewal A, Gautam PL, Luthra N, Kaur A. Evaluation of cardiopulmonary resuscitation (CPR) for patient outcomes and their predictors. J Clin Diagn Res. 2016;10(1):UC01-UC04. PubMed
5. Anderson WG, Chase R, Pantilat SZ, Tulsky JA, Auerbach AD. Code status discussions between attending hospitalist physicians and medical patients at hospital admission. J Gen Intern Med. 2011;26(4):359-366. PubMed
6. Einstein DJ, Einstein KL, Mathew P. Dying for advice: code status discussions between resident physicians and patients with advanced cancer--a national survey. J Palliat Med. 2015;18(6):535-541. PubMed
7. Piscator E, Hedberg P, Göransson K, Djärv T. Survival after in-hospital cardiac arrest is highly associated with the Age-combined Charlson Co-morbidity Index in a cohort study from a two-site Swedish University hospital. Resuscitation. 2016;99:79-83. PubMed
8. Zijlstra TJ, Leenman-Dekker SJ, Oldenhuis HK, Bosveld HE, Berendsen AJ. Knowledge and preferences regarding cardiopulmonary resuscitation: A survey among older patients. Patient Educ Couns. 2016;99(1):160-163. PubMed
9. Wilson ME, Akhoundi A, Krupa AK, Hinds RF, Litell JM, Gajic O, Kashani K. Development, validation, and results of a survey to measure understanding of cardiopulmonary resuscitation choices among ICU patients and their surrogate decision makers. BMC Anesthesiol. 2014;14:15. PubMed
10. Harris D, Willoughby H. Resuscitation on television: realistic or ridiculous? A quantitative observational analysis of the portrayal of cardiopulmonary resuscitation in television medical drama. Resuscitation. 2009;80(11):1275-1279. PubMed
11. Mills LM, Rhoads C, Curtis JR. Medical student training on code status discussions: how far have we come? J Palliat Med. 2016;19(3):323-325. PubMed
12. Binder AF, Huang GC, Buss MK. Uninformed consent: do medicine residents lack the proper framework for code status discussions? J Hosp Med. 2016;11(2):111-116. PubMed
13. Volandes AE, Levin TT, Slovin S, Carvajal RD, O’Reilly EM, et al. Augmenting advance care planning in poor prognosis cancer with a video decision aid: a preintervention-postintervention study. Cancer. 2012;118(17):4331-4338. PubMed
14. El-Jawahri A, Podgurski LM, Eichler AF, Plotkin SR, Temel JS, Mitchell SL, et al. Use of video to facilitate end-of-life discussions with patients with cancer: a randomized controlled trial. J Clin Oncol. 2010;28(2):305-310. PubMed
15. El-Jawahri A, Mitchell SL, Paasche-Orlow MK, Temel JS, Jackson VA, Rutledge RR, et al. A randomized controlled trial of a CPR and intubation video decision support tool for hospitalized patients. J Gen Intern Med. 2015;30(8):1071-1080. PubMed
16. Volandes AE, Paasche-Orlow MK, Barry MJ, Gillick MR, Minaker KL, Chang Y, et al. Video decision support tool for advance care planning in dementia: randomised controlled trial. BMJ. 2009;338:b2159. PubMed
17. Celso BG, Meenrajan S. The triad that matters: palliative medicine, code status, and health care costs. Am J Hosp Palliat Care. 2010;27(6):398-401. PubMed
18. Wilson ME, Krupa A, Hinds RF, Litell JM, Swetz KM, Akhoundi A, et al. A video to improve patient and surrogate understanding of cardiopulmonary resuscitation choices in the ICU: a randomized controlled trial. Crit Care Med. 2015;43(3):621-629. PubMed
19. Overdyk FJ, Dowling O, Marino J, Qiu J, Chien HL, Erslon M, et al. Association of opioids and sedatives with increased risk of in-hospital cardiopulmonary arrest from an administrative database. PLoS One. 2016;11(2):e0150214. PubMed
20. Stacey D, Samant R, Bennett C. Decision making in oncology: a review of patient decision aids to support patient participation. CA Cancer J Clin. 2008;58(5)293-304. PubMed
21. Lin GA, Aaronson DS, Knight SJ, Carroll PR, Dudley RA. Patient decision aids for prostate cancer treatment: a systematic review of the literature. CA Cancer J Clin. 2009;59(6):379-390. PubMed
22. O’Brien MA, Whelan TJ, Villasis-Keever M, Gafni A, Charles C, Roberts R, et al. Are cancer-related decision aids effective? A systematic review and meta-analysis. J Clin Oncol. 2009;27(6):974-985. PubMed
23. Sudore RL, Landefeld CS, Pérez-Stable EJ, Bibbins-Domingo K, Williams BA, Schillinger D. Unraveling the relationship between literacy, language proficiency, and patient-physician communication. Patient Educ Couns. 2009;75(3):398-402. PubMed
1. Dunn RH, Ahn J, Bernstein J. End-of-life care planning and fragility fractures of the hip: are we missing a valuable opportunity? Clin Orthop Relat Res 2016;474(7):1736-1739. PubMed
2. Warren MB, Lapid MI, McKean AJ, Cha SS, Stevens MA, Brekke FM, et al. Code status discussions in psychiatric and medical inpatients. J Clin Psychiatry. 2015;76(1):49-53. PubMed
3. Bhatia HL, Patel NR, Choma NN, Grande J, Giuse DA, Lehmann CU. Code status and resuscitation options in the electronic health record. Resuscitation. 2015;87:14-20. PubMed
4. Singh S, Namrata, Grewal A, Gautam PL, Luthra N, Kaur A. Evaluation of cardiopulmonary resuscitation (CPR) for patient outcomes and their predictors. J Clin Diagn Res. 2016;10(1):UC01-UC04. PubMed
5. Anderson WG, Chase R, Pantilat SZ, Tulsky JA, Auerbach AD. Code status discussions between attending hospitalist physicians and medical patients at hospital admission. J Gen Intern Med. 2011;26(4):359-366. PubMed
6. Einstein DJ, Einstein KL, Mathew P. Dying for advice: code status discussions between resident physicians and patients with advanced cancer--a national survey. J Palliat Med. 2015;18(6):535-541. PubMed
7. Piscator E, Hedberg P, Göransson K, Djärv T. Survival after in-hospital cardiac arrest is highly associated with the Age-combined Charlson Co-morbidity Index in a cohort study from a two-site Swedish University hospital. Resuscitation. 2016;99:79-83. PubMed
8. Zijlstra TJ, Leenman-Dekker SJ, Oldenhuis HK, Bosveld HE, Berendsen AJ. Knowledge and preferences regarding cardiopulmonary resuscitation: A survey among older patients. Patient Educ Couns. 2016;99(1):160-163. PubMed
9. Wilson ME, Akhoundi A, Krupa AK, Hinds RF, Litell JM, Gajic O, Kashani K. Development, validation, and results of a survey to measure understanding of cardiopulmonary resuscitation choices among ICU patients and their surrogate decision makers. BMC Anesthesiol. 2014;14:15. PubMed
10. Harris D, Willoughby H. Resuscitation on television: realistic or ridiculous? A quantitative observational analysis of the portrayal of cardiopulmonary resuscitation in television medical drama. Resuscitation. 2009;80(11):1275-1279. PubMed
11. Mills LM, Rhoads C, Curtis JR. Medical student training on code status discussions: how far have we come? J Palliat Med. 2016;19(3):323-325. PubMed
12. Binder AF, Huang GC, Buss MK. Uninformed consent: do medicine residents lack the proper framework for code status discussions? J Hosp Med. 2016;11(2):111-116. PubMed
13. Volandes AE, Levin TT, Slovin S, Carvajal RD, O’Reilly EM, et al. Augmenting advance care planning in poor prognosis cancer with a video decision aid: a preintervention-postintervention study. Cancer. 2012;118(17):4331-4338. PubMed
14. El-Jawahri A, Podgurski LM, Eichler AF, Plotkin SR, Temel JS, Mitchell SL, et al. Use of video to facilitate end-of-life discussions with patients with cancer: a randomized controlled trial. J Clin Oncol. 2010;28(2):305-310. PubMed
15. El-Jawahri A, Mitchell SL, Paasche-Orlow MK, Temel JS, Jackson VA, Rutledge RR, et al. A randomized controlled trial of a CPR and intubation video decision support tool for hospitalized patients. J Gen Intern Med. 2015;30(8):1071-1080. PubMed
16. Volandes AE, Paasche-Orlow MK, Barry MJ, Gillick MR, Minaker KL, Chang Y, et al. Video decision support tool for advance care planning in dementia: randomised controlled trial. BMJ. 2009;338:b2159. PubMed
17. Celso BG, Meenrajan S. The triad that matters: palliative medicine, code status, and health care costs. Am J Hosp Palliat Care. 2010;27(6):398-401. PubMed
18. Wilson ME, Krupa A, Hinds RF, Litell JM, Swetz KM, Akhoundi A, et al. A video to improve patient and surrogate understanding of cardiopulmonary resuscitation choices in the ICU: a randomized controlled trial. Crit Care Med. 2015;43(3):621-629. PubMed
19. Overdyk FJ, Dowling O, Marino J, Qiu J, Chien HL, Erslon M, et al. Association of opioids and sedatives with increased risk of in-hospital cardiopulmonary arrest from an administrative database. PLoS One. 2016;11(2):e0150214. PubMed
20. Stacey D, Samant R, Bennett C. Decision making in oncology: a review of patient decision aids to support patient participation. CA Cancer J Clin. 2008;58(5)293-304. PubMed
21. Lin GA, Aaronson DS, Knight SJ, Carroll PR, Dudley RA. Patient decision aids for prostate cancer treatment: a systematic review of the literature. CA Cancer J Clin. 2009;59(6):379-390. PubMed
22. O’Brien MA, Whelan TJ, Villasis-Keever M, Gafni A, Charles C, Roberts R, et al. Are cancer-related decision aids effective? A systematic review and meta-analysis. J Clin Oncol. 2009;27(6):974-985. PubMed
23. Sudore RL, Landefeld CS, Pérez-Stable EJ, Bibbins-Domingo K, Williams BA, Schillinger D. Unraveling the relationship between literacy, language proficiency, and patient-physician communication. Patient Educ Couns. 2009;75(3):398-402. PubMed
© 2017 Society of Hospital Medicine
Referral Patterns for Chronic Groin Pain and Athletic Pubalgia/Sports Hernia: Magnetic Resonance Imaging Findings, Treatment, and Outcomes
The past 3 decades have seen an evolution in the understanding, diagnosis, and treatment of groin pain, both chronic and acute, in athletes and non-athletes alike. Groin pain and groin injury are common. Most cases are transient, with patients returning to their activities within weeks or months. There has also been increasing awareness of a definitive population of patients who do not get better, or who improve and plateau before reaching preinjury level of performance.1-3 Several authors have brought more attention to the injury, introducing vocabulary, theories, diagnostic testing, and diagnoses, which now constitute a knowledge base.1,3-5
As stated in almost every article on groin pain and diagnosis, lack of cohesive agreement and vocabulary, and consistent protocols and procedures, has abounded, making general understanding and agreement in this area inconsistent.1,6-8In this article, members of a tertiary-care group specializing in chronic groin pain, athletic pubalgia (sports hernia), and inguinal herniorrhaphy outline their clinical examination, diagnostic algorithm, imaging protocol, treatment strategy, and outcomes for a population of patients referred by physicians and allied health professionals for a suspected diagnosis of athletic pubalgia.
Background
The pubic symphysis acts as a stabilizing central anchor with elaborate involvement of the anterior structures, including the rectus abdominis, adductor longus, and inguinal ligaments.3,7,9 Literature from Europe, Australia, and the United States has described groin pain, mostly in professional athletes, involving these pubic structures and attachments. Several publications have been addressing chronic groin pain, and each has its own diagnostic algorithm, imaging protocol, and treatment strategy.3,6,9-18
Terminology specific to groin pain in athletes is not new, and has a varied history dating to the early 20th century. Terms such as sportsman hernia19 and subsequently sports hernia20, have recently been embraced by the lay population. In 1999, Gibbon21 described shearing of the common adductor–rectus abdominis anatomical and functional unit and referenced a 1902 anatomical text that describes vertical ligamentous fibers contiguous with rectus sheath and adductor muscles, both attaching to the pubis. Injury to this region is the basis of pubalgia, a term originally used in 1984 by Brunet to describe a pain syndrome at the pubis.22
Many authors have proposed replacing sports hernia with athletic pubalgia.1,3,6,7,10,14,18,23 These terms refer to a group of musculoskeletal processes that occur in and around the pubic symphysis and that share similar mechanisms of injury and common clinical manifestations. The condition was originally described in high-performance athletes, and at one point the term sports hernia was reserved for this patient population.5 According to many authors, presence of an inguinal hernia excludes the diagnosis.1,2,5Magnetic resonance imaging (MRI) has helped to advance and define our understanding of the injury.10 As the history of the literature suggests, earlier concepts of chronic pain focused either on the medial aspect of the inguinal canal and its structures or on the pubic attachments. Many specialists in the area have concluded that the chronic groin pain injury can and often does embody both elements.3,9 Correlation with MRI findings, injury seen during surgical procedures, and cadaveric studies have directed our understanding to a structure, the pre-pubic aponeurotic complex (P-PAC), or rectus aponeurotic plate.12,24,25 Anatomically, the P-PAC, which has several fascial components, attaches posteriorly to the pubic bone and, to a degree, the pubic symphyseal cartilaginous disc. Major contributions to the P-PAC are fibers from the rectus abdominis tendon, the medial aspect of the transversalis and internal oblique muscles (the conjoint tendon, according to some), the inguinal ligament, and the adductor longus tendon.26When communicating with referring physicians, we use the term athletic pubalgia to indicate a specific injury. The athletic pubalgia injury can be defined as serial microtearing,1 or complete tearing, of the posterior attachment of the P-PAC off the anterior pubis.3,10 Complete tearing or displacement can occur unilaterally or across the midline to the other side. As athletic pubalgia is a specific anatomical injury rather than a broad category of findings, an additional pathologic diagnosis, such as inguinal hernia, does not exclude the diagnosis of athletic pubalgia. Unfortunately, the terms sports hernia and sportsman hernia, commonly used in the media and in professional communities, have largely confused the broader understanding of nuances and of the differences between the specific injuries and MRI findings.18
Our Experience
In our practice, we see groin pain patients referred by internists, physiatrists, physical therapists, trainers, general surgeons, urologists, gynecologists, and orthopedic surgeons. In many cases, patients have been through several consultations and work-ups, as their pain syndrome does not fall under a specific category. Patients without inguinal hernia, hip injury, urologic, or gynecologic issues typically are referred to a physiatrist or a physical therapist. Often, there are marginal improvements with physical therapy, but in some cases the injury never completely resolves, and the patient continues to have pain with activity or return to sports.
Most of our patients are nonprofessional athletes, men and women who range widely in age and participate casually or regularly in sporting events. Most lack the rigorous training, conditioning, and close supervision that professional athletes receive. Many other patients are nonprofessional but elite athletes who train 7 days a week for marathons, ultramarathons, triathlons, obstacle course races (“mudders”), and similar events.
Work-Up
A single algorithm is used for all patients initially referred to the surgeon’s office for pelvic or groin pain. The initial interview directs attention to injury onset and mechanism, duration of rest or physical therapy after surgery, pain quality and pain levels, and antagonistic movements and positions. Examination starts with assessment for inguinal, femoral, and umbilical hernias. Resisted sit-up, leg-raise, adduction, and hip assessment tests are performed. The P-PAC is examined with a maneuver similar to the one used for inguinal hernia, as it allows for better assessment of the transversalis fascia (over the direct space) to determine if the inguinal canal floor is attenuated and bulges forward with the Valsalva maneuver. Then, the lateral aspect of the rectus muscle is assessed for pain, usually with the head raised to contract the muscle, to determine tenderness along the lateral border. The rectus edge is traced down to the pubis at its attachment, the superolateral border of the P-PAC. Examination proceeds medially, over the rectus attachment, toward the pubic symphysis, continuing the assessment for tenderness. Laterally, the conjoint tendon and inguinal ligament medial attachments are assessed at the level of the pubic tubercle, which represents the lateral border of the P-PAC. Finally, the examination continues to the inferior border with assessment of the adductor longus attachment, which is best performed with the leg in an adducted position. In the acute or semiacute setting (pain within 1 year of injury onset), tenderness is often elicited. With long-standing injuries, pain is often not elicited, but the patient experiences pain along this axis during activity or afterward.
Patients with positive history and physical examination findings proceed through an MRI protocol designed to detect pathology of the pubic symphysis, hips, and inguinal canals (Figures 1A-1D).
Treatment
Patients who report sustaining an acute groin injury within the previous 6 months are treated nonoperatively. A combination of rest, nonsteroidal anti-inflammatory drugs, and physical therapy is generally recommended.2,10 In cases of failed nonoperative management, patients are evaluated for surgery. No single operation is recommended for all patients.1,6,14,27,28 (Larson26 recently reviewed results from several trials involving a variety of surgical repairs and found return-to-sports rates ranging from 80% to 100%.) Findings from the physical examination and from the properly protocolled MRI examination are used in planning surgery to correct any pathology that could be contributing to symptoms or destabilization of the structures attaching to the pubis. Disruption of the P-PAC from the pubis would be repaired, for example. Additional injuries, such as partial or complete detachment of the conjoint tendon or inguinal ligament, may be repaired as well. If the transversalis fascia is attenuated and bulging forward, the inguinal floor is closed. Adductor longus tendon pathology is addressed, most commonly with partial tendinolysis. Often, concomitant inguinal hernias are found, and these may be repaired in open fashion while other maneuvers are being performed, or laparoscopically.
Materials and Methods
After receiving study approval from our Institutional Review Board, we retrospectively searched for all MRIs performed by our radiology department between March 1, 2011 and March 31, 2013 on patients referred for an indication of groin pain, sports hernia, or athletic pubalgia. Patients were excluded if they were younger than 18 years any time during their care. Some patients previously or subsequently underwent computed tomography or ultrasonography. MRIs were reviewed and positive findings were compiled in a database. Charts were reviewed to identify which patients in the dataset underwent surgery, after MRI, to address their presenting chief complaint. Surgery date and procedure(s) performed were recorded. Patients were interviewed by telephone as part of the in-office postoperative follow-up.
Results
One hundred nineteen MRIs were performed on 117 patients (97 men, 83%). Mean age was 39.8 years. Seventy-nine patients (68%) had an MRI finding of athletic pubalgia, 67 (57%) had an acetabular labral tear in one or both hip joints, and 41 (35%) had a true inguinal hernia. Concomitant findings were common: 47 cases of athletic pubalgia and labral tear(s), 28 cases of athletic pubalgia and inguinal hernia, and 15 cases of all 3 (athletic pubalgia, labral tear, inguinal hernia).
Use of breath-hold axial single-shot fast spin-echo sequences with and without the Valsalva maneuver increased sensitivity in detecting pathologies—inguinal hernia and Gilmore groin in particular. On 24 of the 119 MRIs, the Valsalva maneuver either revealed the finding or made it significantly more apparent.
Of all patients referred for MRI for chronic groin pain, 48 (41%) subsequently underwent surgery. In 29 surgeries, the rectus abdominis, adductor longus, and/or pre-pubic aponeurotic plate were repaired; in 13 cases, herniorrhaphy was performed as well; in 2 cases, masses involving the spermatic cord were removed.
The most common surgery (30 cases) was herniorrhaphy, which was performed as a single procedure, multiple procedures, or in combination with procedures not related to a true hernia. Eighteen patients underwent surgery only for hernia repair.
Of the 79 patients with MRI-positive athletic pubalgia, 39 subsequently underwent surgery, and 31 (79%) of these were followed up by telephone. Mean duration of rest after surgery was 6.2 weeks. Twelve patients (39%) had physical therapy after surgery, some as early as 4 weeks, and some have continued their therapy since surgery. Of the 31 patients who were followed up after surgery, 23 (74%) resumed previous activity levels. Return to previous activity level took these patients a mean of 17.9 weeks. When asked if outcomes satisfied their expectations, 28 patients (90%) said yes, and 3 said no.
Forty patients with MRI-positive athletic pubalgia were nonoperatively treated, and 28 (70%) of these patients were followed up. In this group, mean duration of rest after surgery was 6.9 weeks. Thirteen patients (46%) participated in physical therapy, for a mean duration of 10.8 weeks. Of the patients followed up, 19 (68%) returned to previous activity level. Twenty-one patients (75%) were satisfied with their outcome.
Discussion
Diagnosis and treatment of chronic groin pain have had a long, confusing, and frustrating history for both patients and the medical professionals who provide them with care.3,6,7,10 Historically, the problem has been, in part, the lack of diagnostic capabilities. Currently, however, pubalgia MRI protocol allows the exact pathology to be demonstrated.3 As already noted, concomitant hip pathology or inguinal hernia is not unusual8; any structural abnormality in the area is a potential destabilizer of the structures attached to the pubis.18 Solving only one of these issues may offer only partial resolution of symptoms and thereby reduce the rate of successful treatment of groin pain.
Diagnostic algorithms are being developed. In addition, nonoperative treatments are being tried for some of the issues. Physicians are giving diagnostic and therapeutic steroid injections in the pubic cleft, along the rectus abdominis/adductor longus complex, or posterior to the P-PAC. Platelet-rich plasma injection therapy has had limited success.29This article provides a snapshot of what a tertiary-care group of physicians specializing in chronic groin pain sees in an unfiltered setting. We think this is instructive for several reasons.
First, many patients in our population have visited a multitude of specialists without receiving a diagnosis or being referred appropriately. Simply, many specialists do not know the next step in treating groin pain and thus do not make the appropriate referral. Until recently, the literature has not been helpful. It has poorly described the constellation of injuries comprising chronic groin pain. More significantly, groin injuries have been presented as ambiguous injuries lacking effective treatment. Over the past decade, however, abundant literature on the correlation of these injuries with specific MRI findings has made the case otherwise.
Second, a specific MRI pubalgia protocol is needed. Inability to make a correct diagnosis, because of improper MRI, continues to add to the confusion surrounding the injury and undoubtedly prolongs the general medical community’s thinking that diagnosis and treatment of chronic groin pain are elusive. Our data support this point in many ways. Although all patients in this study were seen by a medical professional before coming to our office, none had received a diagnosis of occult hernia or attenuated transversalis fascia; nevertheless, we identified inguinal hernia, Gilmore groin, or both in 44% of these patients. These findings are not surprising, as MRI was the crucial link in diagnosis. In addition, the point made by other groin pain specialists—that a hernia precludes a pubalgia diagnosis1,2,5—is not supported by our data. Inguinal hernia can and does exist in conjunction with pubalgia. More than half the patients in our study had a combined diagnosis. We contend that, much as hip labral pathology occurs concomitantly with pubalgia,23 inguinal hernia may be a predisposing factor as well. A defect in the direct or indirect space can destabilize the area and place additional strain on the pubic attachments.
In our experience, the dynamic Valsalva sequence improves detection of true hernias and anterior abdominal wall deficiencies and should be included in each protocol for the evaluation of acute or chronic groin pain.
Shear forces and injury at the pubis can occur outside professional athletics. Our patient population is nonprofessional athletes, teenagers to retirees, and all can develop athletic pubalgia. Ninety percent of surveyed patients who received a diagnosis and were treated surgically were satisfied with their outcomes.
Am J Orthop. 2017;46(4):E251-E256. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
1. Meyers WC, Lanfranco A, Castellanos A. Surgical management of chronic lower abdominal and groin pain in high-performance athletes. Curr Sports Med Rep. 2002;1(5):301-305.
2. Ahumada LA, Ashruf S, Espinosa-de-los-Monteros A, et al. Athletic pubalgia: definition and surgical treatment. Ann Plast Surg. 2005;55(4):393-396.
3. Omar IM, Zoga AC, Kavanagh EC, et al. Athletic pubalgia and “sports hernia”: optimal MR imaging technique and findings. Radiographics. 2008;28(5):1415-1438.
4. Gilmore OJA. Gilmore’s groin: ten years experience of groin disruption—a previously unsolved problem in sportsmen. Sports Med Soft Tissue Trauma. 1991;3:12-14.
5. Meyers WC, Foley DP, Garrett WE, Lohnes JH, Mandlebaum BR. Management of severe lower abdominal or inguinal pain in high-performance athletes. PAIN (Performing Athletes with Abdominal or Inguinal Neuromuscular Pain Study Group). Am J Sports Med. 2000;28(1):2-8.
6. Kavanagh EC, Koulouris G, Ford S, McMahon P, Johnson C, Eustace SJ. MR imaging of groin pain in the athlete. Semin Musculoskelet Radiol. 2006;10(3):197-207.
7. Cunningham PM, Brennan D, O’Connell M, MacMahon P, O’Neill P, Eustace S. Patterns of bone and soft-tissue injury at the symphysis pubis in soccer players: observations at MRI. AJR Am J Roentgenol. 2007;188(3):W291-W296.
8. Zoga AC, Kavanagh EC, Omar IM, et al. Athletic pubalgia and the “sports hernia”: MR imaging findings. Radiology. 2008;247(3):797-807.
9. Koulouris G. Imaging review of groin pain in elite athletes: an anatomic approach to imaging findings. AJR Am J Roentgenol. 2008;191(4):962-972.
10. Albers SL, Spritzer CE, Garrett WE Jr, Meyers WC. MR findings in athletes with pubalgia. Skeletal Radiol. 2001;30(5):270-277.
11. Brennan D, O’Connell MJ, Ryan M, et al. Secondary cleft sign as a marker of injury in athletes with groin pain: MR image appearance and interpretation. Radiology. 2005;235(1):162-167.
12. Robinson P, Salehi F, Grainger A, et al. Cadaveric and MRI study of the musculotendinous contributions to the capsule of the symphysis pubis. AJR Am J Roentgenol. 2007;188(5):W440-W445.
13. Schilders E, Talbot JC, Robinson P, Dimitrakopoulou A, Gibbon WW, Bismil Q. Adductor-related groin pain in recreational athletes. J Bone Joint Surg Am. 2009;91(10):2455-2460.
14. Davies AG, Clarke AW, Gilmore J, Wotherspoon M, Connell DA. Review: imaging of groin pain in the athlete. Skeletal Radiol. 2010;39(7):629-644.
15. Mullens FE, Zoga AC, Morrison WB, Meyers WC. Review of MRI technique and imaging findings in athletic pubalgia and the “sports hernia.” Eur J Radiol. 2012;81(12):3780-3792.
16. Zoga AC, Meyers WC. Magnetic resonance imaging for pain after surgical treatment for athletic pubalgia and the “sports hernia.” Semin Musculoskelet Radiol. 2011;15(4):372-382.
17. Beer E. Periostitis of symphysis and descending rami of pubes following suprapubic operations. Int J Med Surg. 1924;37(5):224-225.
18. MacMahon PJ, Hogan BA, Shelly MJ, Eustace SJ, Kavanagh EC. Imaging of groin pain. Magn Reson Imaging Clin N Am. 2009;17(4):655-666.
19. Malycha P, Lovell G. Inguinal surgery in athletes with chronic groin pain: the ‘sportsman’s’ hernia. Aust N Z J Surg. 1992;62(2):123-125.
20. Hackney RG. The sports hernia: a cause of chronic groin pain. Br J Sports Med. 1993;27(1):58-62.
21. Gibbon WW. Groin pain in athletes. Lancet. 1999;353(9162):1444-1445.
22. Brunet B, Brunet-Geudj E, Genety J. La pubalgie: syndrome “fourre-tout” pur une plus grande riguer diagnostique et therapeutique. Intantanes Medicaux. 1984;55:25-30.
23. Lischuk AW, Dorantes TM, Wong W, Haims AH. Imaging of sports-related hip and groin injuries. Sports Health. 2010;2(3):252-261.
24. Gibbon WW, Hession PR. Diseases of the pubis and pubic symphysis: MR imaging appearances. AJR Am J Roentgenol. 1997;169(3):849-853.
25. Gamble JG, Simmons SC, Freedman M. The symphysis pubis. Anatomic and pathologic considerations. Clin Orthop Relat Res. 1986;(203):261-272.
26. Larson CM. Sports hernia/athletic pubalgia: evaluation and management. Sports Health. 2014;6(2):139-144.
27. Maffulli N, Loppini M, Longo UG, Denaro V. Bilateral mini-invasive adductor tenotomy for the management of chronic unilateral adductor longus tendinopathy in athletes. Am J Sports Med. 2012;40(8):1880-1886.
28. Schilders E, Dimitrakopoulou A, Cooke M, Bismil Q, Cooke C. Effectiveness of a selective partial adductor release for chronic adductor-related groin pain in professional athletes. Am J Sports Med. 2013;41(3):603-607.
29. Scholten PM, Massimi S, Dahmen N, Diamond J, Wyss J. Successful treatment of athletic pubalgia in a lacrosse player with ultrasound-guided needle tenotomy and platelet-rich plasma injection: a case report. PM R. 2015;7(1):79-83.
The past 3 decades have seen an evolution in the understanding, diagnosis, and treatment of groin pain, both chronic and acute, in athletes and non-athletes alike. Groin pain and groin injury are common. Most cases are transient, with patients returning to their activities within weeks or months. There has also been increasing awareness of a definitive population of patients who do not get better, or who improve and plateau before reaching preinjury level of performance.1-3 Several authors have brought more attention to the injury, introducing vocabulary, theories, diagnostic testing, and diagnoses, which now constitute a knowledge base.1,3-5
As stated in almost every article on groin pain and diagnosis, lack of cohesive agreement and vocabulary, and consistent protocols and procedures, has abounded, making general understanding and agreement in this area inconsistent.1,6-8In this article, members of a tertiary-care group specializing in chronic groin pain, athletic pubalgia (sports hernia), and inguinal herniorrhaphy outline their clinical examination, diagnostic algorithm, imaging protocol, treatment strategy, and outcomes for a population of patients referred by physicians and allied health professionals for a suspected diagnosis of athletic pubalgia.
Background
The pubic symphysis acts as a stabilizing central anchor with elaborate involvement of the anterior structures, including the rectus abdominis, adductor longus, and inguinal ligaments.3,7,9 Literature from Europe, Australia, and the United States has described groin pain, mostly in professional athletes, involving these pubic structures and attachments. Several publications have been addressing chronic groin pain, and each has its own diagnostic algorithm, imaging protocol, and treatment strategy.3,6,9-18
Terminology specific to groin pain in athletes is not new, and has a varied history dating to the early 20th century. Terms such as sportsman hernia19 and subsequently sports hernia20, have recently been embraced by the lay population. In 1999, Gibbon21 described shearing of the common adductor–rectus abdominis anatomical and functional unit and referenced a 1902 anatomical text that describes vertical ligamentous fibers contiguous with rectus sheath and adductor muscles, both attaching to the pubis. Injury to this region is the basis of pubalgia, a term originally used in 1984 by Brunet to describe a pain syndrome at the pubis.22
Many authors have proposed replacing sports hernia with athletic pubalgia.1,3,6,7,10,14,18,23 These terms refer to a group of musculoskeletal processes that occur in and around the pubic symphysis and that share similar mechanisms of injury and common clinical manifestations. The condition was originally described in high-performance athletes, and at one point the term sports hernia was reserved for this patient population.5 According to many authors, presence of an inguinal hernia excludes the diagnosis.1,2,5Magnetic resonance imaging (MRI) has helped to advance and define our understanding of the injury.10 As the history of the literature suggests, earlier concepts of chronic pain focused either on the medial aspect of the inguinal canal and its structures or on the pubic attachments. Many specialists in the area have concluded that the chronic groin pain injury can and often does embody both elements.3,9 Correlation with MRI findings, injury seen during surgical procedures, and cadaveric studies have directed our understanding to a structure, the pre-pubic aponeurotic complex (P-PAC), or rectus aponeurotic plate.12,24,25 Anatomically, the P-PAC, which has several fascial components, attaches posteriorly to the pubic bone and, to a degree, the pubic symphyseal cartilaginous disc. Major contributions to the P-PAC are fibers from the rectus abdominis tendon, the medial aspect of the transversalis and internal oblique muscles (the conjoint tendon, according to some), the inguinal ligament, and the adductor longus tendon.26When communicating with referring physicians, we use the term athletic pubalgia to indicate a specific injury. The athletic pubalgia injury can be defined as serial microtearing,1 or complete tearing, of the posterior attachment of the P-PAC off the anterior pubis.3,10 Complete tearing or displacement can occur unilaterally or across the midline to the other side. As athletic pubalgia is a specific anatomical injury rather than a broad category of findings, an additional pathologic diagnosis, such as inguinal hernia, does not exclude the diagnosis of athletic pubalgia. Unfortunately, the terms sports hernia and sportsman hernia, commonly used in the media and in professional communities, have largely confused the broader understanding of nuances and of the differences between the specific injuries and MRI findings.18
Our Experience
In our practice, we see groin pain patients referred by internists, physiatrists, physical therapists, trainers, general surgeons, urologists, gynecologists, and orthopedic surgeons. In many cases, patients have been through several consultations and work-ups, as their pain syndrome does not fall under a specific category. Patients without inguinal hernia, hip injury, urologic, or gynecologic issues typically are referred to a physiatrist or a physical therapist. Often, there are marginal improvements with physical therapy, but in some cases the injury never completely resolves, and the patient continues to have pain with activity or return to sports.
Most of our patients are nonprofessional athletes, men and women who range widely in age and participate casually or regularly in sporting events. Most lack the rigorous training, conditioning, and close supervision that professional athletes receive. Many other patients are nonprofessional but elite athletes who train 7 days a week for marathons, ultramarathons, triathlons, obstacle course races (“mudders”), and similar events.
Work-Up
A single algorithm is used for all patients initially referred to the surgeon’s office for pelvic or groin pain. The initial interview directs attention to injury onset and mechanism, duration of rest or physical therapy after surgery, pain quality and pain levels, and antagonistic movements and positions. Examination starts with assessment for inguinal, femoral, and umbilical hernias. Resisted sit-up, leg-raise, adduction, and hip assessment tests are performed. The P-PAC is examined with a maneuver similar to the one used for inguinal hernia, as it allows for better assessment of the transversalis fascia (over the direct space) to determine if the inguinal canal floor is attenuated and bulges forward with the Valsalva maneuver. Then, the lateral aspect of the rectus muscle is assessed for pain, usually with the head raised to contract the muscle, to determine tenderness along the lateral border. The rectus edge is traced down to the pubis at its attachment, the superolateral border of the P-PAC. Examination proceeds medially, over the rectus attachment, toward the pubic symphysis, continuing the assessment for tenderness. Laterally, the conjoint tendon and inguinal ligament medial attachments are assessed at the level of the pubic tubercle, which represents the lateral border of the P-PAC. Finally, the examination continues to the inferior border with assessment of the adductor longus attachment, which is best performed with the leg in an adducted position. In the acute or semiacute setting (pain within 1 year of injury onset), tenderness is often elicited. With long-standing injuries, pain is often not elicited, but the patient experiences pain along this axis during activity or afterward.
Patients with positive history and physical examination findings proceed through an MRI protocol designed to detect pathology of the pubic symphysis, hips, and inguinal canals (Figures 1A-1D).
Treatment
Patients who report sustaining an acute groin injury within the previous 6 months are treated nonoperatively. A combination of rest, nonsteroidal anti-inflammatory drugs, and physical therapy is generally recommended.2,10 In cases of failed nonoperative management, patients are evaluated for surgery. No single operation is recommended for all patients.1,6,14,27,28 (Larson26 recently reviewed results from several trials involving a variety of surgical repairs and found return-to-sports rates ranging from 80% to 100%.) Findings from the physical examination and from the properly protocolled MRI examination are used in planning surgery to correct any pathology that could be contributing to symptoms or destabilization of the structures attaching to the pubis. Disruption of the P-PAC from the pubis would be repaired, for example. Additional injuries, such as partial or complete detachment of the conjoint tendon or inguinal ligament, may be repaired as well. If the transversalis fascia is attenuated and bulging forward, the inguinal floor is closed. Adductor longus tendon pathology is addressed, most commonly with partial tendinolysis. Often, concomitant inguinal hernias are found, and these may be repaired in open fashion while other maneuvers are being performed, or laparoscopically.
Materials and Methods
After receiving study approval from our Institutional Review Board, we retrospectively searched for all MRIs performed by our radiology department between March 1, 2011 and March 31, 2013 on patients referred for an indication of groin pain, sports hernia, or athletic pubalgia. Patients were excluded if they were younger than 18 years any time during their care. Some patients previously or subsequently underwent computed tomography or ultrasonography. MRIs were reviewed and positive findings were compiled in a database. Charts were reviewed to identify which patients in the dataset underwent surgery, after MRI, to address their presenting chief complaint. Surgery date and procedure(s) performed were recorded. Patients were interviewed by telephone as part of the in-office postoperative follow-up.
Results
One hundred nineteen MRIs were performed on 117 patients (97 men, 83%). Mean age was 39.8 years. Seventy-nine patients (68%) had an MRI finding of athletic pubalgia, 67 (57%) had an acetabular labral tear in one or both hip joints, and 41 (35%) had a true inguinal hernia. Concomitant findings were common: 47 cases of athletic pubalgia and labral tear(s), 28 cases of athletic pubalgia and inguinal hernia, and 15 cases of all 3 (athletic pubalgia, labral tear, inguinal hernia).
Use of breath-hold axial single-shot fast spin-echo sequences with and without the Valsalva maneuver increased sensitivity in detecting pathologies—inguinal hernia and Gilmore groin in particular. On 24 of the 119 MRIs, the Valsalva maneuver either revealed the finding or made it significantly more apparent.
Of all patients referred for MRI for chronic groin pain, 48 (41%) subsequently underwent surgery. In 29 surgeries, the rectus abdominis, adductor longus, and/or pre-pubic aponeurotic plate were repaired; in 13 cases, herniorrhaphy was performed as well; in 2 cases, masses involving the spermatic cord were removed.
The most common surgery (30 cases) was herniorrhaphy, which was performed as a single procedure, multiple procedures, or in combination with procedures not related to a true hernia. Eighteen patients underwent surgery only for hernia repair.
Of the 79 patients with MRI-positive athletic pubalgia, 39 subsequently underwent surgery, and 31 (79%) of these were followed up by telephone. Mean duration of rest after surgery was 6.2 weeks. Twelve patients (39%) had physical therapy after surgery, some as early as 4 weeks, and some have continued their therapy since surgery. Of the 31 patients who were followed up after surgery, 23 (74%) resumed previous activity levels. Return to previous activity level took these patients a mean of 17.9 weeks. When asked if outcomes satisfied their expectations, 28 patients (90%) said yes, and 3 said no.
Forty patients with MRI-positive athletic pubalgia were nonoperatively treated, and 28 (70%) of these patients were followed up. In this group, mean duration of rest after surgery was 6.9 weeks. Thirteen patients (46%) participated in physical therapy, for a mean duration of 10.8 weeks. Of the patients followed up, 19 (68%) returned to previous activity level. Twenty-one patients (75%) were satisfied with their outcome.
Discussion
Diagnosis and treatment of chronic groin pain have had a long, confusing, and frustrating history for both patients and the medical professionals who provide them with care.3,6,7,10 Historically, the problem has been, in part, the lack of diagnostic capabilities. Currently, however, pubalgia MRI protocol allows the exact pathology to be demonstrated.3 As already noted, concomitant hip pathology or inguinal hernia is not unusual8; any structural abnormality in the area is a potential destabilizer of the structures attached to the pubis.18 Solving only one of these issues may offer only partial resolution of symptoms and thereby reduce the rate of successful treatment of groin pain.
Diagnostic algorithms are being developed. In addition, nonoperative treatments are being tried for some of the issues. Physicians are giving diagnostic and therapeutic steroid injections in the pubic cleft, along the rectus abdominis/adductor longus complex, or posterior to the P-PAC. Platelet-rich plasma injection therapy has had limited success.29This article provides a snapshot of what a tertiary-care group of physicians specializing in chronic groin pain sees in an unfiltered setting. We think this is instructive for several reasons.
First, many patients in our population have visited a multitude of specialists without receiving a diagnosis or being referred appropriately. Simply, many specialists do not know the next step in treating groin pain and thus do not make the appropriate referral. Until recently, the literature has not been helpful. It has poorly described the constellation of injuries comprising chronic groin pain. More significantly, groin injuries have been presented as ambiguous injuries lacking effective treatment. Over the past decade, however, abundant literature on the correlation of these injuries with specific MRI findings has made the case otherwise.
Second, a specific MRI pubalgia protocol is needed. Inability to make a correct diagnosis, because of improper MRI, continues to add to the confusion surrounding the injury and undoubtedly prolongs the general medical community’s thinking that diagnosis and treatment of chronic groin pain are elusive. Our data support this point in many ways. Although all patients in this study were seen by a medical professional before coming to our office, none had received a diagnosis of occult hernia or attenuated transversalis fascia; nevertheless, we identified inguinal hernia, Gilmore groin, or both in 44% of these patients. These findings are not surprising, as MRI was the crucial link in diagnosis. In addition, the point made by other groin pain specialists—that a hernia precludes a pubalgia diagnosis1,2,5—is not supported by our data. Inguinal hernia can and does exist in conjunction with pubalgia. More than half the patients in our study had a combined diagnosis. We contend that, much as hip labral pathology occurs concomitantly with pubalgia,23 inguinal hernia may be a predisposing factor as well. A defect in the direct or indirect space can destabilize the area and place additional strain on the pubic attachments.
In our experience, the dynamic Valsalva sequence improves detection of true hernias and anterior abdominal wall deficiencies and should be included in each protocol for the evaluation of acute or chronic groin pain.
Shear forces and injury at the pubis can occur outside professional athletics. Our patient population is nonprofessional athletes, teenagers to retirees, and all can develop athletic pubalgia. Ninety percent of surveyed patients who received a diagnosis and were treated surgically were satisfied with their outcomes.
Am J Orthop. 2017;46(4):E251-E256. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
The past 3 decades have seen an evolution in the understanding, diagnosis, and treatment of groin pain, both chronic and acute, in athletes and non-athletes alike. Groin pain and groin injury are common. Most cases are transient, with patients returning to their activities within weeks or months. There has also been increasing awareness of a definitive population of patients who do not get better, or who improve and plateau before reaching preinjury level of performance.1-3 Several authors have brought more attention to the injury, introducing vocabulary, theories, diagnostic testing, and diagnoses, which now constitute a knowledge base.1,3-5
As stated in almost every article on groin pain and diagnosis, lack of cohesive agreement and vocabulary, and consistent protocols and procedures, has abounded, making general understanding and agreement in this area inconsistent.1,6-8In this article, members of a tertiary-care group specializing in chronic groin pain, athletic pubalgia (sports hernia), and inguinal herniorrhaphy outline their clinical examination, diagnostic algorithm, imaging protocol, treatment strategy, and outcomes for a population of patients referred by physicians and allied health professionals for a suspected diagnosis of athletic pubalgia.
Background
The pubic symphysis acts as a stabilizing central anchor with elaborate involvement of the anterior structures, including the rectus abdominis, adductor longus, and inguinal ligaments.3,7,9 Literature from Europe, Australia, and the United States has described groin pain, mostly in professional athletes, involving these pubic structures and attachments. Several publications have been addressing chronic groin pain, and each has its own diagnostic algorithm, imaging protocol, and treatment strategy.3,6,9-18
Terminology specific to groin pain in athletes is not new, and has a varied history dating to the early 20th century. Terms such as sportsman hernia19 and subsequently sports hernia20, have recently been embraced by the lay population. In 1999, Gibbon21 described shearing of the common adductor–rectus abdominis anatomical and functional unit and referenced a 1902 anatomical text that describes vertical ligamentous fibers contiguous with rectus sheath and adductor muscles, both attaching to the pubis. Injury to this region is the basis of pubalgia, a term originally used in 1984 by Brunet to describe a pain syndrome at the pubis.22
Many authors have proposed replacing sports hernia with athletic pubalgia.1,3,6,7,10,14,18,23 These terms refer to a group of musculoskeletal processes that occur in and around the pubic symphysis and that share similar mechanisms of injury and common clinical manifestations. The condition was originally described in high-performance athletes, and at one point the term sports hernia was reserved for this patient population.5 According to many authors, presence of an inguinal hernia excludes the diagnosis.1,2,5Magnetic resonance imaging (MRI) has helped to advance and define our understanding of the injury.10 As the history of the literature suggests, earlier concepts of chronic pain focused either on the medial aspect of the inguinal canal and its structures or on the pubic attachments. Many specialists in the area have concluded that the chronic groin pain injury can and often does embody both elements.3,9 Correlation with MRI findings, injury seen during surgical procedures, and cadaveric studies have directed our understanding to a structure, the pre-pubic aponeurotic complex (P-PAC), or rectus aponeurotic plate.12,24,25 Anatomically, the P-PAC, which has several fascial components, attaches posteriorly to the pubic bone and, to a degree, the pubic symphyseal cartilaginous disc. Major contributions to the P-PAC are fibers from the rectus abdominis tendon, the medial aspect of the transversalis and internal oblique muscles (the conjoint tendon, according to some), the inguinal ligament, and the adductor longus tendon.26When communicating with referring physicians, we use the term athletic pubalgia to indicate a specific injury. The athletic pubalgia injury can be defined as serial microtearing,1 or complete tearing, of the posterior attachment of the P-PAC off the anterior pubis.3,10 Complete tearing or displacement can occur unilaterally or across the midline to the other side. As athletic pubalgia is a specific anatomical injury rather than a broad category of findings, an additional pathologic diagnosis, such as inguinal hernia, does not exclude the diagnosis of athletic pubalgia. Unfortunately, the terms sports hernia and sportsman hernia, commonly used in the media and in professional communities, have largely confused the broader understanding of nuances and of the differences between the specific injuries and MRI findings.18
Our Experience
In our practice, we see groin pain patients referred by internists, physiatrists, physical therapists, trainers, general surgeons, urologists, gynecologists, and orthopedic surgeons. In many cases, patients have been through several consultations and work-ups, as their pain syndrome does not fall under a specific category. Patients without inguinal hernia, hip injury, urologic, or gynecologic issues typically are referred to a physiatrist or a physical therapist. Often, there are marginal improvements with physical therapy, but in some cases the injury never completely resolves, and the patient continues to have pain with activity or return to sports.
Most of our patients are nonprofessional athletes, men and women who range widely in age and participate casually or regularly in sporting events. Most lack the rigorous training, conditioning, and close supervision that professional athletes receive. Many other patients are nonprofessional but elite athletes who train 7 days a week for marathons, ultramarathons, triathlons, obstacle course races (“mudders”), and similar events.
Work-Up
A single algorithm is used for all patients initially referred to the surgeon’s office for pelvic or groin pain. The initial interview directs attention to injury onset and mechanism, duration of rest or physical therapy after surgery, pain quality and pain levels, and antagonistic movements and positions. Examination starts with assessment for inguinal, femoral, and umbilical hernias. Resisted sit-up, leg-raise, adduction, and hip assessment tests are performed. The P-PAC is examined with a maneuver similar to the one used for inguinal hernia, as it allows for better assessment of the transversalis fascia (over the direct space) to determine if the inguinal canal floor is attenuated and bulges forward with the Valsalva maneuver. Then, the lateral aspect of the rectus muscle is assessed for pain, usually with the head raised to contract the muscle, to determine tenderness along the lateral border. The rectus edge is traced down to the pubis at its attachment, the superolateral border of the P-PAC. Examination proceeds medially, over the rectus attachment, toward the pubic symphysis, continuing the assessment for tenderness. Laterally, the conjoint tendon and inguinal ligament medial attachments are assessed at the level of the pubic tubercle, which represents the lateral border of the P-PAC. Finally, the examination continues to the inferior border with assessment of the adductor longus attachment, which is best performed with the leg in an adducted position. In the acute or semiacute setting (pain within 1 year of injury onset), tenderness is often elicited. With long-standing injuries, pain is often not elicited, but the patient experiences pain along this axis during activity or afterward.
Patients with positive history and physical examination findings proceed through an MRI protocol designed to detect pathology of the pubic symphysis, hips, and inguinal canals (Figures 1A-1D).
Treatment
Patients who report sustaining an acute groin injury within the previous 6 months are treated nonoperatively. A combination of rest, nonsteroidal anti-inflammatory drugs, and physical therapy is generally recommended.2,10 In cases of failed nonoperative management, patients are evaluated for surgery. No single operation is recommended for all patients.1,6,14,27,28 (Larson26 recently reviewed results from several trials involving a variety of surgical repairs and found return-to-sports rates ranging from 80% to 100%.) Findings from the physical examination and from the properly protocolled MRI examination are used in planning surgery to correct any pathology that could be contributing to symptoms or destabilization of the structures attaching to the pubis. Disruption of the P-PAC from the pubis would be repaired, for example. Additional injuries, such as partial or complete detachment of the conjoint tendon or inguinal ligament, may be repaired as well. If the transversalis fascia is attenuated and bulging forward, the inguinal floor is closed. Adductor longus tendon pathology is addressed, most commonly with partial tendinolysis. Often, concomitant inguinal hernias are found, and these may be repaired in open fashion while other maneuvers are being performed, or laparoscopically.
Materials and Methods
After receiving study approval from our Institutional Review Board, we retrospectively searched for all MRIs performed by our radiology department between March 1, 2011 and March 31, 2013 on patients referred for an indication of groin pain, sports hernia, or athletic pubalgia. Patients were excluded if they were younger than 18 years any time during their care. Some patients previously or subsequently underwent computed tomography or ultrasonography. MRIs were reviewed and positive findings were compiled in a database. Charts were reviewed to identify which patients in the dataset underwent surgery, after MRI, to address their presenting chief complaint. Surgery date and procedure(s) performed were recorded. Patients were interviewed by telephone as part of the in-office postoperative follow-up.
Results
One hundred nineteen MRIs were performed on 117 patients (97 men, 83%). Mean age was 39.8 years. Seventy-nine patients (68%) had an MRI finding of athletic pubalgia, 67 (57%) had an acetabular labral tear in one or both hip joints, and 41 (35%) had a true inguinal hernia. Concomitant findings were common: 47 cases of athletic pubalgia and labral tear(s), 28 cases of athletic pubalgia and inguinal hernia, and 15 cases of all 3 (athletic pubalgia, labral tear, inguinal hernia).
Use of breath-hold axial single-shot fast spin-echo sequences with and without the Valsalva maneuver increased sensitivity in detecting pathologies—inguinal hernia and Gilmore groin in particular. On 24 of the 119 MRIs, the Valsalva maneuver either revealed the finding or made it significantly more apparent.
Of all patients referred for MRI for chronic groin pain, 48 (41%) subsequently underwent surgery. In 29 surgeries, the rectus abdominis, adductor longus, and/or pre-pubic aponeurotic plate were repaired; in 13 cases, herniorrhaphy was performed as well; in 2 cases, masses involving the spermatic cord were removed.
The most common surgery (30 cases) was herniorrhaphy, which was performed as a single procedure, multiple procedures, or in combination with procedures not related to a true hernia. Eighteen patients underwent surgery only for hernia repair.
Of the 79 patients with MRI-positive athletic pubalgia, 39 subsequently underwent surgery, and 31 (79%) of these were followed up by telephone. Mean duration of rest after surgery was 6.2 weeks. Twelve patients (39%) had physical therapy after surgery, some as early as 4 weeks, and some have continued their therapy since surgery. Of the 31 patients who were followed up after surgery, 23 (74%) resumed previous activity levels. Return to previous activity level took these patients a mean of 17.9 weeks. When asked if outcomes satisfied their expectations, 28 patients (90%) said yes, and 3 said no.
Forty patients with MRI-positive athletic pubalgia were nonoperatively treated, and 28 (70%) of these patients were followed up. In this group, mean duration of rest after surgery was 6.9 weeks. Thirteen patients (46%) participated in physical therapy, for a mean duration of 10.8 weeks. Of the patients followed up, 19 (68%) returned to previous activity level. Twenty-one patients (75%) were satisfied with their outcome.
Discussion
Diagnosis and treatment of chronic groin pain have had a long, confusing, and frustrating history for both patients and the medical professionals who provide them with care.3,6,7,10 Historically, the problem has been, in part, the lack of diagnostic capabilities. Currently, however, pubalgia MRI protocol allows the exact pathology to be demonstrated.3 As already noted, concomitant hip pathology or inguinal hernia is not unusual8; any structural abnormality in the area is a potential destabilizer of the structures attached to the pubis.18 Solving only one of these issues may offer only partial resolution of symptoms and thereby reduce the rate of successful treatment of groin pain.
Diagnostic algorithms are being developed. In addition, nonoperative treatments are being tried for some of the issues. Physicians are giving diagnostic and therapeutic steroid injections in the pubic cleft, along the rectus abdominis/adductor longus complex, or posterior to the P-PAC. Platelet-rich plasma injection therapy has had limited success.29This article provides a snapshot of what a tertiary-care group of physicians specializing in chronic groin pain sees in an unfiltered setting. We think this is instructive for several reasons.
First, many patients in our population have visited a multitude of specialists without receiving a diagnosis or being referred appropriately. Simply, many specialists do not know the next step in treating groin pain and thus do not make the appropriate referral. Until recently, the literature has not been helpful. It has poorly described the constellation of injuries comprising chronic groin pain. More significantly, groin injuries have been presented as ambiguous injuries lacking effective treatment. Over the past decade, however, abundant literature on the correlation of these injuries with specific MRI findings has made the case otherwise.
Second, a specific MRI pubalgia protocol is needed. Inability to make a correct diagnosis, because of improper MRI, continues to add to the confusion surrounding the injury and undoubtedly prolongs the general medical community’s thinking that diagnosis and treatment of chronic groin pain are elusive. Our data support this point in many ways. Although all patients in this study were seen by a medical professional before coming to our office, none had received a diagnosis of occult hernia or attenuated transversalis fascia; nevertheless, we identified inguinal hernia, Gilmore groin, or both in 44% of these patients. These findings are not surprising, as MRI was the crucial link in diagnosis. In addition, the point made by other groin pain specialists—that a hernia precludes a pubalgia diagnosis1,2,5—is not supported by our data. Inguinal hernia can and does exist in conjunction with pubalgia. More than half the patients in our study had a combined diagnosis. We contend that, much as hip labral pathology occurs concomitantly with pubalgia,23 inguinal hernia may be a predisposing factor as well. A defect in the direct or indirect space can destabilize the area and place additional strain on the pubic attachments.
In our experience, the dynamic Valsalva sequence improves detection of true hernias and anterior abdominal wall deficiencies and should be included in each protocol for the evaluation of acute or chronic groin pain.
Shear forces and injury at the pubis can occur outside professional athletics. Our patient population is nonprofessional athletes, teenagers to retirees, and all can develop athletic pubalgia. Ninety percent of surveyed patients who received a diagnosis and were treated surgically were satisfied with their outcomes.
Am J Orthop. 2017;46(4):E251-E256. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
1. Meyers WC, Lanfranco A, Castellanos A. Surgical management of chronic lower abdominal and groin pain in high-performance athletes. Curr Sports Med Rep. 2002;1(5):301-305.
2. Ahumada LA, Ashruf S, Espinosa-de-los-Monteros A, et al. Athletic pubalgia: definition and surgical treatment. Ann Plast Surg. 2005;55(4):393-396.
3. Omar IM, Zoga AC, Kavanagh EC, et al. Athletic pubalgia and “sports hernia”: optimal MR imaging technique and findings. Radiographics. 2008;28(5):1415-1438.
4. Gilmore OJA. Gilmore’s groin: ten years experience of groin disruption—a previously unsolved problem in sportsmen. Sports Med Soft Tissue Trauma. 1991;3:12-14.
5. Meyers WC, Foley DP, Garrett WE, Lohnes JH, Mandlebaum BR. Management of severe lower abdominal or inguinal pain in high-performance athletes. PAIN (Performing Athletes with Abdominal or Inguinal Neuromuscular Pain Study Group). Am J Sports Med. 2000;28(1):2-8.
6. Kavanagh EC, Koulouris G, Ford S, McMahon P, Johnson C, Eustace SJ. MR imaging of groin pain in the athlete. Semin Musculoskelet Radiol. 2006;10(3):197-207.
7. Cunningham PM, Brennan D, O’Connell M, MacMahon P, O’Neill P, Eustace S. Patterns of bone and soft-tissue injury at the symphysis pubis in soccer players: observations at MRI. AJR Am J Roentgenol. 2007;188(3):W291-W296.
8. Zoga AC, Kavanagh EC, Omar IM, et al. Athletic pubalgia and the “sports hernia”: MR imaging findings. Radiology. 2008;247(3):797-807.
9. Koulouris G. Imaging review of groin pain in elite athletes: an anatomic approach to imaging findings. AJR Am J Roentgenol. 2008;191(4):962-972.
10. Albers SL, Spritzer CE, Garrett WE Jr, Meyers WC. MR findings in athletes with pubalgia. Skeletal Radiol. 2001;30(5):270-277.
11. Brennan D, O’Connell MJ, Ryan M, et al. Secondary cleft sign as a marker of injury in athletes with groin pain: MR image appearance and interpretation. Radiology. 2005;235(1):162-167.
12. Robinson P, Salehi F, Grainger A, et al. Cadaveric and MRI study of the musculotendinous contributions to the capsule of the symphysis pubis. AJR Am J Roentgenol. 2007;188(5):W440-W445.
13. Schilders E, Talbot JC, Robinson P, Dimitrakopoulou A, Gibbon WW, Bismil Q. Adductor-related groin pain in recreational athletes. J Bone Joint Surg Am. 2009;91(10):2455-2460.
14. Davies AG, Clarke AW, Gilmore J, Wotherspoon M, Connell DA. Review: imaging of groin pain in the athlete. Skeletal Radiol. 2010;39(7):629-644.
15. Mullens FE, Zoga AC, Morrison WB, Meyers WC. Review of MRI technique and imaging findings in athletic pubalgia and the “sports hernia.” Eur J Radiol. 2012;81(12):3780-3792.
16. Zoga AC, Meyers WC. Magnetic resonance imaging for pain after surgical treatment for athletic pubalgia and the “sports hernia.” Semin Musculoskelet Radiol. 2011;15(4):372-382.
17. Beer E. Periostitis of symphysis and descending rami of pubes following suprapubic operations. Int J Med Surg. 1924;37(5):224-225.
18. MacMahon PJ, Hogan BA, Shelly MJ, Eustace SJ, Kavanagh EC. Imaging of groin pain. Magn Reson Imaging Clin N Am. 2009;17(4):655-666.
19. Malycha P, Lovell G. Inguinal surgery in athletes with chronic groin pain: the ‘sportsman’s’ hernia. Aust N Z J Surg. 1992;62(2):123-125.
20. Hackney RG. The sports hernia: a cause of chronic groin pain. Br J Sports Med. 1993;27(1):58-62.
21. Gibbon WW. Groin pain in athletes. Lancet. 1999;353(9162):1444-1445.
22. Brunet B, Brunet-Geudj E, Genety J. La pubalgie: syndrome “fourre-tout” pur une plus grande riguer diagnostique et therapeutique. Intantanes Medicaux. 1984;55:25-30.
23. Lischuk AW, Dorantes TM, Wong W, Haims AH. Imaging of sports-related hip and groin injuries. Sports Health. 2010;2(3):252-261.
24. Gibbon WW, Hession PR. Diseases of the pubis and pubic symphysis: MR imaging appearances. AJR Am J Roentgenol. 1997;169(3):849-853.
25. Gamble JG, Simmons SC, Freedman M. The symphysis pubis. Anatomic and pathologic considerations. Clin Orthop Relat Res. 1986;(203):261-272.
26. Larson CM. Sports hernia/athletic pubalgia: evaluation and management. Sports Health. 2014;6(2):139-144.
27. Maffulli N, Loppini M, Longo UG, Denaro V. Bilateral mini-invasive adductor tenotomy for the management of chronic unilateral adductor longus tendinopathy in athletes. Am J Sports Med. 2012;40(8):1880-1886.
28. Schilders E, Dimitrakopoulou A, Cooke M, Bismil Q, Cooke C. Effectiveness of a selective partial adductor release for chronic adductor-related groin pain in professional athletes. Am J Sports Med. 2013;41(3):603-607.
29. Scholten PM, Massimi S, Dahmen N, Diamond J, Wyss J. Successful treatment of athletic pubalgia in a lacrosse player with ultrasound-guided needle tenotomy and platelet-rich plasma injection: a case report. PM R. 2015;7(1):79-83.
1. Meyers WC, Lanfranco A, Castellanos A. Surgical management of chronic lower abdominal and groin pain in high-performance athletes. Curr Sports Med Rep. 2002;1(5):301-305.
2. Ahumada LA, Ashruf S, Espinosa-de-los-Monteros A, et al. Athletic pubalgia: definition and surgical treatment. Ann Plast Surg. 2005;55(4):393-396.
3. Omar IM, Zoga AC, Kavanagh EC, et al. Athletic pubalgia and “sports hernia”: optimal MR imaging technique and findings. Radiographics. 2008;28(5):1415-1438.
4. Gilmore OJA. Gilmore’s groin: ten years experience of groin disruption—a previously unsolved problem in sportsmen. Sports Med Soft Tissue Trauma. 1991;3:12-14.
5. Meyers WC, Foley DP, Garrett WE, Lohnes JH, Mandlebaum BR. Management of severe lower abdominal or inguinal pain in high-performance athletes. PAIN (Performing Athletes with Abdominal or Inguinal Neuromuscular Pain Study Group). Am J Sports Med. 2000;28(1):2-8.
6. Kavanagh EC, Koulouris G, Ford S, McMahon P, Johnson C, Eustace SJ. MR imaging of groin pain in the athlete. Semin Musculoskelet Radiol. 2006;10(3):197-207.
7. Cunningham PM, Brennan D, O’Connell M, MacMahon P, O’Neill P, Eustace S. Patterns of bone and soft-tissue injury at the symphysis pubis in soccer players: observations at MRI. AJR Am J Roentgenol. 2007;188(3):W291-W296.
8. Zoga AC, Kavanagh EC, Omar IM, et al. Athletic pubalgia and the “sports hernia”: MR imaging findings. Radiology. 2008;247(3):797-807.
9. Koulouris G. Imaging review of groin pain in elite athletes: an anatomic approach to imaging findings. AJR Am J Roentgenol. 2008;191(4):962-972.
10. Albers SL, Spritzer CE, Garrett WE Jr, Meyers WC. MR findings in athletes with pubalgia. Skeletal Radiol. 2001;30(5):270-277.
11. Brennan D, O’Connell MJ, Ryan M, et al. Secondary cleft sign as a marker of injury in athletes with groin pain: MR image appearance and interpretation. Radiology. 2005;235(1):162-167.
12. Robinson P, Salehi F, Grainger A, et al. Cadaveric and MRI study of the musculotendinous contributions to the capsule of the symphysis pubis. AJR Am J Roentgenol. 2007;188(5):W440-W445.
13. Schilders E, Talbot JC, Robinson P, Dimitrakopoulou A, Gibbon WW, Bismil Q. Adductor-related groin pain in recreational athletes. J Bone Joint Surg Am. 2009;91(10):2455-2460.
14. Davies AG, Clarke AW, Gilmore J, Wotherspoon M, Connell DA. Review: imaging of groin pain in the athlete. Skeletal Radiol. 2010;39(7):629-644.
15. Mullens FE, Zoga AC, Morrison WB, Meyers WC. Review of MRI technique and imaging findings in athletic pubalgia and the “sports hernia.” Eur J Radiol. 2012;81(12):3780-3792.
16. Zoga AC, Meyers WC. Magnetic resonance imaging for pain after surgical treatment for athletic pubalgia and the “sports hernia.” Semin Musculoskelet Radiol. 2011;15(4):372-382.
17. Beer E. Periostitis of symphysis and descending rami of pubes following suprapubic operations. Int J Med Surg. 1924;37(5):224-225.
18. MacMahon PJ, Hogan BA, Shelly MJ, Eustace SJ, Kavanagh EC. Imaging of groin pain. Magn Reson Imaging Clin N Am. 2009;17(4):655-666.
19. Malycha P, Lovell G. Inguinal surgery in athletes with chronic groin pain: the ‘sportsman’s’ hernia. Aust N Z J Surg. 1992;62(2):123-125.
20. Hackney RG. The sports hernia: a cause of chronic groin pain. Br J Sports Med. 1993;27(1):58-62.
21. Gibbon WW. Groin pain in athletes. Lancet. 1999;353(9162):1444-1445.
22. Brunet B, Brunet-Geudj E, Genety J. La pubalgie: syndrome “fourre-tout” pur une plus grande riguer diagnostique et therapeutique. Intantanes Medicaux. 1984;55:25-30.
23. Lischuk AW, Dorantes TM, Wong W, Haims AH. Imaging of sports-related hip and groin injuries. Sports Health. 2010;2(3):252-261.
24. Gibbon WW, Hession PR. Diseases of the pubis and pubic symphysis: MR imaging appearances. AJR Am J Roentgenol. 1997;169(3):849-853.
25. Gamble JG, Simmons SC, Freedman M. The symphysis pubis. Anatomic and pathologic considerations. Clin Orthop Relat Res. 1986;(203):261-272.
26. Larson CM. Sports hernia/athletic pubalgia: evaluation and management. Sports Health. 2014;6(2):139-144.
27. Maffulli N, Loppini M, Longo UG, Denaro V. Bilateral mini-invasive adductor tenotomy for the management of chronic unilateral adductor longus tendinopathy in athletes. Am J Sports Med. 2012;40(8):1880-1886.
28. Schilders E, Dimitrakopoulou A, Cooke M, Bismil Q, Cooke C. Effectiveness of a selective partial adductor release for chronic adductor-related groin pain in professional athletes. Am J Sports Med. 2013;41(3):603-607.
29. Scholten PM, Massimi S, Dahmen N, Diamond J, Wyss J. Successful treatment of athletic pubalgia in a lacrosse player with ultrasound-guided needle tenotomy and platelet-rich plasma injection: a case report. PM R. 2015;7(1):79-83.
The TEND (Tomorrow’s Expected Number of Discharges) Model Accurately Predicted the Number of Patients Who Were Discharged from the Hospital the Next Day
Hospitals typically allocate beds based on historical patient volumes. If funding decreases, hospitals will usually try to maximize resource utilization by allocating beds to attain occupancies close to 100% for significant periods of time. This will invariably cause days in which hospital occupancy exceeds capacity, at which time critical entry points (such as the emergency department and operating room) will become blocked. This creates significant concerns over the patient quality of care.
Hospital administrators have very few options when hospital occupancy exceeds 100%. They could postpone admissions for “planned” cases, bring in additional staff to increase capacity, or instigate additional methods to increase hospital discharges such as expanding care resources in the community. All options are costly, bothersome, or cannot be actioned immediately. The need for these options could be minimized by enabling hospital administrators to make more informed decisions regarding hospital bed management by knowing the likely number of discharges in the next 24 hours.
Predicting the number of people who will be discharged in the next day can be approached in several ways. One approach would be to calculate each patient’s expected length of stay and then use the variation around that estimate to calculate each day’s discharge probability. Several studies have attempted to model hospital length of stay using a broad assortment of methodologies, but a mechanism to accurately predict this outcome has been elusive1,2 (with Verburg et al.3 concluding in their study’s abstract that “…it is difficult to predict length of stay…”). A second approach would be to use survival analysis methods to generate each patient’s hazard of discharge over time, which could be directly converted to an expected daily risk of discharge. However, this approach is complicated by the concurrent need to include time-dependent covariates and consider the competing risk of death in hospital, which can complicate survival modeling.4,5 A third approach would be the implementation of a longitudinal analysis using marginal models to predict the daily probability of discharge,6 but this method quickly overwhelms computer resources when large datasets are present.
In this study, we decided to use nonparametric models to predict the daily number of hospital discharges. We first identified patient groups with distinct discharge patterns. We then calculated the conditional daily discharge probability of patients in each of these groups. Finally, these conditional daily discharge probabilities were then summed for each hospital day to generate the expected number of discharges in the next 24 hours. This paper details the methods we used to create our model and the accuracy of its predictions.
METHODS
Study Setting and Databases Used for Analysis
The study took place at The Ottawa Hospital, a 1000-bed teaching hospital with 3 campuses that is the primary referral center in our region. The study was approved by our local research ethics board.
The Patient Registry Database records the date and time of admission for each patient (defined as the moment that a patient’s admission request is registered in the patient registration) and discharge (defined as the time when the patient’s discharge from hospital was entered into the patient registration) for hospital encounters. Emergency department encounters were also identified in the Patient Registry Database along with admission service, patient age and sex, and patient location throughout the admission. The Laboratory Database records all laboratory studies and results on all patients at the hospital.
Study Cohort
We used the Patient Registry Database to identify all people aged 1 year or more who were admitted to the hospital between January 1, 2013, and December 31, 2015. This time frame was selected to (i) ensure that data were complete; and (ii) complete calendar years of data were available for both derivation (patient-days in 2013-2014) and validation (2015) cohorts. Patients who were observed in the emergency room without admission to hospital were not included.
Study Outcome
The study outcome was the number of patients discharged from the hospital each day. For the analysis, the reference point for each day was 1 second past midnight; therefore, values for time-dependent covariates up to and including midnight were used to predict the number of discharges in the next 24 hours.
Study Covariates
Baseline (ie, time-independent) covariates included patient age and sex, admission service, hospital campus, whether or not the patient was admitted from the emergency department (all determined from the Patient Registry Database), and the Laboratory-based Acute Physiological Score (LAPS). The latter, which was calculated with the Laboratory Database using results for 14 tests (arterial pH, PaCO2, PaO2, anion gap, hematocrit, total white blood cell count, serum albumin, total bilirubin, creatinine, urea nitrogen, glucose, sodium, bicarbonate, and troponin I) measured in the 24-hour time frame preceding hospitalization, was derived by Escobar and colleagues7 to measure severity of illness and was subsequently validated in our hospital.8 The independent association of each laboratory perturbation with risk of death in hospital is reflected by the number of points assigned to each lab value with the total LAPS being the sum of these values. Time-dependent covariates included weekday in hospital and whether or not patients were in the intensive care unit.
Analysis
We used 3 stages to create a model to predict the daily expected number of discharges: we identified discharge risk strata containing patients having similar discharge patterns using data from patients in the derivation cohort (first stage); then, we generated the preliminary probability of discharge by determining the daily discharge probability in each discharge risk strata (second stage); finally, we modified the probability from the second stage based on the weekday and admission service and summed these probabilities to create the expected number of discharges on a particular date (third stage).
The first stage identified discharge risk strata based on the covariates listed above. This was determined by using a survival tree approach9 with proportional hazard regression models to generate the “splits.” These models were offered all covariates listed in the Study Covariates section. Admission service was clustered within 4 departments (obstetrics/gynecology, psychiatry, surgery, and medicine) and day of week was “binarized” into weekday/weekend-holiday (because the use of categorical variables with large numbers of groups can “stunt” regression trees due to small numbers of patients—and, therefore, statistical power—in each subgroup). The proportional hazards model identified the covariate having the strongest association with time to discharge (based on the Wald X2 value divided by the degrees of freedom). This variable was then used to split the cohort into subgroups (with continuous covariates being categorized into quartiles). The proportional hazards model was then repeated in each subgroup (with the previous splitting variable[s] excluded from the model). This process continued until no variable was associated with time to discharge with a P value less than .0001. This survival-tree was then used to cluster all patients into distinct discharge risk strata.
In the second stage, we generated the preliminary probability of discharge for a specific date. This was calculated by assigning all patients in hospital to their discharge risk strata (Appendix). We then measured the probability of discharge on each hospitalization day in all discharge risk strata using data from the previous 180 days (we only used the prior 180 days of data to account for temporal changes in hospital discharge patterns). For example, consider a 75-year-old patient on her third hospital day under obstetrics/gynecology on December 19, 2015 (a Saturday). This patient would be assigned to risk stratum #133 (Appendix A). We then measured the probability of discharge of all patients in this discharge risk stratum hospitalized in the previous 6 months (ie, between June 22, 2015, and December 18, 2015) on each hospital day. For risk stratum #133, the probability of discharge on hospital day 3 was 0.1111; therefore, our sample patient’s preliminary expected discharge probability was 0.1111.
To attain stable daily discharge probability estimates, a minimum of 50 patients per discharge risk stratum-hospitalization day combination was required. If there were less than 50 patients for a particular hospitalization day in a particular discharge risk stratum, we grouped hospitalization days in that risk stratum together until the minimum of 50 patients was collected.
The third (and final) stage accounted for the lack of granularity when we created the discharge risk strata in the first stage. As we mentioned above, admission service was clustered into 4 departments and the day of week was clustered into weekend/weekday. However, important variations in discharge probabilities could still exist within departments and between particular days of the week.10 Therefore, we created a correction factor to adjust the preliminary expected number of discharges based on the admission division and day of week. This correction factor used data from the 180 days prior to the analysis date within which the expected daily number of discharges was calculated (using the methods above). The correction factor was the relative difference between the observed and expected number of discharges within each division-day of week grouping.
For example, to calculate the correction factor for our sample patient presented above (75-year-old patient on hospital day 3 under gynecology on Saturday, December 19, 2015), we measured the observed number of discharges from gynecology on Saturdays between June 22, 2015, and December 18, 2015, (n = 206) and the expected number of discharges (n = 195.255) resulting in a correction factor of (observed-expected)/expected = (195.255-206)/195.206 = 0.05503. Therefore, the final expected discharge probability for our sample patient was 0.1111+0.1111*0.05503=0.1172. The expected number of discharges on a particular date was the preliminary expected number of discharges on that date (generated in the second stage) multiplied by the correction factor for the corresponding division-day or week group.
RESULTS
There were 192,859 admissions involving patients more than 1 year of age that spent at least part of their hospitalization between January 1, 2013, and December 31, 2015 (Table). Patients were middle-aged and slightly female predominant, with about half being admitted from the emergency department. Approximately 80% of admissions were to surgical or medical services. More than 95% of admissions ended with a discharge from the hospital with the remainder ending in a death. Almost 30% of hospitalization days occurred on weekends or holidays. Hospitalizations in the derivation (2013-2014) and validation (2015) group were essentially the same, except there was a slight drop in hospital length of stay (from a median of 4 days to 3 days) between the 2 periods.
Patient and hospital covariates importantly influenced the daily conditional probability of discharge (Figure 1). Patients admitted to the obstetrics/gynecology department were notably more likely to be discharged from hospital with no influence from the day of week. In contrast, the probability of discharge decreased notably on the weekends in the other departments. Patients on the ward were much more likely to be discharged than those in the intensive care unit, with increasing age associated with a decreased discharge likelihood in the former but not the latter patients. Finally, discharge probabilities varied only slightly between campuses at our hospital with discharge risk decreasing as severity of illness (as measured by LAPS) increased.
The TEND model contained 142 discharge risk strata (Appendix A). Weekend-holiday status had the strongest association with discharge probability (ie, it was the first splitting variable). The most complex discharge risk strata contained 6 covariates. The daily conditional probability of discharge during the first 2 weeks of hospitalization varied extensively between discharge risk strata (Figure 2). Overall, the conditional discharge probability increased from the first to the second day, remained relatively stable for several days, and then slowly decreased over time. However, this pattern and day-to-day variability differed extensively between risk strata.
The observed daily number of discharges in the validation cohort varied extensively (median 139; interquartile range [IQR] 95-160; range 39-214). The TEND model accurately predicted the daily number of discharges with the expected daily number being strongly associated with the observed number (adjusted R2 = 89.2%; P < 0.0001; Figure 3). Calibration decreased but remained significant when we limited the analyses by hospital campus (General: R2 = 46.3%; P < 0.0001; Civic: R2 = 47.9%; P < 0.0001; Heart Institute: R2 = 18.1%; P < 0.0001). The expected number of daily discharges was an unbiased estimator of the observed number of discharges (its parameter estimate in a linear regression model with the observed number of discharges as the outcome variable was 1.0005; 95% confidence interval, 0.9647-1.0363). The absolute difference in the observed and expected daily number of discharges was small (median 1.6; IQR −6.8 to 9.4; range −37 to 63.4) as was the relative difference (median 1.4%; IQR −5.5% to 7.1%; range −40.9% to 43.4%). The expected number of discharges was within 20% of the observed number of discharges in 95.1% of days in 2015.
DISCUSSION
Knowing how many patients will soon be discharged from the hospital should greatly facilitate hospital planning. This study showed that the TEND model used simple patient and hospitalization covariates to accurately predict the number of patients who will be discharged from hospital in the next day.
We believe that this study has several notable findings. First, we think that using a nonparametric approach to predicting the daily number of discharges importantly increased accuracy. This approach allowed us to generate expected likelihoods based on actual discharge probabilities at our hospital in the most recent 6 months of hospitalization-days within patients having discharge patterns that were very similar to the patient in question (ie, discharge risk strata, Appendix A). This ensured that trends in hospitalization habits were accounted for without the need of a period variable in our model. In addition, the lack of parameters in the model will make it easier to transplant it to other hospitals. Second, we think that the accuracy of the predictions were remarkable given the relative “crudeness” of our predictors. By using relatively simple factors, the TEND model was able to output accurate predictions for the number of daily discharges (Figure 3).
This study joins several others that have attempted to accomplish the difficult task of predicting the number of hospital discharges by using digitized data. Barnes et al.11 created a model using regression random forest methods in a single medical service within a hospital to predict the daily number of discharges with impressive accuracy (mean daily number of discharges observed 8.29, expected 8.51). Interestingly, the model in this study was more accurate at predicting discharge likelihood than physicians. Levin et al.12 derived a model using discrete time logistic regression to predict the likelihood of discharge from a pediatric intensive care unit, finding that physician orders (captured via electronic order entry) could be categorized and used to significantly increase the accuracy of discharge likelihood. This study demonstrates the potential opportunities within health-related data from hospital data warehouses to improve prediction. We believe that continued work in this field will result in the increased use of digital data to help hospital administrators manage patient beds more efficiently and effectively than currently used resource intensive manual methods.13,14
Several issues should be kept in mind when interpreting our findings. First, our analysis is limited to a single institution in Canada. It will be important to determine if the TEND model methodology generalizes to other hospitals in different jurisdictions. Such an external validation, especially in multiple hospitals, will be important to show that the TEND model methodology works in other facilities. Hospitals could implement the TEND model if they are able to record daily values for each of the variables required to assign patients to a discharge risk stratum (Appendix A) and calculate within each the daily probability of discharge. Hospitals could derive their own discharge risk strata to account for covariates, which we did not include in our study but could be influential, such as insurance status. These discharge risk estimates could also be incorporated into the electronic medical record or hospital dashboards (as long as the data required to generate the estimates are available). These interventions would permit the expected number of hospital discharges (and even the patient-level probability of discharge) to be calculated on a daily basis. Second, 2 potential biases could have influenced the identification of our discharge risk strata (Appendix A). In this process, we used survival tree methods to separate patient-days into clusters having progressively more homogenous discharge patterns. Each split was determined by using a proportional hazards model that ignored the competing risks of death in hospital. In addition, the model expressed age and LAPS as continuous variables, whereas these covariates had to be categorized to create our risk strata groupings. The strength of a covariate’s association with an outcome will decrease when a continuous variable is categorized.15 Both of these issues might have biased our final risk strata categorization (Appendix A). Third, we limited our model to include simple covariates whose values could be determined relatively easily within most hospital administrative data systems. While this increases the generalizability to other hospital information systems, we believe that the introduction of other covariates to the model—such as daily vital signs, laboratory results, medications, or time from operations—could increase prediction accuracy. Finally, it is uncertain whether or not knowing the predicted number of discharges will improve the efficiency of bed management within the hospital. It seems logical that an accurate prediction of the number of beds that will be made available in the next day should improve decisions regarding the number of patients who could be admitted electively to the hospital. It remains to be seen, however, whether this truly happens.
In summary, we found that the TEND model used a handful of patient and hospitalization factors to accurately predict the expected number of discharges from hospital in the next day. Further work is required to implement this model into our institution’s data warehouse and then determine whether this prediction will improve the efficiency of bed management at our hospital.
Disclosure: CvW is supported by a University of Ottawa Department of Medicine Clinician Scientist Chair. The authors have no conflicts of interest
1. Austin PC, Rothwell DM, Tu JV. A comparison of statistical modeling strategies for analyzing length of stay after CABG surgery. Health Serv Outcomes Res Methodol. 2002;3:107-133.
2. Moran JL, Solomon PJ. A review of statistical estimators for risk-adjusted length of stay: analysis of the Australian and new Zealand intensive care adult patient data-base, 2008-2009. BMC Med Res Methodol. 2012;12:68. PubMed
3. Verburg IWM, de Keizer NF, de Jonge E, Peek N. Comparison of regression methods for modeling intensive care length of stay. PLoS One. 2014;9:e109684. PubMed
4. Beyersmann J, Schumacher M. Time-dependent covariates in the proportional subdistribution hazards model for competing risks. Biostatistics. 2008;9:765-776. PubMed
5. Latouche A, Porcher R, Chevret S. A note on including time-dependent covariate in regression model for competing risks data. Biom J. 2005;47:807-814. PubMed
6. Fitzmaurice GM, Laird NM, Ware JH. Marginal models: generalized estimating equations. Applied Longitudinal Analysis. 2nd ed. John Wiley & Sons; 2011;353-394.
7. Escobar GJ, Greene JD, Scheirer P, Gardner MN, Draper D, Kipnis P. Risk-adjusting hospital inpatient mortality using automated inpatient, outpatient, and laboratory databases. Med Care. 2008;46:232-239. PubMed
8. van Walraven C, Escobar GJ, Greene JD, Forster AJ. The Kaiser Permanente inpatient risk adjustment methodology was valid in an external patient population. J Clin Epidemiol. 2010;63:798-803. PubMed
9. Bou-Hamad I, Larocque D, Ben-Ameur H. A review of survival trees. Statist Surv. 2011;44-71.
10. van Walraven C, Bell CM. Risk of death or readmission among people discharged from hospital on Fridays. CMAJ. 2002;166:1672-1673. PubMed
11. Barnes S, Hamrock E, Toerper M, Siddiqui S, Levin S. Real-time prediction of inpatient length of stay for discharge prioritization. J Am Med Inform Assoc. 2016;23:e2-e10. PubMed
12. Levin SRP, Harley ETB, Fackler JCM, et al. Real-time forecasting of pediatric intensive care unit length of stay using computerized provider orders. Crit Care Med. 2012;40:3058-3064. PubMed
13. Resar R, Nolan K, Kaczynski D, Jensen K. Using real-time demand capacity management to improve hospitalwide patient flow. Jt Comm J Qual Patient Saf. 2011;37:217-227. PubMed
14. de Grood A, Blades K, Pendharkar SR. A review of discharge prediction processes in acute care hospitals. Healthc Policy. 2016;12:105-115. PubMed
15. van Walraven C, Hart RG. Leave ‘em alone - why continuous variables should be analyzed as such. Neuroepidemiology 2008;30:138-139. PubMed
Hospitals typically allocate beds based on historical patient volumes. If funding decreases, hospitals will usually try to maximize resource utilization by allocating beds to attain occupancies close to 100% for significant periods of time. This will invariably cause days in which hospital occupancy exceeds capacity, at which time critical entry points (such as the emergency department and operating room) will become blocked. This creates significant concerns over the patient quality of care.
Hospital administrators have very few options when hospital occupancy exceeds 100%. They could postpone admissions for “planned” cases, bring in additional staff to increase capacity, or instigate additional methods to increase hospital discharges such as expanding care resources in the community. All options are costly, bothersome, or cannot be actioned immediately. The need for these options could be minimized by enabling hospital administrators to make more informed decisions regarding hospital bed management by knowing the likely number of discharges in the next 24 hours.
Predicting the number of people who will be discharged in the next day can be approached in several ways. One approach would be to calculate each patient’s expected length of stay and then use the variation around that estimate to calculate each day’s discharge probability. Several studies have attempted to model hospital length of stay using a broad assortment of methodologies, but a mechanism to accurately predict this outcome has been elusive1,2 (with Verburg et al.3 concluding in their study’s abstract that “…it is difficult to predict length of stay…”). A second approach would be to use survival analysis methods to generate each patient’s hazard of discharge over time, which could be directly converted to an expected daily risk of discharge. However, this approach is complicated by the concurrent need to include time-dependent covariates and consider the competing risk of death in hospital, which can complicate survival modeling.4,5 A third approach would be the implementation of a longitudinal analysis using marginal models to predict the daily probability of discharge,6 but this method quickly overwhelms computer resources when large datasets are present.
In this study, we decided to use nonparametric models to predict the daily number of hospital discharges. We first identified patient groups with distinct discharge patterns. We then calculated the conditional daily discharge probability of patients in each of these groups. Finally, these conditional daily discharge probabilities were then summed for each hospital day to generate the expected number of discharges in the next 24 hours. This paper details the methods we used to create our model and the accuracy of its predictions.
METHODS
Study Setting and Databases Used for Analysis
The study took place at The Ottawa Hospital, a 1000-bed teaching hospital with 3 campuses that is the primary referral center in our region. The study was approved by our local research ethics board.
The Patient Registry Database records the date and time of admission for each patient (defined as the moment that a patient’s admission request is registered in the patient registration) and discharge (defined as the time when the patient’s discharge from hospital was entered into the patient registration) for hospital encounters. Emergency department encounters were also identified in the Patient Registry Database along with admission service, patient age and sex, and patient location throughout the admission. The Laboratory Database records all laboratory studies and results on all patients at the hospital.
Study Cohort
We used the Patient Registry Database to identify all people aged 1 year or more who were admitted to the hospital between January 1, 2013, and December 31, 2015. This time frame was selected to (i) ensure that data were complete; and (ii) complete calendar years of data were available for both derivation (patient-days in 2013-2014) and validation (2015) cohorts. Patients who were observed in the emergency room without admission to hospital were not included.
Study Outcome
The study outcome was the number of patients discharged from the hospital each day. For the analysis, the reference point for each day was 1 second past midnight; therefore, values for time-dependent covariates up to and including midnight were used to predict the number of discharges in the next 24 hours.
Study Covariates
Baseline (ie, time-independent) covariates included patient age and sex, admission service, hospital campus, whether or not the patient was admitted from the emergency department (all determined from the Patient Registry Database), and the Laboratory-based Acute Physiological Score (LAPS). The latter, which was calculated with the Laboratory Database using results for 14 tests (arterial pH, PaCO2, PaO2, anion gap, hematocrit, total white blood cell count, serum albumin, total bilirubin, creatinine, urea nitrogen, glucose, sodium, bicarbonate, and troponin I) measured in the 24-hour time frame preceding hospitalization, was derived by Escobar and colleagues7 to measure severity of illness and was subsequently validated in our hospital.8 The independent association of each laboratory perturbation with risk of death in hospital is reflected by the number of points assigned to each lab value with the total LAPS being the sum of these values. Time-dependent covariates included weekday in hospital and whether or not patients were in the intensive care unit.
Analysis
We used 3 stages to create a model to predict the daily expected number of discharges: we identified discharge risk strata containing patients having similar discharge patterns using data from patients in the derivation cohort (first stage); then, we generated the preliminary probability of discharge by determining the daily discharge probability in each discharge risk strata (second stage); finally, we modified the probability from the second stage based on the weekday and admission service and summed these probabilities to create the expected number of discharges on a particular date (third stage).
The first stage identified discharge risk strata based on the covariates listed above. This was determined by using a survival tree approach9 with proportional hazard regression models to generate the “splits.” These models were offered all covariates listed in the Study Covariates section. Admission service was clustered within 4 departments (obstetrics/gynecology, psychiatry, surgery, and medicine) and day of week was “binarized” into weekday/weekend-holiday (because the use of categorical variables with large numbers of groups can “stunt” regression trees due to small numbers of patients—and, therefore, statistical power—in each subgroup). The proportional hazards model identified the covariate having the strongest association with time to discharge (based on the Wald X2 value divided by the degrees of freedom). This variable was then used to split the cohort into subgroups (with continuous covariates being categorized into quartiles). The proportional hazards model was then repeated in each subgroup (with the previous splitting variable[s] excluded from the model). This process continued until no variable was associated with time to discharge with a P value less than .0001. This survival-tree was then used to cluster all patients into distinct discharge risk strata.
In the second stage, we generated the preliminary probability of discharge for a specific date. This was calculated by assigning all patients in hospital to their discharge risk strata (Appendix). We then measured the probability of discharge on each hospitalization day in all discharge risk strata using data from the previous 180 days (we only used the prior 180 days of data to account for temporal changes in hospital discharge patterns). For example, consider a 75-year-old patient on her third hospital day under obstetrics/gynecology on December 19, 2015 (a Saturday). This patient would be assigned to risk stratum #133 (Appendix A). We then measured the probability of discharge of all patients in this discharge risk stratum hospitalized in the previous 6 months (ie, between June 22, 2015, and December 18, 2015) on each hospital day. For risk stratum #133, the probability of discharge on hospital day 3 was 0.1111; therefore, our sample patient’s preliminary expected discharge probability was 0.1111.
To attain stable daily discharge probability estimates, a minimum of 50 patients per discharge risk stratum-hospitalization day combination was required. If there were less than 50 patients for a particular hospitalization day in a particular discharge risk stratum, we grouped hospitalization days in that risk stratum together until the minimum of 50 patients was collected.
The third (and final) stage accounted for the lack of granularity when we created the discharge risk strata in the first stage. As we mentioned above, admission service was clustered into 4 departments and the day of week was clustered into weekend/weekday. However, important variations in discharge probabilities could still exist within departments and between particular days of the week.10 Therefore, we created a correction factor to adjust the preliminary expected number of discharges based on the admission division and day of week. This correction factor used data from the 180 days prior to the analysis date within which the expected daily number of discharges was calculated (using the methods above). The correction factor was the relative difference between the observed and expected number of discharges within each division-day of week grouping.
For example, to calculate the correction factor for our sample patient presented above (75-year-old patient on hospital day 3 under gynecology on Saturday, December 19, 2015), we measured the observed number of discharges from gynecology on Saturdays between June 22, 2015, and December 18, 2015, (n = 206) and the expected number of discharges (n = 195.255) resulting in a correction factor of (observed-expected)/expected = (195.255-206)/195.206 = 0.05503. Therefore, the final expected discharge probability for our sample patient was 0.1111+0.1111*0.05503=0.1172. The expected number of discharges on a particular date was the preliminary expected number of discharges on that date (generated in the second stage) multiplied by the correction factor for the corresponding division-day or week group.
RESULTS
There were 192,859 admissions involving patients more than 1 year of age that spent at least part of their hospitalization between January 1, 2013, and December 31, 2015 (Table). Patients were middle-aged and slightly female predominant, with about half being admitted from the emergency department. Approximately 80% of admissions were to surgical or medical services. More than 95% of admissions ended with a discharge from the hospital with the remainder ending in a death. Almost 30% of hospitalization days occurred on weekends or holidays. Hospitalizations in the derivation (2013-2014) and validation (2015) group were essentially the same, except there was a slight drop in hospital length of stay (from a median of 4 days to 3 days) between the 2 periods.
Patient and hospital covariates importantly influenced the daily conditional probability of discharge (Figure 1). Patients admitted to the obstetrics/gynecology department were notably more likely to be discharged from hospital with no influence from the day of week. In contrast, the probability of discharge decreased notably on the weekends in the other departments. Patients on the ward were much more likely to be discharged than those in the intensive care unit, with increasing age associated with a decreased discharge likelihood in the former but not the latter patients. Finally, discharge probabilities varied only slightly between campuses at our hospital with discharge risk decreasing as severity of illness (as measured by LAPS) increased.
The TEND model contained 142 discharge risk strata (Appendix A). Weekend-holiday status had the strongest association with discharge probability (ie, it was the first splitting variable). The most complex discharge risk strata contained 6 covariates. The daily conditional probability of discharge during the first 2 weeks of hospitalization varied extensively between discharge risk strata (Figure 2). Overall, the conditional discharge probability increased from the first to the second day, remained relatively stable for several days, and then slowly decreased over time. However, this pattern and day-to-day variability differed extensively between risk strata.
The observed daily number of discharges in the validation cohort varied extensively (median 139; interquartile range [IQR] 95-160; range 39-214). The TEND model accurately predicted the daily number of discharges with the expected daily number being strongly associated with the observed number (adjusted R2 = 89.2%; P < 0.0001; Figure 3). Calibration decreased but remained significant when we limited the analyses by hospital campus (General: R2 = 46.3%; P < 0.0001; Civic: R2 = 47.9%; P < 0.0001; Heart Institute: R2 = 18.1%; P < 0.0001). The expected number of daily discharges was an unbiased estimator of the observed number of discharges (its parameter estimate in a linear regression model with the observed number of discharges as the outcome variable was 1.0005; 95% confidence interval, 0.9647-1.0363). The absolute difference in the observed and expected daily number of discharges was small (median 1.6; IQR −6.8 to 9.4; range −37 to 63.4) as was the relative difference (median 1.4%; IQR −5.5% to 7.1%; range −40.9% to 43.4%). The expected number of discharges was within 20% of the observed number of discharges in 95.1% of days in 2015.
DISCUSSION
Knowing how many patients will soon be discharged from the hospital should greatly facilitate hospital planning. This study showed that the TEND model used simple patient and hospitalization covariates to accurately predict the number of patients who will be discharged from hospital in the next day.
We believe that this study has several notable findings. First, we think that using a nonparametric approach to predicting the daily number of discharges importantly increased accuracy. This approach allowed us to generate expected likelihoods based on actual discharge probabilities at our hospital in the most recent 6 months of hospitalization-days within patients having discharge patterns that were very similar to the patient in question (ie, discharge risk strata, Appendix A). This ensured that trends in hospitalization habits were accounted for without the need of a period variable in our model. In addition, the lack of parameters in the model will make it easier to transplant it to other hospitals. Second, we think that the accuracy of the predictions were remarkable given the relative “crudeness” of our predictors. By using relatively simple factors, the TEND model was able to output accurate predictions for the number of daily discharges (Figure 3).
This study joins several others that have attempted to accomplish the difficult task of predicting the number of hospital discharges by using digitized data. Barnes et al.11 created a model using regression random forest methods in a single medical service within a hospital to predict the daily number of discharges with impressive accuracy (mean daily number of discharges observed 8.29, expected 8.51). Interestingly, the model in this study was more accurate at predicting discharge likelihood than physicians. Levin et al.12 derived a model using discrete time logistic regression to predict the likelihood of discharge from a pediatric intensive care unit, finding that physician orders (captured via electronic order entry) could be categorized and used to significantly increase the accuracy of discharge likelihood. This study demonstrates the potential opportunities within health-related data from hospital data warehouses to improve prediction. We believe that continued work in this field will result in the increased use of digital data to help hospital administrators manage patient beds more efficiently and effectively than currently used resource intensive manual methods.13,14
Several issues should be kept in mind when interpreting our findings. First, our analysis is limited to a single institution in Canada. It will be important to determine if the TEND model methodology generalizes to other hospitals in different jurisdictions. Such an external validation, especially in multiple hospitals, will be important to show that the TEND model methodology works in other facilities. Hospitals could implement the TEND model if they are able to record daily values for each of the variables required to assign patients to a discharge risk stratum (Appendix A) and calculate within each the daily probability of discharge. Hospitals could derive their own discharge risk strata to account for covariates, which we did not include in our study but could be influential, such as insurance status. These discharge risk estimates could also be incorporated into the electronic medical record or hospital dashboards (as long as the data required to generate the estimates are available). These interventions would permit the expected number of hospital discharges (and even the patient-level probability of discharge) to be calculated on a daily basis. Second, 2 potential biases could have influenced the identification of our discharge risk strata (Appendix A). In this process, we used survival tree methods to separate patient-days into clusters having progressively more homogenous discharge patterns. Each split was determined by using a proportional hazards model that ignored the competing risks of death in hospital. In addition, the model expressed age and LAPS as continuous variables, whereas these covariates had to be categorized to create our risk strata groupings. The strength of a covariate’s association with an outcome will decrease when a continuous variable is categorized.15 Both of these issues might have biased our final risk strata categorization (Appendix A). Third, we limited our model to include simple covariates whose values could be determined relatively easily within most hospital administrative data systems. While this increases the generalizability to other hospital information systems, we believe that the introduction of other covariates to the model—such as daily vital signs, laboratory results, medications, or time from operations—could increase prediction accuracy. Finally, it is uncertain whether or not knowing the predicted number of discharges will improve the efficiency of bed management within the hospital. It seems logical that an accurate prediction of the number of beds that will be made available in the next day should improve decisions regarding the number of patients who could be admitted electively to the hospital. It remains to be seen, however, whether this truly happens.
In summary, we found that the TEND model used a handful of patient and hospitalization factors to accurately predict the expected number of discharges from hospital in the next day. Further work is required to implement this model into our institution’s data warehouse and then determine whether this prediction will improve the efficiency of bed management at our hospital.
Disclosure: CvW is supported by a University of Ottawa Department of Medicine Clinician Scientist Chair. The authors have no conflicts of interest
Hospitals typically allocate beds based on historical patient volumes. If funding decreases, hospitals will usually try to maximize resource utilization by allocating beds to attain occupancies close to 100% for significant periods of time. This will invariably cause days in which hospital occupancy exceeds capacity, at which time critical entry points (such as the emergency department and operating room) will become blocked. This creates significant concerns over the patient quality of care.
Hospital administrators have very few options when hospital occupancy exceeds 100%. They could postpone admissions for “planned” cases, bring in additional staff to increase capacity, or instigate additional methods to increase hospital discharges such as expanding care resources in the community. All options are costly, bothersome, or cannot be actioned immediately. The need for these options could be minimized by enabling hospital administrators to make more informed decisions regarding hospital bed management by knowing the likely number of discharges in the next 24 hours.
Predicting the number of people who will be discharged in the next day can be approached in several ways. One approach would be to calculate each patient’s expected length of stay and then use the variation around that estimate to calculate each day’s discharge probability. Several studies have attempted to model hospital length of stay using a broad assortment of methodologies, but a mechanism to accurately predict this outcome has been elusive1,2 (with Verburg et al.3 concluding in their study’s abstract that “…it is difficult to predict length of stay…”). A second approach would be to use survival analysis methods to generate each patient’s hazard of discharge over time, which could be directly converted to an expected daily risk of discharge. However, this approach is complicated by the concurrent need to include time-dependent covariates and consider the competing risk of death in hospital, which can complicate survival modeling.4,5 A third approach would be the implementation of a longitudinal analysis using marginal models to predict the daily probability of discharge,6 but this method quickly overwhelms computer resources when large datasets are present.
In this study, we decided to use nonparametric models to predict the daily number of hospital discharges. We first identified patient groups with distinct discharge patterns. We then calculated the conditional daily discharge probability of patients in each of these groups. Finally, these conditional daily discharge probabilities were then summed for each hospital day to generate the expected number of discharges in the next 24 hours. This paper details the methods we used to create our model and the accuracy of its predictions.
METHODS
Study Setting and Databases Used for Analysis
The study took place at The Ottawa Hospital, a 1000-bed teaching hospital with 3 campuses that is the primary referral center in our region. The study was approved by our local research ethics board.
The Patient Registry Database records the date and time of admission for each patient (defined as the moment that a patient’s admission request is registered in the patient registration) and discharge (defined as the time when the patient’s discharge from hospital was entered into the patient registration) for hospital encounters. Emergency department encounters were also identified in the Patient Registry Database along with admission service, patient age and sex, and patient location throughout the admission. The Laboratory Database records all laboratory studies and results on all patients at the hospital.
Study Cohort
We used the Patient Registry Database to identify all people aged 1 year or more who were admitted to the hospital between January 1, 2013, and December 31, 2015. This time frame was selected to (i) ensure that data were complete; and (ii) complete calendar years of data were available for both derivation (patient-days in 2013-2014) and validation (2015) cohorts. Patients who were observed in the emergency room without admission to hospital were not included.
Study Outcome
The study outcome was the number of patients discharged from the hospital each day. For the analysis, the reference point for each day was 1 second past midnight; therefore, values for time-dependent covariates up to and including midnight were used to predict the number of discharges in the next 24 hours.
Study Covariates
Baseline (ie, time-independent) covariates included patient age and sex, admission service, hospital campus, whether or not the patient was admitted from the emergency department (all determined from the Patient Registry Database), and the Laboratory-based Acute Physiological Score (LAPS). The latter, which was calculated with the Laboratory Database using results for 14 tests (arterial pH, PaCO2, PaO2, anion gap, hematocrit, total white blood cell count, serum albumin, total bilirubin, creatinine, urea nitrogen, glucose, sodium, bicarbonate, and troponin I) measured in the 24-hour time frame preceding hospitalization, was derived by Escobar and colleagues7 to measure severity of illness and was subsequently validated in our hospital.8 The independent association of each laboratory perturbation with risk of death in hospital is reflected by the number of points assigned to each lab value with the total LAPS being the sum of these values. Time-dependent covariates included weekday in hospital and whether or not patients were in the intensive care unit.
Analysis
We used 3 stages to create a model to predict the daily expected number of discharges: we identified discharge risk strata containing patients having similar discharge patterns using data from patients in the derivation cohort (first stage); then, we generated the preliminary probability of discharge by determining the daily discharge probability in each discharge risk strata (second stage); finally, we modified the probability from the second stage based on the weekday and admission service and summed these probabilities to create the expected number of discharges on a particular date (third stage).
The first stage identified discharge risk strata based on the covariates listed above. This was determined by using a survival tree approach9 with proportional hazard regression models to generate the “splits.” These models were offered all covariates listed in the Study Covariates section. Admission service was clustered within 4 departments (obstetrics/gynecology, psychiatry, surgery, and medicine) and day of week was “binarized” into weekday/weekend-holiday (because the use of categorical variables with large numbers of groups can “stunt” regression trees due to small numbers of patients—and, therefore, statistical power—in each subgroup). The proportional hazards model identified the covariate having the strongest association with time to discharge (based on the Wald X2 value divided by the degrees of freedom). This variable was then used to split the cohort into subgroups (with continuous covariates being categorized into quartiles). The proportional hazards model was then repeated in each subgroup (with the previous splitting variable[s] excluded from the model). This process continued until no variable was associated with time to discharge with a P value less than .0001. This survival-tree was then used to cluster all patients into distinct discharge risk strata.
In the second stage, we generated the preliminary probability of discharge for a specific date. This was calculated by assigning all patients in hospital to their discharge risk strata (Appendix). We then measured the probability of discharge on each hospitalization day in all discharge risk strata using data from the previous 180 days (we only used the prior 180 days of data to account for temporal changes in hospital discharge patterns). For example, consider a 75-year-old patient on her third hospital day under obstetrics/gynecology on December 19, 2015 (a Saturday). This patient would be assigned to risk stratum #133 (Appendix A). We then measured the probability of discharge of all patients in this discharge risk stratum hospitalized in the previous 6 months (ie, between June 22, 2015, and December 18, 2015) on each hospital day. For risk stratum #133, the probability of discharge on hospital day 3 was 0.1111; therefore, our sample patient’s preliminary expected discharge probability was 0.1111.
To attain stable daily discharge probability estimates, a minimum of 50 patients per discharge risk stratum-hospitalization day combination was required. If there were less than 50 patients for a particular hospitalization day in a particular discharge risk stratum, we grouped hospitalization days in that risk stratum together until the minimum of 50 patients was collected.
The third (and final) stage accounted for the lack of granularity when we created the discharge risk strata in the first stage. As we mentioned above, admission service was clustered into 4 departments and the day of week was clustered into weekend/weekday. However, important variations in discharge probabilities could still exist within departments and between particular days of the week.10 Therefore, we created a correction factor to adjust the preliminary expected number of discharges based on the admission division and day of week. This correction factor used data from the 180 days prior to the analysis date within which the expected daily number of discharges was calculated (using the methods above). The correction factor was the relative difference between the observed and expected number of discharges within each division-day of week grouping.
For example, to calculate the correction factor for our sample patient presented above (75-year-old patient on hospital day 3 under gynecology on Saturday, December 19, 2015), we measured the observed number of discharges from gynecology on Saturdays between June 22, 2015, and December 18, 2015, (n = 206) and the expected number of discharges (n = 195.255) resulting in a correction factor of (observed-expected)/expected = (195.255-206)/195.206 = 0.05503. Therefore, the final expected discharge probability for our sample patient was 0.1111+0.1111*0.05503=0.1172. The expected number of discharges on a particular date was the preliminary expected number of discharges on that date (generated in the second stage) multiplied by the correction factor for the corresponding division-day or week group.
RESULTS
There were 192,859 admissions involving patients more than 1 year of age that spent at least part of their hospitalization between January 1, 2013, and December 31, 2015 (Table). Patients were middle-aged and slightly female predominant, with about half being admitted from the emergency department. Approximately 80% of admissions were to surgical or medical services. More than 95% of admissions ended with a discharge from the hospital with the remainder ending in a death. Almost 30% of hospitalization days occurred on weekends or holidays. Hospitalizations in the derivation (2013-2014) and validation (2015) group were essentially the same, except there was a slight drop in hospital length of stay (from a median of 4 days to 3 days) between the 2 periods.
Patient and hospital covariates importantly influenced the daily conditional probability of discharge (Figure 1). Patients admitted to the obstetrics/gynecology department were notably more likely to be discharged from hospital with no influence from the day of week. In contrast, the probability of discharge decreased notably on the weekends in the other departments. Patients on the ward were much more likely to be discharged than those in the intensive care unit, with increasing age associated with a decreased discharge likelihood in the former but not the latter patients. Finally, discharge probabilities varied only slightly between campuses at our hospital with discharge risk decreasing as severity of illness (as measured by LAPS) increased.
The TEND model contained 142 discharge risk strata (Appendix A). Weekend-holiday status had the strongest association with discharge probability (ie, it was the first splitting variable). The most complex discharge risk strata contained 6 covariates. The daily conditional probability of discharge during the first 2 weeks of hospitalization varied extensively between discharge risk strata (Figure 2). Overall, the conditional discharge probability increased from the first to the second day, remained relatively stable for several days, and then slowly decreased over time. However, this pattern and day-to-day variability differed extensively between risk strata.
The observed daily number of discharges in the validation cohort varied extensively (median 139; interquartile range [IQR] 95-160; range 39-214). The TEND model accurately predicted the daily number of discharges with the expected daily number being strongly associated with the observed number (adjusted R2 = 89.2%; P < 0.0001; Figure 3). Calibration decreased but remained significant when we limited the analyses by hospital campus (General: R2 = 46.3%; P < 0.0001; Civic: R2 = 47.9%; P < 0.0001; Heart Institute: R2 = 18.1%; P < 0.0001). The expected number of daily discharges was an unbiased estimator of the observed number of discharges (its parameter estimate in a linear regression model with the observed number of discharges as the outcome variable was 1.0005; 95% confidence interval, 0.9647-1.0363). The absolute difference in the observed and expected daily number of discharges was small (median 1.6; IQR −6.8 to 9.4; range −37 to 63.4) as was the relative difference (median 1.4%; IQR −5.5% to 7.1%; range −40.9% to 43.4%). The expected number of discharges was within 20% of the observed number of discharges in 95.1% of days in 2015.
DISCUSSION
Knowing how many patients will soon be discharged from the hospital should greatly facilitate hospital planning. This study showed that the TEND model used simple patient and hospitalization covariates to accurately predict the number of patients who will be discharged from hospital in the next day.
We believe that this study has several notable findings. First, we think that using a nonparametric approach to predicting the daily number of discharges importantly increased accuracy. This approach allowed us to generate expected likelihoods based on actual discharge probabilities at our hospital in the most recent 6 months of hospitalization-days within patients having discharge patterns that were very similar to the patient in question (ie, discharge risk strata, Appendix A). This ensured that trends in hospitalization habits were accounted for without the need of a period variable in our model. In addition, the lack of parameters in the model will make it easier to transplant it to other hospitals. Second, we think that the accuracy of the predictions were remarkable given the relative “crudeness” of our predictors. By using relatively simple factors, the TEND model was able to output accurate predictions for the number of daily discharges (Figure 3).
This study joins several others that have attempted to accomplish the difficult task of predicting the number of hospital discharges by using digitized data. Barnes et al.11 created a model using regression random forest methods in a single medical service within a hospital to predict the daily number of discharges with impressive accuracy (mean daily number of discharges observed 8.29, expected 8.51). Interestingly, the model in this study was more accurate at predicting discharge likelihood than physicians. Levin et al.12 derived a model using discrete time logistic regression to predict the likelihood of discharge from a pediatric intensive care unit, finding that physician orders (captured via electronic order entry) could be categorized and used to significantly increase the accuracy of discharge likelihood. This study demonstrates the potential opportunities within health-related data from hospital data warehouses to improve prediction. We believe that continued work in this field will result in the increased use of digital data to help hospital administrators manage patient beds more efficiently and effectively than currently used resource intensive manual methods.13,14
Several issues should be kept in mind when interpreting our findings. First, our analysis is limited to a single institution in Canada. It will be important to determine if the TEND model methodology generalizes to other hospitals in different jurisdictions. Such an external validation, especially in multiple hospitals, will be important to show that the TEND model methodology works in other facilities. Hospitals could implement the TEND model if they are able to record daily values for each of the variables required to assign patients to a discharge risk stratum (Appendix A) and calculate within each the daily probability of discharge. Hospitals could derive their own discharge risk strata to account for covariates, which we did not include in our study but could be influential, such as insurance status. These discharge risk estimates could also be incorporated into the electronic medical record or hospital dashboards (as long as the data required to generate the estimates are available). These interventions would permit the expected number of hospital discharges (and even the patient-level probability of discharge) to be calculated on a daily basis. Second, 2 potential biases could have influenced the identification of our discharge risk strata (Appendix A). In this process, we used survival tree methods to separate patient-days into clusters having progressively more homogenous discharge patterns. Each split was determined by using a proportional hazards model that ignored the competing risks of death in hospital. In addition, the model expressed age and LAPS as continuous variables, whereas these covariates had to be categorized to create our risk strata groupings. The strength of a covariate’s association with an outcome will decrease when a continuous variable is categorized.15 Both of these issues might have biased our final risk strata categorization (Appendix A). Third, we limited our model to include simple covariates whose values could be determined relatively easily within most hospital administrative data systems. While this increases the generalizability to other hospital information systems, we believe that the introduction of other covariates to the model—such as daily vital signs, laboratory results, medications, or time from operations—could increase prediction accuracy. Finally, it is uncertain whether or not knowing the predicted number of discharges will improve the efficiency of bed management within the hospital. It seems logical that an accurate prediction of the number of beds that will be made available in the next day should improve decisions regarding the number of patients who could be admitted electively to the hospital. It remains to be seen, however, whether this truly happens.
In summary, we found that the TEND model used a handful of patient and hospitalization factors to accurately predict the expected number of discharges from hospital in the next day. Further work is required to implement this model into our institution’s data warehouse and then determine whether this prediction will improve the efficiency of bed management at our hospital.
Disclosure: CvW is supported by a University of Ottawa Department of Medicine Clinician Scientist Chair. The authors have no conflicts of interest
1. Austin PC, Rothwell DM, Tu JV. A comparison of statistical modeling strategies for analyzing length of stay after CABG surgery. Health Serv Outcomes Res Methodol. 2002;3:107-133.
2. Moran JL, Solomon PJ. A review of statistical estimators for risk-adjusted length of stay: analysis of the Australian and new Zealand intensive care adult patient data-base, 2008-2009. BMC Med Res Methodol. 2012;12:68. PubMed
3. Verburg IWM, de Keizer NF, de Jonge E, Peek N. Comparison of regression methods for modeling intensive care length of stay. PLoS One. 2014;9:e109684. PubMed
4. Beyersmann J, Schumacher M. Time-dependent covariates in the proportional subdistribution hazards model for competing risks. Biostatistics. 2008;9:765-776. PubMed
5. Latouche A, Porcher R, Chevret S. A note on including time-dependent covariate in regression model for competing risks data. Biom J. 2005;47:807-814. PubMed
6. Fitzmaurice GM, Laird NM, Ware JH. Marginal models: generalized estimating equations. Applied Longitudinal Analysis. 2nd ed. John Wiley & Sons; 2011;353-394.
7. Escobar GJ, Greene JD, Scheirer P, Gardner MN, Draper D, Kipnis P. Risk-adjusting hospital inpatient mortality using automated inpatient, outpatient, and laboratory databases. Med Care. 2008;46:232-239. PubMed
8. van Walraven C, Escobar GJ, Greene JD, Forster AJ. The Kaiser Permanente inpatient risk adjustment methodology was valid in an external patient population. J Clin Epidemiol. 2010;63:798-803. PubMed
9. Bou-Hamad I, Larocque D, Ben-Ameur H. A review of survival trees. Statist Surv. 2011;44-71.
10. van Walraven C, Bell CM. Risk of death or readmission among people discharged from hospital on Fridays. CMAJ. 2002;166:1672-1673. PubMed
11. Barnes S, Hamrock E, Toerper M, Siddiqui S, Levin S. Real-time prediction of inpatient length of stay for discharge prioritization. J Am Med Inform Assoc. 2016;23:e2-e10. PubMed
12. Levin SRP, Harley ETB, Fackler JCM, et al. Real-time forecasting of pediatric intensive care unit length of stay using computerized provider orders. Crit Care Med. 2012;40:3058-3064. PubMed
13. Resar R, Nolan K, Kaczynski D, Jensen K. Using real-time demand capacity management to improve hospitalwide patient flow. Jt Comm J Qual Patient Saf. 2011;37:217-227. PubMed
14. de Grood A, Blades K, Pendharkar SR. A review of discharge prediction processes in acute care hospitals. Healthc Policy. 2016;12:105-115. PubMed
15. van Walraven C, Hart RG. Leave ‘em alone - why continuous variables should be analyzed as such. Neuroepidemiology 2008;30:138-139. PubMed
1. Austin PC, Rothwell DM, Tu JV. A comparison of statistical modeling strategies for analyzing length of stay after CABG surgery. Health Serv Outcomes Res Methodol. 2002;3:107-133.
2. Moran JL, Solomon PJ. A review of statistical estimators for risk-adjusted length of stay: analysis of the Australian and new Zealand intensive care adult patient data-base, 2008-2009. BMC Med Res Methodol. 2012;12:68. PubMed
3. Verburg IWM, de Keizer NF, de Jonge E, Peek N. Comparison of regression methods for modeling intensive care length of stay. PLoS One. 2014;9:e109684. PubMed
4. Beyersmann J, Schumacher M. Time-dependent covariates in the proportional subdistribution hazards model for competing risks. Biostatistics. 2008;9:765-776. PubMed
5. Latouche A, Porcher R, Chevret S. A note on including time-dependent covariate in regression model for competing risks data. Biom J. 2005;47:807-814. PubMed
6. Fitzmaurice GM, Laird NM, Ware JH. Marginal models: generalized estimating equations. Applied Longitudinal Analysis. 2nd ed. John Wiley & Sons; 2011;353-394.
7. Escobar GJ, Greene JD, Scheirer P, Gardner MN, Draper D, Kipnis P. Risk-adjusting hospital inpatient mortality using automated inpatient, outpatient, and laboratory databases. Med Care. 2008;46:232-239. PubMed
8. van Walraven C, Escobar GJ, Greene JD, Forster AJ. The Kaiser Permanente inpatient risk adjustment methodology was valid in an external patient population. J Clin Epidemiol. 2010;63:798-803. PubMed
9. Bou-Hamad I, Larocque D, Ben-Ameur H. A review of survival trees. Statist Surv. 2011;44-71.
10. van Walraven C, Bell CM. Risk of death or readmission among people discharged from hospital on Fridays. CMAJ. 2002;166:1672-1673. PubMed
11. Barnes S, Hamrock E, Toerper M, Siddiqui S, Levin S. Real-time prediction of inpatient length of stay for discharge prioritization. J Am Med Inform Assoc. 2016;23:e2-e10. PubMed
12. Levin SRP, Harley ETB, Fackler JCM, et al. Real-time forecasting of pediatric intensive care unit length of stay using computerized provider orders. Crit Care Med. 2012;40:3058-3064. PubMed
13. Resar R, Nolan K, Kaczynski D, Jensen K. Using real-time demand capacity management to improve hospitalwide patient flow. Jt Comm J Qual Patient Saf. 2011;37:217-227. PubMed
14. de Grood A, Blades K, Pendharkar SR. A review of discharge prediction processes in acute care hospitals. Healthc Policy. 2016;12:105-115. PubMed
15. van Walraven C, Hart RG. Leave ‘em alone - why continuous variables should be analyzed as such. Neuroepidemiology 2008;30:138-139. PubMed
© 2018 Society of Hospital Medicine
Impact of Displaying Inpatient Pharmaceutical Costs at the Time of Order Entry: Lessons From a Tertiary Care Center
Secondary to rising healthcare costs in the United States, broad efforts are underway to identify and reduce waste in the health system.1,2 A recent systematic review exhibited that many physicians inaccurately estimate the cost of medications.3 Raising awareness of medication costs among prescribers is one potential way to promote high-value care.
Some evidence suggests that cost transparency may help prescribers understand how medication orders drive costs. In a previous study carried out at the Johns Hopkins Hospital, fee data were displayed to providers for diagnostic laboratory tests.4 An 8.6% decrease (95% confidence interval [CI], –8.99% to –8.19%) in test ordering was observed when costs were displayed vs a 5.6% increase (95% CI, 4.90% to 6.39%) in ordering when costs were not displayed during a 6-month intervention period (P < 0.001). Conversely, a similar study that investigated the impact of cost transparency on inpatient imaging utilization did not demonstrate a significant influence of cost display.5 This suggests that cost transparency may work in some areas of care but not in others. A systematic review that investigated price-display interventions for imaging, laboratory studies, and medications reported 10 studies that demonstrated a statistically significant decrease in expenditures without an effect on patient safety.6
Informing prescribers of institution-specific medication costs within and between drug classes may enable the selection of less expensive, therapeutically equivalent drugs. Prior studies investigating the effect of medication cost display were conducted in a variety of patient care settings, including ambulatory clinics,7 urgent care centers,8 and operating rooms,9,10 with some yielding positive results in terms of ordering and cost11,12 and others having no impact.13,14 Currently, there is little evidence specifically addressing the effect of cost display for medications in the inpatient setting.
As part of an institutional initiative to control pharmaceutical expenditures, informational messaging for several high-cost drugs was initiated at our tertiary care hospital in April 2015. The goal of our study was to assess the effect of these medication cost messages on ordering practices. We hypothesized that the display of inpatient pharmaceutical costs at the time of order entry would result in a reduction in ordering.
METHODS
Setting, Intervention, and Participants
As part of an effort to educate prescribers about the high cost of medications, 9 intravenous (IV) medications were selected by the Johns Hopkins Hospital Pharmacy and Therapeutics Committee as targets for drug cost messaging. The intention of the committee was to implement a rapid, low-cost, proof-of-concept, quality-improvement project that was not designed as prospective research. Representatives from the pharmacy and clinicians from relevant clinical areas participated in preimplementation discussions to help identify medications that were subjectively felt to be overused at our institution and potentially modifiable through provider education. The criteria for selecting drug targets included a variety of factors, such as medications infrequently ordered but representing a significant cost per dose (eg, eculizumab and ribavirin), frequently ordered medications with less expensive substitutes (eg, linezolid and voriconazole), and high-cost medications without direct therapeutic alternatives (eg, calcitonin). From April 10, 2015, to October 5, 2015, the computerized Provider Order Entry System (cPOE), Sunrise Clinical Manager (Allscripts Corporation, Chicago, IL), displayed the cost for targeted medications. Seven of the medication alerts also included a reasonable therapeutic alternative and its cost. There were no restrictions placed on ordering; prescribers were able to choose the high-cost medications at their discretion.
Despite the fact that this initiative was not designed as a research project, we felt it was important to formally evaluate the impact of the drug cost messaging effort to inform future quality-improvement interventions. Each medication was compared to its preintervention baseline utilization dating back to January 1, 2013. For the 7 medications with alternatives offered, we also analyzed use of the suggested alternative during these time periods.
Data Sources and Measurement
Our study utilized data obtained from the pharmacy order verification system and the cPOE database. Data were collected over a period of 143 weeks from January 1, 2013, to October 5, 2015, to allow for a baseline period (January 1, 2013, to April 9, 2015) and an intervention period (April 10, 2015, to October 5, 2015). Data elements extracted included drug characteristics (dosage form, route, cost, strength, name, and quantity), patient characteristics (race, gender, and age), clinical setting (facility location, inpatient or outpatient), and billing information (provider name, doses dispensed from pharmacy, order number, revenue or procedure code, record number, date of service, and unique billing number) for each admission. Using these elements, we generated the following 8 variables to use in our analyses: week, month, period identifier, drug name, dosage form, weekly orders, weekly patient days, and number of weekly orders per 10,000 patient days. Average wholesale price (AWP), referred to as medication cost in this manuscript, was used to report all drug costs in all associated cost calculations. While the actual cost of acquisition and price charged to the patient may vary based on several factors, including manufacturer and payer, we chose to use AWP as a generalizable estimate of the cost of acquisition of the drug for the hospital.
Variables
“Week” and “month” were defined as the week and month of our study, respectively. The “period identifier” was a binary variable that identified the time period before and after the intervention. “Weekly orders” was defined as the total number of new orders placed per week for each specified drug included in our study. For example, if a patient received 2 discrete, new orders for a medication in a given week, 2 orders would be counted toward the “weekly orders” variable. “Patient days,” defined as the total number of patients treated at our facility, was summated for each week of our study to yield “weekly patient days.” To derive the “number of weekly orders per 10,000 patient days,” we divided weekly orders by weekly patient days and multiplied the resultant figure by 10,000.
Statistical Analysis
Segmented regression, a form of interrupted time series analysis, is a quasi-experimental design that was used to determine the immediate and sustained effects of the drug cost messages on the rate of medication ordering.15-17 The model enabled the use of comparison groups (alternative medications, as described above) to enhance internal validity.
In time series data, outcomes may not be independent over time. Autocorrelation of the error terms can arise when outcomes are more similar at time points closer together than outcomes at time points further apart. Failure to account for autocorrelation of the error terms may lead to underestimated standard errors. The presence of autocorrelation, assessed by calculating the Durbin-Watson statistic, was significant among our data. To adjust for this, we employed a Prais-Winsten estimation to adjust the error term (εt) calculated in our models.
Two segmented linear regression models were used to estimate trends in ordering before and after the intervention. The presence or absence of a comparator drug determined which model was to be used. When only single medications were under study, as in the case of eculizumab and calcitonin, our regression model was as follows:
Yt = (β0) + (β1)(Timet) + (β2)(Interventiont) + (β3)(Post-Intervention Timet) + (εt)
In our single-drug model, Yt denoted the number of orders per 10,000 patient days at week “t”; Timet was a continuous variable that indicated the number of weeks prior to or after the study intervention (April 10, 2015) and ranged from –116 to 27 weeks. Post-Intervention Timet was a continuous variable that denoted the number of weeks since the start of the intervention and is coded as zero for all time periods prior to the intervention. β0 was the estimated baseline number of orders per 10,000 patient days at the beginning of the study. β1 is the trend of orders per 10,000 patient days per week during the preintervention period; β2 represents an estimate of the change in the number of orders per 10,000 patient days immediately after the intervention; β3 denotes the difference between preintervention and postintervention slopes; and εt is the “error term,” which represents autocorrelation and random variability of the data.
As mentioned previously, alternative dosage forms of 7 medications included in our study were utilized as comparison groups. In these instances (when multiple drugs were included in our analyses), the following regression model was applied:
Y t = ( β 0 ) + ( β 1 )(Time t ) + ( β 2 )(Intervention t ) + ( β 3 )(Post-Intervention Time t ) + ( β 4 )(Cohort) + ( β 5 )(Cohort)(Time t ) + ( β 6 )(Cohort)(Intervention t ) + ( β 7 )(Cohort)(Post-Intervention Time t ) + ( ε t )
Here, 3 coefficients were added (β4-β7) to describe an additional cohort of orders. Cohort, a binary indicator variable, held a value of either 0 or 1 when the model was used to describe the treatment or comparison group, respectively. The coefficients β4-β7 described the treatment group, and β0-β3 described the comparison group. β4 was the difference in the number of baseline orders per 10,000 patient days between treatment and comparison groups; Β5 represented the difference between the estimated ordering trends of treatment and comparison groups; and Β6 indicated the difference in immediate changes in the number of orders per 10,000 patient days in the 2 groups following the intervention.
The number of orders per week was recorded for each medicine, which enabled a large number of data points to be included in our analyses. This allowed for more accurate and stable estimates to be made in our regression model. A total of 143 data points were collected for each study group, 116 before and 27 following each intervention.
All analyses were conducted by using STATA version 13.1 (StataCorp LP, College Station, TX).
RESULTS
Initial results pertaining to 9 IV medications were examined (Table). Following the implementation of cost messaging, no significant changes were observed in order frequency or trend for IV formulations of eculizumab, calcitonin, levetiracetam, linezolid, mycophenolate, ribavirin, voriconazole, and levothyroxine (Figures 1 and 2). However, a significant decrease in the number of oral ribavirin orders (Figure 2), the control group for the IV form, was observed (–16.3 orders per 10,000 patient days; P = .004; 95% CI, –27.2 to –5.31).
DISCUSSION
Our results suggest that the passive strategy of displaying cost alone was not effective in altering prescriber ordering patterns for the selected medications. This may be due to a lack of awareness regarding direct financial impact on the patient, importance of costs in medical decision-making, or a perceived lack of alternatives or suitability of recommended alternatives. These results may prove valuable to hospital and pharmacy leadership as they develop strategies to curb medication expense.
Changes observed in IV pantoprazole ordering are instructive. Due to a national shortage, the IV form of this medication underwent a restriction, which required approval by the pharmacy prior to dispensing. This restriction was instituted independently of our study and led to a 73% decrease from usage rates prior to policy implementation (Figure 3). Ordering was restricted according to defined criteria for IV use. The restriction did not apply to oral pantoprazole, and no significant change in ordering of the oral formulation was noted during the evaluated period (Figure 3).
The dramatic effect of policy changes, as observed with pantoprazole and voriconazole, suggests that a more active strategy may have a greater impact on prescriber behavior when it comes to medication ordering in the inpatient setting. It also highlights several potential sources of confounding that may introduce bias to cost-transparency studies.
This study has multiple limitations. First, as with all observational study designs, causation cannot be drawn with certainty from our results. While we were able to compare medications to their preintervention baselines, the data could have been impacted by longitudinal or seasonal trends in medication ordering, which may have been impacted by seasonal variability in disease prevalence, changes in resistance patterns, and annual cycling of house staff in an academic medical center. While there appear to be potential seasonal patterns regarding prescribing patterns for some of the medications included in this analysis, we also believe the linear regressions capture the overall trends in prescribing adequately. Nonstationarity, or trends in the mean and variance of the outcome that are not related to the intervention, may introduce bias in the interpretation of our findings. However, we believe the parameters included in our models, namely the immediate change in the intercept following the intervention and the change in the trend of the rate of prescribing over time from pre- to postintervention, provide substantial protections from faulty interpretation. Our models are limited to the extent that these parameters do not account for nonstationarity. Additionally, we did not collect data on dosing frequency or duration of treatment, which would have been dependent on factors that are not readily quantified, such as indication, clinical rationale, or patient response. Thus, we were not able to evaluate the impact of the intervention on these factors.
Although intended to enhance internal validity, comparison groups were also subject to external influence. For example, we observed a significant, short-lived rise in oral ribavirin (a control medication) ordering during the preintervention baseline period that appeared to be independent of our intervention and may speak to the unaccounted-for longitudinal variability detailed above.
Finally, the clinical indication and setting may be important. Previous studies performed at the same hospital with price displays showed a reduction in laboratory ordering but no change in imaging.18,19 One might speculate that ordering fewer laboratory tests is viewed by providers as eliminating waste rather than choosing a less expensive option to accomplish the same diagnostic task at hand. Therapeutics may be more similar to radiology tests, because patients presumably need the treatment and often do not have the option of simply not ordering without a concerted effort to reevaluate the treatment plan. Additionally, in a tertiary care teaching center such as ours, a junior clinician, oftentimes at the behest of a more senior colleague, enters most orders. In an environment in which the ordering prescriber has more autonomy or when the order is driven by a junior practitioner rather than an attending (such as daily laboratories), results may be different. Additionally, institutions that incentivize prescribers directly to practice cost-conscious care may experience different results from similar interventions.
We conclude that, in the case of medication cost messaging, a strategy of displaying cost information alone was insufficient to affect prescriber ordering behavior. Coupling cost transparency with educational interventions and active stewardship to impact clinical practice is worthy of further study.
Disclosures: The authors state that there were no external sponsors for this work. The Johns Hopkins Hospital and University “funded” this work by paying the salaries of the authors. The author team maintained independence and made all decisions regarding the study design, data collection, data analysis, interpretation of results, writing of the research report, and decision to submit it for publication. Dr. Shermock had full access to all the study data and takes responsibility for the integrity of the data and accuracy of the data analysis.
1. Berwick DM, Hackbarth AD. Eliminating Waste in US Health Care. JAMA. 2012;307(14):1513-1516. PubMed
2. PricewaterhouseCoopers’ Health Research Institute. The Price of Excess: Identifying Waste in Healthcare Spending. http://www.pwc.com/us/en/healthcare/publications/the-price-of-excess.html. Accessed June 17, 2015.
3. Allan GM, Lexchin J, Wiebe N. Physician awareness of drug cost: a systematic review. PLoS Med. 2007;4(9):e283. PubMed
4. Feldman LS, Shihab HM, Thiemann D, et al. Impact of providing fee data on laboratory test ordering: a controlled clinical trial. JAMA Intern Med. 2013;173(10):903-908. PubMed
5. Durand DJ, Feldman LS, Lewin JS, Brotman DJ. Provider cost transparency alone has no impact on inpatient imaging utilization. J Am Coll Radiol. 2013;10(2):108-113. PubMed
6. Silvestri MT, Bongiovanni TR, Glover JG, Gross CP. Impact of price display on provider ordering: A systematic review. J Hosp Med. 2016;11(1):65-76. PubMed
7. Ornstein SM, MacFarlane LL, Jenkins RG, Pan Q, Wager KA. Medication cost information in a computer-based patient record system. Impact on prescribing in a family medicine clinical practice. Arch Fam Med. 1999;8(2):118-121. PubMed
8. Guterman JJ, Chernof BA, Mares B, Gross-Schulman SG, Gan PG, Thomas D. Modifying provider behavior: A low-tech approach to pharmaceutical ordering. J Gen Intern Med. 2002;17(10):792-796. PubMed
9. McNitt JD, Bode ET, Nelson RE. Long-term pharmaceutical cost reduction using a data management system. Anesth Analg. 1998;87(4):837-842. PubMed
10. Horrow JC, Rosenberg H. Price stickers do not alter drug usage. Can J Anaesth. 1994;41(11):1047-1052. PubMed
11. Guterman JJ, Chernof BA, Mares B, Gross-Schulman SG, Gan PG, Thomas D. Modifying provider behavior: A low-tech approach to pharmaceutical ordering. J Gen Intern Med. 2002;17(10):792-796. PubMed
12. McNitt JD, Bode ET, Nelson RE. Long-term pharmaceutical cost reduction using a data management system. Anesth Analg. 1998;87(4):837-842. PubMed
13. Ornstein SM, MacFarlane LL, Jenkins RG, Pan Q, Wager KA. Medication cost information in a computer-based patient record system. Impact on prescribing in a family medicine clinical practice. Arch Fam Med. 1999;8(2):118-121. PubMed
14. Horrow JC, Rosenberg H. Price stickers do not alter drug usage. Can J Anaesth. 1994;41(11):1047-1052. PubMed
15. Jandoc R, Burden AM, Mamdani M, Levesque LE, Cadarette SM. Interrupted time series analysis in drug utilization research is increasing: Systematic review and recommendations. J Clin Epidemiol. 2015;68(8):950-56. PubMed
16. Linden A. Conducting interrupted time-series analysis for single- and multiple-group comparisons. Stata J. 2015;15(2):480-500.
17. Linden A, Adams JL. Applying a propensity score-based weighting model to interrupted time series data: improving causal inference in programme evaluation. J Eval Clin Pract. 2011;17(6):1231-1238. PubMed
18. Feldman LS, Shihab HM, Thiemann D, et al. Impact of providing fee data on laboratory test ordering: a controlled clinical trial. JAMA Intern Med. 2013;173(10):903-908. PubMed
19. Durand DJ, Feldman LS, Lewin JS, Brotman DJ. Provider cost transparency alone has no impact on inpatient imaging utilization. J Am Coll Radiol. 2013;10(2):108-113. PubMed
Secondary to rising healthcare costs in the United States, broad efforts are underway to identify and reduce waste in the health system.1,2 A recent systematic review exhibited that many physicians inaccurately estimate the cost of medications.3 Raising awareness of medication costs among prescribers is one potential way to promote high-value care.
Some evidence suggests that cost transparency may help prescribers understand how medication orders drive costs. In a previous study carried out at the Johns Hopkins Hospital, fee data were displayed to providers for diagnostic laboratory tests.4 An 8.6% decrease (95% confidence interval [CI], –8.99% to –8.19%) in test ordering was observed when costs were displayed vs a 5.6% increase (95% CI, 4.90% to 6.39%) in ordering when costs were not displayed during a 6-month intervention period (P < 0.001). Conversely, a similar study that investigated the impact of cost transparency on inpatient imaging utilization did not demonstrate a significant influence of cost display.5 This suggests that cost transparency may work in some areas of care but not in others. A systematic review that investigated price-display interventions for imaging, laboratory studies, and medications reported 10 studies that demonstrated a statistically significant decrease in expenditures without an effect on patient safety.6
Informing prescribers of institution-specific medication costs within and between drug classes may enable the selection of less expensive, therapeutically equivalent drugs. Prior studies investigating the effect of medication cost display were conducted in a variety of patient care settings, including ambulatory clinics,7 urgent care centers,8 and operating rooms,9,10 with some yielding positive results in terms of ordering and cost11,12 and others having no impact.13,14 Currently, there is little evidence specifically addressing the effect of cost display for medications in the inpatient setting.
As part of an institutional initiative to control pharmaceutical expenditures, informational messaging for several high-cost drugs was initiated at our tertiary care hospital in April 2015. The goal of our study was to assess the effect of these medication cost messages on ordering practices. We hypothesized that the display of inpatient pharmaceutical costs at the time of order entry would result in a reduction in ordering.
METHODS
Setting, Intervention, and Participants
As part of an effort to educate prescribers about the high cost of medications, 9 intravenous (IV) medications were selected by the Johns Hopkins Hospital Pharmacy and Therapeutics Committee as targets for drug cost messaging. The intention of the committee was to implement a rapid, low-cost, proof-of-concept, quality-improvement project that was not designed as prospective research. Representatives from the pharmacy and clinicians from relevant clinical areas participated in preimplementation discussions to help identify medications that were subjectively felt to be overused at our institution and potentially modifiable through provider education. The criteria for selecting drug targets included a variety of factors, such as medications infrequently ordered but representing a significant cost per dose (eg, eculizumab and ribavirin), frequently ordered medications with less expensive substitutes (eg, linezolid and voriconazole), and high-cost medications without direct therapeutic alternatives (eg, calcitonin). From April 10, 2015, to October 5, 2015, the computerized Provider Order Entry System (cPOE), Sunrise Clinical Manager (Allscripts Corporation, Chicago, IL), displayed the cost for targeted medications. Seven of the medication alerts also included a reasonable therapeutic alternative and its cost. There were no restrictions placed on ordering; prescribers were able to choose the high-cost medications at their discretion.
Despite the fact that this initiative was not designed as a research project, we felt it was important to formally evaluate the impact of the drug cost messaging effort to inform future quality-improvement interventions. Each medication was compared to its preintervention baseline utilization dating back to January 1, 2013. For the 7 medications with alternatives offered, we also analyzed use of the suggested alternative during these time periods.
Data Sources and Measurement
Our study utilized data obtained from the pharmacy order verification system and the cPOE database. Data were collected over a period of 143 weeks from January 1, 2013, to October 5, 2015, to allow for a baseline period (January 1, 2013, to April 9, 2015) and an intervention period (April 10, 2015, to October 5, 2015). Data elements extracted included drug characteristics (dosage form, route, cost, strength, name, and quantity), patient characteristics (race, gender, and age), clinical setting (facility location, inpatient or outpatient), and billing information (provider name, doses dispensed from pharmacy, order number, revenue or procedure code, record number, date of service, and unique billing number) for each admission. Using these elements, we generated the following 8 variables to use in our analyses: week, month, period identifier, drug name, dosage form, weekly orders, weekly patient days, and number of weekly orders per 10,000 patient days. Average wholesale price (AWP), referred to as medication cost in this manuscript, was used to report all drug costs in all associated cost calculations. While the actual cost of acquisition and price charged to the patient may vary based on several factors, including manufacturer and payer, we chose to use AWP as a generalizable estimate of the cost of acquisition of the drug for the hospital.
Variables
“Week” and “month” were defined as the week and month of our study, respectively. The “period identifier” was a binary variable that identified the time period before and after the intervention. “Weekly orders” was defined as the total number of new orders placed per week for each specified drug included in our study. For example, if a patient received 2 discrete, new orders for a medication in a given week, 2 orders would be counted toward the “weekly orders” variable. “Patient days,” defined as the total number of patients treated at our facility, was summated for each week of our study to yield “weekly patient days.” To derive the “number of weekly orders per 10,000 patient days,” we divided weekly orders by weekly patient days and multiplied the resultant figure by 10,000.
Statistical Analysis
Segmented regression, a form of interrupted time series analysis, is a quasi-experimental design that was used to determine the immediate and sustained effects of the drug cost messages on the rate of medication ordering.15-17 The model enabled the use of comparison groups (alternative medications, as described above) to enhance internal validity.
In time series data, outcomes may not be independent over time. Autocorrelation of the error terms can arise when outcomes are more similar at time points closer together than outcomes at time points further apart. Failure to account for autocorrelation of the error terms may lead to underestimated standard errors. The presence of autocorrelation, assessed by calculating the Durbin-Watson statistic, was significant among our data. To adjust for this, we employed a Prais-Winsten estimation to adjust the error term (εt) calculated in our models.
Two segmented linear regression models were used to estimate trends in ordering before and after the intervention. The presence or absence of a comparator drug determined which model was to be used. When only single medications were under study, as in the case of eculizumab and calcitonin, our regression model was as follows:
Yt = (β0) + (β1)(Timet) + (β2)(Interventiont) + (β3)(Post-Intervention Timet) + (εt)
In our single-drug model, Yt denoted the number of orders per 10,000 patient days at week “t”; Timet was a continuous variable that indicated the number of weeks prior to or after the study intervention (April 10, 2015) and ranged from –116 to 27 weeks. Post-Intervention Timet was a continuous variable that denoted the number of weeks since the start of the intervention and is coded as zero for all time periods prior to the intervention. β0 was the estimated baseline number of orders per 10,000 patient days at the beginning of the study. β1 is the trend of orders per 10,000 patient days per week during the preintervention period; β2 represents an estimate of the change in the number of orders per 10,000 patient days immediately after the intervention; β3 denotes the difference between preintervention and postintervention slopes; and εt is the “error term,” which represents autocorrelation and random variability of the data.
As mentioned previously, alternative dosage forms of 7 medications included in our study were utilized as comparison groups. In these instances (when multiple drugs were included in our analyses), the following regression model was applied:
Y t = ( β 0 ) + ( β 1 )(Time t ) + ( β 2 )(Intervention t ) + ( β 3 )(Post-Intervention Time t ) + ( β 4 )(Cohort) + ( β 5 )(Cohort)(Time t ) + ( β 6 )(Cohort)(Intervention t ) + ( β 7 )(Cohort)(Post-Intervention Time t ) + ( ε t )
Here, 3 coefficients were added (β4-β7) to describe an additional cohort of orders. Cohort, a binary indicator variable, held a value of either 0 or 1 when the model was used to describe the treatment or comparison group, respectively. The coefficients β4-β7 described the treatment group, and β0-β3 described the comparison group. β4 was the difference in the number of baseline orders per 10,000 patient days between treatment and comparison groups; Β5 represented the difference between the estimated ordering trends of treatment and comparison groups; and Β6 indicated the difference in immediate changes in the number of orders per 10,000 patient days in the 2 groups following the intervention.
The number of orders per week was recorded for each medicine, which enabled a large number of data points to be included in our analyses. This allowed for more accurate and stable estimates to be made in our regression model. A total of 143 data points were collected for each study group, 116 before and 27 following each intervention.
All analyses were conducted by using STATA version 13.1 (StataCorp LP, College Station, TX).
RESULTS
Initial results pertaining to 9 IV medications were examined (Table). Following the implementation of cost messaging, no significant changes were observed in order frequency or trend for IV formulations of eculizumab, calcitonin, levetiracetam, linezolid, mycophenolate, ribavirin, voriconazole, and levothyroxine (Figures 1 and 2). However, a significant decrease in the number of oral ribavirin orders (Figure 2), the control group for the IV form, was observed (–16.3 orders per 10,000 patient days; P = .004; 95% CI, –27.2 to –5.31).
DISCUSSION
Our results suggest that the passive strategy of displaying cost alone was not effective in altering prescriber ordering patterns for the selected medications. This may be due to a lack of awareness regarding direct financial impact on the patient, importance of costs in medical decision-making, or a perceived lack of alternatives or suitability of recommended alternatives. These results may prove valuable to hospital and pharmacy leadership as they develop strategies to curb medication expense.
Changes observed in IV pantoprazole ordering are instructive. Due to a national shortage, the IV form of this medication underwent a restriction, which required approval by the pharmacy prior to dispensing. This restriction was instituted independently of our study and led to a 73% decrease from usage rates prior to policy implementation (Figure 3). Ordering was restricted according to defined criteria for IV use. The restriction did not apply to oral pantoprazole, and no significant change in ordering of the oral formulation was noted during the evaluated period (Figure 3).
The dramatic effect of policy changes, as observed with pantoprazole and voriconazole, suggests that a more active strategy may have a greater impact on prescriber behavior when it comes to medication ordering in the inpatient setting. It also highlights several potential sources of confounding that may introduce bias to cost-transparency studies.
This study has multiple limitations. First, as with all observational study designs, causation cannot be drawn with certainty from our results. While we were able to compare medications to their preintervention baselines, the data could have been impacted by longitudinal or seasonal trends in medication ordering, which may have been impacted by seasonal variability in disease prevalence, changes in resistance patterns, and annual cycling of house staff in an academic medical center. While there appear to be potential seasonal patterns regarding prescribing patterns for some of the medications included in this analysis, we also believe the linear regressions capture the overall trends in prescribing adequately. Nonstationarity, or trends in the mean and variance of the outcome that are not related to the intervention, may introduce bias in the interpretation of our findings. However, we believe the parameters included in our models, namely the immediate change in the intercept following the intervention and the change in the trend of the rate of prescribing over time from pre- to postintervention, provide substantial protections from faulty interpretation. Our models are limited to the extent that these parameters do not account for nonstationarity. Additionally, we did not collect data on dosing frequency or duration of treatment, which would have been dependent on factors that are not readily quantified, such as indication, clinical rationale, or patient response. Thus, we were not able to evaluate the impact of the intervention on these factors.
Although intended to enhance internal validity, comparison groups were also subject to external influence. For example, we observed a significant, short-lived rise in oral ribavirin (a control medication) ordering during the preintervention baseline period that appeared to be independent of our intervention and may speak to the unaccounted-for longitudinal variability detailed above.
Finally, the clinical indication and setting may be important. Previous studies performed at the same hospital with price displays showed a reduction in laboratory ordering but no change in imaging.18,19 One might speculate that ordering fewer laboratory tests is viewed by providers as eliminating waste rather than choosing a less expensive option to accomplish the same diagnostic task at hand. Therapeutics may be more similar to radiology tests, because patients presumably need the treatment and often do not have the option of simply not ordering without a concerted effort to reevaluate the treatment plan. Additionally, in a tertiary care teaching center such as ours, a junior clinician, oftentimes at the behest of a more senior colleague, enters most orders. In an environment in which the ordering prescriber has more autonomy or when the order is driven by a junior practitioner rather than an attending (such as daily laboratories), results may be different. Additionally, institutions that incentivize prescribers directly to practice cost-conscious care may experience different results from similar interventions.
We conclude that, in the case of medication cost messaging, a strategy of displaying cost information alone was insufficient to affect prescriber ordering behavior. Coupling cost transparency with educational interventions and active stewardship to impact clinical practice is worthy of further study.
Disclosures: The authors state that there were no external sponsors for this work. The Johns Hopkins Hospital and University “funded” this work by paying the salaries of the authors. The author team maintained independence and made all decisions regarding the study design, data collection, data analysis, interpretation of results, writing of the research report, and decision to submit it for publication. Dr. Shermock had full access to all the study data and takes responsibility for the integrity of the data and accuracy of the data analysis.
Secondary to rising healthcare costs in the United States, broad efforts are underway to identify and reduce waste in the health system.1,2 A recent systematic review exhibited that many physicians inaccurately estimate the cost of medications.3 Raising awareness of medication costs among prescribers is one potential way to promote high-value care.
Some evidence suggests that cost transparency may help prescribers understand how medication orders drive costs. In a previous study carried out at the Johns Hopkins Hospital, fee data were displayed to providers for diagnostic laboratory tests.4 An 8.6% decrease (95% confidence interval [CI], –8.99% to –8.19%) in test ordering was observed when costs were displayed vs a 5.6% increase (95% CI, 4.90% to 6.39%) in ordering when costs were not displayed during a 6-month intervention period (P < 0.001). Conversely, a similar study that investigated the impact of cost transparency on inpatient imaging utilization did not demonstrate a significant influence of cost display.5 This suggests that cost transparency may work in some areas of care but not in others. A systematic review that investigated price-display interventions for imaging, laboratory studies, and medications reported 10 studies that demonstrated a statistically significant decrease in expenditures without an effect on patient safety.6
Informing prescribers of institution-specific medication costs within and between drug classes may enable the selection of less expensive, therapeutically equivalent drugs. Prior studies investigating the effect of medication cost display were conducted in a variety of patient care settings, including ambulatory clinics,7 urgent care centers,8 and operating rooms,9,10 with some yielding positive results in terms of ordering and cost11,12 and others having no impact.13,14 Currently, there is little evidence specifically addressing the effect of cost display for medications in the inpatient setting.
As part of an institutional initiative to control pharmaceutical expenditures, informational messaging for several high-cost drugs was initiated at our tertiary care hospital in April 2015. The goal of our study was to assess the effect of these medication cost messages on ordering practices. We hypothesized that the display of inpatient pharmaceutical costs at the time of order entry would result in a reduction in ordering.
METHODS
Setting, Intervention, and Participants
As part of an effort to educate prescribers about the high cost of medications, 9 intravenous (IV) medications were selected by the Johns Hopkins Hospital Pharmacy and Therapeutics Committee as targets for drug cost messaging. The intention of the committee was to implement a rapid, low-cost, proof-of-concept, quality-improvement project that was not designed as prospective research. Representatives from the pharmacy and clinicians from relevant clinical areas participated in preimplementation discussions to help identify medications that were subjectively felt to be overused at our institution and potentially modifiable through provider education. The criteria for selecting drug targets included a variety of factors, such as medications infrequently ordered but representing a significant cost per dose (eg, eculizumab and ribavirin), frequently ordered medications with less expensive substitutes (eg, linezolid and voriconazole), and high-cost medications without direct therapeutic alternatives (eg, calcitonin). From April 10, 2015, to October 5, 2015, the computerized Provider Order Entry System (cPOE), Sunrise Clinical Manager (Allscripts Corporation, Chicago, IL), displayed the cost for targeted medications. Seven of the medication alerts also included a reasonable therapeutic alternative and its cost. There were no restrictions placed on ordering; prescribers were able to choose the high-cost medications at their discretion.
Despite the fact that this initiative was not designed as a research project, we felt it was important to formally evaluate the impact of the drug cost messaging effort to inform future quality-improvement interventions. Each medication was compared to its preintervention baseline utilization dating back to January 1, 2013. For the 7 medications with alternatives offered, we also analyzed use of the suggested alternative during these time periods.
Data Sources and Measurement
Our study utilized data obtained from the pharmacy order verification system and the cPOE database. Data were collected over a period of 143 weeks from January 1, 2013, to October 5, 2015, to allow for a baseline period (January 1, 2013, to April 9, 2015) and an intervention period (April 10, 2015, to October 5, 2015). Data elements extracted included drug characteristics (dosage form, route, cost, strength, name, and quantity), patient characteristics (race, gender, and age), clinical setting (facility location, inpatient or outpatient), and billing information (provider name, doses dispensed from pharmacy, order number, revenue or procedure code, record number, date of service, and unique billing number) for each admission. Using these elements, we generated the following 8 variables to use in our analyses: week, month, period identifier, drug name, dosage form, weekly orders, weekly patient days, and number of weekly orders per 10,000 patient days. Average wholesale price (AWP), referred to as medication cost in this manuscript, was used to report all drug costs in all associated cost calculations. While the actual cost of acquisition and price charged to the patient may vary based on several factors, including manufacturer and payer, we chose to use AWP as a generalizable estimate of the cost of acquisition of the drug for the hospital.
Variables
“Week” and “month” were defined as the week and month of our study, respectively. The “period identifier” was a binary variable that identified the time period before and after the intervention. “Weekly orders” was defined as the total number of new orders placed per week for each specified drug included in our study. For example, if a patient received 2 discrete, new orders for a medication in a given week, 2 orders would be counted toward the “weekly orders” variable. “Patient days,” defined as the total number of patients treated at our facility, was summated for each week of our study to yield “weekly patient days.” To derive the “number of weekly orders per 10,000 patient days,” we divided weekly orders by weekly patient days and multiplied the resultant figure by 10,000.
Statistical Analysis
Segmented regression, a form of interrupted time series analysis, is a quasi-experimental design that was used to determine the immediate and sustained effects of the drug cost messages on the rate of medication ordering.15-17 The model enabled the use of comparison groups (alternative medications, as described above) to enhance internal validity.
In time series data, outcomes may not be independent over time. Autocorrelation of the error terms can arise when outcomes are more similar at time points closer together than outcomes at time points further apart. Failure to account for autocorrelation of the error terms may lead to underestimated standard errors. The presence of autocorrelation, assessed by calculating the Durbin-Watson statistic, was significant among our data. To adjust for this, we employed a Prais-Winsten estimation to adjust the error term (εt) calculated in our models.
Two segmented linear regression models were used to estimate trends in ordering before and after the intervention. The presence or absence of a comparator drug determined which model was to be used. When only single medications were under study, as in the case of eculizumab and calcitonin, our regression model was as follows:
Yt = (β0) + (β1)(Timet) + (β2)(Interventiont) + (β3)(Post-Intervention Timet) + (εt)
In our single-drug model, Yt denoted the number of orders per 10,000 patient days at week “t”; Timet was a continuous variable that indicated the number of weeks prior to or after the study intervention (April 10, 2015) and ranged from –116 to 27 weeks. Post-Intervention Timet was a continuous variable that denoted the number of weeks since the start of the intervention and is coded as zero for all time periods prior to the intervention. β0 was the estimated baseline number of orders per 10,000 patient days at the beginning of the study. β1 is the trend of orders per 10,000 patient days per week during the preintervention period; β2 represents an estimate of the change in the number of orders per 10,000 patient days immediately after the intervention; β3 denotes the difference between preintervention and postintervention slopes; and εt is the “error term,” which represents autocorrelation and random variability of the data.
As mentioned previously, alternative dosage forms of 7 medications included in our study were utilized as comparison groups. In these instances (when multiple drugs were included in our analyses), the following regression model was applied:
Y t = ( β 0 ) + ( β 1 )(Time t ) + ( β 2 )(Intervention t ) + ( β 3 )(Post-Intervention Time t ) + ( β 4 )(Cohort) + ( β 5 )(Cohort)(Time t ) + ( β 6 )(Cohort)(Intervention t ) + ( β 7 )(Cohort)(Post-Intervention Time t ) + ( ε t )
Here, 3 coefficients were added (β4-β7) to describe an additional cohort of orders. Cohort, a binary indicator variable, held a value of either 0 or 1 when the model was used to describe the treatment or comparison group, respectively. The coefficients β4-β7 described the treatment group, and β0-β3 described the comparison group. β4 was the difference in the number of baseline orders per 10,000 patient days between treatment and comparison groups; Β5 represented the difference between the estimated ordering trends of treatment and comparison groups; and Β6 indicated the difference in immediate changes in the number of orders per 10,000 patient days in the 2 groups following the intervention.
The number of orders per week was recorded for each medicine, which enabled a large number of data points to be included in our analyses. This allowed for more accurate and stable estimates to be made in our regression model. A total of 143 data points were collected for each study group, 116 before and 27 following each intervention.
All analyses were conducted by using STATA version 13.1 (StataCorp LP, College Station, TX).
RESULTS
Initial results pertaining to 9 IV medications were examined (Table). Following the implementation of cost messaging, no significant changes were observed in order frequency or trend for IV formulations of eculizumab, calcitonin, levetiracetam, linezolid, mycophenolate, ribavirin, voriconazole, and levothyroxine (Figures 1 and 2). However, a significant decrease in the number of oral ribavirin orders (Figure 2), the control group for the IV form, was observed (–16.3 orders per 10,000 patient days; P = .004; 95% CI, –27.2 to –5.31).
DISCUSSION
Our results suggest that the passive strategy of displaying cost alone was not effective in altering prescriber ordering patterns for the selected medications. This may be due to a lack of awareness regarding direct financial impact on the patient, importance of costs in medical decision-making, or a perceived lack of alternatives or suitability of recommended alternatives. These results may prove valuable to hospital and pharmacy leadership as they develop strategies to curb medication expense.
Changes observed in IV pantoprazole ordering are instructive. Due to a national shortage, the IV form of this medication underwent a restriction, which required approval by the pharmacy prior to dispensing. This restriction was instituted independently of our study and led to a 73% decrease from usage rates prior to policy implementation (Figure 3). Ordering was restricted according to defined criteria for IV use. The restriction did not apply to oral pantoprazole, and no significant change in ordering of the oral formulation was noted during the evaluated period (Figure 3).
The dramatic effect of policy changes, as observed with pantoprazole and voriconazole, suggests that a more active strategy may have a greater impact on prescriber behavior when it comes to medication ordering in the inpatient setting. It also highlights several potential sources of confounding that may introduce bias to cost-transparency studies.
This study has multiple limitations. First, as with all observational study designs, causation cannot be drawn with certainty from our results. While we were able to compare medications to their preintervention baselines, the data could have been impacted by longitudinal or seasonal trends in medication ordering, which may have been impacted by seasonal variability in disease prevalence, changes in resistance patterns, and annual cycling of house staff in an academic medical center. While there appear to be potential seasonal patterns regarding prescribing patterns for some of the medications included in this analysis, we also believe the linear regressions capture the overall trends in prescribing adequately. Nonstationarity, or trends in the mean and variance of the outcome that are not related to the intervention, may introduce bias in the interpretation of our findings. However, we believe the parameters included in our models, namely the immediate change in the intercept following the intervention and the change in the trend of the rate of prescribing over time from pre- to postintervention, provide substantial protections from faulty interpretation. Our models are limited to the extent that these parameters do not account for nonstationarity. Additionally, we did not collect data on dosing frequency or duration of treatment, which would have been dependent on factors that are not readily quantified, such as indication, clinical rationale, or patient response. Thus, we were not able to evaluate the impact of the intervention on these factors.
Although intended to enhance internal validity, comparison groups were also subject to external influence. For example, we observed a significant, short-lived rise in oral ribavirin (a control medication) ordering during the preintervention baseline period that appeared to be independent of our intervention and may speak to the unaccounted-for longitudinal variability detailed above.
Finally, the clinical indication and setting may be important. Previous studies performed at the same hospital with price displays showed a reduction in laboratory ordering but no change in imaging.18,19 One might speculate that ordering fewer laboratory tests is viewed by providers as eliminating waste rather than choosing a less expensive option to accomplish the same diagnostic task at hand. Therapeutics may be more similar to radiology tests, because patients presumably need the treatment and often do not have the option of simply not ordering without a concerted effort to reevaluate the treatment plan. Additionally, in a tertiary care teaching center such as ours, a junior clinician, oftentimes at the behest of a more senior colleague, enters most orders. In an environment in which the ordering prescriber has more autonomy or when the order is driven by a junior practitioner rather than an attending (such as daily laboratories), results may be different. Additionally, institutions that incentivize prescribers directly to practice cost-conscious care may experience different results from similar interventions.
We conclude that, in the case of medication cost messaging, a strategy of displaying cost information alone was insufficient to affect prescriber ordering behavior. Coupling cost transparency with educational interventions and active stewardship to impact clinical practice is worthy of further study.
Disclosures: The authors state that there were no external sponsors for this work. The Johns Hopkins Hospital and University “funded” this work by paying the salaries of the authors. The author team maintained independence and made all decisions regarding the study design, data collection, data analysis, interpretation of results, writing of the research report, and decision to submit it for publication. Dr. Shermock had full access to all the study data and takes responsibility for the integrity of the data and accuracy of the data analysis.
1. Berwick DM, Hackbarth AD. Eliminating Waste in US Health Care. JAMA. 2012;307(14):1513-1516. PubMed
2. PricewaterhouseCoopers’ Health Research Institute. The Price of Excess: Identifying Waste in Healthcare Spending. http://www.pwc.com/us/en/healthcare/publications/the-price-of-excess.html. Accessed June 17, 2015.
3. Allan GM, Lexchin J, Wiebe N. Physician awareness of drug cost: a systematic review. PLoS Med. 2007;4(9):e283. PubMed
4. Feldman LS, Shihab HM, Thiemann D, et al. Impact of providing fee data on laboratory test ordering: a controlled clinical trial. JAMA Intern Med. 2013;173(10):903-908. PubMed
5. Durand DJ, Feldman LS, Lewin JS, Brotman DJ. Provider cost transparency alone has no impact on inpatient imaging utilization. J Am Coll Radiol. 2013;10(2):108-113. PubMed
6. Silvestri MT, Bongiovanni TR, Glover JG, Gross CP. Impact of price display on provider ordering: A systematic review. J Hosp Med. 2016;11(1):65-76. PubMed
7. Ornstein SM, MacFarlane LL, Jenkins RG, Pan Q, Wager KA. Medication cost information in a computer-based patient record system. Impact on prescribing in a family medicine clinical practice. Arch Fam Med. 1999;8(2):118-121. PubMed
8. Guterman JJ, Chernof BA, Mares B, Gross-Schulman SG, Gan PG, Thomas D. Modifying provider behavior: A low-tech approach to pharmaceutical ordering. J Gen Intern Med. 2002;17(10):792-796. PubMed
9. McNitt JD, Bode ET, Nelson RE. Long-term pharmaceutical cost reduction using a data management system. Anesth Analg. 1998;87(4):837-842. PubMed
10. Horrow JC, Rosenberg H. Price stickers do not alter drug usage. Can J Anaesth. 1994;41(11):1047-1052. PubMed
11. Guterman JJ, Chernof BA, Mares B, Gross-Schulman SG, Gan PG, Thomas D. Modifying provider behavior: A low-tech approach to pharmaceutical ordering. J Gen Intern Med. 2002;17(10):792-796. PubMed
12. McNitt JD, Bode ET, Nelson RE. Long-term pharmaceutical cost reduction using a data management system. Anesth Analg. 1998;87(4):837-842. PubMed
13. Ornstein SM, MacFarlane LL, Jenkins RG, Pan Q, Wager KA. Medication cost information in a computer-based patient record system. Impact on prescribing in a family medicine clinical practice. Arch Fam Med. 1999;8(2):118-121. PubMed
14. Horrow JC, Rosenberg H. Price stickers do not alter drug usage. Can J Anaesth. 1994;41(11):1047-1052. PubMed
15. Jandoc R, Burden AM, Mamdani M, Levesque LE, Cadarette SM. Interrupted time series analysis in drug utilization research is increasing: Systematic review and recommendations. J Clin Epidemiol. 2015;68(8):950-56. PubMed
16. Linden A. Conducting interrupted time-series analysis for single- and multiple-group comparisons. Stata J. 2015;15(2):480-500.
17. Linden A, Adams JL. Applying a propensity score-based weighting model to interrupted time series data: improving causal inference in programme evaluation. J Eval Clin Pract. 2011;17(6):1231-1238. PubMed
18. Feldman LS, Shihab HM, Thiemann D, et al. Impact of providing fee data on laboratory test ordering: a controlled clinical trial. JAMA Intern Med. 2013;173(10):903-908. PubMed
19. Durand DJ, Feldman LS, Lewin JS, Brotman DJ. Provider cost transparency alone has no impact on inpatient imaging utilization. J Am Coll Radiol. 2013;10(2):108-113. PubMed
1. Berwick DM, Hackbarth AD. Eliminating Waste in US Health Care. JAMA. 2012;307(14):1513-1516. PubMed
2. PricewaterhouseCoopers’ Health Research Institute. The Price of Excess: Identifying Waste in Healthcare Spending. http://www.pwc.com/us/en/healthcare/publications/the-price-of-excess.html. Accessed June 17, 2015.
3. Allan GM, Lexchin J, Wiebe N. Physician awareness of drug cost: a systematic review. PLoS Med. 2007;4(9):e283. PubMed
4. Feldman LS, Shihab HM, Thiemann D, et al. Impact of providing fee data on laboratory test ordering: a controlled clinical trial. JAMA Intern Med. 2013;173(10):903-908. PubMed
5. Durand DJ, Feldman LS, Lewin JS, Brotman DJ. Provider cost transparency alone has no impact on inpatient imaging utilization. J Am Coll Radiol. 2013;10(2):108-113. PubMed
6. Silvestri MT, Bongiovanni TR, Glover JG, Gross CP. Impact of price display on provider ordering: A systematic review. J Hosp Med. 2016;11(1):65-76. PubMed
7. Ornstein SM, MacFarlane LL, Jenkins RG, Pan Q, Wager KA. Medication cost information in a computer-based patient record system. Impact on prescribing in a family medicine clinical practice. Arch Fam Med. 1999;8(2):118-121. PubMed
8. Guterman JJ, Chernof BA, Mares B, Gross-Schulman SG, Gan PG, Thomas D. Modifying provider behavior: A low-tech approach to pharmaceutical ordering. J Gen Intern Med. 2002;17(10):792-796. PubMed
9. McNitt JD, Bode ET, Nelson RE. Long-term pharmaceutical cost reduction using a data management system. Anesth Analg. 1998;87(4):837-842. PubMed
10. Horrow JC, Rosenberg H. Price stickers do not alter drug usage. Can J Anaesth. 1994;41(11):1047-1052. PubMed
11. Guterman JJ, Chernof BA, Mares B, Gross-Schulman SG, Gan PG, Thomas D. Modifying provider behavior: A low-tech approach to pharmaceutical ordering. J Gen Intern Med. 2002;17(10):792-796. PubMed
12. McNitt JD, Bode ET, Nelson RE. Long-term pharmaceutical cost reduction using a data management system. Anesth Analg. 1998;87(4):837-842. PubMed
13. Ornstein SM, MacFarlane LL, Jenkins RG, Pan Q, Wager KA. Medication cost information in a computer-based patient record system. Impact on prescribing in a family medicine clinical practice. Arch Fam Med. 1999;8(2):118-121. PubMed
14. Horrow JC, Rosenberg H. Price stickers do not alter drug usage. Can J Anaesth. 1994;41(11):1047-1052. PubMed
15. Jandoc R, Burden AM, Mamdani M, Levesque LE, Cadarette SM. Interrupted time series analysis in drug utilization research is increasing: Systematic review and recommendations. J Clin Epidemiol. 2015;68(8):950-56. PubMed
16. Linden A. Conducting interrupted time-series analysis for single- and multiple-group comparisons. Stata J. 2015;15(2):480-500.
17. Linden A, Adams JL. Applying a propensity score-based weighting model to interrupted time series data: improving causal inference in programme evaluation. J Eval Clin Pract. 2011;17(6):1231-1238. PubMed
18. Feldman LS, Shihab HM, Thiemann D, et al. Impact of providing fee data on laboratory test ordering: a controlled clinical trial. JAMA Intern Med. 2013;173(10):903-908. PubMed
19. Durand DJ, Feldman LS, Lewin JS, Brotman DJ. Provider cost transparency alone has no impact on inpatient imaging utilization. J Am Coll Radiol. 2013;10(2):108-113. PubMed
© 2017 Society of Hospital Medicine
Successive Potassium Hydroxide Testing for Improved Diagnosis of Tinea Pedis
The gold standard for diagnosing dermatophytosis is the use of direct microscopic examination together with fungal culture.1 However, in the last 2 decades, molecular techniques that currently are available worldwide have improved the diagnosis procedure.2,3 In the practice of dermatology, potassium hydroxide (KOH) testing is a commonly used method for the diagnosis of superficial fungal infections.4 The sensitivity and specificity of KOH testing in patients with tinea pedis have been reported as 73.3% and 42.5%, respectively.5 Repetition of this test after an initial negative test result is recommended if the clinical picture strongly suggests a fungal infection.6,7 Alternatively, several repetitions of direct microscopic examinations also have been proposed for detecting other microorganisms. For example, 3 negative sputum smears traditionally are recommended to exclude a diagnosis of pulmonary tuberculosis.8 However, after numerous investigations in various regions of the world, the World Health Organization reduced the recommended number of these specimens from 3 to 2 in 2007.9
The literature suggests that successive mycological tests, both with direct microscopy and fungal cultures, improve the diagnosis of onychomycosis.1,10,11 Therefore, if such investigations are increased in number, recommendations for successive mycological tests may be more reliable. In the current study, we aimed to investigate the value of successive KOH testing in the management of patients with clinically suspected tinea pedis.
Methods
Patients and Clinical Evaluation
One hundred thirty-five consecutive patients (63 male; 72 female) with clinical symptoms suggestive of intertriginous, vesiculobullous, and/or moccasin-type tinea pedis were enrolled in this prospective study. The mean age (SD) of patients was 45.9 (14.7) years (range, 11–77 years). Almost exclusively, the clinical symptoms suggestive of tinea pedis were desquamation or maceration in the toe webs, blistering lesions on the soles, and diffuse or patchy scaling or keratosis on the soles. A single dermatologist (B.F.K.) clinically evaluated the patients and found only 1 region showing different patterns suggestive of tinea pedis in 72 patients, 2 regions in 61 patients, and 3 regions in 2 patients. Therefore, 200 lesions from the 135 patients were chosen for the KOH test. The dermatologist recorded her level of suspicion for a fungal infection as low or high for each lesion, depending on the absence or presence of signs (eg, unilateral involvement, a well-defined border). None of the patients had used topical or systemic antifungal therapy for at least 1 month prior to the study.12
Clinical Sampling and Direct Microscopic Examination
The dermatologist took 3 samples of skin scrapings from each of the 200 lesions. All 3 samples from a given lesion were obtained from sites with the same clinical symptoms in a single session. Special attention was paid to samples from the active advancing borders of the lesions and the roofs of blisters if they were present.13 Upon completion of every 15 samples from every 5 lesions, the dermatologist randomized the order of the samples (https://www.random.org/). She then gave the samples, without the identities of the patients or any clinical information, to an experienced laboratory technician for direct microscopic examination. The technician prepared and examined the samples as described elsewhere5,7,14 and recorded the results as positive if hyphal elements were present or negative if they were not. The study was reviewed and approved by the Çukurova University Faculty of Medicine Ethics Committee (Adana, Turkey). Informed consent was obtained from each patient or from his/her guardian(s) prior to initiating the study.
Statistical Analysis
Statistical analysis was conducted using the χ2 test in the SPSS software version 20.0. McNemar test was used for analysis of the paired data.
Results
Among the 135 patients, lesions were suggestive of the intertriginous type of tinea pedis in 24 patients, moccasin type in 50 patients, and both intertriginous and moccasin type in 58 patients. Among the remaining 3 patients, 1 had lesions suggestive of the vesiculobullous type, and another patient had both the vesiculobullous and intertriginous types; the last patient demonstrated lesions that were inconsistent with any of these 3 subtypes of tinea pedis, and a well-defined eczematous plaque was observed on the dorsal surface of the patient’s left foot.
Among the 200 lesions from which skin scrapings were taken for KOH testing, 83 were in the toe webs, 110 were on the soles, and 7 were on the dorsal surfaces of the feet. Of these 7 dorsal lesions, 6 were extensions from lesions on the toe webs or soles and 1 was inconsistent with the 3 subtypes of tinea pedis. Among the 200 lesions, the main clinical symptom was maceration in 38 lesions, desquamation or scaling in 132 lesions, keratosis in 28 lesions, and blistering in 2 lesions. The dermatologist recorded the level of suspicion for tinea pedis as low in 68 lesions and high in 132.
According to the order in which the dermatologist took the 3 samples from each lesion, the KOH test was positive in 95 of the first set of 200 samples, 94 of the second set, and 86 of the third set; however, from the second set, the incremental yield (ie, the number of lesions in which the first KOH test was negative and the second was positive) was 10. The number of lesions in which the first and the second tests were negative and the third was positive was only 4. Therefore, the number of lesions with a positive KOH test was significantly increased from 95 to 105 by performing the second KOH test (P=.002). This number again increased from 105 to 109 when a third test was performed; however, this increase was not statistically significant (P=.125)(Table 1).
According to an evaluation that was not stratified by the dermatologist’s order of sampling, 72 lesions (36.0%) showed KOH test positivity in all 3 samples, 22 (11.0%) were positive in 2 samples, 15 (7.5%) were positive in only 1 sample, and 91 (45.5%) were positive in none of the samples (Table 2). When the data were subdivided based on the sites of the lesions, the toe web lesions (n=83) showed rates of 41.0%, 9.6%, and 4.8% for 3, 2, and 1 positive KOH tests, respectively. For the sole lesions (n=110), the rates were somewhat different at 31.8%, 11.8%, and 10.0%, respectively, but the difference was not statistically significant (P=.395).
For the subgroups based on the main clinical symptoms, the percentage of lesions having at least 1 positive KOH test from the 3 samples was 35.7% for the keratotic lesions (n=28). This rate was lower than macerated lesions (n=38) and desquamating or scaling lesions (n=132), which were 52.6% and 59.1%, respectively (Table 2). On the other hand, the percentage of lesions that produced only 1 or 2 positive KOH tests from the 3 samples was 25.0% for the keratotic lesions, which was higher than the rates for the macerated lesions and the desquamating or scaling lesions (13.1% and 18.9%, respectively). In particular, the difference between the keratotic lesions and the desquamating or scaling lesions in the distribution of the rates of 0, 1, 2, and 3 positive KOH tests was statistically significant (P=.019). The macerated, desquamating or scaling, keratotic, and blistering lesions are presented in the Figure.
If the dermatologist indicated a high suspicion of fungal infection, it was more likely that at least 1 of 3 KOH test results was positive. The rate of at least 1 positive test was 64.4% for the highly suspicious lesions (n=132) and 35.3% for the lesions with low suspicion of a fungal infection (n=68)(Table 2). The difference was statistically significant (P<.001). Conversely, if the suspicion was low, it was more likely that only 1 or 2 KOH tests were positive. The percentages of lesions having 3, 2, or 1 positive KOH tests were 14.7%, 8.8%, and 11.8%, respectively, for the low-suspicion lesions and 47.0%, 12.1%, and 5.3%, respectively, for the high-suspicion lesions. The difference was statistically significant (P<.001).
Comment
In the current study, we aimed to investigate if successive KOH tests provide an incremental diagnostic yield in the management of patients with clinically suspected tinea pedis and if these results differ among the subgroups of patients. Both in the evaluation taking into account the order of sampling and in the evaluation disregarding this order, we found that the second sample was necessary for all subgroups, and even the third sample was necessary for patients with keratotic lesions. The main limitation of the study was that we lacked a gold-standard technique (eg, a molecular-based technique); therefore, we are unable to comment on the false-negative and false-positive results of the successive KOH testing.
Summerbell et al11 found in their study that in initial specimens of toenails with apparent lesions taken from 473 patients, the KOH test was 73.8% sensitive for dermatophytes, and this rate was only somewhat higher for cultures (74.6%). Arabatzis et al2 investigated 92 skin, nail, and hair specimens from 67 patients with suspected dermatophytosis and found that the KOH test was superior to culture for the detection of dermatophytes (43% vs 33%). Moreover and more importantly, they noted that a real-time polymerase chain reaction (PCR) assay yielded a higher detection rate (51%).2 In another study, Wisselink et al3 examined 1437 clinical samples and demonstrated a great increase in the detection of dermatophytes using a real-time PCR assay (48.5%) compared to culture (26.9%). However, PCR may not reflect active disease and could lead to false-positive results.2,3 Therefore, the aforementioned weakness of our study will be overcome in further studies investigating the benefit of successive KOH testing compared to a molecular-based assay, such as the real-time PCR assay.
In this study, repeating the KOH test provided better results for achieving the diagnosis of tinea pedis in a large number of samples from clinically suspected lesions. Additionally, the distribution of 3, 2, or 1 positive results on the 3 KOH tests was different among the subgroups of lesions. Overall, positivity was less frequent in the keratotic lesions compared to the macerated or desquamating or scaling lesions. Moreover, positivity on all 3 tests also was less frequent in the keratotic lesions. Inversely, the frequency of samples with only 1 or 2 positive results was higher in this subgroup. The necessity for the second, even the third, tests was greater in this subgroup.
Our findings were consistent with the results of the studies performed with successive mycological tests on the nail specimens. Meireles et al1 repeated 156 mycological nail tests 3 times and found the rate of positivity in the first test to be 19.9%. When the results of the first and second tests were combined, this rate increased to 28.2%, and when the results of all 3 tests were combined, it increased to 37.8%.1 Gupta10 demonstrated that even a fourth culture provided an incremental diagnostic yield in the diagnosis of onychomycosis, yet 4 cultures may not be clinically practical. Furthermore, periodic acid–Schiff staining is a more effective measure of positivity in onychomycosis.15
Although the overall rate of positivity on the 3 tests in our study was unsurprisingly higher in lesions rated highly suspicious for a fungal infection, the rate of only 1 or 2 positive tests was surprisingly somewhat higher in low-suspicion lesions, which suggested that repeating the KOH test would be beneficial, even if the clinical suspicion for tinea pedis was low. The novel contribution of this study includes the finding that mycological information was markedly improved in highly suspicious tinea pedis lesions regardless of the infection site (Table 1) by using 3 successive KOH tests; the percentage of lesions with 1, 2, or 3 positive KOH tests was 5.3%, 12.1%, and 47.0%, respectively (Table 2). A single physician from a single geographical location introduces a limitation to the study for a variety of reasons, including bias in the cases chosen and possible overrepresentation of the causative organism due to region-specific incidence. It is unknown how different causative organisms affect KOH results. The lack of fungal culture results limits the value of this information.
Conclusion
In this study, we investigated the benefit of successive KOH testing in the laboratory diagnosis of tinea pedis and found that the use of second samples in particular provided a substantial increase in diagnostic yield. In other words, the utilization of successive KOH testing remarkably improved the diagnosis of tinea pedis. Therefore, we suggest that at least 2 samples of skin scrapings should be taken for the diagnosis of tinea pedis and that the number of samples should be at least 3 for keratotic lesions. However, further study by using a gold-standard method such as a molecular-based assay as well as taking the samples in daily or weekly intervals is recommended to achieve a more reliable result.
Acknowledgment
The authors would like to thank Gökçen Şahin (Adana, Turkey) for providing technical support in direct microscopic examination.
- Meireles TE, Rocha MF, Brilhante RS, et al. Successive mycological nail tests for onychomycosis: a strategy to improve diagnosis efficiency. Braz J Infect Dis. 2008;2:333-337.
- Arabatzis M, Bruijnesteijn van Coppenraet LE, Kuijper EJ, et al. Diagnosis of dermatophyte infection by a novel multiplex real-time polymerase chain reaction detection/identification scheme. Br J Dermatol. 2007;157:681-689.
- Wisselink GJ, van Zanten E, Kooistra-Smid AM. Trapped in keratin; a comparison of dermatophyte detection in nail, skin and hair samples directly from clinical samples using culture and real-time PCR. J Microbiol Methods. 2011;85:62-66.
- Kurade SM, Amladi SA, Miskeen AK. Skin scraping and a potassium hydroxide mount. Indian J Dermatol Venereol Leprol. 2006;72:238-241.
- Levitt JO, Levitt BH, Akhavan A, et al. The sensitivity and specificity of potassium hydroxide smear and fungal culture relative to clinical assessment in the evaluation of tinea pedis: a pooled analysis [published online June 22, 2010]. Dermatol Res Pract. 2010;2010:764843.
- Brodell RT, Helms SE, Snelson ME. Office dermatologic testing: the KOH preparation. Am Fam Physicin. 1991;43:2061-2065.
- McKay M. Office techniques for dermatologic diagnosis. In: Walkers HK, Hall WD, Hurst JW, eds. Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd ed. Boston, MA: Butterworths; 1990:540-543.
- Wilmer A, Bryce E, Grant J. The role of the third acid-fast bacillus smear in tuberculosis screening for infection control purposes: a controversial topic revisited. Can J Infect Dis Med Microbiol. 2011;22:E1-E3.
- World Health Organization. Same-day diagnosis of tuberculosis by microscopy: WHO policy statement. http://www.who.int/tb/publications/2011/tb_microscopy_9789241501606/en/. Published 2011. Accessed July 24, 2017.
- Gupta A. The incremental diagnostic yield of successive re-cultures in patients with a clinical diagnosis of onychomycosis. J Am Acad Dermatol. 2005;52:P129.
- Summerbell RC, Cooper E, Bunn U, et al. Onychomycosis: a critical study of techniques and criteria for confirming the etiologic significance of nondermatophytes. Med Mycol. 2005;43:39-59.
- Miller MA, Hodgson Y. Sensitivity and specificity of potassium hydroxide smears of skin scrapings for the diagnosis of tinea pedis. Arch Dermatol. 1993;129:510-511.
- Ilkit M, Durdu M. Tinea pedis: the etiology and global epidemiology of a common fungal infection. Crit Rev Microbiol. 2015;41:374-388.
- McGinnis MR. Laboratory Handbook of Medical Mycology. New York, NY: Academic Press, Inc; 1980.
- Jeelani S, Ahmed QM, Lanker AM, et al. Histopathological examination of nail clippings using PAS staining (HPE-PAS): gold-standard in diagnosis of onychomycosis. Mycoses. 2015;58:27-32.
The gold standard for diagnosing dermatophytosis is the use of direct microscopic examination together with fungal culture.1 However, in the last 2 decades, molecular techniques that currently are available worldwide have improved the diagnosis procedure.2,3 In the practice of dermatology, potassium hydroxide (KOH) testing is a commonly used method for the diagnosis of superficial fungal infections.4 The sensitivity and specificity of KOH testing in patients with tinea pedis have been reported as 73.3% and 42.5%, respectively.5 Repetition of this test after an initial negative test result is recommended if the clinical picture strongly suggests a fungal infection.6,7 Alternatively, several repetitions of direct microscopic examinations also have been proposed for detecting other microorganisms. For example, 3 negative sputum smears traditionally are recommended to exclude a diagnosis of pulmonary tuberculosis.8 However, after numerous investigations in various regions of the world, the World Health Organization reduced the recommended number of these specimens from 3 to 2 in 2007.9
The literature suggests that successive mycological tests, both with direct microscopy and fungal cultures, improve the diagnosis of onychomycosis.1,10,11 Therefore, if such investigations are increased in number, recommendations for successive mycological tests may be more reliable. In the current study, we aimed to investigate the value of successive KOH testing in the management of patients with clinically suspected tinea pedis.
Methods
Patients and Clinical Evaluation
One hundred thirty-five consecutive patients (63 male; 72 female) with clinical symptoms suggestive of intertriginous, vesiculobullous, and/or moccasin-type tinea pedis were enrolled in this prospective study. The mean age (SD) of patients was 45.9 (14.7) years (range, 11–77 years). Almost exclusively, the clinical symptoms suggestive of tinea pedis were desquamation or maceration in the toe webs, blistering lesions on the soles, and diffuse or patchy scaling or keratosis on the soles. A single dermatologist (B.F.K.) clinically evaluated the patients and found only 1 region showing different patterns suggestive of tinea pedis in 72 patients, 2 regions in 61 patients, and 3 regions in 2 patients. Therefore, 200 lesions from the 135 patients were chosen for the KOH test. The dermatologist recorded her level of suspicion for a fungal infection as low or high for each lesion, depending on the absence or presence of signs (eg, unilateral involvement, a well-defined border). None of the patients had used topical or systemic antifungal therapy for at least 1 month prior to the study.12
Clinical Sampling and Direct Microscopic Examination
The dermatologist took 3 samples of skin scrapings from each of the 200 lesions. All 3 samples from a given lesion were obtained from sites with the same clinical symptoms in a single session. Special attention was paid to samples from the active advancing borders of the lesions and the roofs of blisters if they were present.13 Upon completion of every 15 samples from every 5 lesions, the dermatologist randomized the order of the samples (https://www.random.org/). She then gave the samples, without the identities of the patients or any clinical information, to an experienced laboratory technician for direct microscopic examination. The technician prepared and examined the samples as described elsewhere5,7,14 and recorded the results as positive if hyphal elements were present or negative if they were not. The study was reviewed and approved by the Çukurova University Faculty of Medicine Ethics Committee (Adana, Turkey). Informed consent was obtained from each patient or from his/her guardian(s) prior to initiating the study.
Statistical Analysis
Statistical analysis was conducted using the χ2 test in the SPSS software version 20.0. McNemar test was used for analysis of the paired data.
Results
Among the 135 patients, lesions were suggestive of the intertriginous type of tinea pedis in 24 patients, moccasin type in 50 patients, and both intertriginous and moccasin type in 58 patients. Among the remaining 3 patients, 1 had lesions suggestive of the vesiculobullous type, and another patient had both the vesiculobullous and intertriginous types; the last patient demonstrated lesions that were inconsistent with any of these 3 subtypes of tinea pedis, and a well-defined eczematous plaque was observed on the dorsal surface of the patient’s left foot.
Among the 200 lesions from which skin scrapings were taken for KOH testing, 83 were in the toe webs, 110 were on the soles, and 7 were on the dorsal surfaces of the feet. Of these 7 dorsal lesions, 6 were extensions from lesions on the toe webs or soles and 1 was inconsistent with the 3 subtypes of tinea pedis. Among the 200 lesions, the main clinical symptom was maceration in 38 lesions, desquamation or scaling in 132 lesions, keratosis in 28 lesions, and blistering in 2 lesions. The dermatologist recorded the level of suspicion for tinea pedis as low in 68 lesions and high in 132.
According to the order in which the dermatologist took the 3 samples from each lesion, the KOH test was positive in 95 of the first set of 200 samples, 94 of the second set, and 86 of the third set; however, from the second set, the incremental yield (ie, the number of lesions in which the first KOH test was negative and the second was positive) was 10. The number of lesions in which the first and the second tests were negative and the third was positive was only 4. Therefore, the number of lesions with a positive KOH test was significantly increased from 95 to 105 by performing the second KOH test (P=.002). This number again increased from 105 to 109 when a third test was performed; however, this increase was not statistically significant (P=.125)(Table 1).
According to an evaluation that was not stratified by the dermatologist’s order of sampling, 72 lesions (36.0%) showed KOH test positivity in all 3 samples, 22 (11.0%) were positive in 2 samples, 15 (7.5%) were positive in only 1 sample, and 91 (45.5%) were positive in none of the samples (Table 2). When the data were subdivided based on the sites of the lesions, the toe web lesions (n=83) showed rates of 41.0%, 9.6%, and 4.8% for 3, 2, and 1 positive KOH tests, respectively. For the sole lesions (n=110), the rates were somewhat different at 31.8%, 11.8%, and 10.0%, respectively, but the difference was not statistically significant (P=.395).
For the subgroups based on the main clinical symptoms, the percentage of lesions having at least 1 positive KOH test from the 3 samples was 35.7% for the keratotic lesions (n=28). This rate was lower than macerated lesions (n=38) and desquamating or scaling lesions (n=132), which were 52.6% and 59.1%, respectively (Table 2). On the other hand, the percentage of lesions that produced only 1 or 2 positive KOH tests from the 3 samples was 25.0% for the keratotic lesions, which was higher than the rates for the macerated lesions and the desquamating or scaling lesions (13.1% and 18.9%, respectively). In particular, the difference between the keratotic lesions and the desquamating or scaling lesions in the distribution of the rates of 0, 1, 2, and 3 positive KOH tests was statistically significant (P=.019). The macerated, desquamating or scaling, keratotic, and blistering lesions are presented in the Figure.
If the dermatologist indicated a high suspicion of fungal infection, it was more likely that at least 1 of 3 KOH test results was positive. The rate of at least 1 positive test was 64.4% for the highly suspicious lesions (n=132) and 35.3% for the lesions with low suspicion of a fungal infection (n=68)(Table 2). The difference was statistically significant (P<.001). Conversely, if the suspicion was low, it was more likely that only 1 or 2 KOH tests were positive. The percentages of lesions having 3, 2, or 1 positive KOH tests were 14.7%, 8.8%, and 11.8%, respectively, for the low-suspicion lesions and 47.0%, 12.1%, and 5.3%, respectively, for the high-suspicion lesions. The difference was statistically significant (P<.001).
Comment
In the current study, we aimed to investigate if successive KOH tests provide an incremental diagnostic yield in the management of patients with clinically suspected tinea pedis and if these results differ among the subgroups of patients. Both in the evaluation taking into account the order of sampling and in the evaluation disregarding this order, we found that the second sample was necessary for all subgroups, and even the third sample was necessary for patients with keratotic lesions. The main limitation of the study was that we lacked a gold-standard technique (eg, a molecular-based technique); therefore, we are unable to comment on the false-negative and false-positive results of the successive KOH testing.
Summerbell et al11 found in their study that in initial specimens of toenails with apparent lesions taken from 473 patients, the KOH test was 73.8% sensitive for dermatophytes, and this rate was only somewhat higher for cultures (74.6%). Arabatzis et al2 investigated 92 skin, nail, and hair specimens from 67 patients with suspected dermatophytosis and found that the KOH test was superior to culture for the detection of dermatophytes (43% vs 33%). Moreover and more importantly, they noted that a real-time polymerase chain reaction (PCR) assay yielded a higher detection rate (51%).2 In another study, Wisselink et al3 examined 1437 clinical samples and demonstrated a great increase in the detection of dermatophytes using a real-time PCR assay (48.5%) compared to culture (26.9%). However, PCR may not reflect active disease and could lead to false-positive results.2,3 Therefore, the aforementioned weakness of our study will be overcome in further studies investigating the benefit of successive KOH testing compared to a molecular-based assay, such as the real-time PCR assay.
In this study, repeating the KOH test provided better results for achieving the diagnosis of tinea pedis in a large number of samples from clinically suspected lesions. Additionally, the distribution of 3, 2, or 1 positive results on the 3 KOH tests was different among the subgroups of lesions. Overall, positivity was less frequent in the keratotic lesions compared to the macerated or desquamating or scaling lesions. Moreover, positivity on all 3 tests also was less frequent in the keratotic lesions. Inversely, the frequency of samples with only 1 or 2 positive results was higher in this subgroup. The necessity for the second, even the third, tests was greater in this subgroup.
Our findings were consistent with the results of the studies performed with successive mycological tests on the nail specimens. Meireles et al1 repeated 156 mycological nail tests 3 times and found the rate of positivity in the first test to be 19.9%. When the results of the first and second tests were combined, this rate increased to 28.2%, and when the results of all 3 tests were combined, it increased to 37.8%.1 Gupta10 demonstrated that even a fourth culture provided an incremental diagnostic yield in the diagnosis of onychomycosis, yet 4 cultures may not be clinically practical. Furthermore, periodic acid–Schiff staining is a more effective measure of positivity in onychomycosis.15
Although the overall rate of positivity on the 3 tests in our study was unsurprisingly higher in lesions rated highly suspicious for a fungal infection, the rate of only 1 or 2 positive tests was surprisingly somewhat higher in low-suspicion lesions, which suggested that repeating the KOH test would be beneficial, even if the clinical suspicion for tinea pedis was low. The novel contribution of this study includes the finding that mycological information was markedly improved in highly suspicious tinea pedis lesions regardless of the infection site (Table 1) by using 3 successive KOH tests; the percentage of lesions with 1, 2, or 3 positive KOH tests was 5.3%, 12.1%, and 47.0%, respectively (Table 2). A single physician from a single geographical location introduces a limitation to the study for a variety of reasons, including bias in the cases chosen and possible overrepresentation of the causative organism due to region-specific incidence. It is unknown how different causative organisms affect KOH results. The lack of fungal culture results limits the value of this information.
Conclusion
In this study, we investigated the benefit of successive KOH testing in the laboratory diagnosis of tinea pedis and found that the use of second samples in particular provided a substantial increase in diagnostic yield. In other words, the utilization of successive KOH testing remarkably improved the diagnosis of tinea pedis. Therefore, we suggest that at least 2 samples of skin scrapings should be taken for the diagnosis of tinea pedis and that the number of samples should be at least 3 for keratotic lesions. However, further study by using a gold-standard method such as a molecular-based assay as well as taking the samples in daily or weekly intervals is recommended to achieve a more reliable result.
Acknowledgment
The authors would like to thank Gökçen Şahin (Adana, Turkey) for providing technical support in direct microscopic examination.
The gold standard for diagnosing dermatophytosis is the use of direct microscopic examination together with fungal culture.1 However, in the last 2 decades, molecular techniques that currently are available worldwide have improved the diagnosis procedure.2,3 In the practice of dermatology, potassium hydroxide (KOH) testing is a commonly used method for the diagnosis of superficial fungal infections.4 The sensitivity and specificity of KOH testing in patients with tinea pedis have been reported as 73.3% and 42.5%, respectively.5 Repetition of this test after an initial negative test result is recommended if the clinical picture strongly suggests a fungal infection.6,7 Alternatively, several repetitions of direct microscopic examinations also have been proposed for detecting other microorganisms. For example, 3 negative sputum smears traditionally are recommended to exclude a diagnosis of pulmonary tuberculosis.8 However, after numerous investigations in various regions of the world, the World Health Organization reduced the recommended number of these specimens from 3 to 2 in 2007.9
The literature suggests that successive mycological tests, both with direct microscopy and fungal cultures, improve the diagnosis of onychomycosis.1,10,11 Therefore, if such investigations are increased in number, recommendations for successive mycological tests may be more reliable. In the current study, we aimed to investigate the value of successive KOH testing in the management of patients with clinically suspected tinea pedis.
Methods
Patients and Clinical Evaluation
One hundred thirty-five consecutive patients (63 male; 72 female) with clinical symptoms suggestive of intertriginous, vesiculobullous, and/or moccasin-type tinea pedis were enrolled in this prospective study. The mean age (SD) of patients was 45.9 (14.7) years (range, 11–77 years). Almost exclusively, the clinical symptoms suggestive of tinea pedis were desquamation or maceration in the toe webs, blistering lesions on the soles, and diffuse or patchy scaling or keratosis on the soles. A single dermatologist (B.F.K.) clinically evaluated the patients and found only 1 region showing different patterns suggestive of tinea pedis in 72 patients, 2 regions in 61 patients, and 3 regions in 2 patients. Therefore, 200 lesions from the 135 patients were chosen for the KOH test. The dermatologist recorded her level of suspicion for a fungal infection as low or high for each lesion, depending on the absence or presence of signs (eg, unilateral involvement, a well-defined border). None of the patients had used topical or systemic antifungal therapy for at least 1 month prior to the study.12
Clinical Sampling and Direct Microscopic Examination
The dermatologist took 3 samples of skin scrapings from each of the 200 lesions. All 3 samples from a given lesion were obtained from sites with the same clinical symptoms in a single session. Special attention was paid to samples from the active advancing borders of the lesions and the roofs of blisters if they were present.13 Upon completion of every 15 samples from every 5 lesions, the dermatologist randomized the order of the samples (https://www.random.org/). She then gave the samples, without the identities of the patients or any clinical information, to an experienced laboratory technician for direct microscopic examination. The technician prepared and examined the samples as described elsewhere5,7,14 and recorded the results as positive if hyphal elements were present or negative if they were not. The study was reviewed and approved by the Çukurova University Faculty of Medicine Ethics Committee (Adana, Turkey). Informed consent was obtained from each patient or from his/her guardian(s) prior to initiating the study.
Statistical Analysis
Statistical analysis was conducted using the χ2 test in the SPSS software version 20.0. McNemar test was used for analysis of the paired data.
Results
Among the 135 patients, lesions were suggestive of the intertriginous type of tinea pedis in 24 patients, moccasin type in 50 patients, and both intertriginous and moccasin type in 58 patients. Among the remaining 3 patients, 1 had lesions suggestive of the vesiculobullous type, and another patient had both the vesiculobullous and intertriginous types; the last patient demonstrated lesions that were inconsistent with any of these 3 subtypes of tinea pedis, and a well-defined eczematous plaque was observed on the dorsal surface of the patient’s left foot.
Among the 200 lesions from which skin scrapings were taken for KOH testing, 83 were in the toe webs, 110 were on the soles, and 7 were on the dorsal surfaces of the feet. Of these 7 dorsal lesions, 6 were extensions from lesions on the toe webs or soles and 1 was inconsistent with the 3 subtypes of tinea pedis. Among the 200 lesions, the main clinical symptom was maceration in 38 lesions, desquamation or scaling in 132 lesions, keratosis in 28 lesions, and blistering in 2 lesions. The dermatologist recorded the level of suspicion for tinea pedis as low in 68 lesions and high in 132.
According to the order in which the dermatologist took the 3 samples from each lesion, the KOH test was positive in 95 of the first set of 200 samples, 94 of the second set, and 86 of the third set; however, from the second set, the incremental yield (ie, the number of lesions in which the first KOH test was negative and the second was positive) was 10. The number of lesions in which the first and the second tests were negative and the third was positive was only 4. Therefore, the number of lesions with a positive KOH test was significantly increased from 95 to 105 by performing the second KOH test (P=.002). This number again increased from 105 to 109 when a third test was performed; however, this increase was not statistically significant (P=.125)(Table 1).
According to an evaluation that was not stratified by the dermatologist’s order of sampling, 72 lesions (36.0%) showed KOH test positivity in all 3 samples, 22 (11.0%) were positive in 2 samples, 15 (7.5%) were positive in only 1 sample, and 91 (45.5%) were positive in none of the samples (Table 2). When the data were subdivided based on the sites of the lesions, the toe web lesions (n=83) showed rates of 41.0%, 9.6%, and 4.8% for 3, 2, and 1 positive KOH tests, respectively. For the sole lesions (n=110), the rates were somewhat different at 31.8%, 11.8%, and 10.0%, respectively, but the difference was not statistically significant (P=.395).
For the subgroups based on the main clinical symptoms, the percentage of lesions having at least 1 positive KOH test from the 3 samples was 35.7% for the keratotic lesions (n=28). This rate was lower than macerated lesions (n=38) and desquamating or scaling lesions (n=132), which were 52.6% and 59.1%, respectively (Table 2). On the other hand, the percentage of lesions that produced only 1 or 2 positive KOH tests from the 3 samples was 25.0% for the keratotic lesions, which was higher than the rates for the macerated lesions and the desquamating or scaling lesions (13.1% and 18.9%, respectively). In particular, the difference between the keratotic lesions and the desquamating or scaling lesions in the distribution of the rates of 0, 1, 2, and 3 positive KOH tests was statistically significant (P=.019). The macerated, desquamating or scaling, keratotic, and blistering lesions are presented in the Figure.
If the dermatologist indicated a high suspicion of fungal infection, it was more likely that at least 1 of 3 KOH test results was positive. The rate of at least 1 positive test was 64.4% for the highly suspicious lesions (n=132) and 35.3% for the lesions with low suspicion of a fungal infection (n=68)(Table 2). The difference was statistically significant (P<.001). Conversely, if the suspicion was low, it was more likely that only 1 or 2 KOH tests were positive. The percentages of lesions having 3, 2, or 1 positive KOH tests were 14.7%, 8.8%, and 11.8%, respectively, for the low-suspicion lesions and 47.0%, 12.1%, and 5.3%, respectively, for the high-suspicion lesions. The difference was statistically significant (P<.001).
Comment
In the current study, we aimed to investigate if successive KOH tests provide an incremental diagnostic yield in the management of patients with clinically suspected tinea pedis and if these results differ among the subgroups of patients. Both in the evaluation taking into account the order of sampling and in the evaluation disregarding this order, we found that the second sample was necessary for all subgroups, and even the third sample was necessary for patients with keratotic lesions. The main limitation of the study was that we lacked a gold-standard technique (eg, a molecular-based technique); therefore, we are unable to comment on the false-negative and false-positive results of the successive KOH testing.
Summerbell et al11 found in their study that in initial specimens of toenails with apparent lesions taken from 473 patients, the KOH test was 73.8% sensitive for dermatophytes, and this rate was only somewhat higher for cultures (74.6%). Arabatzis et al2 investigated 92 skin, nail, and hair specimens from 67 patients with suspected dermatophytosis and found that the KOH test was superior to culture for the detection of dermatophytes (43% vs 33%). Moreover and more importantly, they noted that a real-time polymerase chain reaction (PCR) assay yielded a higher detection rate (51%).2 In another study, Wisselink et al3 examined 1437 clinical samples and demonstrated a great increase in the detection of dermatophytes using a real-time PCR assay (48.5%) compared to culture (26.9%). However, PCR may not reflect active disease and could lead to false-positive results.2,3 Therefore, the aforementioned weakness of our study will be overcome in further studies investigating the benefit of successive KOH testing compared to a molecular-based assay, such as the real-time PCR assay.
In this study, repeating the KOH test provided better results for achieving the diagnosis of tinea pedis in a large number of samples from clinically suspected lesions. Additionally, the distribution of 3, 2, or 1 positive results on the 3 KOH tests was different among the subgroups of lesions. Overall, positivity was less frequent in the keratotic lesions compared to the macerated or desquamating or scaling lesions. Moreover, positivity on all 3 tests also was less frequent in the keratotic lesions. Inversely, the frequency of samples with only 1 or 2 positive results was higher in this subgroup. The necessity for the second, even the third, tests was greater in this subgroup.
Our findings were consistent with the results of the studies performed with successive mycological tests on the nail specimens. Meireles et al1 repeated 156 mycological nail tests 3 times and found the rate of positivity in the first test to be 19.9%. When the results of the first and second tests were combined, this rate increased to 28.2%, and when the results of all 3 tests were combined, it increased to 37.8%.1 Gupta10 demonstrated that even a fourth culture provided an incremental diagnostic yield in the diagnosis of onychomycosis, yet 4 cultures may not be clinically practical. Furthermore, periodic acid–Schiff staining is a more effective measure of positivity in onychomycosis.15
Although the overall rate of positivity on the 3 tests in our study was unsurprisingly higher in lesions rated highly suspicious for a fungal infection, the rate of only 1 or 2 positive tests was surprisingly somewhat higher in low-suspicion lesions, which suggested that repeating the KOH test would be beneficial, even if the clinical suspicion for tinea pedis was low. The novel contribution of this study includes the finding that mycological information was markedly improved in highly suspicious tinea pedis lesions regardless of the infection site (Table 1) by using 3 successive KOH tests; the percentage of lesions with 1, 2, or 3 positive KOH tests was 5.3%, 12.1%, and 47.0%, respectively (Table 2). A single physician from a single geographical location introduces a limitation to the study for a variety of reasons, including bias in the cases chosen and possible overrepresentation of the causative organism due to region-specific incidence. It is unknown how different causative organisms affect KOH results. The lack of fungal culture results limits the value of this information.
Conclusion
In this study, we investigated the benefit of successive KOH testing in the laboratory diagnosis of tinea pedis and found that the use of second samples in particular provided a substantial increase in diagnostic yield. In other words, the utilization of successive KOH testing remarkably improved the diagnosis of tinea pedis. Therefore, we suggest that at least 2 samples of skin scrapings should be taken for the diagnosis of tinea pedis and that the number of samples should be at least 3 for keratotic lesions. However, further study by using a gold-standard method such as a molecular-based assay as well as taking the samples in daily or weekly intervals is recommended to achieve a more reliable result.
Acknowledgment
The authors would like to thank Gökçen Şahin (Adana, Turkey) for providing technical support in direct microscopic examination.
- Meireles TE, Rocha MF, Brilhante RS, et al. Successive mycological nail tests for onychomycosis: a strategy to improve diagnosis efficiency. Braz J Infect Dis. 2008;2:333-337.
- Arabatzis M, Bruijnesteijn van Coppenraet LE, Kuijper EJ, et al. Diagnosis of dermatophyte infection by a novel multiplex real-time polymerase chain reaction detection/identification scheme. Br J Dermatol. 2007;157:681-689.
- Wisselink GJ, van Zanten E, Kooistra-Smid AM. Trapped in keratin; a comparison of dermatophyte detection in nail, skin and hair samples directly from clinical samples using culture and real-time PCR. J Microbiol Methods. 2011;85:62-66.
- Kurade SM, Amladi SA, Miskeen AK. Skin scraping and a potassium hydroxide mount. Indian J Dermatol Venereol Leprol. 2006;72:238-241.
- Levitt JO, Levitt BH, Akhavan A, et al. The sensitivity and specificity of potassium hydroxide smear and fungal culture relative to clinical assessment in the evaluation of tinea pedis: a pooled analysis [published online June 22, 2010]. Dermatol Res Pract. 2010;2010:764843.
- Brodell RT, Helms SE, Snelson ME. Office dermatologic testing: the KOH preparation. Am Fam Physicin. 1991;43:2061-2065.
- McKay M. Office techniques for dermatologic diagnosis. In: Walkers HK, Hall WD, Hurst JW, eds. Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd ed. Boston, MA: Butterworths; 1990:540-543.
- Wilmer A, Bryce E, Grant J. The role of the third acid-fast bacillus smear in tuberculosis screening for infection control purposes: a controversial topic revisited. Can J Infect Dis Med Microbiol. 2011;22:E1-E3.
- World Health Organization. Same-day diagnosis of tuberculosis by microscopy: WHO policy statement. http://www.who.int/tb/publications/2011/tb_microscopy_9789241501606/en/. Published 2011. Accessed July 24, 2017.
- Gupta A. The incremental diagnostic yield of successive re-cultures in patients with a clinical diagnosis of onychomycosis. J Am Acad Dermatol. 2005;52:P129.
- Summerbell RC, Cooper E, Bunn U, et al. Onychomycosis: a critical study of techniques and criteria for confirming the etiologic significance of nondermatophytes. Med Mycol. 2005;43:39-59.
- Miller MA, Hodgson Y. Sensitivity and specificity of potassium hydroxide smears of skin scrapings for the diagnosis of tinea pedis. Arch Dermatol. 1993;129:510-511.
- Ilkit M, Durdu M. Tinea pedis: the etiology and global epidemiology of a common fungal infection. Crit Rev Microbiol. 2015;41:374-388.
- McGinnis MR. Laboratory Handbook of Medical Mycology. New York, NY: Academic Press, Inc; 1980.
- Jeelani S, Ahmed QM, Lanker AM, et al. Histopathological examination of nail clippings using PAS staining (HPE-PAS): gold-standard in diagnosis of onychomycosis. Mycoses. 2015;58:27-32.
- Meireles TE, Rocha MF, Brilhante RS, et al. Successive mycological nail tests for onychomycosis: a strategy to improve diagnosis efficiency. Braz J Infect Dis. 2008;2:333-337.
- Arabatzis M, Bruijnesteijn van Coppenraet LE, Kuijper EJ, et al. Diagnosis of dermatophyte infection by a novel multiplex real-time polymerase chain reaction detection/identification scheme. Br J Dermatol. 2007;157:681-689.
- Wisselink GJ, van Zanten E, Kooistra-Smid AM. Trapped in keratin; a comparison of dermatophyte detection in nail, skin and hair samples directly from clinical samples using culture and real-time PCR. J Microbiol Methods. 2011;85:62-66.
- Kurade SM, Amladi SA, Miskeen AK. Skin scraping and a potassium hydroxide mount. Indian J Dermatol Venereol Leprol. 2006;72:238-241.
- Levitt JO, Levitt BH, Akhavan A, et al. The sensitivity and specificity of potassium hydroxide smear and fungal culture relative to clinical assessment in the evaluation of tinea pedis: a pooled analysis [published online June 22, 2010]. Dermatol Res Pract. 2010;2010:764843.
- Brodell RT, Helms SE, Snelson ME. Office dermatologic testing: the KOH preparation. Am Fam Physicin. 1991;43:2061-2065.
- McKay M. Office techniques for dermatologic diagnosis. In: Walkers HK, Hall WD, Hurst JW, eds. Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd ed. Boston, MA: Butterworths; 1990:540-543.
- Wilmer A, Bryce E, Grant J. The role of the third acid-fast bacillus smear in tuberculosis screening for infection control purposes: a controversial topic revisited. Can J Infect Dis Med Microbiol. 2011;22:E1-E3.
- World Health Organization. Same-day diagnosis of tuberculosis by microscopy: WHO policy statement. http://www.who.int/tb/publications/2011/tb_microscopy_9789241501606/en/. Published 2011. Accessed July 24, 2017.
- Gupta A. The incremental diagnostic yield of successive re-cultures in patients with a clinical diagnosis of onychomycosis. J Am Acad Dermatol. 2005;52:P129.
- Summerbell RC, Cooper E, Bunn U, et al. Onychomycosis: a critical study of techniques and criteria for confirming the etiologic significance of nondermatophytes. Med Mycol. 2005;43:39-59.
- Miller MA, Hodgson Y. Sensitivity and specificity of potassium hydroxide smears of skin scrapings for the diagnosis of tinea pedis. Arch Dermatol. 1993;129:510-511.
- Ilkit M, Durdu M. Tinea pedis: the etiology and global epidemiology of a common fungal infection. Crit Rev Microbiol. 2015;41:374-388.
- McGinnis MR. Laboratory Handbook of Medical Mycology. New York, NY: Academic Press, Inc; 1980.
- Jeelani S, Ahmed QM, Lanker AM, et al. Histopathological examination of nail clippings using PAS staining (HPE-PAS): gold-standard in diagnosis of onychomycosis. Mycoses. 2015;58:27-32.
Practice Points
- At least 2 samples should be taken for potassium hydroxide examination when tinea pedis is sus-pected clinically.
- The number of samples should be at least 3 if keratotic lesions are present.
Severity Weighting of Postoperative Adverse Events in Orthopedic Surgery
Take-Home Points
- Studies of AEs after orthopedic surgery commonly use composite AE outcomes.
- These types of outcomes treat AEs with different clinical significance similarly.
- This study created a single severity-weighted outcome that can be used to characterize the overall severity of a given patient’s postoperative course.
- Future studies may benefit from using this new severity-weighted outcome score.
Recently there has been an increase in the use of national databases for orthopedic surgery research.1-4 Studies commonly compare rates of postoperative adverse events (AEs) across different demographic, comorbidity, and procedural characteristics.5-23 Their conclusions often highlight different modifiable and/or nonmodifiable risk factors associated with the occurrence of postoperative events.
The several dozen AEs that have been investigated range from very severe (eg, death, myocardial infarction, coma) to less severe (eg, urinary tract infection [UTI], anemia requiring blood transfusion). A common approach for these studies is to consider many AEs together in the same analysis, asking a question such as, “What are risk factors for the occurrence of ‘adverse events’ after spine surgery?” Such studies test for associations with the occurrence of “any adverse event,” the occurrence of any “serious adverse event,” or similar composite outcomes. How common this type of study has become is indicated by the fact that in 2013 and 2014, at least 12 such studies were published in Clinical Orthopaedics and Related Research and the Journal of Bone and Joint Surgery,5-14,21-23 and many more in other orthopedic journals.15-20 However, there is a problem in using this type of composite outcome to perform such analyses: AEs with highly varying degrees of severity have identical impacts on the outcome variable, changing it from negative (“no adverse event”) to positive (“at least one adverse event”). As a result, the system may treat a very severe AE such as death and a very minor AE such as UTI similarly. Even in studies that use the slightly more specific composite outcome of “serious adverse events,” death and a nonlethal thromboembolic event would be treated similarly. Failure to differentiate these AEs in terms of their clinical significance detracts from the clinical applicability of conclusions drawn from studies using these types of composite AE outcomes.
In one of many examples that can be considered, a retrospective cohort study compared general and spinal anesthesia used in total knee arthroplasty.10 The rate of any AEs was higher with general anesthesia than with spinal anesthesia (12.34% vs 10.72%; P = .003). However, the only 2 specific AEs that had statistically significant differences were anemia requiring blood transfusion (6.07% vs 5.02%; P = .009) and superficial surgical-site infection (SSI; 0.92% vs 0.68%; P < .001). These 2 AEs are of relatively low severity; nevertheless, because these AEs are common, their differences constituted the majority of the difference in the rate of any AEs. In contrast, differences in the more severe AEs, such as death (0.11% vs 0.22%; P > .05), septic shock (0.14% vs 0.12%; P > .05), and myocardial infarction (0.20% vs 0.20%; P > .05), were small and not statistically significant. Had more weight been given to these more severe events, the outcome of the study likely would have been “no difference.”
To address this shortcoming in orthopedic research methodology, we created a severity-weighted outcome score that can be used to determine the overall “severity” of any given patient’s postoperative course. We also tested this novel outcome score for correlation with procedure type and patient characteristics using orthopedic patients from the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP). Our intention is for database investigators to be able to use this outcome score in place of the composite outcomes that are dominating this type of research.
Methods
Generation of Severity Weights
Our method is described generally as utility weighting, assigning value weights reflective of overall impact to differing outcome states.24 Parallel methods have been used to generate the disability weights used to determine disability-adjusted life years for the Global Burden of Disease project25 and many other areas of health, economic, and policy research.
All orthopedic faculty members at 2 geographically disparate, large US academic institutions were invited to participate in a severity-weighting exercise. Each surgeon who agreed to participate performed the exercise independently.
- STEP 1: Please reorder the AE cards by your perception of “severity” for a patient experiencing that event after an orthopedic procedure.
- STEP 2: Once your cards are in order, please determine how many postoperative occurrences of each event you would “trade” for 1 patient experiencing postoperative death. Place this number of occurrences in the box in the upper right corner of each card.
- NOTES: As you consider each AE:
- Please consider an “average” occurrence of that AE, but note that in no case does the AE result in perioperative death.
- Please consider only the “severity” for the patient. (Do not consider the extent to which the event may be related to surgical error.)
- Please consider that the numbers you assign are relative to each other. Hence, if you would trade 20 of “event A” for 1 death, and if you would trade 40 of “event B” for 1 death, the implication is that you would trade 20 of “event A” for 40 of “event B.”
- You may readjust the order of your cards at any point.
Participants’ responses were recorded. For each number provided by each participant, the inverse (reciprocal) was taken and multiplied by 100%. This new number was taken to be the percentage severity of death that the given participant considered the given AE to embody. For example, as a hypothetical on one end of the spectrum, if a participant reported 1 (he/she would trade 1 AE X for 1 death), then the severity would be 1/1 × 100% = 100% of death, a very severe AE. Conversely, if a participant reported a very large number like 100,000 (he/she would trade 100,000 AEs X for 1 death), then the severity would be 1/100,000 × 100% = 0.001% of death, a very minor AE. More commonly, a participant will report a number like 25, which would translate to 4% of death (1/25 × 100% = 4%). For each AE, weights were then averaged across participants to derive a mean severity weight to be used to generate a novel composite outcome score.
Definition of Novel Composite Outcome Score
The novel composite outcome score would be expressed as a percentage to be interpreted as percentage severity of death, which we termed severity-weighted outcome relative to death (SWORD). For each patient, SWORD was defined as no AE (0%) or postoperative death (100%), with other AEs assigned mean severity weights based on faculty members’ survey responses. A patient with multiple AEs would be assigned the weight for the more severe AE. This method was chosen over summing the AE weights because in many cases the AEs were thought to overlap; hence, summing would be inappropriate. For example, generally a deep SSI would result in a return to the operating room, and one would not want to double-count this AE. Similarly, it would not make sense for a patient who died of a complication to have a SWORD of >100%, which would be the summing result.
Application to ACS-NSQIP Patients
ACS-NSQIP is a surgical registry that prospectively identifies patients undergoing major surgery at any of >500 institutions nationwide.26,27 Patients are characterized at baseline and are followed for AEs over the first 30 postoperative days.
First, mean SWORD was calculated and reported for patients undergoing each of the 8 procedures. Analysis of variance (ANOVA) was used to test for associations of mean SWORD with type of procedure both before and after multivariate adjustment for demographics (sex; age in years, <40, 40-49, 50-59, 60-69, 70-79, 80-89, ≥90) and comorbidities (diabetes, hypertension, chronic obstructive pulmonary disease, exertional dyspnea, end-stage renal disease, congestive heart failure).
Second, patients undergoing the procedure with the highest mean SWORD (hip fracture surgery) were examined in depth. Among only these patients, multivariate ANOVA was used to test for associations of mean SWORD with the same demographics and comorbidities.
All statistical tests were 2-tailed. Significance was set at α = 0.05 (P < .05).
All 23 institution A faculty members (100%) and 24 (89%) of the 27 institution B faculty members completed the exercise.
In the ACS-NSQIP database, 85,109 patients were identified on the basis of the initial inclusion criteria.
Results
Figure 1 shows mean severity weights and standard errors generated from faculty responses. Mean (standard error) severity weight for UTI was 0.23% (0.08%); blood transfusion, 0.28% (0.09%); pneumonia, 0.55% (0.15%); hospital readmission, 0.59% (0.23%); wound dehiscence, 0.64% (0.17%); deep vein thrombosis, 0.64% (0.19%); superficial SSI, 0.68% (0.23%); return to operating room, 0.91% (0.29%); progressive renal insufficiency, 0.93% (0.27%); graft/prosthesis/flap failure, 1.20% (0.34%); unplanned intubation, 1.38% (0.53%); deep SSI, 1.45% (0.38%); failure to wean from ventilator, 1.45% (0.48%); organ/space SSI, 1.76% (0.46%); sepsis without shock, 1.77% (0.42%); peripheral nerve injury, 1.83% (0.47%); pulmonary embolism, 2.99% (0.76%); acute renal failure, 3.95% (0.85%); myocardial infarction, 4.16% (0.98%); septic shock, 7.17% (1.36%); stroke, 8.73% (1.74%); cardiac arrest requiring cardiopulmonary resuscitation, 9.97% (2.46%); and coma, 15.14% (3.04%).
Among ACS-NSQIP patients, mean SWORD ranged from 0.2% (elective anterior cervical decompression and fusion) to 6.0% (hip fracture surgery) (Figure 2).
Discussion
The use of national databases in studies has become increasingly common in orthopedic surgery.1-4
The academic orthopedic surgeons who participated in our severity-weighting exercise thought the various AEs have markedly different severities. The least severe AE (UTI) was considered 0.23% as severe as postoperative death, with other events spanning the range up to 15.14% as severe as death. This wide range of severities demonstrates the problem with composite outcomes that implicitly consider all AEs similarly severe. Use of these markedly disparate weights in the development of SWORD enables this outcome to be more clinically applicable than outcomes such as “any adverse events.”
SWORD was highly associated with procedure type both before and after adjustment for demographics and comorbidities. Among patients undergoing the highest SWORD procedure (hip fracture surgery), SWORD was also associated with age, sex, and 4 of 6 tested comorbidities. Together, our findings show how SWORD is intended to be used in studies: to identify demographic, comorbidity, and procedural risk factors for an adverse postoperative course. We propose that researchers use our weighted outcome as their primary outcome—it is more meaningful than the simpler composite outcomes commonly used.
Outside orthopedic surgery, a small series of studies has addressed severity weighting of postoperative AEs.25,28-30 However, their approach was very different, as they were not designed to generate weights that could be transferred to future studies; rather, they simply compared severities of postoperative courses for patients within each individual study. In each study, a review of each original patient record was required, as the severity of each patient’s postoperative course was characterized according to the degree of any postoperative intervention—from no intervention to minor interventions such as placement of an intravenous catheter and major interventions such as endoscopic, radiologic, and surgical procedures. Only after the degree of intervention was defined could an outcome score be assigned to a given patient. However, databases do not depict the degree of intervention with nearly enough detail for this type of approach; they typically identify only occurrence or nonoccurrence of each event. Our work, which arose independently from this body of literature, enables an entirely different type of analysis. SWORD, which is not based on degree of intervention but on perceived severity of an “average” event, enables direct application of severity weights to large databases that store simple information on occurrence and nonoccurrence of specific AEs.
This study had several limitations. Most significantly, the generated severity weights were based on the surgeons’ subjective perceptions of severity, not on definitive assessments of the impacts of specific AEs on actual patients. We did not query the specialists who treat the complications or who present data on the costs and disabilities that may arise from these AEs. In addition, to develop our severity weighting scale, we queried faculty at only 2 institutions. A survey of surgeons throughout the United States would be more representative and would minimize selection bias. This is a potential research area. Another limitation is that scoring was subjective, based on surgeons’ perceptions of patients—in contrast to the Global Burden of Disease project, in which severity was based more objectively on epidemiologic data from >150 countries.
Orthopedic database research itself has often-noted limitations, including inability to sufficiently control for confounders, potential inaccuracies in data coding, limited follow-up, and lack of orthopedic-specific outcomes.1-4,31-33 However, this research also has much to offer, has increased tremendously over the past several years, and is expected to continue to expand. Many of the limitations of database studies cannot be entirely reversed. In providing a system for weighting postoperative AEs, our study fills a methodologic void. Future studies in orthopedics may benefit from using the severity-weighted outcome score presented here. Other fields with growth in database research may consider using similar methods to create severity-weighting systems of their own.
Am J Orthop. 2017;46(4):E235-E243. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
1. Bohl DD, Basques BA, Golinvaux NS, Baumgaertner MR, Grauer JN. Nationwide Inpatient Sample and National Surgical Quality Improvement Program give different results in hip fracture studies. Clin Orthop Relat Res. 2014;472(6):1672-1680.
2. Bohl DD, Russo GS, Basques BA, et al. Variations in data collection methods between national databases affect study results: a comparison of the Nationwide Inpatient Sample and National Surgical Quality Improvement Program databases for lumbar spine fusion procedures. J Bone Joint Surg Am. 2014;96(23):e193.
3. Bohl DD, Grauer JN, Leopold SS. Editor’s spotlight/Take 5: Nationwide Inpatient Sample and National Surgical Quality Improvement Program give different results in hip fracture studies. Clin Orthop Relat Res. 2014;472(6):1667-1671.
4. Levin PE. Apples, oranges, and national databases: commentary on an article by Daniel D. Bohl, MPH, et al.: “Variations in data collection methods between national databases affect study results: a comparison of the Nationwide Inpatient Sample and National Surgical Quality Improvement Program databases for lumbar spine fusion procedures.” J Bone Joint Surg Am. 2014;96(23):e198.
5. Duchman KR, Gao Y, Pugely AJ, Martin CT, Callaghan JJ. Differences in short-term complications between unicompartmental and total knee arthroplasty: a propensity score matched analysis. J Bone Joint Surg Am. 2014;96(16):1387-1394.
6. Edelstein AI, Lovecchio FC, Saha S, Hsu WK, Kim JY. Impact of resident involvement on orthopaedic surgery outcomes: an analysis of 30,628 patients from the American College of Surgeons National Surgical Quality Improvement Program database. J Bone Joint Surg Am. 2014;96(15):e131.
7. Belmont PJ Jr, Goodman GP, Waterman BR, Bader JO, Schoenfeld AJ. Thirty-day postoperative complications and mortality following total knee arthroplasty: incidence and risk factors among a national sample of 15,321 patients. J Bone Joint Surg Am. 2014;96(1):20-26.
8. Martin CT, Pugely AJ, Gao Y, Mendoza-Lattes S. Thirty-day morbidity after single-level anterior cervical discectomy and fusion: identification of risk factors and emphasis on the safety of outpatient procedures. J Bone Joint Surg Am. 2014;96(15):1288-1294.
9. Martin CT, Pugely AJ, Gao Y, Wolf BR. Risk factors for thirty-day morbidity and mortality following knee arthroscopy: a review of 12,271 patients from the National Surgical Quality Improvement Program database. J Bone Joint Surg Am. 2013;95(14):e98 1-10.
10. Pugely AJ, Martin CT, Gao Y, Mendoza-Lattes S, Callaghan JJ. Differences in short-term complications between spinal and general anesthesia for primary total knee arthroplasty. J Bone Joint Surg Am. 2013;95(3):193-199.
11. Odum SM, Springer BD. In-hospital complication rates and associated factors after simultaneous bilateral versus unilateral total knee arthroplasty. J Bone Joint Surg Am. 2014;96(13):1058-1065.
12. Yoshihara H, Yoneoka D. Trends in the incidence and in-hospital outcomes of elective major orthopaedic surgery in patients eighty years of age and older in the United States from 2000 to 2009. J Bone Joint Surg Am. 2014;96(14):1185-1191.
13. Lin CA, Kuo AC, Takemoto S. Comorbidities and perioperative complications in HIV-positive patients undergoing primary total hip and knee arthroplasty. J Bone Joint Surg Am. 2013;95(11):1028-1036.
14. Mednick RE, Alvi HM, Krishnan V, Lovecchio F, Manning DW. Factors affecting readmission rates following primary total hip arthroplasty. J Bone Joint Surg Am. 2014;96(14):1201-1209.
15. Pugely AJ, Martin CT, Gao Y, Ilgenfritz R, Weinstein SL. The incidence and risk factors for short-term morbidity and mortality in pediatric deformity spinal surgery: an analysis of the NSQIP pediatric database. Spine. 2014;39(15):1225-1234.
16. Haughom BD, Schairer WW, Hellman MD, Yi PH, Levine BR. Resident involvement does not influence complication after total hip arthroplasty: an analysis of 13,109 cases. J Arthroplasty. 2014;29(10):1919-1924.
17. Belmont PJ Jr, Goodman GP, Hamilton W, Waterman BR, Bader JO, Schoenfeld AJ. Morbidity and mortality in the thirty-day period following total hip arthroplasty: risk factors and incidence. J Arthroplasty. 2014;29(10):2025-2030.
18. Bohl DD, Fu MC, Golinvaux NS, Basques BA, Gruskay JA, Grauer JN. The “July effect” in primary total hip and knee arthroplasty: analysis of 21,434 cases from the ACS-NSQIP database. J Arthroplasty. 2014;29(7):1332-1338.
19. Bohl DD, Fu MC, Gruskay JA, Basques BA, Golinvaux NS, Grauer JN. “July effect” in elective spine surgery: analysis of the American College of Surgeons National Surgical Quality Improvement Program database. Spine. 2014;39(7):603-611.
20. Babu R, Thomas S, Hazzard MA, et al. Morbidity, mortality, and health care costs for patients undergoing spine surgery following the ACGME resident duty-hour reform: clinical article. J Neurosurg Spine. 2014;21(4):502-515.
21. Lovecchio F, Beal M, Kwasny M, Manning D. Do patients with insulin-dependent and noninsulin-dependent diabetes have different risks for complications after arthroplasty? Clin Orthop Relat Res. 2014;472(11):3570-3575.
22. Pugely AJ, Gao Y, Martin CT, Callagh JJ, Weinstein SL, Marsh JL. The effect of resident participation on short-term outcomes after orthopaedic surgery. Clin Orthop Relat Res. 2014;472(7):2290-2300.
23. Easterlin MC, Chang DG, Talamini M, Chang DC. Older age increases short-term surgical complications after primary knee arthroplasty. Clin Orthop Relat Res. 2013;471(8):2611-2620.
24. Morimoto T, Fukui T. Utilities measured by rating scale, time trade-off, and standard gamble: review and reference for health care professionals. J Epidemiology. 2002;12(2):160-178.
25. Salomon JA, Vos T, Hogan DR, et al. Common values in assessing health outcomes from disease and injury: disability weights measurement study for the Global Burden of Disease Study 2010. Lancet. 2012;380(9859):2129-2143.
26. American College of Surgeons National Surgical Quality Improvement Program. User Guide for the 2011 Participant Use Data File. https://www.facs.org/~/media/files/quality%20programs/nsqip/ug11.ashx. Published October 2012. Accessed December 1, 2013.
27. Molina CS, Thakore RV, Blumer A, Obremskey WT, Sethi MK. Use of the National Surgical Quality Improvement Program in orthopaedic surgery. Clin Orthop Relat Res. 2015;473(5):1574-1581.
28. Strasberg SM, Hall BL. Postoperative Morbidity Index: a quantitative measure of severity of postoperative complications. J Am Coll Surg. 2011;213(5):616-626.
29. Beilan J, Strakosha R, Palacios DA, Rosser CJ. The Postoperative Morbidity Index: a quantitative weighing of postoperative complications applied to urological procedures. BMC Urol. 2014;14:1.
30. Porembka MR, Hall BL, Hirbe M, Strasberg SM. Quantitative weighting of postoperative complications based on the Accordion Severity Grading System: demonstration of potential impact using the American College of Surgeons National Surgical Quality Improvement Program. J Am Coll Surg. 2010;210(3):286-298.
31. Golinvaux NS, Bohl DD, Basques BA, Fu MC, Gardner EC, Grauer JN. Limitations of administrative databases in spine research: a study in obesity. Spine J. 2014;14(12):2923-2928.
32. Golinvaux NS, Bohl DD, Basques BA, Grauer JN. Administrative database concerns: accuracy of International Classification of Diseases, Ninth Revision coding is poor for preoperative anemia in patients undergoing spinal fusion. Spine. 2014;39(24):2019-2023.
33. Bekkers S, Bot AG, Makarawung D, Neuhaus V, Ring D. The National Hospital Discharge Survey and Nationwide Inpatient Sample: the databases used affect results in THA research. Clin Orthop Relat Res. 2014;472(11):3441-3449.
Take-Home Points
- Studies of AEs after orthopedic surgery commonly use composite AE outcomes.
- These types of outcomes treat AEs with different clinical significance similarly.
- This study created a single severity-weighted outcome that can be used to characterize the overall severity of a given patient’s postoperative course.
- Future studies may benefit from using this new severity-weighted outcome score.
Recently there has been an increase in the use of national databases for orthopedic surgery research.1-4 Studies commonly compare rates of postoperative adverse events (AEs) across different demographic, comorbidity, and procedural characteristics.5-23 Their conclusions often highlight different modifiable and/or nonmodifiable risk factors associated with the occurrence of postoperative events.
The several dozen AEs that have been investigated range from very severe (eg, death, myocardial infarction, coma) to less severe (eg, urinary tract infection [UTI], anemia requiring blood transfusion). A common approach for these studies is to consider many AEs together in the same analysis, asking a question such as, “What are risk factors for the occurrence of ‘adverse events’ after spine surgery?” Such studies test for associations with the occurrence of “any adverse event,” the occurrence of any “serious adverse event,” or similar composite outcomes. How common this type of study has become is indicated by the fact that in 2013 and 2014, at least 12 such studies were published in Clinical Orthopaedics and Related Research and the Journal of Bone and Joint Surgery,5-14,21-23 and many more in other orthopedic journals.15-20 However, there is a problem in using this type of composite outcome to perform such analyses: AEs with highly varying degrees of severity have identical impacts on the outcome variable, changing it from negative (“no adverse event”) to positive (“at least one adverse event”). As a result, the system may treat a very severe AE such as death and a very minor AE such as UTI similarly. Even in studies that use the slightly more specific composite outcome of “serious adverse events,” death and a nonlethal thromboembolic event would be treated similarly. Failure to differentiate these AEs in terms of their clinical significance detracts from the clinical applicability of conclusions drawn from studies using these types of composite AE outcomes.
In one of many examples that can be considered, a retrospective cohort study compared general and spinal anesthesia used in total knee arthroplasty.10 The rate of any AEs was higher with general anesthesia than with spinal anesthesia (12.34% vs 10.72%; P = .003). However, the only 2 specific AEs that had statistically significant differences were anemia requiring blood transfusion (6.07% vs 5.02%; P = .009) and superficial surgical-site infection (SSI; 0.92% vs 0.68%; P < .001). These 2 AEs are of relatively low severity; nevertheless, because these AEs are common, their differences constituted the majority of the difference in the rate of any AEs. In contrast, differences in the more severe AEs, such as death (0.11% vs 0.22%; P > .05), septic shock (0.14% vs 0.12%; P > .05), and myocardial infarction (0.20% vs 0.20%; P > .05), were small and not statistically significant. Had more weight been given to these more severe events, the outcome of the study likely would have been “no difference.”
To address this shortcoming in orthopedic research methodology, we created a severity-weighted outcome score that can be used to determine the overall “severity” of any given patient’s postoperative course. We also tested this novel outcome score for correlation with procedure type and patient characteristics using orthopedic patients from the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP). Our intention is for database investigators to be able to use this outcome score in place of the composite outcomes that are dominating this type of research.
Methods
Generation of Severity Weights
Our method is described generally as utility weighting, assigning value weights reflective of overall impact to differing outcome states.24 Parallel methods have been used to generate the disability weights used to determine disability-adjusted life years for the Global Burden of Disease project25 and many other areas of health, economic, and policy research.
All orthopedic faculty members at 2 geographically disparate, large US academic institutions were invited to participate in a severity-weighting exercise. Each surgeon who agreed to participate performed the exercise independently.
- STEP 1: Please reorder the AE cards by your perception of “severity” for a patient experiencing that event after an orthopedic procedure.
- STEP 2: Once your cards are in order, please determine how many postoperative occurrences of each event you would “trade” for 1 patient experiencing postoperative death. Place this number of occurrences in the box in the upper right corner of each card.
- NOTES: As you consider each AE:
- Please consider an “average” occurrence of that AE, but note that in no case does the AE result in perioperative death.
- Please consider only the “severity” for the patient. (Do not consider the extent to which the event may be related to surgical error.)
- Please consider that the numbers you assign are relative to each other. Hence, if you would trade 20 of “event A” for 1 death, and if you would trade 40 of “event B” for 1 death, the implication is that you would trade 20 of “event A” for 40 of “event B.”
- You may readjust the order of your cards at any point.
Participants’ responses were recorded. For each number provided by each participant, the inverse (reciprocal) was taken and multiplied by 100%. This new number was taken to be the percentage severity of death that the given participant considered the given AE to embody. For example, as a hypothetical on one end of the spectrum, if a participant reported 1 (he/she would trade 1 AE X for 1 death), then the severity would be 1/1 × 100% = 100% of death, a very severe AE. Conversely, if a participant reported a very large number like 100,000 (he/she would trade 100,000 AEs X for 1 death), then the severity would be 1/100,000 × 100% = 0.001% of death, a very minor AE. More commonly, a participant will report a number like 25, which would translate to 4% of death (1/25 × 100% = 4%). For each AE, weights were then averaged across participants to derive a mean severity weight to be used to generate a novel composite outcome score.
Definition of Novel Composite Outcome Score
The novel composite outcome score would be expressed as a percentage to be interpreted as percentage severity of death, which we termed severity-weighted outcome relative to death (SWORD). For each patient, SWORD was defined as no AE (0%) or postoperative death (100%), with other AEs assigned mean severity weights based on faculty members’ survey responses. A patient with multiple AEs would be assigned the weight for the more severe AE. This method was chosen over summing the AE weights because in many cases the AEs were thought to overlap; hence, summing would be inappropriate. For example, generally a deep SSI would result in a return to the operating room, and one would not want to double-count this AE. Similarly, it would not make sense for a patient who died of a complication to have a SWORD of >100%, which would be the summing result.
Application to ACS-NSQIP Patients
ACS-NSQIP is a surgical registry that prospectively identifies patients undergoing major surgery at any of >500 institutions nationwide.26,27 Patients are characterized at baseline and are followed for AEs over the first 30 postoperative days.
First, mean SWORD was calculated and reported for patients undergoing each of the 8 procedures. Analysis of variance (ANOVA) was used to test for associations of mean SWORD with type of procedure both before and after multivariate adjustment for demographics (sex; age in years, <40, 40-49, 50-59, 60-69, 70-79, 80-89, ≥90) and comorbidities (diabetes, hypertension, chronic obstructive pulmonary disease, exertional dyspnea, end-stage renal disease, congestive heart failure).
Second, patients undergoing the procedure with the highest mean SWORD (hip fracture surgery) were examined in depth. Among only these patients, multivariate ANOVA was used to test for associations of mean SWORD with the same demographics and comorbidities.
All statistical tests were 2-tailed. Significance was set at α = 0.05 (P < .05).
All 23 institution A faculty members (100%) and 24 (89%) of the 27 institution B faculty members completed the exercise.
In the ACS-NSQIP database, 85,109 patients were identified on the basis of the initial inclusion criteria.
Results
Figure 1 shows mean severity weights and standard errors generated from faculty responses. Mean (standard error) severity weight for UTI was 0.23% (0.08%); blood transfusion, 0.28% (0.09%); pneumonia, 0.55% (0.15%); hospital readmission, 0.59% (0.23%); wound dehiscence, 0.64% (0.17%); deep vein thrombosis, 0.64% (0.19%); superficial SSI, 0.68% (0.23%); return to operating room, 0.91% (0.29%); progressive renal insufficiency, 0.93% (0.27%); graft/prosthesis/flap failure, 1.20% (0.34%); unplanned intubation, 1.38% (0.53%); deep SSI, 1.45% (0.38%); failure to wean from ventilator, 1.45% (0.48%); organ/space SSI, 1.76% (0.46%); sepsis without shock, 1.77% (0.42%); peripheral nerve injury, 1.83% (0.47%); pulmonary embolism, 2.99% (0.76%); acute renal failure, 3.95% (0.85%); myocardial infarction, 4.16% (0.98%); septic shock, 7.17% (1.36%); stroke, 8.73% (1.74%); cardiac arrest requiring cardiopulmonary resuscitation, 9.97% (2.46%); and coma, 15.14% (3.04%).
Among ACS-NSQIP patients, mean SWORD ranged from 0.2% (elective anterior cervical decompression and fusion) to 6.0% (hip fracture surgery) (Figure 2).
Discussion
The use of national databases in studies has become increasingly common in orthopedic surgery.1-4
The academic orthopedic surgeons who participated in our severity-weighting exercise thought the various AEs have markedly different severities. The least severe AE (UTI) was considered 0.23% as severe as postoperative death, with other events spanning the range up to 15.14% as severe as death. This wide range of severities demonstrates the problem with composite outcomes that implicitly consider all AEs similarly severe. Use of these markedly disparate weights in the development of SWORD enables this outcome to be more clinically applicable than outcomes such as “any adverse events.”
SWORD was highly associated with procedure type both before and after adjustment for demographics and comorbidities. Among patients undergoing the highest SWORD procedure (hip fracture surgery), SWORD was also associated with age, sex, and 4 of 6 tested comorbidities. Together, our findings show how SWORD is intended to be used in studies: to identify demographic, comorbidity, and procedural risk factors for an adverse postoperative course. We propose that researchers use our weighted outcome as their primary outcome—it is more meaningful than the simpler composite outcomes commonly used.
Outside orthopedic surgery, a small series of studies has addressed severity weighting of postoperative AEs.25,28-30 However, their approach was very different, as they were not designed to generate weights that could be transferred to future studies; rather, they simply compared severities of postoperative courses for patients within each individual study. In each study, a review of each original patient record was required, as the severity of each patient’s postoperative course was characterized according to the degree of any postoperative intervention—from no intervention to minor interventions such as placement of an intravenous catheter and major interventions such as endoscopic, radiologic, and surgical procedures. Only after the degree of intervention was defined could an outcome score be assigned to a given patient. However, databases do not depict the degree of intervention with nearly enough detail for this type of approach; they typically identify only occurrence or nonoccurrence of each event. Our work, which arose independently from this body of literature, enables an entirely different type of analysis. SWORD, which is not based on degree of intervention but on perceived severity of an “average” event, enables direct application of severity weights to large databases that store simple information on occurrence and nonoccurrence of specific AEs.
This study had several limitations. Most significantly, the generated severity weights were based on the surgeons’ subjective perceptions of severity, not on definitive assessments of the impacts of specific AEs on actual patients. We did not query the specialists who treat the complications or who present data on the costs and disabilities that may arise from these AEs. In addition, to develop our severity weighting scale, we queried faculty at only 2 institutions. A survey of surgeons throughout the United States would be more representative and would minimize selection bias. This is a potential research area. Another limitation is that scoring was subjective, based on surgeons’ perceptions of patients—in contrast to the Global Burden of Disease project, in which severity was based more objectively on epidemiologic data from >150 countries.
Orthopedic database research itself has often-noted limitations, including inability to sufficiently control for confounders, potential inaccuracies in data coding, limited follow-up, and lack of orthopedic-specific outcomes.1-4,31-33 However, this research also has much to offer, has increased tremendously over the past several years, and is expected to continue to expand. Many of the limitations of database studies cannot be entirely reversed. In providing a system for weighting postoperative AEs, our study fills a methodologic void. Future studies in orthopedics may benefit from using the severity-weighted outcome score presented here. Other fields with growth in database research may consider using similar methods to create severity-weighting systems of their own.
Am J Orthop. 2017;46(4):E235-E243. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
Take-Home Points
- Studies of AEs after orthopedic surgery commonly use composite AE outcomes.
- These types of outcomes treat AEs with different clinical significance similarly.
- This study created a single severity-weighted outcome that can be used to characterize the overall severity of a given patient’s postoperative course.
- Future studies may benefit from using this new severity-weighted outcome score.
Recently there has been an increase in the use of national databases for orthopedic surgery research.1-4 Studies commonly compare rates of postoperative adverse events (AEs) across different demographic, comorbidity, and procedural characteristics.5-23 Their conclusions often highlight different modifiable and/or nonmodifiable risk factors associated with the occurrence of postoperative events.
The several dozen AEs that have been investigated range from very severe (eg, death, myocardial infarction, coma) to less severe (eg, urinary tract infection [UTI], anemia requiring blood transfusion). A common approach for these studies is to consider many AEs together in the same analysis, asking a question such as, “What are risk factors for the occurrence of ‘adverse events’ after spine surgery?” Such studies test for associations with the occurrence of “any adverse event,” the occurrence of any “serious adverse event,” or similar composite outcomes. How common this type of study has become is indicated by the fact that in 2013 and 2014, at least 12 such studies were published in Clinical Orthopaedics and Related Research and the Journal of Bone and Joint Surgery,5-14,21-23 and many more in other orthopedic journals.15-20 However, there is a problem in using this type of composite outcome to perform such analyses: AEs with highly varying degrees of severity have identical impacts on the outcome variable, changing it from negative (“no adverse event”) to positive (“at least one adverse event”). As a result, the system may treat a very severe AE such as death and a very minor AE such as UTI similarly. Even in studies that use the slightly more specific composite outcome of “serious adverse events,” death and a nonlethal thromboembolic event would be treated similarly. Failure to differentiate these AEs in terms of their clinical significance detracts from the clinical applicability of conclusions drawn from studies using these types of composite AE outcomes.
In one of many examples that can be considered, a retrospective cohort study compared general and spinal anesthesia used in total knee arthroplasty.10 The rate of any AEs was higher with general anesthesia than with spinal anesthesia (12.34% vs 10.72%; P = .003). However, the only 2 specific AEs that had statistically significant differences were anemia requiring blood transfusion (6.07% vs 5.02%; P = .009) and superficial surgical-site infection (SSI; 0.92% vs 0.68%; P < .001). These 2 AEs are of relatively low severity; nevertheless, because these AEs are common, their differences constituted the majority of the difference in the rate of any AEs. In contrast, differences in the more severe AEs, such as death (0.11% vs 0.22%; P > .05), septic shock (0.14% vs 0.12%; P > .05), and myocardial infarction (0.20% vs 0.20%; P > .05), were small and not statistically significant. Had more weight been given to these more severe events, the outcome of the study likely would have been “no difference.”
To address this shortcoming in orthopedic research methodology, we created a severity-weighted outcome score that can be used to determine the overall “severity” of any given patient’s postoperative course. We also tested this novel outcome score for correlation with procedure type and patient characteristics using orthopedic patients from the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP). Our intention is for database investigators to be able to use this outcome score in place of the composite outcomes that are dominating this type of research.
Methods
Generation of Severity Weights
Our method is described generally as utility weighting, assigning value weights reflective of overall impact to differing outcome states.24 Parallel methods have been used to generate the disability weights used to determine disability-adjusted life years for the Global Burden of Disease project25 and many other areas of health, economic, and policy research.
All orthopedic faculty members at 2 geographically disparate, large US academic institutions were invited to participate in a severity-weighting exercise. Each surgeon who agreed to participate performed the exercise independently.
- STEP 1: Please reorder the AE cards by your perception of “severity” for a patient experiencing that event after an orthopedic procedure.
- STEP 2: Once your cards are in order, please determine how many postoperative occurrences of each event you would “trade” for 1 patient experiencing postoperative death. Place this number of occurrences in the box in the upper right corner of each card.
- NOTES: As you consider each AE:
- Please consider an “average” occurrence of that AE, but note that in no case does the AE result in perioperative death.
- Please consider only the “severity” for the patient. (Do not consider the extent to which the event may be related to surgical error.)
- Please consider that the numbers you assign are relative to each other. Hence, if you would trade 20 of “event A” for 1 death, and if you would trade 40 of “event B” for 1 death, the implication is that you would trade 20 of “event A” for 40 of “event B.”
- You may readjust the order of your cards at any point.
Participants’ responses were recorded. For each number provided by each participant, the inverse (reciprocal) was taken and multiplied by 100%. This new number was taken to be the percentage severity of death that the given participant considered the given AE to embody. For example, as a hypothetical on one end of the spectrum, if a participant reported 1 (he/she would trade 1 AE X for 1 death), then the severity would be 1/1 × 100% = 100% of death, a very severe AE. Conversely, if a participant reported a very large number like 100,000 (he/she would trade 100,000 AEs X for 1 death), then the severity would be 1/100,000 × 100% = 0.001% of death, a very minor AE. More commonly, a participant will report a number like 25, which would translate to 4% of death (1/25 × 100% = 4%). For each AE, weights were then averaged across participants to derive a mean severity weight to be used to generate a novel composite outcome score.
Definition of Novel Composite Outcome Score
The novel composite outcome score would be expressed as a percentage to be interpreted as percentage severity of death, which we termed severity-weighted outcome relative to death (SWORD). For each patient, SWORD was defined as no AE (0%) or postoperative death (100%), with other AEs assigned mean severity weights based on faculty members’ survey responses. A patient with multiple AEs would be assigned the weight for the more severe AE. This method was chosen over summing the AE weights because in many cases the AEs were thought to overlap; hence, summing would be inappropriate. For example, generally a deep SSI would result in a return to the operating room, and one would not want to double-count this AE. Similarly, it would not make sense for a patient who died of a complication to have a SWORD of >100%, which would be the summing result.
Application to ACS-NSQIP Patients
ACS-NSQIP is a surgical registry that prospectively identifies patients undergoing major surgery at any of >500 institutions nationwide.26,27 Patients are characterized at baseline and are followed for AEs over the first 30 postoperative days.
First, mean SWORD was calculated and reported for patients undergoing each of the 8 procedures. Analysis of variance (ANOVA) was used to test for associations of mean SWORD with type of procedure both before and after multivariate adjustment for demographics (sex; age in years, <40, 40-49, 50-59, 60-69, 70-79, 80-89, ≥90) and comorbidities (diabetes, hypertension, chronic obstructive pulmonary disease, exertional dyspnea, end-stage renal disease, congestive heart failure).
Second, patients undergoing the procedure with the highest mean SWORD (hip fracture surgery) were examined in depth. Among only these patients, multivariate ANOVA was used to test for associations of mean SWORD with the same demographics and comorbidities.
All statistical tests were 2-tailed. Significance was set at α = 0.05 (P < .05).
All 23 institution A faculty members (100%) and 24 (89%) of the 27 institution B faculty members completed the exercise.
In the ACS-NSQIP database, 85,109 patients were identified on the basis of the initial inclusion criteria.
Results
Figure 1 shows mean severity weights and standard errors generated from faculty responses. Mean (standard error) severity weight for UTI was 0.23% (0.08%); blood transfusion, 0.28% (0.09%); pneumonia, 0.55% (0.15%); hospital readmission, 0.59% (0.23%); wound dehiscence, 0.64% (0.17%); deep vein thrombosis, 0.64% (0.19%); superficial SSI, 0.68% (0.23%); return to operating room, 0.91% (0.29%); progressive renal insufficiency, 0.93% (0.27%); graft/prosthesis/flap failure, 1.20% (0.34%); unplanned intubation, 1.38% (0.53%); deep SSI, 1.45% (0.38%); failure to wean from ventilator, 1.45% (0.48%); organ/space SSI, 1.76% (0.46%); sepsis without shock, 1.77% (0.42%); peripheral nerve injury, 1.83% (0.47%); pulmonary embolism, 2.99% (0.76%); acute renal failure, 3.95% (0.85%); myocardial infarction, 4.16% (0.98%); septic shock, 7.17% (1.36%); stroke, 8.73% (1.74%); cardiac arrest requiring cardiopulmonary resuscitation, 9.97% (2.46%); and coma, 15.14% (3.04%).
Among ACS-NSQIP patients, mean SWORD ranged from 0.2% (elective anterior cervical decompression and fusion) to 6.0% (hip fracture surgery) (Figure 2).
Discussion
The use of national databases in studies has become increasingly common in orthopedic surgery.1-4
The academic orthopedic surgeons who participated in our severity-weighting exercise thought the various AEs have markedly different severities. The least severe AE (UTI) was considered 0.23% as severe as postoperative death, with other events spanning the range up to 15.14% as severe as death. This wide range of severities demonstrates the problem with composite outcomes that implicitly consider all AEs similarly severe. Use of these markedly disparate weights in the development of SWORD enables this outcome to be more clinically applicable than outcomes such as “any adverse events.”
SWORD was highly associated with procedure type both before and after adjustment for demographics and comorbidities. Among patients undergoing the highest SWORD procedure (hip fracture surgery), SWORD was also associated with age, sex, and 4 of 6 tested comorbidities. Together, our findings show how SWORD is intended to be used in studies: to identify demographic, comorbidity, and procedural risk factors for an adverse postoperative course. We propose that researchers use our weighted outcome as their primary outcome—it is more meaningful than the simpler composite outcomes commonly used.
Outside orthopedic surgery, a small series of studies has addressed severity weighting of postoperative AEs.25,28-30 However, their approach was very different, as they were not designed to generate weights that could be transferred to future studies; rather, they simply compared severities of postoperative courses for patients within each individual study. In each study, a review of each original patient record was required, as the severity of each patient’s postoperative course was characterized according to the degree of any postoperative intervention—from no intervention to minor interventions such as placement of an intravenous catheter and major interventions such as endoscopic, radiologic, and surgical procedures. Only after the degree of intervention was defined could an outcome score be assigned to a given patient. However, databases do not depict the degree of intervention with nearly enough detail for this type of approach; they typically identify only occurrence or nonoccurrence of each event. Our work, which arose independently from this body of literature, enables an entirely different type of analysis. SWORD, which is not based on degree of intervention but on perceived severity of an “average” event, enables direct application of severity weights to large databases that store simple information on occurrence and nonoccurrence of specific AEs.
This study had several limitations. Most significantly, the generated severity weights were based on the surgeons’ subjective perceptions of severity, not on definitive assessments of the impacts of specific AEs on actual patients. We did not query the specialists who treat the complications or who present data on the costs and disabilities that may arise from these AEs. In addition, to develop our severity weighting scale, we queried faculty at only 2 institutions. A survey of surgeons throughout the United States would be more representative and would minimize selection bias. This is a potential research area. Another limitation is that scoring was subjective, based on surgeons’ perceptions of patients—in contrast to the Global Burden of Disease project, in which severity was based more objectively on epidemiologic data from >150 countries.
Orthopedic database research itself has often-noted limitations, including inability to sufficiently control for confounders, potential inaccuracies in data coding, limited follow-up, and lack of orthopedic-specific outcomes.1-4,31-33 However, this research also has much to offer, has increased tremendously over the past several years, and is expected to continue to expand. Many of the limitations of database studies cannot be entirely reversed. In providing a system for weighting postoperative AEs, our study fills a methodologic void. Future studies in orthopedics may benefit from using the severity-weighted outcome score presented here. Other fields with growth in database research may consider using similar methods to create severity-weighting systems of their own.
Am J Orthop. 2017;46(4):E235-E243. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
1. Bohl DD, Basques BA, Golinvaux NS, Baumgaertner MR, Grauer JN. Nationwide Inpatient Sample and National Surgical Quality Improvement Program give different results in hip fracture studies. Clin Orthop Relat Res. 2014;472(6):1672-1680.
2. Bohl DD, Russo GS, Basques BA, et al. Variations in data collection methods between national databases affect study results: a comparison of the Nationwide Inpatient Sample and National Surgical Quality Improvement Program databases for lumbar spine fusion procedures. J Bone Joint Surg Am. 2014;96(23):e193.
3. Bohl DD, Grauer JN, Leopold SS. Editor’s spotlight/Take 5: Nationwide Inpatient Sample and National Surgical Quality Improvement Program give different results in hip fracture studies. Clin Orthop Relat Res. 2014;472(6):1667-1671.
4. Levin PE. Apples, oranges, and national databases: commentary on an article by Daniel D. Bohl, MPH, et al.: “Variations in data collection methods between national databases affect study results: a comparison of the Nationwide Inpatient Sample and National Surgical Quality Improvement Program databases for lumbar spine fusion procedures.” J Bone Joint Surg Am. 2014;96(23):e198.
5. Duchman KR, Gao Y, Pugely AJ, Martin CT, Callaghan JJ. Differences in short-term complications between unicompartmental and total knee arthroplasty: a propensity score matched analysis. J Bone Joint Surg Am. 2014;96(16):1387-1394.
6. Edelstein AI, Lovecchio FC, Saha S, Hsu WK, Kim JY. Impact of resident involvement on orthopaedic surgery outcomes: an analysis of 30,628 patients from the American College of Surgeons National Surgical Quality Improvement Program database. J Bone Joint Surg Am. 2014;96(15):e131.
7. Belmont PJ Jr, Goodman GP, Waterman BR, Bader JO, Schoenfeld AJ. Thirty-day postoperative complications and mortality following total knee arthroplasty: incidence and risk factors among a national sample of 15,321 patients. J Bone Joint Surg Am. 2014;96(1):20-26.
8. Martin CT, Pugely AJ, Gao Y, Mendoza-Lattes S. Thirty-day morbidity after single-level anterior cervical discectomy and fusion: identification of risk factors and emphasis on the safety of outpatient procedures. J Bone Joint Surg Am. 2014;96(15):1288-1294.
9. Martin CT, Pugely AJ, Gao Y, Wolf BR. Risk factors for thirty-day morbidity and mortality following knee arthroscopy: a review of 12,271 patients from the National Surgical Quality Improvement Program database. J Bone Joint Surg Am. 2013;95(14):e98 1-10.
10. Pugely AJ, Martin CT, Gao Y, Mendoza-Lattes S, Callaghan JJ. Differences in short-term complications between spinal and general anesthesia for primary total knee arthroplasty. J Bone Joint Surg Am. 2013;95(3):193-199.
11. Odum SM, Springer BD. In-hospital complication rates and associated factors after simultaneous bilateral versus unilateral total knee arthroplasty. J Bone Joint Surg Am. 2014;96(13):1058-1065.
12. Yoshihara H, Yoneoka D. Trends in the incidence and in-hospital outcomes of elective major orthopaedic surgery in patients eighty years of age and older in the United States from 2000 to 2009. J Bone Joint Surg Am. 2014;96(14):1185-1191.
13. Lin CA, Kuo AC, Takemoto S. Comorbidities and perioperative complications in HIV-positive patients undergoing primary total hip and knee arthroplasty. J Bone Joint Surg Am. 2013;95(11):1028-1036.
14. Mednick RE, Alvi HM, Krishnan V, Lovecchio F, Manning DW. Factors affecting readmission rates following primary total hip arthroplasty. J Bone Joint Surg Am. 2014;96(14):1201-1209.
15. Pugely AJ, Martin CT, Gao Y, Ilgenfritz R, Weinstein SL. The incidence and risk factors for short-term morbidity and mortality in pediatric deformity spinal surgery: an analysis of the NSQIP pediatric database. Spine. 2014;39(15):1225-1234.
16. Haughom BD, Schairer WW, Hellman MD, Yi PH, Levine BR. Resident involvement does not influence complication after total hip arthroplasty: an analysis of 13,109 cases. J Arthroplasty. 2014;29(10):1919-1924.
17. Belmont PJ Jr, Goodman GP, Hamilton W, Waterman BR, Bader JO, Schoenfeld AJ. Morbidity and mortality in the thirty-day period following total hip arthroplasty: risk factors and incidence. J Arthroplasty. 2014;29(10):2025-2030.
18. Bohl DD, Fu MC, Golinvaux NS, Basques BA, Gruskay JA, Grauer JN. The “July effect” in primary total hip and knee arthroplasty: analysis of 21,434 cases from the ACS-NSQIP database. J Arthroplasty. 2014;29(7):1332-1338.
19. Bohl DD, Fu MC, Gruskay JA, Basques BA, Golinvaux NS, Grauer JN. “July effect” in elective spine surgery: analysis of the American College of Surgeons National Surgical Quality Improvement Program database. Spine. 2014;39(7):603-611.
20. Babu R, Thomas S, Hazzard MA, et al. Morbidity, mortality, and health care costs for patients undergoing spine surgery following the ACGME resident duty-hour reform: clinical article. J Neurosurg Spine. 2014;21(4):502-515.
21. Lovecchio F, Beal M, Kwasny M, Manning D. Do patients with insulin-dependent and noninsulin-dependent diabetes have different risks for complications after arthroplasty? Clin Orthop Relat Res. 2014;472(11):3570-3575.
22. Pugely AJ, Gao Y, Martin CT, Callagh JJ, Weinstein SL, Marsh JL. The effect of resident participation on short-term outcomes after orthopaedic surgery. Clin Orthop Relat Res. 2014;472(7):2290-2300.
23. Easterlin MC, Chang DG, Talamini M, Chang DC. Older age increases short-term surgical complications after primary knee arthroplasty. Clin Orthop Relat Res. 2013;471(8):2611-2620.
24. Morimoto T, Fukui T. Utilities measured by rating scale, time trade-off, and standard gamble: review and reference for health care professionals. J Epidemiology. 2002;12(2):160-178.
25. Salomon JA, Vos T, Hogan DR, et al. Common values in assessing health outcomes from disease and injury: disability weights measurement study for the Global Burden of Disease Study 2010. Lancet. 2012;380(9859):2129-2143.
26. American College of Surgeons National Surgical Quality Improvement Program. User Guide for the 2011 Participant Use Data File. https://www.facs.org/~/media/files/quality%20programs/nsqip/ug11.ashx. Published October 2012. Accessed December 1, 2013.
27. Molina CS, Thakore RV, Blumer A, Obremskey WT, Sethi MK. Use of the National Surgical Quality Improvement Program in orthopaedic surgery. Clin Orthop Relat Res. 2015;473(5):1574-1581.
28. Strasberg SM, Hall BL. Postoperative Morbidity Index: a quantitative measure of severity of postoperative complications. J Am Coll Surg. 2011;213(5):616-626.
29. Beilan J, Strakosha R, Palacios DA, Rosser CJ. The Postoperative Morbidity Index: a quantitative weighing of postoperative complications applied to urological procedures. BMC Urol. 2014;14:1.
30. Porembka MR, Hall BL, Hirbe M, Strasberg SM. Quantitative weighting of postoperative complications based on the Accordion Severity Grading System: demonstration of potential impact using the American College of Surgeons National Surgical Quality Improvement Program. J Am Coll Surg. 2010;210(3):286-298.
31. Golinvaux NS, Bohl DD, Basques BA, Fu MC, Gardner EC, Grauer JN. Limitations of administrative databases in spine research: a study in obesity. Spine J. 2014;14(12):2923-2928.
32. Golinvaux NS, Bohl DD, Basques BA, Grauer JN. Administrative database concerns: accuracy of International Classification of Diseases, Ninth Revision coding is poor for preoperative anemia in patients undergoing spinal fusion. Spine. 2014;39(24):2019-2023.
33. Bekkers S, Bot AG, Makarawung D, Neuhaus V, Ring D. The National Hospital Discharge Survey and Nationwide Inpatient Sample: the databases used affect results in THA research. Clin Orthop Relat Res. 2014;472(11):3441-3449.
1. Bohl DD, Basques BA, Golinvaux NS, Baumgaertner MR, Grauer JN. Nationwide Inpatient Sample and National Surgical Quality Improvement Program give different results in hip fracture studies. Clin Orthop Relat Res. 2014;472(6):1672-1680.
2. Bohl DD, Russo GS, Basques BA, et al. Variations in data collection methods between national databases affect study results: a comparison of the Nationwide Inpatient Sample and National Surgical Quality Improvement Program databases for lumbar spine fusion procedures. J Bone Joint Surg Am. 2014;96(23):e193.
3. Bohl DD, Grauer JN, Leopold SS. Editor’s spotlight/Take 5: Nationwide Inpatient Sample and National Surgical Quality Improvement Program give different results in hip fracture studies. Clin Orthop Relat Res. 2014;472(6):1667-1671.
4. Levin PE. Apples, oranges, and national databases: commentary on an article by Daniel D. Bohl, MPH, et al.: “Variations in data collection methods between national databases affect study results: a comparison of the Nationwide Inpatient Sample and National Surgical Quality Improvement Program databases for lumbar spine fusion procedures.” J Bone Joint Surg Am. 2014;96(23):e198.
5. Duchman KR, Gao Y, Pugely AJ, Martin CT, Callaghan JJ. Differences in short-term complications between unicompartmental and total knee arthroplasty: a propensity score matched analysis. J Bone Joint Surg Am. 2014;96(16):1387-1394.
6. Edelstein AI, Lovecchio FC, Saha S, Hsu WK, Kim JY. Impact of resident involvement on orthopaedic surgery outcomes: an analysis of 30,628 patients from the American College of Surgeons National Surgical Quality Improvement Program database. J Bone Joint Surg Am. 2014;96(15):e131.
7. Belmont PJ Jr, Goodman GP, Waterman BR, Bader JO, Schoenfeld AJ. Thirty-day postoperative complications and mortality following total knee arthroplasty: incidence and risk factors among a national sample of 15,321 patients. J Bone Joint Surg Am. 2014;96(1):20-26.
8. Martin CT, Pugely AJ, Gao Y, Mendoza-Lattes S. Thirty-day morbidity after single-level anterior cervical discectomy and fusion: identification of risk factors and emphasis on the safety of outpatient procedures. J Bone Joint Surg Am. 2014;96(15):1288-1294.
9. Martin CT, Pugely AJ, Gao Y, Wolf BR. Risk factors for thirty-day morbidity and mortality following knee arthroscopy: a review of 12,271 patients from the National Surgical Quality Improvement Program database. J Bone Joint Surg Am. 2013;95(14):e98 1-10.
10. Pugely AJ, Martin CT, Gao Y, Mendoza-Lattes S, Callaghan JJ. Differences in short-term complications between spinal and general anesthesia for primary total knee arthroplasty. J Bone Joint Surg Am. 2013;95(3):193-199.
11. Odum SM, Springer BD. In-hospital complication rates and associated factors after simultaneous bilateral versus unilateral total knee arthroplasty. J Bone Joint Surg Am. 2014;96(13):1058-1065.
12. Yoshihara H, Yoneoka D. Trends in the incidence and in-hospital outcomes of elective major orthopaedic surgery in patients eighty years of age and older in the United States from 2000 to 2009. J Bone Joint Surg Am. 2014;96(14):1185-1191.
13. Lin CA, Kuo AC, Takemoto S. Comorbidities and perioperative complications in HIV-positive patients undergoing primary total hip and knee arthroplasty. J Bone Joint Surg Am. 2013;95(11):1028-1036.
14. Mednick RE, Alvi HM, Krishnan V, Lovecchio F, Manning DW. Factors affecting readmission rates following primary total hip arthroplasty. J Bone Joint Surg Am. 2014;96(14):1201-1209.
15. Pugely AJ, Martin CT, Gao Y, Ilgenfritz R, Weinstein SL. The incidence and risk factors for short-term morbidity and mortality in pediatric deformity spinal surgery: an analysis of the NSQIP pediatric database. Spine. 2014;39(15):1225-1234.
16. Haughom BD, Schairer WW, Hellman MD, Yi PH, Levine BR. Resident involvement does not influence complication after total hip arthroplasty: an analysis of 13,109 cases. J Arthroplasty. 2014;29(10):1919-1924.
17. Belmont PJ Jr, Goodman GP, Hamilton W, Waterman BR, Bader JO, Schoenfeld AJ. Morbidity and mortality in the thirty-day period following total hip arthroplasty: risk factors and incidence. J Arthroplasty. 2014;29(10):2025-2030.
18. Bohl DD, Fu MC, Golinvaux NS, Basques BA, Gruskay JA, Grauer JN. The “July effect” in primary total hip and knee arthroplasty: analysis of 21,434 cases from the ACS-NSQIP database. J Arthroplasty. 2014;29(7):1332-1338.
19. Bohl DD, Fu MC, Gruskay JA, Basques BA, Golinvaux NS, Grauer JN. “July effect” in elective spine surgery: analysis of the American College of Surgeons National Surgical Quality Improvement Program database. Spine. 2014;39(7):603-611.
20. Babu R, Thomas S, Hazzard MA, et al. Morbidity, mortality, and health care costs for patients undergoing spine surgery following the ACGME resident duty-hour reform: clinical article. J Neurosurg Spine. 2014;21(4):502-515.
21. Lovecchio F, Beal M, Kwasny M, Manning D. Do patients with insulin-dependent and noninsulin-dependent diabetes have different risks for complications after arthroplasty? Clin Orthop Relat Res. 2014;472(11):3570-3575.
22. Pugely AJ, Gao Y, Martin CT, Callagh JJ, Weinstein SL, Marsh JL. The effect of resident participation on short-term outcomes after orthopaedic surgery. Clin Orthop Relat Res. 2014;472(7):2290-2300.
23. Easterlin MC, Chang DG, Talamini M, Chang DC. Older age increases short-term surgical complications after primary knee arthroplasty. Clin Orthop Relat Res. 2013;471(8):2611-2620.
24. Morimoto T, Fukui T. Utilities measured by rating scale, time trade-off, and standard gamble: review and reference for health care professionals. J Epidemiology. 2002;12(2):160-178.
25. Salomon JA, Vos T, Hogan DR, et al. Common values in assessing health outcomes from disease and injury: disability weights measurement study for the Global Burden of Disease Study 2010. Lancet. 2012;380(9859):2129-2143.
26. American College of Surgeons National Surgical Quality Improvement Program. User Guide for the 2011 Participant Use Data File. https://www.facs.org/~/media/files/quality%20programs/nsqip/ug11.ashx. Published October 2012. Accessed December 1, 2013.
27. Molina CS, Thakore RV, Blumer A, Obremskey WT, Sethi MK. Use of the National Surgical Quality Improvement Program in orthopaedic surgery. Clin Orthop Relat Res. 2015;473(5):1574-1581.
28. Strasberg SM, Hall BL. Postoperative Morbidity Index: a quantitative measure of severity of postoperative complications. J Am Coll Surg. 2011;213(5):616-626.
29. Beilan J, Strakosha R, Palacios DA, Rosser CJ. The Postoperative Morbidity Index: a quantitative weighing of postoperative complications applied to urological procedures. BMC Urol. 2014;14:1.
30. Porembka MR, Hall BL, Hirbe M, Strasberg SM. Quantitative weighting of postoperative complications based on the Accordion Severity Grading System: demonstration of potential impact using the American College of Surgeons National Surgical Quality Improvement Program. J Am Coll Surg. 2010;210(3):286-298.
31. Golinvaux NS, Bohl DD, Basques BA, Fu MC, Gardner EC, Grauer JN. Limitations of administrative databases in spine research: a study in obesity. Spine J. 2014;14(12):2923-2928.
32. Golinvaux NS, Bohl DD, Basques BA, Grauer JN. Administrative database concerns: accuracy of International Classification of Diseases, Ninth Revision coding is poor for preoperative anemia in patients undergoing spinal fusion. Spine. 2014;39(24):2019-2023.
33. Bekkers S, Bot AG, Makarawung D, Neuhaus V, Ring D. The National Hospital Discharge Survey and Nationwide Inpatient Sample: the databases used affect results in THA research. Clin Orthop Relat Res. 2014;472(11):3441-3449.
Clinical Outcomes After Conversion from Low-Molecular-Weight Heparin to Unfractionated Heparin for Venous Thromboembolism Prophylaxis
From the Anne Arundel Health System Research Institute, Annapolis, MD.
Abstract
- Objective: To measure clinical outcomes associated with heparin-induced thrombocytopenia (HIT) and acquisition costs of heparin after implementing a new order set promoting unfractionated heparin (UFH) use instead of low-molecular-weight heparin (LMWH) for venous thromboembolism (VTE) prophylaxis.
- Methods: This was single-center, retrospective, pre-post intervention analysis utilizing pharmacy, laboratory, and clinical data sources. Subjects were patients receiving VTE thromboprophyalxis with heparin at an acute care hospital. Usage rates for UFH and LMWH, acquisition costs for heparins, number of HIT assays, best practice advisories for HIT, and confirmed cases of HIT and HIT with thrombosis were assessed.
- Results: After order set intervention, UFH use increased from 43% of all prophylaxis orders to 86%. Net annual savings in acquisition costs for VTE prophylaxis was $131,000. After the intervention, HIT best practice advisories and number of monthly HIT assays fell 35% and 15%, respectively. In the 9-month pre-intervention period, HIT and HITT occurred in zero of 6717 patients receiving VTE prophylaxis. In the 25 months of post-intervention follow-up, HIT occurred in 3 of 44,240 patients (P = 0.86) receiving VTE prophylaxis, 2 of whom had HITT, all after receiving UFH. The median duration of UFH and LMWH use was 3.0 and 3.5 days, respectively.
- Conclusion: UFH use in hospitals can be safely maintained or increased among patient subpopulations that are not at high risk for HIT. A more nuanced approach to prophylaxis, taking into account individual patient risk and expected duration of therapy, may provide desired cost savings without provoking HIT.
Key words: heparin; heparin-induced thrombocytopenia; venous thromboembolism prophylaxis; cost-effectiveness.
Heparin-induced thrombocytopenia (HIT) and its more severe clinical complication, HIT with thrombosis (HITT), complicate the use of heparin products for venous thromboembolic (VTE) prophylaxis. The clinical characteristics and time course of thrombocytopenia in relation to heparin are well characterized (typically 30%–50% drop in platelet count 5–10 days after exposure), if not absolute. Risk calculation tools help to judge the clinical probability and guide ordering of appropriate confirmatory tests [1]. The incidence of HIT is higher with unfractionated heparin (UFH) than with low-molecular-weight heparin (LMWH). A meta-analysis of 5 randomized or prospective nonrandomized trials indicated a risk of 2.6% (95% CI, 1.5%–3.8%) for UFH and 0.2% (95% CI, 0.1%–0.4%) for LMWH [2], though the analyzed studies were heavily weighted by studies of orthopedic surgery patients, a high-risk group. However, not all patients are at equal risk for HIT, suggesting that LMWH may not be necessary for all patients [3]. Unfortunately, LMWH is considerably more expensive for hospitals to purchase than UFH, raising costs for a prophylactic treatment that is widely utilized. However, the higher incidence of HIT and HITT associated with UFH can erode any cost savings because of the additional cost of diagnosing HIT and need for temporary or long-term treatment with even more expensive alternative anticoagulants. Indeed, a recent retrospective study suggested that the excess costs of evaluating and treating HIT were approximately $267,000 per year in Canadian dollars [4].But contrary data has also been reported. A retrospective study of the consequences of increased prophylactic UFH use found no increase in ordered HIT assays or in the results of HIT testing or of inferred positive cases despite a growth of 71% in the number of patients receiving UFH prophylaxis [5].
In 2013, the pharmacy and therapeutics committee made a decision to encourage the use of UFH over LMWH for VTE prophylaxis by making changes to order sets to favor UFH over LMWH (enoxaparin). Given the uncertainty about excess risk of HIT, a monitoring work group was created to assess for any increase of either HIT or HITT that might follow, including any patient readmitted with thrombosis within 30 days of a discharge. In this paper, we report the impact of a hospital-wide conversion to UFH for VTE prophylaxis on the incidence of VTE, HIT, and HITT and acquisition costs of UFH and LMWH and use of alternative prophylactic anticoagulant medications.
Methods
Setting
Anne Arundel Medical Center is a 383-bed acute care hospital with about 30,000 adult admissions and 10,000 inpatient surgeries annually. The average length of stay is approximately 3.6 days with a patient median age of 59 years. Caucasians comprise 75.3% of the admitted populations and African Americans 21.4%. Most patients are on Medicare (59%), while 29.5% have private insurance, 6.6% are on Medicaid, and 4.7% self-pay. The 9 most common medical principal diagnoses are sepsis, heart failure, chronic obstructive pulmonary disease, pneumonia, myocardial infarction, ischemic stroke, urinary tract infection, cardiac arrhythmia, and other infection. The 6 most common procedures include newborn delivery (with and without caesarean section), joint replacement surgery, bariatric procedures, cardiac catheterizations, other abdominal surgeries, and thoracotomy. The predominant medical care model is internal medicine and physician assistant acute care hospitalists attending both medicine and surgical patients. Obstetrical hospitalists care for admitted obstetric patients. Patients admitted to the intensive care units had only critical care trained physician specialists as attending physicians. No trainees cared for the patients described in this study.
P&T Committee
The P&T committee is a multidisciplinary group of health care professionals selected for appointment by the chairs of the committee (chair of medicine and director of pharmacy) and approved by the president of the medical staff. The committee has oversight responsibility for all medication policies, order sets involving medications, as well as the monitoring of clinical outcomes as they regard medications.
Electronic Medical Record and Best Practice Advisory
Throughout this study period both pre-and post-intervention, the EMR in use was Epic (Verona WI), used for all ordering and lab results. A best practice advisory was in place in the EMR that alerted providers to all cases of thrombocytopenia < 100,000/mm3 when there was concurrent order for any heparin. The best practice advisory highlighted the thrombocytopenia, advised the providers to consider HIT as a diagnosis and to order confirmation tests if clinically appropriate, providing a direct link to the HIT assay order screen. The best practice advisory did not access information from prior admissions where heparin might have been used nor determine the percentage drop from the baseline platelet count.
HIT Case Definition and Assays
The 2 laboratory tests for HIT on which this study is based are the heparin-induced platelet antibody test (also known as anti-PF4) and the serotonin release assay. The heparin-induced platelet antibody test is an enzyme-linked immunosorbent assay (ELISA) that detects IgG, IgM, and IgA antibodies against the platelet factor 4 (PF4/heparin complex). This test was reported as positive if the optical density was 0.4 or higher and generated an automatic request for a serotonin release assay (SRA), which is a functional assay that measures heparin-dependent platelet activation. The decision to order the SRA was therefore a “reflex” test and not made with any knowledge of clinical characteristics of the case. The HIT assays were performed by a reference lab, Quest Diagnostics, in the Chantilly, VA facility. HIT was said to be present when both a characteristic pattern of thrombocytopenia occurring after heparin use was seen [1]and when the confirmatory SRA was positive at a level of > 20% release.
Order Set Modifications
After the P&T committee decision to emphasize UFH for VTE prophylaxis in October 2013, the relevant electronic order sets were altered to highlight the fact that UFH was the first choice for VTE prophylaxis. The order sets still allowed LMWH (enoxaparin) or alternative anticoagulants at the prescribers’ discretion but indicated they were a second choice. Doses of UFH and LMWH in the order sets were standard based upon weight and estimates of creatinine clearance and, in the case of dosing frequency for UFH, based upon the risk of VTE. Order sets for the therapeutic treatment of VTE were not changed.
Data Collection and Analysis
The clinical research committee, the local oversight board for research and performance improvement analyses, reviewed this project and determined that it qualified as a performance improvement analysis based upon the standards of the U.S. Office of Human Research Protections. Some data were extracted from patient medical records and stored in a customized and password-protected database. Access to the database was limited to members of the analysis team and stripped of all patient identifiers under the HIPAA privacy rule standard for de-identification from 45 CFR 164.514(b) immediately following the collection of all data elements from the medical record.
An internal pharmacy database was used to determine the volume and actual acquisition cost of prophylactic anticoagulant doses administered during both pre- and post-intervention time periods. To determine if clinical suspicion for HIT increased after the intervention, a definitive listing of all ordered HIT assays was obtained from laboratory billing records for the 9 months (January 2013–September 2013) before the conversion and for 25 months after the intervention (beginning in November 2013 so as not to include the conversion month). To determine if the HIT assays were associated with a higher risk score, we identified all cases in which the HIT assay was ordered and retroactively measured the probability score known as the 4T score [1].Simultaneously, separate clinical work groups reviewed all cases of hospital-acquired thrombosis, whatever their cause, including patients readmitted with thrombosis up to 30 days after discharge and episodes of bleeding due to anti-coagulant use. A chi square analysis of the incidence of HIT pre- and post-intervention was performed.
Results
Heparin Use and Acquisition Costs
HIT Assays and Incidence of HIT and HITT
In the 9 months pre-intervention, HIT and HITT occurred in zero of 6717 patients receiving at least 1 dose of VTE prophylaxis. In the 25 months of post-intervention follow-up, 44,240 patients received prophylaxis with either heparin. HIT (clinical suspicion with positive antibody and confirmatory SRA) occurred in 3 patients, 2 of whom had HITT, all after UFH. This incidence was not statistically significant using chi square analysis (P = 0.86).
Discussion
Because the efficacy of UFH and LMWH for VTE prophylaxis are equivalent [6],choosing between them involves many factors including patient-level risk factors such as renal function, risk of bleeding, as well as other considerations such as nursing time, patient preference, risk of HIT, and acquisition cost. Indeed, the most recent version of the American College of Chest Physicians guidelines for prophylaxis against VTE note that both drugs are recommended with an evidence grade of IB [7].Cost is among the considerations considered appropriate in choosing among agents. The difference in acquisition costs of > $20 per patient per day can have a major financial impact on hospital’s pharmacy budget and may be decisive. But a focus only on acquisition cost is short sighted as the 2 medications have different complication rates with regard to HIT. Thus the need to track HIT incidence after protocol changes are made is paramount.
In our study, we did not measure thrombocytopenia as an endpoint because acquired thrombocytopenia is too common and multifactorial to be a meaningful. Rather, we used the clinical suspicion for HIT as measured by both the number of times the BPA fired warnings of low platelets in the setting of recent heparin use and the number of times clinicians suspected HIT enough to order a HIT assay. We also used actual outcomes (clinically adjudicated cases of HIT and HITT). Our data shows substantial compliance among clinicians with the voluntary conversion to UFH with an immediate and sustained shift to UFH so that UFH was used in 86% of patients. Corresponding cost savings were achieved in heparin acquisition. Unlike some prior reports, there was a minimal burden of HIT as measured by the unchanged number of BPAs, monthly HIT assays and the unchanged clinical risk 4T scores among those patients in whom the test was ordered pre and post intervention. HIT rates were not statistically different after the order set conversion took effect.
Our results and study design are similar but not identical to that of Zhou et al, who found that a campaign to increase VTE prophylaxis resulted in 71% increase of UFH use over 5 years but no increase in number of HIT assays ordered or in the distribution of HIT assay results-both surrogate endpoints [5].But not all analyses of heparin order interventions show similar results. A recent study of a heparin avoidance program in a Canadian tertiary care hospital showed a reduction of 79% and 91% in adjudicated cases of HIT and HITT respectively [4].Moreover, hospital-related expenditures for HIT decreased by nearly $267,000 (Canadian dollars) per year though the additional acquisition costs of LMWH were not stated.A small retrospective heparin avoidance protocol among orthopedic surgery patients showed a reduction of HIT incidence from 5.2% with UFH to 0% with LMWH after universal substitution of LMWH for UFH [8].A recent systematic review identified only 3 prospective studies involving over 1398 postoperative surgical patients that measured HIT and HITT as outcomes [9].The review authors, in pooled analysis, found a lower incidence of HIT and HITT with LMWH postoperatively but downgraded the evidence to “low quality” due to methodologic issues and concerns over bias.A nested case-control study of adult medical patients found that HIT was 6 times more common with UFH than with LMWH and the cost of admissions associated with HIT was 3.5 times higher than for those without HIT, though this increase in costs are not necessarily due to the HIT diagnosis itself but may be markers of patients with more severe illness [10].The duration of heparin therapy was not stated.
There are several potential reasons that our data differs from some of the previous reports described above. We used a strict definition of HIT, requiring the serotonin release assay to be positive in the appropriate clinical setting and did not rely solely upon antibody tests to make the diagnosis, a less rigorous standard found in some studies. Furthermore, our results may differ from previously reports because of differences in patient risk and duration of therapy. Our institution does not perform cardiac surgery and the very large orthopedic surgery programs do not generally use heparin. Another potentially important difference in our study from prior studies is that many of the patients treated at this institution did not receive heparin long enough to be considered at risk; only a quarter were treated for longer than 5 days, generally considered a minumum [11].This is less than half of the duration of the patients in the studies included in the meta-analysis of HIT incidence [2].
We do not contend that UFH is as safe as LMWH with regard to HIT for all populatons, but rather that the increased risk is not manifest in all patient populations and settings and so the increased cost may not be justified in low-risk patients. Indeed while variability in HIT risk among patients is well documented [3,12], the guidelines for prophylaxis do not generally take this into account when recommending particular VTE prophylaxis strategies.Clinical practice guidelines do recommend different degrees of monitoring the platelet count based on risk of HIT however.
Our study had limitations, chief of which is the retrospective nature of the analysis; however, the methodology we used was similar to those of previous publications [4,5,8].We may have missed some cases of HIT if a clinician did not order the assay in all appropriate patients but there is no reason to think that likelihood was any different pre- and post-intervention. In addition, though we reviewed every case of hospital-acquired thrombosis, it is possible that the clinical reviewers may have missed cases of HITT, especially if the thrombosis occurred before a substantial drop in the platelet count, which is rare but possible. Here too the chance of missing actual cases did not change between the pre-and post-intervention. Our study examined prophylaxis with heparin use and not therapeutic uses. Finally, while noting the acquisition cost reduction achieved with conversion to UFH, we were not able to calculate any excess expense attributed to the rare case of HIT and HITT that occurred. We believe our results are generalizable to hospitals with similar patient profiles.
The idea that patients with different risk factors might do well with different prophylaxis strategies needs to be better appreciated. Such information could be used as a guide to more individualized prophylaxis strategy aided by clinical decision support embedded within the EMR. In this way the benefit of LMWH in avoiding HIT could be reserved for those patients at greatest risk of HIT while simultaneously allowing hospitals not to overspend for prophylaxis in patients who will not benefit from LMWH. Such a strategy would need to be tested prospectively before widespread adoption.
As a result of our internal analysis we have altered our EMR-based best practice alert to conform to the 2013 American Society of Hematology guidelines [15],which is more informative than our original BPA. Specifically, the old guideline only warned if the platelet count was < 100,000/mm3 in association with heparin. The revision notified if there is a > 30% fall regardless of the absolute count and informed prescribers of the 4T score to encourage more optimum use of the HIT assay, avoiding its use for low risk scores and encouraging its use for moderate to high risk scores. We are also strengthening the emphasis that moderate to high risk 4T patients receive alternative anticoagulation until results of the HIT assay are available as we found this not to be a be a universal practice. We recommend similar self-inspection to other institutions.
Corresponding author: Barry R. Meisenberg, MD, Anne Arundel Medical Center, 2001 Medical Parkway, Annapolis, MD 21401, Meisenberg@aahs.org.
Financial disclosures: None.
Author contributions: conception and design, JR, BRM; analysis and interpretation of data, KW, JR, BRM; drafting of article, JR, BRM; critical revision of the article, KW, JR, BRM; statistical expertise, KW, JR; administrative or technical support, JR; collection and assembly of data, KW, JR.
1. Lo GK, Juhl D, Warkentin TE, et al. Evaluation of pretest clinical score (4T’s) for the diagnosis of heparin-induced thrombocytopenia in two clinical settings. J Thromb Haemost 2006;4:759–65.
2. Martel N, Lee J, Wells PS. Risk for heparin-induced thrombocytopenia with unfractionated and low-molecular-weight heparin thromboprophylaxis: a meta-analysis. Blood 2005; 106:2710–5.
3. Warkentin TE, Sheppard JI, Horsewood P, et al. Impact of the patient population on the risk for heparin-induced thrombocytpenia Blood 2000; 96:1703–8.
4. McGowan KE, Makari J, Diamantouros A, et al. Reducing the hospital burden of heparin-induced thrombocytopenia: impact of an avoid heparin program. Blood 2016; 127:1954–9.
5. Zhou A, Winkler A, Emamifar A, et al. Is the incidence of heparin-induced thrombocytopenia affected by the increased use of heparin for VTE prophylaxis? Chest 2012; 142:1175–8.
6. Mismetti P, Laporte-Simitsidis S, Tardy B, et al. Prevention of venous thromboembolism in internal medicine with unfractionated or low-molecular-weight heparins: a meta-analysis of randomised clinical trials. Thromb Haemost 2000;83:14–19.
7. Guyatt GH, Akl EA, Crowther M, et al; for the American College of Chest Physicians Antithrombotic Therapy and Prevention of Thrombosis Panel. Antithrombotic therapy and prevention of thrombosis. 9th ed. American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012;141(2 Suppl):7S–47S.
8. Greinacher A, Eichler P, Lietz T, Warkentin TE. Replacement of unfractionated heparin by low-molecular-weight heparin for postorthopedic surgery antithrombotic prophylaxis lowers the overall risk of symptomatic thrombosis because of a lower frequency of heparin-induced thrombocytopenia. Blood 2005;106:2921–2.
9. Junqueira DRG, Zorzela LM, Perini E. Unfractionated heparin versus low molecular weight heparin for avoiding heparin-induced thrombocytopenia in postoperative patients. Cochrane Database Syst Rev 2017;4:CD007557.
10. Creekmore FM, Oderda GM, Pendleton RC, Brixner DI. Incidence and economic implications of heparin-induced thrombocytopenia in medical patients receiving prophylaxis for venous thromboembolism. Pharmacotherapy 2006;26:1348–445.
11. Warkentin TE, Kelton JG. Temporal aspects of heparin-induced thrombocytopenia N Engl J Med 2001;344:1286–92.
12. Warkentin TE, Sheppard JA, Sigouin CS, et al. Gender imbalance and risk factor interactions in heparin-induced thrombocytopenia. Blood 2006;108:2937–41.
13. Camden R, Ludwig S. Prophylaxis against venous thromboembolism in hospitalized medically ill patients: Update and practical approach. Am J Health Syst Pharm 2012;71:909–17.
14. Linkins LA, Dans AL, Moores LK, et al. Treatment and prevention of heparin-induced thrombocytopenia. antithrombotic therapy and prevention of thrombosis. 9th ed. American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012;141(2 Suppl):e495s–e530s.
15. Cuker A, Crowther MA. 2013 Clinical practice guideline on the evaluation and management of adults with suspected heparin-induced thrombocytopenia. Acessed 19 May 2017 at www.hematology.org/search.aspx?q=heparin+induced+thrombocytopenia.
From the Anne Arundel Health System Research Institute, Annapolis, MD.
Abstract
- Objective: To measure clinical outcomes associated with heparin-induced thrombocytopenia (HIT) and acquisition costs of heparin after implementing a new order set promoting unfractionated heparin (UFH) use instead of low-molecular-weight heparin (LMWH) for venous thromboembolism (VTE) prophylaxis.
- Methods: This was single-center, retrospective, pre-post intervention analysis utilizing pharmacy, laboratory, and clinical data sources. Subjects were patients receiving VTE thromboprophyalxis with heparin at an acute care hospital. Usage rates for UFH and LMWH, acquisition costs for heparins, number of HIT assays, best practice advisories for HIT, and confirmed cases of HIT and HIT with thrombosis were assessed.
- Results: After order set intervention, UFH use increased from 43% of all prophylaxis orders to 86%. Net annual savings in acquisition costs for VTE prophylaxis was $131,000. After the intervention, HIT best practice advisories and number of monthly HIT assays fell 35% and 15%, respectively. In the 9-month pre-intervention period, HIT and HITT occurred in zero of 6717 patients receiving VTE prophylaxis. In the 25 months of post-intervention follow-up, HIT occurred in 3 of 44,240 patients (P = 0.86) receiving VTE prophylaxis, 2 of whom had HITT, all after receiving UFH. The median duration of UFH and LMWH use was 3.0 and 3.5 days, respectively.
- Conclusion: UFH use in hospitals can be safely maintained or increased among patient subpopulations that are not at high risk for HIT. A more nuanced approach to prophylaxis, taking into account individual patient risk and expected duration of therapy, may provide desired cost savings without provoking HIT.
Key words: heparin; heparin-induced thrombocytopenia; venous thromboembolism prophylaxis; cost-effectiveness.
Heparin-induced thrombocytopenia (HIT) and its more severe clinical complication, HIT with thrombosis (HITT), complicate the use of heparin products for venous thromboembolic (VTE) prophylaxis. The clinical characteristics and time course of thrombocytopenia in relation to heparin are well characterized (typically 30%–50% drop in platelet count 5–10 days after exposure), if not absolute. Risk calculation tools help to judge the clinical probability and guide ordering of appropriate confirmatory tests [1]. The incidence of HIT is higher with unfractionated heparin (UFH) than with low-molecular-weight heparin (LMWH). A meta-analysis of 5 randomized or prospective nonrandomized trials indicated a risk of 2.6% (95% CI, 1.5%–3.8%) for UFH and 0.2% (95% CI, 0.1%–0.4%) for LMWH [2], though the analyzed studies were heavily weighted by studies of orthopedic surgery patients, a high-risk group. However, not all patients are at equal risk for HIT, suggesting that LMWH may not be necessary for all patients [3]. Unfortunately, LMWH is considerably more expensive for hospitals to purchase than UFH, raising costs for a prophylactic treatment that is widely utilized. However, the higher incidence of HIT and HITT associated with UFH can erode any cost savings because of the additional cost of diagnosing HIT and need for temporary or long-term treatment with even more expensive alternative anticoagulants. Indeed, a recent retrospective study suggested that the excess costs of evaluating and treating HIT were approximately $267,000 per year in Canadian dollars [4].But contrary data has also been reported. A retrospective study of the consequences of increased prophylactic UFH use found no increase in ordered HIT assays or in the results of HIT testing or of inferred positive cases despite a growth of 71% in the number of patients receiving UFH prophylaxis [5].
In 2013, the pharmacy and therapeutics committee made a decision to encourage the use of UFH over LMWH for VTE prophylaxis by making changes to order sets to favor UFH over LMWH (enoxaparin). Given the uncertainty about excess risk of HIT, a monitoring work group was created to assess for any increase of either HIT or HITT that might follow, including any patient readmitted with thrombosis within 30 days of a discharge. In this paper, we report the impact of a hospital-wide conversion to UFH for VTE prophylaxis on the incidence of VTE, HIT, and HITT and acquisition costs of UFH and LMWH and use of alternative prophylactic anticoagulant medications.
Methods
Setting
Anne Arundel Medical Center is a 383-bed acute care hospital with about 30,000 adult admissions and 10,000 inpatient surgeries annually. The average length of stay is approximately 3.6 days with a patient median age of 59 years. Caucasians comprise 75.3% of the admitted populations and African Americans 21.4%. Most patients are on Medicare (59%), while 29.5% have private insurance, 6.6% are on Medicaid, and 4.7% self-pay. The 9 most common medical principal diagnoses are sepsis, heart failure, chronic obstructive pulmonary disease, pneumonia, myocardial infarction, ischemic stroke, urinary tract infection, cardiac arrhythmia, and other infection. The 6 most common procedures include newborn delivery (with and without caesarean section), joint replacement surgery, bariatric procedures, cardiac catheterizations, other abdominal surgeries, and thoracotomy. The predominant medical care model is internal medicine and physician assistant acute care hospitalists attending both medicine and surgical patients. Obstetrical hospitalists care for admitted obstetric patients. Patients admitted to the intensive care units had only critical care trained physician specialists as attending physicians. No trainees cared for the patients described in this study.
P&T Committee
The P&T committee is a multidisciplinary group of health care professionals selected for appointment by the chairs of the committee (chair of medicine and director of pharmacy) and approved by the president of the medical staff. The committee has oversight responsibility for all medication policies, order sets involving medications, as well as the monitoring of clinical outcomes as they regard medications.
Electronic Medical Record and Best Practice Advisory
Throughout this study period both pre-and post-intervention, the EMR in use was Epic (Verona WI), used for all ordering and lab results. A best practice advisory was in place in the EMR that alerted providers to all cases of thrombocytopenia < 100,000/mm3 when there was concurrent order for any heparin. The best practice advisory highlighted the thrombocytopenia, advised the providers to consider HIT as a diagnosis and to order confirmation tests if clinically appropriate, providing a direct link to the HIT assay order screen. The best practice advisory did not access information from prior admissions where heparin might have been used nor determine the percentage drop from the baseline platelet count.
HIT Case Definition and Assays
The 2 laboratory tests for HIT on which this study is based are the heparin-induced platelet antibody test (also known as anti-PF4) and the serotonin release assay. The heparin-induced platelet antibody test is an enzyme-linked immunosorbent assay (ELISA) that detects IgG, IgM, and IgA antibodies against the platelet factor 4 (PF4/heparin complex). This test was reported as positive if the optical density was 0.4 or higher and generated an automatic request for a serotonin release assay (SRA), which is a functional assay that measures heparin-dependent platelet activation. The decision to order the SRA was therefore a “reflex” test and not made with any knowledge of clinical characteristics of the case. The HIT assays were performed by a reference lab, Quest Diagnostics, in the Chantilly, VA facility. HIT was said to be present when both a characteristic pattern of thrombocytopenia occurring after heparin use was seen [1]and when the confirmatory SRA was positive at a level of > 20% release.
Order Set Modifications
After the P&T committee decision to emphasize UFH for VTE prophylaxis in October 2013, the relevant electronic order sets were altered to highlight the fact that UFH was the first choice for VTE prophylaxis. The order sets still allowed LMWH (enoxaparin) or alternative anticoagulants at the prescribers’ discretion but indicated they were a second choice. Doses of UFH and LMWH in the order sets were standard based upon weight and estimates of creatinine clearance and, in the case of dosing frequency for UFH, based upon the risk of VTE. Order sets for the therapeutic treatment of VTE were not changed.
Data Collection and Analysis
The clinical research committee, the local oversight board for research and performance improvement analyses, reviewed this project and determined that it qualified as a performance improvement analysis based upon the standards of the U.S. Office of Human Research Protections. Some data were extracted from patient medical records and stored in a customized and password-protected database. Access to the database was limited to members of the analysis team and stripped of all patient identifiers under the HIPAA privacy rule standard for de-identification from 45 CFR 164.514(b) immediately following the collection of all data elements from the medical record.
An internal pharmacy database was used to determine the volume and actual acquisition cost of prophylactic anticoagulant doses administered during both pre- and post-intervention time periods. To determine if clinical suspicion for HIT increased after the intervention, a definitive listing of all ordered HIT assays was obtained from laboratory billing records for the 9 months (January 2013–September 2013) before the conversion and for 25 months after the intervention (beginning in November 2013 so as not to include the conversion month). To determine if the HIT assays were associated with a higher risk score, we identified all cases in which the HIT assay was ordered and retroactively measured the probability score known as the 4T score [1].Simultaneously, separate clinical work groups reviewed all cases of hospital-acquired thrombosis, whatever their cause, including patients readmitted with thrombosis up to 30 days after discharge and episodes of bleeding due to anti-coagulant use. A chi square analysis of the incidence of HIT pre- and post-intervention was performed.
Results
Heparin Use and Acquisition Costs
HIT Assays and Incidence of HIT and HITT
In the 9 months pre-intervention, HIT and HITT occurred in zero of 6717 patients receiving at least 1 dose of VTE prophylaxis. In the 25 months of post-intervention follow-up, 44,240 patients received prophylaxis with either heparin. HIT (clinical suspicion with positive antibody and confirmatory SRA) occurred in 3 patients, 2 of whom had HITT, all after UFH. This incidence was not statistically significant using chi square analysis (P = 0.86).
Discussion
Because the efficacy of UFH and LMWH for VTE prophylaxis are equivalent [6],choosing between them involves many factors including patient-level risk factors such as renal function, risk of bleeding, as well as other considerations such as nursing time, patient preference, risk of HIT, and acquisition cost. Indeed, the most recent version of the American College of Chest Physicians guidelines for prophylaxis against VTE note that both drugs are recommended with an evidence grade of IB [7].Cost is among the considerations considered appropriate in choosing among agents. The difference in acquisition costs of > $20 per patient per day can have a major financial impact on hospital’s pharmacy budget and may be decisive. But a focus only on acquisition cost is short sighted as the 2 medications have different complication rates with regard to HIT. Thus the need to track HIT incidence after protocol changes are made is paramount.
In our study, we did not measure thrombocytopenia as an endpoint because acquired thrombocytopenia is too common and multifactorial to be a meaningful. Rather, we used the clinical suspicion for HIT as measured by both the number of times the BPA fired warnings of low platelets in the setting of recent heparin use and the number of times clinicians suspected HIT enough to order a HIT assay. We also used actual outcomes (clinically adjudicated cases of HIT and HITT). Our data shows substantial compliance among clinicians with the voluntary conversion to UFH with an immediate and sustained shift to UFH so that UFH was used in 86% of patients. Corresponding cost savings were achieved in heparin acquisition. Unlike some prior reports, there was a minimal burden of HIT as measured by the unchanged number of BPAs, monthly HIT assays and the unchanged clinical risk 4T scores among those patients in whom the test was ordered pre and post intervention. HIT rates were not statistically different after the order set conversion took effect.
Our results and study design are similar but not identical to that of Zhou et al, who found that a campaign to increase VTE prophylaxis resulted in 71% increase of UFH use over 5 years but no increase in number of HIT assays ordered or in the distribution of HIT assay results-both surrogate endpoints [5].But not all analyses of heparin order interventions show similar results. A recent study of a heparin avoidance program in a Canadian tertiary care hospital showed a reduction of 79% and 91% in adjudicated cases of HIT and HITT respectively [4].Moreover, hospital-related expenditures for HIT decreased by nearly $267,000 (Canadian dollars) per year though the additional acquisition costs of LMWH were not stated.A small retrospective heparin avoidance protocol among orthopedic surgery patients showed a reduction of HIT incidence from 5.2% with UFH to 0% with LMWH after universal substitution of LMWH for UFH [8].A recent systematic review identified only 3 prospective studies involving over 1398 postoperative surgical patients that measured HIT and HITT as outcomes [9].The review authors, in pooled analysis, found a lower incidence of HIT and HITT with LMWH postoperatively but downgraded the evidence to “low quality” due to methodologic issues and concerns over bias.A nested case-control study of adult medical patients found that HIT was 6 times more common with UFH than with LMWH and the cost of admissions associated with HIT was 3.5 times higher than for those without HIT, though this increase in costs are not necessarily due to the HIT diagnosis itself but may be markers of patients with more severe illness [10].The duration of heparin therapy was not stated.
There are several potential reasons that our data differs from some of the previous reports described above. We used a strict definition of HIT, requiring the serotonin release assay to be positive in the appropriate clinical setting and did not rely solely upon antibody tests to make the diagnosis, a less rigorous standard found in some studies. Furthermore, our results may differ from previously reports because of differences in patient risk and duration of therapy. Our institution does not perform cardiac surgery and the very large orthopedic surgery programs do not generally use heparin. Another potentially important difference in our study from prior studies is that many of the patients treated at this institution did not receive heparin long enough to be considered at risk; only a quarter were treated for longer than 5 days, generally considered a minumum [11].This is less than half of the duration of the patients in the studies included in the meta-analysis of HIT incidence [2].
We do not contend that UFH is as safe as LMWH with regard to HIT for all populatons, but rather that the increased risk is not manifest in all patient populations and settings and so the increased cost may not be justified in low-risk patients. Indeed while variability in HIT risk among patients is well documented [3,12], the guidelines for prophylaxis do not generally take this into account when recommending particular VTE prophylaxis strategies.Clinical practice guidelines do recommend different degrees of monitoring the platelet count based on risk of HIT however.
Our study had limitations, chief of which is the retrospective nature of the analysis; however, the methodology we used was similar to those of previous publications [4,5,8].We may have missed some cases of HIT if a clinician did not order the assay in all appropriate patients but there is no reason to think that likelihood was any different pre- and post-intervention. In addition, though we reviewed every case of hospital-acquired thrombosis, it is possible that the clinical reviewers may have missed cases of HITT, especially if the thrombosis occurred before a substantial drop in the platelet count, which is rare but possible. Here too the chance of missing actual cases did not change between the pre-and post-intervention. Our study examined prophylaxis with heparin use and not therapeutic uses. Finally, while noting the acquisition cost reduction achieved with conversion to UFH, we were not able to calculate any excess expense attributed to the rare case of HIT and HITT that occurred. We believe our results are generalizable to hospitals with similar patient profiles.
The idea that patients with different risk factors might do well with different prophylaxis strategies needs to be better appreciated. Such information could be used as a guide to more individualized prophylaxis strategy aided by clinical decision support embedded within the EMR. In this way the benefit of LMWH in avoiding HIT could be reserved for those patients at greatest risk of HIT while simultaneously allowing hospitals not to overspend for prophylaxis in patients who will not benefit from LMWH. Such a strategy would need to be tested prospectively before widespread adoption.
As a result of our internal analysis we have altered our EMR-based best practice alert to conform to the 2013 American Society of Hematology guidelines [15],which is more informative than our original BPA. Specifically, the old guideline only warned if the platelet count was < 100,000/mm3 in association with heparin. The revision notified if there is a > 30% fall regardless of the absolute count and informed prescribers of the 4T score to encourage more optimum use of the HIT assay, avoiding its use for low risk scores and encouraging its use for moderate to high risk scores. We are also strengthening the emphasis that moderate to high risk 4T patients receive alternative anticoagulation until results of the HIT assay are available as we found this not to be a be a universal practice. We recommend similar self-inspection to other institutions.
Corresponding author: Barry R. Meisenberg, MD, Anne Arundel Medical Center, 2001 Medical Parkway, Annapolis, MD 21401, Meisenberg@aahs.org.
Financial disclosures: None.
Author contributions: conception and design, JR, BRM; analysis and interpretation of data, KW, JR, BRM; drafting of article, JR, BRM; critical revision of the article, KW, JR, BRM; statistical expertise, KW, JR; administrative or technical support, JR; collection and assembly of data, KW, JR.
From the Anne Arundel Health System Research Institute, Annapolis, MD.
Abstract
- Objective: To measure clinical outcomes associated with heparin-induced thrombocytopenia (HIT) and acquisition costs of heparin after implementing a new order set promoting unfractionated heparin (UFH) use instead of low-molecular-weight heparin (LMWH) for venous thromboembolism (VTE) prophylaxis.
- Methods: This was single-center, retrospective, pre-post intervention analysis utilizing pharmacy, laboratory, and clinical data sources. Subjects were patients receiving VTE thromboprophyalxis with heparin at an acute care hospital. Usage rates for UFH and LMWH, acquisition costs for heparins, number of HIT assays, best practice advisories for HIT, and confirmed cases of HIT and HIT with thrombosis were assessed.
- Results: After order set intervention, UFH use increased from 43% of all prophylaxis orders to 86%. Net annual savings in acquisition costs for VTE prophylaxis was $131,000. After the intervention, HIT best practice advisories and number of monthly HIT assays fell 35% and 15%, respectively. In the 9-month pre-intervention period, HIT and HITT occurred in zero of 6717 patients receiving VTE prophylaxis. In the 25 months of post-intervention follow-up, HIT occurred in 3 of 44,240 patients (P = 0.86) receiving VTE prophylaxis, 2 of whom had HITT, all after receiving UFH. The median duration of UFH and LMWH use was 3.0 and 3.5 days, respectively.
- Conclusion: UFH use in hospitals can be safely maintained or increased among patient subpopulations that are not at high risk for HIT. A more nuanced approach to prophylaxis, taking into account individual patient risk and expected duration of therapy, may provide desired cost savings without provoking HIT.
Key words: heparin; heparin-induced thrombocytopenia; venous thromboembolism prophylaxis; cost-effectiveness.
Heparin-induced thrombocytopenia (HIT) and its more severe clinical complication, HIT with thrombosis (HITT), complicate the use of heparin products for venous thromboembolic (VTE) prophylaxis. The clinical characteristics and time course of thrombocytopenia in relation to heparin are well characterized (typically 30%–50% drop in platelet count 5–10 days after exposure), if not absolute. Risk calculation tools help to judge the clinical probability and guide ordering of appropriate confirmatory tests [1]. The incidence of HIT is higher with unfractionated heparin (UFH) than with low-molecular-weight heparin (LMWH). A meta-analysis of 5 randomized or prospective nonrandomized trials indicated a risk of 2.6% (95% CI, 1.5%–3.8%) for UFH and 0.2% (95% CI, 0.1%–0.4%) for LMWH [2], though the analyzed studies were heavily weighted by studies of orthopedic surgery patients, a high-risk group. However, not all patients are at equal risk for HIT, suggesting that LMWH may not be necessary for all patients [3]. Unfortunately, LMWH is considerably more expensive for hospitals to purchase than UFH, raising costs for a prophylactic treatment that is widely utilized. However, the higher incidence of HIT and HITT associated with UFH can erode any cost savings because of the additional cost of diagnosing HIT and need for temporary or long-term treatment with even more expensive alternative anticoagulants. Indeed, a recent retrospective study suggested that the excess costs of evaluating and treating HIT were approximately $267,000 per year in Canadian dollars [4].But contrary data has also been reported. A retrospective study of the consequences of increased prophylactic UFH use found no increase in ordered HIT assays or in the results of HIT testing or of inferred positive cases despite a growth of 71% in the number of patients receiving UFH prophylaxis [5].
In 2013, the pharmacy and therapeutics committee made a decision to encourage the use of UFH over LMWH for VTE prophylaxis by making changes to order sets to favor UFH over LMWH (enoxaparin). Given the uncertainty about excess risk of HIT, a monitoring work group was created to assess for any increase of either HIT or HITT that might follow, including any patient readmitted with thrombosis within 30 days of a discharge. In this paper, we report the impact of a hospital-wide conversion to UFH for VTE prophylaxis on the incidence of VTE, HIT, and HITT and acquisition costs of UFH and LMWH and use of alternative prophylactic anticoagulant medications.
Methods
Setting
Anne Arundel Medical Center is a 383-bed acute care hospital with about 30,000 adult admissions and 10,000 inpatient surgeries annually. The average length of stay is approximately 3.6 days with a patient median age of 59 years. Caucasians comprise 75.3% of the admitted populations and African Americans 21.4%. Most patients are on Medicare (59%), while 29.5% have private insurance, 6.6% are on Medicaid, and 4.7% self-pay. The 9 most common medical principal diagnoses are sepsis, heart failure, chronic obstructive pulmonary disease, pneumonia, myocardial infarction, ischemic stroke, urinary tract infection, cardiac arrhythmia, and other infection. The 6 most common procedures include newborn delivery (with and without caesarean section), joint replacement surgery, bariatric procedures, cardiac catheterizations, other abdominal surgeries, and thoracotomy. The predominant medical care model is internal medicine and physician assistant acute care hospitalists attending both medicine and surgical patients. Obstetrical hospitalists care for admitted obstetric patients. Patients admitted to the intensive care units had only critical care trained physician specialists as attending physicians. No trainees cared for the patients described in this study.
P&T Committee
The P&T committee is a multidisciplinary group of health care professionals selected for appointment by the chairs of the committee (chair of medicine and director of pharmacy) and approved by the president of the medical staff. The committee has oversight responsibility for all medication policies, order sets involving medications, as well as the monitoring of clinical outcomes as they regard medications.
Electronic Medical Record and Best Practice Advisory
Throughout this study period both pre-and post-intervention, the EMR in use was Epic (Verona WI), used for all ordering and lab results. A best practice advisory was in place in the EMR that alerted providers to all cases of thrombocytopenia < 100,000/mm3 when there was concurrent order for any heparin. The best practice advisory highlighted the thrombocytopenia, advised the providers to consider HIT as a diagnosis and to order confirmation tests if clinically appropriate, providing a direct link to the HIT assay order screen. The best practice advisory did not access information from prior admissions where heparin might have been used nor determine the percentage drop from the baseline platelet count.
HIT Case Definition and Assays
The 2 laboratory tests for HIT on which this study is based are the heparin-induced platelet antibody test (also known as anti-PF4) and the serotonin release assay. The heparin-induced platelet antibody test is an enzyme-linked immunosorbent assay (ELISA) that detects IgG, IgM, and IgA antibodies against the platelet factor 4 (PF4/heparin complex). This test was reported as positive if the optical density was 0.4 or higher and generated an automatic request for a serotonin release assay (SRA), which is a functional assay that measures heparin-dependent platelet activation. The decision to order the SRA was therefore a “reflex” test and not made with any knowledge of clinical characteristics of the case. The HIT assays were performed by a reference lab, Quest Diagnostics, in the Chantilly, VA facility. HIT was said to be present when both a characteristic pattern of thrombocytopenia occurring after heparin use was seen [1]and when the confirmatory SRA was positive at a level of > 20% release.
Order Set Modifications
After the P&T committee decision to emphasize UFH for VTE prophylaxis in October 2013, the relevant electronic order sets were altered to highlight the fact that UFH was the first choice for VTE prophylaxis. The order sets still allowed LMWH (enoxaparin) or alternative anticoagulants at the prescribers’ discretion but indicated they were a second choice. Doses of UFH and LMWH in the order sets were standard based upon weight and estimates of creatinine clearance and, in the case of dosing frequency for UFH, based upon the risk of VTE. Order sets for the therapeutic treatment of VTE were not changed.
Data Collection and Analysis
The clinical research committee, the local oversight board for research and performance improvement analyses, reviewed this project and determined that it qualified as a performance improvement analysis based upon the standards of the U.S. Office of Human Research Protections. Some data were extracted from patient medical records and stored in a customized and password-protected database. Access to the database was limited to members of the analysis team and stripped of all patient identifiers under the HIPAA privacy rule standard for de-identification from 45 CFR 164.514(b) immediately following the collection of all data elements from the medical record.
An internal pharmacy database was used to determine the volume and actual acquisition cost of prophylactic anticoagulant doses administered during both pre- and post-intervention time periods. To determine if clinical suspicion for HIT increased after the intervention, a definitive listing of all ordered HIT assays was obtained from laboratory billing records for the 9 months (January 2013–September 2013) before the conversion and for 25 months after the intervention (beginning in November 2013 so as not to include the conversion month). To determine if the HIT assays were associated with a higher risk score, we identified all cases in which the HIT assay was ordered and retroactively measured the probability score known as the 4T score [1].Simultaneously, separate clinical work groups reviewed all cases of hospital-acquired thrombosis, whatever their cause, including patients readmitted with thrombosis up to 30 days after discharge and episodes of bleeding due to anti-coagulant use. A chi square analysis of the incidence of HIT pre- and post-intervention was performed.
Results
Heparin Use and Acquisition Costs
HIT Assays and Incidence of HIT and HITT
In the 9 months pre-intervention, HIT and HITT occurred in zero of 6717 patients receiving at least 1 dose of VTE prophylaxis. In the 25 months of post-intervention follow-up, 44,240 patients received prophylaxis with either heparin. HIT (clinical suspicion with positive antibody and confirmatory SRA) occurred in 3 patients, 2 of whom had HITT, all after UFH. This incidence was not statistically significant using chi square analysis (P = 0.86).
Discussion
Because the efficacy of UFH and LMWH for VTE prophylaxis are equivalent [6],choosing between them involves many factors including patient-level risk factors such as renal function, risk of bleeding, as well as other considerations such as nursing time, patient preference, risk of HIT, and acquisition cost. Indeed, the most recent version of the American College of Chest Physicians guidelines for prophylaxis against VTE note that both drugs are recommended with an evidence grade of IB [7].Cost is among the considerations considered appropriate in choosing among agents. The difference in acquisition costs of > $20 per patient per day can have a major financial impact on hospital’s pharmacy budget and may be decisive. But a focus only on acquisition cost is short sighted as the 2 medications have different complication rates with regard to HIT. Thus the need to track HIT incidence after protocol changes are made is paramount.
In our study, we did not measure thrombocytopenia as an endpoint because acquired thrombocytopenia is too common and multifactorial to be a meaningful. Rather, we used the clinical suspicion for HIT as measured by both the number of times the BPA fired warnings of low platelets in the setting of recent heparin use and the number of times clinicians suspected HIT enough to order a HIT assay. We also used actual outcomes (clinically adjudicated cases of HIT and HITT). Our data shows substantial compliance among clinicians with the voluntary conversion to UFH with an immediate and sustained shift to UFH so that UFH was used in 86% of patients. Corresponding cost savings were achieved in heparin acquisition. Unlike some prior reports, there was a minimal burden of HIT as measured by the unchanged number of BPAs, monthly HIT assays and the unchanged clinical risk 4T scores among those patients in whom the test was ordered pre and post intervention. HIT rates were not statistically different after the order set conversion took effect.
Our results and study design are similar but not identical to that of Zhou et al, who found that a campaign to increase VTE prophylaxis resulted in 71% increase of UFH use over 5 years but no increase in number of HIT assays ordered or in the distribution of HIT assay results-both surrogate endpoints [5].But not all analyses of heparin order interventions show similar results. A recent study of a heparin avoidance program in a Canadian tertiary care hospital showed a reduction of 79% and 91% in adjudicated cases of HIT and HITT respectively [4].Moreover, hospital-related expenditures for HIT decreased by nearly $267,000 (Canadian dollars) per year though the additional acquisition costs of LMWH were not stated.A small retrospective heparin avoidance protocol among orthopedic surgery patients showed a reduction of HIT incidence from 5.2% with UFH to 0% with LMWH after universal substitution of LMWH for UFH [8].A recent systematic review identified only 3 prospective studies involving over 1398 postoperative surgical patients that measured HIT and HITT as outcomes [9].The review authors, in pooled analysis, found a lower incidence of HIT and HITT with LMWH postoperatively but downgraded the evidence to “low quality” due to methodologic issues and concerns over bias.A nested case-control study of adult medical patients found that HIT was 6 times more common with UFH than with LMWH and the cost of admissions associated with HIT was 3.5 times higher than for those without HIT, though this increase in costs are not necessarily due to the HIT diagnosis itself but may be markers of patients with more severe illness [10].The duration of heparin therapy was not stated.
There are several potential reasons that our data differs from some of the previous reports described above. We used a strict definition of HIT, requiring the serotonin release assay to be positive in the appropriate clinical setting and did not rely solely upon antibody tests to make the diagnosis, a less rigorous standard found in some studies. Furthermore, our results may differ from previously reports because of differences in patient risk and duration of therapy. Our institution does not perform cardiac surgery and the very large orthopedic surgery programs do not generally use heparin. Another potentially important difference in our study from prior studies is that many of the patients treated at this institution did not receive heparin long enough to be considered at risk; only a quarter were treated for longer than 5 days, generally considered a minumum [11].This is less than half of the duration of the patients in the studies included in the meta-analysis of HIT incidence [2].
We do not contend that UFH is as safe as LMWH with regard to HIT for all populatons, but rather that the increased risk is not manifest in all patient populations and settings and so the increased cost may not be justified in low-risk patients. Indeed while variability in HIT risk among patients is well documented [3,12], the guidelines for prophylaxis do not generally take this into account when recommending particular VTE prophylaxis strategies.Clinical practice guidelines do recommend different degrees of monitoring the platelet count based on risk of HIT however.
Our study had limitations, chief of which is the retrospective nature of the analysis; however, the methodology we used was similar to those of previous publications [4,5,8].We may have missed some cases of HIT if a clinician did not order the assay in all appropriate patients but there is no reason to think that likelihood was any different pre- and post-intervention. In addition, though we reviewed every case of hospital-acquired thrombosis, it is possible that the clinical reviewers may have missed cases of HITT, especially if the thrombosis occurred before a substantial drop in the platelet count, which is rare but possible. Here too the chance of missing actual cases did not change between the pre-and post-intervention. Our study examined prophylaxis with heparin use and not therapeutic uses. Finally, while noting the acquisition cost reduction achieved with conversion to UFH, we were not able to calculate any excess expense attributed to the rare case of HIT and HITT that occurred. We believe our results are generalizable to hospitals with similar patient profiles.
The idea that patients with different risk factors might do well with different prophylaxis strategies needs to be better appreciated. Such information could be used as a guide to more individualized prophylaxis strategy aided by clinical decision support embedded within the EMR. In this way the benefit of LMWH in avoiding HIT could be reserved for those patients at greatest risk of HIT while simultaneously allowing hospitals not to overspend for prophylaxis in patients who will not benefit from LMWH. Such a strategy would need to be tested prospectively before widespread adoption.
As a result of our internal analysis we have altered our EMR-based best practice alert to conform to the 2013 American Society of Hematology guidelines [15],which is more informative than our original BPA. Specifically, the old guideline only warned if the platelet count was < 100,000/mm3 in association with heparin. The revision notified if there is a > 30% fall regardless of the absolute count and informed prescribers of the 4T score to encourage more optimum use of the HIT assay, avoiding its use for low risk scores and encouraging its use for moderate to high risk scores. We are also strengthening the emphasis that moderate to high risk 4T patients receive alternative anticoagulation until results of the HIT assay are available as we found this not to be a be a universal practice. We recommend similar self-inspection to other institutions.
Corresponding author: Barry R. Meisenberg, MD, Anne Arundel Medical Center, 2001 Medical Parkway, Annapolis, MD 21401, Meisenberg@aahs.org.
Financial disclosures: None.
Author contributions: conception and design, JR, BRM; analysis and interpretation of data, KW, JR, BRM; drafting of article, JR, BRM; critical revision of the article, KW, JR, BRM; statistical expertise, KW, JR; administrative or technical support, JR; collection and assembly of data, KW, JR.
1. Lo GK, Juhl D, Warkentin TE, et al. Evaluation of pretest clinical score (4T’s) for the diagnosis of heparin-induced thrombocytopenia in two clinical settings. J Thromb Haemost 2006;4:759–65.
2. Martel N, Lee J, Wells PS. Risk for heparin-induced thrombocytopenia with unfractionated and low-molecular-weight heparin thromboprophylaxis: a meta-analysis. Blood 2005; 106:2710–5.
3. Warkentin TE, Sheppard JI, Horsewood P, et al. Impact of the patient population on the risk for heparin-induced thrombocytpenia Blood 2000; 96:1703–8.
4. McGowan KE, Makari J, Diamantouros A, et al. Reducing the hospital burden of heparin-induced thrombocytopenia: impact of an avoid heparin program. Blood 2016; 127:1954–9.
5. Zhou A, Winkler A, Emamifar A, et al. Is the incidence of heparin-induced thrombocytopenia affected by the increased use of heparin for VTE prophylaxis? Chest 2012; 142:1175–8.
6. Mismetti P, Laporte-Simitsidis S, Tardy B, et al. Prevention of venous thromboembolism in internal medicine with unfractionated or low-molecular-weight heparins: a meta-analysis of randomised clinical trials. Thromb Haemost 2000;83:14–19.
7. Guyatt GH, Akl EA, Crowther M, et al; for the American College of Chest Physicians Antithrombotic Therapy and Prevention of Thrombosis Panel. Antithrombotic therapy and prevention of thrombosis. 9th ed. American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012;141(2 Suppl):7S–47S.
8. Greinacher A, Eichler P, Lietz T, Warkentin TE. Replacement of unfractionated heparin by low-molecular-weight heparin for postorthopedic surgery antithrombotic prophylaxis lowers the overall risk of symptomatic thrombosis because of a lower frequency of heparin-induced thrombocytopenia. Blood 2005;106:2921–2.
9. Junqueira DRG, Zorzela LM, Perini E. Unfractionated heparin versus low molecular weight heparin for avoiding heparin-induced thrombocytopenia in postoperative patients. Cochrane Database Syst Rev 2017;4:CD007557.
10. Creekmore FM, Oderda GM, Pendleton RC, Brixner DI. Incidence and economic implications of heparin-induced thrombocytopenia in medical patients receiving prophylaxis for venous thromboembolism. Pharmacotherapy 2006;26:1348–445.
11. Warkentin TE, Kelton JG. Temporal aspects of heparin-induced thrombocytopenia N Engl J Med 2001;344:1286–92.
12. Warkentin TE, Sheppard JA, Sigouin CS, et al. Gender imbalance and risk factor interactions in heparin-induced thrombocytopenia. Blood 2006;108:2937–41.
13. Camden R, Ludwig S. Prophylaxis against venous thromboembolism in hospitalized medically ill patients: Update and practical approach. Am J Health Syst Pharm 2012;71:909–17.
14. Linkins LA, Dans AL, Moores LK, et al. Treatment and prevention of heparin-induced thrombocytopenia. antithrombotic therapy and prevention of thrombosis. 9th ed. American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012;141(2 Suppl):e495s–e530s.
15. Cuker A, Crowther MA. 2013 Clinical practice guideline on the evaluation and management of adults with suspected heparin-induced thrombocytopenia. Acessed 19 May 2017 at www.hematology.org/search.aspx?q=heparin+induced+thrombocytopenia.
1. Lo GK, Juhl D, Warkentin TE, et al. Evaluation of pretest clinical score (4T’s) for the diagnosis of heparin-induced thrombocytopenia in two clinical settings. J Thromb Haemost 2006;4:759–65.
2. Martel N, Lee J, Wells PS. Risk for heparin-induced thrombocytopenia with unfractionated and low-molecular-weight heparin thromboprophylaxis: a meta-analysis. Blood 2005; 106:2710–5.
3. Warkentin TE, Sheppard JI, Horsewood P, et al. Impact of the patient population on the risk for heparin-induced thrombocytpenia Blood 2000; 96:1703–8.
4. McGowan KE, Makari J, Diamantouros A, et al. Reducing the hospital burden of heparin-induced thrombocytopenia: impact of an avoid heparin program. Blood 2016; 127:1954–9.
5. Zhou A, Winkler A, Emamifar A, et al. Is the incidence of heparin-induced thrombocytopenia affected by the increased use of heparin for VTE prophylaxis? Chest 2012; 142:1175–8.
6. Mismetti P, Laporte-Simitsidis S, Tardy B, et al. Prevention of venous thromboembolism in internal medicine with unfractionated or low-molecular-weight heparins: a meta-analysis of randomised clinical trials. Thromb Haemost 2000;83:14–19.
7. Guyatt GH, Akl EA, Crowther M, et al; for the American College of Chest Physicians Antithrombotic Therapy and Prevention of Thrombosis Panel. Antithrombotic therapy and prevention of thrombosis. 9th ed. American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012;141(2 Suppl):7S–47S.
8. Greinacher A, Eichler P, Lietz T, Warkentin TE. Replacement of unfractionated heparin by low-molecular-weight heparin for postorthopedic surgery antithrombotic prophylaxis lowers the overall risk of symptomatic thrombosis because of a lower frequency of heparin-induced thrombocytopenia. Blood 2005;106:2921–2.
9. Junqueira DRG, Zorzela LM, Perini E. Unfractionated heparin versus low molecular weight heparin for avoiding heparin-induced thrombocytopenia in postoperative patients. Cochrane Database Syst Rev 2017;4:CD007557.
10. Creekmore FM, Oderda GM, Pendleton RC, Brixner DI. Incidence and economic implications of heparin-induced thrombocytopenia in medical patients receiving prophylaxis for venous thromboembolism. Pharmacotherapy 2006;26:1348–445.
11. Warkentin TE, Kelton JG. Temporal aspects of heparin-induced thrombocytopenia N Engl J Med 2001;344:1286–92.
12. Warkentin TE, Sheppard JA, Sigouin CS, et al. Gender imbalance and risk factor interactions in heparin-induced thrombocytopenia. Blood 2006;108:2937–41.
13. Camden R, Ludwig S. Prophylaxis against venous thromboembolism in hospitalized medically ill patients: Update and practical approach. Am J Health Syst Pharm 2012;71:909–17.
14. Linkins LA, Dans AL, Moores LK, et al. Treatment and prevention of heparin-induced thrombocytopenia. antithrombotic therapy and prevention of thrombosis. 9th ed. American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012;141(2 Suppl):e495s–e530s.
15. Cuker A, Crowther MA. 2013 Clinical practice guideline on the evaluation and management of adults with suspected heparin-induced thrombocytopenia. Acessed 19 May 2017 at www.hematology.org/search.aspx?q=heparin+induced+thrombocytopenia.
Impact of a Safety Huddle–Based Intervention on Monitor Alarm Rates in Low-Acuity Pediatric Intensive Care Unit Patients
BACKGROUND
Physiologic monitors are intended to prevent cardiac and respiratory arrest by generating alarms to alert clinicians to signs of instability. To minimize the probability that monitors will miss signs of deterioration, alarm algorithms and default parameters are often set to maximize sensitivity while sacrificing specificity.1 As a result, monitors generate large numbers of nonactionable alarms—alarms that are either invalid and do not accurately represent the physiologic status of the patient or are valid but do not warrant clinical intervention.2 Prior research has demonstrated that the pediatric intensive care unit (PICU) is responsible for a higher proportion of alarms than pediatric wards3 and a large proportion of these alarms, 87% - 97%, are nonactionable.4-8 In national surveys of healthcare staff, respondents report that high alarm rates interrupt patient care and can lead clinicians to disable alarms entirely.9 Recent research has supported this, demonstrating that nurses who are exposed to higher numbers of alarms have slower response times to alarms.4,10 In an attempt to mitigate safety risks, the Joint Commission in 2012 issued recommendations for hospitals to (a) establish guidelines for tailoring alarm settings and limits for individual patients and (b) identify situations in which alarms are not clinically necessary.11
In order to address these recommendations within our PICU, we sought to evaluate the impact of a focused physiologic monitor alarm reduction intervention integrated into safety huddles. Safety huddles are brief, structured discussions among physicians, nurses, and other staff aiming to identify safety concerns.12 Huddles offer an appropriate forum for reviewing alarm data and identifying patients whose high alarm rates may necessitate safe tailoring of alarm limits. Pilot data demonstrating high alarm rates among low-acuity PICU patients led us to hypothesize that low-acuity, high-alarm PICU patients would be a safe and effective target for an alarm huddle-based intervention.
In this study, we aimed to measure the impact of a structured safety huddle review of low-acuity PICU patients with high rates of priority alarms who were randomized to intervention compared with other low-acuity, high-alarm, concurrent, and historical control patients in the PICU.
METHODS
Study Definitions
Priority alarm activation rate. We conceptualized priority alarms as any alarm for a clinical condition that requires a timely response to determine if intervention is necessary to save a patient’s life,4 yet little empirical data support its existence in the hospital. We operationally defined these alarms on the General Electric Solar physiologic monitoring devices as any potentially life-threatening events including lethal arrhythmias (asystole, ventricular tachycardia, and ventricular fibrillation) and alarms for vital signs (heart rate, respiratory rate, and oxygen saturation) outside of the set parameter limits. These alarms produced audible tones in the patient room and automatically sent text messages to the nurse’s phone and had the potential to contribute to alarm fatigue regardless of the nurse’s location.
High-alarm patients. High-alarm patients were those who had more than 40 priority alarms in the preceding 4 hours, representing the top 20% of alarm rates in the PICU according to prior quality improvement projects completed in our PICU.
Low-acuity patients. Prior to and during this study, patient acuity was determined using the OptiLink Patient Classification System (OptiLink Healthcare Management Systems, Inc.; Tigard, OR; www.optilinkhealthcare.com; see Appendix 1) for the PICU twice daily. Low-acuity patients comprised on average 16% of the PICU patients.
Setting and Subjects
This study was performed in the PICU at The Children’s Hospital of Philadelphia.
The PICU is made up of 3 separate wings: east, south, and west. Bed availability was the only factor determining patient placement on the east, south, or west wing; the physical bed location was not preferentially assigned based on diagnosis or disease severity. The east wing was the intervention unit where the huddles occurred.
The PICU is composed of 3 different geographical teams. Two of the teams are composed of 4 to 5 pediatric or emergency medicine residents, 1 fellow, and 1 attending covering the south and west wings. The third team, located on the east wing, is composed of 1 to 2 pediatric residents, 2 to 3 nurse practitioners, 1 fellow, and 1 attending. Bedside family-centered rounds are held at each patient room, with the bedside nurse participating by reading a nursing rounding script that includes vital signs, vascular access, continuous medications, and additional questions or concerns.
Control subjects were any monitored patients on any of the 3 wings of the PICU between April 1, 2015, and October 31, 2015. The control patients were in 2 categories: historical controls from April 1, 2015, to May 31, 2015, and concurrent controls from June 1, 2015, to October 31, 2015, who were located anywhere in the PICU. On each nonholiday weekday beginning June 1, 2015, we randomly selected up to 2 patients to receive the intervention. These were high-alarm, low-acuity patients on the east wing to be discussed in the daily morning huddle. If more than 2 high-alarm, low-acuity patients were eligible for intervention, they were randomly selected by using the RAND function in Microsoft Excel. The other low-acuity, high-alarm patients in the PICU were included as control patients. Patients were eligible for the study if they were present for the 4 hours prior to huddle and present past noon on the day of huddle. If patients met criteria as high-alarm, low-acuity patients on multiple days, they could be enrolled as intervention or control patients multiple times. Patients’ alarm rates were calculated by dividing the number of alarms by their length of stay to the minute. There was no adjustment made for patients enrolled more than once.
Human Subjects Protection
The Institutional Review Board of The Children’s Hospital of Philadelphia approved this study with a waiver of informed consent.
Alarm Capture
We used BedMasterEx (Excel Medical Electronics; Jupiter, FL, http://excel-medical.com/products/bedmaster-ex) software connected to the General Electric monitor network to measure alarm rates. The software captured, in near real time, every alarm that occurred on every monitor in the PICU. Alarm rates over the preceding 4 hours for all PICU patients were exported and summarized by alarm type and level as set by hospital policy (crisis, warning, advisory, and system warning). Crisis and warning alarms were included as they represented potential life-threatening events meeting the definition of priority alarms. Physicians used an order within the PICU admission order-set to order monitoring based on preset age parameters (see online Appendix 1 for default settings). Physician orders were required for nurses to change alarm parameters. Daily electrode changes to reduce false alarms were standard of care.
Primary Outcome
The primary outcome was the change in priority alarm activation rate (the number of priority alarms per day) from prehuddle period (24 hours before morning huddle) to posthuddle period (the 24 hours following morning huddle) for intervention cases as compared with controls.
Primary Intervention
The intervention consisted of integrating a short script to facilitate the discussion of the alarm data during existing safety huddle and rounding workflows. The discussion and subsequent workflow proceeded as follows: A member of the research team who was not involved in patient care brought an alarm data sheet for each randomly selected intervention patient on the east wing to each safety huddle. The huddles were attended by the outgoing night charge nurse, the day charge nurse, and all bedside nurses working on the east wing that day. The alarm data sheet provided to the charge nurse displayed data on the 1 to 2 alarm parameters (respiratory rate, heart rate, or pulse oximetry) that generated the highest number of alarms. The charge nurse listed the high-alarm patients by room number during huddle, and the alarm data sheet was given to the bedside nurse responsible for the patient to facilitate further scripted discussion during bedside rounds with patient-specific information to reduce the alarm rates of individual patients throughout the adjustment of physiologic monitor parameters (see Appendix 2 for sample data sheet and script).
Data Collection
Intervention patients were high-alarm, low-acuity patients on the east wing from June 1, 2015, through October 31, 2015. Two months of baseline data were gathered prior to intervention on all 3 wings; therefore, control patients were high-alarm, low-acuity patients throughout the PICU from April 1, 2015, to May 31, 2015, as historical controls and from June 1, 2015, to October 31, 2015, as concurrent controls. Alarm rates for the 24 hours prior to huddle and the 24 hours following huddle were collected and analyzed. See Figure 1 for schematic of study design.
We collected data on patient characteristics, including patient location, age, sex, and intervention date. Information regarding changes to monitor alarm parameters for both intervention and control patients during the posthuddle period (the period following morning huddle until noon on intervention day) was also collected. We monitored for code blue events and unexpected changes in acuity until discharge or transfer out of the PICU.
Data Analysis
We compared the priority alarm activation rates of individual patients in the 24 hours before and the 24 hours after the huddle intervention and contrasted the differences in rates between intervention and control patients, both concurrent and historical controls. We also divided the intervention and control groups into 2 additional groups each—those patients whose alarm parameters were changed, compared with those whose parameters did not change. We evaluated for possible contamination by comparing alarm rates of historical and concurrent controls, as well as evaluating alarm rates by location. We used mixed-effects regression models to evaluate the effect of the intervention and control type (historical or concurrent) on alarm rates, adjusted for patient age and sex. Analysis was performed using Stata version 10.3 (StataCorp, LLC, College Station, TX) and SAS version 9.4 (SAS Institute Inc., Cary, NC).
RESULTS
Because patients could be enrolled more than once, we refer to the instances when they were included in the study as “events” (huddle discussions for intervention patients and huddle opportunities for controls) below. We identified 49 historical control events between April 1, 2015, and May 31, 2015. During the intervention period, we identified 88 intervention events and 163 concurrent control events between June 1, 2015, and October 31, 2015 (total n = 300; see Table 1 for event characteristics). A total of 6 patients were enrolled more than once as either intervention or control patients.
UNADJUSTED ANALYSIS OF CHANGES IN ALARM RATES
The average priority alarm activation rate for intervention patients was 433 alarms (95% confidence interval [CI], 392-472) per day in the 24 hours leading up to the intervention and 223 alarms (95% CI, 182-265) per day in the 24 hours following the intervention, a 48.5% unadjusted decrease (95% CI, 38.1%-58.9%). In contrast, priority alarm activation rates for concurrent control patients averaged 412 alarms (95% CI, 383-442) per day in the 24 hours leading up to the morning huddle and 323 alarms (95% CI, 270-375) per day in the 24 hours following huddle, a 21.6% unadjusted decrease (95% CI, 15.3%-27.9%). For historical controls, priority alarm activation rates averaged 369 alarms (95% CI, 339-399) per day in the 24 hours leading up to the morning huddle and 242 alarms (95% CI, 164-320) per day in the 24 hours following huddle, a 34.4% unadjusted decrease (95% CI, 13.5%-55.0%). When we compared historical versus concurrent controls in the unadjusted analysis, concurrent controls had 37 more alarms per day (95% CI, 59 fewer to 134 more; P = 0.45) than historical controls. There was no significant difference between concurrent and historical controls, demonstrating no evidence of contamination.
Adjusted Analysis of Changes in Alarm Rates
The overall estimate of the effect of the intervention adjusted for age and sex compared with concurrent controls was a reduction of 116 priority alarms per day (95% CI, 37-194; P = 0.004, Table 2). The adjusted percent decrease was 29.0% (95% CI, 12.1%-46.0%). There were no unexpected changes in patient acuity or code blue events related to the intervention.
Fidelity Analysis
We tracked changes in alarm parameter settings for evidence of intervention fidelity to determine if the team carried out the recommendations made. We found that 42% of intervention patients and 24% of combined control patients had alarm parameters changed during the posthuddle period (P = 0.002).
For those intervention patients who had parameters changed during the posthuddle period (N = 37), the mean effect was greater at a 54.9% decrease (95% CI, 38.8%-70.8%) in priority alarms as compared with control patients who had parameters adjusted during the posthuddle period (n = 50), having a mean decrease of only 12.2% (95% CI, –18.1%-42.3%). There was a 43.2% decrease (95% CI, 29.3%-57.0%) for intervention patients who were discussed but did not have parameters adjusted during the time window of observation (n = 51), as compared with combined control patients who did not have parameters adjusted (N = 162) who had a 28.1% decrease (95% CI, 16.8%-39.1%); see Figure 2.
This study is the first to demonstrate a successful and safe intervention to reduce the alarm rates of PICU patients. In addition, we observed a more significant reduction in priority alarm activation rates for intervention patients who had their alarm parameters changed during the monitored time period, leading us to hypothesize that providing patient-specific data regarding types of alarms was a key component of the intervention.
In control patients, we observed a reduction in alarm rates over time as well. There are 2 potential explanations for this. First, it is possible that as patients stabilize in the PICU, their vital signs become less extreme and generate fewer alarms even if the alarm parameters are not changed. The second is that parameters were changed within or outside of the time windows during which we evaluated for alarm parameter changes. Nevertheless, the decline over time observed in the intervention patients was greater than in both control groups. This change was even more noticeable in the intervention patients who had their alarm parameters changed during the posthuddle period as compared with controls who had their alarm parameters changed following the posthuddle period. This may have been due to the data provided during the huddle intervention, pointing the team to the cause of the high alarm rate.
Prior successful research regarding reduction of pediatric alarms has often shown decreased use of physiological monitors as 1 approach to reducing unnecessary alarms. The single prior pediatric alarm intervention study conducted on a pediatric ward involved instituting a cardiac monitor care process that included the ordering of age-based parameters, daily replacement of electrodes, individualized assessment of parameters, and a reliable method to discontinue monitoring.13 Because most patients in the PICU are critically ill, the reliance on monitor discontinuation as a main approach to decreasing alarms is not feasible in this setting. Instead, the use of targeted alarm parameter adjustments for low-acuity patients demonstrated a safe and feasible approach to decreasing alarms in PICU patients. The daily electrode change and age-based parameters were already in place at our institution.
There are a few limitations to this study. First, we focused only on low-acuity PICU patients. We believe that focusing on low-acuity patients allows for reduction in nonactionable alarms with limited potential for adverse events; however, this approach excludes many critically ill patients who might be at highest risk for harm from alarm fatigue if important alarms are ignored. Second, many of our patients were not present for the full 24 hours pre- and posthuddle due to their low acuity limiting our ability to follow alarm rates over time. Third, changes in alarm parameters were only monitored for a set period of 5 hours following the huddle to determine the effect of the recommended rounding script on changes to alarms. It is possible the changes to alarm parameters outside of the observed posthuddle period affected the alarm rates of both intervention and control patients. Lastly, the balancing metrics of unexpected changes in OptiLink status and code blue events are rare events, and therefore we may have been underpowered to find them. The effects of the huddle intervention on safety huddle length and rounding length were not measured.
CONCLUSION
Integrating a data-driven monitor alarm discussion into safety huddles was a safe and effective approach to reduce alarms in low-acuity, high-alarm PICU patients. Innovative approaches to make data-driven alarm decisions using informatics tools integrated into monitoring systems and electronic health records have the potential to facilitate cost-effective spread of this intervention.
Disclosure
This work was supported by a pilot grant from the Center for Pediatric Clinical Effectiveness, The Children’s Hospital of Philadelphia. Dr. Bonafide is supported by a Mentored Patient-Oriented Research Career Development Award from the National Heart, Lung, and Blood Institute of the National Institutes of Health under Award Number K23HL116427. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations or employers. The funding organizations had no role in the design, preparation, review, or approval of this paper, nor the decision to submit for publication.
1. Drew BJ, Califf RM, Funk M, et al. Practice standards for electrocardiographic monitoring in hospital settings: An American Heart Association scientific statement from the councils on cardiovascular nursing, clinical cardiology, and cardiovascular disease in the young. Circulation. 2004;110(17):2721-2746; DOI:10.1161/01.CIR.0000145144.56673.59. PubMed
2. Paine CW, Goel V V, Ely E, et al. Systematic Review of Physiologic Monitor Alarm Characteristics and Pragmatic Interventions to Reduce Alarm Frequency. J Hosp Med. 2016;11(2):136-144; DOI:10.1002/jhm.2520. PubMed
3. Schondelmeyer AC, Bonafide CP, Goel V V, et al. The frequency of physiologic monitor alarms in a children’s hospital. J Hosp Med. 2016;11(11):796-798; DOI:10.1002/jhm.2612. PubMed
4. Bonafide CP, Lin R, Zander M, et al. Association between exposure to nonactionable physiologic monitor alarms and response time in a children’s hospital. J Hosp Med. 2015;10(6):345-351; DOI:10.1002/jhm.2331. PubMed
5. Lawless ST. Crying wolf: false alarms in a pediatric intensive care unit. Crit Care Med. 1994;22(6):981-985; DOI:10.1016/0025-326X(92)90542-E. PubMed
6. Tsien CL, Fackler JC. Poor prognosis for existing monitors in the intensive care unit. Crit Care Med. 1997;25(4):614-619 DOI:10.1097/00003246-199704000-00010. PubMed
7. Talley LB, Hooper J, Jacobs B, et al. Cardiopulmonary monitors and clinically significant events in critically ill children. Biomed Instrum Technol. 2011;45(SPRING):38-45; DOI:10.2345/0899-8205-45.s1.38. PubMed
8. Rosman EC, Blaufox AD, Menco A, Trope R, Seiden HS. What are we missing? Arrhythmia detection in the pediatric intensive care unit. J Pediatr. 2013;163(2):511-514; DOI:10.1016/j.jpeds.2013.01.053. PubMed
9. Korniewicz DM, Clark T, David Y. A national online survey on the effectiveness of clinical alarms. Am J Crit Care. 2008;17(1):36-41; DOI:17/1/36 [pii]. PubMed
10. Voepel-Lewis T, Parker ML, Burke CN, et al. Pulse oximetry desaturation alarms on a general postoperative adult unit: A prospective observational study of nurse response time. Int J Nurs Stud. 2013;50(10):1351-1358; DOI:10.1016/j.ijnurstu.2013.02.006. PubMed
11. Joint Commission on Accreditation of Healthcare Organizations. Medical device alarm safety in hospitals. Sentin Event Alert. 2012:1-3. PubMed
12. Goldenhar LM, Brady PW, Sutcliffe KM, Muething SE, Anderson JM. Huddling for high reliability and situation awareness. BMJ Qual Saf. 2013;22:899-906; DOI:10.1136/bmjqs-2012-001467. PubMed
13. Dandoy CE, Davies SM, Flesch L, et al. A Team-Based Approach to Reducing Cardiac Monitor Alarms. Pediatrics. 2014;134(6):E1686-E1694. DOI: 10.1542/peds.2014-1162. PubMed
BACKGROUND
Physiologic monitors are intended to prevent cardiac and respiratory arrest by generating alarms to alert clinicians to signs of instability. To minimize the probability that monitors will miss signs of deterioration, alarm algorithms and default parameters are often set to maximize sensitivity while sacrificing specificity.1 As a result, monitors generate large numbers of nonactionable alarms—alarms that are either invalid and do not accurately represent the physiologic status of the patient or are valid but do not warrant clinical intervention.2 Prior research has demonstrated that the pediatric intensive care unit (PICU) is responsible for a higher proportion of alarms than pediatric wards3 and a large proportion of these alarms, 87% - 97%, are nonactionable.4-8 In national surveys of healthcare staff, respondents report that high alarm rates interrupt patient care and can lead clinicians to disable alarms entirely.9 Recent research has supported this, demonstrating that nurses who are exposed to higher numbers of alarms have slower response times to alarms.4,10 In an attempt to mitigate safety risks, the Joint Commission in 2012 issued recommendations for hospitals to (a) establish guidelines for tailoring alarm settings and limits for individual patients and (b) identify situations in which alarms are not clinically necessary.11
In order to address these recommendations within our PICU, we sought to evaluate the impact of a focused physiologic monitor alarm reduction intervention integrated into safety huddles. Safety huddles are brief, structured discussions among physicians, nurses, and other staff aiming to identify safety concerns.12 Huddles offer an appropriate forum for reviewing alarm data and identifying patients whose high alarm rates may necessitate safe tailoring of alarm limits. Pilot data demonstrating high alarm rates among low-acuity PICU patients led us to hypothesize that low-acuity, high-alarm PICU patients would be a safe and effective target for an alarm huddle-based intervention.
In this study, we aimed to measure the impact of a structured safety huddle review of low-acuity PICU patients with high rates of priority alarms who were randomized to intervention compared with other low-acuity, high-alarm, concurrent, and historical control patients in the PICU.
METHODS
Study Definitions
Priority alarm activation rate. We conceptualized priority alarms as any alarm for a clinical condition that requires a timely response to determine if intervention is necessary to save a patient’s life,4 yet little empirical data support its existence in the hospital. We operationally defined these alarms on the General Electric Solar physiologic monitoring devices as any potentially life-threatening events including lethal arrhythmias (asystole, ventricular tachycardia, and ventricular fibrillation) and alarms for vital signs (heart rate, respiratory rate, and oxygen saturation) outside of the set parameter limits. These alarms produced audible tones in the patient room and automatically sent text messages to the nurse’s phone and had the potential to contribute to alarm fatigue regardless of the nurse’s location.
High-alarm patients. High-alarm patients were those who had more than 40 priority alarms in the preceding 4 hours, representing the top 20% of alarm rates in the PICU according to prior quality improvement projects completed in our PICU.
Low-acuity patients. Prior to and during this study, patient acuity was determined using the OptiLink Patient Classification System (OptiLink Healthcare Management Systems, Inc.; Tigard, OR; www.optilinkhealthcare.com; see Appendix 1) for the PICU twice daily. Low-acuity patients comprised on average 16% of the PICU patients.
Setting and Subjects
This study was performed in the PICU at The Children’s Hospital of Philadelphia.
The PICU is made up of 3 separate wings: east, south, and west. Bed availability was the only factor determining patient placement on the east, south, or west wing; the physical bed location was not preferentially assigned based on diagnosis or disease severity. The east wing was the intervention unit where the huddles occurred.
The PICU is composed of 3 different geographical teams. Two of the teams are composed of 4 to 5 pediatric or emergency medicine residents, 1 fellow, and 1 attending covering the south and west wings. The third team, located on the east wing, is composed of 1 to 2 pediatric residents, 2 to 3 nurse practitioners, 1 fellow, and 1 attending. Bedside family-centered rounds are held at each patient room, with the bedside nurse participating by reading a nursing rounding script that includes vital signs, vascular access, continuous medications, and additional questions or concerns.
Control subjects were any monitored patients on any of the 3 wings of the PICU between April 1, 2015, and October 31, 2015. The control patients were in 2 categories: historical controls from April 1, 2015, to May 31, 2015, and concurrent controls from June 1, 2015, to October 31, 2015, who were located anywhere in the PICU. On each nonholiday weekday beginning June 1, 2015, we randomly selected up to 2 patients to receive the intervention. These were high-alarm, low-acuity patients on the east wing to be discussed in the daily morning huddle. If more than 2 high-alarm, low-acuity patients were eligible for intervention, they were randomly selected by using the RAND function in Microsoft Excel. The other low-acuity, high-alarm patients in the PICU were included as control patients. Patients were eligible for the study if they were present for the 4 hours prior to huddle and present past noon on the day of huddle. If patients met criteria as high-alarm, low-acuity patients on multiple days, they could be enrolled as intervention or control patients multiple times. Patients’ alarm rates were calculated by dividing the number of alarms by their length of stay to the minute. There was no adjustment made for patients enrolled more than once.
Human Subjects Protection
The Institutional Review Board of The Children’s Hospital of Philadelphia approved this study with a waiver of informed consent.
Alarm Capture
We used BedMasterEx (Excel Medical Electronics; Jupiter, FL, http://excel-medical.com/products/bedmaster-ex) software connected to the General Electric monitor network to measure alarm rates. The software captured, in near real time, every alarm that occurred on every monitor in the PICU. Alarm rates over the preceding 4 hours for all PICU patients were exported and summarized by alarm type and level as set by hospital policy (crisis, warning, advisory, and system warning). Crisis and warning alarms were included as they represented potential life-threatening events meeting the definition of priority alarms. Physicians used an order within the PICU admission order-set to order monitoring based on preset age parameters (see online Appendix 1 for default settings). Physician orders were required for nurses to change alarm parameters. Daily electrode changes to reduce false alarms were standard of care.
Primary Outcome
The primary outcome was the change in priority alarm activation rate (the number of priority alarms per day) from prehuddle period (24 hours before morning huddle) to posthuddle period (the 24 hours following morning huddle) for intervention cases as compared with controls.
Primary Intervention
The intervention consisted of integrating a short script to facilitate the discussion of the alarm data during existing safety huddle and rounding workflows. The discussion and subsequent workflow proceeded as follows: A member of the research team who was not involved in patient care brought an alarm data sheet for each randomly selected intervention patient on the east wing to each safety huddle. The huddles were attended by the outgoing night charge nurse, the day charge nurse, and all bedside nurses working on the east wing that day. The alarm data sheet provided to the charge nurse displayed data on the 1 to 2 alarm parameters (respiratory rate, heart rate, or pulse oximetry) that generated the highest number of alarms. The charge nurse listed the high-alarm patients by room number during huddle, and the alarm data sheet was given to the bedside nurse responsible for the patient to facilitate further scripted discussion during bedside rounds with patient-specific information to reduce the alarm rates of individual patients throughout the adjustment of physiologic monitor parameters (see Appendix 2 for sample data sheet and script).
Data Collection
Intervention patients were high-alarm, low-acuity patients on the east wing from June 1, 2015, through October 31, 2015. Two months of baseline data were gathered prior to intervention on all 3 wings; therefore, control patients were high-alarm, low-acuity patients throughout the PICU from April 1, 2015, to May 31, 2015, as historical controls and from June 1, 2015, to October 31, 2015, as concurrent controls. Alarm rates for the 24 hours prior to huddle and the 24 hours following huddle were collected and analyzed. See Figure 1 for schematic of study design.
We collected data on patient characteristics, including patient location, age, sex, and intervention date. Information regarding changes to monitor alarm parameters for both intervention and control patients during the posthuddle period (the period following morning huddle until noon on intervention day) was also collected. We monitored for code blue events and unexpected changes in acuity until discharge or transfer out of the PICU.
Data Analysis
We compared the priority alarm activation rates of individual patients in the 24 hours before and the 24 hours after the huddle intervention and contrasted the differences in rates between intervention and control patients, both concurrent and historical controls. We also divided the intervention and control groups into 2 additional groups each—those patients whose alarm parameters were changed, compared with those whose parameters did not change. We evaluated for possible contamination by comparing alarm rates of historical and concurrent controls, as well as evaluating alarm rates by location. We used mixed-effects regression models to evaluate the effect of the intervention and control type (historical or concurrent) on alarm rates, adjusted for patient age and sex. Analysis was performed using Stata version 10.3 (StataCorp, LLC, College Station, TX) and SAS version 9.4 (SAS Institute Inc., Cary, NC).
RESULTS
Because patients could be enrolled more than once, we refer to the instances when they were included in the study as “events” (huddle discussions for intervention patients and huddle opportunities for controls) below. We identified 49 historical control events between April 1, 2015, and May 31, 2015. During the intervention period, we identified 88 intervention events and 163 concurrent control events between June 1, 2015, and October 31, 2015 (total n = 300; see Table 1 for event characteristics). A total of 6 patients were enrolled more than once as either intervention or control patients.
UNADJUSTED ANALYSIS OF CHANGES IN ALARM RATES
The average priority alarm activation rate for intervention patients was 433 alarms (95% confidence interval [CI], 392-472) per day in the 24 hours leading up to the intervention and 223 alarms (95% CI, 182-265) per day in the 24 hours following the intervention, a 48.5% unadjusted decrease (95% CI, 38.1%-58.9%). In contrast, priority alarm activation rates for concurrent control patients averaged 412 alarms (95% CI, 383-442) per day in the 24 hours leading up to the morning huddle and 323 alarms (95% CI, 270-375) per day in the 24 hours following huddle, a 21.6% unadjusted decrease (95% CI, 15.3%-27.9%). For historical controls, priority alarm activation rates averaged 369 alarms (95% CI, 339-399) per day in the 24 hours leading up to the morning huddle and 242 alarms (95% CI, 164-320) per day in the 24 hours following huddle, a 34.4% unadjusted decrease (95% CI, 13.5%-55.0%). When we compared historical versus concurrent controls in the unadjusted analysis, concurrent controls had 37 more alarms per day (95% CI, 59 fewer to 134 more; P = 0.45) than historical controls. There was no significant difference between concurrent and historical controls, demonstrating no evidence of contamination.
Adjusted Analysis of Changes in Alarm Rates
The overall estimate of the effect of the intervention adjusted for age and sex compared with concurrent controls was a reduction of 116 priority alarms per day (95% CI, 37-194; P = 0.004, Table 2). The adjusted percent decrease was 29.0% (95% CI, 12.1%-46.0%). There were no unexpected changes in patient acuity or code blue events related to the intervention.
Fidelity Analysis
We tracked changes in alarm parameter settings for evidence of intervention fidelity to determine if the team carried out the recommendations made. We found that 42% of intervention patients and 24% of combined control patients had alarm parameters changed during the posthuddle period (P = 0.002).
For those intervention patients who had parameters changed during the posthuddle period (N = 37), the mean effect was greater at a 54.9% decrease (95% CI, 38.8%-70.8%) in priority alarms as compared with control patients who had parameters adjusted during the posthuddle period (n = 50), having a mean decrease of only 12.2% (95% CI, –18.1%-42.3%). There was a 43.2% decrease (95% CI, 29.3%-57.0%) for intervention patients who were discussed but did not have parameters adjusted during the time window of observation (n = 51), as compared with combined control patients who did not have parameters adjusted (N = 162) who had a 28.1% decrease (95% CI, 16.8%-39.1%); see Figure 2.
This study is the first to demonstrate a successful and safe intervention to reduce the alarm rates of PICU patients. In addition, we observed a more significant reduction in priority alarm activation rates for intervention patients who had their alarm parameters changed during the monitored time period, leading us to hypothesize that providing patient-specific data regarding types of alarms was a key component of the intervention.
In control patients, we observed a reduction in alarm rates over time as well. There are 2 potential explanations for this. First, it is possible that as patients stabilize in the PICU, their vital signs become less extreme and generate fewer alarms even if the alarm parameters are not changed. The second is that parameters were changed within or outside of the time windows during which we evaluated for alarm parameter changes. Nevertheless, the decline over time observed in the intervention patients was greater than in both control groups. This change was even more noticeable in the intervention patients who had their alarm parameters changed during the posthuddle period as compared with controls who had their alarm parameters changed following the posthuddle period. This may have been due to the data provided during the huddle intervention, pointing the team to the cause of the high alarm rate.
Prior successful research regarding reduction of pediatric alarms has often shown decreased use of physiological monitors as 1 approach to reducing unnecessary alarms. The single prior pediatric alarm intervention study conducted on a pediatric ward involved instituting a cardiac monitor care process that included the ordering of age-based parameters, daily replacement of electrodes, individualized assessment of parameters, and a reliable method to discontinue monitoring.13 Because most patients in the PICU are critically ill, the reliance on monitor discontinuation as a main approach to decreasing alarms is not feasible in this setting. Instead, the use of targeted alarm parameter adjustments for low-acuity patients demonstrated a safe and feasible approach to decreasing alarms in PICU patients. The daily electrode change and age-based parameters were already in place at our institution.
There are a few limitations to this study. First, we focused only on low-acuity PICU patients. We believe that focusing on low-acuity patients allows for reduction in nonactionable alarms with limited potential for adverse events; however, this approach excludes many critically ill patients who might be at highest risk for harm from alarm fatigue if important alarms are ignored. Second, many of our patients were not present for the full 24 hours pre- and posthuddle due to their low acuity limiting our ability to follow alarm rates over time. Third, changes in alarm parameters were only monitored for a set period of 5 hours following the huddle to determine the effect of the recommended rounding script on changes to alarms. It is possible the changes to alarm parameters outside of the observed posthuddle period affected the alarm rates of both intervention and control patients. Lastly, the balancing metrics of unexpected changes in OptiLink status and code blue events are rare events, and therefore we may have been underpowered to find them. The effects of the huddle intervention on safety huddle length and rounding length were not measured.
CONCLUSION
Integrating a data-driven monitor alarm discussion into safety huddles was a safe and effective approach to reduce alarms in low-acuity, high-alarm PICU patients. Innovative approaches to make data-driven alarm decisions using informatics tools integrated into monitoring systems and electronic health records have the potential to facilitate cost-effective spread of this intervention.
Disclosure
This work was supported by a pilot grant from the Center for Pediatric Clinical Effectiveness, The Children’s Hospital of Philadelphia. Dr. Bonafide is supported by a Mentored Patient-Oriented Research Career Development Award from the National Heart, Lung, and Blood Institute of the National Institutes of Health under Award Number K23HL116427. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations or employers. The funding organizations had no role in the design, preparation, review, or approval of this paper, nor the decision to submit for publication.
BACKGROUND
Physiologic monitors are intended to prevent cardiac and respiratory arrest by generating alarms to alert clinicians to signs of instability. To minimize the probability that monitors will miss signs of deterioration, alarm algorithms and default parameters are often set to maximize sensitivity while sacrificing specificity.1 As a result, monitors generate large numbers of nonactionable alarms—alarms that are either invalid and do not accurately represent the physiologic status of the patient or are valid but do not warrant clinical intervention.2 Prior research has demonstrated that the pediatric intensive care unit (PICU) is responsible for a higher proportion of alarms than pediatric wards3 and a large proportion of these alarms, 87% - 97%, are nonactionable.4-8 In national surveys of healthcare staff, respondents report that high alarm rates interrupt patient care and can lead clinicians to disable alarms entirely.9 Recent research has supported this, demonstrating that nurses who are exposed to higher numbers of alarms have slower response times to alarms.4,10 In an attempt to mitigate safety risks, the Joint Commission in 2012 issued recommendations for hospitals to (a) establish guidelines for tailoring alarm settings and limits for individual patients and (b) identify situations in which alarms are not clinically necessary.11
In order to address these recommendations within our PICU, we sought to evaluate the impact of a focused physiologic monitor alarm reduction intervention integrated into safety huddles. Safety huddles are brief, structured discussions among physicians, nurses, and other staff aiming to identify safety concerns.12 Huddles offer an appropriate forum for reviewing alarm data and identifying patients whose high alarm rates may necessitate safe tailoring of alarm limits. Pilot data demonstrating high alarm rates among low-acuity PICU patients led us to hypothesize that low-acuity, high-alarm PICU patients would be a safe and effective target for an alarm huddle-based intervention.
In this study, we aimed to measure the impact of a structured safety huddle review of low-acuity PICU patients with high rates of priority alarms who were randomized to intervention compared with other low-acuity, high-alarm, concurrent, and historical control patients in the PICU.
METHODS
Study Definitions
Priority alarm activation rate. We conceptualized priority alarms as any alarm for a clinical condition that requires a timely response to determine if intervention is necessary to save a patient’s life,4 yet little empirical data support its existence in the hospital. We operationally defined these alarms on the General Electric Solar physiologic monitoring devices as any potentially life-threatening events including lethal arrhythmias (asystole, ventricular tachycardia, and ventricular fibrillation) and alarms for vital signs (heart rate, respiratory rate, and oxygen saturation) outside of the set parameter limits. These alarms produced audible tones in the patient room and automatically sent text messages to the nurse’s phone and had the potential to contribute to alarm fatigue regardless of the nurse’s location.
High-alarm patients. High-alarm patients were those who had more than 40 priority alarms in the preceding 4 hours, representing the top 20% of alarm rates in the PICU according to prior quality improvement projects completed in our PICU.
Low-acuity patients. Prior to and during this study, patient acuity was determined using the OptiLink Patient Classification System (OptiLink Healthcare Management Systems, Inc.; Tigard, OR; www.optilinkhealthcare.com; see Appendix 1) for the PICU twice daily. Low-acuity patients comprised on average 16% of the PICU patients.
Setting and Subjects
This study was performed in the PICU at The Children’s Hospital of Philadelphia.
The PICU is made up of 3 separate wings: east, south, and west. Bed availability was the only factor determining patient placement on the east, south, or west wing; the physical bed location was not preferentially assigned based on diagnosis or disease severity. The east wing was the intervention unit where the huddles occurred.
The PICU is composed of 3 different geographical teams. Two of the teams are composed of 4 to 5 pediatric or emergency medicine residents, 1 fellow, and 1 attending covering the south and west wings. The third team, located on the east wing, is composed of 1 to 2 pediatric residents, 2 to 3 nurse practitioners, 1 fellow, and 1 attending. Bedside family-centered rounds are held at each patient room, with the bedside nurse participating by reading a nursing rounding script that includes vital signs, vascular access, continuous medications, and additional questions or concerns.
Control subjects were any monitored patients on any of the 3 wings of the PICU between April 1, 2015, and October 31, 2015. The control patients were in 2 categories: historical controls from April 1, 2015, to May 31, 2015, and concurrent controls from June 1, 2015, to October 31, 2015, who were located anywhere in the PICU. On each nonholiday weekday beginning June 1, 2015, we randomly selected up to 2 patients to receive the intervention. These were high-alarm, low-acuity patients on the east wing to be discussed in the daily morning huddle. If more than 2 high-alarm, low-acuity patients were eligible for intervention, they were randomly selected by using the RAND function in Microsoft Excel. The other low-acuity, high-alarm patients in the PICU were included as control patients. Patients were eligible for the study if they were present for the 4 hours prior to huddle and present past noon on the day of huddle. If patients met criteria as high-alarm, low-acuity patients on multiple days, they could be enrolled as intervention or control patients multiple times. Patients’ alarm rates were calculated by dividing the number of alarms by their length of stay to the minute. There was no adjustment made for patients enrolled more than once.
Human Subjects Protection
The Institutional Review Board of The Children’s Hospital of Philadelphia approved this study with a waiver of informed consent.
Alarm Capture
We used BedMasterEx (Excel Medical Electronics; Jupiter, FL, http://excel-medical.com/products/bedmaster-ex) software connected to the General Electric monitor network to measure alarm rates. The software captured, in near real time, every alarm that occurred on every monitor in the PICU. Alarm rates over the preceding 4 hours for all PICU patients were exported and summarized by alarm type and level as set by hospital policy (crisis, warning, advisory, and system warning). Crisis and warning alarms were included as they represented potential life-threatening events meeting the definition of priority alarms. Physicians used an order within the PICU admission order-set to order monitoring based on preset age parameters (see online Appendix 1 for default settings). Physician orders were required for nurses to change alarm parameters. Daily electrode changes to reduce false alarms were standard of care.
Primary Outcome
The primary outcome was the change in priority alarm activation rate (the number of priority alarms per day) from prehuddle period (24 hours before morning huddle) to posthuddle period (the 24 hours following morning huddle) for intervention cases as compared with controls.
Primary Intervention
The intervention consisted of integrating a short script to facilitate the discussion of the alarm data during existing safety huddle and rounding workflows. The discussion and subsequent workflow proceeded as follows: A member of the research team who was not involved in patient care brought an alarm data sheet for each randomly selected intervention patient on the east wing to each safety huddle. The huddles were attended by the outgoing night charge nurse, the day charge nurse, and all bedside nurses working on the east wing that day. The alarm data sheet provided to the charge nurse displayed data on the 1 to 2 alarm parameters (respiratory rate, heart rate, or pulse oximetry) that generated the highest number of alarms. The charge nurse listed the high-alarm patients by room number during huddle, and the alarm data sheet was given to the bedside nurse responsible for the patient to facilitate further scripted discussion during bedside rounds with patient-specific information to reduce the alarm rates of individual patients throughout the adjustment of physiologic monitor parameters (see Appendix 2 for sample data sheet and script).
Data Collection
Intervention patients were high-alarm, low-acuity patients on the east wing from June 1, 2015, through October 31, 2015. Two months of baseline data were gathered prior to intervention on all 3 wings; therefore, control patients were high-alarm, low-acuity patients throughout the PICU from April 1, 2015, to May 31, 2015, as historical controls and from June 1, 2015, to October 31, 2015, as concurrent controls. Alarm rates for the 24 hours prior to huddle and the 24 hours following huddle were collected and analyzed. See Figure 1 for schematic of study design.
We collected data on patient characteristics, including patient location, age, sex, and intervention date. Information regarding changes to monitor alarm parameters for both intervention and control patients during the posthuddle period (the period following morning huddle until noon on intervention day) was also collected. We monitored for code blue events and unexpected changes in acuity until discharge or transfer out of the PICU.
Data Analysis
We compared the priority alarm activation rates of individual patients in the 24 hours before and the 24 hours after the huddle intervention and contrasted the differences in rates between intervention and control patients, both concurrent and historical controls. We also divided the intervention and control groups into 2 additional groups each—those patients whose alarm parameters were changed, compared with those whose parameters did not change. We evaluated for possible contamination by comparing alarm rates of historical and concurrent controls, as well as evaluating alarm rates by location. We used mixed-effects regression models to evaluate the effect of the intervention and control type (historical or concurrent) on alarm rates, adjusted for patient age and sex. Analysis was performed using Stata version 10.3 (StataCorp, LLC, College Station, TX) and SAS version 9.4 (SAS Institute Inc., Cary, NC).
RESULTS
Because patients could be enrolled more than once, we refer to the instances when they were included in the study as “events” (huddle discussions for intervention patients and huddle opportunities for controls) below. We identified 49 historical control events between April 1, 2015, and May 31, 2015. During the intervention period, we identified 88 intervention events and 163 concurrent control events between June 1, 2015, and October 31, 2015 (total n = 300; see Table 1 for event characteristics). A total of 6 patients were enrolled more than once as either intervention or control patients.
UNADJUSTED ANALYSIS OF CHANGES IN ALARM RATES
The average priority alarm activation rate for intervention patients was 433 alarms (95% confidence interval [CI], 392-472) per day in the 24 hours leading up to the intervention and 223 alarms (95% CI, 182-265) per day in the 24 hours following the intervention, a 48.5% unadjusted decrease (95% CI, 38.1%-58.9%). In contrast, priority alarm activation rates for concurrent control patients averaged 412 alarms (95% CI, 383-442) per day in the 24 hours leading up to the morning huddle and 323 alarms (95% CI, 270-375) per day in the 24 hours following huddle, a 21.6% unadjusted decrease (95% CI, 15.3%-27.9%). For historical controls, priority alarm activation rates averaged 369 alarms (95% CI, 339-399) per day in the 24 hours leading up to the morning huddle and 242 alarms (95% CI, 164-320) per day in the 24 hours following huddle, a 34.4% unadjusted decrease (95% CI, 13.5%-55.0%). When we compared historical versus concurrent controls in the unadjusted analysis, concurrent controls had 37 more alarms per day (95% CI, 59 fewer to 134 more; P = 0.45) than historical controls. There was no significant difference between concurrent and historical controls, demonstrating no evidence of contamination.
Adjusted Analysis of Changes in Alarm Rates
The overall estimate of the effect of the intervention adjusted for age and sex compared with concurrent controls was a reduction of 116 priority alarms per day (95% CI, 37-194; P = 0.004, Table 2). The adjusted percent decrease was 29.0% (95% CI, 12.1%-46.0%). There were no unexpected changes in patient acuity or code blue events related to the intervention.
Fidelity Analysis
We tracked changes in alarm parameter settings for evidence of intervention fidelity to determine if the team carried out the recommendations made. We found that 42% of intervention patients and 24% of combined control patients had alarm parameters changed during the posthuddle period (P = 0.002).
For those intervention patients who had parameters changed during the posthuddle period (N = 37), the mean effect was greater at a 54.9% decrease (95% CI, 38.8%-70.8%) in priority alarms as compared with control patients who had parameters adjusted during the posthuddle period (n = 50), having a mean decrease of only 12.2% (95% CI, –18.1%-42.3%). There was a 43.2% decrease (95% CI, 29.3%-57.0%) for intervention patients who were discussed but did not have parameters adjusted during the time window of observation (n = 51), as compared with combined control patients who did not have parameters adjusted (N = 162) who had a 28.1% decrease (95% CI, 16.8%-39.1%); see Figure 2.
This study is the first to demonstrate a successful and safe intervention to reduce the alarm rates of PICU patients. In addition, we observed a more significant reduction in priority alarm activation rates for intervention patients who had their alarm parameters changed during the monitored time period, leading us to hypothesize that providing patient-specific data regarding types of alarms was a key component of the intervention.
In control patients, we observed a reduction in alarm rates over time as well. There are 2 potential explanations for this. First, it is possible that as patients stabilize in the PICU, their vital signs become less extreme and generate fewer alarms even if the alarm parameters are not changed. The second is that parameters were changed within or outside of the time windows during which we evaluated for alarm parameter changes. Nevertheless, the decline over time observed in the intervention patients was greater than in both control groups. This change was even more noticeable in the intervention patients who had their alarm parameters changed during the posthuddle period as compared with controls who had their alarm parameters changed following the posthuddle period. This may have been due to the data provided during the huddle intervention, pointing the team to the cause of the high alarm rate.
Prior successful research regarding reduction of pediatric alarms has often shown decreased use of physiological monitors as 1 approach to reducing unnecessary alarms. The single prior pediatric alarm intervention study conducted on a pediatric ward involved instituting a cardiac monitor care process that included the ordering of age-based parameters, daily replacement of electrodes, individualized assessment of parameters, and a reliable method to discontinue monitoring.13 Because most patients in the PICU are critically ill, the reliance on monitor discontinuation as a main approach to decreasing alarms is not feasible in this setting. Instead, the use of targeted alarm parameter adjustments for low-acuity patients demonstrated a safe and feasible approach to decreasing alarms in PICU patients. The daily electrode change and age-based parameters were already in place at our institution.
There are a few limitations to this study. First, we focused only on low-acuity PICU patients. We believe that focusing on low-acuity patients allows for reduction in nonactionable alarms with limited potential for adverse events; however, this approach excludes many critically ill patients who might be at highest risk for harm from alarm fatigue if important alarms are ignored. Second, many of our patients were not present for the full 24 hours pre- and posthuddle due to their low acuity limiting our ability to follow alarm rates over time. Third, changes in alarm parameters were only monitored for a set period of 5 hours following the huddle to determine the effect of the recommended rounding script on changes to alarms. It is possible the changes to alarm parameters outside of the observed posthuddle period affected the alarm rates of both intervention and control patients. Lastly, the balancing metrics of unexpected changes in OptiLink status and code blue events are rare events, and therefore we may have been underpowered to find them. The effects of the huddle intervention on safety huddle length and rounding length were not measured.
CONCLUSION
Integrating a data-driven monitor alarm discussion into safety huddles was a safe and effective approach to reduce alarms in low-acuity, high-alarm PICU patients. Innovative approaches to make data-driven alarm decisions using informatics tools integrated into monitoring systems and electronic health records have the potential to facilitate cost-effective spread of this intervention.
Disclosure
This work was supported by a pilot grant from the Center for Pediatric Clinical Effectiveness, The Children’s Hospital of Philadelphia. Dr. Bonafide is supported by a Mentored Patient-Oriented Research Career Development Award from the National Heart, Lung, and Blood Institute of the National Institutes of Health under Award Number K23HL116427. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations or employers. The funding organizations had no role in the design, preparation, review, or approval of this paper, nor the decision to submit for publication.
1. Drew BJ, Califf RM, Funk M, et al. Practice standards for electrocardiographic monitoring in hospital settings: An American Heart Association scientific statement from the councils on cardiovascular nursing, clinical cardiology, and cardiovascular disease in the young. Circulation. 2004;110(17):2721-2746; DOI:10.1161/01.CIR.0000145144.56673.59. PubMed
2. Paine CW, Goel V V, Ely E, et al. Systematic Review of Physiologic Monitor Alarm Characteristics and Pragmatic Interventions to Reduce Alarm Frequency. J Hosp Med. 2016;11(2):136-144; DOI:10.1002/jhm.2520. PubMed
3. Schondelmeyer AC, Bonafide CP, Goel V V, et al. The frequency of physiologic monitor alarms in a children’s hospital. J Hosp Med. 2016;11(11):796-798; DOI:10.1002/jhm.2612. PubMed
4. Bonafide CP, Lin R, Zander M, et al. Association between exposure to nonactionable physiologic monitor alarms and response time in a children’s hospital. J Hosp Med. 2015;10(6):345-351; DOI:10.1002/jhm.2331. PubMed
5. Lawless ST. Crying wolf: false alarms in a pediatric intensive care unit. Crit Care Med. 1994;22(6):981-985; DOI:10.1016/0025-326X(92)90542-E. PubMed
6. Tsien CL, Fackler JC. Poor prognosis for existing monitors in the intensive care unit. Crit Care Med. 1997;25(4):614-619 DOI:10.1097/00003246-199704000-00010. PubMed
7. Talley LB, Hooper J, Jacobs B, et al. Cardiopulmonary monitors and clinically significant events in critically ill children. Biomed Instrum Technol. 2011;45(SPRING):38-45; DOI:10.2345/0899-8205-45.s1.38. PubMed
8. Rosman EC, Blaufox AD, Menco A, Trope R, Seiden HS. What are we missing? Arrhythmia detection in the pediatric intensive care unit. J Pediatr. 2013;163(2):511-514; DOI:10.1016/j.jpeds.2013.01.053. PubMed
9. Korniewicz DM, Clark T, David Y. A national online survey on the effectiveness of clinical alarms. Am J Crit Care. 2008;17(1):36-41; DOI:17/1/36 [pii]. PubMed
10. Voepel-Lewis T, Parker ML, Burke CN, et al. Pulse oximetry desaturation alarms on a general postoperative adult unit: A prospective observational study of nurse response time. Int J Nurs Stud. 2013;50(10):1351-1358; DOI:10.1016/j.ijnurstu.2013.02.006. PubMed
11. Joint Commission on Accreditation of Healthcare Organizations. Medical device alarm safety in hospitals. Sentin Event Alert. 2012:1-3. PubMed
12. Goldenhar LM, Brady PW, Sutcliffe KM, Muething SE, Anderson JM. Huddling for high reliability and situation awareness. BMJ Qual Saf. 2013;22:899-906; DOI:10.1136/bmjqs-2012-001467. PubMed
13. Dandoy CE, Davies SM, Flesch L, et al. A Team-Based Approach to Reducing Cardiac Monitor Alarms. Pediatrics. 2014;134(6):E1686-E1694. DOI: 10.1542/peds.2014-1162. PubMed
1. Drew BJ, Califf RM, Funk M, et al. Practice standards for electrocardiographic monitoring in hospital settings: An American Heart Association scientific statement from the councils on cardiovascular nursing, clinical cardiology, and cardiovascular disease in the young. Circulation. 2004;110(17):2721-2746; DOI:10.1161/01.CIR.0000145144.56673.59. PubMed
2. Paine CW, Goel V V, Ely E, et al. Systematic Review of Physiologic Monitor Alarm Characteristics and Pragmatic Interventions to Reduce Alarm Frequency. J Hosp Med. 2016;11(2):136-144; DOI:10.1002/jhm.2520. PubMed
3. Schondelmeyer AC, Bonafide CP, Goel V V, et al. The frequency of physiologic monitor alarms in a children’s hospital. J Hosp Med. 2016;11(11):796-798; DOI:10.1002/jhm.2612. PubMed
4. Bonafide CP, Lin R, Zander M, et al. Association between exposure to nonactionable physiologic monitor alarms and response time in a children’s hospital. J Hosp Med. 2015;10(6):345-351; DOI:10.1002/jhm.2331. PubMed
5. Lawless ST. Crying wolf: false alarms in a pediatric intensive care unit. Crit Care Med. 1994;22(6):981-985; DOI:10.1016/0025-326X(92)90542-E. PubMed
6. Tsien CL, Fackler JC. Poor prognosis for existing monitors in the intensive care unit. Crit Care Med. 1997;25(4):614-619 DOI:10.1097/00003246-199704000-00010. PubMed
7. Talley LB, Hooper J, Jacobs B, et al. Cardiopulmonary monitors and clinically significant events in critically ill children. Biomed Instrum Technol. 2011;45(SPRING):38-45; DOI:10.2345/0899-8205-45.s1.38. PubMed
8. Rosman EC, Blaufox AD, Menco A, Trope R, Seiden HS. What are we missing? Arrhythmia detection in the pediatric intensive care unit. J Pediatr. 2013;163(2):511-514; DOI:10.1016/j.jpeds.2013.01.053. PubMed
9. Korniewicz DM, Clark T, David Y. A national online survey on the effectiveness of clinical alarms. Am J Crit Care. 2008;17(1):36-41; DOI:17/1/36 [pii]. PubMed
10. Voepel-Lewis T, Parker ML, Burke CN, et al. Pulse oximetry desaturation alarms on a general postoperative adult unit: A prospective observational study of nurse response time. Int J Nurs Stud. 2013;50(10):1351-1358; DOI:10.1016/j.ijnurstu.2013.02.006. PubMed
11. Joint Commission on Accreditation of Healthcare Organizations. Medical device alarm safety in hospitals. Sentin Event Alert. 2012:1-3. PubMed
12. Goldenhar LM, Brady PW, Sutcliffe KM, Muething SE, Anderson JM. Huddling for high reliability and situation awareness. BMJ Qual Saf. 2013;22:899-906; DOI:10.1136/bmjqs-2012-001467. PubMed
13. Dandoy CE, Davies SM, Flesch L, et al. A Team-Based Approach to Reducing Cardiac Monitor Alarms. Pediatrics. 2014;134(6):E1686-E1694. DOI: 10.1542/peds.2014-1162. PubMed
© 2017 Society of Hospital Medicine
A Contemporary Assessment of Mechanical Complication Rates and Trainee Perceptions of Central Venous Catheter Insertion
Central venous catheter (CVC) placement is commonly performed in emergency and critical care settings for parenteral access, central monitoring, and hemodialysis. Although potentially lifesaving CVC insertion is associated with immediate risks including injury to nerves, vessels, and lungs.1-3 These “insertion-related complications” are of particular interest for several reasons. First, the frequency of such complications varies widely, with published rates between 1.4% and 33.2%.2-7 Reasons for such variation include differences in study definitions of complications (eg, pneumothorax and tip position),2,5 setting of CVC placement (eg, intensive care unit [ICU] vs emergency room), timing of placement (eg, elective vs emergent), differences in technique, and type of operator (eg, experienced vs learner). Thus, the precise incidence of such events in modern-day training settings with use of ultrasound guidance remains uncertain. Second, mechanical complications might be preventable with adequate training and supervision. Indeed, studies using simulation-based mastery techniques have demonstrated a reduction in rates of complications following intensive training.8 Finally, understanding risk factors associated with insertion complications might inform preventative strategies and improve patient safety.9-11
Few studies to date have examined trainees’ perceptions on CVC training, experience, supervision, and ability to recognize and prevent mechanical complications. While research investigating effects of simulation training has accumulated, most focus on successful completion of the procedure or individual procedural steps with little emphasis on operator perceptions.12-14 In addition, while multiple studies have shown that unsuccessful line attempts are a risk factor for CVC complications,3,4,7,15 there is very little known about trainee behavior and perceptions regarding unsuccessful line placement. CVC simulation trainings often assume successful completion of the procedure and do not address the crucial postprocedure steps that should be undertaken if a procedure is unsuccessful. For these reasons, we developed a survey to specifically examine trainee experience with CVC placement, supervision, postprocedural behavior, and attitudes regarding unsuccessful line placement.
Therefore, we designed a study with 2 specific goals: The first is to perform a contemporary analysis of CVC mechanical complication rate at an academic teaching institution and identify potential risk factors associated with these complications. Second, we sought to determine trainee perceptions regarding CVC complication experience, prevention, procedural supervision, and perceptions surrounding unsuccessful line placement.
METHODS
Design and Setting
We conducted a single-center retrospective review of nontunneled acute CVC procedures between June 1, 2014, and May 1, 2015, at the University of Michigan Health System (UMHS). UMHS is a tertiary care referral center with over 900 inpatient beds, including 99 ICU beds.
All residents in internal medicine, surgery, anesthesia, and emergency medicine receive mandatory education in CVC placement that includes an online training module and simulation-based training with competency assessment. Use of real-time ultrasound guidance is considered the standard of care for CVC placement.
Data Collection
Inpatient procedure notes were electronically searched for terms indicating CVC placement. This was performed by using our hospital’s Data Office for Clinical and Translational Research using the Electronic Medical Record Search Engine tool. Please see the supplemental materials for the full list of search terms. We electronically extracted data, including date of procedure, gender, and most recent body mass index (BMI), within 1 year prior to note. Acute Physiology and Chronic Health Evaluation III (APACHE III) data are tracked for all patients on admission to ICU; this was collected when available. Charts were then manually reviewed to collect additional data, including international normalized ratio (INR), platelet count, lactate level on the day of CVC placement, anticoagulant use (actively prescribed coumadin, therapeutic enoxaparin, therapeutic unfractionated heparin, or direct oral anticoagulant), ventilator or noninvasive positive pressure ventilation (NIPPV) at time of CVC placement, and vasopressor requirement within 24 hours of CVC placement. The procedure note was reviewed to gather information about site of CVC placement, size and type of catheter, number of attempts, procedural success, training level of the operator, and attending presence. Small bore CVCs were defined as 7 French (Fr) or lower. Large bore CVCs were defined as >7 Fr; this includes dialysis catheters, Cordis catheters (Cordis, Fremont, CA), and cooling catheters. The times of the procedure note and postprocedure chest x-ray (CXR) were recorded, including whether the CVC was placed on a weekend (Friday 7
Primary Outcome
The primary outcome was the rate of severe mechanical complications related to CVC placement. Similar to prior studies,2 we defined severe mechanical complications as arterial placement of dilator or catheter, hemothorax, pneumothorax, cerebral ischemia, patient death (related to procedure), significant hematoma, or vascular injury (defined as complication requiring expert consultation or blood product transfusion). We did not require a lower limit on blood transfusion. We considered pneumothorax a complication regardless of whether chest tube intervention was performed, as pneumothorax subjects the patient to additional tests (eg, serial CXRs) and sometimes symptoms (shortness of breath, pain, anxiety) regardless of whether or not a chest tube was required. Complications were confirmed by a direct review of procedure notes, progress notes, discharge summaries, and imaging studies.
Trainee Survey
A survey was electronically disseminated to all internal medicine and medicine-pediatric residents to inquire about CVC experiences, including time spent in the medical ICU, number of CVCs performed, postprocedure behavior for both failed and successful CVCs, and supervision experience and attitudes. Please see supplemental materials for full survey contents.
Statistical Methods
Descriptive statistics (percentage) were used to summarize data. Continuous and categorical variables were compared using Student t tests and chi-square tests, respectively. All analyses were performed using SAS 9.3 (SAS Institute, Cary, NC).
Ethical and Regulatory Oversight
The study was deemed exempt by the University of Michigan Institutional Review Board (HUM00100549) as data collection was part of a quality improvement effort.
RESULTS
Demographics and Characteristics of Device Insertion
Between June 1, 2014, and May 1, 2015, 730 CVC procedure notes were reviewed (Table 1). The mean age of the study population was 58.9 years, and 41.6% (n = 304) were female. BMI data were available in 400 patients without complications and 5 patients with complications; the average BMI was 31.5 kg/m2. The APACHE III score was available for 442 patients without complications and 10 patients with complications; the average score was 86 (range 19-200). Most of the CVCs placed (n= 504, 69%) were small bore (<7 Fr). The majority of catheters were placed in the internal jugular (IJ) position (n = 525, 71.9%), followed by femoral (n = 144, 19.7%), subclavian (N = 57, 7.8%), and undocumented (n = 4, 0.6%). Ninety-six percent (n = 699) of CVCs were successfully placed. Seventy-six percent (n = 558) of procedure notes included documentation of the number of CVC attempts; of these, 85% documented 2 or fewer attempts. The majority of CVCs were placed by residents (n = 537, 73.9%), followed by fellows (N = 127, 17.5%) and attendings (n = 27, 3.7%). Attending supervision for all or key portions of CVC placement occurred 34.7% (n = 244) of the time overall and was lower for internal medicine trainees (n = 98/463, 21.2%) compared with surgical trainees (n = 73/127, 57.4%) or emergency medicine trainees (n = 62/96, 64.6%; P < 0.001). All successful IJ and subclavian CVCs except for 2 insertions (0.3%) had a postprocedure CXR. A minority of notes documented pressure transduction (4.5%) or blood gas analysis (0.2%) to confirm venous placement.
Mechanical Complications
The mechanical complications identified included pneumothorax (n = 5), bleeding requiring transfusion (n = 3), vascular injury requiring expert consultation or intervention (n = 3), stroke (n = 1), and death (n = 2). Vascular injuries included 1 neck hematoma with superinfection requiring antibiotics, 1 neck hematoma requiring otolaryngology and vascular surgery consultation, and 1 venous dissection of IJ vein requiring vascular surgery consultation. None of these cases required operative intervention. The stroke was caused by inadvertent CVC placement into the carotid artery. One patient experienced tension pneumothorax and died due to this complication; this death occurred after 3 failed left subclavian CVC attempts and an ultimately successful CVC placement into left IJ vein. Another death occurred immediately following unsuccessful Cordis placement. As no autopsy was performed, it is impossible to know if the cause of death was the line placement. However, it would be prudent to consider this as a CVC complication given the temporal relationship to line placement. Thus, the total number of patients who experienced severe mechanical complications was 14 out of 730 (1.92%).
Risk Factors for Mechanical Complications
Certain patient factors were more commonly associated with complications. For example, BMI was significantly lower in the group that experienced complications vs those that did not (25.7 vs 31.0 kg/m2, P = 0.001). No other associations between demographic factors, including age (61.4 years vs 58.9 years, P = 0.57) or sex (57.1% male vs 41.3% female, P = 0.24), or admission APACHE III score (96 vs 86, P = 0.397) were noted. The mean INR, platelets, and lactate did not differ between the 2 groups. There was no difference between the use of vasopressors. Ventilator use (including endotracheal tube or NIPPV) was found to be significantly higher in the group that experienced mechanical complications (78.5% vs 65.9%, P = 0.001). Anticoagulation use was also associated with mechanical complications (28.6% vs 20.6%, P = 0.05); 3 patients on anticoagulation experienced significant hematomas. Mechanical complications were more common with subclavian location (21.4% vs 7.8%, P = 0.001); in all 3 cases involving subclavian CVC placement, the complication experienced was pneumothorax. The number of attempts significantly differed between the 2 groups, with an average of 1.5 attempts in the group without complications and 2.2 attempts in the group that experienced complications (P = 0.02). Additionally, rates of successful placement were lower among patients who experienced complications (78.6% vs 95.7%, P = 0.001).
With respect to operator characteristics, no significant difference between the levels of training was noted among those who experienced complications vs those who did not. Attending supervision was more frequent for the group that experienced complications (61.5% vs 34.2%, P = 0.04). There was no significant difference in complication rate according to the first vs the second half of the academic year (0.4% vs 0.3% per month, P = 0.30) or CVC placement during the day vs night (1.9% vs 2.0%, P = 0.97). A trend toward more complications in CVCs placed over the weekend compared to a weekday was observed (2.80% vs 1.23%, P = 0.125).
Unsuccessful CVCs
There were 30 documented unsuccessful CVC procedures, representing 4.1% of all procedures. Of these, 3 procedures had complications; these included 2 pneumothoraxes (1 leading to death) and 1 unexplained death. Twenty-four of the unsuccessful CVC attempts were in either the subclavian or IJ location; of these, 5 (21%) did not have a postprocedure CXR obtained.
Survey Results
The survey was completed by 103 out of 166 internal medicine residents (62% response rate). Of these, 55% (n = 57) reported having performed 5 or more CVCs, and 14% (n = 14) had performed more than 15 CVCs.
All respondents who had performed at least 1 CVC (n = 80) were asked about their perceptions regarding attending supervision. Eighty-one percent (n = 65/80) responded that they have never been directly supervised by an attending during CVC placement, while 16% (n = 13/80) reported being supervised less than 25% of the time. Most (n = 53/75, 71%) did not feel that attending supervision affected their performance, while 21% (n = 16/75) felt it affected performance negatively, and only 8% (n = 6/75) stated it affected performance positively. Nineteen percent (n = 15/80) indicated that they prefer more supervision by attendings, while 35% (n = 28/80) did not wish for more attending supervision, and 46% (n = 37/80) were indifferent.
DISCUSSION
We performed a contemporary analysis of CVC placement at an academic tertiary care center and observed a rate of severe mechanical complications of 1.9%. This rate is within previously described acceptable thresholds.16 Our study adds to the literature by identifying several important risk factors for development of mechanical complications. We confirm many risk factors that have been noted historically, such as subclavian line location,2,3 attending supervision,3 low BMI,4 number of CVC attempts, and unsuccessful CVC placement.3,4,7,15 We identified several unique risk factors, including systemic anticoagulation as well as ventilator use. Lastly, we identified unexpected deficits in trainee knowledge surrounding management of failed CVCs and negative attitudes regarding attending supervision.
Most existing literature evaluated risk factors for CVC complication prior to routine ultrasound use;3-5,7,15 surprisingly, it appears that severe mechanical complications do not differ dramatically in the real-time ultrasound era. Eisen et al.3 prospectively studied CVC placement at an academic medical center and found a severe mechanical complication rate (as defined in our paper) of 1.9% due to pneumothorax (1.3%), hemothorax (0.3%), and death (0.3%).We would expect the number of complications to decrease in the postultrasound era, and indeed it appears that pneumothoraces have decreased likely due to ultrasound guidance and decrease in subclavian location. However, in contrast, rates of significant hematomas and bleeding are higher in our study. Although we are unable to state why this may be the case, increasing use of anticoagulation in the general population might explain this finding.17 For instance, of the 6 patients who experienced hematomas or vascular injuries in our study, 3 were on anticoagulation at the time of CVC placement.
Interestingly, time of academic year of CVC placement and level of training were not correlated with an increased risk of complications, nor was time of day of CVC placement. In contrast, Merrer et al.showed that CVC insertion during nighttime was significantly associated with increased mechanical complications (odds ratio 2.06, 95% confidence interval, 1.04-4.08;,P = 0.03).5 This difference may be attributable to the fact that most of our ICUs now have a night float system rather than a more traditional 24-hour call model; therefore, trainees are less likely to be sleep deprived during CVC placement at night.
Severity of illness did not appear to significantly affect mechanical complication rates based on similar APACHE scores between the 2 groups. In addition, other indicators of illness severity (vasopressor use or lactate level) did not suggest that sicker patients may be more likely to experience mechanical complications than others. One could conjecture that perhaps sicker patients were more likely to have lines placed by more experienced trainees, although the present study design does not allow us to answer this question. Interestingly, ventilator use was associated with higher rates of complications. We cannot say definitively why this was the case; however, 1 contributing factor may be the physical constraints of placing the CVC around ventilator tubing.
Several unexpected findings surrounding attending supervision were noted: first, attending supervision appears to be significantly associated with increased complication rate, and second, trainees have negative perceptions regarding attending supervision. Eisen et al.showed a similar association between attending supervision and complication rate.3 It is possible that the increased complication rate is because sicker patients are more likely to have procedural supervision by attendings, attending physicians may be called to supervise when a CVC placement is not going as planned, or attendings may supervise more inexperienced operators. Reasons behind negative trainee attitudes surrounding supervision are unclear and literature on this topic is limited. This is an area that warrants further exploration in future studies.
Another unexpected finding is trainee practices regarding unsuccessful CVC placement; most trainees do not document failed procedures or order follow-up CXRs after unsuccessful CVC attempts. Given the higher risk of complications after unsuccessful CVCs, it is paramount that all physicians are trained to order postprocedure CXR to rule out pneumothorax or hemothorax. Furthermore, documentation of failed procedures is important for medical accuracy, transparency, and also hospital billing. It is unknown if these practices surrounding unsuccessful CVCs are institution-specific or more widespread. As far as we know, this is the first time that trainee practices regarding failed CVC placement have been published. Interestingly, while many current guidelines call attention to prevention, recognition, and management of central line-associated mechanical complications, specific recommendations about postprocedure behavior after failed CVC placement are not published.9-11 We feel it is critical that institutions reflect on their own practices, especially given that unsuccessful CVCs are shown to be correlated with a significant increase in complication rate. At our own institution, we have initiated an educational component of central line training for medicine trainees specifically addressing failed central line attempts.
This study has several limitations, including a retrospective study design at a single institution. There was a low overall number of complications, which reduced our ability to detect risk factors for complications and did not allow us to perform multivariable adjustment. Other limitations are that only documented CVC attempts were recorded and only those that met our search criteria. Lastly, not all notes contain information such as the number of attempts or peer supervision. Furthermore, the definition of CVC “attempt” is left to the operator’s discretion.
In conclusion, we observed a modern CVC mechanical complication rate of 1.9%. While the complication rate is similar to previous studies, there appear to be lower rates of pneumothorax and higher rates of bleeding complications. We also identified a deficit in trainee education regarding unsuccessful CVC placement; this is a novel finding and requires further investigation at other centers.
Disclosure: The authors have no conflicts of interest to report.
1. McGee DC, Gould MK. Preventing complications of central venous catheterization. N Engl J Med. 2003;348(12):1123-1133. PubMed
2. Parienti JJ, Mongardon N, Mégarbane B, et al. Intravascular complications of central venous catheterization by insertion site. N Engl J Med. 2015;373(13):1220-1229. PubMed
3. Eisen LA, Narasimhan M, Berger JS, Mayo PH, Rosen MJ, Schneider RF. Mechanical complications of central venous catheters. J Intensive Care Med. 2006;21(1):40-46. PubMed
4. Mansfield PF, Hohn DC, Fornage BD, Gregurich MA, Ota DM. Complications and failures of subclavian-vein catheterization. N Engl J Med. 1994;331(26):1735-1738. PubMed
5. Merrer J, De Jonghe B, Golliot F, et al. Complications of femoral and subclavian venous catheterization in critically ill patients: A randomized controlled trial. JAMA. 2001;286(6):700-707. PubMed
6. Steele R, Irvin CB. Central line mechanical complication rate in emergency medicine patients. Acad Emerg Med. 2001;8(2):204-207. PubMed
7. Calvache JA, Rodriguez MV, Trochez A, Klimek M, Stolker RJ, Lesaffre E. Incidence of mechanical complications of central venous catheterization using landmark technique: Do not try more than 3 times. J Intensive Care Med. 2016;31(6):397-402. PubMed
8. Barsuk JH, McDaghie WC, Cohen ER, Balachandran JS, Wayne DB. Use of simulation-based mastery learning to improve the quality of central venous catheter placement in a medical intensive care unit. J Hosp Med. 2009;4(7):397-403. PubMed
9. American Society of Anesthesiologists Task Force on Central Venous Access, Rupp SM, Apfelbaum JL, et al. Practice guidelines for central venous access: A report by the American Society of Anesthesiologists Task Force on Central Venous Access. Anesthesiology. 2012;116(3):539-573. PubMed
10. Bodenham Chair A, Babu S, Bennett J, et al. Association of Anaesthetists of Great Britian and Irealand: Safe vascular access 2016. Anaesthesia. 2016;71:573-585. PubMed
11. Frykholm P, Pikwer A, Hammarskjöld F, et al. Clinical guidelines on central venous catheterisation. Swedish Society of Anaesthesiology and Intensic Care Medicine. Acta Anaesteshiol Scand. 2014;58(5):508-524. PubMed
12. Sekiguchi H, Tokita JE, Minami T, Eisen LA, Mayo PH, Narasimhan M. A prerotational, simulation-based workshop improves the safety of central venous catheter insertion: Results of a successful internal medicine house staff training program. Chest. 2011;140(3): 652-658. PubMed
13. Dong Y, Suri HS, Cook DA, et al. Simulation-based objective assessment discerns clinical proficiency in central line placement: A construct validation. Chest. 2010;137(5):1050-1056. PubMed
14. Evans LV, Dodge KL, Shah TD, et al. Simulation training in central venous catheter insertion: Improved performance in clinical practice. Acad Med. 2010;85(9):1462-1469. PubMed
15. Lefrant JY, Muller L, De La Coussaye JE et al. Risk factors of failure and immediate complication of subclavian vein catheterization in critically ill patients. Intensive Care Med. 2002;28(8):1036-1041. PubMed
16. Dariushnia SR, Wallace MJ, Siddigi NH, et al. Quality improvement guidelines for central venous access. J Vasc Interv Radiol. 2010;21(7):976-981. PubMed
17. Barnes GD, Lucas E, Alexander GC, Goldberger ZD. National trends in ambulatory oral anticoagulant use. Am J Med. 2015;128(12):1300-1305.e2. PubMed
Central venous catheter (CVC) placement is commonly performed in emergency and critical care settings for parenteral access, central monitoring, and hemodialysis. Although potentially lifesaving CVC insertion is associated with immediate risks including injury to nerves, vessels, and lungs.1-3 These “insertion-related complications” are of particular interest for several reasons. First, the frequency of such complications varies widely, with published rates between 1.4% and 33.2%.2-7 Reasons for such variation include differences in study definitions of complications (eg, pneumothorax and tip position),2,5 setting of CVC placement (eg, intensive care unit [ICU] vs emergency room), timing of placement (eg, elective vs emergent), differences in technique, and type of operator (eg, experienced vs learner). Thus, the precise incidence of such events in modern-day training settings with use of ultrasound guidance remains uncertain. Second, mechanical complications might be preventable with adequate training and supervision. Indeed, studies using simulation-based mastery techniques have demonstrated a reduction in rates of complications following intensive training.8 Finally, understanding risk factors associated with insertion complications might inform preventative strategies and improve patient safety.9-11
Few studies to date have examined trainees’ perceptions on CVC training, experience, supervision, and ability to recognize and prevent mechanical complications. While research investigating effects of simulation training has accumulated, most focus on successful completion of the procedure or individual procedural steps with little emphasis on operator perceptions.12-14 In addition, while multiple studies have shown that unsuccessful line attempts are a risk factor for CVC complications,3,4,7,15 there is very little known about trainee behavior and perceptions regarding unsuccessful line placement. CVC simulation trainings often assume successful completion of the procedure and do not address the crucial postprocedure steps that should be undertaken if a procedure is unsuccessful. For these reasons, we developed a survey to specifically examine trainee experience with CVC placement, supervision, postprocedural behavior, and attitudes regarding unsuccessful line placement.
Therefore, we designed a study with 2 specific goals: The first is to perform a contemporary analysis of CVC mechanical complication rate at an academic teaching institution and identify potential risk factors associated with these complications. Second, we sought to determine trainee perceptions regarding CVC complication experience, prevention, procedural supervision, and perceptions surrounding unsuccessful line placement.
METHODS
Design and Setting
We conducted a single-center retrospective review of nontunneled acute CVC procedures between June 1, 2014, and May 1, 2015, at the University of Michigan Health System (UMHS). UMHS is a tertiary care referral center with over 900 inpatient beds, including 99 ICU beds.
All residents in internal medicine, surgery, anesthesia, and emergency medicine receive mandatory education in CVC placement that includes an online training module and simulation-based training with competency assessment. Use of real-time ultrasound guidance is considered the standard of care for CVC placement.
Data Collection
Inpatient procedure notes were electronically searched for terms indicating CVC placement. This was performed by using our hospital’s Data Office for Clinical and Translational Research using the Electronic Medical Record Search Engine tool. Please see the supplemental materials for the full list of search terms. We electronically extracted data, including date of procedure, gender, and most recent body mass index (BMI), within 1 year prior to note. Acute Physiology and Chronic Health Evaluation III (APACHE III) data are tracked for all patients on admission to ICU; this was collected when available. Charts were then manually reviewed to collect additional data, including international normalized ratio (INR), platelet count, lactate level on the day of CVC placement, anticoagulant use (actively prescribed coumadin, therapeutic enoxaparin, therapeutic unfractionated heparin, or direct oral anticoagulant), ventilator or noninvasive positive pressure ventilation (NIPPV) at time of CVC placement, and vasopressor requirement within 24 hours of CVC placement. The procedure note was reviewed to gather information about site of CVC placement, size and type of catheter, number of attempts, procedural success, training level of the operator, and attending presence. Small bore CVCs were defined as 7 French (Fr) or lower. Large bore CVCs were defined as >7 Fr; this includes dialysis catheters, Cordis catheters (Cordis, Fremont, CA), and cooling catheters. The times of the procedure note and postprocedure chest x-ray (CXR) were recorded, including whether the CVC was placed on a weekend (Friday 7
Primary Outcome
The primary outcome was the rate of severe mechanical complications related to CVC placement. Similar to prior studies,2 we defined severe mechanical complications as arterial placement of dilator or catheter, hemothorax, pneumothorax, cerebral ischemia, patient death (related to procedure), significant hematoma, or vascular injury (defined as complication requiring expert consultation or blood product transfusion). We did not require a lower limit on blood transfusion. We considered pneumothorax a complication regardless of whether chest tube intervention was performed, as pneumothorax subjects the patient to additional tests (eg, serial CXRs) and sometimes symptoms (shortness of breath, pain, anxiety) regardless of whether or not a chest tube was required. Complications were confirmed by a direct review of procedure notes, progress notes, discharge summaries, and imaging studies.
Trainee Survey
A survey was electronically disseminated to all internal medicine and medicine-pediatric residents to inquire about CVC experiences, including time spent in the medical ICU, number of CVCs performed, postprocedure behavior for both failed and successful CVCs, and supervision experience and attitudes. Please see supplemental materials for full survey contents.
Statistical Methods
Descriptive statistics (percentage) were used to summarize data. Continuous and categorical variables were compared using Student t tests and chi-square tests, respectively. All analyses were performed using SAS 9.3 (SAS Institute, Cary, NC).
Ethical and Regulatory Oversight
The study was deemed exempt by the University of Michigan Institutional Review Board (HUM00100549) as data collection was part of a quality improvement effort.
RESULTS
Demographics and Characteristics of Device Insertion
Between June 1, 2014, and May 1, 2015, 730 CVC procedure notes were reviewed (Table 1). The mean age of the study population was 58.9 years, and 41.6% (n = 304) were female. BMI data were available in 400 patients without complications and 5 patients with complications; the average BMI was 31.5 kg/m2. The APACHE III score was available for 442 patients without complications and 10 patients with complications; the average score was 86 (range 19-200). Most of the CVCs placed (n= 504, 69%) were small bore (<7 Fr). The majority of catheters were placed in the internal jugular (IJ) position (n = 525, 71.9%), followed by femoral (n = 144, 19.7%), subclavian (N = 57, 7.8%), and undocumented (n = 4, 0.6%). Ninety-six percent (n = 699) of CVCs were successfully placed. Seventy-six percent (n = 558) of procedure notes included documentation of the number of CVC attempts; of these, 85% documented 2 or fewer attempts. The majority of CVCs were placed by residents (n = 537, 73.9%), followed by fellows (N = 127, 17.5%) and attendings (n = 27, 3.7%). Attending supervision for all or key portions of CVC placement occurred 34.7% (n = 244) of the time overall and was lower for internal medicine trainees (n = 98/463, 21.2%) compared with surgical trainees (n = 73/127, 57.4%) or emergency medicine trainees (n = 62/96, 64.6%; P < 0.001). All successful IJ and subclavian CVCs except for 2 insertions (0.3%) had a postprocedure CXR. A minority of notes documented pressure transduction (4.5%) or blood gas analysis (0.2%) to confirm venous placement.
Mechanical Complications
The mechanical complications identified included pneumothorax (n = 5), bleeding requiring transfusion (n = 3), vascular injury requiring expert consultation or intervention (n = 3), stroke (n = 1), and death (n = 2). Vascular injuries included 1 neck hematoma with superinfection requiring antibiotics, 1 neck hematoma requiring otolaryngology and vascular surgery consultation, and 1 venous dissection of IJ vein requiring vascular surgery consultation. None of these cases required operative intervention. The stroke was caused by inadvertent CVC placement into the carotid artery. One patient experienced tension pneumothorax and died due to this complication; this death occurred after 3 failed left subclavian CVC attempts and an ultimately successful CVC placement into left IJ vein. Another death occurred immediately following unsuccessful Cordis placement. As no autopsy was performed, it is impossible to know if the cause of death was the line placement. However, it would be prudent to consider this as a CVC complication given the temporal relationship to line placement. Thus, the total number of patients who experienced severe mechanical complications was 14 out of 730 (1.92%).
Risk Factors for Mechanical Complications
Certain patient factors were more commonly associated with complications. For example, BMI was significantly lower in the group that experienced complications vs those that did not (25.7 vs 31.0 kg/m2, P = 0.001). No other associations between demographic factors, including age (61.4 years vs 58.9 years, P = 0.57) or sex (57.1% male vs 41.3% female, P = 0.24), or admission APACHE III score (96 vs 86, P = 0.397) were noted. The mean INR, platelets, and lactate did not differ between the 2 groups. There was no difference between the use of vasopressors. Ventilator use (including endotracheal tube or NIPPV) was found to be significantly higher in the group that experienced mechanical complications (78.5% vs 65.9%, P = 0.001). Anticoagulation use was also associated with mechanical complications (28.6% vs 20.6%, P = 0.05); 3 patients on anticoagulation experienced significant hematomas. Mechanical complications were more common with subclavian location (21.4% vs 7.8%, P = 0.001); in all 3 cases involving subclavian CVC placement, the complication experienced was pneumothorax. The number of attempts significantly differed between the 2 groups, with an average of 1.5 attempts in the group without complications and 2.2 attempts in the group that experienced complications (P = 0.02). Additionally, rates of successful placement were lower among patients who experienced complications (78.6% vs 95.7%, P = 0.001).
With respect to operator characteristics, no significant difference between the levels of training was noted among those who experienced complications vs those who did not. Attending supervision was more frequent for the group that experienced complications (61.5% vs 34.2%, P = 0.04). There was no significant difference in complication rate according to the first vs the second half of the academic year (0.4% vs 0.3% per month, P = 0.30) or CVC placement during the day vs night (1.9% vs 2.0%, P = 0.97). A trend toward more complications in CVCs placed over the weekend compared to a weekday was observed (2.80% vs 1.23%, P = 0.125).
Unsuccessful CVCs
There were 30 documented unsuccessful CVC procedures, representing 4.1% of all procedures. Of these, 3 procedures had complications; these included 2 pneumothoraxes (1 leading to death) and 1 unexplained death. Twenty-four of the unsuccessful CVC attempts were in either the subclavian or IJ location; of these, 5 (21%) did not have a postprocedure CXR obtained.
Survey Results
The survey was completed by 103 out of 166 internal medicine residents (62% response rate). Of these, 55% (n = 57) reported having performed 5 or more CVCs, and 14% (n = 14) had performed more than 15 CVCs.
All respondents who had performed at least 1 CVC (n = 80) were asked about their perceptions regarding attending supervision. Eighty-one percent (n = 65/80) responded that they have never been directly supervised by an attending during CVC placement, while 16% (n = 13/80) reported being supervised less than 25% of the time. Most (n = 53/75, 71%) did not feel that attending supervision affected their performance, while 21% (n = 16/75) felt it affected performance negatively, and only 8% (n = 6/75) stated it affected performance positively. Nineteen percent (n = 15/80) indicated that they prefer more supervision by attendings, while 35% (n = 28/80) did not wish for more attending supervision, and 46% (n = 37/80) were indifferent.
DISCUSSION
We performed a contemporary analysis of CVC placement at an academic tertiary care center and observed a rate of severe mechanical complications of 1.9%. This rate is within previously described acceptable thresholds.16 Our study adds to the literature by identifying several important risk factors for development of mechanical complications. We confirm many risk factors that have been noted historically, such as subclavian line location,2,3 attending supervision,3 low BMI,4 number of CVC attempts, and unsuccessful CVC placement.3,4,7,15 We identified several unique risk factors, including systemic anticoagulation as well as ventilator use. Lastly, we identified unexpected deficits in trainee knowledge surrounding management of failed CVCs and negative attitudes regarding attending supervision.
Most existing literature evaluated risk factors for CVC complication prior to routine ultrasound use;3-5,7,15 surprisingly, it appears that severe mechanical complications do not differ dramatically in the real-time ultrasound era. Eisen et al.3 prospectively studied CVC placement at an academic medical center and found a severe mechanical complication rate (as defined in our paper) of 1.9% due to pneumothorax (1.3%), hemothorax (0.3%), and death (0.3%).We would expect the number of complications to decrease in the postultrasound era, and indeed it appears that pneumothoraces have decreased likely due to ultrasound guidance and decrease in subclavian location. However, in contrast, rates of significant hematomas and bleeding are higher in our study. Although we are unable to state why this may be the case, increasing use of anticoagulation in the general population might explain this finding.17 For instance, of the 6 patients who experienced hematomas or vascular injuries in our study, 3 were on anticoagulation at the time of CVC placement.
Interestingly, time of academic year of CVC placement and level of training were not correlated with an increased risk of complications, nor was time of day of CVC placement. In contrast, Merrer et al.showed that CVC insertion during nighttime was significantly associated with increased mechanical complications (odds ratio 2.06, 95% confidence interval, 1.04-4.08;,P = 0.03).5 This difference may be attributable to the fact that most of our ICUs now have a night float system rather than a more traditional 24-hour call model; therefore, trainees are less likely to be sleep deprived during CVC placement at night.
Severity of illness did not appear to significantly affect mechanical complication rates based on similar APACHE scores between the 2 groups. In addition, other indicators of illness severity (vasopressor use or lactate level) did not suggest that sicker patients may be more likely to experience mechanical complications than others. One could conjecture that perhaps sicker patients were more likely to have lines placed by more experienced trainees, although the present study design does not allow us to answer this question. Interestingly, ventilator use was associated with higher rates of complications. We cannot say definitively why this was the case; however, 1 contributing factor may be the physical constraints of placing the CVC around ventilator tubing.
Several unexpected findings surrounding attending supervision were noted: first, attending supervision appears to be significantly associated with increased complication rate, and second, trainees have negative perceptions regarding attending supervision. Eisen et al.showed a similar association between attending supervision and complication rate.3 It is possible that the increased complication rate is because sicker patients are more likely to have procedural supervision by attendings, attending physicians may be called to supervise when a CVC placement is not going as planned, or attendings may supervise more inexperienced operators. Reasons behind negative trainee attitudes surrounding supervision are unclear and literature on this topic is limited. This is an area that warrants further exploration in future studies.
Another unexpected finding is trainee practices regarding unsuccessful CVC placement; most trainees do not document failed procedures or order follow-up CXRs after unsuccessful CVC attempts. Given the higher risk of complications after unsuccessful CVCs, it is paramount that all physicians are trained to order postprocedure CXR to rule out pneumothorax or hemothorax. Furthermore, documentation of failed procedures is important for medical accuracy, transparency, and also hospital billing. It is unknown if these practices surrounding unsuccessful CVCs are institution-specific or more widespread. As far as we know, this is the first time that trainee practices regarding failed CVC placement have been published. Interestingly, while many current guidelines call attention to prevention, recognition, and management of central line-associated mechanical complications, specific recommendations about postprocedure behavior after failed CVC placement are not published.9-11 We feel it is critical that institutions reflect on their own practices, especially given that unsuccessful CVCs are shown to be correlated with a significant increase in complication rate. At our own institution, we have initiated an educational component of central line training for medicine trainees specifically addressing failed central line attempts.
This study has several limitations, including a retrospective study design at a single institution. There was a low overall number of complications, which reduced our ability to detect risk factors for complications and did not allow us to perform multivariable adjustment. Other limitations are that only documented CVC attempts were recorded and only those that met our search criteria. Lastly, not all notes contain information such as the number of attempts or peer supervision. Furthermore, the definition of CVC “attempt” is left to the operator’s discretion.
In conclusion, we observed a modern CVC mechanical complication rate of 1.9%. While the complication rate is similar to previous studies, there appear to be lower rates of pneumothorax and higher rates of bleeding complications. We also identified a deficit in trainee education regarding unsuccessful CVC placement; this is a novel finding and requires further investigation at other centers.
Disclosure: The authors have no conflicts of interest to report.
Central venous catheter (CVC) placement is commonly performed in emergency and critical care settings for parenteral access, central monitoring, and hemodialysis. Although potentially lifesaving CVC insertion is associated with immediate risks including injury to nerves, vessels, and lungs.1-3 These “insertion-related complications” are of particular interest for several reasons. First, the frequency of such complications varies widely, with published rates between 1.4% and 33.2%.2-7 Reasons for such variation include differences in study definitions of complications (eg, pneumothorax and tip position),2,5 setting of CVC placement (eg, intensive care unit [ICU] vs emergency room), timing of placement (eg, elective vs emergent), differences in technique, and type of operator (eg, experienced vs learner). Thus, the precise incidence of such events in modern-day training settings with use of ultrasound guidance remains uncertain. Second, mechanical complications might be preventable with adequate training and supervision. Indeed, studies using simulation-based mastery techniques have demonstrated a reduction in rates of complications following intensive training.8 Finally, understanding risk factors associated with insertion complications might inform preventative strategies and improve patient safety.9-11
Few studies to date have examined trainees’ perceptions on CVC training, experience, supervision, and ability to recognize and prevent mechanical complications. While research investigating effects of simulation training has accumulated, most focus on successful completion of the procedure or individual procedural steps with little emphasis on operator perceptions.12-14 In addition, while multiple studies have shown that unsuccessful line attempts are a risk factor for CVC complications,3,4,7,15 there is very little known about trainee behavior and perceptions regarding unsuccessful line placement. CVC simulation trainings often assume successful completion of the procedure and do not address the crucial postprocedure steps that should be undertaken if a procedure is unsuccessful. For these reasons, we developed a survey to specifically examine trainee experience with CVC placement, supervision, postprocedural behavior, and attitudes regarding unsuccessful line placement.
Therefore, we designed a study with 2 specific goals: The first is to perform a contemporary analysis of CVC mechanical complication rate at an academic teaching institution and identify potential risk factors associated with these complications. Second, we sought to determine trainee perceptions regarding CVC complication experience, prevention, procedural supervision, and perceptions surrounding unsuccessful line placement.
METHODS
Design and Setting
We conducted a single-center retrospective review of nontunneled acute CVC procedures between June 1, 2014, and May 1, 2015, at the University of Michigan Health System (UMHS). UMHS is a tertiary care referral center with over 900 inpatient beds, including 99 ICU beds.
All residents in internal medicine, surgery, anesthesia, and emergency medicine receive mandatory education in CVC placement that includes an online training module and simulation-based training with competency assessment. Use of real-time ultrasound guidance is considered the standard of care for CVC placement.
Data Collection
Inpatient procedure notes were electronically searched for terms indicating CVC placement. This was performed by using our hospital’s Data Office for Clinical and Translational Research using the Electronic Medical Record Search Engine tool. Please see the supplemental materials for the full list of search terms. We electronically extracted data, including date of procedure, gender, and most recent body mass index (BMI), within 1 year prior to note. Acute Physiology and Chronic Health Evaluation III (APACHE III) data are tracked for all patients on admission to ICU; this was collected when available. Charts were then manually reviewed to collect additional data, including international normalized ratio (INR), platelet count, lactate level on the day of CVC placement, anticoagulant use (actively prescribed coumadin, therapeutic enoxaparin, therapeutic unfractionated heparin, or direct oral anticoagulant), ventilator or noninvasive positive pressure ventilation (NIPPV) at time of CVC placement, and vasopressor requirement within 24 hours of CVC placement. The procedure note was reviewed to gather information about site of CVC placement, size and type of catheter, number of attempts, procedural success, training level of the operator, and attending presence. Small bore CVCs were defined as 7 French (Fr) or lower. Large bore CVCs were defined as >7 Fr; this includes dialysis catheters, Cordis catheters (Cordis, Fremont, CA), and cooling catheters. The times of the procedure note and postprocedure chest x-ray (CXR) were recorded, including whether the CVC was placed on a weekend (Friday 7
Primary Outcome
The primary outcome was the rate of severe mechanical complications related to CVC placement. Similar to prior studies,2 we defined severe mechanical complications as arterial placement of dilator or catheter, hemothorax, pneumothorax, cerebral ischemia, patient death (related to procedure), significant hematoma, or vascular injury (defined as complication requiring expert consultation or blood product transfusion). We did not require a lower limit on blood transfusion. We considered pneumothorax a complication regardless of whether chest tube intervention was performed, as pneumothorax subjects the patient to additional tests (eg, serial CXRs) and sometimes symptoms (shortness of breath, pain, anxiety) regardless of whether or not a chest tube was required. Complications were confirmed by a direct review of procedure notes, progress notes, discharge summaries, and imaging studies.
Trainee Survey
A survey was electronically disseminated to all internal medicine and medicine-pediatric residents to inquire about CVC experiences, including time spent in the medical ICU, number of CVCs performed, postprocedure behavior for both failed and successful CVCs, and supervision experience and attitudes. Please see supplemental materials for full survey contents.
Statistical Methods
Descriptive statistics (percentage) were used to summarize data. Continuous and categorical variables were compared using Student t tests and chi-square tests, respectively. All analyses were performed using SAS 9.3 (SAS Institute, Cary, NC).
Ethical and Regulatory Oversight
The study was deemed exempt by the University of Michigan Institutional Review Board (HUM00100549) as data collection was part of a quality improvement effort.
RESULTS
Demographics and Characteristics of Device Insertion
Between June 1, 2014, and May 1, 2015, 730 CVC procedure notes were reviewed (Table 1). The mean age of the study population was 58.9 years, and 41.6% (n = 304) were female. BMI data were available in 400 patients without complications and 5 patients with complications; the average BMI was 31.5 kg/m2. The APACHE III score was available for 442 patients without complications and 10 patients with complications; the average score was 86 (range 19-200). Most of the CVCs placed (n= 504, 69%) were small bore (<7 Fr). The majority of catheters were placed in the internal jugular (IJ) position (n = 525, 71.9%), followed by femoral (n = 144, 19.7%), subclavian (N = 57, 7.8%), and undocumented (n = 4, 0.6%). Ninety-six percent (n = 699) of CVCs were successfully placed. Seventy-six percent (n = 558) of procedure notes included documentation of the number of CVC attempts; of these, 85% documented 2 or fewer attempts. The majority of CVCs were placed by residents (n = 537, 73.9%), followed by fellows (N = 127, 17.5%) and attendings (n = 27, 3.7%). Attending supervision for all or key portions of CVC placement occurred 34.7% (n = 244) of the time overall and was lower for internal medicine trainees (n = 98/463, 21.2%) compared with surgical trainees (n = 73/127, 57.4%) or emergency medicine trainees (n = 62/96, 64.6%; P < 0.001). All successful IJ and subclavian CVCs except for 2 insertions (0.3%) had a postprocedure CXR. A minority of notes documented pressure transduction (4.5%) or blood gas analysis (0.2%) to confirm venous placement.
Mechanical Complications
The mechanical complications identified included pneumothorax (n = 5), bleeding requiring transfusion (n = 3), vascular injury requiring expert consultation or intervention (n = 3), stroke (n = 1), and death (n = 2). Vascular injuries included 1 neck hematoma with superinfection requiring antibiotics, 1 neck hematoma requiring otolaryngology and vascular surgery consultation, and 1 venous dissection of IJ vein requiring vascular surgery consultation. None of these cases required operative intervention. The stroke was caused by inadvertent CVC placement into the carotid artery. One patient experienced tension pneumothorax and died due to this complication; this death occurred after 3 failed left subclavian CVC attempts and an ultimately successful CVC placement into left IJ vein. Another death occurred immediately following unsuccessful Cordis placement. As no autopsy was performed, it is impossible to know if the cause of death was the line placement. However, it would be prudent to consider this as a CVC complication given the temporal relationship to line placement. Thus, the total number of patients who experienced severe mechanical complications was 14 out of 730 (1.92%).
Risk Factors for Mechanical Complications
Certain patient factors were more commonly associated with complications. For example, BMI was significantly lower in the group that experienced complications vs those that did not (25.7 vs 31.0 kg/m2, P = 0.001). No other associations between demographic factors, including age (61.4 years vs 58.9 years, P = 0.57) or sex (57.1% male vs 41.3% female, P = 0.24), or admission APACHE III score (96 vs 86, P = 0.397) were noted. The mean INR, platelets, and lactate did not differ between the 2 groups. There was no difference between the use of vasopressors. Ventilator use (including endotracheal tube or NIPPV) was found to be significantly higher in the group that experienced mechanical complications (78.5% vs 65.9%, P = 0.001). Anticoagulation use was also associated with mechanical complications (28.6% vs 20.6%, P = 0.05); 3 patients on anticoagulation experienced significant hematomas. Mechanical complications were more common with subclavian location (21.4% vs 7.8%, P = 0.001); in all 3 cases involving subclavian CVC placement, the complication experienced was pneumothorax. The number of attempts significantly differed between the 2 groups, with an average of 1.5 attempts in the group without complications and 2.2 attempts in the group that experienced complications (P = 0.02). Additionally, rates of successful placement were lower among patients who experienced complications (78.6% vs 95.7%, P = 0.001).
With respect to operator characteristics, no significant difference between the levels of training was noted among those who experienced complications vs those who did not. Attending supervision was more frequent for the group that experienced complications (61.5% vs 34.2%, P = 0.04). There was no significant difference in complication rate according to the first vs the second half of the academic year (0.4% vs 0.3% per month, P = 0.30) or CVC placement during the day vs night (1.9% vs 2.0%, P = 0.97). A trend toward more complications in CVCs placed over the weekend compared to a weekday was observed (2.80% vs 1.23%, P = 0.125).
Unsuccessful CVCs
There were 30 documented unsuccessful CVC procedures, representing 4.1% of all procedures. Of these, 3 procedures had complications; these included 2 pneumothoraxes (1 leading to death) and 1 unexplained death. Twenty-four of the unsuccessful CVC attempts were in either the subclavian or IJ location; of these, 5 (21%) did not have a postprocedure CXR obtained.
Survey Results
The survey was completed by 103 out of 166 internal medicine residents (62% response rate). Of these, 55% (n = 57) reported having performed 5 or more CVCs, and 14% (n = 14) had performed more than 15 CVCs.
All respondents who had performed at least 1 CVC (n = 80) were asked about their perceptions regarding attending supervision. Eighty-one percent (n = 65/80) responded that they have never been directly supervised by an attending during CVC placement, while 16% (n = 13/80) reported being supervised less than 25% of the time. Most (n = 53/75, 71%) did not feel that attending supervision affected their performance, while 21% (n = 16/75) felt it affected performance negatively, and only 8% (n = 6/75) stated it affected performance positively. Nineteen percent (n = 15/80) indicated that they prefer more supervision by attendings, while 35% (n = 28/80) did not wish for more attending supervision, and 46% (n = 37/80) were indifferent.
DISCUSSION
We performed a contemporary analysis of CVC placement at an academic tertiary care center and observed a rate of severe mechanical complications of 1.9%. This rate is within previously described acceptable thresholds.16 Our study adds to the literature by identifying several important risk factors for development of mechanical complications. We confirm many risk factors that have been noted historically, such as subclavian line location,2,3 attending supervision,3 low BMI,4 number of CVC attempts, and unsuccessful CVC placement.3,4,7,15 We identified several unique risk factors, including systemic anticoagulation as well as ventilator use. Lastly, we identified unexpected deficits in trainee knowledge surrounding management of failed CVCs and negative attitudes regarding attending supervision.
Most existing literature evaluated risk factors for CVC complication prior to routine ultrasound use;3-5,7,15 surprisingly, it appears that severe mechanical complications do not differ dramatically in the real-time ultrasound era. Eisen et al.3 prospectively studied CVC placement at an academic medical center and found a severe mechanical complication rate (as defined in our paper) of 1.9% due to pneumothorax (1.3%), hemothorax (0.3%), and death (0.3%).We would expect the number of complications to decrease in the postultrasound era, and indeed it appears that pneumothoraces have decreased likely due to ultrasound guidance and decrease in subclavian location. However, in contrast, rates of significant hematomas and bleeding are higher in our study. Although we are unable to state why this may be the case, increasing use of anticoagulation in the general population might explain this finding.17 For instance, of the 6 patients who experienced hematomas or vascular injuries in our study, 3 were on anticoagulation at the time of CVC placement.
Interestingly, time of academic year of CVC placement and level of training were not correlated with an increased risk of complications, nor was time of day of CVC placement. In contrast, Merrer et al.showed that CVC insertion during nighttime was significantly associated with increased mechanical complications (odds ratio 2.06, 95% confidence interval, 1.04-4.08;,P = 0.03).5 This difference may be attributable to the fact that most of our ICUs now have a night float system rather than a more traditional 24-hour call model; therefore, trainees are less likely to be sleep deprived during CVC placement at night.
Severity of illness did not appear to significantly affect mechanical complication rates based on similar APACHE scores between the 2 groups. In addition, other indicators of illness severity (vasopressor use or lactate level) did not suggest that sicker patients may be more likely to experience mechanical complications than others. One could conjecture that perhaps sicker patients were more likely to have lines placed by more experienced trainees, although the present study design does not allow us to answer this question. Interestingly, ventilator use was associated with higher rates of complications. We cannot say definitively why this was the case; however, 1 contributing factor may be the physical constraints of placing the CVC around ventilator tubing.
Several unexpected findings surrounding attending supervision were noted: first, attending supervision appears to be significantly associated with increased complication rate, and second, trainees have negative perceptions regarding attending supervision. Eisen et al.showed a similar association between attending supervision and complication rate.3 It is possible that the increased complication rate is because sicker patients are more likely to have procedural supervision by attendings, attending physicians may be called to supervise when a CVC placement is not going as planned, or attendings may supervise more inexperienced operators. Reasons behind negative trainee attitudes surrounding supervision are unclear and literature on this topic is limited. This is an area that warrants further exploration in future studies.
Another unexpected finding is trainee practices regarding unsuccessful CVC placement; most trainees do not document failed procedures or order follow-up CXRs after unsuccessful CVC attempts. Given the higher risk of complications after unsuccessful CVCs, it is paramount that all physicians are trained to order postprocedure CXR to rule out pneumothorax or hemothorax. Furthermore, documentation of failed procedures is important for medical accuracy, transparency, and also hospital billing. It is unknown if these practices surrounding unsuccessful CVCs are institution-specific or more widespread. As far as we know, this is the first time that trainee practices regarding failed CVC placement have been published. Interestingly, while many current guidelines call attention to prevention, recognition, and management of central line-associated mechanical complications, specific recommendations about postprocedure behavior after failed CVC placement are not published.9-11 We feel it is critical that institutions reflect on their own practices, especially given that unsuccessful CVCs are shown to be correlated with a significant increase in complication rate. At our own institution, we have initiated an educational component of central line training for medicine trainees specifically addressing failed central line attempts.
This study has several limitations, including a retrospective study design at a single institution. There was a low overall number of complications, which reduced our ability to detect risk factors for complications and did not allow us to perform multivariable adjustment. Other limitations are that only documented CVC attempts were recorded and only those that met our search criteria. Lastly, not all notes contain information such as the number of attempts or peer supervision. Furthermore, the definition of CVC “attempt” is left to the operator’s discretion.
In conclusion, we observed a modern CVC mechanical complication rate of 1.9%. While the complication rate is similar to previous studies, there appear to be lower rates of pneumothorax and higher rates of bleeding complications. We also identified a deficit in trainee education regarding unsuccessful CVC placement; this is a novel finding and requires further investigation at other centers.
Disclosure: The authors have no conflicts of interest to report.
1. McGee DC, Gould MK. Preventing complications of central venous catheterization. N Engl J Med. 2003;348(12):1123-1133. PubMed
2. Parienti JJ, Mongardon N, Mégarbane B, et al. Intravascular complications of central venous catheterization by insertion site. N Engl J Med. 2015;373(13):1220-1229. PubMed
3. Eisen LA, Narasimhan M, Berger JS, Mayo PH, Rosen MJ, Schneider RF. Mechanical complications of central venous catheters. J Intensive Care Med. 2006;21(1):40-46. PubMed
4. Mansfield PF, Hohn DC, Fornage BD, Gregurich MA, Ota DM. Complications and failures of subclavian-vein catheterization. N Engl J Med. 1994;331(26):1735-1738. PubMed
5. Merrer J, De Jonghe B, Golliot F, et al. Complications of femoral and subclavian venous catheterization in critically ill patients: A randomized controlled trial. JAMA. 2001;286(6):700-707. PubMed
6. Steele R, Irvin CB. Central line mechanical complication rate in emergency medicine patients. Acad Emerg Med. 2001;8(2):204-207. PubMed
7. Calvache JA, Rodriguez MV, Trochez A, Klimek M, Stolker RJ, Lesaffre E. Incidence of mechanical complications of central venous catheterization using landmark technique: Do not try more than 3 times. J Intensive Care Med. 2016;31(6):397-402. PubMed
8. Barsuk JH, McDaghie WC, Cohen ER, Balachandran JS, Wayne DB. Use of simulation-based mastery learning to improve the quality of central venous catheter placement in a medical intensive care unit. J Hosp Med. 2009;4(7):397-403. PubMed
9. American Society of Anesthesiologists Task Force on Central Venous Access, Rupp SM, Apfelbaum JL, et al. Practice guidelines for central venous access: A report by the American Society of Anesthesiologists Task Force on Central Venous Access. Anesthesiology. 2012;116(3):539-573. PubMed
10. Bodenham Chair A, Babu S, Bennett J, et al. Association of Anaesthetists of Great Britian and Irealand: Safe vascular access 2016. Anaesthesia. 2016;71:573-585. PubMed
11. Frykholm P, Pikwer A, Hammarskjöld F, et al. Clinical guidelines on central venous catheterisation. Swedish Society of Anaesthesiology and Intensic Care Medicine. Acta Anaesteshiol Scand. 2014;58(5):508-524. PubMed
12. Sekiguchi H, Tokita JE, Minami T, Eisen LA, Mayo PH, Narasimhan M. A prerotational, simulation-based workshop improves the safety of central venous catheter insertion: Results of a successful internal medicine house staff training program. Chest. 2011;140(3): 652-658. PubMed
13. Dong Y, Suri HS, Cook DA, et al. Simulation-based objective assessment discerns clinical proficiency in central line placement: A construct validation. Chest. 2010;137(5):1050-1056. PubMed
14. Evans LV, Dodge KL, Shah TD, et al. Simulation training in central venous catheter insertion: Improved performance in clinical practice. Acad Med. 2010;85(9):1462-1469. PubMed
15. Lefrant JY, Muller L, De La Coussaye JE et al. Risk factors of failure and immediate complication of subclavian vein catheterization in critically ill patients. Intensive Care Med. 2002;28(8):1036-1041. PubMed
16. Dariushnia SR, Wallace MJ, Siddigi NH, et al. Quality improvement guidelines for central venous access. J Vasc Interv Radiol. 2010;21(7):976-981. PubMed
17. Barnes GD, Lucas E, Alexander GC, Goldberger ZD. National trends in ambulatory oral anticoagulant use. Am J Med. 2015;128(12):1300-1305.e2. PubMed
1. McGee DC, Gould MK. Preventing complications of central venous catheterization. N Engl J Med. 2003;348(12):1123-1133. PubMed
2. Parienti JJ, Mongardon N, Mégarbane B, et al. Intravascular complications of central venous catheterization by insertion site. N Engl J Med. 2015;373(13):1220-1229. PubMed
3. Eisen LA, Narasimhan M, Berger JS, Mayo PH, Rosen MJ, Schneider RF. Mechanical complications of central venous catheters. J Intensive Care Med. 2006;21(1):40-46. PubMed
4. Mansfield PF, Hohn DC, Fornage BD, Gregurich MA, Ota DM. Complications and failures of subclavian-vein catheterization. N Engl J Med. 1994;331(26):1735-1738. PubMed
5. Merrer J, De Jonghe B, Golliot F, et al. Complications of femoral and subclavian venous catheterization in critically ill patients: A randomized controlled trial. JAMA. 2001;286(6):700-707. PubMed
6. Steele R, Irvin CB. Central line mechanical complication rate in emergency medicine patients. Acad Emerg Med. 2001;8(2):204-207. PubMed
7. Calvache JA, Rodriguez MV, Trochez A, Klimek M, Stolker RJ, Lesaffre E. Incidence of mechanical complications of central venous catheterization using landmark technique: Do not try more than 3 times. J Intensive Care Med. 2016;31(6):397-402. PubMed
8. Barsuk JH, McDaghie WC, Cohen ER, Balachandran JS, Wayne DB. Use of simulation-based mastery learning to improve the quality of central venous catheter placement in a medical intensive care unit. J Hosp Med. 2009;4(7):397-403. PubMed
9. American Society of Anesthesiologists Task Force on Central Venous Access, Rupp SM, Apfelbaum JL, et al. Practice guidelines for central venous access: A report by the American Society of Anesthesiologists Task Force on Central Venous Access. Anesthesiology. 2012;116(3):539-573. PubMed
10. Bodenham Chair A, Babu S, Bennett J, et al. Association of Anaesthetists of Great Britian and Irealand: Safe vascular access 2016. Anaesthesia. 2016;71:573-585. PubMed
11. Frykholm P, Pikwer A, Hammarskjöld F, et al. Clinical guidelines on central venous catheterisation. Swedish Society of Anaesthesiology and Intensic Care Medicine. Acta Anaesteshiol Scand. 2014;58(5):508-524. PubMed
12. Sekiguchi H, Tokita JE, Minami T, Eisen LA, Mayo PH, Narasimhan M. A prerotational, simulation-based workshop improves the safety of central venous catheter insertion: Results of a successful internal medicine house staff training program. Chest. 2011;140(3): 652-658. PubMed
13. Dong Y, Suri HS, Cook DA, et al. Simulation-based objective assessment discerns clinical proficiency in central line placement: A construct validation. Chest. 2010;137(5):1050-1056. PubMed
14. Evans LV, Dodge KL, Shah TD, et al. Simulation training in central venous catheter insertion: Improved performance in clinical practice. Acad Med. 2010;85(9):1462-1469. PubMed
15. Lefrant JY, Muller L, De La Coussaye JE et al. Risk factors of failure and immediate complication of subclavian vein catheterization in critically ill patients. Intensive Care Med. 2002;28(8):1036-1041. PubMed
16. Dariushnia SR, Wallace MJ, Siddigi NH, et al. Quality improvement guidelines for central venous access. J Vasc Interv Radiol. 2010;21(7):976-981. PubMed
17. Barnes GD, Lucas E, Alexander GC, Goldberger ZD. National trends in ambulatory oral anticoagulant use. Am J Med. 2015;128(12):1300-1305.e2. PubMed
© 2017 Society of Hospital Medicine