User login
Lateral Ulnar Collateral Ligament Reconstruction: An Analysis of Ulnar Tunnel Locations
Posterolateral rotatory instability (PLRI) of the elbow is well recognized1 and is the most common type of chronic elbow instability. PLRI is often an end result of traumatic elbow dislocation.2 The “essential lesion” in patients with PLRI of the elbow is injury to the lateral ulnar collateral ligament (LUCL).1 However, more recent research has emphasized the importance of other ligaments in the lateral ligament complex (radial collateral and annular ligaments) in preventing PLRI.3-5 Nevertheless, when conservative treatment fails, the most commonly used surgical treatment involves LUCL reconstruction.1,6-11
Numerous techniques for LUCL reconstruction have been described.1,7-9,11-13 The chosen technique ideally restores normal anatomy. Therefore, the isometric point of origin at the lateral epicondyle and insertion at the supinator tubercle are important landmarks for creating tunnels that reproduce isometry, function, and normal anatomy. Most often, 2 tunnels are created in the ulna to secure the graft. It has been our experience that ulnar tunnel creation can affect the length of the bony bridge and the orientation of the graft.
We conducted a study to identify the precise proximal ulna tunnel location—anterior to posterior, with the distal tunnel at the supinator tubercle on the crest—that allows for the largest bony bridge and most geometrically favorable construct. We hypothesized that a most posteriorly placed proximal tunnel would increase bony bridge size and allow for a more isosceles graft configuration. An isosceles configuration with the humerus tunnel at the isometric location would allow for anterior and posterior bands of the same length with theoretically equal force distribution.
Methods
After obtaining institutional review board approval, we retrospectively reviewed the cases of 17 adults with elbow computed tomography (CT) scans for inclusion in this study. The scans were previously performed for diagnostic workup of several pathologies, including valgus instability, olecranon stress fracture, and valgus extension overload. The scan protocol involved 0.5-mm axial cuts with inclusion of the distal humerus through the proximal radius and ulna in the DICOM (Digital Imaging and Communications in Medicine) format. Exclusion criteria included poor CT quality, inadequate visualization of the entire supinator crest, and age under 18 years. Fifteen patients with adequate CT scans met the inclusion criteria. MIMICS (Materialise’s Interactive Medical Image Control System) software was used to convert scans into patient-specific 3-dimensional (3-D) computer models. (Use of this software to produce anatomically accurate models has been verified in shoulder14 and elbow15 models.) These models were uploaded into Magics rapid prototyping software (Materialise) and manipulated for simulated tunnel drilling by precise bone subtraction methods. This software was used to define an ulnar Cartesian coordinate system with anatomical landmarks as reference points in order to standardize the position of each model (Figure 1).16 The y-axis was defined by the longitudinal axis of the ulna, and the x-axis was the transepicondylar axis, defined as the perpendicular line connecting the y-axis with the supinator crest. The z-axis was then established as the line perpendicular to the x- and y-axes—yielding a 3-D coordinate system that allowed us to manipulate the models in standardized fashion, maintaining the exact positions of the ulna while making measurements.
Surgical simulations were performed in the rapid prototyping software by creating a cylinder and placing it at the desired location of each tunnel. Cylinder diameter was 4 mm, matching the diameter of the drill we use to create each tunnel in our practice. The cylinder was inserted into the bone, perpendicular to the surface of the ulna at the point of insertion, so the cylinder’s deepest point entered the medullary canal of the ulna. Using a Boolean operation in the rapid prototyping software, we subtracted cylinder from bone to create a tunnel (Figure 2).15
In a previous study,17 we determined that the radial head junction is reproducibly about 15 mm proximal to the distinct supinator tubercle, which may be absent or not readily appreciated in up to 50% of cases. Therefore, proximal ulnar tunnels were placed 0, 5, and 10 mm posterior to the supinator crest at the radial head junction. Distal tunnels were placed 15 mm anterior to the radial head junction on the supinator crest (Figure 2). The bony bridges created by these tunnels were measured, as was the distance between the distal tunnel and the supinator tubercle.
Ideal graft configuration was described as an isosceles triangle with ulna tunnels perpendicular to the humeral tunnel (Figure 3).11 Location of the humeral origin in the sagittal plane was determined by finding the isometric point of the lateral humerus using only bony landmarks. Similar techniques have been used to find the isometric point on the medial epicondyle for medial ulnar collateral ligament reconstruction.15,18 With a circle fit into the trochlear notch of the ulna, the isometric point can be determined by the center of the circle. This point was then superimposed on the humerus to identify the starting point (Figure 4). In our simulation, we measured the isosceles configuration by drawing a line between the proximal and distal tunnels, and then another line connecting the bisecting point of the first line with the isometric point on the humerus from which the graft would originate. The angle between the 2 lines was measured; if isosceles, the angle was 90° (Figure 5). Length of the more proximal limb of the graft and the more distal limb of the graft was determined by measuring the distance from the isometric point to the proximal and distal tunnels, respectively (Figure 6).
One-way analysis of variance was used to compare all the tunnels’ bony bridge sizes, graft lengths, and angles to the isometric point. For all comparisons, statistical significance was set at P < .05. As no other studies have compared bony bridges by varying tunnel creation parameters, and as the present study is observational and not comparative, no power analysis was performed.
Results
Bony bridges were significantly longer, and angles more perpendicular, with increasing distance from the proximal tunnel to the supinator crest (Table 1, Figure 5, Figure 7). The bony bridge 0 mm posterior to the supinator crest yielded a mean (SE) bony bridge length of 11.0 (0.2) mm. This proximal tunnel also yielded the smallest mean (SE) perpendicular angle to the isometric point, 131.2° (1.9°). The tunnel most posterior to the supinator crest yielded the longest mean (SE) bony bridge, 13.7 (0.2) mm, and the largest mean (SE) degree of perpendicularity, 95.8° (1.4°). The differences between all tunnels’ bony bridges and isometric angles were statistically significant (P < .00001). The difference between the more distal limb and the more proximal limb of the graft was smallest in the more posteriorly placed proximal tunnel (Table 2, Figure 8). In fact, there was no statistical difference between the proximal and distal limbs of the graft when the proximal tunnel was placed 10 mm posterior to the supinator crest: Mean (SE) was 9.4 (0.5) mm at 0 mm (P < .00001) and 1.1 (0.6) mm at 10 mm (P = .24).
Discussion
PLRI of the elbow is best initially managed nonoperatively. However, when nonoperative management fails, the LUCL is often surgically reconstructed. Reconstruction methods vary by fixation method, graft choice, and bone tunnels.1,7-9,11-13 In 1991, O’Driscoll and colleagues1 described a “yoke” technique for LUCL reconstruction. Since then, the docking technique7 and other techniques have been developed. All these techniques emphasize maximizing anatomical precision and isometry with careful placement of tunnels or fixation devices. The humeral fixation site, at the anterior inferior aspect of the lateral epicondyle at the point of isometry, can be accessed relatively reproducibly. By contrast, the ulnar points of fixation are more variable, because of increased bone stock and overlying soft-tissue and bony anatomy.
Among the challenges in determining the points of ulnar fixation is the bony anatomy that is often used for landmarks. In the literature, the supinator crest or the supintor tubercle is the landmark for placing the distal tunnel.1,7-9,11-13 This is a problem for 2 reasons. First, the supintor crest, a longitudinal structure on the lateral aspect of the ulna, originates from the radial head junction and extends tens of millimeters distally; further specification is needed to guide these ulnar tunnels. The second reason is that use of the supinator tubercle, a prominence on the supinator crest, adds specificity to the location of the ulnar tunnels. During surgery, however, the supinator tubercle may not be a reliable, independently prominent structure; instead, it may be indistinguishable from the supinator crest, on which it rests. One study determined that only about 50% of computer models of patient ulnas had a distinct prominence that could be classified as the supinator tubercle.17 The percentage presumably is lower during surgery, with limited exposure and overlying soft tissues.
In a study of patients with a prominent tubercle, mean (SE) distance from radial head junction to tubercle was 15 (2) mm.17 This finding led us to use the radial head junction as the primary bony landmark in determining the location of the proximal tunnel and placing the distal tunnel 15 mm distally—achieving the same fixation described in the literature but using more distinct landmarks. Our study thus provided a reliable, verified approach to locating the ulnar tunnels in the proximal-distal axis.
We also explored the anterior-posterior orientation of the proximal ulnar tunnel. The 2 primary considerations surrounding the varied proximal tunnel placements were the bony bridge formed between the proximal and distal tunnels and the perpendicularity of the triangle formed by the fixation points. Maximizing the bony bridge is obviously ideal in securing and preventing fixation blowout. Achieving an isoceles reconstruction has been reported in the literature on the various fixation techniques for LUCL reconstruction.11 Although the biomechanical advantage of this fixation type is not fully clear, we assume the construct produces graft stands of equal length, tension, and stability. In addition, the larger footprint created by an isoceles reconstructed ligament increases the stability of the radial head.
Results of the present study showed that the more posterior the proximal ulnar tunnel, the longer the bony bridge and the more isoceles the reconstruction. The difference in bony bridge distance from the most anterior to the most posterior tunnel was about 2 mm, or 18%. For every 1 mm of posteriorization, the bony bridge was 0.2 mm longer. The line from the isometric point of humeral fixation bisecting the proximal and distal tunnels was also more perpendicular with the most posterior tunnel, by about 40°. The resulting proximal and distal limbs of the reconstruction were equal in length, as demonstrated by the smaller difference between the limbs. We assume this isoceles reconstruction more likely applies uniform restraint on the radial head. Thus, an effort should be made to posteriorize the proximal ulnar tunnel during reconstruction.
The study was limited by the number of patient-specific elbow models used. However, given the statistical consistency of measurements, sample size was sufficient. Another limitation, inherent to the model, was that only bony anatomy was incorporated. However, the overlying muscles, tendons, and ligaments can significantly alter tunnel placement, and this study provided other means and cues using more reliable landmarks to adequately place the tunnels. As this was a simulation study, we cannot confirm whether these results would make a difference clinically. The strengths of this study include development and verification of reliable landmarks that can be used to guide ulnar tunnel locations during LUCL reconstruction; these landmarks have been used for medial ulnar collateral ligament reconstruction.15 Other strengths include precise and accurate placement of tunnels and measurement of resulting bony bridges—accomplished independently and without compromising specimen quality.
Conclusion
We recommend drilling the proximal ulnar tunnel posterior to the supinator crest at the level of the radial head junction. A reasonable goal is 10 mm posterior to the crest, though the overlying soft tissue must be considered, and care should be taken to aim the drill anteriorly, toward the ulna’s intramedullary canal, to avoid posterior cortical breach. The distal ulnar tunnel should be drilled just posterior to the supinator crest, 15 mm distal to the radial head junction.
1. O’Driscoll SW, Bell DF, Morrey BF. Posterolateral rotatory instability of the elbow. J Bone Joint Surg Am. 1991;73(3):440-446.
2. O’Driscoll SW. Classification and evaluation of recurrent instability of the elbow. Clin Orthop Relat Res. 2000;370:34-43.
3. Takigawa N, Ryu J, Kish VL, Kinoshita M, Abe M. Functional anatomy of the lateral collateral ligament complex of the elbow: morphology and strain. J Hand Surg Br. 2005;30(2):143-147.
4. McAdams TR, Masters GW, Srivastava S. The effect of arthroscopic sectioning of the lateral ligament complex of the elbow on posterolateral rotatory stability. J Shoulder Elbow Surg. 2005;14(3):298-301.
5. Dunning CE, Zarzour ZD, Patterson SD, Johnson JA, King GJ. Ligamentous stabilizers against posterolateral rotatory instability of the elbow. J Bone Joint Surg Am. 2001;83(12):1823-1828.
6. Eygendaal D. Ligamentous reconstruction around the elbow using triceps tendon. Acta Orthop Scand. 2004;75(5):516-523.
7. Jones KJ, Dodson CC, Osbahr DC, et al. The docking technique for lateral ulnar collateral ligament reconstruction: surgical technique and clinical outcomes. J Shoulder Elbow Surg. 2012;21(3):389-395.
8. Lee BP, Teo LH. Surgical reconstruction for posterolateral rotatory instability of the elbow. J Shoulder Elbow Surg. 2003;12(5):476-479.
9. Lin KY, Shen PH, Lee CH, Pan RY, Lin LC, Shen HC. Functional outcomes of surgical reconstruction for posterolateral rotatory instability of the elbow. Injury. 2012;43(10):1657-1661.
10. Olsen BS, Søjbjerg JO. The treatment of recurrent posterolateral instability of the elbow. J Bone Joint Surg Br. 2003;85(3):342-346.
11. Sanchez-Sotelo J, Morrey BF, O’Driscoll SW. Ligamentous repair and reconstruction for posterolateral rotatory instability of the elbow. J Bone Joint Surg Br. 2005;87(1):54-61.
12. Savoie FH 3rd, Field LD, Gurley DJ. Arthroscopic and open radial ulnohumeral ligament reconstruction for posterolateral rotatory instability of the elbow. Hand Clin. 2009;25(3):323-329.
13. Savoie FH 3rd, O’Brien MJ, Field LD, Gurley DJ. Arthroscopic and open radial ulnohumeral ligament reconstruction for posterolateral rotatory instability of the elbow. Clin Sports Med. 2010;29(4):611-618.
14. Bryce CD, Pennypacker JL, Kulkarni N, et al. Validation of three-dimensional models of in situ scapulae. J Shoulder Elbow Surg. 2008;17(5):825-832.
15. Byram IR, Khanna K, Gardner TR, Ahmad CS. Characterizing bone tunnel placement in medial ulnar collateral ligament reconstruction using patient-specific 3-dimensional computed tomography modeling. Am J Sports Med. 2013;41(4):894-902.
16. Shiba R, Sorbie C, Siu DW, Bryant JT, Cooke TD, Wevers HW. Geometry of the humeroulnar joint. J Orthop Res. 1988;6(6):897-906.
17. Anakwenze OA, Khanna K, Levine WN, Ahmad CS. Characterization of the supinator tubercle for lateral ulnar collateral ligament reconstruction. Orthop J Sports Med. 2014;2(4):2325967114530969. doi:10.1177/2325967114530969.
18. Sasashige Y, Ochi M, Ikuta Y. Optimal attachment site for reconstruction of the ulnar collateral ligament. A cadaver study. Arch Orthop Trauma Surg. 1994;113(5):265-270.
Posterolateral rotatory instability (PLRI) of the elbow is well recognized1 and is the most common type of chronic elbow instability. PLRI is often an end result of traumatic elbow dislocation.2 The “essential lesion” in patients with PLRI of the elbow is injury to the lateral ulnar collateral ligament (LUCL).1 However, more recent research has emphasized the importance of other ligaments in the lateral ligament complex (radial collateral and annular ligaments) in preventing PLRI.3-5 Nevertheless, when conservative treatment fails, the most commonly used surgical treatment involves LUCL reconstruction.1,6-11
Numerous techniques for LUCL reconstruction have been described.1,7-9,11-13 The chosen technique ideally restores normal anatomy. Therefore, the isometric point of origin at the lateral epicondyle and insertion at the supinator tubercle are important landmarks for creating tunnels that reproduce isometry, function, and normal anatomy. Most often, 2 tunnels are created in the ulna to secure the graft. It has been our experience that ulnar tunnel creation can affect the length of the bony bridge and the orientation of the graft.
We conducted a study to identify the precise proximal ulna tunnel location—anterior to posterior, with the distal tunnel at the supinator tubercle on the crest—that allows for the largest bony bridge and most geometrically favorable construct. We hypothesized that a most posteriorly placed proximal tunnel would increase bony bridge size and allow for a more isosceles graft configuration. An isosceles configuration with the humerus tunnel at the isometric location would allow for anterior and posterior bands of the same length with theoretically equal force distribution.
Methods
After obtaining institutional review board approval, we retrospectively reviewed the cases of 17 adults with elbow computed tomography (CT) scans for inclusion in this study. The scans were previously performed for diagnostic workup of several pathologies, including valgus instability, olecranon stress fracture, and valgus extension overload. The scan protocol involved 0.5-mm axial cuts with inclusion of the distal humerus through the proximal radius and ulna in the DICOM (Digital Imaging and Communications in Medicine) format. Exclusion criteria included poor CT quality, inadequate visualization of the entire supinator crest, and age under 18 years. Fifteen patients with adequate CT scans met the inclusion criteria. MIMICS (Materialise’s Interactive Medical Image Control System) software was used to convert scans into patient-specific 3-dimensional (3-D) computer models. (Use of this software to produce anatomically accurate models has been verified in shoulder14 and elbow15 models.) These models were uploaded into Magics rapid prototyping software (Materialise) and manipulated for simulated tunnel drilling by precise bone subtraction methods. This software was used to define an ulnar Cartesian coordinate system with anatomical landmarks as reference points in order to standardize the position of each model (Figure 1).16 The y-axis was defined by the longitudinal axis of the ulna, and the x-axis was the transepicondylar axis, defined as the perpendicular line connecting the y-axis with the supinator crest. The z-axis was then established as the line perpendicular to the x- and y-axes—yielding a 3-D coordinate system that allowed us to manipulate the models in standardized fashion, maintaining the exact positions of the ulna while making measurements.
Surgical simulations were performed in the rapid prototyping software by creating a cylinder and placing it at the desired location of each tunnel. Cylinder diameter was 4 mm, matching the diameter of the drill we use to create each tunnel in our practice. The cylinder was inserted into the bone, perpendicular to the surface of the ulna at the point of insertion, so the cylinder’s deepest point entered the medullary canal of the ulna. Using a Boolean operation in the rapid prototyping software, we subtracted cylinder from bone to create a tunnel (Figure 2).15
In a previous study,17 we determined that the radial head junction is reproducibly about 15 mm proximal to the distinct supinator tubercle, which may be absent or not readily appreciated in up to 50% of cases. Therefore, proximal ulnar tunnels were placed 0, 5, and 10 mm posterior to the supinator crest at the radial head junction. Distal tunnels were placed 15 mm anterior to the radial head junction on the supinator crest (Figure 2). The bony bridges created by these tunnels were measured, as was the distance between the distal tunnel and the supinator tubercle.
Ideal graft configuration was described as an isosceles triangle with ulna tunnels perpendicular to the humeral tunnel (Figure 3).11 Location of the humeral origin in the sagittal plane was determined by finding the isometric point of the lateral humerus using only bony landmarks. Similar techniques have been used to find the isometric point on the medial epicondyle for medial ulnar collateral ligament reconstruction.15,18 With a circle fit into the trochlear notch of the ulna, the isometric point can be determined by the center of the circle. This point was then superimposed on the humerus to identify the starting point (Figure 4). In our simulation, we measured the isosceles configuration by drawing a line between the proximal and distal tunnels, and then another line connecting the bisecting point of the first line with the isometric point on the humerus from which the graft would originate. The angle between the 2 lines was measured; if isosceles, the angle was 90° (Figure 5). Length of the more proximal limb of the graft and the more distal limb of the graft was determined by measuring the distance from the isometric point to the proximal and distal tunnels, respectively (Figure 6).
One-way analysis of variance was used to compare all the tunnels’ bony bridge sizes, graft lengths, and angles to the isometric point. For all comparisons, statistical significance was set at P < .05. As no other studies have compared bony bridges by varying tunnel creation parameters, and as the present study is observational and not comparative, no power analysis was performed.
Results
Bony bridges were significantly longer, and angles more perpendicular, with increasing distance from the proximal tunnel to the supinator crest (Table 1, Figure 5, Figure 7). The bony bridge 0 mm posterior to the supinator crest yielded a mean (SE) bony bridge length of 11.0 (0.2) mm. This proximal tunnel also yielded the smallest mean (SE) perpendicular angle to the isometric point, 131.2° (1.9°). The tunnel most posterior to the supinator crest yielded the longest mean (SE) bony bridge, 13.7 (0.2) mm, and the largest mean (SE) degree of perpendicularity, 95.8° (1.4°). The differences between all tunnels’ bony bridges and isometric angles were statistically significant (P < .00001). The difference between the more distal limb and the more proximal limb of the graft was smallest in the more posteriorly placed proximal tunnel (Table 2, Figure 8). In fact, there was no statistical difference between the proximal and distal limbs of the graft when the proximal tunnel was placed 10 mm posterior to the supinator crest: Mean (SE) was 9.4 (0.5) mm at 0 mm (P < .00001) and 1.1 (0.6) mm at 10 mm (P = .24).
Discussion
PLRI of the elbow is best initially managed nonoperatively. However, when nonoperative management fails, the LUCL is often surgically reconstructed. Reconstruction methods vary by fixation method, graft choice, and bone tunnels.1,7-9,11-13 In 1991, O’Driscoll and colleagues1 described a “yoke” technique for LUCL reconstruction. Since then, the docking technique7 and other techniques have been developed. All these techniques emphasize maximizing anatomical precision and isometry with careful placement of tunnels or fixation devices. The humeral fixation site, at the anterior inferior aspect of the lateral epicondyle at the point of isometry, can be accessed relatively reproducibly. By contrast, the ulnar points of fixation are more variable, because of increased bone stock and overlying soft-tissue and bony anatomy.
Among the challenges in determining the points of ulnar fixation is the bony anatomy that is often used for landmarks. In the literature, the supinator crest or the supintor tubercle is the landmark for placing the distal tunnel.1,7-9,11-13 This is a problem for 2 reasons. First, the supintor crest, a longitudinal structure on the lateral aspect of the ulna, originates from the radial head junction and extends tens of millimeters distally; further specification is needed to guide these ulnar tunnels. The second reason is that use of the supinator tubercle, a prominence on the supinator crest, adds specificity to the location of the ulnar tunnels. During surgery, however, the supinator tubercle may not be a reliable, independently prominent structure; instead, it may be indistinguishable from the supinator crest, on which it rests. One study determined that only about 50% of computer models of patient ulnas had a distinct prominence that could be classified as the supinator tubercle.17 The percentage presumably is lower during surgery, with limited exposure and overlying soft tissues.
In a study of patients with a prominent tubercle, mean (SE) distance from radial head junction to tubercle was 15 (2) mm.17 This finding led us to use the radial head junction as the primary bony landmark in determining the location of the proximal tunnel and placing the distal tunnel 15 mm distally—achieving the same fixation described in the literature but using more distinct landmarks. Our study thus provided a reliable, verified approach to locating the ulnar tunnels in the proximal-distal axis.
We also explored the anterior-posterior orientation of the proximal ulnar tunnel. The 2 primary considerations surrounding the varied proximal tunnel placements were the bony bridge formed between the proximal and distal tunnels and the perpendicularity of the triangle formed by the fixation points. Maximizing the bony bridge is obviously ideal in securing and preventing fixation blowout. Achieving an isoceles reconstruction has been reported in the literature on the various fixation techniques for LUCL reconstruction.11 Although the biomechanical advantage of this fixation type is not fully clear, we assume the construct produces graft stands of equal length, tension, and stability. In addition, the larger footprint created by an isoceles reconstructed ligament increases the stability of the radial head.
Results of the present study showed that the more posterior the proximal ulnar tunnel, the longer the bony bridge and the more isoceles the reconstruction. The difference in bony bridge distance from the most anterior to the most posterior tunnel was about 2 mm, or 18%. For every 1 mm of posteriorization, the bony bridge was 0.2 mm longer. The line from the isometric point of humeral fixation bisecting the proximal and distal tunnels was also more perpendicular with the most posterior tunnel, by about 40°. The resulting proximal and distal limbs of the reconstruction were equal in length, as demonstrated by the smaller difference between the limbs. We assume this isoceles reconstruction more likely applies uniform restraint on the radial head. Thus, an effort should be made to posteriorize the proximal ulnar tunnel during reconstruction.
The study was limited by the number of patient-specific elbow models used. However, given the statistical consistency of measurements, sample size was sufficient. Another limitation, inherent to the model, was that only bony anatomy was incorporated. However, the overlying muscles, tendons, and ligaments can significantly alter tunnel placement, and this study provided other means and cues using more reliable landmarks to adequately place the tunnels. As this was a simulation study, we cannot confirm whether these results would make a difference clinically. The strengths of this study include development and verification of reliable landmarks that can be used to guide ulnar tunnel locations during LUCL reconstruction; these landmarks have been used for medial ulnar collateral ligament reconstruction.15 Other strengths include precise and accurate placement of tunnels and measurement of resulting bony bridges—accomplished independently and without compromising specimen quality.
Conclusion
We recommend drilling the proximal ulnar tunnel posterior to the supinator crest at the level of the radial head junction. A reasonable goal is 10 mm posterior to the crest, though the overlying soft tissue must be considered, and care should be taken to aim the drill anteriorly, toward the ulna’s intramedullary canal, to avoid posterior cortical breach. The distal ulnar tunnel should be drilled just posterior to the supinator crest, 15 mm distal to the radial head junction.
Posterolateral rotatory instability (PLRI) of the elbow is well recognized1 and is the most common type of chronic elbow instability. PLRI is often an end result of traumatic elbow dislocation.2 The “essential lesion” in patients with PLRI of the elbow is injury to the lateral ulnar collateral ligament (LUCL).1 However, more recent research has emphasized the importance of other ligaments in the lateral ligament complex (radial collateral and annular ligaments) in preventing PLRI.3-5 Nevertheless, when conservative treatment fails, the most commonly used surgical treatment involves LUCL reconstruction.1,6-11
Numerous techniques for LUCL reconstruction have been described.1,7-9,11-13 The chosen technique ideally restores normal anatomy. Therefore, the isometric point of origin at the lateral epicondyle and insertion at the supinator tubercle are important landmarks for creating tunnels that reproduce isometry, function, and normal anatomy. Most often, 2 tunnels are created in the ulna to secure the graft. It has been our experience that ulnar tunnel creation can affect the length of the bony bridge and the orientation of the graft.
We conducted a study to identify the precise proximal ulna tunnel location—anterior to posterior, with the distal tunnel at the supinator tubercle on the crest—that allows for the largest bony bridge and most geometrically favorable construct. We hypothesized that a most posteriorly placed proximal tunnel would increase bony bridge size and allow for a more isosceles graft configuration. An isosceles configuration with the humerus tunnel at the isometric location would allow for anterior and posterior bands of the same length with theoretically equal force distribution.
Methods
After obtaining institutional review board approval, we retrospectively reviewed the cases of 17 adults with elbow computed tomography (CT) scans for inclusion in this study. The scans were previously performed for diagnostic workup of several pathologies, including valgus instability, olecranon stress fracture, and valgus extension overload. The scan protocol involved 0.5-mm axial cuts with inclusion of the distal humerus through the proximal radius and ulna in the DICOM (Digital Imaging and Communications in Medicine) format. Exclusion criteria included poor CT quality, inadequate visualization of the entire supinator crest, and age under 18 years. Fifteen patients with adequate CT scans met the inclusion criteria. MIMICS (Materialise’s Interactive Medical Image Control System) software was used to convert scans into patient-specific 3-dimensional (3-D) computer models. (Use of this software to produce anatomically accurate models has been verified in shoulder14 and elbow15 models.) These models were uploaded into Magics rapid prototyping software (Materialise) and manipulated for simulated tunnel drilling by precise bone subtraction methods. This software was used to define an ulnar Cartesian coordinate system with anatomical landmarks as reference points in order to standardize the position of each model (Figure 1).16 The y-axis was defined by the longitudinal axis of the ulna, and the x-axis was the transepicondylar axis, defined as the perpendicular line connecting the y-axis with the supinator crest. The z-axis was then established as the line perpendicular to the x- and y-axes—yielding a 3-D coordinate system that allowed us to manipulate the models in standardized fashion, maintaining the exact positions of the ulna while making measurements.
Surgical simulations were performed in the rapid prototyping software by creating a cylinder and placing it at the desired location of each tunnel. Cylinder diameter was 4 mm, matching the diameter of the drill we use to create each tunnel in our practice. The cylinder was inserted into the bone, perpendicular to the surface of the ulna at the point of insertion, so the cylinder’s deepest point entered the medullary canal of the ulna. Using a Boolean operation in the rapid prototyping software, we subtracted cylinder from bone to create a tunnel (Figure 2).15
In a previous study,17 we determined that the radial head junction is reproducibly about 15 mm proximal to the distinct supinator tubercle, which may be absent or not readily appreciated in up to 50% of cases. Therefore, proximal ulnar tunnels were placed 0, 5, and 10 mm posterior to the supinator crest at the radial head junction. Distal tunnels were placed 15 mm anterior to the radial head junction on the supinator crest (Figure 2). The bony bridges created by these tunnels were measured, as was the distance between the distal tunnel and the supinator tubercle.
Ideal graft configuration was described as an isosceles triangle with ulna tunnels perpendicular to the humeral tunnel (Figure 3).11 Location of the humeral origin in the sagittal plane was determined by finding the isometric point of the lateral humerus using only bony landmarks. Similar techniques have been used to find the isometric point on the medial epicondyle for medial ulnar collateral ligament reconstruction.15,18 With a circle fit into the trochlear notch of the ulna, the isometric point can be determined by the center of the circle. This point was then superimposed on the humerus to identify the starting point (Figure 4). In our simulation, we measured the isosceles configuration by drawing a line between the proximal and distal tunnels, and then another line connecting the bisecting point of the first line with the isometric point on the humerus from which the graft would originate. The angle between the 2 lines was measured; if isosceles, the angle was 90° (Figure 5). Length of the more proximal limb of the graft and the more distal limb of the graft was determined by measuring the distance from the isometric point to the proximal and distal tunnels, respectively (Figure 6).
One-way analysis of variance was used to compare all the tunnels’ bony bridge sizes, graft lengths, and angles to the isometric point. For all comparisons, statistical significance was set at P < .05. As no other studies have compared bony bridges by varying tunnel creation parameters, and as the present study is observational and not comparative, no power analysis was performed.
Results
Bony bridges were significantly longer, and angles more perpendicular, with increasing distance from the proximal tunnel to the supinator crest (Table 1, Figure 5, Figure 7). The bony bridge 0 mm posterior to the supinator crest yielded a mean (SE) bony bridge length of 11.0 (0.2) mm. This proximal tunnel also yielded the smallest mean (SE) perpendicular angle to the isometric point, 131.2° (1.9°). The tunnel most posterior to the supinator crest yielded the longest mean (SE) bony bridge, 13.7 (0.2) mm, and the largest mean (SE) degree of perpendicularity, 95.8° (1.4°). The differences between all tunnels’ bony bridges and isometric angles were statistically significant (P < .00001). The difference between the more distal limb and the more proximal limb of the graft was smallest in the more posteriorly placed proximal tunnel (Table 2, Figure 8). In fact, there was no statistical difference between the proximal and distal limbs of the graft when the proximal tunnel was placed 10 mm posterior to the supinator crest: Mean (SE) was 9.4 (0.5) mm at 0 mm (P < .00001) and 1.1 (0.6) mm at 10 mm (P = .24).
Discussion
PLRI of the elbow is best initially managed nonoperatively. However, when nonoperative management fails, the LUCL is often surgically reconstructed. Reconstruction methods vary by fixation method, graft choice, and bone tunnels.1,7-9,11-13 In 1991, O’Driscoll and colleagues1 described a “yoke” technique for LUCL reconstruction. Since then, the docking technique7 and other techniques have been developed. All these techniques emphasize maximizing anatomical precision and isometry with careful placement of tunnels or fixation devices. The humeral fixation site, at the anterior inferior aspect of the lateral epicondyle at the point of isometry, can be accessed relatively reproducibly. By contrast, the ulnar points of fixation are more variable, because of increased bone stock and overlying soft-tissue and bony anatomy.
Among the challenges in determining the points of ulnar fixation is the bony anatomy that is often used for landmarks. In the literature, the supinator crest or the supintor tubercle is the landmark for placing the distal tunnel.1,7-9,11-13 This is a problem for 2 reasons. First, the supintor crest, a longitudinal structure on the lateral aspect of the ulna, originates from the radial head junction and extends tens of millimeters distally; further specification is needed to guide these ulnar tunnels. The second reason is that use of the supinator tubercle, a prominence on the supinator crest, adds specificity to the location of the ulnar tunnels. During surgery, however, the supinator tubercle may not be a reliable, independently prominent structure; instead, it may be indistinguishable from the supinator crest, on which it rests. One study determined that only about 50% of computer models of patient ulnas had a distinct prominence that could be classified as the supinator tubercle.17 The percentage presumably is lower during surgery, with limited exposure and overlying soft tissues.
In a study of patients with a prominent tubercle, mean (SE) distance from radial head junction to tubercle was 15 (2) mm.17 This finding led us to use the radial head junction as the primary bony landmark in determining the location of the proximal tunnel and placing the distal tunnel 15 mm distally—achieving the same fixation described in the literature but using more distinct landmarks. Our study thus provided a reliable, verified approach to locating the ulnar tunnels in the proximal-distal axis.
We also explored the anterior-posterior orientation of the proximal ulnar tunnel. The 2 primary considerations surrounding the varied proximal tunnel placements were the bony bridge formed between the proximal and distal tunnels and the perpendicularity of the triangle formed by the fixation points. Maximizing the bony bridge is obviously ideal in securing and preventing fixation blowout. Achieving an isoceles reconstruction has been reported in the literature on the various fixation techniques for LUCL reconstruction.11 Although the biomechanical advantage of this fixation type is not fully clear, we assume the construct produces graft stands of equal length, tension, and stability. In addition, the larger footprint created by an isoceles reconstructed ligament increases the stability of the radial head.
Results of the present study showed that the more posterior the proximal ulnar tunnel, the longer the bony bridge and the more isoceles the reconstruction. The difference in bony bridge distance from the most anterior to the most posterior tunnel was about 2 mm, or 18%. For every 1 mm of posteriorization, the bony bridge was 0.2 mm longer. The line from the isometric point of humeral fixation bisecting the proximal and distal tunnels was also more perpendicular with the most posterior tunnel, by about 40°. The resulting proximal and distal limbs of the reconstruction were equal in length, as demonstrated by the smaller difference between the limbs. We assume this isoceles reconstruction more likely applies uniform restraint on the radial head. Thus, an effort should be made to posteriorize the proximal ulnar tunnel during reconstruction.
The study was limited by the number of patient-specific elbow models used. However, given the statistical consistency of measurements, sample size was sufficient. Another limitation, inherent to the model, was that only bony anatomy was incorporated. However, the overlying muscles, tendons, and ligaments can significantly alter tunnel placement, and this study provided other means and cues using more reliable landmarks to adequately place the tunnels. As this was a simulation study, we cannot confirm whether these results would make a difference clinically. The strengths of this study include development and verification of reliable landmarks that can be used to guide ulnar tunnel locations during LUCL reconstruction; these landmarks have been used for medial ulnar collateral ligament reconstruction.15 Other strengths include precise and accurate placement of tunnels and measurement of resulting bony bridges—accomplished independently and without compromising specimen quality.
Conclusion
We recommend drilling the proximal ulnar tunnel posterior to the supinator crest at the level of the radial head junction. A reasonable goal is 10 mm posterior to the crest, though the overlying soft tissue must be considered, and care should be taken to aim the drill anteriorly, toward the ulna’s intramedullary canal, to avoid posterior cortical breach. The distal ulnar tunnel should be drilled just posterior to the supinator crest, 15 mm distal to the radial head junction.
1. O’Driscoll SW, Bell DF, Morrey BF. Posterolateral rotatory instability of the elbow. J Bone Joint Surg Am. 1991;73(3):440-446.
2. O’Driscoll SW. Classification and evaluation of recurrent instability of the elbow. Clin Orthop Relat Res. 2000;370:34-43.
3. Takigawa N, Ryu J, Kish VL, Kinoshita M, Abe M. Functional anatomy of the lateral collateral ligament complex of the elbow: morphology and strain. J Hand Surg Br. 2005;30(2):143-147.
4. McAdams TR, Masters GW, Srivastava S. The effect of arthroscopic sectioning of the lateral ligament complex of the elbow on posterolateral rotatory stability. J Shoulder Elbow Surg. 2005;14(3):298-301.
5. Dunning CE, Zarzour ZD, Patterson SD, Johnson JA, King GJ. Ligamentous stabilizers against posterolateral rotatory instability of the elbow. J Bone Joint Surg Am. 2001;83(12):1823-1828.
6. Eygendaal D. Ligamentous reconstruction around the elbow using triceps tendon. Acta Orthop Scand. 2004;75(5):516-523.
7. Jones KJ, Dodson CC, Osbahr DC, et al. The docking technique for lateral ulnar collateral ligament reconstruction: surgical technique and clinical outcomes. J Shoulder Elbow Surg. 2012;21(3):389-395.
8. Lee BP, Teo LH. Surgical reconstruction for posterolateral rotatory instability of the elbow. J Shoulder Elbow Surg. 2003;12(5):476-479.
9. Lin KY, Shen PH, Lee CH, Pan RY, Lin LC, Shen HC. Functional outcomes of surgical reconstruction for posterolateral rotatory instability of the elbow. Injury. 2012;43(10):1657-1661.
10. Olsen BS, Søjbjerg JO. The treatment of recurrent posterolateral instability of the elbow. J Bone Joint Surg Br. 2003;85(3):342-346.
11. Sanchez-Sotelo J, Morrey BF, O’Driscoll SW. Ligamentous repair and reconstruction for posterolateral rotatory instability of the elbow. J Bone Joint Surg Br. 2005;87(1):54-61.
12. Savoie FH 3rd, Field LD, Gurley DJ. Arthroscopic and open radial ulnohumeral ligament reconstruction for posterolateral rotatory instability of the elbow. Hand Clin. 2009;25(3):323-329.
13. Savoie FH 3rd, O’Brien MJ, Field LD, Gurley DJ. Arthroscopic and open radial ulnohumeral ligament reconstruction for posterolateral rotatory instability of the elbow. Clin Sports Med. 2010;29(4):611-618.
14. Bryce CD, Pennypacker JL, Kulkarni N, et al. Validation of three-dimensional models of in situ scapulae. J Shoulder Elbow Surg. 2008;17(5):825-832.
15. Byram IR, Khanna K, Gardner TR, Ahmad CS. Characterizing bone tunnel placement in medial ulnar collateral ligament reconstruction using patient-specific 3-dimensional computed tomography modeling. Am J Sports Med. 2013;41(4):894-902.
16. Shiba R, Sorbie C, Siu DW, Bryant JT, Cooke TD, Wevers HW. Geometry of the humeroulnar joint. J Orthop Res. 1988;6(6):897-906.
17. Anakwenze OA, Khanna K, Levine WN, Ahmad CS. Characterization of the supinator tubercle for lateral ulnar collateral ligament reconstruction. Orthop J Sports Med. 2014;2(4):2325967114530969. doi:10.1177/2325967114530969.
18. Sasashige Y, Ochi M, Ikuta Y. Optimal attachment site for reconstruction of the ulnar collateral ligament. A cadaver study. Arch Orthop Trauma Surg. 1994;113(5):265-270.
1. O’Driscoll SW, Bell DF, Morrey BF. Posterolateral rotatory instability of the elbow. J Bone Joint Surg Am. 1991;73(3):440-446.
2. O’Driscoll SW. Classification and evaluation of recurrent instability of the elbow. Clin Orthop Relat Res. 2000;370:34-43.
3. Takigawa N, Ryu J, Kish VL, Kinoshita M, Abe M. Functional anatomy of the lateral collateral ligament complex of the elbow: morphology and strain. J Hand Surg Br. 2005;30(2):143-147.
4. McAdams TR, Masters GW, Srivastava S. The effect of arthroscopic sectioning of the lateral ligament complex of the elbow on posterolateral rotatory stability. J Shoulder Elbow Surg. 2005;14(3):298-301.
5. Dunning CE, Zarzour ZD, Patterson SD, Johnson JA, King GJ. Ligamentous stabilizers against posterolateral rotatory instability of the elbow. J Bone Joint Surg Am. 2001;83(12):1823-1828.
6. Eygendaal D. Ligamentous reconstruction around the elbow using triceps tendon. Acta Orthop Scand. 2004;75(5):516-523.
7. Jones KJ, Dodson CC, Osbahr DC, et al. The docking technique for lateral ulnar collateral ligament reconstruction: surgical technique and clinical outcomes. J Shoulder Elbow Surg. 2012;21(3):389-395.
8. Lee BP, Teo LH. Surgical reconstruction for posterolateral rotatory instability of the elbow. J Shoulder Elbow Surg. 2003;12(5):476-479.
9. Lin KY, Shen PH, Lee CH, Pan RY, Lin LC, Shen HC. Functional outcomes of surgical reconstruction for posterolateral rotatory instability of the elbow. Injury. 2012;43(10):1657-1661.
10. Olsen BS, Søjbjerg JO. The treatment of recurrent posterolateral instability of the elbow. J Bone Joint Surg Br. 2003;85(3):342-346.
11. Sanchez-Sotelo J, Morrey BF, O’Driscoll SW. Ligamentous repair and reconstruction for posterolateral rotatory instability of the elbow. J Bone Joint Surg Br. 2005;87(1):54-61.
12. Savoie FH 3rd, Field LD, Gurley DJ. Arthroscopic and open radial ulnohumeral ligament reconstruction for posterolateral rotatory instability of the elbow. Hand Clin. 2009;25(3):323-329.
13. Savoie FH 3rd, O’Brien MJ, Field LD, Gurley DJ. Arthroscopic and open radial ulnohumeral ligament reconstruction for posterolateral rotatory instability of the elbow. Clin Sports Med. 2010;29(4):611-618.
14. Bryce CD, Pennypacker JL, Kulkarni N, et al. Validation of three-dimensional models of in situ scapulae. J Shoulder Elbow Surg. 2008;17(5):825-832.
15. Byram IR, Khanna K, Gardner TR, Ahmad CS. Characterizing bone tunnel placement in medial ulnar collateral ligament reconstruction using patient-specific 3-dimensional computed tomography modeling. Am J Sports Med. 2013;41(4):894-902.
16. Shiba R, Sorbie C, Siu DW, Bryant JT, Cooke TD, Wevers HW. Geometry of the humeroulnar joint. J Orthop Res. 1988;6(6):897-906.
17. Anakwenze OA, Khanna K, Levine WN, Ahmad CS. Characterization of the supinator tubercle for lateral ulnar collateral ligament reconstruction. Orthop J Sports Med. 2014;2(4):2325967114530969. doi:10.1177/2325967114530969.
18. Sasashige Y, Ochi M, Ikuta Y. Optimal attachment site for reconstruction of the ulnar collateral ligament. A cadaver study. Arch Orthop Trauma Surg. 1994;113(5):265-270.
Real‐Time Patient Experience Surveys
In 2010, the Centers for Medicare and Medicaid Services implemented value‐based purchasing, a payment model that incentivizes hospitals for reaching certain quality and patient experience thresholds and penalizes those that do not, in part on the basis of patient satisfaction scores.[1] Although low patient satisfaction scores will adversely affect institutions financially, they also reflect patients' perceptions of their care. Some studies suggest that hospitals with higher patient satisfaction scores score higher overall on clinical care processes such as core measures compliance, readmission rates, lower mortality rates, and other quality‐of‐care metrics.[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey assesses patients' experience following their hospital stay.[1] The percent of top box scores (ie, response of always on a four point scale, or scores of 9 or 10 on a 10‐point scale) are utilized to compare hospitals and determine the reimbursement or penalty a hospital will receive. Although these scores are available to the public on the Hospital Compare website,[12] physicians may not know how their hospital is ranked or how they are individually perceived by their patients. Additionally, these surveys are typically conducted 48 hours to 6 weeks after patients are discharged, and the results are distributed back to the hospitals well after the time that care was provided, thereby offering providers no chance of improving patient satisfaction during a given hospital stay.
Institutions across the country are trying to improve their HCAHPS scores, but there is limited research identifying specific measures providers can implement. Some studies have suggested that utilizing etiquette‐based communication and sitting at the bedside[13, 14] may help improve patient experience with their providers, and more recently, it has been suggested that providing real‐time deidentified patient experience survey results with education and a rewards/emncentive system to residents may help as well.[15]
Surveys conducted during a patient's hospitalization can offer real‐time actionable feedback to providers. We performed a quality‐improvement project that was designed to determine if real‐time feedback to hospitalist physicians, followed by coaching, and revisits to the patients' bedside could improve the results recorded on provider‐specific patient surveys and/or patients' HCAHPS scores or percentile rankings.
METHODS
Design
This was a prospective, randomized quality‐improvement initiative that was approved by the Colorado Multiple Institutional Review Board and conducted at Denver Health, a 525‐bed university‐affiliated public safety net hospital. The initiative was conducted on both teaching and nonteaching general internal medicine services, which typically have a daily census of between 10 and 15 patients. No protocol changes occurred during the study.
Participants
Participants included all English‐ or Spanish‐speaking patients who were hospitalized on a general internal medicine service, had been admitted within the 2 days prior to enrollment, and had a hospitalist as their attending physician. Patients were excluded if they were enrolled in the study during a previous hospitalization, refused to participate, lacked capacity to participate, had hearing or speech impediments precluding regular conversation, were prisoners, if their clinical condition precluded participation, or their attending was an investigator in the project.
Intervention
Participants were prescreened by investigators by reviewing team sign‐outs to determine if patients had any exclusion criteria. Investigators attempted to survey each patient who met inclusion criteria on a daily basis between 9:00 am and 11:00 am. An investigator administered the survey to each patient verbally using scripted language. Patients were asked to rate how well their doctors were listening to them, explaining what they wanted to know, and whether the doctors were being friendly and helpful, all questions taken from a survey that was available on the US Department of Health and Human Services website (to be referred to as here forward daily survey).[16] We converted the original 5‐point Likert scale used in this survey to a 4‐point scale by removing the option of ok, leaving participants the options of poor, fair, good, or great. Patients were also asked to provide any personalized feedback they had, and these comments were recorded in writing by the investigator.
After being surveyed on day 1, patients were randomized to an intervention or control group using an automated randomization module in Research Electronic Data Capture (REDCap).[17] Patients in both groups who did not provide answers to all 3 questions that qualified as being top box (ie, great) were resurveyed on a daily basis until their responses were all top box or they were discharged, met exclusion criteria, or had been surveyed for a total of 4 consecutive days. In the pilot phase of this study, we found that if patients reported all top box scores on the initial survey their responses typically did not change over time, and the patients became frustrated if asked the same questions again when the patient felt there was not room for improvement. Accordingly, we elected to stop surveying patients when all top box responses were reported.
The attending hospitalist caring for each patient in the intervention group was given feedback about their patients' survey results (both their scores and any specific comments) on a daily basis. Feedback was provided in person by 1 of the investigators. The hospitalist also received an automatically generated electronic mail message with the survey results at 11:00 am on each study day. After informing the hospitalists of the patients' scores, the investigator provided a brief education session that included discussing Denver Health's most recent HCAHPS scores, value‐based purchasing, and the financial consequences of poor patient satisfaction scores. The investigator then coached the hospitalist on etiquette‐based communication,[18, 19] suggested that they sit down when communicating with their patients,[19, 20] and then asked the hospitalist to revisit each patient to discuss how the team could improve in any of the 3 areas where the patient did not give a top box score. These educational sessions were conducted in person and lasted a maximum of 5 minutes. An investigator followed up with each hospitalist the following day to determine whether the revisit occurred. Hospitalists caring for patients who were randomized to the control group were not given real‐time feedback or coaching and were not asked to revisit patients.
A random sample of patients surveyed for this initiative also received HCAHPS surveys 48 hours to 6 weeks following their hospital discharge, according to the standard methodology used to acquire HCAHPS data,[21] by an outside vendor contracted by Denver Health. Our vendor conducted these surveys via telephone in English or Spanish.
Outcomes
The primary outcome was the proportion of patients in each group who reported top box scores on the daily surveys. Secondary outcomes included the percent change for the scores recorded for 3 provider‐specific questions from the daily survey, the median top box HCAHPS scores for the 3 provider related questions and overall hospital rating, and the HCAHPS percentiles of top box scores for these questions.
Sample Size
The sample size for this intervention assumed that the proportion of patients whose treating physicians did not receive real‐time feedback who rated their providers as top box would be 75%, and that the effect of providing real‐time feedback would increase this proportion to 85% on the daily surveys. To have 80% power with a type 1 error of 0.05, we estimated a need to enroll 430 patients, 215 in each group.
Statistics
Data were collected and managed using a secure, Web‐based electronic data capture tool hosted at Denver Health (REDCap), which is designed to support data collection for research studies providing: (1) an intuitive interface for validated data entry, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for importing data from external sources.[17]
A 2 test was used to compare the proportion of patients in the 2 groups who reported great scores for each question on the study survey on the first and last day. With the intent of providing a framework for understanding the effect real‐time feedback could have on patient experience, a secondary analysis of HCAHPS results was conducted using several different methods.
First, the proportion of patients in the 2 groups who reported scores of 9 or 10 for the overall hospital rating question or reported always for each doctor communication question on the HCHAPS survey was compared using a 2. Second, to allow for detection of differences in a sample with a smaller N, the median overall hospital rating scores from the HCAHPS survey reported by patients in the 2 groups who completed a survey following discharge were compared using a Wilcoxon rank sum test. Lastly, to place changes in proportion into a larger context (ie, how these changes would relate to value‐based purchasing), HCAHPS scores were converted to percentiles of national performance using the 2014 percentile rankings obtained from the external vendor that conducts the HCAHPS surveys for our hospital and compared between the intervention and control groups using a Wilcoxon rank sum test.
All comments collected from patients during their daily surveys were reviewed, and key words were abstracted from each comment. These key words were sorted and reviewed to categorize recurring key words into themes. Exemplars were then selected for each theme derived from patient comments.
RESULTS
From April 14, 2014 to September 19, 2014, we enrolled 227 patients in the control group and 228 in the intervention group (Figure 1). Patient demographics are summarized in Table 1. Of the 132 patients in the intervention group who reported anything less than top box scores for any of the 3 questions (thus prompting a revisit by their provider), 106 (80%) were revisited by their provider at least once during their hospitalization.
All Patients | HCAHPS Patients | |||
---|---|---|---|---|
Control, N = 227 | Intervention, N = 228 | Control, N = 35 | Intervention, N = 30 | |
| ||||
Age, mean SD | 55 14 | 55 15 | 55 15 | 57 16 |
Gender | ||||
Male | 126 (60) | 121 (55) | 20 (57) | 12 (40) |
Female | 85 (40) | 98 (45) | 15(43) | 18 (60) |
Race/ethnicity | ||||
Hispanic | 84 (40) | 90 (41) | 17 (49) | 12 (40) |
Black | 38 (18) | 28 (13) | 6 (17) | 7 (23) |
White | 87 (41) | 97 (44) | 12 (34) | 10 (33) |
Other | 2 (1) | 4 (2) | 0 (0) | 1 (3) |
Payer | ||||
Medicare | 65 (29) | 82 (36) | 15 (43) | 12 (40) |
Medicaid | 122 (54) | 108 (47) | 17 (49) | 14 (47) |
Commercial | 12 (5) | 15 (7) | 1 (3) | 1 (3) |
Medically indigent | 4 (2) | 7 (3) | 0 (0) | 3 (10) |
Self‐pay | 5 (2) | 4 (2) | 1 (3) | 0 (0) |
Other/unknown | 19 (8) | 12 (5) | 0 (0) | 0 (0) |
Team | ||||
Teaching | 187 (82) | 196 (86) | 27 (77) | 24 (80) |
Nonteaching | 40 (18) | 32 (14) | 8 (23) | 6 (20) |
Top 5 primary discharge diagnoses* | ||||
Septicemia | 26 (11) | 34 (15) | 3 (9) | 5 (17) |
Heart failure | 14 (6) | 13 (6) | 2 (6) | |
Acute pancreatitis | 12 (5) | 9 (4) | 3 (9) | 2 (7) |
Diabetes mellitus | 11 (5) | 8 (4) | 2 (6) | |
Alcohol withdrawal | 9 (4) | |||
Cellulitis | 7 (3) | 2 (7) | ||
Pulmonary embolism | 2 (7) | |||
Chest pain | 2 (7) | |||
Atrial fibrillation | 2 (6) | |||
Length of stay, median (IQR) | 3 (2, 5) | 3 (2, 5) | 3 (2, 5) | 3 (2, 4) |
Charlson Comorbidity Index, median (IQR) | 1 (0, 3) | 2 (0, 3) | 1 (0, 3) | 1.5 (1, 3) |

Daily Surveys
The proportion of patients in both study groups reporting top box scores tended to increase from the first day to the last day of the survey (Figure 2); however, we found no statistically significant differences between the proportion of patients who reported top box scores on first day or last day in the intervention group compared to the control group. The comments made by the patients are summarized in Supporting Table 1 in the online version of this article.

HCAHPS Scores
The proportion of top box scores from the HCAHPS surveys were higher, though not statistically significant, for all 3 provider‐specific questions and for the overall hospital rating for patients whose hospitalists received real‐time feedback (Table 2). The median [interquartile range] score for the overall hospital rating was higher for patients in the intervention group compared with those in the control group, (10 [9, 10] vs 9 [8, 10], P = 0.04]. After converting the HCAHPS scores to percentiles, we found considerably higher rankings for all 3 provider‐related questions and for the overall hospital rating in the intervention group compared to the control group (P = 0.02 for overall differences in percentiles [Table 2]).
HCAHPS Questions | Proportion Top Box* | Percentile Rank | ||
---|---|---|---|---|
Control, N = 35 | Intervention, N = 30 | Control, N = 35 | Intervention, N = 30 | |
| ||||
Overall hospital rating | 61% | 80% | 6 | 87 |
Courtesy/respect | 86% | 93% | 23 | 88 |
Clear communication | 77% | 80% | 39 | 60 |
Listening | 83% | 90% | 57 | 95 |
No adverse events occurred during the course of the study in either group.
DISCUSSION
The important findings of this study were that (1) daily patient satisfaction scores improved from first day to last day regardless of study group, (2) patients whose providers received real‐time feedback had a trend toward higher HCAHPS proportions for the 3 provider‐related questions as well as the overall rating of the hospital but were not statistically significant, (3) the percentile differences in these 3 questions as well as the overall rating of the hospital were significantly higher in the intervention group as was the median score for the overall hospital rating.
Our original sample size calculation was based upon our own preliminary data, indicating that our baseline top box scores for the daily survey was around 75%. The daily survey top box score on the first day was, however, much lower (Figure 2). Accordingly, although we did not find a significant difference in these daily scores, we were underpowered to find such a difference. Additionally, because only a small percentage of patients are selected for the HCAHPS survey, our ability to detect a difference in this secondary outcome was also limited. We felt that it was important to analyze the percentile comparisons in addition to the proportion of top box scores on the HCAHPS, because the metrics for value‐based purchasing are based upon, in part, how a hospital system compares to other systems. Finally, to improve our power to detect a difference given a small sample size, we converted the scoring system for overall hospital ranking to a continuous variable, which again was noted to be significant.
To our knowledge, this is the first randomized investigation designed to assess the effect of real‐time, patient‐specific feedback to physicians. Real‐time feedback is increasingly being incorporated into medical practice, but there is only limited information available describing how this type of feedback affects outcomes.[22, 23, 24] Banka et al.[15] found that HCAHPS scores improved as a result of real‐time feedback given to residents, but the study was not randomized, utilized a pre‐post design that resulted in there being differences between the patients studied before and after the intervention, and did not provide patient‐specific data to the residents. Tabib et al.[25] found that operating costs decreased 17% after instituting real‐time feedback to providers about these costs. Reeves et al.[26] conducted a cluster randomized trial of a patient feedback survey that was designed to improve nursing care, but the results were reviewed by the nurses several months after patients had been discharged.
The differences in median top box scores and percentile rank that we observed could have resulted from the real‐time feedback, the educational coaching, the fact that the providers revisited the majority of the patients, or a combination of all of the above. Gross et al.[27] found that longer visits lead to higher satisfaction, though others have not found this to necessarily be the case.[28, 29] Lin et al.[30] found that patient satisfaction was affected by the perceived duration of the visit as well as whether expectations on visit length were met and/or exceeded. Brown et al.[31] found that training providers in communication skills improved the providers perception of their communication skills, although patient experience scores did not improve. We feel that the results seen are more likely a combination thereof as opposed to any 1 component of the intervention.
The most commonly reported complaints or concerns in patients' undirected comments often related to communication issues. Comments on subsequent surveys suggested that patient satisfaction improved over time in the intervention group, indicating that perhaps physicians did try to improve in areas that were highlighted by the real‐time feedback, and that patients perceived the physician efforts to do so (eg, They're doing better than the last time you asked. They sat down and talked to me and listened better. They came back and explained to me about my care. They listened better. They should do this survey at the clinic. See Supporting Table 1 in the online version of this article).
Our study has several limitations. First, we did not randomize providers, and many of our providers (approximately 65%) participated in both the control group and also in the intervention group, and thus received real‐time feedback at some point during the study, which could have affected their overall practice and limited our ability to find a difference between the 2 groups. In an attempt to control for this possibility, the study was conducted on an intermittent basis during the study time frame. Furthermore, the proportion of patients who reported top box scores at the beginning of the study did not have a clear trend of change by the end of the study, suggesting that overall clinician practices with respect to patient satisfaction did not change during this short time period.
Second, only a small number of our patients were randomly selected for the HCAHPS survey, which limited our ability to detect significant differences in HCAHPS proportions. Third, the HCAHPS percentiles at our institution at that time were low. Accordingly, the improvements that we observed in patient satisfaction scores might not be reproducible at institutions with higher satisfactions scores. Fourth, time and resources were needed to obtain patient feedback to provide to providers during this study. There are, however, other ways to obtain feedback that are less resource intensive (eg, electronic feedback, the utilization of volunteers, or partnering this with manager rounding). Finally, the study was conducted at a single, university‐affiliated public teaching hospital and was a quality‐improvement initiative, and thus our results are not generalizable to other institutions.
In conclusion, real‐time feedback of patient experience to their providers, coupled with provider education, coaching, and revisits, seems to improve satisfaction of patients hospitalized on general internal medicine units who were cared for by hospitalists.
Acknowledgements
The authors thank Kate Fagan, MPH, for her excellent technical assistance.
Disclosure: Nothing to report.
- HCAHPS Fact Sheet. 2015. Available at: http://www.hcahpsonline.org/Files/HCAHPS_Fact_Sheet_June_2015.pdf. Accessed August 25, 2015.
- The relationship between commercial website ratings and traditional hospital performance measures in the USA. BMJ Qual Saf. 2013;22:194–202. , , , .
- Patients' perception of hospital care in the United States. N Engl J Med. 2008;359:1921–1931. , , , .
- The relationship between patients' perception of care and measures of hospital quality and safety. Health Serv Res. 2010;45:1024–1040. , , , .
- Relationship between quality of diabetes care and patient satisfaction. J Natl Med Assoc. 2003;95:64–70. , , , et al.
- Relationship between patient satisfaction with inpatient care and hospital readmission within 30 days. Am J Manag Care. 2011;17:41–48. , , , , .
- A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 2013;3(1). , , .
- The association between satisfaction with services provided in primary care and outcomes in type 2 diabetes mellitus. Diabet Med. 2003;20:486–490. , .
- Associations between Web‐based patient ratings and objective measures of hospital quality. Arch Intern Med. 2012;172:435–436. , , , et al.
- Patient satisfaction and its relationship with clinical quality and inpatient mortality in acute myocardial infarction. Circ Cardiovasc Qual Outcomes. 2010;3:188–195. , , , et al.
- Patients' perceptions of care are associated with quality of hospital care: a survey of 4605 hospitals. Am J Med Qual. 2015;30(4):382–388. , , , , .
- Centers for Medicare 28:908–913.
- Effect of sitting vs. standing on perception of provider time at bedside: a pilot study. Patient Educ Couns. 2012;86:166–171. , , , , , .
- Improving patient satisfaction through physician education, feedback, and incentives. J Hosp Med. 2015;10:497–502. , , , et al.
- US Department of Health and Human Services. Patient satisfaction survey. Available at: http://bphc.hrsa.gov/policiesregulations/performancemeasures/patientsurvey/surveyform.html. Accessed November 15, 2013.
- Research electronic data capture (REDCap)—a metadata‐driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42:377–381. , , , , , .
- The HCAHPS Handbook. Gulf Breeze, FL: Fire Starter; 2010. .
- Etiquette‐based medicine. N Engl J Med. 2008;358:1988–1989. .
- 5 years after the Kahn's etiquette‐based medicine: a brief checklist proposal for a functional second meeting with the patient. Front Psychol. 2013;4:723. .
- Frequently Asked Questions. Hospital Value‐Based Purchasing Program. Available at: http://www.cms.gov/Medicare/Quality‐Initiatives‐Patient‐Assessment‐Instruments/hospital‐value‐based‐purchasing/Downloads/FY‐2013‐Program‐Frequently‐Asked‐Questions‐about‐Hospital‐VBP‐3‐9‐12.pdf. Accessed February 8, 2014.
- Real‐time patient survey data during routine clinical activities for rapid‐cycle quality improvement. JMIR Med Inform. 2015;3:e13. , , , .
- Mount Sinai launches real‐time patient‐feedback survey tool. Healthcare Informatics website. Available at: http://www.healthcare‐informatics.com/news‐item/mount‐sinai‐launches‐real‐time‐patient‐feedback‐survey‐tool. Accessed August 25, 2015. .
- Hospitals are finally starting to put real‐time data to use. Harvard Business Review website. Available at: https://hbr.org/2014/11/hospitals‐are‐finally‐starting‐to‐put‐real‐time‐data‐to‐use. Published November 12, 2014. Accessed August 25, 2015. , .
- Reducing operating room costs through real‐time cost information feedback: a pilot study. J Endourol. 2015;29:963–968. , , , , .
- Facilitated patient experience feedback can improve nursing care: a pilot study for a phase III cluster randomised controlled trial. BMC Health Serv Res. 2013;13:259. , , .
- Patient satisfaction with time spent with their physician. J Fam Pract. 1998;47:133–137. , , , , .
- The relationship between time spent communicating and communication outcomes on a hospital medicine service. J Gen Intern Med. 2012;27:185–189. , , , , , .
- Cognitive interview techniques reveal specific behaviors and issues that could affect patient satisfaction relative to hospitalists. J Hosp Med. 2009;4:E1–E6. , .
- Is patients' perception of time spent with the physician a determinant of ambulatory patient satisfaction? Arch Intern Med. 2001;161:1437–1442. , , , et al.
- Effect of clinician communication skills training on patient satisfaction. A randomized, controlled trial. Ann Intern Med. 1999;131:822–829. , , , .
In 2010, the Centers for Medicare and Medicaid Services implemented value‐based purchasing, a payment model that incentivizes hospitals for reaching certain quality and patient experience thresholds and penalizes those that do not, in part on the basis of patient satisfaction scores.[1] Although low patient satisfaction scores will adversely affect institutions financially, they also reflect patients' perceptions of their care. Some studies suggest that hospitals with higher patient satisfaction scores score higher overall on clinical care processes such as core measures compliance, readmission rates, lower mortality rates, and other quality‐of‐care metrics.[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey assesses patients' experience following their hospital stay.[1] The percent of top box scores (ie, response of always on a four point scale, or scores of 9 or 10 on a 10‐point scale) are utilized to compare hospitals and determine the reimbursement or penalty a hospital will receive. Although these scores are available to the public on the Hospital Compare website,[12] physicians may not know how their hospital is ranked or how they are individually perceived by their patients. Additionally, these surveys are typically conducted 48 hours to 6 weeks after patients are discharged, and the results are distributed back to the hospitals well after the time that care was provided, thereby offering providers no chance of improving patient satisfaction during a given hospital stay.
Institutions across the country are trying to improve their HCAHPS scores, but there is limited research identifying specific measures providers can implement. Some studies have suggested that utilizing etiquette‐based communication and sitting at the bedside[13, 14] may help improve patient experience with their providers, and more recently, it has been suggested that providing real‐time deidentified patient experience survey results with education and a rewards/emncentive system to residents may help as well.[15]
Surveys conducted during a patient's hospitalization can offer real‐time actionable feedback to providers. We performed a quality‐improvement project that was designed to determine if real‐time feedback to hospitalist physicians, followed by coaching, and revisits to the patients' bedside could improve the results recorded on provider‐specific patient surveys and/or patients' HCAHPS scores or percentile rankings.
METHODS
Design
This was a prospective, randomized quality‐improvement initiative that was approved by the Colorado Multiple Institutional Review Board and conducted at Denver Health, a 525‐bed university‐affiliated public safety net hospital. The initiative was conducted on both teaching and nonteaching general internal medicine services, which typically have a daily census of between 10 and 15 patients. No protocol changes occurred during the study.
Participants
Participants included all English‐ or Spanish‐speaking patients who were hospitalized on a general internal medicine service, had been admitted within the 2 days prior to enrollment, and had a hospitalist as their attending physician. Patients were excluded if they were enrolled in the study during a previous hospitalization, refused to participate, lacked capacity to participate, had hearing or speech impediments precluding regular conversation, were prisoners, if their clinical condition precluded participation, or their attending was an investigator in the project.
Intervention
Participants were prescreened by investigators by reviewing team sign‐outs to determine if patients had any exclusion criteria. Investigators attempted to survey each patient who met inclusion criteria on a daily basis between 9:00 am and 11:00 am. An investigator administered the survey to each patient verbally using scripted language. Patients were asked to rate how well their doctors were listening to them, explaining what they wanted to know, and whether the doctors were being friendly and helpful, all questions taken from a survey that was available on the US Department of Health and Human Services website (to be referred to as here forward daily survey).[16] We converted the original 5‐point Likert scale used in this survey to a 4‐point scale by removing the option of ok, leaving participants the options of poor, fair, good, or great. Patients were also asked to provide any personalized feedback they had, and these comments were recorded in writing by the investigator.
After being surveyed on day 1, patients were randomized to an intervention or control group using an automated randomization module in Research Electronic Data Capture (REDCap).[17] Patients in both groups who did not provide answers to all 3 questions that qualified as being top box (ie, great) were resurveyed on a daily basis until their responses were all top box or they were discharged, met exclusion criteria, or had been surveyed for a total of 4 consecutive days. In the pilot phase of this study, we found that if patients reported all top box scores on the initial survey their responses typically did not change over time, and the patients became frustrated if asked the same questions again when the patient felt there was not room for improvement. Accordingly, we elected to stop surveying patients when all top box responses were reported.
The attending hospitalist caring for each patient in the intervention group was given feedback about their patients' survey results (both their scores and any specific comments) on a daily basis. Feedback was provided in person by 1 of the investigators. The hospitalist also received an automatically generated electronic mail message with the survey results at 11:00 am on each study day. After informing the hospitalists of the patients' scores, the investigator provided a brief education session that included discussing Denver Health's most recent HCAHPS scores, value‐based purchasing, and the financial consequences of poor patient satisfaction scores. The investigator then coached the hospitalist on etiquette‐based communication,[18, 19] suggested that they sit down when communicating with their patients,[19, 20] and then asked the hospitalist to revisit each patient to discuss how the team could improve in any of the 3 areas where the patient did not give a top box score. These educational sessions were conducted in person and lasted a maximum of 5 minutes. An investigator followed up with each hospitalist the following day to determine whether the revisit occurred. Hospitalists caring for patients who were randomized to the control group were not given real‐time feedback or coaching and were not asked to revisit patients.
A random sample of patients surveyed for this initiative also received HCAHPS surveys 48 hours to 6 weeks following their hospital discharge, according to the standard methodology used to acquire HCAHPS data,[21] by an outside vendor contracted by Denver Health. Our vendor conducted these surveys via telephone in English or Spanish.
Outcomes
The primary outcome was the proportion of patients in each group who reported top box scores on the daily surveys. Secondary outcomes included the percent change for the scores recorded for 3 provider‐specific questions from the daily survey, the median top box HCAHPS scores for the 3 provider related questions and overall hospital rating, and the HCAHPS percentiles of top box scores for these questions.
Sample Size
The sample size for this intervention assumed that the proportion of patients whose treating physicians did not receive real‐time feedback who rated their providers as top box would be 75%, and that the effect of providing real‐time feedback would increase this proportion to 85% on the daily surveys. To have 80% power with a type 1 error of 0.05, we estimated a need to enroll 430 patients, 215 in each group.
Statistics
Data were collected and managed using a secure, Web‐based electronic data capture tool hosted at Denver Health (REDCap), which is designed to support data collection for research studies providing: (1) an intuitive interface for validated data entry, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for importing data from external sources.[17]
A 2 test was used to compare the proportion of patients in the 2 groups who reported great scores for each question on the study survey on the first and last day. With the intent of providing a framework for understanding the effect real‐time feedback could have on patient experience, a secondary analysis of HCAHPS results was conducted using several different methods.
First, the proportion of patients in the 2 groups who reported scores of 9 or 10 for the overall hospital rating question or reported always for each doctor communication question on the HCHAPS survey was compared using a 2. Second, to allow for detection of differences in a sample with a smaller N, the median overall hospital rating scores from the HCAHPS survey reported by patients in the 2 groups who completed a survey following discharge were compared using a Wilcoxon rank sum test. Lastly, to place changes in proportion into a larger context (ie, how these changes would relate to value‐based purchasing), HCAHPS scores were converted to percentiles of national performance using the 2014 percentile rankings obtained from the external vendor that conducts the HCAHPS surveys for our hospital and compared between the intervention and control groups using a Wilcoxon rank sum test.
All comments collected from patients during their daily surveys were reviewed, and key words were abstracted from each comment. These key words were sorted and reviewed to categorize recurring key words into themes. Exemplars were then selected for each theme derived from patient comments.
RESULTS
From April 14, 2014 to September 19, 2014, we enrolled 227 patients in the control group and 228 in the intervention group (Figure 1). Patient demographics are summarized in Table 1. Of the 132 patients in the intervention group who reported anything less than top box scores for any of the 3 questions (thus prompting a revisit by their provider), 106 (80%) were revisited by their provider at least once during their hospitalization.
All Patients | HCAHPS Patients | |||
---|---|---|---|---|
Control, N = 227 | Intervention, N = 228 | Control, N = 35 | Intervention, N = 30 | |
| ||||
Age, mean SD | 55 14 | 55 15 | 55 15 | 57 16 |
Gender | ||||
Male | 126 (60) | 121 (55) | 20 (57) | 12 (40) |
Female | 85 (40) | 98 (45) | 15(43) | 18 (60) |
Race/ethnicity | ||||
Hispanic | 84 (40) | 90 (41) | 17 (49) | 12 (40) |
Black | 38 (18) | 28 (13) | 6 (17) | 7 (23) |
White | 87 (41) | 97 (44) | 12 (34) | 10 (33) |
Other | 2 (1) | 4 (2) | 0 (0) | 1 (3) |
Payer | ||||
Medicare | 65 (29) | 82 (36) | 15 (43) | 12 (40) |
Medicaid | 122 (54) | 108 (47) | 17 (49) | 14 (47) |
Commercial | 12 (5) | 15 (7) | 1 (3) | 1 (3) |
Medically indigent | 4 (2) | 7 (3) | 0 (0) | 3 (10) |
Self‐pay | 5 (2) | 4 (2) | 1 (3) | 0 (0) |
Other/unknown | 19 (8) | 12 (5) | 0 (0) | 0 (0) |
Team | ||||
Teaching | 187 (82) | 196 (86) | 27 (77) | 24 (80) |
Nonteaching | 40 (18) | 32 (14) | 8 (23) | 6 (20) |
Top 5 primary discharge diagnoses* | ||||
Septicemia | 26 (11) | 34 (15) | 3 (9) | 5 (17) |
Heart failure | 14 (6) | 13 (6) | 2 (6) | |
Acute pancreatitis | 12 (5) | 9 (4) | 3 (9) | 2 (7) |
Diabetes mellitus | 11 (5) | 8 (4) | 2 (6) | |
Alcohol withdrawal | 9 (4) | |||
Cellulitis | 7 (3) | 2 (7) | ||
Pulmonary embolism | 2 (7) | |||
Chest pain | 2 (7) | |||
Atrial fibrillation | 2 (6) | |||
Length of stay, median (IQR) | 3 (2, 5) | 3 (2, 5) | 3 (2, 5) | 3 (2, 4) |
Charlson Comorbidity Index, median (IQR) | 1 (0, 3) | 2 (0, 3) | 1 (0, 3) | 1.5 (1, 3) |

Daily Surveys
The proportion of patients in both study groups reporting top box scores tended to increase from the first day to the last day of the survey (Figure 2); however, we found no statistically significant differences between the proportion of patients who reported top box scores on first day or last day in the intervention group compared to the control group. The comments made by the patients are summarized in Supporting Table 1 in the online version of this article.

HCAHPS Scores
The proportion of top box scores from the HCAHPS surveys were higher, though not statistically significant, for all 3 provider‐specific questions and for the overall hospital rating for patients whose hospitalists received real‐time feedback (Table 2). The median [interquartile range] score for the overall hospital rating was higher for patients in the intervention group compared with those in the control group, (10 [9, 10] vs 9 [8, 10], P = 0.04]. After converting the HCAHPS scores to percentiles, we found considerably higher rankings for all 3 provider‐related questions and for the overall hospital rating in the intervention group compared to the control group (P = 0.02 for overall differences in percentiles [Table 2]).
HCAHPS Questions | Proportion Top Box* | Percentile Rank | ||
---|---|---|---|---|
Control, N = 35 | Intervention, N = 30 | Control, N = 35 | Intervention, N = 30 | |
| ||||
Overall hospital rating | 61% | 80% | 6 | 87 |
Courtesy/respect | 86% | 93% | 23 | 88 |
Clear communication | 77% | 80% | 39 | 60 |
Listening | 83% | 90% | 57 | 95 |
No adverse events occurred during the course of the study in either group.
DISCUSSION
The important findings of this study were that (1) daily patient satisfaction scores improved from first day to last day regardless of study group, (2) patients whose providers received real‐time feedback had a trend toward higher HCAHPS proportions for the 3 provider‐related questions as well as the overall rating of the hospital but were not statistically significant, (3) the percentile differences in these 3 questions as well as the overall rating of the hospital were significantly higher in the intervention group as was the median score for the overall hospital rating.
Our original sample size calculation was based upon our own preliminary data, indicating that our baseline top box scores for the daily survey was around 75%. The daily survey top box score on the first day was, however, much lower (Figure 2). Accordingly, although we did not find a significant difference in these daily scores, we were underpowered to find such a difference. Additionally, because only a small percentage of patients are selected for the HCAHPS survey, our ability to detect a difference in this secondary outcome was also limited. We felt that it was important to analyze the percentile comparisons in addition to the proportion of top box scores on the HCAHPS, because the metrics for value‐based purchasing are based upon, in part, how a hospital system compares to other systems. Finally, to improve our power to detect a difference given a small sample size, we converted the scoring system for overall hospital ranking to a continuous variable, which again was noted to be significant.
To our knowledge, this is the first randomized investigation designed to assess the effect of real‐time, patient‐specific feedback to physicians. Real‐time feedback is increasingly being incorporated into medical practice, but there is only limited information available describing how this type of feedback affects outcomes.[22, 23, 24] Banka et al.[15] found that HCAHPS scores improved as a result of real‐time feedback given to residents, but the study was not randomized, utilized a pre‐post design that resulted in there being differences between the patients studied before and after the intervention, and did not provide patient‐specific data to the residents. Tabib et al.[25] found that operating costs decreased 17% after instituting real‐time feedback to providers about these costs. Reeves et al.[26] conducted a cluster randomized trial of a patient feedback survey that was designed to improve nursing care, but the results were reviewed by the nurses several months after patients had been discharged.
The differences in median top box scores and percentile rank that we observed could have resulted from the real‐time feedback, the educational coaching, the fact that the providers revisited the majority of the patients, or a combination of all of the above. Gross et al.[27] found that longer visits lead to higher satisfaction, though others have not found this to necessarily be the case.[28, 29] Lin et al.[30] found that patient satisfaction was affected by the perceived duration of the visit as well as whether expectations on visit length were met and/or exceeded. Brown et al.[31] found that training providers in communication skills improved the providers perception of their communication skills, although patient experience scores did not improve. We feel that the results seen are more likely a combination thereof as opposed to any 1 component of the intervention.
The most commonly reported complaints or concerns in patients' undirected comments often related to communication issues. Comments on subsequent surveys suggested that patient satisfaction improved over time in the intervention group, indicating that perhaps physicians did try to improve in areas that were highlighted by the real‐time feedback, and that patients perceived the physician efforts to do so (eg, They're doing better than the last time you asked. They sat down and talked to me and listened better. They came back and explained to me about my care. They listened better. They should do this survey at the clinic. See Supporting Table 1 in the online version of this article).
Our study has several limitations. First, we did not randomize providers, and many of our providers (approximately 65%) participated in both the control group and also in the intervention group, and thus received real‐time feedback at some point during the study, which could have affected their overall practice and limited our ability to find a difference between the 2 groups. In an attempt to control for this possibility, the study was conducted on an intermittent basis during the study time frame. Furthermore, the proportion of patients who reported top box scores at the beginning of the study did not have a clear trend of change by the end of the study, suggesting that overall clinician practices with respect to patient satisfaction did not change during this short time period.
Second, only a small number of our patients were randomly selected for the HCAHPS survey, which limited our ability to detect significant differences in HCAHPS proportions. Third, the HCAHPS percentiles at our institution at that time were low. Accordingly, the improvements that we observed in patient satisfaction scores might not be reproducible at institutions with higher satisfactions scores. Fourth, time and resources were needed to obtain patient feedback to provide to providers during this study. There are, however, other ways to obtain feedback that are less resource intensive (eg, electronic feedback, the utilization of volunteers, or partnering this with manager rounding). Finally, the study was conducted at a single, university‐affiliated public teaching hospital and was a quality‐improvement initiative, and thus our results are not generalizable to other institutions.
In conclusion, real‐time feedback of patient experience to their providers, coupled with provider education, coaching, and revisits, seems to improve satisfaction of patients hospitalized on general internal medicine units who were cared for by hospitalists.
Acknowledgements
The authors thank Kate Fagan, MPH, for her excellent technical assistance.
Disclosure: Nothing to report.
In 2010, the Centers for Medicare and Medicaid Services implemented value‐based purchasing, a payment model that incentivizes hospitals for reaching certain quality and patient experience thresholds and penalizes those that do not, in part on the basis of patient satisfaction scores.[1] Although low patient satisfaction scores will adversely affect institutions financially, they also reflect patients' perceptions of their care. Some studies suggest that hospitals with higher patient satisfaction scores score higher overall on clinical care processes such as core measures compliance, readmission rates, lower mortality rates, and other quality‐of‐care metrics.[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey assesses patients' experience following their hospital stay.[1] The percent of top box scores (ie, response of always on a four point scale, or scores of 9 or 10 on a 10‐point scale) are utilized to compare hospitals and determine the reimbursement or penalty a hospital will receive. Although these scores are available to the public on the Hospital Compare website,[12] physicians may not know how their hospital is ranked or how they are individually perceived by their patients. Additionally, these surveys are typically conducted 48 hours to 6 weeks after patients are discharged, and the results are distributed back to the hospitals well after the time that care was provided, thereby offering providers no chance of improving patient satisfaction during a given hospital stay.
Institutions across the country are trying to improve their HCAHPS scores, but there is limited research identifying specific measures providers can implement. Some studies have suggested that utilizing etiquette‐based communication and sitting at the bedside[13, 14] may help improve patient experience with their providers, and more recently, it has been suggested that providing real‐time deidentified patient experience survey results with education and a rewards/emncentive system to residents may help as well.[15]
Surveys conducted during a patient's hospitalization can offer real‐time actionable feedback to providers. We performed a quality‐improvement project that was designed to determine if real‐time feedback to hospitalist physicians, followed by coaching, and revisits to the patients' bedside could improve the results recorded on provider‐specific patient surveys and/or patients' HCAHPS scores or percentile rankings.
METHODS
Design
This was a prospective, randomized quality‐improvement initiative that was approved by the Colorado Multiple Institutional Review Board and conducted at Denver Health, a 525‐bed university‐affiliated public safety net hospital. The initiative was conducted on both teaching and nonteaching general internal medicine services, which typically have a daily census of between 10 and 15 patients. No protocol changes occurred during the study.
Participants
Participants included all English‐ or Spanish‐speaking patients who were hospitalized on a general internal medicine service, had been admitted within the 2 days prior to enrollment, and had a hospitalist as their attending physician. Patients were excluded if they were enrolled in the study during a previous hospitalization, refused to participate, lacked capacity to participate, had hearing or speech impediments precluding regular conversation, were prisoners, if their clinical condition precluded participation, or their attending was an investigator in the project.
Intervention
Participants were prescreened by investigators by reviewing team sign‐outs to determine if patients had any exclusion criteria. Investigators attempted to survey each patient who met inclusion criteria on a daily basis between 9:00 am and 11:00 am. An investigator administered the survey to each patient verbally using scripted language. Patients were asked to rate how well their doctors were listening to them, explaining what they wanted to know, and whether the doctors were being friendly and helpful, all questions taken from a survey that was available on the US Department of Health and Human Services website (to be referred to as here forward daily survey).[16] We converted the original 5‐point Likert scale used in this survey to a 4‐point scale by removing the option of ok, leaving participants the options of poor, fair, good, or great. Patients were also asked to provide any personalized feedback they had, and these comments were recorded in writing by the investigator.
After being surveyed on day 1, patients were randomized to an intervention or control group using an automated randomization module in Research Electronic Data Capture (REDCap).[17] Patients in both groups who did not provide answers to all 3 questions that qualified as being top box (ie, great) were resurveyed on a daily basis until their responses were all top box or they were discharged, met exclusion criteria, or had been surveyed for a total of 4 consecutive days. In the pilot phase of this study, we found that if patients reported all top box scores on the initial survey their responses typically did not change over time, and the patients became frustrated if asked the same questions again when the patient felt there was not room for improvement. Accordingly, we elected to stop surveying patients when all top box responses were reported.
The attending hospitalist caring for each patient in the intervention group was given feedback about their patients' survey results (both their scores and any specific comments) on a daily basis. Feedback was provided in person by 1 of the investigators. The hospitalist also received an automatically generated electronic mail message with the survey results at 11:00 am on each study day. After informing the hospitalists of the patients' scores, the investigator provided a brief education session that included discussing Denver Health's most recent HCAHPS scores, value‐based purchasing, and the financial consequences of poor patient satisfaction scores. The investigator then coached the hospitalist on etiquette‐based communication,[18, 19] suggested that they sit down when communicating with their patients,[19, 20] and then asked the hospitalist to revisit each patient to discuss how the team could improve in any of the 3 areas where the patient did not give a top box score. These educational sessions were conducted in person and lasted a maximum of 5 minutes. An investigator followed up with each hospitalist the following day to determine whether the revisit occurred. Hospitalists caring for patients who were randomized to the control group were not given real‐time feedback or coaching and were not asked to revisit patients.
A random sample of patients surveyed for this initiative also received HCAHPS surveys 48 hours to 6 weeks following their hospital discharge, according to the standard methodology used to acquire HCAHPS data,[21] by an outside vendor contracted by Denver Health. Our vendor conducted these surveys via telephone in English or Spanish.
Outcomes
The primary outcome was the proportion of patients in each group who reported top box scores on the daily surveys. Secondary outcomes included the percent change for the scores recorded for 3 provider‐specific questions from the daily survey, the median top box HCAHPS scores for the 3 provider related questions and overall hospital rating, and the HCAHPS percentiles of top box scores for these questions.
Sample Size
The sample size for this intervention assumed that the proportion of patients whose treating physicians did not receive real‐time feedback who rated their providers as top box would be 75%, and that the effect of providing real‐time feedback would increase this proportion to 85% on the daily surveys. To have 80% power with a type 1 error of 0.05, we estimated a need to enroll 430 patients, 215 in each group.
Statistics
Data were collected and managed using a secure, Web‐based electronic data capture tool hosted at Denver Health (REDCap), which is designed to support data collection for research studies providing: (1) an intuitive interface for validated data entry, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for importing data from external sources.[17]
A 2 test was used to compare the proportion of patients in the 2 groups who reported great scores for each question on the study survey on the first and last day. With the intent of providing a framework for understanding the effect real‐time feedback could have on patient experience, a secondary analysis of HCAHPS results was conducted using several different methods.
First, the proportion of patients in the 2 groups who reported scores of 9 or 10 for the overall hospital rating question or reported always for each doctor communication question on the HCHAPS survey was compared using a 2. Second, to allow for detection of differences in a sample with a smaller N, the median overall hospital rating scores from the HCAHPS survey reported by patients in the 2 groups who completed a survey following discharge were compared using a Wilcoxon rank sum test. Lastly, to place changes in proportion into a larger context (ie, how these changes would relate to value‐based purchasing), HCAHPS scores were converted to percentiles of national performance using the 2014 percentile rankings obtained from the external vendor that conducts the HCAHPS surveys for our hospital and compared between the intervention and control groups using a Wilcoxon rank sum test.
All comments collected from patients during their daily surveys were reviewed, and key words were abstracted from each comment. These key words were sorted and reviewed to categorize recurring key words into themes. Exemplars were then selected for each theme derived from patient comments.
RESULTS
From April 14, 2014 to September 19, 2014, we enrolled 227 patients in the control group and 228 in the intervention group (Figure 1). Patient demographics are summarized in Table 1. Of the 132 patients in the intervention group who reported anything less than top box scores for any of the 3 questions (thus prompting a revisit by their provider), 106 (80%) were revisited by their provider at least once during their hospitalization.
All Patients | HCAHPS Patients | |||
---|---|---|---|---|
Control, N = 227 | Intervention, N = 228 | Control, N = 35 | Intervention, N = 30 | |
| ||||
Age, mean SD | 55 14 | 55 15 | 55 15 | 57 16 |
Gender | ||||
Male | 126 (60) | 121 (55) | 20 (57) | 12 (40) |
Female | 85 (40) | 98 (45) | 15(43) | 18 (60) |
Race/ethnicity | ||||
Hispanic | 84 (40) | 90 (41) | 17 (49) | 12 (40) |
Black | 38 (18) | 28 (13) | 6 (17) | 7 (23) |
White | 87 (41) | 97 (44) | 12 (34) | 10 (33) |
Other | 2 (1) | 4 (2) | 0 (0) | 1 (3) |
Payer | ||||
Medicare | 65 (29) | 82 (36) | 15 (43) | 12 (40) |
Medicaid | 122 (54) | 108 (47) | 17 (49) | 14 (47) |
Commercial | 12 (5) | 15 (7) | 1 (3) | 1 (3) |
Medically indigent | 4 (2) | 7 (3) | 0 (0) | 3 (10) |
Self‐pay | 5 (2) | 4 (2) | 1 (3) | 0 (0) |
Other/unknown | 19 (8) | 12 (5) | 0 (0) | 0 (0) |
Team | ||||
Teaching | 187 (82) | 196 (86) | 27 (77) | 24 (80) |
Nonteaching | 40 (18) | 32 (14) | 8 (23) | 6 (20) |
Top 5 primary discharge diagnoses* | ||||
Septicemia | 26 (11) | 34 (15) | 3 (9) | 5 (17) |
Heart failure | 14 (6) | 13 (6) | 2 (6) | |
Acute pancreatitis | 12 (5) | 9 (4) | 3 (9) | 2 (7) |
Diabetes mellitus | 11 (5) | 8 (4) | 2 (6) | |
Alcohol withdrawal | 9 (4) | |||
Cellulitis | 7 (3) | 2 (7) | ||
Pulmonary embolism | 2 (7) | |||
Chest pain | 2 (7) | |||
Atrial fibrillation | 2 (6) | |||
Length of stay, median (IQR) | 3 (2, 5) | 3 (2, 5) | 3 (2, 5) | 3 (2, 4) |
Charlson Comorbidity Index, median (IQR) | 1 (0, 3) | 2 (0, 3) | 1 (0, 3) | 1.5 (1, 3) |

Daily Surveys
The proportion of patients in both study groups reporting top box scores tended to increase from the first day to the last day of the survey (Figure 2); however, we found no statistically significant differences between the proportion of patients who reported top box scores on first day or last day in the intervention group compared to the control group. The comments made by the patients are summarized in Supporting Table 1 in the online version of this article.

HCAHPS Scores
The proportion of top box scores from the HCAHPS surveys were higher, though not statistically significant, for all 3 provider‐specific questions and for the overall hospital rating for patients whose hospitalists received real‐time feedback (Table 2). The median [interquartile range] score for the overall hospital rating was higher for patients in the intervention group compared with those in the control group, (10 [9, 10] vs 9 [8, 10], P = 0.04]. After converting the HCAHPS scores to percentiles, we found considerably higher rankings for all 3 provider‐related questions and for the overall hospital rating in the intervention group compared to the control group (P = 0.02 for overall differences in percentiles [Table 2]).
HCAHPS Questions | Proportion Top Box* | Percentile Rank | ||
---|---|---|---|---|
Control, N = 35 | Intervention, N = 30 | Control, N = 35 | Intervention, N = 30 | |
| ||||
Overall hospital rating | 61% | 80% | 6 | 87 |
Courtesy/respect | 86% | 93% | 23 | 88 |
Clear communication | 77% | 80% | 39 | 60 |
Listening | 83% | 90% | 57 | 95 |
No adverse events occurred during the course of the study in either group.
DISCUSSION
The important findings of this study were that (1) daily patient satisfaction scores improved from first day to last day regardless of study group, (2) patients whose providers received real‐time feedback had a trend toward higher HCAHPS proportions for the 3 provider‐related questions as well as the overall rating of the hospital but were not statistically significant, (3) the percentile differences in these 3 questions as well as the overall rating of the hospital were significantly higher in the intervention group as was the median score for the overall hospital rating.
Our original sample size calculation was based upon our own preliminary data, indicating that our baseline top box scores for the daily survey was around 75%. The daily survey top box score on the first day was, however, much lower (Figure 2). Accordingly, although we did not find a significant difference in these daily scores, we were underpowered to find such a difference. Additionally, because only a small percentage of patients are selected for the HCAHPS survey, our ability to detect a difference in this secondary outcome was also limited. We felt that it was important to analyze the percentile comparisons in addition to the proportion of top box scores on the HCAHPS, because the metrics for value‐based purchasing are based upon, in part, how a hospital system compares to other systems. Finally, to improve our power to detect a difference given a small sample size, we converted the scoring system for overall hospital ranking to a continuous variable, which again was noted to be significant.
To our knowledge, this is the first randomized investigation designed to assess the effect of real‐time, patient‐specific feedback to physicians. Real‐time feedback is increasingly being incorporated into medical practice, but there is only limited information available describing how this type of feedback affects outcomes.[22, 23, 24] Banka et al.[15] found that HCAHPS scores improved as a result of real‐time feedback given to residents, but the study was not randomized, utilized a pre‐post design that resulted in there being differences between the patients studied before and after the intervention, and did not provide patient‐specific data to the residents. Tabib et al.[25] found that operating costs decreased 17% after instituting real‐time feedback to providers about these costs. Reeves et al.[26] conducted a cluster randomized trial of a patient feedback survey that was designed to improve nursing care, but the results were reviewed by the nurses several months after patients had been discharged.
The differences in median top box scores and percentile rank that we observed could have resulted from the real‐time feedback, the educational coaching, the fact that the providers revisited the majority of the patients, or a combination of all of the above. Gross et al.[27] found that longer visits lead to higher satisfaction, though others have not found this to necessarily be the case.[28, 29] Lin et al.[30] found that patient satisfaction was affected by the perceived duration of the visit as well as whether expectations on visit length were met and/or exceeded. Brown et al.[31] found that training providers in communication skills improved the providers perception of their communication skills, although patient experience scores did not improve. We feel that the results seen are more likely a combination thereof as opposed to any 1 component of the intervention.
The most commonly reported complaints or concerns in patients' undirected comments often related to communication issues. Comments on subsequent surveys suggested that patient satisfaction improved over time in the intervention group, indicating that perhaps physicians did try to improve in areas that were highlighted by the real‐time feedback, and that patients perceived the physician efforts to do so (eg, They're doing better than the last time you asked. They sat down and talked to me and listened better. They came back and explained to me about my care. They listened better. They should do this survey at the clinic. See Supporting Table 1 in the online version of this article).
Our study has several limitations. First, we did not randomize providers, and many of our providers (approximately 65%) participated in both the control group and also in the intervention group, and thus received real‐time feedback at some point during the study, which could have affected their overall practice and limited our ability to find a difference between the 2 groups. In an attempt to control for this possibility, the study was conducted on an intermittent basis during the study time frame. Furthermore, the proportion of patients who reported top box scores at the beginning of the study did not have a clear trend of change by the end of the study, suggesting that overall clinician practices with respect to patient satisfaction did not change during this short time period.
Second, only a small number of our patients were randomly selected for the HCAHPS survey, which limited our ability to detect significant differences in HCAHPS proportions. Third, the HCAHPS percentiles at our institution at that time were low. Accordingly, the improvements that we observed in patient satisfaction scores might not be reproducible at institutions with higher satisfactions scores. Fourth, time and resources were needed to obtain patient feedback to provide to providers during this study. There are, however, other ways to obtain feedback that are less resource intensive (eg, electronic feedback, the utilization of volunteers, or partnering this with manager rounding). Finally, the study was conducted at a single, university‐affiliated public teaching hospital and was a quality‐improvement initiative, and thus our results are not generalizable to other institutions.
In conclusion, real‐time feedback of patient experience to their providers, coupled with provider education, coaching, and revisits, seems to improve satisfaction of patients hospitalized on general internal medicine units who were cared for by hospitalists.
Acknowledgements
The authors thank Kate Fagan, MPH, for her excellent technical assistance.
Disclosure: Nothing to report.
- HCAHPS Fact Sheet. 2015. Available at: http://www.hcahpsonline.org/Files/HCAHPS_Fact_Sheet_June_2015.pdf. Accessed August 25, 2015.
- The relationship between commercial website ratings and traditional hospital performance measures in the USA. BMJ Qual Saf. 2013;22:194–202. , , , .
- Patients' perception of hospital care in the United States. N Engl J Med. 2008;359:1921–1931. , , , .
- The relationship between patients' perception of care and measures of hospital quality and safety. Health Serv Res. 2010;45:1024–1040. , , , .
- Relationship between quality of diabetes care and patient satisfaction. J Natl Med Assoc. 2003;95:64–70. , , , et al.
- Relationship between patient satisfaction with inpatient care and hospital readmission within 30 days. Am J Manag Care. 2011;17:41–48. , , , , .
- A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 2013;3(1). , , .
- The association between satisfaction with services provided in primary care and outcomes in type 2 diabetes mellitus. Diabet Med. 2003;20:486–490. , .
- Associations between Web‐based patient ratings and objective measures of hospital quality. Arch Intern Med. 2012;172:435–436. , , , et al.
- Patient satisfaction and its relationship with clinical quality and inpatient mortality in acute myocardial infarction. Circ Cardiovasc Qual Outcomes. 2010;3:188–195. , , , et al.
- Patients' perceptions of care are associated with quality of hospital care: a survey of 4605 hospitals. Am J Med Qual. 2015;30(4):382–388. , , , , .
- Centers for Medicare 28:908–913.
- Effect of sitting vs. standing on perception of provider time at bedside: a pilot study. Patient Educ Couns. 2012;86:166–171. , , , , , .
- Improving patient satisfaction through physician education, feedback, and incentives. J Hosp Med. 2015;10:497–502. , , , et al.
- US Department of Health and Human Services. Patient satisfaction survey. Available at: http://bphc.hrsa.gov/policiesregulations/performancemeasures/patientsurvey/surveyform.html. Accessed November 15, 2013.
- Research electronic data capture (REDCap)—a metadata‐driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42:377–381. , , , , , .
- The HCAHPS Handbook. Gulf Breeze, FL: Fire Starter; 2010. .
- Etiquette‐based medicine. N Engl J Med. 2008;358:1988–1989. .
- 5 years after the Kahn's etiquette‐based medicine: a brief checklist proposal for a functional second meeting with the patient. Front Psychol. 2013;4:723. .
- Frequently Asked Questions. Hospital Value‐Based Purchasing Program. Available at: http://www.cms.gov/Medicare/Quality‐Initiatives‐Patient‐Assessment‐Instruments/hospital‐value‐based‐purchasing/Downloads/FY‐2013‐Program‐Frequently‐Asked‐Questions‐about‐Hospital‐VBP‐3‐9‐12.pdf. Accessed February 8, 2014.
- Real‐time patient survey data during routine clinical activities for rapid‐cycle quality improvement. JMIR Med Inform. 2015;3:e13. , , , .
- Mount Sinai launches real‐time patient‐feedback survey tool. Healthcare Informatics website. Available at: http://www.healthcare‐informatics.com/news‐item/mount‐sinai‐launches‐real‐time‐patient‐feedback‐survey‐tool. Accessed August 25, 2015. .
- Hospitals are finally starting to put real‐time data to use. Harvard Business Review website. Available at: https://hbr.org/2014/11/hospitals‐are‐finally‐starting‐to‐put‐real‐time‐data‐to‐use. Published November 12, 2014. Accessed August 25, 2015. , .
- Reducing operating room costs through real‐time cost information feedback: a pilot study. J Endourol. 2015;29:963–968. , , , , .
- Facilitated patient experience feedback can improve nursing care: a pilot study for a phase III cluster randomised controlled trial. BMC Health Serv Res. 2013;13:259. , , .
- Patient satisfaction with time spent with their physician. J Fam Pract. 1998;47:133–137. , , , , .
- The relationship between time spent communicating and communication outcomes on a hospital medicine service. J Gen Intern Med. 2012;27:185–189. , , , , , .
- Cognitive interview techniques reveal specific behaviors and issues that could affect patient satisfaction relative to hospitalists. J Hosp Med. 2009;4:E1–E6. , .
- Is patients' perception of time spent with the physician a determinant of ambulatory patient satisfaction? Arch Intern Med. 2001;161:1437–1442. , , , et al.
- Effect of clinician communication skills training on patient satisfaction. A randomized, controlled trial. Ann Intern Med. 1999;131:822–829. , , , .
- HCAHPS Fact Sheet. 2015. Available at: http://www.hcahpsonline.org/Files/HCAHPS_Fact_Sheet_June_2015.pdf. Accessed August 25, 2015.
- The relationship between commercial website ratings and traditional hospital performance measures in the USA. BMJ Qual Saf. 2013;22:194–202. , , , .
- Patients' perception of hospital care in the United States. N Engl J Med. 2008;359:1921–1931. , , , .
- The relationship between patients' perception of care and measures of hospital quality and safety. Health Serv Res. 2010;45:1024–1040. , , , .
- Relationship between quality of diabetes care and patient satisfaction. J Natl Med Assoc. 2003;95:64–70. , , , et al.
- Relationship between patient satisfaction with inpatient care and hospital readmission within 30 days. Am J Manag Care. 2011;17:41–48. , , , , .
- A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 2013;3(1). , , .
- The association between satisfaction with services provided in primary care and outcomes in type 2 diabetes mellitus. Diabet Med. 2003;20:486–490. , .
- Associations between Web‐based patient ratings and objective measures of hospital quality. Arch Intern Med. 2012;172:435–436. , , , et al.
- Patient satisfaction and its relationship with clinical quality and inpatient mortality in acute myocardial infarction. Circ Cardiovasc Qual Outcomes. 2010;3:188–195. , , , et al.
- Patients' perceptions of care are associated with quality of hospital care: a survey of 4605 hospitals. Am J Med Qual. 2015;30(4):382–388. , , , , .
- Centers for Medicare 28:908–913.
- Effect of sitting vs. standing on perception of provider time at bedside: a pilot study. Patient Educ Couns. 2012;86:166–171. , , , , , .
- Improving patient satisfaction through physician education, feedback, and incentives. J Hosp Med. 2015;10:497–502. , , , et al.
- US Department of Health and Human Services. Patient satisfaction survey. Available at: http://bphc.hrsa.gov/policiesregulations/performancemeasures/patientsurvey/surveyform.html. Accessed November 15, 2013.
- Research electronic data capture (REDCap)—a metadata‐driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42:377–381. , , , , , .
- The HCAHPS Handbook. Gulf Breeze, FL: Fire Starter; 2010. .
- Etiquette‐based medicine. N Engl J Med. 2008;358:1988–1989. .
- 5 years after the Kahn's etiquette‐based medicine: a brief checklist proposal for a functional second meeting with the patient. Front Psychol. 2013;4:723. .
- Frequently Asked Questions. Hospital Value‐Based Purchasing Program. Available at: http://www.cms.gov/Medicare/Quality‐Initiatives‐Patient‐Assessment‐Instruments/hospital‐value‐based‐purchasing/Downloads/FY‐2013‐Program‐Frequently‐Asked‐Questions‐about‐Hospital‐VBP‐3‐9‐12.pdf. Accessed February 8, 2014.
- Real‐time patient survey data during routine clinical activities for rapid‐cycle quality improvement. JMIR Med Inform. 2015;3:e13. , , , .
- Mount Sinai launches real‐time patient‐feedback survey tool. Healthcare Informatics website. Available at: http://www.healthcare‐informatics.com/news‐item/mount‐sinai‐launches‐real‐time‐patient‐feedback‐survey‐tool. Accessed August 25, 2015. .
- Hospitals are finally starting to put real‐time data to use. Harvard Business Review website. Available at: https://hbr.org/2014/11/hospitals‐are‐finally‐starting‐to‐put‐real‐time‐data‐to‐use. Published November 12, 2014. Accessed August 25, 2015. , .
- Reducing operating room costs through real‐time cost information feedback: a pilot study. J Endourol. 2015;29:963–968. , , , , .
- Facilitated patient experience feedback can improve nursing care: a pilot study for a phase III cluster randomised controlled trial. BMC Health Serv Res. 2013;13:259. , , .
- Patient satisfaction with time spent with their physician. J Fam Pract. 1998;47:133–137. , , , , .
- The relationship between time spent communicating and communication outcomes on a hospital medicine service. J Gen Intern Med. 2012;27:185–189. , , , , , .
- Cognitive interview techniques reveal specific behaviors and issues that could affect patient satisfaction relative to hospitalists. J Hosp Med. 2009;4:E1–E6. , .
- Is patients' perception of time spent with the physician a determinant of ambulatory patient satisfaction? Arch Intern Med. 2001;161:1437–1442. , , , et al.
- Effect of clinician communication skills training on patient satisfaction. A randomized, controlled trial. Ann Intern Med. 1999;131:822–829. , , , .
© 2016 Society of Hospital Medicine
Impact of Pneumonia Guidelines
Overutilization of resources is a significant, yet underappreciated, problem in medicine. Many interventions target underutilization (eg, immunizations) or misuse (eg, antibiotic prescribing for viral pharyngitis), yet overutilization remains as a significant contributor to healthcare waste.[1] In an effort to reduce waste, the Choosing Wisely campaign created a work group to highlight areas of overutilization, specifically noting both diagnostic tests and therapies for common pediatric conditions with no proven benefit and possible harm to the patient.[2] Respiratory illnesses have been a target of many quality‐improvement efforts, and pneumonia represents a common diagnosis in pediatrics.[3] The use of diagnostic testing for pneumonia is an area where care can be optimized and aligned with evidence.
Laboratory testing and diagnostic imaging are routinely used for the management of children with community‐acquired pneumonia (CAP). Several studies have documented substantial variability in the use of these resources for pneumonia management, with higher resource use associated with a higher chance of hospitalization after emergency department (ED) evaluation and a longer length of stay among those requiring hospitalization.[4, 5] This variation in diagnostic resource utilization has been attributed, at least in part, to a lack of consensus on the management of pneumonia. There is wide variability in diagnostic testing, and due to potential consequences for patients presenting with pneumonia, efforts to standardize care offer an opportunity to improve healthcare value.
In August 2011, the first national, evidence‐based consensus guidelines for the management of childhood CAP were published jointly by the Pediatric Infectious Diseases Society (PIDS) and the Infectious Diseases Society of America (IDSA).[6] A primary focus of these guidelines was the recommendation for the use of narrow spectrum antibiotics for the management of uncomplicated pneumonia. Previous studies have assessed the impact of the publication of the PIDS/IDSA guidelines on empiric antibiotic selection for the management of pneumonia.[7, 8] In addition, the guidelines provided recommendations regarding diagnostic test utilization, in particular discouraging blood tests (eg, complete blood counts) and radiologic studies for nontoxic, fully immunized children treated as outpatients, as well as repeat testing for children hospitalized with CAP who are improving.
Although single centers have demonstrated changes in utilization patterns based on clinical practice guidelines,[9, 10, 11, 12] whether these guidelines have impacted diagnostic test utilization among US children with CAP in a larger scale remains unknown. Therefore, we sought to determine the impact of the PIDS/IDSA guidelines on the use of diagnostic testing among children with CAP using a national sample of US children's hospitals. Because the guidelines discourage repeat diagnostic testing in patients who are improving, we also evaluated the association between repeat diagnostic studies and severity of illness.
METHODS
This retrospective cohort study used data from the Pediatric Health Information System (PHIS) (Children's Hospital Association, Overland Park, KS). The PHIS database contains deidentified administrative data, detailing demographic, diagnostic, procedure, and billing data from 47 freestanding, tertiary care children's hospitals. This database accounts for approximately 20% of all annual pediatric hospitalizations in the United States. Data quality is ensured through a joint effort between the Children's Hospital Association and participating hospitals.
Patient Population
Data from 32 (of the 47) hospitals included in PHIS with complete inpatient and ED data were used to evaluate hospital‐level resource utilization for children 1 to 18 years of age discharged January 1, 2008 to June 30, 2014 with a diagnosis of pneumonia (International Classification of Diseases, 9th Revision [ICD‐9] codes 480.x‐486.x, 487.0).[13] Our goal was to identify previously healthy children with uncomplicated pneumonia, so we excluded patients with complex chronic conditions,[14] billing charges for intensive care management and/or pleural drainage procedure (IDC‐9 codes 510.0, 510.9, 511.0, 511.1, 511.8, 511.9, 513.x) on day of admission or the next day, or prior pneumonia admission in the last 30 days. We studied 2 mutually exclusive populations: children with pneumonia treated in the ED (ie, patients who were evaluated in the ED and discharged to home), and children hospitalized with pneumonia, including those admitted through the ED.
Guideline Publication and Study Periods
For an exploratory before and after comparison, patients were grouped into 2 cohorts based on a guideline online publication date of August 1, 2011: preguideline (January 1, 2008 to July 31, 2011) and postguideline (August 1, 2011 to June 30, 2014).
Study Outcomes
The measured outcomes were the monthly proportion of pneumonia patients for whom specific diagnostic tests were performed, as determined from billing data. The diagnostic tests evaluated were complete blood count (CBC), blood culture, C‐reactive protein (CRP), and chest radiograph (CXR). Standardized costs were also calculated from PHIS charges as previously described to standardize the cost of the individual tests and remove interhospital cost variation.[3]
Relationship of Repeat Testing and Severity of Illness
Because higher illness severity and clinical deterioration may warrant repeat testing, we also explored the association of repeat diagnostic testing for inpatients with severity of illness by using the following variables as measures of severity: length of stay (LOS), transfer to intensive care unit (ICU), or pleural drainage procedure after admission (>2 calendar days after admission). Repeat diagnostic testing was stratified by number of tests.
Statistical Analysis
The categorical demographic characteristics of the pre‐ and postguideline populations were summarized using frequencies and percentages, and compared using 2 tests. Continuous demographics were summarized with medians and interquartile ranges (IQRs) and compared with the Wilcoxon rank sum test. Segmented regression, clustered by hospital, was used to assess trends in monthly resource utilization as well as associated standardized costs before and after guidelines publication. To estimate the impact of the guidelines overall, we compared the observed diagnostic resource use at the end of the study period with expected use projected from trends in the preguidelines period (ie, if there were no new guidelines). Individual interrupted time series were also built for each hospital. From these models, we assessed which hospitals had a significant difference between the rate observed at the end of the study and that estimated from their preguideline trajectory. To assess the relationship between the number of positive improvements at a hospital and hospital characteristics, we used Spearman's correlation and Kruskal‐Wallis tests. All analyses were performed with SAS version 9.3 (SAS Institute, Inc., Cary, NC), and P values <0.05 were considered statistically significant. In accordance with the policies of the Cincinnati Children's Hospital Medical Center Institutional Review Board, this research, using a deidentified dataset, was not considered human subjects research.
RESULTS
There were 275,288 hospital admissions meeting study inclusion criteria of 1 to 18 years of age with a diagnosis of pneumonia from 2008 to 2014. Of these, 54,749 met exclusion criteria (1874 had pleural drainage procedure on day 0 or 1, 51,306 had complex chronic conditions, 1569 were hospitalized with pneumonia in the last 30 days). Characteristics of the remaining 220,539 patients in the final sample are shown in Table 1. The median age was 4 years (IQR, 27 years); a majority of the children were male (53%) and had public insurance (58%). There were 128,855 patients in the preguideline period (January 1, 2008 to July 31, 2011) and 91,684 in the post guideline period (August 1, 2011June 30, 2014).
Overall | Preguideline | Postguideline | P | |
---|---|---|---|---|
| ||||
No. of discharges | 220,539 | 128,855 | 91,684 | |
Type of encounter | ||||
ED only | 150,215 (68.1) | 88,790 (68.9) | 61,425 (67) | <0.001 |
Inpatient | 70,324 (31.9) | 40,065 (31.1) | 30,259 (33) | |
Age | ||||
14 years | 129,360 (58.7) | 77,802 (60.4) | 51,558 (56.2) | <0.001 |
59 years | 58,609 (26.6) | 32,708 (25.4) | 25,901 (28.3) | |
1018 years | 32,570 (14.8) | 18,345 (14.2) | 14,225 (15.5) | |
Median [IQR] | 4 [27] | 3 [27] | 4 [27] | <0.001 |
Gender | ||||
Male | 116,718 (52.9) | 68,319 (53) | 48,399 (52.8) | 00.285 |
Female | 103,813 (47.1) | 60,532 (47) | 43,281 (47.2) | |
Race | ||||
Non‐Hispanic white | 84,423 (38.3) | 47,327 (36.7) | 37,096 (40.5) | <0.001 |
Non‐Hispanic black | 60,062 (27.2) | 35,870 (27.8) | 24,192 (26.4) | |
Hispanic | 51,184 (23.2) | 31,167 (24.2) | 20,017 (21.8) | |
Asian | 6,444 (2.9) | 3,691 (2.9) | 2,753 (3) | |
Other | 18,426 (8.4) | 10,800 (8.4) | 7,626 (8.3) | |
Payer | ||||
Government | 128,047 (58.1) | 70,742 (54.9) | 57,305 (62.5) | <0.001 |
Private | 73,338 (33.3) | 44,410 (34.5) | 28,928 (31.6) | |
Other | 19,154 (8.7) | 13,703 (10.6) | 5,451 (5.9) | |
Disposition | ||||
HHS | 684 (0.3) | 411 (0.3) | 273 (0.3) | <0.001 |
Home | 209,710 (95.1) | 123,236 (95.6) | 86,474 (94.3) | |
Other | 9,749 (4.4) | 4,962 (3.9) | 4,787 (5.2) | |
SNF | 396 (0.2) | 246 (0.2) | 150 (0.2) | |
Season | ||||
Spring | 60,171 (27.3) | 36,709 (28.5) | 23,462 (25.6) | <0.001 |
Summer | 29,891 (13.6) | 17,748 (13.8) | 12,143 (13.2) | |
Fall | 52,161 (23.7) | 28,332 (22) | 23,829 (26) | |
Winter | 78,316 (35.5) | 46,066 (35.8) | 32,250 (35.2) | |
LOS | ||||
13 days | 204,812 (92.9) | 119,497 (92.7) | 85,315 (93.1) | <0.001 |
46 days | 10,454 (4.7) | 6,148 (4.8) | 4,306 (4.7) | |
7+ days | 5,273 (2.4) | 3,210 (2.5) | 2,063 (2.3) | |
Median [IQR] | 1 [11] | 1 [11] | 1 [11] | 0.144 |
Admitted patients, median [IQR] | 2 [13] | 2 [13] | 2 [13] | <0.001 |
Discharged From the ED
Throughout the study, utilization of CBC, blood cultures, and CRP was <20%, whereas CXR use was >75%. In segmented regression analysis, CRP utilization was relatively stable before the guidelines publication. However, by the end of the study period, the projected estimate of CRP utilization without guidelines (expected) was 2.9% compared with 4.8% with the guidelines (observed) (P < 0.05) (Figure 1). A similar pattern of higher rates of diagnostic utilization after the guidelines compared with projected estimates without the guidelines was also seen in the ED utilization of CBC, blood cultures, and CXR (Figure 1); however, these trends did not achieve statistical significance. Table 2 provides specific values. Using a standard cost of $19.52 for CRP testing, annual costs across all hospitals increased $11,783 for ED evaluation of CAP.
Baseline (%) | Preguideline Trend | Level Change at Guideline | Change in Trend After Guideline | Estimates at End of Study* | |||
---|---|---|---|---|---|---|---|
Without Guideline (%) | With Guideline (%) | P | |||||
| |||||||
ED‐only encounters | |||||||
Blood culture | 14.6 | 0.1 | 0.8 | 0.1 | 5.5 | 8.6 | NS |
CBC | 19.2 | 0.1 | 0.4 | 0.1 | 10.7 | 14.0 | NS |
CRP | 5.4 | 0.0 | 0.6 | 0.1 | 2.9 | 4.8 | <0.05 |
Chest x‐ray | 85.4 | 0.1 | 0.1 | 0.0 | 80.9 | 81.1 | NS |
Inpatient encounters | |||||||
Blood culture | 50.6 | 0.0 | 1.7 | 0.2 | 49.2 | 41.4 | <0.05 |
Repeat blood culture | 6.5 | 0.0 | 1.0 | 0.1 | 8.9 | 5.8 | NS |
CBC | 65.2 | 0.0 | 3.1 | 0.0 | 65.0 | 62.2 | NS |
Repeat CBC | 23.4 | 0.0 | 4.2 | 0.0 | 20.8 | 16.0 | NS |
CRP | 25.7 | 0.0 | 1.1 | 0.0 | 23.8 | 23.5 | NS |
Repeat CRP | 12.5 | 0.1 | 2.2 | 0.1 | 7.1 | 7.3 | NS |
Chest x‐ray | 89.4 | 0.1 | 0.7 | 0.0 | 85.4 | 83.9 | NS |
Repeat chest x‐ray | 25.5 | 0.0 | 2.0 | 0.1 | 24.1 | 17.7 | <0.05 |

Inpatient Encounters
In the segmented regression analysis of children hospitalized with CAP, guideline publication was associated with changes in the monthly use of some diagnostic tests. For example, by the end of the study period, the use of blood culture was 41.4% (observed), whereas the projected estimated use in the absence of the guidelines was 49.2% (expected) (P < 0.05) (Figure 2). Table 2 includes the data for the other tests, CBC, CRP, and CXR, in which similar patterns are noted with lower utilization rates after the guidelines, compared with expected utilization rates without the guidelines; however, these trends did not achieve statistical significance. Evaluating the utilization of repeat testing for inpatients, only repeat CXR achieved statistical significance (P < 0.05), with utilization rates of 17.7% with the guidelines (actual) compared with 24.1% without the guidelines (predicted).

To better understand the use of repeat testing, a comparison of severity outcomesLOS, ICU transfer, and pleural drainage procedureswas performed between patients with no repeat testing (70%) and patients with 1 or more repeat tests (30%). Patients with repeat testing had longer LOS (no repeat testing LOS 1 [IQR, 12]) versus 1 repeat test LOS 3 ([IQR, 24] vs 2+ repeat tests LOS 5 [IQR, 38]), higher rate of ICU transfer (no repeat testing 4.6% vs 1 repeat test 14.6% vs 2+ repeat test 35.6%), and higher rate of pleural drainage (no repeat testing 0% vs 1 repeat test 0.1% vs 2+ repeat test 5.9%] (all P < 0.001).
Using standard costs of $37.57 for blood cultures and $73.28 for CXR, annual costs for children with CAP across all hospitals decreased by $91,512 due to decreased utilization of blood cultures, and by $146,840 due to decreased utilization of CXR.
Hospital‐Level Variation in the Impact of the National Guideline
Figure 3 is a visual representation (heat map) of the impact of the guidelines at the hospital level at the end of the study from the individual interrupted time series. Based on this heat map (Figure 3), there was wide variability between hospitals in the impact of the guideline on each test in different settings (ED or inpatient). By diagnostic testing, 7 hospitals significantly decreased utilization of blood cultures for inpatients, and 5 hospitals significantly decreased utilization for repeat blood cultures and repeat CXR. Correlation between the number of positive improvements at a hospital and region (P = 0.974), number of CAP cases (P = 0.731), or percentage of public insurance (P = 0.241) were all nonsignificant.

DISCUSSION
This study complements previous assessments by evaluating the impact of the 2011 IDSA/PIDS consensus guidelines on the management of children with CAP cared for at US children's hospitals. Prior studies have shown increased use of narrow‐spectrum antibiotics for children with CAP after the publication of these guidelines.[7] The current study focused on diagnostic testing for CAP before and after the publication of the 2011 guidelines. In the ED setting, use of some diagnostic tests (blood culture, CBC, CXR, CRP) was declining prior to guideline publication, but appeared to plateau and/or increase after 2011. Among children admitted with CAP, use of diagnostic testing was relatively stable prior to 2011, and use of these tests (blood culture, CBC, CXR, CRP) declined after guideline publication. Overall, changes in diagnostic resource utilization 3 years after publication were modest, with few changes achieving statistical significance. There was a large variability in the impact of guidelines on test use between hospitals.
For outpatients, including those managed in the ED, the PIDS/IDSA guidelines recommend limited laboratory testing in nontoxic, fully immunized patients. The guidelines discourage the use of diagnostic testing among outpatients because of their low yield (eg, blood culture), and because test results may not impact management (eg, CBC).[6] In the years prior to guideline publication, there was already a declining trend in testing rates, including blood cultures, CBC, and CRP, for patients in the ED. After guideline publication, the rate of blood cultures, CBC, and CRP increased, but only the increase in CRP utilization achieved statistical significance. We would not expect utilization for common diagnostic tests (eg, CBC for outpatients with CAP) to be at or close to 0% because of the complexity of clinical decision making regarding admission that factors in aspects of patient history, exam findings, and underlying risk.[15] ED utilization of blood cultures was <10%, CBC <15%, and CRP <5% after guideline publication, which may represent the lowest testing limit that could be achieved.
CXRs obtained in the ED did not decrease over the entire study period. The rates of CXR use (close to 80%) seen in our study are similar to prior ED studies.[5, 16] Management of children with CAP in the ED might be different than outpatient primary care management because (1) unlike primary care providers, ED providers do not have an established relationship with their patients and do not have the opportunity for follow‐up and serial exams, making them less likely to tolerate diagnostic uncertainty; and (2) ED providers may see sicker patients. However, use of CXR in the ED does represent an opportunity for further study to understand if decreased utilization is feasible without adversely impacting clinical outcomes.
The CAP guidelines provide a strong recommendation to obtain blood culture in moderate to severe pneumonia. Despite this, blood culture utilization declined after guideline publication. Less than 10% of children hospitalized with uncomplicated CAP have positive blood cultures, which calls into question the utility of blood cultures for all admitted patients.[17, 18, 19] The recent EPIC (Epidemiology of Pneumonia in the Community) study showed that a majority of children hospitalized with pneumonia do not have growth of bacteria in culture, but there may be a role for blood cultures in patients with a strong suspicion of complicated CAP or in the patient with moderate to severe disease.[20] In addition to blood cultures, the guidelines also recommend CBC and CXR in moderate to severely ill children. This observed decline in testing in CBC and CXR may be related to individual physician assessments of which patients are moderately to severely ill, as the guidelines do not recommend testing for children with less severe disease. Our exclusion of patients requiring intensive care management or pleural drainage on admission might have selected children with a milder course of illness, although still requiring admission.
The guidelines discourage repeat diagnostic testing among children hospitalized with CAP who are improving. In this study, repeat CXR and CBC occurred in approximately 20% of patients, but repeat blood culture and CRP was much lower. As with initial diagnostic testing for inpatients with CAP, the rates of some repeat testing decreased with the guidelines. However, those with repeat testing had longer LOS and were more likely to require ICU transfer or a pleural drainage procedure compared to children without repeat testing. This suggests that repeat testing is used more often in children with a severe presentation or a worsening clinical course, and not done routinely on hospitalized patients.
The financial impact of decreased testing is modest, because the tests themselves are relatively inexpensive. However, the lack of substantial cost savings should not preclude efforts to continue to improve adherence to the guidelines. Not only is increased testing associated with higher hospitalization rates,[5] potentially yielding higher costs and family stress, increased testing may also lead to patient discomfort and possibly increased radiation exposure through chest radiography.
Many of the diagnostic testing recommendations in the CAP guidelines are based on weak evidence, which may contribute to the lack of substantial adoption. Nevertheless, adherence to guideline recommendations requires sustained effort on the part of individual physicians that should be encouraged through institutional support.[21] Continuous education and clinical decision support, as well as reminders in the electronic medical record, would make guideline recommendations more visible and may help overcome the inertia of previous practice.[15] The hospital‐level heat map (Figure 3) included in this study demonstrates that the impact of the guidelines was variable across sites. Although a few sites had decreased diagnostic testing in many areas with no increased testing in any category, there were several sites that had no improvement in any diagnostic testing category. In addition, hospital‐level factors like size, geography, and insurance status were not associated with number of improvements. To better understand drivers of change at individual hospitals, future studies should evaluate specific strategies utilized by the rapid guideline adopters.
This study is subject to several limitations. The use of ICD‐9 codes to identify patients with CAP may not capture all patients with this diagnosis; however, these codes have been previously validated.[13] Additionally, because patients were identified using ICD‐9 coding assigned at the time of discharge, testing performed in the ED setting may not reflect care for a child with known pneumonia, but rather may reflect testing for a child with fever or other signs of infection. PHIS collects data from freestanding children's hospitals, which care for a majority of children with CAP in the US, but our findings may not be generalizable to other hospitals. In addition, we did not examine drivers of trends within individual institutions. We did not have detailed information to examine whether the PHIS hospitals in our study had actively worked to adopt the CAP guidelines. We were also unable to assess physician's familiarity with guidelines or the level of disagreement with the recommendations. Furthermore, the PHIS database does not permit detailed correlation of diagnostic testing with clinical parameters. In contrast to the diagnostic testing evaluated in this study, which is primarily discouraged by the IDSA/PIDS guidelines, respiratory viral testing for children with CAP is recommended but could not be evaluated, as data on such testing are not readily available in PHIS.
CONCLUSION
Publication of the IDSA/PIDS evidence‐based guidelines for the management of CAP was associated with modest, variable changes in use of diagnostic testing. Further adoption of the CAP guidelines should reduce variation in care and decrease unnecessary resource utilization in the management of CAP. Our study demonstrates that efforts to promote decreased resource utilization should target specific situations (eg, repeat testing for inpatients who are improving). Adherence to guidelines may be improved by the adoption of local practices that integrate and improve daily workflow, like order sets and clinical decision support tools.
Disclosure: Nothing to report.
- Eliminating waste in US health care. JAMA. 2012;307(14):1513–1516. , .
- Choosing wisely in pediatric hospital medicine: five opportunities for improved healthcare value. J Hosp Med. 2013;8(9):479–485. , , , et al.
- Pediatric Research in Inpatient Settings (PRIS) Network. Prioritization of comparative effectiveness research topics in hospital pediatrics. Arch Pediatr Adolesc Med. 2012;166(12):1155–1164. , , , et al.;
- Variability in processes of care and outcomes among children hospitalized with community‐acquired pneumonia. Pediatr Infect Dis J. 2012;31(10):1036–1041. , , , et al.
- Variation in emergency department diagnostic testing and disposition outcomes in pneumonia. Pediatrics. 2013;132(2):237–244. , , , , .
- Pediatric Infectious Diseases Society and the Infectious Diseases Society of America. The management of community‐acquired pneumonia in infants and children older than 3 months of age: clinical practice guidelines by the Pediatric Infectious Diseases Society and the Infectious Diseases Society of America. Clin Infect Dis. 2011;53(7):e25–e76. , , , et al.;
- Impact of Infectious Diseases Society of America/Pediatric Infectious Diseases Society guidelines on treatment of community‐acquired pneumonia in hospitalized children. Clin Infect Dis. 2014;58(6):834–838. , , , et al.,
- Antibiotic choice for children hospitalized with pneumonia and adherence to national guidelines. Pediatrics. 2015;136(1):44–52. , , , et al.
- Quality improvement methods increase appropriate antibiotic prescribing for childhood pneumonia. Pediatrics. 2013;131(5):e1623–e1631. , , , et al.
- Improvement methodology increases guideline recommended blood cultures in children with pneumonia. Pediatrics. 2015;135(4):e1052–e1059. , , , et al.
- Impact of a guideline on management of children hospitalized with community‐acquired pneumonia. Pediatrics. 2012;129(3):e597–e604. , , , , , .
- Effectiveness of antimicrobial guidelines for community‐acquired pneumonia in children. Pediatrics. 2012;129(5):e1326–e1333. , , , .
- Identifying pediatric community‐acquired pneumonia hospitalizations: accuracy of administrative billing codes. JAMA Pediatr. 2013;167(9):851–858. , , , et al.
- Pediatric complex chronic conditions classification system version 2: updated for ICD‐10 and complex medical technology dependence and transplantation. BMC Pediatr. 2014;14:199. , , , , .
- Establishing superior benchmarks of care in clinical practice: a proposal to drive achievable health care value. JAMA Pediatr. 2015;169(4):301–302. , .
- Emergency department management of childhood pneumonia in the United States prior to publication of national guidelines. Acad Emerg Med. 2013;20(3):240–246. , , , .
- Prevalence of bacteremia in hospitalized pediatric patients with community‐acquired pneumonia. Pediatr Infect Dis J. 2013;32(7):736–740. , , , et al.
- The prevalence of bacteremia in pediatric patients with community‐acquired pneumonia: guidelines to reduce the frequency of obtaining blood cultures. Hosp Pediatr. 2013;3(2):92–96. , , , , .
- Do all children hospitalized with community‐acquired pneumonia require blood cultures? Hosp Pediatr. 2013;3(2):177–179. .
- CDC EPIC Study Team. Community‐acquired pneumonia requiring hospitalization among U.S. children. N Engl J Med. 2015;372(9):835–845. , , , et al.;
- Influence of hospital guidelines on management of children hospitalized with pneumonia. Pediatrics. 2012;130(5):e823–e830. , , , et al.
Overutilization of resources is a significant, yet underappreciated, problem in medicine. Many interventions target underutilization (eg, immunizations) or misuse (eg, antibiotic prescribing for viral pharyngitis), yet overutilization remains as a significant contributor to healthcare waste.[1] In an effort to reduce waste, the Choosing Wisely campaign created a work group to highlight areas of overutilization, specifically noting both diagnostic tests and therapies for common pediatric conditions with no proven benefit and possible harm to the patient.[2] Respiratory illnesses have been a target of many quality‐improvement efforts, and pneumonia represents a common diagnosis in pediatrics.[3] The use of diagnostic testing for pneumonia is an area where care can be optimized and aligned with evidence.
Laboratory testing and diagnostic imaging are routinely used for the management of children with community‐acquired pneumonia (CAP). Several studies have documented substantial variability in the use of these resources for pneumonia management, with higher resource use associated with a higher chance of hospitalization after emergency department (ED) evaluation and a longer length of stay among those requiring hospitalization.[4, 5] This variation in diagnostic resource utilization has been attributed, at least in part, to a lack of consensus on the management of pneumonia. There is wide variability in diagnostic testing, and due to potential consequences for patients presenting with pneumonia, efforts to standardize care offer an opportunity to improve healthcare value.
In August 2011, the first national, evidence‐based consensus guidelines for the management of childhood CAP were published jointly by the Pediatric Infectious Diseases Society (PIDS) and the Infectious Diseases Society of America (IDSA).[6] A primary focus of these guidelines was the recommendation for the use of narrow spectrum antibiotics for the management of uncomplicated pneumonia. Previous studies have assessed the impact of the publication of the PIDS/IDSA guidelines on empiric antibiotic selection for the management of pneumonia.[7, 8] In addition, the guidelines provided recommendations regarding diagnostic test utilization, in particular discouraging blood tests (eg, complete blood counts) and radiologic studies for nontoxic, fully immunized children treated as outpatients, as well as repeat testing for children hospitalized with CAP who are improving.
Although single centers have demonstrated changes in utilization patterns based on clinical practice guidelines,[9, 10, 11, 12] whether these guidelines have impacted diagnostic test utilization among US children with CAP in a larger scale remains unknown. Therefore, we sought to determine the impact of the PIDS/IDSA guidelines on the use of diagnostic testing among children with CAP using a national sample of US children's hospitals. Because the guidelines discourage repeat diagnostic testing in patients who are improving, we also evaluated the association between repeat diagnostic studies and severity of illness.
METHODS
This retrospective cohort study used data from the Pediatric Health Information System (PHIS) (Children's Hospital Association, Overland Park, KS). The PHIS database contains deidentified administrative data, detailing demographic, diagnostic, procedure, and billing data from 47 freestanding, tertiary care children's hospitals. This database accounts for approximately 20% of all annual pediatric hospitalizations in the United States. Data quality is ensured through a joint effort between the Children's Hospital Association and participating hospitals.
Patient Population
Data from 32 (of the 47) hospitals included in PHIS with complete inpatient and ED data were used to evaluate hospital‐level resource utilization for children 1 to 18 years of age discharged January 1, 2008 to June 30, 2014 with a diagnosis of pneumonia (International Classification of Diseases, 9th Revision [ICD‐9] codes 480.x‐486.x, 487.0).[13] Our goal was to identify previously healthy children with uncomplicated pneumonia, so we excluded patients with complex chronic conditions,[14] billing charges for intensive care management and/or pleural drainage procedure (IDC‐9 codes 510.0, 510.9, 511.0, 511.1, 511.8, 511.9, 513.x) on day of admission or the next day, or prior pneumonia admission in the last 30 days. We studied 2 mutually exclusive populations: children with pneumonia treated in the ED (ie, patients who were evaluated in the ED and discharged to home), and children hospitalized with pneumonia, including those admitted through the ED.
Guideline Publication and Study Periods
For an exploratory before and after comparison, patients were grouped into 2 cohorts based on a guideline online publication date of August 1, 2011: preguideline (January 1, 2008 to July 31, 2011) and postguideline (August 1, 2011 to June 30, 2014).
Study Outcomes
The measured outcomes were the monthly proportion of pneumonia patients for whom specific diagnostic tests were performed, as determined from billing data. The diagnostic tests evaluated were complete blood count (CBC), blood culture, C‐reactive protein (CRP), and chest radiograph (CXR). Standardized costs were also calculated from PHIS charges as previously described to standardize the cost of the individual tests and remove interhospital cost variation.[3]
Relationship of Repeat Testing and Severity of Illness
Because higher illness severity and clinical deterioration may warrant repeat testing, we also explored the association of repeat diagnostic testing for inpatients with severity of illness by using the following variables as measures of severity: length of stay (LOS), transfer to intensive care unit (ICU), or pleural drainage procedure after admission (>2 calendar days after admission). Repeat diagnostic testing was stratified by number of tests.
Statistical Analysis
The categorical demographic characteristics of the pre‐ and postguideline populations were summarized using frequencies and percentages, and compared using 2 tests. Continuous demographics were summarized with medians and interquartile ranges (IQRs) and compared with the Wilcoxon rank sum test. Segmented regression, clustered by hospital, was used to assess trends in monthly resource utilization as well as associated standardized costs before and after guidelines publication. To estimate the impact of the guidelines overall, we compared the observed diagnostic resource use at the end of the study period with expected use projected from trends in the preguidelines period (ie, if there were no new guidelines). Individual interrupted time series were also built for each hospital. From these models, we assessed which hospitals had a significant difference between the rate observed at the end of the study and that estimated from their preguideline trajectory. To assess the relationship between the number of positive improvements at a hospital and hospital characteristics, we used Spearman's correlation and Kruskal‐Wallis tests. All analyses were performed with SAS version 9.3 (SAS Institute, Inc., Cary, NC), and P values <0.05 were considered statistically significant. In accordance with the policies of the Cincinnati Children's Hospital Medical Center Institutional Review Board, this research, using a deidentified dataset, was not considered human subjects research.
RESULTS
There were 275,288 hospital admissions meeting study inclusion criteria of 1 to 18 years of age with a diagnosis of pneumonia from 2008 to 2014. Of these, 54,749 met exclusion criteria (1874 had pleural drainage procedure on day 0 or 1, 51,306 had complex chronic conditions, 1569 were hospitalized with pneumonia in the last 30 days). Characteristics of the remaining 220,539 patients in the final sample are shown in Table 1. The median age was 4 years (IQR, 27 years); a majority of the children were male (53%) and had public insurance (58%). There were 128,855 patients in the preguideline period (January 1, 2008 to July 31, 2011) and 91,684 in the post guideline period (August 1, 2011June 30, 2014).
Overall | Preguideline | Postguideline | P | |
---|---|---|---|---|
| ||||
No. of discharges | 220,539 | 128,855 | 91,684 | |
Type of encounter | ||||
ED only | 150,215 (68.1) | 88,790 (68.9) | 61,425 (67) | <0.001 |
Inpatient | 70,324 (31.9) | 40,065 (31.1) | 30,259 (33) | |
Age | ||||
14 years | 129,360 (58.7) | 77,802 (60.4) | 51,558 (56.2) | <0.001 |
59 years | 58,609 (26.6) | 32,708 (25.4) | 25,901 (28.3) | |
1018 years | 32,570 (14.8) | 18,345 (14.2) | 14,225 (15.5) | |
Median [IQR] | 4 [27] | 3 [27] | 4 [27] | <0.001 |
Gender | ||||
Male | 116,718 (52.9) | 68,319 (53) | 48,399 (52.8) | 00.285 |
Female | 103,813 (47.1) | 60,532 (47) | 43,281 (47.2) | |
Race | ||||
Non‐Hispanic white | 84,423 (38.3) | 47,327 (36.7) | 37,096 (40.5) | <0.001 |
Non‐Hispanic black | 60,062 (27.2) | 35,870 (27.8) | 24,192 (26.4) | |
Hispanic | 51,184 (23.2) | 31,167 (24.2) | 20,017 (21.8) | |
Asian | 6,444 (2.9) | 3,691 (2.9) | 2,753 (3) | |
Other | 18,426 (8.4) | 10,800 (8.4) | 7,626 (8.3) | |
Payer | ||||
Government | 128,047 (58.1) | 70,742 (54.9) | 57,305 (62.5) | <0.001 |
Private | 73,338 (33.3) | 44,410 (34.5) | 28,928 (31.6) | |
Other | 19,154 (8.7) | 13,703 (10.6) | 5,451 (5.9) | |
Disposition | ||||
HHS | 684 (0.3) | 411 (0.3) | 273 (0.3) | <0.001 |
Home | 209,710 (95.1) | 123,236 (95.6) | 86,474 (94.3) | |
Other | 9,749 (4.4) | 4,962 (3.9) | 4,787 (5.2) | |
SNF | 396 (0.2) | 246 (0.2) | 150 (0.2) | |
Season | ||||
Spring | 60,171 (27.3) | 36,709 (28.5) | 23,462 (25.6) | <0.001 |
Summer | 29,891 (13.6) | 17,748 (13.8) | 12,143 (13.2) | |
Fall | 52,161 (23.7) | 28,332 (22) | 23,829 (26) | |
Winter | 78,316 (35.5) | 46,066 (35.8) | 32,250 (35.2) | |
LOS | ||||
13 days | 204,812 (92.9) | 119,497 (92.7) | 85,315 (93.1) | <0.001 |
46 days | 10,454 (4.7) | 6,148 (4.8) | 4,306 (4.7) | |
7+ days | 5,273 (2.4) | 3,210 (2.5) | 2,063 (2.3) | |
Median [IQR] | 1 [11] | 1 [11] | 1 [11] | 0.144 |
Admitted patients, median [IQR] | 2 [13] | 2 [13] | 2 [13] | <0.001 |
Discharged From the ED
Throughout the study, utilization of CBC, blood cultures, and CRP was <20%, whereas CXR use was >75%. In segmented regression analysis, CRP utilization was relatively stable before the guidelines publication. However, by the end of the study period, the projected estimate of CRP utilization without guidelines (expected) was 2.9% compared with 4.8% with the guidelines (observed) (P < 0.05) (Figure 1). A similar pattern of higher rates of diagnostic utilization after the guidelines compared with projected estimates without the guidelines was also seen in the ED utilization of CBC, blood cultures, and CXR (Figure 1); however, these trends did not achieve statistical significance. Table 2 provides specific values. Using a standard cost of $19.52 for CRP testing, annual costs across all hospitals increased $11,783 for ED evaluation of CAP.
Baseline (%) | Preguideline Trend | Level Change at Guideline | Change in Trend After Guideline | Estimates at End of Study* | |||
---|---|---|---|---|---|---|---|
Without Guideline (%) | With Guideline (%) | P | |||||
| |||||||
ED‐only encounters | |||||||
Blood culture | 14.6 | 0.1 | 0.8 | 0.1 | 5.5 | 8.6 | NS |
CBC | 19.2 | 0.1 | 0.4 | 0.1 | 10.7 | 14.0 | NS |
CRP | 5.4 | 0.0 | 0.6 | 0.1 | 2.9 | 4.8 | <0.05 |
Chest x‐ray | 85.4 | 0.1 | 0.1 | 0.0 | 80.9 | 81.1 | NS |
Inpatient encounters | |||||||
Blood culture | 50.6 | 0.0 | 1.7 | 0.2 | 49.2 | 41.4 | <0.05 |
Repeat blood culture | 6.5 | 0.0 | 1.0 | 0.1 | 8.9 | 5.8 | NS |
CBC | 65.2 | 0.0 | 3.1 | 0.0 | 65.0 | 62.2 | NS |
Repeat CBC | 23.4 | 0.0 | 4.2 | 0.0 | 20.8 | 16.0 | NS |
CRP | 25.7 | 0.0 | 1.1 | 0.0 | 23.8 | 23.5 | NS |
Repeat CRP | 12.5 | 0.1 | 2.2 | 0.1 | 7.1 | 7.3 | NS |
Chest x‐ray | 89.4 | 0.1 | 0.7 | 0.0 | 85.4 | 83.9 | NS |
Repeat chest x‐ray | 25.5 | 0.0 | 2.0 | 0.1 | 24.1 | 17.7 | <0.05 |

Inpatient Encounters
In the segmented regression analysis of children hospitalized with CAP, guideline publication was associated with changes in the monthly use of some diagnostic tests. For example, by the end of the study period, the use of blood culture was 41.4% (observed), whereas the projected estimated use in the absence of the guidelines was 49.2% (expected) (P < 0.05) (Figure 2). Table 2 includes the data for the other tests, CBC, CRP, and CXR, in which similar patterns are noted with lower utilization rates after the guidelines, compared with expected utilization rates without the guidelines; however, these trends did not achieve statistical significance. Evaluating the utilization of repeat testing for inpatients, only repeat CXR achieved statistical significance (P < 0.05), with utilization rates of 17.7% with the guidelines (actual) compared with 24.1% without the guidelines (predicted).

To better understand the use of repeat testing, a comparison of severity outcomesLOS, ICU transfer, and pleural drainage procedureswas performed between patients with no repeat testing (70%) and patients with 1 or more repeat tests (30%). Patients with repeat testing had longer LOS (no repeat testing LOS 1 [IQR, 12]) versus 1 repeat test LOS 3 ([IQR, 24] vs 2+ repeat tests LOS 5 [IQR, 38]), higher rate of ICU transfer (no repeat testing 4.6% vs 1 repeat test 14.6% vs 2+ repeat test 35.6%), and higher rate of pleural drainage (no repeat testing 0% vs 1 repeat test 0.1% vs 2+ repeat test 5.9%] (all P < 0.001).
Using standard costs of $37.57 for blood cultures and $73.28 for CXR, annual costs for children with CAP across all hospitals decreased by $91,512 due to decreased utilization of blood cultures, and by $146,840 due to decreased utilization of CXR.
Hospital‐Level Variation in the Impact of the National Guideline
Figure 3 is a visual representation (heat map) of the impact of the guidelines at the hospital level at the end of the study from the individual interrupted time series. Based on this heat map (Figure 3), there was wide variability between hospitals in the impact of the guideline on each test in different settings (ED or inpatient). By diagnostic testing, 7 hospitals significantly decreased utilization of blood cultures for inpatients, and 5 hospitals significantly decreased utilization for repeat blood cultures and repeat CXR. Correlation between the number of positive improvements at a hospital and region (P = 0.974), number of CAP cases (P = 0.731), or percentage of public insurance (P = 0.241) were all nonsignificant.

DISCUSSION
This study complements previous assessments by evaluating the impact of the 2011 IDSA/PIDS consensus guidelines on the management of children with CAP cared for at US children's hospitals. Prior studies have shown increased use of narrow‐spectrum antibiotics for children with CAP after the publication of these guidelines.[7] The current study focused on diagnostic testing for CAP before and after the publication of the 2011 guidelines. In the ED setting, use of some diagnostic tests (blood culture, CBC, CXR, CRP) was declining prior to guideline publication, but appeared to plateau and/or increase after 2011. Among children admitted with CAP, use of diagnostic testing was relatively stable prior to 2011, and use of these tests (blood culture, CBC, CXR, CRP) declined after guideline publication. Overall, changes in diagnostic resource utilization 3 years after publication were modest, with few changes achieving statistical significance. There was a large variability in the impact of guidelines on test use between hospitals.
For outpatients, including those managed in the ED, the PIDS/IDSA guidelines recommend limited laboratory testing in nontoxic, fully immunized patients. The guidelines discourage the use of diagnostic testing among outpatients because of their low yield (eg, blood culture), and because test results may not impact management (eg, CBC).[6] In the years prior to guideline publication, there was already a declining trend in testing rates, including blood cultures, CBC, and CRP, for patients in the ED. After guideline publication, the rate of blood cultures, CBC, and CRP increased, but only the increase in CRP utilization achieved statistical significance. We would not expect utilization for common diagnostic tests (eg, CBC for outpatients with CAP) to be at or close to 0% because of the complexity of clinical decision making regarding admission that factors in aspects of patient history, exam findings, and underlying risk.[15] ED utilization of blood cultures was <10%, CBC <15%, and CRP <5% after guideline publication, which may represent the lowest testing limit that could be achieved.
CXRs obtained in the ED did not decrease over the entire study period. The rates of CXR use (close to 80%) seen in our study are similar to prior ED studies.[5, 16] Management of children with CAP in the ED might be different than outpatient primary care management because (1) unlike primary care providers, ED providers do not have an established relationship with their patients and do not have the opportunity for follow‐up and serial exams, making them less likely to tolerate diagnostic uncertainty; and (2) ED providers may see sicker patients. However, use of CXR in the ED does represent an opportunity for further study to understand if decreased utilization is feasible without adversely impacting clinical outcomes.
The CAP guidelines provide a strong recommendation to obtain blood culture in moderate to severe pneumonia. Despite this, blood culture utilization declined after guideline publication. Less than 10% of children hospitalized with uncomplicated CAP have positive blood cultures, which calls into question the utility of blood cultures for all admitted patients.[17, 18, 19] The recent EPIC (Epidemiology of Pneumonia in the Community) study showed that a majority of children hospitalized with pneumonia do not have growth of bacteria in culture, but there may be a role for blood cultures in patients with a strong suspicion of complicated CAP or in the patient with moderate to severe disease.[20] In addition to blood cultures, the guidelines also recommend CBC and CXR in moderate to severely ill children. This observed decline in testing in CBC and CXR may be related to individual physician assessments of which patients are moderately to severely ill, as the guidelines do not recommend testing for children with less severe disease. Our exclusion of patients requiring intensive care management or pleural drainage on admission might have selected children with a milder course of illness, although still requiring admission.
The guidelines discourage repeat diagnostic testing among children hospitalized with CAP who are improving. In this study, repeat CXR and CBC occurred in approximately 20% of patients, but repeat blood culture and CRP was much lower. As with initial diagnostic testing for inpatients with CAP, the rates of some repeat testing decreased with the guidelines. However, those with repeat testing had longer LOS and were more likely to require ICU transfer or a pleural drainage procedure compared to children without repeat testing. This suggests that repeat testing is used more often in children with a severe presentation or a worsening clinical course, and not done routinely on hospitalized patients.
The financial impact of decreased testing is modest, because the tests themselves are relatively inexpensive. However, the lack of substantial cost savings should not preclude efforts to continue to improve adherence to the guidelines. Not only is increased testing associated with higher hospitalization rates,[5] potentially yielding higher costs and family stress, increased testing may also lead to patient discomfort and possibly increased radiation exposure through chest radiography.
Many of the diagnostic testing recommendations in the CAP guidelines are based on weak evidence, which may contribute to the lack of substantial adoption. Nevertheless, adherence to guideline recommendations requires sustained effort on the part of individual physicians that should be encouraged through institutional support.[21] Continuous education and clinical decision support, as well as reminders in the electronic medical record, would make guideline recommendations more visible and may help overcome the inertia of previous practice.[15] The hospital‐level heat map (Figure 3) included in this study demonstrates that the impact of the guidelines was variable across sites. Although a few sites had decreased diagnostic testing in many areas with no increased testing in any category, there were several sites that had no improvement in any diagnostic testing category. In addition, hospital‐level factors like size, geography, and insurance status were not associated with number of improvements. To better understand drivers of change at individual hospitals, future studies should evaluate specific strategies utilized by the rapid guideline adopters.
This study is subject to several limitations. The use of ICD‐9 codes to identify patients with CAP may not capture all patients with this diagnosis; however, these codes have been previously validated.[13] Additionally, because patients were identified using ICD‐9 coding assigned at the time of discharge, testing performed in the ED setting may not reflect care for a child with known pneumonia, but rather may reflect testing for a child with fever or other signs of infection. PHIS collects data from freestanding children's hospitals, which care for a majority of children with CAP in the US, but our findings may not be generalizable to other hospitals. In addition, we did not examine drivers of trends within individual institutions. We did not have detailed information to examine whether the PHIS hospitals in our study had actively worked to adopt the CAP guidelines. We were also unable to assess physician's familiarity with guidelines or the level of disagreement with the recommendations. Furthermore, the PHIS database does not permit detailed correlation of diagnostic testing with clinical parameters. In contrast to the diagnostic testing evaluated in this study, which is primarily discouraged by the IDSA/PIDS guidelines, respiratory viral testing for children with CAP is recommended but could not be evaluated, as data on such testing are not readily available in PHIS.
CONCLUSION
Publication of the IDSA/PIDS evidence‐based guidelines for the management of CAP was associated with modest, variable changes in use of diagnostic testing. Further adoption of the CAP guidelines should reduce variation in care and decrease unnecessary resource utilization in the management of CAP. Our study demonstrates that efforts to promote decreased resource utilization should target specific situations (eg, repeat testing for inpatients who are improving). Adherence to guidelines may be improved by the adoption of local practices that integrate and improve daily workflow, like order sets and clinical decision support tools.
Disclosure: Nothing to report.
Overutilization of resources is a significant, yet underappreciated, problem in medicine. Many interventions target underutilization (eg, immunizations) or misuse (eg, antibiotic prescribing for viral pharyngitis), yet overutilization remains as a significant contributor to healthcare waste.[1] In an effort to reduce waste, the Choosing Wisely campaign created a work group to highlight areas of overutilization, specifically noting both diagnostic tests and therapies for common pediatric conditions with no proven benefit and possible harm to the patient.[2] Respiratory illnesses have been a target of many quality‐improvement efforts, and pneumonia represents a common diagnosis in pediatrics.[3] The use of diagnostic testing for pneumonia is an area where care can be optimized and aligned with evidence.
Laboratory testing and diagnostic imaging are routinely used for the management of children with community‐acquired pneumonia (CAP). Several studies have documented substantial variability in the use of these resources for pneumonia management, with higher resource use associated with a higher chance of hospitalization after emergency department (ED) evaluation and a longer length of stay among those requiring hospitalization.[4, 5] This variation in diagnostic resource utilization has been attributed, at least in part, to a lack of consensus on the management of pneumonia. There is wide variability in diagnostic testing, and due to potential consequences for patients presenting with pneumonia, efforts to standardize care offer an opportunity to improve healthcare value.
In August 2011, the first national, evidence‐based consensus guidelines for the management of childhood CAP were published jointly by the Pediatric Infectious Diseases Society (PIDS) and the Infectious Diseases Society of America (IDSA).[6] A primary focus of these guidelines was the recommendation for the use of narrow spectrum antibiotics for the management of uncomplicated pneumonia. Previous studies have assessed the impact of the publication of the PIDS/IDSA guidelines on empiric antibiotic selection for the management of pneumonia.[7, 8] In addition, the guidelines provided recommendations regarding diagnostic test utilization, in particular discouraging blood tests (eg, complete blood counts) and radiologic studies for nontoxic, fully immunized children treated as outpatients, as well as repeat testing for children hospitalized with CAP who are improving.
Although single centers have demonstrated changes in utilization patterns based on clinical practice guidelines,[9, 10, 11, 12] whether these guidelines have impacted diagnostic test utilization among US children with CAP in a larger scale remains unknown. Therefore, we sought to determine the impact of the PIDS/IDSA guidelines on the use of diagnostic testing among children with CAP using a national sample of US children's hospitals. Because the guidelines discourage repeat diagnostic testing in patients who are improving, we also evaluated the association between repeat diagnostic studies and severity of illness.
METHODS
This retrospective cohort study used data from the Pediatric Health Information System (PHIS) (Children's Hospital Association, Overland Park, KS). The PHIS database contains deidentified administrative data, detailing demographic, diagnostic, procedure, and billing data from 47 freestanding, tertiary care children's hospitals. This database accounts for approximately 20% of all annual pediatric hospitalizations in the United States. Data quality is ensured through a joint effort between the Children's Hospital Association and participating hospitals.
Patient Population
Data from 32 (of the 47) hospitals included in PHIS with complete inpatient and ED data were used to evaluate hospital‐level resource utilization for children 1 to 18 years of age discharged January 1, 2008 to June 30, 2014 with a diagnosis of pneumonia (International Classification of Diseases, 9th Revision [ICD‐9] codes 480.x‐486.x, 487.0).[13] Our goal was to identify previously healthy children with uncomplicated pneumonia, so we excluded patients with complex chronic conditions,[14] billing charges for intensive care management and/or pleural drainage procedure (IDC‐9 codes 510.0, 510.9, 511.0, 511.1, 511.8, 511.9, 513.x) on day of admission or the next day, or prior pneumonia admission in the last 30 days. We studied 2 mutually exclusive populations: children with pneumonia treated in the ED (ie, patients who were evaluated in the ED and discharged to home), and children hospitalized with pneumonia, including those admitted through the ED.
Guideline Publication and Study Periods
For an exploratory before and after comparison, patients were grouped into 2 cohorts based on a guideline online publication date of August 1, 2011: preguideline (January 1, 2008 to July 31, 2011) and postguideline (August 1, 2011 to June 30, 2014).
Study Outcomes
The measured outcomes were the monthly proportion of pneumonia patients for whom specific diagnostic tests were performed, as determined from billing data. The diagnostic tests evaluated were complete blood count (CBC), blood culture, C‐reactive protein (CRP), and chest radiograph (CXR). Standardized costs were also calculated from PHIS charges as previously described to standardize the cost of the individual tests and remove interhospital cost variation.[3]
Relationship of Repeat Testing and Severity of Illness
Because higher illness severity and clinical deterioration may warrant repeat testing, we also explored the association of repeat diagnostic testing for inpatients with severity of illness by using the following variables as measures of severity: length of stay (LOS), transfer to intensive care unit (ICU), or pleural drainage procedure after admission (>2 calendar days after admission). Repeat diagnostic testing was stratified by number of tests.
Statistical Analysis
The categorical demographic characteristics of the pre‐ and postguideline populations were summarized using frequencies and percentages, and compared using 2 tests. Continuous demographics were summarized with medians and interquartile ranges (IQRs) and compared with the Wilcoxon rank sum test. Segmented regression, clustered by hospital, was used to assess trends in monthly resource utilization as well as associated standardized costs before and after guidelines publication. To estimate the impact of the guidelines overall, we compared the observed diagnostic resource use at the end of the study period with expected use projected from trends in the preguidelines period (ie, if there were no new guidelines). Individual interrupted time series were also built for each hospital. From these models, we assessed which hospitals had a significant difference between the rate observed at the end of the study and that estimated from their preguideline trajectory. To assess the relationship between the number of positive improvements at a hospital and hospital characteristics, we used Spearman's correlation and Kruskal‐Wallis tests. All analyses were performed with SAS version 9.3 (SAS Institute, Inc., Cary, NC), and P values <0.05 were considered statistically significant. In accordance with the policies of the Cincinnati Children's Hospital Medical Center Institutional Review Board, this research, using a deidentified dataset, was not considered human subjects research.
RESULTS
There were 275,288 hospital admissions meeting study inclusion criteria of 1 to 18 years of age with a diagnosis of pneumonia from 2008 to 2014. Of these, 54,749 met exclusion criteria (1874 had pleural drainage procedure on day 0 or 1, 51,306 had complex chronic conditions, 1569 were hospitalized with pneumonia in the last 30 days). Characteristics of the remaining 220,539 patients in the final sample are shown in Table 1. The median age was 4 years (IQR, 27 years); a majority of the children were male (53%) and had public insurance (58%). There were 128,855 patients in the preguideline period (January 1, 2008 to July 31, 2011) and 91,684 in the post guideline period (August 1, 2011June 30, 2014).
Overall | Preguideline | Postguideline | P | |
---|---|---|---|---|
| ||||
No. of discharges | 220,539 | 128,855 | 91,684 | |
Type of encounter | ||||
ED only | 150,215 (68.1) | 88,790 (68.9) | 61,425 (67) | <0.001 |
Inpatient | 70,324 (31.9) | 40,065 (31.1) | 30,259 (33) | |
Age | ||||
14 years | 129,360 (58.7) | 77,802 (60.4) | 51,558 (56.2) | <0.001 |
59 years | 58,609 (26.6) | 32,708 (25.4) | 25,901 (28.3) | |
1018 years | 32,570 (14.8) | 18,345 (14.2) | 14,225 (15.5) | |
Median [IQR] | 4 [27] | 3 [27] | 4 [27] | <0.001 |
Gender | ||||
Male | 116,718 (52.9) | 68,319 (53) | 48,399 (52.8) | 00.285 |
Female | 103,813 (47.1) | 60,532 (47) | 43,281 (47.2) | |
Race | ||||
Non‐Hispanic white | 84,423 (38.3) | 47,327 (36.7) | 37,096 (40.5) | <0.001 |
Non‐Hispanic black | 60,062 (27.2) | 35,870 (27.8) | 24,192 (26.4) | |
Hispanic | 51,184 (23.2) | 31,167 (24.2) | 20,017 (21.8) | |
Asian | 6,444 (2.9) | 3,691 (2.9) | 2,753 (3) | |
Other | 18,426 (8.4) | 10,800 (8.4) | 7,626 (8.3) | |
Payer | ||||
Government | 128,047 (58.1) | 70,742 (54.9) | 57,305 (62.5) | <0.001 |
Private | 73,338 (33.3) | 44,410 (34.5) | 28,928 (31.6) | |
Other | 19,154 (8.7) | 13,703 (10.6) | 5,451 (5.9) | |
Disposition | ||||
HHS | 684 (0.3) | 411 (0.3) | 273 (0.3) | <0.001 |
Home | 209,710 (95.1) | 123,236 (95.6) | 86,474 (94.3) | |
Other | 9,749 (4.4) | 4,962 (3.9) | 4,787 (5.2) | |
SNF | 396 (0.2) | 246 (0.2) | 150 (0.2) | |
Season | ||||
Spring | 60,171 (27.3) | 36,709 (28.5) | 23,462 (25.6) | <0.001 |
Summer | 29,891 (13.6) | 17,748 (13.8) | 12,143 (13.2) | |
Fall | 52,161 (23.7) | 28,332 (22) | 23,829 (26) | |
Winter | 78,316 (35.5) | 46,066 (35.8) | 32,250 (35.2) | |
LOS | ||||
13 days | 204,812 (92.9) | 119,497 (92.7) | 85,315 (93.1) | <0.001 |
46 days | 10,454 (4.7) | 6,148 (4.8) | 4,306 (4.7) | |
7+ days | 5,273 (2.4) | 3,210 (2.5) | 2,063 (2.3) | |
Median [IQR] | 1 [11] | 1 [11] | 1 [11] | 0.144 |
Admitted patients, median [IQR] | 2 [13] | 2 [13] | 2 [13] | <0.001 |
Discharged From the ED
Throughout the study, utilization of CBC, blood cultures, and CRP was <20%, whereas CXR use was >75%. In segmented regression analysis, CRP utilization was relatively stable before the guidelines publication. However, by the end of the study period, the projected estimate of CRP utilization without guidelines (expected) was 2.9% compared with 4.8% with the guidelines (observed) (P < 0.05) (Figure 1). A similar pattern of higher rates of diagnostic utilization after the guidelines compared with projected estimates without the guidelines was also seen in the ED utilization of CBC, blood cultures, and CXR (Figure 1); however, these trends did not achieve statistical significance. Table 2 provides specific values. Using a standard cost of $19.52 for CRP testing, annual costs across all hospitals increased $11,783 for ED evaluation of CAP.
Baseline (%) | Preguideline Trend | Level Change at Guideline | Change in Trend After Guideline | Estimates at End of Study* | |||
---|---|---|---|---|---|---|---|
Without Guideline (%) | With Guideline (%) | P | |||||
| |||||||
ED‐only encounters | |||||||
Blood culture | 14.6 | 0.1 | 0.8 | 0.1 | 5.5 | 8.6 | NS |
CBC | 19.2 | 0.1 | 0.4 | 0.1 | 10.7 | 14.0 | NS |
CRP | 5.4 | 0.0 | 0.6 | 0.1 | 2.9 | 4.8 | <0.05 |
Chest x‐ray | 85.4 | 0.1 | 0.1 | 0.0 | 80.9 | 81.1 | NS |
Inpatient encounters | |||||||
Blood culture | 50.6 | 0.0 | 1.7 | 0.2 | 49.2 | 41.4 | <0.05 |
Repeat blood culture | 6.5 | 0.0 | 1.0 | 0.1 | 8.9 | 5.8 | NS |
CBC | 65.2 | 0.0 | 3.1 | 0.0 | 65.0 | 62.2 | NS |
Repeat CBC | 23.4 | 0.0 | 4.2 | 0.0 | 20.8 | 16.0 | NS |
CRP | 25.7 | 0.0 | 1.1 | 0.0 | 23.8 | 23.5 | NS |
Repeat CRP | 12.5 | 0.1 | 2.2 | 0.1 | 7.1 | 7.3 | NS |
Chest x‐ray | 89.4 | 0.1 | 0.7 | 0.0 | 85.4 | 83.9 | NS |
Repeat chest x‐ray | 25.5 | 0.0 | 2.0 | 0.1 | 24.1 | 17.7 | <0.05 |

Inpatient Encounters
In the segmented regression analysis of children hospitalized with CAP, guideline publication was associated with changes in the monthly use of some diagnostic tests. For example, by the end of the study period, the use of blood culture was 41.4% (observed), whereas the projected estimated use in the absence of the guidelines was 49.2% (expected) (P < 0.05) (Figure 2). Table 2 includes the data for the other tests, CBC, CRP, and CXR, in which similar patterns are noted with lower utilization rates after the guidelines, compared with expected utilization rates without the guidelines; however, these trends did not achieve statistical significance. Evaluating the utilization of repeat testing for inpatients, only repeat CXR achieved statistical significance (P < 0.05), with utilization rates of 17.7% with the guidelines (actual) compared with 24.1% without the guidelines (predicted).

To better understand the use of repeat testing, a comparison of severity outcomesLOS, ICU transfer, and pleural drainage procedureswas performed between patients with no repeat testing (70%) and patients with 1 or more repeat tests (30%). Patients with repeat testing had longer LOS (no repeat testing LOS 1 [IQR, 12]) versus 1 repeat test LOS 3 ([IQR, 24] vs 2+ repeat tests LOS 5 [IQR, 38]), higher rate of ICU transfer (no repeat testing 4.6% vs 1 repeat test 14.6% vs 2+ repeat test 35.6%), and higher rate of pleural drainage (no repeat testing 0% vs 1 repeat test 0.1% vs 2+ repeat test 5.9%] (all P < 0.001).
Using standard costs of $37.57 for blood cultures and $73.28 for CXR, annual costs for children with CAP across all hospitals decreased by $91,512 due to decreased utilization of blood cultures, and by $146,840 due to decreased utilization of CXR.
Hospital‐Level Variation in the Impact of the National Guideline
Figure 3 is a visual representation (heat map) of the impact of the guidelines at the hospital level at the end of the study from the individual interrupted time series. Based on this heat map (Figure 3), there was wide variability between hospitals in the impact of the guideline on each test in different settings (ED or inpatient). By diagnostic testing, 7 hospitals significantly decreased utilization of blood cultures for inpatients, and 5 hospitals significantly decreased utilization for repeat blood cultures and repeat CXR. Correlation between the number of positive improvements at a hospital and region (P = 0.974), number of CAP cases (P = 0.731), or percentage of public insurance (P = 0.241) were all nonsignificant.

DISCUSSION
This study complements previous assessments by evaluating the impact of the 2011 IDSA/PIDS consensus guidelines on the management of children with CAP cared for at US children's hospitals. Prior studies have shown increased use of narrow‐spectrum antibiotics for children with CAP after the publication of these guidelines.[7] The current study focused on diagnostic testing for CAP before and after the publication of the 2011 guidelines. In the ED setting, use of some diagnostic tests (blood culture, CBC, CXR, CRP) was declining prior to guideline publication, but appeared to plateau and/or increase after 2011. Among children admitted with CAP, use of diagnostic testing was relatively stable prior to 2011, and use of these tests (blood culture, CBC, CXR, CRP) declined after guideline publication. Overall, changes in diagnostic resource utilization 3 years after publication were modest, with few changes achieving statistical significance. There was a large variability in the impact of guidelines on test use between hospitals.
For outpatients, including those managed in the ED, the PIDS/IDSA guidelines recommend limited laboratory testing in nontoxic, fully immunized patients. The guidelines discourage the use of diagnostic testing among outpatients because of their low yield (eg, blood culture), and because test results may not impact management (eg, CBC).[6] In the years prior to guideline publication, there was already a declining trend in testing rates, including blood cultures, CBC, and CRP, for patients in the ED. After guideline publication, the rate of blood cultures, CBC, and CRP increased, but only the increase in CRP utilization achieved statistical significance. We would not expect utilization for common diagnostic tests (eg, CBC for outpatients with CAP) to be at or close to 0% because of the complexity of clinical decision making regarding admission that factors in aspects of patient history, exam findings, and underlying risk.[15] ED utilization of blood cultures was <10%, CBC <15%, and CRP <5% after guideline publication, which may represent the lowest testing limit that could be achieved.
CXRs obtained in the ED did not decrease over the entire study period. The rates of CXR use (close to 80%) seen in our study are similar to prior ED studies.[5, 16] Management of children with CAP in the ED might be different than outpatient primary care management because (1) unlike primary care providers, ED providers do not have an established relationship with their patients and do not have the opportunity for follow‐up and serial exams, making them less likely to tolerate diagnostic uncertainty; and (2) ED providers may see sicker patients. However, use of CXR in the ED does represent an opportunity for further study to understand if decreased utilization is feasible without adversely impacting clinical outcomes.
The CAP guidelines provide a strong recommendation to obtain blood culture in moderate to severe pneumonia. Despite this, blood culture utilization declined after guideline publication. Less than 10% of children hospitalized with uncomplicated CAP have positive blood cultures, which calls into question the utility of blood cultures for all admitted patients.[17, 18, 19] The recent EPIC (Epidemiology of Pneumonia in the Community) study showed that a majority of children hospitalized with pneumonia do not have growth of bacteria in culture, but there may be a role for blood cultures in patients with a strong suspicion of complicated CAP or in the patient with moderate to severe disease.[20] In addition to blood cultures, the guidelines also recommend CBC and CXR in moderate to severely ill children. This observed decline in testing in CBC and CXR may be related to individual physician assessments of which patients are moderately to severely ill, as the guidelines do not recommend testing for children with less severe disease. Our exclusion of patients requiring intensive care management or pleural drainage on admission might have selected children with a milder course of illness, although still requiring admission.
The guidelines discourage repeat diagnostic testing among children hospitalized with CAP who are improving. In this study, repeat CXR and CBC occurred in approximately 20% of patients, but repeat blood culture and CRP was much lower. As with initial diagnostic testing for inpatients with CAP, the rates of some repeat testing decreased with the guidelines. However, those with repeat testing had longer LOS and were more likely to require ICU transfer or a pleural drainage procedure compared to children without repeat testing. This suggests that repeat testing is used more often in children with a severe presentation or a worsening clinical course, and not done routinely on hospitalized patients.
The financial impact of decreased testing is modest, because the tests themselves are relatively inexpensive. However, the lack of substantial cost savings should not preclude efforts to continue to improve adherence to the guidelines. Not only is increased testing associated with higher hospitalization rates,[5] potentially yielding higher costs and family stress, increased testing may also lead to patient discomfort and possibly increased radiation exposure through chest radiography.
Many of the diagnostic testing recommendations in the CAP guidelines are based on weak evidence, which may contribute to the lack of substantial adoption. Nevertheless, adherence to guideline recommendations requires sustained effort on the part of individual physicians that should be encouraged through institutional support.[21] Continuous education and clinical decision support, as well as reminders in the electronic medical record, would make guideline recommendations more visible and may help overcome the inertia of previous practice.[15] The hospital‐level heat map (Figure 3) included in this study demonstrates that the impact of the guidelines was variable across sites. Although a few sites had decreased diagnostic testing in many areas with no increased testing in any category, there were several sites that had no improvement in any diagnostic testing category. In addition, hospital‐level factors like size, geography, and insurance status were not associated with number of improvements. To better understand drivers of change at individual hospitals, future studies should evaluate specific strategies utilized by the rapid guideline adopters.
This study is subject to several limitations. The use of ICD‐9 codes to identify patients with CAP may not capture all patients with this diagnosis; however, these codes have been previously validated.[13] Additionally, because patients were identified using ICD‐9 coding assigned at the time of discharge, testing performed in the ED setting may not reflect care for a child with known pneumonia, but rather may reflect testing for a child with fever or other signs of infection. PHIS collects data from freestanding children's hospitals, which care for a majority of children with CAP in the US, but our findings may not be generalizable to other hospitals. In addition, we did not examine drivers of trends within individual institutions. We did not have detailed information to examine whether the PHIS hospitals in our study had actively worked to adopt the CAP guidelines. We were also unable to assess physician's familiarity with guidelines or the level of disagreement with the recommendations. Furthermore, the PHIS database does not permit detailed correlation of diagnostic testing with clinical parameters. In contrast to the diagnostic testing evaluated in this study, which is primarily discouraged by the IDSA/PIDS guidelines, respiratory viral testing for children with CAP is recommended but could not be evaluated, as data on such testing are not readily available in PHIS.
CONCLUSION
Publication of the IDSA/PIDS evidence‐based guidelines for the management of CAP was associated with modest, variable changes in use of diagnostic testing. Further adoption of the CAP guidelines should reduce variation in care and decrease unnecessary resource utilization in the management of CAP. Our study demonstrates that efforts to promote decreased resource utilization should target specific situations (eg, repeat testing for inpatients who are improving). Adherence to guidelines may be improved by the adoption of local practices that integrate and improve daily workflow, like order sets and clinical decision support tools.
Disclosure: Nothing to report.
- Eliminating waste in US health care. JAMA. 2012;307(14):1513–1516. , .
- Choosing wisely in pediatric hospital medicine: five opportunities for improved healthcare value. J Hosp Med. 2013;8(9):479–485. , , , et al.
- Pediatric Research in Inpatient Settings (PRIS) Network. Prioritization of comparative effectiveness research topics in hospital pediatrics. Arch Pediatr Adolesc Med. 2012;166(12):1155–1164. , , , et al.;
- Variability in processes of care and outcomes among children hospitalized with community‐acquired pneumonia. Pediatr Infect Dis J. 2012;31(10):1036–1041. , , , et al.
- Variation in emergency department diagnostic testing and disposition outcomes in pneumonia. Pediatrics. 2013;132(2):237–244. , , , , .
- Pediatric Infectious Diseases Society and the Infectious Diseases Society of America. The management of community‐acquired pneumonia in infants and children older than 3 months of age: clinical practice guidelines by the Pediatric Infectious Diseases Society and the Infectious Diseases Society of America. Clin Infect Dis. 2011;53(7):e25–e76. , , , et al.;
- Impact of Infectious Diseases Society of America/Pediatric Infectious Diseases Society guidelines on treatment of community‐acquired pneumonia in hospitalized children. Clin Infect Dis. 2014;58(6):834–838. , , , et al.,
- Antibiotic choice for children hospitalized with pneumonia and adherence to national guidelines. Pediatrics. 2015;136(1):44–52. , , , et al.
- Quality improvement methods increase appropriate antibiotic prescribing for childhood pneumonia. Pediatrics. 2013;131(5):e1623–e1631. , , , et al.
- Improvement methodology increases guideline recommended blood cultures in children with pneumonia. Pediatrics. 2015;135(4):e1052–e1059. , , , et al.
- Impact of a guideline on management of children hospitalized with community‐acquired pneumonia. Pediatrics. 2012;129(3):e597–e604. , , , , , .
- Effectiveness of antimicrobial guidelines for community‐acquired pneumonia in children. Pediatrics. 2012;129(5):e1326–e1333. , , , .
- Identifying pediatric community‐acquired pneumonia hospitalizations: accuracy of administrative billing codes. JAMA Pediatr. 2013;167(9):851–858. , , , et al.
- Pediatric complex chronic conditions classification system version 2: updated for ICD‐10 and complex medical technology dependence and transplantation. BMC Pediatr. 2014;14:199. , , , , .
- Establishing superior benchmarks of care in clinical practice: a proposal to drive achievable health care value. JAMA Pediatr. 2015;169(4):301–302. , .
- Emergency department management of childhood pneumonia in the United States prior to publication of national guidelines. Acad Emerg Med. 2013;20(3):240–246. , , , .
- Prevalence of bacteremia in hospitalized pediatric patients with community‐acquired pneumonia. Pediatr Infect Dis J. 2013;32(7):736–740. , , , et al.
- The prevalence of bacteremia in pediatric patients with community‐acquired pneumonia: guidelines to reduce the frequency of obtaining blood cultures. Hosp Pediatr. 2013;3(2):92–96. , , , , .
- Do all children hospitalized with community‐acquired pneumonia require blood cultures? Hosp Pediatr. 2013;3(2):177–179. .
- CDC EPIC Study Team. Community‐acquired pneumonia requiring hospitalization among U.S. children. N Engl J Med. 2015;372(9):835–845. , , , et al.;
- Influence of hospital guidelines on management of children hospitalized with pneumonia. Pediatrics. 2012;130(5):e823–e830. , , , et al.
- Eliminating waste in US health care. JAMA. 2012;307(14):1513–1516. , .
- Choosing wisely in pediatric hospital medicine: five opportunities for improved healthcare value. J Hosp Med. 2013;8(9):479–485. , , , et al.
- Pediatric Research in Inpatient Settings (PRIS) Network. Prioritization of comparative effectiveness research topics in hospital pediatrics. Arch Pediatr Adolesc Med. 2012;166(12):1155–1164. , , , et al.;
- Variability in processes of care and outcomes among children hospitalized with community‐acquired pneumonia. Pediatr Infect Dis J. 2012;31(10):1036–1041. , , , et al.
- Variation in emergency department diagnostic testing and disposition outcomes in pneumonia. Pediatrics. 2013;132(2):237–244. , , , , .
- Pediatric Infectious Diseases Society and the Infectious Diseases Society of America. The management of community‐acquired pneumonia in infants and children older than 3 months of age: clinical practice guidelines by the Pediatric Infectious Diseases Society and the Infectious Diseases Society of America. Clin Infect Dis. 2011;53(7):e25–e76. , , , et al.;
- Impact of Infectious Diseases Society of America/Pediatric Infectious Diseases Society guidelines on treatment of community‐acquired pneumonia in hospitalized children. Clin Infect Dis. 2014;58(6):834–838. , , , et al.,
- Antibiotic choice for children hospitalized with pneumonia and adherence to national guidelines. Pediatrics. 2015;136(1):44–52. , , , et al.
- Quality improvement methods increase appropriate antibiotic prescribing for childhood pneumonia. Pediatrics. 2013;131(5):e1623–e1631. , , , et al.
- Improvement methodology increases guideline recommended blood cultures in children with pneumonia. Pediatrics. 2015;135(4):e1052–e1059. , , , et al.
- Impact of a guideline on management of children hospitalized with community‐acquired pneumonia. Pediatrics. 2012;129(3):e597–e604. , , , , , .
- Effectiveness of antimicrobial guidelines for community‐acquired pneumonia in children. Pediatrics. 2012;129(5):e1326–e1333. , , , .
- Identifying pediatric community‐acquired pneumonia hospitalizations: accuracy of administrative billing codes. JAMA Pediatr. 2013;167(9):851–858. , , , et al.
- Pediatric complex chronic conditions classification system version 2: updated for ICD‐10 and complex medical technology dependence and transplantation. BMC Pediatr. 2014;14:199. , , , , .
- Establishing superior benchmarks of care in clinical practice: a proposal to drive achievable health care value. JAMA Pediatr. 2015;169(4):301–302. , .
- Emergency department management of childhood pneumonia in the United States prior to publication of national guidelines. Acad Emerg Med. 2013;20(3):240–246. , , , .
- Prevalence of bacteremia in hospitalized pediatric patients with community‐acquired pneumonia. Pediatr Infect Dis J. 2013;32(7):736–740. , , , et al.
- The prevalence of bacteremia in pediatric patients with community‐acquired pneumonia: guidelines to reduce the frequency of obtaining blood cultures. Hosp Pediatr. 2013;3(2):92–96. , , , , .
- Do all children hospitalized with community‐acquired pneumonia require blood cultures? Hosp Pediatr. 2013;3(2):177–179. .
- CDC EPIC Study Team. Community‐acquired pneumonia requiring hospitalization among U.S. children. N Engl J Med. 2015;372(9):835–845. , , , et al.;
- Influence of hospital guidelines on management of children hospitalized with pneumonia. Pediatrics. 2012;130(5):e823–e830. , , , et al.
© 2015 Society of Hospital Medicine
Yield of Blood Cultures
Blood cultures are the gold standard test for the diagnosis of bloodstream infections (BSI). Given the high mortality associated with BSI,[1, 2, 3] physicians have a low threshold to obtain blood cultures.[4, 5] Unfortunately, physicians are poor at predicting which hospitalized patients have BSI,[6, 7] and published guidelines do not provide clear indications for the use of blood cultures.[8] As a result, current practice follows a culture if spikes paradigm, whereby inpatient providers often obtain blood cultures in the setting of any fever. This is the most common anticipatory guidance communicated between providers, involving up to 75% of written sign‐out instructions.[9] The result is a low rate of true positive blood cultures (5%10%)[10, 11, 12] with only a slightly lower rate of false positive blood cultures (contaminants).[12, 13, 14] False positive blood cultures often lead to repeat blood cultures, unnecessary antibiotic use, and increased hospital cost and length of stay.[13]
Over the last several years, there has been an increased emphasis on practicing high‐value care by avoiding unnecessary and duplicate testing. In 2012, the American Board of Internal Medicine introduced the Choosing Wisely campaign, with specific initiatives to reduce medical waste and overuse. Given the low yield of blood cultures, guidance on patients in whom blood cultures are most appropriate would be welcome. Studies assessing risk factors for bacteremia have led to the development of multiple stratification systems without overall consensus.[10, 15, 16, 17, 18, 19, 20] Furthermore, much of the current literature on blood culture utilization includes cultures drawn in the emergency department (ED) or intensive care unit setting (ICU).[10, 18, 19, 20] Less is known regarding the rates of positivity and utility for blood cultures drawn on patients hospitalized on an acute care medical ward.
Our study had 3 main objectives: (1) determine the rates of true positive and false positive blood cultures among hospitalized medical patients, (2) determine the ability of physician‐selected indications and patient characteristics to predict BSI, and (3) identify populations in which blood cultures may not be necessary.
PATIENTS AND METHODS
Study Design
We conducted a prospective cohort study of all hospitalized medical patients for whom blood cultures were ordered and received by the microbiology laboratory. This investigation was approved by the Veterans Affairs (VA) Boston Healthcare System internal review board.
Patients and Setting
During a 7‐month period (October 1, 2014April 15, 2015), all blood culture orders were reviewed for indication and result each day (and on Monday for weekend blood cultures) at a large VA teaching hospital (approximately 6200 admissions each year). As part of the electronic medical order, providers selected from among a list of common indications. Options included various clinical signs and diagnoses, and providers could select more than 1 indication. Each blood culture order triggered a phlebotomist to draw 2 separate blood culture sets (each set consisted of 1 aerobic and 1 anaerobic blood culture bottle).
Inclusion criteria included admission to 1 of 5 general medical service teams or 1 of 2 cardiology teams. Given that the study hospital does not have dedicated subspecialty service teams (with the exception of cardiology), all patients with medical diagnoses are cared for on the general medical service.
Predictor and Outcome Variables
Patient characteristics were obtained via chart review. Fever was defined as a single temperature greater than 100.4F within 24 hours prior to a blood culture order. Leukocytosis was defined as a white blood cell count greater than 10,000 within 24 hours of a blood culture order. Patients were considered to have received antibiotics if an order for an antibacterial or antifungal agent was active within 72 hours prior to the blood culture order. Each blood culture order was assigned a working diagnosis that prompted the order. These working diagnoses were identified by chart review as documented under the provider's assessment and plan and were not necessarily the primary diagnosis prompting hospitalization.
Classification of positive blood cultures into true and false positive was determined by consensus among the microbiology and the infectious disease departments after review of clinical and laboratory data, consistent with a previously established practice at the hospital. A true negative culture consisted of any culture that was not a true positive or a false positive. A blood culture order was defined as an electronic entry and included all sets of blood cultures drawn as a result of that order. Consistent with previous literature, a blood culture episode was defined as all blood cultures ordered within a 48‐hour period starting at the time of the first culture.[10] For patients with multiple admissions during the study period, each admission was considered a unique patient.
Statistical Analysis
Rates of true and false positivity of blood cultures were calculated. In addition, positive likelihood ratios (LR+) for true positive blood cultures were calculated using JMP statistical software (SAS Institute, Inc., Cary, NC).
RESULTS
Overall
A total of 576 blood culture orders (467 blood culture episodes) were completed on 363 hospitalized medical patients during the study period. Five hundred forty orders were placed on patients on general medical services and 36 orders on patients on the cardiology services. Four hundred eighty‐seven (85%) orders resulted in 2 sets of cultures being drawn, 87 (15%) resulted in 1 set of cultures, and 2 (0.3%) resulted in 3 sets of cultures. The median time between admission and culture draw was 2 days (range, 072 days), with 57% of cultures drawn during hospital day 0 to 2, 24.5% drawn between hospital day 3 to 7, and 19.4% drawn after hospital day 7. The average age of the patients was 70.4 years, and 94% were men. Additional patient characteristics are shown in Table 1.
Clinical Characteristic | Total, n = 363 (%) | True Positive Blood Cultures, n = 14 (%) | P Value |
---|---|---|---|
| |||
Mean age, y | 70.4 | 73.9 | 0.4 |
Male sex | 350 (96%) | 14 (100%) | 1 |
White race | 308 (85%) | 11 (79%) | 0.7 |
Location prior to admission | |||
Community | 276 (76%) | 11 (79%) | 1 |
Hospital | 51 (14%) | 1 (7%) | 0.7 |
Long‐term care facility | 36 (10%) | 2 (14%) | 0.6 |
Comorbidities | |||
Diabetes | 136 (37%) | 5 (36%) | 1 |
Malignancy | 100 (28%) | 4 (31%) | 1 |
Alcohol abuse | 89 (25%) | 2 (14%) | 0.5 |
Cirrhosis | 31 (9%) | 1 (7%) | 1 |
End‐stage renal disease | 21 (6%) | 1 (7%) | 1 |
Active drug use* | 16 (4%) | 1 (7%) | 0.5 |
Catheter | 93 (26%) | 3 (21%) | 0.8 |
Recent hospitalization | 145 (40%) | 6 (43%) | 1 |
History of MRSA colonization | 72 (20%) | 5 (36%) | 0.16 |
Cultures drawn in emergency department | 69 (19%) | 6 (43%) | 0.03 |
The true positive and false positive rates per blood culture order were 3.6% (21/576) and 2.3% (13/576), respectively (Table 2). Similar values were seen per blood cultures episode (3.4% and 2.7%, respectively). The true positive blood culture rates per order and episode were significantly lower than those drawn on emergency room patients during the study period (41/570, 7.2%, P < 0.05).
Total, n (%) | True Positive, n (%) | False Positive, n (%) | True Negative, n (%) | |
---|---|---|---|---|
| ||||
Per patient | 363 | 14 (3.8) | 13 (3.6) | 336 (92.6) |
Per blood culture episode | 467 | 16 (3.4) | 13 (2.7) | 438 (93.8) |
Per blood culture order | 576 | 21 (3.6) | 13 (2.3) | 542 (94.1) |
Rates per blood culture order | ||||
Physician‐selected indication, n = 530 | ||||
Fever | 136 (25.6) | 3 (2.2) | 3 (2.2) | 130 (95.6) |
Fever and additional indication(s) | 118 (22.2) | 5 (4.2) | 3 (2.5) | 110 (93.2) |
Fever and leukocytosis | 50 (9.4) | 4 (8.0) | 3 (6.0) | 43 (86.0) |
Leukocytosis | 50 (9.4) | 2 (4.0) | 0 (0) | 48 (96.0) |
Follow‐up previous positive | 60 (11.3) | 7 (11.7) | 0 (0) | 53 (88.3) |
Working diagnosis, n = 576 | ||||
Pneumonia | 101 (17.5) | 0 (0) | 4 (3.9) | 97 (96.0) |
Bacteremia/endocarditis | 97 (16.8) | 12 (12.3) | 1 (1.0) | 84 (86.6) |
Urinary tract infection* | 95 (16.4) | 5 (5.3) | 2 (2.1) | 88 (92.6) |
Other infection | 46 (8.0) | 0 (0) | 0 (0) | 46 (100) |
Skin and soft‐tissue infection | 39 (6.8) | 1 (2.6) | 0 (0) | 38 (97.4) |
Neutropenic fever | 28 (4.9) | 0 (0) | 0 (0) | 28 (100) |
Sepsis | 27 (4.7) | 0 (0) | 0 (0) | 27 (100) |
Fever | 18 (3.1) | 1 (5.5) | 1 (5.5) | 16 (88.9) |
Bone and join infection | 15 (2.6) | 1 (6.7) | 0 (0) | 14 (93.3) |
Postoperative fever | 9 (1.6) | 0 (0) | 0 (0) | 9 (100) |
Noninfectious diagnosis | 101 (17.5) | 1 (1.0) | 5 (5.0) | 95 (94.1) |
Antibiotic exposure | ||||
Yes | 354 (61.5) | 5 (1.4) | 5 (1.4) | 344 (97.1) |
No | 222 (38.6) | 16 (7.2) | 8 (3.6) | 198 (89.1) |
Previous documented positive culture via chart review | ||||
Yes | 155 (26.9) | 9 (5.8) | 2 (1.3) | 144 (92.9) |
No | 421 (73.1) | 12 (2.9) | 11 (2.6) | 398 (94.5) |
LR+ (95% CI), True Positive Blood Culture | LR+ (95% CI), False Positive Blood Culture | |
---|---|---|
| ||
Physician‐selected indication | ||
Fever | 0.6 (0.21.7) | 0.9 (0.32.5) |
Fever and additional indication(s) | 1.1 (0.52.4) | 1.0 (0.42.8) |
Fever and leukocytosis | 2.2 (0.95.6) | 2.5 (0.97.1) |
Leukocytosis | 1.1 (0.34.0) | 0.4 (0.05.6) |
Follow‐up previous positive | 3.4 (1.86.5) | 0.3 (0.04.7) |
Diagnosis | ||
Pneumonia | 0.1 (0.01.9) | 1.8 (0.84.1) |
Bacteremia/endocarditis | 3.7 (2.55.7) | 0.5 (0.13.0) |
Urinary tract infection | 1.5 (0.73.2) | 0.9 (0.33.4) |
Noninfectious diagnosis | 0.3 (0.01.8) | 2.3 (1.14.6) |
Recent antibiotic exposure | ||
Yes | 0.4 (0.20.8) | 0.6 (0.31.2) |
No | 2.1 (1.62.7) | 1.6 (1.02.5) |
No with fever | 2.4 (1.24.9) | 0.8 (0.23.6) |
No with fever and leukocytosis | 5.6 (1.818.2) | 0.4 (0.12.6) |
Prior positive cultures | ||
Yes | 1.6 (1.02.7) | 0.6 (0.22.0) |
For the true positive cultures, gram‐positive organisms were isolated most frequently (14/21, 67%) with Staphylococcus aureus identified in 2/21 (10%) positive cultures and Enterococcus faecalis identified in 7/21 (33%) positive cultures. Gram‐negative organisms were isolated in 6/21 (29%) cultures, and 1/21 (5%) culture grew 2 organisms (Enterococcus faecalis and Nocardia). The majority of false positive cultures isolated 1 or more species of coagulase‐negative Staphylococcus (11/13, 85%).
Predictors of True Bacteremia
The 4 most common working diagnoses prompting a blood culture order were pneumonia, bacteremia/endocarditis, urinary tract infection, and a noninfectious diagnosis (eg, syncope), with each prompting approximately 17% of the total orders (Table 2). Of these, only a primary diagnosis of bacteremia/endocarditis was predictive of a true positive culture, yielding a rate of 12.3% (LR+ 3.7, 95% confidence interval [CI]: 2.5‐5.7). No other diagnosis was predictive of true positivity. A diagnosis of pneumonia yielded no true positive and 4 false positive blood cultures (3.9%), whereas a noninfectious diagnosis yielded only 1 true positive (1.0%) and 5 false positives (5.0%). The positive likelihood ratios for these 2 diagnoses were 0.1 (95% CI: 0.00‐1.9) and 0.3 (95% CI: 0.04‐1.8), respectively.
Indications were selected for 530 of 576 (92%) blood culture orders (Table 2). The most common indication was fever alone (25.6%), followed by fever with an additional indication (22.2%), follow‐up positive blood cultures (11.3%), fever and leukocytosis (9.4%), and leukocytosis alone (9.4%). Only follow‐up positive blood cultures was predictive of a true positive, with a LR+ of 3.4 (95% CI: 1.8‐6.5).
A total of 14 patients (3.9%) had true positive blood cultures. For these patients, 10/14 (71%) had 1 true positive blood culture, 3/14 (21%) had 2 true positive blood cultures, and 1/14 (7%) had 5 true positive blood cultures. The average number of cultures drawn was 4.9. The clinical characteristic most predictive of a true positive blood culture was the absence of recent antibiotic administration. If the blood culture was ordered on a patient not receiving antibiotics (true positivity rate 7.2%, 16/222), the LR+ was 2.1 (95% CI: 1.6‐2.7). In a patient not receiving antibiotics who was also noted to have fever and leukocytosis (true positivity rate 17.6%, 3/17), the LR+ was 5.6 (95% CI: 1.8‐18.2). Conversely, patients receiving antibiotics were rarely found to have true positive blood cultures (true positivity rate 1.4%, 5/354) with a LR+ of 0.4 (95% CI: 0.2‐0.8).
DISCUSSION
In this prospective study, we determined the diagnostic yield of blood cultures ordered on hospitalized medical patients to be low, with just 3.6% of orders identifying a true BSI. This was coupled with a similar false positive rate of 2.3%. Our study found rates of true positive blood cultures much lower in hospitalized medical patients than in rates previously described when ED and ICU patients were included.[11, 16]
Although ordering blood cultures is a routine clinical behavior when there is concern for an infection, a clinician's ability to subjectively predict who has a BSI only improves the likelihood 2‐fold.[6] Despite the availability of multiple scoring systems to aid the clinicians,[10, 21, 22] our study found that over 50% of cultures were ordered in the setting of fever or leukocytosis, potentially demonstrating a triggered response to an event, rather than a decision based on probabilities. This common clinician instinct to culture if spikes is an ineffective practice if not coupled with additional clinical information. In fact, in 1 retrospective study, there was no association between fever spike and blood culture positivity.[23]
Our study suggests that objective and easily obtainable clinical characteristics may be effective in helping determine the probability of blood cultures revealing a BSI. Although more robust prediction models have value, they often require multiple inputs, limiting their utility to the bedside clinician. Stratifying patients by either antibiotic exposure or working diagnosis may provide the most benefit for the hospitalized medical patient. For those on antibiotics, the yield of true positive blood cultures is so low that they are unlikely to provide clinically useful information. In fact, although nearly two‐thirds of cultures were obtained after antibiotic exposure, only 1 (0.2%) of these patients had a culture that provided additional information regarding a BSI. Bacteremia had already been established for the other 4 patients. These results are similar to a prior study, which concluded that physicians should wait 72 hours from time of preantibiotic cultures before considering additional blood cultures given the lack of additional information provided.[24]
The working diagnosis also drives the probability of a positive blood culture. As has been shown with other studies, blood cultures are unlikely to diagnose a BSI for patients being treated for either cellulitis or pneumonia.[25, 26, 27] In our study, the working diagnosis prompting the most blood cultures was pneumonia, with the false positive rate exceeding the true positive rate, a finding consistent with previous literature. This situation may lead to the addition of unnecessary antibiotics while waiting for a positive culture to be confirmed as a false positive (eg, vancomycin for a preliminary culture showing gram‐positive cocci in clusters).
There are a number of limitations to our study. Physician‐chosen indication may not correlate with the actual clinical picture and/or may not represent the full set of variables involved in the clinical decision to order a blood culture. However, the subjective clinical indication and the objective clinical criteria found in the chart provided similar LRs. Our study did not evaluate the potential harm of not ordering a blood culture. We also did not assess the value of a true negative culture particularly in patients with endovascular infections where additional cultures are often required to document clearance of bacteremia. Lastly, our study applies to patients on a hospitalized medical service and was performed at a VA hospital with a specific population of elderly male patients, which may limit the generalizability of our results.
Despite these limitations, this study benefits from its prospective design, along with the fact that >90% of blood culture orders placed included a corresponding indication. This provides insight into physician clinical reasoning at the time the blood culture was ordered. In addition, our ability to calculate likelihood ratios provides bedside physicians with an easy and powerful way of modifying the probability of BSI prior to ordering blood cultures, aiding them in providing high‐value clinical care while potentially reducing testing overuse.
The acceptability of not obtaining blood cultures may vary by clinical experience and by specialty. Physicians must weigh the low true positive rate against the consequences of missing a BSI. Although not a substitute for clinical judgement, the LRs in this study can provide a framework to aid in clinical decision making. For example, assuming a pretest probability of 3.6% (the rate of true positive for our entire cohort), blood cultures may not be equally as compelling in 2 similar patients with fever. The first is not on antibiotics and also has a leukocytosis. The second is being treated for pneumonia and is already on antibiotics. For the first patient, using a LR+ of 5.6 (for the fever and leukocytosis in the absence of antibiotics) modifies the patient's probability of a true positive blood culture to 17.3%. Blood cultures should be ordered. In contrast, for the second patient, using a LR+ of 0.4 (for the presence of antibiotics) decreases the patient's probability of a true positive blood culture to 1.5%. Armed with these data, the bedside clinician can now decide whether this rate of true positivity warrants blood cultures. For some, this rate will be comfortably low. For others, this rate will not assuage them; only the negative culture will. Our data are not meant to make this decision, but may aid in making it a probability‐based decision.
Disclosures
Presented in part at the Infectious Diseases Society of America Annual Meeting in San Diego, California in 2015. This material is the result of work supported in part with resources and the use of facilities at the VA Boston HCS, West Roxbury, MA. Katherine Linsenmeyer, MD, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The authors report no conflicts of interest.
- Population‐based epidemiology and microbiology of community‐onset bloodstream infections. Clin Microbiol Rev. 2014;27(4):647–664. , .
- The clinical significance of positive blood cultures in the 1990s: a prospective comprehensive evaluation of the microbiology, epidemiology, and outcome of bacteremia and fungemia in adults. Clin Infect Dis. 1997;24(4):584–602. , , , et al.
- The clinical significance of positive blood cultures: a comprehensive analysis of 500 episodes of bacteremia and fungemia in adults. II. Clinical observations, with special reference to factors influencing prognosis. Rev Infect Dis. 1983;5(1):54–70. , , , .
- Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med. 2006;34(6):1589–1596. , , , et al.
- Epidemiology of sepsis syndrome in 8 academic medical centers. JAMA. 1997;278(3):234–240. , , , et al.
- Predicting bacteremia in older patients. J Am Geriatr Soc. 1995;43(3):230–235. , , , .
- Febrile inpatients: house officers' use of blood cultures. J Gen Intern Med. 1987;2(5):293–297. , , , , , .
- Executive summary: a guide to utilization of the microbiology laboratory for diagnosis of infectious diseases: 2013 recommendations by the Infectious Diseases Society of America (IDSA) and the American Society for Microbiology (ASM)(a). Clin Infect Dis. 2013;57(4):485–488. , , , et al.
- What are covering doctors told about their patients? Analysis of sign‐out among internal medicine house staff. Qual Saf Health Care. 2009;18(4):248–255. , , , , .
- Predicting bacteremia in hospitalized patients. A prospectively validated model. Ann Intern Med. 1990;113(7):495–500. , , , .
- Blood cultures. Ann Intern Med. 1987;106(2):246–253. , .
- Reducing blood culture contamination by a simple informational intervention. J Clin Microbiol. 2010;48(12):4552–4558. , , , et al.
- Contaminant blood cultures and resource utilization. The true consequences of false‐positive results. JAMA. 1991;265(3):365–369. , , .
- Blood culture contaminants. J Hosp Infect. 2014;87(1):1–10. .
- The natural history of the systemic inflammatory response syndrome (SIRS). A prospective study. JAMA. 1995;273(2):117–123. , , , , , .
- Predicting bacteremia in patients with sepsis syndrome. Academic Medical Center Consortium Sepsis Project Working Group. J Infect Dis. 1997;176(6):1538–1551. , , , et al.
- The systemic inflammatory response syndrome as a predictor of bacteraemia and outcome from sepsis. QJM. 1996;89(7):515–522. , .
- Who needs a blood culture? A prospectively derived and validated prediction rule. J Emerg Med. 2008;35(3):255–264. , , , , .
- Factors associated with positive blood cultures in outpatients with suspected bacteremia. Eur J Clin Microbiol Infect Dis. 2011;30(12):1615–1619. , , , , , .
- Two rules for early prediction of bacteremia: testing in a university and a community hospital. J Gen Intern Med. 1996;11(2):98–103. , , , et al.
- Does this adult patient with suspected bacteremia require blood cultures? JAMA. 2012;308(5):502–511. , , , .
- Clinical prediction rules for bacteremia and in‐hospital death based on clinical data at the time of blood withdrawal for culture: an evaluation of their development and use. J Eval Clin Pract. 2006;12(6):692–703. , , , et al.
- Timing of specimen collection for blood cultures from febrile patients with bacteremia. J Clin Microbiol. 2008;46(4):1381–1385. , , , et al.
- Usefulness of blood culture for hospitalized patients who are receiving antibiotic therapy. Clin Infect Dis. 2001;32(11):1651–1655. , , , .
- Clinical utility of blood cultures in adult patients with community‐acquired pneumonia without defined underlying risks. Chest. 1995;108(4):932–936. , , , , .
- Blood cultures in community‐acquired pneumonia: are we ready to quit? Chest. 2003;123(4):977–978. .
- Blood cultures for community‐acquired pneumonia: piecing together a mosaic for doing less. Am J Respir Crit Care Med. 2004;169(3):327–328. .
Blood cultures are the gold standard test for the diagnosis of bloodstream infections (BSI). Given the high mortality associated with BSI,[1, 2, 3] physicians have a low threshold to obtain blood cultures.[4, 5] Unfortunately, physicians are poor at predicting which hospitalized patients have BSI,[6, 7] and published guidelines do not provide clear indications for the use of blood cultures.[8] As a result, current practice follows a culture if spikes paradigm, whereby inpatient providers often obtain blood cultures in the setting of any fever. This is the most common anticipatory guidance communicated between providers, involving up to 75% of written sign‐out instructions.[9] The result is a low rate of true positive blood cultures (5%10%)[10, 11, 12] with only a slightly lower rate of false positive blood cultures (contaminants).[12, 13, 14] False positive blood cultures often lead to repeat blood cultures, unnecessary antibiotic use, and increased hospital cost and length of stay.[13]
Over the last several years, there has been an increased emphasis on practicing high‐value care by avoiding unnecessary and duplicate testing. In 2012, the American Board of Internal Medicine introduced the Choosing Wisely campaign, with specific initiatives to reduce medical waste and overuse. Given the low yield of blood cultures, guidance on patients in whom blood cultures are most appropriate would be welcome. Studies assessing risk factors for bacteremia have led to the development of multiple stratification systems without overall consensus.[10, 15, 16, 17, 18, 19, 20] Furthermore, much of the current literature on blood culture utilization includes cultures drawn in the emergency department (ED) or intensive care unit setting (ICU).[10, 18, 19, 20] Less is known regarding the rates of positivity and utility for blood cultures drawn on patients hospitalized on an acute care medical ward.
Our study had 3 main objectives: (1) determine the rates of true positive and false positive blood cultures among hospitalized medical patients, (2) determine the ability of physician‐selected indications and patient characteristics to predict BSI, and (3) identify populations in which blood cultures may not be necessary.
PATIENTS AND METHODS
Study Design
We conducted a prospective cohort study of all hospitalized medical patients for whom blood cultures were ordered and received by the microbiology laboratory. This investigation was approved by the Veterans Affairs (VA) Boston Healthcare System internal review board.
Patients and Setting
During a 7‐month period (October 1, 2014April 15, 2015), all blood culture orders were reviewed for indication and result each day (and on Monday for weekend blood cultures) at a large VA teaching hospital (approximately 6200 admissions each year). As part of the electronic medical order, providers selected from among a list of common indications. Options included various clinical signs and diagnoses, and providers could select more than 1 indication. Each blood culture order triggered a phlebotomist to draw 2 separate blood culture sets (each set consisted of 1 aerobic and 1 anaerobic blood culture bottle).
Inclusion criteria included admission to 1 of 5 general medical service teams or 1 of 2 cardiology teams. Given that the study hospital does not have dedicated subspecialty service teams (with the exception of cardiology), all patients with medical diagnoses are cared for on the general medical service.
Predictor and Outcome Variables
Patient characteristics were obtained via chart review. Fever was defined as a single temperature greater than 100.4F within 24 hours prior to a blood culture order. Leukocytosis was defined as a white blood cell count greater than 10,000 within 24 hours of a blood culture order. Patients were considered to have received antibiotics if an order for an antibacterial or antifungal agent was active within 72 hours prior to the blood culture order. Each blood culture order was assigned a working diagnosis that prompted the order. These working diagnoses were identified by chart review as documented under the provider's assessment and plan and were not necessarily the primary diagnosis prompting hospitalization.
Classification of positive blood cultures into true and false positive was determined by consensus among the microbiology and the infectious disease departments after review of clinical and laboratory data, consistent with a previously established practice at the hospital. A true negative culture consisted of any culture that was not a true positive or a false positive. A blood culture order was defined as an electronic entry and included all sets of blood cultures drawn as a result of that order. Consistent with previous literature, a blood culture episode was defined as all blood cultures ordered within a 48‐hour period starting at the time of the first culture.[10] For patients with multiple admissions during the study period, each admission was considered a unique patient.
Statistical Analysis
Rates of true and false positivity of blood cultures were calculated. In addition, positive likelihood ratios (LR+) for true positive blood cultures were calculated using JMP statistical software (SAS Institute, Inc., Cary, NC).
RESULTS
Overall
A total of 576 blood culture orders (467 blood culture episodes) were completed on 363 hospitalized medical patients during the study period. Five hundred forty orders were placed on patients on general medical services and 36 orders on patients on the cardiology services. Four hundred eighty‐seven (85%) orders resulted in 2 sets of cultures being drawn, 87 (15%) resulted in 1 set of cultures, and 2 (0.3%) resulted in 3 sets of cultures. The median time between admission and culture draw was 2 days (range, 072 days), with 57% of cultures drawn during hospital day 0 to 2, 24.5% drawn between hospital day 3 to 7, and 19.4% drawn after hospital day 7. The average age of the patients was 70.4 years, and 94% were men. Additional patient characteristics are shown in Table 1.
Clinical Characteristic | Total, n = 363 (%) | True Positive Blood Cultures, n = 14 (%) | P Value |
---|---|---|---|
| |||
Mean age, y | 70.4 | 73.9 | 0.4 |
Male sex | 350 (96%) | 14 (100%) | 1 |
White race | 308 (85%) | 11 (79%) | 0.7 |
Location prior to admission | |||
Community | 276 (76%) | 11 (79%) | 1 |
Hospital | 51 (14%) | 1 (7%) | 0.7 |
Long‐term care facility | 36 (10%) | 2 (14%) | 0.6 |
Comorbidities | |||
Diabetes | 136 (37%) | 5 (36%) | 1 |
Malignancy | 100 (28%) | 4 (31%) | 1 |
Alcohol abuse | 89 (25%) | 2 (14%) | 0.5 |
Cirrhosis | 31 (9%) | 1 (7%) | 1 |
End‐stage renal disease | 21 (6%) | 1 (7%) | 1 |
Active drug use* | 16 (4%) | 1 (7%) | 0.5 |
Catheter | 93 (26%) | 3 (21%) | 0.8 |
Recent hospitalization | 145 (40%) | 6 (43%) | 1 |
History of MRSA colonization | 72 (20%) | 5 (36%) | 0.16 |
Cultures drawn in emergency department | 69 (19%) | 6 (43%) | 0.03 |
The true positive and false positive rates per blood culture order were 3.6% (21/576) and 2.3% (13/576), respectively (Table 2). Similar values were seen per blood cultures episode (3.4% and 2.7%, respectively). The true positive blood culture rates per order and episode were significantly lower than those drawn on emergency room patients during the study period (41/570, 7.2%, P < 0.05).
Total, n (%) | True Positive, n (%) | False Positive, n (%) | True Negative, n (%) | |
---|---|---|---|---|
| ||||
Per patient | 363 | 14 (3.8) | 13 (3.6) | 336 (92.6) |
Per blood culture episode | 467 | 16 (3.4) | 13 (2.7) | 438 (93.8) |
Per blood culture order | 576 | 21 (3.6) | 13 (2.3) | 542 (94.1) |
Rates per blood culture order | ||||
Physician‐selected indication, n = 530 | ||||
Fever | 136 (25.6) | 3 (2.2) | 3 (2.2) | 130 (95.6) |
Fever and additional indication(s) | 118 (22.2) | 5 (4.2) | 3 (2.5) | 110 (93.2) |
Fever and leukocytosis | 50 (9.4) | 4 (8.0) | 3 (6.0) | 43 (86.0) |
Leukocytosis | 50 (9.4) | 2 (4.0) | 0 (0) | 48 (96.0) |
Follow‐up previous positive | 60 (11.3) | 7 (11.7) | 0 (0) | 53 (88.3) |
Working diagnosis, n = 576 | ||||
Pneumonia | 101 (17.5) | 0 (0) | 4 (3.9) | 97 (96.0) |
Bacteremia/endocarditis | 97 (16.8) | 12 (12.3) | 1 (1.0) | 84 (86.6) |
Urinary tract infection* | 95 (16.4) | 5 (5.3) | 2 (2.1) | 88 (92.6) |
Other infection | 46 (8.0) | 0 (0) | 0 (0) | 46 (100) |
Skin and soft‐tissue infection | 39 (6.8) | 1 (2.6) | 0 (0) | 38 (97.4) |
Neutropenic fever | 28 (4.9) | 0 (0) | 0 (0) | 28 (100) |
Sepsis | 27 (4.7) | 0 (0) | 0 (0) | 27 (100) |
Fever | 18 (3.1) | 1 (5.5) | 1 (5.5) | 16 (88.9) |
Bone and join infection | 15 (2.6) | 1 (6.7) | 0 (0) | 14 (93.3) |
Postoperative fever | 9 (1.6) | 0 (0) | 0 (0) | 9 (100) |
Noninfectious diagnosis | 101 (17.5) | 1 (1.0) | 5 (5.0) | 95 (94.1) |
Antibiotic exposure | ||||
Yes | 354 (61.5) | 5 (1.4) | 5 (1.4) | 344 (97.1) |
No | 222 (38.6) | 16 (7.2) | 8 (3.6) | 198 (89.1) |
Previous documented positive culture via chart review | ||||
Yes | 155 (26.9) | 9 (5.8) | 2 (1.3) | 144 (92.9) |
No | 421 (73.1) | 12 (2.9) | 11 (2.6) | 398 (94.5) |
LR+ (95% CI), True Positive Blood Culture | LR+ (95% CI), False Positive Blood Culture | |
---|---|---|
| ||
Physician‐selected indication | ||
Fever | 0.6 (0.21.7) | 0.9 (0.32.5) |
Fever and additional indication(s) | 1.1 (0.52.4) | 1.0 (0.42.8) |
Fever and leukocytosis | 2.2 (0.95.6) | 2.5 (0.97.1) |
Leukocytosis | 1.1 (0.34.0) | 0.4 (0.05.6) |
Follow‐up previous positive | 3.4 (1.86.5) | 0.3 (0.04.7) |
Diagnosis | ||
Pneumonia | 0.1 (0.01.9) | 1.8 (0.84.1) |
Bacteremia/endocarditis | 3.7 (2.55.7) | 0.5 (0.13.0) |
Urinary tract infection | 1.5 (0.73.2) | 0.9 (0.33.4) |
Noninfectious diagnosis | 0.3 (0.01.8) | 2.3 (1.14.6) |
Recent antibiotic exposure | ||
Yes | 0.4 (0.20.8) | 0.6 (0.31.2) |
No | 2.1 (1.62.7) | 1.6 (1.02.5) |
No with fever | 2.4 (1.24.9) | 0.8 (0.23.6) |
No with fever and leukocytosis | 5.6 (1.818.2) | 0.4 (0.12.6) |
Prior positive cultures | ||
Yes | 1.6 (1.02.7) | 0.6 (0.22.0) |
For the true positive cultures, gram‐positive organisms were isolated most frequently (14/21, 67%) with Staphylococcus aureus identified in 2/21 (10%) positive cultures and Enterococcus faecalis identified in 7/21 (33%) positive cultures. Gram‐negative organisms were isolated in 6/21 (29%) cultures, and 1/21 (5%) culture grew 2 organisms (Enterococcus faecalis and Nocardia). The majority of false positive cultures isolated 1 or more species of coagulase‐negative Staphylococcus (11/13, 85%).
Predictors of True Bacteremia
The 4 most common working diagnoses prompting a blood culture order were pneumonia, bacteremia/endocarditis, urinary tract infection, and a noninfectious diagnosis (eg, syncope), with each prompting approximately 17% of the total orders (Table 2). Of these, only a primary diagnosis of bacteremia/endocarditis was predictive of a true positive culture, yielding a rate of 12.3% (LR+ 3.7, 95% confidence interval [CI]: 2.5‐5.7). No other diagnosis was predictive of true positivity. A diagnosis of pneumonia yielded no true positive and 4 false positive blood cultures (3.9%), whereas a noninfectious diagnosis yielded only 1 true positive (1.0%) and 5 false positives (5.0%). The positive likelihood ratios for these 2 diagnoses were 0.1 (95% CI: 0.00‐1.9) and 0.3 (95% CI: 0.04‐1.8), respectively.
Indications were selected for 530 of 576 (92%) blood culture orders (Table 2). The most common indication was fever alone (25.6%), followed by fever with an additional indication (22.2%), follow‐up positive blood cultures (11.3%), fever and leukocytosis (9.4%), and leukocytosis alone (9.4%). Only follow‐up positive blood cultures was predictive of a true positive, with a LR+ of 3.4 (95% CI: 1.8‐6.5).
A total of 14 patients (3.9%) had true positive blood cultures. For these patients, 10/14 (71%) had 1 true positive blood culture, 3/14 (21%) had 2 true positive blood cultures, and 1/14 (7%) had 5 true positive blood cultures. The average number of cultures drawn was 4.9. The clinical characteristic most predictive of a true positive blood culture was the absence of recent antibiotic administration. If the blood culture was ordered on a patient not receiving antibiotics (true positivity rate 7.2%, 16/222), the LR+ was 2.1 (95% CI: 1.6‐2.7). In a patient not receiving antibiotics who was also noted to have fever and leukocytosis (true positivity rate 17.6%, 3/17), the LR+ was 5.6 (95% CI: 1.8‐18.2). Conversely, patients receiving antibiotics were rarely found to have true positive blood cultures (true positivity rate 1.4%, 5/354) with a LR+ of 0.4 (95% CI: 0.2‐0.8).
DISCUSSION
In this prospective study, we determined the diagnostic yield of blood cultures ordered on hospitalized medical patients to be low, with just 3.6% of orders identifying a true BSI. This was coupled with a similar false positive rate of 2.3%. Our study found rates of true positive blood cultures much lower in hospitalized medical patients than in rates previously described when ED and ICU patients were included.[11, 16]
Although ordering blood cultures is a routine clinical behavior when there is concern for an infection, a clinician's ability to subjectively predict who has a BSI only improves the likelihood 2‐fold.[6] Despite the availability of multiple scoring systems to aid the clinicians,[10, 21, 22] our study found that over 50% of cultures were ordered in the setting of fever or leukocytosis, potentially demonstrating a triggered response to an event, rather than a decision based on probabilities. This common clinician instinct to culture if spikes is an ineffective practice if not coupled with additional clinical information. In fact, in 1 retrospective study, there was no association between fever spike and blood culture positivity.[23]
Our study suggests that objective and easily obtainable clinical characteristics may be effective in helping determine the probability of blood cultures revealing a BSI. Although more robust prediction models have value, they often require multiple inputs, limiting their utility to the bedside clinician. Stratifying patients by either antibiotic exposure or working diagnosis may provide the most benefit for the hospitalized medical patient. For those on antibiotics, the yield of true positive blood cultures is so low that they are unlikely to provide clinically useful information. In fact, although nearly two‐thirds of cultures were obtained after antibiotic exposure, only 1 (0.2%) of these patients had a culture that provided additional information regarding a BSI. Bacteremia had already been established for the other 4 patients. These results are similar to a prior study, which concluded that physicians should wait 72 hours from time of preantibiotic cultures before considering additional blood cultures given the lack of additional information provided.[24]
The working diagnosis also drives the probability of a positive blood culture. As has been shown with other studies, blood cultures are unlikely to diagnose a BSI for patients being treated for either cellulitis or pneumonia.[25, 26, 27] In our study, the working diagnosis prompting the most blood cultures was pneumonia, with the false positive rate exceeding the true positive rate, a finding consistent with previous literature. This situation may lead to the addition of unnecessary antibiotics while waiting for a positive culture to be confirmed as a false positive (eg, vancomycin for a preliminary culture showing gram‐positive cocci in clusters).
There are a number of limitations to our study. Physician‐chosen indication may not correlate with the actual clinical picture and/or may not represent the full set of variables involved in the clinical decision to order a blood culture. However, the subjective clinical indication and the objective clinical criteria found in the chart provided similar LRs. Our study did not evaluate the potential harm of not ordering a blood culture. We also did not assess the value of a true negative culture particularly in patients with endovascular infections where additional cultures are often required to document clearance of bacteremia. Lastly, our study applies to patients on a hospitalized medical service and was performed at a VA hospital with a specific population of elderly male patients, which may limit the generalizability of our results.
Despite these limitations, this study benefits from its prospective design, along with the fact that >90% of blood culture orders placed included a corresponding indication. This provides insight into physician clinical reasoning at the time the blood culture was ordered. In addition, our ability to calculate likelihood ratios provides bedside physicians with an easy and powerful way of modifying the probability of BSI prior to ordering blood cultures, aiding them in providing high‐value clinical care while potentially reducing testing overuse.
The acceptability of not obtaining blood cultures may vary by clinical experience and by specialty. Physicians must weigh the low true positive rate against the consequences of missing a BSI. Although not a substitute for clinical judgement, the LRs in this study can provide a framework to aid in clinical decision making. For example, assuming a pretest probability of 3.6% (the rate of true positive for our entire cohort), blood cultures may not be equally as compelling in 2 similar patients with fever. The first is not on antibiotics and also has a leukocytosis. The second is being treated for pneumonia and is already on antibiotics. For the first patient, using a LR+ of 5.6 (for the fever and leukocytosis in the absence of antibiotics) modifies the patient's probability of a true positive blood culture to 17.3%. Blood cultures should be ordered. In contrast, for the second patient, using a LR+ of 0.4 (for the presence of antibiotics) decreases the patient's probability of a true positive blood culture to 1.5%. Armed with these data, the bedside clinician can now decide whether this rate of true positivity warrants blood cultures. For some, this rate will be comfortably low. For others, this rate will not assuage them; only the negative culture will. Our data are not meant to make this decision, but may aid in making it a probability‐based decision.
Disclosures
Presented in part at the Infectious Diseases Society of America Annual Meeting in San Diego, California in 2015. This material is the result of work supported in part with resources and the use of facilities at the VA Boston HCS, West Roxbury, MA. Katherine Linsenmeyer, MD, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The authors report no conflicts of interest.
Blood cultures are the gold standard test for the diagnosis of bloodstream infections (BSI). Given the high mortality associated with BSI,[1, 2, 3] physicians have a low threshold to obtain blood cultures.[4, 5] Unfortunately, physicians are poor at predicting which hospitalized patients have BSI,[6, 7] and published guidelines do not provide clear indications for the use of blood cultures.[8] As a result, current practice follows a culture if spikes paradigm, whereby inpatient providers often obtain blood cultures in the setting of any fever. This is the most common anticipatory guidance communicated between providers, involving up to 75% of written sign‐out instructions.[9] The result is a low rate of true positive blood cultures (5%10%)[10, 11, 12] with only a slightly lower rate of false positive blood cultures (contaminants).[12, 13, 14] False positive blood cultures often lead to repeat blood cultures, unnecessary antibiotic use, and increased hospital cost and length of stay.[13]
Over the last several years, there has been an increased emphasis on practicing high‐value care by avoiding unnecessary and duplicate testing. In 2012, the American Board of Internal Medicine introduced the Choosing Wisely campaign, with specific initiatives to reduce medical waste and overuse. Given the low yield of blood cultures, guidance on patients in whom blood cultures are most appropriate would be welcome. Studies assessing risk factors for bacteremia have led to the development of multiple stratification systems without overall consensus.[10, 15, 16, 17, 18, 19, 20] Furthermore, much of the current literature on blood culture utilization includes cultures drawn in the emergency department (ED) or intensive care unit setting (ICU).[10, 18, 19, 20] Less is known regarding the rates of positivity and utility for blood cultures drawn on patients hospitalized on an acute care medical ward.
Our study had 3 main objectives: (1) determine the rates of true positive and false positive blood cultures among hospitalized medical patients, (2) determine the ability of physician‐selected indications and patient characteristics to predict BSI, and (3) identify populations in which blood cultures may not be necessary.
PATIENTS AND METHODS
Study Design
We conducted a prospective cohort study of all hospitalized medical patients for whom blood cultures were ordered and received by the microbiology laboratory. This investigation was approved by the Veterans Affairs (VA) Boston Healthcare System internal review board.
Patients and Setting
During a 7‐month period (October 1, 2014April 15, 2015), all blood culture orders were reviewed for indication and result each day (and on Monday for weekend blood cultures) at a large VA teaching hospital (approximately 6200 admissions each year). As part of the electronic medical order, providers selected from among a list of common indications. Options included various clinical signs and diagnoses, and providers could select more than 1 indication. Each blood culture order triggered a phlebotomist to draw 2 separate blood culture sets (each set consisted of 1 aerobic and 1 anaerobic blood culture bottle).
Inclusion criteria included admission to 1 of 5 general medical service teams or 1 of 2 cardiology teams. Given that the study hospital does not have dedicated subspecialty service teams (with the exception of cardiology), all patients with medical diagnoses are cared for on the general medical service.
Predictor and Outcome Variables
Patient characteristics were obtained via chart review. Fever was defined as a single temperature greater than 100.4F within 24 hours prior to a blood culture order. Leukocytosis was defined as a white blood cell count greater than 10,000 within 24 hours of a blood culture order. Patients were considered to have received antibiotics if an order for an antibacterial or antifungal agent was active within 72 hours prior to the blood culture order. Each blood culture order was assigned a working diagnosis that prompted the order. These working diagnoses were identified by chart review as documented under the provider's assessment and plan and were not necessarily the primary diagnosis prompting hospitalization.
Classification of positive blood cultures into true and false positive was determined by consensus among the microbiology and the infectious disease departments after review of clinical and laboratory data, consistent with a previously established practice at the hospital. A true negative culture consisted of any culture that was not a true positive or a false positive. A blood culture order was defined as an electronic entry and included all sets of blood cultures drawn as a result of that order. Consistent with previous literature, a blood culture episode was defined as all blood cultures ordered within a 48‐hour period starting at the time of the first culture.[10] For patients with multiple admissions during the study period, each admission was considered a unique patient.
Statistical Analysis
Rates of true and false positivity of blood cultures were calculated. In addition, positive likelihood ratios (LR+) for true positive blood cultures were calculated using JMP statistical software (SAS Institute, Inc., Cary, NC).
RESULTS
Overall
A total of 576 blood culture orders (467 blood culture episodes) were completed on 363 hospitalized medical patients during the study period. Five hundred forty orders were placed on patients on general medical services and 36 orders on patients on the cardiology services. Four hundred eighty‐seven (85%) orders resulted in 2 sets of cultures being drawn, 87 (15%) resulted in 1 set of cultures, and 2 (0.3%) resulted in 3 sets of cultures. The median time between admission and culture draw was 2 days (range, 072 days), with 57% of cultures drawn during hospital day 0 to 2, 24.5% drawn between hospital day 3 to 7, and 19.4% drawn after hospital day 7. The average age of the patients was 70.4 years, and 94% were men. Additional patient characteristics are shown in Table 1.
Clinical Characteristic | Total, n = 363 (%) | True Positive Blood Cultures, n = 14 (%) | P Value |
---|---|---|---|
| |||
Mean age, y | 70.4 | 73.9 | 0.4 |
Male sex | 350 (96%) | 14 (100%) | 1 |
White race | 308 (85%) | 11 (79%) | 0.7 |
Location prior to admission | |||
Community | 276 (76%) | 11 (79%) | 1 |
Hospital | 51 (14%) | 1 (7%) | 0.7 |
Long‐term care facility | 36 (10%) | 2 (14%) | 0.6 |
Comorbidities | |||
Diabetes | 136 (37%) | 5 (36%) | 1 |
Malignancy | 100 (28%) | 4 (31%) | 1 |
Alcohol abuse | 89 (25%) | 2 (14%) | 0.5 |
Cirrhosis | 31 (9%) | 1 (7%) | 1 |
End‐stage renal disease | 21 (6%) | 1 (7%) | 1 |
Active drug use* | 16 (4%) | 1 (7%) | 0.5 |
Catheter | 93 (26%) | 3 (21%) | 0.8 |
Recent hospitalization | 145 (40%) | 6 (43%) | 1 |
History of MRSA colonization | 72 (20%) | 5 (36%) | 0.16 |
Cultures drawn in emergency department | 69 (19%) | 6 (43%) | 0.03 |
The true positive and false positive rates per blood culture order were 3.6% (21/576) and 2.3% (13/576), respectively (Table 2). Similar values were seen per blood cultures episode (3.4% and 2.7%, respectively). The true positive blood culture rates per order and episode were significantly lower than those drawn on emergency room patients during the study period (41/570, 7.2%, P < 0.05).
Total, n (%) | True Positive, n (%) | False Positive, n (%) | True Negative, n (%) | |
---|---|---|---|---|
| ||||
Per patient | 363 | 14 (3.8) | 13 (3.6) | 336 (92.6) |
Per blood culture episode | 467 | 16 (3.4) | 13 (2.7) | 438 (93.8) |
Per blood culture order | 576 | 21 (3.6) | 13 (2.3) | 542 (94.1) |
Rates per blood culture order | ||||
Physician‐selected indication, n = 530 | ||||
Fever | 136 (25.6) | 3 (2.2) | 3 (2.2) | 130 (95.6) |
Fever and additional indication(s) | 118 (22.2) | 5 (4.2) | 3 (2.5) | 110 (93.2) |
Fever and leukocytosis | 50 (9.4) | 4 (8.0) | 3 (6.0) | 43 (86.0) |
Leukocytosis | 50 (9.4) | 2 (4.0) | 0 (0) | 48 (96.0) |
Follow‐up previous positive | 60 (11.3) | 7 (11.7) | 0 (0) | 53 (88.3) |
Working diagnosis, n = 576 | ||||
Pneumonia | 101 (17.5) | 0 (0) | 4 (3.9) | 97 (96.0) |
Bacteremia/endocarditis | 97 (16.8) | 12 (12.3) | 1 (1.0) | 84 (86.6) |
Urinary tract infection* | 95 (16.4) | 5 (5.3) | 2 (2.1) | 88 (92.6) |
Other infection | 46 (8.0) | 0 (0) | 0 (0) | 46 (100) |
Skin and soft‐tissue infection | 39 (6.8) | 1 (2.6) | 0 (0) | 38 (97.4) |
Neutropenic fever | 28 (4.9) | 0 (0) | 0 (0) | 28 (100) |
Sepsis | 27 (4.7) | 0 (0) | 0 (0) | 27 (100) |
Fever | 18 (3.1) | 1 (5.5) | 1 (5.5) | 16 (88.9) |
Bone and join infection | 15 (2.6) | 1 (6.7) | 0 (0) | 14 (93.3) |
Postoperative fever | 9 (1.6) | 0 (0) | 0 (0) | 9 (100) |
Noninfectious diagnosis | 101 (17.5) | 1 (1.0) | 5 (5.0) | 95 (94.1) |
Antibiotic exposure | ||||
Yes | 354 (61.5) | 5 (1.4) | 5 (1.4) | 344 (97.1) |
No | 222 (38.6) | 16 (7.2) | 8 (3.6) | 198 (89.1) |
Previous documented positive culture via chart review | ||||
Yes | 155 (26.9) | 9 (5.8) | 2 (1.3) | 144 (92.9) |
No | 421 (73.1) | 12 (2.9) | 11 (2.6) | 398 (94.5) |
LR+ (95% CI), True Positive Blood Culture | LR+ (95% CI), False Positive Blood Culture | |
---|---|---|
| ||
Physician‐selected indication | ||
Fever | 0.6 (0.21.7) | 0.9 (0.32.5) |
Fever and additional indication(s) | 1.1 (0.52.4) | 1.0 (0.42.8) |
Fever and leukocytosis | 2.2 (0.95.6) | 2.5 (0.97.1) |
Leukocytosis | 1.1 (0.34.0) | 0.4 (0.05.6) |
Follow‐up previous positive | 3.4 (1.86.5) | 0.3 (0.04.7) |
Diagnosis | ||
Pneumonia | 0.1 (0.01.9) | 1.8 (0.84.1) |
Bacteremia/endocarditis | 3.7 (2.55.7) | 0.5 (0.13.0) |
Urinary tract infection | 1.5 (0.73.2) | 0.9 (0.33.4) |
Noninfectious diagnosis | 0.3 (0.01.8) | 2.3 (1.14.6) |
Recent antibiotic exposure | ||
Yes | 0.4 (0.20.8) | 0.6 (0.31.2) |
No | 2.1 (1.62.7) | 1.6 (1.02.5) |
No with fever | 2.4 (1.24.9) | 0.8 (0.23.6) |
No with fever and leukocytosis | 5.6 (1.818.2) | 0.4 (0.12.6) |
Prior positive cultures | ||
Yes | 1.6 (1.02.7) | 0.6 (0.22.0) |
For the true positive cultures, gram‐positive organisms were isolated most frequently (14/21, 67%) with Staphylococcus aureus identified in 2/21 (10%) positive cultures and Enterococcus faecalis identified in 7/21 (33%) positive cultures. Gram‐negative organisms were isolated in 6/21 (29%) cultures, and 1/21 (5%) culture grew 2 organisms (Enterococcus faecalis and Nocardia). The majority of false positive cultures isolated 1 or more species of coagulase‐negative Staphylococcus (11/13, 85%).
Predictors of True Bacteremia
The 4 most common working diagnoses prompting a blood culture order were pneumonia, bacteremia/endocarditis, urinary tract infection, and a noninfectious diagnosis (eg, syncope), with each prompting approximately 17% of the total orders (Table 2). Of these, only a primary diagnosis of bacteremia/endocarditis was predictive of a true positive culture, yielding a rate of 12.3% (LR+ 3.7, 95% confidence interval [CI]: 2.5‐5.7). No other diagnosis was predictive of true positivity. A diagnosis of pneumonia yielded no true positive and 4 false positive blood cultures (3.9%), whereas a noninfectious diagnosis yielded only 1 true positive (1.0%) and 5 false positives (5.0%). The positive likelihood ratios for these 2 diagnoses were 0.1 (95% CI: 0.00‐1.9) and 0.3 (95% CI: 0.04‐1.8), respectively.
Indications were selected for 530 of 576 (92%) blood culture orders (Table 2). The most common indication was fever alone (25.6%), followed by fever with an additional indication (22.2%), follow‐up positive blood cultures (11.3%), fever and leukocytosis (9.4%), and leukocytosis alone (9.4%). Only follow‐up positive blood cultures was predictive of a true positive, with a LR+ of 3.4 (95% CI: 1.8‐6.5).
A total of 14 patients (3.9%) had true positive blood cultures. For these patients, 10/14 (71%) had 1 true positive blood culture, 3/14 (21%) had 2 true positive blood cultures, and 1/14 (7%) had 5 true positive blood cultures. The average number of cultures drawn was 4.9. The clinical characteristic most predictive of a true positive blood culture was the absence of recent antibiotic administration. If the blood culture was ordered on a patient not receiving antibiotics (true positivity rate 7.2%, 16/222), the LR+ was 2.1 (95% CI: 1.6‐2.7). In a patient not receiving antibiotics who was also noted to have fever and leukocytosis (true positivity rate 17.6%, 3/17), the LR+ was 5.6 (95% CI: 1.8‐18.2). Conversely, patients receiving antibiotics were rarely found to have true positive blood cultures (true positivity rate 1.4%, 5/354) with a LR+ of 0.4 (95% CI: 0.2‐0.8).
DISCUSSION
In this prospective study, we determined the diagnostic yield of blood cultures ordered on hospitalized medical patients to be low, with just 3.6% of orders identifying a true BSI. This was coupled with a similar false positive rate of 2.3%. Our study found rates of true positive blood cultures much lower in hospitalized medical patients than in rates previously described when ED and ICU patients were included.[11, 16]
Although ordering blood cultures is a routine clinical behavior when there is concern for an infection, a clinician's ability to subjectively predict who has a BSI only improves the likelihood 2‐fold.[6] Despite the availability of multiple scoring systems to aid the clinicians,[10, 21, 22] our study found that over 50% of cultures were ordered in the setting of fever or leukocytosis, potentially demonstrating a triggered response to an event, rather than a decision based on probabilities. This common clinician instinct to culture if spikes is an ineffective practice if not coupled with additional clinical information. In fact, in 1 retrospective study, there was no association between fever spike and blood culture positivity.[23]
Our study suggests that objective and easily obtainable clinical characteristics may be effective in helping determine the probability of blood cultures revealing a BSI. Although more robust prediction models have value, they often require multiple inputs, limiting their utility to the bedside clinician. Stratifying patients by either antibiotic exposure or working diagnosis may provide the most benefit for the hospitalized medical patient. For those on antibiotics, the yield of true positive blood cultures is so low that they are unlikely to provide clinically useful information. In fact, although nearly two‐thirds of cultures were obtained after antibiotic exposure, only 1 (0.2%) of these patients had a culture that provided additional information regarding a BSI. Bacteremia had already been established for the other 4 patients. These results are similar to a prior study, which concluded that physicians should wait 72 hours from time of preantibiotic cultures before considering additional blood cultures given the lack of additional information provided.[24]
The working diagnosis also drives the probability of a positive blood culture. As has been shown with other studies, blood cultures are unlikely to diagnose a BSI for patients being treated for either cellulitis or pneumonia.[25, 26, 27] In our study, the working diagnosis prompting the most blood cultures was pneumonia, with the false positive rate exceeding the true positive rate, a finding consistent with previous literature. This situation may lead to the addition of unnecessary antibiotics while waiting for a positive culture to be confirmed as a false positive (eg, vancomycin for a preliminary culture showing gram‐positive cocci in clusters).
There are a number of limitations to our study. Physician‐chosen indication may not correlate with the actual clinical picture and/or may not represent the full set of variables involved in the clinical decision to order a blood culture. However, the subjective clinical indication and the objective clinical criteria found in the chart provided similar LRs. Our study did not evaluate the potential harm of not ordering a blood culture. We also did not assess the value of a true negative culture particularly in patients with endovascular infections where additional cultures are often required to document clearance of bacteremia. Lastly, our study applies to patients on a hospitalized medical service and was performed at a VA hospital with a specific population of elderly male patients, which may limit the generalizability of our results.
Despite these limitations, this study benefits from its prospective design, along with the fact that >90% of blood culture orders placed included a corresponding indication. This provides insight into physician clinical reasoning at the time the blood culture was ordered. In addition, our ability to calculate likelihood ratios provides bedside physicians with an easy and powerful way of modifying the probability of BSI prior to ordering blood cultures, aiding them in providing high‐value clinical care while potentially reducing testing overuse.
The acceptability of not obtaining blood cultures may vary by clinical experience and by specialty. Physicians must weigh the low true positive rate against the consequences of missing a BSI. Although not a substitute for clinical judgement, the LRs in this study can provide a framework to aid in clinical decision making. For example, assuming a pretest probability of 3.6% (the rate of true positive for our entire cohort), blood cultures may not be equally as compelling in 2 similar patients with fever. The first is not on antibiotics and also has a leukocytosis. The second is being treated for pneumonia and is already on antibiotics. For the first patient, using a LR+ of 5.6 (for the fever and leukocytosis in the absence of antibiotics) modifies the patient's probability of a true positive blood culture to 17.3%. Blood cultures should be ordered. In contrast, for the second patient, using a LR+ of 0.4 (for the presence of antibiotics) decreases the patient's probability of a true positive blood culture to 1.5%. Armed with these data, the bedside clinician can now decide whether this rate of true positivity warrants blood cultures. For some, this rate will be comfortably low. For others, this rate will not assuage them; only the negative culture will. Our data are not meant to make this decision, but may aid in making it a probability‐based decision.
Disclosures
Presented in part at the Infectious Diseases Society of America Annual Meeting in San Diego, California in 2015. This material is the result of work supported in part with resources and the use of facilities at the VA Boston HCS, West Roxbury, MA. Katherine Linsenmeyer, MD, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The authors report no conflicts of interest.
- Population‐based epidemiology and microbiology of community‐onset bloodstream infections. Clin Microbiol Rev. 2014;27(4):647–664. , .
- The clinical significance of positive blood cultures in the 1990s: a prospective comprehensive evaluation of the microbiology, epidemiology, and outcome of bacteremia and fungemia in adults. Clin Infect Dis. 1997;24(4):584–602. , , , et al.
- The clinical significance of positive blood cultures: a comprehensive analysis of 500 episodes of bacteremia and fungemia in adults. II. Clinical observations, with special reference to factors influencing prognosis. Rev Infect Dis. 1983;5(1):54–70. , , , .
- Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med. 2006;34(6):1589–1596. , , , et al.
- Epidemiology of sepsis syndrome in 8 academic medical centers. JAMA. 1997;278(3):234–240. , , , et al.
- Predicting bacteremia in older patients. J Am Geriatr Soc. 1995;43(3):230–235. , , , .
- Febrile inpatients: house officers' use of blood cultures. J Gen Intern Med. 1987;2(5):293–297. , , , , , .
- Executive summary: a guide to utilization of the microbiology laboratory for diagnosis of infectious diseases: 2013 recommendations by the Infectious Diseases Society of America (IDSA) and the American Society for Microbiology (ASM)(a). Clin Infect Dis. 2013;57(4):485–488. , , , et al.
- What are covering doctors told about their patients? Analysis of sign‐out among internal medicine house staff. Qual Saf Health Care. 2009;18(4):248–255. , , , , .
- Predicting bacteremia in hospitalized patients. A prospectively validated model. Ann Intern Med. 1990;113(7):495–500. , , , .
- Blood cultures. Ann Intern Med. 1987;106(2):246–253. , .
- Reducing blood culture contamination by a simple informational intervention. J Clin Microbiol. 2010;48(12):4552–4558. , , , et al.
- Contaminant blood cultures and resource utilization. The true consequences of false‐positive results. JAMA. 1991;265(3):365–369. , , .
- Blood culture contaminants. J Hosp Infect. 2014;87(1):1–10. .
- The natural history of the systemic inflammatory response syndrome (SIRS). A prospective study. JAMA. 1995;273(2):117–123. , , , , , .
- Predicting bacteremia in patients with sepsis syndrome. Academic Medical Center Consortium Sepsis Project Working Group. J Infect Dis. 1997;176(6):1538–1551. , , , et al.
- The systemic inflammatory response syndrome as a predictor of bacteraemia and outcome from sepsis. QJM. 1996;89(7):515–522. , .
- Who needs a blood culture? A prospectively derived and validated prediction rule. J Emerg Med. 2008;35(3):255–264. , , , , .
- Factors associated with positive blood cultures in outpatients with suspected bacteremia. Eur J Clin Microbiol Infect Dis. 2011;30(12):1615–1619. , , , , , .
- Two rules for early prediction of bacteremia: testing in a university and a community hospital. J Gen Intern Med. 1996;11(2):98–103. , , , et al.
- Does this adult patient with suspected bacteremia require blood cultures? JAMA. 2012;308(5):502–511. , , , .
- Clinical prediction rules for bacteremia and in‐hospital death based on clinical data at the time of blood withdrawal for culture: an evaluation of their development and use. J Eval Clin Pract. 2006;12(6):692–703. , , , et al.
- Timing of specimen collection for blood cultures from febrile patients with bacteremia. J Clin Microbiol. 2008;46(4):1381–1385. , , , et al.
- Usefulness of blood culture for hospitalized patients who are receiving antibiotic therapy. Clin Infect Dis. 2001;32(11):1651–1655. , , , .
- Clinical utility of blood cultures in adult patients with community‐acquired pneumonia without defined underlying risks. Chest. 1995;108(4):932–936. , , , , .
- Blood cultures in community‐acquired pneumonia: are we ready to quit? Chest. 2003;123(4):977–978. .
- Blood cultures for community‐acquired pneumonia: piecing together a mosaic for doing less. Am J Respir Crit Care Med. 2004;169(3):327–328. .
- Population‐based epidemiology and microbiology of community‐onset bloodstream infections. Clin Microbiol Rev. 2014;27(4):647–664. , .
- The clinical significance of positive blood cultures in the 1990s: a prospective comprehensive evaluation of the microbiology, epidemiology, and outcome of bacteremia and fungemia in adults. Clin Infect Dis. 1997;24(4):584–602. , , , et al.
- The clinical significance of positive blood cultures: a comprehensive analysis of 500 episodes of bacteremia and fungemia in adults. II. Clinical observations, with special reference to factors influencing prognosis. Rev Infect Dis. 1983;5(1):54–70. , , , .
- Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med. 2006;34(6):1589–1596. , , , et al.
- Epidemiology of sepsis syndrome in 8 academic medical centers. JAMA. 1997;278(3):234–240. , , , et al.
- Predicting bacteremia in older patients. J Am Geriatr Soc. 1995;43(3):230–235. , , , .
- Febrile inpatients: house officers' use of blood cultures. J Gen Intern Med. 1987;2(5):293–297. , , , , , .
- Executive summary: a guide to utilization of the microbiology laboratory for diagnosis of infectious diseases: 2013 recommendations by the Infectious Diseases Society of America (IDSA) and the American Society for Microbiology (ASM)(a). Clin Infect Dis. 2013;57(4):485–488. , , , et al.
- What are covering doctors told about their patients? Analysis of sign‐out among internal medicine house staff. Qual Saf Health Care. 2009;18(4):248–255. , , , , .
- Predicting bacteremia in hospitalized patients. A prospectively validated model. Ann Intern Med. 1990;113(7):495–500. , , , .
- Blood cultures. Ann Intern Med. 1987;106(2):246–253. , .
- Reducing blood culture contamination by a simple informational intervention. J Clin Microbiol. 2010;48(12):4552–4558. , , , et al.
- Contaminant blood cultures and resource utilization. The true consequences of false‐positive results. JAMA. 1991;265(3):365–369. , , .
- Blood culture contaminants. J Hosp Infect. 2014;87(1):1–10. .
- The natural history of the systemic inflammatory response syndrome (SIRS). A prospective study. JAMA. 1995;273(2):117–123. , , , , , .
- Predicting bacteremia in patients with sepsis syndrome. Academic Medical Center Consortium Sepsis Project Working Group. J Infect Dis. 1997;176(6):1538–1551. , , , et al.
- The systemic inflammatory response syndrome as a predictor of bacteraemia and outcome from sepsis. QJM. 1996;89(7):515–522. , .
- Who needs a blood culture? A prospectively derived and validated prediction rule. J Emerg Med. 2008;35(3):255–264. , , , , .
- Factors associated with positive blood cultures in outpatients with suspected bacteremia. Eur J Clin Microbiol Infect Dis. 2011;30(12):1615–1619. , , , , , .
- Two rules for early prediction of bacteremia: testing in a university and a community hospital. J Gen Intern Med. 1996;11(2):98–103. , , , et al.
- Does this adult patient with suspected bacteremia require blood cultures? JAMA. 2012;308(5):502–511. , , , .
- Clinical prediction rules for bacteremia and in‐hospital death based on clinical data at the time of blood withdrawal for culture: an evaluation of their development and use. J Eval Clin Pract. 2006;12(6):692–703. , , , et al.
- Timing of specimen collection for blood cultures from febrile patients with bacteremia. J Clin Microbiol. 2008;46(4):1381–1385. , , , et al.
- Usefulness of blood culture for hospitalized patients who are receiving antibiotic therapy. Clin Infect Dis. 2001;32(11):1651–1655. , , , .
- Clinical utility of blood cultures in adult patients with community‐acquired pneumonia without defined underlying risks. Chest. 1995;108(4):932–936. , , , , .
- Blood cultures in community‐acquired pneumonia: are we ready to quit? Chest. 2003;123(4):977–978. .
- Blood cultures for community‐acquired pneumonia: piecing together a mosaic for doing less. Am J Respir Crit Care Med. 2004;169(3):327–328. .
© 2016 Society of Hospital Medicine
Access to Inpatient Dermatology Care in Pennsylvania Hospitals
Access to care is a known issue in dermatology, and many patients may experience long waiting periods to see a physician.1 Previous research has evaluated access to outpatient dermatology services, but access to dermatology in inpatient medicine is also a growing problem.2 Reports depict a decrease in dermatologist involvement in inpatient care and an increase in nondermatologist physicians caring for inpatients with dermatologic needs.2,3 This lack of access could potentially lead to missed and/or incorrect diagnoses. One study showed that most cases in which dermatology was consulted required a change in treatment once correctly diagnosed by a dermatologist.4
Despite the known trend of decreasing involvement of dermatologists in inpatient care, there remains a paucity of data quantifying the current gap in access to care for inpatients with dermatologic needs. The purpose of this study was to evaluate differential access to inpatient dermatology services across licensed hospitals within the state of Pennsylvania.
Methods
In July 2014, an invitation to participate in an anonymous online survey was mailed to all 274 hospitals throughout Pennsylvania that were currently licensed by the US Department of Health. This study was declared exempt from review by the University of Pennsylvania (Philadelphia, Pennsylvania) institutional review board. Study data were collected and managed using electronic data capture tools hosted by the University of Pennsylvania. Hospital administrators were encouraged to report dermatology access and details regardless of current status of inpatient dermatology services in order to inform efforts to improve access to care. Invitation letters to participate in the online survey were addressed to “Administrator” according to the contact method used by the US Department of Health for accreditation of state hospitals. Addresses for accredited state hospitals were obtained from the US Department of Health Web site and were supplemented with additional addresses of Veterans Administration hospitals obtained from public listings. Three weeks after initial survey invitations were sent, reminder letters were sent to nonresponsive hospitals. Only data from hospitals currently offering inpatient services were included in the analysis; exclusion criteria included psychiatric hospitals, substance abuse treatment centers, physical rehabilitation facilities, and outpatient centers.
Results
Of the 204 (74%) hospitals that met the inclusion criteria, 32 responded (16% response rate). Of the 32 hospitals that responded, 31 (97%) were privately owned facilities, 3 of which were specialty surgical centers. One (3%) hospital was a Veterans Administration hospital. Of the responders, 16 (50%) reported having any form of access to inpatient dermatology consultations. Of the 16 with reported access, 9 (56%) received their consultations through a local or private dermatology group, while 4 (25%) had a dermatologist on staff. The remaining 3 hospitals (19%) provided dermatology consultations through nondermatologist physicians on staff (a surgeon, an emergency care physician, and an internist, respectively).
The survey also sought to gain information about the various degrees of access to inpatient dermatology care that hospitals provide. Of the 16 hospitals that reported access to inpatient dermatology services, 11 (69%) provided specific details related to access (eg, coverage, anticipated response times) of dermatology consultations (Figure). The type of access to inpatient dermatology in relation to the type of hospital ownership is shown in the Table.
Comment
The survey results indicated suboptimal access to inpatient dermatology services in Pennsylvania hospitals. Only 50% (16/32) of respondents reported providing access to dermatology consultation, the majority of which appeared to have extremely limited same-day, evening, and weekend coverage. Although our study was limited by a low response rate (16%) and represents a narrow geographic distribution, these results suggested that lack of access to inpatient dermatology consultation may be a widespread problem and may be independent of the type of hospital ownership. Furthermore, the results of this study may offer insight into the different types and availability of inpatient dermatology services offered in hospitals across the United States.
The decrease in inpatient dermatology access has been driven by many factors. First, advances in medical research and pharmacotherapy may have decreased the need for dermatologic inpatient care, as patients who formerly would have required inpatient treatments are now able to receive therapies in an outpatient setting (eg, treatment of psoriasis).5 This may create less demand for hospitals to have a dermatologist on staff. Additionally, hospitals may be less able to incentivize dermatologists to provide inpatient dermatology consultations due to low reimbursement rates, time and distance required to visit inpatient facilities (taking away from outpatient clinic time), and the perception that inpatient cases carry greater liability given their greater complexity.6-8 Together, these factors may have contributed to the current lack of inpatient dermatology services in Pennsylvania hospitals and likely in hospitals throughout the United States.
Conclusion
Although a relatively small number of academic hospitals are experiencing an emergence of dermatology hospitalists, poor access to inpatient dermatology care continues to be a problem.8 Innovation (eg, the use of teledermatology to improve access to care9) and further studies are needed to address this gap in access to inpatient dermatology care.
- Kimball AB, Resneck JS. The US dermatology workforce: a specialty remains in shortage. J Am Acad Dermatol. 2008;59:741-745.
- Helms AE, Helms SE, Brodell RT. Hospital consultations: time to address an unmet need? J Am Acad Dermatol. 2009;60:308-311.
- Kirsner RS, Yang DG, Kerdel FA. The changing status of inpatient dermatology at American academic dermatology programs. J Am Acad Dermatol. 1999;40:755-757.
- Nahass GT, Meyer AJ, Campbell SF, et al. Prevalence of cutaneous findings in hospitalized medical patients. J Am Acad Dermatol. 1995;33:207-211.
- Steinke S, Peitsch WK, Ludwig A, et al. Cost-of-illness in psoriasis: comparing inpatient and outpatient therapy. PLoS One. 2013;8:e78152.
- Swerlick RA. Declining interest in medical dermatology. Arch Dermatol. 1998;134:1160-1162.
- Kirsner RS, Yang DG, Kerdel FA. Inpatient dermatology: the difficulties, the reality, and the future. Dermatol Clin. 2000;18:383-390.
- Fox LP, Cotliar J, Hughey L, et al. Hospitalist dermatology. J Am Acad Dermatol. 2009;61:153-154.
- Sharma P, Kovarik CL, Lipoff JB. Teledermatology as a means to improve access to inpatient dermatology care [published online ahead of print September 16, 2015]. J Telemed Telecare. PII: 1357633X15603298.
Access to care is a known issue in dermatology, and many patients may experience long waiting periods to see a physician.1 Previous research has evaluated access to outpatient dermatology services, but access to dermatology in inpatient medicine is also a growing problem.2 Reports depict a decrease in dermatologist involvement in inpatient care and an increase in nondermatologist physicians caring for inpatients with dermatologic needs.2,3 This lack of access could potentially lead to missed and/or incorrect diagnoses. One study showed that most cases in which dermatology was consulted required a change in treatment once correctly diagnosed by a dermatologist.4
Despite the known trend of decreasing involvement of dermatologists in inpatient care, there remains a paucity of data quantifying the current gap in access to care for inpatients with dermatologic needs. The purpose of this study was to evaluate differential access to inpatient dermatology services across licensed hospitals within the state of Pennsylvania.
Methods
In July 2014, an invitation to participate in an anonymous online survey was mailed to all 274 hospitals throughout Pennsylvania that were currently licensed by the US Department of Health. This study was declared exempt from review by the University of Pennsylvania (Philadelphia, Pennsylvania) institutional review board. Study data were collected and managed using electronic data capture tools hosted by the University of Pennsylvania. Hospital administrators were encouraged to report dermatology access and details regardless of current status of inpatient dermatology services in order to inform efforts to improve access to care. Invitation letters to participate in the online survey were addressed to “Administrator” according to the contact method used by the US Department of Health for accreditation of state hospitals. Addresses for accredited state hospitals were obtained from the US Department of Health Web site and were supplemented with additional addresses of Veterans Administration hospitals obtained from public listings. Three weeks after initial survey invitations were sent, reminder letters were sent to nonresponsive hospitals. Only data from hospitals currently offering inpatient services were included in the analysis; exclusion criteria included psychiatric hospitals, substance abuse treatment centers, physical rehabilitation facilities, and outpatient centers.
Results
Of the 204 (74%) hospitals that met the inclusion criteria, 32 responded (16% response rate). Of the 32 hospitals that responded, 31 (97%) were privately owned facilities, 3 of which were specialty surgical centers. One (3%) hospital was a Veterans Administration hospital. Of the responders, 16 (50%) reported having any form of access to inpatient dermatology consultations. Of the 16 with reported access, 9 (56%) received their consultations through a local or private dermatology group, while 4 (25%) had a dermatologist on staff. The remaining 3 hospitals (19%) provided dermatology consultations through nondermatologist physicians on staff (a surgeon, an emergency care physician, and an internist, respectively).
The survey also sought to gain information about the various degrees of access to inpatient dermatology care that hospitals provide. Of the 16 hospitals that reported access to inpatient dermatology services, 11 (69%) provided specific details related to access (eg, coverage, anticipated response times) of dermatology consultations (Figure). The type of access to inpatient dermatology in relation to the type of hospital ownership is shown in the Table.
Comment
The survey results indicated suboptimal access to inpatient dermatology services in Pennsylvania hospitals. Only 50% (16/32) of respondents reported providing access to dermatology consultation, the majority of which appeared to have extremely limited same-day, evening, and weekend coverage. Although our study was limited by a low response rate (16%) and represents a narrow geographic distribution, these results suggested that lack of access to inpatient dermatology consultation may be a widespread problem and may be independent of the type of hospital ownership. Furthermore, the results of this study may offer insight into the different types and availability of inpatient dermatology services offered in hospitals across the United States.
The decrease in inpatient dermatology access has been driven by many factors. First, advances in medical research and pharmacotherapy may have decreased the need for dermatologic inpatient care, as patients who formerly would have required inpatient treatments are now able to receive therapies in an outpatient setting (eg, treatment of psoriasis).5 This may create less demand for hospitals to have a dermatologist on staff. Additionally, hospitals may be less able to incentivize dermatologists to provide inpatient dermatology consultations due to low reimbursement rates, time and distance required to visit inpatient facilities (taking away from outpatient clinic time), and the perception that inpatient cases carry greater liability given their greater complexity.6-8 Together, these factors may have contributed to the current lack of inpatient dermatology services in Pennsylvania hospitals and likely in hospitals throughout the United States.
Conclusion
Although a relatively small number of academic hospitals are experiencing an emergence of dermatology hospitalists, poor access to inpatient dermatology care continues to be a problem.8 Innovation (eg, the use of teledermatology to improve access to care9) and further studies are needed to address this gap in access to inpatient dermatology care.
Access to care is a known issue in dermatology, and many patients may experience long waiting periods to see a physician.1 Previous research has evaluated access to outpatient dermatology services, but access to dermatology in inpatient medicine is also a growing problem.2 Reports depict a decrease in dermatologist involvement in inpatient care and an increase in nondermatologist physicians caring for inpatients with dermatologic needs.2,3 This lack of access could potentially lead to missed and/or incorrect diagnoses. One study showed that most cases in which dermatology was consulted required a change in treatment once correctly diagnosed by a dermatologist.4
Despite the known trend of decreasing involvement of dermatologists in inpatient care, there remains a paucity of data quantifying the current gap in access to care for inpatients with dermatologic needs. The purpose of this study was to evaluate differential access to inpatient dermatology services across licensed hospitals within the state of Pennsylvania.
Methods
In July 2014, an invitation to participate in an anonymous online survey was mailed to all 274 hospitals throughout Pennsylvania that were currently licensed by the US Department of Health. This study was declared exempt from review by the University of Pennsylvania (Philadelphia, Pennsylvania) institutional review board. Study data were collected and managed using electronic data capture tools hosted by the University of Pennsylvania. Hospital administrators were encouraged to report dermatology access and details regardless of current status of inpatient dermatology services in order to inform efforts to improve access to care. Invitation letters to participate in the online survey were addressed to “Administrator” according to the contact method used by the US Department of Health for accreditation of state hospitals. Addresses for accredited state hospitals were obtained from the US Department of Health Web site and were supplemented with additional addresses of Veterans Administration hospitals obtained from public listings. Three weeks after initial survey invitations were sent, reminder letters were sent to nonresponsive hospitals. Only data from hospitals currently offering inpatient services were included in the analysis; exclusion criteria included psychiatric hospitals, substance abuse treatment centers, physical rehabilitation facilities, and outpatient centers.
Results
Of the 204 (74%) hospitals that met the inclusion criteria, 32 responded (16% response rate). Of the 32 hospitals that responded, 31 (97%) were privately owned facilities, 3 of which were specialty surgical centers. One (3%) hospital was a Veterans Administration hospital. Of the responders, 16 (50%) reported having any form of access to inpatient dermatology consultations. Of the 16 with reported access, 9 (56%) received their consultations through a local or private dermatology group, while 4 (25%) had a dermatologist on staff. The remaining 3 hospitals (19%) provided dermatology consultations through nondermatologist physicians on staff (a surgeon, an emergency care physician, and an internist, respectively).
The survey also sought to gain information about the various degrees of access to inpatient dermatology care that hospitals provide. Of the 16 hospitals that reported access to inpatient dermatology services, 11 (69%) provided specific details related to access (eg, coverage, anticipated response times) of dermatology consultations (Figure). The type of access to inpatient dermatology in relation to the type of hospital ownership is shown in the Table.
Comment
The survey results indicated suboptimal access to inpatient dermatology services in Pennsylvania hospitals. Only 50% (16/32) of respondents reported providing access to dermatology consultation, the majority of which appeared to have extremely limited same-day, evening, and weekend coverage. Although our study was limited by a low response rate (16%) and represents a narrow geographic distribution, these results suggested that lack of access to inpatient dermatology consultation may be a widespread problem and may be independent of the type of hospital ownership. Furthermore, the results of this study may offer insight into the different types and availability of inpatient dermatology services offered in hospitals across the United States.
The decrease in inpatient dermatology access has been driven by many factors. First, advances in medical research and pharmacotherapy may have decreased the need for dermatologic inpatient care, as patients who formerly would have required inpatient treatments are now able to receive therapies in an outpatient setting (eg, treatment of psoriasis).5 This may create less demand for hospitals to have a dermatologist on staff. Additionally, hospitals may be less able to incentivize dermatologists to provide inpatient dermatology consultations due to low reimbursement rates, time and distance required to visit inpatient facilities (taking away from outpatient clinic time), and the perception that inpatient cases carry greater liability given their greater complexity.6-8 Together, these factors may have contributed to the current lack of inpatient dermatology services in Pennsylvania hospitals and likely in hospitals throughout the United States.
Conclusion
Although a relatively small number of academic hospitals are experiencing an emergence of dermatology hospitalists, poor access to inpatient dermatology care continues to be a problem.8 Innovation (eg, the use of teledermatology to improve access to care9) and further studies are needed to address this gap in access to inpatient dermatology care.
- Kimball AB, Resneck JS. The US dermatology workforce: a specialty remains in shortage. J Am Acad Dermatol. 2008;59:741-745.
- Helms AE, Helms SE, Brodell RT. Hospital consultations: time to address an unmet need? J Am Acad Dermatol. 2009;60:308-311.
- Kirsner RS, Yang DG, Kerdel FA. The changing status of inpatient dermatology at American academic dermatology programs. J Am Acad Dermatol. 1999;40:755-757.
- Nahass GT, Meyer AJ, Campbell SF, et al. Prevalence of cutaneous findings in hospitalized medical patients. J Am Acad Dermatol. 1995;33:207-211.
- Steinke S, Peitsch WK, Ludwig A, et al. Cost-of-illness in psoriasis: comparing inpatient and outpatient therapy. PLoS One. 2013;8:e78152.
- Swerlick RA. Declining interest in medical dermatology. Arch Dermatol. 1998;134:1160-1162.
- Kirsner RS, Yang DG, Kerdel FA. Inpatient dermatology: the difficulties, the reality, and the future. Dermatol Clin. 2000;18:383-390.
- Fox LP, Cotliar J, Hughey L, et al. Hospitalist dermatology. J Am Acad Dermatol. 2009;61:153-154.
- Sharma P, Kovarik CL, Lipoff JB. Teledermatology as a means to improve access to inpatient dermatology care [published online ahead of print September 16, 2015]. J Telemed Telecare. PII: 1357633X15603298.
- Kimball AB, Resneck JS. The US dermatology workforce: a specialty remains in shortage. J Am Acad Dermatol. 2008;59:741-745.
- Helms AE, Helms SE, Brodell RT. Hospital consultations: time to address an unmet need? J Am Acad Dermatol. 2009;60:308-311.
- Kirsner RS, Yang DG, Kerdel FA. The changing status of inpatient dermatology at American academic dermatology programs. J Am Acad Dermatol. 1999;40:755-757.
- Nahass GT, Meyer AJ, Campbell SF, et al. Prevalence of cutaneous findings in hospitalized medical patients. J Am Acad Dermatol. 1995;33:207-211.
- Steinke S, Peitsch WK, Ludwig A, et al. Cost-of-illness in psoriasis: comparing inpatient and outpatient therapy. PLoS One. 2013;8:e78152.
- Swerlick RA. Declining interest in medical dermatology. Arch Dermatol. 1998;134:1160-1162.
- Kirsner RS, Yang DG, Kerdel FA. Inpatient dermatology: the difficulties, the reality, and the future. Dermatol Clin. 2000;18:383-390.
- Fox LP, Cotliar J, Hughey L, et al. Hospitalist dermatology. J Am Acad Dermatol. 2009;61:153-154.
- Sharma P, Kovarik CL, Lipoff JB. Teledermatology as a means to improve access to inpatient dermatology care [published online ahead of print September 16, 2015]. J Telemed Telecare. PII: 1357633X15603298.
Practice Points
- Changes in inpatient dermatology care over the past few decades have led to barriers in patient access to care.
- Many hospitals currently lack access to inpatient dermatology care, and those that do provide access often have no same-day, evening, or weekend coverage or may only provide access to dermatology care via nondermatologist physicians.
- Intervention by a dermatologist may be essential in making correct dermatologic diagnoses and treatment recommendations in inpatient settings.
Web Page Content and Quality Assessed for Shoulder Replacement
The Internet is becoming a primary source for obtaining medical information. This growing trend may have serious implications for the medical field. As patients increasingly regard the Internet as an essential tool for obtaining health-related information, questions have been raised regarding the quality of medical information available on the Internet.1 Studies have shown that health-related sites often present inaccurate, inconsistent, and outdated information that may have a negative impact on health care decisions made by patients.2
According to the US Census Bureau, 71.7% of American households report having access to the Internet.3 Of those who have access to Internet, approximately 72% have sought health information online over the last year.4 Among people older than age 65 years living in the United States, there has been a growing trend toward using the Internet, from 14% in 2000 to almost 60% in 2013, according to the Pew Research Internet Project.5 Most medical websites are viewed for information on diseases and treatment options.6 Since most patients want to be informed about treatment options, as well as risks and benefits for each treatment, access to credible information is essential for proper decision-making.7
To assess the quality of information on the Internet, we used DISCERN, a standardized questionnaire to aid consumers in judging Internet content.8 The DISCERN instrument, available at www.discern.org.uk, was designed by an expert group in the United Kingdom. First, an expert panel developed and tested the instrument, and then health care providers and self-help group members tested it further.8,9 The questionnaire had been found to have good interrater reliability, regardless of use by health professionals or consumers.8-10
More than 53,000 shoulder arthroplasties are performed in the United States annually, and the number is growing, with the main goal of pain relief from glenohumeral degenerative joint disease.11,12 The Internet has become a quasi–second opinion for patients trying to participate in their care. Given the prevalence of shoulder-related surgeries, it is critical to analyze and become familiar with the quality of information that patients read online in order to direct them to nonbiased, all-inclusive websites. In this study, we provide a summary assessment and comparison of the quality of online information pertaining to shoulder replacement, using medical (total shoulder replacement) and nontechnical (shoulder replacement) search terms.
Methods
Websites were identified using 3 search engines (Google, Yahoo, and Bing) and 2 search terms, shoulder replacement (SR) and total shoulder arthroplasty (TSA), on January 17, 2014. These 3 search engines were used because 77% of health care–related information online searches begin through a search engine (Google, Bing, Yahoo); only 13% begin at a health care–specialized website.4 These search terms were used after consulting with orthopedic residents and attending physicians in a focus group regarding the terminology used with patients. The first 30 websites in each search engine were identified consecutively and evaluated for category and quality of information using the DISCERN instrument.
A total of 180 websites (90 per search term) were reviewed. Each website was evaluated independently by 3 medical students. In the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram, we recorded how websites were identified, screened, and included (Figure 1).13 Websites that were duplicated within each search term and those that were inaccessible were used to determine the total number of noncommercial versus commercial websites, but were excluded from the final analysis. The first part of the analysis involved determining the type of website (eg, commercial vs noncommercial) based upon the html endings. All .com endings were classified as commercial websites; noncommercial included .gov, .org, .edu, and .net endings. Next, each website was categorized based on the target audience. Websites were grouped into health professional–oriented information, patient-oriented, advertisement, or “other.” These classifications were based on those described in previous works.14,15 The “other” category included images, YouTube videos, another search engine, and open forums, which were also excluded from the final analysis because they were not easily evaluable with the DISCERN instrument. Websites were considered health professional–oriented if they included journal articles, scholarly articles, and/or rehabilitation protocols. Patient-directed websites clearly stated the information was directed to patients or provided a general overview. Advertisement included sites that displayed ads or products for sale. Websites were evaluated for quality using the DISCERN instrument (Figure 2).
DISCERN has 3 subdivision scores: the reliable score (composed of the first 8 questions), the treatment options (the next 7 questions), and 1 final question that addresses the overall quality of the website and is rated independently of the first 15 questions. DISCERN uses 2 scales, a binary scale anchored on both extremes with the number 1 equaling complete absence of the criteria being measured, and the number 5 at the upper extreme, representing completeness of the quality being assessed. In between 1 and 5 is a partial ordinal scale measuring from 2 to 4, which indicates the information is present to some extent but not complete. The ordinal scale allows ranking of the criteria being assessed. Summarizing values from each of the 2 scales poses some concern: the scale is not a true binary scale because of the ordinal scale of the middle numbers (2-4), and as such, is not amenable to being an interval scale to calculate arithmetic means. To summarize the values from the 2 scales, we calculated the harmonic mean, the arithmetic mean, the geometric mean, and the median. The means were empirically compared with the median, and we used the harmonic mean to summarize scale values because it was the best approximation of the medians.
Results
A total of 90 websites were assessed with the search term total shoulder arthroplasty and another 90 with shoulder replacement. When 37 duplicate websites for TSA and 52 for SR were eliminated, 53 (59%) and 38 (42%) unique websites were evaluated for each search term, respectively (Figure 1). (These unique websites are included in the Appendix.) Between the 2 search terms, 20 websites were duplicated. Figure 3 shows the distribution of websites by category. Total shoulder arthroplasty provided the highest percentage of health professional–oriented information; SR had the greatest percentage of patient-oriented information. Both TSA and SR had nearly the same number of advertisements and websites labeled “other.” The percentage of noncommercial websites from each search engine is represented in Figure 4. For SR, Google had 40% (12/30) noncommercial websites compared with Yahoo at 53% (16/30) and Bing at 46% (14/30). Total shoulder arthroplasty had 43% (13/30) noncommercial websites on Google, 27% (8/30) on Yahoo, and 40% (12/30) on Bing. In total, SR had more noncommercial websites, 47% (42/90), compared with 37% (33/90) for TSA.
The mean of all 3 raters for reliablity (DISCERN questions 1-8) and treatment options (DISCERN questions 9-15) is represented in the Table. For both search terms, we found that websites identified as health professional–oriented had the highest reliable mean scores, followed by patient-oriented, and advertisement at the lowest (SR: P = .054; TSA: P = .134). For SR, treatment mean scores demonstrated similar results with health professional–oriented websites receiving the highest, followed by patient-oriented and advertisement (P = .005). However, the treatment mean scores for TSA differed with patient-oriented websites receiving higher scores than health professional–oriented websites, but this was not statistically significant (P= .407). Regarding search terms, there were no significant differences between mean reliable and treatment scores across all categories.
The average overall DISCERN score for TSA websites was 2.5 (range, 1-5), compared with 2.3 (range, 1-5) for SR websites. The overall reliable score (DISCERN questions 1-8) for TSA websites was 2.6 and 2.5 for SR websites (P < .001). For TSA websites, 38% (20/53) were classified as good, having an overall DISCERN score ≥3, versus 26% (10/38) of SR websites. The overall DISCERN score for health professional–oriented websites was 2.7, patient-oriented websites received a score of 2.6, and advertisements had the lowest score at 2.4.
Discussion
Both patients and health professionals obtain information on health care subjects through the Internet, which has become the primary resource for patients.15,16 However, there are no strict regulations of the content being written. This creates a challenge for the typical user to find credible and evidence-based information, which is important because misleading information could cause undue anxiety, among other effects.17,18 The aims of this study were to determine the quality of Internet information for shoulder replacement surgeries using the medical terminology total shoulder arthroplasty (TSA) and the nontechnical term shoulder replacement (SR), and to compare the results.
After analyzing the types of websites returned for both total shoulder arthroplasty and shoulder replacement (Figure 4), it was interesting to find that using nonmedical terminology as the search term provided more noncommercial websites compared with total shoulder arthroplasty. Furthermore, Yahoo provided the highest yield of noncommercial websites at 16, with Bing at 14, when using SR as the search term. We believe the increase in noncommercial websites returned for SR was greater than for TSA because SR yielded more patient-oriented websites, which usually had html endings of .edu and .org, as shown in Figure 3 (48% of SR websites offered patient-oriented information).
Although there were more noncommercial websites for SR, the majority of the DISCERN values between the 2 search terms did not differ significantly. This is a direct result of the number of sites (20) that were duplicated across both search terms. However as seen in the Table, TSA had similar reliable mean scores for advertisements and patient-oriented websites but a slightly higher reliable score for health professional–oriented websites. We correlated this with the increased number of health professional–oriented websites returned when using TSA as the search term (Figure 3). The health professional–oriented websites explained their aims and cited their sources more consistently than did patient-oriented sites and advertisements, resulting in higher reliable scores. Although patient-oriented websites frequently lacked citations, they provided information about multiple treatment options, which were more relevant to consumers. This resulted in nearly equivalent reliable scores. Treatment means for advertisements in both SR and TSA were similar. However, treatment means for professional-oriented websites in TSA were lower than those for SR because health professional–oriented websites often were only moderately relevant to consumers, with their focus usually on 1 treatment option or on rehabilitation protocols. Although the DISCERN scores were similar between the search terms, total shoulder arthroplasty provided more websites (20) classified as good—overall DISCERN score, ≥3—than SR did (10). Advertisement websites had similar overall DISCERN scores, which we anticipated because most of the advertisements were duplicated across the search terms.
Using the 2 search terms, academic websites and commercial websites, such as WebMD, consistently received higher reliable and overall DISCERN scores. Advertisement websites, which need to deliver a clear message, frequently scored high on explicitly stating their aims and relevance to consumers, but focused on their products without discussing the benefits of other treatment options. This is significant because Internet search engines, such as Google, offer sponsor links for which organizations pay to appear at the top of the search results. This creates the potential for consumers to receive biased information because most individuals only visit the top 10 websites generated by a search engine.19
We concluded that the quality of online information relating to SR and TSA was highly variable and frequently of moderate-to-poor quality, with most overall DISCERN scores <3. The quality of information found online for this study using the DISCERN instrument is consistent with those studies using DISCERN to evaluate other medical conditions (eg, bunions, chronic pain, general anesthesia, and anterior cruciate ligament reconstruction).2,9,15,19 These studies also concluded that online information varies tremendously in quality and completeness.
This study has several limitations. Websites were searched at a single time point and, because Internet resources are frequently updated, the results of this study could vary. Furthermore, although Google, Yahoo, and Bing are 3 of the most popular search engines, these are not the only resources patients use when searching the Internet for health-related information. Other search engines, such as Pubmed.gov and MSN.com, could provide additional websites for Internet users. Lastly, although DISCERN is validated to address the quality of information available online, it does not evaluate the accuracy of the information.8 Our use of DISCERN involves 2 scales, a binary yes/no (ratings, 1 and 5) and an ordinal scale (ratings, 2-4). As such, a single mean summary statistic cannot be calculated.
Conclusion
The information available on the Internet pertaining to TSA and SR is highly variable and provides mostly moderate-to-poor quality information based on the DISCERN instrument. Many websites failed to describe the benefits and the risks of different treatment options, including nonoperative management. Health care professionals should be aware that patients often refer to the Internet as a primary resource for obtaining medical information. It is important to direct patients to websites that provide accurate information, because patients who educate themselves about their conditions and actively participate in decision-making may have improved health outcomes.20-22 Overall, academic websites and commercial websites, such as WebMD and OrthoInfo, generally had higher DISCERN scores when using either search term. Of major concern is the potential for misleading advertisements or incorrect information that can negatively affect health outcomes. This study found that using nonmedical terminology (SR) provided more noncommercial and patient-oriented websites, especially through Yahoo. This study highlights the need for more comprehensive online information pertaining to shoulder replacement that can better serve as a resource for Internet users.
1. Eysenbach G, Powell J, Kuss O, Sa ER. Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. JAMA. 2002;287(20):2691-2700.
2. Bruce-Brand RA, Baker JF, Byrne DP, Hogan NA, McCarthy T. Assessment of the quality and content of information on anterior cruciate ligament reconstruction on the internet. Arthroscopy. 2013;29(6):1095-1100.
3. Computer and internet use in the United States: population characteristics. US Census Bureau website. http://www.census.gov/hhes/computer/. Accessed December 11, 2015.
4. Fox S, Duggan M. Health online 2013. Pew Research Center website. http://pewinternet.org/Reports/2013/Health-online.aspx. Published January 15, 2013. Accessed November 24, 2015.
5. Smith A. Older adults and technology use. Pew Research Center website. http://www.pewinternet.org/2014/04/03/older-adults-and-technology-use. Published April 3, 2014. Accessed November 24, 2015.
6. Shuyler KS, Knight KM. What are patients seeking when they turn to the internet? Qualitative content analysis of questions asked by visitors to an orthopaedics web site. J Med Internet Res. 2003;5(4):e24.
7. Meredith P, Emberton M, Wood C, Smith J. Comparison of patients’ needs for information on prostate surgery with printed materials provided by surgeons. Qual Health Care. 1995;4(1):18-23.
8. Charnock D, Shepperd S, Needham G, Gann R. DISCERN: An instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health. 1999;53(2):105-111.
9. Kaicker J, Debono VB, Dang W, Buckley N, Thabane L. Assessment of the quality and variability of health information on chronic pain websites using the DISCERN instrument. BMC Med. 2010;8(1):59.
10. Griffiths KM, Christensen H. Website quality indicators for consumers. J Med Internet Res. 2005;7(5):e55.
11. Wiater JM. Shoulder joint replacement. American Academy of Orthopedic Surgeons website. http://orthoinfo.aaos.org/topic.cfm?topic=A00094. Updated December 2011. Accessed November 24, 2015.
12. Kim SH, Wise BL, Zhang Y, Szabo RM. Increasing incidence of shoulder arthroplasty in the united states. J Bone Joint Surg Am. 2011;93(24):2249-2254.
13. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med. 2009;151(4):W65-W94.
14. Nason GJ, Baker JF, Byrne DP, Noel J, Moore D, Kiely PJ. Scoliosis-specific information on the internet: has the “information highway” led to better information provision? Spine. 2012;37(21):E1364-E1369.
15. Starman JS, Gettys FK, Capo JA, Fleischli JE, Norton HJ, Karunakar MA. Quality and content of internet-based information for ten common orthopaedic sports medicine diagnoses. J Bone Joint Surg Am. 2010;92(7):1612-1618.
16. Bernstein J, Ahn J, Veillette C. The future of orthopaedic information management. J Bone Joint Surg Am. 2012;94(13):e95.
17. Berland GK, Elliott MN, Morales LS, et al. Health information on the Internet: accessibility, quality, and readability in English and Spanish. JAMA. 2001;285(20):2612-2621.
18. Fallowfield LJ, Hall A, Maguire GP, Baum M. Psychological outcomes of different treatment policies in women with early breast cancer outside a clinical trial. BMJ. 1990;301(6752):575-580.
19. Chong YM, Fraval A, Chandrananth J, Plunkett V, Tran P. Assessment of the quality of web-based information on bunions. Foot Ankle Int. 2013;34(8):1134-1139.
20. Brody DS, Miller SM, Lerman CE, Smith DG, Caputo GC. Patient perception of involvement in medical care. J Gen Intern Med. 1989;4(6):506-511.
21. Greenfield S, Kaplan S, Ware JE Jr. Expanding patient involvement in care. Effects on patient outcomes. Ann Intern Med. 1985;102(4):520-528.
22. Kaplan SH, Greenfield S, Ware JE Jr. Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Med Care. 1989;27(3 suppl):S110-S127.
The Internet is becoming a primary source for obtaining medical information. This growing trend may have serious implications for the medical field. As patients increasingly regard the Internet as an essential tool for obtaining health-related information, questions have been raised regarding the quality of medical information available on the Internet.1 Studies have shown that health-related sites often present inaccurate, inconsistent, and outdated information that may have a negative impact on health care decisions made by patients.2
According to the US Census Bureau, 71.7% of American households report having access to the Internet.3 Of those who have access to Internet, approximately 72% have sought health information online over the last year.4 Among people older than age 65 years living in the United States, there has been a growing trend toward using the Internet, from 14% in 2000 to almost 60% in 2013, according to the Pew Research Internet Project.5 Most medical websites are viewed for information on diseases and treatment options.6 Since most patients want to be informed about treatment options, as well as risks and benefits for each treatment, access to credible information is essential for proper decision-making.7
To assess the quality of information on the Internet, we used DISCERN, a standardized questionnaire to aid consumers in judging Internet content.8 The DISCERN instrument, available at www.discern.org.uk, was designed by an expert group in the United Kingdom. First, an expert panel developed and tested the instrument, and then health care providers and self-help group members tested it further.8,9 The questionnaire had been found to have good interrater reliability, regardless of use by health professionals or consumers.8-10
More than 53,000 shoulder arthroplasties are performed in the United States annually, and the number is growing, with the main goal of pain relief from glenohumeral degenerative joint disease.11,12 The Internet has become a quasi–second opinion for patients trying to participate in their care. Given the prevalence of shoulder-related surgeries, it is critical to analyze and become familiar with the quality of information that patients read online in order to direct them to nonbiased, all-inclusive websites. In this study, we provide a summary assessment and comparison of the quality of online information pertaining to shoulder replacement, using medical (total shoulder replacement) and nontechnical (shoulder replacement) search terms.
Methods
Websites were identified using 3 search engines (Google, Yahoo, and Bing) and 2 search terms, shoulder replacement (SR) and total shoulder arthroplasty (TSA), on January 17, 2014. These 3 search engines were used because 77% of health care–related information online searches begin through a search engine (Google, Bing, Yahoo); only 13% begin at a health care–specialized website.4 These search terms were used after consulting with orthopedic residents and attending physicians in a focus group regarding the terminology used with patients. The first 30 websites in each search engine were identified consecutively and evaluated for category and quality of information using the DISCERN instrument.
A total of 180 websites (90 per search term) were reviewed. Each website was evaluated independently by 3 medical students. In the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram, we recorded how websites were identified, screened, and included (Figure 1).13 Websites that were duplicated within each search term and those that were inaccessible were used to determine the total number of noncommercial versus commercial websites, but were excluded from the final analysis. The first part of the analysis involved determining the type of website (eg, commercial vs noncommercial) based upon the html endings. All .com endings were classified as commercial websites; noncommercial included .gov, .org, .edu, and .net endings. Next, each website was categorized based on the target audience. Websites were grouped into health professional–oriented information, patient-oriented, advertisement, or “other.” These classifications were based on those described in previous works.14,15 The “other” category included images, YouTube videos, another search engine, and open forums, which were also excluded from the final analysis because they were not easily evaluable with the DISCERN instrument. Websites were considered health professional–oriented if they included journal articles, scholarly articles, and/or rehabilitation protocols. Patient-directed websites clearly stated the information was directed to patients or provided a general overview. Advertisement included sites that displayed ads or products for sale. Websites were evaluated for quality using the DISCERN instrument (Figure 2).
DISCERN has 3 subdivision scores: the reliable score (composed of the first 8 questions), the treatment options (the next 7 questions), and 1 final question that addresses the overall quality of the website and is rated independently of the first 15 questions. DISCERN uses 2 scales, a binary scale anchored on both extremes with the number 1 equaling complete absence of the criteria being measured, and the number 5 at the upper extreme, representing completeness of the quality being assessed. In between 1 and 5 is a partial ordinal scale measuring from 2 to 4, which indicates the information is present to some extent but not complete. The ordinal scale allows ranking of the criteria being assessed. Summarizing values from each of the 2 scales poses some concern: the scale is not a true binary scale because of the ordinal scale of the middle numbers (2-4), and as such, is not amenable to being an interval scale to calculate arithmetic means. To summarize the values from the 2 scales, we calculated the harmonic mean, the arithmetic mean, the geometric mean, and the median. The means were empirically compared with the median, and we used the harmonic mean to summarize scale values because it was the best approximation of the medians.
Results
A total of 90 websites were assessed with the search term total shoulder arthroplasty and another 90 with shoulder replacement. When 37 duplicate websites for TSA and 52 for SR were eliminated, 53 (59%) and 38 (42%) unique websites were evaluated for each search term, respectively (Figure 1). (These unique websites are included in the Appendix.) Between the 2 search terms, 20 websites were duplicated. Figure 3 shows the distribution of websites by category. Total shoulder arthroplasty provided the highest percentage of health professional–oriented information; SR had the greatest percentage of patient-oriented information. Both TSA and SR had nearly the same number of advertisements and websites labeled “other.” The percentage of noncommercial websites from each search engine is represented in Figure 4. For SR, Google had 40% (12/30) noncommercial websites compared with Yahoo at 53% (16/30) and Bing at 46% (14/30). Total shoulder arthroplasty had 43% (13/30) noncommercial websites on Google, 27% (8/30) on Yahoo, and 40% (12/30) on Bing. In total, SR had more noncommercial websites, 47% (42/90), compared with 37% (33/90) for TSA.
The mean of all 3 raters for reliablity (DISCERN questions 1-8) and treatment options (DISCERN questions 9-15) is represented in the Table. For both search terms, we found that websites identified as health professional–oriented had the highest reliable mean scores, followed by patient-oriented, and advertisement at the lowest (SR: P = .054; TSA: P = .134). For SR, treatment mean scores demonstrated similar results with health professional–oriented websites receiving the highest, followed by patient-oriented and advertisement (P = .005). However, the treatment mean scores for TSA differed with patient-oriented websites receiving higher scores than health professional–oriented websites, but this was not statistically significant (P= .407). Regarding search terms, there were no significant differences between mean reliable and treatment scores across all categories.
The average overall DISCERN score for TSA websites was 2.5 (range, 1-5), compared with 2.3 (range, 1-5) for SR websites. The overall reliable score (DISCERN questions 1-8) for TSA websites was 2.6 and 2.5 for SR websites (P < .001). For TSA websites, 38% (20/53) were classified as good, having an overall DISCERN score ≥3, versus 26% (10/38) of SR websites. The overall DISCERN score for health professional–oriented websites was 2.7, patient-oriented websites received a score of 2.6, and advertisements had the lowest score at 2.4.
Discussion
Both patients and health professionals obtain information on health care subjects through the Internet, which has become the primary resource for patients.15,16 However, there are no strict regulations of the content being written. This creates a challenge for the typical user to find credible and evidence-based information, which is important because misleading information could cause undue anxiety, among other effects.17,18 The aims of this study were to determine the quality of Internet information for shoulder replacement surgeries using the medical terminology total shoulder arthroplasty (TSA) and the nontechnical term shoulder replacement (SR), and to compare the results.
After analyzing the types of websites returned for both total shoulder arthroplasty and shoulder replacement (Figure 4), it was interesting to find that using nonmedical terminology as the search term provided more noncommercial websites compared with total shoulder arthroplasty. Furthermore, Yahoo provided the highest yield of noncommercial websites at 16, with Bing at 14, when using SR as the search term. We believe the increase in noncommercial websites returned for SR was greater than for TSA because SR yielded more patient-oriented websites, which usually had html endings of .edu and .org, as shown in Figure 3 (48% of SR websites offered patient-oriented information).
Although there were more noncommercial websites for SR, the majority of the DISCERN values between the 2 search terms did not differ significantly. This is a direct result of the number of sites (20) that were duplicated across both search terms. However as seen in the Table, TSA had similar reliable mean scores for advertisements and patient-oriented websites but a slightly higher reliable score for health professional–oriented websites. We correlated this with the increased number of health professional–oriented websites returned when using TSA as the search term (Figure 3). The health professional–oriented websites explained their aims and cited their sources more consistently than did patient-oriented sites and advertisements, resulting in higher reliable scores. Although patient-oriented websites frequently lacked citations, they provided information about multiple treatment options, which were more relevant to consumers. This resulted in nearly equivalent reliable scores. Treatment means for advertisements in both SR and TSA were similar. However, treatment means for professional-oriented websites in TSA were lower than those for SR because health professional–oriented websites often were only moderately relevant to consumers, with their focus usually on 1 treatment option or on rehabilitation protocols. Although the DISCERN scores were similar between the search terms, total shoulder arthroplasty provided more websites (20) classified as good—overall DISCERN score, ≥3—than SR did (10). Advertisement websites had similar overall DISCERN scores, which we anticipated because most of the advertisements were duplicated across the search terms.
Using the 2 search terms, academic websites and commercial websites, such as WebMD, consistently received higher reliable and overall DISCERN scores. Advertisement websites, which need to deliver a clear message, frequently scored high on explicitly stating their aims and relevance to consumers, but focused on their products without discussing the benefits of other treatment options. This is significant because Internet search engines, such as Google, offer sponsor links for which organizations pay to appear at the top of the search results. This creates the potential for consumers to receive biased information because most individuals only visit the top 10 websites generated by a search engine.19
We concluded that the quality of online information relating to SR and TSA was highly variable and frequently of moderate-to-poor quality, with most overall DISCERN scores <3. The quality of information found online for this study using the DISCERN instrument is consistent with those studies using DISCERN to evaluate other medical conditions (eg, bunions, chronic pain, general anesthesia, and anterior cruciate ligament reconstruction).2,9,15,19 These studies also concluded that online information varies tremendously in quality and completeness.
This study has several limitations. Websites were searched at a single time point and, because Internet resources are frequently updated, the results of this study could vary. Furthermore, although Google, Yahoo, and Bing are 3 of the most popular search engines, these are not the only resources patients use when searching the Internet for health-related information. Other search engines, such as Pubmed.gov and MSN.com, could provide additional websites for Internet users. Lastly, although DISCERN is validated to address the quality of information available online, it does not evaluate the accuracy of the information.8 Our use of DISCERN involves 2 scales, a binary yes/no (ratings, 1 and 5) and an ordinal scale (ratings, 2-4). As such, a single mean summary statistic cannot be calculated.
Conclusion
The information available on the Internet pertaining to TSA and SR is highly variable and provides mostly moderate-to-poor quality information based on the DISCERN instrument. Many websites failed to describe the benefits and the risks of different treatment options, including nonoperative management. Health care professionals should be aware that patients often refer to the Internet as a primary resource for obtaining medical information. It is important to direct patients to websites that provide accurate information, because patients who educate themselves about their conditions and actively participate in decision-making may have improved health outcomes.20-22 Overall, academic websites and commercial websites, such as WebMD and OrthoInfo, generally had higher DISCERN scores when using either search term. Of major concern is the potential for misleading advertisements or incorrect information that can negatively affect health outcomes. This study found that using nonmedical terminology (SR) provided more noncommercial and patient-oriented websites, especially through Yahoo. This study highlights the need for more comprehensive online information pertaining to shoulder replacement that can better serve as a resource for Internet users.
The Internet is becoming a primary source for obtaining medical information. This growing trend may have serious implications for the medical field. As patients increasingly regard the Internet as an essential tool for obtaining health-related information, questions have been raised regarding the quality of medical information available on the Internet.1 Studies have shown that health-related sites often present inaccurate, inconsistent, and outdated information that may have a negative impact on health care decisions made by patients.2
According to the US Census Bureau, 71.7% of American households report having access to the Internet.3 Of those who have access to Internet, approximately 72% have sought health information online over the last year.4 Among people older than age 65 years living in the United States, there has been a growing trend toward using the Internet, from 14% in 2000 to almost 60% in 2013, according to the Pew Research Internet Project.5 Most medical websites are viewed for information on diseases and treatment options.6 Since most patients want to be informed about treatment options, as well as risks and benefits for each treatment, access to credible information is essential for proper decision-making.7
To assess the quality of information on the Internet, we used DISCERN, a standardized questionnaire to aid consumers in judging Internet content.8 The DISCERN instrument, available at www.discern.org.uk, was designed by an expert group in the United Kingdom. First, an expert panel developed and tested the instrument, and then health care providers and self-help group members tested it further.8,9 The questionnaire had been found to have good interrater reliability, regardless of use by health professionals or consumers.8-10
More than 53,000 shoulder arthroplasties are performed in the United States annually, and the number is growing, with the main goal of pain relief from glenohumeral degenerative joint disease.11,12 The Internet has become a quasi–second opinion for patients trying to participate in their care. Given the prevalence of shoulder-related surgeries, it is critical to analyze and become familiar with the quality of information that patients read online in order to direct them to nonbiased, all-inclusive websites. In this study, we provide a summary assessment and comparison of the quality of online information pertaining to shoulder replacement, using medical (total shoulder replacement) and nontechnical (shoulder replacement) search terms.
Methods
Websites were identified using 3 search engines (Google, Yahoo, and Bing) and 2 search terms, shoulder replacement (SR) and total shoulder arthroplasty (TSA), on January 17, 2014. These 3 search engines were used because 77% of health care–related information online searches begin through a search engine (Google, Bing, Yahoo); only 13% begin at a health care–specialized website.4 These search terms were used after consulting with orthopedic residents and attending physicians in a focus group regarding the terminology used with patients. The first 30 websites in each search engine were identified consecutively and evaluated for category and quality of information using the DISCERN instrument.
A total of 180 websites (90 per search term) were reviewed. Each website was evaluated independently by 3 medical students. In the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram, we recorded how websites were identified, screened, and included (Figure 1).13 Websites that were duplicated within each search term and those that were inaccessible were used to determine the total number of noncommercial versus commercial websites, but were excluded from the final analysis. The first part of the analysis involved determining the type of website (eg, commercial vs noncommercial) based upon the html endings. All .com endings were classified as commercial websites; noncommercial included .gov, .org, .edu, and .net endings. Next, each website was categorized based on the target audience. Websites were grouped into health professional–oriented information, patient-oriented, advertisement, or “other.” These classifications were based on those described in previous works.14,15 The “other” category included images, YouTube videos, another search engine, and open forums, which were also excluded from the final analysis because they were not easily evaluable with the DISCERN instrument. Websites were considered health professional–oriented if they included journal articles, scholarly articles, and/or rehabilitation protocols. Patient-directed websites clearly stated the information was directed to patients or provided a general overview. Advertisement included sites that displayed ads or products for sale. Websites were evaluated for quality using the DISCERN instrument (Figure 2).
DISCERN has 3 subdivision scores: the reliable score (composed of the first 8 questions), the treatment options (the next 7 questions), and 1 final question that addresses the overall quality of the website and is rated independently of the first 15 questions. DISCERN uses 2 scales, a binary scale anchored on both extremes with the number 1 equaling complete absence of the criteria being measured, and the number 5 at the upper extreme, representing completeness of the quality being assessed. In between 1 and 5 is a partial ordinal scale measuring from 2 to 4, which indicates the information is present to some extent but not complete. The ordinal scale allows ranking of the criteria being assessed. Summarizing values from each of the 2 scales poses some concern: the scale is not a true binary scale because of the ordinal scale of the middle numbers (2-4), and as such, is not amenable to being an interval scale to calculate arithmetic means. To summarize the values from the 2 scales, we calculated the harmonic mean, the arithmetic mean, the geometric mean, and the median. The means were empirically compared with the median, and we used the harmonic mean to summarize scale values because it was the best approximation of the medians.
Results
A total of 90 websites were assessed with the search term total shoulder arthroplasty and another 90 with shoulder replacement. When 37 duplicate websites for TSA and 52 for SR were eliminated, 53 (59%) and 38 (42%) unique websites were evaluated for each search term, respectively (Figure 1). (These unique websites are included in the Appendix.) Between the 2 search terms, 20 websites were duplicated. Figure 3 shows the distribution of websites by category. Total shoulder arthroplasty provided the highest percentage of health professional–oriented information; SR had the greatest percentage of patient-oriented information. Both TSA and SR had nearly the same number of advertisements and websites labeled “other.” The percentage of noncommercial websites from each search engine is represented in Figure 4. For SR, Google had 40% (12/30) noncommercial websites compared with Yahoo at 53% (16/30) and Bing at 46% (14/30). Total shoulder arthroplasty had 43% (13/30) noncommercial websites on Google, 27% (8/30) on Yahoo, and 40% (12/30) on Bing. In total, SR had more noncommercial websites, 47% (42/90), compared with 37% (33/90) for TSA.
The mean of all 3 raters for reliablity (DISCERN questions 1-8) and treatment options (DISCERN questions 9-15) is represented in the Table. For both search terms, we found that websites identified as health professional–oriented had the highest reliable mean scores, followed by patient-oriented, and advertisement at the lowest (SR: P = .054; TSA: P = .134). For SR, treatment mean scores demonstrated similar results with health professional–oriented websites receiving the highest, followed by patient-oriented and advertisement (P = .005). However, the treatment mean scores for TSA differed with patient-oriented websites receiving higher scores than health professional–oriented websites, but this was not statistically significant (P= .407). Regarding search terms, there were no significant differences between mean reliable and treatment scores across all categories.
The average overall DISCERN score for TSA websites was 2.5 (range, 1-5), compared with 2.3 (range, 1-5) for SR websites. The overall reliable score (DISCERN questions 1-8) for TSA websites was 2.6 and 2.5 for SR websites (P < .001). For TSA websites, 38% (20/53) were classified as good, having an overall DISCERN score ≥3, versus 26% (10/38) of SR websites. The overall DISCERN score for health professional–oriented websites was 2.7, patient-oriented websites received a score of 2.6, and advertisements had the lowest score at 2.4.
Discussion
Both patients and health professionals obtain information on health care subjects through the Internet, which has become the primary resource for patients.15,16 However, there are no strict regulations of the content being written. This creates a challenge for the typical user to find credible and evidence-based information, which is important because misleading information could cause undue anxiety, among other effects.17,18 The aims of this study were to determine the quality of Internet information for shoulder replacement surgeries using the medical terminology total shoulder arthroplasty (TSA) and the nontechnical term shoulder replacement (SR), and to compare the results.
After analyzing the types of websites returned for both total shoulder arthroplasty and shoulder replacement (Figure 4), it was interesting to find that using nonmedical terminology as the search term provided more noncommercial websites compared with total shoulder arthroplasty. Furthermore, Yahoo provided the highest yield of noncommercial websites at 16, with Bing at 14, when using SR as the search term. We believe the increase in noncommercial websites returned for SR was greater than for TSA because SR yielded more patient-oriented websites, which usually had html endings of .edu and .org, as shown in Figure 3 (48% of SR websites offered patient-oriented information).
Although there were more noncommercial websites for SR, the majority of the DISCERN values between the 2 search terms did not differ significantly. This is a direct result of the number of sites (20) that were duplicated across both search terms. However as seen in the Table, TSA had similar reliable mean scores for advertisements and patient-oriented websites but a slightly higher reliable score for health professional–oriented websites. We correlated this with the increased number of health professional–oriented websites returned when using TSA as the search term (Figure 3). The health professional–oriented websites explained their aims and cited their sources more consistently than did patient-oriented sites and advertisements, resulting in higher reliable scores. Although patient-oriented websites frequently lacked citations, they provided information about multiple treatment options, which were more relevant to consumers. This resulted in nearly equivalent reliable scores. Treatment means for advertisements in both SR and TSA were similar. However, treatment means for professional-oriented websites in TSA were lower than those for SR because health professional–oriented websites often were only moderately relevant to consumers, with their focus usually on 1 treatment option or on rehabilitation protocols. Although the DISCERN scores were similar between the search terms, total shoulder arthroplasty provided more websites (20) classified as good—overall DISCERN score, ≥3—than SR did (10). Advertisement websites had similar overall DISCERN scores, which we anticipated because most of the advertisements were duplicated across the search terms.
Using the 2 search terms, academic websites and commercial websites, such as WebMD, consistently received higher reliable and overall DISCERN scores. Advertisement websites, which need to deliver a clear message, frequently scored high on explicitly stating their aims and relevance to consumers, but focused on their products without discussing the benefits of other treatment options. This is significant because Internet search engines, such as Google, offer sponsor links for which organizations pay to appear at the top of the search results. This creates the potential for consumers to receive biased information because most individuals only visit the top 10 websites generated by a search engine.19
We concluded that the quality of online information relating to SR and TSA was highly variable and frequently of moderate-to-poor quality, with most overall DISCERN scores <3. The quality of information found online for this study using the DISCERN instrument is consistent with those studies using DISCERN to evaluate other medical conditions (eg, bunions, chronic pain, general anesthesia, and anterior cruciate ligament reconstruction).2,9,15,19 These studies also concluded that online information varies tremendously in quality and completeness.
This study has several limitations. Websites were searched at a single time point and, because Internet resources are frequently updated, the results of this study could vary. Furthermore, although Google, Yahoo, and Bing are 3 of the most popular search engines, these are not the only resources patients use when searching the Internet for health-related information. Other search engines, such as Pubmed.gov and MSN.com, could provide additional websites for Internet users. Lastly, although DISCERN is validated to address the quality of information available online, it does not evaluate the accuracy of the information.8 Our use of DISCERN involves 2 scales, a binary yes/no (ratings, 1 and 5) and an ordinal scale (ratings, 2-4). As such, a single mean summary statistic cannot be calculated.
Conclusion
The information available on the Internet pertaining to TSA and SR is highly variable and provides mostly moderate-to-poor quality information based on the DISCERN instrument. Many websites failed to describe the benefits and the risks of different treatment options, including nonoperative management. Health care professionals should be aware that patients often refer to the Internet as a primary resource for obtaining medical information. It is important to direct patients to websites that provide accurate information, because patients who educate themselves about their conditions and actively participate in decision-making may have improved health outcomes.20-22 Overall, academic websites and commercial websites, such as WebMD and OrthoInfo, generally had higher DISCERN scores when using either search term. Of major concern is the potential for misleading advertisements or incorrect information that can negatively affect health outcomes. This study found that using nonmedical terminology (SR) provided more noncommercial and patient-oriented websites, especially through Yahoo. This study highlights the need for more comprehensive online information pertaining to shoulder replacement that can better serve as a resource for Internet users.
1. Eysenbach G, Powell J, Kuss O, Sa ER. Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. JAMA. 2002;287(20):2691-2700.
2. Bruce-Brand RA, Baker JF, Byrne DP, Hogan NA, McCarthy T. Assessment of the quality and content of information on anterior cruciate ligament reconstruction on the internet. Arthroscopy. 2013;29(6):1095-1100.
3. Computer and internet use in the United States: population characteristics. US Census Bureau website. http://www.census.gov/hhes/computer/. Accessed December 11, 2015.
4. Fox S, Duggan M. Health online 2013. Pew Research Center website. http://pewinternet.org/Reports/2013/Health-online.aspx. Published January 15, 2013. Accessed November 24, 2015.
5. Smith A. Older adults and technology use. Pew Research Center website. http://www.pewinternet.org/2014/04/03/older-adults-and-technology-use. Published April 3, 2014. Accessed November 24, 2015.
6. Shuyler KS, Knight KM. What are patients seeking when they turn to the internet? Qualitative content analysis of questions asked by visitors to an orthopaedics web site. J Med Internet Res. 2003;5(4):e24.
7. Meredith P, Emberton M, Wood C, Smith J. Comparison of patients’ needs for information on prostate surgery with printed materials provided by surgeons. Qual Health Care. 1995;4(1):18-23.
8. Charnock D, Shepperd S, Needham G, Gann R. DISCERN: An instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health. 1999;53(2):105-111.
9. Kaicker J, Debono VB, Dang W, Buckley N, Thabane L. Assessment of the quality and variability of health information on chronic pain websites using the DISCERN instrument. BMC Med. 2010;8(1):59.
10. Griffiths KM, Christensen H. Website quality indicators for consumers. J Med Internet Res. 2005;7(5):e55.
11. Wiater JM. Shoulder joint replacement. American Academy of Orthopedic Surgeons website. http://orthoinfo.aaos.org/topic.cfm?topic=A00094. Updated December 2011. Accessed November 24, 2015.
12. Kim SH, Wise BL, Zhang Y, Szabo RM. Increasing incidence of shoulder arthroplasty in the united states. J Bone Joint Surg Am. 2011;93(24):2249-2254.
13. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med. 2009;151(4):W65-W94.
14. Nason GJ, Baker JF, Byrne DP, Noel J, Moore D, Kiely PJ. Scoliosis-specific information on the internet: has the “information highway” led to better information provision? Spine. 2012;37(21):E1364-E1369.
15. Starman JS, Gettys FK, Capo JA, Fleischli JE, Norton HJ, Karunakar MA. Quality and content of internet-based information for ten common orthopaedic sports medicine diagnoses. J Bone Joint Surg Am. 2010;92(7):1612-1618.
16. Bernstein J, Ahn J, Veillette C. The future of orthopaedic information management. J Bone Joint Surg Am. 2012;94(13):e95.
17. Berland GK, Elliott MN, Morales LS, et al. Health information on the Internet: accessibility, quality, and readability in English and Spanish. JAMA. 2001;285(20):2612-2621.
18. Fallowfield LJ, Hall A, Maguire GP, Baum M. Psychological outcomes of different treatment policies in women with early breast cancer outside a clinical trial. BMJ. 1990;301(6752):575-580.
19. Chong YM, Fraval A, Chandrananth J, Plunkett V, Tran P. Assessment of the quality of web-based information on bunions. Foot Ankle Int. 2013;34(8):1134-1139.
20. Brody DS, Miller SM, Lerman CE, Smith DG, Caputo GC. Patient perception of involvement in medical care. J Gen Intern Med. 1989;4(6):506-511.
21. Greenfield S, Kaplan S, Ware JE Jr. Expanding patient involvement in care. Effects on patient outcomes. Ann Intern Med. 1985;102(4):520-528.
22. Kaplan SH, Greenfield S, Ware JE Jr. Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Med Care. 1989;27(3 suppl):S110-S127.
1. Eysenbach G, Powell J, Kuss O, Sa ER. Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. JAMA. 2002;287(20):2691-2700.
2. Bruce-Brand RA, Baker JF, Byrne DP, Hogan NA, McCarthy T. Assessment of the quality and content of information on anterior cruciate ligament reconstruction on the internet. Arthroscopy. 2013;29(6):1095-1100.
3. Computer and internet use in the United States: population characteristics. US Census Bureau website. http://www.census.gov/hhes/computer/. Accessed December 11, 2015.
4. Fox S, Duggan M. Health online 2013. Pew Research Center website. http://pewinternet.org/Reports/2013/Health-online.aspx. Published January 15, 2013. Accessed November 24, 2015.
5. Smith A. Older adults and technology use. Pew Research Center website. http://www.pewinternet.org/2014/04/03/older-adults-and-technology-use. Published April 3, 2014. Accessed November 24, 2015.
6. Shuyler KS, Knight KM. What are patients seeking when they turn to the internet? Qualitative content analysis of questions asked by visitors to an orthopaedics web site. J Med Internet Res. 2003;5(4):e24.
7. Meredith P, Emberton M, Wood C, Smith J. Comparison of patients’ needs for information on prostate surgery with printed materials provided by surgeons. Qual Health Care. 1995;4(1):18-23.
8. Charnock D, Shepperd S, Needham G, Gann R. DISCERN: An instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health. 1999;53(2):105-111.
9. Kaicker J, Debono VB, Dang W, Buckley N, Thabane L. Assessment of the quality and variability of health information on chronic pain websites using the DISCERN instrument. BMC Med. 2010;8(1):59.
10. Griffiths KM, Christensen H. Website quality indicators for consumers. J Med Internet Res. 2005;7(5):e55.
11. Wiater JM. Shoulder joint replacement. American Academy of Orthopedic Surgeons website. http://orthoinfo.aaos.org/topic.cfm?topic=A00094. Updated December 2011. Accessed November 24, 2015.
12. Kim SH, Wise BL, Zhang Y, Szabo RM. Increasing incidence of shoulder arthroplasty in the united states. J Bone Joint Surg Am. 2011;93(24):2249-2254.
13. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med. 2009;151(4):W65-W94.
14. Nason GJ, Baker JF, Byrne DP, Noel J, Moore D, Kiely PJ. Scoliosis-specific information on the internet: has the “information highway” led to better information provision? Spine. 2012;37(21):E1364-E1369.
15. Starman JS, Gettys FK, Capo JA, Fleischli JE, Norton HJ, Karunakar MA. Quality and content of internet-based information for ten common orthopaedic sports medicine diagnoses. J Bone Joint Surg Am. 2010;92(7):1612-1618.
16. Bernstein J, Ahn J, Veillette C. The future of orthopaedic information management. J Bone Joint Surg Am. 2012;94(13):e95.
17. Berland GK, Elliott MN, Morales LS, et al. Health information on the Internet: accessibility, quality, and readability in English and Spanish. JAMA. 2001;285(20):2612-2621.
18. Fallowfield LJ, Hall A, Maguire GP, Baum M. Psychological outcomes of different treatment policies in women with early breast cancer outside a clinical trial. BMJ. 1990;301(6752):575-580.
19. Chong YM, Fraval A, Chandrananth J, Plunkett V, Tran P. Assessment of the quality of web-based information on bunions. Foot Ankle Int. 2013;34(8):1134-1139.
20. Brody DS, Miller SM, Lerman CE, Smith DG, Caputo GC. Patient perception of involvement in medical care. J Gen Intern Med. 1989;4(6):506-511.
21. Greenfield S, Kaplan S, Ware JE Jr. Expanding patient involvement in care. Effects on patient outcomes. Ann Intern Med. 1985;102(4):520-528.
22. Kaplan SH, Greenfield S, Ware JE Jr. Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Med Care. 1989;27(3 suppl):S110-S127.
Incidence, Risk Factors, and Outcome Trends of Acute Kidney Injury in Elective Total Hip and Knee Arthroplasty
Degenerative arthritis is a widespread chronic condition with an incidence of almost 43 million and annual health care costs of $60 billion in the United States alone.1 Although many cases can be managed symptomatically with medical therapy and intra-articular injections,2 many patients experience disease progression resulting in decreased ambulatory ability and work productivity. For these patients, elective hip and knee arthroplasties can drastically improve quality of life and functionality.3,4 Over the past decade, there has been a marked increase in the number of primary and revision total hip and knee arthroplasties performed in the United States. By 2030, the demand for primary total hip arthroplasties will grow an estimated 174%, to 572,000 procedures. Likewise, the demand for primary total knee arthroplasties is projected to grow by 673%, to 3.48 million procedures.5 However, though better surgical techniques and technology have led to improved functional outcomes, there is still substantial risk for complications in the perioperative period, especially in the geriatric population, in which substantial comorbidities are common.6-9
Acute kidney injury (AKI) is a common public health problem in hospitalized patients and in patients undergoing procedures. More than one-third of all AKI cases occur in surgical settings.10,11 Over the past decade, both community-acquired and in-hospital AKIs rapidly increased in incidence in all major clinical settings.12-14 Patients with AKI have high rates of adverse outcomes during hospitalization and discharge.11,15 Sequelae of AKIs include worsening chronic kidney disease (CKD) and progression to end-stage renal disease, necessitating either long-term dialysis or transplantation.12 This in turn leads to exacerbated disability, diminished quality of life, and disproportionate burden on health care resources.
Much of our knowledge about postoperative AKI has been derived from cardiovascular, thoracic, and abdominal surgery settings. However, there is a paucity of data on epidemiology and trends for either AKI or associated outcomes in patients undergoing major orthopedic surgery. The few studies to date either were single-center or had inadequate sample sizes for appropriately powered analysis of the risk factors and outcomes related to AKI.16
In the study reported here, we analyzed a large cohort of patients from a nationwide multicenter database to determine the incidence of and risk factors for AKI. We also examined the mortality and adverse discharges associated with AKI after major joint surgery. Lastly, we assessed temporal trends in both incidence and outcomes of AKI, including the death risk attributable to AKI.
Methods
Database
We extracted our study cohort from the Nationwide Inpatient Sample (NIS) and the National Inpatient Sample of Healthcare Cost and Utilization Project (HCUP) compiled by the Agency for Healthcare Research and Quality.17 NIS, the largest inpatient care database in the United States, stores data from almost 8 million stays in about 1000 hospitals across the country each year. Its participating hospital pool consists of about 20% of US community hospitals, resulting in a sampling frame comprising about 90% of all hospital discharges in the United States. This allows for calculation of precise, weighted nationwide estimates. Data elements within NIS are drawn from hospital discharge abstracts that indicate all procedures performed. NIS also stores information on patient characteristics, length of stay (LOS), discharge disposition, postoperative morbidity, and observed in-hospital mortality. However, it stores no information on long-term follow-up or complications after discharge.
Data Analysis
For the period 2002–2012, we queried the NIS database for hip and knee arthroplasties with primary diagnosis codes for osteoarthritis and secondary codes for AKI. We excluded patients under age 18 years and patients with diagnosis codes for hip and knee fracture/necrosis, inflammatory/infectious arthritis, or bone neoplasms (Table 1). We then extracted baseline characteristics of the study population. Patient-level characteristics included age, sex, race, quartile classification of median household income according to postal (ZIP) code, and primary payer (Medicare/Medicaid, private insurance, self-pay, no charge). Hospital-level characteristics included hospital location (urban, rural), hospital bed size (small, medium, large), region (Northeast, Midwest/North Central, South, West), and teaching status. We defined illness severity and likelihood of death using Deyo’s modification of the Charlson Comorbidity Index (CCI), which draws on principal and secondary ICD-9-CM (International Classification of Diseases, Ninth Revision-Clinical Modification) diagnosis codes, procedure codes, and patient demographics to estimate a patient’s mortality risk. This method reliably predicts mortality and readmission in the orthopedic population.18,19 We assessed the effect of AKI on 4 outcomes, including in-hospital mortality, discharge disposition, LOS, and cost of stay. Discharge disposition was grouped by either (a) home or short-term facility or (b) adverse discharge. Home or short-term facility covered routine, short-term hospital, against medical advice, home intravenous provider, another rehabilitation facility, another institution for outpatient services, institution for outpatient services, discharged alive, and destination unknown; adverse discharge covered skilled nursing facility, intermediate care, hospice home, hospice medical facility, long-term care hospital, and certified nursing facility. This dichotomization of discharge disposition is often used in studies of NIS data.20
Statistical Analyses
We compared the baseline characteristics of hospitalized patients with and without AKI. To test for significance, we used the χ2 test for categorical variables, the Student t test for normally distributed continuous variables, the Wilcoxon rank sum test for non-normally distributed continuous variables, and the Cochran-Armitage test for trends in AKI incidence. We used survey logistic regression models to calculate adjusted odds ratios (ORs) with 95% confidence intervals (95% CIs) in order to estimate the predictors of AKI and the impact of AKI on hospital outcomes. We constructed final models after adjusting for confounders, testing for potential interactions, and ensuring no multicolinearity between covariates. Last, we computed the risk proportion of death attributable to AKI, indicating the proportion of deaths that could potentially be avoided if AKI and its complications were abrogated.21
We performed all statistical analyses with SAS Version 9.3 (SAS Institute) using designated weight values to produce weighted national estimates. The threshold for statistical significance was set at P < .01 (with ORs and 95% CIs that excluded 1).
Results
AKI Incidence, Risk Factors, and Trends
We identified 7,235,251 patients who underwent elective hip or knee arthroplasty for osteoarthritis between 2002 and 2012—an estimate consistent with data from the Centers for Disease Control and Prevention.22 Of that total, 94,367 (1.3%) had AKI. The proportion of discharges diagnosed with AKI increased rapidly over the decade, from 0.5% in 2002 to 1.8% to 1.9% in the period 2010–2012. This upward trend was highly significant (Ptrend < .001) (Figure 1). Patients with AKI (vs patients without AKI) were more likely to be older (mean age, 70 vs 66 years; P < .001), male (50.8% vs 38.4%; P < .001), and black (10.07% vs 5.15%; P<. 001). They were also found to have a significantly higher comorbidity score (mean CCI, 2.8 vs 1.5; P < .001) and higher proportions of comorbidities, including hypertension, CKD, atrial fibrillation, diabetes mellitus (DM), congestive heart failure, chronic liver disease, and hepatitis C virus infection. In addition, AKI was associated with perioperative myocardial infarction (MI), sepsis, cardiac catheterization, and blood transfusion. Regarding socioeconomic characteristics, patients with AKI were more likely to have Medicare/Medicaid insurance (72.26% vs 58.06%; P < .001) and to belong to the extremes of income categories (Table 2).
Using multivariable logistic regression, we found that increased age (1.11 increase in adjusted OR for every year older; 95% CI, 1.09-1.14; P < .001), male sex (adjusted OR, 1.65; 95% CI, 1.60-1.71; P < .001), and black race (adjusted OR, 1.57; 95% CI, 1.45-1.69; P < .001) were significantly associated with postoperative AKI. Regarding comorbidities, baseline CKD (adjusted OR, 8.64; 95% CI, 8.14-9.18; P < .001) and congestive heart failure (adjusted OR, 2.74; 95% CI, 2.57-2.92; P< .0001) were most significantly associated with AKI. Perioperative events, including sepsis (adjusted OR, 35.64; 95% CI, 30.28-41.96; P < .0001), MI (adjusted OR, 6.14; 95% CI, 5.17-7.28; P < .0001), and blood transfusion (adjusted OR, 2.28; 95% CI, 2.15-2.42; P < .0001), were also strongly associated with postoperative AKI. Last, compared with urban hospitals and small hospital bed size, rural hospitals (adjusted OR, 0.70; 95% CI, 0.60-0.81; P< .001) and large bed size (adjusted OR, 0.82; 95% CI, 0.70-0.93; P = .003) were associated with lower probability of developing AKI (Table 3).
Figure 2 elucidates the frequency of AKI based on a combination of key preoperative comorbid conditions and postoperative complications—demonstrating that the proportion of AKI cases associated with other postoperative complications is significantly higher in the CKD and concomitant DM/CKD patient populations. Patients hospitalized with CKD exhibited higher rates of AKI in cases involving blood transfusion (20.9% vs 1.8%; P < .001), acute MI (48.9% vs 13.8%; P < .001), and sepsis (74.7% vs 36.3%;P< .001) relative to patients without CKD. Similarly, patients with concomitant DM/CKD exhibited higher rates of AKI in cases involving blood transfusion (23% vs 1.9%; P< .001), acute MI (51.1% vs 12.1%; P< .001), and sepsis (75% vs 38.2%; P < .001) relative to patients without either condition. However, patients hospitalized with DM alone exhibited only marginally higher rates of AKI in cases involving blood transfusion (4.7% vs 2%; P < .01) and acute MI (19.2% vs 16.7%; P< .01) and a lower rate in cases involving sepsis (38.2% vs 41.7%; P < .01) relative to patients without DM. These data suggest that CKD is the most significant clinically relevant risk factor for AKI and that CKD may synergize with DM to raise the risk for AKI.
Outcomes
We then analyzed the impact of AKI on hospital outcomes, including in-hospital mortality, discharge disposition, LOS, and cost of care. Mortality was significantly higher in patients with AKI than in patients without it (2.08% vs 0.06%; P < .001). Even after adjusting for confounders (eg, demographics, comorbidity burden, perioperative sepsis, hospital-level characteristics), AKI was still associated with strikingly higher odds of in-hospital death (adjusted OR, 11.32; 95% CI, 9.34-13.74; P < .001). However, analysis of temporal trends indicated that the odds for adjusted mortality associated with AKI decreased from 18.09 to 9.45 (Ptrend = .01) over the period 2002–2012 (Figure 3). This decrease in odds of death was countered by an increase in incidence of AKI, resulting in a stable attributable risk proportion (97.9% in 2002 to 97.3% in 2012; Ptrend = .90) (Table 4). Regarding discharge disposition, patients with AKI were much less likely to be discharged home (41.35% vs 62.59%; P < .001) and more likely to be discharged to long-term care (56.37% vs 37.03%; P< .001). After adjustment for confounders, AKI was associated with significantly increased odds of adverse discharge (adjusted OR, 2.24; 95% CI, 2.12-2.36; P< .001). Analysis of temporal trends revealed no appreciable decrease in the adjusted odds of adverse discharge between 2002 (adjusted OR, 1.87; 95% CI, 1.37-2.55; P < .001) and 2012 (adjusted OR, 1.93; 95% CI, 1.76-2.11; P < .001) (Figure 4, Table 5). Last, both mean LOS (5 days vs 3 days; P < .001) and mean cost of hospitalization (US $22,269 vs $15,757; P < .001) were significantly higher in patients with AKI.
Discussion
In this study, we found that the incidence of AKI among hospitalized patients increased 4-fold between 2002 and 2012. Moreover, we identified numerous patient-specific, hospital-specific, perioperative risk factors for AKI. Most important, we found that AKI was associated with a strikingly higher risk of in-hospital death, and surviving patients were more likely to experience adverse discharge. Although the adjusted mortality rate associated with AKI decreased over that decade, the attributable risk proportion remained stable.
Few studies have addressed this significant public health concern. In one recent study in Australia, Kimmel and colleagues16 identified risk factors for AKI but lacked data on AKI outcomes. In a study of complications and mortality occurring after orthopedic surgery, Belmont and colleagues22 categorized complications as either local or systemic but did not examine renal complications. Only 2 other major studies have been conducted on renal outcomes associated with major joint surgery, and both were limited to patients with acute hip fractures. The first included acute fracture surgery patients and omitted elective joint surgery patients, and it evaluated admission renal function but not postoperative AKI.22 The second study had a sample size of only 170 patients.23 Thus, the literature leaves us with a crucial knowledge gap in renal outcomes and their postoperative impact in elective arthroplasties.
The present study filled this information gap by examining the incidence, risk factors, outcomes, and temporal trends of AKI after elective hip and knee arthroplasties. The increasing incidence of AKI in this surgical setting is similar to that of AKI in other surgical settings (cardiac and noncardiac).21 Although our analysis was limited by lack of perioperative management data, patients undergoing elective joint arthroplasty can experience kidney dysfunction for several reasons, including volume depletion, postoperative sepsis, and influence of medications, such as nonsteroidal anti-inflammatory drugs (NSAIDs), especially in older patients with more comorbidities and a higher burden of CKD. Each of these factors can cause renal dysfunction in patients having orthopedic procedures.24 Moreover, NSAID use among elective joint arthroplasty patients is likely higher because of an emphasis on multimodal analgesia, as recent randomized controlled trials have demonstrated the efficacy of NSAID use in controlling pain without increasing bleeding.25-27 Our results also demonstrated that the absolute incidence of AKI after orthopedic surgery is relatively low. One possible explanation for this phenomenon is that the definitions used were based on ICD-9-CM codes that underestimate the true incidence of AKI.
Consistent with other studies, we found that certain key preoperative comorbid conditions and postoperative events were associated with higher AKI risk. We stratified the rate of AKI associated with each postoperative event (sepsis, acute MI, cardiac catheterization, need for transfusion) by DM/CKD comorbidity. CKD was associated with significantly higher AKI risk across all postoperative complications. This information may provide clinicians with bedside information that can be used to determine which patients may be at higher or lower risk for AKI.
Our analysis of patient outcomes revealed that, though AKI was relatively uncommon, it increased the risk for death during hospitalization more than 10-fold between 2002 and 2012. Although the adjusted OR of in-hospital mortality decreased over the decade studied, the concurrent increase in AKI incidence caused the attributable risk of death associated with AKI to essentially remain the same. This observation is consistent with recent reports from cardiac surgery settings.21 These data together suggest that ameliorating occurrences of AKI would decrease mortality and increase quality of care for patients undergoing elective joint surgeries.
We also examined the effect of AKI on resource use by studying LOS, costs, and risk for adverse discharge. Much as in other surgical settings, AKI increased both LOS and overall hospitalization costs. More important, AKI was associated with increased adverse discharge (discharge to long-term care or nursing homes). Although exact reasons are unclear, we can speculate that postoperative renal dysfunction precludes early rehabilitation, impeding desired functional outcome and disposition.28,29 Given the projected increases in primary and revision hip and knee arthroplasties,5 these data predict that the impact of AKI on health outcomes will increase alarmingly in coming years.
There are limitations to our study. First, it was based on administrative data and lacked patient-level and laboratory data. As reported, the sensitivity of AKI codes remains moderate,30 so the true burden may be higher than indicated here. As the definition of AKI was based on administrative coding, we also could not estimate severity, though previous studies have found that administrative codes typically capture a more severe form of disease.31 Another limitation is that, because the data were deidentified, we could not delineate the risk for recurrent AKI in repeated surgical procedures, though this cohort unlikely was large enough to qualitatively affect our results. The third limitation is that, though we used CCI to adjust for the comorbidity burden, we were unable to account for other unmeasured confounders associated with increased AKI incidence, such as specific medication use. In addition, given the lack of patient-level data, we could not analyze the specific factors responsible for AKI in the perioperative period. Nevertheless, the strengths of a nationally representative sample, such as large sample size and generalizability, outweigh these limitations.
Conclusion
AKI is potentially an important quality indicator of elective joint surgery, and reducing its incidence is therefore essential for quality improvement. Given that hip and knee arthroplasties are projected to increase exponentially, as is the burden of comorbid conditions in this population, postoperative AKI will continue to have an incremental impact on health and health care resources. Thus, a carefully planned approach of interdisciplinary perioperative care is warranted to reduce both the risk and the consequences of this devastating condition.
1. Reginster JY. The prevalence and burden of arthritis. Rheumatology. 2002;41(supp 1):3-6.
2. Kullenberg B, Runesson R, Tuvhag R, Olsson C, Resch S. Intraarticular corticosteroid injection: pain relief in osteoarthritis of the hip? J Rheumatol. 2004;31(11):2265-2268.
3. Kawasaki M, Hasegawa Y, Sakano S, Torii Y, Warashina H. Quality of life after several treatments for osteoarthritis of the hip. J Orthop Sci. 2003;8(1):32-35.
4. Ethgen O, Bruyère O, Richy F, Dardennes C, Reginster JY. Health-related quality of life in total hip and total knee arthroplasty. A qualitative and systematic review of the literature. J Bone Joint Surg Am. 2004;86(5):963-974.
5. Kurtz S, Ong K, Lau E, Mowat F, Halpern M. Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J Bone Joint Surg Am. 2007;89(4):780-785.
6. Matlock D, Earnest M, Epstein A. Utilization of elective hip and knee arthroplasty by age and payer. Clin Orthop Relat Res. 2008;466(4):914-919.
7. Parvizi J, Holiday AD, Ereth MH, Lewallen DG. The Frank Stinchfield Award. Sudden death during primary hip arthroplasty. Clin Orthop Relat Res. 1999;(369):39-48.
8. Parvizi J, Mui A, Purtill JJ, Sharkey PF, Hozack WJ, Rothman RH. Total joint arthroplasty: when do fatal or near-fatal complications occur? J Bone Joint Surg Am. 2007;89(1):27-32.
9. Parvizi J, Sullivan TA, Trousdale RT, Lewallen DG. Thirty-day mortality after total knee arthroplasty. J Bone Joint Surg Am. 2001;83(8):1157-1161.
10. Uchino S, Kellum JA, Bellomo R, et al; Beginning and Ending Supportive Therapy for the Kidney (BEST Kidney) Investigators. Acute renal failure in critically ill patients: a multinational, multicenter study. JAMA. 2005;294(7):813-818.
11. Thakar CV. Perioperative acute kidney injury. Adv Chronic Kidney Dis. 2013;20(1):67-75.
12. Hsu CY, Chertow GM, McCulloch CE, Fan D, Ordoñez JD, Go AS. Nonrecovery of kidney function and death after acute on chronic renal failure. Clin J Am Soc Nephrol. 2009;4(5):891-898.
13. Rewa O, Bagshaw SM. Acute kidney injury—epidemiology, outcomes and economics. Nat Rev Nephrol. 2014;10(4):193-207.
14. Thakar CV, Worley S, Arrigain S, Yared JP, Paganini EP. Influence of renal dysfunction on mortality after cardiac surgery: modifying effect of preoperative renal function. Kidney Int. 2005;67(3):1112-1119.
15. Zeng X, McMahon GM, Brunelli SM, Bates DW, Waikar SS. Incidence, outcomes, and comparisons across definitions of AKI in hospitalized individuals. Clin J Am Soc Nephrol. 2014;9(1):12-20.
16. Kimmel LA, Wilson S, Janardan JD, Liew SM, Walker RG. Incidence of acute kidney injury following total joint arthroplasty: a retrospective review by RIFLE criteria. Clin Kidney J. 2014;7(6):546-551.
17. Agency for Healthcare Research and Quality. Healthcare Cost and Utilization Project (HCUP) databases, 2002–2012. Rockville, MD: Agency for Healthcare Research and Quality.
18. Bjorgul K, Novicoff WM, Saleh KJ. Evaluating comorbidities in total hip and knee arthroplasty: available instruments. J Orthop Traumatol. 2010;11(4):203-209.
19. Voskuijl T, Hageman M, Ring D. Higher Charlson Comorbidity Index Scores are associated with readmission after orthopaedic surgery. Clin Orthop Relat Res. 2014;472(5):1638-1644.
20. Chertow GM, Burdick E, Honour M, Bonventre JV, Bates DW. Acute kidney injury, mortality, length of stay, and costs in hospitalized patients. J Am Soc Nephrol. 2005;16(11):3365-3370.
21. Lenihan CR, Montez-Rath ME, Mora Mangano CT, Chertow GM, Winkelmayer WC. Trends in acute kidney injury, associated use of dialysis, and mortality after cardiac surgery, 1999 to 2008. Ann Thorac Surg. 2013;95(1):20-28.
22. Belmont PJ Jr, Goodman GP, Waterman BR, Bader JO, Schoenfeld AJ. Thirty-day postoperative complications and mortality following total knee arthroplasty: incidence and risk factors among a national sample of 15,321 patients. J Bone Joint Surg Am. 2014;96(1):20-26.
23. Bennet SJ, Berry OM, Goddard J, Keating JF. Acute renal dysfunction following hip fracture. Injury. 2010;41(4):335-338.
24. Kateros K, Doulgerakis C, Galanakos SP, Sakellariou VI, Papadakis SA, Macheras GA. Analysis of kidney dysfunction in orthopaedic patients. BMC Nephrol. 2012;13:101.
25. Huang YM, Wang CM, Wang CT, Lin WP, Horng LC, Jiang CC. Perioperative celecoxib administration for pain management after total knee arthroplasty—a randomized, controlled study. BMC Musculoskelet Disord. 2008;9:77.
26. Kelley TC, Adams MJ, Mulliken BD, Dalury DF. Efficacy of multimodal perioperative analgesia protocol with periarticular medication injection in total knee arthroplasty: a randomized, double-blinded study. J Arthroplasty. 2013;28(8):1274-1277.
27. Lamplot JD, Wagner ER, Manning DW. Multimodal pain management in total knee arthroplasty: a prospective randomized controlled trial. J Arthroplasty. 2014;29(2):329-334.
28. Munin MC, Rudy TE, Glynn NW, Crossett LS, Rubash HE. Early inpatient rehabilitation after elective hip and knee arthroplasty. JAMA. 1998;279(11):847-852.
29. Pua YH, Ong PH. Association of early ambulation with length of stay and costs in total knee arthroplasty: retrospective cohort study. Am J Phys Med Rehabil. 2014;93(11):962-970.
30. Waikar SS, Wald R, Chertow GM, et al. Validity of International Classification of Diseases, Ninth Revision, Clinical Modification codes for acute renal failure. J Am Soc Nephrol. 2006;17(6):1688-1694.
31. Grams ME, Waikar SS, MacMahon B, Whelton S, Ballew SH, Coresh J. Performance and limitations of administrative data in the identification of AKI. Clin J Am Soc Nephrol. 2014;9(4):682-689.
Degenerative arthritis is a widespread chronic condition with an incidence of almost 43 million and annual health care costs of $60 billion in the United States alone.1 Although many cases can be managed symptomatically with medical therapy and intra-articular injections,2 many patients experience disease progression resulting in decreased ambulatory ability and work productivity. For these patients, elective hip and knee arthroplasties can drastically improve quality of life and functionality.3,4 Over the past decade, there has been a marked increase in the number of primary and revision total hip and knee arthroplasties performed in the United States. By 2030, the demand for primary total hip arthroplasties will grow an estimated 174%, to 572,000 procedures. Likewise, the demand for primary total knee arthroplasties is projected to grow by 673%, to 3.48 million procedures.5 However, though better surgical techniques and technology have led to improved functional outcomes, there is still substantial risk for complications in the perioperative period, especially in the geriatric population, in which substantial comorbidities are common.6-9
Acute kidney injury (AKI) is a common public health problem in hospitalized patients and in patients undergoing procedures. More than one-third of all AKI cases occur in surgical settings.10,11 Over the past decade, both community-acquired and in-hospital AKIs rapidly increased in incidence in all major clinical settings.12-14 Patients with AKI have high rates of adverse outcomes during hospitalization and discharge.11,15 Sequelae of AKIs include worsening chronic kidney disease (CKD) and progression to end-stage renal disease, necessitating either long-term dialysis or transplantation.12 This in turn leads to exacerbated disability, diminished quality of life, and disproportionate burden on health care resources.
Much of our knowledge about postoperative AKI has been derived from cardiovascular, thoracic, and abdominal surgery settings. However, there is a paucity of data on epidemiology and trends for either AKI or associated outcomes in patients undergoing major orthopedic surgery. The few studies to date either were single-center or had inadequate sample sizes for appropriately powered analysis of the risk factors and outcomes related to AKI.16
In the study reported here, we analyzed a large cohort of patients from a nationwide multicenter database to determine the incidence of and risk factors for AKI. We also examined the mortality and adverse discharges associated with AKI after major joint surgery. Lastly, we assessed temporal trends in both incidence and outcomes of AKI, including the death risk attributable to AKI.
Methods
Database
We extracted our study cohort from the Nationwide Inpatient Sample (NIS) and the National Inpatient Sample of Healthcare Cost and Utilization Project (HCUP) compiled by the Agency for Healthcare Research and Quality.17 NIS, the largest inpatient care database in the United States, stores data from almost 8 million stays in about 1000 hospitals across the country each year. Its participating hospital pool consists of about 20% of US community hospitals, resulting in a sampling frame comprising about 90% of all hospital discharges in the United States. This allows for calculation of precise, weighted nationwide estimates. Data elements within NIS are drawn from hospital discharge abstracts that indicate all procedures performed. NIS also stores information on patient characteristics, length of stay (LOS), discharge disposition, postoperative morbidity, and observed in-hospital mortality. However, it stores no information on long-term follow-up or complications after discharge.
Data Analysis
For the period 2002–2012, we queried the NIS database for hip and knee arthroplasties with primary diagnosis codes for osteoarthritis and secondary codes for AKI. We excluded patients under age 18 years and patients with diagnosis codes for hip and knee fracture/necrosis, inflammatory/infectious arthritis, or bone neoplasms (Table 1). We then extracted baseline characteristics of the study population. Patient-level characteristics included age, sex, race, quartile classification of median household income according to postal (ZIP) code, and primary payer (Medicare/Medicaid, private insurance, self-pay, no charge). Hospital-level characteristics included hospital location (urban, rural), hospital bed size (small, medium, large), region (Northeast, Midwest/North Central, South, West), and teaching status. We defined illness severity and likelihood of death using Deyo’s modification of the Charlson Comorbidity Index (CCI), which draws on principal and secondary ICD-9-CM (International Classification of Diseases, Ninth Revision-Clinical Modification) diagnosis codes, procedure codes, and patient demographics to estimate a patient’s mortality risk. This method reliably predicts mortality and readmission in the orthopedic population.18,19 We assessed the effect of AKI on 4 outcomes, including in-hospital mortality, discharge disposition, LOS, and cost of stay. Discharge disposition was grouped by either (a) home or short-term facility or (b) adverse discharge. Home or short-term facility covered routine, short-term hospital, against medical advice, home intravenous provider, another rehabilitation facility, another institution for outpatient services, institution for outpatient services, discharged alive, and destination unknown; adverse discharge covered skilled nursing facility, intermediate care, hospice home, hospice medical facility, long-term care hospital, and certified nursing facility. This dichotomization of discharge disposition is often used in studies of NIS data.20
Statistical Analyses
We compared the baseline characteristics of hospitalized patients with and without AKI. To test for significance, we used the χ2 test for categorical variables, the Student t test for normally distributed continuous variables, the Wilcoxon rank sum test for non-normally distributed continuous variables, and the Cochran-Armitage test for trends in AKI incidence. We used survey logistic regression models to calculate adjusted odds ratios (ORs) with 95% confidence intervals (95% CIs) in order to estimate the predictors of AKI and the impact of AKI on hospital outcomes. We constructed final models after adjusting for confounders, testing for potential interactions, and ensuring no multicolinearity between covariates. Last, we computed the risk proportion of death attributable to AKI, indicating the proportion of deaths that could potentially be avoided if AKI and its complications were abrogated.21
We performed all statistical analyses with SAS Version 9.3 (SAS Institute) using designated weight values to produce weighted national estimates. The threshold for statistical significance was set at P < .01 (with ORs and 95% CIs that excluded 1).
Results
AKI Incidence, Risk Factors, and Trends
We identified 7,235,251 patients who underwent elective hip or knee arthroplasty for osteoarthritis between 2002 and 2012—an estimate consistent with data from the Centers for Disease Control and Prevention.22 Of that total, 94,367 (1.3%) had AKI. The proportion of discharges diagnosed with AKI increased rapidly over the decade, from 0.5% in 2002 to 1.8% to 1.9% in the period 2010–2012. This upward trend was highly significant (Ptrend < .001) (Figure 1). Patients with AKI (vs patients without AKI) were more likely to be older (mean age, 70 vs 66 years; P < .001), male (50.8% vs 38.4%; P < .001), and black (10.07% vs 5.15%; P<. 001). They were also found to have a significantly higher comorbidity score (mean CCI, 2.8 vs 1.5; P < .001) and higher proportions of comorbidities, including hypertension, CKD, atrial fibrillation, diabetes mellitus (DM), congestive heart failure, chronic liver disease, and hepatitis C virus infection. In addition, AKI was associated with perioperative myocardial infarction (MI), sepsis, cardiac catheterization, and blood transfusion. Regarding socioeconomic characteristics, patients with AKI were more likely to have Medicare/Medicaid insurance (72.26% vs 58.06%; P < .001) and to belong to the extremes of income categories (Table 2).
Using multivariable logistic regression, we found that increased age (1.11 increase in adjusted OR for every year older; 95% CI, 1.09-1.14; P < .001), male sex (adjusted OR, 1.65; 95% CI, 1.60-1.71; P < .001), and black race (adjusted OR, 1.57; 95% CI, 1.45-1.69; P < .001) were significantly associated with postoperative AKI. Regarding comorbidities, baseline CKD (adjusted OR, 8.64; 95% CI, 8.14-9.18; P < .001) and congestive heart failure (adjusted OR, 2.74; 95% CI, 2.57-2.92; P< .0001) were most significantly associated with AKI. Perioperative events, including sepsis (adjusted OR, 35.64; 95% CI, 30.28-41.96; P < .0001), MI (adjusted OR, 6.14; 95% CI, 5.17-7.28; P < .0001), and blood transfusion (adjusted OR, 2.28; 95% CI, 2.15-2.42; P < .0001), were also strongly associated with postoperative AKI. Last, compared with urban hospitals and small hospital bed size, rural hospitals (adjusted OR, 0.70; 95% CI, 0.60-0.81; P< .001) and large bed size (adjusted OR, 0.82; 95% CI, 0.70-0.93; P = .003) were associated with lower probability of developing AKI (Table 3).
Figure 2 elucidates the frequency of AKI based on a combination of key preoperative comorbid conditions and postoperative complications—demonstrating that the proportion of AKI cases associated with other postoperative complications is significantly higher in the CKD and concomitant DM/CKD patient populations. Patients hospitalized with CKD exhibited higher rates of AKI in cases involving blood transfusion (20.9% vs 1.8%; P < .001), acute MI (48.9% vs 13.8%; P < .001), and sepsis (74.7% vs 36.3%;P< .001) relative to patients without CKD. Similarly, patients with concomitant DM/CKD exhibited higher rates of AKI in cases involving blood transfusion (23% vs 1.9%; P< .001), acute MI (51.1% vs 12.1%; P< .001), and sepsis (75% vs 38.2%; P < .001) relative to patients without either condition. However, patients hospitalized with DM alone exhibited only marginally higher rates of AKI in cases involving blood transfusion (4.7% vs 2%; P < .01) and acute MI (19.2% vs 16.7%; P< .01) and a lower rate in cases involving sepsis (38.2% vs 41.7%; P < .01) relative to patients without DM. These data suggest that CKD is the most significant clinically relevant risk factor for AKI and that CKD may synergize with DM to raise the risk for AKI.
Outcomes
We then analyzed the impact of AKI on hospital outcomes, including in-hospital mortality, discharge disposition, LOS, and cost of care. Mortality was significantly higher in patients with AKI than in patients without it (2.08% vs 0.06%; P < .001). Even after adjusting for confounders (eg, demographics, comorbidity burden, perioperative sepsis, hospital-level characteristics), AKI was still associated with strikingly higher odds of in-hospital death (adjusted OR, 11.32; 95% CI, 9.34-13.74; P < .001). However, analysis of temporal trends indicated that the odds for adjusted mortality associated with AKI decreased from 18.09 to 9.45 (Ptrend = .01) over the period 2002–2012 (Figure 3). This decrease in odds of death was countered by an increase in incidence of AKI, resulting in a stable attributable risk proportion (97.9% in 2002 to 97.3% in 2012; Ptrend = .90) (Table 4). Regarding discharge disposition, patients with AKI were much less likely to be discharged home (41.35% vs 62.59%; P < .001) and more likely to be discharged to long-term care (56.37% vs 37.03%; P< .001). After adjustment for confounders, AKI was associated with significantly increased odds of adverse discharge (adjusted OR, 2.24; 95% CI, 2.12-2.36; P< .001). Analysis of temporal trends revealed no appreciable decrease in the adjusted odds of adverse discharge between 2002 (adjusted OR, 1.87; 95% CI, 1.37-2.55; P < .001) and 2012 (adjusted OR, 1.93; 95% CI, 1.76-2.11; P < .001) (Figure 4, Table 5). Last, both mean LOS (5 days vs 3 days; P < .001) and mean cost of hospitalization (US $22,269 vs $15,757; P < .001) were significantly higher in patients with AKI.
Discussion
In this study, we found that the incidence of AKI among hospitalized patients increased 4-fold between 2002 and 2012. Moreover, we identified numerous patient-specific, hospital-specific, perioperative risk factors for AKI. Most important, we found that AKI was associated with a strikingly higher risk of in-hospital death, and surviving patients were more likely to experience adverse discharge. Although the adjusted mortality rate associated with AKI decreased over that decade, the attributable risk proportion remained stable.
Few studies have addressed this significant public health concern. In one recent study in Australia, Kimmel and colleagues16 identified risk factors for AKI but lacked data on AKI outcomes. In a study of complications and mortality occurring after orthopedic surgery, Belmont and colleagues22 categorized complications as either local or systemic but did not examine renal complications. Only 2 other major studies have been conducted on renal outcomes associated with major joint surgery, and both were limited to patients with acute hip fractures. The first included acute fracture surgery patients and omitted elective joint surgery patients, and it evaluated admission renal function but not postoperative AKI.22 The second study had a sample size of only 170 patients.23 Thus, the literature leaves us with a crucial knowledge gap in renal outcomes and their postoperative impact in elective arthroplasties.
The present study filled this information gap by examining the incidence, risk factors, outcomes, and temporal trends of AKI after elective hip and knee arthroplasties. The increasing incidence of AKI in this surgical setting is similar to that of AKI in other surgical settings (cardiac and noncardiac).21 Although our analysis was limited by lack of perioperative management data, patients undergoing elective joint arthroplasty can experience kidney dysfunction for several reasons, including volume depletion, postoperative sepsis, and influence of medications, such as nonsteroidal anti-inflammatory drugs (NSAIDs), especially in older patients with more comorbidities and a higher burden of CKD. Each of these factors can cause renal dysfunction in patients having orthopedic procedures.24 Moreover, NSAID use among elective joint arthroplasty patients is likely higher because of an emphasis on multimodal analgesia, as recent randomized controlled trials have demonstrated the efficacy of NSAID use in controlling pain without increasing bleeding.25-27 Our results also demonstrated that the absolute incidence of AKI after orthopedic surgery is relatively low. One possible explanation for this phenomenon is that the definitions used were based on ICD-9-CM codes that underestimate the true incidence of AKI.
Consistent with other studies, we found that certain key preoperative comorbid conditions and postoperative events were associated with higher AKI risk. We stratified the rate of AKI associated with each postoperative event (sepsis, acute MI, cardiac catheterization, need for transfusion) by DM/CKD comorbidity. CKD was associated with significantly higher AKI risk across all postoperative complications. This information may provide clinicians with bedside information that can be used to determine which patients may be at higher or lower risk for AKI.
Our analysis of patient outcomes revealed that, though AKI was relatively uncommon, it increased the risk for death during hospitalization more than 10-fold between 2002 and 2012. Although the adjusted OR of in-hospital mortality decreased over the decade studied, the concurrent increase in AKI incidence caused the attributable risk of death associated with AKI to essentially remain the same. This observation is consistent with recent reports from cardiac surgery settings.21 These data together suggest that ameliorating occurrences of AKI would decrease mortality and increase quality of care for patients undergoing elective joint surgeries.
We also examined the effect of AKI on resource use by studying LOS, costs, and risk for adverse discharge. Much as in other surgical settings, AKI increased both LOS and overall hospitalization costs. More important, AKI was associated with increased adverse discharge (discharge to long-term care or nursing homes). Although exact reasons are unclear, we can speculate that postoperative renal dysfunction precludes early rehabilitation, impeding desired functional outcome and disposition.28,29 Given the projected increases in primary and revision hip and knee arthroplasties,5 these data predict that the impact of AKI on health outcomes will increase alarmingly in coming years.
There are limitations to our study. First, it was based on administrative data and lacked patient-level and laboratory data. As reported, the sensitivity of AKI codes remains moderate,30 so the true burden may be higher than indicated here. As the definition of AKI was based on administrative coding, we also could not estimate severity, though previous studies have found that administrative codes typically capture a more severe form of disease.31 Another limitation is that, because the data were deidentified, we could not delineate the risk for recurrent AKI in repeated surgical procedures, though this cohort unlikely was large enough to qualitatively affect our results. The third limitation is that, though we used CCI to adjust for the comorbidity burden, we were unable to account for other unmeasured confounders associated with increased AKI incidence, such as specific medication use. In addition, given the lack of patient-level data, we could not analyze the specific factors responsible for AKI in the perioperative period. Nevertheless, the strengths of a nationally representative sample, such as large sample size and generalizability, outweigh these limitations.
Conclusion
AKI is potentially an important quality indicator of elective joint surgery, and reducing its incidence is therefore essential for quality improvement. Given that hip and knee arthroplasties are projected to increase exponentially, as is the burden of comorbid conditions in this population, postoperative AKI will continue to have an incremental impact on health and health care resources. Thus, a carefully planned approach of interdisciplinary perioperative care is warranted to reduce both the risk and the consequences of this devastating condition.
Degenerative arthritis is a widespread chronic condition with an incidence of almost 43 million and annual health care costs of $60 billion in the United States alone.1 Although many cases can be managed symptomatically with medical therapy and intra-articular injections,2 many patients experience disease progression resulting in decreased ambulatory ability and work productivity. For these patients, elective hip and knee arthroplasties can drastically improve quality of life and functionality.3,4 Over the past decade, there has been a marked increase in the number of primary and revision total hip and knee arthroplasties performed in the United States. By 2030, the demand for primary total hip arthroplasties will grow an estimated 174%, to 572,000 procedures. Likewise, the demand for primary total knee arthroplasties is projected to grow by 673%, to 3.48 million procedures.5 However, though better surgical techniques and technology have led to improved functional outcomes, there is still substantial risk for complications in the perioperative period, especially in the geriatric population, in which substantial comorbidities are common.6-9
Acute kidney injury (AKI) is a common public health problem in hospitalized patients and in patients undergoing procedures. More than one-third of all AKI cases occur in surgical settings.10,11 Over the past decade, both community-acquired and in-hospital AKIs rapidly increased in incidence in all major clinical settings.12-14 Patients with AKI have high rates of adverse outcomes during hospitalization and discharge.11,15 Sequelae of AKIs include worsening chronic kidney disease (CKD) and progression to end-stage renal disease, necessitating either long-term dialysis or transplantation.12 This in turn leads to exacerbated disability, diminished quality of life, and disproportionate burden on health care resources.
Much of our knowledge about postoperative AKI has been derived from cardiovascular, thoracic, and abdominal surgery settings. However, there is a paucity of data on epidemiology and trends for either AKI or associated outcomes in patients undergoing major orthopedic surgery. The few studies to date either were single-center or had inadequate sample sizes for appropriately powered analysis of the risk factors and outcomes related to AKI.16
In the study reported here, we analyzed a large cohort of patients from a nationwide multicenter database to determine the incidence of and risk factors for AKI. We also examined the mortality and adverse discharges associated with AKI after major joint surgery. Lastly, we assessed temporal trends in both incidence and outcomes of AKI, including the death risk attributable to AKI.
Methods
Database
We extracted our study cohort from the Nationwide Inpatient Sample (NIS) and the National Inpatient Sample of Healthcare Cost and Utilization Project (HCUP) compiled by the Agency for Healthcare Research and Quality.17 NIS, the largest inpatient care database in the United States, stores data from almost 8 million stays in about 1000 hospitals across the country each year. Its participating hospital pool consists of about 20% of US community hospitals, resulting in a sampling frame comprising about 90% of all hospital discharges in the United States. This allows for calculation of precise, weighted nationwide estimates. Data elements within NIS are drawn from hospital discharge abstracts that indicate all procedures performed. NIS also stores information on patient characteristics, length of stay (LOS), discharge disposition, postoperative morbidity, and observed in-hospital mortality. However, it stores no information on long-term follow-up or complications after discharge.
Data Analysis
For the period 2002–2012, we queried the NIS database for hip and knee arthroplasties with primary diagnosis codes for osteoarthritis and secondary codes for AKI. We excluded patients under age 18 years and patients with diagnosis codes for hip and knee fracture/necrosis, inflammatory/infectious arthritis, or bone neoplasms (Table 1). We then extracted baseline characteristics of the study population. Patient-level characteristics included age, sex, race, quartile classification of median household income according to postal (ZIP) code, and primary payer (Medicare/Medicaid, private insurance, self-pay, no charge). Hospital-level characteristics included hospital location (urban, rural), hospital bed size (small, medium, large), region (Northeast, Midwest/North Central, South, West), and teaching status. We defined illness severity and likelihood of death using Deyo’s modification of the Charlson Comorbidity Index (CCI), which draws on principal and secondary ICD-9-CM (International Classification of Diseases, Ninth Revision-Clinical Modification) diagnosis codes, procedure codes, and patient demographics to estimate a patient’s mortality risk. This method reliably predicts mortality and readmission in the orthopedic population.18,19 We assessed the effect of AKI on 4 outcomes, including in-hospital mortality, discharge disposition, LOS, and cost of stay. Discharge disposition was grouped by either (a) home or short-term facility or (b) adverse discharge. Home or short-term facility covered routine, short-term hospital, against medical advice, home intravenous provider, another rehabilitation facility, another institution for outpatient services, institution for outpatient services, discharged alive, and destination unknown; adverse discharge covered skilled nursing facility, intermediate care, hospice home, hospice medical facility, long-term care hospital, and certified nursing facility. This dichotomization of discharge disposition is often used in studies of NIS data.20
Statistical Analyses
We compared the baseline characteristics of hospitalized patients with and without AKI. To test for significance, we used the χ2 test for categorical variables, the Student t test for normally distributed continuous variables, the Wilcoxon rank sum test for non-normally distributed continuous variables, and the Cochran-Armitage test for trends in AKI incidence. We used survey logistic regression models to calculate adjusted odds ratios (ORs) with 95% confidence intervals (95% CIs) in order to estimate the predictors of AKI and the impact of AKI on hospital outcomes. We constructed final models after adjusting for confounders, testing for potential interactions, and ensuring no multicolinearity between covariates. Last, we computed the risk proportion of death attributable to AKI, indicating the proportion of deaths that could potentially be avoided if AKI and its complications were abrogated.21
We performed all statistical analyses with SAS Version 9.3 (SAS Institute) using designated weight values to produce weighted national estimates. The threshold for statistical significance was set at P < .01 (with ORs and 95% CIs that excluded 1).
Results
AKI Incidence, Risk Factors, and Trends
We identified 7,235,251 patients who underwent elective hip or knee arthroplasty for osteoarthritis between 2002 and 2012—an estimate consistent with data from the Centers for Disease Control and Prevention.22 Of that total, 94,367 (1.3%) had AKI. The proportion of discharges diagnosed with AKI increased rapidly over the decade, from 0.5% in 2002 to 1.8% to 1.9% in the period 2010–2012. This upward trend was highly significant (Ptrend < .001) (Figure 1). Patients with AKI (vs patients without AKI) were more likely to be older (mean age, 70 vs 66 years; P < .001), male (50.8% vs 38.4%; P < .001), and black (10.07% vs 5.15%; P<. 001). They were also found to have a significantly higher comorbidity score (mean CCI, 2.8 vs 1.5; P < .001) and higher proportions of comorbidities, including hypertension, CKD, atrial fibrillation, diabetes mellitus (DM), congestive heart failure, chronic liver disease, and hepatitis C virus infection. In addition, AKI was associated with perioperative myocardial infarction (MI), sepsis, cardiac catheterization, and blood transfusion. Regarding socioeconomic characteristics, patients with AKI were more likely to have Medicare/Medicaid insurance (72.26% vs 58.06%; P < .001) and to belong to the extremes of income categories (Table 2).
Using multivariable logistic regression, we found that increased age (1.11 increase in adjusted OR for every year older; 95% CI, 1.09-1.14; P < .001), male sex (adjusted OR, 1.65; 95% CI, 1.60-1.71; P < .001), and black race (adjusted OR, 1.57; 95% CI, 1.45-1.69; P < .001) were significantly associated with postoperative AKI. Regarding comorbidities, baseline CKD (adjusted OR, 8.64; 95% CI, 8.14-9.18; P < .001) and congestive heart failure (adjusted OR, 2.74; 95% CI, 2.57-2.92; P< .0001) were most significantly associated with AKI. Perioperative events, including sepsis (adjusted OR, 35.64; 95% CI, 30.28-41.96; P < .0001), MI (adjusted OR, 6.14; 95% CI, 5.17-7.28; P < .0001), and blood transfusion (adjusted OR, 2.28; 95% CI, 2.15-2.42; P < .0001), were also strongly associated with postoperative AKI. Last, compared with urban hospitals and small hospital bed size, rural hospitals (adjusted OR, 0.70; 95% CI, 0.60-0.81; P< .001) and large bed size (adjusted OR, 0.82; 95% CI, 0.70-0.93; P = .003) were associated with lower probability of developing AKI (Table 3).
Figure 2 elucidates the frequency of AKI based on a combination of key preoperative comorbid conditions and postoperative complications—demonstrating that the proportion of AKI cases associated with other postoperative complications is significantly higher in the CKD and concomitant DM/CKD patient populations. Patients hospitalized with CKD exhibited higher rates of AKI in cases involving blood transfusion (20.9% vs 1.8%; P < .001), acute MI (48.9% vs 13.8%; P < .001), and sepsis (74.7% vs 36.3%;P< .001) relative to patients without CKD. Similarly, patients with concomitant DM/CKD exhibited higher rates of AKI in cases involving blood transfusion (23% vs 1.9%; P< .001), acute MI (51.1% vs 12.1%; P< .001), and sepsis (75% vs 38.2%; P < .001) relative to patients without either condition. However, patients hospitalized with DM alone exhibited only marginally higher rates of AKI in cases involving blood transfusion (4.7% vs 2%; P < .01) and acute MI (19.2% vs 16.7%; P< .01) and a lower rate in cases involving sepsis (38.2% vs 41.7%; P < .01) relative to patients without DM. These data suggest that CKD is the most significant clinically relevant risk factor for AKI and that CKD may synergize with DM to raise the risk for AKI.
Outcomes
We then analyzed the impact of AKI on hospital outcomes, including in-hospital mortality, discharge disposition, LOS, and cost of care. Mortality was significantly higher in patients with AKI than in patients without it (2.08% vs 0.06%; P < .001). Even after adjusting for confounders (eg, demographics, comorbidity burden, perioperative sepsis, hospital-level characteristics), AKI was still associated with strikingly higher odds of in-hospital death (adjusted OR, 11.32; 95% CI, 9.34-13.74; P < .001). However, analysis of temporal trends indicated that the odds for adjusted mortality associated with AKI decreased from 18.09 to 9.45 (Ptrend = .01) over the period 2002–2012 (Figure 3). This decrease in odds of death was countered by an increase in incidence of AKI, resulting in a stable attributable risk proportion (97.9% in 2002 to 97.3% in 2012; Ptrend = .90) (Table 4). Regarding discharge disposition, patients with AKI were much less likely to be discharged home (41.35% vs 62.59%; P < .001) and more likely to be discharged to long-term care (56.37% vs 37.03%; P< .001). After adjustment for confounders, AKI was associated with significantly increased odds of adverse discharge (adjusted OR, 2.24; 95% CI, 2.12-2.36; P< .001). Analysis of temporal trends revealed no appreciable decrease in the adjusted odds of adverse discharge between 2002 (adjusted OR, 1.87; 95% CI, 1.37-2.55; P < .001) and 2012 (adjusted OR, 1.93; 95% CI, 1.76-2.11; P < .001) (Figure 4, Table 5). Last, both mean LOS (5 days vs 3 days; P < .001) and mean cost of hospitalization (US $22,269 vs $15,757; P < .001) were significantly higher in patients with AKI.
Discussion
In this study, we found that the incidence of AKI among hospitalized patients increased 4-fold between 2002 and 2012. Moreover, we identified numerous patient-specific, hospital-specific, perioperative risk factors for AKI. Most important, we found that AKI was associated with a strikingly higher risk of in-hospital death, and surviving patients were more likely to experience adverse discharge. Although the adjusted mortality rate associated with AKI decreased over that decade, the attributable risk proportion remained stable.
Few studies have addressed this significant public health concern. In one recent study in Australia, Kimmel and colleagues16 identified risk factors for AKI but lacked data on AKI outcomes. In a study of complications and mortality occurring after orthopedic surgery, Belmont and colleagues22 categorized complications as either local or systemic but did not examine renal complications. Only 2 other major studies have been conducted on renal outcomes associated with major joint surgery, and both were limited to patients with acute hip fractures. The first included acute fracture surgery patients and omitted elective joint surgery patients, and it evaluated admission renal function but not postoperative AKI.22 The second study had a sample size of only 170 patients.23 Thus, the literature leaves us with a crucial knowledge gap in renal outcomes and their postoperative impact in elective arthroplasties.
The present study filled this information gap by examining the incidence, risk factors, outcomes, and temporal trends of AKI after elective hip and knee arthroplasties. The increasing incidence of AKI in this surgical setting is similar to that of AKI in other surgical settings (cardiac and noncardiac).21 Although our analysis was limited by lack of perioperative management data, patients undergoing elective joint arthroplasty can experience kidney dysfunction for several reasons, including volume depletion, postoperative sepsis, and influence of medications, such as nonsteroidal anti-inflammatory drugs (NSAIDs), especially in older patients with more comorbidities and a higher burden of CKD. Each of these factors can cause renal dysfunction in patients having orthopedic procedures.24 Moreover, NSAID use among elective joint arthroplasty patients is likely higher because of an emphasis on multimodal analgesia, as recent randomized controlled trials have demonstrated the efficacy of NSAID use in controlling pain without increasing bleeding.25-27 Our results also demonstrated that the absolute incidence of AKI after orthopedic surgery is relatively low. One possible explanation for this phenomenon is that the definitions used were based on ICD-9-CM codes that underestimate the true incidence of AKI.
Consistent with other studies, we found that certain key preoperative comorbid conditions and postoperative events were associated with higher AKI risk. We stratified the rate of AKI associated with each postoperative event (sepsis, acute MI, cardiac catheterization, need for transfusion) by DM/CKD comorbidity. CKD was associated with significantly higher AKI risk across all postoperative complications. This information may provide clinicians with bedside information that can be used to determine which patients may be at higher or lower risk for AKI.
Our analysis of patient outcomes revealed that, though AKI was relatively uncommon, it increased the risk for death during hospitalization more than 10-fold between 2002 and 2012. Although the adjusted OR of in-hospital mortality decreased over the decade studied, the concurrent increase in AKI incidence caused the attributable risk of death associated with AKI to essentially remain the same. This observation is consistent with recent reports from cardiac surgery settings.21 These data together suggest that ameliorating occurrences of AKI would decrease mortality and increase quality of care for patients undergoing elective joint surgeries.
We also examined the effect of AKI on resource use by studying LOS, costs, and risk for adverse discharge. Much as in other surgical settings, AKI increased both LOS and overall hospitalization costs. More important, AKI was associated with increased adverse discharge (discharge to long-term care or nursing homes). Although exact reasons are unclear, we can speculate that postoperative renal dysfunction precludes early rehabilitation, impeding desired functional outcome and disposition.28,29 Given the projected increases in primary and revision hip and knee arthroplasties,5 these data predict that the impact of AKI on health outcomes will increase alarmingly in coming years.
There are limitations to our study. First, it was based on administrative data and lacked patient-level and laboratory data. As reported, the sensitivity of AKI codes remains moderate,30 so the true burden may be higher than indicated here. As the definition of AKI was based on administrative coding, we also could not estimate severity, though previous studies have found that administrative codes typically capture a more severe form of disease.31 Another limitation is that, because the data were deidentified, we could not delineate the risk for recurrent AKI in repeated surgical procedures, though this cohort unlikely was large enough to qualitatively affect our results. The third limitation is that, though we used CCI to adjust for the comorbidity burden, we were unable to account for other unmeasured confounders associated with increased AKI incidence, such as specific medication use. In addition, given the lack of patient-level data, we could not analyze the specific factors responsible for AKI in the perioperative period. Nevertheless, the strengths of a nationally representative sample, such as large sample size and generalizability, outweigh these limitations.
Conclusion
AKI is potentially an important quality indicator of elective joint surgery, and reducing its incidence is therefore essential for quality improvement. Given that hip and knee arthroplasties are projected to increase exponentially, as is the burden of comorbid conditions in this population, postoperative AKI will continue to have an incremental impact on health and health care resources. Thus, a carefully planned approach of interdisciplinary perioperative care is warranted to reduce both the risk and the consequences of this devastating condition.
1. Reginster JY. The prevalence and burden of arthritis. Rheumatology. 2002;41(supp 1):3-6.
2. Kullenberg B, Runesson R, Tuvhag R, Olsson C, Resch S. Intraarticular corticosteroid injection: pain relief in osteoarthritis of the hip? J Rheumatol. 2004;31(11):2265-2268.
3. Kawasaki M, Hasegawa Y, Sakano S, Torii Y, Warashina H. Quality of life after several treatments for osteoarthritis of the hip. J Orthop Sci. 2003;8(1):32-35.
4. Ethgen O, Bruyère O, Richy F, Dardennes C, Reginster JY. Health-related quality of life in total hip and total knee arthroplasty. A qualitative and systematic review of the literature. J Bone Joint Surg Am. 2004;86(5):963-974.
5. Kurtz S, Ong K, Lau E, Mowat F, Halpern M. Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J Bone Joint Surg Am. 2007;89(4):780-785.
6. Matlock D, Earnest M, Epstein A. Utilization of elective hip and knee arthroplasty by age and payer. Clin Orthop Relat Res. 2008;466(4):914-919.
7. Parvizi J, Holiday AD, Ereth MH, Lewallen DG. The Frank Stinchfield Award. Sudden death during primary hip arthroplasty. Clin Orthop Relat Res. 1999;(369):39-48.
8. Parvizi J, Mui A, Purtill JJ, Sharkey PF, Hozack WJ, Rothman RH. Total joint arthroplasty: when do fatal or near-fatal complications occur? J Bone Joint Surg Am. 2007;89(1):27-32.
9. Parvizi J, Sullivan TA, Trousdale RT, Lewallen DG. Thirty-day mortality after total knee arthroplasty. J Bone Joint Surg Am. 2001;83(8):1157-1161.
10. Uchino S, Kellum JA, Bellomo R, et al; Beginning and Ending Supportive Therapy for the Kidney (BEST Kidney) Investigators. Acute renal failure in critically ill patients: a multinational, multicenter study. JAMA. 2005;294(7):813-818.
11. Thakar CV. Perioperative acute kidney injury. Adv Chronic Kidney Dis. 2013;20(1):67-75.
12. Hsu CY, Chertow GM, McCulloch CE, Fan D, Ordoñez JD, Go AS. Nonrecovery of kidney function and death after acute on chronic renal failure. Clin J Am Soc Nephrol. 2009;4(5):891-898.
13. Rewa O, Bagshaw SM. Acute kidney injury—epidemiology, outcomes and economics. Nat Rev Nephrol. 2014;10(4):193-207.
14. Thakar CV, Worley S, Arrigain S, Yared JP, Paganini EP. Influence of renal dysfunction on mortality after cardiac surgery: modifying effect of preoperative renal function. Kidney Int. 2005;67(3):1112-1119.
15. Zeng X, McMahon GM, Brunelli SM, Bates DW, Waikar SS. Incidence, outcomes, and comparisons across definitions of AKI in hospitalized individuals. Clin J Am Soc Nephrol. 2014;9(1):12-20.
16. Kimmel LA, Wilson S, Janardan JD, Liew SM, Walker RG. Incidence of acute kidney injury following total joint arthroplasty: a retrospective review by RIFLE criteria. Clin Kidney J. 2014;7(6):546-551.
17. Agency for Healthcare Research and Quality. Healthcare Cost and Utilization Project (HCUP) databases, 2002–2012. Rockville, MD: Agency for Healthcare Research and Quality.
18. Bjorgul K, Novicoff WM, Saleh KJ. Evaluating comorbidities in total hip and knee arthroplasty: available instruments. J Orthop Traumatol. 2010;11(4):203-209.
19. Voskuijl T, Hageman M, Ring D. Higher Charlson Comorbidity Index Scores are associated with readmission after orthopaedic surgery. Clin Orthop Relat Res. 2014;472(5):1638-1644.
20. Chertow GM, Burdick E, Honour M, Bonventre JV, Bates DW. Acute kidney injury, mortality, length of stay, and costs in hospitalized patients. J Am Soc Nephrol. 2005;16(11):3365-3370.
21. Lenihan CR, Montez-Rath ME, Mora Mangano CT, Chertow GM, Winkelmayer WC. Trends in acute kidney injury, associated use of dialysis, and mortality after cardiac surgery, 1999 to 2008. Ann Thorac Surg. 2013;95(1):20-28.
22. Belmont PJ Jr, Goodman GP, Waterman BR, Bader JO, Schoenfeld AJ. Thirty-day postoperative complications and mortality following total knee arthroplasty: incidence and risk factors among a national sample of 15,321 patients. J Bone Joint Surg Am. 2014;96(1):20-26.
23. Bennet SJ, Berry OM, Goddard J, Keating JF. Acute renal dysfunction following hip fracture. Injury. 2010;41(4):335-338.
24. Kateros K, Doulgerakis C, Galanakos SP, Sakellariou VI, Papadakis SA, Macheras GA. Analysis of kidney dysfunction in orthopaedic patients. BMC Nephrol. 2012;13:101.
25. Huang YM, Wang CM, Wang CT, Lin WP, Horng LC, Jiang CC. Perioperative celecoxib administration for pain management after total knee arthroplasty—a randomized, controlled study. BMC Musculoskelet Disord. 2008;9:77.
26. Kelley TC, Adams MJ, Mulliken BD, Dalury DF. Efficacy of multimodal perioperative analgesia protocol with periarticular medication injection in total knee arthroplasty: a randomized, double-blinded study. J Arthroplasty. 2013;28(8):1274-1277.
27. Lamplot JD, Wagner ER, Manning DW. Multimodal pain management in total knee arthroplasty: a prospective randomized controlled trial. J Arthroplasty. 2014;29(2):329-334.
28. Munin MC, Rudy TE, Glynn NW, Crossett LS, Rubash HE. Early inpatient rehabilitation after elective hip and knee arthroplasty. JAMA. 1998;279(11):847-852.
29. Pua YH, Ong PH. Association of early ambulation with length of stay and costs in total knee arthroplasty: retrospective cohort study. Am J Phys Med Rehabil. 2014;93(11):962-970.
30. Waikar SS, Wald R, Chertow GM, et al. Validity of International Classification of Diseases, Ninth Revision, Clinical Modification codes for acute renal failure. J Am Soc Nephrol. 2006;17(6):1688-1694.
31. Grams ME, Waikar SS, MacMahon B, Whelton S, Ballew SH, Coresh J. Performance and limitations of administrative data in the identification of AKI. Clin J Am Soc Nephrol. 2014;9(4):682-689.
1. Reginster JY. The prevalence and burden of arthritis. Rheumatology. 2002;41(supp 1):3-6.
2. Kullenberg B, Runesson R, Tuvhag R, Olsson C, Resch S. Intraarticular corticosteroid injection: pain relief in osteoarthritis of the hip? J Rheumatol. 2004;31(11):2265-2268.
3. Kawasaki M, Hasegawa Y, Sakano S, Torii Y, Warashina H. Quality of life after several treatments for osteoarthritis of the hip. J Orthop Sci. 2003;8(1):32-35.
4. Ethgen O, Bruyère O, Richy F, Dardennes C, Reginster JY. Health-related quality of life in total hip and total knee arthroplasty. A qualitative and systematic review of the literature. J Bone Joint Surg Am. 2004;86(5):963-974.
5. Kurtz S, Ong K, Lau E, Mowat F, Halpern M. Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J Bone Joint Surg Am. 2007;89(4):780-785.
6. Matlock D, Earnest M, Epstein A. Utilization of elective hip and knee arthroplasty by age and payer. Clin Orthop Relat Res. 2008;466(4):914-919.
7. Parvizi J, Holiday AD, Ereth MH, Lewallen DG. The Frank Stinchfield Award. Sudden death during primary hip arthroplasty. Clin Orthop Relat Res. 1999;(369):39-48.
8. Parvizi J, Mui A, Purtill JJ, Sharkey PF, Hozack WJ, Rothman RH. Total joint arthroplasty: when do fatal or near-fatal complications occur? J Bone Joint Surg Am. 2007;89(1):27-32.
9. Parvizi J, Sullivan TA, Trousdale RT, Lewallen DG. Thirty-day mortality after total knee arthroplasty. J Bone Joint Surg Am. 2001;83(8):1157-1161.
10. Uchino S, Kellum JA, Bellomo R, et al; Beginning and Ending Supportive Therapy for the Kidney (BEST Kidney) Investigators. Acute renal failure in critically ill patients: a multinational, multicenter study. JAMA. 2005;294(7):813-818.
11. Thakar CV. Perioperative acute kidney injury. Adv Chronic Kidney Dis. 2013;20(1):67-75.
12. Hsu CY, Chertow GM, McCulloch CE, Fan D, Ordoñez JD, Go AS. Nonrecovery of kidney function and death after acute on chronic renal failure. Clin J Am Soc Nephrol. 2009;4(5):891-898.
13. Rewa O, Bagshaw SM. Acute kidney injury—epidemiology, outcomes and economics. Nat Rev Nephrol. 2014;10(4):193-207.
14. Thakar CV, Worley S, Arrigain S, Yared JP, Paganini EP. Influence of renal dysfunction on mortality after cardiac surgery: modifying effect of preoperative renal function. Kidney Int. 2005;67(3):1112-1119.
15. Zeng X, McMahon GM, Brunelli SM, Bates DW, Waikar SS. Incidence, outcomes, and comparisons across definitions of AKI in hospitalized individuals. Clin J Am Soc Nephrol. 2014;9(1):12-20.
16. Kimmel LA, Wilson S, Janardan JD, Liew SM, Walker RG. Incidence of acute kidney injury following total joint arthroplasty: a retrospective review by RIFLE criteria. Clin Kidney J. 2014;7(6):546-551.
17. Agency for Healthcare Research and Quality. Healthcare Cost and Utilization Project (HCUP) databases, 2002–2012. Rockville, MD: Agency for Healthcare Research and Quality.
18. Bjorgul K, Novicoff WM, Saleh KJ. Evaluating comorbidities in total hip and knee arthroplasty: available instruments. J Orthop Traumatol. 2010;11(4):203-209.
19. Voskuijl T, Hageman M, Ring D. Higher Charlson Comorbidity Index Scores are associated with readmission after orthopaedic surgery. Clin Orthop Relat Res. 2014;472(5):1638-1644.
20. Chertow GM, Burdick E, Honour M, Bonventre JV, Bates DW. Acute kidney injury, mortality, length of stay, and costs in hospitalized patients. J Am Soc Nephrol. 2005;16(11):3365-3370.
21. Lenihan CR, Montez-Rath ME, Mora Mangano CT, Chertow GM, Winkelmayer WC. Trends in acute kidney injury, associated use of dialysis, and mortality after cardiac surgery, 1999 to 2008. Ann Thorac Surg. 2013;95(1):20-28.
22. Belmont PJ Jr, Goodman GP, Waterman BR, Bader JO, Schoenfeld AJ. Thirty-day postoperative complications and mortality following total knee arthroplasty: incidence and risk factors among a national sample of 15,321 patients. J Bone Joint Surg Am. 2014;96(1):20-26.
23. Bennet SJ, Berry OM, Goddard J, Keating JF. Acute renal dysfunction following hip fracture. Injury. 2010;41(4):335-338.
24. Kateros K, Doulgerakis C, Galanakos SP, Sakellariou VI, Papadakis SA, Macheras GA. Analysis of kidney dysfunction in orthopaedic patients. BMC Nephrol. 2012;13:101.
25. Huang YM, Wang CM, Wang CT, Lin WP, Horng LC, Jiang CC. Perioperative celecoxib administration for pain management after total knee arthroplasty—a randomized, controlled study. BMC Musculoskelet Disord. 2008;9:77.
26. Kelley TC, Adams MJ, Mulliken BD, Dalury DF. Efficacy of multimodal perioperative analgesia protocol with periarticular medication injection in total knee arthroplasty: a randomized, double-blinded study. J Arthroplasty. 2013;28(8):1274-1277.
27. Lamplot JD, Wagner ER, Manning DW. Multimodal pain management in total knee arthroplasty: a prospective randomized controlled trial. J Arthroplasty. 2014;29(2):329-334.
28. Munin MC, Rudy TE, Glynn NW, Crossett LS, Rubash HE. Early inpatient rehabilitation after elective hip and knee arthroplasty. JAMA. 1998;279(11):847-852.
29. Pua YH, Ong PH. Association of early ambulation with length of stay and costs in total knee arthroplasty: retrospective cohort study. Am J Phys Med Rehabil. 2014;93(11):962-970.
30. Waikar SS, Wald R, Chertow GM, et al. Validity of International Classification of Diseases, Ninth Revision, Clinical Modification codes for acute renal failure. J Am Soc Nephrol. 2006;17(6):1688-1694.
31. Grams ME, Waikar SS, MacMahon B, Whelton S, Ballew SH, Coresh J. Performance and limitations of administrative data in the identification of AKI. Clin J Am Soc Nephrol. 2014;9(4):682-689.
Analysis of Direct Costs of Outpatient Arthroscopic Rotator Cuff Repair
Musculoskeletal disorders, the leading cause of disability in the United States,1 account for more than half of all persons reporting missing a workday because of a medical condition.2 Shoulder disorders in particular play a significant role in the burden of musculoskeletal disorders and cost of care. In 2008, 18.9 million adults (8.2% of the US adult population) reported chronic shoulder pain.1 Among shoulder disorders, rotator cuff pathology is the most common cause of shoulder-related disability found by orthopedic surgeons.3 Rotator cuff surgery (RCS) is one of the most commonly performed orthopedic surgical procedures, and surgery volume is on the rise. One study found a 141% increase in rotator cuff repairs between the years 1996 (~41 per 100,000 population) and 2006 (~98 per 100,000 population).4
US health care costs are also increasing. In 2011, $2.7 trillion was spent on health care, representing 17.9% of the national gross domestic product (GDP). According to projections, costs will rise to $4.6 trillion by 2020.5 In particular, as patients continue to live longer and remain more active into their later years, the costs of treating and managing musculoskeletal disorders become more important from a public policy standpoint. In 2006, the cost of treating musculoskeletal disorders alone was $576 billion, representing 4.5% of that year’s GDP.2
Paramount in this era of rising costs is the idea of maximizing the value of health care dollars. Health care economists Porter and Teisberg6 defined value as patient health outcomes achieved per dollar of cost expended in a care cycle (diagnosis, treatment, ongoing management) for a particular disease or disorder. For proper management of value, outcomes and costs for an entire cycle of care must be determined. From a practical standpoint, this first requires determining the true cost of a care cycle—dollars spent on personnel, equipment, materials, and other resources required to deliver a particular service—rather than the amount charged or reimbursed for providing the service in question.7
Kaplan and Anderson8,9 described the TDABC (time-driven activity-based costing) algorithm for calculating the cost of delivering a service based on 2 parameters: unit cost of a particular resource, and time required to supply it. These parameters apply to material costs and labor costs. In the medical setting, the TDABC algorithm can be applied by defining a care delivery value chain for each aspect of patient care and then multiplying incremental cost per unit time by time required to deliver that resource (Figure 1). Tabulating the overall unit cost for each resource then yields the overall cost of the care cycle. Clinical outcomes data can then be determined and used to calculate overall value for the patient care cycle.
In the study reported here, we used the TDABC algorithm to calculate the direct financial costs of surgical treatment of rotator cuff tears confirmed by magnetic resonance imaging (MRI) in an academic medical center.
Methods
Per our institution’s Office for the Protection of Research Subjects, institutional review board (IRB) approval is required only for projects using “human subjects” as defined by federal policy. In the present study, no private information could be identified, and all data were obtained from hospital billing records without intervention or interaction with individual patients. Accordingly, IRB approval was deemed unnecessary for our economic cost analysis.
Billing records of a single academic fellowship-trained sports surgeon were reviewed to identify patients who underwent primary repair of an MRI-confirmed rotator cuff tear between April 1, 2009, and July 31, 2012. Patients who had undergone prior shoulder surgery of any type were excluded from the study. Operative reports were reviewed, and exact surgical procedures performed were noted. The operating surgeon selected the specific repair techniques, including single- or double-row repair, with emphasis on restoring footprint coverage and avoiding overtensioning.
All surgeries were performed in an outpatient surgical center owned and operated by the surgeon’s home university. Surgeries were performed by the attending physician assisted by a senior orthopedic resident. The RCS care cycle was divided into 3 phases (Figure 2):
1. Preoperative. Patient’s interaction with receptionist in surgery center, time with preoperative nurse and circulating nurse in preoperative area, resident check-in time, and time placing preoperative nerve block and consumable materials used during block placement.
2. Operative. Time in operating room with surgical team for RCS, consumable materials used during surgery (eg, anchors, shavers, drapes), anesthetic medications, shoulder abduction pillow placed on completion of surgery, and cost of instrument processing.
3. Postoperative. Time in postoperative recovery area with recovery room nursing staff.
Time in each portion of the care cycle was directly observed and tabulated by hospital volunteers in the surgery center. Institutional billing data were used to identify material resources consumed, and the actual cost paid by the hospital for these resources was obtained from internal records. Mean hourly salary data and standard benefit rates were obtained for surgery center staff. Attending physician salary was extrapolated from published mean market salary data for academic physicians and mean hours worked,10,11 and resident physician costs were tabulated from publically available institutional payroll data and average resident work hours at our institution. These cost data and times were then used to tabulate total cost for the RCS care cycle using the TDABC algorithm.
Results
We identified 28 shoulders in 26 patients (mean age, 54.5 years) who met the inclusion criteria. Of these 28 shoulders, 18 (64.3%) had an isolated supraspinatus tear, 8 (28.6%) had combined supraspinatus and infraspinatus tears, 1 (3.6%) had combined supraspinatus and subscapularis tears, and 1 (3.6%) had an isolated infraspinatus tear. Demographic data are listed in Table 1.
All patients received an interscalene nerve block in the preoperative area before being brought into the operating room. In our analysis, we included nerve block supply costs and the anesthesiologist’s mean time placing the nerve block.
In all cases, primary rotator cuff repair was performed with suture anchors (Parcus Medical) with the patient in the lateral decubitus position. In 13 (46%) of the 28 shoulders, this repair was described as “complex,” requiring double-row technique. Subacromial decompression and bursectomy were performed in addition to the rotator cuff repair. Labral débridement was performed in 23 patients, synovectomy in 10, biceps tenodesis with anchor (Smith & Nephew) in 1, and biceps tenotomy in 1. Mean time in operating room was 148 minutes; mean time in postoperative recovery unit was 105 minutes.
Directly observing the care cycle, hospital volunteers found that patients spent a mean of 15 minutes with the receptionist when they arrived in the outpatient surgical center, 25 minutes with nurses for check-in in the preoperative holding area, and 10 minutes with the anesthesiology resident and 15 minutes with the orthopedic surgery resident for preoperative evaluation and paperwork. Mean nerve block time was 20 minutes. Mean electrocardiogram (ECG) time (12 patients) was 15 minutes. The surgical technician spent a mean time of 20 minutes setting up the operating room before the patient was brought in and 15 minutes cleaning up after the patient was transferred to the recovery room. Costs of postoperative care in the recovery room were based on a 2:1 patient-to-nurse ratio, as is the standard practice in our outpatient surgery center.
Using the times mentioned and our hospital’s salary data—including standard hospital benefits rates of 33.5% for nonphysicians and 17.65% for physicians—we determined, using the TDABC algorithm, a direct cost of $5904.21 for this process cycle, excluding hospital overhead and indirect costs. Table 2 provides the overall cost breakdown. Compared with the direct economic cost, the mean hospital charge to insurers for the procedure was $31,459.35. Mean reimbursement from insurers was $9679.08.
Overall attending and resident physician costs were $1077.75, which consisted of $623.66 for the surgeon and $454.09 for the anesthesiologist (included placement of nerve block and administration of anesthesia during surgery). Preoperative bloodwork was obtained in 23 cases, adding a mean cost of $111.04 after adjusting for standard hospital markup. Preoperative ECG was performed in 12 cases, for an added mean cost of $7.30 based on the TDABC algorithm.
We also broke down costs by care cycle phase. The preoperative phase, excluding the preoperative laboratory studies and ECGs (not performed in all cases), cost $134.34 (2.3% of total costs); the operative phase cost $5718.01 (96.8% of total costs); and the postoperative phase cost $51.86 (0.9% of total costs). Within the operative phase, the cost of consumables (specifically, suture anchors) was the main cost driver. Mean anchor cost per case was $3432.67. “Complex” tears involving a double-row repair averaged $4570.25 in anchor cost per patient, as compared with $2522.60 in anchor costs for simple repairs.
Discussion
US health care costs continue to increase unsustainably, with rising pressure on hospitals and providers to deliver the highest value for each health care dollar. The present study is the first to calculate (using the TDABC algorithm) the direct economic cost ($5904.21) of the entire RCS care cycle at a university-based outpatient surgery center. Rent, utility costs, administrative costs, overhead, and other indirect costs at the surgery center were not included in this cost analysis, as they would be incurred irrespective of type of surgery performed. As such, our data isolate the procedure-specific costs of rotator cuff repair in order to provide a more meaningful comparison for other institutions, where indirect costs may be different.
In the literature, rigorous economic analysis of shoulder pathology is sparse. Kuye and colleagues12 systematically reviewed economic evaluations in shoulder surgery for the period 1980–2010 and noted more than 50% of the papers were published between 2005 and 2010.12 They also noted the poor quality of these studies and concluded more rigorous economic evaluations are needed to help justify the rising costs of shoulder-related treatments.
Several studies have directly evaluated costs associated with RCS. Cordasco and colleagues13 detailed the success of open rotator cuff repair as an outpatient procedure—noting its 43% cost savings ($4300 for outpatient vs $7500 for inpatient) and high patient satisfaction—using hospital charge data for operating room time, supplies, instruments, and postoperative slings. Churchill and Ghorai14 evaluated costs of mini-open and arthroscopic rotator cuff repairs in a statewide database and estimated the arthroscopic repair cost at $8985, compared with $7841 for the mini-open repair. They used reported hospital charge data, which were not itemized and did not include physician professional fees. Adla and colleagues,15 in a similar analysis of open versus arthroscopic cuff repair, estimated direct material costs of $1609.50 (arthroscopic) and $360.75 (open); these figures were converted from 2005 UK currency using the exchange rate cited in their paper. Salaries of surgeon, anesthesiologist, and other operating room personnel were said to be included in the operating room cost, but the authors’ paper did not include these data.
Two studies directly estimated the costs of arthroscopic rotator cuff repair. Hearnden and Tennent16 calculated the cost of RCS at their UK institution to be £2672, which included cost of operating room consumable materials, medication, and salaries of operating room personnel, including surgeon and anesthesiologist. Using online currency conversion from 2008 exchange rates and adjusting for inflation gave a corresponding US cost of $5449.63.17 Vitale and colleagues18 prospectively calculated costs of arthroscopic rotator cuff repair over a 1-year period using a cost-to-charge ratio from tabulated inpatient charges, procedure charges, and physician fees and payments abstracted from medical records, hospital billing, and administrative databases. Mean total cost for this cycle was $10,605.20, which included several costs (physical therapy, radiologist fees) not included in the present study. These studies, though more comprehensive than prior work, did not capture the entire cycle of surgical care.
Our study was designed to provide initial data on the direct costs of arthroscopic repair of the rotator cuff for the entire process cycle. Our overall cost estimate of $5904.21 differs significantly from prior work—not unexpected given the completely different cost methodology used.
Our study had several limitations. First, it was a single-surgeon evaluation, and a number of operating room variables (eg, use of adjunct instrumentation such as radiofrequency probes, differences in draping preferences) as well as surgeon volume in performing rotator cuff repairs might have substantially affected the reproducibility and generalizability of our data. Similarly, the large number of adjunctive procedures (eg, subacromial decompression, labral débridement) performed in conjunction with the rotator cuff repairs added operative time and therefore increased overall cost. Double-row repairs added operative time and increased the cost of consumable materials as well. Differences in surgeon preference for suture anchors may also be important, as anchors are a major cost driver and can vary significantly between vendors and institutions. Tear-related variables (eg, tear size, tear chronicity, degree of fatty cuff degeneration) were not controlled for and might have significantly affected operative time and associated cost. Resident involvement in the surgical procedure and anesthesia process in an academic setting prolongs surgical time and thus directly impacts costs.
In addition, we used the patient’s time in the operating room as a proxy for actual surgical time, as this was the only reliable and reproducible data point available in our electronic medical record. As such, an unquantifiable amount of surgeon time may have been overallocated to our cost estimate for time spent inducing anesthesia, positioning, helping take the patient off the operating table, and so on. However, as typical surgeon practice is to be involved in these tasks in the operating room, the possible overestimate of surgeon cost is likely minimal.
Our salary data for the TDABC algorithm were based on national averages for work hours and gross income for physicians and on hospital-based wage structure and may not be generalizable to other institutions. There may also be regional differences in work hours and salaries, which in turn would factor into a different per-minute cost for surgeon and anesthesiologist, depending on the exact geographic area where the surgery is performed. Costs may be higher at institutions that use certified nurse anesthetists rather than resident physicians because of the salary differences between these practitioners.
Moreover, the time that patients spend in the holding area—waiting to go into surgery and, after surgery, waiting for their ride home, for their prescriptions to be ready, and so forth—is an important variable to consider from a cost standpoint. However, as this time varied significantly and involved minimal contact with hospital personnel, we excluded its associated costs from our analysis. Similarly, and as already noted, hospital overhead and other indirect costs were excluded from analysis as well.
Conclusion
Using the TDABC algorithm, we found a direct economic cost of $5904.21 for RCS at our academic outpatient surgical center, with anchor cost the main cost driver. Judicious use of consumable resources is a key focus for cost containment in arthroscopic shoulder surgery, particularly with respect to implantable suture anchors. However, in the setting of more complex tears that require multiple anchors in a double-row repair construct, our pilot data may be useful to hospitals and surgery centers negotiating procedural reimbursement for the increased cost of complex repairs. Use of the TDABC algorithm for RCS and other procedures may also help in identifying opportunities to deliver more cost-effective health care.
1. American Academy of Orthopaedic Surgeons. The Burden of Musculoskeletal Diseases in the United States: Prevalence, Societal and Economic Cost. Rosemont, IL: American Academy of Orthopaedic Surgeons; 2011.
2. National health expenditure data. Centers for Medicare & Medicare Services website. https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/index.html. Updated May 5, 2014. Accessed December 1, 2015.
3. Tashjian RZ. Epidemiology, natural history, and indications for treatment of rotator cuff tears. Clin Sports Med. 2012;31(4):589-604.
4. Colvin AC, Egorova N, Harrison AK, Moskowitz A, Flatow EL. National trends in rotator cuff repair. J Bone Joint Surg Am. 2012;94(3):227-233.
5. Black EM, Higgins LD, Warner JJ. Value-based shoulder surgery: practicing outcomes-driven, cost-conscious care. J Shoulder Elbow Surg. 2013;22(7):1000-1009.
6. Porter ME, Teisberg EO. Redefining Health Care: Creating Value-Based Competition on Results. Boston, MA: Harvard Business School Press; 2006.
7. Kaplan RS, Porter ME. How to solve the cost crisis in health care. Harv Bus Rev. 2011;89(9):46-52, 54, 56-61 passim.
8. Kaplan RS, Anderson SR. Time-driven activity-based costing. Harv Bus Rev. 2004;82(11):131-138, 150.
9. Kaplan RS, Anderson SR. Time-Driven Activity-Based Costing: A Simpler and More Powerful Path to Higher Profits. Boston, MA: Harvard Business Review Press; 2007.
10. American Academy of Orthopaedic Surgeons. Orthopaedic Practice in the U.S. 2012. Rosemont, IL: American Academy of Orthopaedic Surgeons; 2012.
11. Medical Group Management Association. Physician Compensation and Production Survey: 2012 Report Based on 2011 Data. Englewood, CO: Medical Group Management Association; 2012.
12. Kuye IO, Jain NB, Warner L, Herndon JH, Warner JJ. Economic evaluations in shoulder pathologies: a systematic review of the literature. J Shoulder Elbow Surg. 2012;21(3):367-375.
13. Cordasco FA, McGinley BJ, Charlton T. Rotator cuff repair as an outpatient procedure. J Shoulder Elbow Surg. 2000;9(1):27-30.
14. Churchill RS, Ghorai JK. Total cost and operating room time comparison of rotator cuff repair techniques at low, intermediate, and high volume centers: mini-open versus all-arthroscopic. J Shoulder Elbow Surg. 2010;19(5):716-721.
15. Adla DN, Rowsell M, Pandey R. Cost-effectiveness of open versus arthroscopic rotator cuff repair. J Shoulder Elbow Surg. 2010;19(2):258-261.
16. Hearnden A, Tennent D. The cost of shoulder arthroscopy: a comparison with national tariff. Ann R Coll Surg Engl. 2008;90(7):587-591.
17. Xrates currency conversion. http://www.x-rates.com/historical/?from=GBP&amount=1&date=2015-12-03. Accessed December 13, 2015.
18. Vitale MA, Vitale MG, Zivin JG, Braman JP, Bigliani LU, Flatow EL. Rotator cuff repair: an analysis of utility scores and cost-effectiveness. J Shoulder Elbow Surg. 2007;16(2):181-187.
Musculoskeletal disorders, the leading cause of disability in the United States,1 account for more than half of all persons reporting missing a workday because of a medical condition.2 Shoulder disorders in particular play a significant role in the burden of musculoskeletal disorders and cost of care. In 2008, 18.9 million adults (8.2% of the US adult population) reported chronic shoulder pain.1 Among shoulder disorders, rotator cuff pathology is the most common cause of shoulder-related disability found by orthopedic surgeons.3 Rotator cuff surgery (RCS) is one of the most commonly performed orthopedic surgical procedures, and surgery volume is on the rise. One study found a 141% increase in rotator cuff repairs between the years 1996 (~41 per 100,000 population) and 2006 (~98 per 100,000 population).4
US health care costs are also increasing. In 2011, $2.7 trillion was spent on health care, representing 17.9% of the national gross domestic product (GDP). According to projections, costs will rise to $4.6 trillion by 2020.5 In particular, as patients continue to live longer and remain more active into their later years, the costs of treating and managing musculoskeletal disorders become more important from a public policy standpoint. In 2006, the cost of treating musculoskeletal disorders alone was $576 billion, representing 4.5% of that year’s GDP.2
Paramount in this era of rising costs is the idea of maximizing the value of health care dollars. Health care economists Porter and Teisberg6 defined value as patient health outcomes achieved per dollar of cost expended in a care cycle (diagnosis, treatment, ongoing management) for a particular disease or disorder. For proper management of value, outcomes and costs for an entire cycle of care must be determined. From a practical standpoint, this first requires determining the true cost of a care cycle—dollars spent on personnel, equipment, materials, and other resources required to deliver a particular service—rather than the amount charged or reimbursed for providing the service in question.7
Kaplan and Anderson8,9 described the TDABC (time-driven activity-based costing) algorithm for calculating the cost of delivering a service based on 2 parameters: unit cost of a particular resource, and time required to supply it. These parameters apply to material costs and labor costs. In the medical setting, the TDABC algorithm can be applied by defining a care delivery value chain for each aspect of patient care and then multiplying incremental cost per unit time by time required to deliver that resource (Figure 1). Tabulating the overall unit cost for each resource then yields the overall cost of the care cycle. Clinical outcomes data can then be determined and used to calculate overall value for the patient care cycle.
In the study reported here, we used the TDABC algorithm to calculate the direct financial costs of surgical treatment of rotator cuff tears confirmed by magnetic resonance imaging (MRI) in an academic medical center.
Methods
Per our institution’s Office for the Protection of Research Subjects, institutional review board (IRB) approval is required only for projects using “human subjects” as defined by federal policy. In the present study, no private information could be identified, and all data were obtained from hospital billing records without intervention or interaction with individual patients. Accordingly, IRB approval was deemed unnecessary for our economic cost analysis.
Billing records of a single academic fellowship-trained sports surgeon were reviewed to identify patients who underwent primary repair of an MRI-confirmed rotator cuff tear between April 1, 2009, and July 31, 2012. Patients who had undergone prior shoulder surgery of any type were excluded from the study. Operative reports were reviewed, and exact surgical procedures performed were noted. The operating surgeon selected the specific repair techniques, including single- or double-row repair, with emphasis on restoring footprint coverage and avoiding overtensioning.
All surgeries were performed in an outpatient surgical center owned and operated by the surgeon’s home university. Surgeries were performed by the attending physician assisted by a senior orthopedic resident. The RCS care cycle was divided into 3 phases (Figure 2):
1. Preoperative. Patient’s interaction with receptionist in surgery center, time with preoperative nurse and circulating nurse in preoperative area, resident check-in time, and time placing preoperative nerve block and consumable materials used during block placement.
2. Operative. Time in operating room with surgical team for RCS, consumable materials used during surgery (eg, anchors, shavers, drapes), anesthetic medications, shoulder abduction pillow placed on completion of surgery, and cost of instrument processing.
3. Postoperative. Time in postoperative recovery area with recovery room nursing staff.
Time in each portion of the care cycle was directly observed and tabulated by hospital volunteers in the surgery center. Institutional billing data were used to identify material resources consumed, and the actual cost paid by the hospital for these resources was obtained from internal records. Mean hourly salary data and standard benefit rates were obtained for surgery center staff. Attending physician salary was extrapolated from published mean market salary data for academic physicians and mean hours worked,10,11 and resident physician costs were tabulated from publically available institutional payroll data and average resident work hours at our institution. These cost data and times were then used to tabulate total cost for the RCS care cycle using the TDABC algorithm.
Results
We identified 28 shoulders in 26 patients (mean age, 54.5 years) who met the inclusion criteria. Of these 28 shoulders, 18 (64.3%) had an isolated supraspinatus tear, 8 (28.6%) had combined supraspinatus and infraspinatus tears, 1 (3.6%) had combined supraspinatus and subscapularis tears, and 1 (3.6%) had an isolated infraspinatus tear. Demographic data are listed in Table 1.
All patients received an interscalene nerve block in the preoperative area before being brought into the operating room. In our analysis, we included nerve block supply costs and the anesthesiologist’s mean time placing the nerve block.
In all cases, primary rotator cuff repair was performed with suture anchors (Parcus Medical) with the patient in the lateral decubitus position. In 13 (46%) of the 28 shoulders, this repair was described as “complex,” requiring double-row technique. Subacromial decompression and bursectomy were performed in addition to the rotator cuff repair. Labral débridement was performed in 23 patients, synovectomy in 10, biceps tenodesis with anchor (Smith & Nephew) in 1, and biceps tenotomy in 1. Mean time in operating room was 148 minutes; mean time in postoperative recovery unit was 105 minutes.
Directly observing the care cycle, hospital volunteers found that patients spent a mean of 15 minutes with the receptionist when they arrived in the outpatient surgical center, 25 minutes with nurses for check-in in the preoperative holding area, and 10 minutes with the anesthesiology resident and 15 minutes with the orthopedic surgery resident for preoperative evaluation and paperwork. Mean nerve block time was 20 minutes. Mean electrocardiogram (ECG) time (12 patients) was 15 minutes. The surgical technician spent a mean time of 20 minutes setting up the operating room before the patient was brought in and 15 minutes cleaning up after the patient was transferred to the recovery room. Costs of postoperative care in the recovery room were based on a 2:1 patient-to-nurse ratio, as is the standard practice in our outpatient surgery center.
Using the times mentioned and our hospital’s salary data—including standard hospital benefits rates of 33.5% for nonphysicians and 17.65% for physicians—we determined, using the TDABC algorithm, a direct cost of $5904.21 for this process cycle, excluding hospital overhead and indirect costs. Table 2 provides the overall cost breakdown. Compared with the direct economic cost, the mean hospital charge to insurers for the procedure was $31,459.35. Mean reimbursement from insurers was $9679.08.
Overall attending and resident physician costs were $1077.75, which consisted of $623.66 for the surgeon and $454.09 for the anesthesiologist (included placement of nerve block and administration of anesthesia during surgery). Preoperative bloodwork was obtained in 23 cases, adding a mean cost of $111.04 after adjusting for standard hospital markup. Preoperative ECG was performed in 12 cases, for an added mean cost of $7.30 based on the TDABC algorithm.
We also broke down costs by care cycle phase. The preoperative phase, excluding the preoperative laboratory studies and ECGs (not performed in all cases), cost $134.34 (2.3% of total costs); the operative phase cost $5718.01 (96.8% of total costs); and the postoperative phase cost $51.86 (0.9% of total costs). Within the operative phase, the cost of consumables (specifically, suture anchors) was the main cost driver. Mean anchor cost per case was $3432.67. “Complex” tears involving a double-row repair averaged $4570.25 in anchor cost per patient, as compared with $2522.60 in anchor costs for simple repairs.
Discussion
US health care costs continue to increase unsustainably, with rising pressure on hospitals and providers to deliver the highest value for each health care dollar. The present study is the first to calculate (using the TDABC algorithm) the direct economic cost ($5904.21) of the entire RCS care cycle at a university-based outpatient surgery center. Rent, utility costs, administrative costs, overhead, and other indirect costs at the surgery center were not included in this cost analysis, as they would be incurred irrespective of type of surgery performed. As such, our data isolate the procedure-specific costs of rotator cuff repair in order to provide a more meaningful comparison for other institutions, where indirect costs may be different.
In the literature, rigorous economic analysis of shoulder pathology is sparse. Kuye and colleagues12 systematically reviewed economic evaluations in shoulder surgery for the period 1980–2010 and noted more than 50% of the papers were published between 2005 and 2010.12 They also noted the poor quality of these studies and concluded more rigorous economic evaluations are needed to help justify the rising costs of shoulder-related treatments.
Several studies have directly evaluated costs associated with RCS. Cordasco and colleagues13 detailed the success of open rotator cuff repair as an outpatient procedure—noting its 43% cost savings ($4300 for outpatient vs $7500 for inpatient) and high patient satisfaction—using hospital charge data for operating room time, supplies, instruments, and postoperative slings. Churchill and Ghorai14 evaluated costs of mini-open and arthroscopic rotator cuff repairs in a statewide database and estimated the arthroscopic repair cost at $8985, compared with $7841 for the mini-open repair. They used reported hospital charge data, which were not itemized and did not include physician professional fees. Adla and colleagues,15 in a similar analysis of open versus arthroscopic cuff repair, estimated direct material costs of $1609.50 (arthroscopic) and $360.75 (open); these figures were converted from 2005 UK currency using the exchange rate cited in their paper. Salaries of surgeon, anesthesiologist, and other operating room personnel were said to be included in the operating room cost, but the authors’ paper did not include these data.
Two studies directly estimated the costs of arthroscopic rotator cuff repair. Hearnden and Tennent16 calculated the cost of RCS at their UK institution to be £2672, which included cost of operating room consumable materials, medication, and salaries of operating room personnel, including surgeon and anesthesiologist. Using online currency conversion from 2008 exchange rates and adjusting for inflation gave a corresponding US cost of $5449.63.17 Vitale and colleagues18 prospectively calculated costs of arthroscopic rotator cuff repair over a 1-year period using a cost-to-charge ratio from tabulated inpatient charges, procedure charges, and physician fees and payments abstracted from medical records, hospital billing, and administrative databases. Mean total cost for this cycle was $10,605.20, which included several costs (physical therapy, radiologist fees) not included in the present study. These studies, though more comprehensive than prior work, did not capture the entire cycle of surgical care.
Our study was designed to provide initial data on the direct costs of arthroscopic repair of the rotator cuff for the entire process cycle. Our overall cost estimate of $5904.21 differs significantly from prior work—not unexpected given the completely different cost methodology used.
Our study had several limitations. First, it was a single-surgeon evaluation, and a number of operating room variables (eg, use of adjunct instrumentation such as radiofrequency probes, differences in draping preferences) as well as surgeon volume in performing rotator cuff repairs might have substantially affected the reproducibility and generalizability of our data. Similarly, the large number of adjunctive procedures (eg, subacromial decompression, labral débridement) performed in conjunction with the rotator cuff repairs added operative time and therefore increased overall cost. Double-row repairs added operative time and increased the cost of consumable materials as well. Differences in surgeon preference for suture anchors may also be important, as anchors are a major cost driver and can vary significantly between vendors and institutions. Tear-related variables (eg, tear size, tear chronicity, degree of fatty cuff degeneration) were not controlled for and might have significantly affected operative time and associated cost. Resident involvement in the surgical procedure and anesthesia process in an academic setting prolongs surgical time and thus directly impacts costs.
In addition, we used the patient’s time in the operating room as a proxy for actual surgical time, as this was the only reliable and reproducible data point available in our electronic medical record. As such, an unquantifiable amount of surgeon time may have been overallocated to our cost estimate for time spent inducing anesthesia, positioning, helping take the patient off the operating table, and so on. However, as typical surgeon practice is to be involved in these tasks in the operating room, the possible overestimate of surgeon cost is likely minimal.
Our salary data for the TDABC algorithm were based on national averages for work hours and gross income for physicians and on hospital-based wage structure and may not be generalizable to other institutions. There may also be regional differences in work hours and salaries, which in turn would factor into a different per-minute cost for surgeon and anesthesiologist, depending on the exact geographic area where the surgery is performed. Costs may be higher at institutions that use certified nurse anesthetists rather than resident physicians because of the salary differences between these practitioners.
Moreover, the time that patients spend in the holding area—waiting to go into surgery and, after surgery, waiting for their ride home, for their prescriptions to be ready, and so forth—is an important variable to consider from a cost standpoint. However, as this time varied significantly and involved minimal contact with hospital personnel, we excluded its associated costs from our analysis. Similarly, and as already noted, hospital overhead and other indirect costs were excluded from analysis as well.
Conclusion
Using the TDABC algorithm, we found a direct economic cost of $5904.21 for RCS at our academic outpatient surgical center, with anchor cost the main cost driver. Judicious use of consumable resources is a key focus for cost containment in arthroscopic shoulder surgery, particularly with respect to implantable suture anchors. However, in the setting of more complex tears that require multiple anchors in a double-row repair construct, our pilot data may be useful to hospitals and surgery centers negotiating procedural reimbursement for the increased cost of complex repairs. Use of the TDABC algorithm for RCS and other procedures may also help in identifying opportunities to deliver more cost-effective health care.
Musculoskeletal disorders, the leading cause of disability in the United States,1 account for more than half of all persons reporting missing a workday because of a medical condition.2 Shoulder disorders in particular play a significant role in the burden of musculoskeletal disorders and cost of care. In 2008, 18.9 million adults (8.2% of the US adult population) reported chronic shoulder pain.1 Among shoulder disorders, rotator cuff pathology is the most common cause of shoulder-related disability found by orthopedic surgeons.3 Rotator cuff surgery (RCS) is one of the most commonly performed orthopedic surgical procedures, and surgery volume is on the rise. One study found a 141% increase in rotator cuff repairs between the years 1996 (~41 per 100,000 population) and 2006 (~98 per 100,000 population).4
US health care costs are also increasing. In 2011, $2.7 trillion was spent on health care, representing 17.9% of the national gross domestic product (GDP). According to projections, costs will rise to $4.6 trillion by 2020.5 In particular, as patients continue to live longer and remain more active into their later years, the costs of treating and managing musculoskeletal disorders become more important from a public policy standpoint. In 2006, the cost of treating musculoskeletal disorders alone was $576 billion, representing 4.5% of that year’s GDP.2
Paramount in this era of rising costs is the idea of maximizing the value of health care dollars. Health care economists Porter and Teisberg6 defined value as patient health outcomes achieved per dollar of cost expended in a care cycle (diagnosis, treatment, ongoing management) for a particular disease or disorder. For proper management of value, outcomes and costs for an entire cycle of care must be determined. From a practical standpoint, this first requires determining the true cost of a care cycle—dollars spent on personnel, equipment, materials, and other resources required to deliver a particular service—rather than the amount charged or reimbursed for providing the service in question.7
Kaplan and Anderson8,9 described the TDABC (time-driven activity-based costing) algorithm for calculating the cost of delivering a service based on 2 parameters: unit cost of a particular resource, and time required to supply it. These parameters apply to material costs and labor costs. In the medical setting, the TDABC algorithm can be applied by defining a care delivery value chain for each aspect of patient care and then multiplying incremental cost per unit time by time required to deliver that resource (Figure 1). Tabulating the overall unit cost for each resource then yields the overall cost of the care cycle. Clinical outcomes data can then be determined and used to calculate overall value for the patient care cycle.
In the study reported here, we used the TDABC algorithm to calculate the direct financial costs of surgical treatment of rotator cuff tears confirmed by magnetic resonance imaging (MRI) in an academic medical center.
Methods
Per our institution’s Office for the Protection of Research Subjects, institutional review board (IRB) approval is required only for projects using “human subjects” as defined by federal policy. In the present study, no private information could be identified, and all data were obtained from hospital billing records without intervention or interaction with individual patients. Accordingly, IRB approval was deemed unnecessary for our economic cost analysis.
Billing records of a single academic fellowship-trained sports surgeon were reviewed to identify patients who underwent primary repair of an MRI-confirmed rotator cuff tear between April 1, 2009, and July 31, 2012. Patients who had undergone prior shoulder surgery of any type were excluded from the study. Operative reports were reviewed, and exact surgical procedures performed were noted. The operating surgeon selected the specific repair techniques, including single- or double-row repair, with emphasis on restoring footprint coverage and avoiding overtensioning.
All surgeries were performed in an outpatient surgical center owned and operated by the surgeon’s home university. Surgeries were performed by the attending physician assisted by a senior orthopedic resident. The RCS care cycle was divided into 3 phases (Figure 2):
1. Preoperative. Patient’s interaction with receptionist in surgery center, time with preoperative nurse and circulating nurse in preoperative area, resident check-in time, and time placing preoperative nerve block and consumable materials used during block placement.
2. Operative. Time in operating room with surgical team for RCS, consumable materials used during surgery (eg, anchors, shavers, drapes), anesthetic medications, shoulder abduction pillow placed on completion of surgery, and cost of instrument processing.
3. Postoperative. Time in postoperative recovery area with recovery room nursing staff.
Time in each portion of the care cycle was directly observed and tabulated by hospital volunteers in the surgery center. Institutional billing data were used to identify material resources consumed, and the actual cost paid by the hospital for these resources was obtained from internal records. Mean hourly salary data and standard benefit rates were obtained for surgery center staff. Attending physician salary was extrapolated from published mean market salary data for academic physicians and mean hours worked,10,11 and resident physician costs were tabulated from publically available institutional payroll data and average resident work hours at our institution. These cost data and times were then used to tabulate total cost for the RCS care cycle using the TDABC algorithm.
Results
We identified 28 shoulders in 26 patients (mean age, 54.5 years) who met the inclusion criteria. Of these 28 shoulders, 18 (64.3%) had an isolated supraspinatus tear, 8 (28.6%) had combined supraspinatus and infraspinatus tears, 1 (3.6%) had combined supraspinatus and subscapularis tears, and 1 (3.6%) had an isolated infraspinatus tear. Demographic data are listed in Table 1.
All patients received an interscalene nerve block in the preoperative area before being brought into the operating room. In our analysis, we included nerve block supply costs and the anesthesiologist’s mean time placing the nerve block.
In all cases, primary rotator cuff repair was performed with suture anchors (Parcus Medical) with the patient in the lateral decubitus position. In 13 (46%) of the 28 shoulders, this repair was described as “complex,” requiring double-row technique. Subacromial decompression and bursectomy were performed in addition to the rotator cuff repair. Labral débridement was performed in 23 patients, synovectomy in 10, biceps tenodesis with anchor (Smith & Nephew) in 1, and biceps tenotomy in 1. Mean time in operating room was 148 minutes; mean time in postoperative recovery unit was 105 minutes.
Directly observing the care cycle, hospital volunteers found that patients spent a mean of 15 minutes with the receptionist when they arrived in the outpatient surgical center, 25 minutes with nurses for check-in in the preoperative holding area, and 10 minutes with the anesthesiology resident and 15 minutes with the orthopedic surgery resident for preoperative evaluation and paperwork. Mean nerve block time was 20 minutes. Mean electrocardiogram (ECG) time (12 patients) was 15 minutes. The surgical technician spent a mean time of 20 minutes setting up the operating room before the patient was brought in and 15 minutes cleaning up after the patient was transferred to the recovery room. Costs of postoperative care in the recovery room were based on a 2:1 patient-to-nurse ratio, as is the standard practice in our outpatient surgery center.
Using the times mentioned and our hospital’s salary data—including standard hospital benefits rates of 33.5% for nonphysicians and 17.65% for physicians—we determined, using the TDABC algorithm, a direct cost of $5904.21 for this process cycle, excluding hospital overhead and indirect costs. Table 2 provides the overall cost breakdown. Compared with the direct economic cost, the mean hospital charge to insurers for the procedure was $31,459.35. Mean reimbursement from insurers was $9679.08.
Overall attending and resident physician costs were $1077.75, which consisted of $623.66 for the surgeon and $454.09 for the anesthesiologist (included placement of nerve block and administration of anesthesia during surgery). Preoperative bloodwork was obtained in 23 cases, adding a mean cost of $111.04 after adjusting for standard hospital markup. Preoperative ECG was performed in 12 cases, for an added mean cost of $7.30 based on the TDABC algorithm.
We also broke down costs by care cycle phase. The preoperative phase, excluding the preoperative laboratory studies and ECGs (not performed in all cases), cost $134.34 (2.3% of total costs); the operative phase cost $5718.01 (96.8% of total costs); and the postoperative phase cost $51.86 (0.9% of total costs). Within the operative phase, the cost of consumables (specifically, suture anchors) was the main cost driver. Mean anchor cost per case was $3432.67. “Complex” tears involving a double-row repair averaged $4570.25 in anchor cost per patient, as compared with $2522.60 in anchor costs for simple repairs.
Discussion
US health care costs continue to increase unsustainably, with rising pressure on hospitals and providers to deliver the highest value for each health care dollar. The present study is the first to calculate (using the TDABC algorithm) the direct economic cost ($5904.21) of the entire RCS care cycle at a university-based outpatient surgery center. Rent, utility costs, administrative costs, overhead, and other indirect costs at the surgery center were not included in this cost analysis, as they would be incurred irrespective of type of surgery performed. As such, our data isolate the procedure-specific costs of rotator cuff repair in order to provide a more meaningful comparison for other institutions, where indirect costs may be different.
In the literature, rigorous economic analysis of shoulder pathology is sparse. Kuye and colleagues12 systematically reviewed economic evaluations in shoulder surgery for the period 1980–2010 and noted more than 50% of the papers were published between 2005 and 2010.12 They also noted the poor quality of these studies and concluded more rigorous economic evaluations are needed to help justify the rising costs of shoulder-related treatments.
Several studies have directly evaluated costs associated with RCS. Cordasco and colleagues13 detailed the success of open rotator cuff repair as an outpatient procedure—noting its 43% cost savings ($4300 for outpatient vs $7500 for inpatient) and high patient satisfaction—using hospital charge data for operating room time, supplies, instruments, and postoperative slings. Churchill and Ghorai14 evaluated costs of mini-open and arthroscopic rotator cuff repairs in a statewide database and estimated the arthroscopic repair cost at $8985, compared with $7841 for the mini-open repair. They used reported hospital charge data, which were not itemized and did not include physician professional fees. Adla and colleagues,15 in a similar analysis of open versus arthroscopic cuff repair, estimated direct material costs of $1609.50 (arthroscopic) and $360.75 (open); these figures were converted from 2005 UK currency using the exchange rate cited in their paper. Salaries of surgeon, anesthesiologist, and other operating room personnel were said to be included in the operating room cost, but the authors’ paper did not include these data.
Two studies directly estimated the costs of arthroscopic rotator cuff repair. Hearnden and Tennent16 calculated the cost of RCS at their UK institution to be £2672, which included cost of operating room consumable materials, medication, and salaries of operating room personnel, including surgeon and anesthesiologist. Using online currency conversion from 2008 exchange rates and adjusting for inflation gave a corresponding US cost of $5449.63.17 Vitale and colleagues18 prospectively calculated costs of arthroscopic rotator cuff repair over a 1-year period using a cost-to-charge ratio from tabulated inpatient charges, procedure charges, and physician fees and payments abstracted from medical records, hospital billing, and administrative databases. Mean total cost for this cycle was $10,605.20, which included several costs (physical therapy, radiologist fees) not included in the present study. These studies, though more comprehensive than prior work, did not capture the entire cycle of surgical care.
Our study was designed to provide initial data on the direct costs of arthroscopic repair of the rotator cuff for the entire process cycle. Our overall cost estimate of $5904.21 differs significantly from prior work—not unexpected given the completely different cost methodology used.
Our study had several limitations. First, it was a single-surgeon evaluation, and a number of operating room variables (eg, use of adjunct instrumentation such as radiofrequency probes, differences in draping preferences) as well as surgeon volume in performing rotator cuff repairs might have substantially affected the reproducibility and generalizability of our data. Similarly, the large number of adjunctive procedures (eg, subacromial decompression, labral débridement) performed in conjunction with the rotator cuff repairs added operative time and therefore increased overall cost. Double-row repairs added operative time and increased the cost of consumable materials as well. Differences in surgeon preference for suture anchors may also be important, as anchors are a major cost driver and can vary significantly between vendors and institutions. Tear-related variables (eg, tear size, tear chronicity, degree of fatty cuff degeneration) were not controlled for and might have significantly affected operative time and associated cost. Resident involvement in the surgical procedure and anesthesia process in an academic setting prolongs surgical time and thus directly impacts costs.
In addition, we used the patient’s time in the operating room as a proxy for actual surgical time, as this was the only reliable and reproducible data point available in our electronic medical record. As such, an unquantifiable amount of surgeon time may have been overallocated to our cost estimate for time spent inducing anesthesia, positioning, helping take the patient off the operating table, and so on. However, as typical surgeon practice is to be involved in these tasks in the operating room, the possible overestimate of surgeon cost is likely minimal.
Our salary data for the TDABC algorithm were based on national averages for work hours and gross income for physicians and on hospital-based wage structure and may not be generalizable to other institutions. There may also be regional differences in work hours and salaries, which in turn would factor into a different per-minute cost for surgeon and anesthesiologist, depending on the exact geographic area where the surgery is performed. Costs may be higher at institutions that use certified nurse anesthetists rather than resident physicians because of the salary differences between these practitioners.
Moreover, the time that patients spend in the holding area—waiting to go into surgery and, after surgery, waiting for their ride home, for their prescriptions to be ready, and so forth—is an important variable to consider from a cost standpoint. However, as this time varied significantly and involved minimal contact with hospital personnel, we excluded its associated costs from our analysis. Similarly, and as already noted, hospital overhead and other indirect costs were excluded from analysis as well.
Conclusion
Using the TDABC algorithm, we found a direct economic cost of $5904.21 for RCS at our academic outpatient surgical center, with anchor cost the main cost driver. Judicious use of consumable resources is a key focus for cost containment in arthroscopic shoulder surgery, particularly with respect to implantable suture anchors. However, in the setting of more complex tears that require multiple anchors in a double-row repair construct, our pilot data may be useful to hospitals and surgery centers negotiating procedural reimbursement for the increased cost of complex repairs. Use of the TDABC algorithm for RCS and other procedures may also help in identifying opportunities to deliver more cost-effective health care.
1. American Academy of Orthopaedic Surgeons. The Burden of Musculoskeletal Diseases in the United States: Prevalence, Societal and Economic Cost. Rosemont, IL: American Academy of Orthopaedic Surgeons; 2011.
2. National health expenditure data. Centers for Medicare & Medicare Services website. https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/index.html. Updated May 5, 2014. Accessed December 1, 2015.
3. Tashjian RZ. Epidemiology, natural history, and indications for treatment of rotator cuff tears. Clin Sports Med. 2012;31(4):589-604.
4. Colvin AC, Egorova N, Harrison AK, Moskowitz A, Flatow EL. National trends in rotator cuff repair. J Bone Joint Surg Am. 2012;94(3):227-233.
5. Black EM, Higgins LD, Warner JJ. Value-based shoulder surgery: practicing outcomes-driven, cost-conscious care. J Shoulder Elbow Surg. 2013;22(7):1000-1009.
6. Porter ME, Teisberg EO. Redefining Health Care: Creating Value-Based Competition on Results. Boston, MA: Harvard Business School Press; 2006.
7. Kaplan RS, Porter ME. How to solve the cost crisis in health care. Harv Bus Rev. 2011;89(9):46-52, 54, 56-61 passim.
8. Kaplan RS, Anderson SR. Time-driven activity-based costing. Harv Bus Rev. 2004;82(11):131-138, 150.
9. Kaplan RS, Anderson SR. Time-Driven Activity-Based Costing: A Simpler and More Powerful Path to Higher Profits. Boston, MA: Harvard Business Review Press; 2007.
10. American Academy of Orthopaedic Surgeons. Orthopaedic Practice in the U.S. 2012. Rosemont, IL: American Academy of Orthopaedic Surgeons; 2012.
11. Medical Group Management Association. Physician Compensation and Production Survey: 2012 Report Based on 2011 Data. Englewood, CO: Medical Group Management Association; 2012.
12. Kuye IO, Jain NB, Warner L, Herndon JH, Warner JJ. Economic evaluations in shoulder pathologies: a systematic review of the literature. J Shoulder Elbow Surg. 2012;21(3):367-375.
13. Cordasco FA, McGinley BJ, Charlton T. Rotator cuff repair as an outpatient procedure. J Shoulder Elbow Surg. 2000;9(1):27-30.
14. Churchill RS, Ghorai JK. Total cost and operating room time comparison of rotator cuff repair techniques at low, intermediate, and high volume centers: mini-open versus all-arthroscopic. J Shoulder Elbow Surg. 2010;19(5):716-721.
15. Adla DN, Rowsell M, Pandey R. Cost-effectiveness of open versus arthroscopic rotator cuff repair. J Shoulder Elbow Surg. 2010;19(2):258-261.
16. Hearnden A, Tennent D. The cost of shoulder arthroscopy: a comparison with national tariff. Ann R Coll Surg Engl. 2008;90(7):587-591.
17. Xrates currency conversion. http://www.x-rates.com/historical/?from=GBP&amount=1&date=2015-12-03. Accessed December 13, 2015.
18. Vitale MA, Vitale MG, Zivin JG, Braman JP, Bigliani LU, Flatow EL. Rotator cuff repair: an analysis of utility scores and cost-effectiveness. J Shoulder Elbow Surg. 2007;16(2):181-187.
1. American Academy of Orthopaedic Surgeons. The Burden of Musculoskeletal Diseases in the United States: Prevalence, Societal and Economic Cost. Rosemont, IL: American Academy of Orthopaedic Surgeons; 2011.
2. National health expenditure data. Centers for Medicare & Medicare Services website. https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/index.html. Updated May 5, 2014. Accessed December 1, 2015.
3. Tashjian RZ. Epidemiology, natural history, and indications for treatment of rotator cuff tears. Clin Sports Med. 2012;31(4):589-604.
4. Colvin AC, Egorova N, Harrison AK, Moskowitz A, Flatow EL. National trends in rotator cuff repair. J Bone Joint Surg Am. 2012;94(3):227-233.
5. Black EM, Higgins LD, Warner JJ. Value-based shoulder surgery: practicing outcomes-driven, cost-conscious care. J Shoulder Elbow Surg. 2013;22(7):1000-1009.
6. Porter ME, Teisberg EO. Redefining Health Care: Creating Value-Based Competition on Results. Boston, MA: Harvard Business School Press; 2006.
7. Kaplan RS, Porter ME. How to solve the cost crisis in health care. Harv Bus Rev. 2011;89(9):46-52, 54, 56-61 passim.
8. Kaplan RS, Anderson SR. Time-driven activity-based costing. Harv Bus Rev. 2004;82(11):131-138, 150.
9. Kaplan RS, Anderson SR. Time-Driven Activity-Based Costing: A Simpler and More Powerful Path to Higher Profits. Boston, MA: Harvard Business Review Press; 2007.
10. American Academy of Orthopaedic Surgeons. Orthopaedic Practice in the U.S. 2012. Rosemont, IL: American Academy of Orthopaedic Surgeons; 2012.
11. Medical Group Management Association. Physician Compensation and Production Survey: 2012 Report Based on 2011 Data. Englewood, CO: Medical Group Management Association; 2012.
12. Kuye IO, Jain NB, Warner L, Herndon JH, Warner JJ. Economic evaluations in shoulder pathologies: a systematic review of the literature. J Shoulder Elbow Surg. 2012;21(3):367-375.
13. Cordasco FA, McGinley BJ, Charlton T. Rotator cuff repair as an outpatient procedure. J Shoulder Elbow Surg. 2000;9(1):27-30.
14. Churchill RS, Ghorai JK. Total cost and operating room time comparison of rotator cuff repair techniques at low, intermediate, and high volume centers: mini-open versus all-arthroscopic. J Shoulder Elbow Surg. 2010;19(5):716-721.
15. Adla DN, Rowsell M, Pandey R. Cost-effectiveness of open versus arthroscopic rotator cuff repair. J Shoulder Elbow Surg. 2010;19(2):258-261.
16. Hearnden A, Tennent D. The cost of shoulder arthroscopy: a comparison with national tariff. Ann R Coll Surg Engl. 2008;90(7):587-591.
17. Xrates currency conversion. http://www.x-rates.com/historical/?from=GBP&amount=1&date=2015-12-03. Accessed December 13, 2015.
18. Vitale MA, Vitale MG, Zivin JG, Braman JP, Bigliani LU, Flatow EL. Rotator cuff repair: an analysis of utility scores and cost-effectiveness. J Shoulder Elbow Surg. 2007;16(2):181-187.
Attending Workload, Teaching, and Safety
Teaching attending physicians must balance clinical workload and resident education simultaneously while supervising inpatient services. The workload of teaching attendings has been increasing due to many factors. As patient complexity has increased, length of stay has decreased, creating higher turnover and higher acuity of hospitalized patients.[1, 2, 3, 4, 5] The rising burden of clinical documentation has increased demands on inpatient attending physicians' time.[6] Additionally, resident duty hour restrictions have shifted the responsibility for patient care to the teaching attending.[7] These factors contribute to the perception of unsafe workloads among attending physicians[8] and could impact the ability to teach well.
Teaching effectiveness is an important facet of the graduate medical education (GME) learning environment.[9] Residents perceive that education suffers when their own workload increases,[10, 11, 12, 13, 14] and higher on‐call workload is associated with lower likelihood of participation in educational activities.[15] More contact between resident trainees and supervisory staff may improve the clinical value of inpatient rotations.[16] Program directors have expressed concern about the educational ramifications of work compression.[17, 18, 19, 20] Higher workload for attending physicians can negatively impact patient safety and quality of care,[21, 22] and perception of higher attending workload is associated with less time for teaching.[23] However, the impact of objective measures of attending physician workload on educational outcomes has not been explored. When attending physicians are responsible for increasingly complex clinical care in addition to resident education, teaching effectiveness may suffer. With growing emphasis on the educational environment's effect on healthcare quality and safety,[24] it is imperative to consider the influence of attending workload on patient care and resident education.
The combination of increasing clinical demands, fewer hours in‐house for residents, and less time for teaching has the potential to decrease attending physician teaching effectiveness. In this study, we aimed to evaluate relationships among objective measures of attending physician workload, resident perception of teaching effectiveness, and patient outcomes. We hypothesized that higher workload for attending physicians would be associated with lower ratings of teaching effectiveness and poorer outcomes for patients.
METHODS
We performed a retrospective study of attending physicians who supervised inpatient internal medicine teaching services at Mayo ClinicRochester from July 2005 through June 2011 (6 full academic years). The team structure for each service was 1 attending physician, 1 senior resident, and 3 interns. Senior residents were on call every fourth night, and interns were on call every sixth night. Up to 2 admissions per service were received during the daytime short call, and up to 5 admissions per service were received during the overnight long call. Attending physicians included all supervising physicians in appointment categories of attending/consultant, senior associate consultant, and chief medical resident at the Mayo Clinic. Maximum continuous on‐call time for residents during the study period was restricted to 30 hours continuously. The timeframe of this study was chosen to minimize variability in resident work schedules; effective July 1, 2011, duty hours for postgraduate year 1 residents were further restricted to a maximum of 16 hours in duration.[25]
Measures of Attending Physician Workload
To measure attending physician workload, we examined mean service census as reported at midnight, mean patient length of stay, mean number of daily admissions, and mean number of daily discharges. We also calculated mean daily outpatient relative value units (RVUs) generated as a measure of outpatient workload while the attending was supervising the inpatient service. Similar measures of workload have been used in previous research.[26] Attending physicians in this study functioned as hospitalists during their time supervising the teaching services; that is, they were not routinely assigned to any outpatient responsibilities. The only way for an outpatient RVU to be generated during their time supervising the hospital service was for the attending physician to specifically request to see an outpatient in the clinic. Attending physicians only supervised 1 teaching service at a time and had no concurrent nonteaching service obligations. Admissions were received on a rotating basis. Because patient illness severity may impact workload, we also examined mean expected mortality (per 1000 patients) for all patients on the attending physicians' hospital services.[27]
The above workload variables were measured in the specific timeframe that corresponded to the number of days an attending physician was supervising a particular team; for example, mean census was the mean number of patients on the attending physician's hospital service during his or her time supervising that resident team.
Teaching Effectiveness Outcome Measures
Teaching effectiveness was measured using residents' evaluations of their attending physicians with a 5‐point scale (1 = needs improvement, 3 = average, 5 = top 10% of attending physicians) that has been previously validated in similar contexts.[28, 29, 30, 31, 32] The evaluation questions are shown in Supporting Information, Appendix A, in the online version of this article.
Patient Outcome Measures
Patient outcomes included applicable patient safety indicators (PSIs) as defined by the Agency for Healthcare Research and Quality[33] (see Supporting Information, Appendix B, in the online version of this article), patient transfers to the intensive care unit (ICU), calls to the rapid response team/cardiopulmonary resuscitation team, and patient deaths. Each indicator and event was summarized as occurred or did not occur at the service‐team level. For example, for a particular attendingresident team, the occurrence of each of these events at any point during the time they worked together was recorded as occurred (1) or did not occur (0). Similar measures of patient outcomes have been used in previous research.[32]
Statistical Analysis
Mixed linear models with variance components covariance structure (including random effects to account for repeated ratings by residents and of faculty) were fit using restricted maximum likelihood to examine associations of attending workload and demographics with teaching scores. Generalized linear regression models, estimated via generalized estimating equations, were used to examine associations of attending workload and demographics with patient outcomes. Due to the binary nature of the outcomes, the binomial distribution and logit link function were used, producing odds ratios (ORs) for covariates akin to those found in standard logistic regression. Multivariate models were used to adjust for physician demographics including age, gender, teaching appointment (consultant, senior associate consultant/temporary clinical appointment, or chief medical resident) and academic rank (professor, associate professor, assistant professor, instructor/none).
To account for multiple comparisons, a significance level of P < 0.01 was used. All analyses were performed using SAS statistical software (version 9.3; SAS Institute Inc., Cary, NC). This study was deemed minimal risk after review by the Mayo Clinic Institutional Review Board.
RESULTS
Over the 6‐year study period, 107 attending physicians supervised internal medicine teaching services. Twenty‐three percent of teaching attending physicians were female. Mean attending age was 42.6 years. Attendings supervised a given service for between 2 and 19 days (mean [standard deviation] = 10.1 [4.1] days). There were 542 internal medicine residents on these teaching services who completed at least 1 teaching evaluation. A total of 69,386 teaching evaluation items were submitted by these residents during the study period.
In a multivariate analysis adjusted for faculty demographics and workload measures, teaching evaluation scores were significantly higher for attending physicians who had an academic rank of professor when compared to attendings who were assistant professors ( = 0.12, P = 0.007), or instructors/no academic rank ( = 0.23, P < 0.0001). The number of days an attending physician spent with the team showed a positive association with teaching evaluations ( = +0.015, P < 0.0001).
Associations between measures of attending physician workload and teaching evaluation scores are shown in Table 1. Mean midnight census and mean number of daily discharges were associated with lower teaching evaluation scores (both = 0.026, P < 0.0001). Mean number of daily admissions was associated with higher teaching scores ( = +0.021, P = 0.001). The mean expected mortality among hospitalized patients on the services supervised by teaching attendings and the outpatient RVUs generated by these attendings during the time they were supervising the hospital service showed no association with teaching scores. The average number of RVUs generated during an attending's entire time supervising hospital service was <1.
Attending Physician Workload Measure | Mean (SD) | Multivariate Analysis* | |||
---|---|---|---|---|---|
SE | 99% CI | P | |||
| |||||
Midnight census | 8.86 (1.8) | 0.026 | 0.002 | (0.03, 0.02) | <0.0001 |
Length of stay, d | 6.91 (3.0) | +0.006 | 0.001 | (0.002, 0.009) | <0.0001 |
Expected mortality (per 1,000 patients) | 51.94 (27.4) | 0.0001 | 0.0001 | (0.0004, 0.0001) | 0.19 |
Daily admissions | 2.23 (0.54) | +0.021 | 0.006 | (0.004, 0.037) | 0.001 |
Daily discharges | 2.13 (0.56) | 0.026 | 0.006 | (0.041, 0.010) | <0.0001 |
Daily outpatient relative value units | 0.69 (1.2) | +0.004 | 0.003 | (0.002, 0.011) | 0.10 |
Table 2 shows relationships between attending physician workload and patient outcomes for the patients on hospital services supervised by 107 attending physicians during the study period. Patient outcome data showed positive associations between measures of higher workload and PSIs. Specifically, for each 1‐patient increase in the average number of daily admissions to the attending and resident services, the cohort of patients under the team's care was 1.8 times more likely to include at least 1 patient with a PSI event (OR = 1.81, 99% confidence interval [CI]: 1.21, 2.71, P = 0.0001). Likewise, for each 1‐day increase in average length of stay, the cohort of patients under the team's care was 1.16 times more likely to have at least 1 patient with a PSI (OR = 1.16, 99% CI: 1.07, 1.26, P < 0.0001). As anticipated, mean expected mortality was associated with actual mortality, cardiopulmonary resuscitation/rapid response team calls, and ICU transfers. There were no associations between patient outcomes and workload measures of midnight census and outpatient RVUs.
Patient Outcomes, Multivariate Analysis* | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Patient Safety Indicators, n = 513 | Deaths, n = 352 | CPR/RRT Calls, n = 409 | ICU Transfers, n = 737 | |||||||||||||
Workload measures | OR | SE | P | 99% CI | OR | SE | P | 99% CI | OR | SE | P | 99% CI | OR | SE | P | 99% CI |
| ||||||||||||||||
Midnight census | 1.10 | 0.05 | 0.04 | (0.98, 1.24) | 0.91 | 0.04 | 0.03 | (0.81, 1.02) | 0.95 | 0.04 | 0.16 | (0.86, 1.05) | 1.06 | 0.04 | 0.16 | (0.96, 1.17) |
Length of stay | 1.16 | 0.04 | <0.0001 | (1.07, 1.26) | 1.03 | 0.03 | 0.39 | (0.95, 1.12) | 0.99 | 0.03 | 0.63 | (0.92, 1.05) | 1.10 | 0.03 | 0.0001 | (1.03, 1.18) |
Expected mortality (per 1,000 patients) | 1.00 | 0.003 | 0.24 | (0.99, 1.01) | 1.01 | 0.00 | 0.002 | (1.00, 1.02) | 1.02 | 0.00 | <0.0001 | (1.01, 1.02) | 1.01 | 0.00 | 0.003 | (1.00, 1.01) |
Daily admissions | 1.81 | 0.28 | 0.0001 | (1.21, 2.71) | 0.78 | 0.14 | 0.16 | (0.49, 1.24) | 1.11 | 0.20 | 0.57 | (0.69, 1.77) | 1.34 | 0.24 | 0.09 | (0.85, 2.11) |
Daily discharges | 1.06 | 0.13 | 0.61 | (0.78, 1.45) | 2.36 | 0.38 | <0.0001 | (1.56, 3.57) | 0.94 | 0.16 | 0.70 | (0.60, 1.46) | 1.09 | 0.16 | 0.53 | (0.75, 1.60) |
Daily outpatient relative value units | 0.81 | 0.07 | 0.01 | (0.65, 1.00) | 1.02 | 0.04 | 0.56 | (0.92, 1.13) | 1.05 | 0.04 | 0.23 | (0.95, 1.17) | 0.92 | 0.06 | 0.23 | (0.77, 1.09) |
DISCUSSION
This study of internal medicine attending physician workload and resident education demonstrates that higher workload among attending physicians is associated with slightly lower teaching evaluation scores from residents as well as increased risks to patient safety.
The prior literature examining relationships between workload and teaching effectiveness is largely survey‐based and reliant upon physicians' self‐reported perceptions of workload.[10, 13, 23] The present study strengthens this evidence by using multiple objective measures of workload, objective measures of patient safety, and a large sample of teaching evaluations.
An interesting finding in this study was that the number of patient dismissals per day was associated with a significant decrease in teaching scores, whereas the number of admissions per day was associated with increased teaching scores. These findings may seem contradictory, because the number of admissions and discharges both measure physician workload. However, a likely explanation for this apparent inconsistency is that on internal medicine inpatient teaching services, much of the teaching of residents occurs at the time of a patient admission as residents are presenting cases to the attending physician, exploring differential diagnoses, and discussing management plans. By contrast, a patient dismissal tends to consist mainly of patient interaction, paperwork, and phone calls by the resident with less input required from the attending physician. Our findings suggest that although patient admissions remain a rich opportunity for resident education, patient dismissals may increase workload without improving teaching evaluations. As the inpatient hospital environment evolves, exploring options for nonphysician providers to assist with or complete patient dismissals may have a beneficial effect on resident education.[34] In addition, exploring more efficient teaching strategies may be beneficial in the fast‐paced inpatient learning milieu.[35]
There was a statistically significant positive association between the number of days an attending physician spent with the team and teaching evaluations. Although prior work has examined advantages and disadvantages of various resident schedules,[36, 37, 38] our results suggest scheduling models that emphasize continuity of the teaching attending and residents may be preferred to enhance teaching effectiveness. Further study would help elucidate potential implications of this finding for the scheduling of supervisory attendings to optimize education.
In this analysis, patient outcome measures were largely independent of attending physician workload, with the exception of PSIs. PSIs have been associated with longer stays in the hospital,[39, 40] which is consistent with our findings. However, mean daily admissions were also associated with PSIs. It could be expected that the more patients on a hospital service, the more PSIs will result. However, there was not a significant association between midnight census and PSIs when other variables were accounted for. Because new patient admissions are time consuming and contribute to the workload of both residents and attending physicians, it is possible that safety of the service's hospitalized patients is compromised when the team is putting time and effort toward new patients. Previous research has shown variability in PSI trends with changes in the workload environment.[41] Further studies are needed to fully explore relationships between admission volume and PSIs on teaching services.
It is worthwhile to note that attending physicians have specific responsibilities of supervision and documentation for new admissions. Although it could be argued that new admissions raise the workload for the entire team, and the higher team workload may impact teaching evaluations, previous research has demonstrated that resident burnout and well‐being, which are influenced by workload, do not impact residents' assessments of teachers.[42] In addition, metrics that could arguably be more apt to measure the workload of the team as a whole (eg, team census) did not show a significant association with patient outcomes.
This study has important limitations. First, the cohort of attending physicians, residents, and patients was from a large single institution and may not be generalizable to all settings. Second, most attending physicians in this sample were experienced teachers, so consequences of increased workload may have been managed effectively without a major impact on resident education in some cases. Third, the magnitude of change in teaching effectiveness, although statistically significant, was small and might call into question the educational significance of these findings. Fourth, although resident satisfaction does not influence teaching scores, it is possible that residents' perception of their own workload may have impacted teaching evaluations. Finally, data collection was intentionally closed at the end of the 2011 academic year because accreditation standards for resident duty hours changed again at that time.[43] Thus, these data may not directly reflect the evolving hospital learning environment but serve as a useful benchmark for future studies of workload and teaching effectiveness in the inpatient setting. Once hospitals have had sufficient time and experience with the new duty hour standards, additional studies exploring relationships between workload, teaching effectiveness, and patient outcomes may be warranted.
Limitations notwithstanding, this study shows that attending physician workload may adversely impact teaching and patient safety on internal medicine hospital services. Ongoing efforts by residency programs to optimize the learning environment should include strategies to manage the workload of supervising attendings.
Disclosures
This publication was made possible in part by Clinical and Translational Science Award grant number UL1 TR000135 from the National Center for Advancing Translational Sciences, a component of the National Institutes of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NIH. Authors also acknowledge support for the Mayo Clinic Department of Medicine Write‐up and Publish grant. In addition, this study was supported in part by the Mayo Clinic Internal Medicine Residency Office of Education Innovations as part of the Accreditation Council for Graduate Medical Education Educational Innovations Project. The information contained in this article was based in part on the performance package data maintained by the University HealthSystem Consortium. Copyright 2015 UHC. All rights reserved.
- The future of residents' education in internal medicine. Am J Med. 2004;116(9):648–650. , , .
- Redesigning residency education in internal medicine: a position paper from the Association of Program Directors in Internal Medicine. Ann Intern Med. 2006;144(12):920–926. , , , , .
- Residency training in the modern era: the pipe dream of less time to learn more, care better, and be more professional. Arch Intern Med. 2005;165(22):2561–2562. , , .
- Trends in Hospitalizations Among Medicare Survivors of Aortic Valve Replacement in the United States From 1999 to 2010. Ann Thorac Surg. 2015;99(2):509–517. , , , et al.
- Restructuring an inpatient resident service to improve outcomes for residents, students, and patients. Acad Med. 2011;86(12):1500–1507. , , .
- Clinical documentation in the 21st century: executive summary of a policy position paper from the American College of Physicians. Ann Intern Med. 2015;162(4):301–303. , , , .
- Effect of ACGME duty hours on attending physician teaching and satisfaction. Arch Intern Med. 2008;168(11):1226–1228. , .
- Identifying potential predictors of a safe attending physician workload: a survey of hospitalists. J Hosp Med. 2013;8(11):644–646. , , , , .
- The clinical learning environment: the foundation of graduate medical education. JAMA. 2013;309(16):1687–1688. , , .
- Better rested, but more stressed? Evidence of the effects of resident work hour restrictions. Acad Pediatr. 2012;12(4):335–343. , , , , , .
- Multifaceted longitudinal study of surgical resident education, quality of life, and patient care before and after July 2011. J Surg Educ. 2013;70(6):769–776. , , , .
- Impact of the new 16‐hour duty period on pediatric interns' neonatal education. Clin Pediatr (Phila). 2014;53(1):51–59. , , .
- Relationship between resident workload and self‐perceived learning on inpatient medicine wards: a longitudinal study. BMC Med Educ. 2006;6:35. , , , , , .
- Perceptions of educational experience and inpatient workload among pediatric residents. Hosp Pediatr. 2013;3(3):276–284. , , , .
- Association of workload of on‐call medical interns with on‐call sleep duration, shift duration, and participation in educational activities. JAMA. 2008;300(10):1146–1153. , , , et al.
- Effects of increased overnight supervision on resident education, decision‐making, and autonomy. J Hosp Med. 2012;7(8):606–610. , , , , , .
- Approval and perceived impact of duty hour regulations: survey of pediatric program directors. Pediatrics. 2013;132(5):819–824. , , , , .
- Anticipated consequences of the 2011 duty hours standards: views of internal medicine and surgery program directors. Acad Med. 2012;87(7):895–903. , , , et al.
- Training on the clock: family medicine residency directors' responses to resident duty hours reform. Acad Med. 2006;81(12):1032–1037. , , , , .
- Duty hour recommendations and implications for meeting the ACGME core competencies: views of residency directors. Mayo Clin Proc. 2011;86(3):185–191. , , , et al.
- Does surgeon workload per day affect outcomes after pulmonary lobectomies? Ann Thorac Surg. 2012;94(3):966–973. , , , et al.
- Impact of attending physician workload on patient care: a survey of hospitalists. JAMA Intern Med. 2013;173(5):375–377. , , , .
- No time for teaching? Inpatient attending physicians' workload and teaching before and after the implementation of the 2003 duty hours regulations. Acad Med. 2013;88(9):1293–1298. , , , et al.
- Accreditation Council for Graduate Medical Education. Clinical Learning Environment Review (CLER) Program. Available at: http://www.acgme.org/acgmeweb/tabid/436/ProgramandInstitutionalAccreditation/NextAccreditationSystem/ClinicalLearningEnvironmentReviewProgram.aspx. Accessed April 27, 2015.
- Accreditation Council for Graduate Medical Education. Frequently Asked Questions: A ACGME common duty hour requirements. Available at: https://www.acgme.org/acgmeweb/Portals/0/PDFs/dh‐faqs 2011.pdf. Accessed April 27, 2015.
- Effect of hospitalist workload on the quality and efficiency of care. JAMA Intern Med. 2014;174(5):786–793. , , , , .
- University HealthSystem Consortium. UHC clinical database/resource manager for Mayo Clinic. Available at: http://www.uhc.edu. Data accessed August 25, 2011.
- The interpersonal, cognitive and efficiency domains of clinical teaching: construct validity of a multi‐dimensional scale. Med Educ. 2005;39(12):1221–1229. , .
- Factor instability of clinical teaching assessment scores among general internists and cardiologists. Med Educ. 2006;40(12):1209–1216. , , .
- Determining reliability of clinical assessment scores in real time. Teach Learn Med. 2009;21(3):188–194. , , , .
- Behaviors of highly professional resident physicians. JAMA. 2008;300(11):1326–1333. , , , , , .
- Service census caps and unit‐based admissions: resident workload, conference attendance, duty hour compliance, and patient safety. Mayo Clin Proc. 2012;87(4):320–327. , , , et al.
- Agency for Healthcare Research and Quality. Patient safety indicators technical specifications updates—Version 5.0, March 2015. Available at: http://www.qualityindicators.ahrq.gov/Modules/PSI_TechSpec.aspx. Accessed May 29, 2015.
- The impact of nonphysician clinicians: do they improve the quality and cost‐effectiveness of health care services? Med Care Res Rev. 2009;66(6 suppl):36S–89S. , , , , , .
- Maximizing teaching on the wards: review and application of the One‐Minute Preceptor and SNAPPS models. J Hosp Med. 2015;10(2):125–130. , , .
- Resident perceptions of the educational value of night float rotations. Teach Learn Med. 2010;22(3):196–201. , , , .
- An evaluation of internal medicine residency continuity clinic redesign to a 50/50 outpatient‐inpatient model. J Gen Intern Med. 2013;28(8):1014–1019. , , , , , .
- Revisiting the rotating call schedule in less than 80 hours per week. J Surg Educ. 2009;66(6):357–360. , , , et al.
- Excess length of stay, charges, and mortality attributable to medical injuries during hospitalization. JAMA. 2003;290(14):1868–1874. , .
- Agency for Healthcare Research and Quality patient safety indicators and mortality in surgical patients. Am Surg. 2014;80(8):801–804. , , , .
- Patient safety in the era of the 80‐hour workweek. J Surg Educ. 2014;71(4):551–559. , , , et al.
- Impact of resident well‐being and empathy on assessments of faculty physicians. J Gen Intern Med. 2010;25(1):52–56. , , , .
- Stress management training for surgeons‐a randomized, controlled, intervention study. Ann Surg. 2011;253(3):488–494. , , , et al.
Teaching attending physicians must balance clinical workload and resident education simultaneously while supervising inpatient services. The workload of teaching attendings has been increasing due to many factors. As patient complexity has increased, length of stay has decreased, creating higher turnover and higher acuity of hospitalized patients.[1, 2, 3, 4, 5] The rising burden of clinical documentation has increased demands on inpatient attending physicians' time.[6] Additionally, resident duty hour restrictions have shifted the responsibility for patient care to the teaching attending.[7] These factors contribute to the perception of unsafe workloads among attending physicians[8] and could impact the ability to teach well.
Teaching effectiveness is an important facet of the graduate medical education (GME) learning environment.[9] Residents perceive that education suffers when their own workload increases,[10, 11, 12, 13, 14] and higher on‐call workload is associated with lower likelihood of participation in educational activities.[15] More contact between resident trainees and supervisory staff may improve the clinical value of inpatient rotations.[16] Program directors have expressed concern about the educational ramifications of work compression.[17, 18, 19, 20] Higher workload for attending physicians can negatively impact patient safety and quality of care,[21, 22] and perception of higher attending workload is associated with less time for teaching.[23] However, the impact of objective measures of attending physician workload on educational outcomes has not been explored. When attending physicians are responsible for increasingly complex clinical care in addition to resident education, teaching effectiveness may suffer. With growing emphasis on the educational environment's effect on healthcare quality and safety,[24] it is imperative to consider the influence of attending workload on patient care and resident education.
The combination of increasing clinical demands, fewer hours in‐house for residents, and less time for teaching has the potential to decrease attending physician teaching effectiveness. In this study, we aimed to evaluate relationships among objective measures of attending physician workload, resident perception of teaching effectiveness, and patient outcomes. We hypothesized that higher workload for attending physicians would be associated with lower ratings of teaching effectiveness and poorer outcomes for patients.
METHODS
We performed a retrospective study of attending physicians who supervised inpatient internal medicine teaching services at Mayo ClinicRochester from July 2005 through June 2011 (6 full academic years). The team structure for each service was 1 attending physician, 1 senior resident, and 3 interns. Senior residents were on call every fourth night, and interns were on call every sixth night. Up to 2 admissions per service were received during the daytime short call, and up to 5 admissions per service were received during the overnight long call. Attending physicians included all supervising physicians in appointment categories of attending/consultant, senior associate consultant, and chief medical resident at the Mayo Clinic. Maximum continuous on‐call time for residents during the study period was restricted to 30 hours continuously. The timeframe of this study was chosen to minimize variability in resident work schedules; effective July 1, 2011, duty hours for postgraduate year 1 residents were further restricted to a maximum of 16 hours in duration.[25]
Measures of Attending Physician Workload
To measure attending physician workload, we examined mean service census as reported at midnight, mean patient length of stay, mean number of daily admissions, and mean number of daily discharges. We also calculated mean daily outpatient relative value units (RVUs) generated as a measure of outpatient workload while the attending was supervising the inpatient service. Similar measures of workload have been used in previous research.[26] Attending physicians in this study functioned as hospitalists during their time supervising the teaching services; that is, they were not routinely assigned to any outpatient responsibilities. The only way for an outpatient RVU to be generated during their time supervising the hospital service was for the attending physician to specifically request to see an outpatient in the clinic. Attending physicians only supervised 1 teaching service at a time and had no concurrent nonteaching service obligations. Admissions were received on a rotating basis. Because patient illness severity may impact workload, we also examined mean expected mortality (per 1000 patients) for all patients on the attending physicians' hospital services.[27]
The above workload variables were measured in the specific timeframe that corresponded to the number of days an attending physician was supervising a particular team; for example, mean census was the mean number of patients on the attending physician's hospital service during his or her time supervising that resident team.
Teaching Effectiveness Outcome Measures
Teaching effectiveness was measured using residents' evaluations of their attending physicians with a 5‐point scale (1 = needs improvement, 3 = average, 5 = top 10% of attending physicians) that has been previously validated in similar contexts.[28, 29, 30, 31, 32] The evaluation questions are shown in Supporting Information, Appendix A, in the online version of this article.
Patient Outcome Measures
Patient outcomes included applicable patient safety indicators (PSIs) as defined by the Agency for Healthcare Research and Quality[33] (see Supporting Information, Appendix B, in the online version of this article), patient transfers to the intensive care unit (ICU), calls to the rapid response team/cardiopulmonary resuscitation team, and patient deaths. Each indicator and event was summarized as occurred or did not occur at the service‐team level. For example, for a particular attendingresident team, the occurrence of each of these events at any point during the time they worked together was recorded as occurred (1) or did not occur (0). Similar measures of patient outcomes have been used in previous research.[32]
Statistical Analysis
Mixed linear models with variance components covariance structure (including random effects to account for repeated ratings by residents and of faculty) were fit using restricted maximum likelihood to examine associations of attending workload and demographics with teaching scores. Generalized linear regression models, estimated via generalized estimating equations, were used to examine associations of attending workload and demographics with patient outcomes. Due to the binary nature of the outcomes, the binomial distribution and logit link function were used, producing odds ratios (ORs) for covariates akin to those found in standard logistic regression. Multivariate models were used to adjust for physician demographics including age, gender, teaching appointment (consultant, senior associate consultant/temporary clinical appointment, or chief medical resident) and academic rank (professor, associate professor, assistant professor, instructor/none).
To account for multiple comparisons, a significance level of P < 0.01 was used. All analyses were performed using SAS statistical software (version 9.3; SAS Institute Inc., Cary, NC). This study was deemed minimal risk after review by the Mayo Clinic Institutional Review Board.
RESULTS
Over the 6‐year study period, 107 attending physicians supervised internal medicine teaching services. Twenty‐three percent of teaching attending physicians were female. Mean attending age was 42.6 years. Attendings supervised a given service for between 2 and 19 days (mean [standard deviation] = 10.1 [4.1] days). There were 542 internal medicine residents on these teaching services who completed at least 1 teaching evaluation. A total of 69,386 teaching evaluation items were submitted by these residents during the study period.
In a multivariate analysis adjusted for faculty demographics and workload measures, teaching evaluation scores were significantly higher for attending physicians who had an academic rank of professor when compared to attendings who were assistant professors ( = 0.12, P = 0.007), or instructors/no academic rank ( = 0.23, P < 0.0001). The number of days an attending physician spent with the team showed a positive association with teaching evaluations ( = +0.015, P < 0.0001).
Associations between measures of attending physician workload and teaching evaluation scores are shown in Table 1. Mean midnight census and mean number of daily discharges were associated with lower teaching evaluation scores (both = 0.026, P < 0.0001). Mean number of daily admissions was associated with higher teaching scores ( = +0.021, P = 0.001). The mean expected mortality among hospitalized patients on the services supervised by teaching attendings and the outpatient RVUs generated by these attendings during the time they were supervising the hospital service showed no association with teaching scores. The average number of RVUs generated during an attending's entire time supervising hospital service was <1.
Attending Physician Workload Measure | Mean (SD) | Multivariate Analysis* | |||
---|---|---|---|---|---|
SE | 99% CI | P | |||
| |||||
Midnight census | 8.86 (1.8) | 0.026 | 0.002 | (0.03, 0.02) | <0.0001 |
Length of stay, d | 6.91 (3.0) | +0.006 | 0.001 | (0.002, 0.009) | <0.0001 |
Expected mortality (per 1,000 patients) | 51.94 (27.4) | 0.0001 | 0.0001 | (0.0004, 0.0001) | 0.19 |
Daily admissions | 2.23 (0.54) | +0.021 | 0.006 | (0.004, 0.037) | 0.001 |
Daily discharges | 2.13 (0.56) | 0.026 | 0.006 | (0.041, 0.010) | <0.0001 |
Daily outpatient relative value units | 0.69 (1.2) | +0.004 | 0.003 | (0.002, 0.011) | 0.10 |
Table 2 shows relationships between attending physician workload and patient outcomes for the patients on hospital services supervised by 107 attending physicians during the study period. Patient outcome data showed positive associations between measures of higher workload and PSIs. Specifically, for each 1‐patient increase in the average number of daily admissions to the attending and resident services, the cohort of patients under the team's care was 1.8 times more likely to include at least 1 patient with a PSI event (OR = 1.81, 99% confidence interval [CI]: 1.21, 2.71, P = 0.0001). Likewise, for each 1‐day increase in average length of stay, the cohort of patients under the team's care was 1.16 times more likely to have at least 1 patient with a PSI (OR = 1.16, 99% CI: 1.07, 1.26, P < 0.0001). As anticipated, mean expected mortality was associated with actual mortality, cardiopulmonary resuscitation/rapid response team calls, and ICU transfers. There were no associations between patient outcomes and workload measures of midnight census and outpatient RVUs.
Patient Outcomes, Multivariate Analysis* | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Patient Safety Indicators, n = 513 | Deaths, n = 352 | CPR/RRT Calls, n = 409 | ICU Transfers, n = 737 | |||||||||||||
Workload measures | OR | SE | P | 99% CI | OR | SE | P | 99% CI | OR | SE | P | 99% CI | OR | SE | P | 99% CI |
| ||||||||||||||||
Midnight census | 1.10 | 0.05 | 0.04 | (0.98, 1.24) | 0.91 | 0.04 | 0.03 | (0.81, 1.02) | 0.95 | 0.04 | 0.16 | (0.86, 1.05) | 1.06 | 0.04 | 0.16 | (0.96, 1.17) |
Length of stay | 1.16 | 0.04 | <0.0001 | (1.07, 1.26) | 1.03 | 0.03 | 0.39 | (0.95, 1.12) | 0.99 | 0.03 | 0.63 | (0.92, 1.05) | 1.10 | 0.03 | 0.0001 | (1.03, 1.18) |
Expected mortality (per 1,000 patients) | 1.00 | 0.003 | 0.24 | (0.99, 1.01) | 1.01 | 0.00 | 0.002 | (1.00, 1.02) | 1.02 | 0.00 | <0.0001 | (1.01, 1.02) | 1.01 | 0.00 | 0.003 | (1.00, 1.01) |
Daily admissions | 1.81 | 0.28 | 0.0001 | (1.21, 2.71) | 0.78 | 0.14 | 0.16 | (0.49, 1.24) | 1.11 | 0.20 | 0.57 | (0.69, 1.77) | 1.34 | 0.24 | 0.09 | (0.85, 2.11) |
Daily discharges | 1.06 | 0.13 | 0.61 | (0.78, 1.45) | 2.36 | 0.38 | <0.0001 | (1.56, 3.57) | 0.94 | 0.16 | 0.70 | (0.60, 1.46) | 1.09 | 0.16 | 0.53 | (0.75, 1.60) |
Daily outpatient relative value units | 0.81 | 0.07 | 0.01 | (0.65, 1.00) | 1.02 | 0.04 | 0.56 | (0.92, 1.13) | 1.05 | 0.04 | 0.23 | (0.95, 1.17) | 0.92 | 0.06 | 0.23 | (0.77, 1.09) |
DISCUSSION
This study of internal medicine attending physician workload and resident education demonstrates that higher workload among attending physicians is associated with slightly lower teaching evaluation scores from residents as well as increased risks to patient safety.
The prior literature examining relationships between workload and teaching effectiveness is largely survey‐based and reliant upon physicians' self‐reported perceptions of workload.[10, 13, 23] The present study strengthens this evidence by using multiple objective measures of workload, objective measures of patient safety, and a large sample of teaching evaluations.
An interesting finding in this study was that the number of patient dismissals per day was associated with a significant decrease in teaching scores, whereas the number of admissions per day was associated with increased teaching scores. These findings may seem contradictory, because the number of admissions and discharges both measure physician workload. However, a likely explanation for this apparent inconsistency is that on internal medicine inpatient teaching services, much of the teaching of residents occurs at the time of a patient admission as residents are presenting cases to the attending physician, exploring differential diagnoses, and discussing management plans. By contrast, a patient dismissal tends to consist mainly of patient interaction, paperwork, and phone calls by the resident with less input required from the attending physician. Our findings suggest that although patient admissions remain a rich opportunity for resident education, patient dismissals may increase workload without improving teaching evaluations. As the inpatient hospital environment evolves, exploring options for nonphysician providers to assist with or complete patient dismissals may have a beneficial effect on resident education.[34] In addition, exploring more efficient teaching strategies may be beneficial in the fast‐paced inpatient learning milieu.[35]
There was a statistically significant positive association between the number of days an attending physician spent with the team and teaching evaluations. Although prior work has examined advantages and disadvantages of various resident schedules,[36, 37, 38] our results suggest scheduling models that emphasize continuity of the teaching attending and residents may be preferred to enhance teaching effectiveness. Further study would help elucidate potential implications of this finding for the scheduling of supervisory attendings to optimize education.
In this analysis, patient outcome measures were largely independent of attending physician workload, with the exception of PSIs. PSIs have been associated with longer stays in the hospital,[39, 40] which is consistent with our findings. However, mean daily admissions were also associated with PSIs. It could be expected that the more patients on a hospital service, the more PSIs will result. However, there was not a significant association between midnight census and PSIs when other variables were accounted for. Because new patient admissions are time consuming and contribute to the workload of both residents and attending physicians, it is possible that safety of the service's hospitalized patients is compromised when the team is putting time and effort toward new patients. Previous research has shown variability in PSI trends with changes in the workload environment.[41] Further studies are needed to fully explore relationships between admission volume and PSIs on teaching services.
It is worthwhile to note that attending physicians have specific responsibilities of supervision and documentation for new admissions. Although it could be argued that new admissions raise the workload for the entire team, and the higher team workload may impact teaching evaluations, previous research has demonstrated that resident burnout and well‐being, which are influenced by workload, do not impact residents' assessments of teachers.[42] In addition, metrics that could arguably be more apt to measure the workload of the team as a whole (eg, team census) did not show a significant association with patient outcomes.
This study has important limitations. First, the cohort of attending physicians, residents, and patients was from a large single institution and may not be generalizable to all settings. Second, most attending physicians in this sample were experienced teachers, so consequences of increased workload may have been managed effectively without a major impact on resident education in some cases. Third, the magnitude of change in teaching effectiveness, although statistically significant, was small and might call into question the educational significance of these findings. Fourth, although resident satisfaction does not influence teaching scores, it is possible that residents' perception of their own workload may have impacted teaching evaluations. Finally, data collection was intentionally closed at the end of the 2011 academic year because accreditation standards for resident duty hours changed again at that time.[43] Thus, these data may not directly reflect the evolving hospital learning environment but serve as a useful benchmark for future studies of workload and teaching effectiveness in the inpatient setting. Once hospitals have had sufficient time and experience with the new duty hour standards, additional studies exploring relationships between workload, teaching effectiveness, and patient outcomes may be warranted.
Limitations notwithstanding, this study shows that attending physician workload may adversely impact teaching and patient safety on internal medicine hospital services. Ongoing efforts by residency programs to optimize the learning environment should include strategies to manage the workload of supervising attendings.
Disclosures
This publication was made possible in part by Clinical and Translational Science Award grant number UL1 TR000135 from the National Center for Advancing Translational Sciences, a component of the National Institutes of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NIH. Authors also acknowledge support for the Mayo Clinic Department of Medicine Write‐up and Publish grant. In addition, this study was supported in part by the Mayo Clinic Internal Medicine Residency Office of Education Innovations as part of the Accreditation Council for Graduate Medical Education Educational Innovations Project. The information contained in this article was based in part on the performance package data maintained by the University HealthSystem Consortium. Copyright 2015 UHC. All rights reserved.
Teaching attending physicians must balance clinical workload and resident education simultaneously while supervising inpatient services. The workload of teaching attendings has been increasing due to many factors. As patient complexity has increased, length of stay has decreased, creating higher turnover and higher acuity of hospitalized patients.[1, 2, 3, 4, 5] The rising burden of clinical documentation has increased demands on inpatient attending physicians' time.[6] Additionally, resident duty hour restrictions have shifted the responsibility for patient care to the teaching attending.[7] These factors contribute to the perception of unsafe workloads among attending physicians[8] and could impact the ability to teach well.
Teaching effectiveness is an important facet of the graduate medical education (GME) learning environment.[9] Residents perceive that education suffers when their own workload increases,[10, 11, 12, 13, 14] and higher on‐call workload is associated with lower likelihood of participation in educational activities.[15] More contact between resident trainees and supervisory staff may improve the clinical value of inpatient rotations.[16] Program directors have expressed concern about the educational ramifications of work compression.[17, 18, 19, 20] Higher workload for attending physicians can negatively impact patient safety and quality of care,[21, 22] and perception of higher attending workload is associated with less time for teaching.[23] However, the impact of objective measures of attending physician workload on educational outcomes has not been explored. When attending physicians are responsible for increasingly complex clinical care in addition to resident education, teaching effectiveness may suffer. With growing emphasis on the educational environment's effect on healthcare quality and safety,[24] it is imperative to consider the influence of attending workload on patient care and resident education.
The combination of increasing clinical demands, fewer hours in‐house for residents, and less time for teaching has the potential to decrease attending physician teaching effectiveness. In this study, we aimed to evaluate relationships among objective measures of attending physician workload, resident perception of teaching effectiveness, and patient outcomes. We hypothesized that higher workload for attending physicians would be associated with lower ratings of teaching effectiveness and poorer outcomes for patients.
METHODS
We performed a retrospective study of attending physicians who supervised inpatient internal medicine teaching services at Mayo ClinicRochester from July 2005 through June 2011 (6 full academic years). The team structure for each service was 1 attending physician, 1 senior resident, and 3 interns. Senior residents were on call every fourth night, and interns were on call every sixth night. Up to 2 admissions per service were received during the daytime short call, and up to 5 admissions per service were received during the overnight long call. Attending physicians included all supervising physicians in appointment categories of attending/consultant, senior associate consultant, and chief medical resident at the Mayo Clinic. Maximum continuous on‐call time for residents during the study period was restricted to 30 hours continuously. The timeframe of this study was chosen to minimize variability in resident work schedules; effective July 1, 2011, duty hours for postgraduate year 1 residents were further restricted to a maximum of 16 hours in duration.[25]
Measures of Attending Physician Workload
To measure attending physician workload, we examined mean service census as reported at midnight, mean patient length of stay, mean number of daily admissions, and mean number of daily discharges. We also calculated mean daily outpatient relative value units (RVUs) generated as a measure of outpatient workload while the attending was supervising the inpatient service. Similar measures of workload have been used in previous research.[26] Attending physicians in this study functioned as hospitalists during their time supervising the teaching services; that is, they were not routinely assigned to any outpatient responsibilities. The only way for an outpatient RVU to be generated during their time supervising the hospital service was for the attending physician to specifically request to see an outpatient in the clinic. Attending physicians only supervised 1 teaching service at a time and had no concurrent nonteaching service obligations. Admissions were received on a rotating basis. Because patient illness severity may impact workload, we also examined mean expected mortality (per 1000 patients) for all patients on the attending physicians' hospital services.[27]
The above workload variables were measured in the specific timeframe that corresponded to the number of days an attending physician was supervising a particular team; for example, mean census was the mean number of patients on the attending physician's hospital service during his or her time supervising that resident team.
Teaching Effectiveness Outcome Measures
Teaching effectiveness was measured using residents' evaluations of their attending physicians with a 5‐point scale (1 = needs improvement, 3 = average, 5 = top 10% of attending physicians) that has been previously validated in similar contexts.[28, 29, 30, 31, 32] The evaluation questions are shown in Supporting Information, Appendix A, in the online version of this article.
Patient Outcome Measures
Patient outcomes included applicable patient safety indicators (PSIs) as defined by the Agency for Healthcare Research and Quality[33] (see Supporting Information, Appendix B, in the online version of this article), patient transfers to the intensive care unit (ICU), calls to the rapid response team/cardiopulmonary resuscitation team, and patient deaths. Each indicator and event was summarized as occurred or did not occur at the service‐team level. For example, for a particular attendingresident team, the occurrence of each of these events at any point during the time they worked together was recorded as occurred (1) or did not occur (0). Similar measures of patient outcomes have been used in previous research.[32]
Statistical Analysis
Mixed linear models with variance components covariance structure (including random effects to account for repeated ratings by residents and of faculty) were fit using restricted maximum likelihood to examine associations of attending workload and demographics with teaching scores. Generalized linear regression models, estimated via generalized estimating equations, were used to examine associations of attending workload and demographics with patient outcomes. Due to the binary nature of the outcomes, the binomial distribution and logit link function were used, producing odds ratios (ORs) for covariates akin to those found in standard logistic regression. Multivariate models were used to adjust for physician demographics including age, gender, teaching appointment (consultant, senior associate consultant/temporary clinical appointment, or chief medical resident) and academic rank (professor, associate professor, assistant professor, instructor/none).
To account for multiple comparisons, a significance level of P < 0.01 was used. All analyses were performed using SAS statistical software (version 9.3; SAS Institute Inc., Cary, NC). This study was deemed minimal risk after review by the Mayo Clinic Institutional Review Board.
RESULTS
Over the 6‐year study period, 107 attending physicians supervised internal medicine teaching services. Twenty‐three percent of teaching attending physicians were female. Mean attending age was 42.6 years. Attendings supervised a given service for between 2 and 19 days (mean [standard deviation] = 10.1 [4.1] days). There were 542 internal medicine residents on these teaching services who completed at least 1 teaching evaluation. A total of 69,386 teaching evaluation items were submitted by these residents during the study period.
In a multivariate analysis adjusted for faculty demographics and workload measures, teaching evaluation scores were significantly higher for attending physicians who had an academic rank of professor when compared to attendings who were assistant professors ( = 0.12, P = 0.007), or instructors/no academic rank ( = 0.23, P < 0.0001). The number of days an attending physician spent with the team showed a positive association with teaching evaluations ( = +0.015, P < 0.0001).
Associations between measures of attending physician workload and teaching evaluation scores are shown in Table 1. Mean midnight census and mean number of daily discharges were associated with lower teaching evaluation scores (both = 0.026, P < 0.0001). Mean number of daily admissions was associated with higher teaching scores ( = +0.021, P = 0.001). The mean expected mortality among hospitalized patients on the services supervised by teaching attendings and the outpatient RVUs generated by these attendings during the time they were supervising the hospital service showed no association with teaching scores. The average number of RVUs generated during an attending's entire time supervising hospital service was <1.
Attending Physician Workload Measure | Mean (SD) | Multivariate Analysis* | |||
---|---|---|---|---|---|
SE | 99% CI | P | |||
| |||||
Midnight census | 8.86 (1.8) | 0.026 | 0.002 | (0.03, 0.02) | <0.0001 |
Length of stay, d | 6.91 (3.0) | +0.006 | 0.001 | (0.002, 0.009) | <0.0001 |
Expected mortality (per 1,000 patients) | 51.94 (27.4) | 0.0001 | 0.0001 | (0.0004, 0.0001) | 0.19 |
Daily admissions | 2.23 (0.54) | +0.021 | 0.006 | (0.004, 0.037) | 0.001 |
Daily discharges | 2.13 (0.56) | 0.026 | 0.006 | (0.041, 0.010) | <0.0001 |
Daily outpatient relative value units | 0.69 (1.2) | +0.004 | 0.003 | (0.002, 0.011) | 0.10 |
Table 2 shows relationships between attending physician workload and patient outcomes for the patients on hospital services supervised by 107 attending physicians during the study period. Patient outcome data showed positive associations between measures of higher workload and PSIs. Specifically, for each 1‐patient increase in the average number of daily admissions to the attending and resident services, the cohort of patients under the team's care was 1.8 times more likely to include at least 1 patient with a PSI event (OR = 1.81, 99% confidence interval [CI]: 1.21, 2.71, P = 0.0001). Likewise, for each 1‐day increase in average length of stay, the cohort of patients under the team's care was 1.16 times more likely to have at least 1 patient with a PSI (OR = 1.16, 99% CI: 1.07, 1.26, P < 0.0001). As anticipated, mean expected mortality was associated with actual mortality, cardiopulmonary resuscitation/rapid response team calls, and ICU transfers. There were no associations between patient outcomes and workload measures of midnight census and outpatient RVUs.
Patient Outcomes, Multivariate Analysis* | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Patient Safety Indicators, n = 513 | Deaths, n = 352 | CPR/RRT Calls, n = 409 | ICU Transfers, n = 737 | |||||||||||||
Workload measures | OR | SE | P | 99% CI | OR | SE | P | 99% CI | OR | SE | P | 99% CI | OR | SE | P | 99% CI |
| ||||||||||||||||
Midnight census | 1.10 | 0.05 | 0.04 | (0.98, 1.24) | 0.91 | 0.04 | 0.03 | (0.81, 1.02) | 0.95 | 0.04 | 0.16 | (0.86, 1.05) | 1.06 | 0.04 | 0.16 | (0.96, 1.17) |
Length of stay | 1.16 | 0.04 | <0.0001 | (1.07, 1.26) | 1.03 | 0.03 | 0.39 | (0.95, 1.12) | 0.99 | 0.03 | 0.63 | (0.92, 1.05) | 1.10 | 0.03 | 0.0001 | (1.03, 1.18) |
Expected mortality (per 1,000 patients) | 1.00 | 0.003 | 0.24 | (0.99, 1.01) | 1.01 | 0.00 | 0.002 | (1.00, 1.02) | 1.02 | 0.00 | <0.0001 | (1.01, 1.02) | 1.01 | 0.00 | 0.003 | (1.00, 1.01) |
Daily admissions | 1.81 | 0.28 | 0.0001 | (1.21, 2.71) | 0.78 | 0.14 | 0.16 | (0.49, 1.24) | 1.11 | 0.20 | 0.57 | (0.69, 1.77) | 1.34 | 0.24 | 0.09 | (0.85, 2.11) |
Daily discharges | 1.06 | 0.13 | 0.61 | (0.78, 1.45) | 2.36 | 0.38 | <0.0001 | (1.56, 3.57) | 0.94 | 0.16 | 0.70 | (0.60, 1.46) | 1.09 | 0.16 | 0.53 | (0.75, 1.60) |
Daily outpatient relative value units | 0.81 | 0.07 | 0.01 | (0.65, 1.00) | 1.02 | 0.04 | 0.56 | (0.92, 1.13) | 1.05 | 0.04 | 0.23 | (0.95, 1.17) | 0.92 | 0.06 | 0.23 | (0.77, 1.09) |
DISCUSSION
This study of internal medicine attending physician workload and resident education demonstrates that higher workload among attending physicians is associated with slightly lower teaching evaluation scores from residents as well as increased risks to patient safety.
The prior literature examining relationships between workload and teaching effectiveness is largely survey‐based and reliant upon physicians' self‐reported perceptions of workload.[10, 13, 23] The present study strengthens this evidence by using multiple objective measures of workload, objective measures of patient safety, and a large sample of teaching evaluations.
An interesting finding in this study was that the number of patient dismissals per day was associated with a significant decrease in teaching scores, whereas the number of admissions per day was associated with increased teaching scores. These findings may seem contradictory, because the number of admissions and discharges both measure physician workload. However, a likely explanation for this apparent inconsistency is that on internal medicine inpatient teaching services, much of the teaching of residents occurs at the time of a patient admission as residents are presenting cases to the attending physician, exploring differential diagnoses, and discussing management plans. By contrast, a patient dismissal tends to consist mainly of patient interaction, paperwork, and phone calls by the resident with less input required from the attending physician. Our findings suggest that although patient admissions remain a rich opportunity for resident education, patient dismissals may increase workload without improving teaching evaluations. As the inpatient hospital environment evolves, exploring options for nonphysician providers to assist with or complete patient dismissals may have a beneficial effect on resident education.[34] In addition, exploring more efficient teaching strategies may be beneficial in the fast‐paced inpatient learning milieu.[35]
There was a statistically significant positive association between the number of days an attending physician spent with the team and teaching evaluations. Although prior work has examined advantages and disadvantages of various resident schedules,[36, 37, 38] our results suggest scheduling models that emphasize continuity of the teaching attending and residents may be preferred to enhance teaching effectiveness. Further study would help elucidate potential implications of this finding for the scheduling of supervisory attendings to optimize education.
In this analysis, patient outcome measures were largely independent of attending physician workload, with the exception of PSIs. PSIs have been associated with longer stays in the hospital,[39, 40] which is consistent with our findings. However, mean daily admissions were also associated with PSIs. It could be expected that the more patients on a hospital service, the more PSIs will result. However, there was not a significant association between midnight census and PSIs when other variables were accounted for. Because new patient admissions are time consuming and contribute to the workload of both residents and attending physicians, it is possible that safety of the service's hospitalized patients is compromised when the team is putting time and effort toward new patients. Previous research has shown variability in PSI trends with changes in the workload environment.[41] Further studies are needed to fully explore relationships between admission volume and PSIs on teaching services.
It is worthwhile to note that attending physicians have specific responsibilities of supervision and documentation for new admissions. Although it could be argued that new admissions raise the workload for the entire team, and the higher team workload may impact teaching evaluations, previous research has demonstrated that resident burnout and well‐being, which are influenced by workload, do not impact residents' assessments of teachers.[42] In addition, metrics that could arguably be more apt to measure the workload of the team as a whole (eg, team census) did not show a significant association with patient outcomes.
This study has important limitations. First, the cohort of attending physicians, residents, and patients was from a large single institution and may not be generalizable to all settings. Second, most attending physicians in this sample were experienced teachers, so consequences of increased workload may have been managed effectively without a major impact on resident education in some cases. Third, the magnitude of change in teaching effectiveness, although statistically significant, was small and might call into question the educational significance of these findings. Fourth, although resident satisfaction does not influence teaching scores, it is possible that residents' perception of their own workload may have impacted teaching evaluations. Finally, data collection was intentionally closed at the end of the 2011 academic year because accreditation standards for resident duty hours changed again at that time.[43] Thus, these data may not directly reflect the evolving hospital learning environment but serve as a useful benchmark for future studies of workload and teaching effectiveness in the inpatient setting. Once hospitals have had sufficient time and experience with the new duty hour standards, additional studies exploring relationships between workload, teaching effectiveness, and patient outcomes may be warranted.
Limitations notwithstanding, this study shows that attending physician workload may adversely impact teaching and patient safety on internal medicine hospital services. Ongoing efforts by residency programs to optimize the learning environment should include strategies to manage the workload of supervising attendings.
Disclosures
This publication was made possible in part by Clinical and Translational Science Award grant number UL1 TR000135 from the National Center for Advancing Translational Sciences, a component of the National Institutes of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NIH. Authors also acknowledge support for the Mayo Clinic Department of Medicine Write‐up and Publish grant. In addition, this study was supported in part by the Mayo Clinic Internal Medicine Residency Office of Education Innovations as part of the Accreditation Council for Graduate Medical Education Educational Innovations Project. The information contained in this article was based in part on the performance package data maintained by the University HealthSystem Consortium. Copyright 2015 UHC. All rights reserved.
- The future of residents' education in internal medicine. Am J Med. 2004;116(9):648–650. , , .
- Redesigning residency education in internal medicine: a position paper from the Association of Program Directors in Internal Medicine. Ann Intern Med. 2006;144(12):920–926. , , , , .
- Residency training in the modern era: the pipe dream of less time to learn more, care better, and be more professional. Arch Intern Med. 2005;165(22):2561–2562. , , .
- Trends in Hospitalizations Among Medicare Survivors of Aortic Valve Replacement in the United States From 1999 to 2010. Ann Thorac Surg. 2015;99(2):509–517. , , , et al.
- Restructuring an inpatient resident service to improve outcomes for residents, students, and patients. Acad Med. 2011;86(12):1500–1507. , , .
- Clinical documentation in the 21st century: executive summary of a policy position paper from the American College of Physicians. Ann Intern Med. 2015;162(4):301–303. , , , .
- Effect of ACGME duty hours on attending physician teaching and satisfaction. Arch Intern Med. 2008;168(11):1226–1228. , .
- Identifying potential predictors of a safe attending physician workload: a survey of hospitalists. J Hosp Med. 2013;8(11):644–646. , , , , .
- The clinical learning environment: the foundation of graduate medical education. JAMA. 2013;309(16):1687–1688. , , .
- Better rested, but more stressed? Evidence of the effects of resident work hour restrictions. Acad Pediatr. 2012;12(4):335–343. , , , , , .
- Multifaceted longitudinal study of surgical resident education, quality of life, and patient care before and after July 2011. J Surg Educ. 2013;70(6):769–776. , , , .
- Impact of the new 16‐hour duty period on pediatric interns' neonatal education. Clin Pediatr (Phila). 2014;53(1):51–59. , , .
- Relationship between resident workload and self‐perceived learning on inpatient medicine wards: a longitudinal study. BMC Med Educ. 2006;6:35. , , , , , .
- Perceptions of educational experience and inpatient workload among pediatric residents. Hosp Pediatr. 2013;3(3):276–284. , , , .
- Association of workload of on‐call medical interns with on‐call sleep duration, shift duration, and participation in educational activities. JAMA. 2008;300(10):1146–1153. , , , et al.
- Effects of increased overnight supervision on resident education, decision‐making, and autonomy. J Hosp Med. 2012;7(8):606–610. , , , , , .
- Approval and perceived impact of duty hour regulations: survey of pediatric program directors. Pediatrics. 2013;132(5):819–824. , , , , .
- Anticipated consequences of the 2011 duty hours standards: views of internal medicine and surgery program directors. Acad Med. 2012;87(7):895–903. , , , et al.
- Training on the clock: family medicine residency directors' responses to resident duty hours reform. Acad Med. 2006;81(12):1032–1037. , , , , .
- Duty hour recommendations and implications for meeting the ACGME core competencies: views of residency directors. Mayo Clin Proc. 2011;86(3):185–191. , , , et al.
- Does surgeon workload per day affect outcomes after pulmonary lobectomies? Ann Thorac Surg. 2012;94(3):966–973. , , , et al.
- Impact of attending physician workload on patient care: a survey of hospitalists. JAMA Intern Med. 2013;173(5):375–377. , , , .
- No time for teaching? Inpatient attending physicians' workload and teaching before and after the implementation of the 2003 duty hours regulations. Acad Med. 2013;88(9):1293–1298. , , , et al.
- Accreditation Council for Graduate Medical Education. Clinical Learning Environment Review (CLER) Program. Available at: http://www.acgme.org/acgmeweb/tabid/436/ProgramandInstitutionalAccreditation/NextAccreditationSystem/ClinicalLearningEnvironmentReviewProgram.aspx. Accessed April 27, 2015.
- Accreditation Council for Graduate Medical Education. Frequently Asked Questions: A ACGME common duty hour requirements. Available at: https://www.acgme.org/acgmeweb/Portals/0/PDFs/dh‐faqs 2011.pdf. Accessed April 27, 2015.
- Effect of hospitalist workload on the quality and efficiency of care. JAMA Intern Med. 2014;174(5):786–793. , , , , .
- University HealthSystem Consortium. UHC clinical database/resource manager for Mayo Clinic. Available at: http://www.uhc.edu. Data accessed August 25, 2011.
- The interpersonal, cognitive and efficiency domains of clinical teaching: construct validity of a multi‐dimensional scale. Med Educ. 2005;39(12):1221–1229. , .
- Factor instability of clinical teaching assessment scores among general internists and cardiologists. Med Educ. 2006;40(12):1209–1216. , , .
- Determining reliability of clinical assessment scores in real time. Teach Learn Med. 2009;21(3):188–194. , , , .
- Behaviors of highly professional resident physicians. JAMA. 2008;300(11):1326–1333. , , , , , .
- Service census caps and unit‐based admissions: resident workload, conference attendance, duty hour compliance, and patient safety. Mayo Clin Proc. 2012;87(4):320–327. , , , et al.
- Agency for Healthcare Research and Quality. Patient safety indicators technical specifications updates—Version 5.0, March 2015. Available at: http://www.qualityindicators.ahrq.gov/Modules/PSI_TechSpec.aspx. Accessed May 29, 2015.
- The impact of nonphysician clinicians: do they improve the quality and cost‐effectiveness of health care services? Med Care Res Rev. 2009;66(6 suppl):36S–89S. , , , , , .
- Maximizing teaching on the wards: review and application of the One‐Minute Preceptor and SNAPPS models. J Hosp Med. 2015;10(2):125–130. , , .
- Resident perceptions of the educational value of night float rotations. Teach Learn Med. 2010;22(3):196–201. , , , .
- An evaluation of internal medicine residency continuity clinic redesign to a 50/50 outpatient‐inpatient model. J Gen Intern Med. 2013;28(8):1014–1019. , , , , , .
- Revisiting the rotating call schedule in less than 80 hours per week. J Surg Educ. 2009;66(6):357–360. , , , et al.
- Excess length of stay, charges, and mortality attributable to medical injuries during hospitalization. JAMA. 2003;290(14):1868–1874. , .
- Agency for Healthcare Research and Quality patient safety indicators and mortality in surgical patients. Am Surg. 2014;80(8):801–804. , , , .
- Patient safety in the era of the 80‐hour workweek. J Surg Educ. 2014;71(4):551–559. , , , et al.
- Impact of resident well‐being and empathy on assessments of faculty physicians. J Gen Intern Med. 2010;25(1):52–56. , , , .
- Stress management training for surgeons‐a randomized, controlled, intervention study. Ann Surg. 2011;253(3):488–494. , , , et al.
- The future of residents' education in internal medicine. Am J Med. 2004;116(9):648–650. , , .
- Redesigning residency education in internal medicine: a position paper from the Association of Program Directors in Internal Medicine. Ann Intern Med. 2006;144(12):920–926. , , , , .
- Residency training in the modern era: the pipe dream of less time to learn more, care better, and be more professional. Arch Intern Med. 2005;165(22):2561–2562. , , .
- Trends in Hospitalizations Among Medicare Survivors of Aortic Valve Replacement in the United States From 1999 to 2010. Ann Thorac Surg. 2015;99(2):509–517. , , , et al.
- Restructuring an inpatient resident service to improve outcomes for residents, students, and patients. Acad Med. 2011;86(12):1500–1507. , , .
- Clinical documentation in the 21st century: executive summary of a policy position paper from the American College of Physicians. Ann Intern Med. 2015;162(4):301–303. , , , .
- Effect of ACGME duty hours on attending physician teaching and satisfaction. Arch Intern Med. 2008;168(11):1226–1228. , .
- Identifying potential predictors of a safe attending physician workload: a survey of hospitalists. J Hosp Med. 2013;8(11):644–646. , , , , .
- The clinical learning environment: the foundation of graduate medical education. JAMA. 2013;309(16):1687–1688. , , .
- Better rested, but more stressed? Evidence of the effects of resident work hour restrictions. Acad Pediatr. 2012;12(4):335–343. , , , , , .
- Multifaceted longitudinal study of surgical resident education, quality of life, and patient care before and after July 2011. J Surg Educ. 2013;70(6):769–776. , , , .
- Impact of the new 16‐hour duty period on pediatric interns' neonatal education. Clin Pediatr (Phila). 2014;53(1):51–59. , , .
- Relationship between resident workload and self‐perceived learning on inpatient medicine wards: a longitudinal study. BMC Med Educ. 2006;6:35. , , , , , .
- Perceptions of educational experience and inpatient workload among pediatric residents. Hosp Pediatr. 2013;3(3):276–284. , , , .
- Association of workload of on‐call medical interns with on‐call sleep duration, shift duration, and participation in educational activities. JAMA. 2008;300(10):1146–1153. , , , et al.
- Effects of increased overnight supervision on resident education, decision‐making, and autonomy. J Hosp Med. 2012;7(8):606–610. , , , , , .
- Approval and perceived impact of duty hour regulations: survey of pediatric program directors. Pediatrics. 2013;132(5):819–824. , , , , .
- Anticipated consequences of the 2011 duty hours standards: views of internal medicine and surgery program directors. Acad Med. 2012;87(7):895–903. , , , et al.
- Training on the clock: family medicine residency directors' responses to resident duty hours reform. Acad Med. 2006;81(12):1032–1037. , , , , .
- Duty hour recommendations and implications for meeting the ACGME core competencies: views of residency directors. Mayo Clin Proc. 2011;86(3):185–191. , , , et al.
- Does surgeon workload per day affect outcomes after pulmonary lobectomies? Ann Thorac Surg. 2012;94(3):966–973. , , , et al.
- Impact of attending physician workload on patient care: a survey of hospitalists. JAMA Intern Med. 2013;173(5):375–377. , , , .
- No time for teaching? Inpatient attending physicians' workload and teaching before and after the implementation of the 2003 duty hours regulations. Acad Med. 2013;88(9):1293–1298. , , , et al.
- Accreditation Council for Graduate Medical Education. Clinical Learning Environment Review (CLER) Program. Available at: http://www.acgme.org/acgmeweb/tabid/436/ProgramandInstitutionalAccreditation/NextAccreditationSystem/ClinicalLearningEnvironmentReviewProgram.aspx. Accessed April 27, 2015.
- Accreditation Council for Graduate Medical Education. Frequently Asked Questions: A ACGME common duty hour requirements. Available at: https://www.acgme.org/acgmeweb/Portals/0/PDFs/dh‐faqs 2011.pdf. Accessed April 27, 2015.
- Effect of hospitalist workload on the quality and efficiency of care. JAMA Intern Med. 2014;174(5):786–793. , , , , .
- University HealthSystem Consortium. UHC clinical database/resource manager for Mayo Clinic. Available at: http://www.uhc.edu. Data accessed August 25, 2011.
- The interpersonal, cognitive and efficiency domains of clinical teaching: construct validity of a multi‐dimensional scale. Med Educ. 2005;39(12):1221–1229. , .
- Factor instability of clinical teaching assessment scores among general internists and cardiologists. Med Educ. 2006;40(12):1209–1216. , , .
- Determining reliability of clinical assessment scores in real time. Teach Learn Med. 2009;21(3):188–194. , , , .
- Behaviors of highly professional resident physicians. JAMA. 2008;300(11):1326–1333. , , , , , .
- Service census caps and unit‐based admissions: resident workload, conference attendance, duty hour compliance, and patient safety. Mayo Clin Proc. 2012;87(4):320–327. , , , et al.
- Agency for Healthcare Research and Quality. Patient safety indicators technical specifications updates—Version 5.0, March 2015. Available at: http://www.qualityindicators.ahrq.gov/Modules/PSI_TechSpec.aspx. Accessed May 29, 2015.
- The impact of nonphysician clinicians: do they improve the quality and cost‐effectiveness of health care services? Med Care Res Rev. 2009;66(6 suppl):36S–89S. , , , , , .
- Maximizing teaching on the wards: review and application of the One‐Minute Preceptor and SNAPPS models. J Hosp Med. 2015;10(2):125–130. , , .
- Resident perceptions of the educational value of night float rotations. Teach Learn Med. 2010;22(3):196–201. , , , .
- An evaluation of internal medicine residency continuity clinic redesign to a 50/50 outpatient‐inpatient model. J Gen Intern Med. 2013;28(8):1014–1019. , , , , , .
- Revisiting the rotating call schedule in less than 80 hours per week. J Surg Educ. 2009;66(6):357–360. , , , et al.
- Excess length of stay, charges, and mortality attributable to medical injuries during hospitalization. JAMA. 2003;290(14):1868–1874. , .
- Agency for Healthcare Research and Quality patient safety indicators and mortality in surgical patients. Am Surg. 2014;80(8):801–804. , , , .
- Patient safety in the era of the 80‐hour workweek. J Surg Educ. 2014;71(4):551–559. , , , et al.
- Impact of resident well‐being and empathy on assessments of faculty physicians. J Gen Intern Med. 2010;25(1):52–56. , , , .
- Stress management training for surgeons‐a randomized, controlled, intervention study. Ann Surg. 2011;253(3):488–494. , , , et al.
© 2016 Society of Hospital Medicine
Association Between DCBN and LOS
Slow hospital throughputthe process whereby a patient is admitted, placed in a room, and eventually dischargedcan worsen outcomes if admitted patients are boarded in emergency rooms or postanesthesia units.[1] One potential method to improve throughput is to discharge patients earlier in the day,[2] freeing up available beds and conceivably reducing hospital length of stay (LOS).
To quantify throughput, hospitals are beginning to measure the proportion of patients discharged before noon (DCBN). One study, looking at discharges on a single medical floor in an urban academic medical center, suggested that increasing the percentage of patients discharged by noon decreased observed‐to‐expected LOS in hospitalized medicine patients,[3] and a follow‐up study demonstrated that it was associated with admissions from the emergency department occurring earlier in the day.[4] However, these studies did not adjust for changes in case mix index (CMI) and other patient‐level characteristics that may also have affected these outcomes. Concerns persist that more efforts to discharge patients by noon could inadvertently increase LOS if staff chose to keep patients overnight for an early discharge the following day.
We undertook a retrospective analysis of data from patients discharged from a large academic medical center where an institution‐wide emphasis was placed on discharging more patients by noon. Using these data, we examined the association between discharges before noon and LOS in medical and surgical inpatients.
METHODS
Site and Subjects
Our study was based at the University of California, San Francisco (UCSF) Medical Center, a 400‐bed academic hospital located in San Francisco, California. We examined adult medical and surgical discharges from July 2012 through April 2015. Patients who stayed less than 24 hours or more than 20 days were excluded. Discharges from the hospital medicine service and the following surgical services were included in the analysis: cardiac surgery, colorectal surgery, cardiothoracic surgery, general surgery, gynecologic oncology, gynecology, neurosurgery, orthopedics, otolaryngology, head and neck surgery, plastic surgery, thoracic surgery, urology, and vascular surgery. No exclusions were made based on patient status (eg, observation vs inpatient). UCSF's institutional review board approved our study.
During the time of our study, discharges before noon time became an institutional priority. To this end, rates of DCBN were tracked using retrospective data, and various units undertook efforts such as informal afternoon meetings to prompt planning for the next morning's discharges. These efforts did not differentially affect medical or surgical units or emergent or nonemergent admissions, and no financial incentives or other changes in workflow were in place to increase DCBN rates.
Data Sources
We used the cost accounting system at UCSF (Enterprise Performance System Inc. [EPSI], Chicago, IL) to collect demographic information about each patient, including age, sex, primary race, and primary ethnicity. This system was also used to collect characteristics of each hospitalization including LOS (calculated from admission date time and discharge date time), hospital service at discharge, the discharge attending, discharge disposition of the patient, and the CMI, a marker of the severity of illness of the patient during that hospitalization. EPSI was also used to collect data on the admission type of all patients, either emergent, urgent, or routine, and the insurance status of the patient during that hospitalization.
Data on time of discharge were entered by the discharging nurse or unit assistant to reflect the time the patient left the hospital. Using these data, we defined a before‐noon discharge as one taking place between 8:00 am and 12:00 pm.
Statistical Analysis
Wilcoxon rank sum test and 2 statistics were used to compare baseline characteristics of hospitalizations of patients discharged before and after noon.
We used generalized linear models to assess the association of a discharge before noon on the LOS with gamma models. We accounted for clustering of discharge attendings using generalized estimating equations with exchangeable working correlation and robust standard errors. After the initial unadjusted analyses, covariates were included in the adjusted analysis if they were associated with an LOS at P < 0.05 or reasons of face validity. These variables are shown in Table 1. Because an effort to increase the discharges before noon was started in the 2014 academic year, we added an interaction term between the date of discharge and whether a discharge occurred before noon. The interaction term was included by dividing the study period into time periods corresponding to sequential 6‐month intervals. A new variable was defined by a categorical variable that indicated in which of these time periods a discharge occurred.
Discharged Before Noon | Discharged After Noon | P Value | |
---|---|---|---|
| |||
Median LOS (IQR) | 3.4 (2.25.9) | 3.7 (2.36.3) | <0.0005 |
Median CMI (IQR) | 1.8 (1.12.4) | 1.7 (1.12.5) | 0.006 |
Service type, N (%) | |||
Hospital medicine | 1,919 (29.6) | 11,290 (35.4) | |
Surgical services | 4,565 (70.4) | 20,591 (64.6) | <0.0005 |
Discharged before noon, N (%) | 6,484 (16.9) | 31,881 (83.1) | |
Discharged on weekend, N (%) | |||
Yes | 1,543 (23.8) | 7,411 (23.3) | |
No | 4,941 (76.2) | 24,470 (76.8) | 0.34 |
Discharge disposition, N (%) | |||
Home with home health | 748 (11.5) | 5,774 (18.1) | |
Home without home health | 3,997 (61.6) | 17,862 (56.0) | |
SNF | 837 (12.9) | 3,082 (9.7) | |
Other | 902 (13.9) | 5,163 (16.2) | <0.0005 |
6‐month interval, N (%) | |||
JulyDecember 2012 | 993 (15.3) | 5,596 (17.6) | |
JanuaryJune 2013 | 980 (15.1) | 5,721 (17.9) | |
JulyDecember 2013 | 1,088 (16.8) | 5,690 (17.9) | |
JanuaryJune 2014 | 1,288 (19.9) | 5,441 (17.1) | |
JulyDecember 2014 | 1,275 (19.7) | 5,656 (17.7) | |
JanuaryApril 2015 | 860 (13.3) | 3,777 (11.9) | <0.0005 |
Age category, N (%) | |||
1864 years | 4,177 (64.4) | 20,044 (62.9) | |
65+ years | 2,307 (35.6) | 11,837 (37.1) | 0.02 |
Male, N (%) | 3,274 (50.5) | 15,596 (48.9) | |
Female, N (%) | 3,210 (49.5) | 16,284 (51.1) | 0.06 |
Race, N (%) | |||
White or Caucasian | 4,133 (63.7) | 18,798 (59.0) | |
African American | 518 (8.0) | 3,020 (9.5) | |
Asian | 703 (10.8) | 4,052 (12.7) | |
Other | 1,130 (17.4) | 6,011 (18.9) | <0.0005 |
Ethnicity, N (%) | |||
Hispanic or Latino | 691 (10.7) | 3,713 (11.7) | |
Not Hispanic or Latino | 5,597 (86.3) | 27,209 (85.4) | |
Unknown/declined | 196 (3.0) | 959 (3.0) | 0.07 |
Admission type, N (%) | |||
Elective | 3,494 (53.9) | 13,881 (43.5) | |
Emergency | 2,047 (31.6) | 12,145 (38.1) | |
Urgent | 889 (13.7) | 5,459 (17.1) | |
Other | 54 (0.8) | 396 (1.2) | <0.0005 |
Payor class, N (%) | |||
Medicare | 2,648 (40.8) | 13,808 (43.3) | |
Medi‐Cal | 1,060 (16.4) | 5,913 (18.6) | |
Commercial | 2,633 (40.6) | 11,242 (35.3) | |
Other | 143 (2.2) | 918 (2.9) | <0.0005 |
We conducted a sensitivity analysis using propensity scores. The propensity score was based on demographic and clinical variables (as listed in Table 1) that exhibited P < 0.2 in bivariate analysis between the variable and being discharged before noon. We then used the propensity score as a covariate in a generalized linear model of the LOS with a gamma distribution and with generalized estimating equations as described above.
Finally, we performed prespecified secondary subset analyses of patients admitted emergently and nonemergently.
Statistical modeling and analysis was completed using Stata version 13 (StataCorp, College Station, TX).
RESULTS
Patient Demographics and Discharge Before Noon
Our study population comprised 27,983 patients for a total of 38,365 hospitalizations with a median LOS of 3.7 days. We observed 6484 discharges before noon (16.9%) and 31,881 discharges after noon (83.1%). The characteristics of the hospitalizations are shown in Table 1.
Patients who were discharged before noon tended to be younger, white, and discharged with a disposition to home without home health. The median CMI was slightly higher in discharges before noon (1.81, P = 0.006), and elective admissions were more likely than emergent to be discharged before noon (53.9% vs 31.6%, P < 0.0005).
Multivariable Analysis
A discharge before noon was associated with a 4.3% increase in LOS (adjusted odds ratio [OR]: 1.043, 95% confidence interval [CI]: 1.003‐1.086), adjusting for CMI, the service type, discharge on the weekend, discharge disposition, age, sex, ethnicity, race, urgency of admission, payor class, and a full interaction with the date of discharge (in 6‐month intervals). In preplanned subset analyses, the association between longer LOS and DCBN was more pronounced in patients admitted emergently (adjusted OR: 1.14, 95% CI: 1.033‐1.249) and less pronounced for patients not admitted emergently (adjusted OR: 1.03, 95% CI: 0.988‐1.074), although the latter did not meet statistical significance. In patients admitted emergently, this corresponds to approximately a 12‐hour increase in LOS. The interaction term of discharge date and DCBN was significant in the model. In further subset analyses, the association between longer LOS and DCBN was more pronounced in medicine patients (adjusted OR: 1.116, 95% CI: 1.014‐1.228) than in surgical patients (adjusted OR: 1.030, 95% CI: 0.989‐1.074), although the relationship in surgical patients did not meet statistical significance.
We also undertook sensitivity analyses utilizing propensity scores as a covariate in our base multivariable models. Results from these analyses did not differ from the base models and are not presented here. Results also did not differ when comparing discharges before and after the initiation of an attending only service.
DISCUSSION AND CONCLUSION
In our retrospective study of patients discharged from an academic medical center, discharge before noon was associated with a longer LOS, with the effect more pronounced in patients admitted emergently in the hospital. Our results suggest that efforts to discharge patients earlier in the day may have varying degrees of success depending on patient characteristics. Conceivably, elective admissions recover according to predictable plans, allowing for discharges earlier in the day. In contrast, patients discharged from emergent hospitalizations may have ongoing evolution of their care plan, making plans for discharging before noon more challenging.
Our results differ from a previous study,[3] which suggested that increasing the proportion of before‐noon discharges was associated with a fall in observed‐to‐expected LOS. However, observational studies of DCBN are challenging, because the association between early discharge and LOS is potentially bidirectional. One interpretation, for example, is that patients were kept longer in order to be discharged by noon the following day, which for the subgroups of patients admitted emergently corresponded to a roughly 12‐hour increase in LOS. However, it is also plausible that patients who stayed longer also had more time to plan for an early discharge. In either scenario, the ability of managers to utilize LOS as a key metric of throughput efforts may be flawed, and suggests that alternatives (eg, number of patients waiting for beds off unit) may be a more reasonable measure of throughput. Our results have several limitations. As in any observational study, our results are vulnerable to biases from unmeasured covariates that confound the analysis. We caution that a causal relationship between a discharge before noon and LOS cannot be determined from the nature of the study. Our results are also limited in that we were unable to adjust for day‐to‐day hospital capacity and other variables that affect LOS including caregiver and transportation availability, bed capacity at receiving care facilities, and patient consent to discharge. Finally, as a single‐site study, our findings may not be applicable to nonacademic settings.
In conclusion, our observational study discerned an association between discharging patients before noon and longer LOS. We believe our findings suggest a rationale for alternate approaches to measuring an early discharge program's effectiveness, namely, that the evaluation of the success of an early discharge initiative should consider multiple evaluation metrics including the effect on emergency department wait times, intensive care unit or postanesthesia transitions, and on patient reported experiences of care transitions.
Disclosures
Andrew Auerbach, MD, is supported by a K24 grant from the National Heart, Lung, and Blood Institute: K24HL098372. The authors report no conflicts of interest.
- The effect of emergency department crowding on clinically oriented outcomes. Acad Emerg Med. 2009;16(1):1–10. , , , et al.
- Centers for Medicare 2013.
- Discharge before noon: an achievable hospital goal. J Hosp Med. 2014;9(4):210–214. , , , et al.
- Discharge before noon: effect on throughput and sustainability. J Hosp Med. 2015;10(10):664–669. , , , et al.
Slow hospital throughputthe process whereby a patient is admitted, placed in a room, and eventually dischargedcan worsen outcomes if admitted patients are boarded in emergency rooms or postanesthesia units.[1] One potential method to improve throughput is to discharge patients earlier in the day,[2] freeing up available beds and conceivably reducing hospital length of stay (LOS).
To quantify throughput, hospitals are beginning to measure the proportion of patients discharged before noon (DCBN). One study, looking at discharges on a single medical floor in an urban academic medical center, suggested that increasing the percentage of patients discharged by noon decreased observed‐to‐expected LOS in hospitalized medicine patients,[3] and a follow‐up study demonstrated that it was associated with admissions from the emergency department occurring earlier in the day.[4] However, these studies did not adjust for changes in case mix index (CMI) and other patient‐level characteristics that may also have affected these outcomes. Concerns persist that more efforts to discharge patients by noon could inadvertently increase LOS if staff chose to keep patients overnight for an early discharge the following day.
We undertook a retrospective analysis of data from patients discharged from a large academic medical center where an institution‐wide emphasis was placed on discharging more patients by noon. Using these data, we examined the association between discharges before noon and LOS in medical and surgical inpatients.
METHODS
Site and Subjects
Our study was based at the University of California, San Francisco (UCSF) Medical Center, a 400‐bed academic hospital located in San Francisco, California. We examined adult medical and surgical discharges from July 2012 through April 2015. Patients who stayed less than 24 hours or more than 20 days were excluded. Discharges from the hospital medicine service and the following surgical services were included in the analysis: cardiac surgery, colorectal surgery, cardiothoracic surgery, general surgery, gynecologic oncology, gynecology, neurosurgery, orthopedics, otolaryngology, head and neck surgery, plastic surgery, thoracic surgery, urology, and vascular surgery. No exclusions were made based on patient status (eg, observation vs inpatient). UCSF's institutional review board approved our study.
During the time of our study, discharges before noon time became an institutional priority. To this end, rates of DCBN were tracked using retrospective data, and various units undertook efforts such as informal afternoon meetings to prompt planning for the next morning's discharges. These efforts did not differentially affect medical or surgical units or emergent or nonemergent admissions, and no financial incentives or other changes in workflow were in place to increase DCBN rates.
Data Sources
We used the cost accounting system at UCSF (Enterprise Performance System Inc. [EPSI], Chicago, IL) to collect demographic information about each patient, including age, sex, primary race, and primary ethnicity. This system was also used to collect characteristics of each hospitalization including LOS (calculated from admission date time and discharge date time), hospital service at discharge, the discharge attending, discharge disposition of the patient, and the CMI, a marker of the severity of illness of the patient during that hospitalization. EPSI was also used to collect data on the admission type of all patients, either emergent, urgent, or routine, and the insurance status of the patient during that hospitalization.
Data on time of discharge were entered by the discharging nurse or unit assistant to reflect the time the patient left the hospital. Using these data, we defined a before‐noon discharge as one taking place between 8:00 am and 12:00 pm.
Statistical Analysis
Wilcoxon rank sum test and 2 statistics were used to compare baseline characteristics of hospitalizations of patients discharged before and after noon.
We used generalized linear models to assess the association of a discharge before noon on the LOS with gamma models. We accounted for clustering of discharge attendings using generalized estimating equations with exchangeable working correlation and robust standard errors. After the initial unadjusted analyses, covariates were included in the adjusted analysis if they were associated with an LOS at P < 0.05 or reasons of face validity. These variables are shown in Table 1. Because an effort to increase the discharges before noon was started in the 2014 academic year, we added an interaction term between the date of discharge and whether a discharge occurred before noon. The interaction term was included by dividing the study period into time periods corresponding to sequential 6‐month intervals. A new variable was defined by a categorical variable that indicated in which of these time periods a discharge occurred.
Discharged Before Noon | Discharged After Noon | P Value | |
---|---|---|---|
| |||
Median LOS (IQR) | 3.4 (2.25.9) | 3.7 (2.36.3) | <0.0005 |
Median CMI (IQR) | 1.8 (1.12.4) | 1.7 (1.12.5) | 0.006 |
Service type, N (%) | |||
Hospital medicine | 1,919 (29.6) | 11,290 (35.4) | |
Surgical services | 4,565 (70.4) | 20,591 (64.6) | <0.0005 |
Discharged before noon, N (%) | 6,484 (16.9) | 31,881 (83.1) | |
Discharged on weekend, N (%) | |||
Yes | 1,543 (23.8) | 7,411 (23.3) | |
No | 4,941 (76.2) | 24,470 (76.8) | 0.34 |
Discharge disposition, N (%) | |||
Home with home health | 748 (11.5) | 5,774 (18.1) | |
Home without home health | 3,997 (61.6) | 17,862 (56.0) | |
SNF | 837 (12.9) | 3,082 (9.7) | |
Other | 902 (13.9) | 5,163 (16.2) | <0.0005 |
6‐month interval, N (%) | |||
JulyDecember 2012 | 993 (15.3) | 5,596 (17.6) | |
JanuaryJune 2013 | 980 (15.1) | 5,721 (17.9) | |
JulyDecember 2013 | 1,088 (16.8) | 5,690 (17.9) | |
JanuaryJune 2014 | 1,288 (19.9) | 5,441 (17.1) | |
JulyDecember 2014 | 1,275 (19.7) | 5,656 (17.7) | |
JanuaryApril 2015 | 860 (13.3) | 3,777 (11.9) | <0.0005 |
Age category, N (%) | |||
1864 years | 4,177 (64.4) | 20,044 (62.9) | |
65+ years | 2,307 (35.6) | 11,837 (37.1) | 0.02 |
Male, N (%) | 3,274 (50.5) | 15,596 (48.9) | |
Female, N (%) | 3,210 (49.5) | 16,284 (51.1) | 0.06 |
Race, N (%) | |||
White or Caucasian | 4,133 (63.7) | 18,798 (59.0) | |
African American | 518 (8.0) | 3,020 (9.5) | |
Asian | 703 (10.8) | 4,052 (12.7) | |
Other | 1,130 (17.4) | 6,011 (18.9) | <0.0005 |
Ethnicity, N (%) | |||
Hispanic or Latino | 691 (10.7) | 3,713 (11.7) | |
Not Hispanic or Latino | 5,597 (86.3) | 27,209 (85.4) | |
Unknown/declined | 196 (3.0) | 959 (3.0) | 0.07 |
Admission type, N (%) | |||
Elective | 3,494 (53.9) | 13,881 (43.5) | |
Emergency | 2,047 (31.6) | 12,145 (38.1) | |
Urgent | 889 (13.7) | 5,459 (17.1) | |
Other | 54 (0.8) | 396 (1.2) | <0.0005 |
Payor class, N (%) | |||
Medicare | 2,648 (40.8) | 13,808 (43.3) | |
Medi‐Cal | 1,060 (16.4) | 5,913 (18.6) | |
Commercial | 2,633 (40.6) | 11,242 (35.3) | |
Other | 143 (2.2) | 918 (2.9) | <0.0005 |
We conducted a sensitivity analysis using propensity scores. The propensity score was based on demographic and clinical variables (as listed in Table 1) that exhibited P < 0.2 in bivariate analysis between the variable and being discharged before noon. We then used the propensity score as a covariate in a generalized linear model of the LOS with a gamma distribution and with generalized estimating equations as described above.
Finally, we performed prespecified secondary subset analyses of patients admitted emergently and nonemergently.
Statistical modeling and analysis was completed using Stata version 13 (StataCorp, College Station, TX).
RESULTS
Patient Demographics and Discharge Before Noon
Our study population comprised 27,983 patients for a total of 38,365 hospitalizations with a median LOS of 3.7 days. We observed 6484 discharges before noon (16.9%) and 31,881 discharges after noon (83.1%). The characteristics of the hospitalizations are shown in Table 1.
Patients who were discharged before noon tended to be younger, white, and discharged with a disposition to home without home health. The median CMI was slightly higher in discharges before noon (1.81, P = 0.006), and elective admissions were more likely than emergent to be discharged before noon (53.9% vs 31.6%, P < 0.0005).
Multivariable Analysis
A discharge before noon was associated with a 4.3% increase in LOS (adjusted odds ratio [OR]: 1.043, 95% confidence interval [CI]: 1.003‐1.086), adjusting for CMI, the service type, discharge on the weekend, discharge disposition, age, sex, ethnicity, race, urgency of admission, payor class, and a full interaction with the date of discharge (in 6‐month intervals). In preplanned subset analyses, the association between longer LOS and DCBN was more pronounced in patients admitted emergently (adjusted OR: 1.14, 95% CI: 1.033‐1.249) and less pronounced for patients not admitted emergently (adjusted OR: 1.03, 95% CI: 0.988‐1.074), although the latter did not meet statistical significance. In patients admitted emergently, this corresponds to approximately a 12‐hour increase in LOS. The interaction term of discharge date and DCBN was significant in the model. In further subset analyses, the association between longer LOS and DCBN was more pronounced in medicine patients (adjusted OR: 1.116, 95% CI: 1.014‐1.228) than in surgical patients (adjusted OR: 1.030, 95% CI: 0.989‐1.074), although the relationship in surgical patients did not meet statistical significance.
We also undertook sensitivity analyses utilizing propensity scores as a covariate in our base multivariable models. Results from these analyses did not differ from the base models and are not presented here. Results also did not differ when comparing discharges before and after the initiation of an attending only service.
DISCUSSION AND CONCLUSION
In our retrospective study of patients discharged from an academic medical center, discharge before noon was associated with a longer LOS, with the effect more pronounced in patients admitted emergently in the hospital. Our results suggest that efforts to discharge patients earlier in the day may have varying degrees of success depending on patient characteristics. Conceivably, elective admissions recover according to predictable plans, allowing for discharges earlier in the day. In contrast, patients discharged from emergent hospitalizations may have ongoing evolution of their care plan, making plans for discharging before noon more challenging.
Our results differ from a previous study,[3] which suggested that increasing the proportion of before‐noon discharges was associated with a fall in observed‐to‐expected LOS. However, observational studies of DCBN are challenging, because the association between early discharge and LOS is potentially bidirectional. One interpretation, for example, is that patients were kept longer in order to be discharged by noon the following day, which for the subgroups of patients admitted emergently corresponded to a roughly 12‐hour increase in LOS. However, it is also plausible that patients who stayed longer also had more time to plan for an early discharge. In either scenario, the ability of managers to utilize LOS as a key metric of throughput efforts may be flawed, and suggests that alternatives (eg, number of patients waiting for beds off unit) may be a more reasonable measure of throughput. Our results have several limitations. As in any observational study, our results are vulnerable to biases from unmeasured covariates that confound the analysis. We caution that a causal relationship between a discharge before noon and LOS cannot be determined from the nature of the study. Our results are also limited in that we were unable to adjust for day‐to‐day hospital capacity and other variables that affect LOS including caregiver and transportation availability, bed capacity at receiving care facilities, and patient consent to discharge. Finally, as a single‐site study, our findings may not be applicable to nonacademic settings.
In conclusion, our observational study discerned an association between discharging patients before noon and longer LOS. We believe our findings suggest a rationale for alternate approaches to measuring an early discharge program's effectiveness, namely, that the evaluation of the success of an early discharge initiative should consider multiple evaluation metrics including the effect on emergency department wait times, intensive care unit or postanesthesia transitions, and on patient reported experiences of care transitions.
Disclosures
Andrew Auerbach, MD, is supported by a K24 grant from the National Heart, Lung, and Blood Institute: K24HL098372. The authors report no conflicts of interest.
Slow hospital throughputthe process whereby a patient is admitted, placed in a room, and eventually dischargedcan worsen outcomes if admitted patients are boarded in emergency rooms or postanesthesia units.[1] One potential method to improve throughput is to discharge patients earlier in the day,[2] freeing up available beds and conceivably reducing hospital length of stay (LOS).
To quantify throughput, hospitals are beginning to measure the proportion of patients discharged before noon (DCBN). One study, looking at discharges on a single medical floor in an urban academic medical center, suggested that increasing the percentage of patients discharged by noon decreased observed‐to‐expected LOS in hospitalized medicine patients,[3] and a follow‐up study demonstrated that it was associated with admissions from the emergency department occurring earlier in the day.[4] However, these studies did not adjust for changes in case mix index (CMI) and other patient‐level characteristics that may also have affected these outcomes. Concerns persist that more efforts to discharge patients by noon could inadvertently increase LOS if staff chose to keep patients overnight for an early discharge the following day.
We undertook a retrospective analysis of data from patients discharged from a large academic medical center where an institution‐wide emphasis was placed on discharging more patients by noon. Using these data, we examined the association between discharges before noon and LOS in medical and surgical inpatients.
METHODS
Site and Subjects
Our study was based at the University of California, San Francisco (UCSF) Medical Center, a 400‐bed academic hospital located in San Francisco, California. We examined adult medical and surgical discharges from July 2012 through April 2015. Patients who stayed less than 24 hours or more than 20 days were excluded. Discharges from the hospital medicine service and the following surgical services were included in the analysis: cardiac surgery, colorectal surgery, cardiothoracic surgery, general surgery, gynecologic oncology, gynecology, neurosurgery, orthopedics, otolaryngology, head and neck surgery, plastic surgery, thoracic surgery, urology, and vascular surgery. No exclusions were made based on patient status (eg, observation vs inpatient). UCSF's institutional review board approved our study.
During the time of our study, discharges before noon time became an institutional priority. To this end, rates of DCBN were tracked using retrospective data, and various units undertook efforts such as informal afternoon meetings to prompt planning for the next morning's discharges. These efforts did not differentially affect medical or surgical units or emergent or nonemergent admissions, and no financial incentives or other changes in workflow were in place to increase DCBN rates.
Data Sources
We used the cost accounting system at UCSF (Enterprise Performance System Inc. [EPSI], Chicago, IL) to collect demographic information about each patient, including age, sex, primary race, and primary ethnicity. This system was also used to collect characteristics of each hospitalization including LOS (calculated from admission date time and discharge date time), hospital service at discharge, the discharge attending, discharge disposition of the patient, and the CMI, a marker of the severity of illness of the patient during that hospitalization. EPSI was also used to collect data on the admission type of all patients, either emergent, urgent, or routine, and the insurance status of the patient during that hospitalization.
Data on time of discharge were entered by the discharging nurse or unit assistant to reflect the time the patient left the hospital. Using these data, we defined a before‐noon discharge as one taking place between 8:00 am and 12:00 pm.
Statistical Analysis
Wilcoxon rank sum test and 2 statistics were used to compare baseline characteristics of hospitalizations of patients discharged before and after noon.
We used generalized linear models to assess the association of a discharge before noon on the LOS with gamma models. We accounted for clustering of discharge attendings using generalized estimating equations with exchangeable working correlation and robust standard errors. After the initial unadjusted analyses, covariates were included in the adjusted analysis if they were associated with an LOS at P < 0.05 or reasons of face validity. These variables are shown in Table 1. Because an effort to increase the discharges before noon was started in the 2014 academic year, we added an interaction term between the date of discharge and whether a discharge occurred before noon. The interaction term was included by dividing the study period into time periods corresponding to sequential 6‐month intervals. A new variable was defined by a categorical variable that indicated in which of these time periods a discharge occurred.
Discharged Before Noon | Discharged After Noon | P Value | |
---|---|---|---|
| |||
Median LOS (IQR) | 3.4 (2.25.9) | 3.7 (2.36.3) | <0.0005 |
Median CMI (IQR) | 1.8 (1.12.4) | 1.7 (1.12.5) | 0.006 |
Service type, N (%) | |||
Hospital medicine | 1,919 (29.6) | 11,290 (35.4) | |
Surgical services | 4,565 (70.4) | 20,591 (64.6) | <0.0005 |
Discharged before noon, N (%) | 6,484 (16.9) | 31,881 (83.1) | |
Discharged on weekend, N (%) | |||
Yes | 1,543 (23.8) | 7,411 (23.3) | |
No | 4,941 (76.2) | 24,470 (76.8) | 0.34 |
Discharge disposition, N (%) | |||
Home with home health | 748 (11.5) | 5,774 (18.1) | |
Home without home health | 3,997 (61.6) | 17,862 (56.0) | |
SNF | 837 (12.9) | 3,082 (9.7) | |
Other | 902 (13.9) | 5,163 (16.2) | <0.0005 |
6‐month interval, N (%) | |||
JulyDecember 2012 | 993 (15.3) | 5,596 (17.6) | |
JanuaryJune 2013 | 980 (15.1) | 5,721 (17.9) | |
JulyDecember 2013 | 1,088 (16.8) | 5,690 (17.9) | |
JanuaryJune 2014 | 1,288 (19.9) | 5,441 (17.1) | |
JulyDecember 2014 | 1,275 (19.7) | 5,656 (17.7) | |
JanuaryApril 2015 | 860 (13.3) | 3,777 (11.9) | <0.0005 |
Age category, N (%) | |||
1864 years | 4,177 (64.4) | 20,044 (62.9) | |
65+ years | 2,307 (35.6) | 11,837 (37.1) | 0.02 |
Male, N (%) | 3,274 (50.5) | 15,596 (48.9) | |
Female, N (%) | 3,210 (49.5) | 16,284 (51.1) | 0.06 |
Race, N (%) | |||
White or Caucasian | 4,133 (63.7) | 18,798 (59.0) | |
African American | 518 (8.0) | 3,020 (9.5) | |
Asian | 703 (10.8) | 4,052 (12.7) | |
Other | 1,130 (17.4) | 6,011 (18.9) | <0.0005 |
Ethnicity, N (%) | |||
Hispanic or Latino | 691 (10.7) | 3,713 (11.7) | |
Not Hispanic or Latino | 5,597 (86.3) | 27,209 (85.4) | |
Unknown/declined | 196 (3.0) | 959 (3.0) | 0.07 |
Admission type, N (%) | |||
Elective | 3,494 (53.9) | 13,881 (43.5) | |
Emergency | 2,047 (31.6) | 12,145 (38.1) | |
Urgent | 889 (13.7) | 5,459 (17.1) | |
Other | 54 (0.8) | 396 (1.2) | <0.0005 |
Payor class, N (%) | |||
Medicare | 2,648 (40.8) | 13,808 (43.3) | |
Medi‐Cal | 1,060 (16.4) | 5,913 (18.6) | |
Commercial | 2,633 (40.6) | 11,242 (35.3) | |
Other | 143 (2.2) | 918 (2.9) | <0.0005 |
We conducted a sensitivity analysis using propensity scores. The propensity score was based on demographic and clinical variables (as listed in Table 1) that exhibited P < 0.2 in bivariate analysis between the variable and being discharged before noon. We then used the propensity score as a covariate in a generalized linear model of the LOS with a gamma distribution and with generalized estimating equations as described above.
Finally, we performed prespecified secondary subset analyses of patients admitted emergently and nonemergently.
Statistical modeling and analysis was completed using Stata version 13 (StataCorp, College Station, TX).
RESULTS
Patient Demographics and Discharge Before Noon
Our study population comprised 27,983 patients for a total of 38,365 hospitalizations with a median LOS of 3.7 days. We observed 6484 discharges before noon (16.9%) and 31,881 discharges after noon (83.1%). The characteristics of the hospitalizations are shown in Table 1.
Patients who were discharged before noon tended to be younger, white, and discharged with a disposition to home without home health. The median CMI was slightly higher in discharges before noon (1.81, P = 0.006), and elective admissions were more likely than emergent to be discharged before noon (53.9% vs 31.6%, P < 0.0005).
Multivariable Analysis
A discharge before noon was associated with a 4.3% increase in LOS (adjusted odds ratio [OR]: 1.043, 95% confidence interval [CI]: 1.003‐1.086), adjusting for CMI, the service type, discharge on the weekend, discharge disposition, age, sex, ethnicity, race, urgency of admission, payor class, and a full interaction with the date of discharge (in 6‐month intervals). In preplanned subset analyses, the association between longer LOS and DCBN was more pronounced in patients admitted emergently (adjusted OR: 1.14, 95% CI: 1.033‐1.249) and less pronounced for patients not admitted emergently (adjusted OR: 1.03, 95% CI: 0.988‐1.074), although the latter did not meet statistical significance. In patients admitted emergently, this corresponds to approximately a 12‐hour increase in LOS. The interaction term of discharge date and DCBN was significant in the model. In further subset analyses, the association between longer LOS and DCBN was more pronounced in medicine patients (adjusted OR: 1.116, 95% CI: 1.014‐1.228) than in surgical patients (adjusted OR: 1.030, 95% CI: 0.989‐1.074), although the relationship in surgical patients did not meet statistical significance.
We also undertook sensitivity analyses utilizing propensity scores as a covariate in our base multivariable models. Results from these analyses did not differ from the base models and are not presented here. Results also did not differ when comparing discharges before and after the initiation of an attending only service.
DISCUSSION AND CONCLUSION
In our retrospective study of patients discharged from an academic medical center, discharge before noon was associated with a longer LOS, with the effect more pronounced in patients admitted emergently in the hospital. Our results suggest that efforts to discharge patients earlier in the day may have varying degrees of success depending on patient characteristics. Conceivably, elective admissions recover according to predictable plans, allowing for discharges earlier in the day. In contrast, patients discharged from emergent hospitalizations may have ongoing evolution of their care plan, making plans for discharging before noon more challenging.
Our results differ from a previous study,[3] which suggested that increasing the proportion of before‐noon discharges was associated with a fall in observed‐to‐expected LOS. However, observational studies of DCBN are challenging, because the association between early discharge and LOS is potentially bidirectional. One interpretation, for example, is that patients were kept longer in order to be discharged by noon the following day, which for the subgroups of patients admitted emergently corresponded to a roughly 12‐hour increase in LOS. However, it is also plausible that patients who stayed longer also had more time to plan for an early discharge. In either scenario, the ability of managers to utilize LOS as a key metric of throughput efforts may be flawed, and suggests that alternatives (eg, number of patients waiting for beds off unit) may be a more reasonable measure of throughput. Our results have several limitations. As in any observational study, our results are vulnerable to biases from unmeasured covariates that confound the analysis. We caution that a causal relationship between a discharge before noon and LOS cannot be determined from the nature of the study. Our results are also limited in that we were unable to adjust for day‐to‐day hospital capacity and other variables that affect LOS including caregiver and transportation availability, bed capacity at receiving care facilities, and patient consent to discharge. Finally, as a single‐site study, our findings may not be applicable to nonacademic settings.
In conclusion, our observational study discerned an association between discharging patients before noon and longer LOS. We believe our findings suggest a rationale for alternate approaches to measuring an early discharge program's effectiveness, namely, that the evaluation of the success of an early discharge initiative should consider multiple evaluation metrics including the effect on emergency department wait times, intensive care unit or postanesthesia transitions, and on patient reported experiences of care transitions.
Disclosures
Andrew Auerbach, MD, is supported by a K24 grant from the National Heart, Lung, and Blood Institute: K24HL098372. The authors report no conflicts of interest.
- The effect of emergency department crowding on clinically oriented outcomes. Acad Emerg Med. 2009;16(1):1–10. , , , et al.
- Centers for Medicare 2013.
- Discharge before noon: an achievable hospital goal. J Hosp Med. 2014;9(4):210–214. , , , et al.
- Discharge before noon: effect on throughput and sustainability. J Hosp Med. 2015;10(10):664–669. , , , et al.
- The effect of emergency department crowding on clinically oriented outcomes. Acad Emerg Med. 2009;16(1):1–10. , , , et al.
- Centers for Medicare 2013.
- Discharge before noon: an achievable hospital goal. J Hosp Med. 2014;9(4):210–214. , , , et al.
- Discharge before noon: effect on throughput and sustainability. J Hosp Med. 2015;10(10):664–669. , , , et al.
© 2015 Society of Hospital Medicine