Efficacy of Cryosurgery and 5-Fluorouracil Cream 0.5% Combination Therapy for the Treatment of Actinic Keratosis

Article Type
Changed
Thu, 01/10/2019 - 13:19
Display Headline
Efficacy of Cryosurgery and 5-Fluorouracil Cream 0.5% Combination Therapy for the Treatment of Actinic Keratosis

Actinic keratosis (AK) is regarded as a lesion on a continuum of progression to squamous cell carcinoma (SCC).1 Studies have estimated that 44% to 97% of SCCs were associated with AK lesions either in contiguous skin or within the same histologic section and that AK lesions progress to SCCs at a rate of 0.6% at 1 year.2 In 1993-1994 there were 3.7 million reported office visits for AK lesions, while in 2002 alone there were 8.2 million office visits.3,4 As the burden of disease from AKs has increased, so has the associated costs from office-based visits, treatments, and subsequent surveillance.

There are a number of highly effective approaches to AK treatment that are based on several factors such as the number of and extent of the lesions, history of skin cancer, provider practice characteristics (eg, location, appointment availability), patient preferences, cost, and tolerability. Cryosurgery is the most commonly used lesion-directed modality in the treatment of individual AKs based on its effectiveness and relative ease of use. Cryosurgery alone has been shown to have a success rate of 67% on AK lesions.5 Patients often experience erythema, edema, pain, and crusting at treated sites; there also is potential for ulceration, scarring, hypopigmentation, hyperpigmentation, and secondary infection, but these effects are less common. Recurrence may be an indicator of treatment-resistant lesions or new lesions appearing in the field.

A field-directed approach with topical 5-fluorouracil (5-FU) may be preferred in patients with a history of substantial photodamage, AKs that are resistant to cryosurgery, or multiple AKs. Field-directed treatments address multiple AKs simultaneously and treat subclinical lesions. Fluorouracil is a common therapy for AKs that often is implemented by dermatologists due to its efficacy and well-understood mechanism of action. Fluorouracil inhibits thymidylate synthase during DNA synthesis, thereby halting cellular proliferation. 5-Fluorouracil cream 0.5% has been approved for 1-, 2-, and 4-week treatment periods. In one study, resolution of AK lesions was greatest in the 4-week treatment group; however, side effects also were greatest in this group.6 Patients commonly may experience a range of local reactions including erythema, pruritus, erosions, ulcerations, scabbing, crusting, and facial irritation. For patients with substantial photodamage and AKs, a robust response can lead to perceived adverse events (AEs) and considerable downtime, possibly affecting patient satisfaction and treatment compliance.7

Many alternative and combination approaches have been studied to decrease AEs and improve compliance and efficacy in the treatment of AKs. In this study, we examined the efficacy and perceived side effects of cryosurgery and 5-FU cream 0.5% combination therapy in the treatment of AKs.

Methods

Study Design and Participants

This single-blind, single-center, comparator cream–controlled pilot study was parallel designed with a balanced randomization (1:1 frequency). The study protocol and consent form were approved by the Wake Forest University Health Sciences institutional review board (Winston-Salem, North Carolina). Participants were 18 years or older with 8 clinically typical, visible, and discrete AK lesions on the face (forehead and temples) or balding scalp. Typical inclusion and exclusion criteria were observed. No other topical agents or therapies were permitted to be applied to the affected areas at least 4 weeks prior to treatment, depending on the treatment modality.

Assessment

During the screening (baseline) visit, eligible participants provided informed consent, baseline lesion counts and investigator global assessments (IGAs) were performed, and cryosurgery was administered to all visible AK lesions in the study areas. Participants returned at weeks 3, 4, 8, and 26. Three weeks following cryosurgery, participants were randomized according to standard randomization tables into 1 of 2 treatment groups to receive once-daily treatment with either 5-FU cream 0.5% or a moisturizing comparator cream. The cream was applied at bedtime to the affected sites for 1 week. Randomization was investigator blinded, but participants and the study administrators were not blinded. Participants were instructed to record their treatment compliance in daily diary entries, which were reviewed at week 4 using the medication tolerability assessment rating for burning, stinging, and ulceration. Investigator global assessment, IGA of improvement, lesion counts, and quality of life (QOL) survey responses were gathered at weeks 3, 8, and 26. The IGA measured the overall severity of AK disease involvement on a 6-point scale (clear; very severe). The IGA of improvement measured the overall improvement from baseline on a 6-point scale (clear; worse). Adverse events were measured at each visit.

Efficacy End Points

The primary end point was 100% clearance of all AK lesions at the end of the study (week 26) relative to the baseline AK lesion count. Secondary end points included comparisons between the groups for the number of participants with greater than 75% reduction of baseline lesion counts at the end of the study as well as differences at each visit in medication tolerability assessments, QOL measures, IGA improvement scores, and medication adherence based on diary entries at week 4.

Statistical Analysis

An intention-to-treat analysis was performed. The number of participants with 100% or greater than 75% clearance of AK lesions by specified time points were compared using relative risks and risk differences with Poisson regression analysis log and identity link functions, respectively, to obtain robust error variance 95% confidence intervals. Medication tolerability assessment, QOL, and IGA improvement scores were compared between the 2 groups using the Mann-Whitney U test. The significance level was set at α=.05. All analyses were performed using SAS data analysis software.

 

 

Results

Sixty age-eligible participants were enrolled in the study with 30 participants in each treatment group. All of the participants completed the 26-week study period and were included in the intention-to-treat analysis. All of the participants were white with a median age of 67 years; the median number of baseline AK lesions was 12. Participant baseline demographics and clinical characteristics are provided in Table 1. Treatment compliance in both groups was good with only a few participants reporting missed doses.

In our evaluation of the rate of change in the number of AK lesions at week 8 compared to baseline, the 5-FU cream 0.5% group showed an 84% reduction in the number of AK lesions versus a 69% reduction in the control group. At week 26, the 5-FU cream 0.5% group showed a 72% reduction in the number of lesions versus 73% in the control group. There was no significant difference between 5-FU cream 0.5% and the comparator cream for either 100% or 75% clearance of AK lesions by the end of the study; however, comparing the AK lesion count from baseline to 8 weeks following the initiation of the study, participants in the 5-FU cream 0.5% group were more likely than the control group to achieve 75% or 100% clearance on the relative risk and risk difference scales (Table 2). 

There were no significant differences between the 2 groups for the IGA of improvement at any time point (Table 3). On average, participants in the 5-FU cream 0.5% group experienced more dryness, erosion, fissuring, and redness than the control group but not more ulcerations by the end of week 4 (Table 4). All other QOL measures were statistically comparable between the 2 treatment groups for all time points.

A total of 25 AEs were reported throughout the study but none were considered to be serious. One AE (redness, burning, and itching over the eyebrow) was considered to be related to the study drug. No participants withdrew from the study due to AEs. A total of 12 participants in the 5-FU cream 0.5% group and 10 in the control group reported AEs.

Comment

After a 1-week course of 5-FU cream 0.5% following cryosurgery, a greater reduction in the number of AK lesions for a period of 2 months was noted in the treatment group compared to the control group. These findings are consistent with a similar study from 2006 that used 5-FU cream 0.5% or a vehicle 1 week prior to cryosurgery and then counted the number of AK lesions that remained.8 In the 2006 study, remarkable improvement out to week 26 was noted,8 unlike our study; however, there was insufficient power in our study to demonstrate a continued effect out to week 26.

Both the 2006 study and our current study support the benefit of using a combination treatment to clear AK lesions versus either treatment alone. Of note, these studies also show that combination treatments are equally effective, regardless of the order of treatments, in lowering AK lesion counts compared to cryosurgery alone.

Although participants in the 5-FU cream 0.5% group reported slightly more AEs on average at week 4, the rate of side effects was lower than those reported in a study documenting the side effects of a 4-week course of 5-FU.6 This rate of side effects must be considered in light of the added benefits this combination treatment has demonstrated.

Results of this pilot study suggest that a larger sample size would yield a difference in the study arms for all time periods (weeks 8 and 26). In an effort to maintain exchangeability of the study arms, patients were randomized at baseline treatment, but behaviors of patients in the 6 months following treatment, such as variation in sun exposure or other habits that promote AK lesion development, may have attenuated the results.

Key strengths of this study include no loss to follow-up and high medication adherence rates. The key limitation was the small sample size, which did not demonstrate a statistical advantage of the 5-FU cream 0.5% at 26 weeks; however, our study does show promise for larger future studies in illustrating this difference. A study by Krawtchenko et al9 noted that long-term efficacy of field therapy with 5-FU may ultimately be less than imiquimod cream 5%, suggesting that a possible alteration of the study protocol to compare the efficacy of different forms of field therapy may ultimately achieve better outcomes.

Conclusion

Overall, individuals with AK may benefit from a combination of treatment with cryosurgery and topical 5-FU to resolve lesions for longer periods than with cryosurgery alone. Although prior studies have found statistically significant differences in short-term and long-term treatment efficacy when cryosurgery is combined with an active field therapy versus a placebo vehicle,8,9 the current study aimed to find the best combination of efficacy with the fewest side effects. Therefore, the results of prior literature studies only further the feelings of the authors that with a protocol that looks at a slightly different treatment regimen within the treatment arm, the results can be extremely beneficial to patients. Further studies should be implemented to confirm the longer-term benefits of this combination therapy.

References

 

1. Lebwohl M. Actinic keratosis: epidemiology and progression to squamous cell carcinoma. Br J Dermatol. 2003;149(suppl 66):31-33.

2. Criscione VD, Weinstock MA, Naylor MF, et al. Actinic keratoses: natural history and risk of malignant transformation in the Veterans Affairs Topical Tretinoin Chemoprevention Trial. Cancer. 2009;115:2523-2530.

3. Smith ES, Feldman SR, Fleischer AB Jr, et al. Characteristics of office-based visits for skin cancer. dermatologists have more experience than other physicians in managing malignant and premalignant skin conditions. Dermatol Surg. 1998;24:981-985.

4. Shoimer I, Rosen N, Muhn C. Current management of actinic keratoses. Skin Therapy Lett. 2010;15:5-7.

5. Thai KE, Fergin P, Freeman M, et al. A prospective study of the use of cryosurgery for the treatment of actinic keratoses. Int J Dermatol. 2004;43:687-692.

6. Weiss J, Menter A, Hevia O, et al. Effective treatment of actinic keratosis with 0.5% fluorouracil cream for 1, 2, or 4 weeks. Cutis. 2002;70(suppl 2):22-29.

7. Jorizzo JL, Carney PS, Ko WT, et al. Treatment options in the management of actinic keratosis. Cutis. 2004;74 (suppl 6):9-17.

8. Jorizzo J, Weiss J, Vamvakias G. One-week treatment with 0.5% fluorouracil cream prior to cryosurgery in patients with actinic keratoses: a double-blind, vehicle-controlled, long-term study. J Drugs Dermatol. 2006;5:133-139.

9. Krawtchenko N, Roewert-Huber J, Ulrich M, et al. A randomised study of topical 5% imiquimod vs. topical 5-fluorouracil vs. cryosurgery in immunocompetent patients with actinic keratoses: a comparison of clinical and histological outcomes including 1-year follow-up. Br J Dermatol. 2007;157(suppl 2):34-40.

Article PDF
Author and Disclosure Information

 

William D. Hoover III, MS; Joseph L. Jorizzo, MD; Adele R. Clark, PA-C; Steven R. Feldman, MD, PhD; Judy Holbrook, LPN, CCRC; Karen E. Huang, MS

From the Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina.

This research was conducted with a grant from sanofi-aventis. The authors report no conflict of interest.

Correspondence: Adele R. Clark, PA-C, Department of Dermatology, Wake Forest University School of Medicine, Medical Center Blvd, Winston-Salem, NC 27157 (adclark@wakehealth.edu).

Issue
Cutis - 94(5)
Publications
Topics
Page Number
255-259
Legacy Keywords
5-FU, 5-fluorouracil, actinic keratosis, cryotherapy, combination therapy, cryosurgery
Sections
Author and Disclosure Information

 

William D. Hoover III, MS; Joseph L. Jorizzo, MD; Adele R. Clark, PA-C; Steven R. Feldman, MD, PhD; Judy Holbrook, LPN, CCRC; Karen E. Huang, MS

From the Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina.

This research was conducted with a grant from sanofi-aventis. The authors report no conflict of interest.

Correspondence: Adele R. Clark, PA-C, Department of Dermatology, Wake Forest University School of Medicine, Medical Center Blvd, Winston-Salem, NC 27157 (adclark@wakehealth.edu).

Author and Disclosure Information

 

William D. Hoover III, MS; Joseph L. Jorizzo, MD; Adele R. Clark, PA-C; Steven R. Feldman, MD, PhD; Judy Holbrook, LPN, CCRC; Karen E. Huang, MS

From the Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina.

This research was conducted with a grant from sanofi-aventis. The authors report no conflict of interest.

Correspondence: Adele R. Clark, PA-C, Department of Dermatology, Wake Forest University School of Medicine, Medical Center Blvd, Winston-Salem, NC 27157 (adclark@wakehealth.edu).

Article PDF
Article PDF
Related Articles

Actinic keratosis (AK) is regarded as a lesion on a continuum of progression to squamous cell carcinoma (SCC).1 Studies have estimated that 44% to 97% of SCCs were associated with AK lesions either in contiguous skin or within the same histologic section and that AK lesions progress to SCCs at a rate of 0.6% at 1 year.2 In 1993-1994 there were 3.7 million reported office visits for AK lesions, while in 2002 alone there were 8.2 million office visits.3,4 As the burden of disease from AKs has increased, so has the associated costs from office-based visits, treatments, and subsequent surveillance.

There are a number of highly effective approaches to AK treatment that are based on several factors such as the number of and extent of the lesions, history of skin cancer, provider practice characteristics (eg, location, appointment availability), patient preferences, cost, and tolerability. Cryosurgery is the most commonly used lesion-directed modality in the treatment of individual AKs based on its effectiveness and relative ease of use. Cryosurgery alone has been shown to have a success rate of 67% on AK lesions.5 Patients often experience erythema, edema, pain, and crusting at treated sites; there also is potential for ulceration, scarring, hypopigmentation, hyperpigmentation, and secondary infection, but these effects are less common. Recurrence may be an indicator of treatment-resistant lesions or new lesions appearing in the field.

A field-directed approach with topical 5-fluorouracil (5-FU) may be preferred in patients with a history of substantial photodamage, AKs that are resistant to cryosurgery, or multiple AKs. Field-directed treatments address multiple AKs simultaneously and treat subclinical lesions. Fluorouracil is a common therapy for AKs that often is implemented by dermatologists due to its efficacy and well-understood mechanism of action. Fluorouracil inhibits thymidylate synthase during DNA synthesis, thereby halting cellular proliferation. 5-Fluorouracil cream 0.5% has been approved for 1-, 2-, and 4-week treatment periods. In one study, resolution of AK lesions was greatest in the 4-week treatment group; however, side effects also were greatest in this group.6 Patients commonly may experience a range of local reactions including erythema, pruritus, erosions, ulcerations, scabbing, crusting, and facial irritation. For patients with substantial photodamage and AKs, a robust response can lead to perceived adverse events (AEs) and considerable downtime, possibly affecting patient satisfaction and treatment compliance.7

Many alternative and combination approaches have been studied to decrease AEs and improve compliance and efficacy in the treatment of AKs. In this study, we examined the efficacy and perceived side effects of cryosurgery and 5-FU cream 0.5% combination therapy in the treatment of AKs.

Methods

Study Design and Participants

This single-blind, single-center, comparator cream–controlled pilot study was parallel designed with a balanced randomization (1:1 frequency). The study protocol and consent form were approved by the Wake Forest University Health Sciences institutional review board (Winston-Salem, North Carolina). Participants were 18 years or older with 8 clinically typical, visible, and discrete AK lesions on the face (forehead and temples) or balding scalp. Typical inclusion and exclusion criteria were observed. No other topical agents or therapies were permitted to be applied to the affected areas at least 4 weeks prior to treatment, depending on the treatment modality.

Assessment

During the screening (baseline) visit, eligible participants provided informed consent, baseline lesion counts and investigator global assessments (IGAs) were performed, and cryosurgery was administered to all visible AK lesions in the study areas. Participants returned at weeks 3, 4, 8, and 26. Three weeks following cryosurgery, participants were randomized according to standard randomization tables into 1 of 2 treatment groups to receive once-daily treatment with either 5-FU cream 0.5% or a moisturizing comparator cream. The cream was applied at bedtime to the affected sites for 1 week. Randomization was investigator blinded, but participants and the study administrators were not blinded. Participants were instructed to record their treatment compliance in daily diary entries, which were reviewed at week 4 using the medication tolerability assessment rating for burning, stinging, and ulceration. Investigator global assessment, IGA of improvement, lesion counts, and quality of life (QOL) survey responses were gathered at weeks 3, 8, and 26. The IGA measured the overall severity of AK disease involvement on a 6-point scale (clear; very severe). The IGA of improvement measured the overall improvement from baseline on a 6-point scale (clear; worse). Adverse events were measured at each visit.

Efficacy End Points

The primary end point was 100% clearance of all AK lesions at the end of the study (week 26) relative to the baseline AK lesion count. Secondary end points included comparisons between the groups for the number of participants with greater than 75% reduction of baseline lesion counts at the end of the study as well as differences at each visit in medication tolerability assessments, QOL measures, IGA improvement scores, and medication adherence based on diary entries at week 4.

Statistical Analysis

An intention-to-treat analysis was performed. The number of participants with 100% or greater than 75% clearance of AK lesions by specified time points were compared using relative risks and risk differences with Poisson regression analysis log and identity link functions, respectively, to obtain robust error variance 95% confidence intervals. Medication tolerability assessment, QOL, and IGA improvement scores were compared between the 2 groups using the Mann-Whitney U test. The significance level was set at α=.05. All analyses were performed using SAS data analysis software.

 

 

Results

Sixty age-eligible participants were enrolled in the study with 30 participants in each treatment group. All of the participants completed the 26-week study period and were included in the intention-to-treat analysis. All of the participants were white with a median age of 67 years; the median number of baseline AK lesions was 12. Participant baseline demographics and clinical characteristics are provided in Table 1. Treatment compliance in both groups was good with only a few participants reporting missed doses.

In our evaluation of the rate of change in the number of AK lesions at week 8 compared to baseline, the 5-FU cream 0.5% group showed an 84% reduction in the number of AK lesions versus a 69% reduction in the control group. At week 26, the 5-FU cream 0.5% group showed a 72% reduction in the number of lesions versus 73% in the control group. There was no significant difference between 5-FU cream 0.5% and the comparator cream for either 100% or 75% clearance of AK lesions by the end of the study; however, comparing the AK lesion count from baseline to 8 weeks following the initiation of the study, participants in the 5-FU cream 0.5% group were more likely than the control group to achieve 75% or 100% clearance on the relative risk and risk difference scales (Table 2). 

There were no significant differences between the 2 groups for the IGA of improvement at any time point (Table 3). On average, participants in the 5-FU cream 0.5% group experienced more dryness, erosion, fissuring, and redness than the control group but not more ulcerations by the end of week 4 (Table 4). All other QOL measures were statistically comparable between the 2 treatment groups for all time points.

A total of 25 AEs were reported throughout the study but none were considered to be serious. One AE (redness, burning, and itching over the eyebrow) was considered to be related to the study drug. No participants withdrew from the study due to AEs. A total of 12 participants in the 5-FU cream 0.5% group and 10 in the control group reported AEs.

Comment

After a 1-week course of 5-FU cream 0.5% following cryosurgery, a greater reduction in the number of AK lesions for a period of 2 months was noted in the treatment group compared to the control group. These findings are consistent with a similar study from 2006 that used 5-FU cream 0.5% or a vehicle 1 week prior to cryosurgery and then counted the number of AK lesions that remained.8 In the 2006 study, remarkable improvement out to week 26 was noted,8 unlike our study; however, there was insufficient power in our study to demonstrate a continued effect out to week 26.

Both the 2006 study and our current study support the benefit of using a combination treatment to clear AK lesions versus either treatment alone. Of note, these studies also show that combination treatments are equally effective, regardless of the order of treatments, in lowering AK lesion counts compared to cryosurgery alone.

Although participants in the 5-FU cream 0.5% group reported slightly more AEs on average at week 4, the rate of side effects was lower than those reported in a study documenting the side effects of a 4-week course of 5-FU.6 This rate of side effects must be considered in light of the added benefits this combination treatment has demonstrated.

Results of this pilot study suggest that a larger sample size would yield a difference in the study arms for all time periods (weeks 8 and 26). In an effort to maintain exchangeability of the study arms, patients were randomized at baseline treatment, but behaviors of patients in the 6 months following treatment, such as variation in sun exposure or other habits that promote AK lesion development, may have attenuated the results.

Key strengths of this study include no loss to follow-up and high medication adherence rates. The key limitation was the small sample size, which did not demonstrate a statistical advantage of the 5-FU cream 0.5% at 26 weeks; however, our study does show promise for larger future studies in illustrating this difference. A study by Krawtchenko et al9 noted that long-term efficacy of field therapy with 5-FU may ultimately be less than imiquimod cream 5%, suggesting that a possible alteration of the study protocol to compare the efficacy of different forms of field therapy may ultimately achieve better outcomes.

Conclusion

Overall, individuals with AK may benefit from a combination of treatment with cryosurgery and topical 5-FU to resolve lesions for longer periods than with cryosurgery alone. Although prior studies have found statistically significant differences in short-term and long-term treatment efficacy when cryosurgery is combined with an active field therapy versus a placebo vehicle,8,9 the current study aimed to find the best combination of efficacy with the fewest side effects. Therefore, the results of prior literature studies only further the feelings of the authors that with a protocol that looks at a slightly different treatment regimen within the treatment arm, the results can be extremely beneficial to patients. Further studies should be implemented to confirm the longer-term benefits of this combination therapy.

Actinic keratosis (AK) is regarded as a lesion on a continuum of progression to squamous cell carcinoma (SCC).1 Studies have estimated that 44% to 97% of SCCs were associated with AK lesions either in contiguous skin or within the same histologic section and that AK lesions progress to SCCs at a rate of 0.6% at 1 year.2 In 1993-1994 there were 3.7 million reported office visits for AK lesions, while in 2002 alone there were 8.2 million office visits.3,4 As the burden of disease from AKs has increased, so has the associated costs from office-based visits, treatments, and subsequent surveillance.

There are a number of highly effective approaches to AK treatment that are based on several factors such as the number of and extent of the lesions, history of skin cancer, provider practice characteristics (eg, location, appointment availability), patient preferences, cost, and tolerability. Cryosurgery is the most commonly used lesion-directed modality in the treatment of individual AKs based on its effectiveness and relative ease of use. Cryosurgery alone has been shown to have a success rate of 67% on AK lesions.5 Patients often experience erythema, edema, pain, and crusting at treated sites; there also is potential for ulceration, scarring, hypopigmentation, hyperpigmentation, and secondary infection, but these effects are less common. Recurrence may be an indicator of treatment-resistant lesions or new lesions appearing in the field.

A field-directed approach with topical 5-fluorouracil (5-FU) may be preferred in patients with a history of substantial photodamage, AKs that are resistant to cryosurgery, or multiple AKs. Field-directed treatments address multiple AKs simultaneously and treat subclinical lesions. Fluorouracil is a common therapy for AKs that often is implemented by dermatologists due to its efficacy and well-understood mechanism of action. Fluorouracil inhibits thymidylate synthase during DNA synthesis, thereby halting cellular proliferation. 5-Fluorouracil cream 0.5% has been approved for 1-, 2-, and 4-week treatment periods. In one study, resolution of AK lesions was greatest in the 4-week treatment group; however, side effects also were greatest in this group.6 Patients commonly may experience a range of local reactions including erythema, pruritus, erosions, ulcerations, scabbing, crusting, and facial irritation. For patients with substantial photodamage and AKs, a robust response can lead to perceived adverse events (AEs) and considerable downtime, possibly affecting patient satisfaction and treatment compliance.7

Many alternative and combination approaches have been studied to decrease AEs and improve compliance and efficacy in the treatment of AKs. In this study, we examined the efficacy and perceived side effects of cryosurgery and 5-FU cream 0.5% combination therapy in the treatment of AKs.

Methods

Study Design and Participants

This single-blind, single-center, comparator cream–controlled pilot study was parallel designed with a balanced randomization (1:1 frequency). The study protocol and consent form were approved by the Wake Forest University Health Sciences institutional review board (Winston-Salem, North Carolina). Participants were 18 years or older with 8 clinically typical, visible, and discrete AK lesions on the face (forehead and temples) or balding scalp. Typical inclusion and exclusion criteria were observed. No other topical agents or therapies were permitted to be applied to the affected areas at least 4 weeks prior to treatment, depending on the treatment modality.

Assessment

During the screening (baseline) visit, eligible participants provided informed consent, baseline lesion counts and investigator global assessments (IGAs) were performed, and cryosurgery was administered to all visible AK lesions in the study areas. Participants returned at weeks 3, 4, 8, and 26. Three weeks following cryosurgery, participants were randomized according to standard randomization tables into 1 of 2 treatment groups to receive once-daily treatment with either 5-FU cream 0.5% or a moisturizing comparator cream. The cream was applied at bedtime to the affected sites for 1 week. Randomization was investigator blinded, but participants and the study administrators were not blinded. Participants were instructed to record their treatment compliance in daily diary entries, which were reviewed at week 4 using the medication tolerability assessment rating for burning, stinging, and ulceration. Investigator global assessment, IGA of improvement, lesion counts, and quality of life (QOL) survey responses were gathered at weeks 3, 8, and 26. The IGA measured the overall severity of AK disease involvement on a 6-point scale (clear; very severe). The IGA of improvement measured the overall improvement from baseline on a 6-point scale (clear; worse). Adverse events were measured at each visit.

Efficacy End Points

The primary end point was 100% clearance of all AK lesions at the end of the study (week 26) relative to the baseline AK lesion count. Secondary end points included comparisons between the groups for the number of participants with greater than 75% reduction of baseline lesion counts at the end of the study as well as differences at each visit in medication tolerability assessments, QOL measures, IGA improvement scores, and medication adherence based on diary entries at week 4.

Statistical Analysis

An intention-to-treat analysis was performed. The number of participants with 100% or greater than 75% clearance of AK lesions by specified time points were compared using relative risks and risk differences with Poisson regression analysis log and identity link functions, respectively, to obtain robust error variance 95% confidence intervals. Medication tolerability assessment, QOL, and IGA improvement scores were compared between the 2 groups using the Mann-Whitney U test. The significance level was set at α=.05. All analyses were performed using SAS data analysis software.

 

 

Results

Sixty age-eligible participants were enrolled in the study with 30 participants in each treatment group. All of the participants completed the 26-week study period and were included in the intention-to-treat analysis. All of the participants were white with a median age of 67 years; the median number of baseline AK lesions was 12. Participant baseline demographics and clinical characteristics are provided in Table 1. Treatment compliance in both groups was good with only a few participants reporting missed doses.

In our evaluation of the rate of change in the number of AK lesions at week 8 compared to baseline, the 5-FU cream 0.5% group showed an 84% reduction in the number of AK lesions versus a 69% reduction in the control group. At week 26, the 5-FU cream 0.5% group showed a 72% reduction in the number of lesions versus 73% in the control group. There was no significant difference between 5-FU cream 0.5% and the comparator cream for either 100% or 75% clearance of AK lesions by the end of the study; however, comparing the AK lesion count from baseline to 8 weeks following the initiation of the study, participants in the 5-FU cream 0.5% group were more likely than the control group to achieve 75% or 100% clearance on the relative risk and risk difference scales (Table 2). 

There were no significant differences between the 2 groups for the IGA of improvement at any time point (Table 3). On average, participants in the 5-FU cream 0.5% group experienced more dryness, erosion, fissuring, and redness than the control group but not more ulcerations by the end of week 4 (Table 4). All other QOL measures were statistically comparable between the 2 treatment groups for all time points.

A total of 25 AEs were reported throughout the study but none were considered to be serious. One AE (redness, burning, and itching over the eyebrow) was considered to be related to the study drug. No participants withdrew from the study due to AEs. A total of 12 participants in the 5-FU cream 0.5% group and 10 in the control group reported AEs.

Comment

After a 1-week course of 5-FU cream 0.5% following cryosurgery, a greater reduction in the number of AK lesions for a period of 2 months was noted in the treatment group compared to the control group. These findings are consistent with a similar study from 2006 that used 5-FU cream 0.5% or a vehicle 1 week prior to cryosurgery and then counted the number of AK lesions that remained.8 In the 2006 study, remarkable improvement out to week 26 was noted,8 unlike our study; however, there was insufficient power in our study to demonstrate a continued effect out to week 26.

Both the 2006 study and our current study support the benefit of using a combination treatment to clear AK lesions versus either treatment alone. Of note, these studies also show that combination treatments are equally effective, regardless of the order of treatments, in lowering AK lesion counts compared to cryosurgery alone.

Although participants in the 5-FU cream 0.5% group reported slightly more AEs on average at week 4, the rate of side effects was lower than those reported in a study documenting the side effects of a 4-week course of 5-FU.6 This rate of side effects must be considered in light of the added benefits this combination treatment has demonstrated.

Results of this pilot study suggest that a larger sample size would yield a difference in the study arms for all time periods (weeks 8 and 26). In an effort to maintain exchangeability of the study arms, patients were randomized at baseline treatment, but behaviors of patients in the 6 months following treatment, such as variation in sun exposure or other habits that promote AK lesion development, may have attenuated the results.

Key strengths of this study include no loss to follow-up and high medication adherence rates. The key limitation was the small sample size, which did not demonstrate a statistical advantage of the 5-FU cream 0.5% at 26 weeks; however, our study does show promise for larger future studies in illustrating this difference. A study by Krawtchenko et al9 noted that long-term efficacy of field therapy with 5-FU may ultimately be less than imiquimod cream 5%, suggesting that a possible alteration of the study protocol to compare the efficacy of different forms of field therapy may ultimately achieve better outcomes.

Conclusion

Overall, individuals with AK may benefit from a combination of treatment with cryosurgery and topical 5-FU to resolve lesions for longer periods than with cryosurgery alone. Although prior studies have found statistically significant differences in short-term and long-term treatment efficacy when cryosurgery is combined with an active field therapy versus a placebo vehicle,8,9 the current study aimed to find the best combination of efficacy with the fewest side effects. Therefore, the results of prior literature studies only further the feelings of the authors that with a protocol that looks at a slightly different treatment regimen within the treatment arm, the results can be extremely beneficial to patients. Further studies should be implemented to confirm the longer-term benefits of this combination therapy.

References

 

1. Lebwohl M. Actinic keratosis: epidemiology and progression to squamous cell carcinoma. Br J Dermatol. 2003;149(suppl 66):31-33.

2. Criscione VD, Weinstock MA, Naylor MF, et al. Actinic keratoses: natural history and risk of malignant transformation in the Veterans Affairs Topical Tretinoin Chemoprevention Trial. Cancer. 2009;115:2523-2530.

3. Smith ES, Feldman SR, Fleischer AB Jr, et al. Characteristics of office-based visits for skin cancer. dermatologists have more experience than other physicians in managing malignant and premalignant skin conditions. Dermatol Surg. 1998;24:981-985.

4. Shoimer I, Rosen N, Muhn C. Current management of actinic keratoses. Skin Therapy Lett. 2010;15:5-7.

5. Thai KE, Fergin P, Freeman M, et al. A prospective study of the use of cryosurgery for the treatment of actinic keratoses. Int J Dermatol. 2004;43:687-692.

6. Weiss J, Menter A, Hevia O, et al. Effective treatment of actinic keratosis with 0.5% fluorouracil cream for 1, 2, or 4 weeks. Cutis. 2002;70(suppl 2):22-29.

7. Jorizzo JL, Carney PS, Ko WT, et al. Treatment options in the management of actinic keratosis. Cutis. 2004;74 (suppl 6):9-17.

8. Jorizzo J, Weiss J, Vamvakias G. One-week treatment with 0.5% fluorouracil cream prior to cryosurgery in patients with actinic keratoses: a double-blind, vehicle-controlled, long-term study. J Drugs Dermatol. 2006;5:133-139.

9. Krawtchenko N, Roewert-Huber J, Ulrich M, et al. A randomised study of topical 5% imiquimod vs. topical 5-fluorouracil vs. cryosurgery in immunocompetent patients with actinic keratoses: a comparison of clinical and histological outcomes including 1-year follow-up. Br J Dermatol. 2007;157(suppl 2):34-40.

References

 

1. Lebwohl M. Actinic keratosis: epidemiology and progression to squamous cell carcinoma. Br J Dermatol. 2003;149(suppl 66):31-33.

2. Criscione VD, Weinstock MA, Naylor MF, et al. Actinic keratoses: natural history and risk of malignant transformation in the Veterans Affairs Topical Tretinoin Chemoprevention Trial. Cancer. 2009;115:2523-2530.

3. Smith ES, Feldman SR, Fleischer AB Jr, et al. Characteristics of office-based visits for skin cancer. dermatologists have more experience than other physicians in managing malignant and premalignant skin conditions. Dermatol Surg. 1998;24:981-985.

4. Shoimer I, Rosen N, Muhn C. Current management of actinic keratoses. Skin Therapy Lett. 2010;15:5-7.

5. Thai KE, Fergin P, Freeman M, et al. A prospective study of the use of cryosurgery for the treatment of actinic keratoses. Int J Dermatol. 2004;43:687-692.

6. Weiss J, Menter A, Hevia O, et al. Effective treatment of actinic keratosis with 0.5% fluorouracil cream for 1, 2, or 4 weeks. Cutis. 2002;70(suppl 2):22-29.

7. Jorizzo JL, Carney PS, Ko WT, et al. Treatment options in the management of actinic keratosis. Cutis. 2004;74 (suppl 6):9-17.

8. Jorizzo J, Weiss J, Vamvakias G. One-week treatment with 0.5% fluorouracil cream prior to cryosurgery in patients with actinic keratoses: a double-blind, vehicle-controlled, long-term study. J Drugs Dermatol. 2006;5:133-139.

9. Krawtchenko N, Roewert-Huber J, Ulrich M, et al. A randomised study of topical 5% imiquimod vs. topical 5-fluorouracil vs. cryosurgery in immunocompetent patients with actinic keratoses: a comparison of clinical and histological outcomes including 1-year follow-up. Br J Dermatol. 2007;157(suppl 2):34-40.

Issue
Cutis - 94(5)
Issue
Cutis - 94(5)
Page Number
255-259
Page Number
255-259
Publications
Publications
Topics
Article Type
Display Headline
Efficacy of Cryosurgery and 5-Fluorouracil Cream 0.5% Combination Therapy for the Treatment of Actinic Keratosis
Display Headline
Efficacy of Cryosurgery and 5-Fluorouracil Cream 0.5% Combination Therapy for the Treatment of Actinic Keratosis
Legacy Keywords
5-FU, 5-fluorouracil, actinic keratosis, cryotherapy, combination therapy, cryosurgery
Legacy Keywords
5-FU, 5-fluorouracil, actinic keratosis, cryotherapy, combination therapy, cryosurgery
Sections
Disallow All Ads
Article PDF Media

CAPO Aspiration Pneumonia

Article Type
Changed
Sun, 05/21/2017 - 13:28
Display Headline
Characteristics associated with clinician diagnosis of aspiration pneumonia: A descriptive study of afflicted patients and their outcomes

Pneumonia is a common clinical syndrome with well‐described epidemiology and microbiology. Aspiration pneumonia comprises 5% to 15% of patients with pneumonia acquired outside of the hospital,[1] but is less well characterized despite being a major syndrome of pneumonia in the elderly.[2, 3] Difficulties in studying aspiration pneumonia include the lack of a sensitive and specific marker for aspiration as well as the potential overlap between aspiration pneumonia and other forms of pneumonia.[4, 5, 6] Additionally, clinicians have difficulty distinguishing between aspiration pneumonia, which develops after the aspiration of oropharyngeal contents, and aspiration pneumonitis, wherein inhalation of gastric contents causes inflammation without the subsequent development of bacterial infection.[7, 8] Central to the study of aspiration pneumonia is whether it should exist as its own entity, or if aspiration is really a designation used for pneumonia in an older patient with greater comorbidities. The ability to clearly understand how a clinician diagnoses aspiration pneumonia, and whether that method has face validity with expert definitions may allow for improved future research, improved generalizability of current or past research, and possibly better clinical care.

Several validated mortality prediction models exist for community‐acquired pneumonia (CAP) using a variety of clinical predictors, but their performance in patients with aspiration pneumonia is less well characterized. Most studies validating pneumonia severity scoring systems excluded aspiration pneumonia from their study population.[9, 10, 11] Severity scoring systems for CAP may not accurately predict disease severity in patients with aspiration pneumonia. The CURB‐65[9] (confusion, uremia, respiratory rate, blood pressure, age 65 years) and the eCURB[12] scoring systems are poor predictors of mortality in patients with aspiration pneumonia, perhaps because they do not account for patient comorbidities.[13] The pneumonia severity index (PSI)[10] might predict mortality better than CURB‐65 in the aspiration population due to the inclusion of comorbidities.

Previous studies have demonstrated that patients with aspiration pneumonia are older and have greater disease severity and more comorbidities.[13, 14, 15] These single‐center studies also demonstrated greater mortality, more frequent admission to an intensive care unit (ICU), and longer hospital lengths of stay in patients with aspiration pneumonia. These studies identified aspiration pneumonia by the presence of a risk factor for aspiration[15] or by physician billing codes.[13] In practice, however, the bedside clinician diagnoses a patient as having aspiration pneumonia, but the logic is likely vague and inconsistent. Despite the potential for variability with individual judgment, an aggregate estimation from independent judgments may perform better than individual judgments.[16] Because there is no gold standard for defining aspiration pneumonia, all previous research has been limited to definitions created by investigators. This multicenter study seeks to determine what clinical characteristics lead physicians to diagnose a patient as having aspiration pneumonia, and whether or not the clinician‐derived diagnosis is distinct and clinically useful.

Our objectives were to: (1) identify covariates associated with bedside clinicians diagnosing a pneumonia patient as having aspiration pneumonia; (2) compare aspiration pneumonia and nonaspiration pneumonia in regard to disease severity, patient demographics, comorbidities, and clinical outcomes; and (3) measure the performance of the PSI in aspiration pneumonia versus nonaspiration pneumonia.

PATIENTS AND METHODS

Study Design and Setting

We performed a secondary analysis of the Community‐Acquired Pneumonia Organization (CAPO) database, which contains retrospectively collected data from 71 hospitals in 16 countries between June 2001 and December 2012. In each participating center, primary investigators selected nonconsecutive, adult hospitalized patients diagnosed with CAP. To decrease systematic selection biases, the selection of patients with CAP for enrollment in the trial was based on the date of hospital admission. Each investigator completed a case report form that was transferred via the internet to the CAPO study center at the University of Louisville (Louisville, KY). A sample of the data collection form is available at the study website (www.caposite.com). Validation of data quality was performed at the study center before the case was entered into the CAPO database. Local institutional review board approval was obtained for each study site.

Inclusion and Exclusion Criteria

Patients 18 years of age and satisfying criteria for CAP were included in this study. A diagnosis of CAP required a new pulmonary infiltrate at time of hospitalization, and at least 1 of the following: new or increased cough; leukocytosis; leukopenia, or left shift pattern on white blood cell count; and temperature >37.8C or <35.6 C. We excluded patients with pneumonia attributed to mycobacterial or fungal infection, and patients infected with human immunodeficiency virus, as we believed these types of pneumonia differ fundamentally from typical CAP.

Patient Variables

Patient variables included presence of aspiration pneumonia, laboratory data, comorbidities, and measures of disease severity, including the PSI. The clinician made a clinical diagnosis of the presence or absence of aspiration for each patient by marking a box on the case report form. Outcomes included in‐hospital mortality, hospital length of stay up to 14 days, and time to clinical stability up to 8 days. All variables were obtained directly from the case report form. In accordance with previously published definitions, we defined clinical stability as the day the following criteria were all met: improved clinical signs (improved cough and shortness of breath), lack of fever for >8 hours, improving leukocytosis (decreased at least 10% from the previous day), and tolerating oral intake.[17, 18]

Statistical Analysis

Baseline characteristics of patients with aspiration and nonaspiration CAP were compared using 2 or Fisher exact tests for categorical variables and the Mann‐Whitney U test for continuous variables.

To determine which patient variables were important in the physician diagnosis of aspiration pneumonia, we performed logistic regression with initial covariates comprising the demographic, comorbidity, and disease severity measurements listed in Table 1. We included interactions between cerebrovascular disease and age, nursing home status, and confusion to improve model fit. We centered all variables (including binary indicators) according to the method outlined by Kraemer and Blasey to improve interpretation of the main effects.[19]

Patient Characteristics of the Community‐Acquired Pneumonia Organization Database Stratified by Aspiration Pneumonia
  Aspiration Pneumonia, N=451 Nonaspiration Pneumonia, N=4,734 P Value
  • NOTE: All continuous data are median values (interquartile range), unless otherwise specified. Significance testing between groups was assessed using 2 or Mann‐Whitney U test, where appropriate. Abbreviations: BUN, blood urea nitrogen.

Demographics      
Age, y 79 (6587) 69 (5380) <0.001
% Male 59% 60% 0.58
Nursing home residence 25% 5% <0.001
Recent (30 days) antibiotic use 21% 16% 0.017
Comorbidities      
Cerebrovascular disease 35% 14% <0.001
Chronic obstructive pulmonary disease 25% 27% 0.62
Congestive heart failure 23% 19% 0.027
Diabetes 18% 18% 0.85
Cancer 12% 10% 0.12
Renal disease 10% 11% 0.53
Liver disease 6% 5% 0.29
Disease severity      
Pneumonia severity index 123 (99153) 92 (68117) <0.001
Confusion 49% 12% <0.001
PaO2 <60 mm Hg 43% 33% <0.001
BUN >30 g/dL 42% 23% <0.001
Multilobar pneumonia 34% 28% 0.003
Pleural effusion 25% 21% 0.07
Respiratory rate >30 breaths/minute 21% 20% 0.95
pH <7.35 13% 5% <0.001
Hematocrit <30% 11% 6% 0.001
Temperature >37.8C or <35.6C 9% 7% 0.30
Systolic blood pressure <90 mm Hg 8% 9% 0.003
Sodium <130 mEq/L 8% 6% 0.08
Heart rate >125 beats/minute 8% 5% 0.71
Glucose >250 mg/dL 6% 7% 0.06
Cavitary lesion 0% 0% 0.67
Clinical outcomes      
In‐hospital mortality 23% 9% <0.001
Intensive care unit admission 19% 13% 0.002
Hospital length of stay, d 9 (515) 7 (412) <0.001
Time to clinical stability, d 8 (48) 4 (38) <0.001

To determine if aspiration pneumonia had worse clinical outcomes compared to nonaspiration pneumonia, multiple methods were used. To compare the differences between the 2 groups with respect to time to clinical stability and length of hospital stay, we constructed Kaplan‐Meier survival curves and Cox proportional hazards regression models. The log‐rank test was used to determine statistical differences between the Kaplan‐Meier survival curves. To compare the impact of aspiration on mortality in patients with CAP, we conducted a propensity scorematched analysis. We chose propensity score matching over traditional logistic regression to balance variables among groups and to avoid the potential for overfit and multicollinearity. We considered a variable balanced after matching if its standardized difference was <10. All variables in the propensity scorematched analysis were balanced.

Although our dataset contained minimal missing data, we imputed any missing values to maintain the full study population in the creation of the propensity score. Missing data were imputed using the aregImpute function of the hmisc package of R (The R Foundation for Statistical Computing, Vienna, Austria).[20, 21] We built the propensity score model using a variable selection algorithm described by Bursac et al.[22] Our model included variables for region (United States/Canada, Europe, Asia/Africa or Latin America) and the variables listed in Table 1, with the exception of the PSI and the 4 clinical outcomes. Given that previous analyses accounting for clustering by physician did not substantially affect our results,[23] our model did not include physician‐level variables and did not account for the clustering effects of physicians. Using the propensity scores generated from this model, we matched a case of aspiration CAP with a case of nonaspiration CAP.[24] We then constructed a general linear model using the matched dataset to obtain the magnitude of effect of aspiration on mortality.

We used receiver operating characteristic curves to define the diagnostic accuracy of the pneumonia severity index for the prediction of mortality among patients with aspiration pneumonia and those with nonaspiration pneumonia. SAS version 9.3 (SAS Institute, Cary, NC) and R version 2.15.3 (The R Foundation for Statistical Computing) were used for all analyses. P values of 0.05 were considered statistically significant in all analyses.

RESULTS

Our initial query, after exclusion criteria, yielded a study population of 5185 patients (Figure 1). We compared 451 patients diagnosed with aspiration pneumonia to 4734 with CAP (Figure 1). Patient characteristics are summarized in Table 1. Patients with aspiration pneumonia were older, more likely to live in a nursing home, had greater disease severity, and were more likely to be admitted to an ICU. Patients with aspiration pneumonia had longer adjusted hospital lengths of stay and took more days to achieve clinical stability than patients with nonaspiration pneumonia (Figure 2). After adjusting for all variables in Table 1, the Cox proportional hazards models demonstrated that aspiration pneumonia was associated with ongoing hospitalization (hazard ratio [HR] for discharge: 0.77, 95% confidence interval [CI]: 0.65‐0.91, P=0.002) and clinical instability (HR for attaining clinical stability: 0.72, 95% CI: 0.61‐0.84, P<0.001). Patients with aspiration pneumonia presented with greater disease severity than those with nonaspiration pneumonia. Although there was no difference between groups in regard to temperature, respiratory rate, hyponatremia, or presence of pleural effusions or cavitary lesions, all other measured indices of disease severity were worse in patients with aspiration pneumonia. Patients with aspiration pneumonia were more likely to have cerebrovascular disease than those with nonaspiration pneumonia. Aspiration pneumonia patients also had increased prevalence of congestive heart failure. There was no appreciable difference between groups among other measured comorbidities.

Figure 1
Patient selection from June 2001 to December 2012. Abbreviations: CAP, community‐acquired pneumonia; HIV, human immunodeficiency virus.
Figure 2
Kaplan‐Meier graph of hospital length of stay (A) and time to clinical stability (B).

The patient characteristics most associated with a physician diagnosis of aspiration pneumonia, identified using logistic regression, were confusion, residence in nursing home, and presence of cerebrovascular disease (odds ratio [OR]: of 4.4, 2.9, and 2.3, respectively), whereas renal disease was associated with decreased physician diagnosis of aspiration pneumonia over nonaspiration pneumonia (OR: 0.58) (Table 2).

Final Logistic Regression Model for Physician Diagnosis of Aspiration Pneumonia
Covariate Odds Ratio 95% Confidence Intervals P Value
  • NOTE: The initial model included all demographic, comorbidity, and disease severity measurements from Table 1. Parameter estimates are for mean‐centered variables. Renal disease is defined as having a clinical diagnosis in the medical record. Although other interaction terms were used in the initial model, they were eliminated from the final model. We centered all variables (including binary indicators) according to the method described by Kraemer and Blasey.[19] The area under the curve of the final model is 0.79.

Demographics      
Age, y 1.00 0.991.01 0.948
Male 1.20 0.941.54 0.148
Nursing home residence 2.93 2.134.00 <0.001
Comorbidities      
Cerebrovascular disease 2.26 1.533.32 <0.001
Renal disease 0.58 0.390.85 0.006
Disease severity      
Confusion 4.41 3.405.72 <0.001
Hematocrit <30% 1.59 1.062.33 0.020
pH <7.35 1.67 1.102.47 0.013
Temperature >37.8C or <35.6C 1.60 1.072.35 0.019
Multilobar pneumonia 1.29 1.001.65 0.047
Interaction terms      
Age * cerebrovascular disease 0.98 0.960.99 0.011
Nursing home * cerebrovascular disease 0.51 0.270.96 0.037
Confusion * cerebrovascular disease 0.70 0.421.17 0.175

Observed in‐patient mortality of aspiration pneumonia was 23%. This mortality was considerably higher than a mean PSI score of 123 would predict (class IV risk group, with expected 30‐day mortality of 8%9%[25]). The PSI score's ability to predict inpatient mortality in patients with aspiration pneumonia was moderate, with an area under the curve (AUC) of 0.71. This was similar to its performance in patients with nonaspiration pneumonia (AUC of 0.75) (Figure 3). These values are lower than the AUC of 0.81 for the PSI in predicting mortality derived from a meta‐analysis of 31 other studies.[26]

Figure 3
Receiver operating characteristic curve of pneumonia severity index score and inpatient mortality. Abbreviations: AUC, area under the curve.

Our regression model after propensity score matching demonstrated that aspiration pneumonia independently confers a 2.3‐fold increased odds for inpatient mortality (95% CI: 1.56‐3.45, P<0.001).

DISCUSSION

Pneumonia patients with confusion, nursing home residence, or cerebrovascular disease are more likely to be diagnosed with aspiration pneumonia by clinicians. Although this is unsurprising, it is notable that these patients are more than twice as likely to die in the inpatient setting, even after accounting for age, comorbidities, and disease severity. These findings are similar to three previously published studies comparing aspiration and nonaspiration pneumonia at single institutions, albeit using different aspiration pneumonia definitions.[13, 14, 15] This study is the first large, multicenter, multinational study to demonstrate these findings.

Central to the interpretation of our results is the method of diagnosing aspiration versus nonaspiration. A bottom‐up method that relies on a clinician to check a box for aspiration may appear poorly reproducible. Because there is no diagnostic gold standard, clinicians may use different criteria to diagnose aspiration, creating potential for idiosyncratic noise. The strength of the wisdom of the crowd method used in this study is that an aggregate estimation from independent judgments may reduce the noise from individual judgments.[16] Although clinicians may vary in why they diagnose a particular patient as having aspiration pneumonia, it appears that the overwhelming reason for diagnosing a patient as having aspiration pneumonia is the presence of confusion, followed by previous nursing home residence or cerebrovascular disease. This finding has some face validity when compared with studies using an investigator definition, as altered mental status, chronic debility, and cerebrovascular disease are either prominent features of the definition of aspiration pneumonia[8] or frequently observed in patients with aspiration pneumonia.[13, 15] The distribution of cerebrovascular disease among our study's aspiration and nonaspiration pneumonia patients was similar to studies that used formal criteria in their definitions.[13, 15] Although nursing home residence was more likely in aspiration pneumonia patients, the majority of aspiration pneumonia patients were residing in the community, suggesting that aspiration is not simply a surrogate for healthcare‐associated pneumonia. Although patients with aspiration pneumonia are typically older than their nonaspiration counterparts, it appears that age is not a key determinant in the diagnosis of aspiration. With aspiration pneumonia, confusion, nursing home residence, and the presence of cerebrovascular disease are the greatest contributors in the clinical diagnosis, more than age.

Our data demonstrate that aspiration pneumonia confers increased odds for mortality, even after adjustment for age, disease severity, and comorbidities. These data suggest that aspiration pneumonia is a distinct entity from nonaspiration pneumonia, and that this disease is worse than nonaspiration CAP. If aspiration pneumonia is distinct from nonaspiration pneumonia, some unrecognized host factor other than age, disease severity, or the captured comorbidities decreases survival in aspiration pneumonia patients. However, it is also possible that aspiration pneumonia is merely a clinical designation for one end of the pneumonia spectrum, and we and others have failed to completely account for all measures of disease severity or all measures of comorbidities. Examples of unmeasured comorbidities would include presence of oropharyngeal dysphagia, which is not assessed in the database but could have a significant effect on clinical diagnosis. Unmeasured covariates can include measures beyond that of disease severity or comorbidity, such as the presence of a do not resuscitate (DNR) order, which could have a significant confounding effect on the observed association. A previous, single‐center study demonstrated that increased 30‐day mortality in aspiration pneumonia was mostly attributable to greater disease severity and comorbidities, although aspiration pneumonia independently conferred greater risk for adverse long‐term outcomes.[15] We propose that aspiration pneumonia represents a clinically distinct entity from nonaspiration pneumonia. Patients with chronic aspiration are often chronically malnourished and may have different oral flora than patients without chronic aspiration.[27, 28] Chronic aspiration has been associated with granulomatous reaction, organizing pneumonia, diffuse alveolar damage, and chronic bronchiolitis.[29] Chronic aspiration may elicit changes in the host physiology, and may render the host more susceptible to the development of secondary bacterial infection with morbid consequences.

The ability of the PSI to predict inpatient mortality was moderate (AUC only 0.7), with no significant additional discrimination between the aspiration and nonaspiration pneumonia groups. Although the PSI had moderate ability to predict inpatient mortality, the observed mortality was considerably higher than predicted. It is possible that the PSI incompletely captures clinically relevant comorbidities (eg, malnutrition). Further study to improve mortality prediction of aspiration pneumonia patients could employ sensitivity analysis to determine optimal thresholds and weighting of the PSI components.

Patients with aspiration pneumonia had longer hospital lengths of stay and took longer to achieve clinical stability than their nonaspiration counterparts. Time to clinical stability has been associated with increased posthospitalization mortality and is associated with time to switch from intravenous to oral antibiotics.[17] Although some component of hospital length‐of‐stay is subject to local practice patterns, time to clinical stability has explicit criteria for clinical improvement and failure, and therefore is less likely to be affected by local practice patterns.

We noted a relatively high (16%21%) incidence of prior antibiotic use among patients in this database. Analysis of antibiotic prescription patterns was limited, given the several different countries from which the database draws its cases. Although we used accepted criteria to define CAP cases, it is possible that this population may have a higher rate of resistant or uncommon pathogens than other studies of CAP that have populations with lower incidence of prior antibiotic use. Although not assessed, we suspect a significant component of the prior antibiotic use represented outpatient pneumonia treatment during the few days prior to visiting the hospital.

This study has several limitations, of which the most important may be that we used clinical determination for defining presence of aspiration pneumonia. This method is susceptible to the subjective perceptions of the treating clinician. We did not account for the effect of individual physicians in our model, although we did adjust for regional differences. The retrospective identification of patients allows for the possibility of selection bias, and therefore we have not attempted to make inferences regarding the relative incidence of pneumonia, nor did we adjust for temporal trends in diagnosis. The ratio of aspiration pneumonia patients to nonaspiration pneumonia patients may not necessarily reflect that observed in reality. Microbiologic and antibiotic data were unavailable for analysis. This study cannot inform on nonhospitalized patients with aspiration pneumonia, as only hospitalized patients were enrolled. The database identified cases of pneumonia, so it is possible for a patient to enter into the database more than once. Detection of mortality was limited to the inpatient setting rather than a set interval of 30 days. Inpatient mortality depends on length‐of‐stay patterns that may bias the mortality endpoint.[30] Also not assessed was the presence of a DNR order. It is possible that an older patient with greater comorbidities and disease severity may have care intentionally limited or withdrawn early by the family or clinicians.

Strengths of the study include its size and its multicenter, multinational population. The CAPO database is a large and well‐described population of patients with CAP.[17, 31] These attributes, as well as the clinician‐determined diagnosis, increase the generalizability of the study compared to a single‐center, single‐country study that employs investigator‐defined criteria.

CONCLUSION

Pneumonia patients with confusion, who are nursing home residence, and have cerebrovascular disease are more likely to be diagnosed with aspiration pneumonia by clinicians. Our clinician‐diagnosed cohort appears similar to those derived using an investigator definition. Patients with aspiration pneumonia are older, and have greater disease severity and more comorbidities than patients with nonaspiration pneumonia. They have greater mortality than their PSI score class would predict. Even after accounting for age, disease severity, and comorbidities, the presence of aspiration pneumonia independently conferred a greater than 2‐fold increase in inpatient mortality. These findings together suggest that aspiration pneumonia should be considered a distinct entity from typical pneumonia, and that additional research should be done in this field.

ACKNOWLEDGMENTS

Disclosures: M.J.L. contributed to the study design, data analysis, statistical analysis, and writing of the manuscript. P.P. contributed to the study design and revision of the manuscript for important intellectual content. T.W. and E.W. contributed to the study design, statistical analysis, and revision of the manuscript for important intellectual content. J.A.R. and N.C.D. contributed to the study design and revision of the manuscript for important intellectual content. All authors read and approved the final manuscript. M.L. takes responsibility for the integrity of the work as a whole, from inception to published article. This investigation was partly supported with funding from the National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health (grant 8UL1TR000105 [formerly UL1RR025764]). The authors report no conflicts of interest.

Files
References
  1. Torres A, Serra‐Batlles J, Ferrer A, et al. Severe community‐acquired pneumonia. Epidemiology and prognostic factors. Am Rev Respir Dis. 1991;144(2):312318.
  2. Koivula I, Sten M, Makela PH. Risk factors for pneumonia in the elderly. Am J Med. 1994;96(4):313320.
  3. Marik PE, Kaplan D. Aspiration pneumonia and dysphagia in the elderly. Chest. 2003;124(1):328336.
  4. Mylotte JM, Goodnough S, Naughton BJ. Pneumonia versus aspiration pneumonitis in nursing home residents: diagnosis and management. J Am Geriatr Soc. 2003;51(1):1723.
  5. Marik PE. Aspiration pneumonia: mixing apples with oranges and tangerines. Crit Care Med. 2004;32(5):1236; author reply 1236–1237.
  6. Kozlow JH, Berenholtz SM, Garrett E, Dorman T, Pronovost PJ. Epidemiology and impact of aspiration pneumonia in patients undergoing surgery in Maryland, 1999–2000. Crit Care Med. 2003;31(7):19301937.
  7. Marik PE. Aspiration syndromes: aspiration pneumonia and pneumonitis. Hosp Pract (Minneap). 2010;38(1):3542.
  8. Marik PE. Aspiration pneumonitis and aspiration pneumonia. N Engl J Med. 2001;344(9):665671.
  9. Lim WS, Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58(5):377382.
  10. Fine MJ, Hanusa BH, Lave JR, et al. Comparison of a disease‐specific and a generic severity of illness measure for patients with community‐acquired pneumonia. J Gen Intern Med. 1995;10(7):359368.
  11. Espana PP, Capelastegui A, Gorordo I, et al. Development and validation of a clinical prediction rule for severe community‐acquired pneumonia. Am J Respir Crit Care Med. 2006;174(11):12491256.
  12. Jones BE, Jones J, Bewick T, et al. CURB‐65 pneumonia severity assessment adapted for electronic decision support. Chest. 2011;140(1):156163.
  13. Lanspa MJ, Jones BE, Brown SM, Dean NC. Mortality, morbidity, and disease severity of patients with aspiration pneumonia. J Hosp Med. 2013;8(2):8390.
  14. Heppner HJ, Sehlhoff B, Niklaus D, Pientka L, Thiem U. Pneumonia Severity Index (PSI), CURB‐65, and mortality in hospitalized elderly patients with aspiration pneumonia [in German]. Z Gerontol Geriatr. 2011;44(4):229234.
  15. Taylor JK, Fleming GB, Singanayagam A, Hill AT, Chalmers JD. Risk factors for aspiration in community‐acquired pneumonia: analysis of a hospitalized UK cohort. Am J Med. 2013;126(11):9951001.
  16. Yi SK, Steyvers M, Lee MD, Dry MJ. The wisdom of the crowd in combinatorial problems. Cogn Sci. 2012;36(3):452470.
  17. Aliberti S, Peyrani P, Filardo G, et al. Association between time to clinical stability and outcomes after discharge in hospitalized patients with community‐acquired pneumonia. Chest. 2011;140(2):482488.
  18. Ramirez JA. Clinical stability and switch therapy in hospitalised patients with community‐acquired pneumonia: are we there yet? Eur Respir J. 2013;41(1):56.
  19. Kraemer HC, Blasey CM. Centring in regression analyses: a strategy to prevent errors in statistical inference. Int J Methods Psychiatr Res. 2004;13(3):141151.
  20. Harrell FE. Hmisc: Harrell miscellaneous. Available at: http://CRAN.R‐project.org/package=Hmisc. Published Sept 12, 2014. Last accessed Oct 27, 2014.
  21. Heitjan DF, Little RJA. Multiple imputation for the fatal accident reporting system. J R Stat Soc Ser C Appl Stat. 1991;40(1):1329.
  22. Bursac Z, Gauss CH, Williams DK, Hosmer DW. Purposeful selection of variables in logistic regression. Source Code Biol Med. 2008;3:17.
  23. Arnold FW, Wiemken TL, Peyrani P, Ramirez JA, Brock GN; CAPO authors. Mortality differences among hospitalized patients with community‐acquired pneumonia in three world regions: results from the Community‐Acquired Pneumonia Organization (CAPO) International Cohort Study. Respir Med. 2013;107(7):11011111.
  24. Parsons L. Reducing bias in a propensity score matched‐pair sample using greedy matching techniques. In: Proceedings of the 26th Annual SAS Users Group International Conference. Cary, NC: SAS Institute Inc.; 2001:214226. Available at: http://www2.sas.com/proceedings/sugi26/p214–26.pdf. Last accessed Oct 27, 2014.
  25. Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low‐risk patients with community‐acquired pneumonia. N Engl J Med. 1997;336(4):243250.
  26. Chalmers JD, Singanayagam A, Akram AR, et al. Severity assessment tools for predicting mortality in hospitalised patients with community‐acquired pneumonia. Systematic review and meta‐analysis. Thorax. 2010;65(10):878883.
  27. Cabre M, Serra‐Prat M, Palomera E, Almirall J, Pallares R, Clave P. Prevalence and prognostic implications of dysphagia in elderly patients with pneumonia. Age Ageing. 2010;39(1):3945.
  28. Pace CC, McCullough GH. The association between oral microorgansims and aspiration pneumonia in the institutionalized elderly: review and recommendations. Dysphagia. 2010;25(4):307322.
  29. Yousem SA, Faber C. Histopathology of aspiration pneumonia not associated with food or other particulate matter: a clinicopathologic study of 10 cases diagnosed on biopsy. Am J Surg Pathol. 2011;35(3):426431.
  30. Jencks SF, Daley J, Draper D, Thomas N, Lenhart G, Walker J. Interpreting hospital mortality data. The role of clinical risk adjustment. JAMA. 1988;260(24):36113616.
  31. Arnold FW, Ramirez JA, McDonald LC, Xia EL. Hospitalization for community‐acquired pneumonia: the pneumonia severity index vs clinical judgment. Chest. 2003;124(1):121124.
Article PDF
Issue
Journal of Hospital Medicine - 10(2)
Page Number
90-96
Sections
Files
Files
Article PDF
Article PDF

Pneumonia is a common clinical syndrome with well‐described epidemiology and microbiology. Aspiration pneumonia comprises 5% to 15% of patients with pneumonia acquired outside of the hospital,[1] but is less well characterized despite being a major syndrome of pneumonia in the elderly.[2, 3] Difficulties in studying aspiration pneumonia include the lack of a sensitive and specific marker for aspiration as well as the potential overlap between aspiration pneumonia and other forms of pneumonia.[4, 5, 6] Additionally, clinicians have difficulty distinguishing between aspiration pneumonia, which develops after the aspiration of oropharyngeal contents, and aspiration pneumonitis, wherein inhalation of gastric contents causes inflammation without the subsequent development of bacterial infection.[7, 8] Central to the study of aspiration pneumonia is whether it should exist as its own entity, or if aspiration is really a designation used for pneumonia in an older patient with greater comorbidities. The ability to clearly understand how a clinician diagnoses aspiration pneumonia, and whether that method has face validity with expert definitions may allow for improved future research, improved generalizability of current or past research, and possibly better clinical care.

Several validated mortality prediction models exist for community‐acquired pneumonia (CAP) using a variety of clinical predictors, but their performance in patients with aspiration pneumonia is less well characterized. Most studies validating pneumonia severity scoring systems excluded aspiration pneumonia from their study population.[9, 10, 11] Severity scoring systems for CAP may not accurately predict disease severity in patients with aspiration pneumonia. The CURB‐65[9] (confusion, uremia, respiratory rate, blood pressure, age 65 years) and the eCURB[12] scoring systems are poor predictors of mortality in patients with aspiration pneumonia, perhaps because they do not account for patient comorbidities.[13] The pneumonia severity index (PSI)[10] might predict mortality better than CURB‐65 in the aspiration population due to the inclusion of comorbidities.

Previous studies have demonstrated that patients with aspiration pneumonia are older and have greater disease severity and more comorbidities.[13, 14, 15] These single‐center studies also demonstrated greater mortality, more frequent admission to an intensive care unit (ICU), and longer hospital lengths of stay in patients with aspiration pneumonia. These studies identified aspiration pneumonia by the presence of a risk factor for aspiration[15] or by physician billing codes.[13] In practice, however, the bedside clinician diagnoses a patient as having aspiration pneumonia, but the logic is likely vague and inconsistent. Despite the potential for variability with individual judgment, an aggregate estimation from independent judgments may perform better than individual judgments.[16] Because there is no gold standard for defining aspiration pneumonia, all previous research has been limited to definitions created by investigators. This multicenter study seeks to determine what clinical characteristics lead physicians to diagnose a patient as having aspiration pneumonia, and whether or not the clinician‐derived diagnosis is distinct and clinically useful.

Our objectives were to: (1) identify covariates associated with bedside clinicians diagnosing a pneumonia patient as having aspiration pneumonia; (2) compare aspiration pneumonia and nonaspiration pneumonia in regard to disease severity, patient demographics, comorbidities, and clinical outcomes; and (3) measure the performance of the PSI in aspiration pneumonia versus nonaspiration pneumonia.

PATIENTS AND METHODS

Study Design and Setting

We performed a secondary analysis of the Community‐Acquired Pneumonia Organization (CAPO) database, which contains retrospectively collected data from 71 hospitals in 16 countries between June 2001 and December 2012. In each participating center, primary investigators selected nonconsecutive, adult hospitalized patients diagnosed with CAP. To decrease systematic selection biases, the selection of patients with CAP for enrollment in the trial was based on the date of hospital admission. Each investigator completed a case report form that was transferred via the internet to the CAPO study center at the University of Louisville (Louisville, KY). A sample of the data collection form is available at the study website (www.caposite.com). Validation of data quality was performed at the study center before the case was entered into the CAPO database. Local institutional review board approval was obtained for each study site.

Inclusion and Exclusion Criteria

Patients 18 years of age and satisfying criteria for CAP were included in this study. A diagnosis of CAP required a new pulmonary infiltrate at time of hospitalization, and at least 1 of the following: new or increased cough; leukocytosis; leukopenia, or left shift pattern on white blood cell count; and temperature >37.8C or <35.6 C. We excluded patients with pneumonia attributed to mycobacterial or fungal infection, and patients infected with human immunodeficiency virus, as we believed these types of pneumonia differ fundamentally from typical CAP.

Patient Variables

Patient variables included presence of aspiration pneumonia, laboratory data, comorbidities, and measures of disease severity, including the PSI. The clinician made a clinical diagnosis of the presence or absence of aspiration for each patient by marking a box on the case report form. Outcomes included in‐hospital mortality, hospital length of stay up to 14 days, and time to clinical stability up to 8 days. All variables were obtained directly from the case report form. In accordance with previously published definitions, we defined clinical stability as the day the following criteria were all met: improved clinical signs (improved cough and shortness of breath), lack of fever for >8 hours, improving leukocytosis (decreased at least 10% from the previous day), and tolerating oral intake.[17, 18]

Statistical Analysis

Baseline characteristics of patients with aspiration and nonaspiration CAP were compared using 2 or Fisher exact tests for categorical variables and the Mann‐Whitney U test for continuous variables.

To determine which patient variables were important in the physician diagnosis of aspiration pneumonia, we performed logistic regression with initial covariates comprising the demographic, comorbidity, and disease severity measurements listed in Table 1. We included interactions between cerebrovascular disease and age, nursing home status, and confusion to improve model fit. We centered all variables (including binary indicators) according to the method outlined by Kraemer and Blasey to improve interpretation of the main effects.[19]

Patient Characteristics of the Community‐Acquired Pneumonia Organization Database Stratified by Aspiration Pneumonia
  Aspiration Pneumonia, N=451 Nonaspiration Pneumonia, N=4,734 P Value
  • NOTE: All continuous data are median values (interquartile range), unless otherwise specified. Significance testing between groups was assessed using 2 or Mann‐Whitney U test, where appropriate. Abbreviations: BUN, blood urea nitrogen.

Demographics      
Age, y 79 (6587) 69 (5380) <0.001
% Male 59% 60% 0.58
Nursing home residence 25% 5% <0.001
Recent (30 days) antibiotic use 21% 16% 0.017
Comorbidities      
Cerebrovascular disease 35% 14% <0.001
Chronic obstructive pulmonary disease 25% 27% 0.62
Congestive heart failure 23% 19% 0.027
Diabetes 18% 18% 0.85
Cancer 12% 10% 0.12
Renal disease 10% 11% 0.53
Liver disease 6% 5% 0.29
Disease severity      
Pneumonia severity index 123 (99153) 92 (68117) <0.001
Confusion 49% 12% <0.001
PaO2 <60 mm Hg 43% 33% <0.001
BUN >30 g/dL 42% 23% <0.001
Multilobar pneumonia 34% 28% 0.003
Pleural effusion 25% 21% 0.07
Respiratory rate >30 breaths/minute 21% 20% 0.95
pH <7.35 13% 5% <0.001
Hematocrit <30% 11% 6% 0.001
Temperature >37.8C or <35.6C 9% 7% 0.30
Systolic blood pressure <90 mm Hg 8% 9% 0.003
Sodium <130 mEq/L 8% 6% 0.08
Heart rate >125 beats/minute 8% 5% 0.71
Glucose >250 mg/dL 6% 7% 0.06
Cavitary lesion 0% 0% 0.67
Clinical outcomes      
In‐hospital mortality 23% 9% <0.001
Intensive care unit admission 19% 13% 0.002
Hospital length of stay, d 9 (515) 7 (412) <0.001
Time to clinical stability, d 8 (48) 4 (38) <0.001

To determine if aspiration pneumonia had worse clinical outcomes compared to nonaspiration pneumonia, multiple methods were used. To compare the differences between the 2 groups with respect to time to clinical stability and length of hospital stay, we constructed Kaplan‐Meier survival curves and Cox proportional hazards regression models. The log‐rank test was used to determine statistical differences between the Kaplan‐Meier survival curves. To compare the impact of aspiration on mortality in patients with CAP, we conducted a propensity scorematched analysis. We chose propensity score matching over traditional logistic regression to balance variables among groups and to avoid the potential for overfit and multicollinearity. We considered a variable balanced after matching if its standardized difference was <10. All variables in the propensity scorematched analysis were balanced.

Although our dataset contained minimal missing data, we imputed any missing values to maintain the full study population in the creation of the propensity score. Missing data were imputed using the aregImpute function of the hmisc package of R (The R Foundation for Statistical Computing, Vienna, Austria).[20, 21] We built the propensity score model using a variable selection algorithm described by Bursac et al.[22] Our model included variables for region (United States/Canada, Europe, Asia/Africa or Latin America) and the variables listed in Table 1, with the exception of the PSI and the 4 clinical outcomes. Given that previous analyses accounting for clustering by physician did not substantially affect our results,[23] our model did not include physician‐level variables and did not account for the clustering effects of physicians. Using the propensity scores generated from this model, we matched a case of aspiration CAP with a case of nonaspiration CAP.[24] We then constructed a general linear model using the matched dataset to obtain the magnitude of effect of aspiration on mortality.

We used receiver operating characteristic curves to define the diagnostic accuracy of the pneumonia severity index for the prediction of mortality among patients with aspiration pneumonia and those with nonaspiration pneumonia. SAS version 9.3 (SAS Institute, Cary, NC) and R version 2.15.3 (The R Foundation for Statistical Computing) were used for all analyses. P values of 0.05 were considered statistically significant in all analyses.

RESULTS

Our initial query, after exclusion criteria, yielded a study population of 5185 patients (Figure 1). We compared 451 patients diagnosed with aspiration pneumonia to 4734 with CAP (Figure 1). Patient characteristics are summarized in Table 1. Patients with aspiration pneumonia were older, more likely to live in a nursing home, had greater disease severity, and were more likely to be admitted to an ICU. Patients with aspiration pneumonia had longer adjusted hospital lengths of stay and took more days to achieve clinical stability than patients with nonaspiration pneumonia (Figure 2). After adjusting for all variables in Table 1, the Cox proportional hazards models demonstrated that aspiration pneumonia was associated with ongoing hospitalization (hazard ratio [HR] for discharge: 0.77, 95% confidence interval [CI]: 0.65‐0.91, P=0.002) and clinical instability (HR for attaining clinical stability: 0.72, 95% CI: 0.61‐0.84, P<0.001). Patients with aspiration pneumonia presented with greater disease severity than those with nonaspiration pneumonia. Although there was no difference between groups in regard to temperature, respiratory rate, hyponatremia, or presence of pleural effusions or cavitary lesions, all other measured indices of disease severity were worse in patients with aspiration pneumonia. Patients with aspiration pneumonia were more likely to have cerebrovascular disease than those with nonaspiration pneumonia. Aspiration pneumonia patients also had increased prevalence of congestive heart failure. There was no appreciable difference between groups among other measured comorbidities.

Figure 1
Patient selection from June 2001 to December 2012. Abbreviations: CAP, community‐acquired pneumonia; HIV, human immunodeficiency virus.
Figure 2
Kaplan‐Meier graph of hospital length of stay (A) and time to clinical stability (B).

The patient characteristics most associated with a physician diagnosis of aspiration pneumonia, identified using logistic regression, were confusion, residence in nursing home, and presence of cerebrovascular disease (odds ratio [OR]: of 4.4, 2.9, and 2.3, respectively), whereas renal disease was associated with decreased physician diagnosis of aspiration pneumonia over nonaspiration pneumonia (OR: 0.58) (Table 2).

Final Logistic Regression Model for Physician Diagnosis of Aspiration Pneumonia
Covariate Odds Ratio 95% Confidence Intervals P Value
  • NOTE: The initial model included all demographic, comorbidity, and disease severity measurements from Table 1. Parameter estimates are for mean‐centered variables. Renal disease is defined as having a clinical diagnosis in the medical record. Although other interaction terms were used in the initial model, they were eliminated from the final model. We centered all variables (including binary indicators) according to the method described by Kraemer and Blasey.[19] The area under the curve of the final model is 0.79.

Demographics      
Age, y 1.00 0.991.01 0.948
Male 1.20 0.941.54 0.148
Nursing home residence 2.93 2.134.00 <0.001
Comorbidities      
Cerebrovascular disease 2.26 1.533.32 <0.001
Renal disease 0.58 0.390.85 0.006
Disease severity      
Confusion 4.41 3.405.72 <0.001
Hematocrit <30% 1.59 1.062.33 0.020
pH <7.35 1.67 1.102.47 0.013
Temperature >37.8C or <35.6C 1.60 1.072.35 0.019
Multilobar pneumonia 1.29 1.001.65 0.047
Interaction terms      
Age * cerebrovascular disease 0.98 0.960.99 0.011
Nursing home * cerebrovascular disease 0.51 0.270.96 0.037
Confusion * cerebrovascular disease 0.70 0.421.17 0.175

Observed in‐patient mortality of aspiration pneumonia was 23%. This mortality was considerably higher than a mean PSI score of 123 would predict (class IV risk group, with expected 30‐day mortality of 8%9%[25]). The PSI score's ability to predict inpatient mortality in patients with aspiration pneumonia was moderate, with an area under the curve (AUC) of 0.71. This was similar to its performance in patients with nonaspiration pneumonia (AUC of 0.75) (Figure 3). These values are lower than the AUC of 0.81 for the PSI in predicting mortality derived from a meta‐analysis of 31 other studies.[26]

Figure 3
Receiver operating characteristic curve of pneumonia severity index score and inpatient mortality. Abbreviations: AUC, area under the curve.

Our regression model after propensity score matching demonstrated that aspiration pneumonia independently confers a 2.3‐fold increased odds for inpatient mortality (95% CI: 1.56‐3.45, P<0.001).

DISCUSSION

Pneumonia patients with confusion, nursing home residence, or cerebrovascular disease are more likely to be diagnosed with aspiration pneumonia by clinicians. Although this is unsurprising, it is notable that these patients are more than twice as likely to die in the inpatient setting, even after accounting for age, comorbidities, and disease severity. These findings are similar to three previously published studies comparing aspiration and nonaspiration pneumonia at single institutions, albeit using different aspiration pneumonia definitions.[13, 14, 15] This study is the first large, multicenter, multinational study to demonstrate these findings.

Central to the interpretation of our results is the method of diagnosing aspiration versus nonaspiration. A bottom‐up method that relies on a clinician to check a box for aspiration may appear poorly reproducible. Because there is no diagnostic gold standard, clinicians may use different criteria to diagnose aspiration, creating potential for idiosyncratic noise. The strength of the wisdom of the crowd method used in this study is that an aggregate estimation from independent judgments may reduce the noise from individual judgments.[16] Although clinicians may vary in why they diagnose a particular patient as having aspiration pneumonia, it appears that the overwhelming reason for diagnosing a patient as having aspiration pneumonia is the presence of confusion, followed by previous nursing home residence or cerebrovascular disease. This finding has some face validity when compared with studies using an investigator definition, as altered mental status, chronic debility, and cerebrovascular disease are either prominent features of the definition of aspiration pneumonia[8] or frequently observed in patients with aspiration pneumonia.[13, 15] The distribution of cerebrovascular disease among our study's aspiration and nonaspiration pneumonia patients was similar to studies that used formal criteria in their definitions.[13, 15] Although nursing home residence was more likely in aspiration pneumonia patients, the majority of aspiration pneumonia patients were residing in the community, suggesting that aspiration is not simply a surrogate for healthcare‐associated pneumonia. Although patients with aspiration pneumonia are typically older than their nonaspiration counterparts, it appears that age is not a key determinant in the diagnosis of aspiration. With aspiration pneumonia, confusion, nursing home residence, and the presence of cerebrovascular disease are the greatest contributors in the clinical diagnosis, more than age.

Our data demonstrate that aspiration pneumonia confers increased odds for mortality, even after adjustment for age, disease severity, and comorbidities. These data suggest that aspiration pneumonia is a distinct entity from nonaspiration pneumonia, and that this disease is worse than nonaspiration CAP. If aspiration pneumonia is distinct from nonaspiration pneumonia, some unrecognized host factor other than age, disease severity, or the captured comorbidities decreases survival in aspiration pneumonia patients. However, it is also possible that aspiration pneumonia is merely a clinical designation for one end of the pneumonia spectrum, and we and others have failed to completely account for all measures of disease severity or all measures of comorbidities. Examples of unmeasured comorbidities would include presence of oropharyngeal dysphagia, which is not assessed in the database but could have a significant effect on clinical diagnosis. Unmeasured covariates can include measures beyond that of disease severity or comorbidity, such as the presence of a do not resuscitate (DNR) order, which could have a significant confounding effect on the observed association. A previous, single‐center study demonstrated that increased 30‐day mortality in aspiration pneumonia was mostly attributable to greater disease severity and comorbidities, although aspiration pneumonia independently conferred greater risk for adverse long‐term outcomes.[15] We propose that aspiration pneumonia represents a clinically distinct entity from nonaspiration pneumonia. Patients with chronic aspiration are often chronically malnourished and may have different oral flora than patients without chronic aspiration.[27, 28] Chronic aspiration has been associated with granulomatous reaction, organizing pneumonia, diffuse alveolar damage, and chronic bronchiolitis.[29] Chronic aspiration may elicit changes in the host physiology, and may render the host more susceptible to the development of secondary bacterial infection with morbid consequences.

The ability of the PSI to predict inpatient mortality was moderate (AUC only 0.7), with no significant additional discrimination between the aspiration and nonaspiration pneumonia groups. Although the PSI had moderate ability to predict inpatient mortality, the observed mortality was considerably higher than predicted. It is possible that the PSI incompletely captures clinically relevant comorbidities (eg, malnutrition). Further study to improve mortality prediction of aspiration pneumonia patients could employ sensitivity analysis to determine optimal thresholds and weighting of the PSI components.

Patients with aspiration pneumonia had longer hospital lengths of stay and took longer to achieve clinical stability than their nonaspiration counterparts. Time to clinical stability has been associated with increased posthospitalization mortality and is associated with time to switch from intravenous to oral antibiotics.[17] Although some component of hospital length‐of‐stay is subject to local practice patterns, time to clinical stability has explicit criteria for clinical improvement and failure, and therefore is less likely to be affected by local practice patterns.

We noted a relatively high (16%21%) incidence of prior antibiotic use among patients in this database. Analysis of antibiotic prescription patterns was limited, given the several different countries from which the database draws its cases. Although we used accepted criteria to define CAP cases, it is possible that this population may have a higher rate of resistant or uncommon pathogens than other studies of CAP that have populations with lower incidence of prior antibiotic use. Although not assessed, we suspect a significant component of the prior antibiotic use represented outpatient pneumonia treatment during the few days prior to visiting the hospital.

This study has several limitations, of which the most important may be that we used clinical determination for defining presence of aspiration pneumonia. This method is susceptible to the subjective perceptions of the treating clinician. We did not account for the effect of individual physicians in our model, although we did adjust for regional differences. The retrospective identification of patients allows for the possibility of selection bias, and therefore we have not attempted to make inferences regarding the relative incidence of pneumonia, nor did we adjust for temporal trends in diagnosis. The ratio of aspiration pneumonia patients to nonaspiration pneumonia patients may not necessarily reflect that observed in reality. Microbiologic and antibiotic data were unavailable for analysis. This study cannot inform on nonhospitalized patients with aspiration pneumonia, as only hospitalized patients were enrolled. The database identified cases of pneumonia, so it is possible for a patient to enter into the database more than once. Detection of mortality was limited to the inpatient setting rather than a set interval of 30 days. Inpatient mortality depends on length‐of‐stay patterns that may bias the mortality endpoint.[30] Also not assessed was the presence of a DNR order. It is possible that an older patient with greater comorbidities and disease severity may have care intentionally limited or withdrawn early by the family or clinicians.

Strengths of the study include its size and its multicenter, multinational population. The CAPO database is a large and well‐described population of patients with CAP.[17, 31] These attributes, as well as the clinician‐determined diagnosis, increase the generalizability of the study compared to a single‐center, single‐country study that employs investigator‐defined criteria.

CONCLUSION

Pneumonia patients with confusion, who are nursing home residence, and have cerebrovascular disease are more likely to be diagnosed with aspiration pneumonia by clinicians. Our clinician‐diagnosed cohort appears similar to those derived using an investigator definition. Patients with aspiration pneumonia are older, and have greater disease severity and more comorbidities than patients with nonaspiration pneumonia. They have greater mortality than their PSI score class would predict. Even after accounting for age, disease severity, and comorbidities, the presence of aspiration pneumonia independently conferred a greater than 2‐fold increase in inpatient mortality. These findings together suggest that aspiration pneumonia should be considered a distinct entity from typical pneumonia, and that additional research should be done in this field.

ACKNOWLEDGMENTS

Disclosures: M.J.L. contributed to the study design, data analysis, statistical analysis, and writing of the manuscript. P.P. contributed to the study design and revision of the manuscript for important intellectual content. T.W. and E.W. contributed to the study design, statistical analysis, and revision of the manuscript for important intellectual content. J.A.R. and N.C.D. contributed to the study design and revision of the manuscript for important intellectual content. All authors read and approved the final manuscript. M.L. takes responsibility for the integrity of the work as a whole, from inception to published article. This investigation was partly supported with funding from the National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health (grant 8UL1TR000105 [formerly UL1RR025764]). The authors report no conflicts of interest.

Pneumonia is a common clinical syndrome with well‐described epidemiology and microbiology. Aspiration pneumonia comprises 5% to 15% of patients with pneumonia acquired outside of the hospital,[1] but is less well characterized despite being a major syndrome of pneumonia in the elderly.[2, 3] Difficulties in studying aspiration pneumonia include the lack of a sensitive and specific marker for aspiration as well as the potential overlap between aspiration pneumonia and other forms of pneumonia.[4, 5, 6] Additionally, clinicians have difficulty distinguishing between aspiration pneumonia, which develops after the aspiration of oropharyngeal contents, and aspiration pneumonitis, wherein inhalation of gastric contents causes inflammation without the subsequent development of bacterial infection.[7, 8] Central to the study of aspiration pneumonia is whether it should exist as its own entity, or if aspiration is really a designation used for pneumonia in an older patient with greater comorbidities. The ability to clearly understand how a clinician diagnoses aspiration pneumonia, and whether that method has face validity with expert definitions may allow for improved future research, improved generalizability of current or past research, and possibly better clinical care.

Several validated mortality prediction models exist for community‐acquired pneumonia (CAP) using a variety of clinical predictors, but their performance in patients with aspiration pneumonia is less well characterized. Most studies validating pneumonia severity scoring systems excluded aspiration pneumonia from their study population.[9, 10, 11] Severity scoring systems for CAP may not accurately predict disease severity in patients with aspiration pneumonia. The CURB‐65[9] (confusion, uremia, respiratory rate, blood pressure, age 65 years) and the eCURB[12] scoring systems are poor predictors of mortality in patients with aspiration pneumonia, perhaps because they do not account for patient comorbidities.[13] The pneumonia severity index (PSI)[10] might predict mortality better than CURB‐65 in the aspiration population due to the inclusion of comorbidities.

Previous studies have demonstrated that patients with aspiration pneumonia are older and have greater disease severity and more comorbidities.[13, 14, 15] These single‐center studies also demonstrated greater mortality, more frequent admission to an intensive care unit (ICU), and longer hospital lengths of stay in patients with aspiration pneumonia. These studies identified aspiration pneumonia by the presence of a risk factor for aspiration[15] or by physician billing codes.[13] In practice, however, the bedside clinician diagnoses a patient as having aspiration pneumonia, but the logic is likely vague and inconsistent. Despite the potential for variability with individual judgment, an aggregate estimation from independent judgments may perform better than individual judgments.[16] Because there is no gold standard for defining aspiration pneumonia, all previous research has been limited to definitions created by investigators. This multicenter study seeks to determine what clinical characteristics lead physicians to diagnose a patient as having aspiration pneumonia, and whether or not the clinician‐derived diagnosis is distinct and clinically useful.

Our objectives were to: (1) identify covariates associated with bedside clinicians diagnosing a pneumonia patient as having aspiration pneumonia; (2) compare aspiration pneumonia and nonaspiration pneumonia in regard to disease severity, patient demographics, comorbidities, and clinical outcomes; and (3) measure the performance of the PSI in aspiration pneumonia versus nonaspiration pneumonia.

PATIENTS AND METHODS

Study Design and Setting

We performed a secondary analysis of the Community‐Acquired Pneumonia Organization (CAPO) database, which contains retrospectively collected data from 71 hospitals in 16 countries between June 2001 and December 2012. In each participating center, primary investigators selected nonconsecutive, adult hospitalized patients diagnosed with CAP. To decrease systematic selection biases, the selection of patients with CAP for enrollment in the trial was based on the date of hospital admission. Each investigator completed a case report form that was transferred via the internet to the CAPO study center at the University of Louisville (Louisville, KY). A sample of the data collection form is available at the study website (www.caposite.com). Validation of data quality was performed at the study center before the case was entered into the CAPO database. Local institutional review board approval was obtained for each study site.

Inclusion and Exclusion Criteria

Patients 18 years of age and satisfying criteria for CAP were included in this study. A diagnosis of CAP required a new pulmonary infiltrate at time of hospitalization, and at least 1 of the following: new or increased cough; leukocytosis; leukopenia, or left shift pattern on white blood cell count; and temperature >37.8C or <35.6 C. We excluded patients with pneumonia attributed to mycobacterial or fungal infection, and patients infected with human immunodeficiency virus, as we believed these types of pneumonia differ fundamentally from typical CAP.

Patient Variables

Patient variables included presence of aspiration pneumonia, laboratory data, comorbidities, and measures of disease severity, including the PSI. The clinician made a clinical diagnosis of the presence or absence of aspiration for each patient by marking a box on the case report form. Outcomes included in‐hospital mortality, hospital length of stay up to 14 days, and time to clinical stability up to 8 days. All variables were obtained directly from the case report form. In accordance with previously published definitions, we defined clinical stability as the day the following criteria were all met: improved clinical signs (improved cough and shortness of breath), lack of fever for >8 hours, improving leukocytosis (decreased at least 10% from the previous day), and tolerating oral intake.[17, 18]

Statistical Analysis

Baseline characteristics of patients with aspiration and nonaspiration CAP were compared using 2 or Fisher exact tests for categorical variables and the Mann‐Whitney U test for continuous variables.

To determine which patient variables were important in the physician diagnosis of aspiration pneumonia, we performed logistic regression with initial covariates comprising the demographic, comorbidity, and disease severity measurements listed in Table 1. We included interactions between cerebrovascular disease and age, nursing home status, and confusion to improve model fit. We centered all variables (including binary indicators) according to the method outlined by Kraemer and Blasey to improve interpretation of the main effects.[19]

Patient Characteristics of the Community‐Acquired Pneumonia Organization Database Stratified by Aspiration Pneumonia
  Aspiration Pneumonia, N=451 Nonaspiration Pneumonia, N=4,734 P Value
  • NOTE: All continuous data are median values (interquartile range), unless otherwise specified. Significance testing between groups was assessed using 2 or Mann‐Whitney U test, where appropriate. Abbreviations: BUN, blood urea nitrogen.

Demographics      
Age, y 79 (6587) 69 (5380) <0.001
% Male 59% 60% 0.58
Nursing home residence 25% 5% <0.001
Recent (30 days) antibiotic use 21% 16% 0.017
Comorbidities      
Cerebrovascular disease 35% 14% <0.001
Chronic obstructive pulmonary disease 25% 27% 0.62
Congestive heart failure 23% 19% 0.027
Diabetes 18% 18% 0.85
Cancer 12% 10% 0.12
Renal disease 10% 11% 0.53
Liver disease 6% 5% 0.29
Disease severity      
Pneumonia severity index 123 (99153) 92 (68117) <0.001
Confusion 49% 12% <0.001
PaO2 <60 mm Hg 43% 33% <0.001
BUN >30 g/dL 42% 23% <0.001
Multilobar pneumonia 34% 28% 0.003
Pleural effusion 25% 21% 0.07
Respiratory rate >30 breaths/minute 21% 20% 0.95
pH <7.35 13% 5% <0.001
Hematocrit <30% 11% 6% 0.001
Temperature >37.8C or <35.6C 9% 7% 0.30
Systolic blood pressure <90 mm Hg 8% 9% 0.003
Sodium <130 mEq/L 8% 6% 0.08
Heart rate >125 beats/minute 8% 5% 0.71
Glucose >250 mg/dL 6% 7% 0.06
Cavitary lesion 0% 0% 0.67
Clinical outcomes      
In‐hospital mortality 23% 9% <0.001
Intensive care unit admission 19% 13% 0.002
Hospital length of stay, d 9 (515) 7 (412) <0.001
Time to clinical stability, d 8 (48) 4 (38) <0.001

To determine if aspiration pneumonia had worse clinical outcomes compared to nonaspiration pneumonia, multiple methods were used. To compare the differences between the 2 groups with respect to time to clinical stability and length of hospital stay, we constructed Kaplan‐Meier survival curves and Cox proportional hazards regression models. The log‐rank test was used to determine statistical differences between the Kaplan‐Meier survival curves. To compare the impact of aspiration on mortality in patients with CAP, we conducted a propensity scorematched analysis. We chose propensity score matching over traditional logistic regression to balance variables among groups and to avoid the potential for overfit and multicollinearity. We considered a variable balanced after matching if its standardized difference was <10. All variables in the propensity scorematched analysis were balanced.

Although our dataset contained minimal missing data, we imputed any missing values to maintain the full study population in the creation of the propensity score. Missing data were imputed using the aregImpute function of the hmisc package of R (The R Foundation for Statistical Computing, Vienna, Austria).[20, 21] We built the propensity score model using a variable selection algorithm described by Bursac et al.[22] Our model included variables for region (United States/Canada, Europe, Asia/Africa or Latin America) and the variables listed in Table 1, with the exception of the PSI and the 4 clinical outcomes. Given that previous analyses accounting for clustering by physician did not substantially affect our results,[23] our model did not include physician‐level variables and did not account for the clustering effects of physicians. Using the propensity scores generated from this model, we matched a case of aspiration CAP with a case of nonaspiration CAP.[24] We then constructed a general linear model using the matched dataset to obtain the magnitude of effect of aspiration on mortality.

We used receiver operating characteristic curves to define the diagnostic accuracy of the pneumonia severity index for the prediction of mortality among patients with aspiration pneumonia and those with nonaspiration pneumonia. SAS version 9.3 (SAS Institute, Cary, NC) and R version 2.15.3 (The R Foundation for Statistical Computing) were used for all analyses. P values of 0.05 were considered statistically significant in all analyses.

RESULTS

Our initial query, after exclusion criteria, yielded a study population of 5185 patients (Figure 1). We compared 451 patients diagnosed with aspiration pneumonia to 4734 with CAP (Figure 1). Patient characteristics are summarized in Table 1. Patients with aspiration pneumonia were older, more likely to live in a nursing home, had greater disease severity, and were more likely to be admitted to an ICU. Patients with aspiration pneumonia had longer adjusted hospital lengths of stay and took more days to achieve clinical stability than patients with nonaspiration pneumonia (Figure 2). After adjusting for all variables in Table 1, the Cox proportional hazards models demonstrated that aspiration pneumonia was associated with ongoing hospitalization (hazard ratio [HR] for discharge: 0.77, 95% confidence interval [CI]: 0.65‐0.91, P=0.002) and clinical instability (HR for attaining clinical stability: 0.72, 95% CI: 0.61‐0.84, P<0.001). Patients with aspiration pneumonia presented with greater disease severity than those with nonaspiration pneumonia. Although there was no difference between groups in regard to temperature, respiratory rate, hyponatremia, or presence of pleural effusions or cavitary lesions, all other measured indices of disease severity were worse in patients with aspiration pneumonia. Patients with aspiration pneumonia were more likely to have cerebrovascular disease than those with nonaspiration pneumonia. Aspiration pneumonia patients also had increased prevalence of congestive heart failure. There was no appreciable difference between groups among other measured comorbidities.

Figure 1
Patient selection from June 2001 to December 2012. Abbreviations: CAP, community‐acquired pneumonia; HIV, human immunodeficiency virus.
Figure 2
Kaplan‐Meier graph of hospital length of stay (A) and time to clinical stability (B).

The patient characteristics most associated with a physician diagnosis of aspiration pneumonia, identified using logistic regression, were confusion, residence in nursing home, and presence of cerebrovascular disease (odds ratio [OR]: of 4.4, 2.9, and 2.3, respectively), whereas renal disease was associated with decreased physician diagnosis of aspiration pneumonia over nonaspiration pneumonia (OR: 0.58) (Table 2).

Final Logistic Regression Model for Physician Diagnosis of Aspiration Pneumonia
Covariate Odds Ratio 95% Confidence Intervals P Value
  • NOTE: The initial model included all demographic, comorbidity, and disease severity measurements from Table 1. Parameter estimates are for mean‐centered variables. Renal disease is defined as having a clinical diagnosis in the medical record. Although other interaction terms were used in the initial model, they were eliminated from the final model. We centered all variables (including binary indicators) according to the method described by Kraemer and Blasey.[19] The area under the curve of the final model is 0.79.

Demographics      
Age, y 1.00 0.991.01 0.948
Male 1.20 0.941.54 0.148
Nursing home residence 2.93 2.134.00 <0.001
Comorbidities      
Cerebrovascular disease 2.26 1.533.32 <0.001
Renal disease 0.58 0.390.85 0.006
Disease severity      
Confusion 4.41 3.405.72 <0.001
Hematocrit <30% 1.59 1.062.33 0.020
pH <7.35 1.67 1.102.47 0.013
Temperature >37.8C or <35.6C 1.60 1.072.35 0.019
Multilobar pneumonia 1.29 1.001.65 0.047
Interaction terms      
Age * cerebrovascular disease 0.98 0.960.99 0.011
Nursing home * cerebrovascular disease 0.51 0.270.96 0.037
Confusion * cerebrovascular disease 0.70 0.421.17 0.175

Observed in‐patient mortality of aspiration pneumonia was 23%. This mortality was considerably higher than a mean PSI score of 123 would predict (class IV risk group, with expected 30‐day mortality of 8%9%[25]). The PSI score's ability to predict inpatient mortality in patients with aspiration pneumonia was moderate, with an area under the curve (AUC) of 0.71. This was similar to its performance in patients with nonaspiration pneumonia (AUC of 0.75) (Figure 3). These values are lower than the AUC of 0.81 for the PSI in predicting mortality derived from a meta‐analysis of 31 other studies.[26]

Figure 3
Receiver operating characteristic curve of pneumonia severity index score and inpatient mortality. Abbreviations: AUC, area under the curve.

Our regression model after propensity score matching demonstrated that aspiration pneumonia independently confers a 2.3‐fold increased odds for inpatient mortality (95% CI: 1.56‐3.45, P<0.001).

DISCUSSION

Pneumonia patients with confusion, nursing home residence, or cerebrovascular disease are more likely to be diagnosed with aspiration pneumonia by clinicians. Although this is unsurprising, it is notable that these patients are more than twice as likely to die in the inpatient setting, even after accounting for age, comorbidities, and disease severity. These findings are similar to three previously published studies comparing aspiration and nonaspiration pneumonia at single institutions, albeit using different aspiration pneumonia definitions.[13, 14, 15] This study is the first large, multicenter, multinational study to demonstrate these findings.

Central to the interpretation of our results is the method of diagnosing aspiration versus nonaspiration. A bottom‐up method that relies on a clinician to check a box for aspiration may appear poorly reproducible. Because there is no diagnostic gold standard, clinicians may use different criteria to diagnose aspiration, creating potential for idiosyncratic noise. The strength of the wisdom of the crowd method used in this study is that an aggregate estimation from independent judgments may reduce the noise from individual judgments.[16] Although clinicians may vary in why they diagnose a particular patient as having aspiration pneumonia, it appears that the overwhelming reason for diagnosing a patient as having aspiration pneumonia is the presence of confusion, followed by previous nursing home residence or cerebrovascular disease. This finding has some face validity when compared with studies using an investigator definition, as altered mental status, chronic debility, and cerebrovascular disease are either prominent features of the definition of aspiration pneumonia[8] or frequently observed in patients with aspiration pneumonia.[13, 15] The distribution of cerebrovascular disease among our study's aspiration and nonaspiration pneumonia patients was similar to studies that used formal criteria in their definitions.[13, 15] Although nursing home residence was more likely in aspiration pneumonia patients, the majority of aspiration pneumonia patients were residing in the community, suggesting that aspiration is not simply a surrogate for healthcare‐associated pneumonia. Although patients with aspiration pneumonia are typically older than their nonaspiration counterparts, it appears that age is not a key determinant in the diagnosis of aspiration. With aspiration pneumonia, confusion, nursing home residence, and the presence of cerebrovascular disease are the greatest contributors in the clinical diagnosis, more than age.

Our data demonstrate that aspiration pneumonia confers increased odds for mortality, even after adjustment for age, disease severity, and comorbidities. These data suggest that aspiration pneumonia is a distinct entity from nonaspiration pneumonia, and that this disease is worse than nonaspiration CAP. If aspiration pneumonia is distinct from nonaspiration pneumonia, some unrecognized host factor other than age, disease severity, or the captured comorbidities decreases survival in aspiration pneumonia patients. However, it is also possible that aspiration pneumonia is merely a clinical designation for one end of the pneumonia spectrum, and we and others have failed to completely account for all measures of disease severity or all measures of comorbidities. Examples of unmeasured comorbidities would include presence of oropharyngeal dysphagia, which is not assessed in the database but could have a significant effect on clinical diagnosis. Unmeasured covariates can include measures beyond that of disease severity or comorbidity, such as the presence of a do not resuscitate (DNR) order, which could have a significant confounding effect on the observed association. A previous, single‐center study demonstrated that increased 30‐day mortality in aspiration pneumonia was mostly attributable to greater disease severity and comorbidities, although aspiration pneumonia independently conferred greater risk for adverse long‐term outcomes.[15] We propose that aspiration pneumonia represents a clinically distinct entity from nonaspiration pneumonia. Patients with chronic aspiration are often chronically malnourished and may have different oral flora than patients without chronic aspiration.[27, 28] Chronic aspiration has been associated with granulomatous reaction, organizing pneumonia, diffuse alveolar damage, and chronic bronchiolitis.[29] Chronic aspiration may elicit changes in the host physiology, and may render the host more susceptible to the development of secondary bacterial infection with morbid consequences.

The ability of the PSI to predict inpatient mortality was moderate (AUC only 0.7), with no significant additional discrimination between the aspiration and nonaspiration pneumonia groups. Although the PSI had moderate ability to predict inpatient mortality, the observed mortality was considerably higher than predicted. It is possible that the PSI incompletely captures clinically relevant comorbidities (eg, malnutrition). Further study to improve mortality prediction of aspiration pneumonia patients could employ sensitivity analysis to determine optimal thresholds and weighting of the PSI components.

Patients with aspiration pneumonia had longer hospital lengths of stay and took longer to achieve clinical stability than their nonaspiration counterparts. Time to clinical stability has been associated with increased posthospitalization mortality and is associated with time to switch from intravenous to oral antibiotics.[17] Although some component of hospital length‐of‐stay is subject to local practice patterns, time to clinical stability has explicit criteria for clinical improvement and failure, and therefore is less likely to be affected by local practice patterns.

We noted a relatively high (16%21%) incidence of prior antibiotic use among patients in this database. Analysis of antibiotic prescription patterns was limited, given the several different countries from which the database draws its cases. Although we used accepted criteria to define CAP cases, it is possible that this population may have a higher rate of resistant or uncommon pathogens than other studies of CAP that have populations with lower incidence of prior antibiotic use. Although not assessed, we suspect a significant component of the prior antibiotic use represented outpatient pneumonia treatment during the few days prior to visiting the hospital.

This study has several limitations, of which the most important may be that we used clinical determination for defining presence of aspiration pneumonia. This method is susceptible to the subjective perceptions of the treating clinician. We did not account for the effect of individual physicians in our model, although we did adjust for regional differences. The retrospective identification of patients allows for the possibility of selection bias, and therefore we have not attempted to make inferences regarding the relative incidence of pneumonia, nor did we adjust for temporal trends in diagnosis. The ratio of aspiration pneumonia patients to nonaspiration pneumonia patients may not necessarily reflect that observed in reality. Microbiologic and antibiotic data were unavailable for analysis. This study cannot inform on nonhospitalized patients with aspiration pneumonia, as only hospitalized patients were enrolled. The database identified cases of pneumonia, so it is possible for a patient to enter into the database more than once. Detection of mortality was limited to the inpatient setting rather than a set interval of 30 days. Inpatient mortality depends on length‐of‐stay patterns that may bias the mortality endpoint.[30] Also not assessed was the presence of a DNR order. It is possible that an older patient with greater comorbidities and disease severity may have care intentionally limited or withdrawn early by the family or clinicians.

Strengths of the study include its size and its multicenter, multinational population. The CAPO database is a large and well‐described population of patients with CAP.[17, 31] These attributes, as well as the clinician‐determined diagnosis, increase the generalizability of the study compared to a single‐center, single‐country study that employs investigator‐defined criteria.

CONCLUSION

Pneumonia patients with confusion, who are nursing home residence, and have cerebrovascular disease are more likely to be diagnosed with aspiration pneumonia by clinicians. Our clinician‐diagnosed cohort appears similar to those derived using an investigator definition. Patients with aspiration pneumonia are older, and have greater disease severity and more comorbidities than patients with nonaspiration pneumonia. They have greater mortality than their PSI score class would predict. Even after accounting for age, disease severity, and comorbidities, the presence of aspiration pneumonia independently conferred a greater than 2‐fold increase in inpatient mortality. These findings together suggest that aspiration pneumonia should be considered a distinct entity from typical pneumonia, and that additional research should be done in this field.

ACKNOWLEDGMENTS

Disclosures: M.J.L. contributed to the study design, data analysis, statistical analysis, and writing of the manuscript. P.P. contributed to the study design and revision of the manuscript for important intellectual content. T.W. and E.W. contributed to the study design, statistical analysis, and revision of the manuscript for important intellectual content. J.A.R. and N.C.D. contributed to the study design and revision of the manuscript for important intellectual content. All authors read and approved the final manuscript. M.L. takes responsibility for the integrity of the work as a whole, from inception to published article. This investigation was partly supported with funding from the National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health (grant 8UL1TR000105 [formerly UL1RR025764]). The authors report no conflicts of interest.

References
  1. Torres A, Serra‐Batlles J, Ferrer A, et al. Severe community‐acquired pneumonia. Epidemiology and prognostic factors. Am Rev Respir Dis. 1991;144(2):312318.
  2. Koivula I, Sten M, Makela PH. Risk factors for pneumonia in the elderly. Am J Med. 1994;96(4):313320.
  3. Marik PE, Kaplan D. Aspiration pneumonia and dysphagia in the elderly. Chest. 2003;124(1):328336.
  4. Mylotte JM, Goodnough S, Naughton BJ. Pneumonia versus aspiration pneumonitis in nursing home residents: diagnosis and management. J Am Geriatr Soc. 2003;51(1):1723.
  5. Marik PE. Aspiration pneumonia: mixing apples with oranges and tangerines. Crit Care Med. 2004;32(5):1236; author reply 1236–1237.
  6. Kozlow JH, Berenholtz SM, Garrett E, Dorman T, Pronovost PJ. Epidemiology and impact of aspiration pneumonia in patients undergoing surgery in Maryland, 1999–2000. Crit Care Med. 2003;31(7):19301937.
  7. Marik PE. Aspiration syndromes: aspiration pneumonia and pneumonitis. Hosp Pract (Minneap). 2010;38(1):3542.
  8. Marik PE. Aspiration pneumonitis and aspiration pneumonia. N Engl J Med. 2001;344(9):665671.
  9. Lim WS, Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58(5):377382.
  10. Fine MJ, Hanusa BH, Lave JR, et al. Comparison of a disease‐specific and a generic severity of illness measure for patients with community‐acquired pneumonia. J Gen Intern Med. 1995;10(7):359368.
  11. Espana PP, Capelastegui A, Gorordo I, et al. Development and validation of a clinical prediction rule for severe community‐acquired pneumonia. Am J Respir Crit Care Med. 2006;174(11):12491256.
  12. Jones BE, Jones J, Bewick T, et al. CURB‐65 pneumonia severity assessment adapted for electronic decision support. Chest. 2011;140(1):156163.
  13. Lanspa MJ, Jones BE, Brown SM, Dean NC. Mortality, morbidity, and disease severity of patients with aspiration pneumonia. J Hosp Med. 2013;8(2):8390.
  14. Heppner HJ, Sehlhoff B, Niklaus D, Pientka L, Thiem U. Pneumonia Severity Index (PSI), CURB‐65, and mortality in hospitalized elderly patients with aspiration pneumonia [in German]. Z Gerontol Geriatr. 2011;44(4):229234.
  15. Taylor JK, Fleming GB, Singanayagam A, Hill AT, Chalmers JD. Risk factors for aspiration in community‐acquired pneumonia: analysis of a hospitalized UK cohort. Am J Med. 2013;126(11):9951001.
  16. Yi SK, Steyvers M, Lee MD, Dry MJ. The wisdom of the crowd in combinatorial problems. Cogn Sci. 2012;36(3):452470.
  17. Aliberti S, Peyrani P, Filardo G, et al. Association between time to clinical stability and outcomes after discharge in hospitalized patients with community‐acquired pneumonia. Chest. 2011;140(2):482488.
  18. Ramirez JA. Clinical stability and switch therapy in hospitalised patients with community‐acquired pneumonia: are we there yet? Eur Respir J. 2013;41(1):56.
  19. Kraemer HC, Blasey CM. Centring in regression analyses: a strategy to prevent errors in statistical inference. Int J Methods Psychiatr Res. 2004;13(3):141151.
  20. Harrell FE. Hmisc: Harrell miscellaneous. Available at: http://CRAN.R‐project.org/package=Hmisc. Published Sept 12, 2014. Last accessed Oct 27, 2014.
  21. Heitjan DF, Little RJA. Multiple imputation for the fatal accident reporting system. J R Stat Soc Ser C Appl Stat. 1991;40(1):1329.
  22. Bursac Z, Gauss CH, Williams DK, Hosmer DW. Purposeful selection of variables in logistic regression. Source Code Biol Med. 2008;3:17.
  23. Arnold FW, Wiemken TL, Peyrani P, Ramirez JA, Brock GN; CAPO authors. Mortality differences among hospitalized patients with community‐acquired pneumonia in three world regions: results from the Community‐Acquired Pneumonia Organization (CAPO) International Cohort Study. Respir Med. 2013;107(7):11011111.
  24. Parsons L. Reducing bias in a propensity score matched‐pair sample using greedy matching techniques. In: Proceedings of the 26th Annual SAS Users Group International Conference. Cary, NC: SAS Institute Inc.; 2001:214226. Available at: http://www2.sas.com/proceedings/sugi26/p214–26.pdf. Last accessed Oct 27, 2014.
  25. Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low‐risk patients with community‐acquired pneumonia. N Engl J Med. 1997;336(4):243250.
  26. Chalmers JD, Singanayagam A, Akram AR, et al. Severity assessment tools for predicting mortality in hospitalised patients with community‐acquired pneumonia. Systematic review and meta‐analysis. Thorax. 2010;65(10):878883.
  27. Cabre M, Serra‐Prat M, Palomera E, Almirall J, Pallares R, Clave P. Prevalence and prognostic implications of dysphagia in elderly patients with pneumonia. Age Ageing. 2010;39(1):3945.
  28. Pace CC, McCullough GH. The association between oral microorgansims and aspiration pneumonia in the institutionalized elderly: review and recommendations. Dysphagia. 2010;25(4):307322.
  29. Yousem SA, Faber C. Histopathology of aspiration pneumonia not associated with food or other particulate matter: a clinicopathologic study of 10 cases diagnosed on biopsy. Am J Surg Pathol. 2011;35(3):426431.
  30. Jencks SF, Daley J, Draper D, Thomas N, Lenhart G, Walker J. Interpreting hospital mortality data. The role of clinical risk adjustment. JAMA. 1988;260(24):36113616.
  31. Arnold FW, Ramirez JA, McDonald LC, Xia EL. Hospitalization for community‐acquired pneumonia: the pneumonia severity index vs clinical judgment. Chest. 2003;124(1):121124.
References
  1. Torres A, Serra‐Batlles J, Ferrer A, et al. Severe community‐acquired pneumonia. Epidemiology and prognostic factors. Am Rev Respir Dis. 1991;144(2):312318.
  2. Koivula I, Sten M, Makela PH. Risk factors for pneumonia in the elderly. Am J Med. 1994;96(4):313320.
  3. Marik PE, Kaplan D. Aspiration pneumonia and dysphagia in the elderly. Chest. 2003;124(1):328336.
  4. Mylotte JM, Goodnough S, Naughton BJ. Pneumonia versus aspiration pneumonitis in nursing home residents: diagnosis and management. J Am Geriatr Soc. 2003;51(1):1723.
  5. Marik PE. Aspiration pneumonia: mixing apples with oranges and tangerines. Crit Care Med. 2004;32(5):1236; author reply 1236–1237.
  6. Kozlow JH, Berenholtz SM, Garrett E, Dorman T, Pronovost PJ. Epidemiology and impact of aspiration pneumonia in patients undergoing surgery in Maryland, 1999–2000. Crit Care Med. 2003;31(7):19301937.
  7. Marik PE. Aspiration syndromes: aspiration pneumonia and pneumonitis. Hosp Pract (Minneap). 2010;38(1):3542.
  8. Marik PE. Aspiration pneumonitis and aspiration pneumonia. N Engl J Med. 2001;344(9):665671.
  9. Lim WS, Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58(5):377382.
  10. Fine MJ, Hanusa BH, Lave JR, et al. Comparison of a disease‐specific and a generic severity of illness measure for patients with community‐acquired pneumonia. J Gen Intern Med. 1995;10(7):359368.
  11. Espana PP, Capelastegui A, Gorordo I, et al. Development and validation of a clinical prediction rule for severe community‐acquired pneumonia. Am J Respir Crit Care Med. 2006;174(11):12491256.
  12. Jones BE, Jones J, Bewick T, et al. CURB‐65 pneumonia severity assessment adapted for electronic decision support. Chest. 2011;140(1):156163.
  13. Lanspa MJ, Jones BE, Brown SM, Dean NC. Mortality, morbidity, and disease severity of patients with aspiration pneumonia. J Hosp Med. 2013;8(2):8390.
  14. Heppner HJ, Sehlhoff B, Niklaus D, Pientka L, Thiem U. Pneumonia Severity Index (PSI), CURB‐65, and mortality in hospitalized elderly patients with aspiration pneumonia [in German]. Z Gerontol Geriatr. 2011;44(4):229234.
  15. Taylor JK, Fleming GB, Singanayagam A, Hill AT, Chalmers JD. Risk factors for aspiration in community‐acquired pneumonia: analysis of a hospitalized UK cohort. Am J Med. 2013;126(11):9951001.
  16. Yi SK, Steyvers M, Lee MD, Dry MJ. The wisdom of the crowd in combinatorial problems. Cogn Sci. 2012;36(3):452470.
  17. Aliberti S, Peyrani P, Filardo G, et al. Association between time to clinical stability and outcomes after discharge in hospitalized patients with community‐acquired pneumonia. Chest. 2011;140(2):482488.
  18. Ramirez JA. Clinical stability and switch therapy in hospitalised patients with community‐acquired pneumonia: are we there yet? Eur Respir J. 2013;41(1):56.
  19. Kraemer HC, Blasey CM. Centring in regression analyses: a strategy to prevent errors in statistical inference. Int J Methods Psychiatr Res. 2004;13(3):141151.
  20. Harrell FE. Hmisc: Harrell miscellaneous. Available at: http://CRAN.R‐project.org/package=Hmisc. Published Sept 12, 2014. Last accessed Oct 27, 2014.
  21. Heitjan DF, Little RJA. Multiple imputation for the fatal accident reporting system. J R Stat Soc Ser C Appl Stat. 1991;40(1):1329.
  22. Bursac Z, Gauss CH, Williams DK, Hosmer DW. Purposeful selection of variables in logistic regression. Source Code Biol Med. 2008;3:17.
  23. Arnold FW, Wiemken TL, Peyrani P, Ramirez JA, Brock GN; CAPO authors. Mortality differences among hospitalized patients with community‐acquired pneumonia in three world regions: results from the Community‐Acquired Pneumonia Organization (CAPO) International Cohort Study. Respir Med. 2013;107(7):11011111.
  24. Parsons L. Reducing bias in a propensity score matched‐pair sample using greedy matching techniques. In: Proceedings of the 26th Annual SAS Users Group International Conference. Cary, NC: SAS Institute Inc.; 2001:214226. Available at: http://www2.sas.com/proceedings/sugi26/p214–26.pdf. Last accessed Oct 27, 2014.
  25. Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low‐risk patients with community‐acquired pneumonia. N Engl J Med. 1997;336(4):243250.
  26. Chalmers JD, Singanayagam A, Akram AR, et al. Severity assessment tools for predicting mortality in hospitalised patients with community‐acquired pneumonia. Systematic review and meta‐analysis. Thorax. 2010;65(10):878883.
  27. Cabre M, Serra‐Prat M, Palomera E, Almirall J, Pallares R, Clave P. Prevalence and prognostic implications of dysphagia in elderly patients with pneumonia. Age Ageing. 2010;39(1):3945.
  28. Pace CC, McCullough GH. The association between oral microorgansims and aspiration pneumonia in the institutionalized elderly: review and recommendations. Dysphagia. 2010;25(4):307322.
  29. Yousem SA, Faber C. Histopathology of aspiration pneumonia not associated with food or other particulate matter: a clinicopathologic study of 10 cases diagnosed on biopsy. Am J Surg Pathol. 2011;35(3):426431.
  30. Jencks SF, Daley J, Draper D, Thomas N, Lenhart G, Walker J. Interpreting hospital mortality data. The role of clinical risk adjustment. JAMA. 1988;260(24):36113616.
  31. Arnold FW, Ramirez JA, McDonald LC, Xia EL. Hospitalization for community‐acquired pneumonia: the pneumonia severity index vs clinical judgment. Chest. 2003;124(1):121124.
Issue
Journal of Hospital Medicine - 10(2)
Issue
Journal of Hospital Medicine - 10(2)
Page Number
90-96
Page Number
90-96
Article Type
Display Headline
Characteristics associated with clinician diagnosis of aspiration pneumonia: A descriptive study of afflicted patients and their outcomes
Display Headline
Characteristics associated with clinician diagnosis of aspiration pneumonia: A descriptive study of afflicted patients and their outcomes
Sections
Article Source

© 2014 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Address for correspondence and reprint requests: Michael J. Lanspa, MD, Shock‐Trauma Intensive Care Unit, Intermountain Medical Center, 5121 S. Cottonwood Street, Murray, UT 84107; Telephone: 801‐507‐6556; Fax: 801‐507‐5578; E‐mail: michael.lanspa@imail.org
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files

Housestaff Teams and Patient Outcomes

Article Type
Changed
Sun, 05/21/2017 - 13:37
Display Headline
Relationships within inpatient physician housestaff teams and their association with hospitalized patient outcomes

Since the Institute of Medicine Report To Err is Human, increased attention has been paid to improving the care of hospitalized patients.[1] Strategies include utilization of guidelines and pathways, and the application of quality improvement techniques to improve or standardize processes. Despite improvements in focused areas such as prevention of hospital‐acquired infections, evidence suggests that outcomes for hospitalized patients remain suboptimal.[2] Rates of errors and hospital‐related complications such as falls, decubitus ulcers, and infections remain high,[3, 4, 5] and not all patients receive what is known to be appropriate care.[6]

Many attempts to improve inpatient care have used process‐improvement approaches, focusing on impacting individuals' behaviors, or on breaking down processes into component parts. Examples include central line bundles or checklists.[7, 8] These approaches attempt to ensure that providers do things in a standardized way, but are implicitly based on the reductionist assumption that we can break processes down into predictable parts to improve the system. An alternative way to understand clinical systems is based on interdependencies between individuals in the system, or the ways in which parts of the system interact with each other, which may be unpredictable over time.[1, 9] Whereas these interdependencies include care processes, they also encompass the providers who care for patients. Providers working together vary in terms of the kinds of relationships they have with each other. Those relationships are crucial to system function because they are the foundation for the interactions that lead to effective patient care.

The application of several frameworks or approaches for considering healthcare systems in terms of relationships highlights the importance of this way of understanding system function. The include complexity science,[1, 7] relational coordination (which is grounded in complexity science),[10] high reliability,[11] and the Big Five for teamwork.[12]

Research indicates that interactions among healthcare providers can have important influences on outcomes.[13, 14, 15, 16, 17] Additionally, the initial implementation of checklists to prevent central‐line associated infections appeared to change provider relationships in a way that significantly influenced their success.[18] For example, positive primary care clinic member relationships as assessed by the Lanham framework have been associated with better chronic care model implementation, learning, and patient experience of care.[19, 20] This framework, which we apply here, identifies 7 relationship characteristics: (1) trust; (2) diversity; (3) respect; (4) mindfulness, or being open to new ideas from others; (5) heedfulness, or an understanding of how one's roles influence those of others; (6) use of rich in‐person or verbal communication, particularly for potentially ambiguous information open to multiple interpretations; and (7) having a mixture of social and task relatedness among teams, or relatedness outside of only work‐related tasks.[19] Relationships within surgical teams that are characterized by psychological safety and diversity are associated with successful uptake of new techniques and decreased mortality.[13, 14] Relationships are important because the ability of patients and providers to learn and make sense of their patients' illnesses is grounded in relationships.

We sought to better understand and characterize inpatient physician teams' relationships, and assess the association between team relationships as evaluated by Lanham's framework and outcomes for hospitalized patients. Data on relationships among inpatient medical teams are few, despite the fact that these teams provide a great proportion of inpatient care. Additionally, the care of hospitalized medical patients is complex and uncertain, often involving multiple providers, making provider relationships potentially even more important to outcomes than in other settings.

METHODS

Overview

We conducted an observational, convergent mixed‐methods study of inpatient medicine teams.[21, 22, 23] We focused on inpatient physician teams, defining them as the functional work group responsible for medical decision making in academic medical centers. Physician teams in this context have been studied in terms of social hierarchy, authority, and delegation.[24, 25, 26] Focusing on the relationships within these groups could provide insights into strategies to mitigate potential negative effects of hierarchy. We recognize that other providers are closely involved in the care of hospitalized patients, and although we did not have standard interactions between physicians, nurses, case managers, and other providers that we could consistently observe, we did include interactions with these other providers in our observations and assessments of team relationships. Because this work is among the first in inpatient medical teams, we chose to study a small number of teams in great depth, allowing us to make rich assessments of team relationships.

We chose patient outcomes of length of stay (LOS), unnecessary LOS (ULOS), and complication rates, adjusted for patient characteristics and team workload. LOS is an important metric of inpatient care delivery. We feel ULOS is an aspect of LOS that is dependent on the physician team, as it reflects their preparation of the patient for discharge. Finally, we chose complication rates because hospital‐acquired conditions and complications are important contributors to inpatient morbidity, and because recent surgical literature has identified complication rates as a contributor to mortality that could be related to providers' collective ability to recognize complications and act quickly.

This study was approved by the institutional review board at the University of Texas Health Science Center at San Antonio (UTHSCSA), the Research and Development Committee for the South Texas Veterans Health Care System (STVHCS), and the Research Committee at University Health System (UHS). All physicians consented to participate in the study. We obtained a waiver of consent for inclusion of patient data.

Setting and Study Participants

This study was conducted at the 2 UTHSCSA primary teaching affiliates. The Audie L. Murphy Veterans Affairs Hospital is the 220‐bed acute‐care hospital of the STVHCS. University Hospital is the 614‐bed, level‐I trauma, acute‐care facility for UHS, the county system for Bexar County, which includes the San Antonio, Texas major metropolitan area.

The inpatient internal medicine physician team was our unit of study. Inpatient medicine teams consisted of 1 faculty attending physician, 1 postgraduate year (PGY)‐2 or PGY‐3 resident, and 2 PGY‐1 members. In addition, typically 2 to 3 third‐year medical students were part of the team, and a subintern was sometimes present. Doctor of Pharmacy faculty and students were also occasionally part of the team. Social workers and case managers often joined team rounds for portions of the time, and nurses sometimes joined bedside rounds on specific patients. These teams admit all medicine patients with the exception of those with acute coronary syndromes, new onset congestive heart failure, or arrhythmias. Patients are randomly assigned to teams based on time of admission and call schedules.

Between these 2 hospitals, there are 10 inpatient medicine teams caring for patients, with a pool of over 40 potential faculty attendings. Our goal was to observe teams that would be most likely to vary in terms of their relationship characteristics and patient outcomes through observing teams with a range of individual members. We used a purposeful sampling approach to obtain a diverse sample, sampling based on physician attributes and time of year.[16, 17] Three characteristics were most important: attending physician years of experience, attending involvement in educational and administrative leadership, and the presence of struggling resident members, as defined by being on probation or having been discussed in the residency Clinical Competency Committee. We did not set explicit thresholds in terms of attending experience, but instead sought to ensure a range. The attendings we observed were more likely to be involved in education and administrative leadership activities, but were otherwise similar to those we did not observe in terms of years of experience. We included struggling residents to observe individuals with a range of skill sets, and not just high‐performing individuals. We obtained attending information based on our knowledge of the attending faculty pool, and from the internal medicine residency program. We sampled across the year to ensure a diversity of trainee experience, but did not observe teams in either July or August, as these months were early in the academic year. Interns spend approximately 5 months per year on inpatient services, whereas residents spend 2 to 3 months per year. Thus, interns but not residents observed later in the year might have spent significantly more time on an inpatient service. However, in all instances, none of the team members observed had worked together previously.

Data Collection

Data were collected over nine 1‐month periods from September 2008 through June 2011. Teams were observed daily for 2‐ to 4‐week periods during morning rounds, the time when the team discusses each patient and makes clinical decisions. Data collection started on the first day of the month, the first day that all team members worked together, and continued for approximately 27 days, the last day before the resident rotated to a different service. By comprehensively and systematically observing these teams' daily rounds, we obtained rich, in‐depth data with multiple data points, enabling us to assess specific team behaviors and interactions.

During the third and fourth months, we collected data on teams in which the attending changed partway through. We did this to understand the impact of individual attending change on team relationships. Because the team relationships differed with each attending, we analyzed them separately. Thus, we observed 7 teams for approximately 4‐week periods and 4 teams for approximately 2‐week periods.

Observers arrived in the team room prior to rounds to begin observations, staying until after rounds were completed. Detailed free‐text field notes were taken regarding team activities and behaviors, including how the teams made patient care decisions. Field notes included: length of rounds, which team members spoke during each patient discussion, who contributed to management discussions, how information from consultants was incorporated, how communication with others outside of the team occurred, how team members spoke with each other including the types of words used, and team member willingness to perform tasks outside of their usually defined role, among others. Field notes were collected in an open‐ended format to allow for inductive observations. Observers also recorded clinical data daily regarding each patient, including admission and discharge dates, and presenting complaint.

The observation team consisted of the principle investigator (PI) (hospitalist) and 2 research assistants (a graduate‐level medical anthropologist and social psychologist), all of whom were trained by a qualitative research expert to systematically collect data related to topics of interest. Observers were instructed to record what the teams were doing and talking about at all times, noting any behaviors that they felt reflected how team members related to each other and came to decisions about their patients, or that were characteristic of the team. To ensure consistency, the PI and 1 research assistant conducted observations jointly at the start of data collection for each team, checking concordance of observations daily using a percent agreement until general agreement on field note content and patient information reached 90%. Two individuals observed 24 days of data collection, representing 252 patient discussions (13% of observed discussions).

An age‐adjusted Charlson‐Deyo comorbidity score was calculated for each patient admitted to each team, using data from rounds and from each hospital's electronic health records (EHR).[27] We collected data regarding mental health conditions for each patient (substance use, mood disorder, cognitive disorder, or a combination) because these comorbidities could impact LOS or ULOS. Discharge diagnoses were based on the discharge summary in the EHR. We also collected data daily regarding team census and numbers of admissions to and discharges from each team to assess workload.

Three patient outcomes were measured: LOS, ULOS, and complications. LOS was defined as the total number of days the patient was in the hospital. ULOS was defined as the number of days a patient remained in the hospital after the day the team determined the patient was medically ready for discharge (assessed by either discussion on rounds or EHR documentation). ULOS may occur when postdischarge needs have been adequately assessed, or because of delays in care, which may be related to provider communication during the hospitalization. Complications were defined on a per‐patient, per‐day basis in 2 ways: the development of a new problem in the hospital, for example acute kidney injury, a hospital‐acquired infection, or delirium, or by the team noting a clinical deterioration after at least 24 hours of clinical stability, such as the patient requiring transfer to a higher level of care. Complications were determined based on discussions during rounds, with EHR verification if needed.

Analysis Phase I: Assessment of Relationship Characteristics

After the completion of data collection, field notes were reviewed by a research team member not involved in the original study design or primary data collection (senior medical student). We took this approach to guard against biasing the reviewer's view of team behaviors, both in terms of not having conducted observations of the teams and being blinded to patient outcomes.

The reviewer completed a series of 3 readings of all field notes. The first reading provided a summary of the content of the data and the individual teams. Behavioral patterns of each team were used to create an initial team profile. The field notes and profiles were reviewed by the PI and a coauthor not involved in data collection to ensure that the profiles adequately reflected the field notes. No significant changes to the profiles were made based on this review. The profiles were discussed at a meeting with members of the larger research team, including the PI, research assistants, and coinvestigators (with backgrounds in medicine, anthropology, and information and organization management). Behavior characteristics that could be used to distinguish teams were identified in the profiles using a grounded theory approach.

The second review of field notes was conducted to test the applicability of the characteristics identified in the first review. To systematically record the appearance of the behaviors, we created a matrix with a row for each behavior and columns for each team to note whether they exhibited each behavior. If the behavior was exhibited, specific examples were cataloged in the matrix. This matrix was reviewed and refined by the research team. During the final field note review meeting, the research team compared the summary matrix for each team, with the specific behaviors noted during the first reading of the field notes to ensure that all behaviors were recorded.

After cataloging behaviors, the research team assigned each behavior to 1 of the 7 Lanham relationship characteristics. We wanted to assess our observations against a relationship framework to ensure that we were able to systematically assess all aspects of relationships. The Lanham framework was initially developed based on a systematic review of the organizational and educational literatures, making it relevant to the complex environment of an academic medical inpatient team and allowing us to assess relationships at a fine‐grained, richly detailed level. This assignment was done by the author team as a group. Any questions were discussed and different interpretations resolved through consensus. The Lanham framework has 7 characteristics.[19] Based on the presence of behaviors associated with each relationship characteristic, we assigned a point to each team for each relationship characteristic observed. We considered a behavior type to be present if we observed it on at least 3 occasions on separate days. Though we used a threshold of at least 3 occurrences, most teams that did not receive a point for a particular characteristic did not have any instances in which we observed the characteristic. This was particularly true for trust and mindfulness, and least so for social/task relatedness. By summing these points, we calculated a total relationship score for each team, with potential scores ranging from 0 (for teams exhibiting no behaviors reflecting a particular relationship characteristic) to 7.

Analysis Phase II: Factor Analysis

To formally determine which relationship characteristics were most highly related, data were submitted to a principal components factor analysis using oblique rotation. Item separation was determined by visual inspection of the scree plot and eigenvalues over 1.

Analysis Phase III: Assessing the Association between Physician Team Relationship Characteristics and Patient Outcomes

We examined the association between team relationships and patient outcomes using team relationship scores. For the LOS/ULOS analysis, we only included patients whose entire hospitalization occurred under the care of the team we observed. Patients who were on the team at the start of the month, were transferred from another service, or who remained hospitalized after the end of the team's time together were excluded. The longest possible LOS for patients whose entire hospitalization occurred on teams that were observed for half a month was 12 days. To facilitate accurate comparison between teams, we only included patients whose LOS was 12 days.

Complication rates were defined on a per‐patient per‐day basis to normalize for different team volumes and days of observation. For this analysis, we included patients who remained on the team after data collection completion, patients transferred to another team, or patients transferred from another team. However, we only counted complications that occurred at least 24 hours following transfer to minimize the likelihood that the complication was related to the care of other physicians.

Preliminary analysis involved inspection and assessment of the distribution of all variables followed, by a general linear modeling approach to assess the association between patient and workload covariates and outcomes.[28, 29] Because we anticipated that outcome variables would be markedly skewed, we also planned to assess the association between relationship characteristics with outcomes using the Kruskal‐Wallis rank sum test to compare groups with Dunn's test[30] for pairwise comparisons if overall significance occurred.[31] There are no known acceptable methods for covariate adjustments using the Kruskal‐Wallis method. All models were run using SAS software (SAS Institute Inc., Cary, NC).[32]

RESULTS

The research team observed 1941 discussions of 576 individual patients. Observations were conducted over 352 hours and 54 minutes, resulting in 741 pages of notes (see Supporting Table 1 in the online version of this article for data regarding individual team members). Teams observed over half‐months are referred to with a and b designations.

Relationship Characteristics and Observed Behaviors
Relationship CharacteristicDefinitionThirteen Types of Behaviors Observed in Field NotesObserved Examples
TrustWillingness to be vulnerable to othersUse of we instead of you or I by the attendingWhere are we going with this guy?
Attending admitting I don't knowLet's go talk to him, I can't figure this out
Asking questions to help team members to think through problemsWill the echo change our management? How will it help us?
DiversityIncluding different perspectives and different thinkingTeam member participation in conversations about patients that are not theirsOne intern is presenting, another intern asks a question, and the resident joins the discussion
Inclusion of perspectives of those outside the team (nursing and family members)Taking a break to call the nurse, having a family meeting
RespectValuing the opinions of others, honest and tactful interactionsUse of positive reinforcement by the attendingBeing encouraging of the medical student's differential, saying excellent
How the team talks with patientsAsking if the patient has any concerns, what they can do to make them comfortable
HeedfulnessAwareness of how each person's roles impact the rest of the teamTeam members performing tasks not expected of their roleOne intern helping another with changing orders to transfer a patient
Summarizing plans and strategizingAttending recaps the plan for the day, asks what they can do
MindfulnessOpenness to new ideas/free discussion about what is and is not workingEntire team engaged in discussionAttending asks the medical student, intern, and resident what they think is going on
Social relatednessHaving socially related interactionsSocial conversation among team membersIntern talks about their day off
Jokes by the attendingShowers and a bowel movement is the key to making people happy
Appropriate use of rich communicationUse of in‐person communication for sensitive or difficult issuesUsing verbal communication with consultants or familyIntern is on the phone with the pharm D because there is a problem with the medication

Creation of team profiles yielded 13 common behavior characteristics that were inductively identified and that could potentially distinguish teams, including consideration of perspectives outside of the team and team members performing tasks normally outside of their roles. Table 1 provides examples of and summarizes observed behaviors using examples from the field notes, mapping these behavior characteristics onto the Lanham relationship characteristics. The distribution of relationship characteristics and scores for each team are shown in Table 2.

Team Relationship Profiles
Relationship CharacteristicTeam
123a3b4a4b56789
Trust00100010111
Diversity01110000111
Respect01110100111
Heedfulness01101010111
Mindfulness00100110111
Social/task relatedness01101110111
Rich/lean communication01100010110
Relationship score (no. of characteristics observed)05722350776

Correlation between relationship characteristics ranged from 0.32 to 0.95 (see Supporting Table 2 in the online version of this article). Mindfulness and trust are more highly correlated with each other than with other variables, as are diversity and respect. We performed a principal components factor analysis. Based on scree plot inspection and eigenvalues >1, we kept 3 factors that explained 85% of the total variance (see Supporting Table 3 in the online version of this article).

Association Between the Teams' Number of Relationship Characteristics and Patient Outcomes
 No. of Relationship Characteristics
023567
  • NOTE: Abbreviations: IQR, interquartile range; LOS, length of stay; ULOS, unnecessary length of stay.

  • Not significant.

LOS, d, n=293   
Median453
IQR543
Mean4.7 (2.72)4.7 (2.52)4.1 (2.51), P=0.12a
ULOS, d, n=293   
Median000
IQR000
Mean0.37 (0.99)0.33 (0.96)0.13 (0.56), P=0.09a
Complications (per patient per day), n=398
Median000
IQR110
Mean0.58 (1.06)0.45 (0.77)0.18 (0.59), P=0.001 compared to teams with 02 or 35 characteristics

Our analyses of LOS and ULOS included 298 of the 576 patients. Two hundred sixty‐seven patients were excluded because their entire LOS did not occur while under the care of the observed teams. Eleven patients were removed from the analysis because their LOS was >12 days. The analysis of complications included 398 patients. In our preliminary general linear modeling approach, only patient workload was significantly associated with outcomes using a cutoff of P=0.05. Charlson‐Deyo score and mental health comorbidities were not associated with outcomes.

The results of the Kruskal‐Wallis test show the patient average ranking on each of the outcome variables by 3 groups (Table 3). Overall, teams with higher relationship scores had lower rank scores on all outcomes measures. However, the only statistically significant comparisons were for complications. Teams having 6 to 7 characteristics had a significantly lower complication rate ranking than teams with 0 to 2 and 3 to 5 (P=0.001). We did not find consistent differences between individual teams or groups of teams with relationship scores from 0 to 2, 3 to 5, and 6 to 7 with regard to Charlson score, mental health issues, or workload. The only significant differences were between Charlson‐Deyo scores for patients admitted to teams with low relationship scores of 0 to 2 versus high relationship scores of 6 to 7 (6.7 vs 5.1); scores for teams with relationship scores of 3 to 5 were not significantly different from the low or high groups.

Table 4 shows the Kruskal‐Wallace rank test results for each group of relationship characteristics identified in the factor analysis based on whether teams displayed all or none of the characteristics in the factor. There were no differences in these groupings for LOS. Teams that exhibited both mindfulness and trust had lower ranks on ULOS than teams that did not have either. Similarly, teams with heedfulness, social‐task relatedness, and more rich communication demonstrated lower ULOS rankings than teams who did not have all 3 characteristics.

Association Between Inpatient Physician Team Relationship Characteristics and Outcomes
 Mind/TrustDiversity/RespectHeed/Relate/Communicate
Patient OutcomeNoneBothNoneBothNoneAll 3
  • NOTE: Abbreviations: IQR, interquartile range; LOS, length of stay; ULOS, unnecessary length of stay.

  • Not significant.

LOS, d, n=293
Median444444
IQR534.5344
Mean4.7 (2.6)4.2 (2.5)4.7 (2.6)4.3 (2.5)4.4 (2.6)4.4 (2.6)
P value0.06a0.23a0.85a
ULOS, d, n=293
Median000000
IQR000000
Mean0.39 (1.01)0.15 (0.62)0.33 (0.92)0.18 (0.71)0.32 (0.93)0.18 (0.69)
P value0.0090.060.03
Complications (per patient), n=389
Median000000
IQR101010
Mean0.58 (1.01)0.19 (0.58)0.47 (0.81)0.29 (0.82)0.26 (0.92)0.28 (0.70)
P value<0.00010.0010.02

DISCUSSION

Relationships are critical to team function because they are the basis for the social interactions that are central to patient care. These interactions include how providers recognize and make sense of what is happening with patients, and how they learn to care for patients more effectively. Additionally, the high task interdependencies among inpatient providers require effective relationships for optimal care. In our study, inpatient medicine physician teams' relationships varied, and these differences were associated with ULOS and complications. Relationship characteristics are not mutually exclusive, and as our factor analysis demonstrates, are intercorrelated. Trust and mindfulness appear to be particularly important. Trust may foster psychological safety that in turn promotes the willingness of individuals to contribute their thoughts and ideas.[13] In low‐trust teams, providers may fear a negative impact for bringing forward a concern based on limited data. Mindful teams may be more likely to notice nuanced changes, or are more likely to talk when things just do not appear to be going in the right direction with the patient. In the case of acutely ill medical patients, trust and mindfulness may lead to an increased likelihood that clinical changes are recognized and discussed quickly. For example, on a team characterized by trust and mindfulness, the entire team was typically involved in care discussions, and the interns and students frequently asked a lot of questions, even regarding the care of patients they were not directly following. We observed that these questions and discussions often led the team to realize that they needed to make a change in management decisions (eg, discontinuing Bactrim, lowering insulin doses, adjusting antihypertensives, premedicating for intravenous contrast) that they had not caught in the assessment and plan portion of the patient care discussion. In another example, a medical student asked a tentative question after a patient needed to go quickly to the bathroom while they were examining her, leading the team to ask more questions that led to a more rapid evaluation of a potential urinary tract infection. This finding is consistent with the description of failure to rescue among surgical patients, in which mortality has been associated with the failure to recognize complications rapidly and act effectively.[33]

Our findings are limited in several ways. First, these data are from a single academic institution. Although we sought diversity among our teams and collected data across 2 hospitals, there may be local contextual factors that influenced our results. Second, our data demonstrate an association, but not causality. Our findings should be tested in studies that assess causality and potential mechanisms through which relationships influence outcomes. Third, the individuals observing the teams had some knowledge of patient outcomes through hearing patient discussions. However, by involving individuals who did not participate in observations and were blinded to outcomes in assessing team relationships, we addressed this potential bias. Fourth, our observations were largely focused on physician teams, not directly including other providers. Our difficulty in observing regular interactions between physicians and other providers underscores the need to increase contact among those caring for hospitalized patients, such as occurs through multidisciplinary rounds. We did include team communication with other disciplines in our assessment of the relationship characteristics of diversity and rich communication. Finally, our analysis was limited by our sample size. We observed a relatively small number of teams. Although we benefitted from seeing the change in team relationships that occurred with attending changes halfway through some of our data collection months, this did limit the number of patients we could include in our analyses. Though we did not observe obvious differences in relationships between the teams observed across the 2 hospitals, the small number of teams and hospitals precluded our ability to perform multilevel modeling analyses, which would have allowed us to assess or account for the influence of team or organizational factors. However, this small sample size did allow for a richer assessment of team behaviors.

Although preliminary, our findings are an important step in understanding the function of inpatient medical teams not only in terms of processes of care, but also in terms of relationships. Patient care is a social activity, requiring effective communication to develop working diagnoses, recognize changes in patients' clinical courses, and formulate effective treatment plans during and after hospitalization. Future work could follow several directions. One would be to assess the causal mechanisms through which relationships influence patient outcomes. These may include sensemaking, learning, and improved coordination. Positive relationships may facilitate interaction of tacit and explicit information, facilitating the creation of understandings that foster more effective patient care.[34] The dynamic nature of relationships and how patient outcomes in turn feed back into relationships could be an area of exploration. This line of research could build on the idea of teaming.[35] Understanding relationships across multidisciplinary teams or with patients and families would be another direction. Finally, our results could point to potential interventions to improve patient outcomes through improving relationships. Better understanding of the nature of effective relationships among providers should enable us to develop more effective strategies to improve the care of hospitalized patients. In the larger context of payment reforms that require greater coordination and communication among and across providers, a greater understanding of how relationships influence patient outcomes will be important.

Acknowledgements

The authors thank the physicians involved in this study and Ms. Shannon Provost for her involvement in discussions of this work.

Disclosures: The research reported herein was supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service (CDA 07‐022). Investigator salary support was provided through this funding, and through the South Texas Veterans Health Care System. The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs. Dr. McDaniel receives support from the IC[2] Institute of the University of Texas at Austin. Dr. Luci Leykum had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The authors report no conflicts of interest.

Files
References
  1. Plsek P. Redesigning health care with insights from the science of complex adaptive systems. In: Crossing the Quality Chasm: A New Heath System for the 21st Century. Washington, DC: National Academy of Sciences; 2000:309322.
  2. Landrigan CP, Parry GJ, Bones CB, Hackbarth AD, Goldmann DA, Sharek PJ. Temporal trends in rates of patient harm resulting from medical care. N Engl J Med. 2010;323(22):21242135.
  3. Krauss MJ, Nguyen SL, Dunagan WC, et al. Circumstances of patient falls and injuries in 9 hospitals in a mid‐western healthcare system. Infect Control Hosp Epidemiol. 2007;28(5):544550.
  4. Hurd T, Posnett J. Point prevalence of wounds in a sample of acute hospitals in Canada. Int Wound J. 2009;6(4):287293.
  5. Garcin F, Leone M, Antonini F, Charvet A, Albanese J, Martin C. Non‐adherence to guidelines: an avoidable cause of failure of empirical antimicrobial therapy in the presence of difficult‐to‐treat bacteria. Intensive Care Med. 2010;36(1):7582.
  6. Williams SC, Schmaltz SP, Morton DJ, Koss RG, Loeb JM. Quality of care in U.S. hospitals as reflected by standardized measures, 2002–2004. N Engl J Med. 2005;353(3):255264.
  7. Centers for Disease Control and Prevention. National Center for Emerging and Zoonotic Infectious Diseases. Division of Healthcare Quality Promotion. Checklist for prevention of central line associated blood stream infections. Available at: http://www.cdc.gov/HAI/pdfs/bsi/checklist‐for‐CLABSI.pdf. Accessed August 3, 2014.
  8. Safer Healthcare Partners, LLC. Checklists: a critical patient safety tool. Available at: http://www.saferhealthcare.com/high‐reliability‐topics/checklists. Accessed July 31, 2014.
  9. Yam Y. Making Things Work: Solving Complex Problems in a Complex World. Boston, MA: Knowledge Press; 2004:117160.
  10. Gittell JH. High Performance Healthcare: Using The Power of Relationships to Achieve Quality, Efficiency, and Resilience. 1st ed. New York, NY: McGraw‐Hill; 2009.
  11. Carroll JS, Rudolph JW. Design of high reliability organizations in health care. Qual Saf Health Care. 2006;15(suppl 1):i4i9.
  12. Salas E, DiazGranados D, Weaver SJ, King H. Does team training work? Principles for health care. Acad Emerg Med. 2008;15(11):10021009.
  13. Edmondson A. Speaking up in the operating room: how team leaders promote learning in interdisciplinary action teams. J Manag Stud. 2003;40(6):14191452.
  14. Neily J, Mills PD, Young‐Xu Y, et al. Association between implementation of a medical team training program and surgical mortality. JAMA. 2010;304(15):16931700.
  15. Lewis K, Belliveau M, Herndon B, Keller J. Group cognition, membership change, and performance: Investigating the benefits and detriments of collective knowledge. Organ Behav Hum Decis Process. 2007;103(2):159178.
  16. Leykum LK, Palmer RF, Lanham HJ, McDaniel RR, Noel PH, Parchman ML. Reciprocal learning and chronic care model implementation in primary care: results from a new scale of learning in primary care settings. BMC Health Serv Res. 2011;11:44.
  17. Noel PH, Lanham HJ, Palmer RF, Leykum LK, Parchman ML. The importance of relational coordination and reciprocal learning for chronic illness care within primary care teams. Health Care Manage Rev. 2012;38(1):2028.
  18. Dixon‐Woods M, Bosk CL, Aveling EL, Goeschel CA, Pronovost PJ. Explaining Michigan: developing an ex post theory of a quality improvement program. Milbank Q. 2011;89(2):167205.
  19. Lanham HJ, McDaniel RR, Crabtree BF, et al. How improving practice relationships among clinicians and nonclinicians can improve quality in primary care. Jt Comm J Qual Patient Saf. 2009;35(9):457466.
  20. Finely EP, Pugh JA, Lanham HJ, et al. Relationship quality and patient‐assessed quality of care in VA primary care clinics: development and validation of the work relationships scale. Ann Fam Med. 2013;11(6):543549.
  21. Creswell JW, Plano Clark VL. Designing and Conducting Mixed Methods Research. 2nd ed. Thousand Oaks, CA: Sage; 2011.
  22. Patton MQ. Qualitative Evaluation Methods. Thousand Oaks, CA: Sage; 2002.
  23. Pope C, Royen P, Baker R. Qualitative methods in research on health care quality. Qual Saf Health Care. 2002;11:148152.
  24. Hoff T. Managing the negatives of experience in physician teams. Health Care Manage Rev. 2010;35(1):6576.
  25. Tamuz M, Giardina TD, Thomas EJ, Menon S, Singh H. Rethinking resident supervision to improve safety: from hierarchical to interprofessional models. J Hosp Med. 2011;6(8):445 b452.
  26. Klein KJ, Ziegart JC, Knight AP, Xiao Y. Dynamic delegation: shared, hierarchical, and deindividualized leadership in extreme action teams. Adm Sci Q. 2006;51(4):590621.
  27. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD‐9‐CM administrative databases. J Clin Epidemiol. 1992;45(6):613619.
  28. Tukey JW. Exploratory Data Analysis. Reading, MA: Addison‐Wesley; 1977.
  29. Zar JH. Biostatistical Analysis. 4th ed. Upper Saddle River, NJ: Pearson Prentice‐Hall; 2010.
  30. Dunn OJ. Multiple contrasts using rank sums. Technometrics. 1964;6:241252.
  31. Elliott AC, Hynan LS. A SAS macro implementation of a multiple comparison post hoc test for a Kruskal–Wallis analysis. Comput Methods Programs Biomed. 2011;102:7580.
  32. SAS/STAT Software [computer program]. Version 9.1. Cary, NC: SAS Institute Inc.; 2003.
  33. Ghaferi AA, Birkmeyer JD, Dimick JB. Complications, failure to rescue, and mortality with major inpatient surgery in Medicare patients. Ann Surg. 2009;250(6):10291034.
  34. Nonaka I. A dynamic theory of organizational knowledge creation. Org Sci. 1994;5(1):1437.
  35. Edmundson AC. Teaming: How Organizations Learn, Innovate, and Compete in the Knowledge Economy. 1st ed. Boston, MA: Harvard Business School; 2012.
Article PDF
Issue
Journal of Hospital Medicine - 9(12)
Page Number
764-771
Sections
Files
Files
Article PDF
Article PDF

Since the Institute of Medicine Report To Err is Human, increased attention has been paid to improving the care of hospitalized patients.[1] Strategies include utilization of guidelines and pathways, and the application of quality improvement techniques to improve or standardize processes. Despite improvements in focused areas such as prevention of hospital‐acquired infections, evidence suggests that outcomes for hospitalized patients remain suboptimal.[2] Rates of errors and hospital‐related complications such as falls, decubitus ulcers, and infections remain high,[3, 4, 5] and not all patients receive what is known to be appropriate care.[6]

Many attempts to improve inpatient care have used process‐improvement approaches, focusing on impacting individuals' behaviors, or on breaking down processes into component parts. Examples include central line bundles or checklists.[7, 8] These approaches attempt to ensure that providers do things in a standardized way, but are implicitly based on the reductionist assumption that we can break processes down into predictable parts to improve the system. An alternative way to understand clinical systems is based on interdependencies between individuals in the system, or the ways in which parts of the system interact with each other, which may be unpredictable over time.[1, 9] Whereas these interdependencies include care processes, they also encompass the providers who care for patients. Providers working together vary in terms of the kinds of relationships they have with each other. Those relationships are crucial to system function because they are the foundation for the interactions that lead to effective patient care.

The application of several frameworks or approaches for considering healthcare systems in terms of relationships highlights the importance of this way of understanding system function. The include complexity science,[1, 7] relational coordination (which is grounded in complexity science),[10] high reliability,[11] and the Big Five for teamwork.[12]

Research indicates that interactions among healthcare providers can have important influences on outcomes.[13, 14, 15, 16, 17] Additionally, the initial implementation of checklists to prevent central‐line associated infections appeared to change provider relationships in a way that significantly influenced their success.[18] For example, positive primary care clinic member relationships as assessed by the Lanham framework have been associated with better chronic care model implementation, learning, and patient experience of care.[19, 20] This framework, which we apply here, identifies 7 relationship characteristics: (1) trust; (2) diversity; (3) respect; (4) mindfulness, or being open to new ideas from others; (5) heedfulness, or an understanding of how one's roles influence those of others; (6) use of rich in‐person or verbal communication, particularly for potentially ambiguous information open to multiple interpretations; and (7) having a mixture of social and task relatedness among teams, or relatedness outside of only work‐related tasks.[19] Relationships within surgical teams that are characterized by psychological safety and diversity are associated with successful uptake of new techniques and decreased mortality.[13, 14] Relationships are important because the ability of patients and providers to learn and make sense of their patients' illnesses is grounded in relationships.

We sought to better understand and characterize inpatient physician teams' relationships, and assess the association between team relationships as evaluated by Lanham's framework and outcomes for hospitalized patients. Data on relationships among inpatient medical teams are few, despite the fact that these teams provide a great proportion of inpatient care. Additionally, the care of hospitalized medical patients is complex and uncertain, often involving multiple providers, making provider relationships potentially even more important to outcomes than in other settings.

METHODS

Overview

We conducted an observational, convergent mixed‐methods study of inpatient medicine teams.[21, 22, 23] We focused on inpatient physician teams, defining them as the functional work group responsible for medical decision making in academic medical centers. Physician teams in this context have been studied in terms of social hierarchy, authority, and delegation.[24, 25, 26] Focusing on the relationships within these groups could provide insights into strategies to mitigate potential negative effects of hierarchy. We recognize that other providers are closely involved in the care of hospitalized patients, and although we did not have standard interactions between physicians, nurses, case managers, and other providers that we could consistently observe, we did include interactions with these other providers in our observations and assessments of team relationships. Because this work is among the first in inpatient medical teams, we chose to study a small number of teams in great depth, allowing us to make rich assessments of team relationships.

We chose patient outcomes of length of stay (LOS), unnecessary LOS (ULOS), and complication rates, adjusted for patient characteristics and team workload. LOS is an important metric of inpatient care delivery. We feel ULOS is an aspect of LOS that is dependent on the physician team, as it reflects their preparation of the patient for discharge. Finally, we chose complication rates because hospital‐acquired conditions and complications are important contributors to inpatient morbidity, and because recent surgical literature has identified complication rates as a contributor to mortality that could be related to providers' collective ability to recognize complications and act quickly.

This study was approved by the institutional review board at the University of Texas Health Science Center at San Antonio (UTHSCSA), the Research and Development Committee for the South Texas Veterans Health Care System (STVHCS), and the Research Committee at University Health System (UHS). All physicians consented to participate in the study. We obtained a waiver of consent for inclusion of patient data.

Setting and Study Participants

This study was conducted at the 2 UTHSCSA primary teaching affiliates. The Audie L. Murphy Veterans Affairs Hospital is the 220‐bed acute‐care hospital of the STVHCS. University Hospital is the 614‐bed, level‐I trauma, acute‐care facility for UHS, the county system for Bexar County, which includes the San Antonio, Texas major metropolitan area.

The inpatient internal medicine physician team was our unit of study. Inpatient medicine teams consisted of 1 faculty attending physician, 1 postgraduate year (PGY)‐2 or PGY‐3 resident, and 2 PGY‐1 members. In addition, typically 2 to 3 third‐year medical students were part of the team, and a subintern was sometimes present. Doctor of Pharmacy faculty and students were also occasionally part of the team. Social workers and case managers often joined team rounds for portions of the time, and nurses sometimes joined bedside rounds on specific patients. These teams admit all medicine patients with the exception of those with acute coronary syndromes, new onset congestive heart failure, or arrhythmias. Patients are randomly assigned to teams based on time of admission and call schedules.

Between these 2 hospitals, there are 10 inpatient medicine teams caring for patients, with a pool of over 40 potential faculty attendings. Our goal was to observe teams that would be most likely to vary in terms of their relationship characteristics and patient outcomes through observing teams with a range of individual members. We used a purposeful sampling approach to obtain a diverse sample, sampling based on physician attributes and time of year.[16, 17] Three characteristics were most important: attending physician years of experience, attending involvement in educational and administrative leadership, and the presence of struggling resident members, as defined by being on probation or having been discussed in the residency Clinical Competency Committee. We did not set explicit thresholds in terms of attending experience, but instead sought to ensure a range. The attendings we observed were more likely to be involved in education and administrative leadership activities, but were otherwise similar to those we did not observe in terms of years of experience. We included struggling residents to observe individuals with a range of skill sets, and not just high‐performing individuals. We obtained attending information based on our knowledge of the attending faculty pool, and from the internal medicine residency program. We sampled across the year to ensure a diversity of trainee experience, but did not observe teams in either July or August, as these months were early in the academic year. Interns spend approximately 5 months per year on inpatient services, whereas residents spend 2 to 3 months per year. Thus, interns but not residents observed later in the year might have spent significantly more time on an inpatient service. However, in all instances, none of the team members observed had worked together previously.

Data Collection

Data were collected over nine 1‐month periods from September 2008 through June 2011. Teams were observed daily for 2‐ to 4‐week periods during morning rounds, the time when the team discusses each patient and makes clinical decisions. Data collection started on the first day of the month, the first day that all team members worked together, and continued for approximately 27 days, the last day before the resident rotated to a different service. By comprehensively and systematically observing these teams' daily rounds, we obtained rich, in‐depth data with multiple data points, enabling us to assess specific team behaviors and interactions.

During the third and fourth months, we collected data on teams in which the attending changed partway through. We did this to understand the impact of individual attending change on team relationships. Because the team relationships differed with each attending, we analyzed them separately. Thus, we observed 7 teams for approximately 4‐week periods and 4 teams for approximately 2‐week periods.

Observers arrived in the team room prior to rounds to begin observations, staying until after rounds were completed. Detailed free‐text field notes were taken regarding team activities and behaviors, including how the teams made patient care decisions. Field notes included: length of rounds, which team members spoke during each patient discussion, who contributed to management discussions, how information from consultants was incorporated, how communication with others outside of the team occurred, how team members spoke with each other including the types of words used, and team member willingness to perform tasks outside of their usually defined role, among others. Field notes were collected in an open‐ended format to allow for inductive observations. Observers also recorded clinical data daily regarding each patient, including admission and discharge dates, and presenting complaint.

The observation team consisted of the principle investigator (PI) (hospitalist) and 2 research assistants (a graduate‐level medical anthropologist and social psychologist), all of whom were trained by a qualitative research expert to systematically collect data related to topics of interest. Observers were instructed to record what the teams were doing and talking about at all times, noting any behaviors that they felt reflected how team members related to each other and came to decisions about their patients, or that were characteristic of the team. To ensure consistency, the PI and 1 research assistant conducted observations jointly at the start of data collection for each team, checking concordance of observations daily using a percent agreement until general agreement on field note content and patient information reached 90%. Two individuals observed 24 days of data collection, representing 252 patient discussions (13% of observed discussions).

An age‐adjusted Charlson‐Deyo comorbidity score was calculated for each patient admitted to each team, using data from rounds and from each hospital's electronic health records (EHR).[27] We collected data regarding mental health conditions for each patient (substance use, mood disorder, cognitive disorder, or a combination) because these comorbidities could impact LOS or ULOS. Discharge diagnoses were based on the discharge summary in the EHR. We also collected data daily regarding team census and numbers of admissions to and discharges from each team to assess workload.

Three patient outcomes were measured: LOS, ULOS, and complications. LOS was defined as the total number of days the patient was in the hospital. ULOS was defined as the number of days a patient remained in the hospital after the day the team determined the patient was medically ready for discharge (assessed by either discussion on rounds or EHR documentation). ULOS may occur when postdischarge needs have been adequately assessed, or because of delays in care, which may be related to provider communication during the hospitalization. Complications were defined on a per‐patient, per‐day basis in 2 ways: the development of a new problem in the hospital, for example acute kidney injury, a hospital‐acquired infection, or delirium, or by the team noting a clinical deterioration after at least 24 hours of clinical stability, such as the patient requiring transfer to a higher level of care. Complications were determined based on discussions during rounds, with EHR verification if needed.

Analysis Phase I: Assessment of Relationship Characteristics

After the completion of data collection, field notes were reviewed by a research team member not involved in the original study design or primary data collection (senior medical student). We took this approach to guard against biasing the reviewer's view of team behaviors, both in terms of not having conducted observations of the teams and being blinded to patient outcomes.

The reviewer completed a series of 3 readings of all field notes. The first reading provided a summary of the content of the data and the individual teams. Behavioral patterns of each team were used to create an initial team profile. The field notes and profiles were reviewed by the PI and a coauthor not involved in data collection to ensure that the profiles adequately reflected the field notes. No significant changes to the profiles were made based on this review. The profiles were discussed at a meeting with members of the larger research team, including the PI, research assistants, and coinvestigators (with backgrounds in medicine, anthropology, and information and organization management). Behavior characteristics that could be used to distinguish teams were identified in the profiles using a grounded theory approach.

The second review of field notes was conducted to test the applicability of the characteristics identified in the first review. To systematically record the appearance of the behaviors, we created a matrix with a row for each behavior and columns for each team to note whether they exhibited each behavior. If the behavior was exhibited, specific examples were cataloged in the matrix. This matrix was reviewed and refined by the research team. During the final field note review meeting, the research team compared the summary matrix for each team, with the specific behaviors noted during the first reading of the field notes to ensure that all behaviors were recorded.

After cataloging behaviors, the research team assigned each behavior to 1 of the 7 Lanham relationship characteristics. We wanted to assess our observations against a relationship framework to ensure that we were able to systematically assess all aspects of relationships. The Lanham framework was initially developed based on a systematic review of the organizational and educational literatures, making it relevant to the complex environment of an academic medical inpatient team and allowing us to assess relationships at a fine‐grained, richly detailed level. This assignment was done by the author team as a group. Any questions were discussed and different interpretations resolved through consensus. The Lanham framework has 7 characteristics.[19] Based on the presence of behaviors associated with each relationship characteristic, we assigned a point to each team for each relationship characteristic observed. We considered a behavior type to be present if we observed it on at least 3 occasions on separate days. Though we used a threshold of at least 3 occurrences, most teams that did not receive a point for a particular characteristic did not have any instances in which we observed the characteristic. This was particularly true for trust and mindfulness, and least so for social/task relatedness. By summing these points, we calculated a total relationship score for each team, with potential scores ranging from 0 (for teams exhibiting no behaviors reflecting a particular relationship characteristic) to 7.

Analysis Phase II: Factor Analysis

To formally determine which relationship characteristics were most highly related, data were submitted to a principal components factor analysis using oblique rotation. Item separation was determined by visual inspection of the scree plot and eigenvalues over 1.

Analysis Phase III: Assessing the Association between Physician Team Relationship Characteristics and Patient Outcomes

We examined the association between team relationships and patient outcomes using team relationship scores. For the LOS/ULOS analysis, we only included patients whose entire hospitalization occurred under the care of the team we observed. Patients who were on the team at the start of the month, were transferred from another service, or who remained hospitalized after the end of the team's time together were excluded. The longest possible LOS for patients whose entire hospitalization occurred on teams that were observed for half a month was 12 days. To facilitate accurate comparison between teams, we only included patients whose LOS was 12 days.

Complication rates were defined on a per‐patient per‐day basis to normalize for different team volumes and days of observation. For this analysis, we included patients who remained on the team after data collection completion, patients transferred to another team, or patients transferred from another team. However, we only counted complications that occurred at least 24 hours following transfer to minimize the likelihood that the complication was related to the care of other physicians.

Preliminary analysis involved inspection and assessment of the distribution of all variables followed, by a general linear modeling approach to assess the association between patient and workload covariates and outcomes.[28, 29] Because we anticipated that outcome variables would be markedly skewed, we also planned to assess the association between relationship characteristics with outcomes using the Kruskal‐Wallis rank sum test to compare groups with Dunn's test[30] for pairwise comparisons if overall significance occurred.[31] There are no known acceptable methods for covariate adjustments using the Kruskal‐Wallis method. All models were run using SAS software (SAS Institute Inc., Cary, NC).[32]

RESULTS

The research team observed 1941 discussions of 576 individual patients. Observations were conducted over 352 hours and 54 minutes, resulting in 741 pages of notes (see Supporting Table 1 in the online version of this article for data regarding individual team members). Teams observed over half‐months are referred to with a and b designations.

Relationship Characteristics and Observed Behaviors
Relationship CharacteristicDefinitionThirteen Types of Behaviors Observed in Field NotesObserved Examples
TrustWillingness to be vulnerable to othersUse of we instead of you or I by the attendingWhere are we going with this guy?
Attending admitting I don't knowLet's go talk to him, I can't figure this out
Asking questions to help team members to think through problemsWill the echo change our management? How will it help us?
DiversityIncluding different perspectives and different thinkingTeam member participation in conversations about patients that are not theirsOne intern is presenting, another intern asks a question, and the resident joins the discussion
Inclusion of perspectives of those outside the team (nursing and family members)Taking a break to call the nurse, having a family meeting
RespectValuing the opinions of others, honest and tactful interactionsUse of positive reinforcement by the attendingBeing encouraging of the medical student's differential, saying excellent
How the team talks with patientsAsking if the patient has any concerns, what they can do to make them comfortable
HeedfulnessAwareness of how each person's roles impact the rest of the teamTeam members performing tasks not expected of their roleOne intern helping another with changing orders to transfer a patient
Summarizing plans and strategizingAttending recaps the plan for the day, asks what they can do
MindfulnessOpenness to new ideas/free discussion about what is and is not workingEntire team engaged in discussionAttending asks the medical student, intern, and resident what they think is going on
Social relatednessHaving socially related interactionsSocial conversation among team membersIntern talks about their day off
Jokes by the attendingShowers and a bowel movement is the key to making people happy
Appropriate use of rich communicationUse of in‐person communication for sensitive or difficult issuesUsing verbal communication with consultants or familyIntern is on the phone with the pharm D because there is a problem with the medication

Creation of team profiles yielded 13 common behavior characteristics that were inductively identified and that could potentially distinguish teams, including consideration of perspectives outside of the team and team members performing tasks normally outside of their roles. Table 1 provides examples of and summarizes observed behaviors using examples from the field notes, mapping these behavior characteristics onto the Lanham relationship characteristics. The distribution of relationship characteristics and scores for each team are shown in Table 2.

Team Relationship Profiles
Relationship CharacteristicTeam
123a3b4a4b56789
Trust00100010111
Diversity01110000111
Respect01110100111
Heedfulness01101010111
Mindfulness00100110111
Social/task relatedness01101110111
Rich/lean communication01100010110
Relationship score (no. of characteristics observed)05722350776

Correlation between relationship characteristics ranged from 0.32 to 0.95 (see Supporting Table 2 in the online version of this article). Mindfulness and trust are more highly correlated with each other than with other variables, as are diversity and respect. We performed a principal components factor analysis. Based on scree plot inspection and eigenvalues >1, we kept 3 factors that explained 85% of the total variance (see Supporting Table 3 in the online version of this article).

Association Between the Teams' Number of Relationship Characteristics and Patient Outcomes
 No. of Relationship Characteristics
023567
  • NOTE: Abbreviations: IQR, interquartile range; LOS, length of stay; ULOS, unnecessary length of stay.

  • Not significant.

LOS, d, n=293   
Median453
IQR543
Mean4.7 (2.72)4.7 (2.52)4.1 (2.51), P=0.12a
ULOS, d, n=293   
Median000
IQR000
Mean0.37 (0.99)0.33 (0.96)0.13 (0.56), P=0.09a
Complications (per patient per day), n=398
Median000
IQR110
Mean0.58 (1.06)0.45 (0.77)0.18 (0.59), P=0.001 compared to teams with 02 or 35 characteristics

Our analyses of LOS and ULOS included 298 of the 576 patients. Two hundred sixty‐seven patients were excluded because their entire LOS did not occur while under the care of the observed teams. Eleven patients were removed from the analysis because their LOS was >12 days. The analysis of complications included 398 patients. In our preliminary general linear modeling approach, only patient workload was significantly associated with outcomes using a cutoff of P=0.05. Charlson‐Deyo score and mental health comorbidities were not associated with outcomes.

The results of the Kruskal‐Wallis test show the patient average ranking on each of the outcome variables by 3 groups (Table 3). Overall, teams with higher relationship scores had lower rank scores on all outcomes measures. However, the only statistically significant comparisons were for complications. Teams having 6 to 7 characteristics had a significantly lower complication rate ranking than teams with 0 to 2 and 3 to 5 (P=0.001). We did not find consistent differences between individual teams or groups of teams with relationship scores from 0 to 2, 3 to 5, and 6 to 7 with regard to Charlson score, mental health issues, or workload. The only significant differences were between Charlson‐Deyo scores for patients admitted to teams with low relationship scores of 0 to 2 versus high relationship scores of 6 to 7 (6.7 vs 5.1); scores for teams with relationship scores of 3 to 5 were not significantly different from the low or high groups.

Table 4 shows the Kruskal‐Wallace rank test results for each group of relationship characteristics identified in the factor analysis based on whether teams displayed all or none of the characteristics in the factor. There were no differences in these groupings for LOS. Teams that exhibited both mindfulness and trust had lower ranks on ULOS than teams that did not have either. Similarly, teams with heedfulness, social‐task relatedness, and more rich communication demonstrated lower ULOS rankings than teams who did not have all 3 characteristics.

Association Between Inpatient Physician Team Relationship Characteristics and Outcomes
 Mind/TrustDiversity/RespectHeed/Relate/Communicate
Patient OutcomeNoneBothNoneBothNoneAll 3
  • NOTE: Abbreviations: IQR, interquartile range; LOS, length of stay; ULOS, unnecessary length of stay.

  • Not significant.

LOS, d, n=293
Median444444
IQR534.5344
Mean4.7 (2.6)4.2 (2.5)4.7 (2.6)4.3 (2.5)4.4 (2.6)4.4 (2.6)
P value0.06a0.23a0.85a
ULOS, d, n=293
Median000000
IQR000000
Mean0.39 (1.01)0.15 (0.62)0.33 (0.92)0.18 (0.71)0.32 (0.93)0.18 (0.69)
P value0.0090.060.03
Complications (per patient), n=389
Median000000
IQR101010
Mean0.58 (1.01)0.19 (0.58)0.47 (0.81)0.29 (0.82)0.26 (0.92)0.28 (0.70)
P value<0.00010.0010.02

DISCUSSION

Relationships are critical to team function because they are the basis for the social interactions that are central to patient care. These interactions include how providers recognize and make sense of what is happening with patients, and how they learn to care for patients more effectively. Additionally, the high task interdependencies among inpatient providers require effective relationships for optimal care. In our study, inpatient medicine physician teams' relationships varied, and these differences were associated with ULOS and complications. Relationship characteristics are not mutually exclusive, and as our factor analysis demonstrates, are intercorrelated. Trust and mindfulness appear to be particularly important. Trust may foster psychological safety that in turn promotes the willingness of individuals to contribute their thoughts and ideas.[13] In low‐trust teams, providers may fear a negative impact for bringing forward a concern based on limited data. Mindful teams may be more likely to notice nuanced changes, or are more likely to talk when things just do not appear to be going in the right direction with the patient. In the case of acutely ill medical patients, trust and mindfulness may lead to an increased likelihood that clinical changes are recognized and discussed quickly. For example, on a team characterized by trust and mindfulness, the entire team was typically involved in care discussions, and the interns and students frequently asked a lot of questions, even regarding the care of patients they were not directly following. We observed that these questions and discussions often led the team to realize that they needed to make a change in management decisions (eg, discontinuing Bactrim, lowering insulin doses, adjusting antihypertensives, premedicating for intravenous contrast) that they had not caught in the assessment and plan portion of the patient care discussion. In another example, a medical student asked a tentative question after a patient needed to go quickly to the bathroom while they were examining her, leading the team to ask more questions that led to a more rapid evaluation of a potential urinary tract infection. This finding is consistent with the description of failure to rescue among surgical patients, in which mortality has been associated with the failure to recognize complications rapidly and act effectively.[33]

Our findings are limited in several ways. First, these data are from a single academic institution. Although we sought diversity among our teams and collected data across 2 hospitals, there may be local contextual factors that influenced our results. Second, our data demonstrate an association, but not causality. Our findings should be tested in studies that assess causality and potential mechanisms through which relationships influence outcomes. Third, the individuals observing the teams had some knowledge of patient outcomes through hearing patient discussions. However, by involving individuals who did not participate in observations and were blinded to outcomes in assessing team relationships, we addressed this potential bias. Fourth, our observations were largely focused on physician teams, not directly including other providers. Our difficulty in observing regular interactions between physicians and other providers underscores the need to increase contact among those caring for hospitalized patients, such as occurs through multidisciplinary rounds. We did include team communication with other disciplines in our assessment of the relationship characteristics of diversity and rich communication. Finally, our analysis was limited by our sample size. We observed a relatively small number of teams. Although we benefitted from seeing the change in team relationships that occurred with attending changes halfway through some of our data collection months, this did limit the number of patients we could include in our analyses. Though we did not observe obvious differences in relationships between the teams observed across the 2 hospitals, the small number of teams and hospitals precluded our ability to perform multilevel modeling analyses, which would have allowed us to assess or account for the influence of team or organizational factors. However, this small sample size did allow for a richer assessment of team behaviors.

Although preliminary, our findings are an important step in understanding the function of inpatient medical teams not only in terms of processes of care, but also in terms of relationships. Patient care is a social activity, requiring effective communication to develop working diagnoses, recognize changes in patients' clinical courses, and formulate effective treatment plans during and after hospitalization. Future work could follow several directions. One would be to assess the causal mechanisms through which relationships influence patient outcomes. These may include sensemaking, learning, and improved coordination. Positive relationships may facilitate interaction of tacit and explicit information, facilitating the creation of understandings that foster more effective patient care.[34] The dynamic nature of relationships and how patient outcomes in turn feed back into relationships could be an area of exploration. This line of research could build on the idea of teaming.[35] Understanding relationships across multidisciplinary teams or with patients and families would be another direction. Finally, our results could point to potential interventions to improve patient outcomes through improving relationships. Better understanding of the nature of effective relationships among providers should enable us to develop more effective strategies to improve the care of hospitalized patients. In the larger context of payment reforms that require greater coordination and communication among and across providers, a greater understanding of how relationships influence patient outcomes will be important.

Acknowledgements

The authors thank the physicians involved in this study and Ms. Shannon Provost for her involvement in discussions of this work.

Disclosures: The research reported herein was supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service (CDA 07‐022). Investigator salary support was provided through this funding, and through the South Texas Veterans Health Care System. The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs. Dr. McDaniel receives support from the IC[2] Institute of the University of Texas at Austin. Dr. Luci Leykum had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The authors report no conflicts of interest.

Since the Institute of Medicine Report To Err is Human, increased attention has been paid to improving the care of hospitalized patients.[1] Strategies include utilization of guidelines and pathways, and the application of quality improvement techniques to improve or standardize processes. Despite improvements in focused areas such as prevention of hospital‐acquired infections, evidence suggests that outcomes for hospitalized patients remain suboptimal.[2] Rates of errors and hospital‐related complications such as falls, decubitus ulcers, and infections remain high,[3, 4, 5] and not all patients receive what is known to be appropriate care.[6]

Many attempts to improve inpatient care have used process‐improvement approaches, focusing on impacting individuals' behaviors, or on breaking down processes into component parts. Examples include central line bundles or checklists.[7, 8] These approaches attempt to ensure that providers do things in a standardized way, but are implicitly based on the reductionist assumption that we can break processes down into predictable parts to improve the system. An alternative way to understand clinical systems is based on interdependencies between individuals in the system, or the ways in which parts of the system interact with each other, which may be unpredictable over time.[1, 9] Whereas these interdependencies include care processes, they also encompass the providers who care for patients. Providers working together vary in terms of the kinds of relationships they have with each other. Those relationships are crucial to system function because they are the foundation for the interactions that lead to effective patient care.

The application of several frameworks or approaches for considering healthcare systems in terms of relationships highlights the importance of this way of understanding system function. The include complexity science,[1, 7] relational coordination (which is grounded in complexity science),[10] high reliability,[11] and the Big Five for teamwork.[12]

Research indicates that interactions among healthcare providers can have important influences on outcomes.[13, 14, 15, 16, 17] Additionally, the initial implementation of checklists to prevent central‐line associated infections appeared to change provider relationships in a way that significantly influenced their success.[18] For example, positive primary care clinic member relationships as assessed by the Lanham framework have been associated with better chronic care model implementation, learning, and patient experience of care.[19, 20] This framework, which we apply here, identifies 7 relationship characteristics: (1) trust; (2) diversity; (3) respect; (4) mindfulness, or being open to new ideas from others; (5) heedfulness, or an understanding of how one's roles influence those of others; (6) use of rich in‐person or verbal communication, particularly for potentially ambiguous information open to multiple interpretations; and (7) having a mixture of social and task relatedness among teams, or relatedness outside of only work‐related tasks.[19] Relationships within surgical teams that are characterized by psychological safety and diversity are associated with successful uptake of new techniques and decreased mortality.[13, 14] Relationships are important because the ability of patients and providers to learn and make sense of their patients' illnesses is grounded in relationships.

We sought to better understand and characterize inpatient physician teams' relationships, and assess the association between team relationships as evaluated by Lanham's framework and outcomes for hospitalized patients. Data on relationships among inpatient medical teams are few, despite the fact that these teams provide a great proportion of inpatient care. Additionally, the care of hospitalized medical patients is complex and uncertain, often involving multiple providers, making provider relationships potentially even more important to outcomes than in other settings.

METHODS

Overview

We conducted an observational, convergent mixed‐methods study of inpatient medicine teams.[21, 22, 23] We focused on inpatient physician teams, defining them as the functional work group responsible for medical decision making in academic medical centers. Physician teams in this context have been studied in terms of social hierarchy, authority, and delegation.[24, 25, 26] Focusing on the relationships within these groups could provide insights into strategies to mitigate potential negative effects of hierarchy. We recognize that other providers are closely involved in the care of hospitalized patients, and although we did not have standard interactions between physicians, nurses, case managers, and other providers that we could consistently observe, we did include interactions with these other providers in our observations and assessments of team relationships. Because this work is among the first in inpatient medical teams, we chose to study a small number of teams in great depth, allowing us to make rich assessments of team relationships.

We chose patient outcomes of length of stay (LOS), unnecessary LOS (ULOS), and complication rates, adjusted for patient characteristics and team workload. LOS is an important metric of inpatient care delivery. We feel ULOS is an aspect of LOS that is dependent on the physician team, as it reflects their preparation of the patient for discharge. Finally, we chose complication rates because hospital‐acquired conditions and complications are important contributors to inpatient morbidity, and because recent surgical literature has identified complication rates as a contributor to mortality that could be related to providers' collective ability to recognize complications and act quickly.

This study was approved by the institutional review board at the University of Texas Health Science Center at San Antonio (UTHSCSA), the Research and Development Committee for the South Texas Veterans Health Care System (STVHCS), and the Research Committee at University Health System (UHS). All physicians consented to participate in the study. We obtained a waiver of consent for inclusion of patient data.

Setting and Study Participants

This study was conducted at the 2 UTHSCSA primary teaching affiliates. The Audie L. Murphy Veterans Affairs Hospital is the 220‐bed acute‐care hospital of the STVHCS. University Hospital is the 614‐bed, level‐I trauma, acute‐care facility for UHS, the county system for Bexar County, which includes the San Antonio, Texas major metropolitan area.

The inpatient internal medicine physician team was our unit of study. Inpatient medicine teams consisted of 1 faculty attending physician, 1 postgraduate year (PGY)‐2 or PGY‐3 resident, and 2 PGY‐1 members. In addition, typically 2 to 3 third‐year medical students were part of the team, and a subintern was sometimes present. Doctor of Pharmacy faculty and students were also occasionally part of the team. Social workers and case managers often joined team rounds for portions of the time, and nurses sometimes joined bedside rounds on specific patients. These teams admit all medicine patients with the exception of those with acute coronary syndromes, new onset congestive heart failure, or arrhythmias. Patients are randomly assigned to teams based on time of admission and call schedules.

Between these 2 hospitals, there are 10 inpatient medicine teams caring for patients, with a pool of over 40 potential faculty attendings. Our goal was to observe teams that would be most likely to vary in terms of their relationship characteristics and patient outcomes through observing teams with a range of individual members. We used a purposeful sampling approach to obtain a diverse sample, sampling based on physician attributes and time of year.[16, 17] Three characteristics were most important: attending physician years of experience, attending involvement in educational and administrative leadership, and the presence of struggling resident members, as defined by being on probation or having been discussed in the residency Clinical Competency Committee. We did not set explicit thresholds in terms of attending experience, but instead sought to ensure a range. The attendings we observed were more likely to be involved in education and administrative leadership activities, but were otherwise similar to those we did not observe in terms of years of experience. We included struggling residents to observe individuals with a range of skill sets, and not just high‐performing individuals. We obtained attending information based on our knowledge of the attending faculty pool, and from the internal medicine residency program. We sampled across the year to ensure a diversity of trainee experience, but did not observe teams in either July or August, as these months were early in the academic year. Interns spend approximately 5 months per year on inpatient services, whereas residents spend 2 to 3 months per year. Thus, interns but not residents observed later in the year might have spent significantly more time on an inpatient service. However, in all instances, none of the team members observed had worked together previously.

Data Collection

Data were collected over nine 1‐month periods from September 2008 through June 2011. Teams were observed daily for 2‐ to 4‐week periods during morning rounds, the time when the team discusses each patient and makes clinical decisions. Data collection started on the first day of the month, the first day that all team members worked together, and continued for approximately 27 days, the last day before the resident rotated to a different service. By comprehensively and systematically observing these teams' daily rounds, we obtained rich, in‐depth data with multiple data points, enabling us to assess specific team behaviors and interactions.

During the third and fourth months, we collected data on teams in which the attending changed partway through. We did this to understand the impact of individual attending change on team relationships. Because the team relationships differed with each attending, we analyzed them separately. Thus, we observed 7 teams for approximately 4‐week periods and 4 teams for approximately 2‐week periods.

Observers arrived in the team room prior to rounds to begin observations, staying until after rounds were completed. Detailed free‐text field notes were taken regarding team activities and behaviors, including how the teams made patient care decisions. Field notes included: length of rounds, which team members spoke during each patient discussion, who contributed to management discussions, how information from consultants was incorporated, how communication with others outside of the team occurred, how team members spoke with each other including the types of words used, and team member willingness to perform tasks outside of their usually defined role, among others. Field notes were collected in an open‐ended format to allow for inductive observations. Observers also recorded clinical data daily regarding each patient, including admission and discharge dates, and presenting complaint.

The observation team consisted of the principle investigator (PI) (hospitalist) and 2 research assistants (a graduate‐level medical anthropologist and social psychologist), all of whom were trained by a qualitative research expert to systematically collect data related to topics of interest. Observers were instructed to record what the teams were doing and talking about at all times, noting any behaviors that they felt reflected how team members related to each other and came to decisions about their patients, or that were characteristic of the team. To ensure consistency, the PI and 1 research assistant conducted observations jointly at the start of data collection for each team, checking concordance of observations daily using a percent agreement until general agreement on field note content and patient information reached 90%. Two individuals observed 24 days of data collection, representing 252 patient discussions (13% of observed discussions).

An age‐adjusted Charlson‐Deyo comorbidity score was calculated for each patient admitted to each team, using data from rounds and from each hospital's electronic health records (EHR).[27] We collected data regarding mental health conditions for each patient (substance use, mood disorder, cognitive disorder, or a combination) because these comorbidities could impact LOS or ULOS. Discharge diagnoses were based on the discharge summary in the EHR. We also collected data daily regarding team census and numbers of admissions to and discharges from each team to assess workload.

Three patient outcomes were measured: LOS, ULOS, and complications. LOS was defined as the total number of days the patient was in the hospital. ULOS was defined as the number of days a patient remained in the hospital after the day the team determined the patient was medically ready for discharge (assessed by either discussion on rounds or EHR documentation). ULOS may occur when postdischarge needs have been adequately assessed, or because of delays in care, which may be related to provider communication during the hospitalization. Complications were defined on a per‐patient, per‐day basis in 2 ways: the development of a new problem in the hospital, for example acute kidney injury, a hospital‐acquired infection, or delirium, or by the team noting a clinical deterioration after at least 24 hours of clinical stability, such as the patient requiring transfer to a higher level of care. Complications were determined based on discussions during rounds, with EHR verification if needed.

Analysis Phase I: Assessment of Relationship Characteristics

After the completion of data collection, field notes were reviewed by a research team member not involved in the original study design or primary data collection (senior medical student). We took this approach to guard against biasing the reviewer's view of team behaviors, both in terms of not having conducted observations of the teams and being blinded to patient outcomes.

The reviewer completed a series of 3 readings of all field notes. The first reading provided a summary of the content of the data and the individual teams. Behavioral patterns of each team were used to create an initial team profile. The field notes and profiles were reviewed by the PI and a coauthor not involved in data collection to ensure that the profiles adequately reflected the field notes. No significant changes to the profiles were made based on this review. The profiles were discussed at a meeting with members of the larger research team, including the PI, research assistants, and coinvestigators (with backgrounds in medicine, anthropology, and information and organization management). Behavior characteristics that could be used to distinguish teams were identified in the profiles using a grounded theory approach.

The second review of field notes was conducted to test the applicability of the characteristics identified in the first review. To systematically record the appearance of the behaviors, we created a matrix with a row for each behavior and columns for each team to note whether they exhibited each behavior. If the behavior was exhibited, specific examples were cataloged in the matrix. This matrix was reviewed and refined by the research team. During the final field note review meeting, the research team compared the summary matrix for each team, with the specific behaviors noted during the first reading of the field notes to ensure that all behaviors were recorded.

After cataloging behaviors, the research team assigned each behavior to 1 of the 7 Lanham relationship characteristics. We wanted to assess our observations against a relationship framework to ensure that we were able to systematically assess all aspects of relationships. The Lanham framework was initially developed based on a systematic review of the organizational and educational literatures, making it relevant to the complex environment of an academic medical inpatient team and allowing us to assess relationships at a fine‐grained, richly detailed level. This assignment was done by the author team as a group. Any questions were discussed and different interpretations resolved through consensus. The Lanham framework has 7 characteristics.[19] Based on the presence of behaviors associated with each relationship characteristic, we assigned a point to each team for each relationship characteristic observed. We considered a behavior type to be present if we observed it on at least 3 occasions on separate days. Though we used a threshold of at least 3 occurrences, most teams that did not receive a point for a particular characteristic did not have any instances in which we observed the characteristic. This was particularly true for trust and mindfulness, and least so for social/task relatedness. By summing these points, we calculated a total relationship score for each team, with potential scores ranging from 0 (for teams exhibiting no behaviors reflecting a particular relationship characteristic) to 7.

Analysis Phase II: Factor Analysis

To formally determine which relationship characteristics were most highly related, data were submitted to a principal components factor analysis using oblique rotation. Item separation was determined by visual inspection of the scree plot and eigenvalues over 1.

Analysis Phase III: Assessing the Association between Physician Team Relationship Characteristics and Patient Outcomes

We examined the association between team relationships and patient outcomes using team relationship scores. For the LOS/ULOS analysis, we only included patients whose entire hospitalization occurred under the care of the team we observed. Patients who were on the team at the start of the month, were transferred from another service, or who remained hospitalized after the end of the team's time together were excluded. The longest possible LOS for patients whose entire hospitalization occurred on teams that were observed for half a month was 12 days. To facilitate accurate comparison between teams, we only included patients whose LOS was 12 days.

Complication rates were defined on a per‐patient per‐day basis to normalize for different team volumes and days of observation. For this analysis, we included patients who remained on the team after data collection completion, patients transferred to another team, or patients transferred from another team. However, we only counted complications that occurred at least 24 hours following transfer to minimize the likelihood that the complication was related to the care of other physicians.

Preliminary analysis involved inspection and assessment of the distribution of all variables followed, by a general linear modeling approach to assess the association between patient and workload covariates and outcomes.[28, 29] Because we anticipated that outcome variables would be markedly skewed, we also planned to assess the association between relationship characteristics with outcomes using the Kruskal‐Wallis rank sum test to compare groups with Dunn's test[30] for pairwise comparisons if overall significance occurred.[31] There are no known acceptable methods for covariate adjustments using the Kruskal‐Wallis method. All models were run using SAS software (SAS Institute Inc., Cary, NC).[32]

RESULTS

The research team observed 1941 discussions of 576 individual patients. Observations were conducted over 352 hours and 54 minutes, resulting in 741 pages of notes (see Supporting Table 1 in the online version of this article for data regarding individual team members). Teams observed over half‐months are referred to with a and b designations.

Relationship Characteristics and Observed Behaviors
Relationship CharacteristicDefinitionThirteen Types of Behaviors Observed in Field NotesObserved Examples
TrustWillingness to be vulnerable to othersUse of we instead of you or I by the attendingWhere are we going with this guy?
Attending admitting I don't knowLet's go talk to him, I can't figure this out
Asking questions to help team members to think through problemsWill the echo change our management? How will it help us?
DiversityIncluding different perspectives and different thinkingTeam member participation in conversations about patients that are not theirsOne intern is presenting, another intern asks a question, and the resident joins the discussion
Inclusion of perspectives of those outside the team (nursing and family members)Taking a break to call the nurse, having a family meeting
RespectValuing the opinions of others, honest and tactful interactionsUse of positive reinforcement by the attendingBeing encouraging of the medical student's differential, saying excellent
How the team talks with patientsAsking if the patient has any concerns, what they can do to make them comfortable
HeedfulnessAwareness of how each person's roles impact the rest of the teamTeam members performing tasks not expected of their roleOne intern helping another with changing orders to transfer a patient
Summarizing plans and strategizingAttending recaps the plan for the day, asks what they can do
MindfulnessOpenness to new ideas/free discussion about what is and is not workingEntire team engaged in discussionAttending asks the medical student, intern, and resident what they think is going on
Social relatednessHaving socially related interactionsSocial conversation among team membersIntern talks about their day off
Jokes by the attendingShowers and a bowel movement is the key to making people happy
Appropriate use of rich communicationUse of in‐person communication for sensitive or difficult issuesUsing verbal communication with consultants or familyIntern is on the phone with the pharm D because there is a problem with the medication

Creation of team profiles yielded 13 common behavior characteristics that were inductively identified and that could potentially distinguish teams, including consideration of perspectives outside of the team and team members performing tasks normally outside of their roles. Table 1 provides examples of and summarizes observed behaviors using examples from the field notes, mapping these behavior characteristics onto the Lanham relationship characteristics. The distribution of relationship characteristics and scores for each team are shown in Table 2.

Team Relationship Profiles
Relationship CharacteristicTeam
123a3b4a4b56789
Trust00100010111
Diversity01110000111
Respect01110100111
Heedfulness01101010111
Mindfulness00100110111
Social/task relatedness01101110111
Rich/lean communication01100010110
Relationship score (no. of characteristics observed)05722350776

Correlation between relationship characteristics ranged from 0.32 to 0.95 (see Supporting Table 2 in the online version of this article). Mindfulness and trust are more highly correlated with each other than with other variables, as are diversity and respect. We performed a principal components factor analysis. Based on scree plot inspection and eigenvalues >1, we kept 3 factors that explained 85% of the total variance (see Supporting Table 3 in the online version of this article).

Association Between the Teams' Number of Relationship Characteristics and Patient Outcomes
 No. of Relationship Characteristics
023567
  • NOTE: Abbreviations: IQR, interquartile range; LOS, length of stay; ULOS, unnecessary length of stay.

  • Not significant.

LOS, d, n=293   
Median453
IQR543
Mean4.7 (2.72)4.7 (2.52)4.1 (2.51), P=0.12a
ULOS, d, n=293   
Median000
IQR000
Mean0.37 (0.99)0.33 (0.96)0.13 (0.56), P=0.09a
Complications (per patient per day), n=398
Median000
IQR110
Mean0.58 (1.06)0.45 (0.77)0.18 (0.59), P=0.001 compared to teams with 02 or 35 characteristics

Our analyses of LOS and ULOS included 298 of the 576 patients. Two hundred sixty‐seven patients were excluded because their entire LOS did not occur while under the care of the observed teams. Eleven patients were removed from the analysis because their LOS was >12 days. The analysis of complications included 398 patients. In our preliminary general linear modeling approach, only patient workload was significantly associated with outcomes using a cutoff of P=0.05. Charlson‐Deyo score and mental health comorbidities were not associated with outcomes.

The results of the Kruskal‐Wallis test show the patient average ranking on each of the outcome variables by 3 groups (Table 3). Overall, teams with higher relationship scores had lower rank scores on all outcomes measures. However, the only statistically significant comparisons were for complications. Teams having 6 to 7 characteristics had a significantly lower complication rate ranking than teams with 0 to 2 and 3 to 5 (P=0.001). We did not find consistent differences between individual teams or groups of teams with relationship scores from 0 to 2, 3 to 5, and 6 to 7 with regard to Charlson score, mental health issues, or workload. The only significant differences were between Charlson‐Deyo scores for patients admitted to teams with low relationship scores of 0 to 2 versus high relationship scores of 6 to 7 (6.7 vs 5.1); scores for teams with relationship scores of 3 to 5 were not significantly different from the low or high groups.

Table 4 shows the Kruskal‐Wallace rank test results for each group of relationship characteristics identified in the factor analysis based on whether teams displayed all or none of the characteristics in the factor. There were no differences in these groupings for LOS. Teams that exhibited both mindfulness and trust had lower ranks on ULOS than teams that did not have either. Similarly, teams with heedfulness, social‐task relatedness, and more rich communication demonstrated lower ULOS rankings than teams who did not have all 3 characteristics.

Association Between Inpatient Physician Team Relationship Characteristics and Outcomes
 Mind/TrustDiversity/RespectHeed/Relate/Communicate
Patient OutcomeNoneBothNoneBothNoneAll 3
  • NOTE: Abbreviations: IQR, interquartile range; LOS, length of stay; ULOS, unnecessary length of stay.

  • Not significant.

LOS, d, n=293
Median444444
IQR534.5344
Mean4.7 (2.6)4.2 (2.5)4.7 (2.6)4.3 (2.5)4.4 (2.6)4.4 (2.6)
P value0.06a0.23a0.85a
ULOS, d, n=293
Median000000
IQR000000
Mean0.39 (1.01)0.15 (0.62)0.33 (0.92)0.18 (0.71)0.32 (0.93)0.18 (0.69)
P value0.0090.060.03
Complications (per patient), n=389
Median000000
IQR101010
Mean0.58 (1.01)0.19 (0.58)0.47 (0.81)0.29 (0.82)0.26 (0.92)0.28 (0.70)
P value<0.00010.0010.02

DISCUSSION

Relationships are critical to team function because they are the basis for the social interactions that are central to patient care. These interactions include how providers recognize and make sense of what is happening with patients, and how they learn to care for patients more effectively. Additionally, the high task interdependencies among inpatient providers require effective relationships for optimal care. In our study, inpatient medicine physician teams' relationships varied, and these differences were associated with ULOS and complications. Relationship characteristics are not mutually exclusive, and as our factor analysis demonstrates, are intercorrelated. Trust and mindfulness appear to be particularly important. Trust may foster psychological safety that in turn promotes the willingness of individuals to contribute their thoughts and ideas.[13] In low‐trust teams, providers may fear a negative impact for bringing forward a concern based on limited data. Mindful teams may be more likely to notice nuanced changes, or are more likely to talk when things just do not appear to be going in the right direction with the patient. In the case of acutely ill medical patients, trust and mindfulness may lead to an increased likelihood that clinical changes are recognized and discussed quickly. For example, on a team characterized by trust and mindfulness, the entire team was typically involved in care discussions, and the interns and students frequently asked a lot of questions, even regarding the care of patients they were not directly following. We observed that these questions and discussions often led the team to realize that they needed to make a change in management decisions (eg, discontinuing Bactrim, lowering insulin doses, adjusting antihypertensives, premedicating for intravenous contrast) that they had not caught in the assessment and plan portion of the patient care discussion. In another example, a medical student asked a tentative question after a patient needed to go quickly to the bathroom while they were examining her, leading the team to ask more questions that led to a more rapid evaluation of a potential urinary tract infection. This finding is consistent with the description of failure to rescue among surgical patients, in which mortality has been associated with the failure to recognize complications rapidly and act effectively.[33]

Our findings are limited in several ways. First, these data are from a single academic institution. Although we sought diversity among our teams and collected data across 2 hospitals, there may be local contextual factors that influenced our results. Second, our data demonstrate an association, but not causality. Our findings should be tested in studies that assess causality and potential mechanisms through which relationships influence outcomes. Third, the individuals observing the teams had some knowledge of patient outcomes through hearing patient discussions. However, by involving individuals who did not participate in observations and were blinded to outcomes in assessing team relationships, we addressed this potential bias. Fourth, our observations were largely focused on physician teams, not directly including other providers. Our difficulty in observing regular interactions between physicians and other providers underscores the need to increase contact among those caring for hospitalized patients, such as occurs through multidisciplinary rounds. We did include team communication with other disciplines in our assessment of the relationship characteristics of diversity and rich communication. Finally, our analysis was limited by our sample size. We observed a relatively small number of teams. Although we benefitted from seeing the change in team relationships that occurred with attending changes halfway through some of our data collection months, this did limit the number of patients we could include in our analyses. Though we did not observe obvious differences in relationships between the teams observed across the 2 hospitals, the small number of teams and hospitals precluded our ability to perform multilevel modeling analyses, which would have allowed us to assess or account for the influence of team or organizational factors. However, this small sample size did allow for a richer assessment of team behaviors.

Although preliminary, our findings are an important step in understanding the function of inpatient medical teams not only in terms of processes of care, but also in terms of relationships. Patient care is a social activity, requiring effective communication to develop working diagnoses, recognize changes in patients' clinical courses, and formulate effective treatment plans during and after hospitalization. Future work could follow several directions. One would be to assess the causal mechanisms through which relationships influence patient outcomes. These may include sensemaking, learning, and improved coordination. Positive relationships may facilitate interaction of tacit and explicit information, facilitating the creation of understandings that foster more effective patient care.[34] The dynamic nature of relationships and how patient outcomes in turn feed back into relationships could be an area of exploration. This line of research could build on the idea of teaming.[35] Understanding relationships across multidisciplinary teams or with patients and families would be another direction. Finally, our results could point to potential interventions to improve patient outcomes through improving relationships. Better understanding of the nature of effective relationships among providers should enable us to develop more effective strategies to improve the care of hospitalized patients. In the larger context of payment reforms that require greater coordination and communication among and across providers, a greater understanding of how relationships influence patient outcomes will be important.

Acknowledgements

The authors thank the physicians involved in this study and Ms. Shannon Provost for her involvement in discussions of this work.

Disclosures: The research reported herein was supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service (CDA 07‐022). Investigator salary support was provided through this funding, and through the South Texas Veterans Health Care System. The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs. Dr. McDaniel receives support from the IC[2] Institute of the University of Texas at Austin. Dr. Luci Leykum had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The authors report no conflicts of interest.

References
  1. Plsek P. Redesigning health care with insights from the science of complex adaptive systems. In: Crossing the Quality Chasm: A New Heath System for the 21st Century. Washington, DC: National Academy of Sciences; 2000:309322.
  2. Landrigan CP, Parry GJ, Bones CB, Hackbarth AD, Goldmann DA, Sharek PJ. Temporal trends in rates of patient harm resulting from medical care. N Engl J Med. 2010;323(22):21242135.
  3. Krauss MJ, Nguyen SL, Dunagan WC, et al. Circumstances of patient falls and injuries in 9 hospitals in a mid‐western healthcare system. Infect Control Hosp Epidemiol. 2007;28(5):544550.
  4. Hurd T, Posnett J. Point prevalence of wounds in a sample of acute hospitals in Canada. Int Wound J. 2009;6(4):287293.
  5. Garcin F, Leone M, Antonini F, Charvet A, Albanese J, Martin C. Non‐adherence to guidelines: an avoidable cause of failure of empirical antimicrobial therapy in the presence of difficult‐to‐treat bacteria. Intensive Care Med. 2010;36(1):7582.
  6. Williams SC, Schmaltz SP, Morton DJ, Koss RG, Loeb JM. Quality of care in U.S. hospitals as reflected by standardized measures, 2002–2004. N Engl J Med. 2005;353(3):255264.
  7. Centers for Disease Control and Prevention. National Center for Emerging and Zoonotic Infectious Diseases. Division of Healthcare Quality Promotion. Checklist for prevention of central line associated blood stream infections. Available at: http://www.cdc.gov/HAI/pdfs/bsi/checklist‐for‐CLABSI.pdf. Accessed August 3, 2014.
  8. Safer Healthcare Partners, LLC. Checklists: a critical patient safety tool. Available at: http://www.saferhealthcare.com/high‐reliability‐topics/checklists. Accessed July 31, 2014.
  9. Yam Y. Making Things Work: Solving Complex Problems in a Complex World. Boston, MA: Knowledge Press; 2004:117160.
  10. Gittell JH. High Performance Healthcare: Using The Power of Relationships to Achieve Quality, Efficiency, and Resilience. 1st ed. New York, NY: McGraw‐Hill; 2009.
  11. Carroll JS, Rudolph JW. Design of high reliability organizations in health care. Qual Saf Health Care. 2006;15(suppl 1):i4i9.
  12. Salas E, DiazGranados D, Weaver SJ, King H. Does team training work? Principles for health care. Acad Emerg Med. 2008;15(11):10021009.
  13. Edmondson A. Speaking up in the operating room: how team leaders promote learning in interdisciplinary action teams. J Manag Stud. 2003;40(6):14191452.
  14. Neily J, Mills PD, Young‐Xu Y, et al. Association between implementation of a medical team training program and surgical mortality. JAMA. 2010;304(15):16931700.
  15. Lewis K, Belliveau M, Herndon B, Keller J. Group cognition, membership change, and performance: Investigating the benefits and detriments of collective knowledge. Organ Behav Hum Decis Process. 2007;103(2):159178.
  16. Leykum LK, Palmer RF, Lanham HJ, McDaniel RR, Noel PH, Parchman ML. Reciprocal learning and chronic care model implementation in primary care: results from a new scale of learning in primary care settings. BMC Health Serv Res. 2011;11:44.
  17. Noel PH, Lanham HJ, Palmer RF, Leykum LK, Parchman ML. The importance of relational coordination and reciprocal learning for chronic illness care within primary care teams. Health Care Manage Rev. 2012;38(1):2028.
  18. Dixon‐Woods M, Bosk CL, Aveling EL, Goeschel CA, Pronovost PJ. Explaining Michigan: developing an ex post theory of a quality improvement program. Milbank Q. 2011;89(2):167205.
  19. Lanham HJ, McDaniel RR, Crabtree BF, et al. How improving practice relationships among clinicians and nonclinicians can improve quality in primary care. Jt Comm J Qual Patient Saf. 2009;35(9):457466.
  20. Finely EP, Pugh JA, Lanham HJ, et al. Relationship quality and patient‐assessed quality of care in VA primary care clinics: development and validation of the work relationships scale. Ann Fam Med. 2013;11(6):543549.
  21. Creswell JW, Plano Clark VL. Designing and Conducting Mixed Methods Research. 2nd ed. Thousand Oaks, CA: Sage; 2011.
  22. Patton MQ. Qualitative Evaluation Methods. Thousand Oaks, CA: Sage; 2002.
  23. Pope C, Royen P, Baker R. Qualitative methods in research on health care quality. Qual Saf Health Care. 2002;11:148152.
  24. Hoff T. Managing the negatives of experience in physician teams. Health Care Manage Rev. 2010;35(1):6576.
  25. Tamuz M, Giardina TD, Thomas EJ, Menon S, Singh H. Rethinking resident supervision to improve safety: from hierarchical to interprofessional models. J Hosp Med. 2011;6(8):445 b452.
  26. Klein KJ, Ziegart JC, Knight AP, Xiao Y. Dynamic delegation: shared, hierarchical, and deindividualized leadership in extreme action teams. Adm Sci Q. 2006;51(4):590621.
  27. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD‐9‐CM administrative databases. J Clin Epidemiol. 1992;45(6):613619.
  28. Tukey JW. Exploratory Data Analysis. Reading, MA: Addison‐Wesley; 1977.
  29. Zar JH. Biostatistical Analysis. 4th ed. Upper Saddle River, NJ: Pearson Prentice‐Hall; 2010.
  30. Dunn OJ. Multiple contrasts using rank sums. Technometrics. 1964;6:241252.
  31. Elliott AC, Hynan LS. A SAS macro implementation of a multiple comparison post hoc test for a Kruskal–Wallis analysis. Comput Methods Programs Biomed. 2011;102:7580.
  32. SAS/STAT Software [computer program]. Version 9.1. Cary, NC: SAS Institute Inc.; 2003.
  33. Ghaferi AA, Birkmeyer JD, Dimick JB. Complications, failure to rescue, and mortality with major inpatient surgery in Medicare patients. Ann Surg. 2009;250(6):10291034.
  34. Nonaka I. A dynamic theory of organizational knowledge creation. Org Sci. 1994;5(1):1437.
  35. Edmundson AC. Teaming: How Organizations Learn, Innovate, and Compete in the Knowledge Economy. 1st ed. Boston, MA: Harvard Business School; 2012.
References
  1. Plsek P. Redesigning health care with insights from the science of complex adaptive systems. In: Crossing the Quality Chasm: A New Heath System for the 21st Century. Washington, DC: National Academy of Sciences; 2000:309322.
  2. Landrigan CP, Parry GJ, Bones CB, Hackbarth AD, Goldmann DA, Sharek PJ. Temporal trends in rates of patient harm resulting from medical care. N Engl J Med. 2010;323(22):21242135.
  3. Krauss MJ, Nguyen SL, Dunagan WC, et al. Circumstances of patient falls and injuries in 9 hospitals in a mid‐western healthcare system. Infect Control Hosp Epidemiol. 2007;28(5):544550.
  4. Hurd T, Posnett J. Point prevalence of wounds in a sample of acute hospitals in Canada. Int Wound J. 2009;6(4):287293.
  5. Garcin F, Leone M, Antonini F, Charvet A, Albanese J, Martin C. Non‐adherence to guidelines: an avoidable cause of failure of empirical antimicrobial therapy in the presence of difficult‐to‐treat bacteria. Intensive Care Med. 2010;36(1):7582.
  6. Williams SC, Schmaltz SP, Morton DJ, Koss RG, Loeb JM. Quality of care in U.S. hospitals as reflected by standardized measures, 2002–2004. N Engl J Med. 2005;353(3):255264.
  7. Centers for Disease Control and Prevention. National Center for Emerging and Zoonotic Infectious Diseases. Division of Healthcare Quality Promotion. Checklist for prevention of central line associated blood stream infections. Available at: http://www.cdc.gov/HAI/pdfs/bsi/checklist‐for‐CLABSI.pdf. Accessed August 3, 2014.
  8. Safer Healthcare Partners, LLC. Checklists: a critical patient safety tool. Available at: http://www.saferhealthcare.com/high‐reliability‐topics/checklists. Accessed July 31, 2014.
  9. Yam Y. Making Things Work: Solving Complex Problems in a Complex World. Boston, MA: Knowledge Press; 2004:117160.
  10. Gittell JH. High Performance Healthcare: Using The Power of Relationships to Achieve Quality, Efficiency, and Resilience. 1st ed. New York, NY: McGraw‐Hill; 2009.
  11. Carroll JS, Rudolph JW. Design of high reliability organizations in health care. Qual Saf Health Care. 2006;15(suppl 1):i4i9.
  12. Salas E, DiazGranados D, Weaver SJ, King H. Does team training work? Principles for health care. Acad Emerg Med. 2008;15(11):10021009.
  13. Edmondson A. Speaking up in the operating room: how team leaders promote learning in interdisciplinary action teams. J Manag Stud. 2003;40(6):14191452.
  14. Neily J, Mills PD, Young‐Xu Y, et al. Association between implementation of a medical team training program and surgical mortality. JAMA. 2010;304(15):16931700.
  15. Lewis K, Belliveau M, Herndon B, Keller J. Group cognition, membership change, and performance: Investigating the benefits and detriments of collective knowledge. Organ Behav Hum Decis Process. 2007;103(2):159178.
  16. Leykum LK, Palmer RF, Lanham HJ, McDaniel RR, Noel PH, Parchman ML. Reciprocal learning and chronic care model implementation in primary care: results from a new scale of learning in primary care settings. BMC Health Serv Res. 2011;11:44.
  17. Noel PH, Lanham HJ, Palmer RF, Leykum LK, Parchman ML. The importance of relational coordination and reciprocal learning for chronic illness care within primary care teams. Health Care Manage Rev. 2012;38(1):2028.
  18. Dixon‐Woods M, Bosk CL, Aveling EL, Goeschel CA, Pronovost PJ. Explaining Michigan: developing an ex post theory of a quality improvement program. Milbank Q. 2011;89(2):167205.
  19. Lanham HJ, McDaniel RR, Crabtree BF, et al. How improving practice relationships among clinicians and nonclinicians can improve quality in primary care. Jt Comm J Qual Patient Saf. 2009;35(9):457466.
  20. Finely EP, Pugh JA, Lanham HJ, et al. Relationship quality and patient‐assessed quality of care in VA primary care clinics: development and validation of the work relationships scale. Ann Fam Med. 2013;11(6):543549.
  21. Creswell JW, Plano Clark VL. Designing and Conducting Mixed Methods Research. 2nd ed. Thousand Oaks, CA: Sage; 2011.
  22. Patton MQ. Qualitative Evaluation Methods. Thousand Oaks, CA: Sage; 2002.
  23. Pope C, Royen P, Baker R. Qualitative methods in research on health care quality. Qual Saf Health Care. 2002;11:148152.
  24. Hoff T. Managing the negatives of experience in physician teams. Health Care Manage Rev. 2010;35(1):6576.
  25. Tamuz M, Giardina TD, Thomas EJ, Menon S, Singh H. Rethinking resident supervision to improve safety: from hierarchical to interprofessional models. J Hosp Med. 2011;6(8):445 b452.
  26. Klein KJ, Ziegart JC, Knight AP, Xiao Y. Dynamic delegation: shared, hierarchical, and deindividualized leadership in extreme action teams. Adm Sci Q. 2006;51(4):590621.
  27. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD‐9‐CM administrative databases. J Clin Epidemiol. 1992;45(6):613619.
  28. Tukey JW. Exploratory Data Analysis. Reading, MA: Addison‐Wesley; 1977.
  29. Zar JH. Biostatistical Analysis. 4th ed. Upper Saddle River, NJ: Pearson Prentice‐Hall; 2010.
  30. Dunn OJ. Multiple contrasts using rank sums. Technometrics. 1964;6:241252.
  31. Elliott AC, Hynan LS. A SAS macro implementation of a multiple comparison post hoc test for a Kruskal–Wallis analysis. Comput Methods Programs Biomed. 2011;102:7580.
  32. SAS/STAT Software [computer program]. Version 9.1. Cary, NC: SAS Institute Inc.; 2003.
  33. Ghaferi AA, Birkmeyer JD, Dimick JB. Complications, failure to rescue, and mortality with major inpatient surgery in Medicare patients. Ann Surg. 2009;250(6):10291034.
  34. Nonaka I. A dynamic theory of organizational knowledge creation. Org Sci. 1994;5(1):1437.
  35. Edmundson AC. Teaming: How Organizations Learn, Innovate, and Compete in the Knowledge Economy. 1st ed. Boston, MA: Harvard Business School; 2012.
Issue
Journal of Hospital Medicine - 9(12)
Issue
Journal of Hospital Medicine - 9(12)
Page Number
764-771
Page Number
764-771
Article Type
Display Headline
Relationships within inpatient physician housestaff teams and their association with hospitalized patient outcomes
Display Headline
Relationships within inpatient physician housestaff teams and their association with hospitalized patient outcomes
Sections
Article Source

© 2014 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Address for correspondence and reprint requests: Luci Leykum, MD, 7400 Merton Minter, San Antonio, TX 78229; Telephone: 210‐567‐4462; Fax: 210‐567‐0218; E‐mail: leykum@uthscsa.edu
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files

Smartphone‐Enabled Communication System

Article Type
Changed
Sun, 05/21/2017 - 13:29
Display Headline
A smartphone‐enabled communication system to improve hospital communication: Usage and perceptions of medical trainees and nurses on general internal medicine wards

Previous studies have advocated the importance of effective communication between clinicians as a critical component in the provision of high‐quality patient care.[1, 2, 3, 4] There is increasing interest in the use of information and communication technologies to improve how clinicians communicate in hospital settings. A number of hospitals have implemented different solutions to improve communication. These solutions include alphanumeric pagers,[5] smartphones,[6] e‐mail,[7] secure text messaging,[8] and a Web‐based interdisciplinary communication tool.[9]

These systems have different limitations that render them inefficient and likely inhibit collaborative care. Current systems, such as pagers, rely on the sender to ensure the message was received and are successful in delivering messages approximately 67% of the time.[5, 9, 10] Although alphanumeric pagers and secure text messaging can increase the likelihood of delivery, these messages are often isolated and not easily viewable by the whole care team.[11] Improved systems should also reduce unnecessary interruptions by providing support for both urgent and delayed messages. Finally, messages should be stored and retrievable to enable increased accountability and allow for review for quality improvement initiatives.

It is also important to consider the unintended consequences of technology implementations.[12] Moving communication to text messages and smartphones has the potential to reduce interprofessional relations and can increase confusion if used for complex issues.[10, 13] In this article, we present a system designed to improve interprofessional communication on general internal medicine wards by incorporating these desired features and describe the usage and attitudes toward the system, specifically assessing for effects on multiple domains including efficiency, interprofessional collaboration, and relationships.

METHODS

Research Question

Will nurses and physicians use a system designed to improve interprofessional communication and will they perceive it to be effective and improve workflow?

Setting

The study took place on the general internal medicine wards at Toronto General Hospital and Toronto Western Hospital, 2 large academic teaching hospitals. There are several general internal medicine wards at each site with approximately 80 beds at each site. At each site there are 4 clinical teaching units and 1 hospitalist team. The study was approved by the research ethics board at the University Health Network.

Intervention

To address issues with communication, we developed a systemClinical Message (CM)that included 2 main components: a physician handover tool and secure messaging module. The focus of CM was to improve communication and information flow among different healthcare providers (physicians, nurses, pharmacists, social workers and therapists) through a secure, shared platform.

Physician Handover

The physician handover tool was designed to facilitate the physician handover process at shift change and is used as a patient rounding tool for day‐to‐day management of patients. It is also accessed by nurses and other clinicians to view the physicians' notes and to stay informed on the overall care plan. The tool contains standard elements including a list of patients with the following information on each patient: demographics, diagnosis, code status, medical history, active issues, and discharge plans (Figure 1).

Figure 1
Physician handover tool: patient list showing patient information and physician notes for a selected patient. (Note: not real patients.)

Secure Messaging

Secure messaging was designed around our dominant communication: nurses sending messages to physicians who would then respond. Nurses and other health professionals sent messages to the medical teams by accessing CM, selecting the appropriate patient, and filling out a message template. The system automatically populated the To field with the team assigned to the selected patient. Messaging for each team was centralized around a single team smartphone that was carried 24 hours a day, 7 days a week by a physician on that team. This removed the guesswork of trying to identify the individual physician responsible for that patient. For each message, a subject or issue and content were entered (Figure 2). Logic was also incorporated to reduce the amount of unnecessary interruptions. Senders would choose to send the message immediately as an interrupt message (urgent) for urgent/time sensitive issues or as an allow time to respond message (delayed). For the latter, the message was posted to the system where physicians could check and answer them. Interrupt messages were sent to the team smartphone using the Short Message Service (SMS) protocol. To try and ensure the communication loop on any issues was closed, when a message requested a response and did not receive it, the system sent another message. For urgent messages, a repeat message was initiated after 15 minutes. For delayed messages, the sender defined when they needed a response, typically within 2 to 8 hours. Senders were also able to select the mode of response that would best meet their needs from a workflow perspective: call back, text reply, or to specify that a reply was not required. Senders were also able to verify if the messages were received by the physician's smartphone. Physicians could view the messages within CM and reply. For messages that went to their team smartphone, physicians could respond from the smartphone through a secure Web link.

Figure 2
Patient list with a selected patient: sending a message on the Clinical Message system. (Note: not real patients.)

Because the messages were linked to the patients, they were visible to the entire care team, not just the message sender and recipient. If the care of the patient was transferred from 1 clinician to the next, the new clinician could easily review prior messages to understand recent patient events. The system was accessible through a browser on the intranet. The system regularly pulled patient demographic details such as name, age, medical record number, and location from our electronic medical record through a 1‐way interface. Information from this communication system was not considered part of the medical record but was retrievable.

The system was introduced as the new standard method of communication for nurses to reach physicians for all of the general internal medicine wards and for all medical teams at site 1 on May 2, 2011 and site 2 on June 6, 2011. The system replaced a text‐based Web‐paging system and supplemented the numeric pager carried by residents. Initial training of a half hour was provided to all nurses and residents.

Message Analysis for Usage Statistics

We analyzed messages created and sent via the CM system from May 2011 until August 2012. The extracted message information included date and time sent, issue, level of urgency, response type requested, roles of clinicians involved from the associated team, hospital site (senders and receivers), and message details. The following inclusion criteria were used for the analyses: (1) the senders and receivers of the messages could not be CM support staff, and (2) the messages sent were intended for the team smartphones used by the respective medical teams, not individual clinicians. Descriptive statistics and frequency analysis were performed using Microsoft Excel (Microsoft Corp., Redmond, WA) and IBM SPSS (IBM, Armonk, NY).

Survey

Development of the Survey

We used standard methods to develop a survey to assess staff perceptions on the impact of the new communication system. Relevant questionnaire items were compiled from a systematic review of the literature for communication surveys and communication issues that included the following domains: efficiency, accountability, accuracy, collaboration, timeliness, richness of the communication medium, and impact on interprofessional relationships and verbal communication.[10, 14, 15] We carried out pilot testing with 5 nurses and physicians, and modified the questionnaires based on their feedback.

Sampling and Data Collection of the Survey

Survey participants consisted of 2 groups of clinicians: (1) medical trainees that included medical residents, medical interns, and clinical fellows, and (2) nursing staff that included part‐time and full‐time nurses. To qualify for inclusion, participants had to have used the CM system for at least a month prior to administration of the questionnaire.

Data Analysis

Responses were recorded into an Excel spreadsheet that was imported into SPSS for analysis. Categorical variables were described using proportions. Survey comments were grouped into common themes, and themes mentioned by more than 1 respondent were reported.

RESULTS

Usage Analysis

A total of 60,969 messages were sent using CM between May 2, 2011 and August 19, 2012. On average, a team would receive 14.8 messages per day. Of all messages, 76.5% requested a text reply, 7.7% requested a call‐back, and 15.7% did not request a response. More than two‐thirds of messages at both hospitals were sent as immediate. Of the nonurgent messages, 86% were not replied to within the desired time, requiring a repeat message to be sent. Examples of different types of messages are shown in Table 1.

Examples of Types of Messages Sent Through the System and the Replies
SenderIssueDetailsPriorityDesired Response TypeTime CreatedTime SentReplyTime Replied
  • NOTE: Abbreviations: BP, blood pressure; HR, heart rate; PT, patient; NG, nasogastric; creat, creatinine; NS, normal saline; RA, room air; IV, intravenous.

NurseVital signPt's BP is 182/95, HR is 108 now. Previous at 0800 was 165/78; HR was 99. PT is not on antihypertensive meds.Allow time to respond (23:00)Text reply21:4323:02OK. Will assess.23:03
NurseNG tubeNG tube is in place. Can you please enter portable chest x‐ray to check placement ASAP?ImmediateText reply16:5816:58Will do.17:00
NurseBloodworkPt creat=216. Pt has NS @ 75 cc/hr. Pt has noted crackles throughout lung fields and has productive cough; eating and drinking well. Would you like it continued as well? Pt O2Sat 93% RA; would you like 4 L of O2 continued? Pls call for telephone order.ImmediateCall back12:5313:04Dealt with it on phone.13:05
NursePain controlHello! Pt has been getting 1 mg hydromorphone IV q 1 hr and pain is still not controlled. Pt remains awake and alert. Thanks!ImmediateInfo only15:4115:41Thank you.15:42

For messages requesting a text reply, 8.6% did not receive a reply. The median response time was 2.3 minutes (interquartile range of 5.8 minutes), but some messages did not receive a response even after a week, which skewed the distribution of response times. For those messages that did receive a reply, 68.9% of them were responded to within 5 minutes, and 84.5% were responded to within 15 minutes. Messages were predominantly received between 9 am and midnight (see Supporting Figure 1 in the online version of this article). Because the sending of some messages was delayed, there appeared to be fewer messages received during protected educational times (89 am and 121 pm) as well as between midnight and 7 am compared to other times.

Survey Results

Between April 2013 and June 2013, 82 of 86 medical trainees (95.3%) and 83 of 116 nurses (71.6%) completed the survey, for an overall response rate of 81.7%. Clinicians perceived that CM appeared to have a positive impact on efficiency. In particular, 82.8% of physicians and 78.3% of nurses agreed or strongly agreed that CM helped speed up daily work tasks (Table 2). The majority of physicians and nurses agreed that the system increased accountability, increased timeliness of communication, and improved interprofessional relationships. It was not seen to be effective for communicating complex patient issues.

Summary of Survey Responses
 No. of Subitems in SurveyPhysician (% Agree, Strongly Agree), n=82Nurse (% Agree, Strongly Agree), n=83
  • NOTE: Major groupings are listed. For those with multiple (>3) items in the survey, important items are listed. Abbreviations: CM, Clinical Message.

Positive impact on efficiency.758.9%66.6%
The CM system helps speed up my daily work tasks. 82.8%78.3%
Positive impact on physician‐nurse collaboration.655.3%58.5%
The CM system increases the amount of communication between nurses and physicians. 50.6%67.1%
Improved timeliness of communication.554.2%50.5%
Communication through the CM system helps me resolve patient issues within the appropriate time frames. 66.7%55.6%
Increased accountability.267.1%73.2%
Improved accuracy of communications.341.6%50.7%
Improved interprofessional relationships.262.2%53.6%
Increased verbal communications.235.1%25.3%
Richness of the communication medium.640.7%48.3%
I find the CM system useful for communicating complex patient issues. 35.8%26.3%
I would prefer CM over standard hospital communication methods such as numeric paging.168.3%76.5%
I enjoy using the CM system for clinical communication on the wards.163.0%79.0%
Communication through the CM system helps to reduce interruptions for physicians.145.7% 

Survey comments revealed that nurses perceived a lack of desired response, whereas physicians noted being interrupted with low‐value information through the system (Table 3). Both commented that further functionality, such as an active message stream, would be of benefit. Difficulty in communicating complex issues was also noted.

Issues Mentioned in Survey Comments by Occurrences
IssueOccurrencesExample
MDRNTotal
  • NOTE: Abbreviations: CM, Clinical Message; MD, medical doctor; RN, registered nurse.

Lack of response11011It depends if they respond quickly or not. A few times I send the 2nd message to remind them of the issue. I also spend more time to check if they answer it or not. I even call their Blackberries at last to get a response.
Message stream347I wish that I could see follow‐up messages after my initial reply (ie, it would be nice to have an open message stream).
Difficult to communicate complex issues156Difficult to communicate complex issues. Takes a lot of time to respond, and it becomes inefficient when responding to nonurgent CM because it interrupts workflow.
Many messages are low‐value interrupts303CM is useful for handover between clinicians, but often it slows down the clinician when they are used for information‐related low‐value/noncritical messages between nurses and clinicians
Lack of detailed response033Specific messages regarding response to care is required most times. For example, acknowledged is not a favorable response.
Technical issues202I find CM very useful. We have had multiple issues with our Blackberry this month, and CM was not working. When it is up and running, however, it is a wonderful tool.
Discrepancy in perceived urgency202Discrepancy between what nurses find urgent and what we find urgent.

DISCUSSION

We describe an implementation of a system to improve clinical communication in hospitals. The system was highly used and was perceived to improve communication by both nurses and physicians. Specifically, users found that the system increased efficiency, accountability, timeliness, and collaboration, but that there were issues with message clarity for complex medical issues.

Other systems and approaches have been implemented to improve communication. These included the use of alphanumeric pagers, e‐mail, secure texting, and smartphones. There is evidence that more advanced systems can improve efficiency for senders.[16] A recent randomized trial of secure text messaging found that it was perceived to be more efficient than paging, but overall usage was low and inconsistent.[8] There is also evidence that smartphones may increase interruptions, worsen interprofessional relationships, and cause issues with professional behavior.[10] Unfortunately, there are a limited number of interventions that improve communication, with some improving efficiency but none demonstrating improved patient‐oriented outcomes.[16, 17] This study evaluated a novel system, with functionality to link communication to patients, and created a system that aligned with the workflow of the clinicians. Messages were linked to the patient, not the sender or receiver, so other clinicians in the patient's circle of care could easily view the communication. Moreover, the system was designed to improve message response rates and allow for nonurgent messages.

Our communication system uses standard, commercially available components (smartphones, SMS), and relatively basic functionality (handover, secure messaging). Important findings are that the current system of paging can be transformed to a more efficient system that users will readily adopt. We found positive effects with components of the system. It appeared to improve efficiency and increase accountability. Accountability is crucial and moves from undocumented conversation to fully documented details of interactions. This can be used for both incident review and to review for quality improvement.

Using the system, physicians perceived that they were bothered by low‐value information, whereas nurses perceived a lack of response, and both found that the system was not ideal for complex messages. The mismatch between what physicians and nurses perceive as important has been attributed to their different timeframes and context.[18] For nurses with an upcoming change of shift, they wanted resolution of issues before handover. A physician on a different ward may not appreciate the context of a nurse having to directly interact with an irate family member. These different perceptions likely contributed to the lack of response to 8.6% of text messages. This is still better than other systems, such as paging, which can be as high as 33%.[10] For nonurgent items, clinicians would ideally check and clear items regularly from the system using a desktop computer, responding within the allotted timeframe. Unfortunately, this never became part of routine physician workflow, likely due to their busy workload, so many physicians would only respond when items became overdue. However, having a method to deal with nonurgent messages may have prevented some interruptions during protected educational times of trainees. The system was also not ideal for urgent or complex items. Complex items can be difficult to convey using the rarified communication medium of text messages.[19, 20] Urgent or complex issues are likely best resolved with a face‐to‐face or telephone conversation.

There are several limitations in our study that should be considered when interpreting the results. It is a study of usage and perceptions after implementation. Although more rigorous study is required to evaluate the effects, we see this as a first step in process improvement. Future research should measure the impact on improving patient care of this system and on patient outcomes such as adverse events. The study and intervention was limited to general internal medicine wards in 2 academic hospital settings where there are frequent rotations of medical personnel. The findings may not be generalizable to other hospital settings.

Future directions should be to further improve on the communication system and to educate and train staff on how to effectively communicate. Survey results showed that although users perceived increased efficiency, there was still significant opportunity to improve. One way to improve would be to have a mobile application in which physicians can easily review nonurgent items. Improvements could also be realized by educating clinicians on the use of the system and providing immediate feedback. Providing feedback to physicians on how well they respond could address nurses' issues around lack of timely response. By creating consensus between nurses and physicians on what is of high and low value to communicate could increase satisfaction for all users.

In summary, we present the usage and perceptions of a system designed to improve hospital communication. We found that there was high uptake, and that users perceived it to improve efficiency, collaboration, and accountability, but it may not be useful for communicating complex issues.

ACKNOWLEDGEMENTS

The authors acknowledge the nurses, physicians, residents, and other health professions on the general internal medicine ward for their patience and support as we continue to try to innovate. The authors also acknowledge the members of the information systems department (Shared Information Management Systems, University Health Network) who helped to support the Communication System, and the software developer, QRS, that helped to codevelop the software system.

Disclosures: The hospital was in a codevelopment agreement that has since terminated. No researcher or hospital received any funds from private industry for any purpose including personal or research. The authors report no conflicts of interest.

Files
References
  1. Coiera E. When conversation is better than computation. J Am Med Inform Assoc. 2000;7(3):277286.
  2. Brennan TA, Leape LL, Laird NM, et al. Incidence of adverse events and negligence in hospitalized patients. Results of the Harvard Medical Practice Study I. N Engl J Med. 1991;324(6):370376.
  3. Woods DM, Holl JL, Angst DB, et al. Gaps in pediatric clinician communication and opportunities for improvement. J Healthc Qual. 2008;30(5):4354.
  4. Wilson RM, Runciman WB, Gibberd RW, Harrison BT, Hamilton JD. Quality in Australian health care study. Med J Aust. 1996;164(12):754.
  5. Wong BM, Quan S, Shadowitz S, Etchells E. Implementation and evaluation of an alphanumeric paging system on a resident inpatient teaching service. J Hosp Med. 2009;4(8):E34E40.
  6. Quan S, Wu R, Morra D, et al. Demonstrating the BlackBerry as a clinical communication tool: a pilot study conducted through the Centre for Innovation in Complex Care. Healthc Q. 2008;11(4):9498.
  7. O'Connor C, Friedrich JO, Scales DC, Adhikari NK. The use of wireless email to improve healthcare team communication. J Am Med Inform Assoc. 2009;16(5):705713.
  8. Przybylo JA, Wang A, Loftus P, Evans KH, Chu I, Shieh L. Smarter hospital communication: secure smartphone text messaging improves provider satisfaction and perception of efficacy, workflow. J Hosp Med. 2014;9(9):573578.
  9. Locke KA, Duffey‐Rosenstein B, De Lio G, Morra D, Hariton N. Beyond paging: building a web‐based communication tool for nurses and physicians. J Gen Intern Med. 2009;24(1):105110.
  10. Wu RC, Lo V, Morra D, et al. The intended and unintended consequences of communication systems on general internal medicine inpatient care delivery: a prospective observational case study of five teaching hospitals. J Am Med Inform Assoc. 2013;20(4):766777.
  11. Wu RC, Lo V, Rossos P, et al. Improving hospital care and collaborative communications for the 21st century: key recommendations for general internal medicine. Interact J Med Res. 2012;1(2):e9.
  12. Bloomrosen M, Starren J, Lorenzi NM, Ash JS, Patel VL, Shortliffe EH. Anticipating and addressing the unintended consequences of health IT and policy: a report from the AMIA 2009 Health Policy Meeting. J Am Med Inform Assoc. 2011;18(1):8290.
  13. Wu R, Rossos P, Quan S, et al. An evaluation of the use of smartphones to communicate between clinicians: a mixed‐methods study. J Med Internet Res. 2011;13(3):e59.
  14. Shortell SM, Rousseau DM, Gillies RR, Devers KJ, Simons TL. Organizational assessment in intensive care units (ICUs): construct development, reliability, and validity of the ICU nurse‐physician questionnaire. Med Care. 1991;29(8):709726.
  15. Suh KS. Impact of communication medium on task performance and satisfaction: an examination of media‐richness theory. Inform Manag. 1999;35:295312.
  16. Wu RC, Tran K, Lo V, et al. Effects of clinical communication interventions in hospitals: a systematic review of information and communication technology adoptions for improved communication between clinicians. Int J Med Inform. 2012;81(11):723732.
  17. Walsh C, Siegler EL, Cheston E, et al. Provider‐to‐provider electronic communication in the era of meaningful use: a review of the evidence. J Hosp Med. 2013;8(10):589597.
  18. Quan SD, Morra D, Lau FY, et al. Perceptions of urgency: defining the gap between what physicians and nurses perceive to be an urgent issue. Int J Med Inform. 2013;82(5):378386.
  19. Wu R, Appel L, Morra D, Lo V, Kitto S, Quan S. Short message service or disService: issues with text messaging in a complex medical environment. Int J Med Inform. 2014;83(4):278284.
  20. Iversen TB, Melby L, Toussaint P. Instant messaging at the hospital: supporting articulation work? Int J Med Inform. 2013;82(9):753761.
Article PDF
Issue
Journal of Hospital Medicine - 10(2)
Page Number
83-89
Sections
Files
Files
Article PDF
Article PDF

Previous studies have advocated the importance of effective communication between clinicians as a critical component in the provision of high‐quality patient care.[1, 2, 3, 4] There is increasing interest in the use of information and communication technologies to improve how clinicians communicate in hospital settings. A number of hospitals have implemented different solutions to improve communication. These solutions include alphanumeric pagers,[5] smartphones,[6] e‐mail,[7] secure text messaging,[8] and a Web‐based interdisciplinary communication tool.[9]

These systems have different limitations that render them inefficient and likely inhibit collaborative care. Current systems, such as pagers, rely on the sender to ensure the message was received and are successful in delivering messages approximately 67% of the time.[5, 9, 10] Although alphanumeric pagers and secure text messaging can increase the likelihood of delivery, these messages are often isolated and not easily viewable by the whole care team.[11] Improved systems should also reduce unnecessary interruptions by providing support for both urgent and delayed messages. Finally, messages should be stored and retrievable to enable increased accountability and allow for review for quality improvement initiatives.

It is also important to consider the unintended consequences of technology implementations.[12] Moving communication to text messages and smartphones has the potential to reduce interprofessional relations and can increase confusion if used for complex issues.[10, 13] In this article, we present a system designed to improve interprofessional communication on general internal medicine wards by incorporating these desired features and describe the usage and attitudes toward the system, specifically assessing for effects on multiple domains including efficiency, interprofessional collaboration, and relationships.

METHODS

Research Question

Will nurses and physicians use a system designed to improve interprofessional communication and will they perceive it to be effective and improve workflow?

Setting

The study took place on the general internal medicine wards at Toronto General Hospital and Toronto Western Hospital, 2 large academic teaching hospitals. There are several general internal medicine wards at each site with approximately 80 beds at each site. At each site there are 4 clinical teaching units and 1 hospitalist team. The study was approved by the research ethics board at the University Health Network.

Intervention

To address issues with communication, we developed a systemClinical Message (CM)that included 2 main components: a physician handover tool and secure messaging module. The focus of CM was to improve communication and information flow among different healthcare providers (physicians, nurses, pharmacists, social workers and therapists) through a secure, shared platform.

Physician Handover

The physician handover tool was designed to facilitate the physician handover process at shift change and is used as a patient rounding tool for day‐to‐day management of patients. It is also accessed by nurses and other clinicians to view the physicians' notes and to stay informed on the overall care plan. The tool contains standard elements including a list of patients with the following information on each patient: demographics, diagnosis, code status, medical history, active issues, and discharge plans (Figure 1).

Figure 1
Physician handover tool: patient list showing patient information and physician notes for a selected patient. (Note: not real patients.)

Secure Messaging

Secure messaging was designed around our dominant communication: nurses sending messages to physicians who would then respond. Nurses and other health professionals sent messages to the medical teams by accessing CM, selecting the appropriate patient, and filling out a message template. The system automatically populated the To field with the team assigned to the selected patient. Messaging for each team was centralized around a single team smartphone that was carried 24 hours a day, 7 days a week by a physician on that team. This removed the guesswork of trying to identify the individual physician responsible for that patient. For each message, a subject or issue and content were entered (Figure 2). Logic was also incorporated to reduce the amount of unnecessary interruptions. Senders would choose to send the message immediately as an interrupt message (urgent) for urgent/time sensitive issues or as an allow time to respond message (delayed). For the latter, the message was posted to the system where physicians could check and answer them. Interrupt messages were sent to the team smartphone using the Short Message Service (SMS) protocol. To try and ensure the communication loop on any issues was closed, when a message requested a response and did not receive it, the system sent another message. For urgent messages, a repeat message was initiated after 15 minutes. For delayed messages, the sender defined when they needed a response, typically within 2 to 8 hours. Senders were also able to select the mode of response that would best meet their needs from a workflow perspective: call back, text reply, or to specify that a reply was not required. Senders were also able to verify if the messages were received by the physician's smartphone. Physicians could view the messages within CM and reply. For messages that went to their team smartphone, physicians could respond from the smartphone through a secure Web link.

Figure 2
Patient list with a selected patient: sending a message on the Clinical Message system. (Note: not real patients.)

Because the messages were linked to the patients, they were visible to the entire care team, not just the message sender and recipient. If the care of the patient was transferred from 1 clinician to the next, the new clinician could easily review prior messages to understand recent patient events. The system was accessible through a browser on the intranet. The system regularly pulled patient demographic details such as name, age, medical record number, and location from our electronic medical record through a 1‐way interface. Information from this communication system was not considered part of the medical record but was retrievable.

The system was introduced as the new standard method of communication for nurses to reach physicians for all of the general internal medicine wards and for all medical teams at site 1 on May 2, 2011 and site 2 on June 6, 2011. The system replaced a text‐based Web‐paging system and supplemented the numeric pager carried by residents. Initial training of a half hour was provided to all nurses and residents.

Message Analysis for Usage Statistics

We analyzed messages created and sent via the CM system from May 2011 until August 2012. The extracted message information included date and time sent, issue, level of urgency, response type requested, roles of clinicians involved from the associated team, hospital site (senders and receivers), and message details. The following inclusion criteria were used for the analyses: (1) the senders and receivers of the messages could not be CM support staff, and (2) the messages sent were intended for the team smartphones used by the respective medical teams, not individual clinicians. Descriptive statistics and frequency analysis were performed using Microsoft Excel (Microsoft Corp., Redmond, WA) and IBM SPSS (IBM, Armonk, NY).

Survey

Development of the Survey

We used standard methods to develop a survey to assess staff perceptions on the impact of the new communication system. Relevant questionnaire items were compiled from a systematic review of the literature for communication surveys and communication issues that included the following domains: efficiency, accountability, accuracy, collaboration, timeliness, richness of the communication medium, and impact on interprofessional relationships and verbal communication.[10, 14, 15] We carried out pilot testing with 5 nurses and physicians, and modified the questionnaires based on their feedback.

Sampling and Data Collection of the Survey

Survey participants consisted of 2 groups of clinicians: (1) medical trainees that included medical residents, medical interns, and clinical fellows, and (2) nursing staff that included part‐time and full‐time nurses. To qualify for inclusion, participants had to have used the CM system for at least a month prior to administration of the questionnaire.

Data Analysis

Responses were recorded into an Excel spreadsheet that was imported into SPSS for analysis. Categorical variables were described using proportions. Survey comments were grouped into common themes, and themes mentioned by more than 1 respondent were reported.

RESULTS

Usage Analysis

A total of 60,969 messages were sent using CM between May 2, 2011 and August 19, 2012. On average, a team would receive 14.8 messages per day. Of all messages, 76.5% requested a text reply, 7.7% requested a call‐back, and 15.7% did not request a response. More than two‐thirds of messages at both hospitals were sent as immediate. Of the nonurgent messages, 86% were not replied to within the desired time, requiring a repeat message to be sent. Examples of different types of messages are shown in Table 1.

Examples of Types of Messages Sent Through the System and the Replies
SenderIssueDetailsPriorityDesired Response TypeTime CreatedTime SentReplyTime Replied
  • NOTE: Abbreviations: BP, blood pressure; HR, heart rate; PT, patient; NG, nasogastric; creat, creatinine; NS, normal saline; RA, room air; IV, intravenous.

NurseVital signPt's BP is 182/95, HR is 108 now. Previous at 0800 was 165/78; HR was 99. PT is not on antihypertensive meds.Allow time to respond (23:00)Text reply21:4323:02OK. Will assess.23:03
NurseNG tubeNG tube is in place. Can you please enter portable chest x‐ray to check placement ASAP?ImmediateText reply16:5816:58Will do.17:00
NurseBloodworkPt creat=216. Pt has NS @ 75 cc/hr. Pt has noted crackles throughout lung fields and has productive cough; eating and drinking well. Would you like it continued as well? Pt O2Sat 93% RA; would you like 4 L of O2 continued? Pls call for telephone order.ImmediateCall back12:5313:04Dealt with it on phone.13:05
NursePain controlHello! Pt has been getting 1 mg hydromorphone IV q 1 hr and pain is still not controlled. Pt remains awake and alert. Thanks!ImmediateInfo only15:4115:41Thank you.15:42

For messages requesting a text reply, 8.6% did not receive a reply. The median response time was 2.3 minutes (interquartile range of 5.8 minutes), but some messages did not receive a response even after a week, which skewed the distribution of response times. For those messages that did receive a reply, 68.9% of them were responded to within 5 minutes, and 84.5% were responded to within 15 minutes. Messages were predominantly received between 9 am and midnight (see Supporting Figure 1 in the online version of this article). Because the sending of some messages was delayed, there appeared to be fewer messages received during protected educational times (89 am and 121 pm) as well as between midnight and 7 am compared to other times.

Survey Results

Between April 2013 and June 2013, 82 of 86 medical trainees (95.3%) and 83 of 116 nurses (71.6%) completed the survey, for an overall response rate of 81.7%. Clinicians perceived that CM appeared to have a positive impact on efficiency. In particular, 82.8% of physicians and 78.3% of nurses agreed or strongly agreed that CM helped speed up daily work tasks (Table 2). The majority of physicians and nurses agreed that the system increased accountability, increased timeliness of communication, and improved interprofessional relationships. It was not seen to be effective for communicating complex patient issues.

Summary of Survey Responses
 No. of Subitems in SurveyPhysician (% Agree, Strongly Agree), n=82Nurse (% Agree, Strongly Agree), n=83
  • NOTE: Major groupings are listed. For those with multiple (>3) items in the survey, important items are listed. Abbreviations: CM, Clinical Message.

Positive impact on efficiency.758.9%66.6%
The CM system helps speed up my daily work tasks. 82.8%78.3%
Positive impact on physician‐nurse collaboration.655.3%58.5%
The CM system increases the amount of communication between nurses and physicians. 50.6%67.1%
Improved timeliness of communication.554.2%50.5%
Communication through the CM system helps me resolve patient issues within the appropriate time frames. 66.7%55.6%
Increased accountability.267.1%73.2%
Improved accuracy of communications.341.6%50.7%
Improved interprofessional relationships.262.2%53.6%
Increased verbal communications.235.1%25.3%
Richness of the communication medium.640.7%48.3%
I find the CM system useful for communicating complex patient issues. 35.8%26.3%
I would prefer CM over standard hospital communication methods such as numeric paging.168.3%76.5%
I enjoy using the CM system for clinical communication on the wards.163.0%79.0%
Communication through the CM system helps to reduce interruptions for physicians.145.7% 

Survey comments revealed that nurses perceived a lack of desired response, whereas physicians noted being interrupted with low‐value information through the system (Table 3). Both commented that further functionality, such as an active message stream, would be of benefit. Difficulty in communicating complex issues was also noted.

Issues Mentioned in Survey Comments by Occurrences
IssueOccurrencesExample
MDRNTotal
  • NOTE: Abbreviations: CM, Clinical Message; MD, medical doctor; RN, registered nurse.

Lack of response11011It depends if they respond quickly or not. A few times I send the 2nd message to remind them of the issue. I also spend more time to check if they answer it or not. I even call their Blackberries at last to get a response.
Message stream347I wish that I could see follow‐up messages after my initial reply (ie, it would be nice to have an open message stream).
Difficult to communicate complex issues156Difficult to communicate complex issues. Takes a lot of time to respond, and it becomes inefficient when responding to nonurgent CM because it interrupts workflow.
Many messages are low‐value interrupts303CM is useful for handover between clinicians, but often it slows down the clinician when they are used for information‐related low‐value/noncritical messages between nurses and clinicians
Lack of detailed response033Specific messages regarding response to care is required most times. For example, acknowledged is not a favorable response.
Technical issues202I find CM very useful. We have had multiple issues with our Blackberry this month, and CM was not working. When it is up and running, however, it is a wonderful tool.
Discrepancy in perceived urgency202Discrepancy between what nurses find urgent and what we find urgent.

DISCUSSION

We describe an implementation of a system to improve clinical communication in hospitals. The system was highly used and was perceived to improve communication by both nurses and physicians. Specifically, users found that the system increased efficiency, accountability, timeliness, and collaboration, but that there were issues with message clarity for complex medical issues.

Other systems and approaches have been implemented to improve communication. These included the use of alphanumeric pagers, e‐mail, secure texting, and smartphones. There is evidence that more advanced systems can improve efficiency for senders.[16] A recent randomized trial of secure text messaging found that it was perceived to be more efficient than paging, but overall usage was low and inconsistent.[8] There is also evidence that smartphones may increase interruptions, worsen interprofessional relationships, and cause issues with professional behavior.[10] Unfortunately, there are a limited number of interventions that improve communication, with some improving efficiency but none demonstrating improved patient‐oriented outcomes.[16, 17] This study evaluated a novel system, with functionality to link communication to patients, and created a system that aligned with the workflow of the clinicians. Messages were linked to the patient, not the sender or receiver, so other clinicians in the patient's circle of care could easily view the communication. Moreover, the system was designed to improve message response rates and allow for nonurgent messages.

Our communication system uses standard, commercially available components (smartphones, SMS), and relatively basic functionality (handover, secure messaging). Important findings are that the current system of paging can be transformed to a more efficient system that users will readily adopt. We found positive effects with components of the system. It appeared to improve efficiency and increase accountability. Accountability is crucial and moves from undocumented conversation to fully documented details of interactions. This can be used for both incident review and to review for quality improvement.

Using the system, physicians perceived that they were bothered by low‐value information, whereas nurses perceived a lack of response, and both found that the system was not ideal for complex messages. The mismatch between what physicians and nurses perceive as important has been attributed to their different timeframes and context.[18] For nurses with an upcoming change of shift, they wanted resolution of issues before handover. A physician on a different ward may not appreciate the context of a nurse having to directly interact with an irate family member. These different perceptions likely contributed to the lack of response to 8.6% of text messages. This is still better than other systems, such as paging, which can be as high as 33%.[10] For nonurgent items, clinicians would ideally check and clear items regularly from the system using a desktop computer, responding within the allotted timeframe. Unfortunately, this never became part of routine physician workflow, likely due to their busy workload, so many physicians would only respond when items became overdue. However, having a method to deal with nonurgent messages may have prevented some interruptions during protected educational times of trainees. The system was also not ideal for urgent or complex items. Complex items can be difficult to convey using the rarified communication medium of text messages.[19, 20] Urgent or complex issues are likely best resolved with a face‐to‐face or telephone conversation.

There are several limitations in our study that should be considered when interpreting the results. It is a study of usage and perceptions after implementation. Although more rigorous study is required to evaluate the effects, we see this as a first step in process improvement. Future research should measure the impact on improving patient care of this system and on patient outcomes such as adverse events. The study and intervention was limited to general internal medicine wards in 2 academic hospital settings where there are frequent rotations of medical personnel. The findings may not be generalizable to other hospital settings.

Future directions should be to further improve on the communication system and to educate and train staff on how to effectively communicate. Survey results showed that although users perceived increased efficiency, there was still significant opportunity to improve. One way to improve would be to have a mobile application in which physicians can easily review nonurgent items. Improvements could also be realized by educating clinicians on the use of the system and providing immediate feedback. Providing feedback to physicians on how well they respond could address nurses' issues around lack of timely response. By creating consensus between nurses and physicians on what is of high and low value to communicate could increase satisfaction for all users.

In summary, we present the usage and perceptions of a system designed to improve hospital communication. We found that there was high uptake, and that users perceived it to improve efficiency, collaboration, and accountability, but it may not be useful for communicating complex issues.

ACKNOWLEDGEMENTS

The authors acknowledge the nurses, physicians, residents, and other health professions on the general internal medicine ward for their patience and support as we continue to try to innovate. The authors also acknowledge the members of the information systems department (Shared Information Management Systems, University Health Network) who helped to support the Communication System, and the software developer, QRS, that helped to codevelop the software system.

Disclosures: The hospital was in a codevelopment agreement that has since terminated. No researcher or hospital received any funds from private industry for any purpose including personal or research. The authors report no conflicts of interest.

Previous studies have advocated the importance of effective communication between clinicians as a critical component in the provision of high‐quality patient care.[1, 2, 3, 4] There is increasing interest in the use of information and communication technologies to improve how clinicians communicate in hospital settings. A number of hospitals have implemented different solutions to improve communication. These solutions include alphanumeric pagers,[5] smartphones,[6] e‐mail,[7] secure text messaging,[8] and a Web‐based interdisciplinary communication tool.[9]

These systems have different limitations that render them inefficient and likely inhibit collaborative care. Current systems, such as pagers, rely on the sender to ensure the message was received and are successful in delivering messages approximately 67% of the time.[5, 9, 10] Although alphanumeric pagers and secure text messaging can increase the likelihood of delivery, these messages are often isolated and not easily viewable by the whole care team.[11] Improved systems should also reduce unnecessary interruptions by providing support for both urgent and delayed messages. Finally, messages should be stored and retrievable to enable increased accountability and allow for review for quality improvement initiatives.

It is also important to consider the unintended consequences of technology implementations.[12] Moving communication to text messages and smartphones has the potential to reduce interprofessional relations and can increase confusion if used for complex issues.[10, 13] In this article, we present a system designed to improve interprofessional communication on general internal medicine wards by incorporating these desired features and describe the usage and attitudes toward the system, specifically assessing for effects on multiple domains including efficiency, interprofessional collaboration, and relationships.

METHODS

Research Question

Will nurses and physicians use a system designed to improve interprofessional communication and will they perceive it to be effective and improve workflow?

Setting

The study took place on the general internal medicine wards at Toronto General Hospital and Toronto Western Hospital, 2 large academic teaching hospitals. There are several general internal medicine wards at each site with approximately 80 beds at each site. At each site there are 4 clinical teaching units and 1 hospitalist team. The study was approved by the research ethics board at the University Health Network.

Intervention

To address issues with communication, we developed a systemClinical Message (CM)that included 2 main components: a physician handover tool and secure messaging module. The focus of CM was to improve communication and information flow among different healthcare providers (physicians, nurses, pharmacists, social workers and therapists) through a secure, shared platform.

Physician Handover

The physician handover tool was designed to facilitate the physician handover process at shift change and is used as a patient rounding tool for day‐to‐day management of patients. It is also accessed by nurses and other clinicians to view the physicians' notes and to stay informed on the overall care plan. The tool contains standard elements including a list of patients with the following information on each patient: demographics, diagnosis, code status, medical history, active issues, and discharge plans (Figure 1).

Figure 1
Physician handover tool: patient list showing patient information and physician notes for a selected patient. (Note: not real patients.)

Secure Messaging

Secure messaging was designed around our dominant communication: nurses sending messages to physicians who would then respond. Nurses and other health professionals sent messages to the medical teams by accessing CM, selecting the appropriate patient, and filling out a message template. The system automatically populated the To field with the team assigned to the selected patient. Messaging for each team was centralized around a single team smartphone that was carried 24 hours a day, 7 days a week by a physician on that team. This removed the guesswork of trying to identify the individual physician responsible for that patient. For each message, a subject or issue and content were entered (Figure 2). Logic was also incorporated to reduce the amount of unnecessary interruptions. Senders would choose to send the message immediately as an interrupt message (urgent) for urgent/time sensitive issues or as an allow time to respond message (delayed). For the latter, the message was posted to the system where physicians could check and answer them. Interrupt messages were sent to the team smartphone using the Short Message Service (SMS) protocol. To try and ensure the communication loop on any issues was closed, when a message requested a response and did not receive it, the system sent another message. For urgent messages, a repeat message was initiated after 15 minutes. For delayed messages, the sender defined when they needed a response, typically within 2 to 8 hours. Senders were also able to select the mode of response that would best meet their needs from a workflow perspective: call back, text reply, or to specify that a reply was not required. Senders were also able to verify if the messages were received by the physician's smartphone. Physicians could view the messages within CM and reply. For messages that went to their team smartphone, physicians could respond from the smartphone through a secure Web link.

Figure 2
Patient list with a selected patient: sending a message on the Clinical Message system. (Note: not real patients.)

Because the messages were linked to the patients, they were visible to the entire care team, not just the message sender and recipient. If the care of the patient was transferred from 1 clinician to the next, the new clinician could easily review prior messages to understand recent patient events. The system was accessible through a browser on the intranet. The system regularly pulled patient demographic details such as name, age, medical record number, and location from our electronic medical record through a 1‐way interface. Information from this communication system was not considered part of the medical record but was retrievable.

The system was introduced as the new standard method of communication for nurses to reach physicians for all of the general internal medicine wards and for all medical teams at site 1 on May 2, 2011 and site 2 on June 6, 2011. The system replaced a text‐based Web‐paging system and supplemented the numeric pager carried by residents. Initial training of a half hour was provided to all nurses and residents.

Message Analysis for Usage Statistics

We analyzed messages created and sent via the CM system from May 2011 until August 2012. The extracted message information included date and time sent, issue, level of urgency, response type requested, roles of clinicians involved from the associated team, hospital site (senders and receivers), and message details. The following inclusion criteria were used for the analyses: (1) the senders and receivers of the messages could not be CM support staff, and (2) the messages sent were intended for the team smartphones used by the respective medical teams, not individual clinicians. Descriptive statistics and frequency analysis were performed using Microsoft Excel (Microsoft Corp., Redmond, WA) and IBM SPSS (IBM, Armonk, NY).

Survey

Development of the Survey

We used standard methods to develop a survey to assess staff perceptions on the impact of the new communication system. Relevant questionnaire items were compiled from a systematic review of the literature for communication surveys and communication issues that included the following domains: efficiency, accountability, accuracy, collaboration, timeliness, richness of the communication medium, and impact on interprofessional relationships and verbal communication.[10, 14, 15] We carried out pilot testing with 5 nurses and physicians, and modified the questionnaires based on their feedback.

Sampling and Data Collection of the Survey

Survey participants consisted of 2 groups of clinicians: (1) medical trainees that included medical residents, medical interns, and clinical fellows, and (2) nursing staff that included part‐time and full‐time nurses. To qualify for inclusion, participants had to have used the CM system for at least a month prior to administration of the questionnaire.

Data Analysis

Responses were recorded into an Excel spreadsheet that was imported into SPSS for analysis. Categorical variables were described using proportions. Survey comments were grouped into common themes, and themes mentioned by more than 1 respondent were reported.

RESULTS

Usage Analysis

A total of 60,969 messages were sent using CM between May 2, 2011 and August 19, 2012. On average, a team would receive 14.8 messages per day. Of all messages, 76.5% requested a text reply, 7.7% requested a call‐back, and 15.7% did not request a response. More than two‐thirds of messages at both hospitals were sent as immediate. Of the nonurgent messages, 86% were not replied to within the desired time, requiring a repeat message to be sent. Examples of different types of messages are shown in Table 1.

Examples of Types of Messages Sent Through the System and the Replies
SenderIssueDetailsPriorityDesired Response TypeTime CreatedTime SentReplyTime Replied
  • NOTE: Abbreviations: BP, blood pressure; HR, heart rate; PT, patient; NG, nasogastric; creat, creatinine; NS, normal saline; RA, room air; IV, intravenous.

NurseVital signPt's BP is 182/95, HR is 108 now. Previous at 0800 was 165/78; HR was 99. PT is not on antihypertensive meds.Allow time to respond (23:00)Text reply21:4323:02OK. Will assess.23:03
NurseNG tubeNG tube is in place. Can you please enter portable chest x‐ray to check placement ASAP?ImmediateText reply16:5816:58Will do.17:00
NurseBloodworkPt creat=216. Pt has NS @ 75 cc/hr. Pt has noted crackles throughout lung fields and has productive cough; eating and drinking well. Would you like it continued as well? Pt O2Sat 93% RA; would you like 4 L of O2 continued? Pls call for telephone order.ImmediateCall back12:5313:04Dealt with it on phone.13:05
NursePain controlHello! Pt has been getting 1 mg hydromorphone IV q 1 hr and pain is still not controlled. Pt remains awake and alert. Thanks!ImmediateInfo only15:4115:41Thank you.15:42

For messages requesting a text reply, 8.6% did not receive a reply. The median response time was 2.3 minutes (interquartile range of 5.8 minutes), but some messages did not receive a response even after a week, which skewed the distribution of response times. For those messages that did receive a reply, 68.9% of them were responded to within 5 minutes, and 84.5% were responded to within 15 minutes. Messages were predominantly received between 9 am and midnight (see Supporting Figure 1 in the online version of this article). Because the sending of some messages was delayed, there appeared to be fewer messages received during protected educational times (89 am and 121 pm) as well as between midnight and 7 am compared to other times.

Survey Results

Between April 2013 and June 2013, 82 of 86 medical trainees (95.3%) and 83 of 116 nurses (71.6%) completed the survey, for an overall response rate of 81.7%. Clinicians perceived that CM appeared to have a positive impact on efficiency. In particular, 82.8% of physicians and 78.3% of nurses agreed or strongly agreed that CM helped speed up daily work tasks (Table 2). The majority of physicians and nurses agreed that the system increased accountability, increased timeliness of communication, and improved interprofessional relationships. It was not seen to be effective for communicating complex patient issues.

Summary of Survey Responses
 No. of Subitems in SurveyPhysician (% Agree, Strongly Agree), n=82Nurse (% Agree, Strongly Agree), n=83
  • NOTE: Major groupings are listed. For those with multiple (>3) items in the survey, important items are listed. Abbreviations: CM, Clinical Message.

Positive impact on efficiency.758.9%66.6%
The CM system helps speed up my daily work tasks. 82.8%78.3%
Positive impact on physician‐nurse collaboration.655.3%58.5%
The CM system increases the amount of communication between nurses and physicians. 50.6%67.1%
Improved timeliness of communication.554.2%50.5%
Communication through the CM system helps me resolve patient issues within the appropriate time frames. 66.7%55.6%
Increased accountability.267.1%73.2%
Improved accuracy of communications.341.6%50.7%
Improved interprofessional relationships.262.2%53.6%
Increased verbal communications.235.1%25.3%
Richness of the communication medium.640.7%48.3%
I find the CM system useful for communicating complex patient issues. 35.8%26.3%
I would prefer CM over standard hospital communication methods such as numeric paging.168.3%76.5%
I enjoy using the CM system for clinical communication on the wards.163.0%79.0%
Communication through the CM system helps to reduce interruptions for physicians.145.7% 

Survey comments revealed that nurses perceived a lack of desired response, whereas physicians noted being interrupted with low‐value information through the system (Table 3). Both commented that further functionality, such as an active message stream, would be of benefit. Difficulty in communicating complex issues was also noted.

Issues Mentioned in Survey Comments by Occurrences
IssueOccurrencesExample
MDRNTotal
  • NOTE: Abbreviations: CM, Clinical Message; MD, medical doctor; RN, registered nurse.

Lack of response11011It depends if they respond quickly or not. A few times I send the 2nd message to remind them of the issue. I also spend more time to check if they answer it or not. I even call their Blackberries at last to get a response.
Message stream347I wish that I could see follow‐up messages after my initial reply (ie, it would be nice to have an open message stream).
Difficult to communicate complex issues156Difficult to communicate complex issues. Takes a lot of time to respond, and it becomes inefficient when responding to nonurgent CM because it interrupts workflow.
Many messages are low‐value interrupts303CM is useful for handover between clinicians, but often it slows down the clinician when they are used for information‐related low‐value/noncritical messages between nurses and clinicians
Lack of detailed response033Specific messages regarding response to care is required most times. For example, acknowledged is not a favorable response.
Technical issues202I find CM very useful. We have had multiple issues with our Blackberry this month, and CM was not working. When it is up and running, however, it is a wonderful tool.
Discrepancy in perceived urgency202Discrepancy between what nurses find urgent and what we find urgent.

DISCUSSION

We describe an implementation of a system to improve clinical communication in hospitals. The system was highly used and was perceived to improve communication by both nurses and physicians. Specifically, users found that the system increased efficiency, accountability, timeliness, and collaboration, but that there were issues with message clarity for complex medical issues.

Other systems and approaches have been implemented to improve communication. These included the use of alphanumeric pagers, e‐mail, secure texting, and smartphones. There is evidence that more advanced systems can improve efficiency for senders.[16] A recent randomized trial of secure text messaging found that it was perceived to be more efficient than paging, but overall usage was low and inconsistent.[8] There is also evidence that smartphones may increase interruptions, worsen interprofessional relationships, and cause issues with professional behavior.[10] Unfortunately, there are a limited number of interventions that improve communication, with some improving efficiency but none demonstrating improved patient‐oriented outcomes.[16, 17] This study evaluated a novel system, with functionality to link communication to patients, and created a system that aligned with the workflow of the clinicians. Messages were linked to the patient, not the sender or receiver, so other clinicians in the patient's circle of care could easily view the communication. Moreover, the system was designed to improve message response rates and allow for nonurgent messages.

Our communication system uses standard, commercially available components (smartphones, SMS), and relatively basic functionality (handover, secure messaging). Important findings are that the current system of paging can be transformed to a more efficient system that users will readily adopt. We found positive effects with components of the system. It appeared to improve efficiency and increase accountability. Accountability is crucial and moves from undocumented conversation to fully documented details of interactions. This can be used for both incident review and to review for quality improvement.

Using the system, physicians perceived that they were bothered by low‐value information, whereas nurses perceived a lack of response, and both found that the system was not ideal for complex messages. The mismatch between what physicians and nurses perceive as important has been attributed to their different timeframes and context.[18] For nurses with an upcoming change of shift, they wanted resolution of issues before handover. A physician on a different ward may not appreciate the context of a nurse having to directly interact with an irate family member. These different perceptions likely contributed to the lack of response to 8.6% of text messages. This is still better than other systems, such as paging, which can be as high as 33%.[10] For nonurgent items, clinicians would ideally check and clear items regularly from the system using a desktop computer, responding within the allotted timeframe. Unfortunately, this never became part of routine physician workflow, likely due to their busy workload, so many physicians would only respond when items became overdue. However, having a method to deal with nonurgent messages may have prevented some interruptions during protected educational times of trainees. The system was also not ideal for urgent or complex items. Complex items can be difficult to convey using the rarified communication medium of text messages.[19, 20] Urgent or complex issues are likely best resolved with a face‐to‐face or telephone conversation.

There are several limitations in our study that should be considered when interpreting the results. It is a study of usage and perceptions after implementation. Although more rigorous study is required to evaluate the effects, we see this as a first step in process improvement. Future research should measure the impact on improving patient care of this system and on patient outcomes such as adverse events. The study and intervention was limited to general internal medicine wards in 2 academic hospital settings where there are frequent rotations of medical personnel. The findings may not be generalizable to other hospital settings.

Future directions should be to further improve on the communication system and to educate and train staff on how to effectively communicate. Survey results showed that although users perceived increased efficiency, there was still significant opportunity to improve. One way to improve would be to have a mobile application in which physicians can easily review nonurgent items. Improvements could also be realized by educating clinicians on the use of the system and providing immediate feedback. Providing feedback to physicians on how well they respond could address nurses' issues around lack of timely response. By creating consensus between nurses and physicians on what is of high and low value to communicate could increase satisfaction for all users.

In summary, we present the usage and perceptions of a system designed to improve hospital communication. We found that there was high uptake, and that users perceived it to improve efficiency, collaboration, and accountability, but it may not be useful for communicating complex issues.

ACKNOWLEDGEMENTS

The authors acknowledge the nurses, physicians, residents, and other health professions on the general internal medicine ward for their patience and support as we continue to try to innovate. The authors also acknowledge the members of the information systems department (Shared Information Management Systems, University Health Network) who helped to support the Communication System, and the software developer, QRS, that helped to codevelop the software system.

Disclosures: The hospital was in a codevelopment agreement that has since terminated. No researcher or hospital received any funds from private industry for any purpose including personal or research. The authors report no conflicts of interest.

References
  1. Coiera E. When conversation is better than computation. J Am Med Inform Assoc. 2000;7(3):277286.
  2. Brennan TA, Leape LL, Laird NM, et al. Incidence of adverse events and negligence in hospitalized patients. Results of the Harvard Medical Practice Study I. N Engl J Med. 1991;324(6):370376.
  3. Woods DM, Holl JL, Angst DB, et al. Gaps in pediatric clinician communication and opportunities for improvement. J Healthc Qual. 2008;30(5):4354.
  4. Wilson RM, Runciman WB, Gibberd RW, Harrison BT, Hamilton JD. Quality in Australian health care study. Med J Aust. 1996;164(12):754.
  5. Wong BM, Quan S, Shadowitz S, Etchells E. Implementation and evaluation of an alphanumeric paging system on a resident inpatient teaching service. J Hosp Med. 2009;4(8):E34E40.
  6. Quan S, Wu R, Morra D, et al. Demonstrating the BlackBerry as a clinical communication tool: a pilot study conducted through the Centre for Innovation in Complex Care. Healthc Q. 2008;11(4):9498.
  7. O'Connor C, Friedrich JO, Scales DC, Adhikari NK. The use of wireless email to improve healthcare team communication. J Am Med Inform Assoc. 2009;16(5):705713.
  8. Przybylo JA, Wang A, Loftus P, Evans KH, Chu I, Shieh L. Smarter hospital communication: secure smartphone text messaging improves provider satisfaction and perception of efficacy, workflow. J Hosp Med. 2014;9(9):573578.
  9. Locke KA, Duffey‐Rosenstein B, De Lio G, Morra D, Hariton N. Beyond paging: building a web‐based communication tool for nurses and physicians. J Gen Intern Med. 2009;24(1):105110.
  10. Wu RC, Lo V, Morra D, et al. The intended and unintended consequences of communication systems on general internal medicine inpatient care delivery: a prospective observational case study of five teaching hospitals. J Am Med Inform Assoc. 2013;20(4):766777.
  11. Wu RC, Lo V, Rossos P, et al. Improving hospital care and collaborative communications for the 21st century: key recommendations for general internal medicine. Interact J Med Res. 2012;1(2):e9.
  12. Bloomrosen M, Starren J, Lorenzi NM, Ash JS, Patel VL, Shortliffe EH. Anticipating and addressing the unintended consequences of health IT and policy: a report from the AMIA 2009 Health Policy Meeting. J Am Med Inform Assoc. 2011;18(1):8290.
  13. Wu R, Rossos P, Quan S, et al. An evaluation of the use of smartphones to communicate between clinicians: a mixed‐methods study. J Med Internet Res. 2011;13(3):e59.
  14. Shortell SM, Rousseau DM, Gillies RR, Devers KJ, Simons TL. Organizational assessment in intensive care units (ICUs): construct development, reliability, and validity of the ICU nurse‐physician questionnaire. Med Care. 1991;29(8):709726.
  15. Suh KS. Impact of communication medium on task performance and satisfaction: an examination of media‐richness theory. Inform Manag. 1999;35:295312.
  16. Wu RC, Tran K, Lo V, et al. Effects of clinical communication interventions in hospitals: a systematic review of information and communication technology adoptions for improved communication between clinicians. Int J Med Inform. 2012;81(11):723732.
  17. Walsh C, Siegler EL, Cheston E, et al. Provider‐to‐provider electronic communication in the era of meaningful use: a review of the evidence. J Hosp Med. 2013;8(10):589597.
  18. Quan SD, Morra D, Lau FY, et al. Perceptions of urgency: defining the gap between what physicians and nurses perceive to be an urgent issue. Int J Med Inform. 2013;82(5):378386.
  19. Wu R, Appel L, Morra D, Lo V, Kitto S, Quan S. Short message service or disService: issues with text messaging in a complex medical environment. Int J Med Inform. 2014;83(4):278284.
  20. Iversen TB, Melby L, Toussaint P. Instant messaging at the hospital: supporting articulation work? Int J Med Inform. 2013;82(9):753761.
References
  1. Coiera E. When conversation is better than computation. J Am Med Inform Assoc. 2000;7(3):277286.
  2. Brennan TA, Leape LL, Laird NM, et al. Incidence of adverse events and negligence in hospitalized patients. Results of the Harvard Medical Practice Study I. N Engl J Med. 1991;324(6):370376.
  3. Woods DM, Holl JL, Angst DB, et al. Gaps in pediatric clinician communication and opportunities for improvement. J Healthc Qual. 2008;30(5):4354.
  4. Wilson RM, Runciman WB, Gibberd RW, Harrison BT, Hamilton JD. Quality in Australian health care study. Med J Aust. 1996;164(12):754.
  5. Wong BM, Quan S, Shadowitz S, Etchells E. Implementation and evaluation of an alphanumeric paging system on a resident inpatient teaching service. J Hosp Med. 2009;4(8):E34E40.
  6. Quan S, Wu R, Morra D, et al. Demonstrating the BlackBerry as a clinical communication tool: a pilot study conducted through the Centre for Innovation in Complex Care. Healthc Q. 2008;11(4):9498.
  7. O'Connor C, Friedrich JO, Scales DC, Adhikari NK. The use of wireless email to improve healthcare team communication. J Am Med Inform Assoc. 2009;16(5):705713.
  8. Przybylo JA, Wang A, Loftus P, Evans KH, Chu I, Shieh L. Smarter hospital communication: secure smartphone text messaging improves provider satisfaction and perception of efficacy, workflow. J Hosp Med. 2014;9(9):573578.
  9. Locke KA, Duffey‐Rosenstein B, De Lio G, Morra D, Hariton N. Beyond paging: building a web‐based communication tool for nurses and physicians. J Gen Intern Med. 2009;24(1):105110.
  10. Wu RC, Lo V, Morra D, et al. The intended and unintended consequences of communication systems on general internal medicine inpatient care delivery: a prospective observational case study of five teaching hospitals. J Am Med Inform Assoc. 2013;20(4):766777.
  11. Wu RC, Lo V, Rossos P, et al. Improving hospital care and collaborative communications for the 21st century: key recommendations for general internal medicine. Interact J Med Res. 2012;1(2):e9.
  12. Bloomrosen M, Starren J, Lorenzi NM, Ash JS, Patel VL, Shortliffe EH. Anticipating and addressing the unintended consequences of health IT and policy: a report from the AMIA 2009 Health Policy Meeting. J Am Med Inform Assoc. 2011;18(1):8290.
  13. Wu R, Rossos P, Quan S, et al. An evaluation of the use of smartphones to communicate between clinicians: a mixed‐methods study. J Med Internet Res. 2011;13(3):e59.
  14. Shortell SM, Rousseau DM, Gillies RR, Devers KJ, Simons TL. Organizational assessment in intensive care units (ICUs): construct development, reliability, and validity of the ICU nurse‐physician questionnaire. Med Care. 1991;29(8):709726.
  15. Suh KS. Impact of communication medium on task performance and satisfaction: an examination of media‐richness theory. Inform Manag. 1999;35:295312.
  16. Wu RC, Tran K, Lo V, et al. Effects of clinical communication interventions in hospitals: a systematic review of information and communication technology adoptions for improved communication between clinicians. Int J Med Inform. 2012;81(11):723732.
  17. Walsh C, Siegler EL, Cheston E, et al. Provider‐to‐provider electronic communication in the era of meaningful use: a review of the evidence. J Hosp Med. 2013;8(10):589597.
  18. Quan SD, Morra D, Lau FY, et al. Perceptions of urgency: defining the gap between what physicians and nurses perceive to be an urgent issue. Int J Med Inform. 2013;82(5):378386.
  19. Wu R, Appel L, Morra D, Lo V, Kitto S, Quan S. Short message service or disService: issues with text messaging in a complex medical environment. Int J Med Inform. 2014;83(4):278284.
  20. Iversen TB, Melby L, Toussaint P. Instant messaging at the hospital: supporting articulation work? Int J Med Inform. 2013;82(9):753761.
Issue
Journal of Hospital Medicine - 10(2)
Issue
Journal of Hospital Medicine - 10(2)
Page Number
83-89
Page Number
83-89
Article Type
Display Headline
A smartphone‐enabled communication system to improve hospital communication: Usage and perceptions of medical trainees and nurses on general internal medicine wards
Display Headline
A smartphone‐enabled communication system to improve hospital communication: Usage and perceptions of medical trainees and nurses on general internal medicine wards
Sections
Article Source

© 2014 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Address for correspondence and reprint requests: Robert Wu, MD, 200 Elizabeth St., 14EN‐222, Toronto, ON, M5G 2C4 Canada; Telephone: 416‐340‐4567; Fax: 416‐595‐5826; E‐mail: robert.wu@uhn.ca
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files

Impaired Arousal and Mortality

Article Type
Changed
Sun, 05/21/2017 - 13:35
Display Headline
Impaired arousal at initial presentation predicts 6‐month mortality: An analysis of 1084 acutely ill older patients

Arousal is defined as the patient's overall level of responsiveness to the environment. Its assessment is standard of care in most intensive care units (ICUs) to monitor depth of sedation and underlying brain dysfunction. There has been recent interest in expanding the role of arousal assessment beyond the ICU. Specifically, the Veterans Affairs Delirium Working Group proposed that simple arousal assessment be a vital sign to quantify underlying brain dysfunction.[1] The rationale is that impaired arousal is closely linked with delirium,[2] and is an integral component of multiple delirium assessments.[3, 4, 5] Chester et al. observed that the presence of impaired arousal was 64% sensitive and 93% specific for delirium diagnosed by a psychiatrist.[2] Delirium is an under‐recognized public health problem that affects up to 25% of older hospitalized patients,[6, 7] is associated with a multitude of adverse outcomes such as death and accelerated cognitive decline,[8] and costs the US healthcare system an excess of $152 billion dollars.[9]

Most delirium assessments require the patient to undergo additional cognitive testing. The assessment of arousal, however, requires the rater to merely observe the patient during routine clinical care and can be easily integrated into the clinical workflow.[10] Because of its simplicity and brevity, assessing arousal alone using validated scales such as the Richmond Agitation‐Sedation Scale (RASS) may be a more appealing alternative to traditional, more complex delirium screening in the acute care setting. Its clinical utility would be further strengthened if impaired arousal was also associated with mortality, and conferred risk even in the absence of delirium. As a result, we sought to determine if impaired arousal at initial presentation in older acutely ill patients predicted 6‐month mortality and whether this relationship was present in the absence of delirium.

METHODS

Design Overview

We performed a planned secondary analysis of 2 prospective cohorts that enrolled patients from May 2007 to August 2008 between 8 am and 10 pm during the weekdays, and July 2009 to February 2012 between 8 am and 4 pm during the weekdays. The first cohort was designed to evaluate the relationship between delirium and patient outcomes.[11, 12] The second cohort was used to validate brief delirium assessments using a psychiatrist's assessment as the reference standard.[5, 13] The local institutional review board approved these studies.

Setting and Participants

These studies were conducted in an urban emergency department located within an academic, tertiary care hospital with over 57,000 visits annually. Patients were included if they were 65 years or older and in the emergency department for <12 hours at the time of enrollment. The 12‐hour cutoff was used to include patients who presented to the emergency department in the evening and early morning hours. Patients were excluded if they were previously enrolled, non‐English speaking, comatose, or were nonverbal and unable to follow simple commands prior to the acute illness. Because the July 2009 to February 2012 cohort was designed to validate delirium assessments with auditory and visual components, patients were also excluded if they were deaf or blind.

Measurement of Arousal

RASS is an arousal scale commonly used in ICUs to assess depth of sedation and ranges from 5 (unarousable) to +4 (combative); 0 represents normal arousal.[10, 14] The RASS simply requires the rater to observe the patient during their routine interactions and does not require any additional cognitive testing. The RASS terms sedation was modified to drowsy (Table 1), because we wanted to capture impaired arousal regardless of sedation administration. We did not use the modified RASS (mRASS) proposed by the Veteran's Affairs Delirium Working Group, because it was published after data collection began.[1] The mRASS is very similar to the RASS, except it also incorporates a very informal inattention assessment. The RASS was ascertained by research assistants who were college students and graduates, and emergency medical technician basics and paramedics. The principal investigator gave them a 5‐minute didactic lecture about the RASS and observed them perform the RASS in at least 5 patients prior to the start of the study. Inter‐rater reliability between trained research assistants and a physician was assessed for 456 (42.0%) patients of the study sample. The weighted kappa of the RASS was 0.61, indicating very good inter‐rater reliability. Because the 81.7% of patients with impaired arousal had a RASS of 1, the RASS dichotomized as normal (RASS=0) or impaired (RASS other than 0).

Richmond Agitation‐Sedation Scale
ScoreTermDescription
  • NOTE: The Richmond Agitation‐Sedation Scale (RASS) is a brief (<10 seconds) arousal scale that was developed by Sessler et al.[10] The RASS is traditionally used in the intensive care unit to monitor depth of sedation. The terms were modified to better reflect the patient's level of arousal rather than sedation. A RASS of 0 indicates normal level of arousal (awake and alert), whereas a RASS <0 indicates decreased arousal, and a RASS >0 indicates increased arousal.

+4CombativeOvertly combative, violent, immediate danger to staff
+3Very agitatedPulls or removes tube(s) or catheter(s), aggressive
+2AgitatedFrequent nonpurposeful movement
+1RestlessAnxious but movements not aggressive or vigorous
0Alert and calm 
1Slight drowsyNot fully alert, but has sustained awakening (eye opening/eye contact) to voice (>10 seconds)
2Moderately drowsyBriefly awakens with eye contact to voice (<10 seconds)
3Very drowsyMovement or eye opening to voice (but no eye contact)
4Awakens to pain onlyNo response to voice, but movement or eye opening to physical stimulation
5UnarousableNo response to voice or physical stimulation

Death Ascertainment

Death within 6 months was ascertained using the following algorithm: (1) The electronic medical record was searched to determine the patient's death status. (2) Patients who had a documented emergency department visit, outpatient clinic visit, or hospitalization after 6 months were considered to be alive at 6 months. (3) For the remaining patients, date of death was searched in the Social Security Death Index (SSDI). (4) Patients without a death recorded in the SSDI 1 year after the index visit was considered to be alive at 6 months. Nine hundred thirty‐one (85.9%) out of 1084 patients had a recorded death in the medical record or SSDI, or had an emergency department or hospital visit documented in their record 6 months after the index visit.

Additional Variables Collected

Patients were considered to have dementia if they had: (1) documented dementia in the medical record, (2) a short form Informant Questionnaire on Cognitive Decline in the Elderly score (IQCODE) greater than 3.38,[15] or (3) prescribed cholinesterase inhibitors prior to admission. The short form IQCODE is an informant questionnaire with 16 items; a cutoff of 3.38 out of 5.00 is 79% sensitive and 82% specific for dementia.[16] Premorbid functional status was determined by the Katz Activities of Daily Living (Katz ADL) and ranges from 0 (completely dependent) to 6 (completely independent).[17] Patients with a score <5 were considered to be functionally dependent. Both the IQCODE and Katz ADL were prospectively collected in the emergency department at the time of enrollment.

The Charlson Comorbidity Index was used to measure comorbid burden.[18] The Acute Physiology Score (APS) of the Acute Physiology and Chronic Health Evaluation II score was used to quantify severity of illness.[19] The Glasgow Coma Scale was not included in the APS because it was not collected. Intravenous, intramuscular, and oral benzodiazepine and opioids given in the prehospital and emergency department were also recorded. The Charlson Comorbidity Index, APS, and benzodiazepine and opioid administration were collected after patient enrollment using the electronic medical record.

Within 3 hours of the RASS, a subset of 406 patients was evaluated by a consultation‐liaison psychiatrist who determined the patient's delirium status using Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM‐IV‐TR) criteria.[20] Details of their comprehensive assessments have been described in a previous report.[5]

Statistical Analysis

Measures of central tendency and dispersion for continuous variables were reported as medians and interquartile ranges. Categorical variables were reported as proportions. For simple comparisons, Wilcoxon rank sum tests were performed for continuous data, and 2 analyses or Fisher exact test were performed for categorical data. To evaluate the predictive validity of impaired arousal on 6‐month mortality, the cumulative probability of survival was estimated within 6 months from the study enrollment date using the Kaplan‐Meier method. Cox proportional hazards regression was performed to assess if impaired arousal was independently associated with 6‐month mortality after adjusting for age, gender, nonwhite race, comorbidity burden (Charlson Comorbidity Index), severity of illness (APS), dementia, functional dependence (Katz ADL <5), nursing home residence, admission status, and benzodiazepine or opioid medication administration. Patients were censored at the end of 6 months. The selection of covariates was based upon expert opinion and literature review. The number of covariates used for the model was limited by the number of events to minimize overfitting; 1 df was allowed for every 10 to 15 events.[21] Because severity of illness, psychoactive medication administration, and admission status might modify the relationship between 6‐month mortality and impaired arousal, 2‐way interaction terms were incorporated. To maintain parsimony and minimize overfitting and collinearity, nonsignificant interaction terms (P>0.20) were removed in the final model.[22] Hazard ratios (HR) with their 95% confidence interval (95% CI) were reported.

To determine if arousal was associated with 6‐month mortality in the absence of delirium, we performed another Cox proportional hazard regression in a subset of 406 patients who received a psychiatrist assessment. Six‐month mortality was the dependent variable, and the independent variable was a 3‐level categorical variable of different arousal/delirium combinations: (1) impaired arousal/delirium positive, (2) impaired arousal/delirium negative, and (3) normal arousal (with or without delirium). Because there were only 8 patients who had normal arousal with delirium, this group was collapsed into the normal arousal without delirium group. Because there were 55 deaths, the number of covariates that could be entered into the Cox proportional hazard regression model was limited. We used the inverse weighted propensity score method to help minimize residual confounding.[23] Traditional propensity score adjustment could not be performed because there were 3 arousal/delirium categories. Similar to propensity score adjustment, inverse weighted propensity score method was used to help balance the distribution of patient characteristics among the exposure groups and also allow adjustment for multiple confounders while minimizing the degrees of freedom expended. A propensity score was the probability of having a particular arousal/delirium category based upon baseline patient characteristics. Multinomial logistic regression was performed to calculate the propensity score, and the baseline covariates used were age, gender, nonwhite race, comorbidity burden, severity of illness, dementia, functional dependence, and nursing home residence. For the Cox proportional hazard regression model, each observation was weighted by the inverse of the propensity score for their given arousal/delirium category; propensity scores exceeding the 95th percentile were trimmed to avoid overly influential weighting. Benzodiazepine and opioid medications given in the emergency department and admission status were adjusted as covariates in the weighted Cox proportional hazard regression model.

Nineteen patients (1.8%) had missing Katz ADL; these missing values were imputed using multiple imputation. The reliability of the final regression models were internally validated using the bootstrap method.[21] Two thousand sets of bootstrap samples were generated by resampling the original data, and the optimism was estimated to determine the degree of overfitting.[21] An optimism value >0.85 indicated no evidence of substantial overfitting.[21] Variance inflation factors were used to check multicollinearity. Schoenfeld residuals were also analyzed to determine goodness‐of‐fit and assess for outliers. P values <0.05 were considered statistically significant. All statistical analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC) and open source R statistical software version 3.0.1 (http://www.r‐project.org/).

RESULTS

A total of 1903 patients were screened, and 1084 patients met enrollment criteria (Figure 1). Of these, 1051 (97.0%) were non‐ICU patients. Patient characteristics of this cohort can be seen in Table 2. Enrolled patients and potentially eligible patients who presented to the emergency department during the enrollment window were similar in age, gender, and severity of illness, but enrolled patients were slightly more likely to have a chief complaint of chest pain and syncope (unpublished data).

Figure 1
Enrollment flow diagram. RASS, Richmond Agitation‐Sedation Scale. Patients who were non‐verbal or unable to follow simple commands prior to their acute illness were considered to have end‐stage dementia.
Patient Characteristics
VariablesNormal Arousal, n=835Impaired Arousal, n=249P Value
  • NOTE: Patient characteristics and demographics of enrolled patients. Continuous and ordinal variables are expressed in medians and interquartile (IQR) ranges. Categorical variables are expressed in absolute numbers and percentages. *Patient was considered to have dementia if it was documented in the medical record, the patient was on home cholineresterase inhibitors, or had a short‐form Informant Questionnaire on Cognitive Decline in the Elderly >3.38. Patients with a Katz Activities of Daily Living of <5 were considered to be functionally dependent. There were 19 patients with missing Katz Activities of Daily Living scores. Charlson index is a weighted scale used to measure comorbidity burden. Higher scores indicate higher comorbidity burden. The Acute Physiology Score (APS) of the Acute Physiology and Chronic Health Evaluation II score was used quantify severity of illness. Glasgow Coma Scale was not incorporated in this score. Higher scores indicate higher severity of illness.

Median age, y (IQR)74 (6980)75 (7083)0.005
Female gender459 (55.0%)132 (53.0%)0.586
Nonwhite race122 (14.6%)51 (20.5%)0.027
Residence  <0.001
Home752 (90.1%)204 (81.9%) 
Assisted living29 (3.5%)13 (5.2%) 
Rehabilitation8 (1.0%)5 (2.0%) 
Nursing home42 (5.0%)27 (10.8%) 
Dementia*175 (21.0%)119 (47.8%)<0.001
Dependent120 (14.4%)99 (39.8%)<0.001
Median Charlson (IQR)2 (1, 4)3 (2, 5)<0.001
Median APS (IQR)2 (1, 4)2 (1, 5)<0.001
Primary complaint  <0.001
Abdominal pain45 (5.4%)13 (5.2%) 
Altered mental status12 (1.4%)36 (14.5%) 
Chest pain128 (15.3%)31 (12.5%) 
Disturbances of sensation17 (2.0%)2 (0.8%) 
Dizziness16 (1.9%)2 (0.8%) 
Fever11 (1.3%)7 (2.8%) 
General illness, malaise26 (3.1%)5 (2.0%) 
General weakness68 (8.1%)29 (11.7%) 
Nausea/vomiting29 (3.5%)4 (1.6%) 
Shortness of breath85 (10.2%)21 (8.4%) 
Syncope46 (5.5%)10 (4.0%) 
Trauma, multiple organs19 (2.3%)8 (3.2%) 
Other333 (39.9%)81 (32.5%) 
Benzodiazepines or opioid medications administration188 (22.5%)67 (26.9%)0.152
Admitted to the hospital478 (57.3%)191 (76.7%)0.002
Internal medicine411 (86.0%)153 (80.1%) 
Surgery38 (8.0%)21 (11.0%) 
Neurology19 (4.0%)13 (6.8%) 
Psychiatry1 (0.2%)2 (1.1%) 
Unknown/missing9 (1.9%)2 (1.1%) 
Death within 6 months81 (9.7%)59 (23.7%)<0.001

Of those enrolled, 249 (23.0%) had an abnormal RASS at initial presentation, and their distribution can be seen in Figure 2. Within 6 months, patients with an abnormal RASS were more likely to die compared with patients with a RASS of 0 (23.7% vs 9.7%, P<0.001). The Kaplan‐Meier survival curves for all enrolled patients with impaired and normal RASS can be seen in Figure 3; the survival curve declined more slowly in patients with a normal RASS compared with those with an abnormal RASS.

Figure 2
Richmond Agitation‐Sedation Scale (RASS) distribution among enrolled patients. Distribution of RASS at initial presentation among 1084 acutely ill older patients, and of these, 1051 patients (97.0%) were non–intensive care unit patients. The RASS is a widely used arousal scale that can be performed during routine clinical care and takes <10 seconds to perform. A RASS of 0 indicates normal level of arousal (awake and alert), whereas a RASS of <0 indicates decreased arousal and a RASS of >0 indicates increased arousal.
Figure 3
Kaplan‐Meier survival curves in acutely ill older patients with a normal and impaired arousal at initial presentation over a 6‐month period. Arousal was assessed for using the Richmond Agitation‐Sedation Scale (RASS). Patients with impaired arousal were more likely to die compared to patients with normal arousal (23.7% vs 9.7%) within 6 months. Using Cox proportional hazard regression, patients with an abnormal RASS were 73% more likely to die within 6 months after adjusting for age, dementia, functional dependence, comorbidity burden, severity of illness, hearing impairment, nursing home residence, admission status, and administration of benzodiazepines/opioids medications. Severity of illness (P = 0.52), benzodiazepine/opioid medication administration (P = 0.38), and admission status (P = 0.57) did not modify the relationship between impaired arousal and 6‐month mortality. Abbreviations: CI, confidence interval.

Using Cox proportional hazards regression, the relationship between an abnormal RASS at initial presentation and 6‐month mortality persisted (HR: 1.73, 95% CI: 1.21‐2.49) after adjusting for age, sex, nonwhite race, comorbidity burden, severity of illness, dementia, functional dependence, nursing home residence, psychoactive medications given, and admission status. The interaction between an abnormal RASS and APS (severity of illness) had a P value of 0.52. The interaction between an abnormal RASS and benzodiazepine or opioid medication administration had a P value of 0.38. The interaction between an abnormal RASS and admission status had a P value of 0.57. This indicated that severity of illness, psychoactive medication administration, and admission status did not modify the relationship between an abnormal RASS and 6‐month mortality.

We analyzed a subset of 406 patients who received a psychiatrist's assessment to determine if an abnormal RASS was associated with 6‐month mortality regardless of delirium status using Cox proportional hazard regression weighted by the inverse of the propensity score. Patients with an abnormal RASS and no delirium were significantly associated with higher mortality compared to those with a normal RASS (HR: 2.20, 95% CI: 1.10‐4.41). Patients with an abnormal RASS with delirium also had an increased risk for 6‐month mortality (HR: 2.86, 95% CI: 1.29‐6.34).

All regression models were internally validated. There was no evidence of substantial overfitting or collinearity. The Schoenfeld residuals for each model were examined graphically and there was good model fit overall, and no significant outliers were observed.

DISCUSSION

Vital sign measurements are a fundamental component of patient care, and abnormalities can serve as an early warning signal of the patient's clinical deterioration. However, traditional vital signs do not include an assessment of the patient's brain function. Our chief finding is that impaired arousal at initial presentation, as determined by the nonphysician research staff, increased the risk of 6‐month mortality by 73% after adjusting for confounders in a diverse group of acutely ill older patients. This relationship existed regardless of severity of illness, administration of psychoactive medications, and admission status. Though impaired arousal is closely linked with delirium,[2, 24] which is another well‐known predictor of mortality,[11, 25, 26] the prognostic significance of impaired arousal appeared to extend beyond delirium. We observed that the relationship between 6‐month mortality and impaired arousal in the absence of delirium was remarkably similar to that observed with impaired arousal with delirium. Arousal can be assessed for by simply observing the patient during routine clinical care and can be performed by nonphysician and physician healthcare providers. Its assessment should be performed and communicated in conjunction with traditional vital sign measurements in the emergency department and inpatient settings.[1]

Most of the data linking impaired arousal to death have been collected in the ICU. Coma, which represents the most severe form of depressed arousal, has been shown to increase the likelihood of death regardless of underlying etiology.[27, 28, 29, 30, 31] This includes patients who have impaired arousal because they received sedative medications during mechanical ventilation.[32] Few studies have investigated the effect of impaired arousal in a non‐ICU patient population. Zuliani et al. observed that impaired arousal was associated with 30‐day mortality, but their study was conducted in 469 older stroke patients, limiting the study's external validity to a more general patient population.[33] Our data advance the current stage of knowledge; we observed a similar relationship between impaired arousal and 6‐month mortality in a much broader clinical population who were predominantly not critically ill regardless of delirium status. Additionally, most of our impaired arousal cohort had a RASS of 1, indicating that even subtle abnormalities portended adverse outcomes.

In addition to long‐term prognosis, the presence of impaired arousal has immediate clinical implications. Using arousal scales like the RASS can serve as a way for healthcare providers to succinctly communicate the patient's mental status in a standardized manner during transitions of care (eg, emergency physician to inpatient team). Regardless of which clinical setting they are in, older acutely ill patients with an impaired arousal may also require close monitoring, especially if the impairment is acute. Because of its close relationship with delirium, these patients likely have an underlying acute medical illness that precipitated their impaired arousal.

Understanding the true clinical significance of impaired arousal in the absence of delirium requires further study. Because of the fluctuating nature of delirium, it is possible that these patients may have initially been delirious and then became nondelirious during the psychiatrist's evaluation. Conversely, it is also possible that these patients may have eventually transitioned into delirium at later point in time; the presence of impaired arousal alone may be a precursor to delirium. Last, these patients may have had subsyndromal delirium, which is defined as having 1 or more delirium symptoms without ever meeting full DSM‐IV‐TR criteria for delirium.[34] Patients with subsyndromal delirium have poorer outcomes, such as prolonged hospitalizations, and higher mortality than patients without delirium symptoms.[34]

Additional studies are also needed to further clarify the impact of impaired arousal on nonmortality outcomes such as functional and cognitive decline. The prognostic significance of serial arousal measurements also requires further study. It is possible that patients whose impaired arousal rapidly resolves after an intervention may have better prognoses than those who have persistent impairment. The measurement of arousal may have additional clinical applications in disease prognosis models. The presence of altered mental status is incorporated in various disease‐specific risk scores such as the CURB‐65 or Pneumonia Severity Index for pneumonia,[35, 36] and the Pulmonary Embolism Severity Index for pulmonary embolism.[37] However, the definition of altered mental status is highly variable; it ranges from subjective impressions that can be unreliable to formal cognitive testing, which can be time consuming. Arousal scales such as the RASS may allow for more feasible, reliable, and standardized assessment of mental status. Future studies should investigate if incorporating the RASS would improve the discrimination of these disease‐severity indices.

This study has several notable limitations. We excluded patients with a RASS of 4 and 5, which represented comatose patients. This exclusion, however, likely biased our findings toward the null. We enrolled a convenience sample that may have introduced selection bias. However, our enrolled cohort was similar to all potentially eligible patients who presented to the emergency department during the study period. We also attempted to mitigate this selection bias by using multivariable regression and adjusting for factors that may have confounded the relationship between RASS and 6‐month mortality. This study was performed at a single, urban, academic hospital and enrolled patients who were aged 65 years and older. Our findings may not be generalizable to other settings and to those who are under 65 years of age. Because 406 patients received a psychiatric evaluation, this limited the number of covariates that could be incorporated into the multivariable model to evaluate if impaired arousal in the absence of delirium is associated with 6‐month mortality. To minimize residual confounding, we used the inverse weighted propensity score, but we acknowledge that this bias may still exist. Larger studies are needed to clarify the relationships between arousal, delirium, and mortality.

CONCLUSION

In conclusion, impaired arousal at initial presentation is an independent predictor for 6‐month mortality in a diverse group of acutely ill older patients, and this risk appears to be present even in the absence of delirium. Because of its ease of use and prognostic significance, it may be a useful vital sign for underlying brain dysfunction. Routine standardized assessment and communication of arousal during routine clinical care may be warranted.

Disclosures: Research reported in this publication was supported by the Vanderbilt Physician Scientist Development Award, Emergency Medicine Foundation, and National Institute on Aging of the National Institutes of Health under award number K23AG032355. This study was also supported by the National Center for Research Resources, grant UL1 RR024975‐01, and is now at the National Center for Advancing Translational Sciences, grant 2 UL1 TR000445‐06. Dr. Vasilevskis was supported in part by the National Institute on Aging of the National Institutes of Health under award number K23AG040157. Dr. Powers was supported by Health Resources and Services Administration Geriatric Education Centers, grant 1D31HP08823‐01‐00. Dr. Storrow was supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number K12HL1090 and the National Center for Advancing Translational Sciences under award number UL1TR000445. Dr. Ely was supported in part by the National Institute on Aging of the National Institutes of Health under award numbers R01AG027472 and R01AG035117, and a Veteran Affairs MERIT award. Drs. Vasilevskis, Schnelle, Dittus, Powers, and Ely were supported by the Veteran Affairs Geriatric Research, Education, and Clinical Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of Vanderbilt University, Emergency Medicine Foundation, National Institutes of Health, and Veterans Affairs. The funding agencies did not have any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

J.H.H., E.W.E., J.F.S., A.B.S., and R.D.S. conceived the trial. J.H.H., E.W.E., A.B.S., J.F.S., R.D.S., A.S., and A.W. participated in the study design. J.H.H. and A.W. recruited patients and collected the data. J.H.H., A.J.G., and A.S. analyzed the data. All authors participated in the interpretation of results. J.H.H. drafted the manuscript, and all authors contributed to the critical review and revision of the manuscript.

The authors report no conflicts of interest.

Files
References
  1. Flaherty JH, Shay K, Weir C, et al. The development of a mental status vital sign for use across the spectrum of care. J Am Med Dir Assoc. 2009;10:379380.
  2. Chester JG, Beth Harrington M, Rudolph JL, Group VADW. Serial administration of a modified Richmond Agitation and Sedation Scale for delirium screening. J Hosp Med. 2012;7:450453.
  3. Inouye SK, Dyck CH, Alessi CA, Balkin S, Siegal AP, Horwitz RI. Clarifying confusion: the confusion assessment method. A new method for detection of delirium. Ann Intern Med. 1990;113:941948.
  4. Ely EW, Inouye SK, Bernard GR, et al. Delirium in mechanically ventilated patients: validity and reliability of the confusion assessment method for the intensive care unit (CAM‐ICU). JAMA. 2001;286:27032710.
  5. Han JH, Wilson A, Vasilevskis EE, et al. Diagnosing delirium in older emergency department patients: validity and reliability of the Delirium Triage Screen And The Brief Confusion Assessment Method. Ann Emerg Med. 2013;62:457465.
  6. Inouye SK, Rushing JT, Foreman MD, Palmer RM, Pompei P. Does delirium contribute to poor hospital outcomes? A three‐site epidemiologic study. J Gen Intern Med. 1998;13:234242.
  7. Pitkala KH, Laurila JV, Strandberg TE, Tilvis RS. Prognostic significance of delirium in frail older people. Dement Geriatr Cogn Disord. 2005;19:158163.
  8. Witlox J, Eurelings LS, Jonghe JF, Kalisvaart KJ, Eikelenboom P, Gool WA. Delirium in elderly patients and the risk of postdischarge mortality, institutionalization, and dementia: a meta‐analysis. JAMA. 2010;304:443451.
  9. Leslie DL, Marcantonio ER, Zhang Y, Leo‐Summers L, Inouye SK. One‐year health care costs associated with delirium in the elderly population. Arch Intern Med. 2008;168:2732.
  10. Sessler CN, Gosnell MS, Grap MJ, et al. The Richmond Agitation‐Sedation Scale: validity and reliability in adult intensive care unit patients. Am J Respir Crit Care Med. 2002;166:13381344.
  11. Han JH, Shintani A, Eden S, et al. Delirium in the emergency department: an independent predictor of death within 6 months. Ann Emerg Med. 2010;56:244252.
  12. Han JH, Eden S, Shintani A, et al. Delirium in older emergency department patients is an independent predictor of hospital length of stay. Acad Emerg Med. 2011;18:451457.
  13. Han JH, Wilson A, Graves AJ, et al. Validation of the Confusion Assessment Method For The Intensive Care Unit in older emergency department patients. Acad Emerg Med. 2014;21:180187.
  14. Ely EW, Truman B, Shintani A, et al. Monitoring sedation status over time in ICU patients: reliability and validity of the Richmond Agitation‐Sedation Scale (RASS). JAMA. 2003;289:29832991.
  15. Holsinger T, Deveau J, Boustani M, Williams JW. Does this patient have dementia? JAMA. 2007;297:23912404.
  16. Jorm AF. A short form of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): development and cross‐validation. Psychol Med. 1994;24:145153.
  17. Katz S. Assessing self‐maintenance: activities of daily living, mobility, and instrumental activities of daily living. J Am Geriatr Soc. 1983;31:721727.
  18. Murray SB, Bates DW, Ngo L, Ufberg JW, Shapiro NI. Charlson Index is associated with one‐year mortality in emergency department patients with suspected infection. Acad Emerg Med. 2006;13:530536.
  19. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13:818829.
  20. American Psychiatric Association. Task Force on DSM‐IV. Diagnostic and Statistical Manual of Mental Disorders: DSM‐IV‐TR. 4th ed. Washington, DC: American Psychiatric Association; 2000.
  21. Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York, NY: Springer; 2001.
  22. Marshall SW. Power for tests of interaction: effect of raising the Type I error rate. Epidemiol Perspect Innov. 2007;4:4.
  23. Austin PC. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behav Res. 2011;46:399424.
  24. Meagher DJ, Maclullich AM, Laurila JV. Defining delirium for the International Classification of Diseases, 11th Revision. J Psychosom Res. 2008;65:207214.
  25. McCusker J, Cole M, Abrahamowicz M, Primeau F, Belzile E. Delirium predicts 12‐month mortality. Arch Intern Med. 2002;162:457463.
  26. Ely EW, Shintani A, Truman B, et al. Delirium as a predictor of mortality in mechanically ventilated patients in the intensive care unit. JAMA. 2004;291:17531762.
  27. Teres D, Brown RB, Lemeshow S. Predicting mortality of intensive care unit patients. The importance of coma. Crit Care Med. 1982;10:8695.
  28. Jennett B, Bond M. Assessment of outcome after severe brain damage. Lancet. 1975;1:480484.
  29. Levy DE, Caronna JJ, Singer BH, Lapinski RH, Frydman H, Plum F. Predicting outcome from hypoxic‐ischemic coma. JAMA. 1985;253:14201426.
  30. Tuhrim S, Dambrosia JM, Price TR, et al. Prediction of intracerebral hemorrhage survival. Ann Neurol. 1988;24:258263.
  31. Booth CM, Boone RH, Tomlinson G, Detsky AS. Is this patient dead, vegetative, or severely neurologically impaired? Assessing outcome for comatose survivors of cardiac arrest. JAMA. 2004;291:870879.
  32. Shehabi Y, Bellomo R, Reade MC, et al. Early intensive care sedation predicts long‐term mortality in ventilated critically ill patients. Am J Respir Crit Care Med. 2012;186:724731.
  33. Zuliani G, Cherubini A, Ranzini M, Ruggiero C, Atti AR, Fellin R. Risk factors for short‐term mortality in older subjects with acute ischemic stroke. Gerontology. 2006;52:231236.
  34. Cole M, McCusker J, Dendukuri N, Han L. The prognostic significance of subsyndromal delirium in elderly medical inpatients. J Am Geriatr Soc. 2003;51:754760.
  35. Lim WS, der Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58:377382.
  36. Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low‐risk patients with community‐acquired pneumonia. N Engl J Med. 1997;336:243250.
  37. Aujesky D, Obrosky DS, Stone RA, et al. Derivation and validation of a prognostic model for pulmonary embolism. Am J Respir Crit Care Med. 2005;172:10411046.
Article PDF
Issue
Journal of Hospital Medicine - 9(12)
Page Number
772-778
Sections
Files
Files
Article PDF
Article PDF

Arousal is defined as the patient's overall level of responsiveness to the environment. Its assessment is standard of care in most intensive care units (ICUs) to monitor depth of sedation and underlying brain dysfunction. There has been recent interest in expanding the role of arousal assessment beyond the ICU. Specifically, the Veterans Affairs Delirium Working Group proposed that simple arousal assessment be a vital sign to quantify underlying brain dysfunction.[1] The rationale is that impaired arousal is closely linked with delirium,[2] and is an integral component of multiple delirium assessments.[3, 4, 5] Chester et al. observed that the presence of impaired arousal was 64% sensitive and 93% specific for delirium diagnosed by a psychiatrist.[2] Delirium is an under‐recognized public health problem that affects up to 25% of older hospitalized patients,[6, 7] is associated with a multitude of adverse outcomes such as death and accelerated cognitive decline,[8] and costs the US healthcare system an excess of $152 billion dollars.[9]

Most delirium assessments require the patient to undergo additional cognitive testing. The assessment of arousal, however, requires the rater to merely observe the patient during routine clinical care and can be easily integrated into the clinical workflow.[10] Because of its simplicity and brevity, assessing arousal alone using validated scales such as the Richmond Agitation‐Sedation Scale (RASS) may be a more appealing alternative to traditional, more complex delirium screening in the acute care setting. Its clinical utility would be further strengthened if impaired arousal was also associated with mortality, and conferred risk even in the absence of delirium. As a result, we sought to determine if impaired arousal at initial presentation in older acutely ill patients predicted 6‐month mortality and whether this relationship was present in the absence of delirium.

METHODS

Design Overview

We performed a planned secondary analysis of 2 prospective cohorts that enrolled patients from May 2007 to August 2008 between 8 am and 10 pm during the weekdays, and July 2009 to February 2012 between 8 am and 4 pm during the weekdays. The first cohort was designed to evaluate the relationship between delirium and patient outcomes.[11, 12] The second cohort was used to validate brief delirium assessments using a psychiatrist's assessment as the reference standard.[5, 13] The local institutional review board approved these studies.

Setting and Participants

These studies were conducted in an urban emergency department located within an academic, tertiary care hospital with over 57,000 visits annually. Patients were included if they were 65 years or older and in the emergency department for <12 hours at the time of enrollment. The 12‐hour cutoff was used to include patients who presented to the emergency department in the evening and early morning hours. Patients were excluded if they were previously enrolled, non‐English speaking, comatose, or were nonverbal and unable to follow simple commands prior to the acute illness. Because the July 2009 to February 2012 cohort was designed to validate delirium assessments with auditory and visual components, patients were also excluded if they were deaf or blind.

Measurement of Arousal

RASS is an arousal scale commonly used in ICUs to assess depth of sedation and ranges from 5 (unarousable) to +4 (combative); 0 represents normal arousal.[10, 14] The RASS simply requires the rater to observe the patient during their routine interactions and does not require any additional cognitive testing. The RASS terms sedation was modified to drowsy (Table 1), because we wanted to capture impaired arousal regardless of sedation administration. We did not use the modified RASS (mRASS) proposed by the Veteran's Affairs Delirium Working Group, because it was published after data collection began.[1] The mRASS is very similar to the RASS, except it also incorporates a very informal inattention assessment. The RASS was ascertained by research assistants who were college students and graduates, and emergency medical technician basics and paramedics. The principal investigator gave them a 5‐minute didactic lecture about the RASS and observed them perform the RASS in at least 5 patients prior to the start of the study. Inter‐rater reliability between trained research assistants and a physician was assessed for 456 (42.0%) patients of the study sample. The weighted kappa of the RASS was 0.61, indicating very good inter‐rater reliability. Because the 81.7% of patients with impaired arousal had a RASS of 1, the RASS dichotomized as normal (RASS=0) or impaired (RASS other than 0).

Richmond Agitation‐Sedation Scale
ScoreTermDescription
  • NOTE: The Richmond Agitation‐Sedation Scale (RASS) is a brief (<10 seconds) arousal scale that was developed by Sessler et al.[10] The RASS is traditionally used in the intensive care unit to monitor depth of sedation. The terms were modified to better reflect the patient's level of arousal rather than sedation. A RASS of 0 indicates normal level of arousal (awake and alert), whereas a RASS <0 indicates decreased arousal, and a RASS >0 indicates increased arousal.

+4CombativeOvertly combative, violent, immediate danger to staff
+3Very agitatedPulls or removes tube(s) or catheter(s), aggressive
+2AgitatedFrequent nonpurposeful movement
+1RestlessAnxious but movements not aggressive or vigorous
0Alert and calm 
1Slight drowsyNot fully alert, but has sustained awakening (eye opening/eye contact) to voice (>10 seconds)
2Moderately drowsyBriefly awakens with eye contact to voice (<10 seconds)
3Very drowsyMovement or eye opening to voice (but no eye contact)
4Awakens to pain onlyNo response to voice, but movement or eye opening to physical stimulation
5UnarousableNo response to voice or physical stimulation

Death Ascertainment

Death within 6 months was ascertained using the following algorithm: (1) The electronic medical record was searched to determine the patient's death status. (2) Patients who had a documented emergency department visit, outpatient clinic visit, or hospitalization after 6 months were considered to be alive at 6 months. (3) For the remaining patients, date of death was searched in the Social Security Death Index (SSDI). (4) Patients without a death recorded in the SSDI 1 year after the index visit was considered to be alive at 6 months. Nine hundred thirty‐one (85.9%) out of 1084 patients had a recorded death in the medical record or SSDI, or had an emergency department or hospital visit documented in their record 6 months after the index visit.

Additional Variables Collected

Patients were considered to have dementia if they had: (1) documented dementia in the medical record, (2) a short form Informant Questionnaire on Cognitive Decline in the Elderly score (IQCODE) greater than 3.38,[15] or (3) prescribed cholinesterase inhibitors prior to admission. The short form IQCODE is an informant questionnaire with 16 items; a cutoff of 3.38 out of 5.00 is 79% sensitive and 82% specific for dementia.[16] Premorbid functional status was determined by the Katz Activities of Daily Living (Katz ADL) and ranges from 0 (completely dependent) to 6 (completely independent).[17] Patients with a score <5 were considered to be functionally dependent. Both the IQCODE and Katz ADL were prospectively collected in the emergency department at the time of enrollment.

The Charlson Comorbidity Index was used to measure comorbid burden.[18] The Acute Physiology Score (APS) of the Acute Physiology and Chronic Health Evaluation II score was used to quantify severity of illness.[19] The Glasgow Coma Scale was not included in the APS because it was not collected. Intravenous, intramuscular, and oral benzodiazepine and opioids given in the prehospital and emergency department were also recorded. The Charlson Comorbidity Index, APS, and benzodiazepine and opioid administration were collected after patient enrollment using the electronic medical record.

Within 3 hours of the RASS, a subset of 406 patients was evaluated by a consultation‐liaison psychiatrist who determined the patient's delirium status using Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM‐IV‐TR) criteria.[20] Details of their comprehensive assessments have been described in a previous report.[5]

Statistical Analysis

Measures of central tendency and dispersion for continuous variables were reported as medians and interquartile ranges. Categorical variables were reported as proportions. For simple comparisons, Wilcoxon rank sum tests were performed for continuous data, and 2 analyses or Fisher exact test were performed for categorical data. To evaluate the predictive validity of impaired arousal on 6‐month mortality, the cumulative probability of survival was estimated within 6 months from the study enrollment date using the Kaplan‐Meier method. Cox proportional hazards regression was performed to assess if impaired arousal was independently associated with 6‐month mortality after adjusting for age, gender, nonwhite race, comorbidity burden (Charlson Comorbidity Index), severity of illness (APS), dementia, functional dependence (Katz ADL <5), nursing home residence, admission status, and benzodiazepine or opioid medication administration. Patients were censored at the end of 6 months. The selection of covariates was based upon expert opinion and literature review. The number of covariates used for the model was limited by the number of events to minimize overfitting; 1 df was allowed for every 10 to 15 events.[21] Because severity of illness, psychoactive medication administration, and admission status might modify the relationship between 6‐month mortality and impaired arousal, 2‐way interaction terms were incorporated. To maintain parsimony and minimize overfitting and collinearity, nonsignificant interaction terms (P>0.20) were removed in the final model.[22] Hazard ratios (HR) with their 95% confidence interval (95% CI) were reported.

To determine if arousal was associated with 6‐month mortality in the absence of delirium, we performed another Cox proportional hazard regression in a subset of 406 patients who received a psychiatrist assessment. Six‐month mortality was the dependent variable, and the independent variable was a 3‐level categorical variable of different arousal/delirium combinations: (1) impaired arousal/delirium positive, (2) impaired arousal/delirium negative, and (3) normal arousal (with or without delirium). Because there were only 8 patients who had normal arousal with delirium, this group was collapsed into the normal arousal without delirium group. Because there were 55 deaths, the number of covariates that could be entered into the Cox proportional hazard regression model was limited. We used the inverse weighted propensity score method to help minimize residual confounding.[23] Traditional propensity score adjustment could not be performed because there were 3 arousal/delirium categories. Similar to propensity score adjustment, inverse weighted propensity score method was used to help balance the distribution of patient characteristics among the exposure groups and also allow adjustment for multiple confounders while minimizing the degrees of freedom expended. A propensity score was the probability of having a particular arousal/delirium category based upon baseline patient characteristics. Multinomial logistic regression was performed to calculate the propensity score, and the baseline covariates used were age, gender, nonwhite race, comorbidity burden, severity of illness, dementia, functional dependence, and nursing home residence. For the Cox proportional hazard regression model, each observation was weighted by the inverse of the propensity score for their given arousal/delirium category; propensity scores exceeding the 95th percentile were trimmed to avoid overly influential weighting. Benzodiazepine and opioid medications given in the emergency department and admission status were adjusted as covariates in the weighted Cox proportional hazard regression model.

Nineteen patients (1.8%) had missing Katz ADL; these missing values were imputed using multiple imputation. The reliability of the final regression models were internally validated using the bootstrap method.[21] Two thousand sets of bootstrap samples were generated by resampling the original data, and the optimism was estimated to determine the degree of overfitting.[21] An optimism value >0.85 indicated no evidence of substantial overfitting.[21] Variance inflation factors were used to check multicollinearity. Schoenfeld residuals were also analyzed to determine goodness‐of‐fit and assess for outliers. P values <0.05 were considered statistically significant. All statistical analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC) and open source R statistical software version 3.0.1 (http://www.r‐project.org/).

RESULTS

A total of 1903 patients were screened, and 1084 patients met enrollment criteria (Figure 1). Of these, 1051 (97.0%) were non‐ICU patients. Patient characteristics of this cohort can be seen in Table 2. Enrolled patients and potentially eligible patients who presented to the emergency department during the enrollment window were similar in age, gender, and severity of illness, but enrolled patients were slightly more likely to have a chief complaint of chest pain and syncope (unpublished data).

Figure 1
Enrollment flow diagram. RASS, Richmond Agitation‐Sedation Scale. Patients who were non‐verbal or unable to follow simple commands prior to their acute illness were considered to have end‐stage dementia.
Patient Characteristics
VariablesNormal Arousal, n=835Impaired Arousal, n=249P Value
  • NOTE: Patient characteristics and demographics of enrolled patients. Continuous and ordinal variables are expressed in medians and interquartile (IQR) ranges. Categorical variables are expressed in absolute numbers and percentages. *Patient was considered to have dementia if it was documented in the medical record, the patient was on home cholineresterase inhibitors, or had a short‐form Informant Questionnaire on Cognitive Decline in the Elderly >3.38. Patients with a Katz Activities of Daily Living of <5 were considered to be functionally dependent. There were 19 patients with missing Katz Activities of Daily Living scores. Charlson index is a weighted scale used to measure comorbidity burden. Higher scores indicate higher comorbidity burden. The Acute Physiology Score (APS) of the Acute Physiology and Chronic Health Evaluation II score was used quantify severity of illness. Glasgow Coma Scale was not incorporated in this score. Higher scores indicate higher severity of illness.

Median age, y (IQR)74 (6980)75 (7083)0.005
Female gender459 (55.0%)132 (53.0%)0.586
Nonwhite race122 (14.6%)51 (20.5%)0.027
Residence  <0.001
Home752 (90.1%)204 (81.9%) 
Assisted living29 (3.5%)13 (5.2%) 
Rehabilitation8 (1.0%)5 (2.0%) 
Nursing home42 (5.0%)27 (10.8%) 
Dementia*175 (21.0%)119 (47.8%)<0.001
Dependent120 (14.4%)99 (39.8%)<0.001
Median Charlson (IQR)2 (1, 4)3 (2, 5)<0.001
Median APS (IQR)2 (1, 4)2 (1, 5)<0.001
Primary complaint  <0.001
Abdominal pain45 (5.4%)13 (5.2%) 
Altered mental status12 (1.4%)36 (14.5%) 
Chest pain128 (15.3%)31 (12.5%) 
Disturbances of sensation17 (2.0%)2 (0.8%) 
Dizziness16 (1.9%)2 (0.8%) 
Fever11 (1.3%)7 (2.8%) 
General illness, malaise26 (3.1%)5 (2.0%) 
General weakness68 (8.1%)29 (11.7%) 
Nausea/vomiting29 (3.5%)4 (1.6%) 
Shortness of breath85 (10.2%)21 (8.4%) 
Syncope46 (5.5%)10 (4.0%) 
Trauma, multiple organs19 (2.3%)8 (3.2%) 
Other333 (39.9%)81 (32.5%) 
Benzodiazepines or opioid medications administration188 (22.5%)67 (26.9%)0.152
Admitted to the hospital478 (57.3%)191 (76.7%)0.002
Internal medicine411 (86.0%)153 (80.1%) 
Surgery38 (8.0%)21 (11.0%) 
Neurology19 (4.0%)13 (6.8%) 
Psychiatry1 (0.2%)2 (1.1%) 
Unknown/missing9 (1.9%)2 (1.1%) 
Death within 6 months81 (9.7%)59 (23.7%)<0.001

Of those enrolled, 249 (23.0%) had an abnormal RASS at initial presentation, and their distribution can be seen in Figure 2. Within 6 months, patients with an abnormal RASS were more likely to die compared with patients with a RASS of 0 (23.7% vs 9.7%, P<0.001). The Kaplan‐Meier survival curves for all enrolled patients with impaired and normal RASS can be seen in Figure 3; the survival curve declined more slowly in patients with a normal RASS compared with those with an abnormal RASS.

Figure 2
Richmond Agitation‐Sedation Scale (RASS) distribution among enrolled patients. Distribution of RASS at initial presentation among 1084 acutely ill older patients, and of these, 1051 patients (97.0%) were non–intensive care unit patients. The RASS is a widely used arousal scale that can be performed during routine clinical care and takes <10 seconds to perform. A RASS of 0 indicates normal level of arousal (awake and alert), whereas a RASS of <0 indicates decreased arousal and a RASS of >0 indicates increased arousal.
Figure 3
Kaplan‐Meier survival curves in acutely ill older patients with a normal and impaired arousal at initial presentation over a 6‐month period. Arousal was assessed for using the Richmond Agitation‐Sedation Scale (RASS). Patients with impaired arousal were more likely to die compared to patients with normal arousal (23.7% vs 9.7%) within 6 months. Using Cox proportional hazard regression, patients with an abnormal RASS were 73% more likely to die within 6 months after adjusting for age, dementia, functional dependence, comorbidity burden, severity of illness, hearing impairment, nursing home residence, admission status, and administration of benzodiazepines/opioids medications. Severity of illness (P = 0.52), benzodiazepine/opioid medication administration (P = 0.38), and admission status (P = 0.57) did not modify the relationship between impaired arousal and 6‐month mortality. Abbreviations: CI, confidence interval.

Using Cox proportional hazards regression, the relationship between an abnormal RASS at initial presentation and 6‐month mortality persisted (HR: 1.73, 95% CI: 1.21‐2.49) after adjusting for age, sex, nonwhite race, comorbidity burden, severity of illness, dementia, functional dependence, nursing home residence, psychoactive medications given, and admission status. The interaction between an abnormal RASS and APS (severity of illness) had a P value of 0.52. The interaction between an abnormal RASS and benzodiazepine or opioid medication administration had a P value of 0.38. The interaction between an abnormal RASS and admission status had a P value of 0.57. This indicated that severity of illness, psychoactive medication administration, and admission status did not modify the relationship between an abnormal RASS and 6‐month mortality.

We analyzed a subset of 406 patients who received a psychiatrist's assessment to determine if an abnormal RASS was associated with 6‐month mortality regardless of delirium status using Cox proportional hazard regression weighted by the inverse of the propensity score. Patients with an abnormal RASS and no delirium were significantly associated with higher mortality compared to those with a normal RASS (HR: 2.20, 95% CI: 1.10‐4.41). Patients with an abnormal RASS with delirium also had an increased risk for 6‐month mortality (HR: 2.86, 95% CI: 1.29‐6.34).

All regression models were internally validated. There was no evidence of substantial overfitting or collinearity. The Schoenfeld residuals for each model were examined graphically and there was good model fit overall, and no significant outliers were observed.

DISCUSSION

Vital sign measurements are a fundamental component of patient care, and abnormalities can serve as an early warning signal of the patient's clinical deterioration. However, traditional vital signs do not include an assessment of the patient's brain function. Our chief finding is that impaired arousal at initial presentation, as determined by the nonphysician research staff, increased the risk of 6‐month mortality by 73% after adjusting for confounders in a diverse group of acutely ill older patients. This relationship existed regardless of severity of illness, administration of psychoactive medications, and admission status. Though impaired arousal is closely linked with delirium,[2, 24] which is another well‐known predictor of mortality,[11, 25, 26] the prognostic significance of impaired arousal appeared to extend beyond delirium. We observed that the relationship between 6‐month mortality and impaired arousal in the absence of delirium was remarkably similar to that observed with impaired arousal with delirium. Arousal can be assessed for by simply observing the patient during routine clinical care and can be performed by nonphysician and physician healthcare providers. Its assessment should be performed and communicated in conjunction with traditional vital sign measurements in the emergency department and inpatient settings.[1]

Most of the data linking impaired arousal to death have been collected in the ICU. Coma, which represents the most severe form of depressed arousal, has been shown to increase the likelihood of death regardless of underlying etiology.[27, 28, 29, 30, 31] This includes patients who have impaired arousal because they received sedative medications during mechanical ventilation.[32] Few studies have investigated the effect of impaired arousal in a non‐ICU patient population. Zuliani et al. observed that impaired arousal was associated with 30‐day mortality, but their study was conducted in 469 older stroke patients, limiting the study's external validity to a more general patient population.[33] Our data advance the current stage of knowledge; we observed a similar relationship between impaired arousal and 6‐month mortality in a much broader clinical population who were predominantly not critically ill regardless of delirium status. Additionally, most of our impaired arousal cohort had a RASS of 1, indicating that even subtle abnormalities portended adverse outcomes.

In addition to long‐term prognosis, the presence of impaired arousal has immediate clinical implications. Using arousal scales like the RASS can serve as a way for healthcare providers to succinctly communicate the patient's mental status in a standardized manner during transitions of care (eg, emergency physician to inpatient team). Regardless of which clinical setting they are in, older acutely ill patients with an impaired arousal may also require close monitoring, especially if the impairment is acute. Because of its close relationship with delirium, these patients likely have an underlying acute medical illness that precipitated their impaired arousal.

Understanding the true clinical significance of impaired arousal in the absence of delirium requires further study. Because of the fluctuating nature of delirium, it is possible that these patients may have initially been delirious and then became nondelirious during the psychiatrist's evaluation. Conversely, it is also possible that these patients may have eventually transitioned into delirium at later point in time; the presence of impaired arousal alone may be a precursor to delirium. Last, these patients may have had subsyndromal delirium, which is defined as having 1 or more delirium symptoms without ever meeting full DSM‐IV‐TR criteria for delirium.[34] Patients with subsyndromal delirium have poorer outcomes, such as prolonged hospitalizations, and higher mortality than patients without delirium symptoms.[34]

Additional studies are also needed to further clarify the impact of impaired arousal on nonmortality outcomes such as functional and cognitive decline. The prognostic significance of serial arousal measurements also requires further study. It is possible that patients whose impaired arousal rapidly resolves after an intervention may have better prognoses than those who have persistent impairment. The measurement of arousal may have additional clinical applications in disease prognosis models. The presence of altered mental status is incorporated in various disease‐specific risk scores such as the CURB‐65 or Pneumonia Severity Index for pneumonia,[35, 36] and the Pulmonary Embolism Severity Index for pulmonary embolism.[37] However, the definition of altered mental status is highly variable; it ranges from subjective impressions that can be unreliable to formal cognitive testing, which can be time consuming. Arousal scales such as the RASS may allow for more feasible, reliable, and standardized assessment of mental status. Future studies should investigate if incorporating the RASS would improve the discrimination of these disease‐severity indices.

This study has several notable limitations. We excluded patients with a RASS of 4 and 5, which represented comatose patients. This exclusion, however, likely biased our findings toward the null. We enrolled a convenience sample that may have introduced selection bias. However, our enrolled cohort was similar to all potentially eligible patients who presented to the emergency department during the study period. We also attempted to mitigate this selection bias by using multivariable regression and adjusting for factors that may have confounded the relationship between RASS and 6‐month mortality. This study was performed at a single, urban, academic hospital and enrolled patients who were aged 65 years and older. Our findings may not be generalizable to other settings and to those who are under 65 years of age. Because 406 patients received a psychiatric evaluation, this limited the number of covariates that could be incorporated into the multivariable model to evaluate if impaired arousal in the absence of delirium is associated with 6‐month mortality. To minimize residual confounding, we used the inverse weighted propensity score, but we acknowledge that this bias may still exist. Larger studies are needed to clarify the relationships between arousal, delirium, and mortality.

CONCLUSION

In conclusion, impaired arousal at initial presentation is an independent predictor for 6‐month mortality in a diverse group of acutely ill older patients, and this risk appears to be present even in the absence of delirium. Because of its ease of use and prognostic significance, it may be a useful vital sign for underlying brain dysfunction. Routine standardized assessment and communication of arousal during routine clinical care may be warranted.

Disclosures: Research reported in this publication was supported by the Vanderbilt Physician Scientist Development Award, Emergency Medicine Foundation, and National Institute on Aging of the National Institutes of Health under award number K23AG032355. This study was also supported by the National Center for Research Resources, grant UL1 RR024975‐01, and is now at the National Center for Advancing Translational Sciences, grant 2 UL1 TR000445‐06. Dr. Vasilevskis was supported in part by the National Institute on Aging of the National Institutes of Health under award number K23AG040157. Dr. Powers was supported by Health Resources and Services Administration Geriatric Education Centers, grant 1D31HP08823‐01‐00. Dr. Storrow was supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number K12HL1090 and the National Center for Advancing Translational Sciences under award number UL1TR000445. Dr. Ely was supported in part by the National Institute on Aging of the National Institutes of Health under award numbers R01AG027472 and R01AG035117, and a Veteran Affairs MERIT award. Drs. Vasilevskis, Schnelle, Dittus, Powers, and Ely were supported by the Veteran Affairs Geriatric Research, Education, and Clinical Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of Vanderbilt University, Emergency Medicine Foundation, National Institutes of Health, and Veterans Affairs. The funding agencies did not have any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

J.H.H., E.W.E., J.F.S., A.B.S., and R.D.S. conceived the trial. J.H.H., E.W.E., A.B.S., J.F.S., R.D.S., A.S., and A.W. participated in the study design. J.H.H. and A.W. recruited patients and collected the data. J.H.H., A.J.G., and A.S. analyzed the data. All authors participated in the interpretation of results. J.H.H. drafted the manuscript, and all authors contributed to the critical review and revision of the manuscript.

The authors report no conflicts of interest.

Arousal is defined as the patient's overall level of responsiveness to the environment. Its assessment is standard of care in most intensive care units (ICUs) to monitor depth of sedation and underlying brain dysfunction. There has been recent interest in expanding the role of arousal assessment beyond the ICU. Specifically, the Veterans Affairs Delirium Working Group proposed that simple arousal assessment be a vital sign to quantify underlying brain dysfunction.[1] The rationale is that impaired arousal is closely linked with delirium,[2] and is an integral component of multiple delirium assessments.[3, 4, 5] Chester et al. observed that the presence of impaired arousal was 64% sensitive and 93% specific for delirium diagnosed by a psychiatrist.[2] Delirium is an under‐recognized public health problem that affects up to 25% of older hospitalized patients,[6, 7] is associated with a multitude of adverse outcomes such as death and accelerated cognitive decline,[8] and costs the US healthcare system an excess of $152 billion dollars.[9]

Most delirium assessments require the patient to undergo additional cognitive testing. The assessment of arousal, however, requires the rater to merely observe the patient during routine clinical care and can be easily integrated into the clinical workflow.[10] Because of its simplicity and brevity, assessing arousal alone using validated scales such as the Richmond Agitation‐Sedation Scale (RASS) may be a more appealing alternative to traditional, more complex delirium screening in the acute care setting. Its clinical utility would be further strengthened if impaired arousal was also associated with mortality, and conferred risk even in the absence of delirium. As a result, we sought to determine if impaired arousal at initial presentation in older acutely ill patients predicted 6‐month mortality and whether this relationship was present in the absence of delirium.

METHODS

Design Overview

We performed a planned secondary analysis of 2 prospective cohorts that enrolled patients from May 2007 to August 2008 between 8 am and 10 pm during the weekdays, and July 2009 to February 2012 between 8 am and 4 pm during the weekdays. The first cohort was designed to evaluate the relationship between delirium and patient outcomes.[11, 12] The second cohort was used to validate brief delirium assessments using a psychiatrist's assessment as the reference standard.[5, 13] The local institutional review board approved these studies.

Setting and Participants

These studies were conducted in an urban emergency department located within an academic, tertiary care hospital with over 57,000 visits annually. Patients were included if they were 65 years or older and in the emergency department for <12 hours at the time of enrollment. The 12‐hour cutoff was used to include patients who presented to the emergency department in the evening and early morning hours. Patients were excluded if they were previously enrolled, non‐English speaking, comatose, or were nonverbal and unable to follow simple commands prior to the acute illness. Because the July 2009 to February 2012 cohort was designed to validate delirium assessments with auditory and visual components, patients were also excluded if they were deaf or blind.

Measurement of Arousal

RASS is an arousal scale commonly used in ICUs to assess depth of sedation and ranges from 5 (unarousable) to +4 (combative); 0 represents normal arousal.[10, 14] The RASS simply requires the rater to observe the patient during their routine interactions and does not require any additional cognitive testing. The RASS terms sedation was modified to drowsy (Table 1), because we wanted to capture impaired arousal regardless of sedation administration. We did not use the modified RASS (mRASS) proposed by the Veteran's Affairs Delirium Working Group, because it was published after data collection began.[1] The mRASS is very similar to the RASS, except it also incorporates a very informal inattention assessment. The RASS was ascertained by research assistants who were college students and graduates, and emergency medical technician basics and paramedics. The principal investigator gave them a 5‐minute didactic lecture about the RASS and observed them perform the RASS in at least 5 patients prior to the start of the study. Inter‐rater reliability between trained research assistants and a physician was assessed for 456 (42.0%) patients of the study sample. The weighted kappa of the RASS was 0.61, indicating very good inter‐rater reliability. Because the 81.7% of patients with impaired arousal had a RASS of 1, the RASS dichotomized as normal (RASS=0) or impaired (RASS other than 0).

Richmond Agitation‐Sedation Scale
ScoreTermDescription
  • NOTE: The Richmond Agitation‐Sedation Scale (RASS) is a brief (<10 seconds) arousal scale that was developed by Sessler et al.[10] The RASS is traditionally used in the intensive care unit to monitor depth of sedation. The terms were modified to better reflect the patient's level of arousal rather than sedation. A RASS of 0 indicates normal level of arousal (awake and alert), whereas a RASS <0 indicates decreased arousal, and a RASS >0 indicates increased arousal.

+4CombativeOvertly combative, violent, immediate danger to staff
+3Very agitatedPulls or removes tube(s) or catheter(s), aggressive
+2AgitatedFrequent nonpurposeful movement
+1RestlessAnxious but movements not aggressive or vigorous
0Alert and calm 
1Slight drowsyNot fully alert, but has sustained awakening (eye opening/eye contact) to voice (>10 seconds)
2Moderately drowsyBriefly awakens with eye contact to voice (<10 seconds)
3Very drowsyMovement or eye opening to voice (but no eye contact)
4Awakens to pain onlyNo response to voice, but movement or eye opening to physical stimulation
5UnarousableNo response to voice or physical stimulation

Death Ascertainment

Death within 6 months was ascertained using the following algorithm: (1) The electronic medical record was searched to determine the patient's death status. (2) Patients who had a documented emergency department visit, outpatient clinic visit, or hospitalization after 6 months were considered to be alive at 6 months. (3) For the remaining patients, date of death was searched in the Social Security Death Index (SSDI). (4) Patients without a death recorded in the SSDI 1 year after the index visit was considered to be alive at 6 months. Nine hundred thirty‐one (85.9%) out of 1084 patients had a recorded death in the medical record or SSDI, or had an emergency department or hospital visit documented in their record 6 months after the index visit.

Additional Variables Collected

Patients were considered to have dementia if they had: (1) documented dementia in the medical record, (2) a short form Informant Questionnaire on Cognitive Decline in the Elderly score (IQCODE) greater than 3.38,[15] or (3) prescribed cholinesterase inhibitors prior to admission. The short form IQCODE is an informant questionnaire with 16 items; a cutoff of 3.38 out of 5.00 is 79% sensitive and 82% specific for dementia.[16] Premorbid functional status was determined by the Katz Activities of Daily Living (Katz ADL) and ranges from 0 (completely dependent) to 6 (completely independent).[17] Patients with a score <5 were considered to be functionally dependent. Both the IQCODE and Katz ADL were prospectively collected in the emergency department at the time of enrollment.

The Charlson Comorbidity Index was used to measure comorbid burden.[18] The Acute Physiology Score (APS) of the Acute Physiology and Chronic Health Evaluation II score was used to quantify severity of illness.[19] The Glasgow Coma Scale was not included in the APS because it was not collected. Intravenous, intramuscular, and oral benzodiazepine and opioids given in the prehospital and emergency department were also recorded. The Charlson Comorbidity Index, APS, and benzodiazepine and opioid administration were collected after patient enrollment using the electronic medical record.

Within 3 hours of the RASS, a subset of 406 patients was evaluated by a consultation‐liaison psychiatrist who determined the patient's delirium status using Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM‐IV‐TR) criteria.[20] Details of their comprehensive assessments have been described in a previous report.[5]

Statistical Analysis

Measures of central tendency and dispersion for continuous variables were reported as medians and interquartile ranges. Categorical variables were reported as proportions. For simple comparisons, Wilcoxon rank sum tests were performed for continuous data, and 2 analyses or Fisher exact test were performed for categorical data. To evaluate the predictive validity of impaired arousal on 6‐month mortality, the cumulative probability of survival was estimated within 6 months from the study enrollment date using the Kaplan‐Meier method. Cox proportional hazards regression was performed to assess if impaired arousal was independently associated with 6‐month mortality after adjusting for age, gender, nonwhite race, comorbidity burden (Charlson Comorbidity Index), severity of illness (APS), dementia, functional dependence (Katz ADL <5), nursing home residence, admission status, and benzodiazepine or opioid medication administration. Patients were censored at the end of 6 months. The selection of covariates was based upon expert opinion and literature review. The number of covariates used for the model was limited by the number of events to minimize overfitting; 1 df was allowed for every 10 to 15 events.[21] Because severity of illness, psychoactive medication administration, and admission status might modify the relationship between 6‐month mortality and impaired arousal, 2‐way interaction terms were incorporated. To maintain parsimony and minimize overfitting and collinearity, nonsignificant interaction terms (P>0.20) were removed in the final model.[22] Hazard ratios (HR) with their 95% confidence interval (95% CI) were reported.

To determine if arousal was associated with 6‐month mortality in the absence of delirium, we performed another Cox proportional hazard regression in a subset of 406 patients who received a psychiatrist assessment. Six‐month mortality was the dependent variable, and the independent variable was a 3‐level categorical variable of different arousal/delirium combinations: (1) impaired arousal/delirium positive, (2) impaired arousal/delirium negative, and (3) normal arousal (with or without delirium). Because there were only 8 patients who had normal arousal with delirium, this group was collapsed into the normal arousal without delirium group. Because there were 55 deaths, the number of covariates that could be entered into the Cox proportional hazard regression model was limited. We used the inverse weighted propensity score method to help minimize residual confounding.[23] Traditional propensity score adjustment could not be performed because there were 3 arousal/delirium categories. Similar to propensity score adjustment, inverse weighted propensity score method was used to help balance the distribution of patient characteristics among the exposure groups and also allow adjustment for multiple confounders while minimizing the degrees of freedom expended. A propensity score was the probability of having a particular arousal/delirium category based upon baseline patient characteristics. Multinomial logistic regression was performed to calculate the propensity score, and the baseline covariates used were age, gender, nonwhite race, comorbidity burden, severity of illness, dementia, functional dependence, and nursing home residence. For the Cox proportional hazard regression model, each observation was weighted by the inverse of the propensity score for their given arousal/delirium category; propensity scores exceeding the 95th percentile were trimmed to avoid overly influential weighting. Benzodiazepine and opioid medications given in the emergency department and admission status were adjusted as covariates in the weighted Cox proportional hazard regression model.

Nineteen patients (1.8%) had missing Katz ADL; these missing values were imputed using multiple imputation. The reliability of the final regression models were internally validated using the bootstrap method.[21] Two thousand sets of bootstrap samples were generated by resampling the original data, and the optimism was estimated to determine the degree of overfitting.[21] An optimism value >0.85 indicated no evidence of substantial overfitting.[21] Variance inflation factors were used to check multicollinearity. Schoenfeld residuals were also analyzed to determine goodness‐of‐fit and assess for outliers. P values <0.05 were considered statistically significant. All statistical analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC) and open source R statistical software version 3.0.1 (http://www.r‐project.org/).

RESULTS

A total of 1903 patients were screened, and 1084 patients met enrollment criteria (Figure 1). Of these, 1051 (97.0%) were non‐ICU patients. Patient characteristics of this cohort can be seen in Table 2. Enrolled patients and potentially eligible patients who presented to the emergency department during the enrollment window were similar in age, gender, and severity of illness, but enrolled patients were slightly more likely to have a chief complaint of chest pain and syncope (unpublished data).

Figure 1
Enrollment flow diagram. RASS, Richmond Agitation‐Sedation Scale. Patients who were non‐verbal or unable to follow simple commands prior to their acute illness were considered to have end‐stage dementia.
Patient Characteristics
VariablesNormal Arousal, n=835Impaired Arousal, n=249P Value
  • NOTE: Patient characteristics and demographics of enrolled patients. Continuous and ordinal variables are expressed in medians and interquartile (IQR) ranges. Categorical variables are expressed in absolute numbers and percentages. *Patient was considered to have dementia if it was documented in the medical record, the patient was on home cholineresterase inhibitors, or had a short‐form Informant Questionnaire on Cognitive Decline in the Elderly >3.38. Patients with a Katz Activities of Daily Living of <5 were considered to be functionally dependent. There were 19 patients with missing Katz Activities of Daily Living scores. Charlson index is a weighted scale used to measure comorbidity burden. Higher scores indicate higher comorbidity burden. The Acute Physiology Score (APS) of the Acute Physiology and Chronic Health Evaluation II score was used quantify severity of illness. Glasgow Coma Scale was not incorporated in this score. Higher scores indicate higher severity of illness.

Median age, y (IQR)74 (6980)75 (7083)0.005
Female gender459 (55.0%)132 (53.0%)0.586
Nonwhite race122 (14.6%)51 (20.5%)0.027
Residence  <0.001
Home752 (90.1%)204 (81.9%) 
Assisted living29 (3.5%)13 (5.2%) 
Rehabilitation8 (1.0%)5 (2.0%) 
Nursing home42 (5.0%)27 (10.8%) 
Dementia*175 (21.0%)119 (47.8%)<0.001
Dependent120 (14.4%)99 (39.8%)<0.001
Median Charlson (IQR)2 (1, 4)3 (2, 5)<0.001
Median APS (IQR)2 (1, 4)2 (1, 5)<0.001
Primary complaint  <0.001
Abdominal pain45 (5.4%)13 (5.2%) 
Altered mental status12 (1.4%)36 (14.5%) 
Chest pain128 (15.3%)31 (12.5%) 
Disturbances of sensation17 (2.0%)2 (0.8%) 
Dizziness16 (1.9%)2 (0.8%) 
Fever11 (1.3%)7 (2.8%) 
General illness, malaise26 (3.1%)5 (2.0%) 
General weakness68 (8.1%)29 (11.7%) 
Nausea/vomiting29 (3.5%)4 (1.6%) 
Shortness of breath85 (10.2%)21 (8.4%) 
Syncope46 (5.5%)10 (4.0%) 
Trauma, multiple organs19 (2.3%)8 (3.2%) 
Other333 (39.9%)81 (32.5%) 
Benzodiazepines or opioid medications administration188 (22.5%)67 (26.9%)0.152
Admitted to the hospital478 (57.3%)191 (76.7%)0.002
Internal medicine411 (86.0%)153 (80.1%) 
Surgery38 (8.0%)21 (11.0%) 
Neurology19 (4.0%)13 (6.8%) 
Psychiatry1 (0.2%)2 (1.1%) 
Unknown/missing9 (1.9%)2 (1.1%) 
Death within 6 months81 (9.7%)59 (23.7%)<0.001

Of those enrolled, 249 (23.0%) had an abnormal RASS at initial presentation, and their distribution can be seen in Figure 2. Within 6 months, patients with an abnormal RASS were more likely to die compared with patients with a RASS of 0 (23.7% vs 9.7%, P<0.001). The Kaplan‐Meier survival curves for all enrolled patients with impaired and normal RASS can be seen in Figure 3; the survival curve declined more slowly in patients with a normal RASS compared with those with an abnormal RASS.

Figure 2
Richmond Agitation‐Sedation Scale (RASS) distribution among enrolled patients. Distribution of RASS at initial presentation among 1084 acutely ill older patients, and of these, 1051 patients (97.0%) were non–intensive care unit patients. The RASS is a widely used arousal scale that can be performed during routine clinical care and takes <10 seconds to perform. A RASS of 0 indicates normal level of arousal (awake and alert), whereas a RASS of <0 indicates decreased arousal and a RASS of >0 indicates increased arousal.
Figure 3
Kaplan‐Meier survival curves in acutely ill older patients with a normal and impaired arousal at initial presentation over a 6‐month period. Arousal was assessed for using the Richmond Agitation‐Sedation Scale (RASS). Patients with impaired arousal were more likely to die compared to patients with normal arousal (23.7% vs 9.7%) within 6 months. Using Cox proportional hazard regression, patients with an abnormal RASS were 73% more likely to die within 6 months after adjusting for age, dementia, functional dependence, comorbidity burden, severity of illness, hearing impairment, nursing home residence, admission status, and administration of benzodiazepines/opioids medications. Severity of illness (P = 0.52), benzodiazepine/opioid medication administration (P = 0.38), and admission status (P = 0.57) did not modify the relationship between impaired arousal and 6‐month mortality. Abbreviations: CI, confidence interval.

Using Cox proportional hazards regression, the relationship between an abnormal RASS at initial presentation and 6‐month mortality persisted (HR: 1.73, 95% CI: 1.21‐2.49) after adjusting for age, sex, nonwhite race, comorbidity burden, severity of illness, dementia, functional dependence, nursing home residence, psychoactive medications given, and admission status. The interaction between an abnormal RASS and APS (severity of illness) had a P value of 0.52. The interaction between an abnormal RASS and benzodiazepine or opioid medication administration had a P value of 0.38. The interaction between an abnormal RASS and admission status had a P value of 0.57. This indicated that severity of illness, psychoactive medication administration, and admission status did not modify the relationship between an abnormal RASS and 6‐month mortality.

We analyzed a subset of 406 patients who received a psychiatrist's assessment to determine if an abnormal RASS was associated with 6‐month mortality regardless of delirium status using Cox proportional hazard regression weighted by the inverse of the propensity score. Patients with an abnormal RASS and no delirium were significantly associated with higher mortality compared to those with a normal RASS (HR: 2.20, 95% CI: 1.10‐4.41). Patients with an abnormal RASS with delirium also had an increased risk for 6‐month mortality (HR: 2.86, 95% CI: 1.29‐6.34).

All regression models were internally validated. There was no evidence of substantial overfitting or collinearity. The Schoenfeld residuals for each model were examined graphically and there was good model fit overall, and no significant outliers were observed.

DISCUSSION

Vital sign measurements are a fundamental component of patient care, and abnormalities can serve as an early warning signal of the patient's clinical deterioration. However, traditional vital signs do not include an assessment of the patient's brain function. Our chief finding is that impaired arousal at initial presentation, as determined by the nonphysician research staff, increased the risk of 6‐month mortality by 73% after adjusting for confounders in a diverse group of acutely ill older patients. This relationship existed regardless of severity of illness, administration of psychoactive medications, and admission status. Though impaired arousal is closely linked with delirium,[2, 24] which is another well‐known predictor of mortality,[11, 25, 26] the prognostic significance of impaired arousal appeared to extend beyond delirium. We observed that the relationship between 6‐month mortality and impaired arousal in the absence of delirium was remarkably similar to that observed with impaired arousal with delirium. Arousal can be assessed for by simply observing the patient during routine clinical care and can be performed by nonphysician and physician healthcare providers. Its assessment should be performed and communicated in conjunction with traditional vital sign measurements in the emergency department and inpatient settings.[1]

Most of the data linking impaired arousal to death have been collected in the ICU. Coma, which represents the most severe form of depressed arousal, has been shown to increase the likelihood of death regardless of underlying etiology.[27, 28, 29, 30, 31] This includes patients who have impaired arousal because they received sedative medications during mechanical ventilation.[32] Few studies have investigated the effect of impaired arousal in a non‐ICU patient population. Zuliani et al. observed that impaired arousal was associated with 30‐day mortality, but their study was conducted in 469 older stroke patients, limiting the study's external validity to a more general patient population.[33] Our data advance the current stage of knowledge; we observed a similar relationship between impaired arousal and 6‐month mortality in a much broader clinical population who were predominantly not critically ill regardless of delirium status. Additionally, most of our impaired arousal cohort had a RASS of 1, indicating that even subtle abnormalities portended adverse outcomes.

In addition to long‐term prognosis, the presence of impaired arousal has immediate clinical implications. Using arousal scales like the RASS can serve as a way for healthcare providers to succinctly communicate the patient's mental status in a standardized manner during transitions of care (eg, emergency physician to inpatient team). Regardless of which clinical setting they are in, older acutely ill patients with an impaired arousal may also require close monitoring, especially if the impairment is acute. Because of its close relationship with delirium, these patients likely have an underlying acute medical illness that precipitated their impaired arousal.

Understanding the true clinical significance of impaired arousal in the absence of delirium requires further study. Because of the fluctuating nature of delirium, it is possible that these patients may have initially been delirious and then became nondelirious during the psychiatrist's evaluation. Conversely, it is also possible that these patients may have eventually transitioned into delirium at later point in time; the presence of impaired arousal alone may be a precursor to delirium. Last, these patients may have had subsyndromal delirium, which is defined as having 1 or more delirium symptoms without ever meeting full DSM‐IV‐TR criteria for delirium.[34] Patients with subsyndromal delirium have poorer outcomes, such as prolonged hospitalizations, and higher mortality than patients without delirium symptoms.[34]

Additional studies are also needed to further clarify the impact of impaired arousal on nonmortality outcomes such as functional and cognitive decline. The prognostic significance of serial arousal measurements also requires further study. It is possible that patients whose impaired arousal rapidly resolves after an intervention may have better prognoses than those who have persistent impairment. The measurement of arousal may have additional clinical applications in disease prognosis models. The presence of altered mental status is incorporated in various disease‐specific risk scores such as the CURB‐65 or Pneumonia Severity Index for pneumonia,[35, 36] and the Pulmonary Embolism Severity Index for pulmonary embolism.[37] However, the definition of altered mental status is highly variable; it ranges from subjective impressions that can be unreliable to formal cognitive testing, which can be time consuming. Arousal scales such as the RASS may allow for more feasible, reliable, and standardized assessment of mental status. Future studies should investigate if incorporating the RASS would improve the discrimination of these disease‐severity indices.

This study has several notable limitations. We excluded patients with a RASS of 4 and 5, which represented comatose patients. This exclusion, however, likely biased our findings toward the null. We enrolled a convenience sample that may have introduced selection bias. However, our enrolled cohort was similar to all potentially eligible patients who presented to the emergency department during the study period. We also attempted to mitigate this selection bias by using multivariable regression and adjusting for factors that may have confounded the relationship between RASS and 6‐month mortality. This study was performed at a single, urban, academic hospital and enrolled patients who were aged 65 years and older. Our findings may not be generalizable to other settings and to those who are under 65 years of age. Because 406 patients received a psychiatric evaluation, this limited the number of covariates that could be incorporated into the multivariable model to evaluate if impaired arousal in the absence of delirium is associated with 6‐month mortality. To minimize residual confounding, we used the inverse weighted propensity score, but we acknowledge that this bias may still exist. Larger studies are needed to clarify the relationships between arousal, delirium, and mortality.

CONCLUSION

In conclusion, impaired arousal at initial presentation is an independent predictor for 6‐month mortality in a diverse group of acutely ill older patients, and this risk appears to be present even in the absence of delirium. Because of its ease of use and prognostic significance, it may be a useful vital sign for underlying brain dysfunction. Routine standardized assessment and communication of arousal during routine clinical care may be warranted.

Disclosures: Research reported in this publication was supported by the Vanderbilt Physician Scientist Development Award, Emergency Medicine Foundation, and National Institute on Aging of the National Institutes of Health under award number K23AG032355. This study was also supported by the National Center for Research Resources, grant UL1 RR024975‐01, and is now at the National Center for Advancing Translational Sciences, grant 2 UL1 TR000445‐06. Dr. Vasilevskis was supported in part by the National Institute on Aging of the National Institutes of Health under award number K23AG040157. Dr. Powers was supported by Health Resources and Services Administration Geriatric Education Centers, grant 1D31HP08823‐01‐00. Dr. Storrow was supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number K12HL1090 and the National Center for Advancing Translational Sciences under award number UL1TR000445. Dr. Ely was supported in part by the National Institute on Aging of the National Institutes of Health under award numbers R01AG027472 and R01AG035117, and a Veteran Affairs MERIT award. Drs. Vasilevskis, Schnelle, Dittus, Powers, and Ely were supported by the Veteran Affairs Geriatric Research, Education, and Clinical Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of Vanderbilt University, Emergency Medicine Foundation, National Institutes of Health, and Veterans Affairs. The funding agencies did not have any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

J.H.H., E.W.E., J.F.S., A.B.S., and R.D.S. conceived the trial. J.H.H., E.W.E., A.B.S., J.F.S., R.D.S., A.S., and A.W. participated in the study design. J.H.H. and A.W. recruited patients and collected the data. J.H.H., A.J.G., and A.S. analyzed the data. All authors participated in the interpretation of results. J.H.H. drafted the manuscript, and all authors contributed to the critical review and revision of the manuscript.

The authors report no conflicts of interest.

References
  1. Flaherty JH, Shay K, Weir C, et al. The development of a mental status vital sign for use across the spectrum of care. J Am Med Dir Assoc. 2009;10:379380.
  2. Chester JG, Beth Harrington M, Rudolph JL, Group VADW. Serial administration of a modified Richmond Agitation and Sedation Scale for delirium screening. J Hosp Med. 2012;7:450453.
  3. Inouye SK, Dyck CH, Alessi CA, Balkin S, Siegal AP, Horwitz RI. Clarifying confusion: the confusion assessment method. A new method for detection of delirium. Ann Intern Med. 1990;113:941948.
  4. Ely EW, Inouye SK, Bernard GR, et al. Delirium in mechanically ventilated patients: validity and reliability of the confusion assessment method for the intensive care unit (CAM‐ICU). JAMA. 2001;286:27032710.
  5. Han JH, Wilson A, Vasilevskis EE, et al. Diagnosing delirium in older emergency department patients: validity and reliability of the Delirium Triage Screen And The Brief Confusion Assessment Method. Ann Emerg Med. 2013;62:457465.
  6. Inouye SK, Rushing JT, Foreman MD, Palmer RM, Pompei P. Does delirium contribute to poor hospital outcomes? A three‐site epidemiologic study. J Gen Intern Med. 1998;13:234242.
  7. Pitkala KH, Laurila JV, Strandberg TE, Tilvis RS. Prognostic significance of delirium in frail older people. Dement Geriatr Cogn Disord. 2005;19:158163.
  8. Witlox J, Eurelings LS, Jonghe JF, Kalisvaart KJ, Eikelenboom P, Gool WA. Delirium in elderly patients and the risk of postdischarge mortality, institutionalization, and dementia: a meta‐analysis. JAMA. 2010;304:443451.
  9. Leslie DL, Marcantonio ER, Zhang Y, Leo‐Summers L, Inouye SK. One‐year health care costs associated with delirium in the elderly population. Arch Intern Med. 2008;168:2732.
  10. Sessler CN, Gosnell MS, Grap MJ, et al. The Richmond Agitation‐Sedation Scale: validity and reliability in adult intensive care unit patients. Am J Respir Crit Care Med. 2002;166:13381344.
  11. Han JH, Shintani A, Eden S, et al. Delirium in the emergency department: an independent predictor of death within 6 months. Ann Emerg Med. 2010;56:244252.
  12. Han JH, Eden S, Shintani A, et al. Delirium in older emergency department patients is an independent predictor of hospital length of stay. Acad Emerg Med. 2011;18:451457.
  13. Han JH, Wilson A, Graves AJ, et al. Validation of the Confusion Assessment Method For The Intensive Care Unit in older emergency department patients. Acad Emerg Med. 2014;21:180187.
  14. Ely EW, Truman B, Shintani A, et al. Monitoring sedation status over time in ICU patients: reliability and validity of the Richmond Agitation‐Sedation Scale (RASS). JAMA. 2003;289:29832991.
  15. Holsinger T, Deveau J, Boustani M, Williams JW. Does this patient have dementia? JAMA. 2007;297:23912404.
  16. Jorm AF. A short form of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): development and cross‐validation. Psychol Med. 1994;24:145153.
  17. Katz S. Assessing self‐maintenance: activities of daily living, mobility, and instrumental activities of daily living. J Am Geriatr Soc. 1983;31:721727.
  18. Murray SB, Bates DW, Ngo L, Ufberg JW, Shapiro NI. Charlson Index is associated with one‐year mortality in emergency department patients with suspected infection. Acad Emerg Med. 2006;13:530536.
  19. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13:818829.
  20. American Psychiatric Association. Task Force on DSM‐IV. Diagnostic and Statistical Manual of Mental Disorders: DSM‐IV‐TR. 4th ed. Washington, DC: American Psychiatric Association; 2000.
  21. Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York, NY: Springer; 2001.
  22. Marshall SW. Power for tests of interaction: effect of raising the Type I error rate. Epidemiol Perspect Innov. 2007;4:4.
  23. Austin PC. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behav Res. 2011;46:399424.
  24. Meagher DJ, Maclullich AM, Laurila JV. Defining delirium for the International Classification of Diseases, 11th Revision. J Psychosom Res. 2008;65:207214.
  25. McCusker J, Cole M, Abrahamowicz M, Primeau F, Belzile E. Delirium predicts 12‐month mortality. Arch Intern Med. 2002;162:457463.
  26. Ely EW, Shintani A, Truman B, et al. Delirium as a predictor of mortality in mechanically ventilated patients in the intensive care unit. JAMA. 2004;291:17531762.
  27. Teres D, Brown RB, Lemeshow S. Predicting mortality of intensive care unit patients. The importance of coma. Crit Care Med. 1982;10:8695.
  28. Jennett B, Bond M. Assessment of outcome after severe brain damage. Lancet. 1975;1:480484.
  29. Levy DE, Caronna JJ, Singer BH, Lapinski RH, Frydman H, Plum F. Predicting outcome from hypoxic‐ischemic coma. JAMA. 1985;253:14201426.
  30. Tuhrim S, Dambrosia JM, Price TR, et al. Prediction of intracerebral hemorrhage survival. Ann Neurol. 1988;24:258263.
  31. Booth CM, Boone RH, Tomlinson G, Detsky AS. Is this patient dead, vegetative, or severely neurologically impaired? Assessing outcome for comatose survivors of cardiac arrest. JAMA. 2004;291:870879.
  32. Shehabi Y, Bellomo R, Reade MC, et al. Early intensive care sedation predicts long‐term mortality in ventilated critically ill patients. Am J Respir Crit Care Med. 2012;186:724731.
  33. Zuliani G, Cherubini A, Ranzini M, Ruggiero C, Atti AR, Fellin R. Risk factors for short‐term mortality in older subjects with acute ischemic stroke. Gerontology. 2006;52:231236.
  34. Cole M, McCusker J, Dendukuri N, Han L. The prognostic significance of subsyndromal delirium in elderly medical inpatients. J Am Geriatr Soc. 2003;51:754760.
  35. Lim WS, der Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58:377382.
  36. Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low‐risk patients with community‐acquired pneumonia. N Engl J Med. 1997;336:243250.
  37. Aujesky D, Obrosky DS, Stone RA, et al. Derivation and validation of a prognostic model for pulmonary embolism. Am J Respir Crit Care Med. 2005;172:10411046.
References
  1. Flaherty JH, Shay K, Weir C, et al. The development of a mental status vital sign for use across the spectrum of care. J Am Med Dir Assoc. 2009;10:379380.
  2. Chester JG, Beth Harrington M, Rudolph JL, Group VADW. Serial administration of a modified Richmond Agitation and Sedation Scale for delirium screening. J Hosp Med. 2012;7:450453.
  3. Inouye SK, Dyck CH, Alessi CA, Balkin S, Siegal AP, Horwitz RI. Clarifying confusion: the confusion assessment method. A new method for detection of delirium. Ann Intern Med. 1990;113:941948.
  4. Ely EW, Inouye SK, Bernard GR, et al. Delirium in mechanically ventilated patients: validity and reliability of the confusion assessment method for the intensive care unit (CAM‐ICU). JAMA. 2001;286:27032710.
  5. Han JH, Wilson A, Vasilevskis EE, et al. Diagnosing delirium in older emergency department patients: validity and reliability of the Delirium Triage Screen And The Brief Confusion Assessment Method. Ann Emerg Med. 2013;62:457465.
  6. Inouye SK, Rushing JT, Foreman MD, Palmer RM, Pompei P. Does delirium contribute to poor hospital outcomes? A three‐site epidemiologic study. J Gen Intern Med. 1998;13:234242.
  7. Pitkala KH, Laurila JV, Strandberg TE, Tilvis RS. Prognostic significance of delirium in frail older people. Dement Geriatr Cogn Disord. 2005;19:158163.
  8. Witlox J, Eurelings LS, Jonghe JF, Kalisvaart KJ, Eikelenboom P, Gool WA. Delirium in elderly patients and the risk of postdischarge mortality, institutionalization, and dementia: a meta‐analysis. JAMA. 2010;304:443451.
  9. Leslie DL, Marcantonio ER, Zhang Y, Leo‐Summers L, Inouye SK. One‐year health care costs associated with delirium in the elderly population. Arch Intern Med. 2008;168:2732.
  10. Sessler CN, Gosnell MS, Grap MJ, et al. The Richmond Agitation‐Sedation Scale: validity and reliability in adult intensive care unit patients. Am J Respir Crit Care Med. 2002;166:13381344.
  11. Han JH, Shintani A, Eden S, et al. Delirium in the emergency department: an independent predictor of death within 6 months. Ann Emerg Med. 2010;56:244252.
  12. Han JH, Eden S, Shintani A, et al. Delirium in older emergency department patients is an independent predictor of hospital length of stay. Acad Emerg Med. 2011;18:451457.
  13. Han JH, Wilson A, Graves AJ, et al. Validation of the Confusion Assessment Method For The Intensive Care Unit in older emergency department patients. Acad Emerg Med. 2014;21:180187.
  14. Ely EW, Truman B, Shintani A, et al. Monitoring sedation status over time in ICU patients: reliability and validity of the Richmond Agitation‐Sedation Scale (RASS). JAMA. 2003;289:29832991.
  15. Holsinger T, Deveau J, Boustani M, Williams JW. Does this patient have dementia? JAMA. 2007;297:23912404.
  16. Jorm AF. A short form of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): development and cross‐validation. Psychol Med. 1994;24:145153.
  17. Katz S. Assessing self‐maintenance: activities of daily living, mobility, and instrumental activities of daily living. J Am Geriatr Soc. 1983;31:721727.
  18. Murray SB, Bates DW, Ngo L, Ufberg JW, Shapiro NI. Charlson Index is associated with one‐year mortality in emergency department patients with suspected infection. Acad Emerg Med. 2006;13:530536.
  19. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13:818829.
  20. American Psychiatric Association. Task Force on DSM‐IV. Diagnostic and Statistical Manual of Mental Disorders: DSM‐IV‐TR. 4th ed. Washington, DC: American Psychiatric Association; 2000.
  21. Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York, NY: Springer; 2001.
  22. Marshall SW. Power for tests of interaction: effect of raising the Type I error rate. Epidemiol Perspect Innov. 2007;4:4.
  23. Austin PC. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behav Res. 2011;46:399424.
  24. Meagher DJ, Maclullich AM, Laurila JV. Defining delirium for the International Classification of Diseases, 11th Revision. J Psychosom Res. 2008;65:207214.
  25. McCusker J, Cole M, Abrahamowicz M, Primeau F, Belzile E. Delirium predicts 12‐month mortality. Arch Intern Med. 2002;162:457463.
  26. Ely EW, Shintani A, Truman B, et al. Delirium as a predictor of mortality in mechanically ventilated patients in the intensive care unit. JAMA. 2004;291:17531762.
  27. Teres D, Brown RB, Lemeshow S. Predicting mortality of intensive care unit patients. The importance of coma. Crit Care Med. 1982;10:8695.
  28. Jennett B, Bond M. Assessment of outcome after severe brain damage. Lancet. 1975;1:480484.
  29. Levy DE, Caronna JJ, Singer BH, Lapinski RH, Frydman H, Plum F. Predicting outcome from hypoxic‐ischemic coma. JAMA. 1985;253:14201426.
  30. Tuhrim S, Dambrosia JM, Price TR, et al. Prediction of intracerebral hemorrhage survival. Ann Neurol. 1988;24:258263.
  31. Booth CM, Boone RH, Tomlinson G, Detsky AS. Is this patient dead, vegetative, or severely neurologically impaired? Assessing outcome for comatose survivors of cardiac arrest. JAMA. 2004;291:870879.
  32. Shehabi Y, Bellomo R, Reade MC, et al. Early intensive care sedation predicts long‐term mortality in ventilated critically ill patients. Am J Respir Crit Care Med. 2012;186:724731.
  33. Zuliani G, Cherubini A, Ranzini M, Ruggiero C, Atti AR, Fellin R. Risk factors for short‐term mortality in older subjects with acute ischemic stroke. Gerontology. 2006;52:231236.
  34. Cole M, McCusker J, Dendukuri N, Han L. The prognostic significance of subsyndromal delirium in elderly medical inpatients. J Am Geriatr Soc. 2003;51:754760.
  35. Lim WS, der Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58:377382.
  36. Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low‐risk patients with community‐acquired pneumonia. N Engl J Med. 1997;336:243250.
  37. Aujesky D, Obrosky DS, Stone RA, et al. Derivation and validation of a prognostic model for pulmonary embolism. Am J Respir Crit Care Med. 2005;172:10411046.
Issue
Journal of Hospital Medicine - 9(12)
Issue
Journal of Hospital Medicine - 9(12)
Page Number
772-778
Page Number
772-778
Article Type
Display Headline
Impaired arousal at initial presentation predicts 6‐month mortality: An analysis of 1084 acutely ill older patients
Display Headline
Impaired arousal at initial presentation predicts 6‐month mortality: An analysis of 1084 acutely ill older patients
Sections
Article Source

© 2014 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Address for correspondence and reprint requests: Jin H. Han, MD, Department of Emergency Medicine, Vanderbilt University Medical Center, 703 Oxford House, Nashville, TN 37232‐4700; Telephone: 615‐936‐0087; Fax: 615‐936‐1316; E‐mail: jin.h.han@vanderbilt.edu
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files

Improving Visual Estimates of Cervical Spine Range of Motion

Article Type
Changed
Thu, 09/19/2019 - 13:39
Display Headline
Improving Visual Estimates of Cervical Spine Range of Motion

Assessment of cervical spine range of motion (ROM) is an integral aspect of the physical examination for cervical conditions,1-3 surgical outcomes,4 and functional impairment.1 In fact, the emphasis being placed on such functional measures before and after treatments is increasing.4,5

Cervical spine range of motion is routinely used as an outcome measure in clinical studies.6-8 Underscoring the importance of defining cervical spine ROM, studies have found it to be a preoperative predictor of outcomes of anterior cervical surgery,9 and other studies have suggested it is a determinant of athletes’ return to play.10

Spinal ROM measurements can be used to determine the degree of disability experienced by a patient with a spinal condition as defined in the Guides to the Evaluation of Permanent Impairment by the American Medical Association (AMA).1 In the medicolegal realm, ROM measurements made by clinicians can influence the dollar amounts of awards in legal claims, and, according to the AMA guides, the difference in cervical spine ROM between normality and disability or impairment can be as little as 5°.

Although cervical spine ROM is routinely assessed and documented in clinical practice, no universal protocol exists for its evaluation.11,12 In fact, considerable inter-examiner variation in visual estimates of ROM has been found,13-16 and significant inaccuracies have been reported.17,18

Goniometers have been shown to be reliable and highly accurate, with low inter-examiner and intra-examiner variability.5,19-21 Nevertheless, logistics22 and costs21 generally limit their being accepted in routine clinical practice. Among many methods available for assessing ROM, visual estimation is the least reliable or accurate,23 but it is the quickest and least expensive and is recommended in textbooks that describe the spinal-specific physical examination.24 Despite the superiority of goniometers in measuring ROM, these significant barriers have limited their use in clinical practice. When assessing cervical spine ROM, most clinicians prefer visual estimates over goniometers.

We conducted a study to determine whether training could improve the accuracy of visual estimates. We compared the accuracy of visual estimates of cervical spine ROM with that of a radiographically validated electrogoniometer and then investigated whether accuracy and reliability of visual estimates could be improved with a session of instruction and demonstration. Assessments of accuracy were made immediately after and 1 month after this training session.

Materials and Methods

Assessments Made Before Training

This study was approved by our institution’s human investigation committee and was conducted in accordance with the ethical standards of that committee.

Cervical spine ROM was assessed by 8 examiners (2 attending spine surgeons, 4 orthopedic residents, 2 medical students). They were informed they would be participating in a study evaluating visual estimates of motion but were given no other information prior to the study.

Four healthy volunteer subjects (examiners who rotated through the role) were assessed. No subject reported any ongoing neck or spine discomfort or had had any previous spinal surgery. One at a time, subjects were fitted with a cervical harness electrogoniometer capable of measuring angulation of the cervical spine to the nearest degree (modified electrogoniometer, torsiometer, and display from Biometrics, Gwent, UK; Figures 1A, 1B). This electrogoniometer has been shown to have a mean (SD) error of 2.3° (2.6°) relative to radiographic assessments.8

With the electrogoniometer fitted, each subject was instructed to sit upright in a chair with his back to the backrest and his head neutrally positioned. The electrogoniometer was then zeroed, and the subject proceeded with 5 series of flexion-extension, left and right lateral bending, and left and right rotation movements. The subject was instructed to make 1 movement in full motion in each direction and the other 4 movements in less than full motion to yield a variety of excursions for assessment. Each subject was instructed to pause at the apex of each motion. During these pauses, the examiners recorded their visual estimates of movement in each direction while the investigator recorded degrees of motion (displayed by the electrogoniometer) in flexion-extension, lateral bending, and rotation (Figures 2A–2D). The electrogoniometer display was not visible to subjects or examiners.

A total of 840 independent visual estimates of 120 distinct movements were recorded.

Training, and Assessments Made Immediately Thereafter

After the first round of visual estimates, the 8 examiners were verbally instructed in cervical spine ROM assessment and were asked to observe 1 subject, fitted with the electrogoniometer, demonstrating partial and full cervical motions while the investigator announced the electrogoniometric measurements. The motions demonstrated included 15°, 30°, and the extremes of cervical spine ROM in each of 6 directions from neutral.

 

 

After this training session, each of the 4 subjects from the first round of assessments was again fitted with the harness electrogoniometer and instructed to repeat the movements in turn while examiners visually estimated cervical spine ROM and independently recorded their estimates. Meanwhile, the investigator recorded the degree of motion during each movement (as measured by the electrogoniometer). Again, a total of 840 independent visual estimates of 120 distinct movements were recorded.

Assessments Made 1 Month After Training

One month after the training session, the examiners and the investigator reconvened to assess the same 4 subjects using a procedure for simultaneous visual estimation and electrogoniometric measurement identical to that used 1 month earlier. No additional training was given. Again, 840 independent visual estimates of 120 distinct movements were recorded.

Data Analysis

The reliabilities of visual estimates were analyzed by calculating the intraclass coefficients (ICCs) using random-effect 1-way analyses of variance. By convention, ICCs of < 0.2, 0.2 to 0.39, 0.4 to 0.59, 0.6 to 0.8, and > 0.8 correspond to poor, fair, moderate, substantial, and perfect reliability, respectively.25

We compared the visual estimates and electrogoniometric measurements made for 3 planes of motion (flexion-extension, lateral bending, axial rotation) before, immediately after, and 1 month after training and drew trend lines generated by linear regression relative to a line of perfect correlation.

Mean errors in examiners’ visual estimates (relative to elec­trogoniometric measurements) made before, immediately after, and 1 month after training were calculated. Paired Student t tests were then used to compare the mean errors before training with the mean errors immediately after and 1 month after training.

All analyses were performed with SPSS for Windows 16.0 (SPSS, Chicago, Illinois).

Results

Inter-examiner reliability of the visual estimates in all planes of motion ranged from 0.51 to 0.79 (suggestive of moderate to substantial reliability). For reference, standard goniometers measuring knee ROM have inter-examiner ICCs of 0.89 to 0.9826 (suggestive of perfect reliability). The ICCs before, immediately after, and 1 month after training were not significantly different. 

As expected, there were significant errors in visual estimates of cervical spine ROM in all planes. Initial errors in visual estimates (relative to electrogoniometric measurements) were 23.9° (flexion-extension), 15.5° (lateral bending), and 19.3° (axial rotation) (Table, Figure 3).

Immediately after training, mean errors in visual estimates decreased to 12.0° (flexion-extension), 11.7° (lateral bending), and 16.4° (axial rotation) (Table, Figure 3). In all 3 planes of cervical motion, these improvements were statistically significant.

One month after training, mean errors in visual estimates were 14.4° (flexion-extension), 13.9° (lateral bending), and 16.2° (axial rotation) (Table, Figure 3). Only the improvement in the estimate of flexion-extension (the direction of the largest error initially) remained statistically significant—a 39.7% decrease in error.

We also considered how errors varied with degree of motion observed. In flexion-extension, the tendency to overestimate at larger degrees of motion was not apparent after training, and 1 month after training we found a tendency to underestimate at smaller degrees of motion (Figure 4A). The tendency to overestimate lateral bending before training did not persist immediately after or 1 month after training (Figure 4B). Estimates of axial rotation correlated well with goniometer measurements before training and were also well correlated immediately after and 1 month after training (Figure 4C).

Discussion

Visual estimation of spinal motion is unreliable and inaccurate, but its widespread use in clinical practice continues. Goniometers are far more accurate and reliable but are seldom used. We investigated whether a training session featuring verbal instruction and demonstration with an electrogoniometer could improve visual estimates and whether potential improvement in visual estimates would remain 1 month after training.

Widely variable ICCs (0.42-0.90) have been reported for visual estimates of cervical spine ROM.17,18,22 Our findings on the reliability of these estimates are consistent with the literature.

We recorded the greatest initial error in estimates of motion in flexion-extension. Previous studies have also found the greatest error and least reliability in visual estimates in this plane.14,15,18 Visual estimation may be more difficult in flexion-extension because the shoulders cannot be used as landmarks, whereas they serve as approximate 90° reference points during estimation of lateral bending and axial rotation. Demonstration of 15°, 30° and the extremes of ROM during the training session may have provided alternative reference points during visual estimation after training—decreasing the error to within the range found in other planes of motion.

Initial errors in visual estimates were 23.9° (flexion-extension), 15.5° (lateral bending), and 19.3° (axial rotation). Based on normative cervical spine ROM in a healthy population— 126° ± 12° for flexion-extension, 86° ± 5° for lateral bending, 151° ± 23° for axial rotation22—the errors we identified are 18.9% of the normal range of flexion-extension, 18.0% of lateral bending, and 12.8% of axial rotation.

 

 

Training clearly improved the accuracy of visual estimates of cervical spine ROM. Estimates were statistically improved for all planes immediately after training and remained significantly improved for flexion-extension (the plane of largest error initially) 1 month after training. Before training, mean errors varied across planes. Training normalized mean errors to about 15°, and this effect lasted in flexion-extension, lateral bending, and axial rotation (Figures 4A–4C). Of note, before training these percentage errors increased with increased motion from neutral in the flexion-extension and lateral bending planes. At full ROM, percentage errors in estimates were greater. After training, percentage errors did not increase appreciably with increasing motion.

Readers will naturally reflect on the clinical significance of the motion assessment improvements demonstrated after the training session described in this study. We must be aware that functional assessments are increasingly being emphasized in the clinical arena—with respect to clinical conditions, surgical outcomes, and functional impairments. We highlight a point made earlier: A difference of only 5° can affect impairment ratings in the medicolegal realm.1 In estimating flexion-extension motion, lasting improvements of almost 10° were demonstrated and maintained 1 month after the training session described in this study.

Nevertheless, mean errors in visual estimation remained at about 15° in all planes of motion, despite our modest improvements. This finding raises the question of whether visually estimated ROM should be pertinent to assessments of impairment and disability. Although visual estimates of ROM may have more utility as a screening test for impairment and disability, fine differences in ROM simply cannot be reliably assessed by visual estimation.

This study has limitations. First, it was conducted at a single institution where the evaluators received most of their training. Their skill in visually estimating cervical spine ROM may not be generalizable to a larger population of spine specialists who are practicing at other institutions and may have different training backgrounds.

Second, only healthy subjects were assessed. Some studies of cervical spine ROM have shown better reliability in symptomatic subjects relative to asymptomatic subjects.13,14 To attempt to overcome this limitation, we assessed many different excursions of motion that were often not to the extremes of motion.

Third, the “gold standard” we used for motion assessment was an electrogoniometer, which has some inherent error (previously validated mean [SD] error of 2.3° [2.6°] relative to radiographs8). Although obtaining radiographs of each movement would have more closely resembled the gold standard, the radiation dose associated with such a study is prohibitive.

Last, the assessors included medical students. The medical students’ estimates, however, tended to be more accurate than the residents’ or attending surgeons’ (though the difference was not statistically significant). This tendency may reflect the medical students’ closer attention to detail.  Clearly, including medical students in the study did not negatively affect the accuracy of the estimates or the validity of our findings.

Conclusion

Despite its limitations, visual assessment of cervical spine motion remains the gold standard in clinical practice and is routinely recorded and reported. Mean errors ranged from 15.5° to 23.9°, depending on plane of motion being assessed, but these improved after a training session.

Visual estimates of motion in flexion-extension were most improved by training, as the initial errors in this plane were the largest. Statistically significant improvement of about 10° remained for flexion-extension motion estimates 1 month after training.

During a time when we are increasingly emphasizing functional outcomes, such a degree of improvement could be of clinical significance. Our study results support a call for more formalized training of ROM assessment, but clinicians should also be aware of the limitations of visual estimates of cervical spine ROM, and our study results support scrutiny of visual assessment of ROM as a criterion for diagnosing permanent impairment or disability.

References

1. Rondinelli RD, Genovese E, Brigham CR; American Medical Association. Guides to the Evaluation of Permanent Impairment. 6th ed. Chicago, IL: American Medical Association; 2008.

2. Hall TM, Briffa K, Hopper D, Robinson K. Comparative analysis and diagnostic accuracy of the cervical flexion-rotation test. J Headache Pain. 2010;11(5):391-397.

3. De Hertogh WJ, Vaes PH, Vijverman V, De Cordt A, Duquet W. The clinical examination of neck pain patients: the validity of a group of tests. Man Ther. 2007;12(1):50-55.

4. Koller H, Resch H, Acosta F, et al. Assessment of two measurement techniques of cervical spine and C1–C2 rotation in the outcome research of axis fractures: a morphometrical analysis using dynamic computed tomography scanning. Spine. 2010;35(3):286-290.

5. Garrett TR, Youdas JW, Madson TJ. Reliability of measuring forward head posture in a clinical setting. J Orthop Sports Phys Ther. 1993;17(3):155-160.

6. Pearcy MJ, Tibrewal SB. Axial rotation and lateral bending in the normal lumbar spine measured by three-dimensional radiography. Spine. 1984;9(6):582-587.

7. Hayes MA, Howard TC, Gruel CR, Kopta JA. Roentgenographic evaluation of lumbar spine flexion-extension in asymptomatic individuals. Spine. 1989;14(3):327-331.

8. Bible JE, Biswas D, Miller CP, Whang PG, Grauer JN. Normal functional range of motion of the cervical spine during 15 activities of daily living. J Spinal Disord Tech. 2010;23(1):15-21.

9. Penning L. Normal movements of the cervical spine. AJR Am J Roentgenol. 1978;130(2):317-326.

10. Mayer TG, Tencer AF, Kristoferson S, Mooney V. Use of noninvasive techniques for quantification of spinal range-of-motion in normal subjects and chronic low-back dysfunction patients. Spine. 1984;9(6):588-595.

11. Williams MA, McCarthy CJ, Chorti A, Cooke MW, Gates S. A systematic review of reliability and validity studies of methods for measuring active and passive cervical range of motion. J Manipulative Physiol Ther. 2010;33(2):138-155.

12. Schaufele MK, Boden SD. Physical function measurements in neck pain. Phys Med Rehabil Clin North Am. 2003;14(3):569-588.

13. Fjellner A, Bexander C, Faleij R, Strender LE. Interexaminer reliability in physical examination of the cervical spine. J Manipulative Physiol Ther. 1999;22(8):511-516.

14. Nilsson N, Christensen HW, Hartvigsen J. The interexaminer reliability of measuring passive cervical range of motion, revisited. J Manipulative Physiol Ther. 1996;19(5):302-305.

15. Pool JJ, Hoving JL, de Vet HC, van Mameren H, Bouter LM. The interexaminer reproducibility of physical examination of the cervical spine. J Manipulative Physiol Ther. 2004;27(2):84-90.

16. Strender LE, Lundin M, Nell K. Interexaminer reliability in physical examination of the neck. J Manipulative Physiol Ther. 1997;20(8):516-520.

17. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98-104.

18. Whitcroft KL, Massouh L, Amirfeyz R, Bannister G. Comparison of methods of measuring active cervical range of motion. Spine. 2010;35(19):E976-E980.

19. de Koning CH, van den Heuvel SP, Staal JB, Smits-Engelsman BC, Hendriks EJ. Clinimetric evaluation of active range of motion measures in patients with non-specific neck pain: a systematic review. Eur Spine J. 2008;17(7):905-921.

20. Christensen HW, Nilsson N. The reliability of measuring active and passive cervical range of motion: an observer-blinded and randomized repeated-measures design. J Manipulative Physiol Ther. 1998;21(5):341-347.

21. Florêncio LL, Pereira PA, Silva ER, Pegoretti KS, Gonçalves MC, Bevilaqua-Grossi D. Agreement and reliability of two non-invasive methods for assessing cervical range of motion among young adults. Rev Bras Fisioter. 2010;14(2):175-181.

22. Lea RD, Gerhardt JJ. Range-of-motion measurements. J Bone Joint Surg Am. 1995;77(5):784-798.

23. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98-104.

24. Greene WB, Netter FH. Netter’s Orthopaedics. Philadelphia, PA: Saunders Elsevier; 2006.

25. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86(2):420-428.

26. Brosseau L, Balmer S, Tousignant M, et al. Intra- and intertester reliability and criterion validity of the parallelogram and universal goniometers for measuring maximum active knee flexion and extension of patients with knee restrictions. Arch Phys Med Rehabil. 2001;82(3):396-402.

Article PDF
Author and Disclosure Information

Brandon P. Hirsch, MD, Matthew L. Webb, AB, Daniel D. Bohl, MHS, Michael Fu, MD, Rafael A. Buerba, MD, Jordan A. Gruskay, BA, and Jonathan N. Grauer, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article.

Issue
The American Journal of Orthopedics - 43(11)
Publications
Topics
Page Number
E261-E265
Legacy Keywords
american journal of orthopedics, AJO, original study, online exclusive, visual estimates, cervical spine, spine, range of motion, ROM, cervical, clinical, surgical outcomes, hirsch, webb, bohl, fu, buerba, gruskay, grauer
Sections
Author and Disclosure Information

Brandon P. Hirsch, MD, Matthew L. Webb, AB, Daniel D. Bohl, MHS, Michael Fu, MD, Rafael A. Buerba, MD, Jordan A. Gruskay, BA, and Jonathan N. Grauer, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article.

Author and Disclosure Information

Brandon P. Hirsch, MD, Matthew L. Webb, AB, Daniel D. Bohl, MHS, Michael Fu, MD, Rafael A. Buerba, MD, Jordan A. Gruskay, BA, and Jonathan N. Grauer, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article.

Article PDF
Article PDF

Assessment of cervical spine range of motion (ROM) is an integral aspect of the physical examination for cervical conditions,1-3 surgical outcomes,4 and functional impairment.1 In fact, the emphasis being placed on such functional measures before and after treatments is increasing.4,5

Cervical spine range of motion is routinely used as an outcome measure in clinical studies.6-8 Underscoring the importance of defining cervical spine ROM, studies have found it to be a preoperative predictor of outcomes of anterior cervical surgery,9 and other studies have suggested it is a determinant of athletes’ return to play.10

Spinal ROM measurements can be used to determine the degree of disability experienced by a patient with a spinal condition as defined in the Guides to the Evaluation of Permanent Impairment by the American Medical Association (AMA).1 In the medicolegal realm, ROM measurements made by clinicians can influence the dollar amounts of awards in legal claims, and, according to the AMA guides, the difference in cervical spine ROM between normality and disability or impairment can be as little as 5°.

Although cervical spine ROM is routinely assessed and documented in clinical practice, no universal protocol exists for its evaluation.11,12 In fact, considerable inter-examiner variation in visual estimates of ROM has been found,13-16 and significant inaccuracies have been reported.17,18

Goniometers have been shown to be reliable and highly accurate, with low inter-examiner and intra-examiner variability.5,19-21 Nevertheless, logistics22 and costs21 generally limit their being accepted in routine clinical practice. Among many methods available for assessing ROM, visual estimation is the least reliable or accurate,23 but it is the quickest and least expensive and is recommended in textbooks that describe the spinal-specific physical examination.24 Despite the superiority of goniometers in measuring ROM, these significant barriers have limited their use in clinical practice. When assessing cervical spine ROM, most clinicians prefer visual estimates over goniometers.

We conducted a study to determine whether training could improve the accuracy of visual estimates. We compared the accuracy of visual estimates of cervical spine ROM with that of a radiographically validated electrogoniometer and then investigated whether accuracy and reliability of visual estimates could be improved with a session of instruction and demonstration. Assessments of accuracy were made immediately after and 1 month after this training session.

Materials and Methods

Assessments Made Before Training

This study was approved by our institution’s human investigation committee and was conducted in accordance with the ethical standards of that committee.

Cervical spine ROM was assessed by 8 examiners (2 attending spine surgeons, 4 orthopedic residents, 2 medical students). They were informed they would be participating in a study evaluating visual estimates of motion but were given no other information prior to the study.

Four healthy volunteer subjects (examiners who rotated through the role) were assessed. No subject reported any ongoing neck or spine discomfort or had had any previous spinal surgery. One at a time, subjects were fitted with a cervical harness electrogoniometer capable of measuring angulation of the cervical spine to the nearest degree (modified electrogoniometer, torsiometer, and display from Biometrics, Gwent, UK; Figures 1A, 1B). This electrogoniometer has been shown to have a mean (SD) error of 2.3° (2.6°) relative to radiographic assessments.8

With the electrogoniometer fitted, each subject was instructed to sit upright in a chair with his back to the backrest and his head neutrally positioned. The electrogoniometer was then zeroed, and the subject proceeded with 5 series of flexion-extension, left and right lateral bending, and left and right rotation movements. The subject was instructed to make 1 movement in full motion in each direction and the other 4 movements in less than full motion to yield a variety of excursions for assessment. Each subject was instructed to pause at the apex of each motion. During these pauses, the examiners recorded their visual estimates of movement in each direction while the investigator recorded degrees of motion (displayed by the electrogoniometer) in flexion-extension, lateral bending, and rotation (Figures 2A–2D). The electrogoniometer display was not visible to subjects or examiners.

A total of 840 independent visual estimates of 120 distinct movements were recorded.

Training, and Assessments Made Immediately Thereafter

After the first round of visual estimates, the 8 examiners were verbally instructed in cervical spine ROM assessment and were asked to observe 1 subject, fitted with the electrogoniometer, demonstrating partial and full cervical motions while the investigator announced the electrogoniometric measurements. The motions demonstrated included 15°, 30°, and the extremes of cervical spine ROM in each of 6 directions from neutral.

 

 

After this training session, each of the 4 subjects from the first round of assessments was again fitted with the harness electrogoniometer and instructed to repeat the movements in turn while examiners visually estimated cervical spine ROM and independently recorded their estimates. Meanwhile, the investigator recorded the degree of motion during each movement (as measured by the electrogoniometer). Again, a total of 840 independent visual estimates of 120 distinct movements were recorded.

Assessments Made 1 Month After Training

One month after the training session, the examiners and the investigator reconvened to assess the same 4 subjects using a procedure for simultaneous visual estimation and electrogoniometric measurement identical to that used 1 month earlier. No additional training was given. Again, 840 independent visual estimates of 120 distinct movements were recorded.

Data Analysis

The reliabilities of visual estimates were analyzed by calculating the intraclass coefficients (ICCs) using random-effect 1-way analyses of variance. By convention, ICCs of < 0.2, 0.2 to 0.39, 0.4 to 0.59, 0.6 to 0.8, and > 0.8 correspond to poor, fair, moderate, substantial, and perfect reliability, respectively.25

We compared the visual estimates and electrogoniometric measurements made for 3 planes of motion (flexion-extension, lateral bending, axial rotation) before, immediately after, and 1 month after training and drew trend lines generated by linear regression relative to a line of perfect correlation.

Mean errors in examiners’ visual estimates (relative to elec­trogoniometric measurements) made before, immediately after, and 1 month after training were calculated. Paired Student t tests were then used to compare the mean errors before training with the mean errors immediately after and 1 month after training.

All analyses were performed with SPSS for Windows 16.0 (SPSS, Chicago, Illinois).

Results

Inter-examiner reliability of the visual estimates in all planes of motion ranged from 0.51 to 0.79 (suggestive of moderate to substantial reliability). For reference, standard goniometers measuring knee ROM have inter-examiner ICCs of 0.89 to 0.9826 (suggestive of perfect reliability). The ICCs before, immediately after, and 1 month after training were not significantly different. 

As expected, there were significant errors in visual estimates of cervical spine ROM in all planes. Initial errors in visual estimates (relative to electrogoniometric measurements) were 23.9° (flexion-extension), 15.5° (lateral bending), and 19.3° (axial rotation) (Table, Figure 3).

Immediately after training, mean errors in visual estimates decreased to 12.0° (flexion-extension), 11.7° (lateral bending), and 16.4° (axial rotation) (Table, Figure 3). In all 3 planes of cervical motion, these improvements were statistically significant.

One month after training, mean errors in visual estimates were 14.4° (flexion-extension), 13.9° (lateral bending), and 16.2° (axial rotation) (Table, Figure 3). Only the improvement in the estimate of flexion-extension (the direction of the largest error initially) remained statistically significant—a 39.7% decrease in error.

We also considered how errors varied with degree of motion observed. In flexion-extension, the tendency to overestimate at larger degrees of motion was not apparent after training, and 1 month after training we found a tendency to underestimate at smaller degrees of motion (Figure 4A). The tendency to overestimate lateral bending before training did not persist immediately after or 1 month after training (Figure 4B). Estimates of axial rotation correlated well with goniometer measurements before training and were also well correlated immediately after and 1 month after training (Figure 4C).

Discussion

Visual estimation of spinal motion is unreliable and inaccurate, but its widespread use in clinical practice continues. Goniometers are far more accurate and reliable but are seldom used. We investigated whether a training session featuring verbal instruction and demonstration with an electrogoniometer could improve visual estimates and whether potential improvement in visual estimates would remain 1 month after training.

Widely variable ICCs (0.42-0.90) have been reported for visual estimates of cervical spine ROM.17,18,22 Our findings on the reliability of these estimates are consistent with the literature.

We recorded the greatest initial error in estimates of motion in flexion-extension. Previous studies have also found the greatest error and least reliability in visual estimates in this plane.14,15,18 Visual estimation may be more difficult in flexion-extension because the shoulders cannot be used as landmarks, whereas they serve as approximate 90° reference points during estimation of lateral bending and axial rotation. Demonstration of 15°, 30° and the extremes of ROM during the training session may have provided alternative reference points during visual estimation after training—decreasing the error to within the range found in other planes of motion.

Initial errors in visual estimates were 23.9° (flexion-extension), 15.5° (lateral bending), and 19.3° (axial rotation). Based on normative cervical spine ROM in a healthy population— 126° ± 12° for flexion-extension, 86° ± 5° for lateral bending, 151° ± 23° for axial rotation22—the errors we identified are 18.9% of the normal range of flexion-extension, 18.0% of lateral bending, and 12.8% of axial rotation.

 

 

Training clearly improved the accuracy of visual estimates of cervical spine ROM. Estimates were statistically improved for all planes immediately after training and remained significantly improved for flexion-extension (the plane of largest error initially) 1 month after training. Before training, mean errors varied across planes. Training normalized mean errors to about 15°, and this effect lasted in flexion-extension, lateral bending, and axial rotation (Figures 4A–4C). Of note, before training these percentage errors increased with increased motion from neutral in the flexion-extension and lateral bending planes. At full ROM, percentage errors in estimates were greater. After training, percentage errors did not increase appreciably with increasing motion.

Readers will naturally reflect on the clinical significance of the motion assessment improvements demonstrated after the training session described in this study. We must be aware that functional assessments are increasingly being emphasized in the clinical arena—with respect to clinical conditions, surgical outcomes, and functional impairments. We highlight a point made earlier: A difference of only 5° can affect impairment ratings in the medicolegal realm.1 In estimating flexion-extension motion, lasting improvements of almost 10° were demonstrated and maintained 1 month after the training session described in this study.

Nevertheless, mean errors in visual estimation remained at about 15° in all planes of motion, despite our modest improvements. This finding raises the question of whether visually estimated ROM should be pertinent to assessments of impairment and disability. Although visual estimates of ROM may have more utility as a screening test for impairment and disability, fine differences in ROM simply cannot be reliably assessed by visual estimation.

This study has limitations. First, it was conducted at a single institution where the evaluators received most of their training. Their skill in visually estimating cervical spine ROM may not be generalizable to a larger population of spine specialists who are practicing at other institutions and may have different training backgrounds.

Second, only healthy subjects were assessed. Some studies of cervical spine ROM have shown better reliability in symptomatic subjects relative to asymptomatic subjects.13,14 To attempt to overcome this limitation, we assessed many different excursions of motion that were often not to the extremes of motion.

Third, the “gold standard” we used for motion assessment was an electrogoniometer, which has some inherent error (previously validated mean [SD] error of 2.3° [2.6°] relative to radiographs8). Although obtaining radiographs of each movement would have more closely resembled the gold standard, the radiation dose associated with such a study is prohibitive.

Last, the assessors included medical students. The medical students’ estimates, however, tended to be more accurate than the residents’ or attending surgeons’ (though the difference was not statistically significant). This tendency may reflect the medical students’ closer attention to detail.  Clearly, including medical students in the study did not negatively affect the accuracy of the estimates or the validity of our findings.

Conclusion

Despite its limitations, visual assessment of cervical spine motion remains the gold standard in clinical practice and is routinely recorded and reported. Mean errors ranged from 15.5° to 23.9°, depending on plane of motion being assessed, but these improved after a training session.

Visual estimates of motion in flexion-extension were most improved by training, as the initial errors in this plane were the largest. Statistically significant improvement of about 10° remained for flexion-extension motion estimates 1 month after training.

During a time when we are increasingly emphasizing functional outcomes, such a degree of improvement could be of clinical significance. Our study results support a call for more formalized training of ROM assessment, but clinicians should also be aware of the limitations of visual estimates of cervical spine ROM, and our study results support scrutiny of visual assessment of ROM as a criterion for diagnosing permanent impairment or disability.

Assessment of cervical spine range of motion (ROM) is an integral aspect of the physical examination for cervical conditions,1-3 surgical outcomes,4 and functional impairment.1 In fact, the emphasis being placed on such functional measures before and after treatments is increasing.4,5

Cervical spine range of motion is routinely used as an outcome measure in clinical studies.6-8 Underscoring the importance of defining cervical spine ROM, studies have found it to be a preoperative predictor of outcomes of anterior cervical surgery,9 and other studies have suggested it is a determinant of athletes’ return to play.10

Spinal ROM measurements can be used to determine the degree of disability experienced by a patient with a spinal condition as defined in the Guides to the Evaluation of Permanent Impairment by the American Medical Association (AMA).1 In the medicolegal realm, ROM measurements made by clinicians can influence the dollar amounts of awards in legal claims, and, according to the AMA guides, the difference in cervical spine ROM between normality and disability or impairment can be as little as 5°.

Although cervical spine ROM is routinely assessed and documented in clinical practice, no universal protocol exists for its evaluation.11,12 In fact, considerable inter-examiner variation in visual estimates of ROM has been found,13-16 and significant inaccuracies have been reported.17,18

Goniometers have been shown to be reliable and highly accurate, with low inter-examiner and intra-examiner variability.5,19-21 Nevertheless, logistics22 and costs21 generally limit their being accepted in routine clinical practice. Among many methods available for assessing ROM, visual estimation is the least reliable or accurate,23 but it is the quickest and least expensive and is recommended in textbooks that describe the spinal-specific physical examination.24 Despite the superiority of goniometers in measuring ROM, these significant barriers have limited their use in clinical practice. When assessing cervical spine ROM, most clinicians prefer visual estimates over goniometers.

We conducted a study to determine whether training could improve the accuracy of visual estimates. We compared the accuracy of visual estimates of cervical spine ROM with that of a radiographically validated electrogoniometer and then investigated whether accuracy and reliability of visual estimates could be improved with a session of instruction and demonstration. Assessments of accuracy were made immediately after and 1 month after this training session.

Materials and Methods

Assessments Made Before Training

This study was approved by our institution’s human investigation committee and was conducted in accordance with the ethical standards of that committee.

Cervical spine ROM was assessed by 8 examiners (2 attending spine surgeons, 4 orthopedic residents, 2 medical students). They were informed they would be participating in a study evaluating visual estimates of motion but were given no other information prior to the study.

Four healthy volunteer subjects (examiners who rotated through the role) were assessed. No subject reported any ongoing neck or spine discomfort or had had any previous spinal surgery. One at a time, subjects were fitted with a cervical harness electrogoniometer capable of measuring angulation of the cervical spine to the nearest degree (modified electrogoniometer, torsiometer, and display from Biometrics, Gwent, UK; Figures 1A, 1B). This electrogoniometer has been shown to have a mean (SD) error of 2.3° (2.6°) relative to radiographic assessments.8

With the electrogoniometer fitted, each subject was instructed to sit upright in a chair with his back to the backrest and his head neutrally positioned. The electrogoniometer was then zeroed, and the subject proceeded with 5 series of flexion-extension, left and right lateral bending, and left and right rotation movements. The subject was instructed to make 1 movement in full motion in each direction and the other 4 movements in less than full motion to yield a variety of excursions for assessment. Each subject was instructed to pause at the apex of each motion. During these pauses, the examiners recorded their visual estimates of movement in each direction while the investigator recorded degrees of motion (displayed by the electrogoniometer) in flexion-extension, lateral bending, and rotation (Figures 2A–2D). The electrogoniometer display was not visible to subjects or examiners.

A total of 840 independent visual estimates of 120 distinct movements were recorded.

Training, and Assessments Made Immediately Thereafter

After the first round of visual estimates, the 8 examiners were verbally instructed in cervical spine ROM assessment and were asked to observe 1 subject, fitted with the electrogoniometer, demonstrating partial and full cervical motions while the investigator announced the electrogoniometric measurements. The motions demonstrated included 15°, 30°, and the extremes of cervical spine ROM in each of 6 directions from neutral.

 

 

After this training session, each of the 4 subjects from the first round of assessments was again fitted with the harness electrogoniometer and instructed to repeat the movements in turn while examiners visually estimated cervical spine ROM and independently recorded their estimates. Meanwhile, the investigator recorded the degree of motion during each movement (as measured by the electrogoniometer). Again, a total of 840 independent visual estimates of 120 distinct movements were recorded.

Assessments Made 1 Month After Training

One month after the training session, the examiners and the investigator reconvened to assess the same 4 subjects using a procedure for simultaneous visual estimation and electrogoniometric measurement identical to that used 1 month earlier. No additional training was given. Again, 840 independent visual estimates of 120 distinct movements were recorded.

Data Analysis

The reliabilities of visual estimates were analyzed by calculating the intraclass coefficients (ICCs) using random-effect 1-way analyses of variance. By convention, ICCs of < 0.2, 0.2 to 0.39, 0.4 to 0.59, 0.6 to 0.8, and > 0.8 correspond to poor, fair, moderate, substantial, and perfect reliability, respectively.25

We compared the visual estimates and electrogoniometric measurements made for 3 planes of motion (flexion-extension, lateral bending, axial rotation) before, immediately after, and 1 month after training and drew trend lines generated by linear regression relative to a line of perfect correlation.

Mean errors in examiners’ visual estimates (relative to elec­trogoniometric measurements) made before, immediately after, and 1 month after training were calculated. Paired Student t tests were then used to compare the mean errors before training with the mean errors immediately after and 1 month after training.

All analyses were performed with SPSS for Windows 16.0 (SPSS, Chicago, Illinois).

Results

Inter-examiner reliability of the visual estimates in all planes of motion ranged from 0.51 to 0.79 (suggestive of moderate to substantial reliability). For reference, standard goniometers measuring knee ROM have inter-examiner ICCs of 0.89 to 0.9826 (suggestive of perfect reliability). The ICCs before, immediately after, and 1 month after training were not significantly different. 

As expected, there were significant errors in visual estimates of cervical spine ROM in all planes. Initial errors in visual estimates (relative to electrogoniometric measurements) were 23.9° (flexion-extension), 15.5° (lateral bending), and 19.3° (axial rotation) (Table, Figure 3).

Immediately after training, mean errors in visual estimates decreased to 12.0° (flexion-extension), 11.7° (lateral bending), and 16.4° (axial rotation) (Table, Figure 3). In all 3 planes of cervical motion, these improvements were statistically significant.

One month after training, mean errors in visual estimates were 14.4° (flexion-extension), 13.9° (lateral bending), and 16.2° (axial rotation) (Table, Figure 3). Only the improvement in the estimate of flexion-extension (the direction of the largest error initially) remained statistically significant—a 39.7% decrease in error.

We also considered how errors varied with degree of motion observed. In flexion-extension, the tendency to overestimate at larger degrees of motion was not apparent after training, and 1 month after training we found a tendency to underestimate at smaller degrees of motion (Figure 4A). The tendency to overestimate lateral bending before training did not persist immediately after or 1 month after training (Figure 4B). Estimates of axial rotation correlated well with goniometer measurements before training and were also well correlated immediately after and 1 month after training (Figure 4C).

Discussion

Visual estimation of spinal motion is unreliable and inaccurate, but its widespread use in clinical practice continues. Goniometers are far more accurate and reliable but are seldom used. We investigated whether a training session featuring verbal instruction and demonstration with an electrogoniometer could improve visual estimates and whether potential improvement in visual estimates would remain 1 month after training.

Widely variable ICCs (0.42-0.90) have been reported for visual estimates of cervical spine ROM.17,18,22 Our findings on the reliability of these estimates are consistent with the literature.

We recorded the greatest initial error in estimates of motion in flexion-extension. Previous studies have also found the greatest error and least reliability in visual estimates in this plane.14,15,18 Visual estimation may be more difficult in flexion-extension because the shoulders cannot be used as landmarks, whereas they serve as approximate 90° reference points during estimation of lateral bending and axial rotation. Demonstration of 15°, 30° and the extremes of ROM during the training session may have provided alternative reference points during visual estimation after training—decreasing the error to within the range found in other planes of motion.

Initial errors in visual estimates were 23.9° (flexion-extension), 15.5° (lateral bending), and 19.3° (axial rotation). Based on normative cervical spine ROM in a healthy population— 126° ± 12° for flexion-extension, 86° ± 5° for lateral bending, 151° ± 23° for axial rotation22—the errors we identified are 18.9% of the normal range of flexion-extension, 18.0% of lateral bending, and 12.8% of axial rotation.

 

 

Training clearly improved the accuracy of visual estimates of cervical spine ROM. Estimates were statistically improved for all planes immediately after training and remained significantly improved for flexion-extension (the plane of largest error initially) 1 month after training. Before training, mean errors varied across planes. Training normalized mean errors to about 15°, and this effect lasted in flexion-extension, lateral bending, and axial rotation (Figures 4A–4C). Of note, before training these percentage errors increased with increased motion from neutral in the flexion-extension and lateral bending planes. At full ROM, percentage errors in estimates were greater. After training, percentage errors did not increase appreciably with increasing motion.

Readers will naturally reflect on the clinical significance of the motion assessment improvements demonstrated after the training session described in this study. We must be aware that functional assessments are increasingly being emphasized in the clinical arena—with respect to clinical conditions, surgical outcomes, and functional impairments. We highlight a point made earlier: A difference of only 5° can affect impairment ratings in the medicolegal realm.1 In estimating flexion-extension motion, lasting improvements of almost 10° were demonstrated and maintained 1 month after the training session described in this study.

Nevertheless, mean errors in visual estimation remained at about 15° in all planes of motion, despite our modest improvements. This finding raises the question of whether visually estimated ROM should be pertinent to assessments of impairment and disability. Although visual estimates of ROM may have more utility as a screening test for impairment and disability, fine differences in ROM simply cannot be reliably assessed by visual estimation.

This study has limitations. First, it was conducted at a single institution where the evaluators received most of their training. Their skill in visually estimating cervical spine ROM may not be generalizable to a larger population of spine specialists who are practicing at other institutions and may have different training backgrounds.

Second, only healthy subjects were assessed. Some studies of cervical spine ROM have shown better reliability in symptomatic subjects relative to asymptomatic subjects.13,14 To attempt to overcome this limitation, we assessed many different excursions of motion that were often not to the extremes of motion.

Third, the “gold standard” we used for motion assessment was an electrogoniometer, which has some inherent error (previously validated mean [SD] error of 2.3° [2.6°] relative to radiographs8). Although obtaining radiographs of each movement would have more closely resembled the gold standard, the radiation dose associated with such a study is prohibitive.

Last, the assessors included medical students. The medical students’ estimates, however, tended to be more accurate than the residents’ or attending surgeons’ (though the difference was not statistically significant). This tendency may reflect the medical students’ closer attention to detail.  Clearly, including medical students in the study did not negatively affect the accuracy of the estimates or the validity of our findings.

Conclusion

Despite its limitations, visual assessment of cervical spine motion remains the gold standard in clinical practice and is routinely recorded and reported. Mean errors ranged from 15.5° to 23.9°, depending on plane of motion being assessed, but these improved after a training session.

Visual estimates of motion in flexion-extension were most improved by training, as the initial errors in this plane were the largest. Statistically significant improvement of about 10° remained for flexion-extension motion estimates 1 month after training.

During a time when we are increasingly emphasizing functional outcomes, such a degree of improvement could be of clinical significance. Our study results support a call for more formalized training of ROM assessment, but clinicians should also be aware of the limitations of visual estimates of cervical spine ROM, and our study results support scrutiny of visual assessment of ROM as a criterion for diagnosing permanent impairment or disability.

References

1. Rondinelli RD, Genovese E, Brigham CR; American Medical Association. Guides to the Evaluation of Permanent Impairment. 6th ed. Chicago, IL: American Medical Association; 2008.

2. Hall TM, Briffa K, Hopper D, Robinson K. Comparative analysis and diagnostic accuracy of the cervical flexion-rotation test. J Headache Pain. 2010;11(5):391-397.

3. De Hertogh WJ, Vaes PH, Vijverman V, De Cordt A, Duquet W. The clinical examination of neck pain patients: the validity of a group of tests. Man Ther. 2007;12(1):50-55.

4. Koller H, Resch H, Acosta F, et al. Assessment of two measurement techniques of cervical spine and C1–C2 rotation in the outcome research of axis fractures: a morphometrical analysis using dynamic computed tomography scanning. Spine. 2010;35(3):286-290.

5. Garrett TR, Youdas JW, Madson TJ. Reliability of measuring forward head posture in a clinical setting. J Orthop Sports Phys Ther. 1993;17(3):155-160.

6. Pearcy MJ, Tibrewal SB. Axial rotation and lateral bending in the normal lumbar spine measured by three-dimensional radiography. Spine. 1984;9(6):582-587.

7. Hayes MA, Howard TC, Gruel CR, Kopta JA. Roentgenographic evaluation of lumbar spine flexion-extension in asymptomatic individuals. Spine. 1989;14(3):327-331.

8. Bible JE, Biswas D, Miller CP, Whang PG, Grauer JN. Normal functional range of motion of the cervical spine during 15 activities of daily living. J Spinal Disord Tech. 2010;23(1):15-21.

9. Penning L. Normal movements of the cervical spine. AJR Am J Roentgenol. 1978;130(2):317-326.

10. Mayer TG, Tencer AF, Kristoferson S, Mooney V. Use of noninvasive techniques for quantification of spinal range-of-motion in normal subjects and chronic low-back dysfunction patients. Spine. 1984;9(6):588-595.

11. Williams MA, McCarthy CJ, Chorti A, Cooke MW, Gates S. A systematic review of reliability and validity studies of methods for measuring active and passive cervical range of motion. J Manipulative Physiol Ther. 2010;33(2):138-155.

12. Schaufele MK, Boden SD. Physical function measurements in neck pain. Phys Med Rehabil Clin North Am. 2003;14(3):569-588.

13. Fjellner A, Bexander C, Faleij R, Strender LE. Interexaminer reliability in physical examination of the cervical spine. J Manipulative Physiol Ther. 1999;22(8):511-516.

14. Nilsson N, Christensen HW, Hartvigsen J. The interexaminer reliability of measuring passive cervical range of motion, revisited. J Manipulative Physiol Ther. 1996;19(5):302-305.

15. Pool JJ, Hoving JL, de Vet HC, van Mameren H, Bouter LM. The interexaminer reproducibility of physical examination of the cervical spine. J Manipulative Physiol Ther. 2004;27(2):84-90.

16. Strender LE, Lundin M, Nell K. Interexaminer reliability in physical examination of the neck. J Manipulative Physiol Ther. 1997;20(8):516-520.

17. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98-104.

18. Whitcroft KL, Massouh L, Amirfeyz R, Bannister G. Comparison of methods of measuring active cervical range of motion. Spine. 2010;35(19):E976-E980.

19. de Koning CH, van den Heuvel SP, Staal JB, Smits-Engelsman BC, Hendriks EJ. Clinimetric evaluation of active range of motion measures in patients with non-specific neck pain: a systematic review. Eur Spine J. 2008;17(7):905-921.

20. Christensen HW, Nilsson N. The reliability of measuring active and passive cervical range of motion: an observer-blinded and randomized repeated-measures design. J Manipulative Physiol Ther. 1998;21(5):341-347.

21. Florêncio LL, Pereira PA, Silva ER, Pegoretti KS, Gonçalves MC, Bevilaqua-Grossi D. Agreement and reliability of two non-invasive methods for assessing cervical range of motion among young adults. Rev Bras Fisioter. 2010;14(2):175-181.

22. Lea RD, Gerhardt JJ. Range-of-motion measurements. J Bone Joint Surg Am. 1995;77(5):784-798.

23. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98-104.

24. Greene WB, Netter FH. Netter’s Orthopaedics. Philadelphia, PA: Saunders Elsevier; 2006.

25. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86(2):420-428.

26. Brosseau L, Balmer S, Tousignant M, et al. Intra- and intertester reliability and criterion validity of the parallelogram and universal goniometers for measuring maximum active knee flexion and extension of patients with knee restrictions. Arch Phys Med Rehabil. 2001;82(3):396-402.

References

1. Rondinelli RD, Genovese E, Brigham CR; American Medical Association. Guides to the Evaluation of Permanent Impairment. 6th ed. Chicago, IL: American Medical Association; 2008.

2. Hall TM, Briffa K, Hopper D, Robinson K. Comparative analysis and diagnostic accuracy of the cervical flexion-rotation test. J Headache Pain. 2010;11(5):391-397.

3. De Hertogh WJ, Vaes PH, Vijverman V, De Cordt A, Duquet W. The clinical examination of neck pain patients: the validity of a group of tests. Man Ther. 2007;12(1):50-55.

4. Koller H, Resch H, Acosta F, et al. Assessment of two measurement techniques of cervical spine and C1–C2 rotation in the outcome research of axis fractures: a morphometrical analysis using dynamic computed tomography scanning. Spine. 2010;35(3):286-290.

5. Garrett TR, Youdas JW, Madson TJ. Reliability of measuring forward head posture in a clinical setting. J Orthop Sports Phys Ther. 1993;17(3):155-160.

6. Pearcy MJ, Tibrewal SB. Axial rotation and lateral bending in the normal lumbar spine measured by three-dimensional radiography. Spine. 1984;9(6):582-587.

7. Hayes MA, Howard TC, Gruel CR, Kopta JA. Roentgenographic evaluation of lumbar spine flexion-extension in asymptomatic individuals. Spine. 1989;14(3):327-331.

8. Bible JE, Biswas D, Miller CP, Whang PG, Grauer JN. Normal functional range of motion of the cervical spine during 15 activities of daily living. J Spinal Disord Tech. 2010;23(1):15-21.

9. Penning L. Normal movements of the cervical spine. AJR Am J Roentgenol. 1978;130(2):317-326.

10. Mayer TG, Tencer AF, Kristoferson S, Mooney V. Use of noninvasive techniques for quantification of spinal range-of-motion in normal subjects and chronic low-back dysfunction patients. Spine. 1984;9(6):588-595.

11. Williams MA, McCarthy CJ, Chorti A, Cooke MW, Gates S. A systematic review of reliability and validity studies of methods for measuring active and passive cervical range of motion. J Manipulative Physiol Ther. 2010;33(2):138-155.

12. Schaufele MK, Boden SD. Physical function measurements in neck pain. Phys Med Rehabil Clin North Am. 2003;14(3):569-588.

13. Fjellner A, Bexander C, Faleij R, Strender LE. Interexaminer reliability in physical examination of the cervical spine. J Manipulative Physiol Ther. 1999;22(8):511-516.

14. Nilsson N, Christensen HW, Hartvigsen J. The interexaminer reliability of measuring passive cervical range of motion, revisited. J Manipulative Physiol Ther. 1996;19(5):302-305.

15. Pool JJ, Hoving JL, de Vet HC, van Mameren H, Bouter LM. The interexaminer reproducibility of physical examination of the cervical spine. J Manipulative Physiol Ther. 2004;27(2):84-90.

16. Strender LE, Lundin M, Nell K. Interexaminer reliability in physical examination of the neck. J Manipulative Physiol Ther. 1997;20(8):516-520.

17. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98-104.

18. Whitcroft KL, Massouh L, Amirfeyz R, Bannister G. Comparison of methods of measuring active cervical range of motion. Spine. 2010;35(19):E976-E980.

19. de Koning CH, van den Heuvel SP, Staal JB, Smits-Engelsman BC, Hendriks EJ. Clinimetric evaluation of active range of motion measures in patients with non-specific neck pain: a systematic review. Eur Spine J. 2008;17(7):905-921.

20. Christensen HW, Nilsson N. The reliability of measuring active and passive cervical range of motion: an observer-blinded and randomized repeated-measures design. J Manipulative Physiol Ther. 1998;21(5):341-347.

21. Florêncio LL, Pereira PA, Silva ER, Pegoretti KS, Gonçalves MC, Bevilaqua-Grossi D. Agreement and reliability of two non-invasive methods for assessing cervical range of motion among young adults. Rev Bras Fisioter. 2010;14(2):175-181.

22. Lea RD, Gerhardt JJ. Range-of-motion measurements. J Bone Joint Surg Am. 1995;77(5):784-798.

23. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98-104.

24. Greene WB, Netter FH. Netter’s Orthopaedics. Philadelphia, PA: Saunders Elsevier; 2006.

25. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86(2):420-428.

26. Brosseau L, Balmer S, Tousignant M, et al. Intra- and intertester reliability and criterion validity of the parallelogram and universal goniometers for measuring maximum active knee flexion and extension of patients with knee restrictions. Arch Phys Med Rehabil. 2001;82(3):396-402.

Issue
The American Journal of Orthopedics - 43(11)
Issue
The American Journal of Orthopedics - 43(11)
Page Number
E261-E265
Page Number
E261-E265
Publications
Publications
Topics
Article Type
Display Headline
Improving Visual Estimates of Cervical Spine Range of Motion
Display Headline
Improving Visual Estimates of Cervical Spine Range of Motion
Legacy Keywords
american journal of orthopedics, AJO, original study, online exclusive, visual estimates, cervical spine, spine, range of motion, ROM, cervical, clinical, surgical outcomes, hirsch, webb, bohl, fu, buerba, gruskay, grauer
Legacy Keywords
american journal of orthopedics, AJO, original study, online exclusive, visual estimates, cervical spine, spine, range of motion, ROM, cervical, clinical, surgical outcomes, hirsch, webb, bohl, fu, buerba, gruskay, grauer
Sections
Article Source

PURLs Copyright

Inside the Article

Article PDF Media

Does a Prior Hip Arthroscopy Affect Clinical Outcomes in Metal-on-Metal Hip Resurfacing Arthroplasty?

Article Type
Changed
Thu, 09/19/2019 - 13:39
Display Headline
Does a Prior Hip Arthroscopy Affect Clinical Outcomes in Metal-on-Metal Hip Resurfacing Arthroplasty?

Metal-on-metal hip resurfacing arthroplasty (HRA) remains an alternative to total hip arthroplasty (THA) in appropriately selected, younger, active adults with degenerative hip disease.1-4 While concerns remain regarding the potential for adverse local tissue reactions from wear of the metal-on-metal bearing surface,5-8 10-year data from the Australian Orthopaedic Association National Joint Replacement Registry Annual Report9 showed a revision rate of only 6.3% when the Birmingham Hip Resurfacing (BHR) System was used (Smith & Nephew Inc, Memphis, Tennessee).In addition, in an independent review of 230 consecutive BHRs at a mean follow-up of 10.4 years, Coulter and colleagues10 showed encouraging clinical results, with a mean Oxford Hip Score of 45.0 and a mean University of California at Los Angeles (UCLA) activity score of 7.4.

Similar to the prior increase in popularity of HRA, hip arthroscopy has also become much more commonplace, and its indications continue to evolve.11 Hip arthroscopy has been used in the native hip joint to manage femoroacetabular impingement, labral tears, and iliopsoas tendinopathy, among other conditions.12 In addition, the use of hip arthroscopy has not been limited to the native hip but also has increased as a diagnostic and therapeutic procedure after hip arthroplasties. Bajwa and Villar12 found hip arthroscopy to be diagnostic in 23 of 24 patients who underwent the procedure after a hip arthroplasty, concluding that arthroscopy is a useful adjunct in the diagnosis of symptomatic arthroplasties.

Therefore, hip arthroscopy has been shown to be an effective modality to treat pathology in both the native hip and after hip arthroplasties. However, the effect of a prior hip arthroscopy on the outcome of a subsequent metal-on-metal HRA has not been determined. Piedade and colleagues13 showed a prior knee arthroscopy to increase the risk of postoperative complications and subsequent revision after total knee arthroplasty. Complications included reflex sympathetic dystrophy, undiagnosed pain, infection, stiffness, and component loosening. A prior osteochondroplasty at the femoral head-neck junction could increase the risk of femoral neck fracture after a subsequent HRA. Thus, the purpose of this study was to evaluate the clinical outcomes of a series of patients who received an HRA after a prior hip arthroscopy and to compare these results with a cohort of patients who received an HRA with no prior hip surgeries. Our hypothesis is that a prior hip arthroscopy will lead to inferior outcomes in patients undergoing HRA. 

Materials and Methods

This study is a retrospective, case-control study using a 1:2 matching analysis. Dr. Su performed all HRAs, which were enrolled in an institutional review board–approved arthroplasty registry. All HRAs were performed using the BHR System. 

The surgical technique for hip resurfacing arthroplasty has been described.1 All procedures were performed via a posterior approach with the patient in the lateral decubitus position. All patients received a hybrid metal-on-metal hip resurfacing, with an uncemented acetabular component and cemented femoral component. Intraoperative anesthesia for all patients was performed via a combined spinal-epidural anesthetic, and an epidural patient-controlled analgesic was used for the first day postoperatively, followed by a transition to oral analgesics. The sizes of the acetabular and femoral components were recorded for each hip resurfacing. Postoperatively, intermittent pneumatic compression devices were placed upon arrival in the recovery room, and active ankle flexion and extension exercises were initiated immediately after the patient’s neurologic function returned.14 Aspirin was used for chemical deep venous thrombosis prophylaxis in all patients postoperatively for a period of 6 weeks. Full weight-bearing, with the use of crutches for assistance with balance, was permitted immediately. Crutches were used for a period of 3 weeks prior to being discontinued. 

From a database of 1357 HRAs (all BHR implants) performed between June 2006 and June 2012, 51 patients were identified who received an HRA after a prior hip arthroscopy. Eight patients were excluded because they did not possess adequate clinical documentation or were lost to follow-up. In the remaining 43 patients, there were 32 men and 11 women (21 right hips, 22 left hips), which formed the arthroscopy cohort. Two patients had a history of multiple hip arthroscopies (1 patient with 2 prior procedures, 1 patient with 3 prior procedures). The mean (SD) time from the most recent hip arthroscopy to the HRA was 2.5 (2.5) years. Table 1 presents a summary of the hip arthroscopy procedures (including only the most recent hip arthroscopy procedure in those with multiple arthroscopies).

Patient demographic variables (age, body mass index [BMI]) were recorded preoperatively, along with the Harris Hip Score (HHS),15 UCLA activity score,16 Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) score,17 and preoperative hip range of motion (flexion, extension, abduction, adduction, internal rotation, and external rotation). The same clinical indices were assessed postoperatively along with the Short Form-12 (SF-12) Health Survey Score,18 at the 6-week, 3-month, 6-month, 1-year, and most recent follow-up visits.

 

 

Radiographic assessment consisted of a low anteroposterior (AP) pelvic radiograph (with the radiographic beam centered on the pubic symphysis) and a cross-table lateral radiograph obtained at the most recent follow-up visit. Both the acetabular component abduction relative to the inter-teardrop line, and the angle between the femoral stem and the anatomic axis of the femoral shaft (stem-shaft angle) were measured on AP radiographs.19,20 Acetabular component anteversion was measured on the cross-table lateral radiographs as the angle between the projected long axis of the acetabular opening and a line drawn perpendicular to the long axis plane of the body (Figures A, B).21

The same registry database was used to identify patients who received an HRA without a prior history of arthroscopy or hip surgery. A 1:2 matching analysis for those patients with a prior hip arthroscopy to those without a prior hip arthroscopy was performed to formulate a control group (control cohort) of 86 patients. Each patient in the arthroscopy cohort was matched with 2 patients in the control cohort based on the following parameters: age (± 6 years), sex (same), BMI (± 4 kg/m2), femoral head size (± 4 mm), and preoperative HHS and WOMAC scores (± 7 points). In the event an arthroscopy patient matched to 2 or more control patients, the patients who minimized the least squared error among the matching variables were selected. 

Statistical Analysis

All data were collected and analyzed using Microsoft Excel software (Microsoft Corporation, Redmond, Washington). Statistical comparisons between the 2 cohorts regarding demographic variables, clinical outcomes, and radiographic alignment were performed using an unpaired, Student 2-tailed t test, with statistical significance set at P ≤ .05. 

Results

A comparison of the results of the 1:2 matching analysis between the arthroscopy and control cohorts is presented in Table 2. There was no significant difference in the preoperative age, BMI, femoral head size, HHS, or WOMAC score between the 2 cohorts. However, the control cohort did show a more severe, preoperative flexion contracture (as expressed by a decreased amount of extension) and a decreased amount of preoperative abduction (Table 3). The preoperative UCLA activity score was also decreased in the control cohort, but this was not statistically significant.

The mean (SD) follow-up was 2.0 (1.0) years in the arthroscopy cohort and 2.1 (1.1) years in the control cohort. There was no significant difference in radiographic alignment between the 2 cohorts. The stem-shaft angle was 139.3° (SD, 5.4°) in the arthroscopy cohort (vs 138.3° [SD, 5.5°] in the control cohort; P = .3), the acetabular abduction was 43.9° (SD, 5.8°) in the arthroscopy cohort (vs 42.9° [SD, 6.1°] in the control cohort; P = .4), and the acetabular anteversion was 21.1° (SD, 7.5°) in the arthroscopy cohort (vs 20.8° [SD, 7.1°] in the control cohort; P = .8). 

At 6-week follow-up, the arthroscopy cohort showed a significantly decreased WOMAC score compared with the control cohort (72.9 [SD, 15.5] vs 80.5 [SD, 11.8], respectively; P = .05). In addition, there was a trend towards a decreased SF-12 mental component score in the arthroscopy cohort (52.2 [SD, 9.3] vs 56.5 [SD, 7.8] in the control cohort; P = .06). However, none of the remaining clinical indices showed a significant difference between the 2 cohorts, and there was no difference in range of motion between the 2 cohorts at the 6-week follow-up visit (Table 4).

In addition, at 3-month follow-up, no statistically significant differences were seen between the 2 cohorts for any of the clinical indices or range of motion values. Both groups continued to improve rapidly, with HHS of 96.9 (SD, 3.5) in the arthroscopy cohort and 95.5 (SD, 6.6) in the control cohort, and WOMAC scores of 88.7 (SD, 10.2) and 89.5 (SD, 9.8), respectively (Table 5). Similarly, at the 6-month and 1-year follow-up intervals, the 2 cohorts showed continued improvement in their clinical measures, with no statistically significant differences between the 2 cohorts (Tables 6, 7). 

At the most recent follow-up visit, more than 1 year after surgery, the HHS was 99.5 (SD, 1.3) in the arthroscopy cohort and 99.2 (SD, 9.7) in the control cohort (P = .9), and the WOMAC score was 93.5 (SD, 11.3) and 92.4 (SD, 12.2), respectively (P = .8). No significant perioperative complications were seen in the arthroscopy cohort. In the arthroscopy cohort, 1 patient was diagnosed with a deep venous thrombosis 2 weeks after the procedure and was placed on low-molecular-weight heparin and coumadin for treatment. A second patient in the arthroscopy cohort had continued serosanguinous drainage for 4 days postoperatively, which resolved with continued compressive dressings. To date, no patients in the arthroscopy or control cohorts have required a second operation or revision of their components.

 

 

Discussion

Given the increasing prevalence of hip arthroscopies to treat multiple disorders of the native joint, it is important to assess the potential consequences of these procedures on future arthroplasties. Piedade and colleagues,13 in a retrospective review of 1474 primary total knee arthroplasties, showed a prior bony procedure (high tibial osteotomy, tibial plateau fracture, patellar realignment) to be predictive of decreased range of motion postoperatively. In addition, a prior knee arthroscopy was associated with a higher rate of postoperative complications, with 30% of the complications requiring a reoperation, and 8.3% of the complications requiring a revision total knee arthroplasty. Kaplan-Meier survival curves showed a survival rate of only 86.8% in those patients with a prior knee arthroscopy (vs 98.1% in those without a prior knee surgery).22 Therefore, the purpose of this study was to evaluate the clinical outcomes of a series of patients who received an HRA after a prior hip arthroscopy. After the initial 6-week follow-up visit, no significant difference was seen in the functional outcomes between those patients with or without a history of prior hip arthroscopy who received an HRA.

After analysis of patient outcomes using multiple clinical measurement tools, at 6-week, 3-month, 6-month, 1-year, and most recent follow-up intervals, the only significant difference between the 2 cohorts was the WOMAC score at 6-week follow-up. Interestingly, there was no significant difference seen in the other clinical assessments, including the SF-12 score, HHS, range of motion, or UCLA activity score (although this did trend towards significance). This can be explained by the difference in both the mode of administration and various metrics assessed by these instruments. In comparison to the HHS evaluation, the patient completes the WOMAC (rather than the clinician) and also provides a more detailed assessment of symptoms, pain, stiffness, and activities of daily living.17 Therefore, this study suggests that patients with a prior hip arthroscopy may require more time to return to their activities of daily living after an HRA. However, whether the statistically significant difference between the 2 scores translates into a clinically significant difference can be questioned.

The clinical outcomes of this series of patients were excellent at the short-term follow-up, and both groups achieved clinical results comparable to prior reported results of HRA.1,10,23,24 However, despite these results, there are several limitations to this study. First, longer-term follow-up is required to determine if any significant differences (such as aseptic loosening, infection, and prosthesis survival) are associated with a prior hip arthroscopy. In addition, this study included a relatively small cohort of patients who had a prior hip arthroscopy. However, a relatively large, single-surgeon database of 1357 HRAs was reviewed, with only 51 cases being reported (3.7%). With the increasing popularity of hip arthroscopy, the number of patients presenting for HRA will likely continue to increase. However, despite these limitations, this study shows that a prior hip arthroscopy does not appear to affect the short-term, clinical outcomes of a metal-on-metal HRA.

References

1. Amstutz HC, Beaulé PE, Dorey FJ, Le Duff MJ, Campbell PA, Gruen TA. Metal-on-metal hybrid surface arthroplasty. Surgical Technique. J Bone Joint Surg Am. 2006;88(suppl 1 Pt 2):234-249.

2. Daniel J, Pynsent PB, McMinn DJ. Metal-on-metal resurfacing of the hip in patients under the age of 55 years with osteoarthritis. J Bone Joint Surg Br. 2004;86(2):177-184.

3. Pollard TC, Baker RP, Eastaugh-Waring SJ, Bannister GC. Treatment of the young active patient with osteoarthritis of the hip. A five- to seven-year comparison of hybrid total hip arthroplasty and metal-on-metal resurfacing. J Bone Joint Surg Br. 2006;88(5):592-600.

4. Treacy RB, McBryde CW, Pynsent PB. Birmingham hip resurfacing arthroplasty. A minimum follow-up of five years. J Bone Joint Surg Br. 2005;87(2):167-170.

5. Amstutz HC, Le Duff MJ, Campbell PA, Gruen TA, Wisk LE. Clinical and radiographic results of metal-on-metal hip resurfacing with a minimum ten-year follow-up. J Bone Joint Surg Am. 2010;92(16):2663-2671.

6. Daniel J, Ziaee H, Pradhan C, Pynsent PB, McMinn DJ. Blood and urine metal ion levels in young and active patients after Birmingham hip resurfacing arthroplasty: four-year results of a prospective longitudinal study.
J Bone Joint Surg Br. 2007;89(2):169-173.

7. deSouza RM, Parsons NR, Oni T, Dalton P, Costa M, Krikler S. Metal ion levels following resurfacing arthroplasty of the hip: serial results over a ten-year period. J Bone Joint Surg Br. 2010;92(12):1642-1647.

8. Kwon YM, Thomas P, Summer B, et al. Lymphocyte proliferation responses in patients with pseudotumors following metal-on-metal hip resurfacing arthroplasty. J Orthop Res. 2010;28(4):444-450.

9. Australian Orthopaedic Association National Joint Replacement Registry. Annual Report 2011. Adelaide: Australian Orthopaedic Association; 2011. https://aoanjrr.dmac.adelaide.edu.au/annual-reports-2011. Accessed September 16, 2014.

10. Coulter G, Young DA, Dalziel RE, Shimmin AJ. Birmingham hip resurfacing at a mean of ten years: results from an independent centre. J Bone Joint Surg Br. 2012;94(3):315-321.

11. McCarthy JC, Jarrett BT, Ojeifo O, Lee JA, Bragdon CR. What factors influence long-term survivorship after hip arthroscopy? Clin Orthop. 2011;469(2):362-371.

12. Bajwa AS, Villar RN. Arthroscopy of the hip in patients following joint replacement. J Bone Joint Surg Br. 2011;93(7):890-896.

13. Piedade SR, Pinaroli A, Servien E, Neyret P. Is previous knee arthroscopy related to worse results in primary total knee arthroplasty? Knee Surg Sports Traumatol Arthrosc. 2009;17(4):328-333.

14. Gonzalez Della Valle A, Serota A, Go G, et al. Venous thromboembolism is rare with a multimodal prophylaxis protocol after total hip arthroplasty. Clin Orthop. 2006;(444):146-153.

15. Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am. 1969;51(4):737-755.

16. Kershaw CJ, Atkins RM, Dodd CA, Bulstrode CJ. Revision total hip arthroplasty for aseptic failure. A review of 276 cases. J Bone Joint Surg Br. 1991;73(4):564-568.

17. Bellamy N. WOMAC: a 20-year experiential review of a patient-centered self-reported health status questionnaire. J Rheumatol. 2002;29(12):2473-2476.

18. Ware J Jr, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220-233.

19. Clark JM, Freeman MA, Witham D. The relationship of neck orientation to the shape of the proximal femur. J Arthroplasty. 1987;2(2):99-109.

20. Lewinnek GE, Lewis JL, Tarr R, Compere CL, Zimmerman JR. Dislocations after total hip-replacement arthroplasties. J Bone Joint Surg Am. 1978;60(2):217-220.

21. Yao L, Yao J, Gold RH. Measurement of acetabular version on the axiolateral radiograph. Clin Orthop. 1995;(316):106-111.

22. Piedade SR, Pinaroli A, Servien E, Neyret P. TKA outcomes after prior bone and soft tissue knee surgery. Knee Surg Sports Traumatol Arthrosc. 2013;21(12):2737-2743.

23. Amstutz HC, Beaulé PE, Dorey FJ, Le Duff MJ, Campbell PA, Gruen TA. Metal-on-metal hybrid surface arthroplasty: two to six-year follow-up study. J Bone Joint Surg Am. 2004;86(1):28-39.

24. Steffen RT, Pandit HP, Palan J, et al. The five-year results of the Birmingham Hip Resurfacing arthroplasty: an independent series. J Bone Joint Surg Br. 2008;90(4):436-441.

Article PDF
Author and Disclosure Information

Denis Nam, MD, Patrick Maher, BA, Trishna Nath, BA, and Edwin P. Su, MD

Authors’ Disclosure Statement: Dr. Su wishes to report that he is a paid consultant to Smith & Nephew Inc. The other authors report no actual or potential conflict of interest in relation to this article.

Issue
The American Journal of Orthopedics - 43(11)
Publications
Topics
Page Number
E255-E260
Legacy Keywords
american journal of orthopedics, AJO, original study, study, online exclusive, hip arthroscopy, hip, arthroscopy, metal-on-metal, arthroplasty, hip resurfacing arthroplasty, hip arthroplasty, nam, maher, nath, su
Sections
Author and Disclosure Information

Denis Nam, MD, Patrick Maher, BA, Trishna Nath, BA, and Edwin P. Su, MD

Authors’ Disclosure Statement: Dr. Su wishes to report that he is a paid consultant to Smith & Nephew Inc. The other authors report no actual or potential conflict of interest in relation to this article.

Author and Disclosure Information

Denis Nam, MD, Patrick Maher, BA, Trishna Nath, BA, and Edwin P. Su, MD

Authors’ Disclosure Statement: Dr. Su wishes to report that he is a paid consultant to Smith & Nephew Inc. The other authors report no actual or potential conflict of interest in relation to this article.

Article PDF
Article PDF

Metal-on-metal hip resurfacing arthroplasty (HRA) remains an alternative to total hip arthroplasty (THA) in appropriately selected, younger, active adults with degenerative hip disease.1-4 While concerns remain regarding the potential for adverse local tissue reactions from wear of the metal-on-metal bearing surface,5-8 10-year data from the Australian Orthopaedic Association National Joint Replacement Registry Annual Report9 showed a revision rate of only 6.3% when the Birmingham Hip Resurfacing (BHR) System was used (Smith & Nephew Inc, Memphis, Tennessee).In addition, in an independent review of 230 consecutive BHRs at a mean follow-up of 10.4 years, Coulter and colleagues10 showed encouraging clinical results, with a mean Oxford Hip Score of 45.0 and a mean University of California at Los Angeles (UCLA) activity score of 7.4.

Similar to the prior increase in popularity of HRA, hip arthroscopy has also become much more commonplace, and its indications continue to evolve.11 Hip arthroscopy has been used in the native hip joint to manage femoroacetabular impingement, labral tears, and iliopsoas tendinopathy, among other conditions.12 In addition, the use of hip arthroscopy has not been limited to the native hip but also has increased as a diagnostic and therapeutic procedure after hip arthroplasties. Bajwa and Villar12 found hip arthroscopy to be diagnostic in 23 of 24 patients who underwent the procedure after a hip arthroplasty, concluding that arthroscopy is a useful adjunct in the diagnosis of symptomatic arthroplasties.

Therefore, hip arthroscopy has been shown to be an effective modality to treat pathology in both the native hip and after hip arthroplasties. However, the effect of a prior hip arthroscopy on the outcome of a subsequent metal-on-metal HRA has not been determined. Piedade and colleagues13 showed a prior knee arthroscopy to increase the risk of postoperative complications and subsequent revision after total knee arthroplasty. Complications included reflex sympathetic dystrophy, undiagnosed pain, infection, stiffness, and component loosening. A prior osteochondroplasty at the femoral head-neck junction could increase the risk of femoral neck fracture after a subsequent HRA. Thus, the purpose of this study was to evaluate the clinical outcomes of a series of patients who received an HRA after a prior hip arthroscopy and to compare these results with a cohort of patients who received an HRA with no prior hip surgeries. Our hypothesis is that a prior hip arthroscopy will lead to inferior outcomes in patients undergoing HRA. 

Materials and Methods

This study is a retrospective, case-control study using a 1:2 matching analysis. Dr. Su performed all HRAs, which were enrolled in an institutional review board–approved arthroplasty registry. All HRAs were performed using the BHR System. 

The surgical technique for hip resurfacing arthroplasty has been described.1 All procedures were performed via a posterior approach with the patient in the lateral decubitus position. All patients received a hybrid metal-on-metal hip resurfacing, with an uncemented acetabular component and cemented femoral component. Intraoperative anesthesia for all patients was performed via a combined spinal-epidural anesthetic, and an epidural patient-controlled analgesic was used for the first day postoperatively, followed by a transition to oral analgesics. The sizes of the acetabular and femoral components were recorded for each hip resurfacing. Postoperatively, intermittent pneumatic compression devices were placed upon arrival in the recovery room, and active ankle flexion and extension exercises were initiated immediately after the patient’s neurologic function returned.14 Aspirin was used for chemical deep venous thrombosis prophylaxis in all patients postoperatively for a period of 6 weeks. Full weight-bearing, with the use of crutches for assistance with balance, was permitted immediately. Crutches were used for a period of 3 weeks prior to being discontinued. 

From a database of 1357 HRAs (all BHR implants) performed between June 2006 and June 2012, 51 patients were identified who received an HRA after a prior hip arthroscopy. Eight patients were excluded because they did not possess adequate clinical documentation or were lost to follow-up. In the remaining 43 patients, there were 32 men and 11 women (21 right hips, 22 left hips), which formed the arthroscopy cohort. Two patients had a history of multiple hip arthroscopies (1 patient with 2 prior procedures, 1 patient with 3 prior procedures). The mean (SD) time from the most recent hip arthroscopy to the HRA was 2.5 (2.5) years. Table 1 presents a summary of the hip arthroscopy procedures (including only the most recent hip arthroscopy procedure in those with multiple arthroscopies).

Patient demographic variables (age, body mass index [BMI]) were recorded preoperatively, along with the Harris Hip Score (HHS),15 UCLA activity score,16 Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) score,17 and preoperative hip range of motion (flexion, extension, abduction, adduction, internal rotation, and external rotation). The same clinical indices were assessed postoperatively along with the Short Form-12 (SF-12) Health Survey Score,18 at the 6-week, 3-month, 6-month, 1-year, and most recent follow-up visits.

 

 

Radiographic assessment consisted of a low anteroposterior (AP) pelvic radiograph (with the radiographic beam centered on the pubic symphysis) and a cross-table lateral radiograph obtained at the most recent follow-up visit. Both the acetabular component abduction relative to the inter-teardrop line, and the angle between the femoral stem and the anatomic axis of the femoral shaft (stem-shaft angle) were measured on AP radiographs.19,20 Acetabular component anteversion was measured on the cross-table lateral radiographs as the angle between the projected long axis of the acetabular opening and a line drawn perpendicular to the long axis plane of the body (Figures A, B).21

The same registry database was used to identify patients who received an HRA without a prior history of arthroscopy or hip surgery. A 1:2 matching analysis for those patients with a prior hip arthroscopy to those without a prior hip arthroscopy was performed to formulate a control group (control cohort) of 86 patients. Each patient in the arthroscopy cohort was matched with 2 patients in the control cohort based on the following parameters: age (± 6 years), sex (same), BMI (± 4 kg/m2), femoral head size (± 4 mm), and preoperative HHS and WOMAC scores (± 7 points). In the event an arthroscopy patient matched to 2 or more control patients, the patients who minimized the least squared error among the matching variables were selected. 

Statistical Analysis

All data were collected and analyzed using Microsoft Excel software (Microsoft Corporation, Redmond, Washington). Statistical comparisons between the 2 cohorts regarding demographic variables, clinical outcomes, and radiographic alignment were performed using an unpaired, Student 2-tailed t test, with statistical significance set at P ≤ .05. 

Results

A comparison of the results of the 1:2 matching analysis between the arthroscopy and control cohorts is presented in Table 2. There was no significant difference in the preoperative age, BMI, femoral head size, HHS, or WOMAC score between the 2 cohorts. However, the control cohort did show a more severe, preoperative flexion contracture (as expressed by a decreased amount of extension) and a decreased amount of preoperative abduction (Table 3). The preoperative UCLA activity score was also decreased in the control cohort, but this was not statistically significant.

The mean (SD) follow-up was 2.0 (1.0) years in the arthroscopy cohort and 2.1 (1.1) years in the control cohort. There was no significant difference in radiographic alignment between the 2 cohorts. The stem-shaft angle was 139.3° (SD, 5.4°) in the arthroscopy cohort (vs 138.3° [SD, 5.5°] in the control cohort; P = .3), the acetabular abduction was 43.9° (SD, 5.8°) in the arthroscopy cohort (vs 42.9° [SD, 6.1°] in the control cohort; P = .4), and the acetabular anteversion was 21.1° (SD, 7.5°) in the arthroscopy cohort (vs 20.8° [SD, 7.1°] in the control cohort; P = .8). 

At 6-week follow-up, the arthroscopy cohort showed a significantly decreased WOMAC score compared with the control cohort (72.9 [SD, 15.5] vs 80.5 [SD, 11.8], respectively; P = .05). In addition, there was a trend towards a decreased SF-12 mental component score in the arthroscopy cohort (52.2 [SD, 9.3] vs 56.5 [SD, 7.8] in the control cohort; P = .06). However, none of the remaining clinical indices showed a significant difference between the 2 cohorts, and there was no difference in range of motion between the 2 cohorts at the 6-week follow-up visit (Table 4).

In addition, at 3-month follow-up, no statistically significant differences were seen between the 2 cohorts for any of the clinical indices or range of motion values. Both groups continued to improve rapidly, with HHS of 96.9 (SD, 3.5) in the arthroscopy cohort and 95.5 (SD, 6.6) in the control cohort, and WOMAC scores of 88.7 (SD, 10.2) and 89.5 (SD, 9.8), respectively (Table 5). Similarly, at the 6-month and 1-year follow-up intervals, the 2 cohorts showed continued improvement in their clinical measures, with no statistically significant differences between the 2 cohorts (Tables 6, 7). 

At the most recent follow-up visit, more than 1 year after surgery, the HHS was 99.5 (SD, 1.3) in the arthroscopy cohort and 99.2 (SD, 9.7) in the control cohort (P = .9), and the WOMAC score was 93.5 (SD, 11.3) and 92.4 (SD, 12.2), respectively (P = .8). No significant perioperative complications were seen in the arthroscopy cohort. In the arthroscopy cohort, 1 patient was diagnosed with a deep venous thrombosis 2 weeks after the procedure and was placed on low-molecular-weight heparin and coumadin for treatment. A second patient in the arthroscopy cohort had continued serosanguinous drainage for 4 days postoperatively, which resolved with continued compressive dressings. To date, no patients in the arthroscopy or control cohorts have required a second operation or revision of their components.

 

 

Discussion

Given the increasing prevalence of hip arthroscopies to treat multiple disorders of the native joint, it is important to assess the potential consequences of these procedures on future arthroplasties. Piedade and colleagues,13 in a retrospective review of 1474 primary total knee arthroplasties, showed a prior bony procedure (high tibial osteotomy, tibial plateau fracture, patellar realignment) to be predictive of decreased range of motion postoperatively. In addition, a prior knee arthroscopy was associated with a higher rate of postoperative complications, with 30% of the complications requiring a reoperation, and 8.3% of the complications requiring a revision total knee arthroplasty. Kaplan-Meier survival curves showed a survival rate of only 86.8% in those patients with a prior knee arthroscopy (vs 98.1% in those without a prior knee surgery).22 Therefore, the purpose of this study was to evaluate the clinical outcomes of a series of patients who received an HRA after a prior hip arthroscopy. After the initial 6-week follow-up visit, no significant difference was seen in the functional outcomes between those patients with or without a history of prior hip arthroscopy who received an HRA.

After analysis of patient outcomes using multiple clinical measurement tools, at 6-week, 3-month, 6-month, 1-year, and most recent follow-up intervals, the only significant difference between the 2 cohorts was the WOMAC score at 6-week follow-up. Interestingly, there was no significant difference seen in the other clinical assessments, including the SF-12 score, HHS, range of motion, or UCLA activity score (although this did trend towards significance). This can be explained by the difference in both the mode of administration and various metrics assessed by these instruments. In comparison to the HHS evaluation, the patient completes the WOMAC (rather than the clinician) and also provides a more detailed assessment of symptoms, pain, stiffness, and activities of daily living.17 Therefore, this study suggests that patients with a prior hip arthroscopy may require more time to return to their activities of daily living after an HRA. However, whether the statistically significant difference between the 2 scores translates into a clinically significant difference can be questioned.

The clinical outcomes of this series of patients were excellent at the short-term follow-up, and both groups achieved clinical results comparable to prior reported results of HRA.1,10,23,24 However, despite these results, there are several limitations to this study. First, longer-term follow-up is required to determine if any significant differences (such as aseptic loosening, infection, and prosthesis survival) are associated with a prior hip arthroscopy. In addition, this study included a relatively small cohort of patients who had a prior hip arthroscopy. However, a relatively large, single-surgeon database of 1357 HRAs was reviewed, with only 51 cases being reported (3.7%). With the increasing popularity of hip arthroscopy, the number of patients presenting for HRA will likely continue to increase. However, despite these limitations, this study shows that a prior hip arthroscopy does not appear to affect the short-term, clinical outcomes of a metal-on-metal HRA.

Metal-on-metal hip resurfacing arthroplasty (HRA) remains an alternative to total hip arthroplasty (THA) in appropriately selected, younger, active adults with degenerative hip disease.1-4 While concerns remain regarding the potential for adverse local tissue reactions from wear of the metal-on-metal bearing surface,5-8 10-year data from the Australian Orthopaedic Association National Joint Replacement Registry Annual Report9 showed a revision rate of only 6.3% when the Birmingham Hip Resurfacing (BHR) System was used (Smith & Nephew Inc, Memphis, Tennessee).In addition, in an independent review of 230 consecutive BHRs at a mean follow-up of 10.4 years, Coulter and colleagues10 showed encouraging clinical results, with a mean Oxford Hip Score of 45.0 and a mean University of California at Los Angeles (UCLA) activity score of 7.4.

Similar to the prior increase in popularity of HRA, hip arthroscopy has also become much more commonplace, and its indications continue to evolve.11 Hip arthroscopy has been used in the native hip joint to manage femoroacetabular impingement, labral tears, and iliopsoas tendinopathy, among other conditions.12 In addition, the use of hip arthroscopy has not been limited to the native hip but also has increased as a diagnostic and therapeutic procedure after hip arthroplasties. Bajwa and Villar12 found hip arthroscopy to be diagnostic in 23 of 24 patients who underwent the procedure after a hip arthroplasty, concluding that arthroscopy is a useful adjunct in the diagnosis of symptomatic arthroplasties.

Therefore, hip arthroscopy has been shown to be an effective modality to treat pathology in both the native hip and after hip arthroplasties. However, the effect of a prior hip arthroscopy on the outcome of a subsequent metal-on-metal HRA has not been determined. Piedade and colleagues13 showed a prior knee arthroscopy to increase the risk of postoperative complications and subsequent revision after total knee arthroplasty. Complications included reflex sympathetic dystrophy, undiagnosed pain, infection, stiffness, and component loosening. A prior osteochondroplasty at the femoral head-neck junction could increase the risk of femoral neck fracture after a subsequent HRA. Thus, the purpose of this study was to evaluate the clinical outcomes of a series of patients who received an HRA after a prior hip arthroscopy and to compare these results with a cohort of patients who received an HRA with no prior hip surgeries. Our hypothesis is that a prior hip arthroscopy will lead to inferior outcomes in patients undergoing HRA. 

Materials and Methods

This study is a retrospective, case-control study using a 1:2 matching analysis. Dr. Su performed all HRAs, which were enrolled in an institutional review board–approved arthroplasty registry. All HRAs were performed using the BHR System. 

The surgical technique for hip resurfacing arthroplasty has been described.1 All procedures were performed via a posterior approach with the patient in the lateral decubitus position. All patients received a hybrid metal-on-metal hip resurfacing, with an uncemented acetabular component and cemented femoral component. Intraoperative anesthesia for all patients was performed via a combined spinal-epidural anesthetic, and an epidural patient-controlled analgesic was used for the first day postoperatively, followed by a transition to oral analgesics. The sizes of the acetabular and femoral components were recorded for each hip resurfacing. Postoperatively, intermittent pneumatic compression devices were placed upon arrival in the recovery room, and active ankle flexion and extension exercises were initiated immediately after the patient’s neurologic function returned.14 Aspirin was used for chemical deep venous thrombosis prophylaxis in all patients postoperatively for a period of 6 weeks. Full weight-bearing, with the use of crutches for assistance with balance, was permitted immediately. Crutches were used for a period of 3 weeks prior to being discontinued. 

From a database of 1357 HRAs (all BHR implants) performed between June 2006 and June 2012, 51 patients were identified who received an HRA after a prior hip arthroscopy. Eight patients were excluded because they did not possess adequate clinical documentation or were lost to follow-up. In the remaining 43 patients, there were 32 men and 11 women (21 right hips, 22 left hips), which formed the arthroscopy cohort. Two patients had a history of multiple hip arthroscopies (1 patient with 2 prior procedures, 1 patient with 3 prior procedures). The mean (SD) time from the most recent hip arthroscopy to the HRA was 2.5 (2.5) years. Table 1 presents a summary of the hip arthroscopy procedures (including only the most recent hip arthroscopy procedure in those with multiple arthroscopies).

Patient demographic variables (age, body mass index [BMI]) were recorded preoperatively, along with the Harris Hip Score (HHS),15 UCLA activity score,16 Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) score,17 and preoperative hip range of motion (flexion, extension, abduction, adduction, internal rotation, and external rotation). The same clinical indices were assessed postoperatively along with the Short Form-12 (SF-12) Health Survey Score,18 at the 6-week, 3-month, 6-month, 1-year, and most recent follow-up visits.

 

 

Radiographic assessment consisted of a low anteroposterior (AP) pelvic radiograph (with the radiographic beam centered on the pubic symphysis) and a cross-table lateral radiograph obtained at the most recent follow-up visit. Both the acetabular component abduction relative to the inter-teardrop line, and the angle between the femoral stem and the anatomic axis of the femoral shaft (stem-shaft angle) were measured on AP radiographs.19,20 Acetabular component anteversion was measured on the cross-table lateral radiographs as the angle between the projected long axis of the acetabular opening and a line drawn perpendicular to the long axis plane of the body (Figures A, B).21

The same registry database was used to identify patients who received an HRA without a prior history of arthroscopy or hip surgery. A 1:2 matching analysis for those patients with a prior hip arthroscopy to those without a prior hip arthroscopy was performed to formulate a control group (control cohort) of 86 patients. Each patient in the arthroscopy cohort was matched with 2 patients in the control cohort based on the following parameters: age (± 6 years), sex (same), BMI (± 4 kg/m2), femoral head size (± 4 mm), and preoperative HHS and WOMAC scores (± 7 points). In the event an arthroscopy patient matched to 2 or more control patients, the patients who minimized the least squared error among the matching variables were selected. 

Statistical Analysis

All data were collected and analyzed using Microsoft Excel software (Microsoft Corporation, Redmond, Washington). Statistical comparisons between the 2 cohorts regarding demographic variables, clinical outcomes, and radiographic alignment were performed using an unpaired, Student 2-tailed t test, with statistical significance set at P ≤ .05. 

Results

A comparison of the results of the 1:2 matching analysis between the arthroscopy and control cohorts is presented in Table 2. There was no significant difference in the preoperative age, BMI, femoral head size, HHS, or WOMAC score between the 2 cohorts. However, the control cohort did show a more severe, preoperative flexion contracture (as expressed by a decreased amount of extension) and a decreased amount of preoperative abduction (Table 3). The preoperative UCLA activity score was also decreased in the control cohort, but this was not statistically significant.

The mean (SD) follow-up was 2.0 (1.0) years in the arthroscopy cohort and 2.1 (1.1) years in the control cohort. There was no significant difference in radiographic alignment between the 2 cohorts. The stem-shaft angle was 139.3° (SD, 5.4°) in the arthroscopy cohort (vs 138.3° [SD, 5.5°] in the control cohort; P = .3), the acetabular abduction was 43.9° (SD, 5.8°) in the arthroscopy cohort (vs 42.9° [SD, 6.1°] in the control cohort; P = .4), and the acetabular anteversion was 21.1° (SD, 7.5°) in the arthroscopy cohort (vs 20.8° [SD, 7.1°] in the control cohort; P = .8). 

At 6-week follow-up, the arthroscopy cohort showed a significantly decreased WOMAC score compared with the control cohort (72.9 [SD, 15.5] vs 80.5 [SD, 11.8], respectively; P = .05). In addition, there was a trend towards a decreased SF-12 mental component score in the arthroscopy cohort (52.2 [SD, 9.3] vs 56.5 [SD, 7.8] in the control cohort; P = .06). However, none of the remaining clinical indices showed a significant difference between the 2 cohorts, and there was no difference in range of motion between the 2 cohorts at the 6-week follow-up visit (Table 4).

In addition, at 3-month follow-up, no statistically significant differences were seen between the 2 cohorts for any of the clinical indices or range of motion values. Both groups continued to improve rapidly, with HHS of 96.9 (SD, 3.5) in the arthroscopy cohort and 95.5 (SD, 6.6) in the control cohort, and WOMAC scores of 88.7 (SD, 10.2) and 89.5 (SD, 9.8), respectively (Table 5). Similarly, at the 6-month and 1-year follow-up intervals, the 2 cohorts showed continued improvement in their clinical measures, with no statistically significant differences between the 2 cohorts (Tables 6, 7). 

At the most recent follow-up visit, more than 1 year after surgery, the HHS was 99.5 (SD, 1.3) in the arthroscopy cohort and 99.2 (SD, 9.7) in the control cohort (P = .9), and the WOMAC score was 93.5 (SD, 11.3) and 92.4 (SD, 12.2), respectively (P = .8). No significant perioperative complications were seen in the arthroscopy cohort. In the arthroscopy cohort, 1 patient was diagnosed with a deep venous thrombosis 2 weeks after the procedure and was placed on low-molecular-weight heparin and coumadin for treatment. A second patient in the arthroscopy cohort had continued serosanguinous drainage for 4 days postoperatively, which resolved with continued compressive dressings. To date, no patients in the arthroscopy or control cohorts have required a second operation or revision of their components.

 

 

Discussion

Given the increasing prevalence of hip arthroscopies to treat multiple disorders of the native joint, it is important to assess the potential consequences of these procedures on future arthroplasties. Piedade and colleagues,13 in a retrospective review of 1474 primary total knee arthroplasties, showed a prior bony procedure (high tibial osteotomy, tibial plateau fracture, patellar realignment) to be predictive of decreased range of motion postoperatively. In addition, a prior knee arthroscopy was associated with a higher rate of postoperative complications, with 30% of the complications requiring a reoperation, and 8.3% of the complications requiring a revision total knee arthroplasty. Kaplan-Meier survival curves showed a survival rate of only 86.8% in those patients with a prior knee arthroscopy (vs 98.1% in those without a prior knee surgery).22 Therefore, the purpose of this study was to evaluate the clinical outcomes of a series of patients who received an HRA after a prior hip arthroscopy. After the initial 6-week follow-up visit, no significant difference was seen in the functional outcomes between those patients with or without a history of prior hip arthroscopy who received an HRA.

After analysis of patient outcomes using multiple clinical measurement tools, at 6-week, 3-month, 6-month, 1-year, and most recent follow-up intervals, the only significant difference between the 2 cohorts was the WOMAC score at 6-week follow-up. Interestingly, there was no significant difference seen in the other clinical assessments, including the SF-12 score, HHS, range of motion, or UCLA activity score (although this did trend towards significance). This can be explained by the difference in both the mode of administration and various metrics assessed by these instruments. In comparison to the HHS evaluation, the patient completes the WOMAC (rather than the clinician) and also provides a more detailed assessment of symptoms, pain, stiffness, and activities of daily living.17 Therefore, this study suggests that patients with a prior hip arthroscopy may require more time to return to their activities of daily living after an HRA. However, whether the statistically significant difference between the 2 scores translates into a clinically significant difference can be questioned.

The clinical outcomes of this series of patients were excellent at the short-term follow-up, and both groups achieved clinical results comparable to prior reported results of HRA.1,10,23,24 However, despite these results, there are several limitations to this study. First, longer-term follow-up is required to determine if any significant differences (such as aseptic loosening, infection, and prosthesis survival) are associated with a prior hip arthroscopy. In addition, this study included a relatively small cohort of patients who had a prior hip arthroscopy. However, a relatively large, single-surgeon database of 1357 HRAs was reviewed, with only 51 cases being reported (3.7%). With the increasing popularity of hip arthroscopy, the number of patients presenting for HRA will likely continue to increase. However, despite these limitations, this study shows that a prior hip arthroscopy does not appear to affect the short-term, clinical outcomes of a metal-on-metal HRA.

References

1. Amstutz HC, Beaulé PE, Dorey FJ, Le Duff MJ, Campbell PA, Gruen TA. Metal-on-metal hybrid surface arthroplasty. Surgical Technique. J Bone Joint Surg Am. 2006;88(suppl 1 Pt 2):234-249.

2. Daniel J, Pynsent PB, McMinn DJ. Metal-on-metal resurfacing of the hip in patients under the age of 55 years with osteoarthritis. J Bone Joint Surg Br. 2004;86(2):177-184.

3. Pollard TC, Baker RP, Eastaugh-Waring SJ, Bannister GC. Treatment of the young active patient with osteoarthritis of the hip. A five- to seven-year comparison of hybrid total hip arthroplasty and metal-on-metal resurfacing. J Bone Joint Surg Br. 2006;88(5):592-600.

4. Treacy RB, McBryde CW, Pynsent PB. Birmingham hip resurfacing arthroplasty. A minimum follow-up of five years. J Bone Joint Surg Br. 2005;87(2):167-170.

5. Amstutz HC, Le Duff MJ, Campbell PA, Gruen TA, Wisk LE. Clinical and radiographic results of metal-on-metal hip resurfacing with a minimum ten-year follow-up. J Bone Joint Surg Am. 2010;92(16):2663-2671.

6. Daniel J, Ziaee H, Pradhan C, Pynsent PB, McMinn DJ. Blood and urine metal ion levels in young and active patients after Birmingham hip resurfacing arthroplasty: four-year results of a prospective longitudinal study.
J Bone Joint Surg Br. 2007;89(2):169-173.

7. deSouza RM, Parsons NR, Oni T, Dalton P, Costa M, Krikler S. Metal ion levels following resurfacing arthroplasty of the hip: serial results over a ten-year period. J Bone Joint Surg Br. 2010;92(12):1642-1647.

8. Kwon YM, Thomas P, Summer B, et al. Lymphocyte proliferation responses in patients with pseudotumors following metal-on-metal hip resurfacing arthroplasty. J Orthop Res. 2010;28(4):444-450.

9. Australian Orthopaedic Association National Joint Replacement Registry. Annual Report 2011. Adelaide: Australian Orthopaedic Association; 2011. https://aoanjrr.dmac.adelaide.edu.au/annual-reports-2011. Accessed September 16, 2014.

10. Coulter G, Young DA, Dalziel RE, Shimmin AJ. Birmingham hip resurfacing at a mean of ten years: results from an independent centre. J Bone Joint Surg Br. 2012;94(3):315-321.

11. McCarthy JC, Jarrett BT, Ojeifo O, Lee JA, Bragdon CR. What factors influence long-term survivorship after hip arthroscopy? Clin Orthop. 2011;469(2):362-371.

12. Bajwa AS, Villar RN. Arthroscopy of the hip in patients following joint replacement. J Bone Joint Surg Br. 2011;93(7):890-896.

13. Piedade SR, Pinaroli A, Servien E, Neyret P. Is previous knee arthroscopy related to worse results in primary total knee arthroplasty? Knee Surg Sports Traumatol Arthrosc. 2009;17(4):328-333.

14. Gonzalez Della Valle A, Serota A, Go G, et al. Venous thromboembolism is rare with a multimodal prophylaxis protocol after total hip arthroplasty. Clin Orthop. 2006;(444):146-153.

15. Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am. 1969;51(4):737-755.

16. Kershaw CJ, Atkins RM, Dodd CA, Bulstrode CJ. Revision total hip arthroplasty for aseptic failure. A review of 276 cases. J Bone Joint Surg Br. 1991;73(4):564-568.

17. Bellamy N. WOMAC: a 20-year experiential review of a patient-centered self-reported health status questionnaire. J Rheumatol. 2002;29(12):2473-2476.

18. Ware J Jr, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220-233.

19. Clark JM, Freeman MA, Witham D. The relationship of neck orientation to the shape of the proximal femur. J Arthroplasty. 1987;2(2):99-109.

20. Lewinnek GE, Lewis JL, Tarr R, Compere CL, Zimmerman JR. Dislocations after total hip-replacement arthroplasties. J Bone Joint Surg Am. 1978;60(2):217-220.

21. Yao L, Yao J, Gold RH. Measurement of acetabular version on the axiolateral radiograph. Clin Orthop. 1995;(316):106-111.

22. Piedade SR, Pinaroli A, Servien E, Neyret P. TKA outcomes after prior bone and soft tissue knee surgery. Knee Surg Sports Traumatol Arthrosc. 2013;21(12):2737-2743.

23. Amstutz HC, Beaulé PE, Dorey FJ, Le Duff MJ, Campbell PA, Gruen TA. Metal-on-metal hybrid surface arthroplasty: two to six-year follow-up study. J Bone Joint Surg Am. 2004;86(1):28-39.

24. Steffen RT, Pandit HP, Palan J, et al. The five-year results of the Birmingham Hip Resurfacing arthroplasty: an independent series. J Bone Joint Surg Br. 2008;90(4):436-441.

References

1. Amstutz HC, Beaulé PE, Dorey FJ, Le Duff MJ, Campbell PA, Gruen TA. Metal-on-metal hybrid surface arthroplasty. Surgical Technique. J Bone Joint Surg Am. 2006;88(suppl 1 Pt 2):234-249.

2. Daniel J, Pynsent PB, McMinn DJ. Metal-on-metal resurfacing of the hip in patients under the age of 55 years with osteoarthritis. J Bone Joint Surg Br. 2004;86(2):177-184.

3. Pollard TC, Baker RP, Eastaugh-Waring SJ, Bannister GC. Treatment of the young active patient with osteoarthritis of the hip. A five- to seven-year comparison of hybrid total hip arthroplasty and metal-on-metal resurfacing. J Bone Joint Surg Br. 2006;88(5):592-600.

4. Treacy RB, McBryde CW, Pynsent PB. Birmingham hip resurfacing arthroplasty. A minimum follow-up of five years. J Bone Joint Surg Br. 2005;87(2):167-170.

5. Amstutz HC, Le Duff MJ, Campbell PA, Gruen TA, Wisk LE. Clinical and radiographic results of metal-on-metal hip resurfacing with a minimum ten-year follow-up. J Bone Joint Surg Am. 2010;92(16):2663-2671.

6. Daniel J, Ziaee H, Pradhan C, Pynsent PB, McMinn DJ. Blood and urine metal ion levels in young and active patients after Birmingham hip resurfacing arthroplasty: four-year results of a prospective longitudinal study.
J Bone Joint Surg Br. 2007;89(2):169-173.

7. deSouza RM, Parsons NR, Oni T, Dalton P, Costa M, Krikler S. Metal ion levels following resurfacing arthroplasty of the hip: serial results over a ten-year period. J Bone Joint Surg Br. 2010;92(12):1642-1647.

8. Kwon YM, Thomas P, Summer B, et al. Lymphocyte proliferation responses in patients with pseudotumors following metal-on-metal hip resurfacing arthroplasty. J Orthop Res. 2010;28(4):444-450.

9. Australian Orthopaedic Association National Joint Replacement Registry. Annual Report 2011. Adelaide: Australian Orthopaedic Association; 2011. https://aoanjrr.dmac.adelaide.edu.au/annual-reports-2011. Accessed September 16, 2014.

10. Coulter G, Young DA, Dalziel RE, Shimmin AJ. Birmingham hip resurfacing at a mean of ten years: results from an independent centre. J Bone Joint Surg Br. 2012;94(3):315-321.

11. McCarthy JC, Jarrett BT, Ojeifo O, Lee JA, Bragdon CR. What factors influence long-term survivorship after hip arthroscopy? Clin Orthop. 2011;469(2):362-371.

12. Bajwa AS, Villar RN. Arthroscopy of the hip in patients following joint replacement. J Bone Joint Surg Br. 2011;93(7):890-896.

13. Piedade SR, Pinaroli A, Servien E, Neyret P. Is previous knee arthroscopy related to worse results in primary total knee arthroplasty? Knee Surg Sports Traumatol Arthrosc. 2009;17(4):328-333.

14. Gonzalez Della Valle A, Serota A, Go G, et al. Venous thromboembolism is rare with a multimodal prophylaxis protocol after total hip arthroplasty. Clin Orthop. 2006;(444):146-153.

15. Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am. 1969;51(4):737-755.

16. Kershaw CJ, Atkins RM, Dodd CA, Bulstrode CJ. Revision total hip arthroplasty for aseptic failure. A review of 276 cases. J Bone Joint Surg Br. 1991;73(4):564-568.

17. Bellamy N. WOMAC: a 20-year experiential review of a patient-centered self-reported health status questionnaire. J Rheumatol. 2002;29(12):2473-2476.

18. Ware J Jr, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220-233.

19. Clark JM, Freeman MA, Witham D. The relationship of neck orientation to the shape of the proximal femur. J Arthroplasty. 1987;2(2):99-109.

20. Lewinnek GE, Lewis JL, Tarr R, Compere CL, Zimmerman JR. Dislocations after total hip-replacement arthroplasties. J Bone Joint Surg Am. 1978;60(2):217-220.

21. Yao L, Yao J, Gold RH. Measurement of acetabular version on the axiolateral radiograph. Clin Orthop. 1995;(316):106-111.

22. Piedade SR, Pinaroli A, Servien E, Neyret P. TKA outcomes after prior bone and soft tissue knee surgery. Knee Surg Sports Traumatol Arthrosc. 2013;21(12):2737-2743.

23. Amstutz HC, Beaulé PE, Dorey FJ, Le Duff MJ, Campbell PA, Gruen TA. Metal-on-metal hybrid surface arthroplasty: two to six-year follow-up study. J Bone Joint Surg Am. 2004;86(1):28-39.

24. Steffen RT, Pandit HP, Palan J, et al. The five-year results of the Birmingham Hip Resurfacing arthroplasty: an independent series. J Bone Joint Surg Br. 2008;90(4):436-441.

Issue
The American Journal of Orthopedics - 43(11)
Issue
The American Journal of Orthopedics - 43(11)
Page Number
E255-E260
Page Number
E255-E260
Publications
Publications
Topics
Article Type
Display Headline
Does a Prior Hip Arthroscopy Affect Clinical Outcomes in Metal-on-Metal Hip Resurfacing Arthroplasty?
Display Headline
Does a Prior Hip Arthroscopy Affect Clinical Outcomes in Metal-on-Metal Hip Resurfacing Arthroplasty?
Legacy Keywords
american journal of orthopedics, AJO, original study, study, online exclusive, hip arthroscopy, hip, arthroscopy, metal-on-metal, arthroplasty, hip resurfacing arthroplasty, hip arthroplasty, nam, maher, nath, su
Legacy Keywords
american journal of orthopedics, AJO, original study, study, online exclusive, hip arthroscopy, hip, arthroscopy, metal-on-metal, arthroplasty, hip resurfacing arthroplasty, hip arthroplasty, nam, maher, nath, su
Sections
Article Source

PURLs Copyright

Inside the Article

Article PDF Media

Pilot Study for an Orthopedic Surgical Training Laboratory for Basic Motor Skills

Article Type
Changed
Thu, 09/19/2019 - 13:39
Display Headline
Pilot Study for an Orthopedic Surgical Training Laboratory for Basic Motor Skills

For the resident, the surgical residency is physically, emotionally, and intellectually demanding, requiring longitudinally concentrated effort. Although education of orthopedic surgeons necessarily occurs within the context of the health care delivery system, vital lessons also are taught in laboratories, skill stations, and surgical simulators. Before practice-based learning can take place, residents must gain experience and demonstrate growth in surgical skills, including decision-making and technical skills. These skill sets are difficult to systematically teach and objectively analyze.

The most effective way to teach and assess a resident’s knowledge of musculoskeletal medicine remains unclear at this point. Much of the current literature addresses the issue at the medical student level.1-7 Some studies have shown the effectiveness of surgical training programs, both cadaveric and computer-based simulators, in teaching various surgical skill sets.8-14 The orthopedic literature has seen a boom in surgical simulators aimed at the upper-level resident. Many of the topics involve use of arthroscopic simulators.15-19 Evidence suggests that simulators can discriminate between novice and expert users, but discrimination between novice and intermediate trainees in surgical education should be paramount.20

The American Board of Orthopaedic Surgery (ABOS) and the orthopedic Residency Review Committee (RRC) recommended new requirements for structured motor skills training in basic orthopedic surgery education,21 which were approved by the Accreditation Council for Graduate Medical Education (ACGME) board of directors and went into effect on July 1, 2013. In response to the new ACGME guidelines, our institution created a skills laboratory devoted to surgical simulation. Our focus in implementing this surgical skills simulation was junior-level, specifically postgraduate year 1 to 3 (PGY-1 to PGY-3), orthopedic residents. Our first goal was to set up a series of surgical training stations to educate junior-level residents in 4 core areas: handling and comfort with basic power equipment, casting/splinting, suturing, and surgical instrument identification. A secondary goal was to objectively evaluate the residents through written examinations (presession–postsession) and a novel ankle fracture model (pre–post).

Materials and Methods

Institutional review board approval was obtained before beginning the investigation.

Written Examination

We created a multiple-choice 25-question written examination (Appendix) and administered it to 11 junior residents before and after they participated in the training. This examination assessed their knowledge base of basic orthopedic tenets, including basic bone healing, basic fracture repair (Arbeitsgemeinschaft für Osteosynthesefragen [AO] principles22), suturing, surgical instrument identification, casting/splinting, and elementary implant-design rationale.

Evaluator Scorecard

We created an evaluation scorecard (Figure 1) and had 2 faculty members and 2 senior-level residents complete it independently. Junior residents were evaluated on a sawbones lateral malleolar ankle fracture model at 2 time points. As with the written examinations, the junior residents completed the fracture model both before and immediately after the multiple skill sessions. Each of the 15 data points was scored from 1 to 4, for a total of 60 points.

Facility for Surgical Training Session

Our Clinical Skills Education and Assessment Center houses small-group interactive laboratories for administration, debriefing, and assessment of simulations with the latest in audiovisual equipment. Five stations were created: hands-on introduction to surgical power equipment using sawbones, wood, and polyvinylchloride (PVC) pipe; hands-on introduction to casting and splinting; hands-on introduction to suturing; hands-on interaction with surgical scrub technician assisting with instrument identification; and didactic PowerPoint (Microsoft, Redmond, Washington) presentation focusing on core trauma competencies, basic orthopedic design rationale, and basic bone biology.

Development of Surgical Skills Training Session

Multiple faculty members and senior-level residents collaborated to create the skill stations (Figure 2), which were designed based on ACGME recommendations and on weaknesses our program had seen in junior-level residents. We devoted an afternoon to this session, excusing our program’s junior residents from clinical responsibilities. Four PGY-1, 5 PGY-2, and 2 PGY-3 residents participated. (Four of our 15 junior residents were unable to attend because of clinical responsibilities.) The afternoon started by dividing the 11 junior residents into 2 groups. Before the session, while one group performed the ankle fracture model and was being evaluated, the other took the written examination. This closely timed portion was allotted only 20 minutes. Then residents were divided into 5 groups of 2 or 3 and were rotated through all 5 stations. Forty minutes were allotted for each station. Residents were not evaluated during this portion. The stations were intended solely for education, and each station was staffed by a faculty member and/or senior-level resident.

Cordless reciprocating saws and drills were purchased to introduce and refine junior residents’ motor skills. Sawbones, 2×4-in sections of wood, and PVC pipe were used in the training. Emphasis was placed on tactile feel and feedback with both sawing and drilling. For the casting and splinting session, we used 4-in fiberglass, 4-in plaster rolls, and cotton soft roll to demonstrate a multitude of common casts and splints (Figure 3). Casts included short- and long-arm casts and short-leg casts. Splinting included coaptation, sugar tong, and ulnar gutter splints for the upper extremity and a short-leg posterior splint for the lower extremity.

 

 

The didactic PowerPoint presentation drew largely from content in chapters of the book AO Principles of Fracture Management.22 Content included condensed, to-the-point high-yield summaries of AO tenets, basic bone healing and biology, and orthopedic implant-design rationale focused on these elementary principles:

◾ Basic screw design, including cortical, cancellous, and locking screw designs.

◾ Evolution of plate osteosynthesis to currently used locking compression plate.

◾ Locking plate principles.

◾ Lag technique.

◾ Plate use: compression mode, neutralization, bridging, buttress, anti-glide.

The suturing portion was performed with thawed ham hocks (Figure 4). This model replicates live tissue layers and allows a layered closure technique as a training tool. Both 0 and 2-0 absorbable suture were available for a layered, deep fascial closure; also available was 2-0 nonabsorbable nylon for the skin. Staple guns were available, as were basic surgical instruments, including quality needle drivers, Adson forceps, and suture scissors. The knots demonstrated included simple, horizontal mattress, vertical mattress, and tension-relieving. One- and 2-hand tying and instrument tying were reinforced.

The final session consisted of surgical instrument identification. A certified orthopedic scrub technician participated. On site were multiple trays, including a basic bone set, a hand-and-foot set, small and large fragment sets, and a hip set. A detailed review of each set was led by the surgical technician. This review was followed by a question-and-answer session with the junior residents. After the session, we ended with the written examination and the ankle fracture model.

Statistical Methods

We report presession and postsession means, modes, and medians as measures of score-central tendencies. Our small sample size makes the assumption of Gaussian distribution tenuous and more susceptible to outliers. Therefore, in addition to reporting means, we include medians and modes to more accurately account for outliers. Moreover, the κ statistic is a robust measure of interrater agreement for 2 or more groups. We report κ statistics to determine the interrater reliability of 4 independent observers.

Results

Written Examination

Eleven residents (PGY-1 to PGY-3) completed the examination (Table 1). For the entire group, mean (SD) presession percentile was 87.3 (10.4), median was 88, and mode was 96; mean (SD) was 80 (12.6) for PGY-1, 89.6 (6.7) for PGY-2, and 96 (5.7) for PGY-3. For the entire group, mean (SD) postsession percentile was 92 (8.4), median was 96, and mode was 96; mean (SD) was 85 (10.5) for PGY-1, 96 (4) for PGY-2, and 96 (0) for PGY-3 (Table 2).

 

There was a significant presession–postsession difference in scores among all test takers, regardless of training level (P = .019). The PGY-1 level did not reach statistical significance in improvement from presession to postsession (P = .080); the PGY-2 level also did not reach statistical significance in improvement (P = .099); the PGY-3 level did not have enough participants to calculate a P value based on a paired Student t test.

Ankle Fracture Model

Actual percentile scores are listed in Table 3. For the entire group, mean (SD) overall presession percentile was 68.6 (13.9), median was 67, and mode was 67; mean (SD) was 58.8 (9.8) for PGY-1, 76.1 (13.6) for PGY-2, and 69.5 (9.8) for PGY-3. For the entire group, mean (SD) postsession percentile was 95.2 (5.2), median was 97, and mode was 97; mean (SD) was 91.8 (6.3) for PGY-1, 97.1 (3.5) for PGY-2, and 97.3 (2.4) for PGY-3.

There was a large and significant presession–postsession difference in scores among all test takers, regardless of training level (P = .03). Each group reached statistical significance in improvement from presession to postsession: PGY-1 (P = .04), PGY-2 (P = .01), and PGY-3 (P = .03).

For κ calculations, we adjusted all scores to ordinal data and thus used a standard grading system:

Score            Grade

90–100             A

80–89               B

70–79               C

60–69               D

0–59                 F

For the presession fracture model, the κ among the 4 independent observational scorers was 0.1148 (Table 4), which is poor based on κ scoring criteria and which we attribute to the particularly harsh grading by 1 observational scorer (faculty 1) relative to the other scorers’. Examination of the κ scores of faculty 1 and faculty 2 indicated only 9.09% agreement. Conversely, the κ among resident scorers agreed 54.55% of the time. Removing faculty 1 as an outlier raised the κ score dramatically, to 0.3125 (fair interobserver agreement).

For the postsession fracture model, the κ among the 4 independent observational scorers improved only marginally, to 0.1156 (still poor), again attributed to a difference in severity of grading: faculty 1 (harsh) versus faculty 2 (relatively kind). Examination of the κ scores of faculty 1 and faculty 2 revealed 72.73% agreement; residents agreed 81.82% of the time.

 

 

Discussion

The importance of surgical skill development in resident education is emphasized in the ACGME Core Competencies.23 The ACGME instructed all programs to require residents to gain competency in 6 areas: patient care, interpersonal and communication skills, medical knowledge, professionalism, practice-based learning and systems-based practice. Although many surgeon educators and residents are focused on these 6 Core Competencies, current standards do not require surgical skills laboratory training and simply require residents to log cases into the ACGME website. Minimal case number recommendations are in place for graduating senior residents, but these numbers are based on averages with no strong scientific basis.

Although sweeping changes in orthopedic residency training went into effect July 1, 2013, this system remains untested and may offer room for improvement. One change is the restructuring of the PGY-1 internship. A basic surgical skills curriculum must include goals, objectives, and assessment metrics; skills used in the initial management of injured patients, including splinting, casting, application of traction devices, and other types of immobilization; and basic operative skills, including soft-tissue management, suturing, bone management, arthroscopy, fluoroscopy, and use of basic orthopedic equipment.21

Orthopedic program directors and residents were recently surveyed regarding the current state of orthopedic motor skills training.24 Three key findings deserve emphasis: There is a lack of objective criteria for evaluating resident performance in the skills laboratory; most program directors who have a laboratory do not understand the associated costs; and the most significant issue for program directors is the financial challenge of operating a motor skills laboratory. The survey findings strongly suggest that proposed changes in skills training should be accompanied by careful cost analysis before widespread implementation.

Although various online demonstrations of entire surgeries are available, as are textbooks describing a generalized approach to musculoskeletal surgery, we assume that, as laid out in the Core Competencies, residents are fine-tuning their surgical skills by actively participating in operating rooms under direct observation of attending physicians. To our knowledge, however, there are no data regarding how often this happens in the operative setting, where volume and efficiency are becoming increasingly scrutinized. There has been much concern over how hour restrictions will affect residents’ total operative experience.25,26 Finally, we have no means to objectively evaluate residents’ surgical skills on graduation.

Other programs have implemented surgical skill simulators, but an orthopedics-specific surgical skills laboratory, to our knowledge, has been discussed in only 1 study.21 Results from randomized controlled trials reported in the general surgery literature have proved simulation-based training leads to detectable benefits for learners in clinical settings.27-29 Over the past decade, some alternative surgical skills training methods have been adopted in orthopedic surgery as well. These methods include hands-on training in specifically designed surgical skills laboratories using cadaver models or synthetic bones; software tools; and computerized simulators. In recent years, numerous studies reported in the orthopedic literature have examined arthroscopic simulators in residency training.18-20,30-34 However, these studies are arguably more specific to sports subspecialties and thus more pertinent to upper-level trainees.

Our study results showed that surgical skills laboratory training should be a required aspect of our residents’ training. Although less of a dramatic improvement was noted in the written examination component of the laboratory, the overall knowledge base improved (Table 3). This was especially evident at the PGY-1 level, where written examination scores increased from a presession median of 80% to a postsession median of 85%. A larger degree of improvement was found with the ankle fracture model, and there was statistical improvement at all training levels, from PGY-1 to PGY-3. Previous work has shown that intensive laboratory-based training can be effective, particularly for first-year residents. Sonnadara and colleagues35 demonstrated that a 30-day intensive surgical skills course effectively helped first-year orthopedic residents develop targeted basic surgical skills. In a follow-up study, Sonnadara and colleagues36 demonstrated that a surgical skills course completed at the beginning of a residency was effective in teaching targeted technical skills, and that skills taught in this manner can have excellent retention rates.

There are limitations inherent in our skills course. The κ agreement in the ankle fracture model was low before and after administration, which we attribute to 1 observer outlier. This could be amended by removing outliers and further objectifying and simplifying the scoring system (A–F). Right now, we do not have enough data to determine whether the scores actually improve significantly through the training years or whether they will correlate with operating room experience. Our study had no control. For future investigations, we are considering having general orthopedic surgeons from the community perform the same scenarios and be graded with the same checklists as a control. Implementation, however, may be a challenge. Both our written examination and our ankle fracture model checklist have not been validated—this is one of our next steps. The point system used to score the ankle fracture model was subjectively developed and would benefit from further standardization before drawing conclusions about true validity.

 

 

Conclusion

Orthopedic residency programs, like programs in other surgical specialties, are increasingly focused on teaching and documenting the learning of core competencies, even as work-hour restrictions and demands for clinical efficiency limit the amount of time residents spend in the operating room. We have demonstrated the potential value of an intensive laboratory in improving junior-level residents’ basic surgical skills and knowledge. We will continue to refine our methods, with a goal being to create reproducible models that could be adapted by other orthopedic residency programs and by other surgical educators.

References

1. Schmale GA. More evidence of educational inadequacies in musculoskeletal medicine. Clin Orthop. 2005;(437):251-259.

2. Day CS, Yeh AC, Franko O, Ramirez M, Krupat E. Musculoskeletal medicine: an assessment of the attitudes and knowledge of medical students at Harvard Medical School. Acad Med. 2007;82(5):452-457.

3. Bilderback K, Eggerstedt J, Sadasivan KK, et al. Design and implementation of a system-based course in musculoskeletal medicine for medical students. J Bone Joint Surg Am. 2008;90(10):2292-2300.

4. Freedman KB, Bernstein J. Educational deficiencies in musculoskeletal medicine. J Bone Joint Surg Am. 2002;84(4):604-608.

5. Corbett EC Jr, Elnicki DM, Conaway MR. When should students learn essential physical examination skills? Views of internal medicine clerkship directors in North America. Acad Med. 2008;83(1):96-99.

6. Coady DA, Walker DJ, Kay LJ. Teaching medical students musculoskeletal examination skills: identifying barriers to learning and ways of overcoming them. Scand J Rheumatol. 2004;33(1):47-51.

7. Saleh K, Messner R, Axtell S, Harris I, Mahowald ML. Development and evaluation of an integrated musculoskeletal disease course for medical students. J Bone Joint Surg Am. 2004;86(8):1653-1658.

8. van Empel PJ, Verdam MG, Huirne JA, Bonjer HJ, Meijerink WJ, Scheele F. Open knot-tying skills: resident skills assessed. J Obstet Gynaecol Res. 2013;39(5):1030-1036.

9. Barrier BF, Thompson AB, McCullough MW, Occhino JA. A novel and inexpensive vaginal hysterectomy simulator. Simul Healthc. 2012;7(6):374-379.

10. Liss MA, McDougall EM. Robotic surgical simulation. Cancer J. 2013;19(2):124-129.

11. Stegemann AP, Ahmed K, Syed JR, et al. Fundamental skills of robotic surgery: a multi-institutional randomized controlled trial for validation of a simulation-based curriculum. Urology. 2013;81(4):767-774.

12. Duran C, Bismuth J, Mitchell E. A nationwide survey of vascular surgery trainees reveals trends in operative experience, confidence, and attitudes about simulation. J Vasc Surg. 2013;58(2):524-528.

13. Kuhls DA, Risucci DA, Bowyer MW, Luchette FA. Advanced surgical skills for exposure in trauma: a new surgical skills cadaver course for surgery residents and fellows. J Trauma Acute Care Surg. 2013;74(2):664-670.

14. Sanfey HA, Dunnington GL. Basic surgical skills testing for junior residents: current views of general surgery program directors. J Am Coll Surg. 2011;212(3):406-412.

15. Alvand A, Khan T, Al-Ali S, Jackson WF, Price AJ, Rees JL. Simple visual parameters for objective assessment of arthroscopic skill. J Bone Joint Surg Am. 2012;94(13):e97.

16. Jackson WF, Khan T, Alvand A, et al. Learning and retaining simulated arthroscopic meniscal repair skills. J Bone Joint Surg Am. 2012;94(17):e132.

17. Pernar LI, Smink DS, Hicks G, Peyre SE. Residents can successfully teach basic surgical skills in the simulation center. J Surg Educ. 2012;69(5):617-622.

18. Tuijthof GJ, Visser P, Sierevelt IN, Van Dijk CN, Kerkhoffs GM. Does perception of usefulness of arthroscopic simulators differ with levels of experience? Clin Orthop. 2011;469(6):1701-1708.

19. Martin KD, Cameron K, Belmont PJ, Schoenfeld A, Owens BD. Shoulder arthroscopy simulator performance correlates with resident and shoulder arthroscopy experience. J Bone Joint Surg Am. 2012;94(21):e160.

20. Slade Shantz JA, Leiter JR, Gottschalk T, MacDonald PB. The internal validity of arthroscopic simulators and their effectiveness in arthroscopic education. Knee Surg Sports Traumatol Arthrosc. 2014;22(1):33-40.

21. Roberts S, Menage J, Eisenstein SM. The cartilage end-plate and intervertebral disc in scoliosis: calcification and other sequelae. J Orthop Res. 1993;11(5):747-757.

22. Ruedi TP, Buckley RE, Moran CG. AO Principles of Fracture Management. Stuttgart, Germany: Thieme; 2007.

23. Chen CL, Chen WC, Chiang JH, Ho CF. Interscapular hibernoma: case report and literature review. Kaohsiung J Med Sci. 2011;27(8):348-352.

24. Karam MD, Pedowitz RA, Natividad H, Murray J, Marsh JL. Current and future use of surgical skills training laboratories in orthopaedic resident education: a national survey. J Bone Joint Surg Am. 2013;95(1):e4.

25. Baskies MA, Ruchelsman DE, Capeci CM, Zuckerman JD, Egol KA. Operative experience in an orthopaedic surgery residency program: the effect of work-hour restrictions. J Bone Joint Surg Am. 2008;90(4):924-927.

26. Pappas AJ, Teague DC. The impact of the Accreditation Council for Graduate Medical Education work-hour regulations on the surgical experience of orthopaedic surgery residents. J Bone Joint Surg Am. 2007;89(4):904-909.

27. Palter VN, Grantcharov T, Harvey A, Macrae HM. Ex vivo technical skills training transfers to the operating room and enhances cognitive learning: a randomized controlled trial. Ann Surg. 2011;253(5):886-889.

28. Franzeck FM, Rosenthal R, Muller MK, et al. Prospective randomized controlled trial of simulator-based versus traditional in-surgery laparoscopic camera navigation training. Surg Endosc. 2012;26(1):235-241.

29. Zendejas B, Cook DA, Bingener J, et al. Simulation-based mastery learning improves patient outcomes in laparoscopic inguinal hernia repair: a randomized controlled trial. Ann Surg. 2011;254(3):502-509.

30. Hui Y, Safir O, Dubrowski A, Carnahan H. What skills should simulation training in arthroscopy teach residents? A focus on resident input. Int J Comput Assist Radiol Surg. 2013;8(6):945-953.

31. Butler A, Olson T, Koehler R, Nicandri G. Do the skills acquired by novice surgeons using anatomic dry models transfer effectively to the task of diagnostic knee arthroscopy performed on cadaveric specimens? J Bone Joint Surg Am. 2013;95(3):e15(1-8).

32. Martin KD, Belmont PJ, Schoenfeld AJ, Todd M, Cameron KL, Owens BD. Arthroscopic basic task performance in shoulder simulator model correlates with similar task performance in cadavers. J Bone Joint Surg Am. 2011;93(21):e1271-e1275.

33. Elliott MJ, Caprise PA, Henning AE, Kurtz CA, Sekiya JK. Diagnostic knee arthroscopy: a pilot study to evaluate surgical skills. Arthroscopy. 2012;28(2):218-224.

34. Andersen C, Winding TN, Vesterby MS. Development of simulated arthroscopic skills. Acta Orthop. 2011;82(1):90-95.

35. Sonnadara RR, Van Vliet A, Safir O, et al. Orthopedic boot camp: examining the effectiveness of an intensive surgical skills course. Surgery. 2011;149(6):745-749.

36. Sonnadara RR, Garbedian S, Safir O, et al. Orthopaedic boot camp II: examining the retention rates of an intensive surgical skills course. Surgery. 2012;151(6):803-807.

Article PDF
Author and Disclosure Information

Jonathan M. Christy, MD, Gregory P. Kolovich, MD, Matthew D. Beal, MD, and Joel L. Mayerson, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article.

Issue
The American Journal of Orthopedics - 43(11)
Publications
Topics
Page Number
E246-E254
Legacy Keywords
american journal of orthopedics, AJO, original study, online exclusive, pilot study, study, orthopedic, surgical training, training, laboratory, motor skills, surgical skills, education, examination, christy, kolovich, beal, mayerson
Sections
Author and Disclosure Information

Jonathan M. Christy, MD, Gregory P. Kolovich, MD, Matthew D. Beal, MD, and Joel L. Mayerson, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article.

Author and Disclosure Information

Jonathan M. Christy, MD, Gregory P. Kolovich, MD, Matthew D. Beal, MD, and Joel L. Mayerson, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article.

Article PDF
Article PDF

For the resident, the surgical residency is physically, emotionally, and intellectually demanding, requiring longitudinally concentrated effort. Although education of orthopedic surgeons necessarily occurs within the context of the health care delivery system, vital lessons also are taught in laboratories, skill stations, and surgical simulators. Before practice-based learning can take place, residents must gain experience and demonstrate growth in surgical skills, including decision-making and technical skills. These skill sets are difficult to systematically teach and objectively analyze.

The most effective way to teach and assess a resident’s knowledge of musculoskeletal medicine remains unclear at this point. Much of the current literature addresses the issue at the medical student level.1-7 Some studies have shown the effectiveness of surgical training programs, both cadaveric and computer-based simulators, in teaching various surgical skill sets.8-14 The orthopedic literature has seen a boom in surgical simulators aimed at the upper-level resident. Many of the topics involve use of arthroscopic simulators.15-19 Evidence suggests that simulators can discriminate between novice and expert users, but discrimination between novice and intermediate trainees in surgical education should be paramount.20

The American Board of Orthopaedic Surgery (ABOS) and the orthopedic Residency Review Committee (RRC) recommended new requirements for structured motor skills training in basic orthopedic surgery education,21 which were approved by the Accreditation Council for Graduate Medical Education (ACGME) board of directors and went into effect on July 1, 2013. In response to the new ACGME guidelines, our institution created a skills laboratory devoted to surgical simulation. Our focus in implementing this surgical skills simulation was junior-level, specifically postgraduate year 1 to 3 (PGY-1 to PGY-3), orthopedic residents. Our first goal was to set up a series of surgical training stations to educate junior-level residents in 4 core areas: handling and comfort with basic power equipment, casting/splinting, suturing, and surgical instrument identification. A secondary goal was to objectively evaluate the residents through written examinations (presession–postsession) and a novel ankle fracture model (pre–post).

Materials and Methods

Institutional review board approval was obtained before beginning the investigation.

Written Examination

We created a multiple-choice 25-question written examination (Appendix) and administered it to 11 junior residents before and after they participated in the training. This examination assessed their knowledge base of basic orthopedic tenets, including basic bone healing, basic fracture repair (Arbeitsgemeinschaft für Osteosynthesefragen [AO] principles22), suturing, surgical instrument identification, casting/splinting, and elementary implant-design rationale.

Evaluator Scorecard

We created an evaluation scorecard (Figure 1) and had 2 faculty members and 2 senior-level residents complete it independently. Junior residents were evaluated on a sawbones lateral malleolar ankle fracture model at 2 time points. As with the written examinations, the junior residents completed the fracture model both before and immediately after the multiple skill sessions. Each of the 15 data points was scored from 1 to 4, for a total of 60 points.

Facility for Surgical Training Session

Our Clinical Skills Education and Assessment Center houses small-group interactive laboratories for administration, debriefing, and assessment of simulations with the latest in audiovisual equipment. Five stations were created: hands-on introduction to surgical power equipment using sawbones, wood, and polyvinylchloride (PVC) pipe; hands-on introduction to casting and splinting; hands-on introduction to suturing; hands-on interaction with surgical scrub technician assisting with instrument identification; and didactic PowerPoint (Microsoft, Redmond, Washington) presentation focusing on core trauma competencies, basic orthopedic design rationale, and basic bone biology.

Development of Surgical Skills Training Session

Multiple faculty members and senior-level residents collaborated to create the skill stations (Figure 2), which were designed based on ACGME recommendations and on weaknesses our program had seen in junior-level residents. We devoted an afternoon to this session, excusing our program’s junior residents from clinical responsibilities. Four PGY-1, 5 PGY-2, and 2 PGY-3 residents participated. (Four of our 15 junior residents were unable to attend because of clinical responsibilities.) The afternoon started by dividing the 11 junior residents into 2 groups. Before the session, while one group performed the ankle fracture model and was being evaluated, the other took the written examination. This closely timed portion was allotted only 20 minutes. Then residents were divided into 5 groups of 2 or 3 and were rotated through all 5 stations. Forty minutes were allotted for each station. Residents were not evaluated during this portion. The stations were intended solely for education, and each station was staffed by a faculty member and/or senior-level resident.

Cordless reciprocating saws and drills were purchased to introduce and refine junior residents’ motor skills. Sawbones, 2×4-in sections of wood, and PVC pipe were used in the training. Emphasis was placed on tactile feel and feedback with both sawing and drilling. For the casting and splinting session, we used 4-in fiberglass, 4-in plaster rolls, and cotton soft roll to demonstrate a multitude of common casts and splints (Figure 3). Casts included short- and long-arm casts and short-leg casts. Splinting included coaptation, sugar tong, and ulnar gutter splints for the upper extremity and a short-leg posterior splint for the lower extremity.

 

 

The didactic PowerPoint presentation drew largely from content in chapters of the book AO Principles of Fracture Management.22 Content included condensed, to-the-point high-yield summaries of AO tenets, basic bone healing and biology, and orthopedic implant-design rationale focused on these elementary principles:

◾ Basic screw design, including cortical, cancellous, and locking screw designs.

◾ Evolution of plate osteosynthesis to currently used locking compression plate.

◾ Locking plate principles.

◾ Lag technique.

◾ Plate use: compression mode, neutralization, bridging, buttress, anti-glide.

The suturing portion was performed with thawed ham hocks (Figure 4). This model replicates live tissue layers and allows a layered closure technique as a training tool. Both 0 and 2-0 absorbable suture were available for a layered, deep fascial closure; also available was 2-0 nonabsorbable nylon for the skin. Staple guns were available, as were basic surgical instruments, including quality needle drivers, Adson forceps, and suture scissors. The knots demonstrated included simple, horizontal mattress, vertical mattress, and tension-relieving. One- and 2-hand tying and instrument tying were reinforced.

The final session consisted of surgical instrument identification. A certified orthopedic scrub technician participated. On site were multiple trays, including a basic bone set, a hand-and-foot set, small and large fragment sets, and a hip set. A detailed review of each set was led by the surgical technician. This review was followed by a question-and-answer session with the junior residents. After the session, we ended with the written examination and the ankle fracture model.

Statistical Methods

We report presession and postsession means, modes, and medians as measures of score-central tendencies. Our small sample size makes the assumption of Gaussian distribution tenuous and more susceptible to outliers. Therefore, in addition to reporting means, we include medians and modes to more accurately account for outliers. Moreover, the κ statistic is a robust measure of interrater agreement for 2 or more groups. We report κ statistics to determine the interrater reliability of 4 independent observers.

Results

Written Examination

Eleven residents (PGY-1 to PGY-3) completed the examination (Table 1). For the entire group, mean (SD) presession percentile was 87.3 (10.4), median was 88, and mode was 96; mean (SD) was 80 (12.6) for PGY-1, 89.6 (6.7) for PGY-2, and 96 (5.7) for PGY-3. For the entire group, mean (SD) postsession percentile was 92 (8.4), median was 96, and mode was 96; mean (SD) was 85 (10.5) for PGY-1, 96 (4) for PGY-2, and 96 (0) for PGY-3 (Table 2).

 

There was a significant presession–postsession difference in scores among all test takers, regardless of training level (P = .019). The PGY-1 level did not reach statistical significance in improvement from presession to postsession (P = .080); the PGY-2 level also did not reach statistical significance in improvement (P = .099); the PGY-3 level did not have enough participants to calculate a P value based on a paired Student t test.

Ankle Fracture Model

Actual percentile scores are listed in Table 3. For the entire group, mean (SD) overall presession percentile was 68.6 (13.9), median was 67, and mode was 67; mean (SD) was 58.8 (9.8) for PGY-1, 76.1 (13.6) for PGY-2, and 69.5 (9.8) for PGY-3. For the entire group, mean (SD) postsession percentile was 95.2 (5.2), median was 97, and mode was 97; mean (SD) was 91.8 (6.3) for PGY-1, 97.1 (3.5) for PGY-2, and 97.3 (2.4) for PGY-3.

There was a large and significant presession–postsession difference in scores among all test takers, regardless of training level (P = .03). Each group reached statistical significance in improvement from presession to postsession: PGY-1 (P = .04), PGY-2 (P = .01), and PGY-3 (P = .03).

For κ calculations, we adjusted all scores to ordinal data and thus used a standard grading system:

Score            Grade

90–100             A

80–89               B

70–79               C

60–69               D

0–59                 F

For the presession fracture model, the κ among the 4 independent observational scorers was 0.1148 (Table 4), which is poor based on κ scoring criteria and which we attribute to the particularly harsh grading by 1 observational scorer (faculty 1) relative to the other scorers’. Examination of the κ scores of faculty 1 and faculty 2 indicated only 9.09% agreement. Conversely, the κ among resident scorers agreed 54.55% of the time. Removing faculty 1 as an outlier raised the κ score dramatically, to 0.3125 (fair interobserver agreement).

For the postsession fracture model, the κ among the 4 independent observational scorers improved only marginally, to 0.1156 (still poor), again attributed to a difference in severity of grading: faculty 1 (harsh) versus faculty 2 (relatively kind). Examination of the κ scores of faculty 1 and faculty 2 revealed 72.73% agreement; residents agreed 81.82% of the time.

 

 

Discussion

The importance of surgical skill development in resident education is emphasized in the ACGME Core Competencies.23 The ACGME instructed all programs to require residents to gain competency in 6 areas: patient care, interpersonal and communication skills, medical knowledge, professionalism, practice-based learning and systems-based practice. Although many surgeon educators and residents are focused on these 6 Core Competencies, current standards do not require surgical skills laboratory training and simply require residents to log cases into the ACGME website. Minimal case number recommendations are in place for graduating senior residents, but these numbers are based on averages with no strong scientific basis.

Although sweeping changes in orthopedic residency training went into effect July 1, 2013, this system remains untested and may offer room for improvement. One change is the restructuring of the PGY-1 internship. A basic surgical skills curriculum must include goals, objectives, and assessment metrics; skills used in the initial management of injured patients, including splinting, casting, application of traction devices, and other types of immobilization; and basic operative skills, including soft-tissue management, suturing, bone management, arthroscopy, fluoroscopy, and use of basic orthopedic equipment.21

Orthopedic program directors and residents were recently surveyed regarding the current state of orthopedic motor skills training.24 Three key findings deserve emphasis: There is a lack of objective criteria for evaluating resident performance in the skills laboratory; most program directors who have a laboratory do not understand the associated costs; and the most significant issue for program directors is the financial challenge of operating a motor skills laboratory. The survey findings strongly suggest that proposed changes in skills training should be accompanied by careful cost analysis before widespread implementation.

Although various online demonstrations of entire surgeries are available, as are textbooks describing a generalized approach to musculoskeletal surgery, we assume that, as laid out in the Core Competencies, residents are fine-tuning their surgical skills by actively participating in operating rooms under direct observation of attending physicians. To our knowledge, however, there are no data regarding how often this happens in the operative setting, where volume and efficiency are becoming increasingly scrutinized. There has been much concern over how hour restrictions will affect residents’ total operative experience.25,26 Finally, we have no means to objectively evaluate residents’ surgical skills on graduation.

Other programs have implemented surgical skill simulators, but an orthopedics-specific surgical skills laboratory, to our knowledge, has been discussed in only 1 study.21 Results from randomized controlled trials reported in the general surgery literature have proved simulation-based training leads to detectable benefits for learners in clinical settings.27-29 Over the past decade, some alternative surgical skills training methods have been adopted in orthopedic surgery as well. These methods include hands-on training in specifically designed surgical skills laboratories using cadaver models or synthetic bones; software tools; and computerized simulators. In recent years, numerous studies reported in the orthopedic literature have examined arthroscopic simulators in residency training.18-20,30-34 However, these studies are arguably more specific to sports subspecialties and thus more pertinent to upper-level trainees.

Our study results showed that surgical skills laboratory training should be a required aspect of our residents’ training. Although less of a dramatic improvement was noted in the written examination component of the laboratory, the overall knowledge base improved (Table 3). This was especially evident at the PGY-1 level, where written examination scores increased from a presession median of 80% to a postsession median of 85%. A larger degree of improvement was found with the ankle fracture model, and there was statistical improvement at all training levels, from PGY-1 to PGY-3. Previous work has shown that intensive laboratory-based training can be effective, particularly for first-year residents. Sonnadara and colleagues35 demonstrated that a 30-day intensive surgical skills course effectively helped first-year orthopedic residents develop targeted basic surgical skills. In a follow-up study, Sonnadara and colleagues36 demonstrated that a surgical skills course completed at the beginning of a residency was effective in teaching targeted technical skills, and that skills taught in this manner can have excellent retention rates.

There are limitations inherent in our skills course. The κ agreement in the ankle fracture model was low before and after administration, which we attribute to 1 observer outlier. This could be amended by removing outliers and further objectifying and simplifying the scoring system (A–F). Right now, we do not have enough data to determine whether the scores actually improve significantly through the training years or whether they will correlate with operating room experience. Our study had no control. For future investigations, we are considering having general orthopedic surgeons from the community perform the same scenarios and be graded with the same checklists as a control. Implementation, however, may be a challenge. Both our written examination and our ankle fracture model checklist have not been validated—this is one of our next steps. The point system used to score the ankle fracture model was subjectively developed and would benefit from further standardization before drawing conclusions about true validity.

 

 

Conclusion

Orthopedic residency programs, like programs in other surgical specialties, are increasingly focused on teaching and documenting the learning of core competencies, even as work-hour restrictions and demands for clinical efficiency limit the amount of time residents spend in the operating room. We have demonstrated the potential value of an intensive laboratory in improving junior-level residents’ basic surgical skills and knowledge. We will continue to refine our methods, with a goal being to create reproducible models that could be adapted by other orthopedic residency programs and by other surgical educators.

For the resident, the surgical residency is physically, emotionally, and intellectually demanding, requiring longitudinally concentrated effort. Although education of orthopedic surgeons necessarily occurs within the context of the health care delivery system, vital lessons also are taught in laboratories, skill stations, and surgical simulators. Before practice-based learning can take place, residents must gain experience and demonstrate growth in surgical skills, including decision-making and technical skills. These skill sets are difficult to systematically teach and objectively analyze.

The most effective way to teach and assess a resident’s knowledge of musculoskeletal medicine remains unclear at this point. Much of the current literature addresses the issue at the medical student level.1-7 Some studies have shown the effectiveness of surgical training programs, both cadaveric and computer-based simulators, in teaching various surgical skill sets.8-14 The orthopedic literature has seen a boom in surgical simulators aimed at the upper-level resident. Many of the topics involve use of arthroscopic simulators.15-19 Evidence suggests that simulators can discriminate between novice and expert users, but discrimination between novice and intermediate trainees in surgical education should be paramount.20

The American Board of Orthopaedic Surgery (ABOS) and the orthopedic Residency Review Committee (RRC) recommended new requirements for structured motor skills training in basic orthopedic surgery education,21 which were approved by the Accreditation Council for Graduate Medical Education (ACGME) board of directors and went into effect on July 1, 2013. In response to the new ACGME guidelines, our institution created a skills laboratory devoted to surgical simulation. Our focus in implementing this surgical skills simulation was junior-level, specifically postgraduate year 1 to 3 (PGY-1 to PGY-3), orthopedic residents. Our first goal was to set up a series of surgical training stations to educate junior-level residents in 4 core areas: handling and comfort with basic power equipment, casting/splinting, suturing, and surgical instrument identification. A secondary goal was to objectively evaluate the residents through written examinations (presession–postsession) and a novel ankle fracture model (pre–post).

Materials and Methods

Institutional review board approval was obtained before beginning the investigation.

Written Examination

We created a multiple-choice 25-question written examination (Appendix) and administered it to 11 junior residents before and after they participated in the training. This examination assessed their knowledge base of basic orthopedic tenets, including basic bone healing, basic fracture repair (Arbeitsgemeinschaft für Osteosynthesefragen [AO] principles22), suturing, surgical instrument identification, casting/splinting, and elementary implant-design rationale.

Evaluator Scorecard

We created an evaluation scorecard (Figure 1) and had 2 faculty members and 2 senior-level residents complete it independently. Junior residents were evaluated on a sawbones lateral malleolar ankle fracture model at 2 time points. As with the written examinations, the junior residents completed the fracture model both before and immediately after the multiple skill sessions. Each of the 15 data points was scored from 1 to 4, for a total of 60 points.

Facility for Surgical Training Session

Our Clinical Skills Education and Assessment Center houses small-group interactive laboratories for administration, debriefing, and assessment of simulations with the latest in audiovisual equipment. Five stations were created: hands-on introduction to surgical power equipment using sawbones, wood, and polyvinylchloride (PVC) pipe; hands-on introduction to casting and splinting; hands-on introduction to suturing; hands-on interaction with surgical scrub technician assisting with instrument identification; and didactic PowerPoint (Microsoft, Redmond, Washington) presentation focusing on core trauma competencies, basic orthopedic design rationale, and basic bone biology.

Development of Surgical Skills Training Session

Multiple faculty members and senior-level residents collaborated to create the skill stations (Figure 2), which were designed based on ACGME recommendations and on weaknesses our program had seen in junior-level residents. We devoted an afternoon to this session, excusing our program’s junior residents from clinical responsibilities. Four PGY-1, 5 PGY-2, and 2 PGY-3 residents participated. (Four of our 15 junior residents were unable to attend because of clinical responsibilities.) The afternoon started by dividing the 11 junior residents into 2 groups. Before the session, while one group performed the ankle fracture model and was being evaluated, the other took the written examination. This closely timed portion was allotted only 20 minutes. Then residents were divided into 5 groups of 2 or 3 and were rotated through all 5 stations. Forty minutes were allotted for each station. Residents were not evaluated during this portion. The stations were intended solely for education, and each station was staffed by a faculty member and/or senior-level resident.

Cordless reciprocating saws and drills were purchased to introduce and refine junior residents’ motor skills. Sawbones, 2×4-in sections of wood, and PVC pipe were used in the training. Emphasis was placed on tactile feel and feedback with both sawing and drilling. For the casting and splinting session, we used 4-in fiberglass, 4-in plaster rolls, and cotton soft roll to demonstrate a multitude of common casts and splints (Figure 3). Casts included short- and long-arm casts and short-leg casts. Splinting included coaptation, sugar tong, and ulnar gutter splints for the upper extremity and a short-leg posterior splint for the lower extremity.

 

 

The didactic PowerPoint presentation drew largely from content in chapters of the book AO Principles of Fracture Management.22 Content included condensed, to-the-point high-yield summaries of AO tenets, basic bone healing and biology, and orthopedic implant-design rationale focused on these elementary principles:

◾ Basic screw design, including cortical, cancellous, and locking screw designs.

◾ Evolution of plate osteosynthesis to currently used locking compression plate.

◾ Locking plate principles.

◾ Lag technique.

◾ Plate use: compression mode, neutralization, bridging, buttress, anti-glide.

The suturing portion was performed with thawed ham hocks (Figure 4). This model replicates live tissue layers and allows a layered closure technique as a training tool. Both 0 and 2-0 absorbable suture were available for a layered, deep fascial closure; also available was 2-0 nonabsorbable nylon for the skin. Staple guns were available, as were basic surgical instruments, including quality needle drivers, Adson forceps, and suture scissors. The knots demonstrated included simple, horizontal mattress, vertical mattress, and tension-relieving. One- and 2-hand tying and instrument tying were reinforced.

The final session consisted of surgical instrument identification. A certified orthopedic scrub technician participated. On site were multiple trays, including a basic bone set, a hand-and-foot set, small and large fragment sets, and a hip set. A detailed review of each set was led by the surgical technician. This review was followed by a question-and-answer session with the junior residents. After the session, we ended with the written examination and the ankle fracture model.

Statistical Methods

We report presession and postsession means, modes, and medians as measures of score-central tendencies. Our small sample size makes the assumption of Gaussian distribution tenuous and more susceptible to outliers. Therefore, in addition to reporting means, we include medians and modes to more accurately account for outliers. Moreover, the κ statistic is a robust measure of interrater agreement for 2 or more groups. We report κ statistics to determine the interrater reliability of 4 independent observers.

Results

Written Examination

Eleven residents (PGY-1 to PGY-3) completed the examination (Table 1). For the entire group, mean (SD) presession percentile was 87.3 (10.4), median was 88, and mode was 96; mean (SD) was 80 (12.6) for PGY-1, 89.6 (6.7) for PGY-2, and 96 (5.7) for PGY-3. For the entire group, mean (SD) postsession percentile was 92 (8.4), median was 96, and mode was 96; mean (SD) was 85 (10.5) for PGY-1, 96 (4) for PGY-2, and 96 (0) for PGY-3 (Table 2).

 

There was a significant presession–postsession difference in scores among all test takers, regardless of training level (P = .019). The PGY-1 level did not reach statistical significance in improvement from presession to postsession (P = .080); the PGY-2 level also did not reach statistical significance in improvement (P = .099); the PGY-3 level did not have enough participants to calculate a P value based on a paired Student t test.

Ankle Fracture Model

Actual percentile scores are listed in Table 3. For the entire group, mean (SD) overall presession percentile was 68.6 (13.9), median was 67, and mode was 67; mean (SD) was 58.8 (9.8) for PGY-1, 76.1 (13.6) for PGY-2, and 69.5 (9.8) for PGY-3. For the entire group, mean (SD) postsession percentile was 95.2 (5.2), median was 97, and mode was 97; mean (SD) was 91.8 (6.3) for PGY-1, 97.1 (3.5) for PGY-2, and 97.3 (2.4) for PGY-3.

There was a large and significant presession–postsession difference in scores among all test takers, regardless of training level (P = .03). Each group reached statistical significance in improvement from presession to postsession: PGY-1 (P = .04), PGY-2 (P = .01), and PGY-3 (P = .03).

For κ calculations, we adjusted all scores to ordinal data and thus used a standard grading system:

Score            Grade

90–100             A

80–89               B

70–79               C

60–69               D

0–59                 F

For the presession fracture model, the κ among the 4 independent observational scorers was 0.1148 (Table 4), which is poor based on κ scoring criteria and which we attribute to the particularly harsh grading by 1 observational scorer (faculty 1) relative to the other scorers’. Examination of the κ scores of faculty 1 and faculty 2 indicated only 9.09% agreement. Conversely, the κ among resident scorers agreed 54.55% of the time. Removing faculty 1 as an outlier raised the κ score dramatically, to 0.3125 (fair interobserver agreement).

For the postsession fracture model, the κ among the 4 independent observational scorers improved only marginally, to 0.1156 (still poor), again attributed to a difference in severity of grading: faculty 1 (harsh) versus faculty 2 (relatively kind). Examination of the κ scores of faculty 1 and faculty 2 revealed 72.73% agreement; residents agreed 81.82% of the time.

 

 

Discussion

The importance of surgical skill development in resident education is emphasized in the ACGME Core Competencies.23 The ACGME instructed all programs to require residents to gain competency in 6 areas: patient care, interpersonal and communication skills, medical knowledge, professionalism, practice-based learning and systems-based practice. Although many surgeon educators and residents are focused on these 6 Core Competencies, current standards do not require surgical skills laboratory training and simply require residents to log cases into the ACGME website. Minimal case number recommendations are in place for graduating senior residents, but these numbers are based on averages with no strong scientific basis.

Although sweeping changes in orthopedic residency training went into effect July 1, 2013, this system remains untested and may offer room for improvement. One change is the restructuring of the PGY-1 internship. A basic surgical skills curriculum must include goals, objectives, and assessment metrics; skills used in the initial management of injured patients, including splinting, casting, application of traction devices, and other types of immobilization; and basic operative skills, including soft-tissue management, suturing, bone management, arthroscopy, fluoroscopy, and use of basic orthopedic equipment.21

Orthopedic program directors and residents were recently surveyed regarding the current state of orthopedic motor skills training.24 Three key findings deserve emphasis: There is a lack of objective criteria for evaluating resident performance in the skills laboratory; most program directors who have a laboratory do not understand the associated costs; and the most significant issue for program directors is the financial challenge of operating a motor skills laboratory. The survey findings strongly suggest that proposed changes in skills training should be accompanied by careful cost analysis before widespread implementation.

Although various online demonstrations of entire surgeries are available, as are textbooks describing a generalized approach to musculoskeletal surgery, we assume that, as laid out in the Core Competencies, residents are fine-tuning their surgical skills by actively participating in operating rooms under direct observation of attending physicians. To our knowledge, however, there are no data regarding how often this happens in the operative setting, where volume and efficiency are becoming increasingly scrutinized. There has been much concern over how hour restrictions will affect residents’ total operative experience.25,26 Finally, we have no means to objectively evaluate residents’ surgical skills on graduation.

Other programs have implemented surgical skill simulators, but an orthopedics-specific surgical skills laboratory, to our knowledge, has been discussed in only 1 study.21 Results from randomized controlled trials reported in the general surgery literature have proved simulation-based training leads to detectable benefits for learners in clinical settings.27-29 Over the past decade, some alternative surgical skills training methods have been adopted in orthopedic surgery as well. These methods include hands-on training in specifically designed surgical skills laboratories using cadaver models or synthetic bones; software tools; and computerized simulators. In recent years, numerous studies reported in the orthopedic literature have examined arthroscopic simulators in residency training.18-20,30-34 However, these studies are arguably more specific to sports subspecialties and thus more pertinent to upper-level trainees.

Our study results showed that surgical skills laboratory training should be a required aspect of our residents’ training. Although less of a dramatic improvement was noted in the written examination component of the laboratory, the overall knowledge base improved (Table 3). This was especially evident at the PGY-1 level, where written examination scores increased from a presession median of 80% to a postsession median of 85%. A larger degree of improvement was found with the ankle fracture model, and there was statistical improvement at all training levels, from PGY-1 to PGY-3. Previous work has shown that intensive laboratory-based training can be effective, particularly for first-year residents. Sonnadara and colleagues35 demonstrated that a 30-day intensive surgical skills course effectively helped first-year orthopedic residents develop targeted basic surgical skills. In a follow-up study, Sonnadara and colleagues36 demonstrated that a surgical skills course completed at the beginning of a residency was effective in teaching targeted technical skills, and that skills taught in this manner can have excellent retention rates.

There are limitations inherent in our skills course. The κ agreement in the ankle fracture model was low before and after administration, which we attribute to 1 observer outlier. This could be amended by removing outliers and further objectifying and simplifying the scoring system (A–F). Right now, we do not have enough data to determine whether the scores actually improve significantly through the training years or whether they will correlate with operating room experience. Our study had no control. For future investigations, we are considering having general orthopedic surgeons from the community perform the same scenarios and be graded with the same checklists as a control. Implementation, however, may be a challenge. Both our written examination and our ankle fracture model checklist have not been validated—this is one of our next steps. The point system used to score the ankle fracture model was subjectively developed and would benefit from further standardization before drawing conclusions about true validity.

 

 

Conclusion

Orthopedic residency programs, like programs in other surgical specialties, are increasingly focused on teaching and documenting the learning of core competencies, even as work-hour restrictions and demands for clinical efficiency limit the amount of time residents spend in the operating room. We have demonstrated the potential value of an intensive laboratory in improving junior-level residents’ basic surgical skills and knowledge. We will continue to refine our methods, with a goal being to create reproducible models that could be adapted by other orthopedic residency programs and by other surgical educators.

References

1. Schmale GA. More evidence of educational inadequacies in musculoskeletal medicine. Clin Orthop. 2005;(437):251-259.

2. Day CS, Yeh AC, Franko O, Ramirez M, Krupat E. Musculoskeletal medicine: an assessment of the attitudes and knowledge of medical students at Harvard Medical School. Acad Med. 2007;82(5):452-457.

3. Bilderback K, Eggerstedt J, Sadasivan KK, et al. Design and implementation of a system-based course in musculoskeletal medicine for medical students. J Bone Joint Surg Am. 2008;90(10):2292-2300.

4. Freedman KB, Bernstein J. Educational deficiencies in musculoskeletal medicine. J Bone Joint Surg Am. 2002;84(4):604-608.

5. Corbett EC Jr, Elnicki DM, Conaway MR. When should students learn essential physical examination skills? Views of internal medicine clerkship directors in North America. Acad Med. 2008;83(1):96-99.

6. Coady DA, Walker DJ, Kay LJ. Teaching medical students musculoskeletal examination skills: identifying barriers to learning and ways of overcoming them. Scand J Rheumatol. 2004;33(1):47-51.

7. Saleh K, Messner R, Axtell S, Harris I, Mahowald ML. Development and evaluation of an integrated musculoskeletal disease course for medical students. J Bone Joint Surg Am. 2004;86(8):1653-1658.

8. van Empel PJ, Verdam MG, Huirne JA, Bonjer HJ, Meijerink WJ, Scheele F. Open knot-tying skills: resident skills assessed. J Obstet Gynaecol Res. 2013;39(5):1030-1036.

9. Barrier BF, Thompson AB, McCullough MW, Occhino JA. A novel and inexpensive vaginal hysterectomy simulator. Simul Healthc. 2012;7(6):374-379.

10. Liss MA, McDougall EM. Robotic surgical simulation. Cancer J. 2013;19(2):124-129.

11. Stegemann AP, Ahmed K, Syed JR, et al. Fundamental skills of robotic surgery: a multi-institutional randomized controlled trial for validation of a simulation-based curriculum. Urology. 2013;81(4):767-774.

12. Duran C, Bismuth J, Mitchell E. A nationwide survey of vascular surgery trainees reveals trends in operative experience, confidence, and attitudes about simulation. J Vasc Surg. 2013;58(2):524-528.

13. Kuhls DA, Risucci DA, Bowyer MW, Luchette FA. Advanced surgical skills for exposure in trauma: a new surgical skills cadaver course for surgery residents and fellows. J Trauma Acute Care Surg. 2013;74(2):664-670.

14. Sanfey HA, Dunnington GL. Basic surgical skills testing for junior residents: current views of general surgery program directors. J Am Coll Surg. 2011;212(3):406-412.

15. Alvand A, Khan T, Al-Ali S, Jackson WF, Price AJ, Rees JL. Simple visual parameters for objective assessment of arthroscopic skill. J Bone Joint Surg Am. 2012;94(13):e97.

16. Jackson WF, Khan T, Alvand A, et al. Learning and retaining simulated arthroscopic meniscal repair skills. J Bone Joint Surg Am. 2012;94(17):e132.

17. Pernar LI, Smink DS, Hicks G, Peyre SE. Residents can successfully teach basic surgical skills in the simulation center. J Surg Educ. 2012;69(5):617-622.

18. Tuijthof GJ, Visser P, Sierevelt IN, Van Dijk CN, Kerkhoffs GM. Does perception of usefulness of arthroscopic simulators differ with levels of experience? Clin Orthop. 2011;469(6):1701-1708.

19. Martin KD, Cameron K, Belmont PJ, Schoenfeld A, Owens BD. Shoulder arthroscopy simulator performance correlates with resident and shoulder arthroscopy experience. J Bone Joint Surg Am. 2012;94(21):e160.

20. Slade Shantz JA, Leiter JR, Gottschalk T, MacDonald PB. The internal validity of arthroscopic simulators and their effectiveness in arthroscopic education. Knee Surg Sports Traumatol Arthrosc. 2014;22(1):33-40.

21. Roberts S, Menage J, Eisenstein SM. The cartilage end-plate and intervertebral disc in scoliosis: calcification and other sequelae. J Orthop Res. 1993;11(5):747-757.

22. Ruedi TP, Buckley RE, Moran CG. AO Principles of Fracture Management. Stuttgart, Germany: Thieme; 2007.

23. Chen CL, Chen WC, Chiang JH, Ho CF. Interscapular hibernoma: case report and literature review. Kaohsiung J Med Sci. 2011;27(8):348-352.

24. Karam MD, Pedowitz RA, Natividad H, Murray J, Marsh JL. Current and future use of surgical skills training laboratories in orthopaedic resident education: a national survey. J Bone Joint Surg Am. 2013;95(1):e4.

25. Baskies MA, Ruchelsman DE, Capeci CM, Zuckerman JD, Egol KA. Operative experience in an orthopaedic surgery residency program: the effect of work-hour restrictions. J Bone Joint Surg Am. 2008;90(4):924-927.

26. Pappas AJ, Teague DC. The impact of the Accreditation Council for Graduate Medical Education work-hour regulations on the surgical experience of orthopaedic surgery residents. J Bone Joint Surg Am. 2007;89(4):904-909.

27. Palter VN, Grantcharov T, Harvey A, Macrae HM. Ex vivo technical skills training transfers to the operating room and enhances cognitive learning: a randomized controlled trial. Ann Surg. 2011;253(5):886-889.

28. Franzeck FM, Rosenthal R, Muller MK, et al. Prospective randomized controlled trial of simulator-based versus traditional in-surgery laparoscopic camera navigation training. Surg Endosc. 2012;26(1):235-241.

29. Zendejas B, Cook DA, Bingener J, et al. Simulation-based mastery learning improves patient outcomes in laparoscopic inguinal hernia repair: a randomized controlled trial. Ann Surg. 2011;254(3):502-509.

30. Hui Y, Safir O, Dubrowski A, Carnahan H. What skills should simulation training in arthroscopy teach residents? A focus on resident input. Int J Comput Assist Radiol Surg. 2013;8(6):945-953.

31. Butler A, Olson T, Koehler R, Nicandri G. Do the skills acquired by novice surgeons using anatomic dry models transfer effectively to the task of diagnostic knee arthroscopy performed on cadaveric specimens? J Bone Joint Surg Am. 2013;95(3):e15(1-8).

32. Martin KD, Belmont PJ, Schoenfeld AJ, Todd M, Cameron KL, Owens BD. Arthroscopic basic task performance in shoulder simulator model correlates with similar task performance in cadavers. J Bone Joint Surg Am. 2011;93(21):e1271-e1275.

33. Elliott MJ, Caprise PA, Henning AE, Kurtz CA, Sekiya JK. Diagnostic knee arthroscopy: a pilot study to evaluate surgical skills. Arthroscopy. 2012;28(2):218-224.

34. Andersen C, Winding TN, Vesterby MS. Development of simulated arthroscopic skills. Acta Orthop. 2011;82(1):90-95.

35. Sonnadara RR, Van Vliet A, Safir O, et al. Orthopedic boot camp: examining the effectiveness of an intensive surgical skills course. Surgery. 2011;149(6):745-749.

36. Sonnadara RR, Garbedian S, Safir O, et al. Orthopaedic boot camp II: examining the retention rates of an intensive surgical skills course. Surgery. 2012;151(6):803-807.

References

1. Schmale GA. More evidence of educational inadequacies in musculoskeletal medicine. Clin Orthop. 2005;(437):251-259.

2. Day CS, Yeh AC, Franko O, Ramirez M, Krupat E. Musculoskeletal medicine: an assessment of the attitudes and knowledge of medical students at Harvard Medical School. Acad Med. 2007;82(5):452-457.

3. Bilderback K, Eggerstedt J, Sadasivan KK, et al. Design and implementation of a system-based course in musculoskeletal medicine for medical students. J Bone Joint Surg Am. 2008;90(10):2292-2300.

4. Freedman KB, Bernstein J. Educational deficiencies in musculoskeletal medicine. J Bone Joint Surg Am. 2002;84(4):604-608.

5. Corbett EC Jr, Elnicki DM, Conaway MR. When should students learn essential physical examination skills? Views of internal medicine clerkship directors in North America. Acad Med. 2008;83(1):96-99.

6. Coady DA, Walker DJ, Kay LJ. Teaching medical students musculoskeletal examination skills: identifying barriers to learning and ways of overcoming them. Scand J Rheumatol. 2004;33(1):47-51.

7. Saleh K, Messner R, Axtell S, Harris I, Mahowald ML. Development and evaluation of an integrated musculoskeletal disease course for medical students. J Bone Joint Surg Am. 2004;86(8):1653-1658.

8. van Empel PJ, Verdam MG, Huirne JA, Bonjer HJ, Meijerink WJ, Scheele F. Open knot-tying skills: resident skills assessed. J Obstet Gynaecol Res. 2013;39(5):1030-1036.

9. Barrier BF, Thompson AB, McCullough MW, Occhino JA. A novel and inexpensive vaginal hysterectomy simulator. Simul Healthc. 2012;7(6):374-379.

10. Liss MA, McDougall EM. Robotic surgical simulation. Cancer J. 2013;19(2):124-129.

11. Stegemann AP, Ahmed K, Syed JR, et al. Fundamental skills of robotic surgery: a multi-institutional randomized controlled trial for validation of a simulation-based curriculum. Urology. 2013;81(4):767-774.

12. Duran C, Bismuth J, Mitchell E. A nationwide survey of vascular surgery trainees reveals trends in operative experience, confidence, and attitudes about simulation. J Vasc Surg. 2013;58(2):524-528.

13. Kuhls DA, Risucci DA, Bowyer MW, Luchette FA. Advanced surgical skills for exposure in trauma: a new surgical skills cadaver course for surgery residents and fellows. J Trauma Acute Care Surg. 2013;74(2):664-670.

14. Sanfey HA, Dunnington GL. Basic surgical skills testing for junior residents: current views of general surgery program directors. J Am Coll Surg. 2011;212(3):406-412.

15. Alvand A, Khan T, Al-Ali S, Jackson WF, Price AJ, Rees JL. Simple visual parameters for objective assessment of arthroscopic skill. J Bone Joint Surg Am. 2012;94(13):e97.

16. Jackson WF, Khan T, Alvand A, et al. Learning and retaining simulated arthroscopic meniscal repair skills. J Bone Joint Surg Am. 2012;94(17):e132.

17. Pernar LI, Smink DS, Hicks G, Peyre SE. Residents can successfully teach basic surgical skills in the simulation center. J Surg Educ. 2012;69(5):617-622.

18. Tuijthof GJ, Visser P, Sierevelt IN, Van Dijk CN, Kerkhoffs GM. Does perception of usefulness of arthroscopic simulators differ with levels of experience? Clin Orthop. 2011;469(6):1701-1708.

19. Martin KD, Cameron K, Belmont PJ, Schoenfeld A, Owens BD. Shoulder arthroscopy simulator performance correlates with resident and shoulder arthroscopy experience. J Bone Joint Surg Am. 2012;94(21):e160.

20. Slade Shantz JA, Leiter JR, Gottschalk T, MacDonald PB. The internal validity of arthroscopic simulators and their effectiveness in arthroscopic education. Knee Surg Sports Traumatol Arthrosc. 2014;22(1):33-40.

21. Roberts S, Menage J, Eisenstein SM. The cartilage end-plate and intervertebral disc in scoliosis: calcification and other sequelae. J Orthop Res. 1993;11(5):747-757.

22. Ruedi TP, Buckley RE, Moran CG. AO Principles of Fracture Management. Stuttgart, Germany: Thieme; 2007.

23. Chen CL, Chen WC, Chiang JH, Ho CF. Interscapular hibernoma: case report and literature review. Kaohsiung J Med Sci. 2011;27(8):348-352.

24. Karam MD, Pedowitz RA, Natividad H, Murray J, Marsh JL. Current and future use of surgical skills training laboratories in orthopaedic resident education: a national survey. J Bone Joint Surg Am. 2013;95(1):e4.

25. Baskies MA, Ruchelsman DE, Capeci CM, Zuckerman JD, Egol KA. Operative experience in an orthopaedic surgery residency program: the effect of work-hour restrictions. J Bone Joint Surg Am. 2008;90(4):924-927.

26. Pappas AJ, Teague DC. The impact of the Accreditation Council for Graduate Medical Education work-hour regulations on the surgical experience of orthopaedic surgery residents. J Bone Joint Surg Am. 2007;89(4):904-909.

27. Palter VN, Grantcharov T, Harvey A, Macrae HM. Ex vivo technical skills training transfers to the operating room and enhances cognitive learning: a randomized controlled trial. Ann Surg. 2011;253(5):886-889.

28. Franzeck FM, Rosenthal R, Muller MK, et al. Prospective randomized controlled trial of simulator-based versus traditional in-surgery laparoscopic camera navigation training. Surg Endosc. 2012;26(1):235-241.

29. Zendejas B, Cook DA, Bingener J, et al. Simulation-based mastery learning improves patient outcomes in laparoscopic inguinal hernia repair: a randomized controlled trial. Ann Surg. 2011;254(3):502-509.

30. Hui Y, Safir O, Dubrowski A, Carnahan H. What skills should simulation training in arthroscopy teach residents? A focus on resident input. Int J Comput Assist Radiol Surg. 2013;8(6):945-953.

31. Butler A, Olson T, Koehler R, Nicandri G. Do the skills acquired by novice surgeons using anatomic dry models transfer effectively to the task of diagnostic knee arthroscopy performed on cadaveric specimens? J Bone Joint Surg Am. 2013;95(3):e15(1-8).

32. Martin KD, Belmont PJ, Schoenfeld AJ, Todd M, Cameron KL, Owens BD. Arthroscopic basic task performance in shoulder simulator model correlates with similar task performance in cadavers. J Bone Joint Surg Am. 2011;93(21):e1271-e1275.

33. Elliott MJ, Caprise PA, Henning AE, Kurtz CA, Sekiya JK. Diagnostic knee arthroscopy: a pilot study to evaluate surgical skills. Arthroscopy. 2012;28(2):218-224.

34. Andersen C, Winding TN, Vesterby MS. Development of simulated arthroscopic skills. Acta Orthop. 2011;82(1):90-95.

35. Sonnadara RR, Van Vliet A, Safir O, et al. Orthopedic boot camp: examining the effectiveness of an intensive surgical skills course. Surgery. 2011;149(6):745-749.

36. Sonnadara RR, Garbedian S, Safir O, et al. Orthopaedic boot camp II: examining the retention rates of an intensive surgical skills course. Surgery. 2012;151(6):803-807.

Issue
The American Journal of Orthopedics - 43(11)
Issue
The American Journal of Orthopedics - 43(11)
Page Number
E246-E254
Page Number
E246-E254
Publications
Publications
Topics
Article Type
Display Headline
Pilot Study for an Orthopedic Surgical Training Laboratory for Basic Motor Skills
Display Headline
Pilot Study for an Orthopedic Surgical Training Laboratory for Basic Motor Skills
Legacy Keywords
american journal of orthopedics, AJO, original study, online exclusive, pilot study, study, orthopedic, surgical training, training, laboratory, motor skills, surgical skills, education, examination, christy, kolovich, beal, mayerson
Legacy Keywords
american journal of orthopedics, AJO, original study, online exclusive, pilot study, study, orthopedic, surgical training, training, laboratory, motor skills, surgical skills, education, examination, christy, kolovich, beal, mayerson
Sections
Article Source

PURLs Copyright

Inside the Article

Article PDF Media

The Role of Computed Tomography for Postoperative Evaluation of Percutaneous Sacroiliac Screw Fixation and Description of a “Safe Zone”

Article Type
Changed
Thu, 09/19/2019 - 13:39
Display Headline
The Role of Computed Tomography for Postoperative Evaluation of Percutaneous Sacroiliac Screw Fixation and Description of a “Safe Zone”

Pelvic injuries account for 3% of all skeletal fractures.1 Injury to the sacroiliac (SI) joint is frequently associated with unstable pelvic ring fractures, which are potentially life-threatening injuries. Surgical fixation of these injuries is preferred to nonoperative treatment given the potential for improved reduction and early mobilization and weight-bearing, thereby decreasing perioperative morbidity and improving functional outcome.2

The classic method of surgical fixation of the SI joint consisted of open reduction and internal fixation. This method carried a substantial risk for large dissection, iatrogenic nerve injury, and increased blood loss to the already traumatized patient.3 Percutaneous fixation allows for a shorter operating time, decreased soft-tissue stripping, and decreased blood loss compared with a traditional open procedure.4 However, posterior pelvic anatomy is complex and variable, and reports have found screw misplacements as high as 24%5 and neurologic complication rates up to 18%.6-9 

Various imaging modalities, including fluoroscopy,5 computed tomography (CT),6-7 fluoroscopic CT, and computer-assisted techniques5,9 have been used to achieve proper screw placement. Conventional fluoroscopy is the standard for intraoperative screw placement. However, acceptable reduction of the SI joint and proper implantation of the screws without perforation of the neural foramina is challenging, especially when coupled with difficulties of fluoroscopic imaging and variations in pelvic anatomy. 

Sacral dysplasia has been reported to occur in up to 20% to 40% of the population and has significant implications in patients indicated for iliosacral screw placement.10 Incorrect placement of iliosacral screws may result in iatrogenic neurovascular complications.11-13 Malpositioned screws using fluoroscopic guidance have been reported in 2% to 15% of patients with an incidence of neurologic compromise between 0.5% and 7.7%. As little as 4° of misdirection can result in damage to neurovascular structures.14

At our institution, we routinely obtained postoperative CT to evaluate the placement of SI screws. The objective of this retrospective study is to evaluate the rate of revision surgery of percutaneous SI screw fixation, to determine whether CT is an accurate tool for evaluation of the reduction and the need for revision surgery, and to decide if any violation of the neural foramina is safe.

Materials and Methods

After institutional review board approval, we retrospectively reviewed and evaluated medical records and radiographs of all patients who sustained unstable pelvic ring fractures between July 1, 2005, and June 30, 2010. We identified all patients who were treated with closed reductions and percutaneous iliosacral screw fixation, according to the method described by Routt in 1995.4 We excluded all pelvic fractures in patients who underwent open reduction for the posterior injury or did not have percutaneous SI screws placed, those with spinal injury, and those without follow-up. Of the 46 patients who met the inclusion criteria were 26 men and 20 women with a mean age of 42 years (range, 16 to 73 years). Motor vehicle accidents accounted for 13 cases; 19 were crush injuries and 14 were falls from height. Seventeen patients (37%) met the radiographic criteria for sacral dysmorphism. Forty-two of the 46 patients were polytrauma patients with associated musculoskeletal injuries and/or abdominal, chest, or head injuries.

Six patients presented with some neurologic deficit at the time of injury; all fractures were closed. The initial imaging study included plain anteroposterior (AP), inlet, and outlet radiographs of the pelvis and a pelvic CT scan. Using the classification of Young and Burgess,15 there were 3 vertical shear injuries, 13 lateral compression–type injuries, 17 anterior-posterior–type injuries, 7 sacral fractures, and 6 combination- or unclassifiable-type pelvic injuries. Of the sacral fractures, there were 3 Denis zone 1, 3 Denis zone 2, and 1 Denis zone 3. 

The pelvic CT scan included the entire pelvis from the ilium to the ischial tuberosities. Each scan consisted of either a 5.0-mm or a 2.5-mm sequential axial image. A picture archiving and communication system (PACS) workstation using Centricity version 2.1 (GE Medical Systems, Waukesha, Wisconsin) was used to analyze each scan with a bone algorithm. On PACS, each initial displacement was characterized by the amount of SI joint widening at the level of the S1 and was measured using digital calipers.   

Surgery

Mean time to surgery was 4 days (range, 2 to 15 days) after the injury. A total of 51 SI screws were implanted in 46 patients. We achieved closed reduction of the posterior pelvic ring by various techniques, including compression with percutaneous partially threaded screw fixation. In the cases in which the posterior ring lesion was associated with a pure pubic symphysis disruption, the anterior pelvis was initially reduced and stabilized with small-fragment plate fixation (Synthes, Inc, Paoli, Pennsylvania). The posterior complex was stabilized with 1 screw in 41 patients, 2 cases required a transiliac screw, and 2 screws (S1 and S2) were placed in each of the remaining 3 cases. Definitive stabilization of the posterior pelvis was achieved with percutaneous, partially threaded 7.3- or 7.5-mm–diameter cannulated screws (Synthes, Inc, and Zimmer Inc, Warsaw, Indiana, respectively) in 42 fractures and 6.5-mm screws (Synthes, Inc) in 4 fractures. In 11 cases where the fracture was through the sacrum, fully threaded cannulated screws were used to avoid compression. Screw insertion was performed under fluoroscopic guidance with inlet, outlet, and lateral sacral views. One of 2 fellowship-trained trauma surgeons performed the surgeries. Rehabilitation plans were customized to each patient based on concomitant injuries. 

 

 

Postoperative Assessment

AP, lateral sacral, and inlet and outlet postoperative radiographs were taken in all cases within 24 hours after surgery. Pelvic CT was also obtained within 24 hours of surgery to review reduction and screw placement.

Using the measurement tool on the PACS system, we measured the penetration of the screw into the foramen. Screws were graded as intraosseous (completely contained within the sacral bone), skived (less than 2 mm of partial penetration into the S1 foramen), or extruded (the screw not contained by the bone). Screw penetration of the S1 was evaluated on the radiographic images as well as the axial images of the CT scans.

After surgery, the senior orthopedic resident and attending surgeon performed and documented detailed neurologic evaluations. They reviewed the medical record for neurologic deficit following surgical fixation.   

Results

The mean follow-up time was 12 months (range, 8 months to 2 years). Two patients expired secondary to associated injuries. There were no early deaths related to the pelvic surgery. Stable fixation, including bone or ligamentous healing, as well as full weight-bearing status, was noted in every case. No case exhibited loss of reduction or implant failure or infection.

According to Matta’s criteria of anatomic reduction within 1 cm, all patients were found to have satisfactory reductions.7 Six of 46 patients had documented preoperative neurologic deficits. After percutaneous screw fixation, 10 of 46 patients had postoperative neurologic deficit, 2 of which were unchanged from preoperative evaluation. Of the 8 patients with new/altered postoperative neurologic deficit, CT showed neural foramen penetration greater than 2.1 mm in only 2 patients. Both patients underwent screw revision, resulting in improved neurologic deficit. The remaining 4 patients did not have foramen penetration and improved their neurologic function over the course of 2 weeks with return to presurgical status by 6 weeks without necessitating screw removal.

Twenty-three of the 51 screws (45%) had some violation of the S1 foramen on the CT. There were 17 patients with dysmorphic sacrums in which 21 S1 screws were placed. Eleven of  21 (52%) screws showed some penetration of the S1 foramen on CT. There were 29 patients with normal sacral morphology in which 30 S1 screws were placed. Twelve of 30 (40%) screws penetrated the S1 foramen. All violations were in the superior one-third position of the foramen. Two of 46 (4%; 1 with dysmorphism, 1 without) had a new neurologic deficit associated with the surgery (Table). CT showed sacral foramen penetration, and both screws were revised with a better neurologic examination.

High-resolution CTs were obtained in 32 patients, while 14 patients underwent the standard 5.0-mm–cut CTs. Of the 32 patients in which a 2.5-mm high-resolution CT was obtained, 20 (62.5%) had evidence of screw penetration (Figures 1, 2). All violations of the S1 neural foramen were in the superior portion of the foramen. 

When compared with patients who had a 5.0-mm CT, the patients who underwent a high-resolution CT were more likely to show neural foramen penetration (P = .3). The average screw penetration into the S1 neural foramen measured 3.3 mm (range, 1.6-5.7 mm) in dysmorphic sacrum and 2.7 mm  (range, 1.4-7 mm) in normal sacrum. However, in our study, any foramen penetration of less than 2.1 mm on CT did not result in neurologic deficit.  

Discussion

Pelvic fractures are fairly common and represent approximately 5% of all trauma admissions and 3% of all skeletal fractures nationwide.1 The current treatment for SI disruption is either nonoperative or operative. Surgical fixation is technically demanding and surgeons often need a long learning curve to acquire the demanding technique because of the limitations of radiographic visualization of the relevant landmarks.16

Letournel17 developed the technique for iliosacral screw fixation for the treatment of posterior pelvic ring injuries, where 1 or 2 large screws (6.5-7.3 mm in diameter) are inserted under fluoroscopic guidance through the ilium, across the SI articulation, and into the superior sacral vertebral bodies using percutaneous techniques. Currently, the standard procedure to accomplish the percutaneous placement of iliosacral screws derives mainly from the technique described by Matta with the C-arm fluoroscopy visualizing the pelvis in 3 views: strict AP, inlet, and outlet views.7

Routt and colleagues4 recommend a strict lateral view of the sacrum, particularly when crossing the narrow zone of the sacral alar. They reported high union rates and accurate placement of the screws.4 There are limitations to the use of biplanar fluoroscopy because the intraoperative images are not orthogonal, with the average arc (67º) between the ideal inlet and outlet. However, because of the variability in sacral anatomy, CT guidance was recommended by others.2,6,8,18 Operating in a CT suite had other complications. Misinterpretation of CT led to “in-out-in” screws, which resulted in neurapraxia. 

 

 

In our study, we used the technique described by Matta and colleagues for placement of the screws and performed a postoperative CT to evaluate screw placement and to assess pelvic reduction.7 We had a high penetration rate using CT, which increased with better resolution, even though none of the radiographs showed any obvious evidence of misplacement of the screws. Ebraheim and colleagues6 described the relationship of the S1 nerve root in its neural foramen and found it to be approximately 8.7 mm inferior and 7.8 mm medial to the starting point for a pedicle screw. Given these numbers, it is possible that a large amount of skiving can be tolerated contingent on an adequate reduction of the SI joint. 

Because of our high rates of skiving and low rates of neurologic deficit, a new “safe zone” for screw insertion can be expanded to include skiving of the S1 neural foramen up to 3 mm without fear of nerve root injury. However, drilling and screw insertion at higher speeds can also cause neurologic injury secondary to thermal injury or soft tissue being caught up in a rotating drill/screw. 

Evaluation of placement of percutaneous SI screw placement in our study resulted in neural foramen penetration in 43% of SI screws, which is higher than other studies.14,19,20 Our study showed that screw penetration up to 2 mm does not correlate with neurologic deficit. Iatrogenic neurologic deficit secondary to perforation of the foramina occurred in only 1 patient. Penetration of the foramina in all cases was in the superior portion of the foramen. We propose that there is a safe zone within the S1 neural foramen, and small amounts of penetration in the superior one-third of the foramen on axial CT images do not correlate with neurologic deficit. This potential safe zone is predicated on adequate reduction of the SI joint. 

Neural foramen penetration shown on postoperative CT does not necessarily correlate with neurologic deficit. A postoperative CT is not indicated unless there are findings of a postoperative nerve injury. Our ideal screw placement skives the superior S1 foramen allowing for a larger screw diameter in a safe zone.

CT-guided placement has been proposed; however, concerns about radiation exposure, cost, and feasibility with similar outcomes compared with fluoroscopic-guided screw placement has resulted in its falling out of favor.

Iatrogenic nerve injuries are reported to occur in 0% to 6% of all percutaneous SI screw placement.14,21 Risk factors for iatrogenic nerve injury while using fluoroscopic guidance include sacral morphologic abnormalities, presence of intestinal gas, or contrast.22 Although these may be minimized with proper use of fluoroscopy, obtaining anatomic reduction as well as a thorough understanding of the pelvic morphology, the surgeon must be prepared to obtain further studies, such as a CT scan, if there is postoperative neurologic deficit.

Based on our findings, we do not routinely obtain a postoperative CT for SI screw placement, unless there is concern for malreduction or there is neurologic deficit. We also believe that up to 2 mm of foramen penetration is safe and does not result in neurologic deficit.

References

1. Failinger MS, McGanity PL. Unstable fractures of the pelvic ring. J Bone and Joint Surg Am. 1992;74(5):781-791.

2. Smith HE, Yuan PS, Sasso R, Papadopolous S, Vaccaro AR. An evaluation of image-guided technologies in the placement of percutaneous iliosacral screws. Spine (Phila Pa 1976). 2006;31(2):234-238.

3. Judet R, Judet J, Letournel E. Fractures of the acetabulum: classification and surgical approaches for open reduction. Preliminary report. J Bone Joint Surg Am. 1964;46(16):1615-1646.

4. Routt ML Jr, Kregor PJ, Simonian PT, Mayo KA. Early results of percutaneous iliosacral screws placed with the patient in the supine position. J Orthop Trauma. 1995;9(3):207-214.

5. Tonetti J, Carrat L, Blendea S, et al. Clinical results of percutaneous pelvic surgery. Computer assisted surgery using ultrasound compared to standard fluoroscopy. Comput Aided Surg. 2001;6(4):204-211.

6. Ebraheim NA, Coombs R, Jackson WT, Rusin JJ. Percutaneous computed tomography-guided stabilization of posterior pelvic fractures. Clin Orthop. 1994;(307):222-228.

7. Keating JF, Werier J, Blachut P, et al. Early fixation of the vertically unstable pelvis: the role of iliosacral screw fixation of the posterior lesion. J Orthop Trauma. 1999;13(2):107-113.

8. Webb LX, de Araujo W, Donofrio P, et al. Electromyography monitoring for percutaneous placement of iliosacral screws. J Orthop Trauma. 2000;14(4):245-254.

9. Barrick EF, O’Mara JW, Lane HE 3rd. Iliosacral screw insertion using computer-assisted CT image guidance: a laboratory study. Comput Aided Surg. 1998;3(6):289-296.

10. Routt ML Jr, Simonian PT, Agnew SG, Mann FA. Radiographic recognition of the sacral alar slope for optimal placement of iliosacral screws: a cadaveric and clinical study. J Orthop Trauma. 1996;10(3):171-177.

11. Altman DT, Jones CB, Routt ML Jr. Superior gluteal artery injury during iliosacral screw placement. J Orthop Trauma. 1999;13(3):220-227.

12. Stephen DJ. Pseudoaneurysm of the superior gluteal arterial system: an unusual cause of pain after a pelvic fracture. J Trauma. 1997;43(1):146-149.

13. Stöckle U, König B, Hofstetter R, Nolte LP, Haas NP. [Navigation assisted by image conversion. An experimental study on pelvic screw fixation]
[in German]. Unfallchirurg. 2001;104(3):215-220.

14. Templeman D, Schmidt A, Freese J, Weisman I, et al. Proximity of iliosacral screws to neurovascular structures after internal fixation. Clin Orthop. 1996;(329):194-198.

15. Young JW, Burgess AR, Brumback RJ, Poka A. Pelvic fractures: value of plain radiography in early assessment and management. Radiology. 1986;160(2):445-451.

16. Graves ML, Routt ML Jr. Iliosacral screw placement: are uniplanar changes realistic based on standard fluoroscopic imaging? J Trauma. 2011;7(1):204-208.

17. Letournel E. Pelvic fractures. Injury. 1978;10(2):145-148.

18. Blake-Toker AM, Hawkins L, Nadalo L, et al. CT guided percutaneous fixation of sacroiliac fractures in trauma patients. J Trauma. 2001;51(6):1117-1121.

19. Hinsche AF, Giannoudis PV, Smith RM. Fluoroscopy-based multiplanar image guidance for insertion of sacroiliac screws. Clin Orthop. 2002;(395):135-144.

20. van den Bosch EW, van Zwienen CM, van Vugt AB. Fluoroscopic positioning of sacroiliac screws in 88 patients. J Trauma. 2002;53(1):44-48.

21. Cole JD, Blum DA, Ansel LJ. Outcome after fixation of unstable posterior pelvic ring injuries. Clin Orthop. 1996;(329):160-179.

22. Routt ML Jr, Simonian PT. Closed reduction and percutaneous skeletal fixation of sacral fractures. Clin Orthop. 1996;(329):121-128.

Article PDF
Author and Disclosure Information

Nirmal C. Tejwani, MD, Dima Raskolnikov, BS, Toni McLaurin, MD, and Richelle Takemoto, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article. 

Issue
The American Journal of Orthopedics - 43(11)
Publications
Topics
Page Number
513-516
Legacy Keywords
american journal of orthopedics, AJO, original study, study, computed tomography, CT, sacroiliac screw fixation, SI, sacroiliac, safe zone, joint, injuries, injury, surgery, CT scans, soft-tissue, SI joint, tejwani, raskolnikov, mclaurin, takemoto
Sections
Author and Disclosure Information

Nirmal C. Tejwani, MD, Dima Raskolnikov, BS, Toni McLaurin, MD, and Richelle Takemoto, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article. 

Author and Disclosure Information

Nirmal C. Tejwani, MD, Dima Raskolnikov, BS, Toni McLaurin, MD, and Richelle Takemoto, MD

Authors’ Disclosure Statement: The authors report no actual or potential conflict of interest in relation to this article. 

Article PDF
Article PDF

Pelvic injuries account for 3% of all skeletal fractures.1 Injury to the sacroiliac (SI) joint is frequently associated with unstable pelvic ring fractures, which are potentially life-threatening injuries. Surgical fixation of these injuries is preferred to nonoperative treatment given the potential for improved reduction and early mobilization and weight-bearing, thereby decreasing perioperative morbidity and improving functional outcome.2

The classic method of surgical fixation of the SI joint consisted of open reduction and internal fixation. This method carried a substantial risk for large dissection, iatrogenic nerve injury, and increased blood loss to the already traumatized patient.3 Percutaneous fixation allows for a shorter operating time, decreased soft-tissue stripping, and decreased blood loss compared with a traditional open procedure.4 However, posterior pelvic anatomy is complex and variable, and reports have found screw misplacements as high as 24%5 and neurologic complication rates up to 18%.6-9 

Various imaging modalities, including fluoroscopy,5 computed tomography (CT),6-7 fluoroscopic CT, and computer-assisted techniques5,9 have been used to achieve proper screw placement. Conventional fluoroscopy is the standard for intraoperative screw placement. However, acceptable reduction of the SI joint and proper implantation of the screws without perforation of the neural foramina is challenging, especially when coupled with difficulties of fluoroscopic imaging and variations in pelvic anatomy. 

Sacral dysplasia has been reported to occur in up to 20% to 40% of the population and has significant implications in patients indicated for iliosacral screw placement.10 Incorrect placement of iliosacral screws may result in iatrogenic neurovascular complications.11-13 Malpositioned screws using fluoroscopic guidance have been reported in 2% to 15% of patients with an incidence of neurologic compromise between 0.5% and 7.7%. As little as 4° of misdirection can result in damage to neurovascular structures.14

At our institution, we routinely obtained postoperative CT to evaluate the placement of SI screws. The objective of this retrospective study is to evaluate the rate of revision surgery of percutaneous SI screw fixation, to determine whether CT is an accurate tool for evaluation of the reduction and the need for revision surgery, and to decide if any violation of the neural foramina is safe.

Materials and Methods

After institutional review board approval, we retrospectively reviewed and evaluated medical records and radiographs of all patients who sustained unstable pelvic ring fractures between July 1, 2005, and June 30, 2010. We identified all patients who were treated with closed reductions and percutaneous iliosacral screw fixation, according to the method described by Routt in 1995.4 We excluded all pelvic fractures in patients who underwent open reduction for the posterior injury or did not have percutaneous SI screws placed, those with spinal injury, and those without follow-up. Of the 46 patients who met the inclusion criteria were 26 men and 20 women with a mean age of 42 years (range, 16 to 73 years). Motor vehicle accidents accounted for 13 cases; 19 were crush injuries and 14 were falls from height. Seventeen patients (37%) met the radiographic criteria for sacral dysmorphism. Forty-two of the 46 patients were polytrauma patients with associated musculoskeletal injuries and/or abdominal, chest, or head injuries.

Six patients presented with some neurologic deficit at the time of injury; all fractures were closed. The initial imaging study included plain anteroposterior (AP), inlet, and outlet radiographs of the pelvis and a pelvic CT scan. Using the classification of Young and Burgess,15 there were 3 vertical shear injuries, 13 lateral compression–type injuries, 17 anterior-posterior–type injuries, 7 sacral fractures, and 6 combination- or unclassifiable-type pelvic injuries. Of the sacral fractures, there were 3 Denis zone 1, 3 Denis zone 2, and 1 Denis zone 3. 

The pelvic CT scan included the entire pelvis from the ilium to the ischial tuberosities. Each scan consisted of either a 5.0-mm or a 2.5-mm sequential axial image. A picture archiving and communication system (PACS) workstation using Centricity version 2.1 (GE Medical Systems, Waukesha, Wisconsin) was used to analyze each scan with a bone algorithm. On PACS, each initial displacement was characterized by the amount of SI joint widening at the level of the S1 and was measured using digital calipers.   

Surgery

Mean time to surgery was 4 days (range, 2 to 15 days) after the injury. A total of 51 SI screws were implanted in 46 patients. We achieved closed reduction of the posterior pelvic ring by various techniques, including compression with percutaneous partially threaded screw fixation. In the cases in which the posterior ring lesion was associated with a pure pubic symphysis disruption, the anterior pelvis was initially reduced and stabilized with small-fragment plate fixation (Synthes, Inc, Paoli, Pennsylvania). The posterior complex was stabilized with 1 screw in 41 patients, 2 cases required a transiliac screw, and 2 screws (S1 and S2) were placed in each of the remaining 3 cases. Definitive stabilization of the posterior pelvis was achieved with percutaneous, partially threaded 7.3- or 7.5-mm–diameter cannulated screws (Synthes, Inc, and Zimmer Inc, Warsaw, Indiana, respectively) in 42 fractures and 6.5-mm screws (Synthes, Inc) in 4 fractures. In 11 cases where the fracture was through the sacrum, fully threaded cannulated screws were used to avoid compression. Screw insertion was performed under fluoroscopic guidance with inlet, outlet, and lateral sacral views. One of 2 fellowship-trained trauma surgeons performed the surgeries. Rehabilitation plans were customized to each patient based on concomitant injuries. 

 

 

Postoperative Assessment

AP, lateral sacral, and inlet and outlet postoperative radiographs were taken in all cases within 24 hours after surgery. Pelvic CT was also obtained within 24 hours of surgery to review reduction and screw placement.

Using the measurement tool on the PACS system, we measured the penetration of the screw into the foramen. Screws were graded as intraosseous (completely contained within the sacral bone), skived (less than 2 mm of partial penetration into the S1 foramen), or extruded (the screw not contained by the bone). Screw penetration of the S1 was evaluated on the radiographic images as well as the axial images of the CT scans.

After surgery, the senior orthopedic resident and attending surgeon performed and documented detailed neurologic evaluations. They reviewed the medical record for neurologic deficit following surgical fixation.   

Results

The mean follow-up time was 12 months (range, 8 months to 2 years). Two patients expired secondary to associated injuries. There were no early deaths related to the pelvic surgery. Stable fixation, including bone or ligamentous healing, as well as full weight-bearing status, was noted in every case. No case exhibited loss of reduction or implant failure or infection.

According to Matta’s criteria of anatomic reduction within 1 cm, all patients were found to have satisfactory reductions.7 Six of 46 patients had documented preoperative neurologic deficits. After percutaneous screw fixation, 10 of 46 patients had postoperative neurologic deficit, 2 of which were unchanged from preoperative evaluation. Of the 8 patients with new/altered postoperative neurologic deficit, CT showed neural foramen penetration greater than 2.1 mm in only 2 patients. Both patients underwent screw revision, resulting in improved neurologic deficit. The remaining 4 patients did not have foramen penetration and improved their neurologic function over the course of 2 weeks with return to presurgical status by 6 weeks without necessitating screw removal.

Twenty-three of the 51 screws (45%) had some violation of the S1 foramen on the CT. There were 17 patients with dysmorphic sacrums in which 21 S1 screws were placed. Eleven of  21 (52%) screws showed some penetration of the S1 foramen on CT. There were 29 patients with normal sacral morphology in which 30 S1 screws were placed. Twelve of 30 (40%) screws penetrated the S1 foramen. All violations were in the superior one-third position of the foramen. Two of 46 (4%; 1 with dysmorphism, 1 without) had a new neurologic deficit associated with the surgery (Table). CT showed sacral foramen penetration, and both screws were revised with a better neurologic examination.

High-resolution CTs were obtained in 32 patients, while 14 patients underwent the standard 5.0-mm–cut CTs. Of the 32 patients in which a 2.5-mm high-resolution CT was obtained, 20 (62.5%) had evidence of screw penetration (Figures 1, 2). All violations of the S1 neural foramen were in the superior portion of the foramen. 

When compared with patients who had a 5.0-mm CT, the patients who underwent a high-resolution CT were more likely to show neural foramen penetration (P = .3). The average screw penetration into the S1 neural foramen measured 3.3 mm (range, 1.6-5.7 mm) in dysmorphic sacrum and 2.7 mm  (range, 1.4-7 mm) in normal sacrum. However, in our study, any foramen penetration of less than 2.1 mm on CT did not result in neurologic deficit.  

Discussion

Pelvic fractures are fairly common and represent approximately 5% of all trauma admissions and 3% of all skeletal fractures nationwide.1 The current treatment for SI disruption is either nonoperative or operative. Surgical fixation is technically demanding and surgeons often need a long learning curve to acquire the demanding technique because of the limitations of radiographic visualization of the relevant landmarks.16

Letournel17 developed the technique for iliosacral screw fixation for the treatment of posterior pelvic ring injuries, where 1 or 2 large screws (6.5-7.3 mm in diameter) are inserted under fluoroscopic guidance through the ilium, across the SI articulation, and into the superior sacral vertebral bodies using percutaneous techniques. Currently, the standard procedure to accomplish the percutaneous placement of iliosacral screws derives mainly from the technique described by Matta with the C-arm fluoroscopy visualizing the pelvis in 3 views: strict AP, inlet, and outlet views.7

Routt and colleagues4 recommend a strict lateral view of the sacrum, particularly when crossing the narrow zone of the sacral alar. They reported high union rates and accurate placement of the screws.4 There are limitations to the use of biplanar fluoroscopy because the intraoperative images are not orthogonal, with the average arc (67º) between the ideal inlet and outlet. However, because of the variability in sacral anatomy, CT guidance was recommended by others.2,6,8,18 Operating in a CT suite had other complications. Misinterpretation of CT led to “in-out-in” screws, which resulted in neurapraxia. 

 

 

In our study, we used the technique described by Matta and colleagues for placement of the screws and performed a postoperative CT to evaluate screw placement and to assess pelvic reduction.7 We had a high penetration rate using CT, which increased with better resolution, even though none of the radiographs showed any obvious evidence of misplacement of the screws. Ebraheim and colleagues6 described the relationship of the S1 nerve root in its neural foramen and found it to be approximately 8.7 mm inferior and 7.8 mm medial to the starting point for a pedicle screw. Given these numbers, it is possible that a large amount of skiving can be tolerated contingent on an adequate reduction of the SI joint. 

Because of our high rates of skiving and low rates of neurologic deficit, a new “safe zone” for screw insertion can be expanded to include skiving of the S1 neural foramen up to 3 mm without fear of nerve root injury. However, drilling and screw insertion at higher speeds can also cause neurologic injury secondary to thermal injury or soft tissue being caught up in a rotating drill/screw. 

Evaluation of placement of percutaneous SI screw placement in our study resulted in neural foramen penetration in 43% of SI screws, which is higher than other studies.14,19,20 Our study showed that screw penetration up to 2 mm does not correlate with neurologic deficit. Iatrogenic neurologic deficit secondary to perforation of the foramina occurred in only 1 patient. Penetration of the foramina in all cases was in the superior portion of the foramen. We propose that there is a safe zone within the S1 neural foramen, and small amounts of penetration in the superior one-third of the foramen on axial CT images do not correlate with neurologic deficit. This potential safe zone is predicated on adequate reduction of the SI joint. 

Neural foramen penetration shown on postoperative CT does not necessarily correlate with neurologic deficit. A postoperative CT is not indicated unless there are findings of a postoperative nerve injury. Our ideal screw placement skives the superior S1 foramen allowing for a larger screw diameter in a safe zone.

CT-guided placement has been proposed; however, concerns about radiation exposure, cost, and feasibility with similar outcomes compared with fluoroscopic-guided screw placement has resulted in its falling out of favor.

Iatrogenic nerve injuries are reported to occur in 0% to 6% of all percutaneous SI screw placement.14,21 Risk factors for iatrogenic nerve injury while using fluoroscopic guidance include sacral morphologic abnormalities, presence of intestinal gas, or contrast.22 Although these may be minimized with proper use of fluoroscopy, obtaining anatomic reduction as well as a thorough understanding of the pelvic morphology, the surgeon must be prepared to obtain further studies, such as a CT scan, if there is postoperative neurologic deficit.

Based on our findings, we do not routinely obtain a postoperative CT for SI screw placement, unless there is concern for malreduction or there is neurologic deficit. We also believe that up to 2 mm of foramen penetration is safe and does not result in neurologic deficit.

Pelvic injuries account for 3% of all skeletal fractures.1 Injury to the sacroiliac (SI) joint is frequently associated with unstable pelvic ring fractures, which are potentially life-threatening injuries. Surgical fixation of these injuries is preferred to nonoperative treatment given the potential for improved reduction and early mobilization and weight-bearing, thereby decreasing perioperative morbidity and improving functional outcome.2

The classic method of surgical fixation of the SI joint consisted of open reduction and internal fixation. This method carried a substantial risk for large dissection, iatrogenic nerve injury, and increased blood loss to the already traumatized patient.3 Percutaneous fixation allows for a shorter operating time, decreased soft-tissue stripping, and decreased blood loss compared with a traditional open procedure.4 However, posterior pelvic anatomy is complex and variable, and reports have found screw misplacements as high as 24%5 and neurologic complication rates up to 18%.6-9 

Various imaging modalities, including fluoroscopy,5 computed tomography (CT),6-7 fluoroscopic CT, and computer-assisted techniques5,9 have been used to achieve proper screw placement. Conventional fluoroscopy is the standard for intraoperative screw placement. However, acceptable reduction of the SI joint and proper implantation of the screws without perforation of the neural foramina is challenging, especially when coupled with difficulties of fluoroscopic imaging and variations in pelvic anatomy. 

Sacral dysplasia has been reported to occur in up to 20% to 40% of the population and has significant implications in patients indicated for iliosacral screw placement.10 Incorrect placement of iliosacral screws may result in iatrogenic neurovascular complications.11-13 Malpositioned screws using fluoroscopic guidance have been reported in 2% to 15% of patients with an incidence of neurologic compromise between 0.5% and 7.7%. As little as 4° of misdirection can result in damage to neurovascular structures.14

At our institution, we routinely obtained postoperative CT to evaluate the placement of SI screws. The objective of this retrospective study is to evaluate the rate of revision surgery of percutaneous SI screw fixation, to determine whether CT is an accurate tool for evaluation of the reduction and the need for revision surgery, and to decide if any violation of the neural foramina is safe.

Materials and Methods

After institutional review board approval, we retrospectively reviewed and evaluated medical records and radiographs of all patients who sustained unstable pelvic ring fractures between July 1, 2005, and June 30, 2010. We identified all patients who were treated with closed reductions and percutaneous iliosacral screw fixation, according to the method described by Routt in 1995.4 We excluded all pelvic fractures in patients who underwent open reduction for the posterior injury or did not have percutaneous SI screws placed, those with spinal injury, and those without follow-up. Of the 46 patients who met the inclusion criteria were 26 men and 20 women with a mean age of 42 years (range, 16 to 73 years). Motor vehicle accidents accounted for 13 cases; 19 were crush injuries and 14 were falls from height. Seventeen patients (37%) met the radiographic criteria for sacral dysmorphism. Forty-two of the 46 patients were polytrauma patients with associated musculoskeletal injuries and/or abdominal, chest, or head injuries.

Six patients presented with some neurologic deficit at the time of injury; all fractures were closed. The initial imaging study included plain anteroposterior (AP), inlet, and outlet radiographs of the pelvis and a pelvic CT scan. Using the classification of Young and Burgess,15 there were 3 vertical shear injuries, 13 lateral compression–type injuries, 17 anterior-posterior–type injuries, 7 sacral fractures, and 6 combination- or unclassifiable-type pelvic injuries. Of the sacral fractures, there were 3 Denis zone 1, 3 Denis zone 2, and 1 Denis zone 3. 

The pelvic CT scan included the entire pelvis from the ilium to the ischial tuberosities. Each scan consisted of either a 5.0-mm or a 2.5-mm sequential axial image. A picture archiving and communication system (PACS) workstation using Centricity version 2.1 (GE Medical Systems, Waukesha, Wisconsin) was used to analyze each scan with a bone algorithm. On PACS, each initial displacement was characterized by the amount of SI joint widening at the level of the S1 and was measured using digital calipers.   

Surgery

Mean time to surgery was 4 days (range, 2 to 15 days) after the injury. A total of 51 SI screws were implanted in 46 patients. We achieved closed reduction of the posterior pelvic ring by various techniques, including compression with percutaneous partially threaded screw fixation. In the cases in which the posterior ring lesion was associated with a pure pubic symphysis disruption, the anterior pelvis was initially reduced and stabilized with small-fragment plate fixation (Synthes, Inc, Paoli, Pennsylvania). The posterior complex was stabilized with 1 screw in 41 patients, 2 cases required a transiliac screw, and 2 screws (S1 and S2) were placed in each of the remaining 3 cases. Definitive stabilization of the posterior pelvis was achieved with percutaneous, partially threaded 7.3- or 7.5-mm–diameter cannulated screws (Synthes, Inc, and Zimmer Inc, Warsaw, Indiana, respectively) in 42 fractures and 6.5-mm screws (Synthes, Inc) in 4 fractures. In 11 cases where the fracture was through the sacrum, fully threaded cannulated screws were used to avoid compression. Screw insertion was performed under fluoroscopic guidance with inlet, outlet, and lateral sacral views. One of 2 fellowship-trained trauma surgeons performed the surgeries. Rehabilitation plans were customized to each patient based on concomitant injuries. 

 

 

Postoperative Assessment

AP, lateral sacral, and inlet and outlet postoperative radiographs were taken in all cases within 24 hours after surgery. Pelvic CT was also obtained within 24 hours of surgery to review reduction and screw placement.

Using the measurement tool on the PACS system, we measured the penetration of the screw into the foramen. Screws were graded as intraosseous (completely contained within the sacral bone), skived (less than 2 mm of partial penetration into the S1 foramen), or extruded (the screw not contained by the bone). Screw penetration of the S1 was evaluated on the radiographic images as well as the axial images of the CT scans.

After surgery, the senior orthopedic resident and attending surgeon performed and documented detailed neurologic evaluations. They reviewed the medical record for neurologic deficit following surgical fixation.   

Results

The mean follow-up time was 12 months (range, 8 months to 2 years). Two patients expired secondary to associated injuries. There were no early deaths related to the pelvic surgery. Stable fixation, including bone or ligamentous healing, as well as full weight-bearing status, was noted in every case. No case exhibited loss of reduction or implant failure or infection.

According to Matta’s criteria of anatomic reduction within 1 cm, all patients were found to have satisfactory reductions.7 Six of 46 patients had documented preoperative neurologic deficits. After percutaneous screw fixation, 10 of 46 patients had postoperative neurologic deficit, 2 of which were unchanged from preoperative evaluation. Of the 8 patients with new/altered postoperative neurologic deficit, CT showed neural foramen penetration greater than 2.1 mm in only 2 patients. Both patients underwent screw revision, resulting in improved neurologic deficit. The remaining 4 patients did not have foramen penetration and improved their neurologic function over the course of 2 weeks with return to presurgical status by 6 weeks without necessitating screw removal.

Twenty-three of the 51 screws (45%) had some violation of the S1 foramen on the CT. There were 17 patients with dysmorphic sacrums in which 21 S1 screws were placed. Eleven of  21 (52%) screws showed some penetration of the S1 foramen on CT. There were 29 patients with normal sacral morphology in which 30 S1 screws were placed. Twelve of 30 (40%) screws penetrated the S1 foramen. All violations were in the superior one-third position of the foramen. Two of 46 (4%; 1 with dysmorphism, 1 without) had a new neurologic deficit associated with the surgery (Table). CT showed sacral foramen penetration, and both screws were revised with a better neurologic examination.

High-resolution CTs were obtained in 32 patients, while 14 patients underwent the standard 5.0-mm–cut CTs. Of the 32 patients in which a 2.5-mm high-resolution CT was obtained, 20 (62.5%) had evidence of screw penetration (Figures 1, 2). All violations of the S1 neural foramen were in the superior portion of the foramen. 

When compared with patients who had a 5.0-mm CT, the patients who underwent a high-resolution CT were more likely to show neural foramen penetration (P = .3). The average screw penetration into the S1 neural foramen measured 3.3 mm (range, 1.6-5.7 mm) in dysmorphic sacrum and 2.7 mm  (range, 1.4-7 mm) in normal sacrum. However, in our study, any foramen penetration of less than 2.1 mm on CT did not result in neurologic deficit.  

Discussion

Pelvic fractures are fairly common and represent approximately 5% of all trauma admissions and 3% of all skeletal fractures nationwide.1 The current treatment for SI disruption is either nonoperative or operative. Surgical fixation is technically demanding and surgeons often need a long learning curve to acquire the demanding technique because of the limitations of radiographic visualization of the relevant landmarks.16

Letournel17 developed the technique for iliosacral screw fixation for the treatment of posterior pelvic ring injuries, where 1 or 2 large screws (6.5-7.3 mm in diameter) are inserted under fluoroscopic guidance through the ilium, across the SI articulation, and into the superior sacral vertebral bodies using percutaneous techniques. Currently, the standard procedure to accomplish the percutaneous placement of iliosacral screws derives mainly from the technique described by Matta with the C-arm fluoroscopy visualizing the pelvis in 3 views: strict AP, inlet, and outlet views.7

Routt and colleagues4 recommend a strict lateral view of the sacrum, particularly when crossing the narrow zone of the sacral alar. They reported high union rates and accurate placement of the screws.4 There are limitations to the use of biplanar fluoroscopy because the intraoperative images are not orthogonal, with the average arc (67º) between the ideal inlet and outlet. However, because of the variability in sacral anatomy, CT guidance was recommended by others.2,6,8,18 Operating in a CT suite had other complications. Misinterpretation of CT led to “in-out-in” screws, which resulted in neurapraxia. 

 

 

In our study, we used the technique described by Matta and colleagues for placement of the screws and performed a postoperative CT to evaluate screw placement and to assess pelvic reduction.7 We had a high penetration rate using CT, which increased with better resolution, even though none of the radiographs showed any obvious evidence of misplacement of the screws. Ebraheim and colleagues6 described the relationship of the S1 nerve root in its neural foramen and found it to be approximately 8.7 mm inferior and 7.8 mm medial to the starting point for a pedicle screw. Given these numbers, it is possible that a large amount of skiving can be tolerated contingent on an adequate reduction of the SI joint. 

Because of our high rates of skiving and low rates of neurologic deficit, a new “safe zone” for screw insertion can be expanded to include skiving of the S1 neural foramen up to 3 mm without fear of nerve root injury. However, drilling and screw insertion at higher speeds can also cause neurologic injury secondary to thermal injury or soft tissue being caught up in a rotating drill/screw. 

Evaluation of placement of percutaneous SI screw placement in our study resulted in neural foramen penetration in 43% of SI screws, which is higher than other studies.14,19,20 Our study showed that screw penetration up to 2 mm does not correlate with neurologic deficit. Iatrogenic neurologic deficit secondary to perforation of the foramina occurred in only 1 patient. Penetration of the foramina in all cases was in the superior portion of the foramen. We propose that there is a safe zone within the S1 neural foramen, and small amounts of penetration in the superior one-third of the foramen on axial CT images do not correlate with neurologic deficit. This potential safe zone is predicated on adequate reduction of the SI joint. 

Neural foramen penetration shown on postoperative CT does not necessarily correlate with neurologic deficit. A postoperative CT is not indicated unless there are findings of a postoperative nerve injury. Our ideal screw placement skives the superior S1 foramen allowing for a larger screw diameter in a safe zone.

CT-guided placement has been proposed; however, concerns about radiation exposure, cost, and feasibility with similar outcomes compared with fluoroscopic-guided screw placement has resulted in its falling out of favor.

Iatrogenic nerve injuries are reported to occur in 0% to 6% of all percutaneous SI screw placement.14,21 Risk factors for iatrogenic nerve injury while using fluoroscopic guidance include sacral morphologic abnormalities, presence of intestinal gas, or contrast.22 Although these may be minimized with proper use of fluoroscopy, obtaining anatomic reduction as well as a thorough understanding of the pelvic morphology, the surgeon must be prepared to obtain further studies, such as a CT scan, if there is postoperative neurologic deficit.

Based on our findings, we do not routinely obtain a postoperative CT for SI screw placement, unless there is concern for malreduction or there is neurologic deficit. We also believe that up to 2 mm of foramen penetration is safe and does not result in neurologic deficit.

References

1. Failinger MS, McGanity PL. Unstable fractures of the pelvic ring. J Bone and Joint Surg Am. 1992;74(5):781-791.

2. Smith HE, Yuan PS, Sasso R, Papadopolous S, Vaccaro AR. An evaluation of image-guided technologies in the placement of percutaneous iliosacral screws. Spine (Phila Pa 1976). 2006;31(2):234-238.

3. Judet R, Judet J, Letournel E. Fractures of the acetabulum: classification and surgical approaches for open reduction. Preliminary report. J Bone Joint Surg Am. 1964;46(16):1615-1646.

4. Routt ML Jr, Kregor PJ, Simonian PT, Mayo KA. Early results of percutaneous iliosacral screws placed with the patient in the supine position. J Orthop Trauma. 1995;9(3):207-214.

5. Tonetti J, Carrat L, Blendea S, et al. Clinical results of percutaneous pelvic surgery. Computer assisted surgery using ultrasound compared to standard fluoroscopy. Comput Aided Surg. 2001;6(4):204-211.

6. Ebraheim NA, Coombs R, Jackson WT, Rusin JJ. Percutaneous computed tomography-guided stabilization of posterior pelvic fractures. Clin Orthop. 1994;(307):222-228.

7. Keating JF, Werier J, Blachut P, et al. Early fixation of the vertically unstable pelvis: the role of iliosacral screw fixation of the posterior lesion. J Orthop Trauma. 1999;13(2):107-113.

8. Webb LX, de Araujo W, Donofrio P, et al. Electromyography monitoring for percutaneous placement of iliosacral screws. J Orthop Trauma. 2000;14(4):245-254.

9. Barrick EF, O’Mara JW, Lane HE 3rd. Iliosacral screw insertion using computer-assisted CT image guidance: a laboratory study. Comput Aided Surg. 1998;3(6):289-296.

10. Routt ML Jr, Simonian PT, Agnew SG, Mann FA. Radiographic recognition of the sacral alar slope for optimal placement of iliosacral screws: a cadaveric and clinical study. J Orthop Trauma. 1996;10(3):171-177.

11. Altman DT, Jones CB, Routt ML Jr. Superior gluteal artery injury during iliosacral screw placement. J Orthop Trauma. 1999;13(3):220-227.

12. Stephen DJ. Pseudoaneurysm of the superior gluteal arterial system: an unusual cause of pain after a pelvic fracture. J Trauma. 1997;43(1):146-149.

13. Stöckle U, König B, Hofstetter R, Nolte LP, Haas NP. [Navigation assisted by image conversion. An experimental study on pelvic screw fixation]
[in German]. Unfallchirurg. 2001;104(3):215-220.

14. Templeman D, Schmidt A, Freese J, Weisman I, et al. Proximity of iliosacral screws to neurovascular structures after internal fixation. Clin Orthop. 1996;(329):194-198.

15. Young JW, Burgess AR, Brumback RJ, Poka A. Pelvic fractures: value of plain radiography in early assessment and management. Radiology. 1986;160(2):445-451.

16. Graves ML, Routt ML Jr. Iliosacral screw placement: are uniplanar changes realistic based on standard fluoroscopic imaging? J Trauma. 2011;7(1):204-208.

17. Letournel E. Pelvic fractures. Injury. 1978;10(2):145-148.

18. Blake-Toker AM, Hawkins L, Nadalo L, et al. CT guided percutaneous fixation of sacroiliac fractures in trauma patients. J Trauma. 2001;51(6):1117-1121.

19. Hinsche AF, Giannoudis PV, Smith RM. Fluoroscopy-based multiplanar image guidance for insertion of sacroiliac screws. Clin Orthop. 2002;(395):135-144.

20. van den Bosch EW, van Zwienen CM, van Vugt AB. Fluoroscopic positioning of sacroiliac screws in 88 patients. J Trauma. 2002;53(1):44-48.

21. Cole JD, Blum DA, Ansel LJ. Outcome after fixation of unstable posterior pelvic ring injuries. Clin Orthop. 1996;(329):160-179.

22. Routt ML Jr, Simonian PT. Closed reduction and percutaneous skeletal fixation of sacral fractures. Clin Orthop. 1996;(329):121-128.

References

1. Failinger MS, McGanity PL. Unstable fractures of the pelvic ring. J Bone and Joint Surg Am. 1992;74(5):781-791.

2. Smith HE, Yuan PS, Sasso R, Papadopolous S, Vaccaro AR. An evaluation of image-guided technologies in the placement of percutaneous iliosacral screws. Spine (Phila Pa 1976). 2006;31(2):234-238.

3. Judet R, Judet J, Letournel E. Fractures of the acetabulum: classification and surgical approaches for open reduction. Preliminary report. J Bone Joint Surg Am. 1964;46(16):1615-1646.

4. Routt ML Jr, Kregor PJ, Simonian PT, Mayo KA. Early results of percutaneous iliosacral screws placed with the patient in the supine position. J Orthop Trauma. 1995;9(3):207-214.

5. Tonetti J, Carrat L, Blendea S, et al. Clinical results of percutaneous pelvic surgery. Computer assisted surgery using ultrasound compared to standard fluoroscopy. Comput Aided Surg. 2001;6(4):204-211.

6. Ebraheim NA, Coombs R, Jackson WT, Rusin JJ. Percutaneous computed tomography-guided stabilization of posterior pelvic fractures. Clin Orthop. 1994;(307):222-228.

7. Keating JF, Werier J, Blachut P, et al. Early fixation of the vertically unstable pelvis: the role of iliosacral screw fixation of the posterior lesion. J Orthop Trauma. 1999;13(2):107-113.

8. Webb LX, de Araujo W, Donofrio P, et al. Electromyography monitoring for percutaneous placement of iliosacral screws. J Orthop Trauma. 2000;14(4):245-254.

9. Barrick EF, O’Mara JW, Lane HE 3rd. Iliosacral screw insertion using computer-assisted CT image guidance: a laboratory study. Comput Aided Surg. 1998;3(6):289-296.

10. Routt ML Jr, Simonian PT, Agnew SG, Mann FA. Radiographic recognition of the sacral alar slope for optimal placement of iliosacral screws: a cadaveric and clinical study. J Orthop Trauma. 1996;10(3):171-177.

11. Altman DT, Jones CB, Routt ML Jr. Superior gluteal artery injury during iliosacral screw placement. J Orthop Trauma. 1999;13(3):220-227.

12. Stephen DJ. Pseudoaneurysm of the superior gluteal arterial system: an unusual cause of pain after a pelvic fracture. J Trauma. 1997;43(1):146-149.

13. Stöckle U, König B, Hofstetter R, Nolte LP, Haas NP. [Navigation assisted by image conversion. An experimental study on pelvic screw fixation]
[in German]. Unfallchirurg. 2001;104(3):215-220.

14. Templeman D, Schmidt A, Freese J, Weisman I, et al. Proximity of iliosacral screws to neurovascular structures after internal fixation. Clin Orthop. 1996;(329):194-198.

15. Young JW, Burgess AR, Brumback RJ, Poka A. Pelvic fractures: value of plain radiography in early assessment and management. Radiology. 1986;160(2):445-451.

16. Graves ML, Routt ML Jr. Iliosacral screw placement: are uniplanar changes realistic based on standard fluoroscopic imaging? J Trauma. 2011;7(1):204-208.

17. Letournel E. Pelvic fractures. Injury. 1978;10(2):145-148.

18. Blake-Toker AM, Hawkins L, Nadalo L, et al. CT guided percutaneous fixation of sacroiliac fractures in trauma patients. J Trauma. 2001;51(6):1117-1121.

19. Hinsche AF, Giannoudis PV, Smith RM. Fluoroscopy-based multiplanar image guidance for insertion of sacroiliac screws. Clin Orthop. 2002;(395):135-144.

20. van den Bosch EW, van Zwienen CM, van Vugt AB. Fluoroscopic positioning of sacroiliac screws in 88 patients. J Trauma. 2002;53(1):44-48.

21. Cole JD, Blum DA, Ansel LJ. Outcome after fixation of unstable posterior pelvic ring injuries. Clin Orthop. 1996;(329):160-179.

22. Routt ML Jr, Simonian PT. Closed reduction and percutaneous skeletal fixation of sacral fractures. Clin Orthop. 1996;(329):121-128.

Issue
The American Journal of Orthopedics - 43(11)
Issue
The American Journal of Orthopedics - 43(11)
Page Number
513-516
Page Number
513-516
Publications
Publications
Topics
Article Type
Display Headline
The Role of Computed Tomography for Postoperative Evaluation of Percutaneous Sacroiliac Screw Fixation and Description of a “Safe Zone”
Display Headline
The Role of Computed Tomography for Postoperative Evaluation of Percutaneous Sacroiliac Screw Fixation and Description of a “Safe Zone”
Legacy Keywords
american journal of orthopedics, AJO, original study, study, computed tomography, CT, sacroiliac screw fixation, SI, sacroiliac, safe zone, joint, injuries, injury, surgery, CT scans, soft-tissue, SI joint, tejwani, raskolnikov, mclaurin, takemoto
Legacy Keywords
american journal of orthopedics, AJO, original study, study, computed tomography, CT, sacroiliac screw fixation, SI, sacroiliac, safe zone, joint, injuries, injury, surgery, CT scans, soft-tissue, SI joint, tejwani, raskolnikov, mclaurin, takemoto
Sections
Article Source

PURLs Copyright

Inside the Article

Article PDF Media

Return Visits to Pediatric EDs

Article Type
Changed
Sun, 05/21/2017 - 13:39
Display Headline
Prevalence and predictors of return visits to pediatric emergency departments

Returns to the hospital following recent encounters, such as an admission to the inpatient unit or evaluation in an emergency department (ED), may reflect the natural progression of a disease, the quality of care received during the initial admission or visit, or the quality of the underlying healthcare system.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] Although national attention has focused on hospital readmissions,[3, 4, 5, 6, 7, 11, 12] ED revisits are a source of concern to emergency physicians.[8, 9] Some ED revisits are medically necessary, but revisits that may be managed in the primary care setting contribute to ED crowding, can be stressful to patients and providers, and increase healthcare costs.[10, 11, 12] Approximately 27 million annual ED visits are made by children, accounting for over one‐quarter of all ED visits in the United States, with a reported ED revisit rate of 2.5% to 5.2%.[2, 13, 14, 15, 16, 17, 18, 19, 20] Improved understanding of the patient‐level or visit‐level factors associated with ED revisits may provide an opportunity to enhance disposition decision making at the index visit and optimize site of and communication around follow‐up care.

Previous studies on ED revisits have largely been conducted in single centers and have used variable visit intervals ranging between 48 hours and 30 days.[2, 13, 16, 18, 21, 22, 23, 24, 25] Two national studies used the National Hospital Ambulatory Medical Care Survey, which includes data from both general and pediatric EDs.[13, 14] Factors identified to be associated with increased odds of returning were: young age, higher acuity, chronic conditions, and public insurance. One national study identified some diagnoses associated with higher likelihood of returning,[13] whereas the other focused primarily on infectious diseaserelated diagnoses.[14]

The purpose of this study was to describe the prevalence of return visits specifically to pediatric EDs and to investigate patient‐level, visit‐level, and healthcare systemrelated factors that may be associated with return visits and hospitalization at return.

METHODS

Study Design and Data Source

This retrospective cohort study used data from the Pediatric Health Information System (PHIS), an administrative database with data from 44 tertiary care pediatric hospitals in 27 US states and the District of Columbia. This database contains patient demographics, diagnoses, and procedures as well as medications, diagnostic imaging, laboratory, and supply charges for each patient. Data are deidentified prior to inclusion; encrypted medical record numbers allow for the identification of individual patients across all ED visits and hospitalizations to the same hospital. The Children's Hospital Association (Overland Park, KS) and participating hospitals jointly assure the quality and integrity of the data. This study was approved by the institutional review board at Boston Children's Hospital with a waiver for informed consent granted.

Study Population and Protocol

To standardize comparisons across the hospitals, we included data from 23 of the 44 hospitals in PHIS; 7 were excluded for not including ED‐specific data. For institutions that collect information from multiple hospitals within their healthcare system, we included only records from the main campus or children's hospital when possible, leading to the exclusion of 9 hospitals where the data were not able to be segregated. As an additional level of data validation, we compared the hospital‐level ED volume and admission rates as reported in the PHIS to those reported to a separate database (the Pediatric Analysis and Comparison Tool). We further excluded 5 hospitals whose volume differed by >10% between these 2 data sources.

Patients <18 years of age who were discharged from these EDs following their index visit in 2012 formed the eligible cohort.

Key Outcome Measures

The primary outcomes were return visits within 72 hours of discharge from the ED, and return visits resulting in hospitalization, including observation status. We defined an ED revisit as a return within 72 hours of ED discharge regardless of whether the patient was subsequently discharged from the ED on the return visit or hospitalized. We assessed revisits within 72 hours of an index ED discharge, because return visits within this time frame are likely to be related to the index visit.[2, 13, 16, 21, 22, 24, 25, 26]

Factors Associated With ED Revisits

A priori, we chose to adjust for the following patient‐level factors: age (<30 days, 30 days<1 year, 14 years, 511 years, 1217 years), gender, and socioeconomic status (SES) measured as the zip codebased median household income, obtained from the 2010 US Census, with respect to the federal poverty level (FPL) (<1.5 FPL, 1.52 FPL, 23 FPL, and >3 FPL).[27] We also adjusted for insurance type (commercial, government, or other), proximity of patient's home zip code to hospital (modeled as the natural log of the geographical distance to patient's home address from the hospital), ED diagnosis‐based severity classification system score (1=low severity, 5=high severity),[28] presence of a complex chronic condition at the index or prior visits using a validated classification scheme,[15, 29, 30, 31] and primary care physician (PCP) density per 100,000 in the patient's residential area (modeled as quartiles: very low, <57.2; low, 57.267.9; medium, 68.078.7; high, >78.8). PCP density, defined by the Dartmouth Atlas of Health Care,[32, 33, 34] is the number of primary care physicians per 100,000 residents (PCP count) in federal health service areas (HSA). Patients were assigned to a corresponding HSA based on their home zip code.

Visit‐level factors included arrival time of index visit (8:01 am 4:00 pm, 4:01 pm12:00 am, 12:01 am8 am representing day, evening, and overnight arrival, respectively), day of the week, season, length of stay (LOS) in the ED during the index visit, and ED crowding (calculated as the average daily LOS/yearly average LOS for the individual ED).[35] We categorized the ED primary diagnosis for each visit using the major diagnosis groupings of a previously described pediatric ED‐specific classification scheme.[36] Using International Classification of Diseases, Ninth Revision (ICD‐9) codes, we identified the conditions with the highest ED revisit rates.

Statistical Analyses

Categorical variables describing the study cohort were summarized using frequencies and percentages. Continuous variables were summarized using mean, median, and interquartile range values, where appropriate. We used 2 different hierarchical logistic regression models to assess revisit rates by patient‐ and visit‐level characteristics. The initial model included all patients discharged from the ED following the index visit and assessed for the outcome of a revisit within 72 hours. The second model considered only patients who returned within 72 hours of an index visit and assessed for hospitalization on that return visit. We used generalized linear mixed effects models, with hospital as a random effect to account for the presence of correlated data (within hospitals), nonconstant variability (across hospitals), and binary responses. Adjusted odds ratios with 95% confidence intervals were used as summary measures of the effect of the individual adjusters. Adjusters were missing in fewer than 5% of patients across participating hospitals. Statistical analyses were performed using SAS version 9.3 (SAS Institute Inc., Cary, NC); 2‐sided P values <0.004 were considered statistically significant to account for multiple comparisons (Bonferroni‐adjusted level of significance=0.0038).

RESULTS

Patients

A total of 1,610,201 patients <18 years of age evaluated across the 23 PHIS EDs in 2012 were included in the study. Twenty‐one of the 23 EDs have academic affiliations; 10 are located in the South, 6 in the Midwest, 5 in the West, and 2 in the Northeast region of the United States. The annual ED volume for these EDs ranged from 25,090 to 136,160 (median, 65,075; interquartile range, 45,28085,206). Of the total patients, 1,415,721 (87.9%) were discharged following the index visit and comprised our study cohort. Of these patients, 47,294 (revisit rate: 3.3%) had an ED revisit within 72 hours. There were 4015 patients (0.3%) who returned more than once within 72 hours, and the largest proportion of these returned with infection‐related conditions. Of those returning, 37,999 (80.3%) were discharged again, whereas 9295 (19.7%) were admitted to the hospital (Figure 1). The demographic and clinical characteristics of study participants are displayed in Table 1.

Figure 1
Patient disposition from the emergency departments of study hospitals (n = 23) in 2012.
Characteristics of Patients Who Returned Within 72 Hours of ED Discharge to the Study EDs
 Index Visit, n=1,415,721, n (%)Return Visits Within 72 Hours of Discharge, n=47,294, 3.3%
Return to Discharge, n (%)Return to Admission, n (%)
  • NOTE: Abbreviations: CCC, complex chronic condition; ED, emergency department; FPL, federal poverty level; IQR, interquartile range; LOS, length of stay.

  • Socioeconomic status is relative to the federal poverty level for a family of 4.

Gender, female659,417 (46.6)17,665 (46.5)4,304 (46.3)
Payor   
Commercial379,403 (26.8)8,388 (22.1)3,214 (34.6)
Government925,147 (65.4)26,880 (70.7)5,786 (62.3)
Other111,171 (7.9)2,731 (7.2)295 (3.2)
Age   
<30 days19,217 (1.4)488 (1.3)253 (2.7)
30 days to <1 year216,967 (15.3)8,280 (21.8)2,372 (25.5)
1 year to 4 years547,083 (38.6)15,542 (40.9)3,187 (34.3)
5 years to 11 years409,463 (28.9)8,906 (23.4)1,964 (21.1)
12 years to 17 years222,991 (15.8)4,783 (12.6)1,519 (16.3)
Socioeconomic statusa   
<1.5 times FPL493,770 (34.9)13,851 (36.5)2,879 (31.0)
1.5 to 2 times FPL455,490 (32.2)12,364 (32.5)2,904 (31.2)
2 to 3 times FPL367,557 (26.0)9,560 (25.2)2,714 (29.2)
>3 times FPL98,904 (7.0)2,224 (5.9)798 (8.6)
Primary care physician density per 100,000 patients   
Very low351,798 (24.9)8,727 (23.0)2,628 (28.3)
Low357,099 (25.2)9,810 (25.8)2,067 (22.2)
Medium347,995 (24.6)10,186 (26.8)2,035 (21.9)
High358,829 (25.4)9,276 (24.4)2,565 (27.6)
CCC present, yes125,774 (8.9)4,446 (11.7)2,825 (30.4)
Severity score   
Low severity (0,1,2)721,061 (50.9)17,310 (45.6)2,955 (31.8)
High severity (3,4,5)694,660 (49.1)20,689 (54.5)6,340 (68.2)
Time of arrival   
Day533,328 (37.7)13,449 (35.4)3,396 (36.5)
Evening684,873 (48.4)18,417 (48.5)4,378 (47.1)
Overnight197,520 (14.0)6,133 (16.1)1,521 (16.4)
Season   
Winter384,957 (27.2)10,603 (27.9)2,844 (30.6)
Spring367,434 (26.0)9,923 (26.1)2,311 (24.9)
Summer303,872 (21.5)8,308 (21.9)1,875 (20.2)
Fall359,458 (25.4)9,165 (24.1)2,265 (24.4)
Weekday/weekend   
Monday217,774 (15.4)5,646 (14.9)1,394 (15)
Tuesday198,220 (14.0)5,054 (13.3)1,316 (14.2)
Wednesday194,295 (13.7)4,985 (13.1)1,333 (14.3)
Thursday191,950 (13.6)5,123 (13.5)1,234 (13.3)
Friday190,022 (13.4)5,449 (14.3)1,228 (13.2)
Saturday202,247 (14.3)5,766 (15.2)1,364 (14.7)
Sunday221,213 (15.6)5,976 (15.7)1,426 (15.3)
Distance from hospital in miles, median (IQR)8.3 (4.614.9)9.2 (4.917.4)8.3 (4.614.9)
ED crowding score at index visit, median (IQR)1.0 (0.91.1)1.0 (0.91.1)1.0 (0.91.1)
ED LOS in hours at index visit, median (IQR)2.0 (1.03.0)3.0 (2.05.0)2.0 (1.03.0)

ED Revisit Rates and Revisits Resulting in Admission

In multivariate analyses, compared to patients who did not return to the ED, patients who returned within 72 hours of discharge had higher odds of revisit if they had the following characteristics: a chronic condition, were <1 year old, a higher severity score, and public insurance. Visit‐level factors associated with higher odds of revisits included arrival for the index visit during the evening or overnight shift or on a Friday or Saturday, index visit during times of lower ED crowding, and living closer to the hospital. On return, patients were more likely to be hospitalized if they had a higher severity score, a chronic condition, private insurance, or were <30 days old. Visit‐level factors associated with higher odds of hospitalization at revisit included an index visit during the evening and overnight shift and living further from the hospital. Although the median SES and PCP density of a patient's area of residence were not associated with greater likelihood of returning, when they returned, patients residing in an area with a lower SES and higher PCP densities (>78.8 PCPs/100,000) had lower odds of being admitted to the hospital. Patients whose index visit was on a Sunday also had lower odds of being hospitalized upon return (Table 2).

Multivariate Analyses of Factors Associated With ED Revisits and Admission at Return
CharacteristicAdjusted OR of 72‐Hour Revisit (95% CI), n=1,380,723P ValueAdjusted OR of 72‐Hour Revisit Admissions (95% CI), n=46,364P Value
  • NOTE: Effects of continuous variables are assessed as 1‐unit offsets from the mean. Abbreviations: CCC, complex chronic condition; CI, confidence interval; ED, emergency department; FPL, federal poverty level; LOS, length of stay; OR, odds ratio, NA, not applicable.

  • Socioeconomic status is relative to the FPL for a family of 4.

  • ED crowding score and LOS are based on index visit. ED crowding score is calculated as the daily LOS (in hours)/overall LOS (in hours). Overall average across hospitals=1; a 1‐ unit increase translates into twice the duration for the daily LOS over the yearly average ED LOS.

  • Modeled as the natural log of the patient geographic distance from the hospital based on zip codes. Number in parentheses represents the exponential of the modeled variable.

Gender    
Male0.99 (0.971.01)0.28091.02 (0.971.07)0.5179
FemaleReference Reference 
Payor    
Government1.14 (1.111.17)<0.00010.68 (0.640.72)<0.0001
Other0.97 (0.921.01)0.11480.33 (0.280.39)<0.0001
PrivateReference Reference 
Age group    
30 days to <1 year1.32 (1.221.42)<0.00010.58 (0.490.69)<0.0001
1 year to 5 years0.89 (0.830.96)0.0030.41 (0.340.48)<0.0001
5 years to 11 years0.69 (0.640.74)<0.00010.40 (0.330.48)<0.0001
12 years to 17 years0.72 (0.660.77)<0.00010.50 (0.420.60)<0.0001
<30 daysReference Reference 
Socioeconomic statusa    
% <1.5 times FPL0.96 (0.921.01)0.09920.82 (0.740.92)0.0005
% 1.5 to 2 times FPL0.98 (0.941.02)0.29920.83 (0.750.92)0.0005
% 2 to 3 times FPL1.02 (0.981.07)0.2920.88 (0.790.97)0.01
% >3 times FPLReference Reference 
Severity score    
High severity, 4, 5, 61.43 (1.401.45)<0.00013.42 (3.233.62)<0.0001
Low severity, 1, 2, 3Reference Reference 
Presence of any CCC    
Yes1.90 (1.861.96)<0.00012.92 (2.753.10)<0.0001
NoReference Reference 
Time of arrival    
Evening1.05 (1.031.08)<0.00011.37 (1.291.44)<0.0001
Overnight1.19 (1.151.22)<0.00011.84 (1.711.97)<0.0001
DayReference Reference 
Season    
Winter1.09 (1.061.11)<0.00011.06 (0.991.14)0.0722
Spring1.07 (1.041.10)<0.00010.98 (0.911.046)0.4763
Summer1.05 (1.021.08)0.00110.93 (0.871.01)0.0729
FallReference Reference 
Weekday/weekend    
Thursday1.02 (0.9821.055)0.32970.983 (0.8971.078)0.7185
Friday1.08 (1.041.11)<0.00011.03 (0.941.13)0.5832
Saturday1.08 (1.041.12)<0.00010.89 (0.810.97)0.0112
Sunday1.02 (0.991.06)0.20540.81 (0.740.89)<0.0001
Monday1.00 (0.961.03)0.89280.98 (0.901.07)0.6647
Tuesday0.99 (0.951.03)0.53420.93 (0.851.02)0.1417
WednesdayReference Reference 
PCP ratio per 100,000 patients    
57.267.91.00 (0.961.04)0.88440.93 (0.841.03)0.1669
68.078.71.00 (0.951.04)0.81560.86 (0.770.96)0.0066
>78.81.00 (0.951.04)0.68830.82 (0.730.92)0.001
<57.2Reference Reference 
ED crowding score at index visitb    
20.92 (0.900.95)<0.00010.96 (0.881.05)0.3435
1Reference Reference 
Distance from hospitalc    
3.168, 23.6 miles0.95 (0.940.96)<0.00011.16 (1.121.19)<0.0001
2.168, 8.7 milesReference Reference 
ED LOS at index visitb    
3.7 hours1.003 (1.0011.005)0.0052NA 
2.7 hoursReference   

Diagnoses Associated With Return Visits

Patients with index visit diagnoses of sickle cell disease and leukemia had the highest proportion of return visits (10.7% and 7.3%, respectively). Other conditions with high revisit rates included infectious diseases such as cellulitis, bronchiolitis, and gastroenteritis. Patients with other chronic diseases such as diabetes and with devices, such as gastrostomy tubes, also had high rates of return visits. At return, the rate of hospitalization for these conditions ranged from a 1‐in‐6 chance of hospitalization for the diagnoses of a fever to a 1‐in‐2 chance of hospitalization for patients with sickle cell anemia (Table 3).

Major Diagnostic Subgroups With the Highest ED Revisit and Admission at Return Rates
Major Diagnostic SubgroupNo. of Index ED Visit Dischargesa72‐Hour Revisit, % (95% CI)Admitted on Return, % (95% CI)
  • NOTE: Abbreviations: CI, confidence interval; ED, emergency department; NOS, not otherwise specified.

  • Diagnoses with <500 index visits (ie, <2 visits per month across the 23 hospitals) or <30 revisits within entire study cohort excluded from analyses.

  • Most prevalent diagnoses as identified by International Classification of Diseases, Ninth Revision codes within specified major diagnostic subgroups: devices and complications of the circulatory system, complication of other vascular device, implant, and graft; other hematologic diseases, anemia NOS, neutropenia NOS, or thrombocytopenia NOS; other devices and complications, hemorrhage complicating a procedure; devices and complications of the gastrointestinal system, gastrostomy; other infectious diseases, perinatal infections.

Sickle cell anemia2,53110.7 (9.511.9)49.6 (43.755.6)
Neoplastic diseases, cancer5367.3 (5.19.5)36 (2151)
Infectious gastrointestinal diseases8027.2 (5.49.0)21 (1031)
Devices and complications of the circulatory systemb1,0336.9 (5.38.4)45 (3457)
Other hematologic diseasesb1,5386.1 (4.97.3)33 (2443)
Fever80,6265.9 (5.76.0)16.3 (15.217.3)
Dehydration7,3625.4 (5.25.5)34.6 (30.139)
Infectious respiratory diseases72,6525.4 (5.25.5)28.6 (27.230)
Seizures17,6375.3 (4.95.6)33.3 (30.336.4)
Other devices and complicationsb1,8965.3 (4.36.3)39.0 (29.448.6)
Infectious skin, dermatologic and soft tissue diseases40,2724.7 (4.55)20.0 (18.221.8)
Devices and complications of the gastrointestinal systemb4,6924.6 (4.05.2)24.7 (18.930.4)
Vomiting44,7304.4 (4.24.6)23.7 (21.825.6)
Infectious urinary tract diseases17,0204.4 (4.14.7)25.9 (22.729)
Headache19,0164.3 (4.14.6)28.2 (25.131.3)
Diabetes mellitus1,5314.5 (3.35.3)29 (1840)
Abdominal pain39,5944.2 (44.4)24.8 (22.726.8)
Other infectious diseasesb6474.2 (2.65.7)33 (1651)
Gastroenteritis55,6134.0 (3.84.1)20.6 (18.922.3)

DISCUSSION

In this nationally representative sample of free‐standing children's hospitals, 3.3% of patients discharged from the ED returned to the same ED within 72 hours. This rate is similar to rates previously published in studies of general EDs.[11, 15] Of the returning children, over 80% were discharged again, and 19.7% were hospitalized, which is two‐thirds more than the admission rate at index visit (12%). In accordance with previous studies,[14, 16, 25] we found higher disease severity, presence of a chronic condition, and younger age were strongly associated with both the odds of patients returning to the ED and of being hospitalized at return. Patients who were hospitalized lived further away from the hospital and were of a higher SES. In this study, we show that visit‐level and access‐related factors are also associated with increased risk of return, although to a lesser degree. Patients seen on a weekend (Friday or Saturday) were found to have higher odds of returning, whereas those seen initially on a Sunday had lower odds of hospitalization at return. In this study, we also found that patients seen on the evening or night shifts at the index presentation had a significant association with return visits and hospitalization at return. Additionally, we found that although PCP density was not associated with the odds of returning to the ED, patients from areas with a higher PCP density were less likely to be admitted at return. In addition, by evaluating the diagnoses of patients who returned, we found that many infectious conditions commonly seen in the ED also had high return rates.

As previously shown,[23] we found that patients with complex and chronic diseases were at risk for ED revisits, especially patients with sickle cell anemia and cancer (mainly acute leukemia). In addition, patients with a chronic condition were 3 times more likely to be hospitalized when they returned. These findings may indicate an opportunity for improved discharge planning and coordination of care with subspecialty care providers for particularly at‐risk populations, or stronger consideration of admission at the index visit. However, admission for these patients at revisit may be unavoidable.

Excluding patients with chronic and complex conditions, the majority of conditions with high revisit rates were acute infectious conditions. One national study showed that >70% of ED revisits by patients with infectious conditions had planned ED follow‐up.[13] Although this study was unable to assess the reasons for return or admission at return, children with infectious diseases often worsen over time (eg, those with bronchiolitis). The relatively low admission rates at return for these conditions, despite evidence that providers may have a lower threshold for admission when a patient returns to the ED shortly after discharge,[24] may reflect the potential for improving follow‐up at the PCP office. However, although some revisits may be prevented,[37, 38] we recognize that an ED visit could be appropriate and necessary for some of these children, especially those without primary care.

Access to primary care and insurance status influence ED utilization.[14, 39, 40, 41] A fragmented healthcare system with poor access to primary care is strongly associated with utilization of the ED for nonurgent care. A high ED revisit rate might be indicative of poor coordination between ED and outpatient services.[9, 39, 42, 43, 44, 45, 46] Our study's finding of increased risk of return visit if the index visit occurred on a Friday or Saturday, and a decreased likelihood of subsequent admission when a patient returns on a Sunday, may suggest limited or perceived limited access to the PCP over a weekend. Although insured patients tend to use the ED less often for nonemergent cases, even when patients have PCPs, they might still choose to return to the ED out of convenience.[47, 48] This may be reflected in our finding that, when adjusted for insurance status and PCP density, patients who lived closer to the hospital were more likely to return, but less likely to be admitted, thereby suggesting proximity as a factor in the decision to return. It is also possible that patients residing further away returned to another institution. Although PCP density did not seem to be associated with revisits, patients who lived in areas with higher PCP density were less likely to be admitted when they returned. In this study, there was a stepwise gradient in the effect of PCP density on the odds of being hospitalized on return with those patients in areas with fewer PCPs being admitted at higher rates on return. Guttmann et al.,[40] in a recent study conducted in Canada where there is universal health insurance, showed that children residing in areas with higher PCP densities had higher rates of PCP visits but lower rates of ED visits compared to children residing in areas with lower PCP densities. It is possible that emergency physicians have more confidence that patients will have dedicated follow‐up when a PCP can be identified. These findings suggest that the development of PCP networks with expanded access, such as alignment of office hours with parent need and patient/parent education about PCP availability, may reduce ED revisits. Alternatively, creation of centralized hospital‐based urgent care centers for evening, night, and weekend visits may benefit both the patient and the PCP and avoid ED revisits and associated costs.

Targeting and eliminating disparities in care might also play a role in reducing ED revisits. Prior studies have shown that publicly insured individuals, in particular, frequently use the ED as their usual source of care and are more likely to return to the ED within 72 hours of an initial visit.[23, 39, 44, 49, 50] Likewise, we found that patients with public insurance were more likely to return but less likely to be admitted on revisit. After controlling for disease severity and other demographic variables, patients with public insurance and of lower socioeconomic status still had lower odds of being hospitalized following a revisit. This might also signify an increase of avoidable hospitalizations among patients of higher SES or with private insurance. Further investigation is needed to explore the reasons for these differences and to identify effective interventions to eliminate disparities.

Our findings have implications for emergency care, ambulatory care, and the larger healthcare system. First, ED revisits are costly and contribute to already overburdened EDs.[10, 11] The average ED visit incurs charges that are 2 to 5 times more than an outpatient office visit.[49, 50] Careful coordination of ambulatory and ED services could not only ensure optimal care for patients, but could save the US healthcare system billions of dollars in potentially avoidable healthcare expenditures.[49, 50] Second, prior studies have demonstrated a consistent relationship between poor access to primary care and increased use of the ED for nonurgent conditions.[42] Publicly insured patients have been shown to have disproportionately increased difficulty acquiring and accessing primary care.[41, 42, 47, 51] Furthermore, conditions with high ED revisit rates are similar to conditions reported by Berry et al.4 as having the highest hospital readmission rates such as cancer, sickle cell anemia, seizure, pneumonia, asthma, and gastroenteritis. This might suggest a close relationship between 72‐hour ED revisits and 30‐day hospital readmissions. In light of the recent expansion of health insurance coverage to an additional 30 million individuals, the need for better coordination of services throughout the entire continuum of care, including primary care, ED, and inpatient services, has never been more important.[52] Future improvements could explore condition‐specific revisit or readmission rates to identify the most effective interventions to reduce the possibly preventable returns.

This study has several limitations. First, as an administrative database, PHIS has limited clinical data, and reasons for return visits could not be assessed. Variations between hospitals in diagnostic coding might also lead to misclassification bias. Second, we were unable to assess return visits to a different ED. Thus, we may have underestimated revisit frequency. However, because children are generally more likely to seek repeat care in the same hospital,[3] we believe our estimate of return visit rate approximates the actual return visit rate; our findings are also similar to previously reported rates. Third, for the PCP density factor, we were unable to account for types of insurance each physician accepted and influence on return rates. Fourth, return visits in our sample could have been for conditions unrelated to the diagnosis at index visit, though the short timeframe considered for revisits makes this less likely. In addition, the crowding index does not include the proportion of occupied beds at the precise moment of the index visit. Finally, this cohort includes only children seen in the EDs of pediatric hospitals, and our findings may not be generalizable to all EDs who provide care for ill and injured children.

We have shown that, in addition to previously identified patient level factors, there are visit‐level and access‐related factors associated with pediatric ED return visits. Eighty percent are discharged again, and almost one‐fifth of returning patients are admitted to the hospital. Admitted patients tend to be younger, sicker, chronically ill, and live farther from the hospital. By being aware of patients' comorbidities, PCP access, as well as certain diagnoses associated with high rates of return, physicians may better target interventions to optimize care. This may include having a lower threshold for hospitalization at the initial visit for children at high risk of return, and communication with the PCP at the time of discharge to ensure close follow‐up. Our study helps to provide benchmarks around ED revisit rates, and may serve as a starting point to better understand variation in care. Future efforts should aim to find creative solutions at individual institutions, with the goal of disseminating and replicating successes more broadly. For example, investigators in Boston have shown that the use of a comprehensive home‐based asthma management program has been successful in decreasing emergency department visits and hospitalization rates.[53] It is possible that this approach could be spread to other institutions to decrease revisits for patients with asthma. As a next step, the authors have undertaken an investigation to identify hospital‐level characteristics that may be associated with rates of return visits.

Acknowledgements

The authors thank the following members of the PHIS ED Return Visits Research Group for their contributions to the data analysis plan and interpretation of results of this study: Rustin Morse, MD, Children's Medical Center of Dallas; Catherine Perron, MD, Boston Children's Hospital; John Cheng, MD, Children's Healthcare of Atlanta; Shabnam Jain, MD, MPH, Children's Healthcare of Atlanta; and Amanda Montalbano, MD, MPH, Children's Mercy Hospitals and Clinics. These contributors did not receive compensation for their help with this work.

Disclosures

A.T.A. and A.M.S. conceived the study and developed the initial study design. All authors were involved in the development of the final study design and data analysis plan. C.W.T. collected and analyzed the data. A.T.A. and C.W.T. had full access to all of the data and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors were involved in the interpretation of the data. A.T.A. drafted the article, and all authors made critical revisions to the initial draft and subsequent versions. A.T.A. and A.M.S. take full responsibility for the article as a whole. The authors report no conflicts of interest.

Files
References
  1. Joint policy statement—guidelines for care of children in the emergency department. Pediatrics. 2009;124:12331243.
  2. Alessandrini EA, Lavelle JM, Grenfell SM, Jacobstein CR, Shaw KN. Return visits to a pediatric emergency department. Pediatr Emerg Care. 2004;20:166171.
  3. Axon RN, Williams MV. Hospital readmission as an accountability measure. JAMA. 2011;305:504505.
  4. Berry JG, Hall DE, Kuo DZ, et al. Hospital utilization and characteristics of patients experiencing recurrent readmissions within children's hospitals. JAMA. 2011;305:682690.
  5. Berry JG, Toomey SL, Zaslavsky AM, et al. Pediatric readmission prevalence and variability across hospitals. JAMA. 2013;309:372380.
  6. Carrns A. Farewell, and don't come back. Health reform gives hospitals a big incentive to send patients home for good. US News World Rep. 2010;147:20, 2223.
  7. Coye MJ. CMS' stealth health reform. Plan to reduce readmissions and boost the continuum of care. Hosp Health Netw. 2008;82:24.
  8. Lerman B, Kobernick MS. Return visits to the emergency department. J Emerg Med. 1987;5:359362.
  9. Rising KL, White LF, Fernandez WG, Boutwell AE. Emergency department visits after hospital discharge: a missing part of the equation. Ann Emerg Med. 2013;62:145150.
  10. Stang AS, Straus SE, Crotts J, Johnson DW, Guttmann A. Quality indicators for high acuity pediatric conditions. Pediatrics. 2013;132:752762.
  11. Fontanarosa PB, McNutt RA. Revisiting hospital readmissions. JAMA. 2013;309:398400.
  12. Vaduganathan M, Bonow RO, Gheorghiade M. Thirty‐day readmissions: the clock is ticking. JAMA. 2013;309:345346.
  13. Adekoya N. Patients seen in emergency departments who had a prior visit within the previous 72 h‐National Hospital Ambulatory Medical Care Survey, 2002. Public Health. 2005;119:914918.
  14. Cho CS, Shapiro DJ, Cabana MD, Maselli JH, Hersh AL. A national depiction of children with return visits to the emergency department within 72 hours, 2001–2007. Pediatr Emerg Care. 2012;28:606610.
  15. Feudtner C, Levin JE, Srivastava R, et al. How well can hospital readmission be predicted in a cohort of hospitalized children? A retrospective, multicenter study. Pediatrics. 2009;123:286293.
  16. Goldman RD, Ong M, Macpherson A. Unscheduled return visits to the pediatric emergency department‐one‐year experience. Pediatr Emerg Care. 2006;22:545549.
  17. Klein‐Kremer A, Goldman RD. Return visits to the emergency department among febrile children 3 to 36 months of age. Pediatr Emerg Care. 2011;27:11261129.
  18. LeDuc K, Rosebrook H, Rannie M, Gao D. Pediatric emergency department recidivism: demographic characteristics and diagnostic predictors. J Emerg Nurs. 2006;32:131138.
  19. Healthcare Cost and Utilization Project. Pediatric emergency department visits in community hospitals from selected states, 2005. Statistical brief #52. Available at: http://www.ncbi.nlm.nih.gov/books/NBK56039. Accessed October 3, 2013.
  20. Sharma V, Simon SD, Bakewell JM, Ellerbeck EF, Fox MH, Wallace DD. Factors influencing infant visits to emergency departments. Pediatrics. 2000;106:10311039.
  21. Ali AB, Place R, Howell J, Malubay SM. Early pediatric emergency department return visits: a prospective patient‐centric assessment. Clin Pediatr (Phila). 2012;51:651658.
  22. Hu KW, Lu YH, Lin HJ, Guo HR, Foo NP. Unscheduled return visits with and without admission post emergency department discharge. J Emerg Med. 2012;43:11101118.
  23. Jacobstein CR, Alessandrini EA, Lavelle JM, Shaw KN. Unscheduled revisits to a pediatric emergency department: risk factors for children with fever or infection‐related complaints. Pediatr Emerg Care. 2005;21:816821.
  24. Sauvin G, Freund Y, Saidi K, Riou B, Hausfater P. Unscheduled return visits to the emergency department: consequences for triage. Acad Emerg Med. 2013;20:3339.
  25. Zimmerman DR, McCarten‐Gibbs KA, DeNoble DH, et al. Repeat pediatric visits to a general emergency department. Ann Emerg Med. 1996;28:467473.
  26. Keith KD, Bocka JJ, Kobernick MS, Krome RL, Ross MA. Emergency department revisits. Ann Emerg Med. 1989;18:964968.
  27. US Department of Health 19:7078.
  28. Feudtner C, Christakis DA, Connell FA. Pediatric deaths attributable to complex chronic conditions: a population‐based study of Washington State, 1980–1997. Pediatrics. 2000;106:205209.
  29. Feudtner C, Hays RM, Haynes G, Geyer JR, Neff JM, Koepsell TD. Deaths attributed to pediatric complex chronic conditions: national trends and implications for supportive care services. Pediatrics. 2001;107:E99.
  30. Feudtner C, Silveira MJ, Christakis DA. Where do children with complex chronic conditions die? Patterns in Washington State, 1980–1998. Pediatrics. 2002;109:656660.
  31. Dartmouth Atlas of Health Care. Hospital and physician capacity, 2006. Available at: http://www.dartmouthatlas.org/data/topic/topic.aspx?cat=24. Accessed October 7, 2013.
  32. Dartmouth Atlas of Health Care. Research methods. What is an HSA/HRR? Available at: http://www.dartmouthatlas.org/tools/faq/researchmethods.aspx. Accessed October 7, 2013,.
  33. Dartmouth Atlas of Health Care. Appendix on the geography of health care in the United States. Available at: http://www.dartmouthatlas.org/downloads/methods/geogappdx.pdf. Accessed October 7, 2013.
  34. Beniuk K, Boyle AA, Clarkson PJ. Emergency department crowding: prioritising quantified crowding measures using a Delphi study. Emerg Med J. 2012;29:868871.
  35. Alessandrini EA, Alpern ER, Chamberlain JM, Shea JA, Gorelick MH. A new diagnosis grouping system for child emergency department visits. Acad Emerg Med. 2010;17:204213.
  36. Guttmann A, Zagorski B, Austin PC, et al. Effectiveness of emergency department asthma management strategies on return visits in children: a population‐based study. Pediatrics. 2007;120:e1402e1410.
  37. Horwitz DA, Schwarz ES, Scott MG, Lewis LM. Emergency department patients with diabetes have better glycemic control when they have identifiable primary care providers. Acad Emerg Med. 2012;19:650655.
  38. Billings J, Zeitel L, Lukomnik J, Carey TS, Blank AE, Newman L. Impact of socioeconomic status on hospital use in New York City. Health Aff (Millwood). 1993;12:162173.
  39. Guttmann A, Shipman SA, Lam K, Goodman DC, Stukel TA. Primary care physician supply and children's health care use, access, and outcomes: findings from Canada. Pediatrics. 2010;125:11191126.
  40. Asplin BR, Rhodes KV, Levy H, et al. Insurance status and access to urgent ambulatory care follow‐up appointments. JAMA. 2005;294:12481254.
  41. Kellermann AL, Weinick RM. Emergency departments, Medicaid costs, and access to primary care—understanding the link. N Engl J Med. 2012;366:21412143.
  42. Committee on the Future of Emergency Care in the United States Health System. Emergency Care for Children: Growing Pains. Washington, DC: The National Academies Press; 2007.
  43. Committee on the Future of Emergency Care in the United States Health System. Hospital‐Based Emergency Care: At the Breaking Point. Washington, DC: The National Academies Press; 2007.
  44. Radley DC, Schoen C. Geographic variation in access to care—the relationship with quality. N Engl J Med. 2012;367:36.
  45. Tang N, Stein J, Hsia RY, Maselli JH, Gonzales R. Trends and characteristics of US emergency department visits, 1997–2007. JAMA. 2010;304:664670.
  46. Young GP, Wagner MB, Kellermann AL, Ellis J, Bouley D. Ambulatory visits to hospital emergency departments. Patterns and reasons for use. 24 Hours in the ED Study Group. JAMA. 1996;276:460465.
  47. Tranquada KE, Denninghoff KR, King ME, Davis SM, Rosen P. Emergency department workload increase: dependence on primary care? J Emerg Med. 2010;38:279285.
  48. Network for Excellence in Health Innovation. Leading healthcare research organizations to examine emergency department overuse. New England Research Institute, 2008. Available at: http://www.nehi.net/news/310‐leading‐health‐care‐research‐organizations‐to‐examine‐emergency‐department‐overuse/view. Accessed October 4, 2013.
  49. Robert Wood Johnson Foundation. Quality field notes: reducing inappropriate emergency department use. Available at: http://www.rwjf.org/en/research‐publications/find‐rwjf‐research/2013/09/quality‐field‐notes–reducing‐inappropriate‐emergency‐department.html.
  50. Access of Medicaid recipients to outpatient care. N Engl J Med. 1994;330:14261430.
  51. Medicaid policy statement. Pediatrics. 2013;131:e1697e1706.
  52. Woods ER, Bhaumik U, Sommer SJ, et al. Community asthma initiative: evaluation of a quality improvement program for comprehensive asthma care. Pediatrics. 2012;129:465472.
Article PDF
Issue
Journal of Hospital Medicine - 9(12)
Page Number
779-787
Sections
Files
Files
Article PDF
Article PDF

Returns to the hospital following recent encounters, such as an admission to the inpatient unit or evaluation in an emergency department (ED), may reflect the natural progression of a disease, the quality of care received during the initial admission or visit, or the quality of the underlying healthcare system.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] Although national attention has focused on hospital readmissions,[3, 4, 5, 6, 7, 11, 12] ED revisits are a source of concern to emergency physicians.[8, 9] Some ED revisits are medically necessary, but revisits that may be managed in the primary care setting contribute to ED crowding, can be stressful to patients and providers, and increase healthcare costs.[10, 11, 12] Approximately 27 million annual ED visits are made by children, accounting for over one‐quarter of all ED visits in the United States, with a reported ED revisit rate of 2.5% to 5.2%.[2, 13, 14, 15, 16, 17, 18, 19, 20] Improved understanding of the patient‐level or visit‐level factors associated with ED revisits may provide an opportunity to enhance disposition decision making at the index visit and optimize site of and communication around follow‐up care.

Previous studies on ED revisits have largely been conducted in single centers and have used variable visit intervals ranging between 48 hours and 30 days.[2, 13, 16, 18, 21, 22, 23, 24, 25] Two national studies used the National Hospital Ambulatory Medical Care Survey, which includes data from both general and pediatric EDs.[13, 14] Factors identified to be associated with increased odds of returning were: young age, higher acuity, chronic conditions, and public insurance. One national study identified some diagnoses associated with higher likelihood of returning,[13] whereas the other focused primarily on infectious diseaserelated diagnoses.[14]

The purpose of this study was to describe the prevalence of return visits specifically to pediatric EDs and to investigate patient‐level, visit‐level, and healthcare systemrelated factors that may be associated with return visits and hospitalization at return.

METHODS

Study Design and Data Source

This retrospective cohort study used data from the Pediatric Health Information System (PHIS), an administrative database with data from 44 tertiary care pediatric hospitals in 27 US states and the District of Columbia. This database contains patient demographics, diagnoses, and procedures as well as medications, diagnostic imaging, laboratory, and supply charges for each patient. Data are deidentified prior to inclusion; encrypted medical record numbers allow for the identification of individual patients across all ED visits and hospitalizations to the same hospital. The Children's Hospital Association (Overland Park, KS) and participating hospitals jointly assure the quality and integrity of the data. This study was approved by the institutional review board at Boston Children's Hospital with a waiver for informed consent granted.

Study Population and Protocol

To standardize comparisons across the hospitals, we included data from 23 of the 44 hospitals in PHIS; 7 were excluded for not including ED‐specific data. For institutions that collect information from multiple hospitals within their healthcare system, we included only records from the main campus or children's hospital when possible, leading to the exclusion of 9 hospitals where the data were not able to be segregated. As an additional level of data validation, we compared the hospital‐level ED volume and admission rates as reported in the PHIS to those reported to a separate database (the Pediatric Analysis and Comparison Tool). We further excluded 5 hospitals whose volume differed by >10% between these 2 data sources.

Patients <18 years of age who were discharged from these EDs following their index visit in 2012 formed the eligible cohort.

Key Outcome Measures

The primary outcomes were return visits within 72 hours of discharge from the ED, and return visits resulting in hospitalization, including observation status. We defined an ED revisit as a return within 72 hours of ED discharge regardless of whether the patient was subsequently discharged from the ED on the return visit or hospitalized. We assessed revisits within 72 hours of an index ED discharge, because return visits within this time frame are likely to be related to the index visit.[2, 13, 16, 21, 22, 24, 25, 26]

Factors Associated With ED Revisits

A priori, we chose to adjust for the following patient‐level factors: age (<30 days, 30 days<1 year, 14 years, 511 years, 1217 years), gender, and socioeconomic status (SES) measured as the zip codebased median household income, obtained from the 2010 US Census, with respect to the federal poverty level (FPL) (<1.5 FPL, 1.52 FPL, 23 FPL, and >3 FPL).[27] We also adjusted for insurance type (commercial, government, or other), proximity of patient's home zip code to hospital (modeled as the natural log of the geographical distance to patient's home address from the hospital), ED diagnosis‐based severity classification system score (1=low severity, 5=high severity),[28] presence of a complex chronic condition at the index or prior visits using a validated classification scheme,[15, 29, 30, 31] and primary care physician (PCP) density per 100,000 in the patient's residential area (modeled as quartiles: very low, <57.2; low, 57.267.9; medium, 68.078.7; high, >78.8). PCP density, defined by the Dartmouth Atlas of Health Care,[32, 33, 34] is the number of primary care physicians per 100,000 residents (PCP count) in federal health service areas (HSA). Patients were assigned to a corresponding HSA based on their home zip code.

Visit‐level factors included arrival time of index visit (8:01 am 4:00 pm, 4:01 pm12:00 am, 12:01 am8 am representing day, evening, and overnight arrival, respectively), day of the week, season, length of stay (LOS) in the ED during the index visit, and ED crowding (calculated as the average daily LOS/yearly average LOS for the individual ED).[35] We categorized the ED primary diagnosis for each visit using the major diagnosis groupings of a previously described pediatric ED‐specific classification scheme.[36] Using International Classification of Diseases, Ninth Revision (ICD‐9) codes, we identified the conditions with the highest ED revisit rates.

Statistical Analyses

Categorical variables describing the study cohort were summarized using frequencies and percentages. Continuous variables were summarized using mean, median, and interquartile range values, where appropriate. We used 2 different hierarchical logistic regression models to assess revisit rates by patient‐ and visit‐level characteristics. The initial model included all patients discharged from the ED following the index visit and assessed for the outcome of a revisit within 72 hours. The second model considered only patients who returned within 72 hours of an index visit and assessed for hospitalization on that return visit. We used generalized linear mixed effects models, with hospital as a random effect to account for the presence of correlated data (within hospitals), nonconstant variability (across hospitals), and binary responses. Adjusted odds ratios with 95% confidence intervals were used as summary measures of the effect of the individual adjusters. Adjusters were missing in fewer than 5% of patients across participating hospitals. Statistical analyses were performed using SAS version 9.3 (SAS Institute Inc., Cary, NC); 2‐sided P values <0.004 were considered statistically significant to account for multiple comparisons (Bonferroni‐adjusted level of significance=0.0038).

RESULTS

Patients

A total of 1,610,201 patients <18 years of age evaluated across the 23 PHIS EDs in 2012 were included in the study. Twenty‐one of the 23 EDs have academic affiliations; 10 are located in the South, 6 in the Midwest, 5 in the West, and 2 in the Northeast region of the United States. The annual ED volume for these EDs ranged from 25,090 to 136,160 (median, 65,075; interquartile range, 45,28085,206). Of the total patients, 1,415,721 (87.9%) were discharged following the index visit and comprised our study cohort. Of these patients, 47,294 (revisit rate: 3.3%) had an ED revisit within 72 hours. There were 4015 patients (0.3%) who returned more than once within 72 hours, and the largest proportion of these returned with infection‐related conditions. Of those returning, 37,999 (80.3%) were discharged again, whereas 9295 (19.7%) were admitted to the hospital (Figure 1). The demographic and clinical characteristics of study participants are displayed in Table 1.

Figure 1
Patient disposition from the emergency departments of study hospitals (n = 23) in 2012.
Characteristics of Patients Who Returned Within 72 Hours of ED Discharge to the Study EDs
 Index Visit, n=1,415,721, n (%)Return Visits Within 72 Hours of Discharge, n=47,294, 3.3%
Return to Discharge, n (%)Return to Admission, n (%)
  • NOTE: Abbreviations: CCC, complex chronic condition; ED, emergency department; FPL, federal poverty level; IQR, interquartile range; LOS, length of stay.

  • Socioeconomic status is relative to the federal poverty level for a family of 4.

Gender, female659,417 (46.6)17,665 (46.5)4,304 (46.3)
Payor   
Commercial379,403 (26.8)8,388 (22.1)3,214 (34.6)
Government925,147 (65.4)26,880 (70.7)5,786 (62.3)
Other111,171 (7.9)2,731 (7.2)295 (3.2)
Age   
<30 days19,217 (1.4)488 (1.3)253 (2.7)
30 days to <1 year216,967 (15.3)8,280 (21.8)2,372 (25.5)
1 year to 4 years547,083 (38.6)15,542 (40.9)3,187 (34.3)
5 years to 11 years409,463 (28.9)8,906 (23.4)1,964 (21.1)
12 years to 17 years222,991 (15.8)4,783 (12.6)1,519 (16.3)
Socioeconomic statusa   
<1.5 times FPL493,770 (34.9)13,851 (36.5)2,879 (31.0)
1.5 to 2 times FPL455,490 (32.2)12,364 (32.5)2,904 (31.2)
2 to 3 times FPL367,557 (26.0)9,560 (25.2)2,714 (29.2)
>3 times FPL98,904 (7.0)2,224 (5.9)798 (8.6)
Primary care physician density per 100,000 patients   
Very low351,798 (24.9)8,727 (23.0)2,628 (28.3)
Low357,099 (25.2)9,810 (25.8)2,067 (22.2)
Medium347,995 (24.6)10,186 (26.8)2,035 (21.9)
High358,829 (25.4)9,276 (24.4)2,565 (27.6)
CCC present, yes125,774 (8.9)4,446 (11.7)2,825 (30.4)
Severity score   
Low severity (0,1,2)721,061 (50.9)17,310 (45.6)2,955 (31.8)
High severity (3,4,5)694,660 (49.1)20,689 (54.5)6,340 (68.2)
Time of arrival   
Day533,328 (37.7)13,449 (35.4)3,396 (36.5)
Evening684,873 (48.4)18,417 (48.5)4,378 (47.1)
Overnight197,520 (14.0)6,133 (16.1)1,521 (16.4)
Season   
Winter384,957 (27.2)10,603 (27.9)2,844 (30.6)
Spring367,434 (26.0)9,923 (26.1)2,311 (24.9)
Summer303,872 (21.5)8,308 (21.9)1,875 (20.2)
Fall359,458 (25.4)9,165 (24.1)2,265 (24.4)
Weekday/weekend   
Monday217,774 (15.4)5,646 (14.9)1,394 (15)
Tuesday198,220 (14.0)5,054 (13.3)1,316 (14.2)
Wednesday194,295 (13.7)4,985 (13.1)1,333 (14.3)
Thursday191,950 (13.6)5,123 (13.5)1,234 (13.3)
Friday190,022 (13.4)5,449 (14.3)1,228 (13.2)
Saturday202,247 (14.3)5,766 (15.2)1,364 (14.7)
Sunday221,213 (15.6)5,976 (15.7)1,426 (15.3)
Distance from hospital in miles, median (IQR)8.3 (4.614.9)9.2 (4.917.4)8.3 (4.614.9)
ED crowding score at index visit, median (IQR)1.0 (0.91.1)1.0 (0.91.1)1.0 (0.91.1)
ED LOS in hours at index visit, median (IQR)2.0 (1.03.0)3.0 (2.05.0)2.0 (1.03.0)

ED Revisit Rates and Revisits Resulting in Admission

In multivariate analyses, compared to patients who did not return to the ED, patients who returned within 72 hours of discharge had higher odds of revisit if they had the following characteristics: a chronic condition, were <1 year old, a higher severity score, and public insurance. Visit‐level factors associated with higher odds of revisits included arrival for the index visit during the evening or overnight shift or on a Friday or Saturday, index visit during times of lower ED crowding, and living closer to the hospital. On return, patients were more likely to be hospitalized if they had a higher severity score, a chronic condition, private insurance, or were <30 days old. Visit‐level factors associated with higher odds of hospitalization at revisit included an index visit during the evening and overnight shift and living further from the hospital. Although the median SES and PCP density of a patient's area of residence were not associated with greater likelihood of returning, when they returned, patients residing in an area with a lower SES and higher PCP densities (>78.8 PCPs/100,000) had lower odds of being admitted to the hospital. Patients whose index visit was on a Sunday also had lower odds of being hospitalized upon return (Table 2).

Multivariate Analyses of Factors Associated With ED Revisits and Admission at Return
CharacteristicAdjusted OR of 72‐Hour Revisit (95% CI), n=1,380,723P ValueAdjusted OR of 72‐Hour Revisit Admissions (95% CI), n=46,364P Value
  • NOTE: Effects of continuous variables are assessed as 1‐unit offsets from the mean. Abbreviations: CCC, complex chronic condition; CI, confidence interval; ED, emergency department; FPL, federal poverty level; LOS, length of stay; OR, odds ratio, NA, not applicable.

  • Socioeconomic status is relative to the FPL for a family of 4.

  • ED crowding score and LOS are based on index visit. ED crowding score is calculated as the daily LOS (in hours)/overall LOS (in hours). Overall average across hospitals=1; a 1‐ unit increase translates into twice the duration for the daily LOS over the yearly average ED LOS.

  • Modeled as the natural log of the patient geographic distance from the hospital based on zip codes. Number in parentheses represents the exponential of the modeled variable.

Gender    
Male0.99 (0.971.01)0.28091.02 (0.971.07)0.5179
FemaleReference Reference 
Payor    
Government1.14 (1.111.17)<0.00010.68 (0.640.72)<0.0001
Other0.97 (0.921.01)0.11480.33 (0.280.39)<0.0001
PrivateReference Reference 
Age group    
30 days to <1 year1.32 (1.221.42)<0.00010.58 (0.490.69)<0.0001
1 year to 5 years0.89 (0.830.96)0.0030.41 (0.340.48)<0.0001
5 years to 11 years0.69 (0.640.74)<0.00010.40 (0.330.48)<0.0001
12 years to 17 years0.72 (0.660.77)<0.00010.50 (0.420.60)<0.0001
<30 daysReference Reference 
Socioeconomic statusa    
% <1.5 times FPL0.96 (0.921.01)0.09920.82 (0.740.92)0.0005
% 1.5 to 2 times FPL0.98 (0.941.02)0.29920.83 (0.750.92)0.0005
% 2 to 3 times FPL1.02 (0.981.07)0.2920.88 (0.790.97)0.01
% >3 times FPLReference Reference 
Severity score    
High severity, 4, 5, 61.43 (1.401.45)<0.00013.42 (3.233.62)<0.0001
Low severity, 1, 2, 3Reference Reference 
Presence of any CCC    
Yes1.90 (1.861.96)<0.00012.92 (2.753.10)<0.0001
NoReference Reference 
Time of arrival    
Evening1.05 (1.031.08)<0.00011.37 (1.291.44)<0.0001
Overnight1.19 (1.151.22)<0.00011.84 (1.711.97)<0.0001
DayReference Reference 
Season    
Winter1.09 (1.061.11)<0.00011.06 (0.991.14)0.0722
Spring1.07 (1.041.10)<0.00010.98 (0.911.046)0.4763
Summer1.05 (1.021.08)0.00110.93 (0.871.01)0.0729
FallReference Reference 
Weekday/weekend    
Thursday1.02 (0.9821.055)0.32970.983 (0.8971.078)0.7185
Friday1.08 (1.041.11)<0.00011.03 (0.941.13)0.5832
Saturday1.08 (1.041.12)<0.00010.89 (0.810.97)0.0112
Sunday1.02 (0.991.06)0.20540.81 (0.740.89)<0.0001
Monday1.00 (0.961.03)0.89280.98 (0.901.07)0.6647
Tuesday0.99 (0.951.03)0.53420.93 (0.851.02)0.1417
WednesdayReference Reference 
PCP ratio per 100,000 patients    
57.267.91.00 (0.961.04)0.88440.93 (0.841.03)0.1669
68.078.71.00 (0.951.04)0.81560.86 (0.770.96)0.0066
>78.81.00 (0.951.04)0.68830.82 (0.730.92)0.001
<57.2Reference Reference 
ED crowding score at index visitb    
20.92 (0.900.95)<0.00010.96 (0.881.05)0.3435
1Reference Reference 
Distance from hospitalc    
3.168, 23.6 miles0.95 (0.940.96)<0.00011.16 (1.121.19)<0.0001
2.168, 8.7 milesReference Reference 
ED LOS at index visitb    
3.7 hours1.003 (1.0011.005)0.0052NA 
2.7 hoursReference   

Diagnoses Associated With Return Visits

Patients with index visit diagnoses of sickle cell disease and leukemia had the highest proportion of return visits (10.7% and 7.3%, respectively). Other conditions with high revisit rates included infectious diseases such as cellulitis, bronchiolitis, and gastroenteritis. Patients with other chronic diseases such as diabetes and with devices, such as gastrostomy tubes, also had high rates of return visits. At return, the rate of hospitalization for these conditions ranged from a 1‐in‐6 chance of hospitalization for the diagnoses of a fever to a 1‐in‐2 chance of hospitalization for patients with sickle cell anemia (Table 3).

Major Diagnostic Subgroups With the Highest ED Revisit and Admission at Return Rates
Major Diagnostic SubgroupNo. of Index ED Visit Dischargesa72‐Hour Revisit, % (95% CI)Admitted on Return, % (95% CI)
  • NOTE: Abbreviations: CI, confidence interval; ED, emergency department; NOS, not otherwise specified.

  • Diagnoses with <500 index visits (ie, <2 visits per month across the 23 hospitals) or <30 revisits within entire study cohort excluded from analyses.

  • Most prevalent diagnoses as identified by International Classification of Diseases, Ninth Revision codes within specified major diagnostic subgroups: devices and complications of the circulatory system, complication of other vascular device, implant, and graft; other hematologic diseases, anemia NOS, neutropenia NOS, or thrombocytopenia NOS; other devices and complications, hemorrhage complicating a procedure; devices and complications of the gastrointestinal system, gastrostomy; other infectious diseases, perinatal infections.

Sickle cell anemia2,53110.7 (9.511.9)49.6 (43.755.6)
Neoplastic diseases, cancer5367.3 (5.19.5)36 (2151)
Infectious gastrointestinal diseases8027.2 (5.49.0)21 (1031)
Devices and complications of the circulatory systemb1,0336.9 (5.38.4)45 (3457)
Other hematologic diseasesb1,5386.1 (4.97.3)33 (2443)
Fever80,6265.9 (5.76.0)16.3 (15.217.3)
Dehydration7,3625.4 (5.25.5)34.6 (30.139)
Infectious respiratory diseases72,6525.4 (5.25.5)28.6 (27.230)
Seizures17,6375.3 (4.95.6)33.3 (30.336.4)
Other devices and complicationsb1,8965.3 (4.36.3)39.0 (29.448.6)
Infectious skin, dermatologic and soft tissue diseases40,2724.7 (4.55)20.0 (18.221.8)
Devices and complications of the gastrointestinal systemb4,6924.6 (4.05.2)24.7 (18.930.4)
Vomiting44,7304.4 (4.24.6)23.7 (21.825.6)
Infectious urinary tract diseases17,0204.4 (4.14.7)25.9 (22.729)
Headache19,0164.3 (4.14.6)28.2 (25.131.3)
Diabetes mellitus1,5314.5 (3.35.3)29 (1840)
Abdominal pain39,5944.2 (44.4)24.8 (22.726.8)
Other infectious diseasesb6474.2 (2.65.7)33 (1651)
Gastroenteritis55,6134.0 (3.84.1)20.6 (18.922.3)

DISCUSSION

In this nationally representative sample of free‐standing children's hospitals, 3.3% of patients discharged from the ED returned to the same ED within 72 hours. This rate is similar to rates previously published in studies of general EDs.[11, 15] Of the returning children, over 80% were discharged again, and 19.7% were hospitalized, which is two‐thirds more than the admission rate at index visit (12%). In accordance with previous studies,[14, 16, 25] we found higher disease severity, presence of a chronic condition, and younger age were strongly associated with both the odds of patients returning to the ED and of being hospitalized at return. Patients who were hospitalized lived further away from the hospital and were of a higher SES. In this study, we show that visit‐level and access‐related factors are also associated with increased risk of return, although to a lesser degree. Patients seen on a weekend (Friday or Saturday) were found to have higher odds of returning, whereas those seen initially on a Sunday had lower odds of hospitalization at return. In this study, we also found that patients seen on the evening or night shifts at the index presentation had a significant association with return visits and hospitalization at return. Additionally, we found that although PCP density was not associated with the odds of returning to the ED, patients from areas with a higher PCP density were less likely to be admitted at return. In addition, by evaluating the diagnoses of patients who returned, we found that many infectious conditions commonly seen in the ED also had high return rates.

As previously shown,[23] we found that patients with complex and chronic diseases were at risk for ED revisits, especially patients with sickle cell anemia and cancer (mainly acute leukemia). In addition, patients with a chronic condition were 3 times more likely to be hospitalized when they returned. These findings may indicate an opportunity for improved discharge planning and coordination of care with subspecialty care providers for particularly at‐risk populations, or stronger consideration of admission at the index visit. However, admission for these patients at revisit may be unavoidable.

Excluding patients with chronic and complex conditions, the majority of conditions with high revisit rates were acute infectious conditions. One national study showed that >70% of ED revisits by patients with infectious conditions had planned ED follow‐up.[13] Although this study was unable to assess the reasons for return or admission at return, children with infectious diseases often worsen over time (eg, those with bronchiolitis). The relatively low admission rates at return for these conditions, despite evidence that providers may have a lower threshold for admission when a patient returns to the ED shortly after discharge,[24] may reflect the potential for improving follow‐up at the PCP office. However, although some revisits may be prevented,[37, 38] we recognize that an ED visit could be appropriate and necessary for some of these children, especially those without primary care.

Access to primary care and insurance status influence ED utilization.[14, 39, 40, 41] A fragmented healthcare system with poor access to primary care is strongly associated with utilization of the ED for nonurgent care. A high ED revisit rate might be indicative of poor coordination between ED and outpatient services.[9, 39, 42, 43, 44, 45, 46] Our study's finding of increased risk of return visit if the index visit occurred on a Friday or Saturday, and a decreased likelihood of subsequent admission when a patient returns on a Sunday, may suggest limited or perceived limited access to the PCP over a weekend. Although insured patients tend to use the ED less often for nonemergent cases, even when patients have PCPs, they might still choose to return to the ED out of convenience.[47, 48] This may be reflected in our finding that, when adjusted for insurance status and PCP density, patients who lived closer to the hospital were more likely to return, but less likely to be admitted, thereby suggesting proximity as a factor in the decision to return. It is also possible that patients residing further away returned to another institution. Although PCP density did not seem to be associated with revisits, patients who lived in areas with higher PCP density were less likely to be admitted when they returned. In this study, there was a stepwise gradient in the effect of PCP density on the odds of being hospitalized on return with those patients in areas with fewer PCPs being admitted at higher rates on return. Guttmann et al.,[40] in a recent study conducted in Canada where there is universal health insurance, showed that children residing in areas with higher PCP densities had higher rates of PCP visits but lower rates of ED visits compared to children residing in areas with lower PCP densities. It is possible that emergency physicians have more confidence that patients will have dedicated follow‐up when a PCP can be identified. These findings suggest that the development of PCP networks with expanded access, such as alignment of office hours with parent need and patient/parent education about PCP availability, may reduce ED revisits. Alternatively, creation of centralized hospital‐based urgent care centers for evening, night, and weekend visits may benefit both the patient and the PCP and avoid ED revisits and associated costs.

Targeting and eliminating disparities in care might also play a role in reducing ED revisits. Prior studies have shown that publicly insured individuals, in particular, frequently use the ED as their usual source of care and are more likely to return to the ED within 72 hours of an initial visit.[23, 39, 44, 49, 50] Likewise, we found that patients with public insurance were more likely to return but less likely to be admitted on revisit. After controlling for disease severity and other demographic variables, patients with public insurance and of lower socioeconomic status still had lower odds of being hospitalized following a revisit. This might also signify an increase of avoidable hospitalizations among patients of higher SES or with private insurance. Further investigation is needed to explore the reasons for these differences and to identify effective interventions to eliminate disparities.

Our findings have implications for emergency care, ambulatory care, and the larger healthcare system. First, ED revisits are costly and contribute to already overburdened EDs.[10, 11] The average ED visit incurs charges that are 2 to 5 times more than an outpatient office visit.[49, 50] Careful coordination of ambulatory and ED services could not only ensure optimal care for patients, but could save the US healthcare system billions of dollars in potentially avoidable healthcare expenditures.[49, 50] Second, prior studies have demonstrated a consistent relationship between poor access to primary care and increased use of the ED for nonurgent conditions.[42] Publicly insured patients have been shown to have disproportionately increased difficulty acquiring and accessing primary care.[41, 42, 47, 51] Furthermore, conditions with high ED revisit rates are similar to conditions reported by Berry et al.4 as having the highest hospital readmission rates such as cancer, sickle cell anemia, seizure, pneumonia, asthma, and gastroenteritis. This might suggest a close relationship between 72‐hour ED revisits and 30‐day hospital readmissions. In light of the recent expansion of health insurance coverage to an additional 30 million individuals, the need for better coordination of services throughout the entire continuum of care, including primary care, ED, and inpatient services, has never been more important.[52] Future improvements could explore condition‐specific revisit or readmission rates to identify the most effective interventions to reduce the possibly preventable returns.

This study has several limitations. First, as an administrative database, PHIS has limited clinical data, and reasons for return visits could not be assessed. Variations between hospitals in diagnostic coding might also lead to misclassification bias. Second, we were unable to assess return visits to a different ED. Thus, we may have underestimated revisit frequency. However, because children are generally more likely to seek repeat care in the same hospital,[3] we believe our estimate of return visit rate approximates the actual return visit rate; our findings are also similar to previously reported rates. Third, for the PCP density factor, we were unable to account for types of insurance each physician accepted and influence on return rates. Fourth, return visits in our sample could have been for conditions unrelated to the diagnosis at index visit, though the short timeframe considered for revisits makes this less likely. In addition, the crowding index does not include the proportion of occupied beds at the precise moment of the index visit. Finally, this cohort includes only children seen in the EDs of pediatric hospitals, and our findings may not be generalizable to all EDs who provide care for ill and injured children.

We have shown that, in addition to previously identified patient level factors, there are visit‐level and access‐related factors associated with pediatric ED return visits. Eighty percent are discharged again, and almost one‐fifth of returning patients are admitted to the hospital. Admitted patients tend to be younger, sicker, chronically ill, and live farther from the hospital. By being aware of patients' comorbidities, PCP access, as well as certain diagnoses associated with high rates of return, physicians may better target interventions to optimize care. This may include having a lower threshold for hospitalization at the initial visit for children at high risk of return, and communication with the PCP at the time of discharge to ensure close follow‐up. Our study helps to provide benchmarks around ED revisit rates, and may serve as a starting point to better understand variation in care. Future efforts should aim to find creative solutions at individual institutions, with the goal of disseminating and replicating successes more broadly. For example, investigators in Boston have shown that the use of a comprehensive home‐based asthma management program has been successful in decreasing emergency department visits and hospitalization rates.[53] It is possible that this approach could be spread to other institutions to decrease revisits for patients with asthma. As a next step, the authors have undertaken an investigation to identify hospital‐level characteristics that may be associated with rates of return visits.

Acknowledgements

The authors thank the following members of the PHIS ED Return Visits Research Group for their contributions to the data analysis plan and interpretation of results of this study: Rustin Morse, MD, Children's Medical Center of Dallas; Catherine Perron, MD, Boston Children's Hospital; John Cheng, MD, Children's Healthcare of Atlanta; Shabnam Jain, MD, MPH, Children's Healthcare of Atlanta; and Amanda Montalbano, MD, MPH, Children's Mercy Hospitals and Clinics. These contributors did not receive compensation for their help with this work.

Disclosures

A.T.A. and A.M.S. conceived the study and developed the initial study design. All authors were involved in the development of the final study design and data analysis plan. C.W.T. collected and analyzed the data. A.T.A. and C.W.T. had full access to all of the data and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors were involved in the interpretation of the data. A.T.A. drafted the article, and all authors made critical revisions to the initial draft and subsequent versions. A.T.A. and A.M.S. take full responsibility for the article as a whole. The authors report no conflicts of interest.

Returns to the hospital following recent encounters, such as an admission to the inpatient unit or evaluation in an emergency department (ED), may reflect the natural progression of a disease, the quality of care received during the initial admission or visit, or the quality of the underlying healthcare system.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] Although national attention has focused on hospital readmissions,[3, 4, 5, 6, 7, 11, 12] ED revisits are a source of concern to emergency physicians.[8, 9] Some ED revisits are medically necessary, but revisits that may be managed in the primary care setting contribute to ED crowding, can be stressful to patients and providers, and increase healthcare costs.[10, 11, 12] Approximately 27 million annual ED visits are made by children, accounting for over one‐quarter of all ED visits in the United States, with a reported ED revisit rate of 2.5% to 5.2%.[2, 13, 14, 15, 16, 17, 18, 19, 20] Improved understanding of the patient‐level or visit‐level factors associated with ED revisits may provide an opportunity to enhance disposition decision making at the index visit and optimize site of and communication around follow‐up care.

Previous studies on ED revisits have largely been conducted in single centers and have used variable visit intervals ranging between 48 hours and 30 days.[2, 13, 16, 18, 21, 22, 23, 24, 25] Two national studies used the National Hospital Ambulatory Medical Care Survey, which includes data from both general and pediatric EDs.[13, 14] Factors identified to be associated with increased odds of returning were: young age, higher acuity, chronic conditions, and public insurance. One national study identified some diagnoses associated with higher likelihood of returning,[13] whereas the other focused primarily on infectious diseaserelated diagnoses.[14]

The purpose of this study was to describe the prevalence of return visits specifically to pediatric EDs and to investigate patient‐level, visit‐level, and healthcare systemrelated factors that may be associated with return visits and hospitalization at return.

METHODS

Study Design and Data Source

This retrospective cohort study used data from the Pediatric Health Information System (PHIS), an administrative database with data from 44 tertiary care pediatric hospitals in 27 US states and the District of Columbia. This database contains patient demographics, diagnoses, and procedures as well as medications, diagnostic imaging, laboratory, and supply charges for each patient. Data are deidentified prior to inclusion; encrypted medical record numbers allow for the identification of individual patients across all ED visits and hospitalizations to the same hospital. The Children's Hospital Association (Overland Park, KS) and participating hospitals jointly assure the quality and integrity of the data. This study was approved by the institutional review board at Boston Children's Hospital with a waiver for informed consent granted.

Study Population and Protocol

To standardize comparisons across the hospitals, we included data from 23 of the 44 hospitals in PHIS; 7 were excluded for not including ED‐specific data. For institutions that collect information from multiple hospitals within their healthcare system, we included only records from the main campus or children's hospital when possible, leading to the exclusion of 9 hospitals where the data were not able to be segregated. As an additional level of data validation, we compared the hospital‐level ED volume and admission rates as reported in the PHIS to those reported to a separate database (the Pediatric Analysis and Comparison Tool). We further excluded 5 hospitals whose volume differed by >10% between these 2 data sources.

Patients <18 years of age who were discharged from these EDs following their index visit in 2012 formed the eligible cohort.

Key Outcome Measures

The primary outcomes were return visits within 72 hours of discharge from the ED, and return visits resulting in hospitalization, including observation status. We defined an ED revisit as a return within 72 hours of ED discharge regardless of whether the patient was subsequently discharged from the ED on the return visit or hospitalized. We assessed revisits within 72 hours of an index ED discharge, because return visits within this time frame are likely to be related to the index visit.[2, 13, 16, 21, 22, 24, 25, 26]

Factors Associated With ED Revisits

A priori, we chose to adjust for the following patient‐level factors: age (<30 days, 30 days<1 year, 14 years, 511 years, 1217 years), gender, and socioeconomic status (SES) measured as the zip codebased median household income, obtained from the 2010 US Census, with respect to the federal poverty level (FPL) (<1.5 FPL, 1.52 FPL, 23 FPL, and >3 FPL).[27] We also adjusted for insurance type (commercial, government, or other), proximity of patient's home zip code to hospital (modeled as the natural log of the geographical distance to patient's home address from the hospital), ED diagnosis‐based severity classification system score (1=low severity, 5=high severity),[28] presence of a complex chronic condition at the index or prior visits using a validated classification scheme,[15, 29, 30, 31] and primary care physician (PCP) density per 100,000 in the patient's residential area (modeled as quartiles: very low, <57.2; low, 57.267.9; medium, 68.078.7; high, >78.8). PCP density, defined by the Dartmouth Atlas of Health Care,[32, 33, 34] is the number of primary care physicians per 100,000 residents (PCP count) in federal health service areas (HSA). Patients were assigned to a corresponding HSA based on their home zip code.

Visit‐level factors included arrival time of index visit (8:01 am 4:00 pm, 4:01 pm12:00 am, 12:01 am8 am representing day, evening, and overnight arrival, respectively), day of the week, season, length of stay (LOS) in the ED during the index visit, and ED crowding (calculated as the average daily LOS/yearly average LOS for the individual ED).[35] We categorized the ED primary diagnosis for each visit using the major diagnosis groupings of a previously described pediatric ED‐specific classification scheme.[36] Using International Classification of Diseases, Ninth Revision (ICD‐9) codes, we identified the conditions with the highest ED revisit rates.

Statistical Analyses

Categorical variables describing the study cohort were summarized using frequencies and percentages. Continuous variables were summarized using mean, median, and interquartile range values, where appropriate. We used 2 different hierarchical logistic regression models to assess revisit rates by patient‐ and visit‐level characteristics. The initial model included all patients discharged from the ED following the index visit and assessed for the outcome of a revisit within 72 hours. The second model considered only patients who returned within 72 hours of an index visit and assessed for hospitalization on that return visit. We used generalized linear mixed effects models, with hospital as a random effect to account for the presence of correlated data (within hospitals), nonconstant variability (across hospitals), and binary responses. Adjusted odds ratios with 95% confidence intervals were used as summary measures of the effect of the individual adjusters. Adjusters were missing in fewer than 5% of patients across participating hospitals. Statistical analyses were performed using SAS version 9.3 (SAS Institute Inc., Cary, NC); 2‐sided P values <0.004 were considered statistically significant to account for multiple comparisons (Bonferroni‐adjusted level of significance=0.0038).

RESULTS

Patients

A total of 1,610,201 patients <18 years of age evaluated across the 23 PHIS EDs in 2012 were included in the study. Twenty‐one of the 23 EDs have academic affiliations; 10 are located in the South, 6 in the Midwest, 5 in the West, and 2 in the Northeast region of the United States. The annual ED volume for these EDs ranged from 25,090 to 136,160 (median, 65,075; interquartile range, 45,28085,206). Of the total patients, 1,415,721 (87.9%) were discharged following the index visit and comprised our study cohort. Of these patients, 47,294 (revisit rate: 3.3%) had an ED revisit within 72 hours. There were 4015 patients (0.3%) who returned more than once within 72 hours, and the largest proportion of these returned with infection‐related conditions. Of those returning, 37,999 (80.3%) were discharged again, whereas 9295 (19.7%) were admitted to the hospital (Figure 1). The demographic and clinical characteristics of study participants are displayed in Table 1.

Figure 1
Patient disposition from the emergency departments of study hospitals (n = 23) in 2012.
Characteristics of Patients Who Returned Within 72 Hours of ED Discharge to the Study EDs
 Index Visit, n=1,415,721, n (%)Return Visits Within 72 Hours of Discharge, n=47,294, 3.3%
Return to Discharge, n (%)Return to Admission, n (%)
  • NOTE: Abbreviations: CCC, complex chronic condition; ED, emergency department; FPL, federal poverty level; IQR, interquartile range; LOS, length of stay.

  • Socioeconomic status is relative to the federal poverty level for a family of 4.

Gender, female659,417 (46.6)17,665 (46.5)4,304 (46.3)
Payor   
Commercial379,403 (26.8)8,388 (22.1)3,214 (34.6)
Government925,147 (65.4)26,880 (70.7)5,786 (62.3)
Other111,171 (7.9)2,731 (7.2)295 (3.2)
Age   
<30 days19,217 (1.4)488 (1.3)253 (2.7)
30 days to <1 year216,967 (15.3)8,280 (21.8)2,372 (25.5)
1 year to 4 years547,083 (38.6)15,542 (40.9)3,187 (34.3)
5 years to 11 years409,463 (28.9)8,906 (23.4)1,964 (21.1)
12 years to 17 years222,991 (15.8)4,783 (12.6)1,519 (16.3)
Socioeconomic statusa   
<1.5 times FPL493,770 (34.9)13,851 (36.5)2,879 (31.0)
1.5 to 2 times FPL455,490 (32.2)12,364 (32.5)2,904 (31.2)
2 to 3 times FPL367,557 (26.0)9,560 (25.2)2,714 (29.2)
>3 times FPL98,904 (7.0)2,224 (5.9)798 (8.6)
Primary care physician density per 100,000 patients   
Very low351,798 (24.9)8,727 (23.0)2,628 (28.3)
Low357,099 (25.2)9,810 (25.8)2,067 (22.2)
Medium347,995 (24.6)10,186 (26.8)2,035 (21.9)
High358,829 (25.4)9,276 (24.4)2,565 (27.6)
CCC present, yes125,774 (8.9)4,446 (11.7)2,825 (30.4)
Severity score   
Low severity (0,1,2)721,061 (50.9)17,310 (45.6)2,955 (31.8)
High severity (3,4,5)694,660 (49.1)20,689 (54.5)6,340 (68.2)
Time of arrival   
Day533,328 (37.7)13,449 (35.4)3,396 (36.5)
Evening684,873 (48.4)18,417 (48.5)4,378 (47.1)
Overnight197,520 (14.0)6,133 (16.1)1,521 (16.4)
Season   
Winter384,957 (27.2)10,603 (27.9)2,844 (30.6)
Spring367,434 (26.0)9,923 (26.1)2,311 (24.9)
Summer303,872 (21.5)8,308 (21.9)1,875 (20.2)
Fall359,458 (25.4)9,165 (24.1)2,265 (24.4)
Weekday/weekend   
Monday217,774 (15.4)5,646 (14.9)1,394 (15)
Tuesday198,220 (14.0)5,054 (13.3)1,316 (14.2)
Wednesday194,295 (13.7)4,985 (13.1)1,333 (14.3)
Thursday191,950 (13.6)5,123 (13.5)1,234 (13.3)
Friday190,022 (13.4)5,449 (14.3)1,228 (13.2)
Saturday202,247 (14.3)5,766 (15.2)1,364 (14.7)
Sunday221,213 (15.6)5,976 (15.7)1,426 (15.3)
Distance from hospital in miles, median (IQR)8.3 (4.614.9)9.2 (4.917.4)8.3 (4.614.9)
ED crowding score at index visit, median (IQR)1.0 (0.91.1)1.0 (0.91.1)1.0 (0.91.1)
ED LOS in hours at index visit, median (IQR)2.0 (1.03.0)3.0 (2.05.0)2.0 (1.03.0)

ED Revisit Rates and Revisits Resulting in Admission

In multivariate analyses, compared to patients who did not return to the ED, patients who returned within 72 hours of discharge had higher odds of revisit if they had the following characteristics: a chronic condition, were <1 year old, a higher severity score, and public insurance. Visit‐level factors associated with higher odds of revisits included arrival for the index visit during the evening or overnight shift or on a Friday or Saturday, index visit during times of lower ED crowding, and living closer to the hospital. On return, patients were more likely to be hospitalized if they had a higher severity score, a chronic condition, private insurance, or were <30 days old. Visit‐level factors associated with higher odds of hospitalization at revisit included an index visit during the evening and overnight shift and living further from the hospital. Although the median SES and PCP density of a patient's area of residence were not associated with greater likelihood of returning, when they returned, patients residing in an area with a lower SES and higher PCP densities (>78.8 PCPs/100,000) had lower odds of being admitted to the hospital. Patients whose index visit was on a Sunday also had lower odds of being hospitalized upon return (Table 2).

Multivariate Analyses of Factors Associated With ED Revisits and Admission at Return
CharacteristicAdjusted OR of 72‐Hour Revisit (95% CI), n=1,380,723P ValueAdjusted OR of 72‐Hour Revisit Admissions (95% CI), n=46,364P Value
  • NOTE: Effects of continuous variables are assessed as 1‐unit offsets from the mean. Abbreviations: CCC, complex chronic condition; CI, confidence interval; ED, emergency department; FPL, federal poverty level; LOS, length of stay; OR, odds ratio, NA, not applicable.

  • Socioeconomic status is relative to the FPL for a family of 4.

  • ED crowding score and LOS are based on index visit. ED crowding score is calculated as the daily LOS (in hours)/overall LOS (in hours). Overall average across hospitals=1; a 1‐ unit increase translates into twice the duration for the daily LOS over the yearly average ED LOS.

  • Modeled as the natural log of the patient geographic distance from the hospital based on zip codes. Number in parentheses represents the exponential of the modeled variable.

Gender    
Male0.99 (0.971.01)0.28091.02 (0.971.07)0.5179
FemaleReference Reference 
Payor    
Government1.14 (1.111.17)<0.00010.68 (0.640.72)<0.0001
Other0.97 (0.921.01)0.11480.33 (0.280.39)<0.0001
PrivateReference Reference 
Age group    
30 days to <1 year1.32 (1.221.42)<0.00010.58 (0.490.69)<0.0001
1 year to 5 years0.89 (0.830.96)0.0030.41 (0.340.48)<0.0001
5 years to 11 years0.69 (0.640.74)<0.00010.40 (0.330.48)<0.0001
12 years to 17 years0.72 (0.660.77)<0.00010.50 (0.420.60)<0.0001
<30 daysReference Reference 
Socioeconomic statusa    
% <1.5 times FPL0.96 (0.921.01)0.09920.82 (0.740.92)0.0005
% 1.5 to 2 times FPL0.98 (0.941.02)0.29920.83 (0.750.92)0.0005
% 2 to 3 times FPL1.02 (0.981.07)0.2920.88 (0.790.97)0.01
% >3 times FPLReference Reference 
Severity score    
High severity, 4, 5, 61.43 (1.401.45)<0.00013.42 (3.233.62)<0.0001
Low severity, 1, 2, 3Reference Reference 
Presence of any CCC    
Yes1.90 (1.861.96)<0.00012.92 (2.753.10)<0.0001
NoReference Reference 
Time of arrival    
Evening1.05 (1.031.08)<0.00011.37 (1.291.44)<0.0001
Overnight1.19 (1.151.22)<0.00011.84 (1.711.97)<0.0001
DayReference Reference 
Season    
Winter1.09 (1.061.11)<0.00011.06 (0.991.14)0.0722
Spring1.07 (1.041.10)<0.00010.98 (0.911.046)0.4763
Summer1.05 (1.021.08)0.00110.93 (0.871.01)0.0729
FallReference Reference 
Weekday/weekend    
Thursday1.02 (0.9821.055)0.32970.983 (0.8971.078)0.7185
Friday1.08 (1.041.11)<0.00011.03 (0.941.13)0.5832
Saturday1.08 (1.041.12)<0.00010.89 (0.810.97)0.0112
Sunday1.02 (0.991.06)0.20540.81 (0.740.89)<0.0001
Monday1.00 (0.961.03)0.89280.98 (0.901.07)0.6647
Tuesday0.99 (0.951.03)0.53420.93 (0.851.02)0.1417
WednesdayReference Reference 
PCP ratio per 100,000 patients    
57.267.91.00 (0.961.04)0.88440.93 (0.841.03)0.1669
68.078.71.00 (0.951.04)0.81560.86 (0.770.96)0.0066
>78.81.00 (0.951.04)0.68830.82 (0.730.92)0.001
<57.2Reference Reference 
ED crowding score at index visitb    
20.92 (0.900.95)<0.00010.96 (0.881.05)0.3435
1Reference Reference 
Distance from hospitalc    
3.168, 23.6 miles0.95 (0.940.96)<0.00011.16 (1.121.19)<0.0001
2.168, 8.7 milesReference Reference 
ED LOS at index visitb    
3.7 hours1.003 (1.0011.005)0.0052NA 
2.7 hoursReference   

Diagnoses Associated With Return Visits

Patients with index visit diagnoses of sickle cell disease and leukemia had the highest proportion of return visits (10.7% and 7.3%, respectively). Other conditions with high revisit rates included infectious diseases such as cellulitis, bronchiolitis, and gastroenteritis. Patients with other chronic diseases such as diabetes and with devices, such as gastrostomy tubes, also had high rates of return visits. At return, the rate of hospitalization for these conditions ranged from a 1‐in‐6 chance of hospitalization for the diagnoses of a fever to a 1‐in‐2 chance of hospitalization for patients with sickle cell anemia (Table 3).

Major Diagnostic Subgroups With the Highest ED Revisit and Admission at Return Rates
Major Diagnostic SubgroupNo. of Index ED Visit Dischargesa72‐Hour Revisit, % (95% CI)Admitted on Return, % (95% CI)
  • NOTE: Abbreviations: CI, confidence interval; ED, emergency department; NOS, not otherwise specified.

  • Diagnoses with <500 index visits (ie, <2 visits per month across the 23 hospitals) or <30 revisits within entire study cohort excluded from analyses.

  • Most prevalent diagnoses as identified by International Classification of Diseases, Ninth Revision codes within specified major diagnostic subgroups: devices and complications of the circulatory system, complication of other vascular device, implant, and graft; other hematologic diseases, anemia NOS, neutropenia NOS, or thrombocytopenia NOS; other devices and complications, hemorrhage complicating a procedure; devices and complications of the gastrointestinal system, gastrostomy; other infectious diseases, perinatal infections.

Sickle cell anemia2,53110.7 (9.511.9)49.6 (43.755.6)
Neoplastic diseases, cancer5367.3 (5.19.5)36 (2151)
Infectious gastrointestinal diseases8027.2 (5.49.0)21 (1031)
Devices and complications of the circulatory systemb1,0336.9 (5.38.4)45 (3457)
Other hematologic diseasesb1,5386.1 (4.97.3)33 (2443)
Fever80,6265.9 (5.76.0)16.3 (15.217.3)
Dehydration7,3625.4 (5.25.5)34.6 (30.139)
Infectious respiratory diseases72,6525.4 (5.25.5)28.6 (27.230)
Seizures17,6375.3 (4.95.6)33.3 (30.336.4)
Other devices and complicationsb1,8965.3 (4.36.3)39.0 (29.448.6)
Infectious skin, dermatologic and soft tissue diseases40,2724.7 (4.55)20.0 (18.221.8)
Devices and complications of the gastrointestinal systemb4,6924.6 (4.05.2)24.7 (18.930.4)
Vomiting44,7304.4 (4.24.6)23.7 (21.825.6)
Infectious urinary tract diseases17,0204.4 (4.14.7)25.9 (22.729)
Headache19,0164.3 (4.14.6)28.2 (25.131.3)
Diabetes mellitus1,5314.5 (3.35.3)29 (1840)
Abdominal pain39,5944.2 (44.4)24.8 (22.726.8)
Other infectious diseasesb6474.2 (2.65.7)33 (1651)
Gastroenteritis55,6134.0 (3.84.1)20.6 (18.922.3)

DISCUSSION

In this nationally representative sample of free‐standing children's hospitals, 3.3% of patients discharged from the ED returned to the same ED within 72 hours. This rate is similar to rates previously published in studies of general EDs.[11, 15] Of the returning children, over 80% were discharged again, and 19.7% were hospitalized, which is two‐thirds more than the admission rate at index visit (12%). In accordance with previous studies,[14, 16, 25] we found higher disease severity, presence of a chronic condition, and younger age were strongly associated with both the odds of patients returning to the ED and of being hospitalized at return. Patients who were hospitalized lived further away from the hospital and were of a higher SES. In this study, we show that visit‐level and access‐related factors are also associated with increased risk of return, although to a lesser degree. Patients seen on a weekend (Friday or Saturday) were found to have higher odds of returning, whereas those seen initially on a Sunday had lower odds of hospitalization at return. In this study, we also found that patients seen on the evening or night shifts at the index presentation had a significant association with return visits and hospitalization at return. Additionally, we found that although PCP density was not associated with the odds of returning to the ED, patients from areas with a higher PCP density were less likely to be admitted at return. In addition, by evaluating the diagnoses of patients who returned, we found that many infectious conditions commonly seen in the ED also had high return rates.

As previously shown,[23] we found that patients with complex and chronic diseases were at risk for ED revisits, especially patients with sickle cell anemia and cancer (mainly acute leukemia). In addition, patients with a chronic condition were 3 times more likely to be hospitalized when they returned. These findings may indicate an opportunity for improved discharge planning and coordination of care with subspecialty care providers for particularly at‐risk populations, or stronger consideration of admission at the index visit. However, admission for these patients at revisit may be unavoidable.

Excluding patients with chronic and complex conditions, the majority of conditions with high revisit rates were acute infectious conditions. One national study showed that >70% of ED revisits by patients with infectious conditions had planned ED follow‐up.[13] Although this study was unable to assess the reasons for return or admission at return, children with infectious diseases often worsen over time (eg, those with bronchiolitis). The relatively low admission rates at return for these conditions, despite evidence that providers may have a lower threshold for admission when a patient returns to the ED shortly after discharge,[24] may reflect the potential for improving follow‐up at the PCP office. However, although some revisits may be prevented,[37, 38] we recognize that an ED visit could be appropriate and necessary for some of these children, especially those without primary care.

Access to primary care and insurance status influence ED utilization.[14, 39, 40, 41] A fragmented healthcare system with poor access to primary care is strongly associated with utilization of the ED for nonurgent care. A high ED revisit rate might be indicative of poor coordination between ED and outpatient services.[9, 39, 42, 43, 44, 45, 46] Our study's finding of increased risk of return visit if the index visit occurred on a Friday or Saturday, and a decreased likelihood of subsequent admission when a patient returns on a Sunday, may suggest limited or perceived limited access to the PCP over a weekend. Although insured patients tend to use the ED less often for nonemergent cases, even when patients have PCPs, they might still choose to return to the ED out of convenience.[47, 48] This may be reflected in our finding that, when adjusted for insurance status and PCP density, patients who lived closer to the hospital were more likely to return, but less likely to be admitted, thereby suggesting proximity as a factor in the decision to return. It is also possible that patients residing further away returned to another institution. Although PCP density did not seem to be associated with revisits, patients who lived in areas with higher PCP density were less likely to be admitted when they returned. In this study, there was a stepwise gradient in the effect of PCP density on the odds of being hospitalized on return with those patients in areas with fewer PCPs being admitted at higher rates on return. Guttmann et al.,[40] in a recent study conducted in Canada where there is universal health insurance, showed that children residing in areas with higher PCP densities had higher rates of PCP visits but lower rates of ED visits compared to children residing in areas with lower PCP densities. It is possible that emergency physicians have more confidence that patients will have dedicated follow‐up when a PCP can be identified. These findings suggest that the development of PCP networks with expanded access, such as alignment of office hours with parent need and patient/parent education about PCP availability, may reduce ED revisits. Alternatively, creation of centralized hospital‐based urgent care centers for evening, night, and weekend visits may benefit both the patient and the PCP and avoid ED revisits and associated costs.

Targeting and eliminating disparities in care might also play a role in reducing ED revisits. Prior studies have shown that publicly insured individuals, in particular, frequently use the ED as their usual source of care and are more likely to return to the ED within 72 hours of an initial visit.[23, 39, 44, 49, 50] Likewise, we found that patients with public insurance were more likely to return but less likely to be admitted on revisit. After controlling for disease severity and other demographic variables, patients with public insurance and of lower socioeconomic status still had lower odds of being hospitalized following a revisit. This might also signify an increase of avoidable hospitalizations among patients of higher SES or with private insurance. Further investigation is needed to explore the reasons for these differences and to identify effective interventions to eliminate disparities.

Our findings have implications for emergency care, ambulatory care, and the larger healthcare system. First, ED revisits are costly and contribute to already overburdened EDs.[10, 11] The average ED visit incurs charges that are 2 to 5 times more than an outpatient office visit.[49, 50] Careful coordination of ambulatory and ED services could not only ensure optimal care for patients, but could save the US healthcare system billions of dollars in potentially avoidable healthcare expenditures.[49, 50] Second, prior studies have demonstrated a consistent relationship between poor access to primary care and increased use of the ED for nonurgent conditions.[42] Publicly insured patients have been shown to have disproportionately increased difficulty acquiring and accessing primary care.[41, 42, 47, 51] Furthermore, conditions with high ED revisit rates are similar to conditions reported by Berry et al.4 as having the highest hospital readmission rates such as cancer, sickle cell anemia, seizure, pneumonia, asthma, and gastroenteritis. This might suggest a close relationship between 72‐hour ED revisits and 30‐day hospital readmissions. In light of the recent expansion of health insurance coverage to an additional 30 million individuals, the need for better coordination of services throughout the entire continuum of care, including primary care, ED, and inpatient services, has never been more important.[52] Future improvements could explore condition‐specific revisit or readmission rates to identify the most effective interventions to reduce the possibly preventable returns.

This study has several limitations. First, as an administrative database, PHIS has limited clinical data, and reasons for return visits could not be assessed. Variations between hospitals in diagnostic coding might also lead to misclassification bias. Second, we were unable to assess return visits to a different ED. Thus, we may have underestimated revisit frequency. However, because children are generally more likely to seek repeat care in the same hospital,[3] we believe our estimate of return visit rate approximates the actual return visit rate; our findings are also similar to previously reported rates. Third, for the PCP density factor, we were unable to account for types of insurance each physician accepted and influence on return rates. Fourth, return visits in our sample could have been for conditions unrelated to the diagnosis at index visit, though the short timeframe considered for revisits makes this less likely. In addition, the crowding index does not include the proportion of occupied beds at the precise moment of the index visit. Finally, this cohort includes only children seen in the EDs of pediatric hospitals, and our findings may not be generalizable to all EDs who provide care for ill and injured children.

We have shown that, in addition to previously identified patient level factors, there are visit‐level and access‐related factors associated with pediatric ED return visits. Eighty percent are discharged again, and almost one‐fifth of returning patients are admitted to the hospital. Admitted patients tend to be younger, sicker, chronically ill, and live farther from the hospital. By being aware of patients' comorbidities, PCP access, as well as certain diagnoses associated with high rates of return, physicians may better target interventions to optimize care. This may include having a lower threshold for hospitalization at the initial visit for children at high risk of return, and communication with the PCP at the time of discharge to ensure close follow‐up. Our study helps to provide benchmarks around ED revisit rates, and may serve as a starting point to better understand variation in care. Future efforts should aim to find creative solutions at individual institutions, with the goal of disseminating and replicating successes more broadly. For example, investigators in Boston have shown that the use of a comprehensive home‐based asthma management program has been successful in decreasing emergency department visits and hospitalization rates.[53] It is possible that this approach could be spread to other institutions to decrease revisits for patients with asthma. As a next step, the authors have undertaken an investigation to identify hospital‐level characteristics that may be associated with rates of return visits.

Acknowledgements

The authors thank the following members of the PHIS ED Return Visits Research Group for their contributions to the data analysis plan and interpretation of results of this study: Rustin Morse, MD, Children's Medical Center of Dallas; Catherine Perron, MD, Boston Children's Hospital; John Cheng, MD, Children's Healthcare of Atlanta; Shabnam Jain, MD, MPH, Children's Healthcare of Atlanta; and Amanda Montalbano, MD, MPH, Children's Mercy Hospitals and Clinics. These contributors did not receive compensation for their help with this work.

Disclosures

A.T.A. and A.M.S. conceived the study and developed the initial study design. All authors were involved in the development of the final study design and data analysis plan. C.W.T. collected and analyzed the data. A.T.A. and C.W.T. had full access to all of the data and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors were involved in the interpretation of the data. A.T.A. drafted the article, and all authors made critical revisions to the initial draft and subsequent versions. A.T.A. and A.M.S. take full responsibility for the article as a whole. The authors report no conflicts of interest.

References
  1. Joint policy statement—guidelines for care of children in the emergency department. Pediatrics. 2009;124:12331243.
  2. Alessandrini EA, Lavelle JM, Grenfell SM, Jacobstein CR, Shaw KN. Return visits to a pediatric emergency department. Pediatr Emerg Care. 2004;20:166171.
  3. Axon RN, Williams MV. Hospital readmission as an accountability measure. JAMA. 2011;305:504505.
  4. Berry JG, Hall DE, Kuo DZ, et al. Hospital utilization and characteristics of patients experiencing recurrent readmissions within children's hospitals. JAMA. 2011;305:682690.
  5. Berry JG, Toomey SL, Zaslavsky AM, et al. Pediatric readmission prevalence and variability across hospitals. JAMA. 2013;309:372380.
  6. Carrns A. Farewell, and don't come back. Health reform gives hospitals a big incentive to send patients home for good. US News World Rep. 2010;147:20, 2223.
  7. Coye MJ. CMS' stealth health reform. Plan to reduce readmissions and boost the continuum of care. Hosp Health Netw. 2008;82:24.
  8. Lerman B, Kobernick MS. Return visits to the emergency department. J Emerg Med. 1987;5:359362.
  9. Rising KL, White LF, Fernandez WG, Boutwell AE. Emergency department visits after hospital discharge: a missing part of the equation. Ann Emerg Med. 2013;62:145150.
  10. Stang AS, Straus SE, Crotts J, Johnson DW, Guttmann A. Quality indicators for high acuity pediatric conditions. Pediatrics. 2013;132:752762.
  11. Fontanarosa PB, McNutt RA. Revisiting hospital readmissions. JAMA. 2013;309:398400.
  12. Vaduganathan M, Bonow RO, Gheorghiade M. Thirty‐day readmissions: the clock is ticking. JAMA. 2013;309:345346.
  13. Adekoya N. Patients seen in emergency departments who had a prior visit within the previous 72 h‐National Hospital Ambulatory Medical Care Survey, 2002. Public Health. 2005;119:914918.
  14. Cho CS, Shapiro DJ, Cabana MD, Maselli JH, Hersh AL. A national depiction of children with return visits to the emergency department within 72 hours, 2001–2007. Pediatr Emerg Care. 2012;28:606610.
  15. Feudtner C, Levin JE, Srivastava R, et al. How well can hospital readmission be predicted in a cohort of hospitalized children? A retrospective, multicenter study. Pediatrics. 2009;123:286293.
  16. Goldman RD, Ong M, Macpherson A. Unscheduled return visits to the pediatric emergency department‐one‐year experience. Pediatr Emerg Care. 2006;22:545549.
  17. Klein‐Kremer A, Goldman RD. Return visits to the emergency department among febrile children 3 to 36 months of age. Pediatr Emerg Care. 2011;27:11261129.
  18. LeDuc K, Rosebrook H, Rannie M, Gao D. Pediatric emergency department recidivism: demographic characteristics and diagnostic predictors. J Emerg Nurs. 2006;32:131138.
  19. Healthcare Cost and Utilization Project. Pediatric emergency department visits in community hospitals from selected states, 2005. Statistical brief #52. Available at: http://www.ncbi.nlm.nih.gov/books/NBK56039. Accessed October 3, 2013.
  20. Sharma V, Simon SD, Bakewell JM, Ellerbeck EF, Fox MH, Wallace DD. Factors influencing infant visits to emergency departments. Pediatrics. 2000;106:10311039.
  21. Ali AB, Place R, Howell J, Malubay SM. Early pediatric emergency department return visits: a prospective patient‐centric assessment. Clin Pediatr (Phila). 2012;51:651658.
  22. Hu KW, Lu YH, Lin HJ, Guo HR, Foo NP. Unscheduled return visits with and without admission post emergency department discharge. J Emerg Med. 2012;43:11101118.
  23. Jacobstein CR, Alessandrini EA, Lavelle JM, Shaw KN. Unscheduled revisits to a pediatric emergency department: risk factors for children with fever or infection‐related complaints. Pediatr Emerg Care. 2005;21:816821.
  24. Sauvin G, Freund Y, Saidi K, Riou B, Hausfater P. Unscheduled return visits to the emergency department: consequences for triage. Acad Emerg Med. 2013;20:3339.
  25. Zimmerman DR, McCarten‐Gibbs KA, DeNoble DH, et al. Repeat pediatric visits to a general emergency department. Ann Emerg Med. 1996;28:467473.
  26. Keith KD, Bocka JJ, Kobernick MS, Krome RL, Ross MA. Emergency department revisits. Ann Emerg Med. 1989;18:964968.
  27. US Department of Health 19:7078.
  28. Feudtner C, Christakis DA, Connell FA. Pediatric deaths attributable to complex chronic conditions: a population‐based study of Washington State, 1980–1997. Pediatrics. 2000;106:205209.
  29. Feudtner C, Hays RM, Haynes G, Geyer JR, Neff JM, Koepsell TD. Deaths attributed to pediatric complex chronic conditions: national trends and implications for supportive care services. Pediatrics. 2001;107:E99.
  30. Feudtner C, Silveira MJ, Christakis DA. Where do children with complex chronic conditions die? Patterns in Washington State, 1980–1998. Pediatrics. 2002;109:656660.
  31. Dartmouth Atlas of Health Care. Hospital and physician capacity, 2006. Available at: http://www.dartmouthatlas.org/data/topic/topic.aspx?cat=24. Accessed October 7, 2013.
  32. Dartmouth Atlas of Health Care. Research methods. What is an HSA/HRR? Available at: http://www.dartmouthatlas.org/tools/faq/researchmethods.aspx. Accessed October 7, 2013,.
  33. Dartmouth Atlas of Health Care. Appendix on the geography of health care in the United States. Available at: http://www.dartmouthatlas.org/downloads/methods/geogappdx.pdf. Accessed October 7, 2013.
  34. Beniuk K, Boyle AA, Clarkson PJ. Emergency department crowding: prioritising quantified crowding measures using a Delphi study. Emerg Med J. 2012;29:868871.
  35. Alessandrini EA, Alpern ER, Chamberlain JM, Shea JA, Gorelick MH. A new diagnosis grouping system for child emergency department visits. Acad Emerg Med. 2010;17:204213.
  36. Guttmann A, Zagorski B, Austin PC, et al. Effectiveness of emergency department asthma management strategies on return visits in children: a population‐based study. Pediatrics. 2007;120:e1402e1410.
  37. Horwitz DA, Schwarz ES, Scott MG, Lewis LM. Emergency department patients with diabetes have better glycemic control when they have identifiable primary care providers. Acad Emerg Med. 2012;19:650655.
  38. Billings J, Zeitel L, Lukomnik J, Carey TS, Blank AE, Newman L. Impact of socioeconomic status on hospital use in New York City. Health Aff (Millwood). 1993;12:162173.
  39. Guttmann A, Shipman SA, Lam K, Goodman DC, Stukel TA. Primary care physician supply and children's health care use, access, and outcomes: findings from Canada. Pediatrics. 2010;125:11191126.
  40. Asplin BR, Rhodes KV, Levy H, et al. Insurance status and access to urgent ambulatory care follow‐up appointments. JAMA. 2005;294:12481254.
  41. Kellermann AL, Weinick RM. Emergency departments, Medicaid costs, and access to primary care—understanding the link. N Engl J Med. 2012;366:21412143.
  42. Committee on the Future of Emergency Care in the United States Health System. Emergency Care for Children: Growing Pains. Washington, DC: The National Academies Press; 2007.
  43. Committee on the Future of Emergency Care in the United States Health System. Hospital‐Based Emergency Care: At the Breaking Point. Washington, DC: The National Academies Press; 2007.
  44. Radley DC, Schoen C. Geographic variation in access to care—the relationship with quality. N Engl J Med. 2012;367:36.
  45. Tang N, Stein J, Hsia RY, Maselli JH, Gonzales R. Trends and characteristics of US emergency department visits, 1997–2007. JAMA. 2010;304:664670.
  46. Young GP, Wagner MB, Kellermann AL, Ellis J, Bouley D. Ambulatory visits to hospital emergency departments. Patterns and reasons for use. 24 Hours in the ED Study Group. JAMA. 1996;276:460465.
  47. Tranquada KE, Denninghoff KR, King ME, Davis SM, Rosen P. Emergency department workload increase: dependence on primary care? J Emerg Med. 2010;38:279285.
  48. Network for Excellence in Health Innovation. Leading healthcare research organizations to examine emergency department overuse. New England Research Institute, 2008. Available at: http://www.nehi.net/news/310‐leading‐health‐care‐research‐organizations‐to‐examine‐emergency‐department‐overuse/view. Accessed October 4, 2013.
  49. Robert Wood Johnson Foundation. Quality field notes: reducing inappropriate emergency department use. Available at: http://www.rwjf.org/en/research‐publications/find‐rwjf‐research/2013/09/quality‐field‐notes–reducing‐inappropriate‐emergency‐department.html.
  50. Access of Medicaid recipients to outpatient care. N Engl J Med. 1994;330:14261430.
  51. Medicaid policy statement. Pediatrics. 2013;131:e1697e1706.
  52. Woods ER, Bhaumik U, Sommer SJ, et al. Community asthma initiative: evaluation of a quality improvement program for comprehensive asthma care. Pediatrics. 2012;129:465472.
References
  1. Joint policy statement—guidelines for care of children in the emergency department. Pediatrics. 2009;124:12331243.
  2. Alessandrini EA, Lavelle JM, Grenfell SM, Jacobstein CR, Shaw KN. Return visits to a pediatric emergency department. Pediatr Emerg Care. 2004;20:166171.
  3. Axon RN, Williams MV. Hospital readmission as an accountability measure. JAMA. 2011;305:504505.
  4. Berry JG, Hall DE, Kuo DZ, et al. Hospital utilization and characteristics of patients experiencing recurrent readmissions within children's hospitals. JAMA. 2011;305:682690.
  5. Berry JG, Toomey SL, Zaslavsky AM, et al. Pediatric readmission prevalence and variability across hospitals. JAMA. 2013;309:372380.
  6. Carrns A. Farewell, and don't come back. Health reform gives hospitals a big incentive to send patients home for good. US News World Rep. 2010;147:20, 2223.
  7. Coye MJ. CMS' stealth health reform. Plan to reduce readmissions and boost the continuum of care. Hosp Health Netw. 2008;82:24.
  8. Lerman B, Kobernick MS. Return visits to the emergency department. J Emerg Med. 1987;5:359362.
  9. Rising KL, White LF, Fernandez WG, Boutwell AE. Emergency department visits after hospital discharge: a missing part of the equation. Ann Emerg Med. 2013;62:145150.
  10. Stang AS, Straus SE, Crotts J, Johnson DW, Guttmann A. Quality indicators for high acuity pediatric conditions. Pediatrics. 2013;132:752762.
  11. Fontanarosa PB, McNutt RA. Revisiting hospital readmissions. JAMA. 2013;309:398400.
  12. Vaduganathan M, Bonow RO, Gheorghiade M. Thirty‐day readmissions: the clock is ticking. JAMA. 2013;309:345346.
  13. Adekoya N. Patients seen in emergency departments who had a prior visit within the previous 72 h‐National Hospital Ambulatory Medical Care Survey, 2002. Public Health. 2005;119:914918.
  14. Cho CS, Shapiro DJ, Cabana MD, Maselli JH, Hersh AL. A national depiction of children with return visits to the emergency department within 72 hours, 2001–2007. Pediatr Emerg Care. 2012;28:606610.
  15. Feudtner C, Levin JE, Srivastava R, et al. How well can hospital readmission be predicted in a cohort of hospitalized children? A retrospective, multicenter study. Pediatrics. 2009;123:286293.
  16. Goldman RD, Ong M, Macpherson A. Unscheduled return visits to the pediatric emergency department‐one‐year experience. Pediatr Emerg Care. 2006;22:545549.
  17. Klein‐Kremer A, Goldman RD. Return visits to the emergency department among febrile children 3 to 36 months of age. Pediatr Emerg Care. 2011;27:11261129.
  18. LeDuc K, Rosebrook H, Rannie M, Gao D. Pediatric emergency department recidivism: demographic characteristics and diagnostic predictors. J Emerg Nurs. 2006;32:131138.
  19. Healthcare Cost and Utilization Project. Pediatric emergency department visits in community hospitals from selected states, 2005. Statistical brief #52. Available at: http://www.ncbi.nlm.nih.gov/books/NBK56039. Accessed October 3, 2013.
  20. Sharma V, Simon SD, Bakewell JM, Ellerbeck EF, Fox MH, Wallace DD. Factors influencing infant visits to emergency departments. Pediatrics. 2000;106:10311039.
  21. Ali AB, Place R, Howell J, Malubay SM. Early pediatric emergency department return visits: a prospective patient‐centric assessment. Clin Pediatr (Phila). 2012;51:651658.
  22. Hu KW, Lu YH, Lin HJ, Guo HR, Foo NP. Unscheduled return visits with and without admission post emergency department discharge. J Emerg Med. 2012;43:11101118.
  23. Jacobstein CR, Alessandrini EA, Lavelle JM, Shaw KN. Unscheduled revisits to a pediatric emergency department: risk factors for children with fever or infection‐related complaints. Pediatr Emerg Care. 2005;21:816821.
  24. Sauvin G, Freund Y, Saidi K, Riou B, Hausfater P. Unscheduled return visits to the emergency department: consequences for triage. Acad Emerg Med. 2013;20:3339.
  25. Zimmerman DR, McCarten‐Gibbs KA, DeNoble DH, et al. Repeat pediatric visits to a general emergency department. Ann Emerg Med. 1996;28:467473.
  26. Keith KD, Bocka JJ, Kobernick MS, Krome RL, Ross MA. Emergency department revisits. Ann Emerg Med. 1989;18:964968.
  27. US Department of Health 19:7078.
  28. Feudtner C, Christakis DA, Connell FA. Pediatric deaths attributable to complex chronic conditions: a population‐based study of Washington State, 1980–1997. Pediatrics. 2000;106:205209.
  29. Feudtner C, Hays RM, Haynes G, Geyer JR, Neff JM, Koepsell TD. Deaths attributed to pediatric complex chronic conditions: national trends and implications for supportive care services. Pediatrics. 2001;107:E99.
  30. Feudtner C, Silveira MJ, Christakis DA. Where do children with complex chronic conditions die? Patterns in Washington State, 1980–1998. Pediatrics. 2002;109:656660.
  31. Dartmouth Atlas of Health Care. Hospital and physician capacity, 2006. Available at: http://www.dartmouthatlas.org/data/topic/topic.aspx?cat=24. Accessed October 7, 2013.
  32. Dartmouth Atlas of Health Care. Research methods. What is an HSA/HRR? Available at: http://www.dartmouthatlas.org/tools/faq/researchmethods.aspx. Accessed October 7, 2013,.
  33. Dartmouth Atlas of Health Care. Appendix on the geography of health care in the United States. Available at: http://www.dartmouthatlas.org/downloads/methods/geogappdx.pdf. Accessed October 7, 2013.
  34. Beniuk K, Boyle AA, Clarkson PJ. Emergency department crowding: prioritising quantified crowding measures using a Delphi study. Emerg Med J. 2012;29:868871.
  35. Alessandrini EA, Alpern ER, Chamberlain JM, Shea JA, Gorelick MH. A new diagnosis grouping system for child emergency department visits. Acad Emerg Med. 2010;17:204213.
  36. Guttmann A, Zagorski B, Austin PC, et al. Effectiveness of emergency department asthma management strategies on return visits in children: a population‐based study. Pediatrics. 2007;120:e1402e1410.
  37. Horwitz DA, Schwarz ES, Scott MG, Lewis LM. Emergency department patients with diabetes have better glycemic control when they have identifiable primary care providers. Acad Emerg Med. 2012;19:650655.
  38. Billings J, Zeitel L, Lukomnik J, Carey TS, Blank AE, Newman L. Impact of socioeconomic status on hospital use in New York City. Health Aff (Millwood). 1993;12:162173.
  39. Guttmann A, Shipman SA, Lam K, Goodman DC, Stukel TA. Primary care physician supply and children's health care use, access, and outcomes: findings from Canada. Pediatrics. 2010;125:11191126.
  40. Asplin BR, Rhodes KV, Levy H, et al. Insurance status and access to urgent ambulatory care follow‐up appointments. JAMA. 2005;294:12481254.
  41. Kellermann AL, Weinick RM. Emergency departments, Medicaid costs, and access to primary care—understanding the link. N Engl J Med. 2012;366:21412143.
  42. Committee on the Future of Emergency Care in the United States Health System. Emergency Care for Children: Growing Pains. Washington, DC: The National Academies Press; 2007.
  43. Committee on the Future of Emergency Care in the United States Health System. Hospital‐Based Emergency Care: At the Breaking Point. Washington, DC: The National Academies Press; 2007.
  44. Radley DC, Schoen C. Geographic variation in access to care—the relationship with quality. N Engl J Med. 2012;367:36.
  45. Tang N, Stein J, Hsia RY, Maselli JH, Gonzales R. Trends and characteristics of US emergency department visits, 1997–2007. JAMA. 2010;304:664670.
  46. Young GP, Wagner MB, Kellermann AL, Ellis J, Bouley D. Ambulatory visits to hospital emergency departments. Patterns and reasons for use. 24 Hours in the ED Study Group. JAMA. 1996;276:460465.
  47. Tranquada KE, Denninghoff KR, King ME, Davis SM, Rosen P. Emergency department workload increase: dependence on primary care? J Emerg Med. 2010;38:279285.
  48. Network for Excellence in Health Innovation. Leading healthcare research organizations to examine emergency department overuse. New England Research Institute, 2008. Available at: http://www.nehi.net/news/310‐leading‐health‐care‐research‐organizations‐to‐examine‐emergency‐department‐overuse/view. Accessed October 4, 2013.
  49. Robert Wood Johnson Foundation. Quality field notes: reducing inappropriate emergency department use. Available at: http://www.rwjf.org/en/research‐publications/find‐rwjf‐research/2013/09/quality‐field‐notes–reducing‐inappropriate‐emergency‐department.html.
  50. Access of Medicaid recipients to outpatient care. N Engl J Med. 1994;330:14261430.
  51. Medicaid policy statement. Pediatrics. 2013;131:e1697e1706.
  52. Woods ER, Bhaumik U, Sommer SJ, et al. Community asthma initiative: evaluation of a quality improvement program for comprehensive asthma care. Pediatrics. 2012;129:465472.
Issue
Journal of Hospital Medicine - 9(12)
Issue
Journal of Hospital Medicine - 9(12)
Page Number
779-787
Page Number
779-787
Article Type
Display Headline
Prevalence and predictors of return visits to pediatric emergency departments
Display Headline
Prevalence and predictors of return visits to pediatric emergency departments
Sections
Article Source

© 2014 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Address for correspondence and reprint requests: Anne Stack, MD, Division of Emergency Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115; Telephone: 617‐355‐6624; Fax: 617‐730‐4824; E‐mail: anne.stack@childrens.harvard.edu
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files