Emergency department (ED) triage is used to identify patients' level of urgency and treat them based on their triage level. The global advancement of triage scales in the past two decades has generated considerable research on the validity and reliability of these scales. This systematic review aims to investigate the scientific evidence for published ED triage scales. The following questions are addressed:
1. Does assessment of individual vital signs or chief complaints affect mortality during the hospital stay or within 30 days after arrival at the ED?
2. What is the level of agreement between clinicians' triage decisions compared to each other or to a gold standard for each scale (reliability)?
3. How valid is each triage scale in predicting hospitalization and hospital mortality?
A systematic search of the international literature published from 1966 through March 31, 2009 explored the British Nursing Index, Business Source Premier, CINAHL, Cochrane Library, EMBASE, and PubMed. Inclusion was limited to controlled studies of adult patients (≥15 years) visiting EDs for somatic reasons. Outcome variables were death in ED or hospital and need for hospitalization (validity). Methodological quality and clinical relevance of each study were rated as high, medium, or low. The results from the studies that met the inclusion criteria and quality standards were synthesized applying the internationally developed GRADE system. Each conclusion was then assessed as having strong, moderately strong, limited, or insufficient scientific evidence. If studies were not available, this was also noted.
We found ED triage scales to be supported, at best, by limited and often insufficient evidence.
The ability of the individual vital signs included in the different scales to predict outcome is seldom, if at all, studied in the ED setting. The scientific evidence to assess interrater agreement (reliability) was limited for one triage scale and insufficient or lacking for all other scales. Two of the scales yielded limited scientific evidence, and one scale yielded insufficient evidence, on which to assess the risk of early death or hospitalization in patients assigned to the two lowest triage levels on a 5-level scale (validity).
Triage is a central task in an emergency department (ED). In this context, triage is viewed as the rating of patients' clinical urgency . Rating is necessary to identify the order in which patients should be given care in an ED when demand is high. Triage is not needed if there is no queue for care. Triage scales aim to optimize the waiting time of patients according to the severity of their medical condition, in order to treat as fast as necessary the most intense symptom(s) and to reduce the negative impact on the prognosis of a prolonged delay before treatment. ED triage is a relatively modern phenomenon, introduced in the 1950s in the United States . Triage is a complex decision-making process, and several triage scales have been designed as decision-support systems  to guide the triage nurse to a correct decision. Triage decisions may be based on both the patients' vital signs (respiratory rate, oxygen saturation in blood, heart rate, blood pressure, level of consciousness, and body temperature) and their chief complaints. Internationally, no consensus has been reached on the functions that should be measured. Apart from emergency care, triage may be used in other clinical activities, e.g. deciding on a certain investigation  or treatment .
Since the early 1990s, several countries have developed and introduced ED triage [6-10]. Development of triage scales in some countries has been influenced largely by the seminal work of FitzGerald , resulting in most of the triage scales developed in the 1990s and 2000s being designed as 5-level scales. Of these, the Australian Triage Scale (ATS), Canadian Emergency Department Triage and Acuity Scale (CTAS), Manchester Triage Scale (MTS), and Emergency Severity Index (ESI) have had the greatest influence on modern ED triage [12-15]. Other scales have not disseminated as widely around the globe, e.g. the Soterion Rapid Triage Scale (SRTS) from the United States and the 4-level Taiwan Triage System (TTS) [6,7,9,16,17]. Some countries, e.g. Australia, have a national mandatory triage scale while many European countries lack such standards [7,9].
Patients may have a life-threatening condition, but show normal vital signs. Hence, in triaging the patient it is important to consider information given by patients or accompanying persons regarding the patient's chief complaints or medical history, which can provide essential information about serious diseases. The chief complaints describe the incident or symptoms that caused the patient to seek care.
In 2005, a joint task force of the American College of Emergency Physicians and the Emergency Nurses Association published a review of the literature on ED triage scales. Based on expert consensus and available evidence, the task force supported adoption of a reliable 5-level triage scale, stating that either the CTAS or the ESI are good choices for ED triage . In 2002, a national survey conducted in Sweden identified the use of 37 different triage scales across the country. Further, some 30 EDs did not use any type of triage scale .
This systematic review aims to investigate the scientific evidence underlying published ED triage scales.
The following questions are addressed:
1. In triage of adults at EDs, does assessment of individual vital signs or chief complaints affect mortality during the hospital stay or within 30 days after arrival at the ED?
2. In adult ED patients, what is the level of agreement between clinicians' triage decisions compared to each other or to a gold standard for each scale (i.e. the reliability of triage scales)?
3. In adult ED patients, how valid is each triage scale in predicting hospitalization and hospital mortality?
A systematic search of the international literature published from 1966 through March 31, 2009 explored the British Nursing Index, Business Source Premier, CINAHL, Cochrane Library, EMBASE, and PubMed. Inclusion was limited to studies of adult patients (≥15 years) visiting EDs for somatic reasons. Another criterion for inclusion was that the study design must contain a control, i.e. randomized controlled trials (RCT), observational studies with a control group based on previously collected data, and before-after studies. Descriptive studies without a control group and retrospective studies were excluded.
Inclusion criteria for vital signs and chief complaints used in triage scales
• Studies analyzing individual vital signs or chief complaints
• Outcome variable defined as death within 30 days after ED arrival or during the hospital stay
Inclusion criteria for reliability and validity of triage scales
• Studies based on real patients triaged at EDs (validity)
• Studies based on real patients triaged at EDs or fictitious patient scenarios (reliability)
• Studies reporting reliability at separate triage levels (reliability)
• Studies reporting mortality and hospitalization per triage level (validity)
• Outcome variables defined as death in the ED or hospital, and need for hospitalization (validity)
Exclusion criteria for studies on reliability of triage scales
• Studies on interrater reproducibility are excluded in cases where any rater in the study had access to retrospective data only.
Six experts from different professions and clinical specialties reviewed the studies, independently in groups of 2 or 3, for quality by using methods validated for internal validity, precision, and applicability (external validity) . The methodological quality and clinical relevance of each study was graded as high, medium, or low. Results from the studies that met the inclusion criteria and quality standards were synthesized by applying the internationally developed GRADE system .
In accordance with GRADE, the following factors were considered in appraising the overall strength of the evidence: study quality, concordance/consistency, transferability/relevance, precision of data, risk of publication bias, effect size, and dose-response. In synthesizing the data, studies having low quality and relevance were included when studies of medium quality and relevance were not available. Based on the overall quality and relevance of the studies reviewed, each conclusion was rated as having strong, moderately strong, limited, or insufficient scientific evidence. If studies were not available, this was noted .
Figure 1. Results of literature search and selection process.
Figure 2. Results of literature search and selection process regarding reliability (10 articles), and validity (10 articles) of triage scales. One article studied both reliability and validity and was rated differently due to the studied endpoint, low quality regarding reliability and medium quality regarding validity.
Vital signs and chief complaints
Most of the studies that investigated associations between different vital signs or chief complaints and mortality after ED arrival were observational cohort studies based on selected, diagnosis-specific, patient groups. All of the studies were found to have medium quality and relevance. Only a few studies included all patients (albeit limited to "medical" patients") that arrived at the ED, regardless of diagnosis. Hence, studies of patients classified as surgical disciplines were generally lacking. Several studies described compiled scales or indexes for appraising the severity level of the patient's conditions, but provided no information on the importance of specific vital signs or chief complaints. Hence, little or no evidence can be found on the association between specific vital signs or reasons for the ED visit and mortality in the group of general patients presenting in EDs.
Only a single study, which described the predictive importance of respiratory rate, fulfilled the inclusion criteria . The study aimed to assess whether the Rapid Acute Physiology Score (RAPS) could be used to predict mortality in nonsurgical patients on ED arrival. It also aimed to study whether an advanced version of RAPS, i.e. the Rapid Emergency Medicine Score (REMS), could yield better predictive information .
RAPS was developed for prehospital care and involves assessing respiratory rate, pulse, blood pressure, and the Glasgow Coma Scale (GCS). REMS is based on RAPS, but also assesses oxygen saturation, body temperature, and age. In total, 11 751 patients were studied prospectively after arrival at the ED of a university hospital in Sweden. Respiratory rate was found to be a significant predictor of mortality during the hospital stay. A decrease of one step on the RAPS scale was found to nearly double the risk of mortality within 30 days (Table 1).
Table 1. Does assessment of certain vital signs and chief complaints in emergency department triage of adults have an impact on 30-day or in-hospital mortality?
Oxygen saturation in blood
Two studies used RAPS and REMS to predict acute mortality after ED arrival and specifically studied the predictive importance of saturation [22,23]. Oxygen saturation was found to be one of the three variables, along with age and level of consciousness, that best predicted mortality during hospitalization.
One study investigated the importance of assessing pulse in the ED as a means to predict mortality during the hospital stay.
The study, which was conducted in Sweden , showed a significant association between the pulse on arrival to the ED and mortality during the hospital stay in a group of 11 751 patients receiving care for nonsurgical disorders. With a decrease of one step on the RAPS scale, 67% of the patients showed an increased risk of mortality within 30 days.
Level of consciousness
The Swedish study (described above) also investigated the association between acute mortality and the level of consciousness on arrival at the ED . Another study used the same methods mentioned above, i.e. RAPS and REMS , to analyze 5583 patients that had called the emergency phone number and were classified as urgent. The study showed that level of consciousness was one of three variables (age and saturation being the other two) that best predicted mortality during the hospital stay. Another study analyzed 986 stroke patients on ED arrival. Impaired level of consciousness appeared to be the best predictor of mortality during the hospital stay .
Blood pressure and body temperature
The importance of blood pressure or body temperature in assessing the risk of acute mortality after ED arrival could not be supported by the included studies due to the lack of scientific evidence.
Studies describing the association between different chief complaints and acute mortality were found to be lacking.
Three of the studies described above showed that the higher the patient's age, the greater the risk of death within 30 days of hospital care following ED arrival [22-24]. The results showed an increase in mortality of 5% per year. Furthermore, one study showed that older patients (above 75 years of age) with symptoms of coronary heart disease had a greater risk of death within 30 days after arrival at the ED compared to younger patients with the same symptoms  (Table 1).
Based on the studies described above, Table 2 summarizes assessments and comments regarding the level of scientific evidence.
Table 2. Appraisal of scientific evidence according to GRADE - Association between vital signs/chief complaints and acute mortality after arrival at the emergency department.
Interrater agreement of triage scales (reliability)
All 11 articles that were found to answer the question concerning reliability of triage scales and met the defined inclusion criteria were observational studies. They addressed reliability of the ATS , CTAS (including eTriage) [19,27-30], MTS , SRTS , and two locally produced scales without names [8,32] (Table 3). Based on the quality review, 9 articles [6,8,19,26-31] were found to be of low and 1  of medium quality. One article was excluded due to deficient quality resulting from high internal dropout . Deficient external validity was the major reason for the low- and medium-quality ratings of the studies. Selection of patients and triage nurses were both found to be irrelevant or insufficiently described. Hence, 10 articles remained as a basis for the conclusions.
Table 3. Reliability of triage scales
The scientific evidence was found to be insufficient to assess the reliability of ATS, CTAS, MTS, SRTS and the Swiss scale (Table 4). However, limited scientific evidence was found in assessing the reproducibility of the Brillman scale (North America) as having moderate interrater agreement.
Table 4. Appraisal of scientific evidence (according to GRADE) - Reliability of triage scales.
Validity of triage scales regarding acute mortality and hospital admission rates
None of the studies reported on hospital admission rates adjusted for age and gender or mortality (Table 5). Since previous studies have shown that age is one of the major predictors of hospital mortality [33,34] the scientific evidence was found to be insufficient to asses the validity of the triage scales ATS, CTAS, and Medical Emergency Triage and Treatment System (METTS) (Table 6). However, safety as measured by hospital mortality in patients graded as low risk (triage levels 4-5/green-blue) by the triage systems may be regarded as one aspect of validity. When assessing the above-mentioned triage scales' level of validity as regards mortality at the lowest triage levels only (levels 4-5/green-blue), the quality and relevance of the studies were found to be moderate. Hence, scientific evidence is limited.
Table 5. Studies on how the assessment of the urgency of need to see a physician according to different triage systems could predict hospital mortality.
Table 6. Appraisal of scientific evidence (according to GRADE) - Validity of 5-level triage scales measured by acute mortality.
Hospital admission rates in patients triaged as non-acute
Nine studies reported on admission rates for the ESI, ATS, and SRTS triage scales (Table 7). The studies showed a range between 0.0% and 17.0% at level 5, the lowest triage level [6,16,35-41]. A range was also observed in the age panorama (mean ages between 30 and 47 years) and in hospital admission rates at triage level 4 (3%-33%): 18% to 33% for ATS, 6% to 10% for ESI, and 3% for SRTS.
Table 7. Studies on how the assessment of the urgency of need to see a physician according to different triage systems could predict hospitalization.
Seven of these studies were found to be of moderate and two of low quality and relevance, and the scientific evidence for validity of admission rates for patients in the lowest triage levels (levels 4-5/green-blue) was found to be limited (Table 8).
Table 8. Appraisal of scientific evidence (according to GRADE) - Safety of 5-level triage scales as measured by hospitalisation rates in patients at triage level 5.
Our systematic review shows that when adjudicated by standard criteria for study quality and scientific evidence, the triage scales used in EDs are supported, at best, by limited evidence. Often, the evidence is weaker, not above insufficient by the GRADE criteria. The ability of the individual vital signs included in the different scales to predict outcome has seldom, or never, been studied in the ED setting. The scientific evidence for assessing interrater agreement (reproducibility) was limited for one triage scale (Brillman) whereas it was insufficient or lacking for all other scales. Two of the scales (CTAS and ATS) offered limited scientific evidence, and the scientific evidence for one scale (METTS) was insufficient to assess the risk of early death or hospitalization in patients assigned to the two lowest triage levels in 5-level scales; the studies showed the risk of death to be low, but a need for inpatient care was not excluded (about 5% hospital admission rate on average). Studies on validity of the triage scales across all levels, i.e. their ability to distinguish the urgency in patients assigned the five different levels, were generally of low quality. Consequently, evidence was insufficient to assess the validity of the scales.
As none of the studies reported on mortality rates adjusted for differences in age and gender between the triage levels, we could not evaluate the validity of the triage scales across all triage levels as regards the risk of early death. To estimate the safety of the scales, we studied early death among patients assigned to the lowest triage levels (green and blue/4-5). Two triage scales (ATS and CTAS) offered limited scientific evidence for assessing safety. In both scales, the patients assigned to the two lowest triage levels had a very low risk of dying within 24 hours after triage. Hence, in this respect, the scales are safe to use. Scientific evidence for METTS, the newly developed Swedish triage scale, was found to be insufficient to assess safety. Since the study recorded the risk of dying during the in-hospital stay, mortality was higher than in the studies on ATS and CTAS.
In using the need of hospitalization as a measure of safety, the situation was found to be more complex. Again, none of the studies reported on hospital admission rates adjusted for age and gender, so we could not evaluate the validity of the triage scales across all triage levels. However, on average, about 5% (in some studies up to 17%) of patients in the lowest (4-5/green-blue) triage levels in ATS, ESI, and SRTS were reported to be admitted as inpatients. The variations were wide not only between different triage scales, but also between studies using the same scales. This indicates differences between the studies in (a) patient populations in the ED, (b) access to hospital beds, (c) hospital admission policies and traditions, and/or (d) inaccurate triage decisions (i.e. patients were rated as less urgent than their actual urgency).
No definitive conclusions could be drawn regarding which of the scales was the safest as measured by the need of hospitalization. Hence, we suggest that none of the scales be used in referral of patients in the lowest triage levels (4-5/green-blue), e.g. to primary care, without further medical examination in the ED.
New diagnostic tests typically need to meet rigid criteria before they can be accepted for widespread use. These criteria include documentation on precision. For non-laboratory tests, interrater agreement (reliability) is a key precision issue. Our review shows that most triage scales present insufficient scientific evidence for assessing interrater agreement. The study designs used to estimate interrater agreement have often been suboptimal. Most of the studies are based on fictitious cases rather than on authentic patients in real-life settings. The value of the studies as regards interrater agreement is also compromised by the fact that the mean age of patients assessed has either been low (as low as 30 years) or unreported. The generalizability to real-life ED patients must therefore be questioned.
All 5-level triage scales present insufficient evidence on interrater variability. The few studies that have been published (most of low quality) have reported widely divergent interrater agreement, with kappa values ranging from 0.2 (slight agreement) to 0.9 (almost perfect). Only a single study  presented limited scientific evidence. This was a 4-grade scale reporting a kappa value of 0.45, a value usually considered to be in the moderate agreement range . It is evident that inter-observer agreement in triage scales must be documented in greater detail, and, if low, actions must be taken to reduce variability.
The literature shows variations in the vital signs and chief complaints applied in triage scales. It is unclear whether the selected vital signs are the best at distinguishing different risk groups. Further, evidence supporting the selected thresholds for continuous variables is deficient. The inclusion criteria for this systematic literature review place considerable emphasis on relevance. Triage scales are intended to be used in EDs irrespective of specific symptoms or disease. Hence, only studies of unselected patient populations in ED settings were included, greatly limiting the number of studies on the ability of individual vital signs to predict outcome. Our literature search revealed that many more studies had been performed in intensive care units, or soon after hospital admission.
Regarding specific vital signs, limited scientific evidence supports the use of oxygen saturation and consciousness level as predictors of mortality early after triage. However, scientific evidence was found to be insufficient as regards respiration and pulse, blood pressure, and body temperature. Hence, it remains unclear whether the selected vital signs are the best ones to use in distinguishing different risk groups. Moderate scientific evidence indicated age as a predictor of mortality early after triage, yet most triage scales do not take age into account.
MTS and eCTAS include the chief complaint leading to the ED visit, but we did not find any studies that analyzed which of the chief complaints are important predictors of mortality early after triage. It appears likely that in the construction of triage scales, much of the information was deduced from studies performed in settings other than EDs.
Strengths and limitations
The strength of this review of the scientific literature on triage in the ED lies in its systematic approach. Our search for relevant literature has been meticulous; the quality of the included studies has been evaluated in a uniform manner; and the level of evidence has been summarized using the GRADE methodology developed under the auspices of the World Health Organization .
Our review is limited to ED triage in adult patients in somatic care. However, EDs are only part of a continuum of services for acutely ill and injured patients. Studies are also needed in other aspects along the continuum of care, e.g. prehospital, psychiatric, and pediatric triage. Other limitations are ascribed to the volume and quality of the scientific literature available. Since all studies were observational, none of the evidence came from randomized controlled trials, the "gold standard" for evaluating new methods. As none of the studies met the standards for high quality, we included studies of low and moderate quality in our review in accordance with the creed in evidence based medicine to use the best available scientific evidence. Low study quality affected the GRADE rating and was a reason why scientific evidence was rated as insufficient or limited for so many aspects of so many scales.
This systematic literature review reveals shortcomings in the scientific evidence on which presently available triage scales are based. Stronger scientific evidence is needed to determine which of the vital signs and chief complaints have the greatest prognostic value in triage. Interrater agreement (reliability), validity, and safety of triage scales need to be investigated further, and head-to-head comparisons are needed to determine whether any of the scales have advantages over others.
This review was confined to ED triage scales for adult ED patients with non-psychiatric illnesses or injuries. In the absence of an internationally agreed outcome measure for ED triage scale validity, the proxy variables hospital admission and mortality were used in the current study. These proxy variables have limitations with regards to ED triage scale validity as the variables may be affected by events occurring after the triage assessment. Further, comparison between ED triage scales need to be done with caution as there may be contextual differences influencing the result.
The authors declare that they have no competing interests.
All authors contributed to study concept and design, and acquisition, analysis, and interpretation of the data. Finally all authors read and approved the submitted manuscript.
J Emerg Med
Australasian College for Emergency Medicine: Guidelines on the implementation of the Australasian triage scale in emergency departments. [http:/ / www.acem.org.au/ media/ policies_and_guidelines/ G24_Implementation__ATS.pdf] webcite
J Emerg Nurs 2005, 31:39-50.
quiz 118PubMed Abstract | Publisher Full Text
Cerebrovasc Dis 1996, 6:161-5. Publisher Full Text
van der Wulp I, van Baar ME, Schrijvers AJ: Reliability and validity of the Manchester Triage System in a general emergency department patient population in the Netherlands: results of a simulation study.
Emerg Med 1999, 11:68-71. Publisher Full Text
Emerg Med (Fremantle) 2003, 15:334-40. Publisher Full Text