ABG Utility

PICO Question:

In patients presenting to the ED with severe dyspnea, what is the utility of arterial blood gas (ABG) sampling?                          



Clinical bottom line:

There is essentially no evidence to support or refute the utility of ABGs in patients presenting to the ED with acute dyspnea. Very limited evidence suggests that ABG parameters are not useful diagnostically.


The first thing to point out is that we wanted to look at a specific subgroup of patients. Most of us concede that an ABG has utility in intubated patients, those with complex acid-base issues, or some reason to not believe the O2 sat monitor, as well as maybe in an altered patient with unclear cause (though VBG is almost certainly good enough for that). The next thing to note about this JC was the utter lack of evidence, either for or against routine use of ABGs. The above study by Burri et al is the only study we could find about using ABGs diagnostically, save for some old papers looking specifically at PE. As far as guiding management, we found no evidence at all.

We were lucky to have a handful of representatives from our ICU services, the most vocal of which was Dr. Bucklely from Wishard Pulm/CC. Our Methodist ICU colleagues, at least according to Dr. Ellender (EM/CC) do not see much use in routine ABG testing for sick, but not intubated, patients with dyspnea. The Wishard CC faculty, however, do. We all seemed to agree that the choice of when to intubate (particularly the patient on NIPPV) should be a clinical one, and blood gas values add little or nothing to that clinical decision making. If you wanted to trend blood gases, trending the pH on a VBG should be sufficient, and pH is prognostically the most important part of a blood gas. Dr. Buckley discussed using the ABG somewhat diagnostically, in order to avoid anchoring on a wrong diagnosis. Historically, the A-a gradient has been espoused as the most useful diagnostic piece of information available from an ABG, and cannot be calculated from a VBG. The discussion about the diagnostic utility of an ABG could have gone on for some time, but a few quick points that were hit on are as follows. An A-a gradient is expected in pneumonia, CHF, PE, but not in a pure asthma/COPD flare. Dr. Buckley argued that a normal A-a gradient can help assure us that a COPD flare is just that, or an abnormal gradient may clue us in to the fact that something else is going on. Unfortunately, A-a gradient has no specificity for any disease process, since so many things cause one. It is also not terribly sensitive for the one diagnosis that we might want to be clued into, and would not otherwise identify easily with CXR and exam, pulmonary embolism. I would not say that we came to a consensus, as our Wishard CC colleagues remained staunchly in support of the ABG, while most of us EPs find it useless. While I remain convinced that an ABG is completely useless in this scenario, we were unable to find any convincing evidence to support that. There appears to be NO evidence to support the use of ABGs in these patients however. One last point from Dr. Buckley was that an ABG isn’t that terrible a procedure and “we do MUCH worse things to people every day,” – agreed by all.


Study #1: Anne-Marie Kelly. Review Article: Can venous blood gas analysis replace arterial in emergency medical care. Emergency Medicine Australasia. 2010;22:493-98


The objectives of the present review are to describe the agreement between variables on arterial and venous blood gas analysis and to identify unanswered questions.

Study Design:

MEDLINE was searched for studies comparing arterial and peripheral venous blood gas values. Data was collected and analyzed as weighted pooled data.

Are the results valid?

Quality Assesment: Low quality

  • Did the overview address a focused clinical question? Kind of. The objective of determining the agreement between VBG and ABG values is a focused question, “identifying unanswered questions” is not.
  • Were the criteria used to select articles for inclusion appropriate? The methods section of this paper actually doesn’t give any real inclusion or exclusion criteria, so it is unclear what exactly the criteria were, other than papers that compared ABG and VBG values. There was no mention that they had to be drawn simultaneously, report singular data, etc. The lack of reported methods makes this impossible to answer.
  • Is it unlikely that important, relevant studies were missed? No. The search was performed on only one search engine. The author used only 4 search terms and limited the results in MEDLINE to adults. From what we are told about the search, it seems to be low quality for a systematic review.
  • Was the validity of the included studies appraised? No. There appears to be no attempt to discern the validity of included studies. Coupled with the fact that there were essentially no inclusion or exclusion criteria, this suggests the potential for inclusion of very low quality studies.
  • Were the results similar from study to study? The author did not report a formal evaluation of heterogeneity, such as an I2 or p value. The “eyeball test” looking at results from the included studies suggests that the point estimates for pH and HCO3 were very consistent across studies, and that results for pCO2 were relatively consistent as well.

What are the results?

The pooled mean difference between ABG and VBG for measurements was as follows, with range of point estimates from different studies:

pH 0.035, range 0.015-0.06

pCO2 5.7 mmHg, range 3.3-8, (95% CI in individual studies ranged up to 20)

HCO3 -1.34 mmol/L, range -1.28 to -1.75

The author points out that there is insufficient evidence to say whether or not this level of agreement persists in shock states.

Are the results clinically applicable?

Perhaps. The results among different studies were very consistent, especially for pH. One question is whether or not ANY blood gas is useful in these patients. If an ABG has no value, then substituting a VBG, while less painful to the patient, still yields a test with no value. Ultimately, the low quality of how this review was performed limits the applicability of the results as well. It does appear fairly certain that pH and HCO3 values are essentially interchangeable between ABG and VBG values, but pCO2 may be too unpredictable to substitute reliably. Obviously, pO2 values will not be helpful from a VBG.

Author’s conclusion: 

For patients who are not in shock, venous pH, bicobonate, and base excess have sufficient agreement to be clinically interchangeable for arterial values. Agreement between arterial and venous pCO2 is too poor and unpredictable to be clinically useful as a one-off test, but venous pCO2 might be useful to screen for arterial hypercarbia or to monitor trends in pCO2 for selected patients.


Study #2: Emanuel Burri et al. Value of arterial blood gas analysis in patients with acute dyspnea: an observational study. Critical Care. 2011;15:R145


The aim of this study was to prospectively investigate the value of ABG parameters as biological markers for diagnosis and prognosis in patients presenting to the ED with acute dyspnea.

Study Design:

Retrospective review of patients, many of whom were enrolled in at least one other study, who got an ABG drawn in the ED. The authors looked to see if ABG values corresponded to, or predicted, any specific diagnoses, and also looked to determine if there was prognostic value from any ABG values.

Are the results valid?

Risk of Bias: Moderate to high

  1. Was an appropriate GS reference chosen? The gold standard for diagnosis in this study was 2 physicians who reviewed each chart and decided on the diagnosis.
  2. Did every patient in the study receive the same GS test? And, if some patients did not receive the GS, was a reasonable alternative GS test done? Yes, the 2 physicians determined the diagnosis in every patient, based on their chart review.
  3. Did the results of the study being evaluated influence whether the GS test was performed or which GS was performed? No
  4. Were those interpreting the test being evaluated blind to the results of the GS?  No, but it doesn’t really matter, because the authors looked at numeric values of ABG, not something subjective 
  5. Were those interpreting the GS blind to the results of the test being evaluated?  This is unclear.  The authors state that the 2 physicians determining the diagnosis reviewed charts “in a blinded fashion.” There is no description of what “blinded fashion” meant, so it is unclear whether they were blinded to ABG results, each other, etc.
  6. Did the study include an appropriate spectrum of patients to whom the test will be applied in practice?  Patients were enrolled with acute dyspnea in the ED. Half of the patients enrolled were enrolled for a study on BNP, the other half it is unclear where they came from. In each case, ABGs were not mandatorily performed, so the patients included were those in whom the treating emergency physician (EP) decided to order an ABG, which represented under half of those “enrolled.”
  7. Were withdrawals or missed patients explained? Were they similar to the patients who completed the study?  “Missed” patients were essentially those in whom no ABG was drawn. This was explained by the fact that the treating EP didn’t order one.
  8. Where the methods for performing the test and the GS described in sufficient detail to permit replication?  Yes. 
  9. Were definitions of positive and negative predefined? N/A The authors reported results as an AUC using ROC curves, as well as average values among different diagnoses. There was no “positive” or “negative” cutoff for any value for any specific diagnosis

What are the results?

A high pH and low pCO2 were moderately predictive of “hyperventilation syndrome,” while the utility of all ABG values was essentially useless for any other specific diagnosis. Outside of hyperventilation, the interquartile ranges of pH, paO2, and paCO2 had nearly 100% overlap. The best performing values for AUC for specific diagnoses (other than hyperventilation) ranged from 0.558 to 0.678 – which represent values for a test that is slightly better than flipping a coin (completely useless test has AUC of 0.50). Prognostically, low pH was associated with increased mortality at multiple time points (discharge, 30 days, 12 months).

Are the results clinically applicable?

Barely. Using ROC curves and AUC really doesn’t correlate to a clinically applicable use of a test, since we tend to choose a “cutoff” between normal and abnormal, and the AUC represents how the test performs over the whole range of possible values. Additionally, those who would suggest that there is any diagnostic utility to an ABG would point to the A-a gradient, which these authors didn’t report on specifically. The A-a gradient takes into account only the paO2 and paCO2 values from the ABG, along with some other relatively static variables, so if neither paO2 or paCO2 are significantly different between diagnoses, it would suggest that an A-a gradient would also be useless diagnostically. Additionally, this study included a very poorly defined group of patients, and it is unclear whether or not the physicians determining the gold standard diagnosis were blinded.

Author’s conclusion: 

ABG parameters were useful neither to distinguish between patients with pulmonary disorders and other causes of acute dyspnea nor to identify specific disorders responsible for acute dyspnea.

1701 N. Senate Blvd, B401 | Indianapolis, IN 46202 | (317) 962-5975