ESCAVO has been following the story of vitamin C in sepsis since the publication of Dr. Paul Marik’s now famous and controversial vitamin C study in CHEST in 2017. In October of that year we caught up with Dr. Marik to learn first-hand about how he arrived at his vitamin C cocktail, and discuss his experience with it. Our interview, which you can read here, provided some fascinating insights into the science of sepsis and his treatment.
Although Dr. Marik’s name is now synonymous with use of vitamin C in sepsis, he was not the first person to investigate this treatment. Another physician by the name of Alpha “Barry” Fowler had been plugging away on vitamin C for some time, methodically investigating its efficacy in sepsis in animals and humans, with some surprisingly positive outcomes. It was in fact Dr. Fowler’s work that inspired Dr. Marik to try vitamin C in a dying sepsis patient in a last ditch effort to save her life, leading to a surprising recovery that later prompted his study.
Recently, Dr. Fowler also received some well-deserved recognition after publication of his team’s CITRIS-ALI study, the first in a batch of randomized clinical trials in the pipeline to further examine the effects of vitamin C in sepsis. The trial examined a total of 187 patients, 84 randomized to receive vitamin C and 83 placebo. Although the study did not reach the conclusion its authors had hoped, its results surprised both supporters and skeptics, and are considered by many to be a promising win for vitamin C. So what happened in CITRIS-ALI and what does it mean for the future of this treatment?
CITRIS-ALI Design
First, it would be helpful to understand a little about the study’s design. Like its pilot study predecessor, CITRIS-ALI used surrogate primary outcomes, namely, a modified version of the Sequential Organ Failure Assessment (mSOFA), and two sepsis biomarkers, C-reactive protein (CRP) and thrombomodulin. The SOFA score was used instead of mortality because a mortality benefit is typically harder to demonstrate, and studies often avoid making this a primary outcome. CRP was used because it is a common (albeit non-specific) sepsis biomarker, whereas serum thrombomodulin is a marker of endothelial injury that is elevated in sepsis and other conditions like acute respiratory distress syndrome (ARDS) which was relevant to this study (ARDS was an inclusion criterion). The clinical endpoint of all-cause mortality was left as a secondary outcome, along with a host of other measures including ICU-free days, ventilator-free days, hospital-free days and other biomarkers of sepsis such as procalcitonin at various time intervals, for a total of 46 pre-specified secondary outcomes.
Unlike the pilot phase I study, CITRIS-ALI required not only the presence of sepsis, but also the presence of ARDS, a serious and fairly late complication of sepsis. This requirement led to a delayed administration of vitamin C, which diverges from the earlier administration used in the pilot study and other studies with vitamin C currently under way. This probably affected the study’s outcomes as we shall see. The reason for the ARDS requirement is unclear, but may have had to do with NIH funding requirements as Josh Farkas speculates in this PulmCrit blog.
CITRIS-ALI Results
The authors had hoped that the improvement in SOFA score and biomarkers seen in the pilot study would be replicated in CITRIS-ALI, but instead something very different and surprising happened. The study found no difference between the treatment and placebo groups in any of its three primary outcomes – mSOFA, CRP or thrombomodulin (see Table 1), but, to everyone’s surprise, the secondary outcome of all-cause mortality at 28 days was significantly lower in the treatment group. In addition, the secondary outcomes of ICU-free days at 28 days and hospital-free days at 60 days were significantly higher in the treatment group (see Table 2).
Table 1. Primary outcome results | ||||
---|---|---|---|---|
Outcome | Placebo | Vitamin C | Difference; 95% CI | p-value |
mSOFA at 96 h (Δ vs 0 h) | 6.8 (-3.5) | 6.8 (-3.0) | -0.1; -1.23 to 1.03 | 0.86 |
CRP at 168 h | 46.1 | 54.1 | 7.94; -8.23 to 24.1 | 0.33 |
Thrombomodulin at 168 h | 13.8 | 14.5 | 0.69; -2.8 to 4.2 | 0.70 |
Table 2. Statistically significant secondary outcome results | |||||
---|---|---|---|---|---|
Outcome | Placebo | Vitamin C | Difference; 95% CI | X2 | p-value |
Mortality at 28 d | 46.3% | 29.8% | 16.6%; 2% to 31.1% | 4.84 | 0.03 |
ICU-free days at 28 d | 7.7 | 10.7 | 3.2; 0.3 to 5.9 | – | 0.03 |
Hospital-free days at 60 d | 15.5 | 22.6 | 6.7; 0.3 to 13.8 | – | 0.04 |
The Kaplan-Meier survival curves for the two groups are shown in Figure 1 below. Using the Gehan-Breslow-Wilcoxon test to assess differences between the groups throughout the first 28 days, investigators found that the survival curves for the two groups were significantly different (X2 = 6.5, p=0.01) with a mortality hazard ratio of 0.55 (95% CI of 0.33-0.90) favoring the treatment group.
Figure 1. All-cause mortality from day 0 to day 28
Discussion
So what does this all mean?
Statistically speaking, this means that CITRIS-ALI essentially failed to show a benefit for vitamin C use because it failed its primarily outcomes. Positive results in secondary outcomes can generally only be considered exploratory or hypothesis-generating, particularly if the primary outcomes are negative. To be properly validated, such results usually need to be retested in further studies appropriately designed and powered for them as primary outcomes.
The main reasons for statistical rejection have to do with two key elements of study design: sample size and something called the multiple comparison effect.
The sample size problem
Investigators select primary outcomes very carefully, running statistical models to ensure that the sample size is large enough (ie, the study is sufficiently powered) to detect statistically significant differences between the placebo and treatment groups. The sample size is partially chosen based on the expected differences in the outcomes – if we expect large differences we may only need a small sample to detect them, and vice versa. Investigators may, however, want to study many outcomes, but it may not be feasible to sufficiently power the study for all of them (due to cost, duration, etc) in which case these outcomes may be relegated to secondary or exploratory status. If a study is not appropriately powered for these (and studies are often not), secondary outcome results must be interpreted with caution. For example, a negative secondary outcome in an underpowered study may simply be due to the sample being too small to detect a low frequency event, thus leading to a false negative (type II error). In CITRIS-ALI, investigators only ran the necessary models and powered the study for the primary outcomes, so secondary outcomes must be interpreted with caution.
The multiple comparison effect
The multiple comparison effect basically means that the more outcomes we test for in a study, the higher the probability that we will find differences between some of them just by chance. This can lead to false positives (type I errors). For example, if we test 100 outcomes at a p-value of <0.05, we have a 5% chance of a false positive, meaning 5 outcomes could be falsely different between the groups just by chance. In CITRIS-ALI, investigators tested 46 secondary outcomes, meaning that about 2.3 of them could be expected to be falsely positive. It can be mathematically shown that the chance of getting at least 1 false positive in X number of tests with a p-value of α is equal to 1 – (1-α)X which in this study is 1 – 0.9546 or 91% if we count all secondary outcomes in the formula (almost guaranteed to have one false positive!). In statistics, this is called the family-wise error rate (FWER) and is an important measure of the impact the multiple comparison effect can have on outcome validity.
Multiple comparison effect corrections
There are statistical methods to correct for the multiple comparison effect. One commonly used method is called the Bonferroni correction, which simply states that the p-value that should be tested in multiple outcomes is equal to the minimal p-value we want to test for, typically 0.05, divided by the number of outcomes. Applying the Bonferroni correction to all secondary outcomes in CITRIS-ALI would require a minimum p-value of 0.05/46 or 0.004. This is nearly 8x larger than the mortality p-value of 0.03 found in this study, so by this method, this outcome does not meet statistical significance. There are many other methods of correction, some that may be better suited for correcting a large set of outcomes, such as Benjamini-Hochberg false discovery correction method. You can read about these here, or better yet, in this FDA Guidance to multiple endpoints in clinical trial design (a long, but definitely worthwhile read if you’re in the process of designing a trial with many outcomes).
What about those great mortality results?
All this statistical mumble-jumble aside, there is, however, something very compelling in those Kaplan-Meier survival curves, especially when looking at the mortality difference between groups at the end of the treatment period (96 hours) which is roughly 23% in the placebo group and 4% in the vitamin C group. These are pretty strong numbers that are hard to entirely ignore. It has been suggested, again very convincingly by Josh Farkas in his CITRIS-ALI blog, that the high rate of early mortality in the placebo group may be the very reason the study failed to meet its SOFA primary endpoint: dead patients do not have SOFA scores.
The idea is that sicker patients with presumably higher SOFA scores succumbed at a higher rate in the placebo group, leaving less sick patients with lower SOFA scores and lower biomarker levels behind for the analysis. This served to artificially lower the primary outcome values in the placebo group thus bringing them closer to the treatment group and eliminating significant difference between the groups. Essentially, the higher rate of mortality in the placebo group introduced what’s known as a survivorship bias.
Vitamin C effects
It can also be the case that vitamin C is acting on mechanisms that are not entirely captured by the SOFA score and biomarkers measured in the primary outcomes. Remember that this study was conducted on severe septic patients that were already manifesting signs of organ failure in the form of ARDS. In recent years, studies have suggested that organ failure in sepsis is caused not only by the typical deleterious effects, such as hemodynamic and coagulation disruption, but also by severe metabolic disruption at a cellular and even mitochondrial level caused by the hyper-metabolic state seen in sepsis (see this paper for a good summary of this theory). Sepsis patients have depleted vitamin C levels because the body runs out of it as it’s trying to combat severe oxidative stress. High dose IV Vitamin C is theorized to work by quickly replenishing the body’s vitamin C stores and serving as a metabolic resuscitator to alleviate the severe oxidative stress that occurs during this hyper-metabolic state. For a good explanation of vitamin C’s actions, see this presentation by Dr. Paul Marik. An excellent review of the biologic basis of the Marik cocktail is also provided in this article.
If this is the case, then it is possible that in CITRIS-ALI the vitamin C was administered just in time to save a number of patients whose organs were metabolically impaired at a cellular level who would have otherwise gone on to organ death irrespective of other factors such as inflammation, perfusion levels, coagulation abnormalities, blood pressure, or other parameters measured in the SOFA score. In effect, vitamin C may have quickly metabolically resuscitated these patients, allowing them to make it on their own from there. The parallel survival curves after the treatment was stopped suggest that once the patients make it over this critical hump and survive their metabolic crisis, weather with the help of vitamin C or without, they recover on their own at about the same rates.
Future studies – VICTAS and ACTS
This will all hopefully be teased out in the next two randomized clinical trials with vitamin C in sepsis, the VICTAS (NCT03509350) and ACTS (NCT03389555) trials. However, unlike CITRIS-ALI, these trials are testing Dr. Marik’s cocktail which uses a considerably smaller dose of vitamin C (1.5 g every 6 hours vs the 50 mg/kg (appx. 3.5 g) every 6 hours used in CITRIS-ALI) and also includes thiamine and hydrocortisone. ACTS will also treat for 4 days like CITRIS-ALI, while VICTAS will treat for 4 days or ICU discharge, whichever comes first. Also, importantly, these studies will be administering vitamin C at the first sign of sepsis and not in its advanced stages when ARDS occurs like in CITRIS-ALI, presumably to arrest metabolic disruption early on before it causes significant organ damage.
Unfortunately, neither ACTS nor VICTAS will measure mortality as a primary outcome (their PIs might be kicking themselves now). Like CITRIS-ALI, ACTS will measure changes in SOFA score at 72 hours as its primary outcome, and 30-day mortality and rate of renal failure at ICU discharge as secondary outcomes. The study also includes a number of additional outcomes including ventilator-free days, vasopressor-free days, ICU length of stay and several others. VICTAS will measure vasopressor and ventilator-free days (VVFD) to 30 days as its primary outcome, and 30-day mortality and delirium-free and coma-free days to 14 days as its secondary outcomes.
The differences between these studies and CITRIS-ALI in terms of patient selection, vitamin C administration timing and dose, not to mention the addition of thiamine and hydrocortisone, make it somewhat difficult to directly compare them to CITRIS-ALI. However, the results of CITRIS-ALI suggest that these studies may have difficulty meeting their primary outcomes, and we may find ourselves in the same situation – failed primary outcomes, but positive secondary mortality findings. On the other hand, since the ARDS requirement will not be there in these studies, the vitamin C will be administered earlier, and it is also possible that patients in both groups will have lower SOFA scores at time 0 since they will be earlier in their sepsis course. This may cause a faster and earlier correction of the SOFA score in the treatment group and continued deterioration of the score in the placebo group, leading to a larger difference between the groups over the measured timeline that may reach statistical significance.
But most of all, people are waiting to see if these studies will replicate the mortality benefit seen in Marik’s initial before-after study – even half or a quarter of that benefit would be a significant win. CITRIS-ALI has provided us with a tantalizing hint that these trials might find a mortality benefit, but whether this turns vitamin C into sepsis standard of care will depend on whether their primary outcomes also hold up. Given the far fewer secondary outcomes, the studies will have a far smaller multiple comparison problem, and they are hopefully sufficiently powered to statistically detect differences in all outcomes – primary and secondary (VICTAS has enrolled 500 patients and ACTS is scheduled to enroll 200). However, a mixed outcome would certainly complicate matters and perhaps require further studies. Either way, we eagerly await their results.
We should also add that many practitioners are not waiting for the results of these studies and are already putting vitamin C into clinical practice in sepsis treatment, mostly as part of Dr. Marik’s cocktail. We are currently running a survey in our Sepsis Clinical Guide app through the month of October to poll app users on their use of vitamin C in sepsis and will be publishing results in the coming month. If you are a clinician taking care of septic patients we encourage you to use the app and participate in our survey.
Daniel Nichita, MD