Milk, Fruit and Vegetable, and Total Antioxidant Intakes in Relation to Mortality Rates

Cohort Studies in Women and Men

Karl Michaëlsson; Alicja Wolk; Håkan Melhus; Liisa Byberg


Am J Epidemiol. 2017;185(5):345-361. 

In This Article


We used data from 2 previously described[1] population-based cohort studies, the Swedish Mammography Cohort (SMC) and the Cohort of Swedish Men. The SMC started in 1987–1990 when 74% of all 90,303 women aged 39–74 years residing in 2 Swedish counties completed a questionnaire covering diet (food frequency questionnaire (FFQ)) and lifestyle that had been enclosed with a mailed invitation to undergo routine mammography screening. In 1997, a subsequent expanded questionnaire was sent to the 56,030 women still living in the study area (response rate 70%). We excluded women with implausible values for total energy intake (≥3 standard deviations below or above the log-transformed mean energy intake; cutoffs were 574 kcal/day and 4,707 kcal/day)[27] and those with missing data on all items regarding fruit and vegetable consumption. Exclusion of outliers for energy intake, in addition to adjustment for total energy intake in the statistical analyses, compensates for overall under- or overreporting of dietary intake.[28] In the present study, a first analysis included 61,240 women without a prevalent cancer diagnosis in the SMC with information from 1987–1990 and 38,331 women with updated information from 1997. In a second analysis with baseline set at the second examination, we included 36,714 women who were alive on January 1, 1998, and free of any previous cancers.

The Cohort of Swedish Men was established in 1997. All men aged 45–79 years residing in 2 counties in central Sweden were invited to participate in the study (n = 100,303). The FFQ and lifestyle questionnaire was completed by 48,850 men. Despite a response rate of only 49%, the Cohort of Swedish Men is representative of Swedish men in this age range in relation to age distribution, educational level, and prevalence of overweight.[29] We excluded men with implausible values for total energy intake (cutoffs were 861 kcal/day and 7,311 kcal/day). For the present analysis, 45,280 men who were alive on January 1, 1998, and free of previous cancers were available.

The single FFQ administered in 1997 was used to simplify the comparison between men and women, but in women we also used time-updated information by means of the complete SMC data set. The studies have been approved by the Regional Ethical Review Board in Stockholm, Sweden.


The participants reported, by means of a valid and reproducible FFQ, their average frequency of consumption of up to 96 foods and beverages during the past year, including milk (either low-fat (≤0.5%), medium-fat (1.5%), or high-fat (3%)), sour milk, yogurt, cheese, 5 fruits (apples, bananas, berries, oranges/citrus fruit, and other fruit), orange juice, and 13 vegetables (carrots, beet root, broccoli, cabbage, cauliflower, lettuce, onion, garlic, peas, pea soup, peppers, spinach, tomatoes, and other vegetables).[29–31] There were 8 possible frequency categories in increasing order from zero times per month to more than 3 times per day. In the 1987–1990 FFQ, the numbers of fruit (n = 4) and vegetable (n = 5) categories were fewer, but they comprised more fruit or vegetable items in each category. The fruit and vegetable categories represented the typical consumption pattern in Sweden at the time of each investigation, with a higher number of items over time. In accordance with national dietary guidelines,[32] only 1 glass of juice (fresh or from concentrate) was included in the calculation of daily intake, independent of the amount ingested. Instructions in the FFQ stated that 1 serving of milk corresponds to 1 glass of 200 mL. Milk intake was specified according to fat content, and intakes were summed into a single measure representing total milk intake on a continuous scale. Missing values for individual dairy products were interpreted as no intake of that particular food.[33] The small fraction of missing data for single items, which were regarded as zero consumption, is unlikely to represent a bias for the observed findings.[33] In fact, 92% of those who did not report milk consumption on the FFQ part of the questionnaire reported that they did not consume milk when posed a specific question, and 99.8% had consumption of less than 1 glass/day according to complementary open questions regarding dairy consumption.

Nutrient intakes were estimated by multiplying the consumption frequency of each food item by the nutrient content of age-specific portion sizes and reference data obtained from the Swedish National Food Agency database[34] and were adjusted for total energy intake using the residual method.[28] According to validation studies of the self-reported milk intakes, the Spearman coefficient for correlation between the FFQ and four 7-day food records every third month (a gold standard reference) was approximately 0.7.[35] Spearman coefficients for correlation between the FFQ and the averages of these four 7-day dietary records ranged between 0.4 and 0.7 for individual fruit and vegetable items.

We calculated estimates of total antioxidant capacity from diet analyzed with an oxygen radical absorbance capacity (ORAC) assay, as described in detail previously.[18] The FFQ contained 31 items with available ORAC values. The total antioxidant capacity of the diet (μmol/day) was calculated as the sum of the antioxidant content of the 31 food items, calculated by multiplying the average daily consumption of each food by its ORAC concentration (μmol Trolox equivalents (TE)/100 g) (Trolox: F. Hoffmann-La Roche AG, Basel, Switzerland). Because antioxidants in coffee and tea have been shown to be poorly absorbed, we took into account absorption (6% for coffee and 4% for tea) when calculating the ORAC.[36] The correlation between the total antioxidant capacities of dietary ORAC and plasma ORAC was 0.31.


We considered as the primary outcome all-cause mortality registered between baseline and September 30, 2015, in the Swedish Cause of Death Registry. We used the underlying cause of death from the Swedish Cause of Death Registry to define secondary outcomes: mortality from cardiovascular diseases (International Classification of Diseases, Tenth Revision, codes I00–I99) and cancer (International Classification of Diseases, Tenth Revision, C-codes) through December 31, 2014. For 1987–1996, we used corresponding codes from the International Classification of Diseases, Ninth Revision.

Statistical Analysis

We calculated time at risk for each participant from study entry until the date of each outcome, the date of emigration, or the end of the study period, whichever came first. We first evaluated trends in mortality rates according to milk intake, fruit and vegetable intake, and ORAC using restricted cubic-spline Cox regression with 3 knots placed at percentiles 10, 50, and 90 of the exposures.[37] We calculated age-adjusted death rates and age- and multivariable-adjusted hazard ratios and 95% confidence intervals for categories of milk intake (<1, 1–<2, 2–<3, or ≥3 glasses/day) and categories of fruit and vegetable intake or quartiles of ORAC. We categorized fruit and vegetable intake as <1, 1–<2, 2–<3, 3–<4, 4–<5, or ≥5 servings/day, with the latter category reflecting dietary recommendations.[32] The proportional hazards assumptions were confirmed graphically by log-log plots.

To select suitable covariates for the multivariable model, we used present knowledge and directed acyclic graphs.[38] The model for the total effect included age, total energy intake, body mass index (weight (kg)/height (m)2), height, intakes of yogurt, cheese, red and processed meat, and alcohol (all continuous), educational level (≤9 years, 10–12 years, >12 years, or other), living alone (yes/no), ever use of antioxidant supplements (yes/no), physical activity (metabolic equivalent-hours/day; continuous), smoking status (never, former with <20 pack-years, former with ≥20 pack-years, current with <20 pack-years, or current with ≥20 pack-years) and Charlson's comorbidity index (possible range of scores, 0–33; continuous).[39,40] To avoid loss of efficiency and limit the introduction of bias by restricting the analysis to persons with complete data alone, missing data on covariates were imputed using multiple imputation.[41] We also imputed covariates not assessed at the baseline of the SMC in 1987–1990 (e.g., smoking status and physical activity).[1] Additional sensitivity analyses included exclusion of the first 2 years of follow-up, persons with a body mass index greater than 35, and current smokers. To our second model, we also added as covariates use of calcium-containing supplements and, for baseline 1997, use of aspirin and prevalent hypercholesterolemia. In a fourth model, we additionally adjusted our estimates for energy-adjusted dietary intakes of protein, total and saturated fat, calcium, vitamin D, retinol, and phosphorus; ever use of cortisone; and, among women, hormone replacement therapy.

Measures of interaction were calculated on the basis of adjusted hazard ratios (HRs), using persons consuming less than 1 glass of milk and 5 or more servings of fruit and vegetables per day as the reference category for the following groups (annotations in parentheses): milk intake ≥3 glasses/day, fruit and vegetable intake ≥5 servings/day (HR10); milk intake <1 glass/day, fruit and vegetable intake <1 serving/day (HR01); and milk intake ≥3 glasses/day, fruit and vegetable intake <1 serving/day (HR11). The relative excess risk of interaction (interaction on the additive scale) was calculated as HR11 − HR10 − HR01 + 1,[42] and 95% confidence intervals were obtained by means of the bootstrap percentile method with 1,000 bootstrap samples. The statistical analyses were performed with Stata 13.1 (StataCorp LP, College Station, Texas).