Respiratory Viral Infections and the Risk of Rheumatoid Arthritis

Young Bin Joo; Youn-Hee Lim; Ki-Jo Kim; Kyung-Su Park; Yune-Jung Park


Arthritis Res Ther. 2019;21(199) 

In This Article


Study Design and Data Source

This is an ecological study design, and we used records from the KNHI claims database from 2011 to 2015. Patients' diagnoses recorded via the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10), procedures, prescriptions, type of institution or department, and individual beneficiary information were provided.[22] The protocol utilized in the present study was approved by the Institutional Ethics Review Board of St. Vincent's Hospital, Catholic University of Korea.

Incident RA

The algorithm for identifying RA using claims has been previously validated in Korea[23] and was recently updated by Won et al..[24] In accordance with Won et al.,[24] we selected individuals aged ≥ 19 years with claims data pertaining to RA (ICD-10 codes M05 or M06). RA was deemed to be confirmed in cases in which a prescription for disease-modifying antirheumatic drugs was issued within 1 year of the RA code being assigned. Incident RA, which means new RA cases, has to be fulfilled 1-year window period (no codes or prescriptions for RA) and three consecutive years of treatment. Weekly number of incident RA was calculated from the first week of January 2012 to the last week of December 2013.

Respiratory Virus Data

KCDC posts the incidences of respiratory virus infections each week on their website ([20] Nasopharyngeal specimens from patients with acute respiratory infections are collected from 36 sentinel hospitals located nationwide and subjected to respiratory viral genetic testing via multiplex PCR. Target viruses include influenza, parainfluenza, adenovirus, respiratory syncytial virus (RSV), rhinovirus, coronavirus, metapneumovirus, and bocavirus.

The detection rate of respiratory virus was calculated as a proportion of patients who are confirmed for viral infection by PCR among those with acute respiratory viral infection symptoms who visited sentinel hospitals. Because we hypothesized that respiratory viral infections would exhibit a delayed association with incident RA rather than an immediate effect, the detection rates of eight respiratory viruses were collected from the first week of November 2011, which is 8 weeks prior to the start date of the collection of incident RA data, to the last week of December 2013.

Environmental Factors as Potential Confounders

Data pertaining to the potentially confounding factors with regard to the viral detection rates and RA diagnoses were obtained from public websites. We obtained hourly air pollution data including particulate matter < 10 μg/m3 in aerodynamic diameter (PM10) and ozone (O3) from the website,[25] operated by the Korean Ministry of Environment. Meteorological data reflecting hourly measurements of temperature, humidity, and solar radiation were obtained from the website maintained by the Korea Meteorological Administration.[26] The hourly mean of all variables was calculated by using obtained raw data in each station and converted to daily means. Next, the daily metrological data were converted into weakly means, then these means were analyzed in conjunction with the respiratory viral infection data. As with the respiratory viral detection rate data, the meteorological data were collected from the first week of November 2011 to the last week of December 2013.

Subgroup Analysis

To identify the groups who were significant to the effects of ambient respiratory viral infections on the number of incident RA, a subgroup analysis was conducted based on age, sex, and the presence or absence of respiratory disease prior to RA development. Age groups were categorized as < 40 years, 40–59 years, and ≥ 60 years based on previously reported definitions of young-onset and elderly-onset RA.[27] The presence of respiratory diseases was defined as cases with respiratory disease codes during the 12 months prior to RA diagnosis. Respiratory disease codes were extracted from ICD-10 codes (I27.8, I27.9, J40.x-J47.x, J60.x-J67.x, J68.4, J70.1, J70.3) for Charlson comorbidity index analysis.[28]

Statistical Analysis

Because respiratory virus data are provided by the source as nationwide totals, all other data were analyzed as nationwide totals. First, generalized additive modeling (GAM) with integrated smoothness estimation was used to investigate the relationships between detection rates of eight respiratory viruses and the numbers of incident RA cases. Generalized linear modeling (GLM) was then used to estimate the effects of eight respiratory viruses on the numbers of incident RA cases after adjusting for potential confounders.

Degrees of freedom (df) for each confounding factor was determined based on the unbiased risk estimation derived from the GAM. Potential confounders used in the model were PM10 with 9 df, O3 with 9 df, mean temperature with 8 df, mean humidity with 9 df, solar radiation with 9 df, and the natural cubic splines (ns) of time trend with 4 df per year (4 df × 2 years = 8 df). To consider delayed and cumulative effects of respiratory viral infections on incident RA, we used the moving average lag up to eight lag weeks (lag1–8). For example, "lag1–8" refers to a moving average lag model for respiratory viral infections over the previous 8 weeks. Confounder lag weeks were also matched with those of each virus in GLM. To determine the greatest respiratory viral effect on incident RA, we selected the lag associated with the highest beta for each virus then analyzed the statistical significance of the effect size at the selected lag week in each virus.

SAS statistical software (version 9.4, SAS, Cary, NC, USA) was used for data collation. All statistical analyses were performed using R software (version 3.5.1, The R Project for Statistical Computing, A P value < .05 was considered statistically significant.