Dengue fever, a tropical vector-borne disease, is a leading cause of hospitalization and death in many parts of the world, especially in Asia and Latin America. In places where timely and accurate dengue activity surveillance is available, decision-makers possess valuable information that may allow them to better design and implement public health measures, and improve the allocation of limited public health resources. In addition, robust and reliable near-term forecasts of likely epidemic outcomes may further help anticipate increased demand on healthcare infrastructure and may promote a culture of preparedness. Here, we propose ensemble modeling approaches that combine forecasts produced with a variety of independent mechanistic, statistical, and machine learning component models to forecast reported dengue case counts 1-, 2-, and 3-months ahead of current time at the province level in multiple countries. We assess the ensemble and each component models' monthly predictive ability in a fully out-of-sample and retrospective fashion, in over 180 locations around the world - all provinces of Brazil, Colombia, Malaysia, Mexico, and Thailand, as well as Iquitos, Peru, and San Juan, Puerto Rico - during at least 2-3 years. Additionally, we evaluate ensemble approaches in a multi-model, real-time, and prospective dengue forecasting platform - where issues of data availability and data completeness introduce important limitations - during an 11-month time period in the years 2022 and 2023. We show that our ensemble modeling approaches lead to reliable and robust prediction estimates when compared to baseline estimates produced with available information at the time of prediction. This can be contrasted with the high variability in the forecasting ability of each individual component model, across locations and time. Furthermore, we find that no individual model leads to optimal and robust predictions across time horizons and locations, and while the ensemble models do not always achieve the best prediction performance in any given location, they consistently provide reliable disease estimates - they rank in the top 3 performing models across locations and time periods - both retrospectively and prospectively.
Tracking COVID-19 Infections Using Survey Data on Rapid At-Home Tests Mauricio Santillana, Ata A. Uslu, Tamanna Urmi, Alexi Quintana-Mathe, James N. Druckman, Katherine Ognyanova, Matthew Baum, Roy H. Perlis, David Lazer. JAMA Network Open. 2024;7(9):e2435442. doi:10.1001/jamanetworkopen.2024.35442
Abstract
ImportanceIdentifying and tracking new infections during an emerging pandemic is crucial to design and deploy interventions to protect populations and mitigate the pandemic’s effects, yet it remains a challenging task. ObjectiveTo characterize the ability of nonprobability online surveys to longitudinally estimate the number of COVID-19 infections in the population both in the presence and absence of institutionalized testing. Design, Setting, and ParticipantsInternet-based online nonprobability surveys were conducted among residents aged 18 years or older across 50 US states and the District of Columbia, using the PureSpectrum survey vendor, approximately every 6 weeks between June 1, 2020, and January 31, 2023, for a multiuniversity consortium—the COVID States Project. Surveys collected information on COVID-19 infections with representative state-level quotas applied to balance age, sex, race and ethnicity, and geographic distribution. Main Outcomes and MeasuresThe main outcomes were (1) survey-weighted estimates of new monthly confirmed COVID-19 cases in the US from January 2020 to January 2023 and (2) estimates of uncounted test-confirmed cases from February 1, 2022, to January 1, 2023. These estimates were compared with institutionally reported COVID-19 infections collected by Johns Hopkins University and wastewater viral concentrations for SARS-CoV-2 from Biobot Analytics. ResultsThe survey spanned 17 waves deployed from June 1, 2020, to January 31, 2023, with a total of 408 515 responses from 306 799 respondents (mean [SD] age, 42.8 [13.0] years; 202 416 women [66.0%]). Overall, 64 946 respondents (15.9%) self-reported a test-confirmed COVID-19 infection. National survey-weighted test-confirmed COVID-19 estimates were strongly correlated with institutionally reported COVID-19 infections (Pearson correlation, r = 0.96; P < .001) from April 2020 to January 2022 (50-state correlation mean [SD] value, r = 0.88 [0.07]). This was before the government-led mass distribution of at-home rapid tests. After January 2022, correlation was diminished and no longer statistically significant (r = 0.55; P = .08; 50-state correlation mean [SD] value, r = 0.48 [0.23]). In contrast, survey COVID-19 estimates correlated highly with SARS-CoV-2 viral concentrations in wastewater both before (r = 0.92; P < .001) and after (r = 0.89; P < .001) January 2022. Institutionally reported COVID-19 cases correlated (r = 0.79; P < .001) with wastewater viral concentrations before January 2022, but poorly (r = 0.31; P = .35) after, suggesting that both survey and wastewater estimates may have better captured test-confirmed COVID-19 infections after January 2022. Consistent correlation patterns were observed at the state level. Based on national-level survey estimates, approximately 54 million COVID-19 cases were likely unaccounted for in official records between January 2022 and January 2023. Conclusions and RelevanceThis study suggests that nonprobability survey data can be used to estimate the temporal evolution of test-confirmed infections during an emerging disease outbreak. Self-reporting tools may enable government and health care officials to implement accessible and affordable at-home testing for efficient infection monitoring in the future.
Prevalence and correlates of irritability among U.S. adults Roy H Perlis, Ata Uslu, Jonathan Schulman, Aliayah Himelfarb, Faith M Gunning, Nili Solomonov, Mauricio Santillana, Matthew A Baum, James N Druckman, Katherine Ognyanova, David Lazer. Neuropsychopharmacol. (2024). https://doi.org/10.1038/s41386-024-01959-3
Abstract
This study aimed to characterize the prevalence of irritability among U.S. adults, and the extent to which it co-occurs with major depressive and anxious symptoms. A non-probability internet survey of individuals 18 and older in 50 U.S. states and the District of Columbia was conducted between November 2, 2023, and January 8, 2024. Regression models with survey weighting were used to examine associations between the Brief Irritability Test (BITe5) and sociodemographic and clinical features. The survey cohort included 42,739 individuals, mean age 46.0 (SD 17.0) years; 25,001 (58.5%) identified as women, 17,281 (40.4%) as men, and 457 (1.1%) as nonbinary. A total of 1218(2.8%) identified as Asian American, 5971 (14.0%) as Black, 5348 (12.5%) as Hispanic, 1775 (4.2%) as another race, and 28,427 (66.5%) as white. Mean irritability score was 13.6 (SD 5.6) on a scale from 5 to 30. In linear regression models, irritability was greater among respondents who were female, younger, had lower levels of education, and lower household income. Greater irritability was associated with likelihood of thoughts of suicide in logistic regression models adjusted for sociodemographic features (OR 1.23, 95% CI 1.22–1.24). Among 1979 individuals without thoughts of suicide on the initial survey assessed for such thoughts on a subsequent survey, greater irritability was also associated with greater likelihood of thoughts of suicide being present (adjusted OR 1.17, 95% CI 1.12–1.23). The prevalence of irritability and its association with thoughts of suicide suggests the need to better understand its implications among adults outside of acute mood episodes.
ImportanceTrust in physicians and hospitals has been associated with achieving public health goals, but the increasing politicization of public health policies during the COVID-19 pandemic may have adversely affected such trust. ObjectiveTo characterize changes in US adults’ trust in physicians and hospitals over the course of the COVID-19 pandemic and the association between this trust and health-related behaviors. Design, Setting, and ParticipantsThis survey study uses data from 24 waves of a nonprobability internet survey conducted between April 1, 2020, and January 31, 2024, among 443 455 unique respondents aged 18 years or older residing in the US, with state-level representative quotas for race and ethnicity, age, and gender. Main Outcome and MeasureSelf-report of trust in physicians and hospitals; self-report of SARS-CoV-2 and influenza vaccination and booster status. Survey-weighted regression models were applied to examine associations between sociodemographic features and trust and between trust and health behaviors. ResultsThe combined data included 582 634 responses across 24 survey waves, reflecting 443 455 unique respondents. The unweighted mean (SD) age was 43.3 (16.6) years; 288 186 respondents (65.0%) reported female gender; 21 957 (5.0%) identified as Asian American, 49 428 (11.1%) as Black, 38 423 (8.7%) as Hispanic, 3138 (0.7%) as Native American, 5598 (1.3%) as Pacific Islander, 315 278 (71.1%) as White, and 9633 (2.2%) as other race and ethnicity (those who selected “Other” from a checklist). Overall, the proportion of adults reporting a lot of trust for physicians and hospitals decreased from 71.5% (95% CI, 70.7%-72.2%) in April 2020 to 40.1% (95% CI, 39.4%-40.7%) in January 2024. In regression models, features associated with lower trust as of spring and summer 2023 included being 25 to 64 years of age, female gender, lower educational level, lower income, Black race, and living in a rural setting. These associations persisted even after controlling for partisanship. In turn, greater trust was associated with greater likelihood of vaccination for SARS-CoV-2 (adjusted odds ratio [OR], 4.94; 95 CI, 4.21-5.80) or influenza (adjusted OR, 5.09; 95 CI, 3.93-6.59) and receiving a SARS-CoV-2 booster (adjusted OR, 3.62; 95 CI, 2.99-4.38). Conclusions and RelevanceThis survey study of US adults suggests that trust in physicians and hospitals decreased during the COVID-19 pandemic. As lower levels of trust were associated with lesser likelihood of pursuing vaccination, restoring trust may represent a public health imperative.
Accurate forecasts can enable more effective public health responses during seasonal influenza epidemics. For the 2021–22 and 2022–23 influenza seasons, 26 forecasting teams provided national and jurisdiction-specific probabilistic predictions of weekly confirmed influenza hospital admissions for one-to-four weeks ahead. Forecast skill is evaluated using the Weighted Interval Score (WIS), relative WIS, and coverage. Six out of 23 models outperform the baseline model across forecast weeks and locations in 2021–22 and 12 out of 18 models in 2022–23. Averaging across all forecast targets, the FluSight ensemble is the 2ndmost accurate model measured by WIS in 2021–22 and the 5th most accurate in the 2022–23 season. Forecast skill and 95% coverage for the FluSight ensemble and most component models degrade over longer forecast horizons. In this work we demonstrate that while the FluSight ensemble was a robust predictor, even ensembles face challenges during periods of rapid change.
Accurate, real-time forecasts of influenza hospitalizations would facilitate prospective resource allocation and public health preparedness. State-of-the-art machine learning methods are a promising approach to produce such forecasts, but they require extensive historical data to be properly trained. Unfortunately, historically observed data of influenza hospitalizations, for the 50 states in the United States, are only available since the beginning of 2020, as their collection was motivated and enabled by the COVID-19 pandemic. In addition, the data are far from perfect as they were under-reported for several months before health systems began consistently and reliably submitting their data. To address these issues, we propose a transfer learning approach to perform data augmentation. We extend the currently available two-season dataset for state-level influenza hospitalizations in the US by an additional ten seasons. Our method leverages influenza-like illness (ILI) surveillance data to infer historical estimates of influenza hospitalizations. This cross-domain data augmentation enables the implementation of advanced machine learning techniques, multi-horizon training, and an ensemble of models for forecasting using the ILI training data set, improving hospitalization forecasts. We evaluated the performance of our machine learning approaches by prospectively producing forecasts for future weeks and submitting them in real time to the Centers for Disease Control and Prevention FluSight challenges during two seasons: 2022-2023 and 2023-2024. Our methodology demonstrated good accuracy and reliability, achieving a fourth place finish (among 20 participating teams) in the 2022-23 and a second place finish (among 20 participating teams) in the 2023-24 CDC FluSight challenges. Our findings highlight the utility of data augmentation and knowledge transfer in the application of machine learning models to public health surveillance where only limited historical data is available.
The COVID-19 pandemic has not only presented a major global public health and socio-economic crisis, but has also significantly impacted human behavior towards adherence (or lack thereof) to public health intervention and mitigation measures implemented in communities worldwide. This study is based on the use of mathematical modeling approaches to assess the extent to which SARS-CoV-2 transmission dynamics is impacted by population-level changes of human behavior due to factors such as (a) the severity of transmission (such as disease-induced mortality and level of symptomatic transmission), (b) fatigue due to the implementation of mitigation interventions measures (e.g., lockdowns) over a long (extended) period of time, (c) social peer-pressure, among others. A novel behavior-epidemiology model, which takes the form of a deterministic system of nonlinear differential equations, is developed and fitted using observed cumulative SARS-CoV-2 mortality data during the first wave in the United States. The model fits the observed data, as well as makes a more accurate prediction of the observed daily SARS-CoV-2 mortality during the first wave (March 2020–June 2020), in comparison to the equivalent model which does not explicitly account for changes in human behavior. This study suggests that, as more newly-infected individuals become asymptomatically-infectious, the overall level of positive behavior change can be expected to significantly decrease (while new cases may rise, particularly if asymptomatic individuals have higher contact rate, in comparison to symptomatic individuals).
Importance: Identifying and tracking new infections during an emerging pandemic is crucial to design and deploy interventions to protect populations and mitigate its effects, yet it remains a challenging task. Objective: To characterize the ability of non-probability online surveys to longitudinally estimate the number of COVID-19 infections in the population both in the presence and absence of institutionalized testing. Design: Internet-based non-probability surveys were conducted, using the PureSpectrum survey vendor, approximately every 6 weeks between April 2020 and January 2023. They collected information on COVID-19 infections with representative state-level quotas applied to balance age, gender, race and ethnicity, and geographic distribution. Data from this survey were compared to institutional case counts collected by Johns Hopkins University and wastewater surveillance data for SARS-CoV-2 from Biobot Analytics. Setting: Population-based online non-probability survey conducted for a multi-university consortium —the Covid States Project. Participants: Residents of age 18+ across 50 US states and the District of Columbia in the US. Main Outcomes and Measures: The main outcomes are: (a) survey-weighted estimates of new monthly confirmed COVID-19 cases in the US from January 2020 to January 2023, and (b) estimates of uncounted test-confirmed cases, from February 1, 2022, to January 1, 2023. These are compared to institutionally reported COVID-19 infections and wastewater viral concentrations. Results: The survey spanned 17 waves deployed from June 2020 to January 2023, with a total of 408,515 responses from 306,799 respondents with mean age 42.8 (STD 13) years; 202,416 (66%) identified as women, and 104,383 (34%) as men. A total of 16,715 (5.4%) identified as Asian, 33,234 (10.8%) as Black, 24,938 (8.1%) as Hispanic, 219,448 (71.5%) as White, and 12,464 (4.1%) as another race. Overall, 64,946 respondents (15.9%) self-reported a test-confirmed COVID-19 infection. National survey-weighted test-confirmed COVID-19 estimates were strongly correlated with institutionally reported COVID-19 infections (Pearson correlation of r=0.96; p=1.8 e-12) from April 2020 to January 2022 (50-state correlation average of r=0.88, SD = 0.073). This was before the government-led mass distribution of at- home rapid tests. Following January 2022, correlation was diminished and no longer statistically significant (r=0.55, p=0.08; 50-state correlation average of r=0.48, SD = 0.227). In contrast, survey COVID-19 estimates correlated highly with SARS-CoV-2 viral concentrations in wastewater both before (r=0.92; p=2.2e-09) and after (r=0.89; p=2.3e-04) January 2022. Institutionally reported COVID-19 cases correlated (r = 0.79, p=1.10e-05) with wastewater viral concentrations before January 2022, but poorly (r = 0.31, p=0.35) after, suggesting both survey and wastewater estimates may have better captured test-confirmed COVID-19 infections after January 2022. Consistent correlation patterns were observed at the state-level. Based on national-level survey estimates, approximately 54 million COVID-19 cases were unaccounted for in official records between January 2022 and January 2023. Conclusions and Relevance: Non-probability survey data can be used to estimate the temporal evolution of test-confirmed infections during an emerging disease outbreak. Self-reporting tools may enable government and healthcare officials to implement accessible and affordable at-home testing for efficient infection monitoring in the future.
Long-term COVID-19 complications are a globally pervasive threat, but their plausible social drivers are often not prioritized. Here, we use data from a multinational consortium to quantify the relative contributions of social and clinical factors to differences in quality of life among participants experiencing long COVID and measure the extent to which social variables’ impacts can be attributed to clinical intermediates, across diverse contexts. In addition to age, neuropsychological and rheumatological comorbidities, educational attainment, employment status, and female sex were identified as important predictors of long COVID-associated quality of life days (long COVID QALDs). Furthermore, a great majority of their impacts on long COVID QALDs could not be tied to key long COVID-predicting comorbidities, such as asthma, diabetes, hypertension, psychological disorder, and obesity. In Norway, 90% (95% CI: 77%, 100%) of the effect of belonging to the highest versus lowest educational attainment quintile was not attributed to intermediate comorbidity impacts. The same was true for 86% (73%, 100%) of the protective effects of full-time employment versus all other employment status categories (excluding retirement) in the UK and 74% (46%,100%) of the protective effects of full-time employment versus all other employment status categories in a cohort of four middle-income countries (MIC). Of the effects of female sex on long COVID QALDs in Norway, UK, and the MIC cohort, 77% (46%,100%), 73% (52%, 94%), and 84% (62%, 100%) were unexplained by the clinical mediators, respectively. Our findings highlight that socio-economic proxies and sex may be as predictive of long COVID QALDs as commonly emphasized comorbidities and that broader structural determinants likely drive their impacts. Importantly, we outline a multi-method, adaptable causal machine learning approach for evaluating the isolated contributions of social disparities to long COVID quality of life experiences.
ImportanceThe frequent occurrence of cognitive symptoms in post–COVID-19 condition has been described, but the nature of these symptoms and their demographic and functional factors are not well characterized in generalizable populations. ObjectiveTo investigate the prevalence of self-reported cognitive symptoms in post–COVID-19 condition, in comparison with individuals with prior acute SARS-CoV-2 infection who did not develop post–COVID-19 condition, and their association with other individual features, including depressive symptoms and functional status. Design, Setting, and ParticipantsTwo waves of a 50-state nonprobability population-based internet survey conducted between December 22, 2022, and May 5, 2023. Participants included survey respondents aged 18 years and older. ExposurePost–COVID-19 condition, defined as self-report of symptoms attributed to COVID-19 beyond 2 months after the initial month of illness. Main Outcomes and MeasuresSeven items from the Neuro-QoL cognition battery assessing the frequency of cognitive symptoms in the past week and patient Health Questionnaire-9. ResultsThe 14 767 individuals reporting test-confirmed COVID-19 illness at least 2 months before the survey had a mean (SD) age of 44.6 (16.3) years; 568 (3.8%) were Asian, 1484 (10.0%) were Black, 1408 (9.5%) were Hispanic, and 10 811 (73.2%) were White. A total of 10 037 respondents (68.0%) were women and 4730 (32.0%) were men. Of the 1683 individuals reporting post–COVID-19 condition, 955 (56.7%) reported at least 1 cognitive symptom experienced daily, compared with 3552 of 13 084 (27.1%) of those who did not report post–COVID-19 condition. More daily cognitive symptoms were associated with a greater likelihood of reporting at least moderate interference with functioning (unadjusted odds ratio [OR], 1.31 [95% CI, 1.25-1.36]; adjusted [AOR], 1.30 [95% CI, 1.25-1.36]), lesser likelihood of full-time employment (unadjusted OR, 0.95 [95% CI, 0.91-0.99]; AOR, 0.92 [95% CI, 0.88-0.96]) and greater severity of depressive symptoms (unadjusted coefficient, 1.40 [95% CI, 1.29-1.51]; adjusted coefficient 1.27 [95% CI, 1.17-1.38). After including depressive symptoms in regression models, associations were also found between cognitive symptoms and at least moderate interference with everyday functioning (AOR, 1.27 [95% CI, 1.21-1.33]) and between cognitive symptoms and lower odds of full-time employment (AOR, 0.92 [95% CI, 0.88-0.97]). Conclusions and RelevanceThe findings of this survey study of US adults suggest that cognitive symptoms are common among individuals with post–COVID-19 condition and associated with greater self-reported functional impairment, lesser likelihood of full-time employment, and greater depressive symptom severity. Screening for and addressing cognitive symptoms is an important component of the public health response to post–COVID-19 condition.
Background: The correlates responsible for the temporal changes of intrahousehold SARS-CoV-2 transmission in the United States have been understudied mainly due to a lack of available surveillance data. Specifically, early analyses of SARS-CoV-2 household secondary attack rates (SARs) were small in sample size and conducted cross-sectionally at single time points. From these limited data, it has been difficult to assess the role that different risk factors have had on intrahousehold disease transmission in different stages of the ongoing COVID-19 pandemic, particularly in children and youth. Objective: This study aimed to estimate the transmission dynamic and infectivity of SARS-CoV-2 among pediatric and young adult index cases (age 0 to 25 years) in the United States through the initial waves of the pandemic. Methods: Using administrative claims, we analyzed 19 million SARS-CoV-2 test records between January 2020 and February 2021. We identified 36,241 households with pediatric index cases and calculated household SARs utilizing complete case information. Using a retrospective cohort design, we estimated the household SARS-CoV-2 transmission between 4 index age groups (0 to 4 years, 5 to 11 years, 12 to 17 years, and 18 to 25 years) while adjusting for sex, family size, quarter of first SARS-CoV-2 positive record, and residential regions of the index cases. Results: After filtering all household records for greater than one member in a household and missing information, only 36,241 (0.85%) of 4,270,130 households with a pediatric case remained in the analysis. Index cases aged between 0 and 17 years were a minority of the total index cases (n=11,484, 11%). The overall SAR of SARS-CoV-2 was 23.04% (95% CI 21.88-24.19). As a comparison, the SAR for all ages (0 to 65+ years) was 32.4% (95% CI 32.1-32.8), higher than the SAR for the population between 0 and 25 years of age. The highest SAR of 38.3% was observed in April 2020 (95% CI 31.6-45), while the lowest SAR of 15.6% was observed in September 2020 (95% CI 13.9-17.3). It consistently decreased from 32% to 21.1% as the age of index groups increased. In a multiple logistic regression analysis, we found that the youngest pediatric age group (0 to 4 years) had 1.69 times (95% CI 1.42-2.00) the odds of SARS-CoV-2 transmission to any family members when compared with the oldest group (18 to 25 years). Family size was significantly associated with household viral transmission (odds ratio 2.66, 95% CI 2.58-2.74). Conclusions: Using retrospective claims data, the pediatric index transmission of SARS-CoV-2 during the initial waves of the COVID-19 pandemic in the United States was associated with location and family characteristics. Pediatric SAR (0 to 25 years) was less than the SAR for all age other groups. Less than 1% (n=36,241) of all household data were retained in the retrospective study for complete case analysis, perhaps biasing our findings. We have provided measures of baseline household pediatric transmission for tracking and comparing the infectivity of later SARS-CoV-2 variants.
The coronavirus (COVID-19) pandemic has profoundly impacted various aspects of daily life, society, healthcare systems, and global health policies. This pandemic has resulted in more than one hundred million people being infected and, unfortunately, the loss of life for many individuals. Although treatment for the coronavirus is now available, effective forecasting of COVID-19 infec- tion is the most importance to aid public health officials in making critical decisions. However, forecasting COVID-19 trends through time-series analysis poses significant challenges due to the data’s inherently dynamic, transient, and noise-prone nature. In this study, we have developed the Fine-Grained Infection Forecast Network (FIGI-Net) model, which provides accurate forecasts of COVID-19 trends up to two weeks in advance. FIGI-Net addresses the current limitations in COVID-19 forecasting by leveraging fine-grained county-level data and a stacked bidirectional LSTM structure. We employ a pre-trained model to capture essential global infection patterns. Subsequently, these pre-trained parameters were transferred to train localized sub-models for county clusters exhibiting comparable infection dynamics. This model adeptly handles sudden changes and rapid fluctuations in data, frequently observed across various times and locations of county-level data, ultimately improving the accuracy of COVID-19 infection forecasting at the county, state, and national levels. FIGI-Net model demonstrated significant improvement over other deep learning-based models and state-of-the-art COVID-19 forecasting models, evident in various standard evaluation metrics. Notably, FIGI-Net model excels at forecasting the direction of infection trends, especially during the initial phases of different COVID-19 outbreak waves. Our study underscores the effectiveness and superiority of our time-series deep learning-based methods in addressing dynamic and sudden changes in infection numbers over short-term time periods. These capabilities facilitate efficient public health management and the early implementation of COVID-19 transmission prevention measures.