We appreciate the questions raised in the letters to the editor [1, 2, 3, 4] about the brief report on the association between youth smoking, electronic cigarette use, and COVID-19 .
We thank the Journal of Adolescent Health for the opportunity to provide further information to clarify our findings and research implications. The brief report was intended to provide preliminary findings, generate discussion, and encourage additional follow-up research related to the COVID-19 pandemic, a rapidly emerging public health crisis. Brief reports have word limitations, and we are happy to respond in detail to the submitted letters [1, 2, 3, 4]. Below, we summarize the comments raised in the letters, followed by our response.
1) The letters submitted by Farsalinos and Niaura and Camacho and Murphy have questioned the use of online anonymous surveys and the representativeness of our sample, and in particular, that our sample is not a random probability sample and that weighting on a convenience sample does not eliminate bias.
We will present our thoughts on the two aspects of the correspondents’ concerns:
Using self-reported data from online surveys: National, self-administered, anonymous surveys are commonly used in epidemiological research, particularly when collecting data related to risky or sensitive behaviors to protect participant privacy . Studies show that adolescents’ self-reported tobacco use behavior (including self-reported e-cigarette use) is correlated with measured urinary biomarkers, an objective measure of tobacco use [7, 8, 9]. Data from online surveys are also reliable and valid.
As we did in our research, such nonprobability surveys, designed to provide rapid response data, are ideally used for examining associations but should not be used for estimating prevalence in the population. Furthermore, data on the reliability of online surveys compared with paper-based surveys show that there are no major differences in data quality between using these two methodologies [10,11], including when asking youth about tobacco-related behaviors [12, 13, 14]. In the publication , Qualtrics was used to field the survey among its existing panel members. Qualtrics panels are increasingly used in epidemiological research and during the COVID-19 pandemic [15, 16, 17, 18, 19, 20, 21].
Weighting a convenience sample in the study: Weighting strategies are commonly applied to online panels to make them more representative of the total population.
Although random sampling is certainly valuable for getting population-based survey estimates, even with highly nonrandom (nonprobability) samples, it has been shown that statistical adjustment (multilevel regression and poststratification) followed by weighted and aggregated statistics can enable such samples to be more representative of the population [22,23]. When cross-tabulated population data corresponding to all relevant variables are not available, raking  is a common statistical approach to correct sampling bias, where weights are assigned to individuals in the sample such that the marginal weighted distribution of the sample variables are close to those in the target population .
In the published study, statistical raking and weighting techniques were used to improve the accuracy of the survey data estimates of associations. Qualtrics used quota sampling to oversample specific subpopulation groups to achieve a more representative sample ; we then weighed our sample using marginal distributions of variables (age; gender; ever- and past 30-day e-cigarette use; lesbian, gay, bisexual, transgender and queer or questioning [LGBTQ]; race/ethnicity; and region) in the target population so that the sample has similar marginal distributions of those variables. We also accounted for correlations among observations because of multilevel clustering by region and state (specific counties and cities of participants were not available in the dataset).
In particular, all percentages provided in column 4 of Supplementary Table 3 of the original article  were used to create a composite weight for each individual in the sample. This includes (1) data from the U.S. Census 2018 to account for the purposive 1:1:1 sampling for adolescents (aged 13–17 years), emerging adults (aged 18–20 years), and young adults (aged 21–24 years); (2) data from the National Youth Tobacco Survey and the National Health Interview Survey to account for the purposive oversampling of e-cigarette ever users (sample quotas were imposed for 50:50 e-cigarette ever/never users) and among past 30-day users; (3) data from the U.S. Census 2018 to make the sample representative of the U.S. population with respect to sex (although data for “Other” was noted as not available), region, and race/ethnicity; and (4) data from the 2017 Youth Risk Behavior Survey to balance the proportion of persons identifying as LGBTQ (because Census data do not include population percentages for people identifying as LGBTQ; although Youth Risk Behavior Survey data relate to 13- to 18-year-olds, we used these data for 18- to 24-year-olds as well). Given the reasonably large sample of 780 LGBTQ youth in the unweighted data, it was also important to create weights for this variable.
2) The letter by Rodu and Plurphanswat and Camacho and Murphy inquired about our numbers in Table 1 and also asked for unweighted data on e-cigarette, cigarette, and dual use by COVID-19-related outcomes.
J Adolesc Health – January 1, 2021.