The PATH Study: A Longitudinal Cohort Study of Tobacco Use, Determinants, Impacts

[Kaye] Good morning everyone. Welcome to
our presentation on The Population Assessment of Tobacco and
Health Study: A Longitudinal Cohort Study of Tobacco Use, Its Determinants, and Its
Impacts. As Linda stated, my name is Kaye Marz. I’m the Senior Data Project Manager
for the National Addiction and HIV Data Archive Program at ICPSR. NAHDAP staff
curate the PATH study data from ICPSR. This is the first in a series of
sessions throughout today that are presenting on various aspects of data
transparency. We hope that you are able to attend the other sessions today in
the ICPSR Data Fair. First, let’s define what data transparency means for this session which is focused on the world of data sharing. The first
component of our definition is reliability and transparency of content.
Data users are confident that the data they are using are accurate and were
provided by permission from the cited source. Secondly, ease of access. Data
users can easily access and work with the data no matter where they are
located or the method or technology used to create the data. Let’s look
specifically at our case study which is again the Population Assessment of
Tobacco and Health study. Although the PATH study is a complex data collection
effort, features of this study and how it is sharing the data can be used as a
model to prepare and share data from other research. In 2009 the Tobacco
Control Act gave the Food and Drug Administration, the FDA, the authority to
regulate the manufacture, distribution, and marketing of tobacco products to
protect public health. The PATH study was launched in 2011 to monitor and assess
tobacco use, its determinants and its impacts in order to inform FDA
regulatory activities. The Path study is the nationally representative
longitudinal study of tobacco use that is designed to support the FDA’s Center
for Tobacco Products regulatory considerations. The study is funded by
the FDA and administered by NIDA (National Institute on Drug Abuse). Westat
holds the contract to implement the PATH study or plan. They have nine
scientific partners to support this mission as shown on the slide. The PATH study again uses a longitudinal
cohort design with a nationally representative sample of the US population. Drawn using a stratified probability
design that started at wave one with nearly 46,000 persons age 12 years and
older including over 13,000 youth. Importantly the sample includes never,
current, and former tobacco users. Adult tobacco users and young adults were
heavily over sampled and there is a modest oversample for African-American
adults. ACASI was used for the interviews and adults also provided
biospecimens. Perhaps the most distinguishing feature of the PATH study
is its longitudinal nature which complements the cross-sectional tobacco
surveillance systems they were already in place. This design allows researchers
to examine changes in tobacco use, behavior, and health outcomes over time.
Shown on this slide is a sample of some of the outcomes that can be
examined with the PATH study data. Another distinguishing feature of the
PATH study is the very detailed questions about use of specific tobacco
products, first time use, current use, amount used, and purchased, new product uses, switching between products, and dual and poly use. Generic images of the
product displayed on the interview screen when selected questions were
asked about that type of tobacco product, except for cigarettes. Some tobacco
products come in a variety of subtypes. For example, cigars include traditional
cigars, cigarillos, and filtered cigars. Product descriptions emphasize the
physical features that differentiated the cigar types. Available from the
PATH study are two main types of data. Questionnaire data in two versions,
public-use files are downloadable from the NAHDAP website after the data user
agrees to the online Terms of Use and restricted-use files that are accessed
only via the ICPSR data Enclave and require a signed data use agreement and
approved application. Biomarker data are also available from the urine and blood
collections. These are available as restricted-use files only, typically they
are identified with the BRUF acronym and access similarly requires the signed
data use agreement and approved application. How does the PATH study
promote data with transparency? Remember the two types of data transparency in
our earlier definition. We have reliability and transparency
of content and ease of access. First we will cover how the PATH study
exemplifies the principle of reliability and transparency of content. Reliability
and transparency of content are present in three main ways, each of which will be
expanded upon in the following slides. First the data source, familiarity with
the source of data is always important for assessing its reliability. The PATH study
design was based on a validated model and the questionnaire items were adapted
from other well-established surveys. The PATH study data are collected with the
involvement of two government agencies that have knowledge of research and
substance use issues. These agencies are openly sharing the data so research can
be done to inform the FDA regulatory activities. For documentation the PATH
study data are accompanied by extensive study level and data set level
documentation. For other resources the PATH study provides resources for
workshops to train data users to appropriately analyze the data and a
data users forum and support staff to answer users questions. Elaborating
on the first facet of reliability and transparency: the data source, the PATH
study is a model and development of its study design and questionnaires an
emphasis on consistency and standards to foster collaborative research. The PATH
study design is based on the host agent vector environment conceptual model that
has been validated in assessing the impact of several tobacco control policy
interventions. Many of the PATH study questionnaire items were adapted from
well-established existing national surveys and included items from
international, state and privately funded tobacco surveillance surveys. Images of tobacco products were used to help the respondent identify which
products they used as was extensive use of tailored question
wording based on responses to previous items. As mentioned earlier the PATH
study is funded by the FDA and administered by NIDA. Their involvement
provides consistency and standards for quality and accuracy, especially knowing
that the sharing of the data encourages collaborative research and extensive use
that increases scrutiny of the data. Finally studies like the PATH study
reinforce positive relationships between government funders and the research
community, increasing the potential for other
similar collaborations in the future. The second facet of reliability and
transparency demonstrated by the PATH study is the nature of documentation
available. The PATH study team has documented all aspects of the data that
they believe data users need to know to use the data correctly. The previous
slide listed several types of study level documentation, here I’m
highlighting the user guides. There’s one each for each version of the data as
their content can be used as a model for the development of other user guides. Their user guides contain information about the study design, background, sample design, data collection procedures, response rates, and more information
about weights and imputation structure of the data variable naming and codes
and information on linking of files and example code for popular statistics. For
the dataset level documentation, I’m highlighting their
annotated questionnaires. They’re asked and skipped statements identify who
answered each question and the explanatory notes such as display notes
and program notes and the algorithms show how information was presented to
the participant and additional information on how to interpret the
variable. The PATH study funding also supports resources for data users. In the
past three years, we have had a summer workshop each free of charge
to researchers who wanted to learn more about using the data. Webinars and
workshops on specific data sets or groups of datasets can be an option, so
contact NAHDAP if you are interested. A number of online resources are available
about the PATH study including the PATH study data user forum. This is a platform
where researchers can discuss questions with other users as well as receive
answers directly from NAHDAP staff and support is also provided to data users
by NAHDAP staff who are available to answer questions about the data and to
assist with applying for access to the restricted-use files. The other principle or data transparency that the PATH study
embodies is that of ease of access. I will discuss two main components, first
is availability of information. It’s best to be clear how data can be accessed and
by whom. As mentioned before, the PATH study data are available as public-use
and restricted-use files. No access fees are charged to use the PATH study data.
Most federally funded data can be downloaded without a fee as cost can
decrease the transparency of data by limiting who can use the data. The PATH
study team also provides support for access to the ICPSR
virtual data Enclave so the RUF and BRUF users, those used in the restricted-
use data, don’t pay the annual VDE license fee. One of the PATH
studies greatest advantages is its support services which include an online
request system to apply for access to the restricted-use data, other online
resources, and support staff to answer user questions. It should always be easy
to find information about the data itself as well as information on how it
can be accessed. The PATH study data are available as public-use files meaning
that it can be downloaded directly from NAHDAP’s website without first going
through an application process, still agreeing to the online Terms of Uses
required. The public-use files can meet the needs of most researchers. The PATH
study restricted-use files are available to researchers who demonstrate a
research need to access details not able to be in a public-use file. For
restricted-use data, the online system walks users through all steps of the
application process and additional information is available in the online
VDE guide. Finally, ease of access is promoted
through readily available support services. The PATH study uses the
ICPSR online system for users to apply for access to the PATH study restricted-
use data. Now that some application information also is sent via an email
correspondence tracking system. The NAHDAP PATH study has its own series page with
information specifically for researchers. The RUF, PUF, and BRUF home
pages and documentation are available online even if the data are restricted-use to provide extensive
information about the study. Citations to publications that utilize the PATH study
data are regularly added to the ICPSR Bibliography so users can know about
research already conducted with the PATH study data. The extensive variable detail
and the PATH study questionnaires allows each variables information to be
searched and displayed on screen as well as allows comparisons across other PATH
study variables and other studies in ICPSR. As always NAHDAP staff are
available to answer questions about accessing the data including the
application process as well as any other questions about the PATH study data. We
have covered how the PATH study with the help of ICPSR exemplifies the important
principles of data transparency. It does set a high bar especially with resources
like workshops and support staff that may be beyond the means of a smaller
study but this does not mean that your data cannot be reliable and easy to
access. You can also base your study design
on known models and adapt questions from well-established surveys or other
sources. You can provide data study level, data set level, and variable level
documentation sufficient for users to understand the data. You can also
consider a public-use version of the data, if possible even if a restricted-
use version is needed that has all the details and the archive can help you
with this. You can let ICPSR know about citations to publications based on use
of your data and contact NAHDAP if you’re interested in a webinar or workshop to
highlight your data including with other data and NAHDAP on the same subject. Finally, it’s organizations like ICPSR
that provide the structure necessary for researchers to share their data in a
transparent process as well as ease of access to those data whether public use
or restricted-use. In addition ICPSR’s resources promote data transparency
through its reliable data storage, online resources, search tools and
training options such as webinars and workshops and our staff support to
assist researchers with using our data. This type of organization allows the
kind of access to studies that may not otherwise be possible, encouraging the
transparent flow of data throughout the research community. If you want
more information or if you have questions we ask that you send questions about
access to the PATH study restricted-use files in ICPSR’s virtual data enclave to
our email address specifically for VDE support: [email protected] We also encourage you to check out our VDE guide which provides a lot of
information about the application process and using the VDE. For other
questions please email us at our generic email address [email protected] We encourage you to attend our five other data transparency sessions
today and just want to remind you that data sets and ICPSR can be used as
a model to make your data transparent as you share it with others, that ICPSR is
your official source for data obtained from the data’s official source,
and ICPSR is easy access to data in a variety of
standardized formats. We can now turn to the questions that we have received.
If you haven’t done so yet and you have questions please go ahead and submit any
questions that you have and we will work to answer each one. “Have you had any surprising findings?” Can you elaborate on your question a
little bit? And while you’re doing that, I think, I’ll go ahead and answer another
question. “My study is not as complex as the PATH study. What
information would I want to provide and share for a study that’s less complex?” We typically would like researchers to give us a good description of their
study. There’s three main things that we ask for, a good description of your study,
why it was done, the design, so who was interviewed, information about the sample,
the dates of collection. That information can be provided to us through the
deposit form which has fields where you can enter all of the information that we
typically display in a study description on our website. If you have a
journal article or other document that also has that information, you can simply
just upload that document with your deposit form. Regarding the data and
documentation, we would like to have the data deposited that has full labeling,
complete variable labels and value labels, and also missing values declared
or at least a legend on missing values so that we will know to prepare and
inform other data users which are the missing values. Also we
would want the original data collection instrument, whether that’s a survey form
or maybe a form that was used to abstract from records information that
shows how each of the variables was conceptualized. With those three items
we can go a long way and if we’ve received those and we still have
questions then we will contact you for more information. I’m sorry, I will
repeat the question before answering, thank you for the reminder. Another question is “How do I know if a public-use version of my data can be
made available?” Sometimes data contains such confidential information and
detailed information it would not be possible but there are times where just
limiting some variables can lead to coarsening the data enough that a public-
use version can be made. One step is to limit geography or if there’s
continuous variables like age, to make those into categories or to top code
variables. If you have a data set that you feel would be best restricted-use
and you also are interested in a public- use version, please go ahead and contact
us. You can send your documentation to us in advance even in advance of depositing
the data so it could be your instrument or a data collection, data dictionary and
we’ll take a look at that and give some recommendations on how we think we could make a public- use file. “What are the practical ways and means one could use to effectively
engage African policymakers and National Institute of Statistics in the move a
data transparency and sharing? This is a pressing issue because it’s difficult
for African researchers to access data.” I think one that comes to mind is to
look for models that can be given to these policymakers to show how data can
be shared transparently and then by giving them a way to know how it can be
done. They might be more willing to adopt a model and make the data more
available. Because we agree, we would like the data to be used, that’s why the
archive is here. If there are others ways that it’s difficult for
African researchers to access data, you might go ahead and send
us an email and we can see if we can think more about ways to address those
specific barriers. “Has the study uncovered anything you were surprised by
or what has been your favorite part of working with PATH?” I think one of the
things with working with the PATH study is that it is really so extensive and we
know that it’s obtaining very important information about an area that affects a
lot of people’s lives. There are a lot of tobacco users. The field is changing
in tobacco product use, even people who are not smokers are often affected by
the use of tobacco products by others. So it just is really great to be involved in making this type of data available. I
think in looking at the publications they are really quite interesting things
that are being found with the data. I encourage people to go to the
Bibliography that’s available from the PATH study series page and take a
look at those publications because then it can find out what’s being found with the data. We have one more question here. “I like the
idea of using existing validated scales for my study, where can I find them?” I
think one option would be for your field is to read publications and other
research that’s been done. Oftentimes they will identify instruments that they
used for collecting their data and then you can Google those scales and find out
how you can also obtain them. I’d like to take this opportunity to mention that
the PATH study questionnaires are not copyrighted by agreement in the way that
the study was designed. So the questions can be used for other research,
they can be adapted for other research without permission and without paying
any fee. They do request that people cite that if you have adapted
from the questionnaire or use the question or items that you would cite
that your questions did come from the PATH study. Another couple of
resources are the National Institutes of Health have a Common Data Elements
project out there. The National Library of Medicine is hosting the CDE
repository and these are where they’re encouraging researchers to use the same
questions for their studies that are being used with other studies so there’s
comparability across studies. If you google NLM space CDE for the National
Library of Medicine Common Data Elements that will connect to with this CDE repositor y. NIH also have the Phoenix toolkit
which is capital P, lower h-e- n, capital X, Phoenix toolkit. And that
also similarly is requesting and encouraging researchers to use the same
questions across … in their research that are being used in other research so
we can build a body of comparable data that can be compared across studies on
the same topic. We have time for a few more questions. Well I don’t see any more questions
but again you can always email us if you come up with questions afterwards. As
Linda mentioned, the slides and the video will be available on our YouTube channel.
So you can then get our email addresses. and feel free to email us at any time.
I think at this point we will conclude the webinar.