This article offers a step-by-step tutorial on the information you will need to include in your IRB application to use secondary data which contain identifying information in your research project. Identifying information means that there are either direct identifiers, such as names, addresses, dates of birth, social security numbers, medical record numbers, or other identifying information, or indirect identifiers such as a Study ID or code that may be linked back to an individual person. Many national survey data sources are de-identified, but some other sources of secondary data, such as electronic health record or medical claims data may include identifying information and/or protected health information. Still, with the right protections in place, most of these studies are considered “Exempt” from IRB oversight.
An additional guide to navigating the IRB when using secondary data that only includes de-identified information is linked below.
Please note that this tutorial is only applicable to the University of Michigan IRB (EResearch) system and may not apply to other institutions. The language included below is purely suggestive, and may need to be modified depending upon your project needs. There are additional questions based on if you are using Protected Health Information (PHI), which I’ve included below, but know that if you are not using PHI in your study, you will not need to answer those questions. If you have specific questions or would like further consultation, please email our team at efdc-datadesign@umich.edu.
-
Questions 1.1-1.3 are standard study questions that ask for your study title, name of Principal Investigator and study team members.
-
Question 1.8 asks for a Project Summary. This should include a summary of your research question, hypothesis and relevant background information. Here it is important to include that your project is secondary data analysis, information on the data source, what level of identifying information is available to you as a researcher and what you plan on using in your research, why it is necessary for you to access this level of identifying information, and a brief description of your plan for data security. There will be additional questions about this later on, so you can be brief.
-
Question 1.9 asks for which IRB this application should go to. This depends on if your affiliation is within Michigan Medicine or the Medical School (IRB-MED), or campus (IRB-HSBS).
-
Question 1.11 asks for study duration. I think it is best to be generous with this estimate in case of delays.
Next page
Next page
-
Question 1: This asks whether your study will involve data or biospecimens (or both). Select whichever is relevant to your study.
-
Question 2: This question asks if your study requires the full approval of the IRB committee, as opposed to exempt. In most cases, if your study is minimal risk, the answer to this will be no. However, in some cases, if your study has additional risks to privacy or is accessing sensitive, restrictive data, you may need to put yes.
-
Question 3: This question asks if anyone on the study team had a role in the data collection and still maintains the ability to link the study back to the subjects. Typically, the answer to this would be no, unless your team was directly involved in data collection.
-
Question 4: This question asks if subjects’ identifiers can be readily ascertained by the data you will access, either through direct or indirect (coded) information. Since you are accessing identifying information, the answer to this will most likely be yes.
-
Question 4.1: This question asks if your team will record any identifying information about subjects’, including indirect or coded information. Since you are accessing identifying information, the answer to this will most likely be yes.
-
Questions 5-7 will depend on the focus of your study.
-
Question 8 asks if any of the identifying information accessed is considered “Protected Health Information”. This depends on the study, but if you are accessing electronic health records or medical claims data, the answer would be yes.
-
Question 8.1 is an attestation in using HIPPAA-protected data.
-
Question 12 asks for a brief summary of your research plan. Here, I recommend restating that your project is secondary data analysis, information on the data source, what level of identifying information is available to you as a researcher and what you plan on using in your research, why it is necessary for you to access this level of identifying information, and a brief description of your plan for data security. There will be additional questions about this later on, so you can be brief.
Next page
-
Questions 1-2.1-2.9 are standard study questions related to who initiated the study, if there are students working on the project, if the study is related to cancer or cancer risk, if the merits of the study have been reviewed and if it is a clinical trial. The answers to these questions depend upon the focus of your specific project.
Next page
-
Questions 2.1-2.4 ask about the financials of the project, including the PAF. For secondary data projects, if you do not have external funding, you will not need a PAF.
Next page
Next page
-
Question 24: This question will ask you about the data sources you will be using for secondary data analysis. Once you click “add”, it will open a new window with additional questions.
-
Question 24.2 asks for the name and location (where it is stored) of the dataset.
-
Question 24.3 asks about the information included in the dataset, including any subject identifiers. Be as thorough here as possible, including any data you think you might need for your study.
-
Question 24.4 will ask you to confirm which level of identifiers you will receive for the study (direct or coded).
-
Question 24.5 asks you to upload any relevant Data Sharing Agreements or Data Dictionaries. This is optional, but a good idea to include if you have it available.
Next page
-
Question 25.1 asks you to select any source of HIPAA-related data that you are using. This is only relevant if you are accessing HIPPA-related information. If you are using Michigan Medicine’s DataDirect, Biorepository, or other Michigan Medicine data source, you can select “Michigan Medicine hybrid covered entity”.
Next page
-
Question 25-1.1 asks you to select all sources of Protected Health Information (PHI) that you will access for the study Select all that apply.
-
Question 25-1.2 asks why this PHI listed in the previous question is needed for the study. Your answer will depend on the focus of your study, but should make it clear that your research question cannot be answered without access to the PHI.
-
Question 25-1.3 asks if you will seek HIPAA authorization for access to the data for your study. This will vary by study, but if patients have already consented to share their data and the study team has no interaction with subjects, you can generally put no.
-
Question 25-1.3.2 asks if you are not seeing HIPAA authorization, what alternative will be used. There are different options here, but you will likely want to select “Request for full or partial waiver of HIPAA authorization to be approved by U-M IRB or Privacy Board”
Next page
-
Question 25-2.1: You will want to select that you are seeking a waiver of authorization for the “Entire Project”.
-
Question 25-2.2 asks you to describe the plan to protect patient-subject identifiers from improper use or disclosure to ensure your research use of the PHI involves no greater than minimal risk to privacy. This will depend on your study protocol, but generally it is best practice to include data projections such as using HIPAA-compliant computing options such as Dropbox Teams or Turbo storage, two factor authentication, and that data can only be accessed by the listed study team members.
-
Question 25-2.3 asks about your plan to destroy patient-subject identifiers at the earliest opportunity consistent with the research. Generally, you will want to maintain identifiers at least one year post-publication, and then have a plan for being permanently deleted, though this may vary depending upon your specific needs.
-
Question 25-2.4: Here you will want to affirm that the PHI will not be reused or disclosed to an
-
Question 25-2.5: This question asks why the research could not practicably be conducted without the waiver of HIPAA authorization. In cases of large, existing data, participants should have already consented to share their data and it is generally not feasible to re-consent all participants for each individual study.
-
Question 25-2.6 asks why the research could not be done without accessing PHI, and here you will want to restate the importance of PHI for your research question.
-
Question 25-2.7 asks if the data containing PHI will be shared outside of University of Michigan.
Next page
After that, the system will prompt you to do an error check and you should be all set to submit your application!
If you run into additional issues, or need help, please contact our team at efdc-datadesign@umich.edu.