If you need preliminary data to help determine whether the data can support your research question, let us help you obtain counts for your population of interest. This information can be used for hypothesis generation and as preliminary data for grant documentation. Counts will be provided in aggregate format at the population level. We can check the feasibility based upon your inclusion/exclusion criteria or for the type of data elements you are interested in.

Integrating Biology & the Bedside (i2b2) is designed to create a comprehensive software and methodological framework enabling clinical researchers to accelerate the translation of genomic and “traditional” clinical findings into novel diagnostics, prognostics, and therapeutics. The i2b2 Query Tool is provided by Regenstrief Institute in partnership with the Indiana CTSI and is a web-based tool for de-identified clinical data queries. Locally, i2b2 queries a clinical data warehouse populated with variables such as lab results, diagnoses, and medications from two large health systems. With i2b2 an investigator can:

• view clinical data elements and data dictionary.
• quickly determine the feasibility of conducting a research study or clinical trial.
• iteratively improve hypothesis generation
• identify patient cohorts for potential recruitment.


Regenstrief can create data sets for analysis. We will work with you to define the variables you need from the electronic medical record system and create a data set that can be analyzed by a biostatistician.

The Data Core has direct access to the appointment systems within Eskenazi and IU Health. Patients can be identified based on diagnoses, labs, and medications. Regenstrief works closely with the recruiters to adapt the recruitment lists to their needs.

The Data Services team is happy to assist with preparation of the grant documentation related to the data sources, methods for data extraction, data security, and data sharing. Additionally, we can assist with clarification of items related protocol development and proper wording for IRB submissions. How much technical detail is required will vary by protocol, and we’re happy to work with you to produce customized documentation. Additionally, we can provide a letter of support from Data Services, which contains a summary of the data available and is customized to your project.

The RDS team creates and maintains commonly requested datasets, pulling specific data elements and matching with various sources. Currently available data sets include registries on Diabetes, Traumatic Brain Injuries and Sexually Transmitted Infections. Custom registries can also be developed.

Regenstrief can identify patients who can be approached for research based on their conditions.

Regenstrief Institute has worked closely with the Polis Center to geocode all the addresses within the INPC system. This data can be linked to Social Determinants of health.

Receive guidance in developing a standard set of electronic medical record data elements for identifying patients for a cohort.

RDS uses HIPAA Safe Harbor standards to de-identify all datasets, removing individually identifiable health information.

Data analysts can provide the necessary data to train, test and validate a machine learning algorithm. The technical services team can assist in operationalizing machine learning models.

The technical services team will integrate research-driven applications with EMR platforms, customizing the programming to the desired system or systems.

The technical services team will facilitate the further development of minimum viable product applications either internally or through established partnerships with third party vendors with software engineering capabilities while continuing to provide technical support and project management services.

Developers will create minimum viable product software designed to specifications to answer research questions. The developer will work with you to customize the proof of concept technology.

Data analysts can mine unstructured textual data and structured discrete data (such as physician notes, operative reports and pathology reports) to create more precise patient cohorts. This capability significantly improves the accuracy and effectiveness of patient identification for research, quality assessment and improvement initiatives.

Data analysts will match various nonclinical data sources with electronic medical records to create a research dataset. Examples include survey data and social determinants data.

Data analysts gather requested data from disparate sources and match fields to standards so the dataset can be analyzed as a whole.


RDS has always been responsive to the needs of my projects. My overall experience has been excellent. Admittedly most of my work with the data core has been through funded studies and working with a specific team that was assigned to me, meaning there was a development of group memory regarding aspects of the diseases and data sources specifically applicable to musculoskeletal work. This has actually been critical to the efficiency and success of our work.”

— Erik Imel, Associate Professor of Medicine and Pediatrics, Endocrinology
Indiana Center for Musculoskeletal Health