By Lana M. Pasek, EdM, MSN, RN, ANP-BC, UB SON PhD student and Shageenth Sandrakumar, BS Engineering, UB graduate data science student
In the fall 2018 semester, Shageenth and I had the unique opportunity to take part in an interdisciplinary graduate course called Data Science Applications in Health Care, which introduces students to data science and provides hands-on learning utilizing existing electronic health data. The class was comprised of UB graduate students from the Schools of Nursing and Computer Science and Engineering, along with NEXUS nursing graduate students from the west coast.
One of the expected outcomes of the course is for students to learn how to better work in collaborative, interprofessional teams while using large databases to improve population health.
The class was divided into three work groups; each group included a student who had experience with computer programs/engineering and two other students who were nurses with substantive health care knowledge/experience.
Our interdisciplinary group consisted of Shageenth, a UB-trained engineer and computer science graduate student; a NEXUS PhD nursing student and acute care nurse practitioner; and myself, an adult nurse practitioner with a wide range of clinical health care experience, including insurance claims.
Data science describes the process of asking questions, analyzing and manipulating large data sets in the search for patterns and knowledge by using diverse mathematical models and methods (Sullivan, 2018).
With the great advances of technology and computer science, scientists now have quick access to large amounts of data from a variety of digitized sources, such as banking information and electronic health records. There is also an increased capacity to store and speedily retrieve huge amounts of data electronically. This speed, coupled with organization and labeling of the data within the software programs, makes it easy to use the data for different analyses.
Health care research focusing on improved management and delivery of care is quickly evolving with the accessibility of large databases. This requires interdisciplinary teamwork of computer software engineers, data scientists, mathematicians, statisticians and those with substantive area expertise, like nurses (Grus, 2015).
Nurses are experts in clinical knowledge and wisdom, which emerges from the understanding and intimate knowledge of clinical practice (Benner, Hooper Kyriakidis, & Stannard, 2011). Nurses are, therefore, ideal contributors to data science health research.
Nurse scientists use data science to uncover patterns, associations or factors related to patient outcomes in large datasets, usually electronic health record data. Nurse scientists develop predictive models to identify risks for adverse health outcomes such as infection or mortality. Data science is also used to develop, assess and evaluate patient outcomes by way of clinical decision support tools, health portals and care coordination activities (Westra, et al., 2017, as cited in Park, et al., 2017).
Data science research starts with a research question and, through a study of the literature, is evaluated for its incidence and prevalence in health care.
The next step is to define the data to be analyzed. These definitions include the data source and dates. Defined data elements are located within a data registry, which can be derived from clinical codes.
The data registry also includes data that may not be necessary for the analysis. The strategy is to broadly request as much data as you think you might need initially and, through data cleaning, duplicates and data not related to the research question can be eliminated.
Next in the process is to identify independent and dependent variables and the applicable data science approaches. There are numerous data science approaches; the most important approach to health care data analytics is clinical prediction. These approaches try to uncover the underlying relationship between attributes and a dependent variable.
Data science approaches help data scientists identify patterns in massive data sets using predictive analytics to answer questions like, “What is odd about my data?” or “What will happen next?” (Sullivan, 2018).
This class has helped to shape my PhD trajectory and activities. It has contributed in further developing my research interests of patient-reported outcomes and the resulting data that can directly be utilized by providers to manage conditions such as multiple sclerosis.
What I learned inspired me to attend the Nurses Knowledge 2019 Big Data Science conference at the University of Minnesota, where I joined national networks of nurses working on issues related to coordination of care and mental health utilizing big data science methods.
Most importantly, I have genuinely enjoyed working with Shageenth, who I never otherwise would have thought of collaborating with and learning from.
The experience of working with an nurse practitioner was exciting. My normal routine job roles usually place me behind a desk doing heavy computational work. It was nice working with a person who has experience on the floor and can directly relate to patients.
Working with actual data can be messy – the storage of information and the way the data is coded can create ambiguity in its information. Also, while looking through the data and the outliers, it’s good to have a person who has been working in the field to give insights on why various algorithms fail in certain situations.
I personally enjoyed doing this project because it gave me experience in a more collaborative setting. My favorite part of the course was the engineering or designing features that should be fed into the system.
This course also sparked my interest to change my graduate study program to the Data Science Master’s Program here at UB. I hope to work on more projects in the medical field that can help me explore this exciting new field of data science.
Benner, P., Hooper Kyriakidis, P., & Stannard, D. (2011). Clinical wisdom and interventions in acute and critical care: A thinking-in-action approach (2nd Ed.). New York, NY: Springer Publishing.
Delaney, C. W., & Weaver, C. (2018). 2018 nursing knowledge big data science initiative. CIN: Computer, Informatics, Nursing, 36(10), 473-474.
Grus, J. (2015). Data science from scratch. Sebastopol, CA: O'Reilly Media Inc.
Hewner, S. (2018). NGC 602 Data Science Applications in Health Care. [Syllabus]. Buffalo, NY: School of Nursing, University at Buffalo.
Reddy, C. K. & Li, Y. (2015). A review of clinical prediction models. In C. K. Reddy & C. C. Aggarwal (Eds.), Healthcare data analytics (pp. 344-371). New York, NY: CRC Press.
Sullivan, S. (2018, December 19). Strength in numbers: Using data science to advance nursing knowledge [Web log post]. Retrieved from http://nursing.buffalo.edu/news-events/nurses-report.host.html/content/shared/nursing/articles/nurses-report/posts/big-data-science-analytics.detail.html
Westra, B. L., Syvia, M., Weinfurter, E. F., Pruinelli, L., Park, J. I., Dodd, D. ... Delaney, C. W. (2017). Big data science: A literature review of nursing research exemplars. Nursing Outlook, 65(5), 549-561.
UB School of Nursing welcomes comments from readers. Please submit your comments in the box below.