kcbion.blogg.se - Relational database

#Relational database series

To perform AI analysis requires efficient storage of hundreds or thousands of data points on a single patient or even on a single course of radiation therapy. This article will demonstrate that a database enables creation of a multidimensional structure to cleanly and accurately contain these data. One might question why a database system is necessary for AI research. (These, in turn, reference computer science literature for the very intrepid reader, but such references might extend beyond the level of understanding of all but the most technically inclined.) I have also provided some excellent Wikipedia references that contain abundant additional information, beyond what can be encapsulated in a single article. It will also discuss considerations when implementing a relational database system (RDBS) for your own research purposes, using an actual lung cancer radiation therapy database as an example. This article will demonstrate both what a relational database system is and how it is superior to a spreadsheet. In fact, many researchers erroneously describe spreadsheets as databases. However, this vast volume of data cannot be accommodated within a single spreadsheet (which is how most clinical researchers work when conducting standard multivariable analyses). One of the potential benefits of AI is that it can pore through large amounts of data to discover patterns not evident to clinicians.

#Relational database series

This issue of Advances in Radiation Oncology presents a series of articles around applications of artificial intelligence (AI) in our field. This article provides initial guidance in terms of creating a relational database system.

Ultimately, this will enable the clinical researcher to perform artificial intelligence analyses across vast amounts of clinical data in a way heretofore impossible. Consequently, with the explosion of data available in electronic health records and other data sources, databases become increasingly important to contain or order these data. Databases can be quite large (terabytes or more in size), yet still are highly efficient to query. Consequently, it becomes trivial to cull large amounts of data from a vast number of data fields on very precise subsets of patients. Likewise, databases can be queried or manipulated using a highly complex language called SQL. Databases store data in very efficient ways, minimizing space and memory requirements on the host system.

Because each record contains a “key,” it becomes impossible to add duplicate information (ie, add the same patient twice). This prevents users from entering spurious data during data import. Databases provide other advantages, in that the data fields are “typed” (that is, they contain specific kinds of data). Consequently, spreadsheets are very inefficient relative to relational database systems, which gracefully manage such data. However, this article demonstrates that clinical medical data encapsulate numerous one-to-many relationships. At their core, spreadsheets are only capable of describing one-to-one (1:1) relationships. This article describes relational database systems and how they differ from spreadsheets. Although many researchers talk about a “patient database,” they typically are not referring to a database at all, but instead to a spreadsheet of curated facts about a cohort of patients.