• Dtsch Arztebl Int · Feb 2015

    Review

    Big data in medical science--a biostatistical view.

    • Harald Binder and Maria Blettner.
    • Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI) at the University Medical Center of the Johannes Gutenberg University Mainz.
    • Dtsch Arztebl Int. 2015 Feb 27; 112 (9): 137-42.

    BackgroundInexpensive techniques for measurement and data storage now enable medical researchers to acquire far more data than can conveniently be analyzed by traditional methods. The expression "big data" refers to quantities on the order of magnitude of a terabyte (1012 bytes); special techniques must be used to evaluate such huge quantities of data in a scientifically meaningful way. Whether data sets of this size are useful and important is an open question that currently confronts medical science.MethodsIn this article, we give illustrative examples of the use of analytical techniques for big data and discuss them in the light of a selective literature review. We point out some critical aspects that should be considered to avoid errors when large amounts of data are analyzed.ResultsMachine learning techniques enable the recognition of potentially relevant patterns. When such techniques are used, certain additional steps should be taken that are unnecessary in more traditional analyses; for example, patient characteristics should be differentially weighted. If this is not done as a preliminary step before similarity detection, which is a component of many data analysis operations, characteristics such as age or sex will be weighted no higher than any one out of 10 000 gene expression values. Experience from the analysis of conventional observational data sets can be called upon to draw conclusions about potential causal effects from big data sets.ConclusionBig data techniques can be used, for example, to evaluate observational data derived from the routine care of entire populations, with clustering methods used to analyze therapeutically relevant patient subgroups. Such analyses can provide complementary information to clinical trials of the classic type. As big data analyses become more popular, various statistical techniques for causality analysis in observational data are becoming more widely available. This is likely to be of benefit to medical science, but specific adaptations will have to be made according to the requirements of the applications.

      Pubmed     Free full text   Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…

What will the 'Medical Journal of You' look like?

Start your free 21 day trial now.

We guarantee your privacy. Your email address will not be shared.