• Pediatr Crit Care Me · Sep 2020

    Multicenter Study

    Joining Datasets Without Identifiers: Probabilistic Linkage of Virtual Pediatric Systems and PEDSnet.

    • Adam C Dziorny, Robert B Lindell, Tellen D Bennett, and L Charles Bailey.
    • Department of Biomedical Informatics, Children's Hospital of Philadelphia, Philadelphia, PA.
    • Pediatr Crit Care Me. 2020 Sep 1; 21 (9): e628-e634.

    ObjectivesTo 1) probabilistically link two important pediatric data sources, Virtual Pediatric Systems and PEDSnet, 2) evaluate linkage accuracy overall and in patients with severe sepsis or septic shock, and 3) identify variables important to linkage accuracy.DesignRetrospective linkage of prospectively collected datasets from Virtual Pediatrics Systems, Inc (Los Angeles, CA) and the PEDSnet consortium.SettingSingle-center academic PICU.PatientsAll PICU encounters between January 1, 2012, and December 31, 2017, that were deterministically matched between the two datasets.InterventionsNone.Measurements And Main ResultsWe abstracted records from Virtual Pediatric Systems and PEDSnet corresponding to PICU encounters and probabilistically linked using 44 features shared by the two datasets. We generated a gold standard deterministic linkage using protected health information elements, which were then removed from datasets. We then calculated candidate pair log-likelihood ratios for all pairs of subjects and selected optimal pairs in a two-stage algorithm. A total of 22,051 gold standard PICU encounter pairs were identified over the study period. The optimal linkage model demonstrated excellent discrimination (area under the receiver operating characteristic curve > 0.99); 19,801 cases (89.9%) were matched with 13 false positives. The addition of two protected health information dates (admission month, birth day-of-year) increased to 20,189 (91.6%) the cases matched, with three false positives. Restricting to patients with Virtual Pediatric Systems diagnosis of severe sepsis or septic shock (n = 1,340 [6.1%]) matched 1,250 cases (93.2%) with zero false positives. Increased number of laboratory values present in the first 12 hours of admission significantly increased log-likelihood ratios, suggesting stronger candidate pair matching.ConclusionsWe demonstrated the use of probabilistic linkage to accurately join two complementary pediatric critical care datasets at a single academic PICU in the absence of protected health information. Combining datasets with curated diagnoses and granular measurements can validate patient acuity metrics and facilitate multicenter machine learning algorithms. We anticipate these methods will generalize to other common PICU diagnoses.

      Pubmed     Full text   Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…