Friday, November 26, 2010

Diseases and Disasters (Dis2): Riff’s Performance

A couple of years ago, I introduced Riff; a hybrid (event-based and indicator-based) disease surveillance platform, at the International Society for Disease Surveillance (ISDS) [the original concept is fully described here - more up-to-date information can be found on InSTEDD's website here].  Riff was designed to streamline the collaboration between human experts dealing with diverse streams of information and enabling them with machine learning algorithms that can learn quickly and accurately classify the information for detection, prediction and response to health-related events (such as disease outbreaks or pandemics).  On Jan 17th 2010, the Thomson Reuters Foundation used Riff [after prior adoption in their EIS system; an Emergency Information Service for survivors of natural disasters (early work can be found in Nico’s Blog here)] to launch a first-of-its kind, free disaster-information service for the people of Port Au Prince, Haiti. This allowed survivors of Haiti's earthquake to receive critical information by text message directly to their phones, free of charge.

Earlier this year; April 2010, I setup a Riff space; Diseases and Disasters (or Dis2), for providing timely situation awareness from credible and reliable [gold standard of a sort] online reports on diseases and disasters in the world provided primarily by BioCaster and HealthMap. This already tagged and verified information provided Riff’s classifier with a great training opportunity that I had to monitor closely and correct at the beginning for both features (e.g., condition, type of disease transmission, severity, etc.) and geo-location of where the event actually occurred as shown here:

Conditions Tracked in Dis2 Proportional to their Coverage since April 19, 2010
%Conditions Tracked in Dis2 since April 19, 2010
Location of Conditions Tracked in Dis2 Proportional to their Coverage since April 19, 2010
Location (Heatmap) of Conditions Tracked in Dis2 since April 19, 2010


In collaboration with the Humanitarian FOSS Project; supported by a group of computing faculty and open source proponents at Trinity College, Wesleyan University, and Connecticut College (Open Source ALPACA Light Parsing And Classifying Application (ALPACA) and Open Source e-dop for Disease Ontology Prediction for Riff), we developed a Support Vector Machines (or SVM) for automatic feature extraction, data classification and tagging. Riff’s classifier performed incredibly well in the Dis2collaborative space with very little training at the outset; 83% (95% CI: 81-85%) True Positive rate as shown here:

Riff's Performance in the Dis2 Collaborative Space (True Positive and False Negative Ratios)
Riff's Performance in the Dis2 Collaborative Space (True Positive and False Negative Ratios)


Riff is an Open Source Project, you can download its source code here and help further enhance its performance. 

Acknowledgment


No comments:

Post a Comment