Domain Adaptation Approaches for Classifying Crisis Related Data on Social Media


Sponsoring Agency
National Science Foundation


The project investigates the use of big data analysis techniques for classifying crisis-related data in social media with respect to situational awareness categories, such as caution, advice, fatality, injury, and support, with the goal of helping emergency response teams identify useful information. A major challenge is the scale of the data, where millions of short messages are continuously posted during a disaster, and need to be analyzed. The use of current technologies based on automated machine learning is limited due to the lack of labeled data for an emergent target disaster, and the fact that every event is unique in terms of geography, culture, infrastructure, technology, and the people involved. To tackle the above challenges, domain adaptation techniques that make use of existing labeled data from prior disasters and unlabeled data from a current disaster are designed. The resulting models are continuously updated and improved based on feedback from crowdsourcing volunteers. The research will provide real, usable solutions to emergency response organizations and will enable these organizations to improve the speed, quality and efficiency of their response.

The research provides novel solutions based on domain adaptation and deep neural networks to tackle the unique challenges in applying machine learning for crisis-related data analysis, specifically the volume and velocity challenges of big crisis data. Domain adaptation approaches enable the transfer of information from prior source disasters to an emergent target disaster. Deep learning approaches make it possible to employ large amounts of labeled source data and unlabeled target data, and to incrementally update the models as more labeled target data becomes available. Large-scale analysis across combinations of source and target crises will help identify patterns of transferable situational awareness knowledge. The resulting technical and social solutions will be blended together for use in data management and emergency response.