Cross-Domain Urban Data Mining
According to U.S. 2010 Census, about 80.7% of the U.S. population live in urban area. Urbanization has modernized people's lives but also generated many urban issues such as traffic congestion, air pollution, health, education, and life quality. In the meantime, with the rapid progress in sensing technologies and widely-used digital documentation, increasing amount of urban data are being accumulated in the digital form, including human traces, traffic, air quality, local events, vehicle collisions, noise reports, and many more. Many cities in the U.S. (e.g., New York City, Chicago, and Los Angeles) have joined the open data initiative and created websites to release the city data to the public. Such big data implies rich knowledge about a city and could empower us to address many critical urban challenges.
This project develops novel data mining techniques to help people uncover the complicated correlations in the big urban data. While each type of urban data has been previously analyzed in its own domain, we lack a principled approach to integrate and analyze the data collected from different domains in order to better understand the urban issues from multiple aspects. The project investigates systematic solutions to integrate and model the urban data, discover the hidden patterns, and present and visualize the results in an interpretable way. The key innovation lies in how to effectively harness the heterogeneous urban data and learn mutually reinforced knowledge from such data. The project explores motivating real-world problems from other research fields such as social science, transportation, ecology, and urban planning, and promises interdisciplinary impacts in these fields. Ultimately, this project strives to advance the techniques in urban computing, a nascent interdisciplinary research field that addresses the challenges and opportunities in the fast-evolving urban environments.