科研项目

Spatiotemporal Data Management and Smart Application

摘要

As a part of the CSE department at HKUST, the STC (spatial-temporal crowdsourcing) group conducts research related to spatial-temporal data management and spatiotemporal applications, from algorithm design to application development. Prof. Lei Chen’s research is motivated by new technologies and applications through mining/learning the spatial-temporal knowledge and exploiting the wisdom of crowds to facilitate people’s daily life. This research group is supported by several fundings, such as the fundings from RGC – Theme-based Research Scheme, Guangdong Natural Science Foundation, and Ministry of Science and Technology of the People’s Republic of China. The recent outcomes of our Prof. Lei Chen’s STC group can be classified into three categories: human-powered data collection, learned index based query processing, and unified approach for route planning in smart transportation.

Human-Powered Data Collection

For human-powered data collection, Prof. Chen’s STC group also collaborates with WeChat (at Tencent) in exploiting the wisdom of humans to boost the performance of machine learning models. Recently, although many accurate models have been developed to recognize images, translate languages and compose songs (some models are even more accurate than humans in image recognition), humans are still better than machines in discovering errors, learning new items and feeling emotion. Prof. Chen’s STC group has studied methods to combine the advantages of humans and machines to build better machine learning models benefitting our daily life with human feedback used to iteratively improve the current models.

Learned Index based Query Processing

As data volume grows at an exponential speed, data indexing is becoming a key component of database management systems, as indexing can significantly reduce the query processing cost. To support more efficient queries over the collected data, Prof. Lei Chen’s STC group also explores how to design light-weight learned indexes and scalable query processing algorithms. For example, HAP is an efficient learned index on Hamming space to support both Hamming range queries and 𝑘NN queries, and Learned Multidimensional Histogram (LHist) is a learned data synopsis to support approximate range aggregation queries.

Unified Approach for Route Planning in Smart Transportation

Based on the collected data and designed index, this research team has also studied the real-world problems in spatial applications. By collaborating with DiDi Chuxing (the biggest online taxi-sharing company in Mainland China), Prof. Chen’s STC group can utilize the huge amount of data generated by millions of drivers and customers to help the company improve the efficiency of their services and the user experience of both drivers and customers. One such service, ridesharing, which allows drivers to share their empty seats to different groups of customers as long as the detours are acceptable, has a huge potential to alleviate the shortage of vehicles and increase the throughput of the platform. With ridesharing, customers can enjoy relatively cheaper transportation services with guarantees, such as the suitability of fellow travelers and the deadline of delivery. However, to operate a good ridesharing service is challenging. Prof. Chen’s STC group has helped DiDi to design smart vehicle dispatching strategies, dynamic pricing strategies, and privacy protection mechanisms such that the efficiency of the ridesharing service can be improved, and the overall profit of the platform can be increased. 

项目成员

陈雷

课题组负责人

讲座教授

数据科学与分析学域

研究领域

特定行业的数据分析