DSA学域研讨会

In Pursuit of Metric Data and Privacy in Modern Data Analysis

Modern data analysis presents many new challenges less considered by classic settings. First, many classic data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. However, as real data sets are noisy, they often do not possess this fundamental property. Second, data leaks and commercial data transactions threaten to reveal sensitive information about any individual in the dataset. Thus, it is necessary to develop algorithms for addressing non-metric data and private data.

In this talk, I will present algorithms that rise to these aforementioned challenges. I will first formalize the metric repair problem (which seeks to minimally modify the data to make it metric, thereby finding the nearest metric data set) and propose solutions to that problem. Then, I will present fast and private algorithms for several classic problems such as k-means, correlation clustering and more.

Chenglin FAN

Postdoctoral Researcher

Johns Hopkins University

Dr. Chenglin Fan is currently a postdoctoral researcher at Johns Hopkins University. He received his Ph.D. in Computer Science from UT Dallas. After graduation, he did a postdoc at Sorbonne University and subsequently was a researcher at Baidu Research. His research interests lie broadly in algorithms theory and differential privacy. Dr. Fan's research has been published at top theory and machine learning conferences such as FOCS, ICML and NeurIPS.

日期

13 November 2023

时间

14:00:00 - 15:00:00

地点

线上

Join Link

Zoom Meeting ID:
820 9302 9823


Passcode: dsat

主办方

数据科学与分析学域

联系邮箱

dsat@hkust-gz.edu.cn