In Pursuit of Metric Data and Privacy in Modern Data Analysis

摘要
Modern data analysis presents many new challenges less considered by classic settings. First, many classic data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. However, as real data sets are noisy, they often do not possess this fundamental property. Second, data leaks and commercial data transactions threaten to reveal sensitive information about any individual in the dataset. Thus, it is necessary to develop algorithms for addressing non-metric data and private data.
In this talk, I will present algorithms that rise to these aforementioned challenges. I will first formalize the metric repair problem (which seeks to minimally modify the data to make it metric, thereby finding the nearest metric data set) and propose solutions to that problem. Then, I will present fast and private algorithms for several classic problems such as k-means, correlation clustering and more.
演讲者简介
Chenglin FAN
Postdoctoral Researcher
Johns Hopkins University
Dr. Chenglin Fan is currently a postdoctoral researcher at Johns Hopkins University. He received his Ph.D. in Computer Science from UT Dallas. After graduation, he did a postdoc at Sorbonne University and subsequently was a researcher at Baidu Research. His research interests lie broadly in algorithms theory and differential privacy. Dr. Fan's research has been published at top theory and machine learning conferences such as FOCS, ICML and NeurIPS.
日期
13 November 2023
时间
14:00:00 - 15:00:00
地点
线上
Join Link
Zoom Meeting ID: 820 9302 9823
Passcode: dsat
主办方
数据科学与分析学域
联系邮箱
dsat@hkust-gz.edu.cn