科研项目

Symphony: Retrieval-augmented language models using multi-modal data lakes

摘要

Multi-modal data lakes, which contain datasets in different formats such as text, tables, and knowledge graphs, have become increasingly popular for many organizations. Large language models, as generative models, cannot ensure the correctness of generative data. Given any natural language query, Symphony will first retrieve (possibly multiple) datasets from data lakes, which are then used for reasoning to answer the given query.

项目成员

汤南

副教授

出版文章

Symphony: Towards natural language query answering over multi-modal data lakes. Zui Chen, Zihui Gu, Lei Cao, Ju Fan, Samuel Madden, and Nan Tang.

项目周期

2023-Present

研究领域

Data-driven AI

关键词

LLM, multi-modal data lake, RAG