Georgia Institute of Technology
Yao's research focuses on AI systems that generate highly readable text and on developing evaluation methods that measure text readability and simplicity. His work aims to enhance the accessibility and usability of textual information for older adults. By specializing in paraphrase generation and text simplification, Yao has created large-scale paraphrase datasets such as MultiPIT, which contains over 500K diverse paraphrase pairs collected from Twitter, which is useful for training more engaging conversational AI systems. Yao has also developed state-of-the-art text simplification evaluation metrics like LENS, which significantly outperforms the existing metrics in measuring fluency, meaning preservation, and simplicity aspects of the generation. These efforts collectively enable the training of AI systems to produce content that is more engaging, digestible, and understandable content for aging adults, particularly those who are diagnosed with MCI.
Recognizing the importance of accuracy and quality in text simplification, Yao is currently designing a comprehensive human evaluation framework called SALSA, which encompasses over 21 edit types, to identify the errors and quality edits that happen in AI-generated simplification. With this method, Yao has gathered over 13K annotations on simplifications from leading AI systems such as ChatGPT, and provides insights into their strengths and weaknesses, paving the way for further improvements in both general and personalized AI communication. Yao is also working on utilizing large language models to develop AI systems that can automatically detect errors in text simplification, provide fine-grained feedback, and refine the simplification based on the feedback. This innovative approach not only improves the quality of text generation in general but also ensures that AI-driven communication remains efficient, relevant, precise, and easy to comprehend for aging adults.
Through these research efforts, Yao's work actively contributes to the development of collaborative AI systems that empower older adults to access information more effectively and simply, fostering greater independence and improved communication quality for them.