Featured
Improving fairness of Large Language Models June 2023 - Current
supervised by Prof. Mengye Ren and Prof. James Zou
- Working on understanding the effects of supervised finetuning on LLMs for debiasing from the view of continual learning.
Large Language Models for clinical applications Feb 2023 - Current
supervised by Prof. Hong Yu
- Working on automatic Chain-of-Thought generation for clinical QA to resolve shortage of annotations.
- Working on improving zero-shot QA performance by adding medical jargons to prompts.
Few-shot parsing for biomedical texts Sep 2022 – Current
supervised by Prof. Andrew McCallum
- Focused on few-shot constituency parsing. Worked on knowledge distillation from LLMs to a smaller applicable language model (i.e., T5).
- Worked on understanding the performance difference between student and teacher in knowledge distillation. Worked on more effective KD methods to handle noise in teacher labels.
Patella segmentation for the early detection of osteoarthritis June 2021 – Feb 2022 Supervised by Prof. Hao Chen
- Worked on increasing the robustness and data-efficiency of Convolutional Neural Networks (CNN) by leveraging Statistical Shape Model (SSM) for Patella segmentation.
- Implemented automated sampling and image registration to generate SSM. Proposed a metric for predicting whether CNN or SSM performs better with no access to ground truth. Proposed a fuzzy nearest-neighbor framework to integrate prior anatomical knowledge into CNN-based segmentation. It can improve the robustness and interpretability of segmentation in the case of limited or corrupted data.
- The proposal can increase the Dice coefficient of various mainstream CNN frameworks by around 1.7% on blind tests. The tested models can achieve a much better performance when only half of the training data are given.
Event detection without triggers September 2021 – January 2022
supervised by Prof. Haiqin Yang
- Worked on low-resource events detection that does not require the annotation of trigger words. Proposed to convert the classification task into a reading comprehension (RC) task.
- Proposed an events derangement module to resolve the imbalanced learning process. The performance on detecting minority events can be strengthened. Two patents are under registration.
- Trained the proposed model with real large-scale data collected by Datastory and applied the model to the company’s product for efficient events detection on large-scale real data from social medias. The cost of data annotation was greatly reduced.