Wuao Liu

Wuao Liu (刘武傲)

I am a Computer Science Ph.D. student working in the Computer Vision Lab at UMass Amherst, advised by Prof. Grant Van Horn. Previously, I received my M.S. and B.Eng. degrees in Robotics from University of Michigan, Ann Arbor and Zhejiang University, respectively.

My research leverages computer vision and machine learning for applications in AI4Science. I currently focus on visual and auditory recognition of fine-grained categories, such as wildlife species, to support biodiversity monitoring and conservation efforts.

I’m actively looking for research collaborations on VLMs / LLMs. Please feel free to send me an email!


News

  • [03/2026]: I will be working at Microsoft LinkedIn as a GenAI Research Intern.
  • [11/2025]: I served on the program committee for NECV 25 (UMass Amherst).
  • [05/2025]: I joined Honda Research Institute as a Research Scientist Intern.
  • [09/2024]: I started at UMass Amherst as a CS PhD student.
▼ Click to see old news

Publications

Masked Autoencoders with Limited Data: Does It Work? A Fine-Grained Bioacoustics Case Study

Masked Autoencoders with Limited Data: Does It Work? A Fine-Grained Bioacoustics Case Study

CVPR Workshop, 2026

A systematic study of MAE pretraining for species classification on iNatSounds, analyzing the impacts of pretraining data scale, domain specificity, data curation, and transfer strategies.

Audio Geolocation: A Natural Sounds Benchmark

Audio Geolocation: A Natural Sounds Benchmark

Under Review, 2026

Using a vision-inspired approach, we tackle the novel challenge of predicting the geographic location of audio recordings by leveraging species sounds and multimodal retrieval techniques.

RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs

RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs

CVPR, 2026

We introduce RealBirdID, a benchmark for fine-grained bird identification where models must either predict a species or abstain with an evidence-based rationale (for example, requiring vocalization, low-quality image, or occlusion).

Can Large Language Models Reason About Goal-Oriented Tasks?

Can Large Language Models Reason About Goal-Oriented Tasks?

ACL Workshop, 2024

We study how well LLMs can complete a sequence of steps to achieve a certain goal, such as making a sandwich or repairing a bicycle tire.


Experience


Academic Services

Program Committee: NECV 2025 (UMass Amherst)