Wuao Liu

Wuao Liu (刘武傲)

I am a Computer Science Ph.D. student working in the Computer Vision Lab at UMass Amherst, advised by Prof. Grant Van Horn. Previously, I received my M.S. and B.Eng. degrees in Robotics from University of Michigan, Ann Arbor and Zhejiang University, respectively.

My research leverages computer vision and multimodal machine learning for applications in biodiversity monitoring and conservation efforts. I currently focus on visual and auditory recognition of fine-grained categories, as well as wildlife species range estimation.

I’m actively looking for research collaborations on VLMs / LLMs. Please feel free to send me an email!


News

  • [03/2026]: I will be working at Microsoft LinkedIn as a GenAI Research Intern.
  • [11/2025]: I served on the program committee for NECV 25 (UMass Amherst).
  • [05/2025]: I joined Honda Research Institute as a Research Scientist Intern.
  • [09/2024]: I started at UMass Amherst as a CS PhD student.
▼ Click to see old news

Publications

Masked Autoencoders with Limited Data: Does It Work? A Fine-Grained Bioacoustics Case Study

Masked Autoencoders with Limited Data: Does It Work?
A Fine-Grained Bioacoustics Case Study

CVPR Workshop, 2026

A systematic study of MAE pretraining for species classification on iNatSounds, analyzing the impacts of pretraining data scale, domain specificity, data curation, and transfer strategies.

Audio Geolocation: A Natural Sounds Benchmark

Audio Geolocation: A Natural Sounds Benchmark

Under Review, 2026

Using a vision-inspired approach, we tackle the novel challenge of predicting the geographic location of audio recordings by leveraging species sounds and multimodal retrieval techniques.

RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs

RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs

CVPR, 2026

We introduce RealBirdID, a benchmark for fine-grained bird identification where models must either predict a species or abstain with an evidence-based rationale (for example, requiring vocalization, low-quality image, or occlusion).

Can Large Language Models Reason About Goal-Oriented Tasks?

Can Large Language Models Reason About Goal-Oriented Tasks?

ACL Workshop, 2024

We study how well LLMs can complete a sequence of steps to achieve a certain goal, such as making a sandwich or repairing a bicycle tire.


Experience


Academic Services

Program Committee: NECV 2025 (UMass Amherst)