Research

My work focuses on machine learning in low-label settings, with contributions over the past five years spanning three key areas: (a) unsupervised machine learning for speech and language, (b) bioacoustics for ethology and biodiversity monitoring, and (c) machine learning in marine robotics.

Speech and language

I aim to model audio understanding without relying on explicit labels—a pursuit that predates the rise of self-supervised learning (SSL). My early work during my PhD focused on computational models of human music perception, which I later expanded to speech modeling during my postdoc and lecturer positions. Speech, with its complex structure and availability of diverse data, serves as a robust test-bed for SSL methods while contributing to broader scientific questions on language evolution and auditory cognition. My research initially centered on developing self-supervised speech representations and their applications in speech processing. More recently, I have explored the interface between speech and text-based large language models (LLMs), investigating their cross-modal potential for advancing conversational AI and studying the agentic behavior in LLMs.

Bioacoustics

Bioacoustics bridges my interests in language origins and environmental conservation. This field presents a challenge due to scarce annotations but abundant raw data (e.g., passive acoustic monitoring). Leveraging SSL techniques, I have advanced detection, classification, and analysis across species, including whales, primates, and birds, supporting both fundamental ethology and applied biodiversity tracking.

Marine robotics

I initiated this line of work upon my arrival at UTLN. It aligns with the institution’s strategic focus on marine science and technology. Given the challenges of data collection in marine environments, I apply simulation, SSL, and expert-informed methods to enhance autonomous robotics and remote sensing for oceanic exploration and monitoring under real-world constraints.

Interpretability in agentic AI

Recently I’m getting interested in understanding how agents are implemented in role-playing Large Language Models (LLMs) and long-form interactions. More to come, stay tuned.