Yu-Hua Chen

AI Researcher & Graduate Student

About Me

I'm a graduate student in the Master of Science in Applied Computing (MScAC) program at the University of Toronto, specializing in speech technology and multi-modal AI. With professional experience in engineering and deploying scalable audio pipelines (ASR/TTS) to cloud environments, I bring both research depth and practical implementation skills. My work spans speech processing, computer vision, and computational imaging, with a focus on building robust AI solutions that bridge the gap between cutting-edge research and real-world applications.

Experience

AI Research Engineer

June 2024 – August 2025
AiTeHub, Taipei, Taiwan (Remote)
  • Engineered an automated, end-to-end audiobook generation pipeline on Google Cloud Platform to address the scalability bottleneck of processing thousands of e-books
  • Integrated Azure TTS and BreezyVoice for high-fidelity synthesis and designed a validation layer that automatically detects ill-formed epub structures to reduce manual review
  • Developed a Taigi (Hokkien) TTS prototype by finetuning the MMS model; the solution was adopted by MediaTek for further research and optimization

Undergraduate Researcher

September 2023 – June 2025
National Taiwan University, Taipei, Taiwan
  • Co-authored and engineered the official implementation for "TAU," a benchmark submitted to ICASSP 2026 for evaluating cultural sound understanding in Large Audio Language Models
  • Addressed the limitation of semantic-only models by creating a dataset of crowdsourced Taiwanese audio and developed a transcript-adversarial filter to isolate audio-modality reasoning from textual cues
  • Utilized Gemini to automate question generation and established a scalable framework for cross-cultural audio benchmarking

Research Intern

July 2023 – August 2023
Academia Sinica, Taipei, Taiwan
  • Improved Automatic Speech Recognition (ASR) performance by optimizing transformer block layer selection in HuBERT and WavLM models
  • Reduced word error rate in English automatic speech recognition (ASR) by 1% compared to the original model

Featured Projects

Interspeech2025 ML-SUPERB Challenge

January 2025 – February 2025

Tackled the problem of simultaneous language identification and transcription for zero-shot languages by engineering a dynamic embedding method. Synthesized tokens for unseen languages by weighting Whisper's pretrained embeddings based on LID confidence scores. Outperformed baselines with a 27% reduction in overall CER (51.9% → 37.7%) and a massive improvement in Dialect CER (65.9% → 38.5%), validating the model's ability to generalize to unseen dialects.

KIBO Robot Programming Challenge

2022

Championship victory in JAXA's international competition, designing safe navigation paths for Astrobee robot on the International Space Station.

Watch Competition Video →

2023 Taipei Fall CodeFest - City Dashboard

2023

Received Honorable Mention Award among 55 teams for developing pregnancy and childcare information dashboard. Project integrated into Taipei City Government's official repository.

Publications

Yi-Cheng Lin, Yu-Hua Chen, et al.
"TAU: A Benchmark for Cultural Sound Understanding Beyond Semantics"
Submitted to ICASSP 2026

Awards & Recognition

Vector Scholarship in AI
Vector Institute
2025-2026
Honorable Mention, Higgs Audio V2 Hackathon
Boson AI
2025
Championship, 3rd KIBO-RPC Robot Contest
JAXA (Japan Aerospace Exploration Agency)
2022

Education

Master of Science in Applied Computing (MScAC)

University of Toronto, Department of Computer Science | September 2025 – December 2026 (expected)

Courses: Computational Imaging, Introduction to Causality

Bachelor of Science in Electrical Engineering

National Taiwan University | September 2021 – June 2025

Ranked 4th in graduating class | Dean's List Award (4 semesters)

Undergraduate Researcher at the Speech Processing & ML Lab

Get In Touch