Yu-Hua Chen - AI Researcher & Graduate Student

About Me

I'm a graduate student in the Master of Science in Applied Computing (MScAC) program at the University of Toronto, specializing in speech technology and multi-modal AI. With professional experience in engineering and deploying scalable audio pipelines (ASR/TTS) to cloud environments, I bring both research depth and practical implementation skills. My work spans speech processing, computer vision, and computational imaging, with a focus on building robust AI solutions that bridge the gap between cutting-edge research and real-world applications.

Experience

AI Research Engineer

June 2024 – August 2025

AiTeHub, Taipei, Taiwan (Remote)

Engineered an automated, end-to-end audiobook generation pipeline on Google Cloud Platform to address the scalability bottleneck of processing thousands of e-books
Integrated Azure TTS and BreezyVoice for high-fidelity synthesis and designed a validation layer that automatically detects ill-formed epub structures to reduce manual review
Developed a Taigi (Hokkien) TTS prototype by finetuning the MMS model; the solution was adopted by MediaTek for further research and optimization

Undergraduate Researcher

September 2023 – June 2025

National Taiwan University, Taipei, Taiwan

Co-authored and engineered the official implementation for "TAU," a benchmark submitted to ICASSP 2026 for evaluating cultural sound understanding in Large Audio Language Models
Addressed the limitation of semantic-only models by creating a dataset of crowdsourced Taiwanese audio and developed a transcript-adversarial filter to isolate audio-modality reasoning from textual cues
Utilized Gemini to automate question generation and established a scalable framework for cross-cultural audio benchmarking

Research Intern

July 2023 – August 2023

Academia Sinica, Taipei, Taiwan

Improved Automatic Speech Recognition (ASR) performance by optimizing transformer block layer selection in HuBERT and WavLM models
Reduced word error rate in English automatic speech recognition (ASR) by 1% compared to the original model

Featured Projects

Interspeech2025 ML-SUPERB Challenge

January 2025 – February 2025

Tackled the problem of simultaneous language identification and transcription for zero-shot languages by engineering a dynamic embedding method. Synthesized tokens for unseen languages by weighting Whisper's pretrained embeddings based on LID confidence scores. Outperformed baselines with a 27% reduction in overall CER (51.9% → 37.7%) and a massive improvement in Dialect CER (65.9% → 38.5%), validating the model's ability to generalize to unseen dialects.

KIBO Robot Programming Challenge

2022

Championship victory in JAXA's international competition, designing safe navigation paths for Astrobee robot on the International Space Station.

Watch Competition Video →

2023 Taipei Fall CodeFest - City Dashboard

2023

Received Honorable Mention Award among 55 teams for developing pregnancy and childcare information dashboard. Project integrated into Taipei City Government's official repository.

Publications

Yi-Cheng Lin, Yu-Hua Chen, et al.

"TAU: A Benchmark for Cultural Sound Understanding Beyond Semantics"

Submitted to ICASSP 2026

Awards & Recognition

Vector Scholarship in AI

Vector Institute

2025-2026

Honorable Mention, Higgs Audio V2 Hackathon

Boson AI

2025

Championship, 3rd KIBO-RPC Robot Contest

JAXA (Japan Aerospace Exploration Agency)

2022

Education

Master of Science in Applied Computing (MScAC)

University of Toronto, Department of Computer Science | September 2025 – December 2026 (expected)

Courses: Computational Imaging, Introduction to Causality

Bachelor of Science in Electrical Engineering

National Taiwan University | September 2021 – June 2025

Ranked 4th in graduating class | Dean's List Award (4 semesters)

Undergraduate Researcher at the Speech Processing & ML Lab

Get In Touch

chenjoachim@cs.toronto.edu

(416) 518-2640

GitHub