I am a PhD student at the Audio Information Research Lab, University of Rochester, working with Prof. Zhiyao Duan. Take a look at my CV.

My research interests lie in speech and audio processing for virtual and augmented reality. My recent work has focused on audio deepfake detection, personalized spatial audio, and audio-visual rendering and analysis. In my spare time, I am fond of movies, paddle boarding, and traveling.

If you are interested in my research or would like to collaborate with me, you are welcome to email me.


[2023/11] I sat down with Berkeley Brean at News10NBC to talk about audio deepfake detection. [Link] [Tweet1] [Tweet2]

[2023/11] I attended WASPAA, SANE, NRT Annual Meeting, and BASH to present our work on personalized spatial audio. Busy but exciting two weeks!

[2023/10] Received National Institute of Justice’s (NIJ) Graduate Research Fellowship Award. [NIJ Description] [UR News Center] [Hajim Highlights]

[2023/07] One paper accepted by WASPAA 2023. Congrats, Yutong!

[2023/06] Our paper “HRTF Field” was recognized as one of the top 3% of all papers accepted at ICASSP 2023. [Hajim Highlights]

[2023/05] One paper accepted by Interspeech 2023. Congrats, Yongyi!

[2023/05] Recognized as one of the ICASSP Rising Stars in Signal Processing. [poster]

[2023/02] Two papers accepted by ICASSP 2023. (HRTF Field and SAMO)

[2023/02] Delivered a talk at ISCA SIG-SPSC webinar, titled “Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks”. [slides]

Selected Publications

(For full list, see Publications)

[1] You Zhang, Fei Jiang, and Zhiyao Duan, One-Class Learning Towards Synthetic Voice Spoofing Detection, IEEE Signal Processing Letters, vol. 28, pp. 937-941, 2021. [DOI] [arXiv] [code] [video] [poster] [slides] [project]

[2] You Zhang, Yuxiang Wang, and Zhiyao Duan, HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023. [DOI] [arXiv] [code] [video] (Recognized as one of the top 3% of all papers accepted at ICASSP 2023)

[3] Sefik Emre Eskimez, You Zhang, and Zhiyao Duan, Speech Driven Talking Face Generation From a Single Image and an Emotion Condition, IEEE Transactions on Multimedia, vol. 24, pp. 3480-3490, 2022. [DOI] [arXiv] [code] [project]

