You (Neil) Zhang  张优 bio photo

You (Neil) Zhang

I am a PhD candidate at the Audio Information Research Lab, University of Rochester, NY, USA. I am fortunate to work with Prof. Zhiyao Duan. Take a look at my CV.

My research focuses on applied machine learning, particularly in speech and audio processing. This includes topics such as audio deepfake detection, personalized spatial audio, and audio-visual rendering and analysis. My research contributions have been showcased at prestigious venues such as ICASSP, SPL, Interspeech, WASPAA, TMM. I received recognition through the Rising Star Program in Signal Processing at ICASSP 2023 and the Graduate Research Fellowship Program from National Institute of Justice. In my spare time, I am fond of paddle boarding, traveling, and movies.

If you are interested in my research or would like to collaborate with me, you are welcome to email me.


[2024/04] I attended NEMISIG 2024, NYC Computer Vision Day 2024 and will be attending ICASSP 2024.

[2024/04] Exciting milestone achieved! I have successfully passed my PhD proposal/qualifying exam on “Generalizing Audio Deepfake Detection”. Looking forward to embarking on this fruitful road!

[2024/04] Our inaugural Singing Voice Deepfake Detection (SVDD) 2024 Challenge proposal has been accepted by IEEE Spoken Language Technology Workshop (SLT) 2024! Check out the challenge website here! Registration deadline: June 8th.

[2024/03] I gave a talk at GenAI Spring School and AI Bootcamp on “Audio Deepfake Detection”.

[2024/02] Our tutorial on “Multimedia Deepfake Detection” has been accepted at ICME 2024.

[2023/12] I gave a talk at Nanjing University on “Improving Generalization Ability for Audio Deepfake Detection”. [Slides]

[2023/12] Two papers accepted by ICASSP 2024. (SingFake and Speech Emotion AV Learning) Congrats, Yongyi and Enting!

[2023/12] Our tutorial on “Personalizing Spatial Audio: Machine Learning for Personalized Head-Related Transfer Functions (HRTFs) Modeling in Gaming” has been accepted at 2024 AES International Conference on Audio for Games.

[2023/11] I sat down with Berkeley Brean at News10NBC to talk about audio deepfake detection. [WHEC-TV Link] [Tweet1] [Tweet2] [Hajim Highlights]

[2023/11] I attended WASPAA, SANE, NRT Annual Meeting, and BASH to present our work on personalized spatial audio. Busy but exciting two weeks!

[2023/10] Received National Institute of Justice’s (NIJ) Graduate Research Fellowship Award. [NIJ Description] [UR News Center] [Hajim Highlights]

[2023/07] One paper accepted by WASPAA 2023. Congrats, Yutong!

[2023/06] Our paper “HRTF Field” was recognized as one of the top 3% of all papers accepted at ICASSP 2023. [Hajim Highlights]

[2023/05] Recognized as one of the ICASSP Rising Stars in Signal Processing. [poster]

[2023/05] One paper accepted by Interspeech 2023. Congrats, Yongyi!

[2023/02] Two papers accepted by ICASSP 2023. (HRTF Field and SAMO) Congrats, Siwen!

[2023/02] Delivered a talk at ISCA SIG-SPSC webinar, titled “Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks”. [slides]

Selected Publications

(For full list, see Publications)

[1] You Zhang, Fei Jiang, and Zhiyao Duan, One-Class Learning Towards Synthetic Voice Spoofing Detection, IEEE Signal Processing Letters, vol. 28, pp. 937-941, 2021. [DOI] [arXiv] [code] [video] [poster] [slides] [project]

[2] You Zhang, Yuxiang Wang, and Zhiyao Duan, HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023. [DOI] [arXiv] [code] [video] (Recognized as one of the top 3% of all papers accepted at ICASSP 2023)

[3] Sefik Emre Eskimez, You Zhang, and Zhiyao Duan, Speech Driven Talking Face Generation From a Single Image and an Emotion Condition, IEEE Transactions on Multimedia, vol. 24, pp. 3480-3490, 2022. [DOI] [arXiv] [code] [project]

Flag Counter