FUZZY

About me

I am a PhD student at Université de Montréal and MILA (Quebec AI Institute) under supervision of Aaron Courville.

I am currently working on generative models, with a particular focus on text-to-image systems. My research explores generation fidelity and diversity, aiming to develop responsible AI algorithms.

Keywords: Generative models, Multi-modality, Responsible AI, Fairness, Diversity, etc.

News

Recipient of Scholarship!

March 2025

I am grateful to Mila for supporting my research through the EDI in research scholarship.
Meetup at Neurips!

Dec. 2024

I will attend NeurIPS 2024 in Vancouver, presenting my work on bias analysis in unconditional image generative models!
Join Meta!

Sep. 2024

I joined FAIR, Meta, as a Visiting Researcher!
EMNLP 2023

Nov. 2023

One paper (Pre-trained Model Selection) is accepted to the Findings of EMNLP 2023!
ICLR 2023

Jan. 2023

One paper (Knowledge Update and Model Editing for LLMs) is accepted to ICLR 2023!
EMNLP 2022

Sep. 2022

Two papers (MoE Attention architecture, Prompt Learning and Meta Learning) are accepted to EMNLP 2022!

Contact

MILA (Quebec AI Institute),

6666 Rue Saint-Urbain,

Montréal, QC H2S 3H1, Canada

xiaofng [DOT] zhang [AT] gmail [DOT] com

Potential Chat

I'm happy to connect with YOU. If you want to chat with me, please look at my available time slots in the following Google Calendar. You could directly send me an Google Calendar invite via my gmail address stating about what topic you would like to talk.

Resume

Education

Experience

Natural Language Processing Engineer Pattern Recognition Center, Wechat AI, Tencent
Dec. 2021 - Jun. 2022 supervised by Dr. Yikang Shen
Worked on two research topics. Firstly, we propose a new attention mechanism called Mixture of Attention Heads (MoA), which is a combination of Mixture of Experts and Multi-head Attention Mechanism. MoA permits easily scale up the model capacity while constraining the same level of computational cost. We achieved better performance on Machine Translation and Language Modeling tasks compared to SOTA baselines. Secondly, we work on the model editing task in order to efficiently correct the false prediction of a pre-trained language model. We propose a new task named sequential editing task and conceive a novel way to edit the stored knowledge in the pre-trained language model. We add a few new neurons in the last Feed Forward layer of Transformer architecture. The experimental result show that our method outperforms other strong baselines.