About me

I am a PhD student at Université de Montréal and MILA (Quebec AI Institute) under supervision of Aaron Courville.

I am currently working on generative models, with a particular focus on text-to-image systems. My research explores generation fidelity and diversity, aiming to develop responsible AI algorithms.

Keywords: Generative models, Multi-modality, Responsible AI, Fairness, Diversity, etc.

News

  • camera icon

    Recipient of Scholarship!

    March 2025

    I am grateful to Mila for supporting my research through the EDI in research scholarship.

  • camera icon

    Meetup at Neurips!

    Dec. 2024

    I will attend NeurIPS 2024 in Vancouver, presenting my work on bias analysis in unconditional image generative models!

  • camera icon

    Join Meta!

    Sep. 2024

    I joined FAIR, Meta, as a Visiting Researcher!

  • paper icon

    EMNLP 2023

    Nov. 2023

    One paper (Pre-trained Model Selection) is accepted to the Findings of EMNLP 2023!

  • camera icon

    ICLR 2023

    Jan. 2023

    One paper (Knowledge Update and Model Editing for LLMs) is accepted to ICLR 2023!

  • paper icon

    EMNLP 2022

    Sep. 2022

    Two papers (MoE Attention architecture, Prompt Learning and Meta Learning) are accepted to EMNLP 2022!

Contact

MILA (Quebec AI Institute),

6666 Rue Saint-Urbain,

Montréal, QC H2S 3H1, Canada

xiaofng [DOT] zhang [AT] gmail [DOT] com



Potential Chat

I'm happy to connect with YOU. If you want to chat with me, please look at my available time slots in the following Google Calendar. You could directly send me an Google Calendar invite via my gmail address stating about what topic you would like to talk.


Resume

Education

Experience

  1. Natural Language Processing Engineer    Pattern Recognition Center, Wechat AI, Tencent

    Dec. 2021 - Jun. 2022    supervised by Dr. Yikang Shen

    Worked on two research topics. Firstly, we propose a new attention mechanism called Mixture of Attention Heads (MoA), which is a combination of Mixture of Experts and Multi-head Attention Mechanism. MoA permits easily scale up the model capacity while constraining the same level of computational cost. We achieved better performance on Machine Translation and Language Modeling tasks compared to SOTA baselines. Secondly, we work on the model editing task in order to efficiently correct the false prediction of a pre-trained language model. We propose a new task named sequential editing task and conceive a novel way to edit the stored knowledge in the pre-trained language model. We add a few new neurons in the last Feed Forward layer of Transformer architecture. The experimental result show that our method outperforms other strong baselines.

Publication

Blog