| Haque Ishfaq I am a recent PhD graduate from Mila and McGill University, where I was advised by Prof. Doina Precup. My current research focuses are on exploration in reinforcement learning, large language model alignment and reasoning. Previously, I obtained my BS (Mathematical and Computational Science) in 2016 and MS (Statistics) in 2018 from Stanford University. Previously, I also did research internships at Meta AI, Microsoft, IBM Research and Nvidia. Feel free to reach out to me in case you have any questions or want to chat about my work! In my high school days, I was heavily involved with mathematical olympiads and represented Bangladesh 3 times at the International Mathematical Olympiad (IMO). Email  /  CV  /  Google Scholar  /  Github  /  Twitter  /  LinkedIn | | Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning Haque Ishfaq*, Guangyuan Wang*, Sami Nur Islam and Doina Precup International Conference on Learning Representations (ICLR), 2025 [Paper], [Code], [Poster] | Offline Multitask Representation Learning for Reinforcement Learning Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin and Doina Precup Conference on Neural Information Processing Systems (NeurIPS), 2024 [Paper], [Poster] | More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling Haque Ishfaq*, Yixin Tan*, Yu Yang, Qingfeng Lan, Jianfeng Lu, A. Rupam Mahmood, Doina Precup and Pan Xu Reinforcement Learning Conference (RLC), 2024 [Paper], [Code], [Poster] | Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo Haque Ishfaq*, Qingfeng Lan*, Pan Xu, A. Rupam Mahmood, Doina Precup, Anima Anandkumar and Kamyar Azizzadenesheli International Conference on Learning Representations (ICLR), 2024 [Paper], [Code], [Poster] | Randomized Exploration for Reinforcement Learning with General Value Function Approximation Haque Ishfaq*, Qiwen Cui*, Viet Nguyen, Alex Ayoub, Zhuoran Yang, Zhaoran Wang, Doina Precup, and Lin F. Yang International Conference on Machine Learning (ICML), 2021 [Paper], [Slide] | | Preprints/Workshop Publications | Randomized Least Squares Policy Optimization Haque Ishfaq, Zhuoran Yang, Andrei Lupu, Viet Nguyen, Lewis Liu, Riashat Islam, Zhaoran Wang and Doina Precup ICML Workshop on Reinforcement Learning Theory , 2021 [Paper] | Path-Based Contextualization of Knowledge Graphs for Textual Entailment Kshitij Fadnis, Kartik Talamadupula, Pavan Kapanipathi, Haque Ishfaq, Salim Roukos and Achille Fokoue Preprint, 2019 [Paper] | TVAE: Triplet-Based Variational Autoencoder using Metric Learning Haque Ishfaq, Assaf Hoogi and Daniel Rubin Preprint, 2018 [Paper] | Website template |