Mike Gimelfarb
- Email
- Github
- Google
Scholar
-
ResearchGate
|
Hi!
I'm currently a Post-Doc in the D3M Lab
led by Prof.
Scott Sanner at the University of Toronto. My work focuses on leveraging structure for
planning and reinforcement learning (RL) (using tools such as probabilistic programming, i.e. RDDL).
I am also actively looking for continuing post-doc and industry research positions in the area of
reinforcement learning/dynamic decision making.
Prior to this, I completed my PhD under the supervision of Scott Sanner
and Prof.
Chi-Guhn Lee in the same department in December 2022. My thesis focused on rendering
modern RL more efficient through the transfer of prior knowledge
(e.g. skills, demonstrations) from multiple sources. I developed novel applications of Bayes'
inference to robustly distinguish good knowledge sources from bad, and tackled the problem of transfer
learning for risk-sensitive agents. My research has been published in top AI/ML conferences such as
NeurIPS, UAI, AAAI, and ICLR. I also completed an internship
at DeepMind in 2022,
and I was previously a post-graduate affiliate of the
Vector Institute from 2020 to 2022.
Prior to this, I completed my master's degree (MASc) at U of T under supervision of Prof.
Michael J. Kim. My thesis focused on the theoretical
developments of Thompson sampling applied in queueing and
admission control problems with demand uncertainty. I
received my Bachelor's degree in Business Administration
(BBA) from the Schulich School of Business in 2014,
graduating with distinction.
I enjoy reading books on cognitive science and play classical piano in my spare time.
|
Selected Publications
|
 |
pyRDDLGym: From RDDL to Gym Environments
Ayal Taitler, Michael Gimelfarb,
Jihwan Jeong, Sriram Gopalakrishnan, Martin Mladenov, Xiaotian Liu, Scott Sanner
arxiv, 2023
|
 |
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
Jihwan Jeong, Xiaoyu Wang, Michael Gimelfarb, Hyunwoo Kim, Baher Abdulhai, Scott Sanner
ICLR (Forthcoming), 2023
|
 |
A Distributional Framework for Risk-Sensitive End-to-End Planning in Continuous MDPs
Noah Patton, Jihwan Jeong, Michael Gimelfarb, Scott Sanner
AAAI, 2022
|
 |
Risk-Aware Transfer in Reinforcement Learning using Successor Features
Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee
NeurIPS, 2021
|
 |
End-to-End Risk-Aware Planning by Gradient Descent
Noah Patton, Jihwan Jeong, Michael Gimelfarb, Scott Sanner
ICAPS Workshop on Bridging the Gap Between AI Planning
and Reinforcement Learning, 2021
|
 |
Bayesian
Experience Reuse for Learning from Multiple Demonstrators
Michael Gimelfarb, Scott Sanner,
Chi-Guhn Lee IJCAI, 2021
|
 |
Contextual
Policy Transfer in Reinforcement Learning Domains via Deep
Mixtures-of-Experts
Michael Gimelfarb, Scott Sanner,
Chi-Guhn Lee UAI, 2021
|
 |
ε-BMC: A Bayesian Ensemble Approach
to Epsilon-Greedy Exploration in Model-Free Reinforcement
Learning
Michael Gimelfarb, Scott Sanner,
Chi-Guhn Lee UAI, 2019
|
 |
Reinforcement Learning with Multiple
Experts: A Bayesian Model Combination Approach
Michael Gimelfarb, Scott Sanner,
Chi-Guhn Lee NeurIPS, 2018
|
![]() |
Thompson Sampling for the Control of
a Queue with Demand Uncertainty
Michael Gimelfarb, Michael J. Kim
Master's Thesis, 2017
|
|