Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation

    In cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample efficient than...

    The Berkeley Crossword Solver

    We recently built the Berkeley Crossword Solver (BCS), the first computer program to beat every human competitor in the world’s top crossword tournament. The...

    Designing Societally Beneficial Reinforcement Learning Systems

    Deep reinforcement learning (DRL) is transitioning from a research field focused on game playing to a technology with real-world applications. Notable examples include DeepMind’s...

    Recent Articles

    Stay on op - Ge the daily news in your inbox