HP/Model Reinforcement Learning

Everything you need to know about model-free and model-based reinforcement learning

Reinforcement learning is one of the exciting branches of artificial intelligence. It plays an important role in game-playing AI systems, modern robots, chip-design systems, and other applications.

Unite.AI

The End of Tabula Rasa: How Pre-Trained World Models are Redefining Reinforcement Learning

For a long time, the core idea in reinforcement learning (RL) was that AI agents should learn every new task from scratch, like a blank slate. This "tabula rasa" approach led to amazing achievements, ...

Tech Xplore on MSN

DeepMind introduces AI agent that learns to complete various tasks in a scalable world model

Over the past decade, deep learning has transformed how artificial intelligence (AI) agents perceive and act in digital ...

International Monetary Fund

AI and Macroeconomic Modeling: Deep Reinforcement Learning in an RBC model

Download PDF More Formats on IMF eLibrary Order a Print Copy Create Citation This study seeks to construct a basic reinforcement learning-based AI-macroeconomic simulator. We use a deep RL (DRL) ...

Semiconductor Engineering

DeepSeek: Improving Language Model Reasoning Capabilities Using Pure Reinforcement Learning

“We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT ...

28d

The reinforcement gap — or why some AI skills improve faster than others

AI tasks that work well with reinforcement learning are getting better fast — and threatening to leave the rest of the industry behind.

Forbes

Ten Questions With OpenAI On Reinforcement Learning With Human Feedback

Recently, we interviewed Long Ouyang and Ryan Lowe, research scientists at OpenAI. As the creators of InstructGPT – one of the first major applications of reinforcement learning with human feedback ...

Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

Ant Group, an affiliate of Alibaba, released Ring-1T which it says is the first trillion parameter open-source model.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results