Domains: Deep Reinforcement Learning, Robotics, Control Systems, Simulation, Computer Vision

</aside>

Overview

This project focuses on building a Deep Reinforcement Learning (DRL)-based agent capable of solving a Rubik’s Cube using a robotic apparatus. The system combines simulation-based training with real-world execution by controlling a stepper-motor-driven cube manipulator, bridging the gap between virtual learning and physical actuation.

Key Concepts

Reinforcement Learning A machine learning paradigm where an agent learns to take actions in an environment to maximize cumulative reward.

Deep Reinforcement Learning Using deep neural networks as function approximators to handle large and complex state-action spaces.

Autodidactic Iteration A method for iteratively improving policies and value functions in complex decision-making problems, enabling the agent to learn optimal strategies even when the state-action space is very large.

Supervised Learning A machine learning approach where models are trained on labeled data, used here to bootstrap the agent with known optimal cube-solving sequences.

Simulation-to-Real Transfer Training models in a simulated environment and deploying them in the physical world with minimal performance loss.

Stepper Motor Control Precise manipulation of motors to perform controlled rotations of the Rubik’s Cube faces.

Computer Vision Using cameras or sensors to perceive the cube’s state and feed this information to the model.

Approach and Workflow

Environment Setup: Simulate the Rubik’s Cube state space and define the action space for face rotations.
Model: Begin with supervised learning on a dataset of cube states to give the agent a strong initialization. Then use ADI (Autodidactic Iteration) to train the model and various search techniques to refine the policy and maximize reward for solving efficiency.
Hardware Integration: Interface with stepper motors and drivers to replicate the learned policy on the real cube apparatus.
State Detection: Implement a vision pipeline to recognize cube colors and translate them into a machine-readable state representation.
Testing and Refinement: Evaluate the real-world performance, fine-tune control signals, and iteratively improve simulation fidelity for better transfer learning.