Student Projects

Published since: 2025-03-26 , Earliest start: 2025-05-01 , Latest end: 2025-12-31

Organization Robotic Systems Lab

Hosts Stolle Jonas , Elbir Emre

Topics Engineering and Technology

Extending Functional Scene Graphs to Include Articulated Object States

While traditional [1] and functional [2] scene graphs are capable of capturing the spatial relationships and functional interactions between objects and spaces, they encode each object as static, with fixed geometry. In this project, we aim to enable the estimation of the state of articulated objects and include it in the functional scene graph.

Keywords

scene understanding, scene graph, exploration

Labels

Master Thesis

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-03-25 , Earliest start: 2025-03-25

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization Computer Vision and Geometry Group

Hosts Bauer Zuria, Dr. , Trisovic Jelena , Zurbrügg René

Topics Information, Computing and Communication Sciences , Engineering and Technology

Event-based feature detection for highly dynamic tracking

Event cameras are an exciting new technology enabling sensing of highly dynamic content over a broad range of illumination conditions. The present thesis explores novel, sparse, event-driven paradigms for detecting structure and motion patterns in raw event streams.

Keywords

Event camera, neuromorphic sensing, feature detection, computer vision

Labels

Master Thesis

Description

Event cameras are a relatively new, vision-based exteroceptive sensor relying on standard CMOS technology. Unlike normal cameras, event cameras do not measure absolute brightness in a frame-by-frame manner, but relative changes of the pixel-level brightness. Essentially, every pixel of an event camera independently observes the local brightness pattern, and when the latter experiences a relative change of minimum amount with respect to a previous value, a measurement is triggered in the form of a time-stamped event indicating the image location as well as the polarity of the change (brighter or darker) [2]. The pixels act asynchronously and can potentially fire events at a very high rate. Owing to their design, event cameras do not suffer from the same artifacts as regular cameras, but continue to perform well under high dynamics or challenging illumination conditions. Event cameras currently enjoy growing popularity and they represent a new, interesting alternative for exteroceptive sensing in robotics when facing scenarios with high dynamics and/or challenging conditions. The focus of the present thesis lies on 3D motion estimation with event cameras, and in particular aims at event-driven, computationally efficient methods that can trigger motion hypotheses from sparse raw events. Initial theoretical advances in this direction have been presented in recent literature [3,4,5], though these methods are still limited in terms of the assumptions that they make. The present thesis will push the boundaries by proposing novel both geometry and learning-based representations. The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview. [1] Event-Based, 6-DOF Camera Tracking from Photometric Depth Maps, TPAMI 40(10):2402-2412, 2017 [2] The Silicon Retina, 264(5): 76-83, 1991 [3] A 5-Point Minimal Solver for Event Camera Relative Motion Estimation. In Proceedings of the International Conference on Computer Vision (ICCV), 2023 [4] An n-point linear solver for line and motion estimation with event cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024. [5] Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers, Arxiv: https://arxiv.org/html/2503.03307v1

Event cameras currently enjoy growing popularity and they represent a new, interesting alternative for exteroceptive sensing in robotics when facing scenarios with high dynamics and/or challenging conditions. The focus of the present thesis lies on 3D motion estimation with event cameras, and in particular aims at event-driven, computationally efficient methods that can trigger motion hypotheses from sparse raw events. Initial theoretical advances in this direction have been presented in recent literature [3,4,5], though these methods are still limited in terms of the assumptions that they make. The present thesis will push the boundaries by proposing novel both geometry and learning-based representations.

The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview.

[1] Event-Based, 6-DOF Camera Tracking from Photometric Depth Maps, TPAMI 40(10):2402-2412, 2017

[2] The Silicon Retina, 264(5): 76-83, 1991

[3] A 5-Point Minimal Solver for Event Camera Relative Motion Estimation. In Proceedings of the International Conference on Computer Vision (ICCV), 2023

[4] An n-point linear solver for line and motion estimation with event cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[5] Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers, Arxiv: https://arxiv.org/html/2503.03307v1

Work Packages

● Literature research

● Extend the mathematical foundation for sparse event-based motion estimation

● Propose novel detectors that extend operability from lines and constant velocity motion to full 6 DoF motion estimation from either points or lines, or other specific object trajectories such as ballistic curves

● Investigate learning-based, sparse event-based motion detectors to handle more general cases

● Apply the technology to real-world data to track fast ego-motion or ballistic object motion in the environment

Requirements

● Excellent knowledge of C++

● Computer vision experience

● Knowledge of geometric computer vision

● Plus: Experience with event cameras

Contact Details

Laurent Kneip (lkneip@theaiinstitute.com)

Please include your CV and up-to-date transcript.

More information

Published since: 2025-03-13 , Earliest start: 2025-03-17

Organization Robotic Systems Lab

Hosts Kneip Laurent

Topics Engineering and Technology

Fast, change-aware map-based camera tracking

Experiment with Gaussian Splatting based map representations for highly efficient camera tracking and simultaneous change detection and map updating. Apply to different exteroceptive sensing modalities.

Keywords

Localization, Camera Tracking, Gaussian Splatting, Change detection

Labels

Master Thesis

Description

Novel techniques for 3D environment representation such as NeRF [2] and Gaussian Splatting [1,3] provide the ability to efficiently render realistically looking images of environments, and-if formulated as a differentiable function of camera pose-can be embedded into a photometric loss in order to enable camera tracking across different modalities such as RGB cameras and event cameras. However, such representations are by default not able to accommodate for changes in the scene, which may happen in many practically relevant scenarios (e.g. domestic environment). Furthermore, in some of the relevant scenarios, the changes that occur over time may indeed be expected and according to plan (e.g. construction environment). The present thesis looks into recent 3D reconstruction methods such as Gaussian Spatting and considers their use for real-time vision-based sensor tracking. However, rather than relying on a static map of the environment, the core of the method consists of incorporating a robust change detection and map updating mechanism that relies on a combination of measurement residuals and available priors. The final goal will be to enable long-term vision-based localization in gradually changing environments while simultaneously making use of new sensing data to update the map. The ultimate goal will be to extend the method to sensors that excel at highly dynamic motion tracking, but do not necessarily represent our first choice when thinking of a mapping device (i.e. event cameras). The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview. [1] GS-EVT: Cross-Modal Event Camera Tracking based on Gaussian Splatting. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025 [2] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, 2020, Arxiv: https://arxiv.org/abs/2003.08934 [3] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, 2023, Arxiv: https://arxiv.org/abs/2308.04079

The present thesis looks into recent 3D reconstruction methods such as Gaussian Spatting and considers their use for real-time vision-based sensor tracking. However, rather than relying on a static map of the environment, the core of the method consists of incorporating a robust change detection and map updating mechanism that relies on a combination of measurement residuals and available priors. The final goal will be to enable long-term vision-based localization in gradually changing environments while simultaneously making use of new sensing data to update the map. The ultimate goal will be to extend the method to sensors that excel at highly dynamic motion tracking, but do not necessarily represent our first choice when thinking of a mapping device (i.e. event cameras).

[1] GS-EVT: Cross-Modal Event Camera Tracking based on Gaussian Splatting. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025

[2] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, 2020, Arxiv: https://arxiv.org/abs/2003.08934

[3] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, 2023, Arxiv: https://arxiv.org/abs/2308.04079

Work Packages

● Literature research

● Creating a Gaussian Splatting representation of an environment and then using it for tracking

● Simultaneously ensuring continuous change detection in the environment. The change detection could rely on semantic detection priors in order to identify coherent image segments that have become inconsistent

● Propose an efficient map update strategy that relies on the introduced change detection and is derived from the original Gaussian Splatting algorithm

Requirements

● Excellent knowledge of Python

● Computer vision experience

● Knowledge of recent image rendering techniques

Contact Details

Laurent Kneip (lkneip@theaiinstitute.com)

Igor Bogoslavskyi (ibogoslavskyi@theaiinstitute.com)

Please include your CV and up-to-date transcript.

More information

Published since: 2025-03-13 , Earliest start: 2025-03-17

Organization Robotic Systems Lab

Hosts Kneip Laurent

Topics Engineering and Technology

Soft object reconstruction

This project consists of reconstructing soft object along with their appearance, geometry, and physical properties from image data for inclusion in reinforcement learning frameworks for manipulation tasks.

Keywords

Computer Vision, Structure from Motion, Image-based Reconstruction, Physics-based Reconstruction

Labels

Master Thesis

Description

As 3D reconstruction [2,3], real-time data-driven rendering [4,5], and learning-based control technologies [6,7] are becoming more mature, recent efforts in reinforcement learning are moving towards end-to-end policies that directly consume images in order to generate control commands [8]. However, many of the simulated environments are limited to a composition of rigid objects. In recent years, the inclusion of differentiable particle-based simulation borrowed from computer graphics has enabled the inclusion of non-rigid or even fluid elements. Ideally, we can generate such representations from real world data in order to extend data-driven world simulators to arbitrary new objects with complex physical behavior. The present thesis focuses on this problem and aims at reconstructing soft objects in terms of their geometry, appearance, and physical behavior. The goal is to make use of the Material Point Method (MPM) in combination with vision-based cues and physical priors in order to reconstruct accurate 3D models of soft objects. The developed models will finally be included into an RL learning environment such as Isaac-Gym in order to train novel manipulation policies for soft objects. The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview. [1] Modeling of Deformable Objects for Robotic Manipulation: A Tutorial and Review, Front. Robot. AI, 7, 2020 [2] Global Structure-from-Motion Revisited, ECCV 2024 [3] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors, CVPR 2025 [4] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, CVPR 2020 [5] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, SIGGRAPH 2023 [6] Learning to walk in minutes using massively parallel deep reinforcement learning, CoRL 2022 [7] Champion-Level Drone Racing Using Deep Reinforcement Learning, Nature, 2023 [8] π0: A Vision-Language-Action Flow Model for General Robot Control, Arxiv: https://arxiv.org/abs/2410.24164

The present thesis focuses on this problem and aims at reconstructing soft objects in terms of their geometry, appearance, and physical behavior. The goal is to make use of the Material Point Method (MPM) in combination with vision-based cues and physical priors in order to reconstruct accurate 3D models of soft objects. The developed models will finally be included into an RL learning environment such as Isaac-Gym in order to train novel manipulation policies for soft objects.

[1] Modeling of Deformable Objects for Robotic Manipulation: A Tutorial and Review, Front. Robot. AI, 7, 2020

[2] Global Structure-from-Motion Revisited, ECCV 2024

[3] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors, CVPR 2025

[4] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, CVPR 2020

[5] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, SIGGRAPH 2023

[6] Learning to walk in minutes using massively parallel deep reinforcement learning, CoRL 2022

[7] Champion-Level Drone Racing Using Deep Reinforcement Learning, Nature, 2023

[8] π0: A Vision-Language-Action Flow Model for General Robot Control, Arxiv: https://arxiv.org/abs/2410.24164

Work Packages

● Literature research

● Design of suitable reconstruction method based on visual data and physical priors

● Dataset collection and testing

● Cross-validation against contact-based reconstruction methods

● Embedding into Isaac-Gym for training novel manipulation policies

Requirements

● Excellent knowledge of Python or C++

● Computer vision experience

● Interest in optimization with physics representations

Contact Details

Laurent Kneip (lkneip@theaiinstitute.com)

Sina Mirrazavi (smirrazavi@theaiinstitute.com)

Please include your CV and up-to-date transcript.

More information

Published since: 2025-03-13 , Earliest start: 2025-03-17

Organization Robotic Systems Lab

Hosts Kneip Laurent

Topics Engineering and Technology

Reconstruction from online videos taken in the wild

Push the limits of arbitrary online video reconstruction by combining the most recent, prior-supported real-time Simultaneous Localization And Mapping (SLAM) methods with automatic supervision techniques.

Keywords

Computer Vision, 3D Reconstruction, SLAM

Labels

Master Thesis

Description

In recent years, the advent of learning-based methods has led to substantial advancements of the performance of video-based 3D reconstruction methods. It is now possible to take an uncalibrated monocular video sequence and automatically process it to obtain a reasonably good estimation of the 3D geometry of the scene as well as the camera motion [1]. However, challenges remain in case of processing videos taken in the wild from open online repositories (e.g. Youtube):

● The videos are often not captured in a single take, but have changing camera perspectives. This often breaks continuous incremental reconstruction paradigms, and leads to the requirement of additional supervision.

● The videos sometimes have highly challenging passages with strong dynamics, missing texture, and/or dynamic objects in the image, thereby again demanding additional supervision.

● It is demonstrated and understood that adding calibration information to the estimation potentially improves the estimation performance.

The goal of the present project is to explore the use of both classical and learning-based solutions to automatically provide such supervision, and subsequently modify existing modern Simultaneous Localization And Mapping (SLAM) frameworks to include such priors and thereby produce more robust performance on challenging videos taken in the wild.

[1] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors, CVPR 2025

Work Packages

● Literature research

● Addition of traditional geometric methods for automatic camera calibration

● Automatic video segmentation and scene categorization.

● Automatic processing of video captions and audio for the extraction of expected semantics, and subsequent application of an open vocabulary model for automatic masking

● Testing and Validation

Requirements

● Excellent knowledge of Python and C++

● Knowledge in Computer vision

● Experience in SLAM/reconstruction

● Experience in applying learning-based representations

● Interest in recent LLM/VLM architectures

Contact Details

Laurent Kneip (lkneip@theaiinstitute.com)

Alexander Liniger (aliniger@theaiinstitute.com)

Please include your CV and up-to-date transcript.

More information

Published since: 2025-03-13 , Earliest start: 2025-03-17

Organization Robotic Systems Lab

Hosts Kneip Laurent

Topics Engineering and Technology

Computationally Efficient Neural Networks

Computing, time, and energy requirements of recent neural networks have demonstrated dramatic increase over time, impacting on their applicability in real-world contexts. The present thesis explores novel ways of implementing neural network implementations that will substantially reduce their computational complexity and thus energy footprint.

Keywords

AI, CNNs, transformers, network implementation

Labels

Master Thesis

Description

Over the past decade, advances in deep learning and computer vision have led to substantial improvements in robotic perception abilities. It is nowadays possible to use neural networks for reliable object detection, object pose and shape estimation, open-vocabulary semantic interpretation, and the solution of low-level problems such as feature tracking and depth estimation, to name just a few. However, a limiting factor of growing concern is the computational complexity and thus the power consumption/computing hardware vs. latency trade-off of such models. We are therefore also experiencing an increasing demand for cloud-based computation, often with remaining and unpredictable latencies. As demonstrated through a number of past efforts [2,3,4], the computational complexity of a neural network can be reduced fairly substantially by changes in the selected low-level computing paradigm. Rather than relying on standard matrix-vector multiplications that make use of hardware multipliers, we can choose architectures that will rely more on additive operations [2], thereby reducing computational complexity and thus energy consumption by a substantial amount. The present work aims at the development and testing of novel and efficient network implementations that can be applied to any off-the-shelf network when deployed in custom-programmable hardware. The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview. [1] On global electricity usage of communication technology: Trends to 2030, Challenges 6(1), 117-157, 2015 [2] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits, https://arxiv.org/pdf/2402.17764 [3] XOR-Net: An Efficient Computation Pipeline for Binary Neural Network Inference on Edge Devices, https://cmu-odml.github.io/papers/XOR-Net_An_Efficient_Computation_Pipeline_for_Binary_Neural_Network_Inference_on_Edge_Devices.pdf [4] DeepSeek-V3 Technical Report, https://arxiv.org/abs/2412.19437

As demonstrated through a number of past efforts [2,3,4], the computational complexity of a neural network can be reduced fairly substantially by changes in the selected low-level computing paradigm. Rather than relying on standard matrix-vector multiplications that make use of hardware multipliers, we can choose architectures that will rely more on additive operations [2], thereby reducing computational complexity and thus energy consumption by a substantial amount. The present work aims at the development and testing of novel and efficient network implementations that can be applied to any off-the-shelf network when deployed in custom-programmable hardware.

[1] On global electricity usage of communication technology: Trends to 2030, Challenges 6(1), 117-157, 2015

[2] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits, https://arxiv.org/pdf/2402.17764

[3] XOR-Net: An Efficient Computation Pipeline for Binary Neural Network Inference on Edge Devices, https://cmu-odml.github.io/papers/XOR-NetAnEfficientComputationPipelineforBinaryNeuralNetworkInferenceonEdgeDevices.pdf

[4] DeepSeek-V3 Technical Report, https://arxiv.org/abs/2412.19437

Work Packages

● Literature research

● Development of loss-less post-training adaptations for reducing the computational complexity of neural networks

● Optimization of computational efficiency of standard architectures such as CNNs, MLPs, and Transformers

● Testing in simulation

● Optional: Testing on custom programmable hardware

Requirements

Excellent knowledge in either C++ or Python

Knowledge in deep learning

Experience in computer vision

Contact Details

Laurent Kneip (lkneip@theaiinstitute.com)

Alex Liniger (aliniger@theaiinstitute.com)

Please include your CV and up-to-date transcript when applying

More information

Published since: 2025-03-12 , Earliest start: 2025-03-17

Organization Robotic Systems Lab

Hosts Kneip Laurent

Topics Engineering and Technology

Generalist Excavator Transformer

We want to develop a generalist digging agent that is able to do multiple tasks, such as digging and moving loose soil, and/or control multiple excavators. We plan to use decision transformers, trained on offline data, to accomplish these tasks.

Keywords

Offline reinforcement learning, transformers, autonomous excavation

Labels

Semester Project , Master Thesis

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-03-11 , Earliest start: 2025-03-01 , Latest end: 2025-08-31

Organization Robotic Systems Lab

Hosts Werner Lennart , Egli Pascal Arturo , Terenzi Lorenzo , Nan Fang , Zhang Weixuan

Topics Information, Computing and Communication Sciences

Differential Particle Simulation for Robotics

This project focuses on applying differential particle-based simulation to address challenges in simulating real-world robotic tasks involving interactions with fluids, granular materials, and soft objects. Leveraging the differentiability of simulations, the project aims to enhance simulation accuracy with limited real-world data and explore learning robotic control using first-order gradient information.

Labels

Semester Project , Master Thesis

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-03-10 , Earliest start: 2025-01-01 , Latest end: 2025-12-31

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization Robotic Systems Lab

Hosts Nan Fang , Ma Hao

Topics Engineering and Technology

Novel Winch Control for Robotic Climbing

While legged robots have demonstrated impressive locomotion performance in structured environments, challenges persist in navigating steep natural terrain and loose, granular soil. These challenges extend to extraterrestrial environments and are relevant to future lunar, martian, and asteroidal missions. In order to explore the most extreme terrains, a novel winch system has been developed for the ANYmal robot platform. The winch could potentially be used as a fail-safe device to prevent falls during unassisted traverses of steep terrain, as well as an added driven degree of freedom for assisted ascending and descending of terrain too steep for unassisted traversal. The goal of this project is to develop control policies that utilize this new hardware and enable further climbing robot research.

Keywords

Robot, Space, Climbing, Winch, Control

Labels

Semester Project , Master Thesis , ETH Zurich (ETHZ)

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-03-05 , Earliest start: 2024-10-07

Organization Robotic Systems Lab

Hosts Vogel Dylan

Topics Information, Computing and Communication Sciences , Engineering and Technology

Beyond Value Functions: Stable Robot Learning with Monte-Carlo GRPO

Robotics is dominated by on-policy reinforcement learning: the paradigm of training a robot controller by iteratively interacting with the environment and maximizing some objective. A crucial idea to make this work is the Advantage Function. On each policy update, algorithms typically sum up the gradient log probabilities of all actions taken in the robot simulation. The advantage function increases or decreases the probabilities of these taken actions by comparing their “goodness” versus a baseline. Current advantage estimation methods use a value function to aggregate robot experience and hence decrease variance. This improves sample efficiency at the cost of introducing some bias. Stably training large language models via reinforcement learning is well-known to be a challenging task. A line of recent work [1, 2] has used Group-Relative Policy Optimization (GRPO) to achieve this feat. In GRPO, a series of answers are generated for each query-answer pair. The advantage is calculated based on a given answer being better than the average answer to the query. In this formulation, no value function is required. Can we adapt GRPO towards robot learning? Value Functions are known to cause issues in training stability [3] and a result in biased advantage estimates [4]. We are in the age of GPU-accelerated RL [5], training policies by simulating thousands of robot instances simultaneously. This makes a new monte-carlo (MC) approach towards RL timely, feasible and appealing. In this project, the student will be tasked to investigate the limitations of value-function based advantage estimation. Using GRPO as a starting point, the student will then develop MC-based algorithms that use the GPU’s parallel simulation capabilities for stable RL training for unbiased variance reduction while maintaining a competitive wall-clock time.

Keywords

Robot Learning, Reinforcement Learning, Monte Carlo RL, GRPO, Advantage Estimation

Labels

Semester Project , Bachelor Thesis , Master Thesis

Description

Co-supervised by Jing Yuan Luo (Mujoco)

Work Packages

Literature research
Investigate the bias and variance properties of the PPO value function
Design and implement a novel algorithm that achieves variance reduction through monte carlo sampling via massive environment parallelism
Re-implement existing SOTA algorithms as benchmarks
Bonus: provide theoretical insights to justify your proposed monte carlo method

Requirements

Background in Learning
Excellent knowledge of Python

Contact Details

vklemm@ethz.ch

More information

Published since: 2025-03-05

Organization Robotic Systems Lab

Hosts Klemm Victor

Topics Information, Computing and Communication Sciences , Engineering and Technology , Behavioural and Cognitive Sciences

Volumetric Bucket-Fill Estimation

Gravis Robotics is an ETH spinoff from the Robotic Systems Lab (RSL) working on the automation of heavy machinery (https://gravisrobotics.com/). In this project, you will be working with the Gravis team to develop a perceptive bucket-fill estimation system. You will conduct your project at Gravis under joint supervision from RSL.

Keywords

Autonomous Excavation

Labels

Semester Project , Master Thesis

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

https://ait.ethz.ch/people/kamanuel

Published since: 2025-02-28 , Earliest start: 2025-01-01 , Latest end: 2026-01-01

Organization Robotic Systems Lab

Hosts Egli Pascal Arturo

Topics Engineering and Technology

Leveraging Human Motion Data from Videos for Humanoid Robot Motion Learning

The advancement in humanoid robotics has reached a stage where mimicking complex human motions with high accuracy is crucial for tasks ranging from entertainment to human-robot interaction in dynamic environments. Traditional approaches in motion learning, particularly for humanoid robots, rely heavily on motion capture (MoCap) data. However, acquiring large amounts of high-quality MoCap data is both expensive and logistically challenging. In contrast, video footage of human activities, such as sports events or dance performances, is widely available and offers an abundant source of motion data. Building on recent advancements in extracting and utilizing human motion from videos, such as the method proposed in WHAM (refer to the paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"), this project aims to develop a system that extracts human motion from videos and applies it to teach a humanoid robot how to perform similar actions. The primary focus will be on extracting dynamic and expressive motions from videos, such as soccer player celebrations, and using these extracted motions as reference data for reinforcement learning (RL) and imitation learning on a humanoid robot.

Labels

Master Thesis

Description

Work packages

Literature research

Global motion reconstruction from videos.

Learning from reconstructed motion demonstrations with reinforcement learning on a humanoid robot.

Requirements

Strong programming skills in Python

Experience in computer vision and reinforcement learning

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / computer vision / robotics conferences.

Related literature

Yuan, Y., Iqbal, U., Molchanov, P., Kitani, K. and Kautz, J., 2022. Glamr: Global occlusion-aware human mesh recovery with dynamic cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11038-11049).

YUAN, Y. and Makoviychuk, V., 2023. Learning physically simulated tennis skills from broadcast videos.

Shin, S., Kim, J., Halilaj, E. and Black, M.J., 2024. Wham: Reconstructing world-grounded humans with accurate 3d motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2070-2080).

Peng, X.B., Abbeel, P., Levine, S. and Van de Panne, M., 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4), pp.1-14.

Goal

The objective of this project is to develop a robust system for extracting human motions from video footage and transferring these motions to a humanoid robot using learning from demonstration techniques. The system will be designed to handle the noisy data typically associated with video-based motion extraction and ensure that the humanoid robot can replicate the extracted motions with high fidelity while respecting physical rules.

Proposed Methodology

Video Data Collection and Motion Extraction:

Collect video footage of soccer player celebrations and other dynamic human activities.
Starting from existing monocular human pose/motion estimation algorithms to extract 3D motion data from the videos.
Incorporate physics-based corrections similar to those employed in WHAM to address issues like jitter, foot sliding, and ground penetration in the extracted motion data.

Motion Learning:

Applying existing learning from demonstration algorithms in a simulated environment to replicate kinematic motions reconstructed from the videos while respecting physical rules using reinforcement learning.

Implementation on Humanoid Robot:

This is encouraged since we have our robot lying there waiting for you.

Contact Details

Please include your CV and transcript in the submission.

Manuel Kaufmann

kamanuel@inf.ethz.ch

Chenhao Li

More information

Published since: 2025-02-25

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Kaufmann Manuel , Li Chenhao , Li Chenhao , Kaufmann Manuel , Li Chenhao

Topics Engineering and Technology

Learning Agile Dodgeball Behaviors for Humanoid Robots

Agility and rapid decision-making are vital for humanoid robots to safely and effectively operate in dynamic, unstructured environments. In human contexts—whether in crowded spaces, industrial settings, or collaborative environments—robots must be capable of reacting to fast, unpredictable changes in their surroundings. This includes not only planned navigation around static obstacles but also rapid responses to dynamic threats such as falling objects, sudden human movements, or unexpected collisions. Developing such reactive capabilities in legged robots remains a significant challenge due to the complexity of real-time perception, decision-making under uncertainty, and balance control. Humanoid robots, with their human-like morphology, are uniquely positioned to navigate and interact with human-centered environments. However, achieving fast, dynamic responses—especially while maintaining postural stability—requires advanced control strategies that integrate perception, motion planning, and balance control within tight time constraints. The task of dodging fast-moving objects, such as balls, provides an ideal testbed for studying these capabilities. It encapsulates several core challenges: rapid object detection and trajectory prediction, real-time motion planning, dynamic stability maintenance, and reactive behavior under uncertainty. Moreover, it presents a simplified yet rich framework to investigate more general collision avoidance strategies that could later be extended to complex real-world interactions. In robotics, reactive motion planning for dynamic environments has been widely studied, but primarily in the context of wheeled robots or static obstacle fields. Classical approaches focus on precomputed motion plans or simple reactive strategies, often unsuitable for highly dynamic scenarios where split-second decisions are critical. In the domain of legged robotics, maintaining balance while executing rapid, evasive maneuvers remains a challenging problem. Previous work on dynamic locomotion has addressed agile behaviors like running, jumping, or turning (e.g., Hutter et al., 2016; Kim et al., 2019), but these movements are often planned in advance rather than triggered reactively. More recent efforts have leveraged reinforcement learning (RL) to enable robots to adapt to dynamic environments, demonstrating success in tasks such as obstacle avoidance, perturbation recovery, and agile locomotion (Peng et al., 2017; Hwangbo et al., 2019). However, many of these approaches still struggle with real-time constraints and robustness in high-speed, unpredictable scenarios. Perception-driven control in humanoids, particularly for tasks requiring fast reactions, has seen advances through sensor fusion, visual servoing, and predictive modeling. For example, integrating vision-based object tracking with dynamic motion planning has enabled robots to perform tasks like ball catching or blocking (Ishiguro et al., 2002; Behnke, 2004). Yet, dodging requires a fundamentally different approach: instead of converging toward an object (as in catching), the robot must predict and strategically avoid the object’s trajectory while maintaining balance—often in the presence of limited maneuvering time. Dodgeball-inspired robotics research has been explored in limited contexts, primarily using wheeled robots or simplified agents in simulations. Few studies have addressed the challenges of high-speed evasion combined with the complexities of humanoid balance and multi-joint coordination. This project aims to bridge that gap by developing learning-based methods that enable humanoid robots to reactively avoid fast-approaching objects in real time, while preserving stability and agility.

Labels

Master Thesis

Description

Work packages

Literature research

Utilize simulation platforms (e.g., Isaac Lab) for initial policy development and training.

Explore model-free RL approaches, potentially incorporating curriculum learning to gradually increase task complexity.

Investigate perception models for object detection and trajectory forecasting, possibly leveraging lightweight deep learning architectures for real-time processing.

Implement and test learned behaviors on a physical humanoid robot, addressing the challenges of sim-to-real transfer through domain randomization or fine-tuning.

Requirements

Solid foundation in robotics, control theory, and machine learning.

Experience with reinforcement learning frameworks (e.g., PyTorch, TensorFlow, or RLlib).

Familiarity with robot simulation environments (e.g., MuJoCo, Gazebo) and real-world robot control.

Strong programming skills (Python, C++) and experience with sensor data processing.

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / robotics conferences.

Goal

Perception & Prediction

Develop a real-time perception pipeline capable of detecting and tracking incoming projectiles. Utilize camera data or external motion capture systems to predict ball trajectories accurately under varying speeds and angles.

Reactive Motion Planning

Design algorithms that plan evasive maneuvers (e.g., side-steps, ducks, or rotational movements) within milliseconds of detecting an incoming threat, ensuring the robot’s center of mass remains stable throughout.

Learning-Based Control

Apply reinforcement learning or imitation learning to optimize dodge behaviors, balancing between minimal energy expenditure and maximum evasive success. Investigate policy architectures that enable rapid reactions while handling noisy observations and sensor delays.

Robustness & Evaluation

Test the system under diverse scenarios, including multi-ball environments and varying throw speeds. Evaluate the robot’s success rate, energy efficiency, and post-dodge recovery capabilities.

Implementation on Humanoid Robot:

This is encouraged since we have our robot lying there waiting for you.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

More information

Published since: 2025-02-25

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Learning Real-time Human Motion Tracking on a Humanoid Robot

Humanoid robots, designed to mimic the structure and behavior of humans, have seen significant advancements in kinematics, dynamics, and control systems. Teleoperation of humanoid robots involves complex control strategies to manage bipedal locomotion, balance, and interaction with environments. Research in this area has focused on developing robots that can perform tasks in environments designed for humans, from simple object manipulation to navigating complex terrains. Reinforcement learning has emerged as a powerful method for enabling robots to learn from interactions with their environment, improving their performance over time without explicit programming for every possible scenario. In the context of humanoid robotics and teleoperation, RL can be used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. Key challenges include the high dimensionality of the action space, the need for safe exploration, and the transfer of learned skills across different tasks and environments. Integrating human motion tracking with reinforcement learning on humanoid robots represents a cutting-edge area of research. This approach involves using human motion data as input to train RL models, enabling the robot to learn more natural and human-like movements. The goal is to develop systems that can not only replicate human actions in real-time but also adapt and improve their responses over time through learning. Challenges in this area include ensuring real-time performance, dealing with the variability of human motion, and maintaining stability and safety of the humanoid robot.

Keywords

real-time, humanoid, reinforcement learning, representation learning

Labels

Master Thesis

Description

Work packages

Literature research

Human motion capture and retargeting

Skill space development

Hardware validation encouraged upon availability

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Serifi, A., Grandia, R., Knoop, E., Gross, M. and Bächer, M., 2024, December. Vmp: Versatile motion priors for robustly tracking motion on physical characters. In Computer Graphics Forum (Vol. 43, No. 8, p. e15175).

Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454.

He, T., Luo, Z., Xiao, W., Zhang, C., Kitani, K., Liu, C. and Shi, G., 2024, October. Learning human-to-humanoid real-time whole-body teleoperation. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8944-8951). IEEE.

He, T., Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C. and Shi, G., 2024. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

More information

Published since: 2025-02-25

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Information, Computing and Communication Sciences

Loosely Guided Reinforcement Learning for Humanoid Parkour

Humanoid robots hold the promise of navigating complex, human-centric environments with agility and adaptability. However, training these robots to perform dynamic behaviors such as parkour—jumping, climbing, and traversing obstacles—remains a significant challenge due to the high-dimensional state and action spaces involved. Traditional Reinforcement Learning (RL) struggles in such settings, primarily due to sparse rewards and the extensive exploration needed for complex tasks. This project proposes a novel approach to address these challenges by incorporating loosely guided references into the RL process. Instead of relying solely on task-specific rewards or complex reward shaping, we introduce a simplified reference trajectory that serves as a guide during training. This trajectory, often limited to the robot's base movement, reduces the exploration burden without constraining the policy to strict tracking, allowing the emergence of diverse and adaptable behaviors. Reinforcement Learning has demonstrated remarkable success in training agents for tasks ranging from game playing to robotic manipulation. However, its application to high-dimensional, dynamic tasks like humanoid parkour is hindered by two primary challenges: Exploration Complexity: The vast state-action space of humanoids leads to slow convergence, often requiring millions of training steps. Reward Design: Sparse rewards make it difficult for the agent to discover meaningful behaviors, while dense rewards demand intricate and often brittle design efforts. By introducing a loosely guided reference—a simple trajectory representing the desired flow of the task—we aim to reduce the exploration space while maintaining the flexibility of RL. This approach bridges the gap between pure RL and demonstration-based methods, enabling the learning of complex maneuvers like climbing, jumping, and dynamic obstacle traversal without heavy reliance on reward engineering or exact demonstrations.

Keywords

humanoid, reinforcement learning, loosely guided

Labels

Master Thesis

Description

**Work packages** Design a Loosely Guided RL Framework that integrates simple reference trajectories into the training loop. Evaluate Exploration Efficiency by comparing baseline RL methods with the guided approach. Demonstrate Complex Parkour Behaviors such as climbing, jumping, and dynamic traversal using the guided RL policy. Hardware validation encouraged **Requirements** Strong programming skills in Python Experience in reinforcement learning and imitation learning frameworks **Publication** This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted. **Related literature** Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14. Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR. Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning." Serifi, A., Grandia, R., Knoop, E., Gross, M. and Bächer, M., 2024, December. Vmp: Versatile motion priors for robustly tracking motion on physical characters. In Computer Graphics Forum (Vol. 43, No. 8, p. e15175). Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454. He, T., Luo, Z., Xiao, W., Zhang, C., Kitani, K., Liu, C. and Shi, G., 2024, October. Learning human-to-humanoid real-time whole-body teleoperation. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8944-8951). IEEE. He, T., Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C. and Shi, G., 2024. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858.

Work packages

Design a Loosely Guided RL Framework that integrates simple reference trajectories into the training loop.

Evaluate Exploration Efficiency by comparing baseline RL methods with the guided approach.

Demonstrate Complex Parkour Behaviors such as climbing, jumping, and dynamic traversal using the guided RL policy.

Hardware validation encouraged

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

Related literature

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

More information

Published since: 2025-02-25

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Information, Computing and Communication Sciences

Learning World Models for Legged Locomotion

Model-based reinforcement learning learns a world model from which an optimal control policy can be extracted. Understanding and predicting the forward dynamics of legged systems is crucial for effective control and planning. Forward dynamics involve predicting the next state of the robot given its current state and the applied actions. While traditional physics-based models can provide a baseline understanding, they often struggle with the complexities and non-linearities inherent in real-world scenarios, particularly due to the varying contact patterns of the robot's feet with the ground. The project aims to develop and evaluate neural network-based models for predicting the dynamics of legged environments, focusing on accounting for varying contact patterns and non-linearities. This involves collecting and preprocessing data from various simulation environment experiments, designing neural network architectures that incorporate necessary structures, and exploring hybrid models that combine physics-based predictions with neural network corrections. The models will be trained and evaluated on prediction autoregressive accuracy, with an emphasis on robustness and generalization capabilities across different noise perturbations. By the end of the project, the goal is to achieve an accurate, robust, and generalizable predictive model for the forward dynamics of legged systems.

Keywords

forward dynamics, non-smooth dynamics, neural networks, model-based reinforcement learning

Labels

Master Thesis

Description

Work packages

Literature research

Understand the training pipeline of the paper Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics.

Explore the possibility of using a first-order gradient in optimizing the policy.

Requirements

Strong programming skills in Python

Experience in machine learning frameworks, especially model-based reinforcement learning.

Publication

This project will mostly focus on simulated environments. Promising results will be submitted to machine learning conferences, where the method will be thoroughly evaluated and tested on different systems (e.g., simple Mujoco environments to complex systems such as quadrupeds and bipeds).

Related literature

Hafner, D., Lillicrap, T., Ba, J. and Norouzi, M., 2019. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603.

Hafner, D., Lillicrap, T., Norouzi, M. and Ba, J., 2020. Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193.

Hafner, D., Pasukonis, J., Ba, J. and Lillicrap, T., 2023. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Song, Y., Kim, S. and Scaramuzza, D., 2024. Learning Quadruped Locomotion Using Differentiable Simulation. arXiv preprint arXiv:2403.14864.

Li, C., Krause, A. and Hutter, M., 2025. Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. arXiv preprint arXiv:2501.10100.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

More information

Published since: 2025-02-25

Organization Robotic Systems Lab

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Supervised learning for loco-manipulation

To spot arm operations, we propose a multi-phase approach combining supervised learning and reinforcement learning (RL). First, we will employ supervised learning to develop a model for solving inverse kinematics (IK), enabling precise joint angle calculations from desired end-effector pose. Next, we will utilize another supervised learning technique to build a collision avoidance model, trained to predict and avoid self-collisions based on arm configurations and environmental data. With these pre-trained networks, we will then integrate RL to generate dynamic and safe arm-motion plans. The RL agent will leverage the IK and collision avoidance models to optimize arm trajectories, ensuring efficient and collision-free movements. This entire pipeline could be back propagated while promising to enhance the accuracy, safety, and flexibility of robotic arm operations in complex environments.

Keywords

Spot, Supervised learning, loco-manipulation

Labels

Master Thesis , ETH Zurich (ETHZ)

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-02-10 , Earliest start: 2025-02-10 , Latest end: 2026-03-01

Organization Robotic Systems Lab

Hosts Mirrazavi Sina

Topics Information, Computing and Communication Sciences

Model-Based Reinforcement Learning for Loco-manipulation

This project aims to develop a model-based reinforcement learning (RL) framework to enable quadruped robots to perform dynamic locomotion and manipulation simultaneously by leveraging advanced model-based RL algorithms such as DeamerV3, TDMPC2 and SAM-RL. We will develop control policies that can predict future states and rewards, enabling the robot to adapt its behavior on-the-fly. The primary focus will be on achieving stable and adaptive walking patterns while reaching and grasping objects. The outcome will provide insights into the integration of complex behaviors in robotic systems, with potential applications in service robotics and automated object handling.

Labels

Master Thesis , ETH Zurich (ETHZ)

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-02-10 , Earliest start: 2025-02-10 , Latest end: 2026-02-10

Organization Robotic Systems Lab

Hosts Mirrazavi Sina

Topics Information, Computing and Communication Sciences

Integrating OpenVLA for Vision-Language-Driven Loco-Manipulation robotics scenarios

This thesis proposes to integrate and adapt the OpenVLA (Open-Source Vision-Language-Action) model to control the Spot robotic arm for performing complex grasping and placing tasks. The study will focus on enabling the robot to recognize, grasp, and organize various toy-sized kitchen items based on human instructions. By leveraging OpenVLA's robust multimodal capabilities, this project aims to bridge the gap between human intent and robotic actions, enabling seamless task execution in unstructured environments. The research will explore the feasibility of fine-tuning OpenVLA for task-specific operations and evaluate its performance in real-world scenarios, providing valuable insights for advancing multimodal robotics.

Labels

Master Thesis , ETH Zurich (ETHZ)

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

cschwarke@ethz.ch vklemm@ethz.ch mittalma@ethz.ch

Published since: 2025-02-10 , Earliest start: 2025-02-10 , Latest end: 2026-02-10

Organization Robotic Systems Lab

Hosts Mirrazavi Sina

Topics Information, Computing and Communication Sciences

Differentiable Simulation for Precise End-Effector Tracking

Unlock the potential of differentiable simulation on ALMA, a quadrupedal robot equipped with a robotic arm. Differentiable simulation enables precise gradient-based optimization, promising greater tracking accuracy and efficiency compared to standard reinforcement learning approaches. This project dives into advanced simulation and control techniques, paving the way for improvements in robotic trajectory tracking.

Keywords

Differentiable Simulation, Learning, ALMA

Labels

Semester Project , Bachelor Thesis , Master Thesis

Description

Differentiable simulation [1] has demonstrated significant improvements in sample efficiency compared to traditional reinforcement learning approaches across various applications, including legged locomotion [2]. This project seeks to explore another key advantage of differentiable simulation: its capability for more precise optimization. The study will focus on a tracking task involving ALMA, a quadrupedal robot equipped with a robotic arm. The primary objectives are to develop a differentiable simulation environment for the robot and evaluate its advantages over traditional reinforcement learning methods. By utilizing the gradients provided by the simulation, control policies will be optimized to improve tracking performance. The work involves creating a tailored differentiable simulation, systematically comparing its performance with reinforcement learning techniques, and analyzing its impact on accuracy and real-world applicability. This project provides an opportunity to contribute to advanced research in robotics by combining theoretical insights with practical implementation.

References

[1] H. J. Suh, M. Simchowitz, K. Zhang, and R. Tedrake, “Do differentiable simulators give better policy gradients?” in InternationalConference on Machine Learning. PMLR, 2022, pp. 20 668–20 696.
[2] Schwarke, C., Klemm, V., Tordesillas, J., Sleiman, J. P., & Hutter, M. (2024). Learning Quadrupedal Locomotion via Differentiable Simulation. arXiv preprint arXiv:2404.02887

Work Packages

Literature research
Implementation of a differentiable simulation environment for ALMA
Training and evaluation of tracking policies

Requirements

Excellent knowledge of Python
Background in Simulation or Learning

Contact Details

Please send your CV, transcript and a short motivation (4-5 sentences max.) to:

More information

Published since: 2025-02-07 , Earliest start: 2025-01-27

Organization Robotic Systems Lab

Hosts Mittal Mayank , Schwarke Clemens , Klemm Victor

Topics Information, Computing and Communication Sciences

Modeling and Simulation for Earthwork in Digital Twin

In this work, we aim to build a digital twin of our autonomous hydraulic excavator, leveraging Mathworks technology for high-fidelity modeling. This will be used in the future to test and benchmark our learning-based controllers.

Keywords

Modeling, Hydraulics, Excavation, Industry

Labels

Semester Project , Master Thesis , ETH Zurich (ETHZ)

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-02-06 , Earliest start: 2025-03-03

Organization Robotic Systems Lab

Hosts Spinelli Filippo , Nan Fang

Topics Information, Computing and Communication Sciences , Engineering and Technology

Reinforcement Learning for Excavation Planning In Terra

We aim to develop a reinforcement learning-based global excavation planner that can plan for the long term and execute a wide range of excavation geometries. The system will be deployed on our legged excavator.

Keywords

Keywords: Reinforcement learning, task planning

Labels

Semester Project , Master Thesis

Description

Reinforcement learning has demonstrated significant success in decision-making and behavior planning with discrete states and action spaces. In this project, we plan to develop and extend a global excavation planner responsible for selecting the next digging area and the actions required to move soil around the site. This requires long-term planning for the sequence of excavation and an understanding of which areas are accessible and where the excavator could potentially become trapped. We developed using Jax a simulation environment, Terra [3], where agents can be trained even in millions of parallel environments on multiple GPUs. The first part of the project will focus on modifying the simulation environment to have a continous state space and include 3D soil to simulate are real contruction site. The second part of the project will focus on deploying the system and integrating it with the current stack to dig geometries that were not achieved so far.

References: [1] PPO for LUX AI [2] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model, Deepmind [3] Terra: https://github.com/leggedrobotics/terra

Work Packages

Reinforcement learning has shown remarkable success in decision-making and behavior planning within environments characterized by discrete states and actions. In this project, we aim to develop and enhance a global excavation planner that is tasked with selecting the next excavation site and coordinating the necessary actions to redistribute soil across the site. This involves intricate long-term planning to sequence excavation activities, while also considering accessibility and potential areas where the excavator might become trapped.

We have developed a simulation environment called Terra [3], using Jax, which supports training agents in millions of parallel environments across multiple GPUs. The initial phase of the project will involve adapting the simulation environment to support a continuous state space and incorporate 3D soil modeling to more accurately mimic a real construction site. The second phase will focus on deploying the system and integrating it with existing technology stacks to achieve excavation geometries previously unattainable.

References: [1] PPO for LUX AI [2] Mastering Atari, Go, Chess, and Shogi by Planning with a Learned Model, Deepmind [3] Terra: https://github.com/leggedrobotics/terra

Requirements

Experience in PyTorch and training neural networks
Experience with GPU-accelerated environments (preferred)

Contact Details

More information

Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-08-31

Organization Robotic Systems Lab

Hosts Terenzi Lorenzo

Topics Information, Computing and Communication Sciences

Model Based Reinforcement Learning

We want to train an excavator agent to learn in a variety of soil using a fast, GPU-accelerated soil particle simulator in Isaac Sim.

Keywords

particle simulation, omniverse, warp, reinforcement learning, model based reinforcement learning.

Labels

Semester Project , Master Thesis

Description

Model-free reinforcement learning (such as PPO) approaches for training excavation agents in simulated environments struggle with computational demands, especially with realistic soil dynamics, and a simple soil model is needed in order to successfully train [1] . We have developed a particle simulation soil model in NVIDIA Isaac Sim [1], enhancing realism but also increasing computational load, which makes it unsuitable for model-free RL algorithms like PPO due to their high sample requirements, resulting in slow training processes. This project aims to explore and implement the model-based RL algorithms, such as the Dreamer algorithm, to train excavation agents efficiently in our particle simulator. Dreamer's predictive world models promise improved sample efficiency, potentially overcoming the computational challenges of our GPU-accelerated simulator.

[1] Egli, Pascal, et al. "Soil-Adaptive Excavation Using Reinforcement Learning." IEEE Robotics and Automation Letters 7.4 (2022): 9778-9785.

[2] NVIDIA Ominverse

[3] Mastering Diverse Domains through World Models, Danijar Hafner et all., arxiv 2023

Work Packages

Adapting Dreamer/model-based RL algorithm to interact effectively with our GPU-accelerated particle simulation.
1. Training and evaluating excavation agents and their world model within this framework.
2. Optimizing the algorithm and simulation settings to successfully train the excavator to scoop in a variety of soils.

Requirements

• Experience in training neural networks and RL

• Experience with ROS and good knowledge of Python

More information

Published since: 2025-02-03 , Earliest start: 2025-02-28 , Latest end: 2025-08-31

Organization Robotic Systems Lab

Hosts Egli Pascal Arturo , Terenzi Lorenzo

Topics Information, Computing and Communication Sciences , Engineering and Technology

Reinforcement Learning for Particle-Based Excavation in Isaac Sim

We want to train RL agents on our new particle simulator, accelerated on the GPU via warp in Isaac sim.

Keywords

particle simulation, omniverse, warp, reinforcement learning

Labels

Semester Project , Master Thesis

Description

Training reinforcement learning digging agents in simulation has only been possible with simplified soil models that compute the experienced forces at the shovel edge [1]. Strong domain randomization is necessary for sim-to-real transfer, but so far it has not been possible to simulate the presence of soil inhomogeneities and soil displacement fast enough. This last one, in particular, is essential for training agents that are able to displace loose soil. We want to use NVIDIA Isaac Sim which features GPU-based high-performance simulations and is part of NVIDIA Omniverse [2]. The objective of this thesis is to further develop a particle simulation soil model and use it to train an excavation agent.

Work Packages

Improve particle-based soil model in simulation using Warp and simulate bucket-soil interactions
Train reinforcement learning agents to learn digging in the designed environment

Requirements

Experience in training neural networks and RL
Experience with ROS and good knowledge of Python

Contact Details

More information

Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-09-30

Organization Robotic Systems Lab

Hosts Egli Pascal Arturo , Mittal Mayank , Terenzi Lorenzo

Topics Information, Computing and Communication Sciences

Perceptive Reinforcement Learning for Exavation

In this project, our goal is to leverage precomputed embeddings(VAE in Isaacsim) from 3D earthworks scene reconstructions to train reinforcement learning agents. These embeddings, derived from incomplete point cloud data and reconstructed using an encoder-decoder neural network, will serve as latent representations. The main emphasis is on utilizing these representations to develop and train reinforcement learning policies for digging tasks.

Keywords

LIDAR, 3D reconstruction, Isaac gym, deep learning, perception, reinforcement learning

Labels

Semester Project , Master Thesis

Description

Our excavator, M545, is equipped with a LIDAR that allows it to precisely perceive the 3D construction scene. However, occlusion caused by mounds of soil and irregularities in the soil often limits our ability to obtain complete information about the environment. This makes arm planning and collision avoidance against the terrain more challenging. In this project, we aim to exploit the regularities found in typical earthworks sites to do a 3D neural reconstruction of the scene using point cloud data. To do this, we plan to collect data in simulation by procedurally generating a set of plausible terrains in Isaac gym and using them to train a network that is able to reconstruct the 3D scene[1]. The latent representation of the network will then be used to develop reinforcement learning policies.

References: [1] Neural scene representation for locomotion on structured terrain [2] Self-Supervised Point Cloud Understanding via Mask Transformer and Contrastive Learning

Work Packages

Sensor simulation and procedural environment generation
Training of 3D sparse network
Field deployment in a digging task

Requirements

Experience in deep learning

Contact Details

More information

Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-08-31

Organization Robotic Systems Lab

Hosts Höller David , Terenzi Lorenzo

Topics Information, Computing and Communication Sciences

Reiforcement Learning of Pretrained Trasformer Models

We want to train RL agents on our new particle simulator, accelerated on the GPU via warp in Isaac sim.

Keywords

Keywords: particle simulation, omniverse, warp, reinforcement learning

Labels

Semester Project , Master Thesis

Description

This project tackles the computational challenges of model-free on-policy reinforcement learning algorithms, such as Proximal Policy Optimization (PPO), in training excavation agents within simulated environments featuring realistic soil dynamics. To achieve this, we developed a particle-based soil simulation model using NVIDIA Isaac Sim, which enhances realism but significantly increases computational demands. This makes the model impractical for high-sample-demand algorithms like PPO on small MLPs, resulting in prolonged training times. To address this, the project explores RL fine-tuning of large pretrained decoder transformers, which are substantially more sample-efficient than small MLPs. The pretrained GPT model is trained on multiple earthworks tasks, including digging with a simplified soil model.

Work Packages

Training and evaluating excavation agents.
Optimizing the algorithm and simulation settings to successfully train the excavator to scoop in a variety of soils.

Requirements

• Experience in training neural networks and RL

Contact Details

More information

Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-08-31

Organization Robotic Systems Lab

Hosts Terenzi Lorenzo

Topics Information, Computing and Communication Sciences

Multiagent Reinforcement Learning in Terra

We want to train multiple agents in the Terra environment, a fully end-to-end GPU-accelerated environment for RL training.

Keywords

multiagent reinforcement learning, jax, deep learning, planning

Labels

Semester Project , Master Thesis

Description

Construction sites require a variety of machinery to operate efficiently, including cranes, skid steers, backhoes, excavators, and trucks. Training a single Reinforcement Learning (RL) agent, such as an excavator, does not sufficiently simulate the complexities of a real-world construction workflow. In this project, we aim to enhance the Terra simulator [1] by integrating multiple types of agents to more realistically represent a construction site environment. The subsequent goal is to train these agents to collaborate effectively and complete specific earthwork projects successfully.

[1] https://github.com/leggedrobotics/terra

Work Packages

expand the Terra simulator with different agents
train multiple agents to cooperate using RL

Requirements

experience training neural networks and RL

Contact Details

More information

fbjelonic@ethz.ch vklemm@ethz.ch cschwarke@ethz.ch

Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-09-16

Organization Robotic Systems Lab

Hosts Terenzi Lorenzo

Topics Information, Computing and Communication Sciences

Propose Your Own Robotics Challenge

This project invites you to step into the role of an innovator, encouraging you to identify challenges you are passionate about within the field of robotics. Rather than working on predefined problems, you will have the freedom to propose your own project ideas, address real-world issues, or explore cutting-edge topics. This project allows you to define your own research journey.

Keywords

Robotics, Research

Labels

Semester Project , Bachelor Thesis , Master Thesis

Description

Robotics is a rapidly evolving field with countless opportunities for innovation. This project gives you the chance to identify a robotics challenge that excites you, propose your own ideas, and pursue a project tailored to your interests. Whether it’s improving locomotion, enhancing human-robot interaction, or designing novel robotic applications, you are encouraged to think critically and creatively about the problems you want to solve. The emphasis is on self-motivation and ownership, with support provided to turn ideas into actionable research projects. This approach not only builds technical skills but also cultivates a passion for tackling meaningful challenges.

Work Packages

Literature research
Implementation
Scientific evaluation

Requirements

Excellent knowledge in the required tools (programming language or software)
Strong engineering foundation
Project experience

Contact Details

Please send your CV, transcripts and a short proposal (4-5 sentences max.) to:

More information

Published since: 2025-01-28 , Earliest start: 2025-01-27

Organization Robotic Systems Lab

Hosts Schwarke Clemens , Bjelonic Filip , Klemm Victor

Topics Information, Computing and Communication Sciences

Data Driven Simulation for End-to-End Navigation

Investigate how neural rendering can become the backbone of comprehensive, next generation data-driven simulation

Keywords

Neural rendering, Simulation

Labels

Internship , Master Thesis

Description

Simulation-based training of locomotion and environment interaction policies have recently shown tremendous success in pushing the abilities of real-world robots. Using massive parallelization, simulation-based learning enables robots to quickly learn new skills without involving the time and hardware investments attached to trying out things in the real world. However, one existing challenge is that such simulators are currently focusing on physics while the simulation of perception readings is often limited to simple geometry. In order to for example support end-to-end, vision-based models, we'd like to add realistic image rendering in complex, realistic environments to such simulators.

In this project, we'd like to explore a data-driven approach to add such capabilities to a simulator. Specifically, neural rendering methods have made large progress in recent years and their use in simulators for training and validation is now actively being investigated. Challenges that need to be addressed are given by 1) runtime considerations for efficient use inside a simulator, 2) artifact-free rendering of novel views, and 3) the imposition of physical constraints such as watertight meshes or the structural stability of static environment reconstructions.

The project is conducted at The AI Institute, a recently established top robotics research institute created by the founders of Boston Dynamics.

References

[1] Neuralangelo: High-Fidelity Neural Surface Reconstruction, CVPR 2023 [2] ViPlanner: Visual Semantic Imperative Learning for Local Navigation, ICRA 2024 [3] OmniRe: Omni Urban Scene Reconstruction, arxiv 2024 [4] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, Siggraph 2023

Work Packages

-Literature research -Adding existing rendering functionality to simulator (similar to [2]) -Incorporate gaussian splatting based rendering into simulator -Improve gaussian splatting for the use in simulators (render quality, mesh extraction, …) -Setup validation pipeline for simulator (validate end-to-end policy, or VIO, …)

Requirements

Excellent knowledge of Python Computer vision experience Knowledge of neural rendering methods

Contact Details

Alexander Liniger (aliniger@theaiinstitute.com) Igor Bogoslavskyi (ibogoslavskyi@theaiinstitute.com)

Please include up-o-date CV and transcript

More information

fbjelonic@ethz.ch cschwarke@ethz.ch

Published since: 2025-01-24 , Earliest start: 2025-01-27

Organization Robotic Systems Lab

Hosts Kneip Laurent

Topics Information, Computing and Communication Sciences , Engineering and Technology

Evolving Minds: Neuroevolution for Legged Locomotion

This project explores the use of neuroevolution for optimizing control policies in legged robots, moving away from classical gradient-based methods like PPO. Neuroevolution directly optimizes network parameters and structures, potentially offering advantages in environments with sparse rewards, while requiring fewer hyperparameters to tune. By leveraging genetic algorithms and evolutionary strategies, the project aims to develop efficient controllers for complex locomotion tasks. With computational capabilities doubling approximately every two years as predicted by Moore's Law, neuroevolution offers a promising approach for scaling intelligent control systems.

Keywords

Evolutionary Algorithms, Reinforcement Learning, Quadrupeds, Legged Locomotion

Labels

Master Thesis

Description

Reinforcement learning for legged robots typically relies on policy gradient methods, which can struggle in environments with sparse rewards and require extensive hyperparameter tuning. This project investigates neuroevolution [1], an alternative approach where the neural network controllers are optimized through evolutionary processes. Neuroevolution allows simultaneous optimization of policy parameters and network architecture, potentially improving performance and simplifying tuning. With computational power continuously advancing, neuroevolution can exploit this trend to scale efficiently for complex robotics applications [2].

References

[1] Galván, Edgar, and Peter Mooney. "Neuroevolution in deep neural networks: Current trends and future challenges." IEEE Transactions on Artificial Intelligence
[2] Salimans, Tim, et al. "Evolution strategies as a scalable alternative to reinforcement learning.", OpenAI

Work Packages

Study previous applications in robotics and control
Implement neuroevolution algorithms within the Isaac Gym/Isaac Lab frameworks
Compare performance, sample efficiency, and robustness against PPO

Requirements

Excellent knowledge of Python (C++)
Background in RL and Learning Methods

Contact Details

Please send your CV, transcripts and a short motivation (4-5 sentences max.) to:

More information

Published since: 2025-01-22 , Earliest start: 2025-01-27

Organization Robotic Systems Lab

Hosts Bjelonic Filip , Schwarke Clemens

Topics Information, Computing and Communication Sciences

Design of a Compliant Mechanism for Human-Robot Collaborative Transportation with Non-Holonomic Robots

Human-robot collaboration is an attractive option in many industries for transporting long and heavy items with a single operator. In this project, we aim to enable HRC transportation with a non-holonomic robotic base platform by designing a compliant manipulation mechanism, inspired by systems like the Omnid Mocobots.

Keywords

Human-robot collaboration Collaborative transportation Non-holonomic robot Mobile manipulation

Labels

Master Thesis , ETH Zurich (ETHZ)

Description

To prevent long-term injuries, worker safety regulations limit the weight a human is allowed to carry. For instance, in many countries within the construction sector, workers are permitted to lift at most 25 kg at a time. Consequently, transporting objects exceeding this weight typically requires a second worker or the use of heavy construction equipment. This not only increases labor costs but also introduces logistical challenges.
Human-robot collaboration offers a promising solution by enabling a single operator to handle heavy and bulky items safely and efficiently. The Omnid Mocobots [1] have demonstrated success in this area by employing compliant manipulators with accurate force control on omnidirectional mobile bases. This setup allows operators to maneuver heavy objects effortlessly and serves as an inspiration for our project. Building upon this concept, the goal of this project is to design and integrate a compliant manipulation mechanism onto "Smally", a skid-steer wheeled-legged mobile robot developed by Hilti. The manipulator will enable gravity compensation and smooth motion while handling heavy payloads. The project includes evaluating system requirements, mechanical design, fabrication, and validation through control algorithm development. The successful implementation aims to enhance worker safety, reduce labor costs, and increase efficiency in environments where full robot autonomy is challenging.

Work Packages

Literature research
Parallel manipulator mechanical design
System integration on Smally

Requirements

Mechatronics and system integration experience
Understanding of parallel mechanisms
Knowledge of C/C++

Contact Details

Please send your application with your CV and transcript to the following emails:

Francesca Bray: frbray@ethz.ch
Julien Kindle: jkindle@ethz.ch

More information

Published since: 2025-01-16 , Earliest start: 2024-07-08

Applications limited to ETH Zurich

Organization Robotic Systems Lab

Hosts Kindle Julien , Bray Francesca

Topics Information, Computing and Communication Sciences , Engineering and Technology

How to Touch: Exploring Tactile Representations for Reinforcement Learning

Developing and benchmarking tactile representations for dexterous manipulation tasks using reinforcement learning.

Keywords

Reinforcement Learning, Dexterous Manipulation, Tactile Sensing

Labels

Semester Project , Bachelor Thesis , Master Thesis

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

Click link "Open this project..." below.
Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Published since: 2025-01-08 , Earliest start: 2024-12-15 , Latest end: 2025-06-01

Applications limited to ETH Zurich

Organization Robotic Systems Lab

Hosts Bhardwaj Arjun , Zurbrügg René

Topics Information, Computing and Communication Sciences

BEV meets Semantic traversability

Enable Birds-Eye-View perception on autonomous mobile robots for human-like navigation.

Keywords

Semantic Traversability, Birds-Eye-View, Localization, SLAM, Object Detection

Labels

Master Thesis , ETH Zurich (ETHZ)

Description

Autonomous Driving made tremendous progress in recent years through innovations in learning-based methods [1]. An emergent enabler are Birds-Eye-View methods that allow vehicles to understand and reason about their surroundings in real time. In this project, we aim to transfer this research to autonomous mobile robots in real-world, human-inhabited environments. While rules of navigation and traversability are well defined for autonomous driving, one exciting aspect of this project is on finding analogous representations for everyday human environments. We would like to explore these methods on a range of robots including Spot, Anymal, the Ultra Mobility Vehicle and Humanoids.

References [1] Liao, B., Chen, S., Zhang, Y., Jiang, B., Zhang, Q., Liu, W., ... & Wang, X. (2024). Maptrv2: An end-to-end framework for online vectorized hd map construction. International Journal of Computer Vision, 1-23. [2] Kim, Y., Lee, J. H., Lee, C., Mun, J., Youm, D., Park, J., & Hwangbo, J. (2024). Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy. arXiv preprint arXiv:2406.02989. [3] https://rpl-cs-ucl.github.io/STEPP/

This project is hosted at The AI institute in collaboration with RSL.

Work Packages

Research latest BEV methods
Develop BEV-inspired methods for mobile robotic semantic traversability.
Deployment on real robots

Requirements

Excellent knowledge of C++, Python
Familiarity with learning framework, e.g. pytorch
Experience with ROS2 is a plus

Contact Details

Email Abel (agawel@theaiinstitute.com) and Laurent (lkneip@theaiinstitute.com). Please include your CV and up-to-date transcript.

More information

Published since: 2024-12-18 , Earliest start: 2025-01-15 , Latest end: 2025-10-31

Organization Robotic Systems Lab

Hosts Gawel Abel

Topics Information, Computing and Communication Sciences , Engineering and Technology

Scene graphs for robot navigation and reasoning

Elevate semantic scene graphs to a new level and perform semantically-guided navigation and interaction with real robots at The AI Institute.

Keywords

Scene graphs, SLAM, Navigation, Spacial Reasoning, 3D reconstruction, Semantics

Labels

Master Thesis , ETH Zurich (ETHZ)

Description

Human environments often adhere to implicit and explicit semantic structures that are easily understood by humans. For Autonomous mobile robots to act in these environments, we aim to investigate how to represent this understanding of the environment. One technology that gained popularity in recent years are scene graphs. These allow robots to spatially deconstruct the world in a graph of multiple levels of abstraction, where nodes represent places, rooms, objects, etc and edges the relationships between them. In this project, we aim to elevate semantic scene graphs to a level to perform semantically-guided navigation and interaction. We would like to explore these methods on a range of robots including Spot, Anymal, the Ultra Mobility Vehicle and Humanoids.

References

[1] Hughes, N., Chang, Y., & Carlone, L. (2022). Hydra: A real-time spatial perception system for 3D scene graph construction and optimization. arXiv preprint arXiv:2201.13360. [2] Honerkamp, D., Büchner, M., Despinoy, F., Welschehold, T., & Valada, A. (2024). Language-grounded dynamic scene graphs for interactive object search with mobile manipulation. IEEE Robotics and Automation Letters. [3] Gu, Q., Kuwajerwala, A., Morin, S., Jatavallabhula, K. M., Sen, B., Agarwal, A., ... & Paull, L. (2024, May). Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning. In 2024 IEEE International Conference on Robotics and Automation (ICRA) (pp. 5021-5028). IEEE.

This thesis will be hosted at The AI Institute in collaboration with RSL.

Work Packages

Familiarization with latest scene graph frameworks
Build new scene graph representations that seamlessly integrate with robot navigation
Deployment on real robots

Requirements

Excellent knowledge of C++, Python
Familiarity with learning framework, e.g. pytorch
Experience with ROS2 is a plus

Contact Details

Email Abel (agawel@theaiinstitute.com) and Alex (aliniger@theaiinstitute.com). Please include your CV and up-to-date transcript.

More information

Published since: 2024-12-18 , Earliest start: 2025-01-15 , Latest end: 2025-10-31

Organization Robotic Systems Lab

Hosts Gawel Abel

Topics Information, Computing and Communication Sciences , Engineering and Technology

Digital Twin for Spot's Home

MOTIVATION ⇾ Creating a digital twin of the robot's environment is crucial for several reasons: 1. Simulate Different Robots: Test various robots in a virtual environment, saving time and resources. 2. Accurate Evaluation: Precisely assess robot interactions and performance. 3. Enhanced Flexibility: Easily modify scenarios to develop robust systems. 4. Cost Efficiency: Reduce costs by identifying issues in virtual simulations. 5. Scalability: Replicate multiple environments for comprehensive testing. PROPOSAL We propose to create a digital twin of our Semantic environment, designed in your preferred graphics Platform to be able to simulate Reinforcement Learning agents in the digital environment, to create a unified evaluation platform for robotic tasks.

Keywords

Digital Twin, Robotics

Labels

Semester Project , Master Thesis

Contact Details

Requirements: experience with a Python deep learning framework, understanding of 3D scene and camera geometry.

Please send us a CV and transcript.

Dr. Hermann Blum (blumh@ethz.ch) Dr. Zuria Bauer (zbauer@ethz.ch) Tifanny Portela (tportela@ethz.ch) Jelena Trisovic (tjelena@ethz.ch)

More information

Published since: 2024-12-17 , Earliest start: 2025-01-05

Applications limited to University of Zurich , ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization Computer Vision and Geometry Group

Hosts Blum Hermann , Portela Tifanny , Bauer Zuria, Dr. , Trisovic Jelena

Topics Information, Computing and Communication Sciences

KALLAX Benchmark: Evaluating Household Tasks

Motivation ⇾ There are three ways to evaluate robots for pick-and-place tasks at home: 1. Simulation setups: High reproducibility but hard to simulate real-world complexities and perception noise. 2. Competitions: Good for comparing overall systems but require significant effort and can't be done frequently. 3. Custom lab setups: Common but lead to overfitting and lack comparability between labs. Proposal ⇾ We propose using IKEA furniture to create standardized, randomized setups that researchers can easily replicate. E.g, a 4x4 KALLAX unit with varying door knobs and drawer positions, generating tasks like "move the cup from the upper right shelf into the black drawer." This prevents overfitting and allows for consistent evaluation across different labs.

Keywords

Benchmakr, Robotics, pick-and-place

Labels

Semester Project , Master Thesis

Contact Details

Requirements: experience with a Python deep learning framework, understanding of 3D scene and camera geometry.

Please send us a CV and transcript. Dr. Hermann Blum (blumh@ethz.ch) René Zurbrügg (zrene@ethz.ch) Dr. Zuria Bauer (zbauer@ethz.ch)

More information

Published since: 2024-12-17 , Earliest start: 2025-01-06

Applications limited to University of Zurich , ETH Zurich , Swiss National Science Foundation , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization Computer Vision and Geometry Group

Hosts Blum Hermann , Bauer Zuria, Dr. , Zurbrügg René

Topics Information, Computing and Communication Sciences

Visual Language Models for Long-Term Planning

This project uses Visual Language Models (VLMs) for high-level planning and supervision in construction tasks, enabling task prioritization, dynamic adaptation, and multi-robot collaboration for excavation and site management. prioritization, dynamic adaptation, and multi-robot collaboration for excavation and site management

Keywords

Visual Language Models, Long-term planning, Robotics

Labels

Semester Project , Master Thesis

Description

VLMs excel in reasoning and dynamic code generation, making them ideal for tasks like excavation sequencing, obstacle management, and multi-robot coordination. Applications include dynamic trenching, rock field clearing, and safety monitoring. The goal is to deploy VLM-based systems on autonomous excavators to enhance efficiency and adaptability.

Work Packages

Develop simulated and real scenarios for VLM-driven planning.

Integrate VLMs into excavation control systems for triggering tasks and code generation.

Benchmark performance in complex planning scenarios.

Contact Details

More information

Published since: 2024-12-06 , Earliest start: 2025-03-31 , Latest end: 2025-10-29

Organization Robotic Systems Lab

Hosts Terenzi Lorenzo

Topics Information, Computing and Communication Sciences

Diffusion-based Shared Autonomy System for Telemanipulation

Robots may not be able to complete tasks fully autonomously in unstructured or unseen environments, however direct teleoperation from human operators may also be challenging due to the difficulty of providing full situational awareness to the operator as well as degradation in communication leading to the loss of control authority. This motivates the use of shared autonomy for assisting the operator thereby enhancing the performance during the task. In this project, we aim to develop a shared autonomy framework for teleoperation of manipulator arms, to assist non-expert users or in the presence of degraded communication. Imitation learning, such as diffusion models, have emerged as a popular and scalable approach for learning manipulation tasks [1, 2]. Additionally, recent works have combined this with partial diffusion to enable shared autonomy [3]. However, the tasks were restricted to simple 2D domains. In this project, we wish to extend previous work in the lab using diffusion-based imitation learning, to enable shared autonomy for non-expert users to complete unseen tasks or in degraded communication environments.

Keywords

Imitation learning, Robotics, Manipulation, Teleoperation

Labels

Semester Project , ETH Zurich (ETHZ)

Description

In this project, we aim to develop a shared autonomy framework for teleoperation of manipulator arms, to assist non-expert users or in the presence of degraded communication. Imitation learning, such as diffusion models, have emerged as a popular and scalable approach for learning manipulation tasks [1, 2]. Additionally, recent works have combined this with partial diffusion to enable shared autonomy [3]. However, the tasks were restricted to simple 2D domains. In this project, we wish to extend previous work in the lab using diffusion-based imitation learning, to enable shared autonomy for non-expert users to complete unseen tasks or in degraded communication environments.

References

[1] Zhao, T.Z., Kumar, V., Levine, S. and Finn, C., 2023. Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705
[2] Chi C, Xu Z, Feng S, et al. Diffusion policy: Visuomotor policy learning via action diffusion. The International Journal of Robotics Research. 2024;0(0). doi:10.1177/02783649241273668
[3] Yoneda, T., Sun, L., Stadie, B. and Walter, M., 2023. To the noise and back: Diffusion for shared autonomy. arXiv preprint arXiv:2302.12244.

Work Packages

Define the tasks that will be performed using the manipulator
Set up data collection and training pipeline
Generate dataset of task specific demonstrations and train a diffusion policy
Transfer this policy to the robot
Setup and evaluate experiments for evaluating performance for shared autonomy

Requirements

Previous experience with machine learning
Strong software development and experience working with Python and PyTorch
Background in Robotics
(Optional) Experience with ROS and C++

Contact Details

Please send a mail to earavind@ethz.ch with the Subject: "Application - Your Name - Diffusion-based Shared Autonomy System for Telemanipulation" with:

BS/MS Transcript of Records
CV
Short motivation for your interest in the project

More information

Published since: 2024-12-02 , Earliest start: 2024-11-01 , Latest end: 2025-11-01

Applications limited to ETH Zurich , University of Zurich

Organization Robotic Systems Lab

Hosts Elanjimattathil Aravind

Topics Information, Computing and Communication Sciences , Engineering and Technology

Lifelike Agility on ANYmal by Learning from Animals

The remarkable agility of animals, characterized by their rapid, fluid movements and precise interaction with their environment, serves as an inspiration for advancements in legged robotics. Recent progress in the field has underscored the potential of learning-based methods for robot control. These methods streamline the development process by optimizing control mechanisms directly from sensory inputs to actuator outputs, often employing deep reinforcement learning (RL) algorithms. By training in simulated environments, these algorithms can develop locomotion skills that are subsequently transferred to physical robots. Although this approach has led to significant achievements in achieving robust locomotion, mimicking the wide range of agile capabilities observed in animals remains a significant challenge. Traditionally, manually crafted controllers have succeeded in replicating complex behaviors, but their development is labor-intensive and demands a high level of expertise in each specific skill. Reinforcement learning offers a promising alternative by potentially reducing the manual labor involved in controller development. However, crafting learning objectives that lead to the desired behaviors in robots also requires considerable expertise, specific to each skill.

Keywords

learning from demonstrations, imitation learning, reinforcement learning

Labels

Master Thesis

Description

**Work packages** Literature research Skill development from an animal dataset (available) Hardware deployment **Requirements** Strong programming skills in Python Experience in reinforcement learning and imitation learning frameworks **Publication** This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted. **Related literature** This project and the following literature will make you a master in imitation/demonstration/expert learning. Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14. Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784. Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20. Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE. Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR. Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9). Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13. Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning." Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.

Work packages

Literature research

Skill development from an animal dataset (available)

Hardware deployment

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

Related literature This project and the following literature will make you a master in imitation/demonstration/expert learning.

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.

Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20.

Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE.

Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9).

Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://www.linkedin.com/in/vklemm/?originalSubdomain=ch

Victor Klemm

vklemm@ethz.ch

More information

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Klemm Victor

Topics Information, Computing and Communication Sciences

Pushing the Limit of Quadruped Running Speed with Autonomous Curriculum Learning

The project aims to explore curriculum learning techniques to push the limits of quadruped running speed using reinforcement learning. By systematically designing and implementing curricula that guide the learning process, the project seeks to develop a quadruped controller capable of achieving the fastest possible forward locomotion. This involves not only optimizing the learning process but also ensuring the robustness and adaptability of the learned policies across various running conditions.

Keywords

curriculum learning, fast locomotion

Labels

Master Thesis

Description

Quadruped robots have shown remarkable versatility in navigating diverse terrains, demonstrating capabilities ranging from basic locomotion to complex maneuvers. However, achieving high-speed forward locomotion remains a challenging task due to the intricate dynamics and control requirements involved. Traditional reinforcement learning (RL) approaches have made significant strides in this area, but they often face issues related to sample efficiency, convergence speed, and stability when applied to tasks with high degrees of freedom like quadruped locomotion. Curriculum learning (CL), a concept inspired by the way humans and animals learn progressively from simpler to more complex tasks, offers a promising solution to these challenges. In the context of reinforcement learning, curriculum learning involves structuring the learning process by starting with simpler tasks and gradually increasing the complexity as the agent's proficiency improves. This approach can lead to faster convergence and better generalization by enabling the agent to build foundational skills before tackling more difficult scenarios. **Work packages** Literature research Development of autonomous curriculum Comparison with baselines (no curriculum, hand-crafted curriculum) **Requirements** Strong programming skills in Python Experience in reinforcement learning **Publication** This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted. **Related literature** This project and the following literature will make you a master in curriculum/active/open-ended learning. Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286. Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169. Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753. Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR. Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR. Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587. Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Curriculum learning (CL), a concept inspired by the way humans and animals learn progressively from simpler to more complex tasks, offers a promising solution to these challenges. In the context of reinforcement learning, curriculum learning involves structuring the learning process by starting with simpler tasks and gradually increasing the complexity as the agent's proficiency improves. This approach can lead to faster convergence and better generalization by enabling the agent to build foundational skills before tackling more difficult scenarios.

Work packages

Literature research

Development of autonomous curriculum

Comparison with baselines (no curriculum, hand-crafted curriculum)

Requirements

Strong programming skills in Python

Experience in reinforcement learning

Publication

Related literature This project and the following literature will make you a master in curriculum/active/open-ended learning.

Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.

Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.

Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.

Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.

Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.

Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://marbaga.github.io/

Marco Bagatella

mbagatella@ethz.ch

More information

Published since: 2024-11-26

Organization Robotic Systems Lab

Hosts Li Chenhao , Bagatella Marco , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Humanoid Locomotion Learning and Finetuning from Human Feedback

In the burgeoning field of deep reinforcement learning (RL), agents autonomously develop complex behaviors through a process of trial and error. Yet, the application of RL across various domains faces notable hurdles, particularly in devising appropriate reward functions. Traditional approaches often resort to sparse rewards for simplicity, though these prove inadequate for training efficient agents. Consequently, real-world applications may necessitate elaborate setups, such as employing accelerometers for door interaction detection, thermal imaging for action recognition, or motion capture systems for precise object tracking. Despite these advanced solutions, crafting an ideal reward function remains challenging due to the propensity of RL algorithms to exploit the reward system in unforeseen ways. Agents might fulfill objectives in unexpected manners, highlighting the complexity of encoding desired behaviors, like adherence to social norms, into a reward function. An alternative strategy, imitation learning, circumvents the intricacies of reward engineering by having the agent learn through the emulation of expert behavior. However, acquiring a sufficient number of high-quality demonstrations for this purpose is often impractically costly. Humans, in contrast, learn with remarkable autonomy, benefiting from intermittent guidance from educators who provide tailored feedback based on the learner's progress. This interactive learning model holds promise for artificial agents, offering a customized learning trajectory that mitigates reward exploitation without extensive reward function engineering. The challenge lies in ensuring the feedback process is both manageable for humans and rich enough to be effective. Despite its potential, the implementation of human-in-the-loop (HiL) RL remains limited in practice. Our research endeavors to significantly lessen the human labor involved in HiL learning, leveraging both unsupervised pre-training and preference-based learning to enhance agent development with minimal human intervention.

Keywords

reinforcement learning from human feedback, preference learning

Labels

Master Thesis

Description

Work packages

Literature research

Reinforcement learning from human feedback

Preference learning

Requirements

Strong programming skills in Python

Experience in reinforcement learning frameworks

Publication

Related literature

Christiano, Paul F., et al. "Deep reinforcement learning from human preferences." Advances in neural information processing systems 30 (2017).

Lee, Kimin, Laura Smith, and Pieter Abbeel. "Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training." arXiv preprint arXiv:2106.05091 (2021).

Wang, Xiaofei, et al. "Skill preferences: Learning to extract and execute robotic skills from human feedback." Conference on Robot Learning. PMLR, 2022.

Li, Chenhao, et al. "FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning." arXiv preprint arXiv:2402.13820 (2024).

Goal

The goal of the project is to learn and finetune humanoid locomotion policies using reinforcement learning from human feedback. The challenge lies in learning effective reward models from an efficient representation of motion clips, as opposed to single-state frames. The tentative pipeline works as follows:

A self-supervised motion representation pretraining phase that learns efficient trajectory representations, potentially using Fourier Latent Dynamics, with data generated by some initial policies.
Reward learning from human feedback, conditioned on the trajectory representation learned in the first step. Human preference from visualizing the motions is thus embedded in this latent trajectory representation.
Policy training with the learning reward. The induced trajectories from the learned policy are used to augment the training set for the first two steps. The process continues.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

Xin Chen

https://www.xccyn.com/

xin.chen@inf.ethz.ch

More information

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Chen Xin , Li Chenhao

Topics Information, Computing and Communication Sciences , Engineering and Technology

Online Safe Locomotion Learning in the Wild

Reinforcement learning (RL) can potentially solve complex problems in a purely data-driven manner. Still, the state-of-the-art in applying RL in robotics, relies heavily on high-fidelity simulators. While learning in simulation allows to circumvent sample complexity challenges that are common in model-free RL, even slight distribution shift ("sim-to-real gap") between simulation and the real system can cause these algorithms to easily fail. Recent advances in model-based reinforcement learning have led to superior sample efficiency, enabling online learning without a simulator. Nonetheless, learning online cannot cause any damage and should adhere to safety requirements (for obvious reasons). The proposed project aims to demonstrate how existing safe model-based RL methods can be used to solve the foregoing challenges.

Keywords

safe mode-base RL, online learning, legged robotics

Labels

Master Thesis

Description

The project aims to answer the following research questions:

How to model safe locomotion tasks for a real robotic system as a constrained RL problem? Can we use existing methods such as the one proposed by @as2022constrained to safely learn effective locomotion policies?

Answering the above questions will encompass hands-on experience with a real robotic system (such as ANYmal) together with learning to implement and test cutting-edge RL methods. As RL on real hardware is not yet fully explored, we expect to unearth various challenges concerning the effectiveness of our methods in the online learning setting. Accordingly, an equally important goal of the project is to accurately identify these challenges and propose methodological improvements that can help address them.

A starting point would be to create a model of a typical locomotion task in Isaac Orbit as a proof-of-concept. Following that, the second part of the project will be dedicated to extending the proof-of-concept to a real system.

Contact Details

If you are a Master's student with - basic knowledge in reinforcement learning, for instance, by taking Probabilistic Artificial Intelligence or Foundations of Reinforcement Learning courses; - strong background in robotics and programming C++, ROS,

please reach out to Yarden As (yarden.as@inf.ethz.ch) or Chenhao Li (chenhao.li@inf.ethz.ch). Feel free to share any previous materials, such as public code that you wrote, that could be relevant in demonstrating the above requirements.

More information

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Autonomous Curriculum Learning for Increasingly Challenging Tasks

While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become stepping stones towards solving even more challenging problems later in the process. Consider the realm of legged locomotion: Training a robot via reinforcement learning to track a velocity command illustrates this concept. Initially, tracking a low velocity is simpler due to algorithm initialization and environmental setup. By manually crafting a curriculum, we can start with low-velocity targets and incrementally increase them as the robot demonstrates competence. This method works well when the difficulty correlates clearly with the target, as with higher velocities or more challenging terrains. However, challenges arise when the relationship between task difficulty and control parameters is unclear. For instance, if a parameter dictates various human dance styles for the robot to mimic, it's not obvious whether jazz is easier than hip-hop. In such scenarios, the difficulty distribution does not align with the control parameter. How, then, can we devise an effective curriculum? In the conventional RSL training setting for locomotion over challenging terrains, there is also a handcrafted learning schedule dictating increasingly hard terrain levels but unified with multiple different types. With a smart autonomous curriculum learning algorithm, are we able to overcome separate terrain types asynchronously and thus achieve overall better performance or higher data efficiency?

Keywords

curriculum learning, open-ended learning, self-evolution, progressive task solving

Labels

Master Thesis

Description

Work packages

Literature research

Development of autonomous curriculum

Comparison with baselines (no curriculum, hand-crafted curriculum)

Requirements

Strong programming skills in Python

Experience in reinforcement learning

Publication

Related literature This project and the following literature will make you a master in curriculum/active/open-ended learning.

Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.

Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.

Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://marbaga.github.io/

Marco Bagatella

mbagatella@ethz.ch

More information

Published since: 2024-11-26

Organization Robotic Systems Lab

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Bagatella Marco , Li Chenhao

Topics Engineering and Technology

Humanoid Locomotion Learning with Human Motion Priors

Humanoid robots, designed to replicate human structure and behavior, have made significant strides in kinematics, dynamics, and control systems. Research aims to develop robots capable of performing tasks in human-centric settings, from simple object manipulation to navigating complex terrains. Reinforcement learning (RL) has proven to be a powerful method for enabling robots to learn from their environment, enhancing their performance over time without explicit programming for every possible scenario. In the realm of humanoid robotics, RL is used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. However, one of the primary challenges is the high dimensionality of the action space, where handcrafted reward functions fall short of generating natural, lifelike motions. Incorporating motion priors into the learning process of humanoid robots addresses these challenges effectively. Motion priors can significantly reduce the exploration space in RL, leading to faster convergence and reduced training time. They ensure that learned policies prioritize stability and safety, reducing the risk of unpredictable or hazardous actions. Additionally, motion priors guide the learning process towards more natural, human-like movements, improving the robot's ability to perform tasks intuitively and seamlessly in human environments. Therefore, motion priors are crucial for efficient, stable, and realistic humanoid locomotion learning, enabling robots to better navigate and interact with the world around them.

Keywords

motion priors, humanoid, reinforcement learning, representation learning

Labels

Master Thesis

Description

**Work packages** Literature research Human motion capture and retargeting Skill space development Hardware validation encouraged upon availability **Requirements** Strong programming skills in Python Experience in reinforcement learning and imitation learning frameworks **Publication** This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted. **Related literature** Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14. Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784. Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20. Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE. Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR. Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9). Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13. Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning." Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.

Work packages

Literature research

Human motion capture and retargeting

Skill space development

Hardware validation encouraged upon availability

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

Related literature

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.

Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20.

Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

More information

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Information, Computing and Communication Sciences

AI Agents for Excavation Planning

Recent advancements in AI, particularly with models like Claude 3.7 Sonnet, have showcased enhanced reasoning capabilities. This project aims to harness such models for excavation planning tasks, drawing parallels from complex automation scenarios in games like Factorio. We will explore the potential of these AI agents to plan and optimize excavation processes, transitioning from simulated environments to real-world applications with our excavator robot.

Keywords

GPT, Large Language Models, Robotics, Deep Learning, Reinforcement Learning

Labels

Semester Project , Master Thesis

Description

The evolution of large language models (LLMs) has opened new avenues in automation and planning. Notably, Claude 3.7 Sonnet introduces hybrid reasoning, enabling both rapid responses and detailed, step-by-step problem-solving \cite{anthropic2025}. Such capabilities position these models as potential candidates for tasks requiring intricate planning, such as excavation.Possible final deployment in the real world with our excavator robot. Excavation planning is a challenging problem requiring spatial reasoning, decision-making under constraints, and long-horizon planning. Recent advances in AI have led to agents that can master complex games like Go and navigate automation-heavy environments. This project aims to determine whether these AI systems can efficiently plan excavation tasks and how they compare to reinforcement learning-based approaches.

Terra is a flexible, JAX-accelerated grid-world environment designed for training AI agents in earthworks planning. It allows for high-level motion and excavation planning, formulated as a reinforcement learning (RL) problem. Terra's multi-GPU capabilities enable rapid training, achieving intelligent excavation planning in minutes on high-end hardware.

First, we will test the zero-shot capabilities of state-of-the-art LLMs and agents in excavation planning. By providing structured prompts and game renderings, we will evaluate whether these models can reason effectively about excavation tasks.

Work Packages

\item Design a pipeline that enables modern AI agents and large language models (LLMs) to play the excavation planning game in \textbf{Terra}.

\item Evaluate whether models like Claude 3.7 Sonnet or GPT-4 can solve excavation tasks zero-shot or require fine-tuning.

\item Train AI models with reinforcement learning using Terra’s multi-GPU acceleration.

\item Deploy the trained models onto a real-world robotic excavator for autonomous excavation.

Requirements

general programming experience with python
experience training neural network
bonus: experience with large language models

Contact Details

More information