Student Projects
Childpage navigation
Currently, the following student projects are available. Please contact the responsible supervisor and apply with your CV and transcripts.
In case you have project ideas related to any of these projects, take the opportunity and propose your own project!
We also offer Master Thesis and or exchange semesters at reknown universities around the world. Please contact us in case of interest!
Studies on Mechatronics
We offer students also to conduct their Studies on Mechatronics at our lab. In general, we recommend to do the Studies on Mechatronics in combination with the Bachelor Thesis, either as prepartory work the semester beofre or as extended study in parallel. If you want to do it independently, yiou can find prroposed projets also in the list below. Please directly apply with corresponsponding supervisor.
Continual Learning and Domain Adaptation Techniques for a Waste Monitoring System on an Ocean Cleanup Vessel
This thesis develops an automated onboard waste quantification system for a maritime waste collection vessel, leveraging computer vision with continual learning and domain adaptation to replace manual counting of floating waste. Evaluated under real-world maritime conditions, the system aims to improve waste management in the South East Asian Sea.
Keywords
Computer Vision, Continual Learning, Field Testing
Labels
Master Thesis
Description
The Autonomous River Cleanup (ARC) is a student-led initiative supported by the Robotic Systems Lab, focused on tackling riverine waste pollution. In partnership with The SeaCleaners, a Swiss NGO, this thesis aims to develop a self-improving onboard waste quantification system for the “Mobula 10” vessel collecting floating waste in the South East Asian Sea. Currently, waste quantification relies on manually counting collected items. The goal of this thesis is to automate the process using computer vision and hardware solutions tailored to the vessel’s infrastructure and the environmental conditions on the sea. Key to this effort will be the integration of continual learning [1] and domain adaptation [2] techniques for computer vision algorithms to adapt models to diverse and changing waste items, ensuring consistent performance without full retraining. Lastly, the system will be evaluated in real-world conditions to propose further improvements.
Work Packages
- Develop and integrate a camera-based monitoring system on an oceanic waste collection vessel
- Explore and implement continual learning techniques for waste detection and classification models
- Conduct field testing in Zürich and deploy the system potentially abroad in South East Asia
- Create GPS-based density maps to assess the distribution of oceanic waste
Requirements
- Solid knowledge of computer vision and machine learning
- Hands-on experience in hardware integration and real-world testing
- Interest in sustainability and environmental protection
- Familiarity with continual learning or domain adaptation is a plus
Contact Details
More information
Open this project... call_made
Published since: 2025-03-26 , Earliest start: 2025-05-01 , Latest end: 2025-12-31
Organization Robotic Systems Lab
Hosts Stolle Jonas , Elbir Emre
Topics Engineering and Technology
Extending Functional Scene Graphs to Include Articulated Object States
While traditional [1] and functional [2] scene graphs are capable of capturing the spatial relationships and functional interactions between objects and spaces, they encode each object as static, with fixed geometry. In this project, we aim to enable the estimation of the state of articulated objects and include it in the functional scene graph.
Keywords
scene understanding, scene graph, exploration
Labels
Master Thesis
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-03-25 , Earliest start: 2025-03-25
Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne
Organization Computer Vision and Geometry Group
Hosts Bauer Zuria, Dr. , Trisovic Jelena , Zurbrügg René
Topics Information, Computing and Communication Sciences , Engineering and Technology
Event-based feature detection for highly dynamic tracking
Event cameras are an exciting new technology enabling sensing of highly dynamic content over a broad range of illumination conditions. The present thesis explores novel, sparse, event-driven paradigms for detecting structure and motion patterns in raw event streams.
Keywords
Event camera, neuromorphic sensing, feature detection, computer vision
Labels
Master Thesis
Description
Event cameras are a relatively new, vision-based exteroceptive sensor relying on standard CMOS technology. Unlike normal cameras, event cameras do not measure absolute brightness in a frame-by-frame manner, but relative changes of the pixel-level brightness. Essentially, every pixel of an event camera independently observes the local brightness pattern, and when the latter experiences a relative change of minimum amount with respect to a previous value, a measurement is triggered in the form of a time-stamped event indicating the image location as well as the polarity of the change (brighter or darker) [2]. The pixels act asynchronously and can potentially fire events at a very high rate. Owing to their design, event cameras do not suffer from the same artifacts as regular cameras, but continue to perform well under high dynamics or challenging illumination conditions.
Event cameras currently enjoy growing popularity and they represent a new, interesting alternative for exteroceptive sensing in robotics when facing scenarios with high dynamics and/or challenging conditions. The focus of the present thesis lies on 3D motion estimation with event cameras, and in particular aims at event-driven, computationally efficient methods that can trigger motion hypotheses from sparse raw events. Initial theoretical advances in this direction have been presented in recent literature [3,4,5], though these methods are still limited in terms of the assumptions that they make. The present thesis will push the boundaries by proposing novel both geometry and learning-based representations.
The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview.
[1] Event-Based, 6-DOF Camera Tracking from Photometric Depth Maps, TPAMI 40(10):2402-2412, 2017
[2] The Silicon Retina, 264(5): 76-83, 1991
[3] A 5-Point Minimal Solver for Event Camera Relative Motion Estimation. In Proceedings of the International Conference on Computer Vision (ICCV), 2023
[4] An n-point linear solver for line and motion estimation with event cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[5] Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers, Arxiv: https://arxiv.org/html/2503.03307v1
Work Packages
● Literature research
● Extend the mathematical foundation for sparse event-based motion estimation
● Propose novel detectors that extend operability from lines and constant velocity motion to full 6 DoF motion estimation from either points or lines, or other specific object trajectories such as ballistic curves
● Investigate learning-based, sparse event-based motion detectors to handle more general cases
● Apply the technology to real-world data to track fast ego-motion or ballistic object motion in the environment
Requirements
● Excellent knowledge of C++
● Computer vision experience
● Knowledge of geometric computer vision
● Plus: Experience with event cameras
Contact Details
Laurent Kneip (lkneip@theaiinstitute.com)
Please include your CV and up-to-date transcript.
More information
Open this project... call_made
Published since: 2025-03-13 , Earliest start: 2025-03-17
Organization Robotic Systems Lab
Hosts Kneip Laurent
Topics Engineering and Technology
Fast, change-aware map-based camera tracking
Experiment with Gaussian Splatting based map representations for highly efficient camera tracking and simultaneous change detection and map updating. Apply to different exteroceptive sensing modalities.
Keywords
Localization, Camera Tracking, Gaussian Splatting, Change detection
Labels
Master Thesis
Description
Novel techniques for 3D environment representation such as NeRF [2] and Gaussian Splatting [1,3] provide the ability to efficiently render realistically looking images of environments, and-if formulated as a differentiable function of camera pose-can be embedded into a photometric loss in order to enable camera tracking across different modalities such as RGB cameras and event cameras. However, such representations are by default not able to accommodate for changes in the scene, which may happen in many practically relevant scenarios (e.g. domestic environment). Furthermore, in some of the relevant scenarios, the changes that occur over time may indeed be expected and according to plan (e.g. construction environment).
The present thesis looks into recent 3D reconstruction methods such as Gaussian Spatting and considers their use for real-time vision-based sensor tracking. However, rather than relying on a static map of the environment, the core of the method consists of incorporating a robust change detection and map updating mechanism that relies on a combination of measurement residuals and available priors. The final goal will be to enable long-term vision-based localization in gradually changing environments while simultaneously making use of new sensing data to update the map. The ultimate goal will be to extend the method to sensors that excel at highly dynamic motion tracking, but do not necessarily represent our first choice when thinking of a mapping device (i.e. event cameras).
The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview.
[1] GS-EVT: Cross-Modal Event Camera Tracking based on Gaussian Splatting. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025
[2] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, 2020, Arxiv: https://arxiv.org/abs/2003.08934
[3] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, 2023, Arxiv: https://arxiv.org/abs/2308.04079
Work Packages
● Literature research
● Creating a Gaussian Splatting representation of an environment and then using it for tracking
● Simultaneously ensuring continuous change detection in the environment. The change detection could rely on semantic detection priors in order to identify coherent image segments that have become inconsistent
● Propose an efficient map update strategy that relies on the introduced change detection and is derived from the original Gaussian Splatting algorithm
Requirements
● Excellent knowledge of Python
● Computer vision experience
● Knowledge of recent image rendering techniques
Contact Details
Laurent Kneip (lkneip@theaiinstitute.com)
Igor Bogoslavskyi (ibogoslavskyi@theaiinstitute.com)
Please include your CV and up-to-date transcript.
More information
Open this project... call_made
Published since: 2025-03-13 , Earliest start: 2025-03-17
Organization Robotic Systems Lab
Hosts Kneip Laurent
Topics Engineering and Technology
Soft object reconstruction
This project consists of reconstructing soft object along with their appearance, geometry, and physical properties from image data for inclusion in reinforcement learning frameworks for manipulation tasks.
Keywords
Computer Vision, Structure from Motion, Image-based Reconstruction, Physics-based Reconstruction
Labels
Master Thesis
Description
As 3D reconstruction [2,3], real-time data-driven rendering [4,5], and learning-based control technologies [6,7] are becoming more mature, recent efforts in reinforcement learning are moving towards end-to-end policies that directly consume images in order to generate control commands [8]. However, many of the simulated environments are limited to a composition of rigid objects. In recent years, the inclusion of differentiable particle-based simulation borrowed from computer graphics has enabled the inclusion of non-rigid or even fluid elements. Ideally, we can generate such representations from real world data in order to extend data-driven world simulators to arbitrary new objects with complex physical behavior.
The present thesis focuses on this problem and aims at reconstructing soft objects in terms of their geometry, appearance, and physical behavior. The goal is to make use of the Material Point Method (MPM) in combination with vision-based cues and physical priors in order to reconstruct accurate 3D models of soft objects. The developed models will finally be included into an RL learning environment such as Isaac-Gym in order to train novel manipulation policies for soft objects.
The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview.
[1] Modeling of Deformable Objects for Robotic Manipulation: A Tutorial and Review, Front. Robot. AI, 7, 2020
[2] Global Structure-from-Motion Revisited, ECCV 2024
[3] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors, CVPR 2025
[4] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, CVPR 2020
[5] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, SIGGRAPH 2023
[6] Learning to walk in minutes using massively parallel deep reinforcement learning, CoRL 2022
[7] Champion-Level Drone Racing Using Deep Reinforcement Learning, Nature, 2023
[8] π0: A Vision-Language-Action Flow Model for General Robot Control, Arxiv: https://arxiv.org/abs/2410.24164
Work Packages
● Literature research
● Design of suitable reconstruction method based on visual data and physical priors
● Dataset collection and testing
● Cross-validation against contact-based reconstruction methods
● Embedding into Isaac-Gym for training novel manipulation policies
Requirements
● Excellent knowledge of Python or C++
● Computer vision experience
● Interest in optimization with physics representations
Contact Details
Laurent Kneip (lkneip@theaiinstitute.com)
Sina Mirrazavi (smirrazavi@theaiinstitute.com)
Please include your CV and up-to-date transcript.
More information
Open this project... call_made
Published since: 2025-03-13 , Earliest start: 2025-03-17
Organization Robotic Systems Lab
Hosts Kneip Laurent
Topics Engineering and Technology
Reconstruction from online videos taken in the wild
Push the limits of arbitrary online video reconstruction by combining the most recent, prior-supported real-time Simultaneous Localization And Mapping (SLAM) methods with automatic supervision techniques.
Keywords
Computer Vision, 3D Reconstruction, SLAM
Labels
Master Thesis
Description
In recent years, the advent of learning-based methods has led to substantial advancements of the performance of video-based 3D reconstruction methods. It is now possible to take an uncalibrated monocular video sequence and automatically process it to obtain a reasonably good estimation of the 3D geometry of the scene as well as the camera motion [1]. However, challenges remain in case of processing videos taken in the wild from open online repositories (e.g. Youtube):
● The videos are often not captured in a single take, but have changing camera perspectives. This often breaks continuous incremental reconstruction paradigms, and leads to the requirement of additional supervision.
● The videos sometimes have highly challenging passages with strong dynamics, missing texture, and/or dynamic objects in the image, thereby again demanding additional supervision.
● It is demonstrated and understood that adding calibration information to the estimation potentially improves the estimation performance.
The goal of the present project is to explore the use of both classical and learning-based solutions to automatically provide such supervision, and subsequently modify existing modern Simultaneous Localization And Mapping (SLAM) frameworks to include such priors and thereby produce more robust performance on challenging videos taken in the wild.
The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview.
[1] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors, CVPR 2025
Work Packages
● Literature research
● Addition of traditional geometric methods for automatic camera calibration
● Automatic video segmentation and scene categorization.
● Automatic processing of video captions and audio for the extraction of expected semantics, and subsequent application of an open vocabulary model for automatic masking
● Testing and Validation
Requirements
● Excellent knowledge of Python and C++
● Knowledge in Computer vision
● Experience in SLAM/reconstruction
● Experience in applying learning-based representations
● Interest in recent LLM/VLM architectures
Contact Details
Laurent Kneip (lkneip@theaiinstitute.com)
Alexander Liniger (aliniger@theaiinstitute.com)
Please include your CV and up-to-date transcript.
More information
Open this project... call_made
Published since: 2025-03-13 , Earliest start: 2025-03-17
Organization Robotic Systems Lab
Hosts Kneip Laurent
Topics Engineering and Technology
Computationally Efficient Neural Networks
Computing, time, and energy requirements of recent neural networks have demonstrated dramatic increase over time, impacting on their applicability in real-world contexts. The present thesis explores novel ways of implementing neural network implementations that will substantially reduce their computational complexity and thus energy footprint.
Keywords
AI, CNNs, transformers, network implementation
Labels
Master Thesis
Description
Over the past decade, advances in deep learning and computer vision have led to substantial improvements in robotic perception abilities. It is nowadays possible to use neural networks for reliable object detection, object pose and shape estimation, open-vocabulary semantic interpretation, and the solution of low-level problems such as feature tracking and depth estimation, to name just a few. However, a limiting factor of growing concern is the computational complexity and thus the power consumption/computing hardware vs. latency trade-off of such models. We are therefore also experiencing an increasing demand for cloud-based computation, often with remaining and unpredictable latencies.
As demonstrated through a number of past efforts [2,3,4], the computational complexity of a neural network can be reduced fairly substantially by changes in the selected low-level computing paradigm. Rather than relying on standard matrix-vector multiplications that make use of hardware multipliers, we can choose architectures that will rely more on additive operations [2], thereby reducing computational complexity and thus energy consumption by a substantial amount. The present work aims at the development and testing of novel and efficient network implementations that can be applied to any off-the-shelf network when deployed in custom-programmable hardware.
The proposed thesis will be conducted at the Robotics and AI Institute, a new top-notch partner institute of Boston Dynamics pushing the boundaries of control and perception in robotics. Selection is highly competitive. Potential candidates are invited to submit their CV and grade sheet, after which students will be invited to an on-site interview.
[1] On global electricity usage of communication technology: Trends to 2030, Challenges 6(1), 117-157, 2015
[2] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits, https://arxiv.org/pdf/2402.17764
[3] XOR-Net: An Efficient Computation Pipeline for Binary Neural Network Inference on Edge Devices, https://cmu-odml.github.io/papers/XOR-NetAnEfficientComputationPipelineforBinaryNeuralNetworkInferenceonEdgeDevices.pdf
[4] DeepSeek-V3 Technical Report, https://arxiv.org/abs/2412.19437
Work Packages
● Literature research
● Development of loss-less post-training adaptations for reducing the computational complexity of neural networks
● Optimization of computational efficiency of standard architectures such as CNNs, MLPs, and Transformers
● Testing in simulation
● Optional: Testing on custom programmable hardware
Requirements
Excellent knowledge in either C++ or Python
Knowledge in deep learning
Experience in computer vision
Contact Details
Laurent Kneip (lkneip@theaiinstitute.com)
Alex Liniger (aliniger@theaiinstitute.com)
Please include your CV and up-to-date transcript when applying
More information
Open this project... call_made
Published since: 2025-03-12 , Earliest start: 2025-03-17
Organization Robotic Systems Lab
Hosts Kneip Laurent
Topics Engineering and Technology
Generalist Excavator Transformer
We want to develop a generalist digging agent that is able to do multiple tasks, such as digging and moving loose soil, and/or control multiple excavators. We plan to use decision transformers, trained on offline data, to accomplish these tasks.
Keywords
Offline reinforcement learning, transformers, autonomous excavation
Labels
Semester Project , Master Thesis
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-03-11 , Earliest start: 2025-03-01 , Latest end: 2025-08-31
Organization Robotic Systems Lab
Hosts Werner Lennart , Egli Pascal Arturo , Terenzi Lorenzo , Nan Fang , Zhang Weixuan
Topics Information, Computing and Communication Sciences
Differential Particle Simulation for Robotics
This project focuses on applying differential particle-based simulation to address challenges in simulating real-world robotic tasks involving interactions with fluids, granular materials, and soft objects. Leveraging the differentiability of simulations, the project aims to enhance simulation accuracy with limited real-world data and explore learning robotic control using first-order gradient information.
Labels
Semester Project , Master Thesis
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-03-10 , Earliest start: 2025-01-01 , Latest end: 2025-12-31
Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne
Organization Robotic Systems Lab
Hosts Nan Fang , Ma Hao
Topics Engineering and Technology
Novel Winch Control for Robotic Climbing
While legged robots have demonstrated impressive locomotion performance in structured environments, challenges persist in navigating steep natural terrain and loose, granular soil. These challenges extend to extraterrestrial environments and are relevant to future lunar, martian, and asteroidal missions. In order to explore the most extreme terrains, a novel winch system has been developed for the ANYmal robot platform. The winch could potentially be used as a fail-safe device to prevent falls during unassisted traverses of steep terrain, as well as an added driven degree of freedom for assisted ascending and descending of terrain too steep for unassisted traversal. The goal of this project is to develop control policies that utilize this new hardware and enable further climbing robot research.
Keywords
Robot, Space, Climbing, Winch, Control
Labels
Semester Project , Master Thesis , ETH Zurich (ETHZ)
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-03-05 , Earliest start: 2024-10-07
Organization Robotic Systems Lab
Hosts Vogel Dylan
Topics Information, Computing and Communication Sciences , Engineering and Technology
Beyond Value Functions: Stable Robot Learning with Monte-Carlo GRPO
Robotics is dominated by on-policy reinforcement learning: the paradigm of training a robot controller by iteratively interacting with the environment and maximizing some objective. A crucial idea to make this work is the Advantage Function. On each policy update, algorithms typically sum up the gradient log probabilities of all actions taken in the robot simulation. The advantage function increases or decreases the probabilities of these taken actions by comparing their “goodness” versus a baseline. Current advantage estimation methods use a value function to aggregate robot experience and hence decrease variance. This improves sample efficiency at the cost of introducing some bias. Stably training large language models via reinforcement learning is well-known to be a challenging task. A line of recent work [1, 2] has used Group-Relative Policy Optimization (GRPO) to achieve this feat. In GRPO, a series of answers are generated for each query-answer pair. The advantage is calculated based on a given answer being better than the average answer to the query. In this formulation, no value function is required. Can we adapt GRPO towards robot learning? Value Functions are known to cause issues in training stability [3] and a result in biased advantage estimates [4]. We are in the age of GPU-accelerated RL [5], training policies by simulating thousands of robot instances simultaneously. This makes a new monte-carlo (MC) approach towards RL timely, feasible and appealing. In this project, the student will be tasked to investigate the limitations of value-function based advantage estimation. Using GRPO as a starting point, the student will then develop MC-based algorithms that use the GPU’s parallel simulation capabilities for stable RL training for unbiased variance reduction while maintaining a competitive wall-clock time.
Keywords
Robot Learning, Reinforcement Learning, Monte Carlo RL, GRPO, Advantage Estimation
Labels
Semester Project , Bachelor Thesis , Master Thesis
Description
Co-supervised by Jing Yuan Luo (Mujoco)
Work Packages
- Literature research
- Investigate the bias and variance properties of the PPO value function
- Design and implement a novel algorithm that achieves variance reduction through monte carlo sampling via massive environment parallelism
- Re-implement existing SOTA algorithms as benchmarks
- Bonus: provide theoretical insights to justify your proposed monte carlo method
Requirements
- Background in Learning
- Excellent knowledge of Python
Contact Details
More information
Open this project... call_made
Published since: 2025-03-05
Organization Robotic Systems Lab
Hosts Klemm Victor
Topics Information, Computing and Communication Sciences , Engineering and Technology , Behavioural and Cognitive Sciences
Volumetric Bucket-Fill Estimation
Gravis Robotics is an ETH spinoff from the Robotic Systems Lab (RSL) working on the automation of heavy machinery (https://gravisrobotics.com/). In this project, you will be working with the Gravis team to develop a perceptive bucket-fill estimation system. You will conduct your project at Gravis under joint supervision from RSL.
Keywords
Autonomous Excavation
Labels
Semester Project , Master Thesis
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-02-28 , Earliest start: 2025-01-01 , Latest end: 2026-01-01
Organization Robotic Systems Lab
Hosts Egli Pascal Arturo
Topics Engineering and Technology
Leveraging Human Motion Data from Videos for Humanoid Robot Motion Learning
The advancement in humanoid robotics has reached a stage where mimicking complex human motions with high accuracy is crucial for tasks ranging from entertainment to human-robot interaction in dynamic environments. Traditional approaches in motion learning, particularly for humanoid robots, rely heavily on motion capture (MoCap) data. However, acquiring large amounts of high-quality MoCap data is both expensive and logistically challenging. In contrast, video footage of human activities, such as sports events or dance performances, is widely available and offers an abundant source of motion data. Building on recent advancements in extracting and utilizing human motion from videos, such as the method proposed in WHAM (refer to the paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"), this project aims to develop a system that extracts human motion from videos and applies it to teach a humanoid robot how to perform similar actions. The primary focus will be on extracting dynamic and expressive motions from videos, such as soccer player celebrations, and using these extracted motions as reference data for reinforcement learning (RL) and imitation learning on a humanoid robot.
Labels
Master Thesis
Description
Work packages
Literature research
Global motion reconstruction from videos.
Learning from reconstructed motion demonstrations with reinforcement learning on a humanoid robot.
Requirements
Strong programming skills in Python
Experience in computer vision and reinforcement learning
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / computer vision / robotics conferences.
Related literature
Yuan, Y., Iqbal, U., Molchanov, P., Kitani, K. and Kautz, J., 2022. Glamr: Global occlusion-aware human mesh recovery with dynamic cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11038-11049).
YUAN, Y. and Makoviychuk, V., 2023. Learning physically simulated tennis skills from broadcast videos.
Shin, S., Kim, J., Halilaj, E. and Black, M.J., 2024. Wham: Reconstructing world-grounded humans with accurate 3d motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2070-2080).
Peng, X.B., Abbeel, P., Levine, S. and Van de Panne, M., 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4), pp.1-14.
Goal
The objective of this project is to develop a robust system for extracting human motions from video footage and transferring these motions to a humanoid robot using learning from demonstration techniques. The system will be designed to handle the noisy data typically associated with video-based motion extraction and ensure that the humanoid robot can replicate the extracted motions with high fidelity while respecting physical rules.
Proposed Methodology
Video Data Collection and Motion Extraction:
Collect video footage of soccer player celebrations and other dynamic human activities.
Starting from existing monocular human pose/motion estimation algorithms to extract 3D motion data from the videos.
Incorporate physics-based corrections similar to those employed in WHAM to address issues like jitter, foot sliding, and ground penetration in the extracted motion data.
Motion Learning:
- Applying existing learning from demonstration algorithms in a simulated environment to replicate kinematic motions reconstructed from the videos while respecting physical rules using reinforcement learning.
Implementation on Humanoid Robot:
- This is encouraged since we have our robot lying there waiting for you.
Contact Details
Please include your CV and transcript in the submission.
Manuel Kaufmann
https://ait.ethz.ch/people/kamanuel
Chenhao Li
More information
Open this project... call_made
Published since: 2025-02-25
Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Kaufmann Manuel , Li Chenhao , Li Chenhao , Kaufmann Manuel , Li Chenhao
Topics Engineering and Technology
Learning Agile Dodgeball Behaviors for Humanoid Robots
Agility and rapid decision-making are vital for humanoid robots to safely and effectively operate in dynamic, unstructured environments. In human contexts—whether in crowded spaces, industrial settings, or collaborative environments—robots must be capable of reacting to fast, unpredictable changes in their surroundings. This includes not only planned navigation around static obstacles but also rapid responses to dynamic threats such as falling objects, sudden human movements, or unexpected collisions. Developing such reactive capabilities in legged robots remains a significant challenge due to the complexity of real-time perception, decision-making under uncertainty, and balance control. Humanoid robots, with their human-like morphology, are uniquely positioned to navigate and interact with human-centered environments. However, achieving fast, dynamic responses—especially while maintaining postural stability—requires advanced control strategies that integrate perception, motion planning, and balance control within tight time constraints. The task of dodging fast-moving objects, such as balls, provides an ideal testbed for studying these capabilities. It encapsulates several core challenges: rapid object detection and trajectory prediction, real-time motion planning, dynamic stability maintenance, and reactive behavior under uncertainty. Moreover, it presents a simplified yet rich framework to investigate more general collision avoidance strategies that could later be extended to complex real-world interactions. In robotics, reactive motion planning for dynamic environments has been widely studied, but primarily in the context of wheeled robots or static obstacle fields. Classical approaches focus on precomputed motion plans or simple reactive strategies, often unsuitable for highly dynamic scenarios where split-second decisions are critical. In the domain of legged robotics, maintaining balance while executing rapid, evasive maneuvers remains a challenging problem. Previous work on dynamic locomotion has addressed agile behaviors like running, jumping, or turning (e.g., Hutter et al., 2016; Kim et al., 2019), but these movements are often planned in advance rather than triggered reactively. More recent efforts have leveraged reinforcement learning (RL) to enable robots to adapt to dynamic environments, demonstrating success in tasks such as obstacle avoidance, perturbation recovery, and agile locomotion (Peng et al., 2017; Hwangbo et al., 2019). However, many of these approaches still struggle with real-time constraints and robustness in high-speed, unpredictable scenarios. Perception-driven control in humanoids, particularly for tasks requiring fast reactions, has seen advances through sensor fusion, visual servoing, and predictive modeling. For example, integrating vision-based object tracking with dynamic motion planning has enabled robots to perform tasks like ball catching or blocking (Ishiguro et al., 2002; Behnke, 2004). Yet, dodging requires a fundamentally different approach: instead of converging toward an object (as in catching), the robot must predict and strategically avoid the object’s trajectory while maintaining balance—often in the presence of limited maneuvering time. Dodgeball-inspired robotics research has been explored in limited contexts, primarily using wheeled robots or simplified agents in simulations. Few studies have addressed the challenges of high-speed evasion combined with the complexities of humanoid balance and multi-joint coordination. This project aims to bridge that gap by developing learning-based methods that enable humanoid robots to reactively avoid fast-approaching objects in real time, while preserving stability and agility.
Labels
Master Thesis
Description
Work packages
Literature research
Utilize simulation platforms (e.g., Isaac Lab) for initial policy development and training.
Explore model-free RL approaches, potentially incorporating curriculum learning to gradually increase task complexity.
Investigate perception models for object detection and trajectory forecasting, possibly leveraging lightweight deep learning architectures for real-time processing.
Implement and test learned behaviors on a physical humanoid robot, addressing the challenges of sim-to-real transfer through domain randomization or fine-tuning.
Requirements
Solid foundation in robotics, control theory, and machine learning.
Experience with reinforcement learning frameworks (e.g., PyTorch, TensorFlow, or RLlib).
Familiarity with robot simulation environments (e.g., MuJoCo, Gazebo) and real-world robot control.
Strong programming skills (Python, C++) and experience with sensor data processing.
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / robotics conferences.
Goal
Perception & Prediction
- Develop a real-time perception pipeline capable of detecting and tracking incoming projectiles. Utilize camera data or external motion capture systems to predict ball trajectories accurately under varying speeds and angles.
Reactive Motion Planning
- Design algorithms that plan evasive maneuvers (e.g., side-steps, ducks, or rotational movements) within milliseconds of detecting an incoming threat, ensuring the robot’s center of mass remains stable throughout.
Learning-Based Control
- Apply reinforcement learning or imitation learning to optimize dodge behaviors, balancing between minimal energy expenditure and maximum evasive success. Investigate policy architectures that enable rapid reactions while handling noisy observations and sensor delays.
Robustness & Evaluation
- Test the system under diverse scenarios, including multi-ball environments and varying throw speeds. Evaluate the robot’s success rate, energy efficiency, and post-dodge recovery capabilities.
Implementation on Humanoid Robot:
- This is encouraged since we have our robot lying there waiting for you.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
More information
Open this project... call_made
Published since: 2025-02-25
Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao
Topics Engineering and Technology
Learning Real-time Human Motion Tracking on a Humanoid Robot
Humanoid robots, designed to mimic the structure and behavior of humans, have seen significant advancements in kinematics, dynamics, and control systems. Teleoperation of humanoid robots involves complex control strategies to manage bipedal locomotion, balance, and interaction with environments. Research in this area has focused on developing robots that can perform tasks in environments designed for humans, from simple object manipulation to navigating complex terrains. Reinforcement learning has emerged as a powerful method for enabling robots to learn from interactions with their environment, improving their performance over time without explicit programming for every possible scenario. In the context of humanoid robotics and teleoperation, RL can be used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. Key challenges include the high dimensionality of the action space, the need for safe exploration, and the transfer of learned skills across different tasks and environments. Integrating human motion tracking with reinforcement learning on humanoid robots represents a cutting-edge area of research. This approach involves using human motion data as input to train RL models, enabling the robot to learn more natural and human-like movements. The goal is to develop systems that can not only replicate human actions in real-time but also adapt and improve their responses over time through learning. Challenges in this area include ensuring real-time performance, dealing with the variability of human motion, and maintaining stability and safety of the humanoid robot.
Keywords
real-time, humanoid, reinforcement learning, representation learning
Labels
Master Thesis
Description
Work packages
Literature research
Human motion capture and retargeting
Skill space development
Hardware validation encouraged upon availability
Requirements
Strong programming skills in Python
Experience in reinforcement learning and imitation learning frameworks
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature
Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.
Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.
Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."
Serifi, A., Grandia, R., Knoop, E., Gross, M. and Bächer, M., 2024, December. Vmp: Versatile motion priors for robustly tracking motion on physical characters. In Computer Graphics Forum (Vol. 43, No. 8, p. e15175).
Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454.
He, T., Luo, Z., Xiao, W., Zhang, C., Kitani, K., Liu, C. and Shi, G., 2024, October. Learning human-to-humanoid real-time whole-body teleoperation. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8944-8951). IEEE.
He, T., Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C. and Shi, G., 2024. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
More information
Open this project... call_made
Published since: 2025-02-25
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao
Topics Information, Computing and Communication Sciences
Loosely Guided Reinforcement Learning for Humanoid Parkour
Humanoid robots hold the promise of navigating complex, human-centric environments with agility and adaptability. However, training these robots to perform dynamic behaviors such as parkour—jumping, climbing, and traversing obstacles—remains a significant challenge due to the high-dimensional state and action spaces involved. Traditional Reinforcement Learning (RL) struggles in such settings, primarily due to sparse rewards and the extensive exploration needed for complex tasks. This project proposes a novel approach to address these challenges by incorporating loosely guided references into the RL process. Instead of relying solely on task-specific rewards or complex reward shaping, we introduce a simplified reference trajectory that serves as a guide during training. This trajectory, often limited to the robot's base movement, reduces the exploration burden without constraining the policy to strict tracking, allowing the emergence of diverse and adaptable behaviors. Reinforcement Learning has demonstrated remarkable success in training agents for tasks ranging from game playing to robotic manipulation. However, its application to high-dimensional, dynamic tasks like humanoid parkour is hindered by two primary challenges: Exploration Complexity: The vast state-action space of humanoids leads to slow convergence, often requiring millions of training steps. Reward Design: Sparse rewards make it difficult for the agent to discover meaningful behaviors, while dense rewards demand intricate and often brittle design efforts. By introducing a loosely guided reference—a simple trajectory representing the desired flow of the task—we aim to reduce the exploration space while maintaining the flexibility of RL. This approach bridges the gap between pure RL and demonstration-based methods, enabling the learning of complex maneuvers like climbing, jumping, and dynamic obstacle traversal without heavy reliance on reward engineering or exact demonstrations.
Keywords
humanoid, reinforcement learning, loosely guided
Labels
Master Thesis
Description
Work packages
Design a Loosely Guided RL Framework that integrates simple reference trajectories into the training loop.
Evaluate Exploration Efficiency by comparing baseline RL methods with the guided approach.
Demonstrate Complex Parkour Behaviors such as climbing, jumping, and dynamic traversal using the guided RL policy.
Hardware validation encouraged
Requirements
Strong programming skills in Python
Experience in reinforcement learning and imitation learning frameworks
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature
Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.
Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.
Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."
Serifi, A., Grandia, R., Knoop, E., Gross, M. and Bächer, M., 2024, December. Vmp: Versatile motion priors for robustly tracking motion on physical characters. In Computer Graphics Forum (Vol. 43, No. 8, p. e15175).
Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454.
He, T., Luo, Z., Xiao, W., Zhang, C., Kitani, K., Liu, C. and Shi, G., 2024, October. Learning human-to-humanoid real-time whole-body teleoperation. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8944-8951). IEEE.
He, T., Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C. and Shi, G., 2024. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
More information
Open this project... call_made
Published since: 2025-02-25
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao
Topics Information, Computing and Communication Sciences
Learning World Models for Legged Locomotion
Model-based reinforcement learning learns a world model from which an optimal control policy can be extracted. Understanding and predicting the forward dynamics of legged systems is crucial for effective control and planning. Forward dynamics involve predicting the next state of the robot given its current state and the applied actions. While traditional physics-based models can provide a baseline understanding, they often struggle with the complexities and non-linearities inherent in real-world scenarios, particularly due to the varying contact patterns of the robot's feet with the ground. The project aims to develop and evaluate neural network-based models for predicting the dynamics of legged environments, focusing on accounting for varying contact patterns and non-linearities. This involves collecting and preprocessing data from various simulation environment experiments, designing neural network architectures that incorporate necessary structures, and exploring hybrid models that combine physics-based predictions with neural network corrections. The models will be trained and evaluated on prediction autoregressive accuracy, with an emphasis on robustness and generalization capabilities across different noise perturbations. By the end of the project, the goal is to achieve an accurate, robust, and generalizable predictive model for the forward dynamics of legged systems.
Keywords
forward dynamics, non-smooth dynamics, neural networks, model-based reinforcement learning
Labels
Master Thesis
Description
Work packages
Literature research
Understand the training pipeline of the paper Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics.
Explore the possibility of using a first-order gradient in optimizing the policy.
Requirements
Strong programming skills in Python
Experience in machine learning frameworks, especially model-based reinforcement learning.
Publication
This project will mostly focus on simulated environments. Promising results will be submitted to machine learning conferences, where the method will be thoroughly evaluated and tested on different systems (e.g., simple Mujoco environments to complex systems such as quadrupeds and bipeds).
Related literature
Hafner, D., Lillicrap, T., Ba, J. and Norouzi, M., 2019. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603.
Hafner, D., Lillicrap, T., Norouzi, M. and Ba, J., 2020. Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193.
Hafner, D., Pasukonis, J., Ba, J. and Lillicrap, T., 2023. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
Song, Y., Kim, S. and Scaramuzza, D., 2024. Learning Quadruped Locomotion Using Differentiable Simulation. arXiv preprint arXiv:2403.14864.
Li, C., Krause, A. and Hutter, M., 2025. Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. arXiv preprint arXiv:2501.10100.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
More information
Open this project... call_made
Published since: 2025-02-25
Organization Robotic Systems Lab
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao
Topics Engineering and Technology
Supervised learning for loco-manipulation
To spot arm operations, we propose a multi-phase approach combining supervised learning and reinforcement learning (RL). First, we will employ supervised learning to develop a model for solving inverse kinematics (IK), enabling precise joint angle calculations from desired end-effector pose. Next, we will utilize another supervised learning technique to build a collision avoidance model, trained to predict and avoid self-collisions based on arm configurations and environmental data. With these pre-trained networks, we will then integrate RL to generate dynamic and safe arm-motion plans. The RL agent will leverage the IK and collision avoidance models to optimize arm trajectories, ensuring efficient and collision-free movements. This entire pipeline could be back propagated while promising to enhance the accuracy, safety, and flexibility of robotic arm operations in complex environments.
Keywords
Spot, Supervised learning, loco-manipulation
Labels
Master Thesis , ETH Zurich (ETHZ)
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-02-10 , Earliest start: 2025-02-10 , Latest end: 2026-03-01
Organization Robotic Systems Lab
Hosts Mirrazavi Sina
Topics Information, Computing and Communication Sciences
Model-Based Reinforcement Learning for Loco-manipulation
This project aims to develop a model-based reinforcement learning (RL) framework to enable quadruped robots to perform dynamic locomotion and manipulation simultaneously by leveraging advanced model-based RL algorithms such as DeamerV3, TDMPC2 and SAM-RL. We will develop control policies that can predict future states and rewards, enabling the robot to adapt its behavior on-the-fly. The primary focus will be on achieving stable and adaptive walking patterns while reaching and grasping objects. The outcome will provide insights into the integration of complex behaviors in robotic systems, with potential applications in service robotics and automated object handling.
Labels
Master Thesis , ETH Zurich (ETHZ)
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-02-10 , Earliest start: 2025-02-10 , Latest end: 2026-02-10
Organization Robotic Systems Lab
Hosts Mirrazavi Sina
Topics Information, Computing and Communication Sciences
Integrating OpenVLA for Vision-Language-Driven Loco-Manipulation robotics scenarios
This thesis proposes to integrate and adapt the OpenVLA (Open-Source Vision-Language-Action) model to control the Spot robotic arm for performing complex grasping and placing tasks. The study will focus on enabling the robot to recognize, grasp, and organize various toy-sized kitchen items based on human instructions. By leveraging OpenVLA's robust multimodal capabilities, this project aims to bridge the gap between human intent and robotic actions, enabling seamless task execution in unstructured environments. The research will explore the feasibility of fine-tuning OpenVLA for task-specific operations and evaluate its performance in real-world scenarios, providing valuable insights for advancing multimodal robotics.
Labels
Master Thesis , ETH Zurich (ETHZ)
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-02-10 , Earliest start: 2025-02-10 , Latest end: 2026-02-10
Organization Robotic Systems Lab
Hosts Mirrazavi Sina
Topics Information, Computing and Communication Sciences
Differentiable Simulation for Precise End-Effector Tracking
Unlock the potential of differentiable simulation on ALMA, a quadrupedal robot equipped with a robotic arm. Differentiable simulation enables precise gradient-based optimization, promising greater tracking accuracy and efficiency compared to standard reinforcement learning approaches. This project dives into advanced simulation and control techniques, paving the way for improvements in robotic trajectory tracking.
Keywords
Differentiable Simulation, Learning, ALMA
Labels
Semester Project , Bachelor Thesis , Master Thesis
Description
Differentiable simulation [1] has demonstrated significant improvements in sample efficiency compared to traditional reinforcement learning approaches across various applications, including legged locomotion [2]. This project seeks to explore another key advantage of differentiable simulation: its capability for more precise optimization. The study will focus on a tracking task involving ALMA, a quadrupedal robot equipped with a robotic arm. The primary objectives are to develop a differentiable simulation environment for the robot and evaluate its advantages over traditional reinforcement learning methods. By utilizing the gradients provided by the simulation, control policies will be optimized to improve tracking performance. The work involves creating a tailored differentiable simulation, systematically comparing its performance with reinforcement learning techniques, and analyzing its impact on accuracy and real-world applicability. This project provides an opportunity to contribute to advanced research in robotics by combining theoretical insights with practical implementation.
References
- [1] H. J. Suh, M. Simchowitz, K. Zhang, and R. Tedrake, “Do differentiable simulators give better policy gradients?” in InternationalConference on Machine Learning. PMLR, 2022, pp. 20 668–20 696.
- [2] Schwarke, C., Klemm, V., Tordesillas, J., Sleiman, J. P., & Hutter, M. (2024). Learning Quadrupedal Locomotion via Differentiable Simulation. arXiv preprint arXiv:2404.02887
Work Packages
- Literature research
- Implementation of a differentiable simulation environment for ALMA
- Training and evaluation of tracking policies
Requirements
- Excellent knowledge of Python
- Background in Simulation or Learning
Contact Details
Please send your CV, transcript and a short motivation (4-5 sentences max.) to:
More information
Open this project... call_made
Published since: 2025-02-07 , Earliest start: 2025-01-27
Organization Robotic Systems Lab
Hosts Mittal Mayank , Schwarke Clemens , Klemm Victor
Topics Information, Computing and Communication Sciences
Modeling and Simulation for Earthwork in Digital Twin
In this work, we aim to build a digital twin of our autonomous hydraulic excavator, leveraging Mathworks technology for high-fidelity modeling. This will be used in the future to test and benchmark our learning-based controllers.
Keywords
Modeling, Hydraulics, Excavation, Industry
Labels
Semester Project , Master Thesis , ETH Zurich (ETHZ)
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-02-06 , Earliest start: 2025-03-03
Organization Robotic Systems Lab
Hosts Spinelli Filippo , Nan Fang
Topics Information, Computing and Communication Sciences , Engineering and Technology
Reinforcement Learning for Excavation Planning In Terra
We aim to develop a reinforcement learning-based global excavation planner that can plan for the long term and execute a wide range of excavation geometries. The system will be deployed on our legged excavator.
Keywords
Keywords: Reinforcement learning, task planning
Labels
Semester Project , Master Thesis
Description
Reinforcement learning has demonstrated significant success in decision-making and behavior planning with discrete states and action spaces. In this project, we plan to develop and extend a global excavation planner responsible for selecting the next digging area and the actions required to move soil around the site. This requires long-term planning for the sequence of excavation and an understanding of which areas are accessible and where the excavator could potentially become trapped. We developed using Jax a simulation environment, Terra [3], where agents can be trained even in millions of parallel environments on multiple GPUs. The first part of the project will focus on modifying the simulation environment to have a continous state space and include 3D soil to simulate are real contruction site. The second part of the project will focus on deploying the system and integrating it with the current stack to dig geometries that were not achieved so far.
References: [1] PPO for LUX AI [2] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model, Deepmind [3] Terra: https://github.com/leggedrobotics/terra
Work Packages
Reinforcement learning has shown remarkable success in decision-making and behavior planning within environments characterized by discrete states and actions. In this project, we aim to develop and enhance a global excavation planner that is tasked with selecting the next excavation site and coordinating the necessary actions to redistribute soil across the site. This involves intricate long-term planning to sequence excavation activities, while also considering accessibility and potential areas where the excavator might become trapped.
We have developed a simulation environment called Terra [3], using Jax, which supports training agents in millions of parallel environments across multiple GPUs. The initial phase of the project will involve adapting the simulation environment to support a continuous state space and incorporate 3D soil modeling to more accurately mimic a real construction site. The second phase will focus on deploying the system and integrating it with existing technology stacks to achieve excavation geometries previously unattainable.
References: [1] PPO for LUX AI [2] Mastering Atari, Go, Chess, and Shogi by Planning with a Learned Model, Deepmind [3] Terra: https://github.com/leggedrobotics/terra
Requirements
- Experience in PyTorch and training neural networks
- Experience with GPU-accelerated environments (preferred)
Contact Details
More information
Open this project... call_made
Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-08-31
Organization Robotic Systems Lab
Hosts Terenzi Lorenzo
Topics Information, Computing and Communication Sciences
Model Based Reinforcement Learning
We want to train an excavator agent to learn in a variety of soil using a fast, GPU-accelerated soil particle simulator in Isaac Sim.
Keywords
particle simulation, omniverse, warp, reinforcement learning, model based reinforcement learning.
Labels
Semester Project , Master Thesis
Description
Model-free reinforcement learning (such as PPO) approaches for training excavation agents in simulated environments struggle with computational demands, especially with realistic soil dynamics, and a simple soil model is needed in order to successfully train [1] . We have developed a particle simulation soil model in NVIDIA Isaac Sim [1], enhancing realism but also increasing computational load, which makes it unsuitable for model-free RL algorithms like PPO due to their high sample requirements, resulting in slow training processes. This project aims to explore and implement the model-based RL algorithms, such as the Dreamer algorithm, to train excavation agents efficiently in our particle simulator. Dreamer's predictive world models promise improved sample efficiency, potentially overcoming the computational challenges of our GPU-accelerated simulator.
[1] Egli, Pascal, et al. "Soil-Adaptive Excavation Using Reinforcement Learning." IEEE Robotics and Automation Letters 7.4 (2022): 9778-9785.
[2] NVIDIA Ominverse
[3] Mastering Diverse Domains through World Models, Danijar Hafner et all., arxiv 2023
Work Packages
Adapting Dreamer/model-based RL algorithm to interact effectively with our GPU-accelerated particle simulation.
Training and evaluating excavation agents and their world model within this framework.
Optimizing the algorithm and simulation settings to successfully train the excavator to scoop in a variety of soils.
Requirements
• Experience in training neural networks and RL
• Experience with ROS and good knowledge of Python
More information
Open this project... call_made
Published since: 2025-02-03 , Earliest start: 2025-02-28 , Latest end: 2025-08-31
Organization Robotic Systems Lab
Hosts Egli Pascal Arturo , Terenzi Lorenzo
Topics Information, Computing and Communication Sciences , Engineering and Technology
Reinforcement Learning for Particle-Based Excavation in Isaac Sim
We want to train RL agents on our new particle simulator, accelerated on the GPU via warp in Isaac sim.
Keywords
particle simulation, omniverse, warp, reinforcement learning
Labels
Semester Project , Master Thesis
Description
Training reinforcement learning digging agents in simulation has only been possible with simplified soil models that compute the experienced forces at the shovel edge [1]. Strong domain randomization is necessary for sim-to-real transfer, but so far it has not been possible to simulate the presence of soil inhomogeneities and soil displacement fast enough. This last one, in particular, is essential for training agents that are able to displace loose soil. We want to use NVIDIA Isaac Sim which features GPU-based high-performance simulations and is part of NVIDIA Omniverse [2]. The objective of this thesis is to further develop a particle simulation soil model and use it to train an excavation agent.
Work Packages
- Improve particle-based soil model in simulation using Warp and simulate bucket-soil interactions
- Train reinforcement learning agents to learn digging in the designed environment
Requirements
- Experience in training neural networks and RL
- Experience with ROS and good knowledge of Python
Contact Details
More information
Open this project... call_made
Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-09-30
Organization Robotic Systems Lab
Hosts Egli Pascal Arturo , Mittal Mayank , Terenzi Lorenzo
Topics Information, Computing and Communication Sciences
Perceptive Reinforcement Learning for Exavation
In this project, our goal is to leverage precomputed embeddings(VAE in Isaacsim) from 3D earthworks scene reconstructions to train reinforcement learning agents. These embeddings, derived from incomplete point cloud data and reconstructed using an encoder-decoder neural network, will serve as latent representations. The main emphasis is on utilizing these representations to develop and train reinforcement learning policies for digging tasks.
Keywords
LIDAR, 3D reconstruction, Isaac gym, deep learning, perception, reinforcement learning
Labels
Semester Project , Master Thesis
Description
Our excavator, M545, is equipped with a LIDAR that allows it to precisely perceive the 3D construction scene. However, occlusion caused by mounds of soil and irregularities in the soil often limits our ability to obtain complete information about the environment. This makes arm planning and collision avoidance against the terrain more challenging. In this project, we aim to exploit the regularities found in typical earthworks sites to do a 3D neural reconstruction of the scene using point cloud data. To do this, we plan to collect data in simulation by procedurally generating a set of plausible terrains in Isaac gym and using them to train a network that is able to reconstruct the 3D scene[1]. The latent representation of the network will then be used to develop reinforcement learning policies.
References: [1] Neural scene representation for locomotion on structured terrain [2] Self-Supervised Point Cloud Understanding via Mask Transformer and Contrastive Learning
Work Packages
- Sensor simulation and procedural environment generation
- Training of 3D sparse network
- Field deployment in a digging task
Requirements
- Experience in deep learning
Contact Details
More information
Open this project... call_made
Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-08-31
Organization Robotic Systems Lab
Hosts Höller David , Terenzi Lorenzo
Topics Information, Computing and Communication Sciences
Reiforcement Learning of Pretrained Trasformer Models
We want to train RL agents on our new particle simulator, accelerated on the GPU via warp in Isaac sim.
Keywords
Keywords: particle simulation, omniverse, warp, reinforcement learning
Labels
Semester Project , Master Thesis
Description
This project tackles the computational challenges of model-free on-policy reinforcement learning algorithms, such as Proximal Policy Optimization (PPO), in training excavation agents within simulated environments featuring realistic soil dynamics. To achieve this, we developed a particle-based soil simulation model using NVIDIA Isaac Sim, which enhances realism but significantly increases computational demands. This makes the model impractical for high-sample-demand algorithms like PPO on small MLPs, resulting in prolonged training times. To address this, the project explores RL fine-tuning of large pretrained decoder transformers, which are substantially more sample-efficient than small MLPs. The pretrained GPT model is trained on multiple earthworks tasks, including digging with a simplified soil model.
Work Packages
Training and evaluating excavation agents.
Optimizing the algorithm and simulation settings to successfully train the excavator to scoop in a variety of soils.
Requirements
• Experience in training neural networks and RL
Contact Details
More information
Open this project... call_made
Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-08-31
Organization Robotic Systems Lab
Hosts Terenzi Lorenzo
Topics Information, Computing and Communication Sciences
Multiagent Reinforcement Learning in Terra
We want to train multiple agents in the Terra environment, a fully end-to-end GPU-accelerated environment for RL training.
Keywords
multiagent reinforcement learning, jax, deep learning, planning
Labels
Semester Project , Master Thesis
Description
Construction sites require a variety of machinery to operate efficiently, including cranes, skid steers, backhoes, excavators, and trucks. Training a single Reinforcement Learning (RL) agent, such as an excavator, does not sufficiently simulate the complexities of a real-world construction workflow. In this project, we aim to enhance the Terra simulator [1] by integrating multiple types of agents to more realistically represent a construction site environment. The subsequent goal is to train these agents to collaborate effectively and complete specific earthwork projects successfully.
Work Packages
- expand the Terra simulator with different agents
- train multiple agents to cooperate using RL
Requirements
- experience training neural networks and RL
Contact Details
More information
Open this project... call_made
Published since: 2025-02-03 , Earliest start: 2025-04-01 , Latest end: 2025-09-16
Organization Robotic Systems Lab
Hosts Terenzi Lorenzo
Topics Information, Computing and Communication Sciences
Propose Your Own Robotics Challenge
This project invites you to step into the role of an innovator, encouraging you to identify challenges you are passionate about within the field of robotics. Rather than working on predefined problems, you will have the freedom to propose your own project ideas, address real-world issues, or explore cutting-edge topics. This project allows you to define your own research journey.
Keywords
Robotics, Research
Labels
Semester Project , Bachelor Thesis , Master Thesis
Description
Robotics is a rapidly evolving field with countless opportunities for innovation. This project gives you the chance to identify a robotics challenge that excites you, propose your own ideas, and pursue a project tailored to your interests. Whether it’s improving locomotion, enhancing human-robot interaction, or designing novel robotic applications, you are encouraged to think critically and creatively about the problems you want to solve. The emphasis is on self-motivation and ownership, with support provided to turn ideas into actionable research projects. This approach not only builds technical skills but also cultivates a passion for tackling meaningful challenges.
Work Packages
- Literature research
- Implementation
- Scientific evaluation
Requirements
- Excellent knowledge in the required tools (programming language or software)
- Strong engineering foundation
- Project experience
Contact Details
Please send your CV, transcripts and a short proposal (4-5 sentences max.) to:
More information
Open this project... call_made
Published since: 2025-01-28 , Earliest start: 2025-01-27
Organization Robotic Systems Lab
Hosts Schwarke Clemens , Bjelonic Filip , Klemm Victor
Topics Information, Computing and Communication Sciences
Data Driven Simulation for End-to-End Navigation
Investigate how neural rendering can become the backbone of comprehensive, next generation data-driven simulation
Keywords
Neural rendering, Simulation
Labels
Internship , Master Thesis
Description
Simulation-based training of locomotion and environment interaction policies have recently shown tremendous success in pushing the abilities of real-world robots. Using massive parallelization, simulation-based learning enables robots to quickly learn new skills without involving the time and hardware investments attached to trying out things in the real world. However, one existing challenge is that such simulators are currently focusing on physics while the simulation of perception readings is often limited to simple geometry. In order to for example support end-to-end, vision-based models, we'd like to add realistic image rendering in complex, realistic environments to such simulators.
In this project, we'd like to explore a data-driven approach to add such capabilities to a simulator. Specifically, neural rendering methods have made large progress in recent years and their use in simulators for training and validation is now actively being investigated. Challenges that need to be addressed are given by 1) runtime considerations for efficient use inside a simulator, 2) artifact-free rendering of novel views, and 3) the imposition of physical constraints such as watertight meshes or the structural stability of static environment reconstructions.
The project is conducted at The AI Institute, a recently established top robotics research institute created by the founders of Boston Dynamics.
References
[1] Neuralangelo: High-Fidelity Neural Surface Reconstruction, CVPR 2023 [2] ViPlanner: Visual Semantic Imperative Learning for Local Navigation, ICRA 2024 [3] OmniRe: Omni Urban Scene Reconstruction, arxiv 2024 [4] 3D Gaussian Splatting for Real-Time Radiance Field Rendering, Siggraph 2023
Work Packages
-Literature research -Adding existing rendering functionality to simulator (similar to [2]) -Incorporate gaussian splatting based rendering into simulator -Improve gaussian splatting for the use in simulators (render quality, mesh extraction, …) -Setup validation pipeline for simulator (validate end-to-end policy, or VIO, …)
Requirements
Excellent knowledge of Python Computer vision experience Knowledge of neural rendering methods
Contact Details
Alexander Liniger (aliniger@theaiinstitute.com) Igor Bogoslavskyi (ibogoslavskyi@theaiinstitute.com)
Please include up-o-date CV and transcript
More information
Open this project... call_made
Published since: 2025-01-24 , Earliest start: 2025-01-27
Organization Robotic Systems Lab
Hosts Kneip Laurent
Topics Information, Computing and Communication Sciences , Engineering and Technology
Evolving Minds: Neuroevolution for Legged Locomotion
This project explores the use of neuroevolution for optimizing control policies in legged robots, moving away from classical gradient-based methods like PPO. Neuroevolution directly optimizes network parameters and structures, potentially offering advantages in environments with sparse rewards, while requiring fewer hyperparameters to tune. By leveraging genetic algorithms and evolutionary strategies, the project aims to develop efficient controllers for complex locomotion tasks. With computational capabilities doubling approximately every two years as predicted by Moore's Law, neuroevolution offers a promising approach for scaling intelligent control systems.
Keywords
Evolutionary Algorithms, Reinforcement Learning, Quadrupeds, Legged Locomotion
Labels
Master Thesis
Description
Reinforcement learning for legged robots typically relies on policy gradient methods, which can struggle in environments with sparse rewards and require extensive hyperparameter tuning. This project investigates neuroevolution [1], an alternative approach where the neural network controllers are optimized through evolutionary processes. Neuroevolution allows simultaneous optimization of policy parameters and network architecture, potentially improving performance and simplifying tuning. With computational power continuously advancing, neuroevolution can exploit this trend to scale efficiently for complex robotics applications [2].
References
- [1] Galván, Edgar, and Peter Mooney. "Neuroevolution in deep neural networks: Current trends and future challenges." IEEE Transactions on Artificial Intelligence
- [2] Salimans, Tim, et al. "Evolution strategies as a scalable alternative to reinforcement learning.", OpenAI
Work Packages
- Study previous applications in robotics and control
- Implement neuroevolution algorithms within the Isaac Gym/Isaac Lab frameworks
- Compare performance, sample efficiency, and robustness against PPO
Requirements
- Excellent knowledge of Python (C++)
- Background in RL and Learning Methods
Contact Details
Please send your CV, transcripts and a short motivation (4-5 sentences max.) to:
More information
Open this project... call_made
Published since: 2025-01-22 , Earliest start: 2025-01-27
Organization Robotic Systems Lab
Hosts Bjelonic Filip , Schwarke Clemens
Topics Information, Computing and Communication Sciences
Design of a Compliant Mechanism for Human-Robot Collaborative Transportation with Non-Holonomic Robots
Human-robot collaboration is an attractive option in many industries for transporting long and heavy items with a single operator. In this project, we aim to enable HRC transportation with a non-holonomic robotic base platform by designing a compliant manipulation mechanism, inspired by systems like the Omnid Mocobots.
Keywords
Human-robot collaboration Collaborative transportation Non-holonomic robot Mobile manipulation
Labels
Master Thesis , ETH Zurich (ETHZ)
Description
To prevent long-term injuries, worker safety regulations limit the weight a human is allowed to carry. For instance, in many countries within the construction sector, workers are permitted to lift at most 25 kg at a time. Consequently, transporting objects exceeding this weight typically requires a second worker or the use of heavy construction equipment. This not only increases labor costs but also introduces logistical challenges.
Human-robot collaboration offers a promising solution by enabling a single operator to handle heavy and bulky items safely and efficiently. The Omnid Mocobots [1] have demonstrated success in this area by employing compliant manipulators with accurate force control on omnidirectional mobile bases. This setup allows operators to maneuver heavy objects effortlessly and serves as an inspiration for our project.
Building upon this concept, the goal of this project is to design and integrate a compliant manipulation mechanism onto "Smally", a skid-steer wheeled-legged mobile robot developed by Hilti. The manipulator will enable gravity compensation and smooth motion while handling heavy payloads. The project includes evaluating system requirements, mechanical design, fabrication, and validation through control algorithm development. The successful implementation aims to enhance worker safety, reduce labor costs, and increase efficiency in environments where full robot autonomy is challenging.
Work Packages
- Literature research
- Parallel manipulator mechanical design
- System integration on Smally
Requirements
- Mechatronics and system integration experience
- Understanding of parallel mechanisms
- Knowledge of C/C++
Contact Details
Please send your application with your CV and transcript to the following emails:
- Francesca Bray: frbray@ethz.ch
- Julien Kindle: jkindle@ethz.ch
More information
Open this project... call_made
Published since: 2025-01-16 , Earliest start: 2024-07-08
Applications limited to ETH Zurich
Organization Robotic Systems Lab
Hosts Kindle Julien , Bray Francesca
Topics Information, Computing and Communication Sciences , Engineering and Technology
How to Touch: Exploring Tactile Representations for Reinforcement Learning
Developing and benchmarking tactile representations for dexterous manipulation tasks using reinforcement learning.
Keywords
Reinforcement Learning, Dexterous Manipulation, Tactile Sensing
Labels
Semester Project , Bachelor Thesis , Master Thesis
PLEASE LOG IN TO SEE DESCRIPTION
This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:
- Click link "Open this project..." below.
- Log in to SiROP using your university login or create an account to see the details.
If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate
More information
Open this project... call_made
Published since: 2025-01-08 , Earliest start: 2024-12-15 , Latest end: 2025-06-01
Applications limited to ETH Zurich
Organization Robotic Systems Lab
Hosts Bhardwaj Arjun , Zurbrügg René
Topics Information, Computing and Communication Sciences
BEV meets Semantic traversability
Enable Birds-Eye-View perception on autonomous mobile robots for human-like navigation.
Keywords
Semantic Traversability, Birds-Eye-View, Localization, SLAM, Object Detection
Labels
Master Thesis , ETH Zurich (ETHZ)
Description
Autonomous Driving made tremendous progress in recent years through innovations in learning-based methods [1]. An emergent enabler are Birds-Eye-View methods that allow vehicles to understand and reason about their surroundings in real time. In this project, we aim to transfer this research to autonomous mobile robots in real-world, human-inhabited environments. While rules of navigation and traversability are well defined for autonomous driving, one exciting aspect of this project is on finding analogous representations for everyday human environments. We would like to explore these methods on a range of robots including Spot, Anymal, the Ultra Mobility Vehicle and Humanoids.
References [1] Liao, B., Chen, S., Zhang, Y., Jiang, B., Zhang, Q., Liu, W., ... & Wang, X. (2024). Maptrv2: An end-to-end framework for online vectorized hd map construction. International Journal of Computer Vision, 1-23. [2] Kim, Y., Lee, J. H., Lee, C., Mun, J., Youm, D., Park, J., & Hwangbo, J. (2024). Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy. arXiv preprint arXiv:2406.02989. [3] https://rpl-cs-ucl.github.io/STEPP/
This project is hosted at The AI institute in collaboration with RSL.
Work Packages
- Research latest BEV methods
- Develop BEV-inspired methods for mobile robotic semantic traversability.
- Deployment on real robots
Requirements
- Excellent knowledge of C++, Python
- Familiarity with learning framework, e.g. pytorch
- Experience with ROS2 is a plus
Contact Details
Email Abel (agawel@theaiinstitute.com) and Laurent (lkneip@theaiinstitute.com). Please include your CV and up-to-date transcript.
More information
Open this project... call_made
Published since: 2024-12-18 , Earliest start: 2025-01-15 , Latest end: 2025-10-31
Organization Robotic Systems Lab
Hosts Gawel Abel
Topics Information, Computing and Communication Sciences , Engineering and Technology
Scene graphs for robot navigation and reasoning
Elevate semantic scene graphs to a new level and perform semantically-guided navigation and interaction with real robots at The AI Institute.
Keywords
Scene graphs, SLAM, Navigation, Spacial Reasoning, 3D reconstruction, Semantics
Labels
Master Thesis , ETH Zurich (ETHZ)
Description
Human environments often adhere to implicit and explicit semantic structures that are easily understood by humans. For Autonomous mobile robots to act in these environments, we aim to investigate how to represent this understanding of the environment. One technology that gained popularity in recent years are scene graphs. These allow robots to spatially deconstruct the world in a graph of multiple levels of abstraction, where nodes represent places, rooms, objects, etc and edges the relationships between them. In this project, we aim to elevate semantic scene graphs to a level to perform semantically-guided navigation and interaction. We would like to explore these methods on a range of robots including Spot, Anymal, the Ultra Mobility Vehicle and Humanoids.
References
[1] Hughes, N., Chang, Y., & Carlone, L. (2022). Hydra: A real-time spatial perception system for 3D scene graph construction and optimization. arXiv preprint arXiv:2201.13360. [2] Honerkamp, D., Büchner, M., Despinoy, F., Welschehold, T., & Valada, A. (2024). Language-grounded dynamic scene graphs for interactive object search with mobile manipulation. IEEE Robotics and Automation Letters. [3] Gu, Q., Kuwajerwala, A., Morin, S., Jatavallabhula, K. M., Sen, B., Agarwal, A., ... & Paull, L. (2024, May). Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning. In 2024 IEEE International Conference on Robotics and Automation (ICRA) (pp. 5021-5028). IEEE.
This thesis will be hosted at The AI Institute in collaboration with RSL.
Work Packages
- Familiarization with latest scene graph frameworks
- Build new scene graph representations that seamlessly integrate with robot navigation
- Deployment on real robots
Requirements
- Excellent knowledge of C++, Python
- Familiarity with learning framework, e.g. pytorch
- Experience with ROS2 is a plus
Contact Details
Email Abel (agawel@theaiinstitute.com) and Alex (aliniger@theaiinstitute.com). Please include your CV and up-to-date transcript.
More information
Open this project... call_made
Published since: 2024-12-18 , Earliest start: 2025-01-15 , Latest end: 2025-10-31
Organization Robotic Systems Lab
Hosts Gawel Abel
Topics Information, Computing and Communication Sciences , Engineering and Technology
Digital Twin for Spot's Home
MOTIVATION ⇾ Creating a digital twin of the robot's environment is crucial for several reasons: 1. Simulate Different Robots: Test various robots in a virtual environment, saving time and resources. 2. Accurate Evaluation: Precisely assess robot interactions and performance. 3. Enhanced Flexibility: Easily modify scenarios to develop robust systems. 4. Cost Efficiency: Reduce costs by identifying issues in virtual simulations. 5. Scalability: Replicate multiple environments for comprehensive testing. PROPOSAL We propose to create a digital twin of our Semantic environment, designed in your preferred graphics Platform to be able to simulate Reinforcement Learning agents in the digital environment, to create a unified evaluation platform for robotic tasks.
Keywords
Digital Twin, Robotics
Labels
Semester Project , Master Thesis
Contact Details
Requirements: experience with a Python deep learning framework, understanding of 3D scene and camera geometry.
Please send us a CV and transcript.
Dr. Hermann Blum (blumh@ethz.ch) Dr. Zuria Bauer (zbauer@ethz.ch) Tifanny Portela (tportela@ethz.ch) Jelena Trisovic (tjelena@ethz.ch)
More information
Open this project... call_made
Published since: 2024-12-17 , Earliest start: 2025-01-05
Applications limited to University of Zurich , ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne
Organization Computer Vision and Geometry Group
Hosts Blum Hermann , Portela Tifanny , Bauer Zuria, Dr. , Trisovic Jelena
Topics Information, Computing and Communication Sciences
KALLAX Benchmark: Evaluating Household Tasks
Motivation ⇾ There are three ways to evaluate robots for pick-and-place tasks at home: 1. Simulation setups: High reproducibility but hard to simulate real-world complexities and perception noise. 2. Competitions: Good for comparing overall systems but require significant effort and can't be done frequently. 3. Custom lab setups: Common but lead to overfitting and lack comparability between labs. Proposal ⇾ We propose using IKEA furniture to create standardized, randomized setups that researchers can easily replicate. E.g, a 4x4 KALLAX unit with varying door knobs and drawer positions, generating tasks like "move the cup from the upper right shelf into the black drawer." This prevents overfitting and allows for consistent evaluation across different labs.
Keywords
Benchmakr, Robotics, pick-and-place
Labels
Semester Project , Master Thesis
Contact Details
Requirements: experience with a Python deep learning framework, understanding of 3D scene and camera geometry.
Please send us a CV and transcript. Dr. Hermann Blum (blumh@ethz.ch) René Zurbrügg (zrene@ethz.ch) Dr. Zuria Bauer (zbauer@ethz.ch)
More information
Open this project... call_made
Published since: 2024-12-17 , Earliest start: 2025-01-06
Applications limited to University of Zurich , ETH Zurich , Swiss National Science Foundation , EPFL - Ecole Polytechnique Fédérale de Lausanne
Organization Computer Vision and Geometry Group
Hosts Blum Hermann , Bauer Zuria, Dr. , Zurbrügg René
Topics Information, Computing and Communication Sciences
Visual Language Models for Long-Term Planning
This project uses Visual Language Models (VLMs) for high-level planning and supervision in construction tasks, enabling task prioritization, dynamic adaptation, and multi-robot collaboration for excavation and site management. prioritization, dynamic adaptation, and multi-robot collaboration for excavation and site management
Keywords
Visual Language Models, Long-term planning, Robotics
Labels
Semester Project , Master Thesis
Description
VLMs excel in reasoning and dynamic code generation, making them ideal for tasks like excavation sequencing, obstacle management, and multi-robot coordination. Applications include dynamic trenching, rock field clearing, and safety monitoring. The goal is to deploy VLM-based systems on autonomous excavators to enhance efficiency and adaptability.
Work Packages
Develop simulated and real scenarios for VLM-driven planning.
Integrate VLMs into excavation control systems for triggering tasks and code generation.
Benchmark performance in complex planning scenarios.
Contact Details
More information
Open this project... call_made
Published since: 2024-12-06 , Earliest start: 2025-03-31 , Latest end: 2025-10-29
Organization Robotic Systems Lab
Hosts Terenzi Lorenzo
Topics Information, Computing and Communication Sciences
Diffusion-based Shared Autonomy System for Telemanipulation
Robots may not be able to complete tasks fully autonomously in unstructured or unseen environments, however direct teleoperation from human operators may also be challenging due to the difficulty of providing full situational awareness to the operator as well as degradation in communication leading to the loss of control authority. This motivates the use of shared autonomy for assisting the operator thereby enhancing the performance during the task. In this project, we aim to develop a shared autonomy framework for teleoperation of manipulator arms, to assist non-expert users or in the presence of degraded communication. Imitation learning, such as diffusion models, have emerged as a popular and scalable approach for learning manipulation tasks [1, 2]. Additionally, recent works have combined this with partial diffusion to enable shared autonomy [3]. However, the tasks were restricted to simple 2D domains. In this project, we wish to extend previous work in the lab using diffusion-based imitation learning, to enable shared autonomy for non-expert users to complete unseen tasks or in degraded communication environments.
Keywords
Imitation learning, Robotics, Manipulation, Teleoperation
Labels
Semester Project , ETH Zurich (ETHZ)
Description
Robots may not be able to complete tasks fully autonomously in unstructured or unseen environments, however direct teleoperation from human operators may also be challenging due to the difficulty of providing full situational awareness to the operator as well as degradation in communication leading to the loss of control authority. This motivates the use of shared autonomy for assisting the operator thereby enhancing the performance during the task.
In this project, we aim to develop a shared autonomy framework for teleoperation of manipulator arms, to assist non-expert users or in the presence of degraded communication. Imitation learning, such as diffusion models, have emerged as a popular and scalable approach for learning manipulation tasks [1, 2]. Additionally, recent works have combined this with partial diffusion to enable shared autonomy [3]. However, the tasks were restricted to simple 2D domains. In this project, we wish to extend previous work in the lab using diffusion-based imitation learning, to enable shared autonomy for non-expert users to complete unseen tasks or in degraded communication environments.
References
- [1] Zhao, T.Z., Kumar, V., Levine, S. and Finn, C., 2023. Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705
- [2] Chi C, Xu Z, Feng S, et al. Diffusion policy: Visuomotor policy learning via action diffusion. The International Journal of Robotics Research. 2024;0(0). doi:10.1177/02783649241273668
- [3] Yoneda, T., Sun, L., Stadie, B. and Walter, M., 2023. To the noise and back: Diffusion for shared autonomy. arXiv preprint arXiv:2302.12244.
Work Packages
- Define the tasks that will be performed using the manipulator
- Set up data collection and training pipeline
- Generate dataset of task specific demonstrations and train a diffusion policy
- Transfer this policy to the robot
- Setup and evaluate experiments for evaluating performance for shared autonomy
Requirements
- Previous experience with machine learning
- Strong software development and experience working with Python and PyTorch
- Background in Robotics
- (Optional) Experience with ROS and C++
Contact Details
Please send a mail to earavind@ethz.ch with the Subject: "Application - Your Name - Diffusion-based Shared Autonomy System for Telemanipulation" with:
- BS/MS Transcript of Records
- CV
- Short motivation for your interest in the project
More information
Open this project... call_made
Published since: 2024-12-02 , Earliest start: 2024-11-01 , Latest end: 2025-11-01
Applications limited to ETH Zurich , University of Zurich
Organization Robotic Systems Lab
Hosts Elanjimattathil Aravind
Topics Information, Computing and Communication Sciences , Engineering and Technology
Lifelike Agility on ANYmal by Learning from Animals
The remarkable agility of animals, characterized by their rapid, fluid movements and precise interaction with their environment, serves as an inspiration for advancements in legged robotics. Recent progress in the field has underscored the potential of learning-based methods for robot control. These methods streamline the development process by optimizing control mechanisms directly from sensory inputs to actuator outputs, often employing deep reinforcement learning (RL) algorithms. By training in simulated environments, these algorithms can develop locomotion skills that are subsequently transferred to physical robots. Although this approach has led to significant achievements in achieving robust locomotion, mimicking the wide range of agile capabilities observed in animals remains a significant challenge. Traditionally, manually crafted controllers have succeeded in replicating complex behaviors, but their development is labor-intensive and demands a high level of expertise in each specific skill. Reinforcement learning offers a promising alternative by potentially reducing the manual labor involved in controller development. However, crafting learning objectives that lead to the desired behaviors in robots also requires considerable expertise, specific to each skill.
Keywords
learning from demonstrations, imitation learning, reinforcement learning
Labels
Master Thesis
Description
Work packages
Literature research
Skill development from an animal dataset (available)
Hardware deployment
Requirements
Strong programming skills in Python
Experience in reinforcement learning and imitation learning frameworks
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature This project and the following literature will make you a master in imitation/demonstration/expert learning.
Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.
Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.
Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20.
Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE.
Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.
Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9).
Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.
Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."
Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
Victor Klemm
More information
Open this project... call_made
Published since: 2024-11-26
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Klemm Victor
Topics Information, Computing and Communication Sciences
Pushing the Limit of Quadruped Running Speed with Autonomous Curriculum Learning
The project aims to explore curriculum learning techniques to push the limits of quadruped running speed using reinforcement learning. By systematically designing and implementing curricula that guide the learning process, the project seeks to develop a quadruped controller capable of achieving the fastest possible forward locomotion. This involves not only optimizing the learning process but also ensuring the robustness and adaptability of the learned policies across various running conditions.
Keywords
curriculum learning, fast locomotion
Labels
Master Thesis
Description
Quadruped robots have shown remarkable versatility in navigating diverse terrains, demonstrating capabilities ranging from basic locomotion to complex maneuvers. However, achieving high-speed forward locomotion remains a challenging task due to the intricate dynamics and control requirements involved. Traditional reinforcement learning (RL) approaches have made significant strides in this area, but they often face issues related to sample efficiency, convergence speed, and stability when applied to tasks with high degrees of freedom like quadruped locomotion.
Curriculum learning (CL), a concept inspired by the way humans and animals learn progressively from simpler to more complex tasks, offers a promising solution to these challenges. In the context of reinforcement learning, curriculum learning involves structuring the learning process by starting with simpler tasks and gradually increasing the complexity as the agent's proficiency improves. This approach can lead to faster convergence and better generalization by enabling the agent to build foundational skills before tackling more difficult scenarios.
Work packages
Literature research
Development of autonomous curriculum
Comparison with baselines (no curriculum, hand-crafted curriculum)
Requirements
Strong programming skills in Python
Experience in reinforcement learning
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature This project and the following literature will make you a master in curriculum/active/open-ended learning.
Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.
Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.
Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.
Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.
Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.
Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
Marco Bagatella
More information
Open this project... call_made
Published since: 2024-11-26
Organization Robotic Systems Lab
Hosts Li Chenhao , Bagatella Marco , Li Chenhao , Li Chenhao , Li Chenhao
Topics Engineering and Technology
Humanoid Locomotion Learning and Finetuning from Human Feedback
In the burgeoning field of deep reinforcement learning (RL), agents autonomously develop complex behaviors through a process of trial and error. Yet, the application of RL across various domains faces notable hurdles, particularly in devising appropriate reward functions. Traditional approaches often resort to sparse rewards for simplicity, though these prove inadequate for training efficient agents. Consequently, real-world applications may necessitate elaborate setups, such as employing accelerometers for door interaction detection, thermal imaging for action recognition, or motion capture systems for precise object tracking. Despite these advanced solutions, crafting an ideal reward function remains challenging due to the propensity of RL algorithms to exploit the reward system in unforeseen ways. Agents might fulfill objectives in unexpected manners, highlighting the complexity of encoding desired behaviors, like adherence to social norms, into a reward function. An alternative strategy, imitation learning, circumvents the intricacies of reward engineering by having the agent learn through the emulation of expert behavior. However, acquiring a sufficient number of high-quality demonstrations for this purpose is often impractically costly. Humans, in contrast, learn with remarkable autonomy, benefiting from intermittent guidance from educators who provide tailored feedback based on the learner's progress. This interactive learning model holds promise for artificial agents, offering a customized learning trajectory that mitigates reward exploitation without extensive reward function engineering. The challenge lies in ensuring the feedback process is both manageable for humans and rich enough to be effective. Despite its potential, the implementation of human-in-the-loop (HiL) RL remains limited in practice. Our research endeavors to significantly lessen the human labor involved in HiL learning, leveraging both unsupervised pre-training and preference-based learning to enhance agent development with minimal human intervention.
Keywords
reinforcement learning from human feedback, preference learning
Labels
Master Thesis
Description
Work packages
Literature research
Reinforcement learning from human feedback
Preference learning
Requirements
Strong programming skills in Python
Experience in reinforcement learning frameworks
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature
Christiano, Paul F., et al. "Deep reinforcement learning from human preferences." Advances in neural information processing systems 30 (2017).
Lee, Kimin, Laura Smith, and Pieter Abbeel. "Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training." arXiv preprint arXiv:2106.05091 (2021).
Wang, Xiaofei, et al. "Skill preferences: Learning to extract and execute robotic skills from human feedback." Conference on Robot Learning. PMLR, 2022.
Li, Chenhao, et al. "FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning." arXiv preprint arXiv:2402.13820 (2024).
Goal
The goal of the project is to learn and finetune humanoid locomotion policies using reinforcement learning from human feedback. The challenge lies in learning effective reward models from an efficient representation of motion clips, as opposed to single-state frames. The tentative pipeline works as follows:
A self-supervised motion representation pretraining phase that learns efficient trajectory representations, potentially using Fourier Latent Dynamics, with data generated by some initial policies.
Reward learning from human feedback, conditioned on the trajectory representation learned in the first step. Human preference from visualizing the motions is thus embedded in this latent trajectory representation.
Policy training with the learning reward. The induced trajectories from the learned policy are used to augment the training set for the first two steps. The process continues.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
Xin Chen
More information
Open this project... call_made
Published since: 2024-11-26
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Chen Xin , Li Chenhao
Topics Information, Computing and Communication Sciences , Engineering and Technology
Online Safe Locomotion Learning in the Wild
Reinforcement learning (RL) can potentially solve complex problems in a purely data-driven manner. Still, the state-of-the-art in applying RL in robotics, relies heavily on high-fidelity simulators. While learning in simulation allows to circumvent sample complexity challenges that are common in model-free RL, even slight distribution shift ("sim-to-real gap") between simulation and the real system can cause these algorithms to easily fail. Recent advances in model-based reinforcement learning have led to superior sample efficiency, enabling online learning without a simulator. Nonetheless, learning online cannot cause any damage and should adhere to safety requirements (for obvious reasons). The proposed project aims to demonstrate how existing safe model-based RL methods can be used to solve the foregoing challenges.
Keywords
safe mode-base RL, online learning, legged robotics
Labels
Master Thesis
Description
The project aims to answer the following research questions:
How to model safe locomotion tasks for a real robotic system as a constrained RL problem? Can we use existing methods such as the one proposed by @as2022constrained to safely learn effective locomotion policies?
Answering the above questions will encompass hands-on experience with a real robotic system (such as ANYmal) together with learning to implement and test cutting-edge RL methods. As RL on real hardware is not yet fully explored, we expect to unearth various challenges concerning the effectiveness of our methods in the online learning setting. Accordingly, an equally important goal of the project is to accurately identify these challenges and propose methodological improvements that can help address them.
A starting point would be to create a model of a typical locomotion task in Isaac Orbit as a proof-of-concept. Following that, the second part of the project will be dedicated to extending the proof-of-concept to a real system.
Contact Details
If you are a Master's student with - basic knowledge in reinforcement learning, for instance, by taking Probabilistic Artificial Intelligence or Foundations of Reinforcement Learning courses; - strong background in robotics and programming C++, ROS,
please reach out to Yarden As (yarden.as@inf.ethz.ch) or Chenhao Li (chenhao.li@inf.ethz.ch). Feel free to share any previous materials, such as public code that you wrote, that could be relevant in demonstrating the above requirements.
More information
Open this project... call_made
Published since: 2024-11-26
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao
Topics Engineering and Technology
Autonomous Curriculum Learning for Increasingly Challenging Tasks
While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become stepping stones towards solving even more challenging problems later in the process. Consider the realm of legged locomotion: Training a robot via reinforcement learning to track a velocity command illustrates this concept. Initially, tracking a low velocity is simpler due to algorithm initialization and environmental setup. By manually crafting a curriculum, we can start with low-velocity targets and incrementally increase them as the robot demonstrates competence. This method works well when the difficulty correlates clearly with the target, as with higher velocities or more challenging terrains. However, challenges arise when the relationship between task difficulty and control parameters is unclear. For instance, if a parameter dictates various human dance styles for the robot to mimic, it's not obvious whether jazz is easier than hip-hop. In such scenarios, the difficulty distribution does not align with the control parameter. How, then, can we devise an effective curriculum? In the conventional RSL training setting for locomotion over challenging terrains, there is also a handcrafted learning schedule dictating increasingly hard terrain levels but unified with multiple different types. With a smart autonomous curriculum learning algorithm, are we able to overcome separate terrain types asynchronously and thus achieve overall better performance or higher data efficiency?
Keywords
curriculum learning, open-ended learning, self-evolution, progressive task solving
Labels
Master Thesis
Description
Work packages
Literature research
Development of autonomous curriculum
Comparison with baselines (no curriculum, hand-crafted curriculum)
Requirements
Strong programming skills in Python
Experience in reinforcement learning
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature This project and the following literature will make you a master in curriculum/active/open-ended learning.
Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.
Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.
Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.
Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.
Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.
Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
Marco Bagatella
More information
Open this project... call_made
Published since: 2024-11-26
Organization Robotic Systems Lab
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Bagatella Marco , Li Chenhao
Topics Engineering and Technology
Humanoid Locomotion Learning with Human Motion Priors
Humanoid robots, designed to replicate human structure and behavior, have made significant strides in kinematics, dynamics, and control systems. Research aims to develop robots capable of performing tasks in human-centric settings, from simple object manipulation to navigating complex terrains. Reinforcement learning (RL) has proven to be a powerful method for enabling robots to learn from their environment, enhancing their performance over time without explicit programming for every possible scenario. In the realm of humanoid robotics, RL is used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. However, one of the primary challenges is the high dimensionality of the action space, where handcrafted reward functions fall short of generating natural, lifelike motions. Incorporating motion priors into the learning process of humanoid robots addresses these challenges effectively. Motion priors can significantly reduce the exploration space in RL, leading to faster convergence and reduced training time. They ensure that learned policies prioritize stability and safety, reducing the risk of unpredictable or hazardous actions. Additionally, motion priors guide the learning process towards more natural, human-like movements, improving the robot's ability to perform tasks intuitively and seamlessly in human environments. Therefore, motion priors are crucial for efficient, stable, and realistic humanoid locomotion learning, enabling robots to better navigate and interact with the world around them.
Keywords
motion priors, humanoid, reinforcement learning, representation learning
Labels
Master Thesis
Description
Work packages
Literature research
Human motion capture and retargeting
Skill space development
Hardware validation encouraged upon availability
Requirements
Strong programming skills in Python
Experience in reinforcement learning and imitation learning frameworks
Publication
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
Related literature
Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.
Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.
Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20.
Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE.
Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.
Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9).
Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.
Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."
Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.
Contact Details
Please include your CV and transcript in the submission.
Chenhao Li
More information
Open this project... call_made
Published since: 2024-11-26
Organization ETH Competence Center - ETH AI Center
Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao
Topics Information, Computing and Communication Sciences
AI Agents for Excavation Planning
Recent advancements in AI, particularly with models like Claude 3.7 Sonnet, have showcased enhanced reasoning capabilities. This project aims to harness such models for excavation planning tasks, drawing parallels from complex automation scenarios in games like Factorio. We will explore the potential of these AI agents to plan and optimize excavation processes, transitioning from simulated environments to real-world applications with our excavator robot.
Keywords
GPT, Large Language Models, Robotics, Deep Learning, Reinforcement Learning
Labels
Semester Project , Master Thesis
Description
The evolution of large language models (LLMs) has opened new avenues in automation and planning. Notably, Claude 3.7 Sonnet introduces hybrid reasoning, enabling both rapid responses and detailed, step-by-step problem-solving \cite{anthropic2025}. Such capabilities position these models as potential candidates for tasks requiring intricate planning, such as excavation.Possible final deployment in the real world with our excavator robot. Excavation planning is a challenging problem requiring spatial reasoning, decision-making under constraints, and long-horizon planning. Recent advances in AI have led to agents that can master complex games like Go and navigate automation-heavy environments. This project aims to determine whether these AI systems can efficiently plan excavation tasks and how they compare to reinforcement learning-based approaches.
Terra is a flexible, JAX-accelerated grid-world environment designed for training AI agents in earthworks planning. It allows for high-level motion and excavation planning, formulated as a reinforcement learning (RL) problem. Terra's multi-GPU capabilities enable rapid training, achieving intelligent excavation planning in minutes on high-end hardware.
First, we will test the zero-shot capabilities of state-of-the-art LLMs and agents in excavation planning. By providing structured prompts and game renderings, we will evaluate whether these models can reason effectively about excavation tasks.
Work Packages
\item Design a pipeline that enables modern AI agents and large language models (LLMs) to play the excavation planning game in \textbf{Terra}.
\item Evaluate whether models like Claude 3.7 Sonnet or GPT-4 can solve excavation tasks zero-shot or require fine-tuning.
\item Train AI models with reinforcement learning using Terra’s multi-GPU acceleration.
\item Deploy the trained models onto a real-world robotic excavator for autonomous excavation.
Requirements
- general programming experience with python
- experience training neural network
- bonus: experience with large language models
Contact Details
More information
Open this project... call_made
Published since: 2024-11-21 , Earliest start: 2025-04-01 , Latest end: 2025-08-31
Organization Robotic Systems Lab
Hosts Terenzi Lorenzo
Topics Engineering and Technology
Student Theses in Industry
We have a large number of industry partners who search for excellent students to conduct their student theses at the company or at ETH but in close collaboration with them (joint supervision by industry and ETH).
Ammann Group (Switzerland)
The Ammann Group is a worldwide leader in the manufacture of mixing plants, machinery, and services in the construction industry, with core competence in road construction and landscaping as well as in the transport infrastructure.
We are collaborating with Ammann to automate construction equipment
Maxon (Switzerland)
Maxon develops and builds electric drive systems that are among the best in the world. Their drive systems can be found wherever extreme precision and the highest quality standards are indispensable – on Earth, and on Mars.
Shunk (Germany)
Legged Wheel Chair

This project aims at extending a dynamic simulation and locomotion controllers for a robotized wheelchair able to handle difficult terrains including stairs. This project will prepare the prototype phase coming next.
.
Note on plagiarism
We would like to suggest every student, irrespective of the type of project (Bachelor, Semester, Master, ...), to make himself/herself familiar with ETH rules regarding plagiarism