SynDRA
Synthetic Dataset for Railway Applications
About

Railways always played a crucial role in global transportation. The major players started investing in Automatic Train Operations, which require many autonomous capabilities, such as accurate localization, obstacle detection, track discrimination, and many more. Recent advancements in deep learning are promising for this field, but data acquisition and systematic testing on railways is expensive, dangerous, and will never capture the most rare cases. Furthermore, only a few datasets are publicly available for railway applications.
To fill these gaps, we used our custom simulation framework to generate SynDRA, a synthetic dataset for semantic segmentation of railway environments.
SynDRA
SynDRA is a synthetic dataset for railway applications. For now, SynDRA includes 80 sequences captured in 4 different environments and changing weather and light conditions. The sensors for now are limited to a stereo RGB camera setup, while the ground truth labels only include semantic segmentation. However, we will soon extend the dataset to additional environments, sensors (LiDAR and Radar), and ground truth labels (object detection, depth masks). Our custom simulation framework is flexible enough to accommodate any needs, so feel free to reach out!
SynDRA was accepted for publication at the Winter Conference of Computer Vision Applications (WACV 2025)
MEGA Download Links for SynDRA v0, as presented in the WACV2025 paper. You can download the entire dataset, which includes 80 sequences from stereo camera systems with related semantic annotations, or you can download individual scenarios, each containing 10 different sequences:
- SynDRA: The whole dataset, ~1.8TB of data.
- Scenario 0
- Reversed Scenario 0
- Scenario 1
- Reversed Scenario 1
- Scenario 2
- Reversed Scenario 2
- Scenario 3
- Reversed Scenario 3
- Utils: Python Scripts: from .bin files to png.
A new version of the dataset with bug fixes (such as corrected lighting and semantic labels) will be available soon. Please report any errors or bugs you encounter. You can contact us at gianluca.damico@santannapisa.it. Additionally, a benchmark table showcasing the results of semantic segmentation models on our dataset will be coming soon.
Semantic Segmentation Example
Different adversarial condition and seasonal changes





Media
Research
If you are interested in synthetic data generation for railway environments, we are always open to new collaboration, check our previous work:
-
D’Amico et al. “SynDRA: Synthetic Dataset for Railway Applications.”
Accepted at WACV2025: Winter Conference of Computer Vision Applications (WACV), February 2025, Tucson, Arizona
-
D’Amico et al. “A Comparative Analysis of Visual Odometry in Virtual and Real-World Railways Environments.”
Accepted at RAILWAYS 2024: The Sixth International Conference on Railways Technology, September 2024, Prague
-
D’Amico et al. "TrainSim: A railway simulation framework for LiDAR and camera dataset generation."
IEEE Transactions on Intelligent Transportation Systems (2023)
-
D'Amico et al. “Graphic Simulation Framework of Railway Scenarios for LiDAR Dataset Generation
RAILWAYS 2022: The Fifth International Conference on Railways Technology, August 2022, Montpellier