SynDRA

Synthetic Dataset for Railway Applications

About

Railways always played a crucial role in global transportation. The major players started investing in Automatic Train Operations, which require many autonomous capabilities, such as accurate localization, obstacle detection, track discrimination, and many more. Recent advancements in deep learning are promising for this field, but data acquisition and systematic testing on railways is expensive, dangerous, and will never capture the most rare cases. Furthermore, only a few datasets are publicly available for railway applications.
To fill these gaps, we used our custom simulation framework to generate SynDRA, a synthetic dataset for semantic segmentation of railway environments.

SynDRA

SynDRA is a synthetic dataset for railway applications. For now, SynDRA includes 80 sequences captured in 4 different environments and changing weather and light conditions. The sensors for now are limited to a stereo RGB camera setup, while the ground truth labels only include semantic segmentation. However, we will soon extend the dataset to additional environments, sensors (LiDAR and Radar), and ground truth labels (object detection, depth masks). Our custom simulation framework is flexible enough to accommodate any needs, so feel free to reach out!

You can download the entire dataset, which includes 80 sequences from stereo camera systems with related semantic annotations, or you can download individual scenarios, each containing 10 different sequences. MEGA Download Links for SynDRA v0, as presented in the WACV2025 paper:

SynDRA: The whole dataset, ~1.8TB of data.
Utils: Sensor Specs, and Python Scripts: from .bin files to png.

SynDRA was accepted for publication at the Winter Conference of Computer Vision Applications (WACV 2025).
If you use SynDRA-BBox in your research, please cite the following paper:

@inproceedings{d2025syndra,
title={SynDRA: Synthetic Dataset for Railway Applications},
author={D’Amico, Gianluca and Nesti, Federico and Rossolini, Giulio and Marinoni, Mauro and Sabina, Salvatore and Buttazzo, Giorgio},
booktitle={2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
pages={3437--3446},
year={2025},
organization={IEEE}}

A new version of the dataset with bug fixes (such as corrected lighting and semantic labels) will be available soon. Please report any errors or bugs you encounter. You can contact us at gianluca.damico@santannapisa.it. Additionally, a benchmark table showcasing the results of semantic segmentation models on our dataset will be coming soon.

The Dataset

Semantic Segmentation Example

Different adversarial condition and seasonal changes

Sensor Specs - SynDRA

Stereo RGB Cameras

Resolution: 1920×1080
FOV: 90°
Baseline: 0.6 m
Height from rails: 3.5 m
Frame Rate: 10 FPS

Annotations

Semantic Segmentation pixel-level
Per-frame camera poses

Dataset Structure - SynDRA

The dataset is organized as follows:

SynDRA/
├── Scenario_0/
│   ├── HV/
│   │   ├── Sunny/
│   │   │   ├── Afternoon/
│   │   │   │   ├── RGBCamera_0/
│   │   │   │   │   ├── Bin_folder/
│   │   │   │   │   │   ├── Sce0_sun_aft_000001.bin
│   │   │   │   │   │   ├── Sce0_sun_aft_000002.bin
│   │   │   │   │   │   └── ...
│   │   │   │   │   ├── Poses/
│   │   │   │   │   └── Times/
│   │   │   │   ├── RGBCamera_1/
│   │   │   │   ├── SSCamera_0/
│   │   │   │   └── SSCamera_1/
│   │   │   ├── Evening/
│   │   │   └── Morning/
│   ├── LV/
│   │   ├── Foggy/
│   │   │   ├── Afternoon/
│   │   │   │   ├── RGBCamera_0/
│   │   │   │   │   ├── Bin_folder/
│   │   │   │   │   │   ├── Sce0_fog_000001.bin
│   │   │   │   │   │   ├── Sce0_fog_000002.bin
│   │   │   │   │   │   └── ...
│   │   │   │   │   ├── Poses/
│   │   │   │   │   └── Times/
│   │   │   │   ├── RGBCamera_1/
│   │   │   │   ├── SSCamera_0/
│   │   │   │   └── SSCamera_1/
│   │   ├── Rainy/
│   │   │   ├── Afternoon/
│   │   │   │   ├── RGBCamera_0/
│   │   │   │   │   ├── Bin_folder/
│   │   │   │   │   │   ├── Sce0_rai_000001.bin
│   │   │   │   │   │   ├── Sce0_rai_000002.bin
│   │   │   │   │   │   └── ...
│   │   │   │   │   ├── Poses/
│   │   │   │   │   └── Times/
│   │   │   │   ├── RGBCamera_1/
│   │   │   │   ├── SSCamera_0/
│   │   │   │   └── SSCamera_1/
├── Utils/
│   ├── RGB_bin_to_png.py
│   ├── SS_bin_to_png.py
│   └── Sensor_Specs.txt
└── README.md

SynDRA-BBox

SynDRA-BBox is an extension of the original SynDRA dataset, specifically designed for 2D/3D object detection in railway environments. While the original SynDRA dataset focuses on image-level data captured from simulated railway scenarios, SynDRA-BBox introduces precise 2D and 3D bounding box annotations for a diverse set of railway objects, including trains, pedestrians, and natural obstacles.

Key features:

Fully annotated 3D point clouds using simulated LiDAR data, aligned with RGB images.
2D and 3D bounding boxes for major object categories relevant to railway operation and safety.
Maintains the visual diversity and domain fidelity of the original SynDRA environments, including urban, rural, and station scenes.

The dataset is intended for benchmarking tasks such as:

2D/3D object detection
Sensor fusion (RGB + LiDAR)
Semi-supervised and unsupervised domain adaptation
Synthetic-to-real transfer learning

You can download the full dataset, which includes 72 sequences plus 2 bonus sequences from multi-sensor systems with corresponding annotations, or opt to download individual scenarios. Each scenario consists of 8 distinct sequences, with each sequence focused on a specific object of interest: car, truck, bus, crossing pedestrian, parallel pedestrian, tree with leaves, tree without leaves, and rocks. These objects either cross the tracks, run parallel to the railway, or lie over the rail in the path of the train. To ensure visual diversity, every sequence within a scenario features a different combination of textures, randomly selected from a common pool of 10, for the ballast, sleepers, terrain, and crossing platform. These textures are consistent across all scenarios but are uniquely sampled for each sequence. We use 5 different car models and various pedestrian models sourced from Epic Games' City Sample. The bus model remains the same across all sequences. Conversely, trees and rocks are uniquely varied across scenarios, with no repetitions. Additionally, each sequence includes dynamic elements such as moving vehicles and pedestrians on nearby roads, which are placed in different positions per sequence. The placement of natural objects like forest elements along the railway is also randomized for each sequence to further enrich the environmental variability. MEGA Download Links for SynDRA-BBox:

SynDRA-BBox: The whole dataset, ~1.8TB of data.
- Scenario_0
- Scenario_1
- Scenario_2
- Scenario_3
- Scenario_4
- Scenario_5
- Scenario_6
- Scenario_7
- Scenario_8
Utils: Sensor Specs, and Python Scripts: from .bin files to png.

SynDRA-BBox related paper is currently under revision.

The Dataset

2D/3D Semantic Segmentation and Bounding Boxes Example

Scenario Examples

Sensor Specs - SynDRA-BBox

RGB Camera 0

Resolution: 2464x1600
FOV: 30°
Height from rails: 3.5 m
Frame Rate: 10 FPS

Stereo RGB Cameras 1/2

Resolution: 2464x1600
FOV: 90°
Baseline: 0.6 m
Height from rails: 3.5 m
Frame Rate: 10 FPS

NotRepLidar_0 (Tele-15)

Non-repeating pattern
Range: 500 m
FOV: 15°
Height: 2.5 m
Points/scan: 4800
Rate: 10 Hz

BeamStackLidar_0 (HDL-64E)

64-beam scanning pattern
Range: 120 m
Horizontal FOV: 180°
Vertical FOV: 26.8°
Height: 2.5 m
Rate: 10 Hz

Annotations

2D/3D bounding boxes
Semantic segmentation (pixel and point-level)
Depth (up to 650m)
Calibration and poses

Dataset Structure - SynDRA-BBox

The dataset is organized as follows:

SynDRA-BBox/
├── Scenario_0/
│   ├── Bus_0/
│   │   ├── Sunny/
│   │   │   ├── DepthCamera_0/
│   │   │   │   ├── Bin_folder/
│   │   │   │   │   ├── SceBus_0_sun_mor_000000.bin
│   │   │   │   │   ├── SceBus_0_sun_mor_000001.bin
│   │   │   │   │   └── ...
│   │   │   │   ├── Poses/
│   │   │   │   └── Times/
│   │   │   ├── DepthCamera_1/
│   │   │   ├── DepthCamera_2/
│   │   │   ├── RGBCamera_0/
│   │   │   ├── RGBCamera_1/
│   │   │   ├── RGBCamera_2/
│   │   │   ├── SSCamera_0/
│   │   │   ├── SSCamera_1/
│   │   │   ├── SSCamera_2/
│   │   │   ├── NotRepLidar_0/
│   │   │   └── BeamStackLidar_0/
│   │   ├── CameraBBoxes_0.txt
│   │   ├── CameraBBoxes_1.txt
│   │   ├── CameraBBoxes_2.txt
│   │   ├── LidarBBoxesBeamsStackLidar_0.txt
│   │   ├── LidarBBoxesNotRepLidar_0.txt
│   │   ├── rail_tumlike_poses.txt
│   │   └── LidarBBoxesNotRepLidar_0.txt
│   ├── Car_0/
│   ├── Crossing_Pedestrian_0/
│   ├── Parallel_Pedestrian_0/
│   ├── Rock_0/
│   ├── Tree_noLeaf_0/
│   ├── Tree_Leaf_0/
│   └── Truck_0/
├── Scenario_1/
├── Scenario_2/
├── Scenario_3/
├── Scenario_4/
├── Scenario_5/
├── Scenario_6/
├── Scenario_7/
├── Scenario_8/
│   ├── Crossing_Pedestrian_8/
│   └── Parallel_Pedestrian_8/
├── Utils/
│   ├── bin_to_png.py
│   ├── bin_to_pcd.py
│   ├── README.md
│   ├── requirements.txt
│   └── Sensor_Specs.txt
└── README.md

Media

Research

If you are interested in synthetic data generation for railway environments, we are always open to new collaboration, check our previous work:

D’Amico et al. “SynDRA: Synthetic Dataset for Railway Applications.”
Accepted at WACV2025: Winter Conference of Computer Vision Applications (WACV), February 2025, Tucson, Arizona
D’Amico et al. “A Comparative Analysis of Visual Odometry in Virtual and Real-World Railways Environments.”
Accepted at RAILWAYS 2024: The Sixth International Conference on Railways Technology, September 2024, Prague
D’Amico et al. "TrainSim: A railway simulation framework for LiDAR and camera dataset generation." IEEE Transactions on Intelligent Transportation Systems (2023)
D'Amico et al. “Graphic Simulation Framework of Railway Scenarios for LiDAR Dataset Generation
RAILWAYS 2022: The Fifth International Conference on Railways Technology, August 2022, Montpellier

SynDRA

About

SynDRA

The Dataset

Sensor Specs - SynDRA

Stereo RGB Cameras

Annotations

Dataset Structure - SynDRA

SynDRA-BBox

The Dataset

Sensor Specs - SynDRA-BBox

RGB Camera 0

Stereo RGB Cameras 1/2

NotRepLidar_0 (Tele-15)

BeamStackLidar_0 (HDL-64E)

Annotations

Dataset Structure - SynDRA-BBox

Media

Research

Our Team

We are a group of PhD students at the ReTiS Lab of Scuola Superiore Sant'Anna.

SynDRA

About

SynDRA

The Dataset

Sensor Specs - SynDRA

Stereo RGB Cameras

Annotations

Dataset Structure - SynDRA

SynDRA-BBox

The Dataset

Sensor Specs - SynDRA-BBox

RGB Camera 0

Stereo RGB Cameras 1/2

NotRepLidar_0 (Tele-15)

BeamStackLidar_0 (HDL-64E)

Annotations

Dataset Structure - SynDRA-BBox

Media

Research

Our Team

Gianluca D'AmicoPost-doctoral researcher

Federico NestiResearcher

Giulio RossoliniAssistant Professor

We are a group of PhD students at the ReTiS Lab of Scuola Superiore Sant'Anna.

Gianluca D'Amico
Post-doctoral researcher

Federico Nesti
Researcher

Giulio Rossolini
Assistant Professor