SynDRA
Synthetic Dataset for Railway Applications
About

Railways have always been the backbone of global transportation. As the industry moves toward Automatic Train Operations (ATO), the demand for advanced autonomous capabilities, such as precise localization, obstacle detection, track and switch identification, and semantic scene understanding, is growing rapidly.
While deep learning has revolutionized perception systems in many domains, the railway sector still suffers from a lack of data. Real-world data collection is not only expensive and logistically complex, but also limited in its ability to capture rare or dangerous scenarios critical for safety validation. On top of that, publicly available datasets tailored to the railway environment are extremely scarce.
To address these challenges, we developed SynDRA, a family of high-quality synthetic datasets generated through our custom simulation framework built in Unreal Engine 5. The original SynDRA (Synthetic Dataset for Railway Applicaiton) dataset focuses on image semantic segmentation, providing pixel-level annotations for complex railway scenes under diverse weather and lighting conditions.
Building on this foundation, we introduced SynDRA-BBox, a new dataset offering 2D and 3D bounding box annotations and LiDAR point clouds, enabling research on object detection, tracking, and 3D perception for railway environments. SynDRA-BBox includes aligned RGB images, point clouds, semantic labels, and tightly coupled 2D/3D bounding boxes—making it one of the most comprehensive synthetic datasets available for multi-modal learning in the railway domain.
Whether you’re developing vision-based ATO systems, training AI models for obstacle detection, or exploring domain adaptation from simulation to real-world, SynDRA is designed to accelerate your research by providing reliable, scalable, and fully annotated data.
Start exploring the datasets now, and bring the future of railway automation one step closer.