Data acquisition pipeline
Definition of recorded scenarios
As mentioned in Section âDatasetâ we have oriented our choice of scenarios on Euro NCAP. In Figure automingo examples (extended) can be seen an extended version of a sketch of other scenarios.
- Leading Braking: Designated under the official Euro NCAP nomenclature as AEB Car-to-Car (CCRb), this scenario evaluates Autonomous Emergency Braking triggered by the sudden deceleration of a lead vehicle to measure Time-to-Collision (TTC) metrics. This test is categorised as 5-Star Critical because it is mandatory for âSafety Assistâ points; failure in this assessment precludes a 5-star rating.
- Cut-in: Analysed within the official Euro NCAP framework for Assisted Driving (Cut-in), this scenario assesses the Adaptive Cruise Control (ACC) response when a target vehicle merges abruptly into the ego-lane. It serves as a primary indicator for the AD Grade (Very Good), specifically measuring âSafety Competenceâ within Highway Assist systems.
- Traffic Light: Vehicle autonomous response to signalised junctions (red lights) and its navigation through intersections are analysed according to the official AEB junction protocol (Signals). This is currently regarded as an Emerging Protocol that provides âBonus Pointsâ as it transitions toward mandatory status in future ratings.
- Vulnerable Crossing / Parallel: Following the official Euro NCAP AEB VRU (Pedestrian/Cyclist) requirements, this test utilises CPFA and CBNA protocols to detect humans crossing or moving in parallel to the vehicle. It is considered 5-Star Critical due to its significant weighting within the âVRU Protectionâ category.
- Normal Crossing Objects: This evaluation refers to the official Euro NCAP test AEB Junction Assist (CCFscp), which utilises a Car-to-Car Crossing Straight Path to evaluate the sensorâs Field of View (FoV) at intersections. It has a High Impact rating and is evaluated on a scale of 0-4 points within the âSafety Assistâ score.
- Lateral Parked Car: Identified within the official Euro NCAP Obstructed VRU Scenarios, specifically the CPNC (Child Nearside) protocol, this test examines the reaction to a child running from behind parked vehicles. Defined as a Critical Safety Test, it is notably difficult to pass and offers great value by reducing the need for extensive manual track testing.
- Merging / Acceleration Lanes: The official Euro NCAP Assisted Driving (Merge) protocol is used to examine Lane Support (LSS) and steering stability during lane convergences. This analysis directly impacts the AD Grade (Comfort) by influencing both âSystem Competenceâ and driver comfort ratings.
- Roundabout: Performance at roundabout entries and exits is measured against the official Euro NCAP Junction & Lane Support assessment. The focus remains on system engagement and the âhand-overâ transition, serving as a metric for AD Grade (Competence) that differentiates âPremiumâ systems from âBasicâ implementations.
- Speed Limit Adaptation: Validated under the official Euro NCAP ISA (Intelligent Speed Assist) protocol, this scenario tests Speed Limit Information (SLI) and Speed Control Functions (SCF) through accurate sign recognition. It is a 5-Star Mandatory requirement, as high ISA performance has been a prerequisite for a 5-star rating since 2023.
- Construction Site Speed Adaptation: This scenario operates as a specialized extension of the ISA (Intelligent Speed Assist) protocol, specifically addressing temporary speed zones. It evaluates the system's capability to detect non-standard signage and infrastructure changes typical of roadworks. Beyond the standard SLI (Speed Limit Information) requirements, this test analyzes the vehicle's ability to decelerate in response to variable message signs and temporary lane narrowings.
Distraction user prompt
You are helping create a
multiple-choice dataset
about autonomous driving scenarios.
Question: {question}
Correct Answer: {correct_answer}
Reasoning for correct answer: {reasoning}
Generate exactly 3 WRONG answers.
Requirements:
- Clearly incorrect but plausible
- Not absurd
- Concise (1 sentence max)
- NOT just Yes/No
- Different misconceptions in each
Respond ONLY with a JSON array
of 3 strings.
Collaborative Labelling Application
To coordinate annotation across a distributed team of domain experts while maintaining strict data integrity, we developed a custom browser-based labelling tool. The application was designed to streamline and accelerate the annotation process, allowing annotators to label cases quickly and efficiently. It was built around three main requirements: ensuring exclusive case assignment to avoid duplicate annotations, providing a simple interface with minimal friction for non-technical users, and enabling the direct export of structured outputs fully compatible with the training pipeline.
Architecture
The application is implemented as a self-contained single-page application built distributed as a single HTML file requiring no installation or build step. It communicates with a lightweight Python HTTP server that manages case availability and records completion status. This client-server separation allowed any number of annotators to work concurrently from their own machines by pointing a browser to the shared server URL, with no risk of two annotators receiving the same case.
Queue-based case assignment
The full dataset was partitioned into nine buckets, each loaded onto the server by the dataset curator. Upon connecting, each annotator selected a batch size (5, 10, 20, or 50 cases, or a custom value) and requested a new batch from the server. The server assigned cases exclusively â once a case was allocated to an annotator, it was locked and unavailable to others â and maintained a persistent record of completion. Annotators could request additional batches at will, export their results at any point, and resume work across sessions. This design enabled fully parallel annotation without coordination overhead, since the server acted as the sole arbiter of case availability.
Annotation interface
Each case was presented as a sequence of five temporally ordered, anonymised frames displayed simultaneously in a responsive grid (Figure app labelling). Individual frames could be enlarged to full screen via click, with keyboard arrow navigation between frames to support careful inspection of temporal dynamics. The annotator's identity was recorded per case to enable traceability of each ground-truth entry.
Situation confirmation and empty handling
Before answering any questions, each annotator was required to confirm whether the displayed sequence matched the situation category assigned during real-time in-vehicle labelling. This step served as an explicit quality gate: if the annotator confirmed the situation, the corresponding scenario-specific question set (three to six questions, see Table situations questions) was presented; if the annotator rejected the pre-label, the case was reclassified as an empty event. In the empty path, the application automatically assigned four questions drawn from a cross-situation question bank via a deterministic round-robin mechanism, designed to ensure uniform coverage of all question types across the empty subset. Context-specific questions whose interpretation depends on a particular scenario (e.g., traffic light colour, numeric speed value) were excluded from the empty bank.
Inline distractor generation
Upon submission of each annotated case, the application called the ChatGPT API to generate three plausible but incorrect alternative answers for each question, using the annotator's correct answer and free-text reasoning as context. Distractors were generated and stored atomically posterior to annotation time, ensuring that the validation fold was produced as a natural by-product of the labelling process rather than as a separate offline step.
Export
On completing a batch, the annotator downloaded a structured Excel file containing one row per question, with columns for scene identifier, situation, question text, correct answer, free-text reasoning, three distractor answers, and annotator name. These per-annotator files were subsequently merged by the dataset curator into the global dataset using a dedicated aggregation pipeline.
NCAP Question and Answer
For each scenario, distinct and specific inquiries were formulated to provide independent data points. These questions were designed to verify, through the different answers, whether the vehicle is correctly positioned within one of the targeted scenarios and to identify its specific operational phase of the time span. These developmental inquiries are detailed in Table situations questions.
| Situation | Questions |
|---|---|
| Traffic light |
Do you see any red traffic lights that could affect the lane the car is in? Which color has the traffic light that affects our car? Is our car in a right or left lane from which we can turn after the traffic light? Do you see in the situation more than 1 traffic light in our direction with different colours? |
| Leading braking |
Do we have a car with braking lights on directly in front of us in our lane? Are we approaching the vehicle in front of us in our lane? Are we stopped, or is the car in front of us stopped? Are we stopping in a traffic jam? |
| Cut in |
Are there any moving cars in the adjacent lanes to our right or left that are changing into our lane? Is there any car merging into our lane directly in front of us, with nothing between the new car and us? Are we in a roundabout? Are we in a intersection? Do you see any car on top of the lines that delimit our lane, either on the left or the right? |
| Construction site |
Are there any construction signs, such as cones, yellow signs, lights, or closed lanes? Are the road lines yellow? Are there any workers working on or around the road? |
| Crossing object |
Is there an intersection or a non-continuous lane ahead of our car? Is there any vehicle or pedestrian in front of our car moving in a different direction than us? Do we have any vehicle ahead of us in our lane, traveling in the same direction as us? |
| Lateral parked car |
Are there cars to the right or left of the lane we are driving in? Are there any cars parked parallel or perpendicular on the left or right of our car? Are we in a city where cars can park along the sides, or on a highway? Are we in a middle lane with more traffic lanes on either side? |
| Pedestrian |
Are we on a narrow street where the sidewalks and/or bike lane are right next to the roadway and visible? Are there any pedestrians or bicycles stopped or traveling alongside our car on the left or right? Could the pedestrianâs path intersect with our trajectory? |
| Merging lane |
Is the vehicle in an acceleration lane on a highway? Is there an acceleration lane for a highway on the right or left, adjacent to our lane? Is there any lane whose outer lines end at the lines of our lane? |
| Intersection road |
Is there an intersection ahead of us? Does that intersection maintain the lane lines? Is there a visible gap or missing lane lines in our lane? Do the lines of two different lanes converge into the same line? |
| Roundabout |
Is the car inside a roundabout? In our lane, can we see a roundabout ahead that we are approaching? Is the car stopped while seeing a vehicle crossing almost perpendicularly? |
| Speed limit adaptation |
In the sequence, can different speed limit signs with different numbers be seen? Is there any speed limit sign that affects the lane we are driving in? What is the speed indicated in the signal? Are we on highway? Are we entering or exiting a highway? Are there speed bumps, crosswalks, or construction zones in our lane? |