What do passengers mean when they say, “I’m in a hurry?” For autonomous cars, the answer is not that simple unless, claim engineers, it uses ChatGPT.
Self-driving vehicles continue to gain traction, with autonomous trucks being tested in Texas and driverless taxis picking up passengers in San Francisco, Phoenix, Los Angeles, and Las Vegas.
However, despite the enthusiasm, autonomous vehicles still face many challenges. While safety remains a concern, communication between the vehicle and its passengers is another major issue.
The internal systems of AVs have difficulty understanding and adapting to vague instructions from humans. If the passenger expresses feelings, saying, “I feel a bit motion sick right now,” or “I’m in a hurry,” it is not easy for the AV to translate this comment into actions. Essentially, it lacks emotional and contextual awareness.
A taxi driver, for example, would know what “in a hurry” means without the passenger having to specify the route the driver should take to avoid traffic. Or, if the passenger is motion sick, the human driver will know it should slow down.
However, that’s not the case for AVs. Most current autonomous systems are trained on limited data, which may not cover all driving scenarios or include enough real-world knowledge for solid decision-making. As a result, they may struggle in unfamiliar or unusual situations, increasing the risk of safety issues.
ChatGPT translates instructions to the car
To address the challenge, Purdue University engineers have integrated a large language model into level four autonomous vehicles (AV) to make the system more ‘understanding.’
They created a framework, named Talk2Drive, that can translate verbal commands from humans into textual instructions, which are processed by large language model (LLM) in the cloud.
LLMs like ChatGPT can accurately understand human intentions and generate specific driving instructions for autonomous vehicles. This allows the vehicle to adjust its driving behavior and settings to match the passenger's preferences.
“The conventional systems in our vehicles have a user interface design where you have to press buttons to convey what you want, or an audio recognition system that requires you to be very explicit when you speak so that your vehicle can understand you,” said Ziran Wang, one of the research authors.
“But the power of large language models is that they can more naturally understand all kinds of things you say. I don’t think any other existing system can do that.”
Before conducting their experiments, the researchers trained ChatGPT using a range of prompts, from direct commands like “Please drive faster” to more indirect ones such as “I feel a bit motion sick right now.”
As ChatGPT learned to handle these commands, the researchers set specific parameters for its large language models, requiring it to consider traffic laws, road conditions, weather, and data from the vehicle’s sensors, including cameras and LiDAR.
During the experiments, when the vehicle’s speech recognition system picked up a passenger command, the large language models processed it within the predefined parameters. These models then generated driving instructions for the vehicle's drive-by-wire system, which controls the throttle, brakes, gears, and steering, to execute the appropriate action.
The researchers carried out most of their experiments at a proving ground in Columbus, Indiana – a former airport runway. This setup provided a safe environment to test the vehicle’s reactions to passenger commands, including driving at highway speeds and navigating two-way intersections.
Remarkable results for now, but let’s not forget hallucinations
During the experiments, the average response time was 1.6 seconds, which is acceptable in non-time-critical scenarios but needs improvement when an autonomous vehicle must respond more quickly.
Field experiments were conducted in a range of scenarios, including highways, intersections, and parking areas, showing that the Talk2Drive framework significantly improves personalization. It reduced the driver takeover rate by 75.9%, while still maintaining safety and comfort within acceptable limits.
Additionally, a memory module that stored past interactions further enhanced personalization by allowing the vehicle to adapt to individual preferences over time. This led to a further reduction in the takeover rate by up to 65.2% compared to systems without the memory feature.
Another issue that remains is that LLMs like ChatGPT can "hallucinate," meaning they might misinterpret information and respond incorrectly. Wang’s study included a fail-safe mechanism to ensure participants' safety if the models misunderstood commands.
While the models' understanding improved during a ride, hallucination is still a concern that must be resolved before integrating these models into autonomous vehicles.
Vehicle manufacturers will need to conduct extensive additional testing beyond university studies to ensure the reliability of large language models in autonomous vehicles.
Your email address will not be published. Required fields are markedmarked