Challenges in training algorithms for autonomous cars
In my earlier article, we talked about the usage of machine learning algorithms in autonomous cars. Obviously, the process of training algorithms for autonomous cars, learning and implementation of machine learning is not without a huge set of challenges. In this article we will talk about the challenges in training data for autonomous vehicles:
Attaining superior accuracy of detection and prediction:
Safety-critical systems as used in self-driving cars, require detection accuracy much higher than in the internet industry. These systems are expected to operate flawlessly irrespective of weather conditions, visibility, or road surface quality.
Challenge of scale:
Deep neural networks, such as those used in self-driving vehicles, require a mind-boggling amount of computational power. Additionally, they require a huge set of datasets. Neural networks need to be trained on representative datasets that include examples of all possible driving, weather, and situational conditions. In practice, this translates into Petabytes of training data.
As per a rough calculation, a fleet of 100 cars instrumented with 5 cameras each will generate in excess of one million hours of video recording in a year. This data needs to be captured, transported from the car to the data center, stored, processed and used for training autonomous vehicles. Importantly, since supervised learning algorithms are used, the data also needs to be annotated by humans. Marking every pedestrian, car, lane, and other details can become a significant bottleneck if the data annotation team is not adequately sized – Image the size of data processing.
One of the main challenges with the safety of deep neural networks is the fact that they are unstable under so-called adversarial perturbations. Minimal modifications in camera images, such are resizing, cropping and the change of lighting conditions might cause the system to misclassify the image. The prevailing automotive safety standard of ISO26262, does not have a way to define safety for self-learning algorithms such as deep learning. Hence, there is still no way to standardize the safety aspect yet, due to the fast pace of current technology
Debugging and deriving problems if the model fails:
Machine-learning learns from a huge set of data and stores the model in a complex set of weighted combination of features. This combination of weighted features in un-intuitive and difficult to interpret.
In the machine learning autonomous systems, training data serves as important input for the required specifications. The key focus is on the data. Several important questions to be ensured for the training data. Such as – How do you know that training data is complete? Does it consist of all aspects of safety criticality, including the moderate rare incidents that have a very low probability of occurrence? The data is annotated properly and modeled properly – without errors.
This article was written by Anil Gupta, Co-founder of Magnos Technologies LLP. He has about 23 years of experience in Connected Cars, Connected Devices, Embedded software, Automotive Infotainment, Telematics, GIS, Energy, and Telecom domain. Recently Anil wrote a book entitled Artificial Intelligence, Machine Learning, Internet of Things and more – Frequently Asked Questions.