Convolutional Neural Networks for Self-Driving Cars

This blog post are my notes from project 3 in the term 1 of the Udacity Nanodegree in Self Driving cars. The project is about developing and training a convolutional neural network of camera input (3 different camera angles) from a simulated car.

Best regards,

Amund Tveit

1. Modelling Convolutional Neural Network for Self Driving Car

Used the NVIDIA Autopilot Deep Learning model for self-driving as inspiration (ref: paper “End to End Learning for Self-Driving Cars” – and implementation of it:, but did some changes to it:

  1. Added normalization in the model itself (ref Lambda(lambda x: x/255.0 – 0.5, input_shape=img_input_shape)), since it is likely to be faster than doing it in pure Python.
  2. Added Max Pooling after the first convolution layers, i.e. making the model a more “traditional” wrt being capable of detecting low level features such as edges (similar to classic networks such as LeNet).
  3. Added Batch Normalization in early layers to be more robust wrt different learning rates
  4. Used he_normal normalization (truncated normal distribution) since this type of normalization with TensorFlow has earlier mattered a lot
  5. Used L2 regularizer (ref: “rule of thumb” – )
  6. Made the model (much) smaller by reducing the fully connected layers (got problems running larger model on 1070 card, but in retrospect it was not the model size but my misunderstandings of Keras 2 that caused this trouble)
  7. Used selu (ref: paper “Self-Normalizing Neural Networks” instead of relu as rectifier functions in later layers (fully connected) – since previous experience have shown (with traffic sign classification and tensorflow) showed that using selu gave faster convergence rates (though not better final result).
  8. Used dropout in later layers to avoid overfitting
  9. Used l1 regularization on the final layer, since I’ve seen that it is good for regression problems (better than l2)

Image of Model model Image

Detailed model

####2. Attempts to reduce overfitting in the model

The model contains dropout layers in order to reduce overfitting (ref dropout_1 and dropout_2 in figure above and train_car_to_drive.ipynb).

Partially related: Used also balancing of data sets in the generator, see sample_weight in generator function and snippet below

The model was tested by running it through the simulator and ensuring that the vehicle could stay on the track. See modelthatworked.mp4 file in this github repository.

####3. Model parameter tuning

The model used an adam optimizer, so the learning rate was not tuned manually

####4. Appropriate training data

Used the training data that was provided as part of the project, and in addition added two runs of data to avoid problems (e.g. curve without lane line on the right side – until the bridge started and also a separate training set driving on the bridge). Data is available on

###Model Architecture and Training Strategy

####1. Solution Design Approach

The overall strategy for deriving a model architecture was to use a, first tried the previous one I used for Traffic Sign detection based on LeNet, but it didn’t work (probably too big images as input), and then started with the Nvidia model (see above for details about changes to it).

In order to gauge how well the model was working, I split my image and steering angle data into a training and validation set. Primary finding was that numerical performance of the models I tried was not a good predictor of how well it would it perform on actual driving in the simulator. Perhaps overfitting could be good for this task (i.e. memorize track), but I attempted to get a correctly trained model without overfitting (ref. dropout/selu and batch normalization). There were many failed runs before the car actually could drive around the first track.


2. Creation of the Training Set & Training Process

I redrove and captured training data for the sections that were problematic (as mentioned the curve without lane lines on right and the bridge and part just before bridge). Regarding center-driving I didn’t get much success adding data for that, but perhaps my rebalancing (ref. generator output above) actually was counter-productive?

For each example line in the training data I generated 6 variants (for data augmentetation), i.e. flipped image (along center vertical axis) + also used the 3 different cameras (left, center and right) with adjustments for the angle.

After the collection process, I had 10485 lines in driving_log.csv, i.e. number of data points = 62430 (6*10485). Preprocessing used to flip image, convert images to numpy arrays and also (as part of Keras model) to scale values. Also did cropping of the image as part of the model. I finally randomly shuffled the data set and put 20 of the data into a validation set, see generator for details. Examples of images (before cropping inside model) is shown below:

Example of center camera image

center Image

Example of flipped center camera image

flippedcenter Image

Example of left camera image

left Image

Example of right camera image

right Image


I used this training data for training the model. The validation helped determine if the model was over or under fitting. The ideal number of epochs was 5 as evidenced by the quick flattening of loss and validation loss (to around 0.03), in earlier runs validation loss increased above training loss when having more epochs. I used an adam optimizer so that manually training the learning rate wasn’t necessary.

3. Challenges

Challenges along the way – found it to be a very hard task, since the model loss and validation loss weren’t good predictors for actual driving performance, also had cases when adding more training data with nice driving data (at the center and far from the edges) actually gave worse results and made the car drive off the road. Other challenges were Keras 2 related, the semantics of parameters in Keras 1 and Keras 2 fooled me a bit using Keras 2, ref the steps_per_epoch. Also had issues with the progress bar not working in Keras 2 in Jupyter notebook, so had to use 3rd party library

Continue Reading

Lane Finding (on Roads) for Self Driving Cars with OpenCV

This blog post is a (basic) approach of how to potentially use OpenCV for Lane Finding for self-driving cars (i.e. the yellow and white stripes along the road) – did this as one of the projects of term 1 of Udacity’s self-driving car nanodegree (highly recommended online education!).

Disclaimer: the approach presented in this blog post is way to simple to use for an actual self-driving car, but was a good way (for me) to learn more about (non-deep learning based) computer vision and the lane finding problem.

See for more details about the approach (python code)

Best regards,

Amund Tveit

Lane Finding (On Roads) for Self Driving Cars with OpenCV

1. First I selected the region of interest (with hand-made vertices)

2. Converted the image to grayscale

3. Extracted likely white lane information from the grayscale image.

Used 220 as limit (255 is 100% white, but 220 is close enough)

4. Extracted likely yellow lane information from the (colorized) region of interest image.

RGB for Yellow is [255,255,0] but found [220,220,30] to be close enough

5. Converted the yellow lane information image to grayscale

6. Combined the likely yellow and white lane grayscale images into a new grayscale image (using max value)

7. Did a gaussian blur (with kernel size 3) followed by canny edge detection

Gaussian blur smooths out the image using Convolution, this is reduce false signalling to the (canny) edge detector

8. Did a hough (transform) image creation, I also modified the draw_lines function (see GitHub link above) by calculating average derivative and b value (i.e. calculating y = x-b for all the hough lines to find a and b, and then average over them).

For more information about Hough Transform, check out this hough transformation tutorial.

(side note: believe it perhaps could have been smarter to use hough line center points instead of hough lines, since the directions of them seem sometimes a bit unstable, and then use average of derivatives between center points instead)

9. Used the weighted image to overlay the hough image with lane detection on top of the original image

Continue Reading

Traffic Sign Detection with Convolutional Neural Networks

Making Self-driving cars work requires several technologies and methods to pull in the same direction (e.g. Radar/Lidar, Camera, Control Theory and Deep Learning). The online available Self-Driving Car Nanodegree from Udacity (divided into 3 terms) is probably the best way to learn more about the topic (see [Term 1], [Term 2] and [Term 3] for more details about each term), the coolest part is that you actually can run your code on an actual self-driving car towards the end of term 3 (I am currently in the middle of term 1 – highly recommended course!).

Note: before taking this course I recommend taking Udacity’s Deep Learning Nanodegree Foundations since most (term 1) projects requires some hands-on experience with Deep Learning.

Traffic Sign Detection with Convolutional Neural Networks

This blog post is a writeup of my (non-perfect) approach for German traffic sign detection (a project in the course) with Convolutional Neural networks (in TensorFlow) – a variant of LeNet with Dropout and (the new) SELU – Self-Normalizing Neural Networks. The effect of SELU was primarily that it quickly gained classification accuracy (even in first epoch), but didn’t lead to higher accuracy than using batch-normalisation + RELU in the end. (Details at: Data Augmentation in particular and perhaps a deeper network could have improved the performance I believe.

For other approaches (e.g. R-CNN and cascaded deep networks) see the blog post: Deep Learning for Vehicle Detection and Recognition.

UPDATE – 2017-July-15:

If you thought Traffic Sign Detection from modern cars was an entire solved problem, think again:



Best regards,

Amund Tveit

1. Basic summary of the German Traffic Sign Data set.

I used numpy shape to calculate summary statistics of the traffic signs data set:

  • The size of training set is ? 34799
  • The size of the validation set is ? 4410
  • The size of test set is ? 12630
  • The shape of a traffic sign image is ? 32x32x3 (3 color channels, RGB)
  • The number of unique classes/labels in the data set is ? 43

2. Visualization of the train, validation and test dataset.

Here is an exploratory visualization of the data set. It is a bar chart showing how the normalized distribution of data for the 43 traffic signs. The key takeaway is that the relative number of data points varies quite a bit between each class, e.g. from around 6.5% (e.g. class 1) to 0.05% (e.g. class 37), i.e. a factor of at least 12 difference (6.5% / 0.05%), this can potentially impact classification performance.

alt text

3 Design of Architecture

3.1 Preprocessing of images

Did no grayscale conversion or other conversion of train/test/validation images (they were preprocessed). For the images from the Internet they were read from using PIL and converted to RGB (from RBGA), resized to 32×32 and converted to numpy array before normalization.

All images were normalized pixels in each color channel (RGB – 3 channels with values between 0 to 255) to be between -0.5 to 0.5 by dividing by (128-value)/255. Did no data augmentation.

Here are sample images from the training set

alt text

3.2 Model Architecture

Given the relatively low resolution of Images I started with Lenet example provided in lectures, but to improve training I added Dropout (in early layers) with RELU rectifier functions. Recently read about self-normalizing rectifier function – SELU – so decided to try that instead of RELU. It gave no better end result after many epochs, but trained much faster (got > 90% in one epoch), so kept SELU in the original. For more information about SELU check out the paper Self-Normalizing Neural Networks from Johannes Kepler University in Linz, Austria.

My final model consisted of the following layers:

Layer Description
Input 32x32x3 RGB image
Convolution 5×5 1×1 stride, valid padding, outputs 28x28x6
Dropout keep_prob = 0.9
Max Pooling 2×2 stride, outputs 14x14x6
Convolution 5×5 1×1 stride, valid padding, outputs 10x10x16
Dropout keep_prob = 0.9
Max Pooling 2×2 stride, outputs 5x5x16
Flatten output dimension 400
Fully connected output dimension 120
Fully connected output dimension 84
Fully connected output dimension 84
Fully connected output dimension 43

3.3 Training of Model

To train the model, I used an Adam optimizer with learning rate of 0.002, 20 epochs (converged fast with SELU) and batch size of 256 (ran on GTX 1070 with 8GB GPU RAM)

3.4 Approach to find solution and getting accuracy > 0.93

Adding dropout to Lenet improved test accuracy and SELU improved training speed. The originally partitioned data sets were quite unbalanced (when plotting), so reading all data, shuffling and creating training/validation/test set also helped. I thought about using Keras and fine tuning a pretrained model (e.g. inception 3), but it could be that a big model on such small images could lead to overfitting (not entirely sure about that though), and reducing input size might lead to long training time (looks like fine tuning is best when you have the same input size, but changing the output classes)

My final model results were:

  • validation set accuracy of 0.976 (between 0.975-0.982)
  • test set accuracy of 0.975

If an iterative approach was chosen:

  • What was the first architecture that was tried and why was it chosen?

Started with Lenet and incrementally added dropout and then several SELU layers.. Also added one fully connected layer more.

  • What were some problems with the initial architecture?

No, but not great results before adding dropout (to avoid overfitting)

  • Which parameters were tuned? How were they adjusted and why?

Tried several combinations learning rates. Could reduce epochs after adding SELU. Used same dropout keep rate.

Since the difference between validation accuracy and test accuracy is very low the model seems to be working well. The loss is also quite low (0.02), so little to gain most likely – at least without changing the model a lot.

4 Test a Model on New Images

4.1. Choose five German traffic signs found on the web

Here are five German traffic signs that I found on the web:

alt text

In the first pick of images I didn’t check that the signs actually were among the the 43 classes the model was built for, and that was actually not the case, i.e. making it impossible to classify correctly. But got interesting results (regarding finding similar signs) for the wrongly classified ones, so replaced only 2 of them with sign images that actually was covered in the model, i.e. making it still impossible to classify 3 of them.

Here are the results of the prediction:

Image Prediction
Priority road Priority road
Side road Speed limit (50km/h)
Adult and child on road Turn left ahead
Two way traffic ahead Beware of ice/snow
Speed limit (60km/h) Speed limit (60km/h)

The model was able to correctly guess 2 of the 5 traffic signs, which gives an accuracy of 40%. For the other ones it can`t classify correctly, but the 2nd prediction for sign 3 – “adult and child on road” – is interesting since it suggests “Go straight or right” – which is quite visually similar (if you blur the innermost of each sign you will get almost the same image).

Continue Reading

Deep Learning for Traffic Sign Detection and Recognition

Traffic Sign Detection and Recognition is key functionality for self-driving cars. This posting has recent papers in this area. Check also out related posting: Deep Learning for Vehicle Detection and Classification

Best regards,
Amund Tveit
Amund Tveit

Year  Title Author
2016   Road surface traffic sign detection with hybrid region proposal and fast R-CNN  R Qian, Q Liu, Y Yue, F Coenen, B Zhang
2016   Traffic sign classification with deep convolutional neural networks  J CREDI
2016   Real-time Traffic Sign Recognition system with deep convolutional neural network  S Jung, U Lee, J Jung, DH Shim
2016   Traffic Sign Detection and Recognition using Fully Convolutional Network Guided Proposals  Y Zhu, C Zhang, D Zhou, X Wang, X Bai, W Liu
2016   A traffic sign recognition method based on deep visual feature  F Lin, Y Lai, L Lin, Y Yuan
2016   The research on traffic sign recognition based on deep learning  C Li, C Yang
2015   Fast Traffic Sign Recognition with a Rotation Invariant Binary Pattern Based Feature  S Yin, P Ouyang, L Liu, Y Guo, S Wei
2015   Malaysia traffic sign recognition with convolutional neural network  MM Lau, KH Lim, AA Gopalai
2015   Negative-Supervised Cascaded Deep Learning for Traffic Sign Classification  K Xie, S Ge, R Yang, X Lu, L Sun
Continue Reading

Deep Learning for Vehicle Detection and Classification

Update: 2017-Feb-03 – launched new service – (navigation and search in papers). Try e.g. out its Vehicle, Car and Driving pages.

This posting has recent papers about vehicle (e.g. car) detection and classification, e.g. for selv-driving/autonomous cars. Related: check also out Nvidia‘s End-to-End Deep Learning for Self-driving Cars and Udacity‘s Self-Driving Car Engineer (Nanodegree).

Best regards,

<a href=””>Amund Tveit</a> (<a href=””>@atveit</a>)

Year  Title Author
2016   Vehicle Classification using Transferable Deep Neural Network Features  Y Zhou, NM Cheung
2016   A Hybrid Fuzzy Morphology And Connected Components Labeling Methods For Vehicle Detection And Counting System  C Fatichah, JL Buliali, A Saikhu, S Tena
2016   Evaluation of vehicle interior sound quality using a continuous restricted Boltzmann machine-based DBN  HB Huang, RX Li, ML Yang, TC Lim, WP Ding
2016   An Automated Traffic Surveillance System with Aerial Camera Arrays: Data Collection with Vehicle Tracking  X Zhao, D Dawson, WA Sarasua, ST Birchfield
2016   Vehicle type classification via adaptive feature clustering for traffic surveillance video  S Wang, F Liu, Z Gan, Z Cui
2016   Vehicle Detection in Satellite Images by Incorporating Objectness and Convolutional Neural Network  S Qu, Y Wang, G Meng, C Pan
2016   DAVE: A Unified Framework for Fast Vehicle Detection and Annotation  Y Zhou, L Liu, L Shao, M Mellor
2016   3D Fully Convolutional Network for Vehicle Detection in Point Cloud  B Li
2016   A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance  X Liu, W Liu, T Mei, H Ma
2016   TraCount: a deep convolutional neural network for highly overlapping vehicle counting  S Surya, RV Babu
2016   Pedestrian, bike, motorcycle, and vehicle classification via deep learning: Deep belief network and small training set  YY Wu, CM Tsai
2016   Fast Vehicle Detection in Satellite Images Using Fully Convolutional Network  J Hu, T Xu, J Zhang, Y Yang
2016   Local Tiled Deep Networks for Recognition of Vehicle Make and Model  Y Gao, HJ Lee
2016   Vehicle detection based on visual saliency and deep sparse convolution hierarchical model  Y Cai, H Wang, X Chen, L Gao, L Chen
2016   Sound quality prediction of vehicle interior noise using deep belief networks  HB Huang, XR Huang, RX Li, TC Lim, WP Ding
2016   Accurate On-Road Vehicle Detection with Deep Fully Convolutional Networks  Z Jie, WF Lu, EHF Tay
2016   Fault Detection and Identification of Vehicle Starters and Alternators Using Machine Learning Techniques  E Seddik
2016   Fault diagnosis network design for vehicle on-board equipments of high-speed railway: A deep learning approach  J Yin, W Zhao
2016   Real-time state-of-health estimation for electric vehicle batteries: A data-driven approach  G You, S Park, D Oh
2016   The Precise Vehicle Retrieval in Traffic Surveillance with Deep Convolutional Neural Networks  B Su, J Shao, J Zhou, X Zhang, L Mei, C Hu
2016   Online vehicle detection using deep neural networks and lidar based preselected image patches  S Lange, F Ulbrich, D Goehring
2016   A closer look at Faster R-CNN for vehicle detection  Q Fan, L Brown, J Smith
2016   Appearance-based Brake-Lights recognition using deep learning and vehicle detection  JG Wang, L Zhou, Y Pan, S Lee, Z Song, BS Han
2016   Night time vehicle detection algorithm based on visual saliency and deep learning  Y Cai, HW Xiaoqiang Sun, LCH Jiang
2016   Vehicle classification in WAMI imagery using deep network  M Yi, F Yang, E Blasch, C Sheaff, K Liu, G Chen, H Ling
2015   VeTrack: Real Time Vehicle Tracking in Uninstrumented Indoor Environments  M Zhao, T Ye, R Gao, F Ye, Y Wang, G Luo
2015   Vehicle Color Recognition in The Surveillance with Deep Convolutional Neural Networks  B Su, J Shao, J Zhou, X Zhang, L Mei
2015   Vehicle Speed Prediction using Deep Learning  J Lemieux, Y Ma
2015   Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning  D Liu, Y Wang
2015   Night Time Vehicle Sensing in Far Infrared Image with Deep Learning  H Wang, Y Cai, X Chen, L Chen
2015   A Vehicle Type Recognition Method based on Sparse Auto Encoder  HL Rong, YX Xia
2015   Occluded vehicle detection with local connected deep model  H Wang, Y Cai, X Chen, L Chen
2015   Performance Evaluation of the Neural Network based Vehicle Detection Models  K Goyal, D Kaur
2015   A Smartphone-based Connected Vehicle Solution for Winter Road Surface Condition Monitoring  MA Linton
2015   Vehicle Logo Recognition System Based on Convolutional Neural Networks With a Pretraining Strategy  Y Huang, R Wu, Y Sun, W Wang, X Ding
2015   SiftKeyPre: A Vehicle Recognition Method Based on SIFT Key-Points Preference in Car-Face Image  CY Zhang, XY Wang, J Feng, Y Cheng
2015   Vehicle Detection in Aerial Imagery: A small target detection benchmark  S Razakarivony, F Jurie
2015   Vehicle license plate recognition using visual attention model and deep learning  D Zang, Z Chai, J Zhang, D Zhang, J Cheng
2015   Domain adaption of vehicle detector based on convolutional neural networks  X Li, M Ye, M Fu, P Xu, T Li
2015   Trainable Convolutional Network Apparatus And Methods For Operating A Robotic Vehicle  P O’connor, E Izhikevich
2015   Vehicle detection and classification based on convolutional neural network  D He, C Lang, S Feng, X Du, C Zhang
2015   The AdaBoost algorithm for vehicle detection based on CNN features  X Song, T Rui, Z Zha, X Wang, H Fang
2015   Deep neural networks-based vehicle detection in satellite images  Q Jiang, L Cao, M Cheng, C Wang, J Li
2015   Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines  C Gou, K Wang, Y Yao, Z Li
2014   Multi-modal Sensor Registration for Vehicle Perception via Deep Neural Networks  M Giering, K Reddy, V Venugopalan
2014   Mooting within the curriculum as a vehicle for learning: student perceptions  L Jones, S Field
2014   Vehicle Type Classification Using Semi-Supervised Convolutional Neural Network  Z Dong, Y Wu, M Pei, Y Jia
2014   Vehicle License Plate Recognition With Random Convolutional Networks  D Menotti, G Chiachia, AX Falcao, VJO Neto
2014   Vehicle Type Classification Using Unsupervised Convolutional Neural Network  Z Dong, M Pei, Y He, T Liu, Y Dong, Y Jia
Continue Reading