Traffic Sign Detection with Convolutional Neural Networks

Making Self-driving cars work requires several technologies and methods to pull in the same direction (e.g. Radar/Lidar, Camera, Control Theory and Deep Learning). The online available Self-Driving Car Nanodegree from Udacity (divided into 3 terms) is probably the best way to learn more about the topic (see [Term 1], [Term 2] and [Term 3] for more details about each term), the coolest part is that you actually can run your code on an actual self-driving car towards the end of term 3 (I am currently in the middle of term 1 – highly recommended course!).

Note: before taking this course I recommend taking Udacity’s Deep Learning Nanodegree Foundations since most (term 1) projects requires some hands-on experience with Deep Learning.

Traffic Sign Detection with Convolutional Neural Networks

This blog post is a writeup of my (non-perfect) approach for German traffic sign detection (a project in the course) with Convolutional Neural networks (in TensorFlow) – a variant of LeNet with Dropout and (the new) SELU – Self-Normalizing Neural Networks. The effect of SELU was primarily that it quickly gained classification accuracy (even in first epoch), but didn’t lead to higher accuracy than using batch-normalisation + RELU in the end. (Details at: Data Augmentation in particular and perhaps a deeper network could have improved the performance I believe.

For other approaches (e.g. R-CNN and cascaded deep networks) see the blog post: Deep Learning for Vehicle Detection and Recognition.

UPDATE – 2017-July-15:

If you thought Traffic Sign Detection from modern cars was an entire solved problem, think again:



Best regards,

Amund Tveit

1. Basic summary of the German Traffic Sign Data set.

I used numpy shape to calculate summary statistics of the traffic signs data set:

  • The size of training set is ? 34799
  • The size of the validation set is ? 4410
  • The size of test set is ? 12630
  • The shape of a traffic sign image is ? 32x32x3 (3 color channels, RGB)
  • The number of unique classes/labels in the data set is ? 43

2. Visualization of the train, validation and test dataset.

Here is an exploratory visualization of the data set. It is a bar chart showing how the normalized distribution of data for the 43 traffic signs. The key takeaway is that the relative number of data points varies quite a bit between each class, e.g. from around 6.5% (e.g. class 1) to 0.05% (e.g. class 37), i.e. a factor of at least 12 difference (6.5% / 0.05%), this can potentially impact classification performance.

alt text

3 Design of Architecture

3.1 Preprocessing of images

Did no grayscale conversion or other conversion of train/test/validation images (they were preprocessed). For the images from the Internet they were read from using PIL and converted to RGB (from RBGA), resized to 32×32 and converted to numpy array before normalization.

All images were normalized pixels in each color channel (RGB – 3 channels with values between 0 to 255) to be between -0.5 to 0.5 by dividing by (128-value)/255. Did no data augmentation.

Here are sample images from the training set

alt text

3.2 Model Architecture

Given the relatively low resolution of Images I started with Lenet example provided in lectures, but to improve training I added Dropout (in early layers) with RELU rectifier functions. Recently read about self-normalizing rectifier function – SELU – so decided to try that instead of RELU. It gave no better end result after many epochs, but trained much faster (got > 90% in one epoch), so kept SELU in the original. For more information about SELU check out the paper Self-Normalizing Neural Networks from Johannes Kepler University in Linz, Austria.

My final model consisted of the following layers:

Layer Description
Input 32x32x3 RGB image
Convolution 5×5 1×1 stride, valid padding, outputs 28x28x6
Dropout keep_prob = 0.9
Max Pooling 2×2 stride, outputs 14x14x6
Convolution 5×5 1×1 stride, valid padding, outputs 10x10x16
Dropout keep_prob = 0.9
Max Pooling 2×2 stride, outputs 5x5x16
Flatten output dimension 400
Fully connected output dimension 120
Fully connected output dimension 84
Fully connected output dimension 84
Fully connected output dimension 43

3.3 Training of Model

To train the model, I used an Adam optimizer with learning rate of 0.002, 20 epochs (converged fast with SELU) and batch size of 256 (ran on GTX 1070 with 8GB GPU RAM)

3.4 Approach to find solution and getting accuracy > 0.93

Adding dropout to Lenet improved test accuracy and SELU improved training speed. The originally partitioned data sets were quite unbalanced (when plotting), so reading all data, shuffling and creating training/validation/test set also helped. I thought about using Keras and fine tuning a pretrained model (e.g. inception 3), but it could be that a big model on such small images could lead to overfitting (not entirely sure about that though), and reducing input size might lead to long training time (looks like fine tuning is best when you have the same input size, but changing the output classes)

My final model results were:

  • validation set accuracy of 0.976 (between 0.975-0.982)
  • test set accuracy of 0.975

If an iterative approach was chosen:

  • What was the first architecture that was tried and why was it chosen?

Started with Lenet and incrementally added dropout and then several SELU layers.. Also added one fully connected layer more.

  • What were some problems with the initial architecture?

No, but not great results before adding dropout (to avoid overfitting)

  • Which parameters were tuned? How were they adjusted and why?

Tried several combinations learning rates. Could reduce epochs after adding SELU. Used same dropout keep rate.

Since the difference between validation accuracy and test accuracy is very low the model seems to be working well. The loss is also quite low (0.02), so little to gain most likely – at least without changing the model a lot.

4 Test a Model on New Images

4.1. Choose five German traffic signs found on the web

Here are five German traffic signs that I found on the web:

alt text

In the first pick of images I didn’t check that the signs actually were among the the 43 classes the model was built for, and that was actually not the case, i.e. making it impossible to classify correctly. But got interesting results (regarding finding similar signs) for the wrongly classified ones, so replaced only 2 of them with sign images that actually was covered in the model, i.e. making it still impossible to classify 3 of them.

Here are the results of the prediction:

Image Prediction
Priority road Priority road
Side road Speed limit (50km/h)
Adult and child on road Turn left ahead
Two way traffic ahead Beware of ice/snow
Speed limit (60km/h) Speed limit (60km/h)

The model was able to correctly guess 2 of the 5 traffic signs, which gives an accuracy of 40%. For the other ones it can`t classify correctly, but the 2nd prediction for sign 3 – “adult and child on road” – is interesting since it suggests “Go straight or right” – which is quite visually similar (if you blur the innermost of each sign you will get almost the same image).

Continue Reading

Deep Learning for Image Super-Resolution (Scale Up)

Scaling down images is a craft, scaling up images is an art

Since in the scaling down to a lower resolution you typically need to remove pixels, but in the case of scaling up you need to invent new pixels. But some Deep Learning models with Convolutional Neural Networks (and frequently Deconvolutional layers) has shown successful to scale up images, this is called Image Super-Resolution. These models are typically trained by taking high resolution images and reducing them to lower resolution and then train in the opposite way. Partially related: Recommend also checking out Odeon et. al’s’s publication: Deconvolution and Checkerboard Artifacts that goes into more detail about the one the core operators used in Image Super-Resolution.

Blog post Illustration Source: Eric Esteve’s 2013 article: Super Resolution bring high end camera image quality to smartphone.

Best regards,

Amund Tveit

Year  Title Author
2017   GUN: Gradual Upsampling Network for single image super-resolution  Y Zhao, R Wang, W Dong, W Jia, J Yang, X Liu, W Gao
2017   Dual Recovery Network with Online Compensation for Image Super-Resolution  S Xia, W Yang, T Zhao, J Liu
2017   A New Single Image Super-resolution Method Based on the Infinite Mixture Model  P Cheng, Y Qiu, X Wang, K Zhao
2017   Underwater Image Super-resolution by Descattering and Fusion  H Lu, Y Li, S Nakashima, H Kim, S Serikawa
2017   Single Image Super-Resolution with a Parameter Economic Residual-Like Convolutional Neural Network  Z Yang, K Zhang, Y Liang, J Wang
2017   Single Image Super-Resolution via Adaptive Transform-Based Nonlocal Self-Similarity Modeling and Learning-Based Gradient Regularization  H Chen, X He, L Qing, Q Teng
2017   Ensemble Based Deep Networks for Image Super-Resolution  Z Huang, L Wang, Y Gong, C Pan
2017   Single Image Super-Resolution Using Multi-Scale Convolutional Neural Network  X Jia, X Xu, B Cai, K Guo
2017   Hyperspectral image super-resolution using deep convolutional neural network  Y Li, J Hu, X Zhao, W Xie, JJ Li
2016   Research on the Natural Image Super-Resolution Reconstruction Algorithm based on Compressive Perception Theory and Deep Learning Model  G Duan, W Hu, J Wang
2016   Image super-resolution with multi-channel convolutional neural networks  Y Kato, S Ohtani, N Kuroki, T Hirose, M Numa
2016   Image super-resolution reconstruction via RBM-based joint dictionary learning and sparse representation  Z Zhang, A Liu, Q Lei
2016   End-to-End Image Super-Resolution via Deep and Shallow Convolutional Networks  Y Wang, L Wang, H Wang, P Li
2016   Single image super-resolution using regularization of non-local steering kernel regression  K Zhang, X Gao, J Li, H Xia
2016   Single image super-resolution via blind blurring estimation and anchored space mapping  X Zhao, Y Wu, J Tian, H Zhang
2016   A Versatile Sparse Representation Based Post-Processing Method for Improving Image Super-Resolution  J Yang, J Guo, H Chao
2016   Robust Single Image Super-Resolution via Deep Networks with Sparse Prior.  D Liu, Z Wang, B Wen, J Yang, W Han, T Huang
2016   EnhanceNet: Single Image Super-Resolution through Automated Texture Synthesis  MSM Sajjadi, B Schölkopf, M Hirsch
2016   Is Image Super-resolution Helpful for Other Vision Tasks?  D Dai, Y Wang, Y Chen, L Van Gool
2016   Cluster-Based Image Super-resolution via Jointly Low-rank and Sparse Representation  N Han, Z Song, Y Li
2016   Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network  C Ledig, L Theis, F Huszar, J Caballero, A Aitken
2016   Image super-resolution using non-local Gaussian process regression  H Wang, X Gao, K Zhang, J Li
2016   A hybrid wavelet convolution network with sparse-coding for image super-resolution  X Gao, H Xiong
2016   Amortised MAP Inference for Image Super-resolution  CK Sønderby, J Caballero, L Theis, W Shi, F Huszár
2016   X-Ray fluorescence image super-resolution using dictionary learning  Q Dai, E Pouyet, O Cossairt, M Walton, F Casadio
2016   Image super-resolution based on convolution neural networks using multi-channel input  GY Youm, SH Bae, M Kim
2016   Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution  W Yang, J Feng, J Yang, F Zhao, J Liu, Z Guo, S Yan
2016   Image Super-Resolution by PSOSEN of Local Receptive Fields Based Extreme Learning Machine  Y Song, B He, Y Shen, R Nian, T Yan
2016   Incorporating Image Priors with Deep Convolutional Neural Networks for Image Super-Resolution  Y Liang, J Wang, S Zhou, Y Gong, N Zheng
2015   Single Image Super-Resolution Via Bm3D Sparse Coding  K Egiazarian, V Katkovnik
2015   Learning a Deep Convolutional Network for Light-Field Image Super-Resolution  Y Yoon, HG Jeon, D Yoo, JY Lee, I Kweon
2015   Single Image Super-Resolution via Image Smoothing  Z Liu, Q Huang, J Li, Q Wang
2015   Deeply Improved Sparse Coding for Image Super-Resolution  Z Wang, D Liu, J Yang, W Han, T Huang
2015   Conditioned Regression Models for Non-Blind Single Image Super-Resolution  GRSSM Rüther, H Bischof
2015   How Useful Is Image Super-resolution to Other Vision Tasks?  D Dai, Y Wang, Y Chen, L Van Gool
2015   Learning Hierarchical Decision Trees for Single Image Super-Resolution  JJ Huang, WC Siu
2015   Single image super-resolution by approximated Heaviside functions  LJ Deng, W Guo, TZ Huang
2015   Jointly Optimized Regressors for Image Super-resolution  D Dai, R Timofte, L Van Gool
2015   Single Image Super-Resolution via Internal Gradient Similarity  Y Xian, Y Tian
2015   Image Super-Resolution Using Deep Convolutional Networks  C Dong, CC Loy, K He, X Tang
2015   Coupled Deep Autoencoder for Single Image Super-Resolution  K Zeng, J Yu, R Wang, C Li, D Tao
2015   Single Image Super-Resolution Using Maximizing Self-Similarity Prior  J Li, Y Wu, X Luo
2015   Accurate Image Super-Resolution Using Very Deep Convolutional Networks  J Kim, JK Lee, KM Lee
2015   Deeply-Recursive Convolutional Network for Image Super-Resolution  J Kim, JK Lee, KM Lee
2015   Single Face Image Super-Resolution via Solo Dictionary Learning  F Juefei
2014   Single image super-resolution via L0 image smoothing  Z Liu, Q Huang, J Li, Q Wang
Continue Reading

Deep Learning for Traffic Sign Detection and Recognition

Traffic Sign Detection and Recognition is key functionality for self-driving cars. This posting has recent papers in this area. Check also out related posting: Deep Learning for Vehicle Detection and Classification

Best regards,
Amund Tveit
Amund Tveit

Year  Title Author
2016   Road surface traffic sign detection with hybrid region proposal and fast R-CNN  R Qian, Q Liu, Y Yue, F Coenen, B Zhang
2016   Traffic sign classification with deep convolutional neural networks  J CREDI
2016   Real-time Traffic Sign Recognition system with deep convolutional neural network  S Jung, U Lee, J Jung, DH Shim
2016   Traffic Sign Detection and Recognition using Fully Convolutional Network Guided Proposals  Y Zhu, C Zhang, D Zhou, X Wang, X Bai, W Liu
2016   A traffic sign recognition method based on deep visual feature  F Lin, Y Lai, L Lin, Y Yuan
2016   The research on traffic sign recognition based on deep learning  C Li, C Yang
2015   Fast Traffic Sign Recognition with a Rotation Invariant Binary Pattern Based Feature  S Yin, P Ouyang, L Liu, Y Guo, S Wei
2015   Malaysia traffic sign recognition with convolutional neural network  MM Lau, KH Lim, AA Gopalai
2015   Negative-Supervised Cascaded Deep Learning for Traffic Sign Classification  K Xie, S Ge, R Yang, X Lu, L Sun
Continue Reading

Deep Learning for Vehicle Detection and Classification

Update: 2017-Feb-03 – launched new service – (navigation and search in papers). Try e.g. out its Vehicle, Car and Driving pages.

This posting has recent papers about vehicle (e.g. car) detection and classification, e.g. for selv-driving/autonomous cars. Related: check also out Nvidia‘s End-to-End Deep Learning for Self-driving Cars and Udacity‘s Self-Driving Car Engineer (Nanodegree).

Best regards,

<a href=””>Amund Tveit</a> (<a href=””>@atveit</a>)

Year  Title Author
2016   Vehicle Classification using Transferable Deep Neural Network Features  Y Zhou, NM Cheung
2016   A Hybrid Fuzzy Morphology And Connected Components Labeling Methods For Vehicle Detection And Counting System  C Fatichah, JL Buliali, A Saikhu, S Tena
2016   Evaluation of vehicle interior sound quality using a continuous restricted Boltzmann machine-based DBN  HB Huang, RX Li, ML Yang, TC Lim, WP Ding
2016   An Automated Traffic Surveillance System with Aerial Camera Arrays: Data Collection with Vehicle Tracking  X Zhao, D Dawson, WA Sarasua, ST Birchfield
2016   Vehicle type classification via adaptive feature clustering for traffic surveillance video  S Wang, F Liu, Z Gan, Z Cui
2016   Vehicle Detection in Satellite Images by Incorporating Objectness and Convolutional Neural Network  S Qu, Y Wang, G Meng, C Pan
2016   DAVE: A Unified Framework for Fast Vehicle Detection and Annotation  Y Zhou, L Liu, L Shao, M Mellor
2016   3D Fully Convolutional Network for Vehicle Detection in Point Cloud  B Li
2016   A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance  X Liu, W Liu, T Mei, H Ma
2016   TraCount: a deep convolutional neural network for highly overlapping vehicle counting  S Surya, RV Babu
2016   Pedestrian, bike, motorcycle, and vehicle classification via deep learning: Deep belief network and small training set  YY Wu, CM Tsai
2016   Fast Vehicle Detection in Satellite Images Using Fully Convolutional Network  J Hu, T Xu, J Zhang, Y Yang
2016   Local Tiled Deep Networks for Recognition of Vehicle Make and Model  Y Gao, HJ Lee
2016   Vehicle detection based on visual saliency and deep sparse convolution hierarchical model  Y Cai, H Wang, X Chen, L Gao, L Chen
2016   Sound quality prediction of vehicle interior noise using deep belief networks  HB Huang, XR Huang, RX Li, TC Lim, WP Ding
2016   Accurate On-Road Vehicle Detection with Deep Fully Convolutional Networks  Z Jie, WF Lu, EHF Tay
2016   Fault Detection and Identification of Vehicle Starters and Alternators Using Machine Learning Techniques  E Seddik
2016   Fault diagnosis network design for vehicle on-board equipments of high-speed railway: A deep learning approach  J Yin, W Zhao
2016   Real-time state-of-health estimation for electric vehicle batteries: A data-driven approach  G You, S Park, D Oh
2016   The Precise Vehicle Retrieval in Traffic Surveillance with Deep Convolutional Neural Networks  B Su, J Shao, J Zhou, X Zhang, L Mei, C Hu
2016   Online vehicle detection using deep neural networks and lidar based preselected image patches  S Lange, F Ulbrich, D Goehring
2016   A closer look at Faster R-CNN for vehicle detection  Q Fan, L Brown, J Smith
2016   Appearance-based Brake-Lights recognition using deep learning and vehicle detection  JG Wang, L Zhou, Y Pan, S Lee, Z Song, BS Han
2016   Night time vehicle detection algorithm based on visual saliency and deep learning  Y Cai, HW Xiaoqiang Sun, LCH Jiang
2016   Vehicle classification in WAMI imagery using deep network  M Yi, F Yang, E Blasch, C Sheaff, K Liu, G Chen, H Ling
2015   VeTrack: Real Time Vehicle Tracking in Uninstrumented Indoor Environments  M Zhao, T Ye, R Gao, F Ye, Y Wang, G Luo
2015   Vehicle Color Recognition in The Surveillance with Deep Convolutional Neural Networks  B Su, J Shao, J Zhou, X Zhang, L Mei
2015   Vehicle Speed Prediction using Deep Learning  J Lemieux, Y Ma
2015   Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning  D Liu, Y Wang
2015   Night Time Vehicle Sensing in Far Infrared Image with Deep Learning  H Wang, Y Cai, X Chen, L Chen
2015   A Vehicle Type Recognition Method based on Sparse Auto Encoder  HL Rong, YX Xia
2015   Occluded vehicle detection with local connected deep model  H Wang, Y Cai, X Chen, L Chen
2015   Performance Evaluation of the Neural Network based Vehicle Detection Models  K Goyal, D Kaur
2015   A Smartphone-based Connected Vehicle Solution for Winter Road Surface Condition Monitoring  MA Linton
2015   Vehicle Logo Recognition System Based on Convolutional Neural Networks With a Pretraining Strategy  Y Huang, R Wu, Y Sun, W Wang, X Ding
2015   SiftKeyPre: A Vehicle Recognition Method Based on SIFT Key-Points Preference in Car-Face Image  CY Zhang, XY Wang, J Feng, Y Cheng
2015   Vehicle Detection in Aerial Imagery: A small target detection benchmark  S Razakarivony, F Jurie
2015   Vehicle license plate recognition using visual attention model and deep learning  D Zang, Z Chai, J Zhang, D Zhang, J Cheng
2015   Domain adaption of vehicle detector based on convolutional neural networks  X Li, M Ye, M Fu, P Xu, T Li
2015   Trainable Convolutional Network Apparatus And Methods For Operating A Robotic Vehicle  P O’connor, E Izhikevich
2015   Vehicle detection and classification based on convolutional neural network  D He, C Lang, S Feng, X Du, C Zhang
2015   The AdaBoost algorithm for vehicle detection based on CNN features  X Song, T Rui, Z Zha, X Wang, H Fang
2015   Deep neural networks-based vehicle detection in satellite images  Q Jiang, L Cao, M Cheng, C Wang, J Li
2015   Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines  C Gou, K Wang, Y Yao, Z Li
2014   Multi-modal Sensor Registration for Vehicle Perception via Deep Neural Networks  M Giering, K Reddy, V Venugopalan
2014   Mooting within the curriculum as a vehicle for learning: student perceptions  L Jones, S Field
2014   Vehicle Type Classification Using Semi-Supervised Convolutional Neural Network  Z Dong, Y Wu, M Pei, Y Jia
2014   Vehicle License Plate Recognition With Random Convolutional Networks  D Menotti, G Chiachia, AX Falcao, VJO Neto
2014   Vehicle Type Classification Using Unsupervised Convolutional Neural Network  Z Dong, M Pei, Y He, T Liu, Y Dong, Y Jia
Continue Reading

Deep Learning for Ultrasound Analysis

Ultrasound (also called Sonography) are sound waves with higher frequency than humans can hear, they frequently used in medical settings, e.g. for checking that pregnancy is going well with fetal ultrasound. For more about Ultrasound data formats check out Ultrasound Research Interface. This blog post has recent publications about applying Deep Learning for analyzing Ultrasound data.

Best regards,
Amund Tveit

Year  Title Authors
2016   Early-stage atherosclerosis detection using deep learning over carotid ultrasound images  RM Menchón
2016   Automatic Detection of Standard Sagittal Plane in the First Trimester of Pregnancy Using 3-D Ultrasound Data  S Nie, J Yu, P Chen, Y Wang, JQ Zhang
2016   Detection of prostate cancer using temporal sequences of ultrasound data: a large clinical feasibility study  S Azizi, F Imani, S Ghavidel, A Tahmasebi, JT Kwak
2016   Hough-CNN: Deep Learning for Segmentation of Deep Brain Regions in MRI and Ultrasound  F Milletari, SA Ahmadi, C Kroll, A Plate, V Rozanski
2016   Hybrid approach for automatic segmentation of fetal abdomen from ultrasound images using deep learning  H Ravishankar, SM Prabhu, V Vaidya, N Singhal
2016   Iterative Multi-domain Regularized Deep Learning for Anatomical Structure Detection and Segmentation from Ultrasound Images  H Chen, Y Zheng, JH Park, PA Heng, SK Zhou
2016   4D Cardiac Ultrasound Standard Plane Location by Spatial-Temporal Correlation  Y Gu, GZ Yang, J Yang, K Sun
2016   Computer-Aided Diagnosis for Breast Ultrasound Using Computerized BI-RADS Features and Machine Learning Methods  J Shan, SK Alam, B Garra, Y Zhang, T Ahmed
2016   Stacked Deep Polynomial Network Based Representation Learning for Tumor Classification with Small Ultrasound Image Dataset  J Shi, S Zhou, X Liu, Q Zhang, M Lu, T Wang
2016   Coupling Convolutional Neural Networks and Hough Voting for Robust Segmentation of Ultrasound Volumes  C Kroll, F Milletari, N Navab, SA Ahmadi
2016   Classifying Cancer Grades Using Temporal Ultrasound for Transrectal Prostate Biopsy  S Azizi, F Imani, JT Kwak, A Tahmasebi, S Xu, P Yan
2015   Tumor Classification by Deep Polynomial Network and Multiple Kernel Learning on Small Ultrasound Image Dataset  X Liu, J Shi, Q Zhang
2015   Automatic Recognition of Fetal Facial Standard Plane in Ultrasound Image via Fisher Vector  B Lei, EL Tan, S Chen, L Zhuo, S Li, D Ni, T Wang
2015   Estimation of the Arterial Diameter in Ultrasound Images of the Common Carotid Artery  RM Menchón
2015   Cell recognition based on topological sparse coding for microscopy imaging of focused ultrasound treatment  Z Wang, J Zhu, Y Xue, C Song, N Bi
2014   Mapping between ultrasound and vowel speech using DNN framework  X Zheng, J Wei, W Lu, Q Fang, J Dang
2014   High-definition 3D Image Processing Technology for Ultrasound Diagnostic Scanners  M Ogino, T Shibahara, Y Noguchi, T Tsujita
2014   Fully automatic segmentation of ultrasound common carotid artery images based on machine learning  RM Menchón
Continue Reading

Deep Learning with Generative and Generative Adverserial Networks – ICLR 2017 Discoveries

The 5th International Conference on Learning Representation (ICLR 2017) is coming to Toulon, France (April 24-26 2017).

This blog post gives an overview of Deep Learning with Generative and Adverserial Networks related papers submitted to ICLR 2017, see underneath for the list of papers. Want to learn about these topics? See OpenAI’s article about Generative Models and Ian Goodfellow’s paper about Generative Adversarial Networks.

Best regards,

Amund Tveit

ICLR 2017 – Generative and Generative Adversarial Papers

  1. Unsupervised Learning Using Generative Adversarial Training And Clustering – Authors: Vittal Premachandran, Alan L. Yuille
  2. Improving Generative Adversarial Networks with Denoising Feature Matching – Authors: David Warde-Farley, Yoshua Bengio
  3. Generative Adversarial Parallelization – Authors: Daniel Jiwoong Im, He Ma, Chris Dongjoo Kim, Graham Taylor
  4. b-GAN: Unified Framework of Generative Adversarial Networks – Authors: Masatosi Uehara, Issei Sato, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo
  5. Generative Adversarial Networks as Variational Training of Energy Based Models – Authors: Shuangfei Zhai, Yu Cheng, Rogerio Feris, Zhongfei Zhang
  6. Boosted Generative Models – Authors: Aditya Grover, Stefano Ermon
  7. Adversarial examples for generative models – Authors: Jernej Kos, Dawn Song
  8. Mode Regularized Generative Adversarial Networks – Authors: Tong Che, Yanran Li, Athul Jacob, Yoshua Bengio, Wenjie Li
  9. Variational Recurrent Adversarial Deep Domain Adaptation – Authors: Sanjay Purushotham, Wilka Carvalho, Tanachat Nilanon, Yan Liu
  10. Structured Interpretation of Deep Generative Models – Authors: N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr
  11. Inference and Introspection in Deep Generative Models of Sparse Data – Authors: Rahul G. Krishnan, Matthew Hoffman
  12. Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy – Authors: Dougal J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, Arthur Gretton
  13. Unsupervised sentence representation learning with adversarial auto-encoder – Authors: Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang
  14. Unsupervised Program Induction with Hierarchical Generative Convolutional Neural Networks – Authors: Qucheng Gong, Yuandong Tian, C. Lawrence Zitnick
  15. A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Noise – Authors: Beilun Wang, Ji Gao, Yanjun Qi
  16. On the Quantitative Analysis of Decoder-Based Generative Models – Authors: Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, Roger Grosse
  17. Evaluation of Defensive Methods for DNNs against Multiple Adversarial Evasion Models – Authors: Xinyun Chen, Bo Li, Yevgeniy Vorobeychik
  18. Calibrating Energy-based Generative Adversarial Networks – Authors: Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville
  19. Inverse Problems in Computer Vision using Adversarial Imagination Priors – Authors: Hsiao-Yu Fish Tung, Katerina Fragkiadaki
  20. Towards Principled Methods for Training Generative Adversarial Networks – Authors: Martin Arjovsky, Leon Bottou
  21. Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning – Authors: Dilin Wang, Qiang Liu
  22. Multi-view Generative Adversarial Networks – Authors: Mickaël Chen, Ludovic Denoyer
  23. LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation – Authors: Jianwei Yang, Anitha Kannan, Dhruv Batra, Devi Parikh
  24. Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks – Authors: Emily Denton, Sam Gross, Rob Fergus
  25. Generative Adversarial Networks for Image Steganography – Authors: Denis Volkhonskiy, Boris Borisenko, Evgeny Burnaev
  26. Unrolled Generative Adversarial Networks – Authors: Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein
  27. Generative Multi-Adversarial Networks – Authors: Ishan Durugkar, Ian Gemp, Sridhar Mahadevan
  28. Joint Multimodal Learning with Deep Generative Models – Authors: Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo
  29. Fast Adaptation in Generative Models with Generative Matching Networks – Authors: Sergey Bartunov, Dmitry P. Vetrov
  30. Adversarially Learned Inference – Authors: Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Alex Lamb, Martin Arjovsky, Olivier Mastropietro, Aaron Courville
  31. Perception Updating Networks: On architectural constraints for interpretable video generative models – Authors: Eder Santana, Jose C Principe
  32. Energy-based Generative Adversarial Networks – Authors: Junbo Zhao, Michael Mathieu, Yann LeCun
  33. Simple Black-Box Adversarial Perturbations for Deep Networks – Authors: Nina Narodytska, Shiva Kasiviswanathan
  34. Learning in Implicit Generative Models – Authors: Shakir Mohamed, Balaji Lakshminarayanan
  35. On Detecting Adversarial Perturbations – Authors: Jan Hendrik Metzen, Tim Genewein, Volker Fischer, Bastian Bischoff
  36. Delving into Transferable Adversarial Examples and Black-box Attacks – Authors: Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song
  37. Adversarial Feature Learning – Authors: Jeff Donahue, Philipp Krähenbühl, Trevor Darrell
  38. Generative Paragraph Vector – Authors: Ruqing Zhang, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng
  39. Adversarial Machine Learning at Scale – Authors: Alexey Kurakin, Ian J. Goodfellow, Samy Bengio
  40. Adversarial Training Methods for Semi-Supervised Text Classification – Authors: Takeru Miyato, Andrew M. Dai, Ian Goodfellow
  41. Sampling Generative Networks: Notes on a Few Effective Techniques – Authors: Tom White
  42. Adversarial examples in the physical world – Authors: Alexey Kurakin, Ian J. Goodfellow, Samy Bengio
  43. Improving Sampling from Generative Autoencoders with Markov Chains – Authors: Kai Arulkumaran, Antonia Creswell, Anil Anthony Bharath
  44. Neural Photo Editing with Introspective Adversarial Networks – Authors: Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston
  45. Learning to Protect Communications with Adversarial Neural Cryptography – Authors: Martín Abadi, David G. Andersen

Sign up for Deep Learning newsletter!

Continue Reading

Deep Learning for Natural Language Processing – ICLR 2017 Discoveries

Update: 2017-Feb-03 – launched new service – (navigation and search in papers). Try e.g. out its Natural Language Processing Page.

The 5th International Conference on Learning Representation (ICLR 2017) is coming to Toulon, France (April 24-26 2017), and there is large amount of Deep Learning papers submitted to the conference, looks like it will be a great event (see word cloud below for most frequent words used in submitted paper titles).


This blog post gives an overview of Natural Language Processing related papers submitted to ICLR 2017, see underneath for the list of papers. If you want to learn about Deep Learning with NLP check out Stanford’s CS224d: Deep Learning for Natural Language Processing

Best regards,

Amund Tveit


Character/Word/Sentence Representation

  1. Character-aware Attention Residual Network for Sentence Representation – Authors: Xin Zheng, Zhenzhou Wu
  2. Program Synthesis for Character Level Language Modeling – Authors: Pavol Bielik, Veselin Raychev, Martin Vechev
  3. Words or Characters? Fine-grained Gating for Reading Comprehension – Authors: Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, Ruslan Salakhutdinov
  4. Deep Character-Level Neural Machine Translation By Learning Morphology – Authors: Shenjian Zhao, Zhihua Zhang
  5. Opening the vocabulary of neural language models with character-level word representations – Authors: Matthieu Labeau, Alexandre Allauzen
  6. Unsupervised sentence representation learning with adversarial auto-encoder – Authors: Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang
  7. Offline Bilingual Word Vectors Without a Dictionary – Authors: Samuel L. Smith, David H. P. Turban, Nils Y. Hammerla, Steven Hamblin
  8. Learning Word-Like Units from Joint Audio-Visual Analylsis – Authors: David Harwath, James R. Glass
  9. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling – Authors: Hakan Inan, Khashayar Khosravi, Richard Socher
  10. Sentence Ordering using Recurrent Neural Networks – Authors: Lajanugen Logeswaran, Honglak Lee, Dragomir Radev

Search/Question-Answer/Recommender Systems

  1. Learning to Query, Reason, and Answer Questions On Ambiguous Texts – Authors: Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerry Tesauro, Satinder Singh
  2. Group Sparse CNNs for Question Sentence Classification with Answer Sets – Authors: Mingbo Ma, Liang Huang, Bing Xiang, Bowen Zhou
  3. CONTENT2VEC: Specializing Joint Representations of Product Images and Text for the task of Product Recommendation – Authors: Thomas Nedelec, Elena Smirnova, Flavian Vasile
  4. Is a picture worth a thousand words? A Deep Multi-Modal Fusion Architecture for Product Classification in e-commerce – Authors: Tom Zahavy, Alessandro Magnani, Abhinandan Krishnan, Shie Mannor

Word/Sentence Embedding

  1. A Simple but Tough-to-Beat Baseline for Sentence Embeddings – Authors: Sanjeev Arora, Yingyu Liang, Tengyu Ma
  2. Investigating Different Context Types and Representations for Learning Word Embeddings – Authors: Bofang Li, Tao Liu, Zhe Zhao, Xiaoyong Du
  3. Multi-view Recurrent Neural Acoustic Word Embeddings – Authors: Wanjia He, Weiran Wang, Karen Livescu
  4. A Self-Attentive Sentence Embedding – Authors: Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio
  5. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks – Authors: Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, Yoav Goldberg


  1. Neural Machine Translation with Latent Semantic of Image and Text – Authors: Joji Toyama, Masanori Misono, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo
  2. Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context – Authors: Shyam Upadhyay, Kai-Wei Chang, James Zhou, Matt Taddy, Adam Kalai
  3. Learning to Understand: Incorporating Local Contexts with Global Attention for Sentiment Classification – Authors: Zhigang Yuan, Yuting Hu, Yongfeng Huang
  4. Adaptive Feature Abstraction for Translating Video to Language – Authors: Yunchen Pu, Martin Renqiang Min, Zhe Gan, Lawrence Carin
  5. A Convolutional Encoder Model for Neural Machine Translation – Authors: Jonas Gehring, Michael Auli, David Grangier, Yann N. Dauphin
  6. Fuzzy paraphrases in learning word representations with a corpus and a lexicon – Authors: Yuanzhi Ke, Masafumi Hagiwara
  7. Iterative Refinement for Machine Translation – Authors: Roman Novak, Michael Auli, David Grangier
  8. Vocabulary Selection Strategies for Neural Machine Translation – Authors: Gurvan L’Hostis, David Grangier, Michael Auli

Language Models/Text Comprehension/Matching/Compression/Classification/++

  1. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks – Authors: Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher
  2. Gated-Attention Readers for Text Comprehension – Authors: Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov
  3. A Compare-Aggregate Model for Matching Text Sequences – Authors: Shuohang Wang, Jing Jiang
  4. A Context-aware Attention Network for Interactive Question Answering – Authors: Huayu Li, Martin Renqiang Min, Yong Ge, Asim Kadav
  5. Compressing text classification models – Authors: Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Herve Jegou, Tomas Mikolov
  6. Multi-Agent Cooperation and the Emergence of (Natural) Language – Authors: Angeliki Lazaridou, Alexander Peysakhovich, Marco Baroni
  7. Learning a Natural Language Interface with Neural Programmer – Authors: Arvind Neelakantan, Quoc V. Le, Martin Abadi, Andrew McCallum, Dario Amodei
  8. Learning similarity preserving representations with neural similarity and context encoders – Authors: Franziska Horn, Klaus-Robert Müller
  9. Adversarial Training Methods for Semi-Supervised Text Classification – Authors: Takeru Miyato, Andrew M. Dai, Ian Goodfellow
  10. Multi-Label Learning using Tensor Decomposition for Large Text Corpora – Authors: Sayantan Dasgupta


Sign up for Deep Learning newsletter!

Continue Reading