Deep Learning for Acoustic Modelling


This blog post has an overview papers related to acoustic modelling primarily for speech recognition but also speech generation (synthesis). See also for a broader set of (at the time of writing 73) recent Deep Learning papers related to acoustics for speech recognition and other applications of acoustics.

Acoustic Modelling is described in Wikipedia as: “An acoustic model is used in Automatic Speech Recognition to represent the relationship between an audio signal and the phonemes or other linguistic units that make up speech. The model is learned from a set of audio recordings and their corresponding transcripts”. 

Blog Post Illustration Photo Source: Professor Mark Gales‘ (University of Cambridge) 2009 presentation Acoustic Modelling for Speech Recognition: Hidden Markov Models and Beyond?

Year  Title Author
2017   Investigation on acoustic modeling with different phoneme set for continuous Lhasa Tibetan recognition based on DNN method  H Wang, K Khyuru, J Li, G Li, J Dang, L Huang
2017   Personalized Acoustic Modeling By Weakly Supervised Multi-Task Deep Learning Using Acoustic Tokens  CK Wei, CT Chung, HY Lee, LS Lee
2017   I-vector estimation as auxiliary task for multi-task learning based acoustic modeling for automatic speech recognition  G Pironkov, S Dupont, T Dutoit
2016   Graph-based Semi-Supervised Learning in Acoustic Modeling for Automatic Speech Recognition  Y Liu
2016   A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition  A Zeyer, P Doetsch, P Voigtlaender, R Schlüter, H Ney
2016   Improvements in IITG Assamese Spoken Query System: Background Noise Suppression and Alternate Acoustic Modeling  S Shahnawazuddin, D Thotappa, A Dey, S Imani
2016   DNN-Based Acoustic Modeling for Russian Speech Recognition Using Kaldi  I Kipyatkova, A Karpov
2015   Doubly Hierarchical Dirichlet Process Hmm For Acoustic Modeling  AHHN Torbati, J Picone
2015   Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends  ZH Ling, SY Kang, H Zen, A Senior, M Schuster
2015   Acoustic Modeling In Statistical Parametric Speech Synthesis–From Hmm To Lstm-Rnn  H Zen
2015   Acoustic Modeling of Bangla Words using Deep Belief Network  M Ahmed, PC Shill, K Islam, MAH Akhand
2015   Unified Acoustic Modeling using Deep Conditional Random Fields  Y Hifny
2015   Exploiting Low-Dimensional Structures To Enhance Dnn Based Acoustic Modeling In Speech Recognition  P Dighe, G Luyet, A Asaei, H Bourlard
2015   Ensemble Acoustic Modeling for CD-DNN-HMM Using Random Forests of Phonetic Decision Trees  T Zhao, Y Zhao, X Chen
2015   Deep Neural Networks for Acoustic Modeling  V from Embeds, G Hinton, L Deng, D Yu, G Dahl
2015   Integrating Articulatory Data in Deep Neural Network-based Acoustic Modeling  L Badino, C Canevari, L Fadiga, G Metta
2015   Deep learning in acoustic modeling for Automatic Speech Recognition and Understanding-an overview  I Gavat, D Militaru
Deep Learning with Long Short-Term Memory (LSTM)

This blog post has some recent papers about Deep Learning with Long-Short Term Memory (LSTM). To get started I recommend checking out Christopher Olah’s Understanding LSTM Networks and Andrej Karpathy’s The Unreasonable Effectiveness of Recurrent Neural Networks. This blog post is complemented by Deep Learning with Recurrent/Recursive Neural Networks (RNN) – ICLR 2017 Discoveries.

Year  Title Author
2016   Look, Listen and Learn-A Multimodal LSTM for Speaker Identification  J Ren, Y Hu, YW Tai, C Wang, L Xu, W Sun, Q Yan
2016   Leveraging Sentence-level Information with Encoder LSTM for Natural Language Understanding  G Kurata, B Xiang, B Zhou, M Yu
2016   Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition  FJ Ordóñez, D Roggen
2016   Exploiting LSTM structure in deep neural networks for speech recognition  T He, J Droppo
2016   A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition  A Zeyer, P Doetsch, P Voigtlaender, R Schlüter, H Ney
2016   Geometric Scene Parsing with Hierarchical LSTM  Z Peng, R Zhang, X Liang, L Lin
2016   LSTM Networks for Mobile Human Activity Recognition  Y Chen, K Zhong, J Zhang, Q Sun, X Zhao
2016   Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention  Y Liu, C Sun, L Lin, X Wang
2016   Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks  Z Zhang, F Ringeval, J Han, J Deng, E Marchi
2016   Contextual LSTM (CLSTM) models for Large scale NLP tasks  S Ghosh, O Vinyals, B Strope, S Roy, T Dean, L Heck
2016   Bidirectional LSTM Networks Employing Stacked Bottleneck Features for Expressive Speech-Driven Head Motion Synthesis  K Haag, H Shimodaira
2016   Beyond Frame-level CNN: Saliency-aware 3D CNN with LSTM for Video Action Recognition  J Song, H Shen
2015   Learning Statistical Scripts with LSTM Recurrent Neural Networks  K Pichotta, RJ Mooney
2015   A deep bidirectional LSTM approach for video-realistic talking head  B Fan, L Xie, S Yang, L Wang, FK Soong
2015   Maxout neurons for deep convolutional and LSTM neural networks in speech recognition  M Cai, J Liu
2015   Scene Analysis by Mid-level Attribute Learning using 2D LSTM networks and an Application to Web-image Tagging  W Byeon, M Liwicki, TM Breuel
2015   Learning to Diagnose with LSTM Recurrent Neural Networks  ZC Lipton, DC Kale, C Elkan, R Wetzell
2015   Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting  SHI Xingjian, Z Chen, H Wang, DY Yeung, W Wong
Deep Learning in Finance

This posting has recent publications about Deep Learning in Finance (e.g. stock market prediction)

Year  Title Author
2016   Genetic deep neural networks using different activation functions for financial data mining  LM Zhang
2016   Computational Intelligence and Financial Markets: A Survey and Future Directions  RC Cavalcante, RC Brasileiro, VLF Souza, JP Nobrega
2016   Classification-based Financial Markets Prediction using Deep Neural Networks  M Dixon, D Klabjan, JH Bang
2016   Exploiting Twitter Moods to Boost Financial Trend Prediction Based on Deep Network Models  Y Huang, K Huang, Y Wang, H Zhang, J Guan, S Zhou
2016 Deep Learning in Finance J. B. Heaton, N. G. Polson, J. H. Witte
2016   Deep Direct Reinforcement Learning for Financial Signal Representation and Trading.  Y Deng, F Bao, Y Kong, Z Ren, Q Dai
2015   Improving Decision Analytics with Deep Learning: The Case of Financial Disclosures  R Fehrer, S Feuerriegel
2015   An application of deep learning for trade signal prediction in financial markets  AC Turkmen, AT Cemgil
2015   Deep Learning for Multivariate Financial Time Series  G BATRES
2015   Deep Modeling Complex Couplings within Financial Markets  W Cao, L Hu, L Cao
2015   Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks  Y Peng, H Jiang
2014   GPU Implementation of a Deep Learning Network for Financial Prediction  R Kumar, AK Cheema
2016   Non-Conformity Detection in High-Dimensional Time Series of Stock Market Data  A Kasuga, Y Ohsawa, T Yoshino, S Ashida
2016   Artificial neural networks approach to the forecast of stock market price movements  L Di Persio, O Honchar
2016   Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network  A Lai, MK Li, FW Pong
