2015 UsingRecurrentNeuralNetworksfor

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

2016

  • (Liu & Lane, 2016) ⇒ Bing Liu, and Ian Lane. (2016). “Attention-based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling.” In: Proceedings of Interspeech-2016.

Quotes

Abstract

Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU). In this paper, we propose to use recurrent neural networks (RNNs) for this task, and present several novel architectures designed to efficiently model past and future temporal dependencies. Specifically, we implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants. To facilitate reproducibility, we implemented these networks with the publicly available Theano neural network toolkit and completed experiments on the well-known airline travel information system (ATIS) benchmark. In addition, we compared the approaches on two custom SLU data sets from the entertainment and movies domains. Our results show that the RNN-based models outperform the conditional random field (CRF) baseline by 2% in absolute error reduction on the ATIS benchmark. We improve the state-of-the-art by 0.5% in the Entertainment domain, and 6.7% for the movies domain.

References

  • 1. G. Tur and R. De Mori, Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. New York, NY, USA: Wiley, 2011.
  • 2. Robert E. Schapire, Yoram Singer, BoosTexter: A Boosting-based Systemfor Text Categorization, Machine Learning, v.39 n.2-3, p.135-168, May-June 2000
  • 3. P. Haffner, G. Tur, and J. Wright, "Optimizing SVMs for Complex Call Classification," in In: Proc. ICASSP, 2003, Pp. 632-635.
  • 4. S. Yaman, Li Deng, Dong Yu, Ye-Yi Wang, A. Acero, An Integrative and Discriminative Technique for Spoken Utterance Classification, IEEE Transactions on Audio, Speech, and Language Processing, v.16 n.6, p.1207-1214, August 2008
  • 5. Y. Wang, L. Deng, and A. Acero, "Spoken Language Understanding --An Introduction to the Statistical Framework," IEEE Signal Process. Mag., Vol. 22, No. 5, Pp. 16-31, Sep. 2005.
  • 6. John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, Proceedings of the Eighteenth International Conference on Machine Learning, p.282-289, June 28-July 01, 2001
  • 7. Y. Wang, L. Deng, and A. Acero, Tur and D. Mori, Eds., "Semantic Frame based Spoken Language Understanding," in Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. New York, NY, USA: Wiley, 2011, Ch. 3, Pp. 35-80.
  • 8. G. E. Dahl, Dong Yu, Li Deng, A. Acero, Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition, IEEE Transactions on Audio, Speech, and Language Processing, v.20 n.1, p.30-42, January 2012
  • 9. G. Mesnil, Y. Dauphin, X. Glorot, S. Rifai, Y. Bengio, I. Goodfellow, E. Lavoie, X. Muller, G. Desjardins, D. Warde-Farley, P. Vincent, A. Courville, and J. Bergstra, "Unsupervised and Transfer Learning Challenge: A Deep Learning Approach," in In: Proc. JMLR W&CP: Proc. Unsupervised Transfer Learn., 2011, Vol. 7.
  • 10. A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Adv. Neural Inf. Process. Syst., Vol. 25, 2012.
  • 11. Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, Larry Heck, Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data, Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, October 27-November 01, 2013, San Francisco, California, USA
  • 12. Geoffrey E. Hinton, Simon Osindero, Yee-Whye Teh, A Fast Learning Algorithm for Deep Belief Nets, Neural Computation, v.18 n.7, p.1527-1554, July 2006
  • 13. L. Deng, G. Tur, X. He, and D. Hakkani-Tur, "Use of Kernel Deep Convex Networks and End-to-end Learning for Spoken Language Understanding," in In: Proc. IEEE SLT, 2012, Pp. 210-215.
  • 14. James Bergstra, Yoshua Bengio, Random Search for Hyper-parameter Optimization, The Journal of Machine Learning Research, 13, p.281-305, 3/1/2012
  • 15. G. Mesnil, X. He, L. Deng, and Y. Bengio, "Investigation of Recurrent-Neural-Network Architectures and Learning Methods for Spoken Language Understanding," in In: Proc. Interspeech, 2013.
  • 16. J. Elman, "Finding Structure in Time," Cognitive Sci., Vol. 14, No. 2, 1990.
  • 17. M. Jordan, Serial Order: A Parallel Distributed Processing Approach Univ. of California, Inst. of Comput. Sci., San Diego, CA, USA, Tech. Rep No 8604, 1997.
  • 18. M. Schuster, K.K. Paliwal, Bidirectional Recurrent Neural Networks, IEEE Transactions on Signal Processing, v.45 n.11, p.2673-2681, November 1997
  • 19. A. Graves, A. Mohamed, and G. Hinton, "Speech Recognition with Deep Recurrent Neural Networks," in In: Proc. ICASSP, 2013, Pp. 6645-6649.
  • 20. Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, Pavel Kuksa, Natural Language Processing (Almost) from Scratch, The Journal of Machine Learning Research, 12, p.2493-2537, 2/1/2011
  • 21. Holger Schwenk, Jean-Luc Gauvain, Training Neural Network Language Models on Very Large Corpora, Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.201-208, October 06-08, 2005, Vancouver, British Columbia, Canada
  • 22. T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, and S. Khudanpur, "Extensions of Recurrent Neural Network based Language Model," in In: Proc. ICASSP, 2011, Pp. 5528-5531.
  • 23. T. Mikolov, W. Yih, and G. Zweig, "Linguistic Regularities in Continuous Space Word Representations," in In: Proc. NAACL-HLT, 2013.
  • 24. K. Yao, G. Zweig, M.-Y. Hwang, Y. Shi, and D. Yu, "Recurrent Neural Networks for Language Understanding," in In: Proc. Interspeech, 2013.
  • 25. J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio, "Theano: A CPU and GPU Math Expression Compiler," in In: Proc. Python for Sci. Comput. Conf. (SciPy), 2010.
  • 26. Y. Bengio, R. Ducharme, and P. Vincent, "A Neural Probabilistic Language Model," in In: Proc. NIPS, 2000.
  • 27. A. Deoras and R. Sarikaya, "Deep Belief Network based Semantic Tagger for Spoken Language Understanding," in In: Proc. Interspeech, 2013.
  • 28. Andrew McCallum, Dayne Freitag, Fernando C. N. Pereira, Maximum Entropy Markov Models for Information Extraction and Segmentation, Proceedings of the Seventeenth International Conference on Machine Learning, p.591-598, June 29-July 02, 2000
  • 29. G. Tur, D. Hakkani-Tur, L. Heck, and S. Parthasarathy, "Sentence Simplification for Spoken Language Understanding," in In: Proc. ICASSP, 2011, Pp. 5628-5631.
  • 30. R. Sarikaya, G. E. Hinton, and B. Ramabhadran, "Deep Belief Nets for Natural Language Call-routing," in In: Proc. ICASSP, 2011, Pp. 5680-5683.
  • 31. Roberto Pieraccini, Evelyne Tzoukermann, Zakhar Gorelov, Jean-Luc Gauvain, Esther Levin, Chin-Hui Lee, Jay G. Wilpon, A Speech Understanding System based on Statistical Representation of Semantics, Proceedings of the 1992 IEEE International Conference on Acoustics, Speech and Signal Processing, March 23-26, 1992, San Francisco, California
  • 32. Y.-Y. Wang and A. Acero, "Discriminative Models for Spoken Language Understanding," in In: Proc. ICSLP, 2006.
  • 33. Y. He and S. Young, "A Data-driven Spoken Language Understanding System," in In: Proc. IEEE ASRU, 2003, Pp. 583-588.
  • 34. C. Raymond and G. Riccardi, "Generative and Discriminative Algorithms for Spoken Language Understanding," in In: Proc. Interspeech, 2007.
  • 35. Scott Miller, Robert Bobrow, Robert Ingria, Richard Schwartz, Hidden Understanding Models of Natural Language, Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, p.25-32, June 27-30, 1994, Las Cruces, New Mexico
  • 36. M. Henderson, M. Gasic, B. Thomson, P. Tsiakoulis, K. Yu, and S. Young, "Discriminative Spoken Language Understanding Using Word Confusion Networks," in In: Proc. IEEE SLT, 2012.
  • 37. Roland Kuhn, Renato De Mori, The Application of Semantic Classification Trees to Natural Language Understanding, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.17 n.5, p.449-460, May 1995
  • 38. G. Tur, D. Hakkani-Tür, and L. Heck, "What is Left to Be Understood in ATIS," in In: Proc. IEEE SLT, 2010.
  • 39. G. Tur, L. Deng, D. Hakkani-Tür, and X. He, "Towards Deeper Understanding: Deep Convex Networks for Semantic Utterance Classification," in In: Proc. ICASSP, 2012, Pp. 5045-5048.
  • 40. A. Viterbi, Error Bounds for Convolutional Codes and An Asymptotically Optimum Decoding Algorithm, IEEE Transactions on Information Theory, v.13 n.2, p.260-269, April 1967
  • 41. K. Yao, B. Peng, G. Zweig, D. Yu, X. Li, and F. Gao, "Recurrent Conditional Random Field for Language Understanding," in In: Proc. ICASSP, 2014, Pp. 4105-4009.
  • 42. K. Yao, B. Peng, G. Zweig, D. Yu, X. Li, and F. Gao, "Recurrent Conditional Random Fields," in In: Proc. NIPS Deep Learn. Workshop, 2013.
  • 43. J. Peng, L. Bo, and J. Xu, "Conditional Neural Fields," in In: Proc. NIPS, 2009.
  • 44. D. Yu, S. Wang, and L. Deng, "Sequential Labeling Using Deep-structured Conditional Random Fileds," J. Sel. Topics Signal Process., Vol. 4, No. 6, Pp. 965-973, Dec. 2010.
  • 45. P. Xu and R. Sarikaya, "Convolutional Neural Networks based Triangular CRF for Joint Intent Detection and Slot Filling," in In: Proc. ASRU, 2013.
  • 46. K. Vesely, A. Ghoshal, L. Burget, and D. Povey, "Sequence-discriminative Training of Deep Neural Networks," in In: Proc. Interspeech, 2013.
  • 47. B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization," in In: Proc. Interspeech, 2012.
  • 48. H. Su, G. Li, D. Yu, and F. Seide, "Error Back Propagation for Sequence Training of Context-dependent Deep Neural Networks for Conversational Speech Transcription," in In: Proc. ICASSP, 2013, Pp. 6664-6668.
  • 49. T. Mikolov and G. Zweig, "Context Dependent Recurrent Neural Network Language Model," in In: Proc. IEEE SLT, 2012, Pp. 234-239.
  • 50. Y. Dauphin, G. Tur, D. Hakkani-Tur, and L. Heck, "Zero-shot Learning and Clustering for Semantic Utterance Classification," in In: Proc. Int. Conf. Learn. Represent. (ICLR), 2013.
  • 51. J. Liu, S. Cyphers, P. Pasupat, I. McGraw, and J. Glass, "A Conversational Movie Search System based on Conditional Random Fields," in In: Proc. Interspeech, 2012.
  • 52. Taku Kudo, Yuji Matsumoto, Chunking with Support Vector Machines, Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, p.1-8, June 01-07, 2001, Pittsburgh, Pennsylvania
  • 53. M. Macherey, F. Och, and H. Ney, "Natural Language Understanding Using Statistical Machine Translation," in In: Proc. Eur. Conf. Speech Commun. Technol., 2001, Pp. 2205-2208.
  • 54. M. Jeong and G. Lee, "Structures for Spoken Language Understanding: A Two-step Approach," in In: Proc. ICASSP, 2007, Pp. 141-144.
  • 55. V. Zue and J. Glass, "Conversational Interface: Advances and Challenges," In: Proc. IEEE, Vol. 88, No. 8, Pp. 1166-1180, Aug. 2000.
  • 56. K. Yao, B. Peng, Y. Zhang, D. Yu, G. Zweig, and Y. Shi, "Spoken Language Understanding Using Long Short-term Memory Neural Networks," in In: Proc. IEEE SLT, 2014.
  • 57. Yoshua Bengio, Learning Deep Architectures for AI, Now Publishers Inc., Hanover, MA, 2009
  • 58. Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, Grégoire Mesnil, A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval, Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, November 03-07, 2014, Shanghai, China
  • 59. Richard Socher, Brody Huval, Christopher D. Manning, Andrew Y. Ng, Semantic Compositionality through Recursive Matrix-vector Spaces, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, July 12-14, 2012, Jeju Island, Korea
  • 60. W. Yih, X. He, and C. Meek, "Semantic Parsing for Single-relation Question Answering," in In: Proc. ACL, 2014.
  • 61. M. Yu, T. Zhao, D. Dong, H. Tian, and D. Yu, "Compound Embedding Features for Semi-supervised Learning," in In: Proc. NAACL-HLT, 2013, 2013, Pp. 563-568.
  • 62. D. Hakkani-Tur, L. Heck, and G. Tur, "Exploiting Query Click Logs for Utterance Domain Detection in Spoken Language Understanding," in In: Proc. ICASSP, 2011, Pp. 5636-5639.
  • 63. L. Heck and D. Hakkani-Tur, "Exploiting the Semantic Web for Unsupervised Spoken Language Understanding," in In: Proc. IEEE-SLT, 2012, Pp. 228-233.
  • 64. L. Heck and H. Huang, "Deep Learning of Knowledge Graph Embeddings for Semantic Parsing of Twitter Dialogs," in In: Proc. IEEE Global Conf. Signal Inf. Process., 2014.
  • 65. Li Deng, Dong Yu, Deep Learning: Methods and Applications, Now Publishers Inc., Hanover, MA, 2014
  • 66. J. Gao, P. Pantel, M. Gamon, H. He, and L. Deng, "Modeling Interestingness with Deep Neural Networks," in In: Proc. EMNLP, 2014, Pp. 2-13.
  • 67. X. He and L. Deng, "Speech-centric Information Processing: An Optimization-oriented Approach," In: Proc. IEEE, Vol. 101, No. 5, Pp. 1116-1135, May 2013.
  • 68. X. He and L. Deng, "Speech Recognition, Machine Translation, and Speech Translation--A Unified Discriminative Learning Paradigm," IEEE Signal Process. Mag., Vol. 28, No. 5, Pp. 126-133, Sep. 2011.
  • 69. G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups," IEEE Signal Process. Mag., Vol. 29, No. 6, Pp. 82-97, Nov. 2012.
  • 70. Dong Yu, Li Deng, Automatic Speech Recognition: A Deep Learning Approach, Springer Publishing Company, Incorporated, 2014

}};


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2015 UsingRecurrentNeuralNetworksforYoshua Bengio
Dong Yu
Li Deng
Geoffrey Zweig
Xiaodong He
Larry Heck
Yann N. Dauphin
Grégoire Mesnil
Kaisheng Yao
Dilek Hakkani-Tur
Gokhan Tur
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding10.1109/TASLP.2014.23836142015