Journal Articles

  1. Cooke, M., García Lecumberri, M. L., Barker, J., & Marxer, R. (2019). Lexical frequency effects in English and Spanish word misperceptions. The Journal of the Acoustical Society of America, 145(2), EL136–EL141. http://doi.org/10.1121/1.5090196 Link
  2. Alghamdi, N., Maddock, S., Marxer, R., Barker, J., & Brown, G. (2018). A corpus of audio-visual Lombard speech with frontal and profile views. Journal of the Acoustical Society of America, 143(6), EL523–EL529. http://doi.org/10.1121/1.5042758 Link
  3. Marxer, R., Barker, J., Alghamdi, N., & Maddock, S. (2018). The impact of the Lombard effect on audio and visual speech recognition systems. Speech Communication, 100, 58–68. http://doi.org/10.1016/j.specom.2018.04.006 Link
  4. Moore, R. K., Thill, S., & Marxer, R. (2017). Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR) (Dagstuhl Seminar 16442). Dagstuhl Reports, 6(10), 154–194. http://doi.org/10.4230/DagRep.6.10.154 Link
  5. Malavasi, M., Turri, E., Atria, J. J., Christensen, H., Marxer, R., Desideri, L., … Green, P. (2017). An Innovative Speech-Based User Interface for Smarthomes and IoT Solutions to Help People with Speech and Motor Disabilities. Studies in Health Technology and Informatics, 242(Harnessing the Power of Technology to Improve Lives), 306–313. http://doi.org/10.3233/978-1-61499-798-6-306 Link
  6. Vincent, E., Watanabe, S., Nugraha, A. A., Barker, J., & Marxer, R. (2017). An analysis of environment, microphone and data simulation mismatches in robust speech recognition. Computer Speech & Language, 46, 535–557. http://doi.org/https://doi.org/10.1016/j.csl.2016.11.005 Link
  7. Barker, J., Marxer, R., Vincent, E., & Watanabe, S. (2017). The third ’CHiME’ speech separation and recognition challenge: Analysis and outcomes. Computer Speech & Language, 46, 605–626. http://doi.org/https://doi.org/10.1016/j.csl.2016.10.005 Link
  8. Marxer, R., Barker, J., Cooke, M., & Garcia Lecumberri, M. L. (2016). A corpus of noise-induced word misperceptions for English. The Journal of the Acoustical Society of America, 140(5), EL458–EL463. http://doi.org/10.1121/1.4967185 Link
  9. Moore, R. K., Marxer, R., & Thill, S. (2016). Vocal Interactivity in-and-between Humans, Animals, and Robots. Frontiers in Robotics and AI, 3. http://doi.org/10.3389/frobt.2016.00061 Link
  10. Bosch, J. J., Marxer, R., & Gómez, E. (2016). Evaluation and combination of pitch estimation methods for melody extraction in symphonic classical music. Journal of New Music Research, 45(2), 101–117. http://doi.org/10.1080/09298215.2016.1182191 Link
  11. Marxer, R., & Purwins, H. (2016). Unsupervised Incremental Online Learning and Prediction of Musical Audio Signals. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(5), 863–874. http://doi.org/10.1109/TASLP.2016.2530409 Link
  12. Marxer, R. (2012). El arte generativo y la belleza de los procesos. Novatica, (216), 51–56. Retrieved from http://www2.ati.es/novatica/2012/216/nv216sum.html
  13. Hazan, A., Marxer, R., Brossier, P., Purwins, H., Herrera, P., & Serra, X. (2009). What/when causal expectation modelling applied to audio signals. Connection Science, 21(2-3), 119–143. http://doi.org/10.1080/09540090902733764 Link
  14. Purwins, H., Grachten, M., Herrera, P., Hazan, A., Marxer, R., & Serra, X. (2008). Computational models of music perception and cognition II: Domain-specific music processing. Physics of Life Reviews, 5(3), 169–182. http://doi.org/10.1016/j.plrev.2008.03.004 Link
  15. Purwins, H., Herrera, P., Grachten, M., Hazan, A., Marxer, R., & Serra, X. (2008). Computational models of music perception and cognition I: The perceptual and cognitive processing chain. Physics of Life Reviews, 5(3), 151–168. http://doi.org/10.1016/j.plrev.2008.03.004 Link

Book Chapters

  1. Barker, J., Marxer, R., Vincent, E., & Watanabe, S. (2016). The CHiME challenges: Robust speech recognition in everyday environments. In New era for robust speech recognition - Exploiting deep learning. Springer. Retrieved from https://hal.inria.fr/hal-01383263

Patents

  1. Bonada, J., Janer, J., Marxer, R., Umeyama, Y., Kondo, K., & Garcia, F. (2015, December 29). Technique for estimating particular audio component (Version 1). Retrieved from https://www.google.com/patents/US9224406
  2. Bonada, J., Janer, J., Marxer, R., Umeyama, Y., & Kondo, K. (2015, June 30). Technique for suppressing particular audio component (Version 1). Retrieved from https://www.google.com/patents/US9070370
  3. Umeyama, Y., Kondo, Y., K. Takahashi, Bonada, J., Janer, J., & Marxer, R. (2015, April 7). Technique for estimating particular audio component (Version 1). Retrieved from https://www.google.com/patents/US9002035

Conference Articles

  1. Gogate, M., Adeel, A., Marxer, R., Barker, J., & Hussain, A. (2018). DNN Driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation. In Interspeech 2018 (pp. 2723–2727). Hybderabad, India: ISCA. http://doi.org/10.21437/Interspeech.2018-2516 Link
  2. Marxer, R., & Barker, J. (2017). Binary Mask Estimation Strategies for Constrained Imputation-Based Speech Enhancement. In Proc. Interspeech 2017 (pp. 1988–1992). http://doi.org/10.21437/Interspeech.2017-1257 Link
  3. Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B., Derleth, P., & Hussain, A. (2016). A Data Driven Approach to Audiovisual Speech Mapping. In C.-}L. Liu, A. Hussain, B. Luo, K. C. Tan, Y. Zeng, & Z. Zhang (Eds.), Advances in Brain Inspired Cognitive Systems - 8th International Conference, BICS 2016, Beijing, China, November 28-30, 2016, Proceedings (Vol. 10023, pp. 331–342). http://doi.org/10.1007/978-3-319-49685-6_30 Link
  4. Moore, R. K., & Marxer, R. (2016). Progress and prospects for spoken language technology: Results from four sexennial surveys. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 08-12-September-2016, pp. 3012–3016). http://doi.org/10.21437/Interspeech.2016-948 Link
  5. Lecumberri, M. L. G., Barker, J., Marxer, R., & Cooke, M. (2016). Language effects in noise-induced word misperceptions. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 08-12-September-2016, pp. 640–644). http://doi.org/10.21437/Interspeech.2016-330 Link
  6. Green, P., Marxer, R., Cunningham, S., Christensen, H., Rudzicz, F., Yancheva, M., … Tamburini, F. (2016). CloudCAST-Remote speech technology for speech professionals. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 08-12-September-2016, pp. 1608–1612). http://doi.org/10.21437/Interspeech.2016-148 Link
  7. Barker, J., Marxer, R., Vincent, E., & Watanabe, S. (2015). The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015 (pp. 504–511). http://doi.org/10.1109/ASRU.2015.7404837 Link
  8. Ma, N., Marxer, R., Barker, J., & Brown, G. J. (2015). Exploiting synchrony spectra and deep neural networks for noise-robust automatic speech recognition. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015 (pp. 490–495). http://doi.org/10.1109/ASRU.2015.7404835 Link
  9. Casanueva, I., Hain, T., Christensen, H., Marxer, R., & Green, P. (2015). Knowledge transfer between speakers for personalised dialogue management. In the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp. 12–21). Prague, Czech Republic: Association for Computational Linguistics.
  10. Marxer, R., Cooke, M., & Barker, J. (2015). A framework for the evaluation of microscopic intelligibility models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2015-January, pp. 2558–2562).
  11. Janer, J., & Marxer, R. (2013). Separation of unvoiced fricatives in singing voice mixtures with semi-supervised NMF. In Proceedings of the 16th International Conference on Digital Audio Effects Conference (DAFx-13), Maynooth, Ireland.
  12. Marxer, R., & Janer, J. (2013). Modelling and Separation of Singing Voice Breathiness in Polyphonic Mixtures. In Proceedings of the 16th International Conference on Digital Audio Effects Conference (DAFx-13), Maynooth, Ireland.
  13. Marxer, R., & Janer, J. (2013). Low-latency bass separation using harmonic-percussion decomposition. In Proceedings of the 16th International Conference on Digital Audio Effects Conference (DAFx-13), Maynooth, Ireland.
  14. Bosch, J. J., Kondo, K., Marxer, R., & Janer, J. (2012). Score-informed and timbre independent lead instrument separation in real-world scenarios. In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO) (pp. 2417–2421). IEEE.
  15. Marxer, R., Janer, J., & Bonada, J. (2012). Low-latency instrument separation in polyphonic audio using timbre models. In F. J. Theis, A. Cichocki, A. Yeredor, & M. Zibulevsky (Eds.), Latent Variable Analysis and Signal Separation: Proceedings of 10th International Conference, LVA/ICA 2012, Tel Aviv, Israel (Vol. 7191, pp. 314–321). Berlin, Heidelberg: Springer. http://doi.org/10.1007/978-3-642-28551-6_39 Link
  16. Janer, J., Marxer, R., & Arimoto, K. (2012). Combining a harmonic-based NMF decomposition with transient analysis for instantaneous percussion separation. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 281–284). IEEE. http://doi.org/10.1109/ICASSP.2012.6287872 Link
  17. Marxer, R., & Janer, J. (2012). A Tikhonov regularization method for spectrum decomposition in low latency audio source separation. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 277–280). IEEE. http://doi.org/10.1109/ICASSP.2012.6287871 Link
  18. Poupard, M., Ferrari, M., Schlüter, J., Marxer, R., Giraudet, P., Barchasz, V., … Glotin, H. (accepted). Real-time passive acoustic 3D tracking of deep diving cetacean by small non-uniform mobile surface antenna. In IEEE International Conference of Acoustics, Speech and Signal Processing 2019.

Thesis

  1. Marxer, R. (2013, September). Audio Source Separation in Low-latency and High-latency Scenarios (PhD thesis). Universitat Pompeu Fabra, Barcelona, Spain. Retrieved from http://ricardmarxer.com/phd/thesis_revised.pdf

Other

  1. Hazan, A., Brossier, P., Marxer, R., & Purwins, H. (2008). What/when causal expectation modelling applied to percussive audio (No. 5). The Journal of the Acoustical Society of America (Vol. 123, p. 3800). Acoustical Society of America. Link
  2. Marxer, R., Holonowicz, P., Hazan, A., & Purwins, H. (2008). Dynamical hierarchical self-organization of harmonic and motivic musical categories (No. 5). The Journal of the Acoustical Society of America (Vol. 123, p. 3800). Acoustical Society of America. Link