14 june 2018

In this paper the authors use seq-2-seq transformer based architecture using BPE units for the multilingual ASR task.

The authors experiment with :

  1. No language information
  2. Language information during the training
    1. Adding a LANG token in the starting of each sub word
    2. Adding a LANG token in the end of each sub word.
      1. Gives better results but not much compared to adding a token in the starting.
  3. Language info during training + testing.
    1. Gives the best result. Because it alleviates the confusion between languages during the testing.