Anthill Inside 2019

A conference on AI and Deep Learning

Nickil Maveli


Efficient Machine Translation for low resource languages using Transformers

Submitted Nov 5, 2019

Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence aligned RNNs or convolution. Transformers were recently used by OpenAI in their language models, and also used recently by DeepMind for AlphaStar, their program to defeat a top professional Starcraft player.

Key Takeaways

Build a translation mechanism for datasets with scarcely available parallel sentence pair corpus to obtain relatively high BLEU scores.


Section 1.

  1. Transformer Model Architecture
    a. Encoder [Theory + Code]
    b. Decoder [Theory + Code]
  2. Self-Attention [Theory + Code]
  3. Multi-head Attention [Theory + Code]
  4. Positional Encoding [Theory + Code]
  5. Note on Bleu Score

Section 2.

  1. Solving a Real World Translation Problem with low resource data
  2. Attention Visualization
  3. Translation Results


Basic Familiarity with Neural Networks and Linear Algebra.

Speaker bio

Have over 3+ years of industrial experience in Data Science. Currently working as a data scientist (NLP) at, where I have built models for Parse Classification, Unsupervised Synonym Detection, Identifying Code Mixing in text, etc. I’ve also participated in numerous data science competitions across Kaggle, AnalyticsVidhya, Topcoder, Crowdanalytix etc and finished in the top 10 in atleast a dozen of those.

Specialties: data science, machine learning, predictive modelling, natural language processing, deep learning, big data, artificial intelligence.




{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}