Submit a talk on data

Submit talks on data engineering, data science, machine learning, big data and analytics through the year – 2019


Preparing AWS instances for performance and resource intensive application

Submitted by Vamsi A (@vamsia) on Tuesday, 17 April 2018

Technical level: Beginner


The ASR module of Samsung’s Bixby does the Speech to Text translation. The accuracy of the output depends heavily on the quality of models involved.

The ASR module processes very large models which are GBs in size. This heavy weight lifting requires high GPU processing and memory resources on the servers.

The processing capabilities required for this purpose were not available with AWS till the P2 instances were launched.

The AWS provided off-the-shelf P2 instances did not meet our need, we had to work on updating our application as well as identifying right parameters on P2 instances to get our solution work on AWS.


Kernel version

  • Compared and chose the kernel best suited. Chose CentOS OS kernel version 3.10.0
    • Num Channels : Latency (ms) Kernel 2.6.32 : Latency (ms) Kernel 3.10.0
      • 48 : 0.945099 : 0.681114
      • 42 : 0.673124 : 0.470596
      • 36 : 0.516722 : 0.356459

CUDA version

  • Chose right version of CUDA libraries for graphics card NVIDIA Tesla K80. Chosen CUDA 6.5 over CUDA 7.5
    • Num Channels : Latency (ms) CUDA 7.5 : Latency (ms) CUDA 6.5
      • 12 : 2.170836 : 1.804922
      • 8 : 0.613931 : 0.543185
      • 6 : 0.338785 : 0.355863

Speaker bio

Vamsi Agasthyaraju is interested in architecting and setting up cloud infrastructure for AI projects involving STT and NLP. He is a solution architect who loves to interconnect multiple cloud components, and to evaluate the right choice of technologies.


  • Zainab Bawa (@zainabbawa) Reviewer a year ago

    Vamsi, you need to submit draft slides and preview video for evaluating this proposal.

  • bovave (@bovave) 2 months ago

Login with Twitter or Google to leave a comment