Rootconf Pune edition

Rootconf Pune edition

On security, network engineering and distributed systems



Manas Malik


Framework For Lossless Data Compression Using Python

Submitted Jul 26, 2019

A lot has been done in the field of data compression, yet we don’t have a proper application for compressing daily usage files. There are appropriate and very specific tools online that provide files to be compressed and saved, but the content we use for streaming our videos, be it a Netflix video or a gaming theater play, data consumed is beyond the calculation of a user. Back-end developers know all about it and as developers we have acknowledged it but not yet achieved it in providing on an ease level. Since the user would not never be concerned about compression, developers can always take initiative while building the application to provide accessibility with compression before-hand. We have decided to create a framework that will provide all the functionality needed for a developer to add this feature. Making use of the python language this process can work. I’m a big fan of Python, mostly because it has a vibrant developer community that has helped produce an unparalleled collection of libraries that enable one to add features to applications quickly. The Python zlib library provides a Python interface to the zlib C library, which is a higher-level abstraction for the DEFLATE lossless compression algorithm, we have a lot to do including the audio, video and subtitles of the file. We also make use of the fabulous ffmpy library. ffmpy is a Python library that provides access to the ffmpeg command line utility. ffmpeg is a command-line application that can perform several different kinds of transformations on video files, including video compression, which is the most commonly requested feature of ffmpeg. Frame rate and audio synchronization are few other parameters to look closely. This is an ongoing project and there remains few implementation aspects, data compression remains a concern when touched upon the design. We along with python community intend to solve this issue.


This talk includes the study of different compression techniques that are currently in use. Making use of all the existing libraries and theoretical concepts, we intent to create a new framework for developers, this will be the first in Python language that would provide all the necessary features for compression. Most importantly video files that consume data of use on daily basis. Handling each frame of the video is a very efficient and lossless process, so far it does not has any fast providing application. The field additionally demands collaboration of other tools outside compression tools existing. As Python is becoming more handy to developers, it is very
convenient for the developers to grasp algorithms and develop in the language. All the theoretical concepts are considered based on the study emerged so far, practical libraries used are very specific and not quite efficient. The framework includes all the codes and algorithms needed for the lossless compression. Python libraries that are already existing and making work with frames and video rates in the
background can increased in terms of the efficiency. There is a splitting into multiple patches of the images involved in the process of compression. For each of the patch, location (recurrent) of the pixel is found. Prior to the pixel value, the estimated locations of each pixel is placed, i.e. for the entire image. We make us of the LSB algorithm to perform the embedding process, after each frame has been compressed. MSE, PSNR, SSIM, are few of the metrics that will determine the superiority and the relevance of the framework proposed.


They just have to come. :)

Speaker bio

We presented our idea of Data Compression at PyCon Cleveland 2019, we received a great appreciation from the members of the community. We have been working on a python project for a while, it is now towards the end of completion. It’s a project that helps refugees and organizations that helps minorities and immigrants to settle for a living. I have tried best to make the interface of my refugee data camp project minimized. It has the potential of a generic product. Me and my team are currently working on a framework for lossless data compression using python, I have published an international research paper on the theoretical aspect of the data compression, we are looking the implementation part of it. I have always been a fanatic of python and love playing with the language. I am now also looking forward to new projects and have been researching on some open source projects like warehouse and zulip. It has now come to the mental state that I want to explore more and more, I have worked on GUI apps using tkinter. I have also worked on many machine learning libraries involving numpy, pandas, scikit-learn, and other tools and projects. Contribution and future of the language are now what fascinates me. I have further plans with python and would love to share ideas that I am carrying. Interaction with python users and developers will help me a lot with my research and projects, not just me the whole idea of collaborating with people will help society, products that might land up in future with interactions. Talk is going to be great catalyst between my current state and my goals. Talk will help me collaborate with other users and developers. I have been following talks on YouTube for a long time and I have received quite a good impression of the projects demonstrated by community members. I too have many ideas that I would like to share and contribute with the community. Lot of experience and interaction is been collectively shared at the conference that can help me and my team with our research.



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

We care about site reliability, cloud costs, security and data privacy