Building reproducible Python applications for secured environments
Submitted by Kushal Das (@kushaldas) on Thursday, 14 March 2019
Technical level: Intermediate
We all have to package Python based applications for various environments, starting from command line tools, to web applications. And depending on the users, it can be installed on thousands on computers or on a selected few systems. https://pypi.org is our goto place for finding any dependencies and also in most of the time we install binary wheels directly from there, thus saving a lot time.
But, Python is also being used in many environments where security is the utter most important, and validating the dependencies of project is also very critical along with the actual project source code. Many of noticed the recent incident where people were being able to steal bitcoins using a popular library.This talk will take SecureDrop client application for journalists as an example project and see how we tried to tackle the similar problem. SecureDrop is an Open Source whistleblower system which is deployed over 75 news organizations all over the world. Our threat model has nation state actors as possible threats, so, security and privacy of the users of the system is a very important point of the whole project. The tools in this case are build and packaged into reproducible Debian deb packages and are installed on Qubes OS in the final end user systems.
There are two basic ways we handle Python project dependencies, for most of the development work, we use a virtualenv, and directly install the dependencies using wheels built from pypi.org. When we package the application for the end users, many times we package them using a operating system based package manager and ask the users to install using those (say RPM or Debian’s deb package). In the second case, all the dependencies come as separate packages (and most of the time from the OS itself). The dependency is being handled by the OS package manager itself. That case, we can not update the dependencies fast enough if required, it depends on the packagers from the community who maintains those said packages in the distribution.
We use dh-virtualenv project to help us to use our own wheels + a virtualenv for the project to be packaged inside the debian .deb package. This talk will go throuh the process of building wheels from known (based on sha256sum) source tarballs, and then having a gpg signed list of updated wheels and a private index for the same. And also how we are verifying the wheels’ sha256sum (and the signature of that list) during the build process. The final output is reproducible Debian packages.
Each part of the talk will tell what steps are done (in a sentence or two) and then explain why those are necessary.
Keeping the final artifacts which get installed on the systems in a secured environment is always a challenge. There are various attack vectors, starting from malware in source to tampering in the build process. People also want to verify any binary package to make sure that they can also build the exact same artifacts for a source.
This talk is for devlopers and administrators (or devops persons) and architects of such secured environments. The talk also tries to identify the possible ways the similar approaches which can be used by smaller projects or teams or other enterprise projects running on old operating systems.
- Introduction - 1 minute
- Why all of these painful steps? 2 minutes
- SecureDrop client desktop tools and their dependency on other upstream projects (or think about an application structutre and standard deployment strategy)- 3 minutes
- Updating dependencies or do we read all updates? - 2 minutes
- Development environment and using pipenv + tools to create
requirements.txtwtih hashes only for source - 3 minutes
- Structure of a static HTML based private package index - 4 minutes
- GPG signed list of already built wheels + syncing them locally - 2 minute
- Running python3 setup.py sdist to create the release tarball + a step before to have a requirements..txt with only binary hashes from our list of wheels. - 5 minutes
- Final Debinan packaging script (for automation) which does double verification of the wheel hashes. - 3 minutes
- Reproducible Debian package as end product - 2 minutes
- Possibility in the RPM land - 1 minute
Kushal Das is a public interest technologist, who is a maintainer of the SecureDrop project, and part of Tor Project core team. He is a CPython core developer, and also a director in Python Software Foundation. He has given talks in various conferences including previous PyCons (and once in a previous version of rootconf), a list of such talks is in https://fedoraproject.org/wiki/User:Kushal#Talks_.26_Workshops
Kushal is currently working as a staff member of the Freedom of the Press Foundation.