Embedding PyPy in uWSGI - interview with Roberto De Ioris10-03-2014
To inaugurate this blog, we would like to present an interview with Roberto De Ioris, who is the lead developer of the uWSGI project as well a co-founder of unbit. We worked together with Roberto on providing an embedding interface for PyPy. unbit is using PyPy in production for its customers and have seen performance improvements ranging from 8% to 120% and more.
Maciej Fijalkowski: What does your company do?
Roberto De Ioris: Unbit was born in 2005 as a hosting provider for developers in Italy. We were the first (in Italy) to support "bleeding edge" technologies like Ruby On Rails and Django. In 2008 we started releasing lot of the source code we wrote in the first 3 years of the company. In 2009 we started working on the uWSGI project to have a single codebase for hosting various customers applications. Since then the project evolved a lot and now all of our infrastructure is based on it. Currently our main business has moved from hosting to consulting for companies that want to enter the hosting/PaaS market or for web-based agencies with scalability and availability problems.
Maciej Fijalkowski: How did you hear about PyPy?
Roberto De Ioris: The first time was in 2008 at the Italian PyCon but to my ears it looked pretty "freaky" as it was sold as running Python over Python (there was no jit at that time). The first time I heard of it and I started being interested was in 2011 at the EuroPython. I saw an Armin Rigo talk and started investigating. Starting from 2012 we got blasted by request to support PyPy on uWSGI. The demos Armin Rigo (and Antonio Cuni) showed were astonishing, so our customers wanted to try it. Unfortunately we were bound to the CPython C api, so supporting PyPy at that time seemed impossible. I started studying PyPy internals; then I tried working on cpyext, PyPy's CPython compatibility layer, and I released a first "almost-working" plugin. It was really a hack and afaik no-one took it seriously, expecially becasue performance was worst than CPython :) Then you contacted me about the improvements in the cffi area and proposed an interesting approach: let's write the uWSGI plugin using cffi instead of cpyext. The first attempt was already promising, after a couple of week we released the first working implementation, and in summer 2013 we had the first customer using it in production. Soon after, we started investigating if we could start using it for some of our work. Now we have 3 apps (and the fourth is being tested) in production using PyPy, 2 based on django and 1 (a REST api server) in pure WSGI. Currently we have improvements in raw performance (read: response times) that span from 8% to a pretty interesting 40%, but we have a peak of an astonishing 100-120% and even more. Take into the account that most of our apps are simple "blocking-on-db" ones, so a 2x increase is literally money. The best thing for now is that we started adding more threads as we experienced (even on the first plugin incarnation) an improved threads management.
Maciej Fijalkowski: We're working on improved GIL handling by the way
Roberto De Ioris: Well, personally I find cffi a silver bullet because it allows me to write lot of things I would have written in C directly in Python, but it is a personal need so I do not know how useful it is. :)
Maciej Fijalkowski: So tell me a few things about uWSGI
Roberto De Ioris: I have to say that it is a very particular project; the hosting market is pretty "frustrating" from various point of view. We decided to invest in an application server, after 4 years of infinite problems reported by our customers with various app servers that were available at that time. Most of those problems were related to bad practices, but I can assure you that telling a developer his code sucks is not easy, especially when he pays you :) so the initial spirit of uWSGI was "bypassing" bad practices, and constantly monitoring the app to spot problems and so on. It may look like a winning approach, but unfortunately a bad app is a bad app, there is nothing you can do to avoid it, so we decided to change direction and tried to implement a common solution for various hosting needs. In a couple of years we were able to support different languages and technologies on top of the same code. In addition to this our main (almost secret) objective was reducing resource usage of the application servers, which directly translates to being able to host more customers on the same server. This resulted in a product with really good performance and really low resource usage combined with an incredible number of features (most of them targeted at sysadmins), which allowed us to move our business into other areas. Companies like booking.com (perl) now uses it, as well as other companies in the hosting market (like PythonAnywhere).
Maciej Fijalkowski: I think it is the most popular way to deploy Python if you have performance needs
Roberto De Ioris: I suppose it is, but most of our customers choose it for features like the Emperor, or the zerg mode that simplifies management.
Maciej Fijalkowski: Cool, thanks! Anything you would like to add?
Roberto De Ioris: Well, personally I am one of those people that likes understanding the internals of computing, so projects like PyPy are an "amusement park". Yesterday I spotted a chat on IRC where people discussed reducing assembler instructions from 9 to 8. A lot of people consider this kind of thing annoying/boring/useless, but I really love them. It reminds me of the first year of my "computing" experience.
Maciej Fijalkowski: Thank you for your time!