Website Logo. Upload to /source/logo.png ; disable in /source/_includes/logo.html

ātmādhyaya

thoughts on tech, programming, parallelism etc.,

The Eagle Has Landed

I’ve been working on adding support to ensure memory consistency in qemu over the summer. This work initially was done as part of Google Summer of Code (GSoC), and I plan to continue the work in my spare time.

GSoC is a learning experience which introduces students to open source communities. Students are supposed to learn how to contribute and get involved in the development community. The ideal outcome of the GSoC program is to see your work included in the upstream codebase. More often than not, the code by the student does not make it upstream. It lives somewhere in an experimental branch.

Given this likely scenario, I am really excited to know that my code contribution has made it upstream! The MTTCG project in qemu seems to be in the final stages with almost all patches ready to be included except for some minor nits. I can’t wait for it all to land in the master branch.

QEMU 2.7 Statistics

I had a great time hacking on QEMU over the summer as part of Google Summer of Code. If you are interested in the details, my mentor, Alex Bennée, has written an excellent summary of the project on lwn.net. While I eagerly wait for my patches to be merged upstream, I decided to analyze some statistics in the most recent release of QEMU, version 2.7. This analysis is done using the gitdm tool written by Jonathan Corbet for analyzing the Linux Kernel releases. The current analysis is neither as thorough nor as accurate as the analysis done for Linux. It is just an amateur attempt at understanding the developer world of QEMU.

QEMU has quite many similarities with the linux kernel, which I guess is because of an overlap of developers among the two projects. Both the projects are mainly written in C and use the dreaded checkpatch script in their patch submission workflow. The technical discussions happen on the mailing list (i.e., posting patches, reviews etc.,) although quite a few QEMU developers often hangout on IRC for discussions and to help out users facing problems with QEMU.

With that background information, let us look at the version 2.7 release of QEMU. The releases in QEMU are tagged by Peter Maydell who also merges in the changesets from the pull requests posted by various subsystem maintainers. In the 2.7 release there were a total of 2292 non-merge commits in the tree, as compared to 2427 commits in the 2.6 release. Individual developers with the most commits are listed below.

QEMU Developers in 2.7 release

Name Number of Commits Percentage
Eric Blake 169 7.4%
Peter Maydell 156 6.8%
Paolo Bonzini 155 6.8%
Kevin Wolf 118 5.1%
Daniel P. Berrange 102 4.5%
Igor Mammedov 77 3.4%
Marc-André Lureau 71 3.1%
Xiaoqiang Zhao 55 2.4%
Eduardo Habkost 54 2.4%
Fam Zheng 53 2.3%

From what I could gather, Eric Blake works mostly in the block layer with quite a few commits in the nbd, replay and qapi subsystems. Peter Maydell has contributions all over the tree with his main contributions being to user mode emulation and the ARM subsystems. Paolo Bonzini too has contributions all over the tree with most of them related to all TCG targets (arm and i386 in particular), nbd and general QEMU execution infrastructure.

Top changeset contributors by employer

Company Number of Commits Percentage
Red Hat 1225 53.4%
(Unknown) 503 21.9%
Linaro 214 9.3%
IBM 111 4.8%
Fujitsu 65 2.8%
Intel 35 1.5%
Parallels 30 1.3%
Xilinx 23 1.0%
Bull 20 0.9%
Novell 18 0.8%

I counted about 21 companies (I am sure there are more) which sponsor QEMU development. The major ones among these are Red Hat, Linaro and IBM. I think the main reason for the overwhelming showing by Red Hat is mainly because of their involvement in KVM, which is their virtualization tool of choice. The unknown row is supposed to be contributions from individuals with no affiliation, but I think the case here that of missing information. Many developers use their personal non-affiliated email id in the commit messages, which makes it difficult to identify the company employing them.

Having said that, let us look at who is employing the hackers working on QEMU.

Employers with the most hackers (total 189)

Company Number of Employees Percentage
(Unknown) 83 43.9%
Red Hat 39 20.6%
IBM 19 10.1%
Fujitsu 9 4.8%
Intel 7 3.7%
Linaro 6 3.2%
Novell 6 3.2%
Xilinx 3 1.6%
Huawei 3 1.6%
Citrix 3 1.6%

Most of the developers working on QEMU are not properly attributed as working for a company. So the unknown tag shows up at the top. Red Hat is the most known employer funding twenty percent of the known QEMU developers.

Employers with the most signoffs (total 1851)

Company Number of Commits Percentage
Red Hat 1036 56.0%
Linaro 362 19.6%
(Unknown) 278 15.0%
IBM 65 3.5%
Telecom-Service 58 3.1%
Fujitsu 16 0.9%
Parallels 14 0.8%
Xilinx 11 0.6%
Huawei 7 0.4%
Intel 2 0.1%

And the 20% developers funded by Red Hat contributed more than half of the commits that went into the 2.7 QEMU release.

As I mentioned earlier, the information here is not accurate. If you want to correct or contribute to such analysis, please comment, send an email or a pull request in the gitdm-qemu repository. This repo has the configuration scripts I used to generate the above information. Hopefully I will get more accurate for the 2.8 release.

All the stats generated by gitdm can be found in this file.

Goodies Are Here!

The goodies from the GSoC welcome packet are finally here. I am not a big fan of having stickers on my laptop, but I am mighty tempted to try this one.

Getting Started With Memory Consistency

When I initially started learning about memory consistency, I found it full of intricacies and it was pretty involved. There are quite a few resources which helped me get up to speed. These are the ones I refer to whenever I am confused about anything related to consistency. I am listing them here and I hope you find it equally useful.

The first document I would recommend would be “Memory Barriers: a Hardware View for Software Hackers” by Paul McKenney. This is a lucid introduction describing some of the hardware structures which cause consistency issues to arise. It mainly boils down to optimizations implemented in hardware to reduce the cost of memory accesses being the cause.

Next would be to read Jeff Preshing' blog which has quite some articles for those interested in parallel programming. I would recommend these three articles in particular: 1, 2, 3.

Once you get a basic understanding of consistency and fences, the next article would be the Linux kernel’s memory-barriers.txt. This document goes into quite some depth explaining the various ordering dependencies which arise in current processors and how to use memory barriers to ensure consistency. You should also see the barrier.h header file for different architectures.

You should also read “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”. As the title says it talks about the ARM and POWER memory models. Another tutorial worth reading is “Shared Memory Consistency Models: A Tutorial”.

I will keep updating this post with any new articles I find interesting. Hope you enjoy them!

QEMU GSoC 2016

My proposal to work on memory consistency issues in the multi-threaded TCG project got accepted as part of Google Summer Of Code 2016! For those of you interested in the details, at a very high level, I am going to work on enabling execution of mis-matched memory model guest architectures on a different host architecture. The easiest way to achieve this is to generate memory fence instructions on every memory access, but this has a drastic overhead on performance since fence instructions can have significant latency. The plan is to generate fence instructions appropriately to ensure the consistency model expected by the guest while keeping the performance overhead to a minimum.

I am looking forward to working with my mentor Alex Bennée and get the patches accepted upstream by the end of summer. I will use this blog to keep track of my progress and as a sounding board for the work I will be doing.