Final report on "Benchmarking Parallel Performance of Numerical MPI Packages"
Google Summer of Code 2024 for Debian
Table of Contents
1. Acknowledgements
I would like to thank my mentors, Fransesco Ballarin and Drew Parsons, for their help and guidance, and the Grid'5000 team for allowing me to use their testbed for the experiments. Furthermore I'd like to thank Nilesh Patra of Debian for his mediation as an org admin for GSoC. I'd also like to thank Google for the opportunity that was offered to me through the Summer of Code program.
Experiments presented in this page were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).
2. Description
The goal of this project, at a minimum, was to collect measurements of the FEniCS miniapp, of the FEniCS project, as packaged by Debian. The purpose was to aid in detecting regressions in the involved Debian packages via the visual inspection of plotted data points. Notably, as I understood eventually, this isn't testing upstream but rather tests regressions for how Debian has put together its distribution with regards to FEniCS. Additional goals included creating a general framework in which more tests can be written for other packages such as NWChem or MEEP.
These measurements were to run on Grid'5000, a large-scale and flexible testbed. Various parameters were to be taken with different values, such as altering the underlying BLAS and MPI implementations or using a different number of cores or degrees of freedom 1.
3. Accomplishments
3.1. Testbuddy-g5k
We created the testbuddy-g5k tool as a general tool that allows the declarative configuration of experiments to be launched to Grid'5000 from another, unrelated, server: to describe it simply, it functions as a script-uploader and results-downloader that requests resources from Grid'5000 and launches scripts in the obtained resources.
Equipped with this tool, we were able to write scripts that would launch experiments for the FEniCS miniapp and store them in an SQLite3 database; we also wrote another script that would produce HTML plots with Plotly 2.
With this approach, at the cost of some repetition (e.g. having to write a script or two per project), we gained significant flexibility in the way results are stored, and in particular we did not have to think of a future-proof common format for all the potential Debian packages that may need to be tested.
Figure 1: A typical use of the testbuddy-g5k tool
After this sequence of events has taken place, the user will find various plots similar to this one:
Figure 2: Artificial weekly data for 4 years, the light region focusing on a regression
3.2. Patches to other projects
I added documentation of the timings and improved some C++ code in the FEniCS miniapp:
I made various documentation impovements to ReFrame:
The following are more minor in nature:
- GNU sleep https://debbugs.gnu.org/cgi/bugreport.cgi?bug=70946
- Grid'5000 wiki edits https://www.grid5000.fr/w/Special:Contributions/Nchatzik
- kameleon https://github.com/oar-team/kameleon/pull/130
4. Difficulties
4.1. Segfaults and unsupported Debian 13
I met a large hurdle when I realized that there were segfaults with the miniapp on Debian 12 and general issues with Debian 13 on Grid'5000. In practice, this meant that I would have to wait until the Grid'5000 team would fix its Debian 13 issue, with projected date at least 4 months after the GSoC program would end. I continued developing my program with the following compromise: instead of running the experiments for hundreds or thousands of cores, I would have to restrict myself to a maximum of 3 cores!
4.2. Decision to use a proper database
There didn't seem to exist a convenient way to serialize Pandas DataFrame objects 3. For that reason and for the fact that we could liberate ourselves from Python tooling, we opted to use a proper database like SQLite3. This created some additional attrition in implementation process, especially since I am not well-versed with SQL concepts.
4.3. Testing
Due perhaps to my naivete in architectural design, my unfamiliarity with Python testing, and the shortage of time, I was not able to write unit tests for this project. It seems that proper isolation of certain components of the programs I wrote for the purpose of testing would increase the complexity of the software and would also slow me down. In a crunch, I debugged as I used the software; Drew Parsons helped me during the review phase a great deal too.
5. Work ahead
5.1. Writing tests for other packages
There still have not been any tests written for NWChem or any other Debian package. I hope to be able to write some in the future.
5.2. Improving the SQL schema
The SQL schema provided was rushed and does not use indexing optimizations 4. This may be fine since now there is only one package, FEniCS, and one purpose, a complete timeline plot, and thus all records must be traversed. When more packages and more constrainted plots are required (say, on dates), it may be fruitful to include SQL indexing in the schema.
6. Download
Testbuddy-g5k has a git repository, a documentation page, and a PyPI package.
Footnotes:
A parameter to the miniapp, and in particular a parameter of the Finite Element Method, the method used by the FEniCS; the miniapp uses it to solve the partial differential equations of Poisson and the so-called elastostatics problem in 3D.
Expressly, we avoided Dash in order to only use Debian packages and to avoid having to configure the Dash webserver.
"Tables" in common parlance, from a popular Python library.
SQLite documentation on how to pick good indices may be useful.