Last week updates
Last week, I presented some git-grep time measurements with userdiff driver preloading disabled. Although we could see some speedups in some test cases, the tests were only performed in a machine with SSD storage. So we couldn’t be sure how it would affect the execution on an HDD. Because of that, my mentors and I decided to leave preloading as it is. We can maybe repeat this test in an HDD machine in the future, though.
About my dir-iterator patchset,
I’m happy to say that it got queued
at this topic branch! :)
But it still has at least one issue to be solved: Junio pointed out that inode
comparison is not sufficient to check if two files are the same as it may not be
unique on cross mountpoints. Also, Johannes told me that it wouldn’t work on
Windows either, as the Git-for-Windows’ port of stat()
does not fill the
st_ino
field of struct stat
. So I should think of another way to implement
this file comparison, which is used to check for recursive symlinks.
Multithreading zlib inflation
By the analyzes we saw in the last two posts, zlib inflation seemed like a good spot to be made parallel. It has all the main ingredients for a well succeeded threaded work:
-
Time expensive tasks
git-grep spends substantial time on inflation. In fact, for the git-grep parameters we’ve been using at performance tests1, I measured the time Git spends ongit_inflate()
and it accounted for over 48% of the total execution time! -
Thread-safety
zlib’s functions are already thread-safe. -
Independency between tasks
The work performed on different zlib streams seems completely independent.
With all the above motivations, my mentors and I decided to seek for a parallel object decompression at git-grep. For a first version, the idea was not to implement it the best way possible, but to make a hacky/testing version, so that we could measure how well it would go. I started this version on top of Duy’s hacky patch on a parallel cached git-grep.
Initially, it was coming to a deadlock for some of my more complex1 test
cases. After some refactoring and re-enabling the locks for the attributes
machinery, though, I got to a version with no deadlocks. But unfortunatelly this
was SegFault’s turn to show up. Throught GDB
and valgrind
it was possible to
identify that the error was coming from a read after free operation on the delta
base cache. But even with this information, it was getting hard to debug and to
guess in which case this could be happening…
I then decided to try another pathway, starting over from master
. In fact, I
tried several ideas… In one of them I eliminated git-grep’s lock arround pack
reading operations and inserted a lock surrounding
read_object_file_extended()
’s code. Then, at unpack_compressed_entry()
I
added a call to release the lock at the beginning and retrieve it at the end,
which should, in theory, safely allow parallel threads at decompression. But
this failed as the latter function is also used before the threads are created,
so we would be unlocking a not yet locked mutex. I tried to solve that working
out a mechanism to only unlock and re-lock after threads are running and the
function was called by a thread holding the lock. Even then, I was getting
errors…
And even after many ideas and work arrounds, I couldn’t get rid of either a deadlock or execution errors due to race conditions.
Difficulties
Pack access call chains are big and, unfortunatelly, that hinders thread-safety
analyzes :( For example, even with the assurance that inflation is already
thread-safe, it’s hard to visualize all possible paths from cmd_grep()
until
git_inflate()
and guarantee that a lock is being held or avoid double looking.
Next steps
Even with the setbacks, we found a nice and promising road to walk, so my plan is to keep going and looking for other ways to allow parallel threads at inflation. One of the things I should do (or should have already done) is to compile Git with no optimizations to see if it helps debuging the errors at GDB.
Footnotes
Til next time,
Matheus