This week I discussed about my previous patch on the mailing list and started working on the first tasks for my GSoC project!
Last week’s follow-ups
Christian gave me some good advices on
the patch I was
working last week. What I implemented with a rebase option, he wiselly pointed
out to me that could already be achieved throught the commit.verbose
option.
So the only advantage my patch brings could be for those who would which to see
diffs when rebasing but not when commiting… As also suggested by Christian,
I replied
to Ævar’s email
which originally mentioned the feature, attaching my patch and asking for
comments.
We are coming to the conclusion that it is worth making a new patch to document that “rebase obeys all commit.* options”, together with some tests to assert that. As Ævar said, these tests could be the ones I developed in the first patch, so that we can reuse the work :D I’m planning to make and send this new version next week.
First GSoC task: cached_object
Following my initial agenda, this week I started working to protect
sha1-file.c
’s global states. As my first GSoC task, I took some time to
investigate the cached_objects
array and how I could protect it, if needed.
This is a set of mocked in-memory objects that don’t existent on disk but
read_object_file()
is able to return.
The array is global and non thread-safe, but it’s only being used by git-blame until this day and wouldn’t lead to race conditions if blame was made parallel today. Nevertheless, I asked for comments in the mailing list on whether I should try to protect this array now. The conversation is still going, but it seems that it’s better to leave this to a future time.
While I was waiting for replies, I started working on other tasks.
Second task: static variables inside functions
For my next task, I set off hunting for static
variables inside functions in
the pack access call chain. These variables are shared between threads and their
lifetime last throughout the whole execution. So, in order to achieve
thread-safety, we must make them local or, when not possible, protect them.
Tip: cflow
is a really good tool to inspect
call chains. With the following command1 I was able to check for the presence
of specific functions in assign_blame()
’s call chain:
cflow --main assign_blame **/*.h *.h **/*.c *.c
Some of this static
to local
conversions are very trivial. Let’s see an
example:
static int write_loose_object(const struct object_id *oid, char *hdr,
int hdrlen, const void *buf, unsigned long len,
time_t mtime)
{
int fd, ret;
[...]
static struct strbuf tmp_file = STRBUF_INIT;
static struct strbuf filename = STRBUF_INIT;
loose_object_path(the_repository, &filename, oid);
fd = create_tmpfile(&tmp_file, filename.buf);
[...]
if (mtime) {
[...]
if (utime(tmp_file.buf, &utb) < 0)
warning_errno(_("failed utime() on %s"), tmp_file.buf);
}
return finalize_object_file(tmp_file.buf, filename.buf);
}
The call to create_tmpfile()
always resets the tmp_file
variable, overriding
any content it previously held. Also, utime()
and finalize_object_file()
,
the other functions which tmp_file
is passed to, don’t hold references to it.
So tmp_file
doesn’t really need to be a static
variable. Besides making this
function one step closer to being thread-safe, removing the static
qualifier
also saves RAM as the variable will only be loaded to memory when the function
is called.
The same logic applies to filename
, which can also be made local.
Note: Most of the time static
variables are declared within functions, the
objective is to free the caller from the necessity of freeing resouces. But as
we are making them local, we need to release any dinamic allocated memory they
hold or pass this responsability to the caller. Otherwise, we could be creating
memory leeks.
Difficulties
My main dificulty this week was to get started. The pack access code can be quite big and complex, so I got a little overwhelmed/lost in the beginning, not knowing where to start.
Also, in recent conversations on the mailing list, I got a little uncertain if I should make the thread-safety conversion bottom-up, as I started doing this week, or begin in a higher level (with a couple wide mutexes), refining it down. I may have missunderstood the suggested idea, though. So let’s wait for more comments on it :)
Next steps
Continue the work on sha1-file.c
’s global states, and re-send last week’s
patch, refactored.
Footnotes
-
The command complained about redefinitions. I guess the problem is with
#ifdef
macros that didn’t get processed. I tried using--preprocess="gcc -E"
but it got even worse :( Anyway, the current output, even if slightly wrong, is sufficient for me, now. ↩
Til next time,
Matheus