Bumping this thread a bit.. I commited yet another updated experimental gensets to testing branch (r1582)
5.4M headers: 5 mins 6 secs
50M headers: 3 hours 5 mins.
Also, I managed to get rid of the initial slow count query and still keep the status bar / ETA

It's still not completely linear, but seems faster than 1.0.4 even on small groups, FINALLY, which has been an annoyance so far with the new approach. If you have the option, try comparing numbers with current normal SVN and perhaps also 1.0.4 stable.
Something happens after 5-10M headers has been processed - chunk process speed drops from about 3-5 secs/chunk to 15-20 secs/chunk. I am now convinced this has to do with MySQL itself, or perhaps the underlying FS (ext4 in my case). Perhaps as simple as the indexes are too big to be completely cached on my system.
Still, it's actually a viable option now to DL ALL headers in a group at once, with full retention, and still not having to wait for many weeks or even months for indexing to finish (but quite a few hours). Also, when the drop in speed happens, it's quite sudden, and then it doesn't really go any worse than that. Would have to test with 500M header or something to be completely sure, but this seems to be the case as far as I can tell from the little testing I have done.
Things are moving forward

Lots of testing is needed though, but looking good so far. Note that this is not in the normal SVN but in the testing branch.
Users looking for "danger" or helping out in general are encouraged to check out the testing branch and post some numbers after testing various amount of headers, check sets integrity and so on, that would be a big help to me. But please don't use your production DB. This is alpha code, at best, with all that that brings.