The past week, I optimized the current stage II, and tried out a new version of stage II ECM. The main way by which we can optimize stage II is by reducing the number of points we compute. My mentor pointed out a very nice way to do so. Unfortunately, it didn't have a significant speedup. I believe this is because the semi primes which we were factoring (at max 128 bits long), did not need a lot of points to be computed in the first place. The newer version definitely decreased the number of points being computed (by more than a half), but the number points were not large enough in the first place to get noticeable speedup.
The most time consuming step in stage II is when we compute and accumulate the product of difference of the x coordinates (in projective form, in which I am working, it is x1*z2 - x2*z1). This step involves a lot of modular reductions, and hence a lot if inversions and divisions. The efforts put into the new implementation of stage II would definitely be helpful when I write stage II using mpn's and precomputed inverses.
I had to write a new outer function too since the newer version had different pre computations than the older one.
Some pieces of code written this week
This week apart from the new ECM stage II, I also spent time on cleaning up the way variables were being passed. I wrapped up most of the important variables in an ecm structure, passing a reference to it as an argument to the functions instead of a lot of variables. This struct also has four fmpz_t's for temporary calculations, which eliminate the need of initing and clearing new fmpz_t's in every
ecm_mul(). This struct also allows me to precompute the
GCD_table (looks up whether
i and j are co-prime) and
prime_table (looks up whether
i * k + j is prime) for all curves instead of computing them each time for a curve.
You can have a look at my recent commits here.
Currently I am trying to understand the FFT techniques used for ECM stage II. I found quite a lot of citations to the Ph.D. dissertation thesis of Peter Montgomery, which was on this topic. Initially I couldn't find it, but some helpful people on the flint-devel mailing list pointed gave me the link. It is very interesting due to the the math involved, but at the same time quite tough to follow.
This week I wasn't able to give in as much time I would have liked. I usually work on all 7 days on the project, however this weekend I wasn't available due to a hackathon I was attending. I will pick up the speed this week. My goals for this week is to implement the FFT stage II and start to code the whole ECM with the mpn data types and functions.