I spend the larger half of last week studying elliptic curves and understanding possible stage II extensions, rather than coding. I got stuck at a lot of places and had to read a lot to understand. Implementing stage II was certainly harder than stage I. More on that a bit later, first I'll summarize the work I have done this week for those who are in a hurry and cannot read the whole post.
In case someone is wondering where all the code I had blogged about in the last post went, it got shifted to the fmpzfactor module. My mentor siggested that it is better to move them to the fmpzfactor module, as these functions are highly specialised, and are not likely to be used in any way other than in ECM. This week I implemented not so optimized versions of:
Apart from this, I also edited the group operation functions (double, add and multiply). I realised I was not taking care of corner cases (which involve the point O). This was causing a lot of bugs. You can have a look at my recent commits here.
It's quite weird Github shows the author to be unrecognised. I guess I forgot to set the global email in got config while setting up git again (I messed up my computer some time back, had to reinstall OSX).
Once I was done with the stage I implementation, I started thinking about how to implement the stage II. My first step was to implement a naive stage II, which computes a scalar multiple of the point at the end of stage I for each prime between B1 and B2 (The two bounds). You can have a look at this algorithm in this paper (Algorithm 1). Now this was quite simple, as I already had all the functions required at my disposal (done in week 1). However this turned out to be quite inefficient.
To optimize this, the first algorithm that I tried out was in Prime Numbers, A computational Perspective (page 353). I decided to implement this primarily because it used Montgomery coordinates, and seemed straightforward enough. However, after spending a day and a half on it, I couldn't implement it correctly. There seemed nothing wrong with my implementation, but it just didn't factor a number, no matter how large the B2 was. So I decided to start looking for papers which follow a different technique.
I came across another algorithm in (this paper)[https://www.iacr.org/archive/ches2006/10/10.pdf] (Algorithm 4). I went forward an implemented it, only to get bad news again. The algorithm never finished running. After spending qutie some time on debugging, I realised there were mistakes in my group operation functions. Maybe this was the reason for my first implementation of stage II not working too.
When I sat down to recheck the group operations I had coded earlier, I came across a very weird thing. When I computed the point 4Q (using the Montgomery ladder) and the same point using addition (3Q + Q), my answers did not match. I found this quite weird, considering that my implementations were correct. I ended up asking a question on math exchange, only to realize that the points were in fact the same! What I had forgotten was that I am dealing with projective coordinates. So the absolute value of the coordinates need not match, however their ratio will always be the same. I was stuck at a non issue for more than a day :/
My goals for next week are to optimize the stage II further. I have been looking at some techniques such as the FFT continuation for ECM, which completely avoids calculating a table of prime numbers. I haven't really looked into it, I will be doing so in this week. Once I'm sure about what code I will be using for ECM, I'll code an mpn layer version for it.