E-521-related changes. Not quite ready yet...
This is largely a save-your-work checkin. Created p521/arch_ref64 code to make sure E-521 basically works. Fixed some of the testing code around E-521. It doesn't quite pass everything yet. Created p521/arch_x86_64 code with optimized multiply. In this checkin, the multiply is fast and works, but all the other code in that directory is the completely unoptimized ref64 build which reduces after every add and sub. So the whole thing isn't fast yet.
Showing with 1455 additions and 26 deletions