Commit 2712648b authored by Michael Hamburg's avatar Michael Hamburg

Second commit. Still very preliminary.

Changed the formats of private keys and shared secrets.

Added SHA512 support.  It's slow and probably has endian bugs.

Signatures are now supported.

Renamed a bunch of internal functions to be more readable and
consistent.

Began documenting functions with Doxygen.

See HISTORY.txt for more details.
parent 25697caf
This source diff could not be displayed because it is too large. You can view the blob instead.
March 5, 2014:
First revision.
Private keys are now longer. They now store a copy of the public key, and
a secret symmetric key for signing purposes.
Signatures are now supported, though like everything else in this library,
their format is not stable. They use a deterministic Schnorr mode,
similar to EdDSA. Precomputed low-latency signing is not supported (yet?).
The hash function is SHA-512.
The deterministic hashing mode needs to be changed to HMAC (TODO!). It's
currently envelope-MAC.
Probably in the future there will be a distinction between ECDH key and
signing keys (and possibly also MQV keys etc).
Began renaming internal functions. Removing p448_ prefixes from EC point
operations. Trying to put the verb first. For example,
"p448_isogeny_un_to_tw" is now called "twist_and_double".
Began documenting with Doxygen. Use "make doc" to make a very incomplete
documentation directory.
There have been many other internal changes.
Feb 21, 2014:
Initial import and benchmarking scripts.
Keygen and ECDH are implemented, but there's no hash function.
......@@ -3,19 +3,20 @@
CC = clang
CFLAGS = -O3 -std=c99 -pedantic -Wall -Wextra -Werror \
-mavx2 -DMUST_HAVE_SSSE3 -mbmi2 \
-ffunction-sections -fdata-sections -fomit-frame-pointer -fPIC
-mssse3 -maes -mavx2 -DMUST_HAVE_AVX -mbmi2 \
-ffunction-sections -fdata-sections -fomit-frame-pointer -fPIC \
-DEXPERIMENT_ECDH_OBLITERATE_CT=1 -DEXPERIMENT_ECDH_STIR_IN_PUBKEYS=1
.PHONY: clean all runbench
.PHONY: clean all runbench todo doc
.PRECIOUS: build/%.s
HEADERS= Makefile $(shell find . -name "*.h") build/timestamp
LIBCOMPONENTS= build/goldilocks.o build/barrett_field.o build/crandom.o \
build/p448.o build/ec_point.o build/scalarmul.o
build/p448.o build/ec_point.o build/scalarmul.o build/sha512.o
all: bench
bench: *.h *.c
$(CC) $(CFLAGS) -o $@ *.c
......@@ -34,7 +35,26 @@ build/goldilocks.so: $(LIBCOMPONENTS)
libtool -macosx_version_min 10.6 -dynamic -dead_strip -lc -x -o $@ \
-exported_symbols_list exported.sym \
$(LIBCOMPONENTS)
doc/timestamp:
mkdir -p doc
touch $@
doc: Doxyfile doc/timestamp *.c *.h
doxygen
todo::
@egrep --color=auto -w -i 'hack|todo|fixme|bug|xxx|perf|future|remove' *.h *.c
@echo '============================='
@(for i in FIXME BUG XXX TODO HACK PERF FUTURE REMOVE; do \
egrep -w -i $$i *.h *.c > /dev/null || continue; \
/bin/echo -n $$i' ' | head -c 10; \
egrep -w -i $$i *.h *.c | wc -l; \
done)
@echo '============================='
@echo -n 'Total '
@egrep -w -i 'hack|todo|fixme|bug|xxx|perf|future|remove' *.h *.c | wc -l
runbench: bench
./$<
......
Important work items for Ed448-Goldilocks:
* Import SHA-512 or SHA-3.
* Decide which.
* Get a public-domain version which is 64-bit and 32-bit clean.
* Update LICENSE and README to reflect that SHA is not my code.
* Incorporate hashing into goldilocks_shared_secret.
* It's a pretty terrible shared secret right now.
* Decide on output size
* Documentation: write high-level API docs, and internal docs to help
other implementors.
* Partial progress on Doxygenating the code.
* Documentation: write a spec or add to Watson's
......@@ -37,12 +30,13 @@ Important work items for Ed448-Goldilocks:
* Testing:
* Corner-case testing
* more bulk random testing
* More bulk random testing
* Negative testing.
* SAGE-(auto?)-generated test vectors
* Test the Barrett fields
* Safety: add static analysis attributes for compilers that support them
* EG, warn on ignored return types
* Most functions now have warn on ignored return.
* Safety:
* Check for init() if it's still required once we've done the above
......@@ -65,17 +59,19 @@ Important work items for Ed448-Goldilocks:
* Scalarmul with other cofactor modes.
* High-level API:
* Signatures.
* Decide on strictness level.
* SPAKE2 Elligator Edition? Maybe write a paper first.
* Elligator.
* Need to write Elligator inverse. Might not be Elligator-2S.
* FHMQV? Is this patented?
* What low-level APIs to expose?
* Edwards points with add, sub, scalarmul, =, ==, ser/deser?
* Portability: test and make clean with other compilers
* Using a fair amount of __attribute__ code.
* Portability: try to make the vector code as portable as possible
* Currently using clang ext_vector_length.
* I can't get a simple for-loop to autovectorize :-/
......@@ -89,8 +85,7 @@ Important work items for Ed448-Goldilocks:
* Run through the SAGE tool to generate new bias & bound.
* Portability: make the outer layers of the code 32-bit clean.
* I don't think that there are endian bugs, but who knows?
* There are endian bugs in the signing algorithm.
* NEON and vectorless constant-time comparison.
* Performance: write and incorporate some extra routines
......@@ -99,6 +94,11 @@ Important work items for Ed448-Goldilocks:
* Performance: fixed parameters?
* Perhaps useful for comb precomputation.
* Performance: Improve SHA512.
* Improve portability.
* Improve speed.
* Decide what things to stir into hashes for various functions.
* Performance: improve the Barrett field code.
* Support other primes?
......
......@@ -109,6 +109,42 @@ widemac(
return carry;
}
void
barrett_negate (
word_t *a,
int nwords_a,
const word_t *p_lo,
int nwords_p,
int nwords_lo,
int p_shift
) {
int i;
dsword_t carry = 0;
barrett_reduce(a,nwords_a,0,p_lo,nwords_p,nwords_lo,p_shift);
/* Have p = 2^big - p_lo. Want p - a = 2^big - p_lo - a */
for (i=0; i<nwords_lo; i++) {
a[i] = carry = carry - p_lo[i] - a[i];
carry >>= WORD_BITS;
}
for (; i<nwords_p; i++) {
a[i] = carry = carry - a[i];
if (i<nwords_p-1) {
carry >>= WORD_BITS;
}
}
a[nwords_p-1] = carry = carry + (((word_t)1) << p_shift);
for (; i<nwords_a; i++) {
assert(!a[i]);
}
assert(!(carry>>64));
}
void
barrett_reduce(
word_t *a,
......@@ -195,14 +231,6 @@ barrett_mul_or_mac(
tmp[i] = 0;
}
if (doMac) {
for (i=0; i<nwords_accum; i++) {
tmp[i] = accum[i];
}
barrett_reduce(tmp, nwords_tmp, 0, p_lo, nwords_p, nwords_lo, p_shift);
}
for (bpos=nwords_b-1; bpos >= 0; bpos--) {
/* Invariant at the beginning of the loop: the high word is unused. */
assert(tmp[nwords_tmp-1] == 0);
......@@ -211,6 +239,7 @@ barrett_mul_or_mac(
for (i=nwords_tmp-2; i>=0; i--) {
tmp[i+1] = tmp[i];
}
tmp[0] = 0;
/* mac and reduce */
word_t carry = widemac(tmp, nwords_tmp, a, nwords_a, b[bpos], 0);
......@@ -223,6 +252,11 @@ barrett_mul_or_mac(
* so the high word is again clear */
}
if (doMac) {
word_t cout = add_nr_packed(tmp, accum, nwords_accum);
barrett_reduce(tmp, nwords_tmp, cout, p_lo, nwords_p, nwords_lo, p_shift);
}
for (i=0; i<nwords_tmp && i<nwords_accum; i++) {
accum[i] = tmp[i];
}
......
......@@ -44,6 +44,16 @@ sub_nr_ext_packed(
int nwords_c,
word_t mask
);
void
barrett_negate (
word_t *a,
int nwords_a,
const word_t *p_lo,
int nwords_p,
int nwords_lo,
int p_shift
);
/*
* If doMac, accum = accum + a*b mod p.
......
This diff is collapsed.
......@@ -7,6 +7,7 @@
#include "intrinsics.h"
#include "crandom.h"
#include <stdio.h>
volatile unsigned int crandom_features = 0;
......@@ -26,11 +27,60 @@ unsigned int crandom_detect_features() {
a=0x80000001; __asm__("cpuid" : "+a"(a), "=b"(b), "=c"(c), "=d"(d));
if (c & 1<<11) out |= XOP;
if (c & 1<<30) out |= RDRAND;
# endif
return out;
}
INTRINSIC u_int64_t rdrand(int abort_on_fail) {
uint64_t out = 0;
int tries = 1000;
if (HAVE(RDRAND)) {
# if defined(__x86_64__)
u_int64_t out, a=0;
for (; tries && !a; tries--) {
__asm__ __volatile__ (
"rdrand %0\n\tsetc %%al"
: "=r"(out), "+a"(a) :: "cc"
);
}
# elif (defined(__i386__))
u_int32_t reg, a=0;
uint64_t out;
for (; tries && !a; tries--) {
__asm__ __volatile__ (
"rdrand %0\n\tsetc %%al"
: "=r"(reg), "+a"(a) :: "cc"
);
}
out = reg; a = 0;
for (; tries && !a; tries--) {
__asm__ __volatile__ (
"rdrand %0\n\tsetc %%al"
: "=r"(reg), "+a"(a) :: "cc"
);
}
out = out << 32 | reg;
return out;
# else
abort(); // whut
# endif
} else {
tries = 0;
}
if (abort_on_fail && !tries) {
abort();
}
return out;
}
/* ------------------------------- Vectorized code ------------------------------- */
#define shuffle(x,i) _mm_shuffle_epi32(x, \
i + ((i+1)&3)*4 + ((i+2)&3)*16 + ((i+3)&3)*64)
......@@ -278,7 +328,7 @@ crandom_init_from_file(
return err ? err : -1;
}
bzero(state->buffer, 96);
memset(state->buffer, 0, 96);
state->magic = CRANDOM_MAGIC;
state->reseeds_mandatory = reseeds_mandatory;
......@@ -292,7 +342,7 @@ crandom_init_from_buffer(
const char initial_seed[32]
) {
memcpy(state->seed, initial_seed, 32);
bzero(state->buffer, 96);
memset(state->buffer, 0, 96);
state->reseed_countdown = state->reseed_interval = state->fill = state->ctr = state->reseeds_mandatory = 0;
state->randomfd = -1;
state->magic = CRANDOM_MAGIC;
......@@ -305,7 +355,9 @@ crandom_generate(
unsigned long long length
) {
/* the generator isn't seeded; maybe they ignored the return value of init_from_file */
if (unlikely(state->magic != CRANDOM_MAGIC)) abort();
if (unlikely(state->magic != CRANDOM_MAGIC)) {
abort();
}
int ret = 0;
......@@ -313,8 +365,13 @@ crandom_generate(
if (unlikely(state->fill <= 0)) {
uint64_t iv = 0;
if (state->reseed_interval) {
/* it's nondeterministic, stir in some rdtsc() */
iv = rdtsc();
/* it's nondeterministic, stir in some rdrand() or rdtsc() */
if (HAVE(RDRAND)) {
iv = rdrand(0);
if (!iv) iv = rdtsc();
} else {
iv = rdtsc();
}
state->reseed_countdown--;
if (unlikely(state->reseed_countdown <= 0)) {
......@@ -335,11 +392,13 @@ crandom_generate(
* is basically over-engineering for caution. Also, the user might ignore
* the return code, so we still need to fill the request.
*
* Set reseed_countdown = 1 so we'll try again later. If the user's perf
* sucks as a result of ignoring the error code while calling us in a loop,
* well, he gets what he deserves.
* Set reseed_countdown = 1 so we'll try again later. If the user's
* performance sucks as a result of ignoring the error code while calling
* us in a loop, well, that's life.
*/
if (state->reseeds_mandatory) abort();
if (state->reseeds_mandatory) {
abort();
}
ret = errno;
if (ret == 0) ret = -1;
......@@ -361,7 +420,7 @@ crandom_generate(
unsigned long long copy = (length > state->fill) ? state->fill : length;
state->fill -= copy;
memcpy(output, state->buffer + state->fill, copy);
bzero(state->buffer + state->fill, copy);
memset(state->buffer + state->fill, 0, copy);
output += copy; length -= copy;
}
......@@ -371,11 +430,13 @@ crandom_generate(
void
crandom_destroy(
struct crandom_state_t *state
) {
if (state->randomfd) close(state->randomfd);
/* Ignore the return value, because what would it mean?
* "Your random device, which you were reading over NFS, lost some data"?
*/
) {
if (state->magic == CRANDOM_MAGIC && state->randomfd) {
(void) close(state->randomfd);
/* Ignore the return value from close(), because what would it mean?
* "Your random device, which you were reading over NFS, lost some data"?
*/
}
bzero(state, sizeof(*state));
memset(state, 0, sizeof(*state));
}
......@@ -3,7 +3,11 @@
* Released under the MIT License. See LICENSE.txt for license information.
*/
/* A miniature version of the (as of yet incomplete) crandom project. */
/**
* @file crandom.h
* @author Mike Hamburg
* @brief A miniature version of the (as of yet incomplete) crandom project.
*/
#ifndef __GOLDI_CRANDOM_H__
#define __GOLDI_CRANDOM_H__ 1
......@@ -16,7 +20,14 @@
#include <strings.h> /* for bzero */
#include <unistd.h> /* for read */
/**
* @brief The state of a crandom generator.
*
* This object is opaque. It is not protected by a lock, and so must
* not be accessed by multiple threads at the same time.
*/
struct crandom_state_t {
/** @privatesection */
unsigned char seed[32];
unsigned char buffer[96];
uint64_t ctr;
......@@ -32,30 +43,93 @@ struct crandom_state_t {
extern "C" {
#endif
/**
* Initialize a crandom state from the chosen file.
*
* This function initializes a state from a given state file, or
* from a random device (eg. /dev/random or /dev/urandom).
*
* You must check the return value of this function.
*
* @param [out] state The crandom state variable to initalize.
* @param [in] filename The name of the seed file or random device.
* @param [in] reseed_interval The number of 96-byte blocks which can be
* generated without reseeding. Suggest 10000.
* @param [in] reseeds_mandatory If nonzero, call abort() if a reseed fails.
* Suggest 1.
*
* @retval 0 Success.
* @retval Nonzero An error to be interpreted by strerror().
*/
int
crandom_init_from_file(
crandom_init_from_file (
struct crandom_state_t *state,
const char *filename,
int reseed_interval,
int reseeds_mandatory
) __attribute__((warn_unused_result));
/**
* Initialize a crandom state from a buffer, for deterministic operation.
*
* This function is used to initialize a crandom state deterministically,
* mainly for testing purposes. It can also be used to expand a secret
* random value deterministically.
*
* @warning The crandom implementation is not guaranteed to be stable.
* That is, a later release might produce a different random stream from
* the same seed.
*
* @param [out] state The crandom state variable to initalize.
* @param [in] initial_seed The seed value.
*/
void
crandom_init_from_buffer(
crandom_init_from_buffer (
struct crandom_state_t *state,
const char initial_seed[32]
);
/* TODO : attribute warn for not checking return type? */
/**
* Fill the output buffer with random data.
*
* This function uses the given crandom state to produce pseudorandom data
* in the output buffer.
*
* This function may perform reads from the state's random device if it needs
* to reseed. This could block if that file is a blocking source, such as
* a pipe or /dev/random on Linux. If reseeding fails and the state has
* reseeds_mandatory set, this function will call abort(). Otherwise, it will
* return an error code, but it will still randomize the buffer.
*
* If called on a corrupted, uninitialized or destroyed state, this function
* will abort().
*
* @warning This function is not thread-safe with respect to the state. Don't
* call it from multiple threads with the same state at the same time.
*
* @param [inout] state The crandom state to use for generation.
* @param [out] output The buffer to fill with random data.
* @param [in] length The length of the buffer.
*
* @retval 0 Success.
* @retval Nonezero A non-mandatory reseed operation failed.
*/
int
crandom_generate(
crandom_generate (
struct crandom_state_t *state,
unsigned char *output,
unsigned long long length
);
/**
* Destroy the random state. Further calls to crandom_generate() on that state
* will abort().
*
* @param [inout] state The state to be destroyed.
*/
void
crandom_destroy(
crandom_destroy (
struct crandom_state_t *state
);
......
This diff is collapsed.
/* Copyright (c) 2014 Cryptography Research, Inc.
* Released under the MIT License. See LICENSE.txt for license information.
/**
* @file ec_point.h
* @copyright
* Copyright (c) 2014 Cryptography Research, Inc. \n
* Released under the MIT License. See LICENSE.txt for license information.
* @author Mike Hamburg
* @warning This file was automatically generated.
*/
/* This file was generated with the assistance of a tool written in SAGE. */
#ifndef __CC_INCLUDED_P448_EDWARDS_H__
#define __CC_INCLUDED_P448_EDWARDS_H__
#ifndef __CC_INCLUDED_EC_POINT_H__
#define __CC_INCLUDED_EC_POINT_H__
#include "p448.h"
......@@ -12,28 +16,28 @@
extern "C" {
#endif
/*
/**
* Affine point on an Edwards curve.
*/
struct affine_t {
struct p448_t x, y;
};
/*
/**
* Affine point on a twisted Edwards curve.
*/
struct tw_affine_t {
struct p448_t x, y;
};
/*
/**
* Montgomery buffer.
*/
struct montgomery_t {
struct p448_t z0, xd, zd, xa, za;
};
/*
/**
* Extensible coordinates for Edwards curves, suitable for
* accumulators.
*
......@@ -55,7 +59,7 @@ struct extensible_t {
struct p448_t x, y, z, t, u;
};
/*
/**
* Extensible coordinates for twisted Edwards curves,
* suitable for accumulators.
*/
......@@ -63,16 +67,18 @@ struct tw_extensible_t {
struct p448_t x, y, z, t, u;
};
/*
* Niels coordinates for twisted Edwards curves. Good for
* mixed readdition; suitable for fixed tables.
/**
* Niels coordinates for twisted Edwards curves.
*
* Good for mixed readdition; suitable for fixed tables.
*/
struct tw_niels_t {
struct p448_t a, b, c;
};
/*
/**
* Projective niels coordinates for twisted Edwards curves.
*
* Good for readdition; suitable for temporary tables.
*/
struct tw_pniels_t {
......@@ -81,7 +87,7 @@ struct tw_pniels_t {
};
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -90,7 +96,7 @@ copy_affine (
const struct affine_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -99,7 +105,7 @@ copy_tw_affine (
const struct tw_affine_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -108,7 +114,7 @@ copy_montgomery (
const struct montgomery_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -117,7 +123,7 @@ copy_extensible (
const struct extensible_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -126,7 +132,7 @@ copy_tw_extensible (
const struct tw_extensible_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -135,7 +141,7 @@ copy_tw_niels (
const struct tw_niels_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Auto-generated copy method.
*/
static __inline__ void
......@@ -144,7 +150,7 @@ copy_tw_pniels (
const struct tw_pniels_t* ds
) __attribute__((unused,always_inline));
/*
/**
* Returns 1/sqrt(+- x).
*
* The Legendre symbol of the result is the same as that of the
......@@ -158,7 +164,7 @@ p448_isr (
const struct p448_t* x
);
/*
/**
* Returns 1/x.
*
* If x=0, returns 0.
......@@ -169,56 +175,80 @@ p448_inverse (
const struct p448_t* x
);
/*
/**
* Add two points on a twisted Edwards curve, one in Extensible form
* and the other in half-Niels form.
*/
void
add_tw_niels_to_tw_extensible (
struct tw_extensible_t* d,
const struct tw_niels_t* e
);
/**
* Add two points on a twisted Edwards curve, one in Extensible form
* and the other in half-Niels form.
*/
void
p448_tw_extensible_add_niels (
sub_tw_niels_from_tw_extensible (
struct tw_extensible_t* d,
const struct tw_niels_t* e
);
/*
/**
* Add two points on a twisted Edwards curve, one in Extensible form
* and the other in projective Niels form.
*/
void
p448_tw_extensible_add_pniels (
add_tw_pniels_to_tw_extensible (
struct tw_extensible_t* e,
const struct tw_pniels_t* a
);
/*
/**
* Add two points on a twisted Edwards curve, one in Extensible form
* and the other in projective Niels form.
*/
void
sub_tw_pniels_from_tw_extensible (
struct tw_extensible_t* e,
const struct tw_pniels_t* a
);
/**
* Double a point on a twisted Edwards curve, in "extensible" coordinates.
*/
void
p448_tw_extensible_double (
double_tw_extensible (
struct tw_extensible_t* a
);
/*
/**
* Double a point on an Edwards curve, in "extensible" coordinates.
*/
void
p448_extensible_double (
double_extensible (
struct extensible_t* a
);
/*
* 4-isogeny from untwisted to twisted.
/**
* Double a point, and transfer it to the twisted curve.
*
* That is, apply the 4-isogeny.
*/
void