> Besides the core algorithms I have added a few of my own
> see how they measure up. We have csum which is your normal IP Header
> check sum.
The IP header checksum is only 16-bit, so it is only suitable for a small
amount of data. It is definitely not suitable for 1MB or 4MB RPC sizes.
The size is just one factor making it vulnerable to collisions. The data
position in the stream is not taking into consideration. Their are many
papers on its weakness.
> For the non cryptographic hashes it's the IP check sum and
> that does the best. This version of murmur3 only generates 32 check
> sums but their exist a 128 bit version that is suppose to be faster.
> It could be worth while to explore. The IP check sum from the linux
> kernel is assembly optimized but my additional algorithms are generic
You should test with the kernel cryptoapi code, since AFAIK there are
assembly versions of the common algorithms already. Check out how the
libcfs code is already handling the crc32 code - it benchmarks each
algorithm at startup and dumps the results in the Lustre debug log.
Currently the most useful optimized algorithms are in the 3.10-rcX
kernels. I have only run lustre up to a linux 3.9.4 kernel so far. My
first runs at collect data has been with the Lustre debug logs as
well as tcrypt.
> The final question is the Lustre community interested in the
> algorithms? If so I can push forward that work.
I'm not against it if there are significant improvements to be had.
Out of the test algorithms I implemented I would say siphash might be
interesting to the UI guys.
It surprises me that newer CPUs do not have hardware-accelerated
of some sort. Is it just that the assembly versions have not been implemented
in the kernels that Lustre is running on? Could they be implemented in libcfs
as was done with crc32 and then submitted to the upstream kernel (so everyone
benefits and we don't have to maintain them forever)?
We have some production systems that only support up to SSE3 so we can't
take advantage of any hardware acceleration for crc32 or crc32c. Down
the line I will most likely need to implement this.