strstr

Currently we compare following variants
glibc - glibc-2.13 version
sse2,ssse3,sse4_1 optimized using corresponding extension
arit - my optimized implementation using only arithmetic

And we benchmark with following
random - random needle and random haystack
random_nocache - as random except that needle and haystack are not in cache.
aaaa - test needle aaaa and baaabaaabaaab
aaab - test needles aa...ab and haystacks aaaa...ab of corresponding sizes
planted - random needle and in haystack needle prefixes have distribution given by file.
dist1
 100 0
 100 1
 1 8
dist2
 98 0
 2 8
dist3
90 0
10 8
dist4
 
33 0
33 5
33 8

strstr

instructionsinstructions/byterange
i7 i7 i7
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII

strcasestr

instructionsinstructions/byterange
i7 i7 i7
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII

memmem

instructionsinstructions/byterange
i7 i7 i7
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII

strrchr

instructionsinstructions/byterange
i7 i7 i7
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII

memrchr

instructionsinstructions/byterange
i7 i7 i7
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII

strlen

instructionsinstructions/byterange
i7_nehalem i7_nehalem i7_nehalem
i7_ivy_bridge i7_ivy_bridge i7_ivy_bridge
core2 core2 core2
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII
fx10 fx10 fx10
atom atom atom

strchr

instructionsinstructions/byterange
i7_nehalem i7_nehalem i7_nehalem
i7_sandy_bridge i7_sandy_bridge i7_sandy_bridge
core2 core2 core2
opteron opteron opteron
xeon xeon xeon
phenomII phenomII phenomII
fx10 fx10 fx10
Benchmarks are available at github