It was verified in practice that this test is able to stress much more
the implementation by introducing errors that were only trivially to
detect with different offsets but impossible to detect starting always
at zero and counting bits the full length of the string.
So instead to reply with a generic error like:
-ERR ... wrong kind of value ...
now it replies with:
-WRONGTYPE ... wrong kind of value ...
This makes this particular error easy to check without resorting to
(fragile) pattern matching of the error string (however the error string
used to be consistent already).
Client libraries should return a specific exeption type for this error.
Most of the commit is about fixing unit tests.
Bug #582 was not present in 32 bit builds of Redis as
getObjectFromLong() will return an error for overflow.
This commit makes sure that the test does not fail because of the error
returned when running against 32 bit builds.
remove unsafe and unnecessary cast.
until now, this cast may lead segmentation fault when end > UINT_MAX
setbit foo 0 1
bitcount 0 4294967295
=> ok
bitcount 0 4294967296
=> cause segmentation fault.
Note by @antirez: the commit was modified a bit to also change the
string length type to long, since it's guaranteed to be at max 512 MB in
size, so we can work with the same type across all the code path.
A regression test was also added.
In the issue #529 an user reported a bug that can be triggered with the
following code:
flushdb
set a
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
bitop or x a b
The bug was introduced with the speed optimization in commit 8bbc076
that specializes every BITOP operation loop up to the minimum length of
the input strings.
However the computation of the minimum length contained an error when a
non existing key was present in the input, after a key that was non zero
length.
This commit fixes the bug and adds a regression test for it.
This commit adds a fast-path to the BITOP that can be used for all the
bytes from 0 to the minimal length of the string, and if there are
at max 16 input keys.
Often the intersected bitmaps are roughly the same size, so this
optimization can provide a 10x speed boost to most real world usages
of the command.
Bytes are processed four full words at a time, in loops specialized
for the specific BITOP sub-command, without the need to check for
length issues with the inputs (since we run this algorithm only as far
as there is data from all the keys at the same time).
The remaining part of the string is intersected in the usual way using
the slow but generic algorith.
It is possible to do better than this with inputs that are not roughly
the same size, sorting the input keys by length, by initializing the
result string in a smarter way, and noticing that the final part of the
output string composed of only data from the longest string does not
need any proecessing since AND, OR and XOR against an empty string does
not alter the output (zero in the first case, and the original string in
the other two cases).
More implementations will be implemented later likely, but this should
be enough to release Redis 2.6-RC4 with bitops merged in.
Note: this commit also adds better testing for BITOP NOT command, that
is currently the faster and hard to optimize further since it just
flips the bits of a single input string.
A bug in the implementation caused BITOP to crash the server if at least
one one of the source objects was integer encoded.
The new implementation takes an additional array of Redis objects
pointers and calls getDecodedObject() to get a reference to a string
encoded object, and then uses decrRefCount() to release the object.
Tests modified to cover the regression and improve coverage.
Fuzzing tests of BITCOUNT / BITOP are iterated multiple times.
The new BITCOUNT fuzzing test uses random strings in a wider interval of
lengths including zero-len strings.
The Redis implementation is tested against Tcl implementations of the
same operation. Both fuzzing and testing of specific aspects of the
commands behavior are performed.