This was introduced by the recent change in #11692 which prevented a
down-sizing rehashing while there is a fork.
## Solution
1. Fix the rehashing code, so that the same as it allows rehashing for up-sizing
during fork when the ratio is extreme, it will allow it for down-sizing as well.
Co-authored-by: Oran Agra <oran@redislabs.com>
This is a partial cherry pick of:
(cherry picked from commit b00a235186)
(cherry picked from commit d4c37320382edb342292a3e30250d46896a12016)
* Fix integer overflows due to using wrong integer size.
* Add assertions / panic when overflow still happens.
* Deletion of dead code to avoid need to maintain it
* Some changes are not because of bugs, but rather paranoia.
* Improve cmsgpack and cjson test coverage.
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
New test fails on valgrind because strtold("+inf") with valgrind returns a non-inf result
same thing is done in incr.tcl.
(cherry picked from commit c3b7bde914)
This bug seems to be there forever, CLIENT REPLY OFF|SKIP will
mark the client with CLIENT_REPLY_OFF or CLIENT_REPLY_SKIP flags.
With these flags, prepareClientToWrite called by addReply* will
return C_ERR directly. So the client can't receive the Pub/Sub
messages and any other push notifications, e.g client side tracking.
In this PR, we adding a CLIENT_PUSHING flag, disables the reply
silencing flags. When adding push replies, set the flag, after the reply,
clear the flag. Then add the flag check in prepareClientToWrite.
Fixes#11874
Note, the SUBSCRIBE command response is a bit awkward,
see https://github.com/redis/redis-doc/pull/2327
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 416842e6c0)
(cherry picked from commit f8ae7a414c)
(cherry picked from commit 96814a32da61e5ed523864e00609a4aa6be065b3)
Check the validity of the value before performing the create operation,
prevents new data from being generated even if the request fails to execute.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: chendianqiang <chendianqiang@meituan.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
(cherry picked from commit bc7fe41e58)
(cherry picked from commit 606a385935363ea46c0df4f40f8a949d85f7a20a)
(cherry picked from commit 7df23a5f51488ce002411c9d24b38520ad67b764)
Issue happens when passing a negative long value that greater than
the max positive value that the long can store.
(cherry picked from commit 41430af6a821c551abb862666ef896f2c196dea6)
(cherry picked from commit f335f9c55e76c76531780c5bbf8805410b7b3ba4)
Authenticated users can use string matching commands with a
specially crafted pattern to trigger a denial-of-service attack on Redis,
causing it to hang and consume 100% CPU time.
(cherry picked from commit e75f92047c22e659d49bba3a083cd0c9935f21e6)
(cherry picked from commit e8a9d3f63aebf6065d69bd0125d4b9c367f88def)
Turns out that a fork child calling getExpire while persisting keys (and
possibly also a result of some module fork tasks) could cause dictFind
to do incremental rehashing in the child process, which is both a waste
of time, and also causes COW harm.
(cherry picked from commit 2bec254d89)
(cherry picked from commit 3e82bdf738)
(cherry picked from commit 4803334cf6cb1eccdd33674a72a215ed6cd10069)
2. Make redis-cli flush stdout when printing a reply
This was needed in order to fix a hung in redis-cli test that uses
--replica.
Note that printf does flush when there's a newline, but fwrite does not.
3. fix the redis-cli --replica test which used to pass previously
because it didn't really care what it read, and because redis-cli
used printf to print these other things to stdout.
4. improve redis-cli --replica test to run with both diskless and disk-based.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>
(cherry picked from commit 1eb4baa5b8)
(cherry picked from commit 8884971223)
Before this commit, TLS tests on Ubuntu 22.04 would fail as dropped
connections result with an ECONNABORTED error thrown instead of an empty
read.
(cherry picked from commit 69d5576832)
**Signed integer overflow.** Although, signed overflow issue can be problematic time to time
and change how compiler generates code, current findings mostly about signed shift or simple
addition overflow. For most platforms Redis can be compiled for, this wouldn't cause any issue
as far as I can tell (checked generated code on godbolt.org).
UB means nothing guaranteed and risky to reason about program behavior but I don't think any
of the fixes here worth backporting. As sanitizers are now part of the CI, preventing new issues
will be the real benefit.
partial cherry pick from commit b91d8b289b
The bug in BITFIELD seems to affect 12.2.1 used on Alpine
(cherry picked from commit 4418cf166e025e7d0d2c965e75ad57c05ecff43f)
Authenticated users issuing specially crafted SETRANGE and SORT(_RO)
commands can trigger an integer overflow, resulting with Redis attempting
to allocate impossible amounts of memory and abort with an OOM panic.
Related to the hang reported in #11671
Currently, redis can disconnect a client due to reaching output buffer limit,
it'll also avoid feeding that output buffer with more data, but it will keep
running the loop in the command (despite the client already being marked for
disconnection)
This PR is an attempt to mitigate the problem, specifically for commands that
are easy to abuse, specifically: SRANDMEMBER.
The RAND family of commands can take a negative COUNT argument (which is not
bound to the number of elements in the key), so it's enough to create a key
with one field, and then these commands can be used to hang redis.
NOTICE:
For in Redis 7.0 this fix handles KEYS as well, but in this branch
it doesn't, details in #11676
Trying to avoid people opening crash report issues about module crashes and ARM QEMU bugs.
(cherry picked from commit 475563e2e9)
(cherry picked from commit 66472a5ee3)
On v6.2.7 a new mechanism was added to Lua scripts that allows
filtering the globals of the Lua interpreter. This mechanism was
added on the following commit: 11b602fbf8
One of the globals that was filtered out was `__redis__compare_helper`. This global
was missed and was not added to the allow list or to the deny list. This is
why we get the following warning when Redis starts:
`A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.`
After investigating the git blame log, the conclusion is that `__redis__compare_helper`
is no longer needed, the PR deletes this function, and fixes the warning.
Detailed Explanation:
`__redis__compare_helper` was added on this commit: https://github.com/redis/redis/commit/2c861050c1
Its purpose is to sort the replies of `SORT` command when script replication is enable and keep the replies
deterministic and avoid primary and replica synchronization issues. On `SORT` command, there was a need for
special compare function that are able to compare boolean values.
The need to sort the `SORT` command reply was removed on this commit: 36741b2c81
The sorting was moved to be part of the `SORT` command and there was not longer a need
to sort it on the Lua interpreter. The commit made `__redis__compare_helper` a dead code but did
not deleted it.
(cherry picked from commit 64c657a8af)
The theory is that a replica gets disconnected from within REPLCONF ACK,
so when we go up the stack, we'll crash when attempting to access
c->cmd->flags
(cherry picked from commit aa9beaca77)
Avoid printing "Killed by PID" when si_code != SI_USER.
Apparently SI_USER isn't always set to 0. e.g. on Mac it's 0x10001 and the check that did <= was wrong.
(cherry picked from commit 6761d10cc3)
This commit 0f8b634cd (CVE-2021-32626 released in 6.2.6, 6.0.16, 5.0.14)
fixes an invalid memory write issue by using `lua_checkstack` API to make
sure the Lua stack is not overflow. This fix was added on 3 places:
1. `luaReplyToRedisReply`
2. `ldbRedis`
3. `redisProtocolToLuaType`
On the first 2 functions, `lua_checkstack` is handled gracefully while the
last is handled with an assert and a statement that this situation can
not happened (only with misbehave module):
> the Redis reply might be deep enough to explode the LUA stack (notice
that currently there is no such command in Redis that returns such a nested
reply, but modules might do it)
The issue that was discovered is that user arguments is also considered part
of the stack, and so the following script (for example) make the assertion reachable:
```
local a = {}
for i=1,7999 do
a[i] = 1
end
return redis.call("lpush", "l", unpack(a))
```
This is a regression because such a script would have worked before and now
its crashing Redis. The solution is to clear the function arguments from the Lua
stack which makes the original assumption true and the assertion unreachable.
(cherry picked from commit 6b0b04f1b2)
There was no check min-slave-* config when evaluating Lua script.
Add check enough good slaves for write command when evaluating scripts.
Co-authored-by: Phuc. Vo Trong <phucvt@vng.com.vn>
(cherry picked from commit 34505d26f7)
Normally we execute the read event first and then the write event.
When the barrier is set, we will do it reverse.
However, under `kqueue`, if an `fd` has both read and write events,
reading the event using `kevent` will generate two events, which will
result in uncontrolled read and write timing.
This also means that the guarantees of AOF `appendfsync` = `always` are
not met on MacOS without this fix.
The main change to this pr is to cache the events already obtained when reading
them, so that if the same `fd` occurs again, only the mask in the cache is updated,
rather than a new event is generated.
This was exposed by the following test failure on MacOS:
```
*** [err]: AOF fsync always barrier issue in tests/integration/aof.tcl
Expected 544 != 544 (context: type eval line 26 cmd {assert {$size1 != $size2}} proc ::test)
```
(cherry picked from commit 306a5ccd2d)
If we want to check `defined(SYNC_FILE_RANGE_WAIT_BEFORE)`, we should include fcntl.h.
otherwise, SYNC_FILE_RANGE_WAIT_BEFORE is not defined, and there is alway not `sync_file_range` system call.
Introduced by #8532
(cherry picked from commit 8edc3cd62c)
this code is in use only if the master is disk-based, and the replica is
diskless. In this case we use a buffered reader, but we must avoid reading
past the rdb file, into the command stream. which Luckly rdb.c doesn't
really attempt to do (it knows how much it should read).
When rioConnRead detects that the extra buffering attempt reaches beyond
the read limit it should read less, but if the caller actually requested
more, then it should return with an error rather than a short read. the
bug would have resulted in short read.
in order to fix it, the code must consider the real requested size, and
not the extra buffering size.
(cherry picked from commit 40d7fca368)
- fix possible heap corruption in ziplist and listpack resulting by trying to
allocate more than the maximum size of 4GB.
- prevent ziplist (hash and zset) from reaching size of above 1GB, will be
converted to HT encoding, that's not a useful size.
- prevent listpack (stream) from reaching size of above 1GB.
- XADD will start a new listpack if the new record may cause the previous
listpack to grow over 1GB.
- XADD will respond with an error if a single stream record is over 1GB
- List type (ziplist in quicklist) was truncating strings that were over 4GB,
now it'll respond with an error.
When LUA call our C code, by default, the LUA stack has room for 20
elements. In most cases, this is more than enough but sometimes it's not
and the caller must verify the LUA stack size before he pushes elements.
On 3 places in the code, there was no verification of the LUA stack size.
On specific inputs this missing verification could have lead to invalid
memory write:
1. On 'luaReplyToRedisReply', one might return a nested reply that will
explode the LUA stack.
2. On 'redisProtocolToLuaType', the Redis reply might be deep enough
to explode the LUA stack (notice that currently there is no such
command in Redis that returns such a nested reply, but modules might
do it)
3. On 'ldbRedis', one might give a command with enough arguments to
explode the LUA stack (all the arguments will be pushed to the LUA
stack)
This commit is solving all those 3 issues by calling 'lua_checkstack' and
verify that there is enough room in the LUA stack to push elements. In
case 'lua_checkstack' returns an error (there is not enough room in the
LUA stack and it's not possible to increase the stack), we will do the
following:
1. On 'luaReplyToRedisReply', we will return an error to the user.
2. On 'redisProtocolToLuaType' we will exit with panic (we assume this
scenario is rare because it can only happen with a module).
3. On 'ldbRedis', we return an error.
The protocol parsing on 'ldbReplParseCommand' (LUA debugging)
Assumed protocol correctness. This means that if the following
is given:
*1
$100
test
The parser will try to read additional 94 unallocated bytes after
the client buffer.
This commit fixes this issue by validating that there are actually enough
bytes to read. It also limits the amount of data that can be sent by
the debugger client to 1M so the client will not be able to explode
the memory.
This change sets a low limit for multibulk and bulk length in the
protocol for unauthenticated connections, so that they can't easily
cause redis to allocate massive amounts of memory by sending just a few
characters on the network.
The new limits are 10 arguments of 16kb each (instead of 1m of 512mb)
The redis-cli command line tool and redis-sentinel service may be vulnerable
to integer overflow when parsing specially crafted large multi-bulk network
replies. This is a result of a vulnerability in the underlying hiredis
library which does not perform an overflow check before calling the calloc()
heap allocation function.
This issue only impacts systems with heap allocators that do not perform their
own overflow checks. Most modern systems do and are therefore not likely to
be affected. Furthermore, by default redis-sentinel uses the jemalloc allocator
which is also not vulnerable.
The vulnerability involves changing the default set-max-intset-entries
configuration parameter to a very large value and constructing specially
crafted commands to manipulate sets
GETBIT, SETBIT may access wrong address because of wrap.
BITCOUNT and BITPOS may return wrapped results.
BITFIELD may access the wrong address but also allocate insufficient memory and segfault (see CVE-2021-32761).
This commit uses `uint64_t` or `long long` instead of `size_t`.
related https://github.com/redis/redis/pull/8096
At 32bit platform:
> setbit bit 4294967295 1
(integer) 0
> config set proto-max-bulk-len 536870913
OK
> append bit "\xFF"
(integer) 536870913
> getbit bit 4294967296
(integer) 0
When the bit index is larger than 4294967295, size_t can't hold bit index. In the past, `proto-max-bulk-len` is limit to 536870912, so there is no problem.
After this commit, bit position is stored in `uint64_t` or `long long`. So when `proto-max-bulk-len > 536870912`, 32bit platforms can still be correct.
For 64bit platform, this problem still exists. The major reason is bit pos 8 times of byte pos. When proto-max-bulk-len is very larger, bit pos may overflow.
But at 64bit platform, we don't have so long string. So this bug may never happen.
Additionally this commit add a test cost `512MB` memory which is tag as `large-memory`. Make freebsd ci and valgrind ci ignore this test.
* This test is disabled in this version since bitops doesn't rely on
proto-max-bulk-len. some of the overflows can still occur so we do want
the fixes.
(cherry picked from commit 71d452876e)
Fix module info genModulesInfoStringRenderModulesList lack separator when there's more than one module in the list.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 1895e134a7)
in case dest key already contains the member, the dest key isn't modified, so the command shouldn't invalidate watch.
(cherry picked from commit 11dc4e59b3)
Make it possible for redis-cli cluster import to work with source and
target that require AUTH.
Adding two different flags --cluster-from-user, --cluster-from-pass
and --cluster-askpass for source node authentication.
Also for target authentication, using existing --user and --pass flag.
Example:
./redis-cli --cluster import 127.0.0.1:7000 --cluster-from 127.0.0.1:6379 --pass 1234 --user default --cluster-from-user default --cluster-from-pass 123456
./redis-cli --cluster import 127.0.0.1:7000 --cluster-from 127.0.0.1:6379 --askpass --cluster-from-user default --cluster-from-askpass
(cherry picked from commit 639b73cd2a)
- promote the code in DEBUG PROTOCOL to addReplyBigNum
- DEBUG PROTOCOL ATTRIB skips the attribute when client is RESP2
- networking.c addReply for push and attributes generate assertion when
called on a RESP2 client, anything else would produce a broken
protocol that clients can't handle.
(cherry picked from commit 6a5bac309e)
(cherry picked from commit 7f38aa8bc719f709acdcefc35a45a7aa6faa76fa)
This makes it possible to distinguish between null response and an empty
array (currently the tests infra translates both to an empty string/list)
(cherry picked from commit 7103367ad4)
(cherry picked from commit e04bce2f01a369f57893be2bd0109e21f14f037e)
In clusterManagerCommandImport strcat was used to concat COPY and
REPLACE, the space maybe not enough.
If we use --cluster-replace but not --cluster-copy, the MIGRATE
command contained COPY instead of REPLACE.
(cherry picked from commit a049f6295a)
(cherry picked from commit d4771a995e99e61e2e8feb92ede06431195b0300)
SINTERSTORE would have deleted the dest key right away,
even when later on it is bound to fail on an (WRONGTYPE) error.
With this change it first picks up all the input keys, and only later
delete the dest key if one is empty.
Also add more tests for some commands.
Mainly focus on
- `wrong type error`:
expand test case (base on sinter bug) in non-store variant
add tests for store variant (although it exists in non-store variant, i think it would be better to have same tests)
- the dstkey result when we meet `non-exist key (empty set)` in *store
sdiff:
- improve test case about wrong type error (the one we found in sinter, although it is safe in sdiff)
- add test about using non-exist key (treat it like an empty set)
sdiffstore:
- according to sdiff test case, also add some tests about `wrong type error` and `non-exist key`
- the different is that in sdiffstore, we will consider the `dstkey` result
sunion/sunionstore add more tests (same as above)
sinter/sinterstore also same as above ...
(cherry picked from commit b8a5da80c4)
(cherry picked from commit f4702b8b7a7da6cc661ddb6744cb322bc92e3267)
When using RESP3, ZPOPMAX/ZPOPMIN should return nested arrays for consistency
with other commands (e.g. ZRANGE).
We do that only when COUNT argument is present (similarly to how LPOP behaves).
for reasoning see https://github.com/redis/redis/issues/8824#issuecomment-855427955
This is a breaking change only when RESP3 is used, and COUNT argument is present!
(cherry picked from commit 7f342020dc)
(cherry picked from commit caaad2d686b2af0d13fbeda414e2b70e57635b5c)