Now, when clients run the unsubscribe, sunsubscribe and punsubscribe
commands in the non-subscribed mode, it returns 0.
Indeed this is a bug, we should not allow client run these kind of
commands here.
Thus, this PR fixes this bug, but it is a break change for existing
clients
---------
Signed-off-by: hwware <wen.hui.ware@gmail.com>
In #764, we add a --io-threads mode in test, but forgot
to handle runtest-cluster, they are different framework.
Currently runtest-cluster does not support tags, and we
don't have plan to support it. And currently cluster tests
does not have any io-threads tests, so this PR just align
--io-threads option with #764.
Signed-off-by: Binbin <binloveplay1314@qq.com>
If n is already myself primary, there is no need to re-establish the
replication connection.
In the past we allow a replica node to reconnect with its primary via
this CLUSTER REPLICATE command, it will use psync. But since #885, we
will assume that a full sync is needed in this case, so if we don't do
this, the replica will always use full sync.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ping Xie <pingxie@google.com>
Fix feedback loop in key eviction with tracking clients when using I/O
threads.
Current issue:
Evicting keys while tracking clients or key space-notification exist
creates a feedback loop when using I/O threads:
While evicting keys we send tracking async writes to I/O threads,
preventing immediate release of tracking clients' COB memory
consumption.
Before the I/O thread finishes its write, we recheck used_memory, which
now includes the tracking clients' COB and thus continue to evict more
keys.
**Fix:**
We will skip the test for now while IO threads are active. We may
consider avoiding sending writes in `processPendingWrites` to I/O
threads for tracking clients when we are out of memory.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Update README.md since the project is no longer under construction, and
can reference the main website.
---------
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
The repl-backlog-size 1mb is too small in most cases, now network
transmission and bandwidth performance have improved rapidly in more
than ten years.
The bigger the replication backlog, the longer the replica can endure
the disconnect and later be able to perform a partial resynchronization.
Part of #653.
---------
Signed-off-by: Binbin <binloveplay1314@qq.com>
Just add one more test for command "sentinel IS-PRIMARY-DOWN-BY-ADDR" to
make the reply-schemas-validator
run successfully.
Note: test result here
https://github.com/hwware/valkey/actions/runs/10457516111
Signed-off-by: hwware <wen.hui.ware@gmail.com>
If the client side crashes by any issue or exits normally, the kernel
will try to disconnect RDMA QPs. Then the kernel of server side receives
CM packets, valkey-server handles CM disconnected event and close
connection.
However, there is a lack of keepalive mechanism from RDMA transport
layer. Once the kernel of client side crashes, the server side will not
be notified. To avoid this issue, valkey server sents Keepaliv command
periodically to detect any dead QPs.
An example of mlx-cx5:
```
# RDMA: CQ handle error status: transport retry counter exceeded[0xc], opcode : 0x0
# RDMA: CQ handle error status: transport retry counter exceeded[0xc], opcode : 0x0
# RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0
# RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0
# RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0
# RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0
```
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Our current replica can initiate a failover without restriction when
it detects that the primary node is offline. This is generally not a
problem. However, consider the following scenarios:
1. In slot migration, a primary loses its last slot and then becomes
a replica. When it is fully synchronized with the new primary, the new
primary downs.
2. In CLUSTER REPLICATE command, a replica becomes a replica of another
primary. When it is fully synchronized with the new primary, the new
primary downs.
In the above scenario, case 1 may cause the empty primary to be elected
as the new primary, resulting in primary data loss. Case 2 may cause the
non-empty replica to be elected as the new primary, resulting in data
loss and confusion.
The reason is that we have cached primary logic, which is used for psync.
In the above scenario, when clusterSetPrimary is called, myself will cache
server.primary in server.cached_primary for psync. In replicationGetReplicaOffset,
we get server.cached_primary->reploff for offset, gossip it and rank it,
which causes the replica to use the old historical offset to initiate
failover, and it get a good rank, initiates election first, and then is
elected as the new primary.
The main problem here is that when the replica has not completed full
sync, it may get the historical offset in replicationGetReplicaOffset.
The fix is to clear cached_primary in these places where full sync is
obviously needed, and let the replica use offset == 0 to participate
in the election. In this way, this unhealthy replica has a worse rank
and is not easy to be elected.
Of course, it is possible that it will be elected with offset == 0.
In the future, we may need to prohibit the replica with offset == 0
from having the right to initiate elections.
Another point worth mentioning, in above cases:
1. In the ROLE command, the replica status will be handshake, and the
offset will be -1.
2. Before this PR, in the CLUSTER SHARD command, the replica status will
be online, and the offset will be the old cached value (which is wrong).
3. After this PR, in the CLUSTER SHARD, the replica status will be loading,
and the offset will be 0.
Signed-off-by: Binbin <binloveplay1314@qq.com>
In these places we should use RDB_EOF_MARK_SIZE, but we mixed
it with CONFIG_RUN_ID_SIZE. This is not an issue since they are
all 40, just a cleanup.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Saves some memory (one pointer) in the client struct.
Since a client cannot be blocked multiple times, we can assume
it will be held in only one extra utility list, so it is ok to maintain
a union of these listNode references.
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Currently, if the client enters a blocked state, it will be
added to the server.clients_waiting_acks list. When the client
is unblocked, that is, when unblockClient is called, we will
need to linearly traverse server.clients_waiting_acks to delete
the client, and this search is O(N).
When WAIT (or WAITAOF) is used extensively in some cases, this
O(N) search may be time-consuming. We can remember the list node
and store it in the blockingState struct and it can avoid the
linear search in unblockClientWaitingReplicas.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Add 4 new commands for Sentinel (reference
https://github.com/valkey-io/valkey/issues/36)
Sentinel GET-PRIMARY-ADDR-BY-NAME
Sentinel PRIMARY
Sentinel PRIMARIES
Sentinel IS-PRIMARY-DOWN-BY-ADDR
and deprecate 4 old commands:
Sentinel GET-MASTER-ADDR-BY-NAME
Sentinel MASTER
Sentinel MASTERS
Sentinel IS-MASTER-DOWN-BY-ADDR
and all sentinel tests pass here
https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583
Note:
1. runtest-sentinel pass all test cases
2. I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3
sentinel
there are 4 steps in this test scenario:
step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and
then new primary can be voted successfully.
step 2: replace sentinel 1 with new sentinel bin file, and then shutdown
primary, and then another new primary can be voted successfully
step 3: replace sentinel 2 with new sentinel bin file, and then shutdown
primary, and then another new primary can be voted successfully
step 4: replace sentinel 3 with new sentinel bin file, and then shutdown
primary, and then another new primary can be voted successfully
We can see, even mixed version sentinel running, whole system still
works.
---------
Signed-off-by: hwware <wen.hui.ware@gmail.com>
For `debug object` command, we use `val->lru` but ignore the `lfu` mode.
So in `lfu` mode, `debug object` would return meaningless `lru` descriptions.
Added two new fields lfu_freq and lfu_access_time_minutes.
Signed-off-by: jiangyujie.jyj <yjjiang1996@163.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Adding FAST option to DEBUG OBJECT command.
The light version only shows the light weight infomation,
which mostly O(1). The pre-existing version that show more
stats such as serializedlength sometimes is time consuming.
This should allow looking into debug stats (the key expired
but not deleted), even on huge object, on which we're afraid
to run the command for fear of causing a server freeze.
Somehow like 3ca451c46f.
Signed-off-by: Binbin <binloveplay1314@qq.com>
In #786, we did skip it in the daily, but not for the others.
When running ./runtest on MacOS, we will get the failure.
```
couldn't open socket: host is unreachable (nodename nor servname provided, or not known)
```
The reason is that TCL 8.5 doesn't support ipv6, so we skip tests
tagged with ipv6. This also revert #786.
Signed-off-by: Binbin <binloveplay1314@qq.com>
I've tried to test a dual channel replication but forgot to add +sync
for my replication user. As a result replica entered silent cycle like
this:
```
* Connecting to PRIMARY 127.0.0.1:6379
* PRIMARY <-> REPLICA sync started
* Non blocking connect for SYNC fired the event.
* Primary replied to PING, replication can continue...
* Trying a partial resynchronization (request ...)
* PSYNC is not possible, initialize RDB channel.
* Aborting dual channel sync
```
And primary got endless cycle like this:
```
* Replica 127.0.0.1:6380 asks for synchronization
* Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '...', my replication IDs are '...' and '...')
* Replica 127.0.0.1:6380 is capable of dual channel synchronization, and partial sync isn't possible. Full sync will continue with dedicated RDB channel.
```
There was no way to understand that replication user is missing +sync
acl on notice log level. With this one-line change we get a warning
message in our replica log.
---------
Signed-off-by: secwall <secwall@yandex-team.ru>
Update references of copyright being assigned to Salvatore when it was
transferred to Redis Ltd. as per
https://github.com/valkey-io/valkey/issues/544.
---------
Signed-off-by: Pieter Cailliau <pieter@redis.com>
A pointer to dtype is stored in the dict forever.
dtype is stack-allocated while the dict created is global.
The dict (and the pointer to dtype in it) will live past the lifetime of
dtype.
clusterManagerLinkDictType is a global object that has the same values
as dtype.
Signed-off-by: Salvatore Mesoraca <salvatore.mesoraca@aiven.io>
Fix#821
During the `FAILOVER` process, when conditions are met (such as when the
force time is reached or the primary and replica offsets are
consistent), the primary actively becomes the replica and transitions to
the `FAILOVER_IN_PROGRESS` state. After the primary becomes the replica,
and after handshaking and other operations, it will eventually send the
`PSYNC FAILOVER` command to the replica, after which the replica will
become the primary. This means that the upgrade of the replica to the
primary is an asynchronous operation, which implies that during the
`FAILOVER_IN_PROGRESS` state, there may be a period of time where both
nodes are replicas. In this scenario, if a `-REDIRECT` is returned, the
request will be redirected to the replica and then redirected back,
causing back and forth redirection. To avoid this situation, during the
`FAILOVER_IN_PROGRESS state`, we temporarily suspend the clients that
need to be redirected until the replica truly becomes the primary, and
then resume the execution.
---------
Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
We are updating this variable in the main thread, and the
child threads can printing the logs at the same time. This
generating a warning in SANITIZER=thread:
```
WARNING: ThreadSanitizer: data race (pid=74208)
Read of size 4 at 0x000102875c10 by thread T3:
#0 serverLogRaw <null>:52173615 (valkey-server:x86_64+0x10003c556)
#1 _serverLog <null>:52173615 (valkey-server:x86_64+0x10003ca89)
#2 bioProcessBackgroundJobs <null>:52173615 (valkey-server:x86_64+0x1001402c9)
Previous write of size 4 at 0x000102875c10 by main thread (mutexes: write M0):
#0 afterSleep <null>:52173615 (valkey-server:x86_64+0x10004989b)
#1 aeProcessEvents <null>:52173615 (valkey-server:x86_64+0x100031e52)
#2 main <null>:52173615 (valkey-server:x86_64+0x100064a3c)
#3 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
#4 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
```
The refresh of daylight_active is not real time, we update
it in aftersleep, so we don't need a strong synchronization,
so using memory_order_relaxed. But also noted we are doing
load/store operations only for daylight_active, which is an
aligned 32-bit integer, so using memory_order_relaxed will
not provide more consistency than what we have today.
So this is just a cleanup that to clear the warning.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Test dual-channel-replication primary gets cob overrun during replica
rdb load` fails during the Valgrind run. This is due to the load
handlers disconnecting before the tests complete, resulting in a low
primary COB. Increasing the handlers' timeout should resolve this issue.
Failure:
https://github.com/valkey-io/valkey/actions/runs/10361286333/job/28681321393
Server logs reveals that the load handler clients were disconnected
before the test started
Also the two previus test took about 20 seconds which is the handler
timeout.
---------
Signed-off-by: naglera <anagler123@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
The clusterManagerCompareKeysValues is introduced in
143bfa1e6e, which calls
DEBUG DIGEST-VALUE to check whether the value of the
source node and the target node are consistent.
However, the DEBUG DIGEST-VALUE command is not supported
in older version, such as version 4, and the node will return
unknown subcommand. Or the DEBUG command can be disabled
in version 7, and the node will return DEBUG not allowed.
In these cases, we need to output friendly message to
allow users to proceed to the next step, instead of just
outputing `Value check failed!`.
Unknown subcommand example:
```
*** Target key exists
*** Checking key values on both nodes...
Node 127.0.0.1:30001 replied with error:
ERR unknown subcommand or wrong number of arguments for 'DIGEST-VALUE'. Try DEBUG HELP.
Node 127.0.0.1:30003 replied with error:
ERR unknown subcommand or wrong number of arguments for 'DIGEST-VALUE'. Try DEBUG HELP.
*** Value check failed!
DEBUG DIGEST-VALUE command is not supported.
You can relaunch the command with --cluster-replace option to force key overriding.
```
DEBUG not allowed example:
```
*** Target key exists
*** Checking key values on both nodes...
Node 127.0.0.1:30001 replied with error:
ERR DEBUG command not allowed. If the enable-debug-command option is ...
Node 127.0.0.1:30003 replied with error:
ERR DEBUG command not allowed. If the enable-debug-command option is ...
*** Value check failed!
DEBUG command is not allowed.
You can turn on the enable-debug-command option.
Or you can relaunch the command with --cluster-replace option to force key overriding.
```
Signed-off-by: Binbin <binloveplay1314@qq.com>
In the past implementation of `ZUNION` and `ZUNIONSTORE` commands, we
first create a temporary dict called `accumulator`. After adding all
member-score mappings to `accumulator`, we still need to convert
`accumulator` back to the final dict `dstzset->dict`. However, we can
directly use `dstzset->dict` to avoid the additional copy operation.
This PR removes the `accumulator` dict and directly uses` dstzset->dict
`to store the member-score mappings.
- **Test**
First, I added 300 unique elements to two sorted sets called
'zunion_test1' and 'zunion_test2'. Then, I tested `zunion` and
`zunionstore` on these two sorted sets. The test results shown below
indicate that the performance of both zunion and zunionstore improved
about 31%.
### ZUNION
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2
Summary:
throughput summary: 2713.41 requests per second
latency summary (msec):
avg min p50 p95 p99 max
146.252 3.464 153.343 182.015 184.959 192.895
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2
Summary:
throughput summary: 3543.84 requests per second
latency summary (msec):
avg min p50 p95 p99 max
108.259 2.984 114.239 141.695 145.151 160.255
```
### ZUNIONSTORE
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2
Summary:
throughput summary: 3168.07 requests per second
latency summary (msec):
avg min p50 p95 p99 max
157.511 3.368 183.167 189.311 193.535 231.679
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2
Summary:
throughput summary: 4144.73 requests per second
latency summary (msec):
avg min p50 p95 p99 max
120.374 2.648 141.823 149.119 153.855 183.167
```
---------
Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Signed-off-by: zisong.cw <zisong.cw@alibaba-inc.com>
A configuration option with zero impact on server operation but is
printed out on server crash and can be accessed by gdb for debugging. It
can be used by the user/operator to store any free-form string. This
string will persist as long as the server is running and will be
accessible in the following ways:
And printed in crash reports:
```
------ CONFIG DEBUG OUTPUT ------
lazyfree-lazy-eviction no
...
io-threads-do-reads yes
debug-context "test2"
proto-max-bulk-len 512mb
```
---------
Signed-off-by: Eran Liberty <eranl@amazon.com>
Co-authored-by: Eran Liberty <eranl@amazon.com>
- Fix TLS bug where connection were shutdown by primary's main process
while the child process was still writing- causing main process to be
blocked.
- TLS connection fix -file descriptors are set to blocking mode in the
main thread, followed by a blocking write. This sets the file
descriptors to non-blocking if TLS is used (see `connTLSSyncWrite()`)
(@xbasel).
- Improve the reliability of dual-channel tests. Modify the pause
mechanism to verify process status directly, rather than relying on log.
- Ensure that `server.repl_offset` and `server.replid` are updated
correctly when dual channel synchronization completes successfully.
Thist led to failures in replication tests that validate replication IDs
or compare replication offsets.
---------
Signed-off-by: naglera <anagler123@gmail.com>
Signed-off-by: naglera <58042354+naglera@users.noreply.github.com>
Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: ranshid <88133677+ranshid@users.noreply.github.com>
Co-authored-by: xbasel <103044017+xbasel@users.noreply.github.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
When debug assert mode enabled, verify that we don't insert the same
client twice into server.clients_to_close.
Signed-off-by: naglera <anagler123@gmail.com>
Currently lastbgsave_status is used in bgsave or disk-replication,
and the target is the disk. In #60, we update it when transfer error,
i think it is mainly used in tests, so we can use log to replace it.
It changes lastbgsave_status to err in this case, but it is strange
that it does not set ok or err in the above if and the following else.
Also noted this will affect stop-writes-on-bgsave-error.
Signed-off-by: Binbin <binloveplay1314@qq.com>
I think we should first check if the server is currently enabled in
cluster mode or if it has modules loaded prior to the throttled cron run
(`run_with_period`) condition.
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Added an assertion to avoid incorrect usage of the network bytes out for
replication code flow in slot stats computation.
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
This PR adjusts the logging conditions of clusterLogCantFailover
in this two ways.
1. For the same cant_failover_reason, we will print the log once
in CLUSTER_CANT_FAILOVER_RELOG_PERIOD, but its value is 10s, which
is a bit long, shorten it to 1s, so we can better track its state.
We get to see the system making progress by watching the message.
Using 1s also covers pretty much all cases as i don't see a reason
for using a <1s node timeout, test or prod.
2. We will not print logs before the nolog_fail_time, its value
is cluster-node-timeout+5000. This may casue us to lose some logs,
for example, if cluster-node-timeout is small, auth_timeout will
be 2000, and auth_retry_time will be 4000. In this case, we will
lose all the reasons during the election if the failover is timedout.
So remove the nolog_fail_time logic, since we still do have the
CLUSTER_CANT_FAILOVER_RELOG_PERIOD logic, we won't print too many
logs.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Few CI improvements witch will reduce occupation CI queue and eliminate
stale runs.
1. Kill CI jobs on PRs once PR branch gets a new push. This will prevent
situation happened today - a huge job triggered twice in less than an
hour and occupied all **org** (for all repositories) runners queue for
the rest of the day (see pic). This completely blocked valkey-glide
team.
2. Distribute nightly croned jobs on time to prevent them running
together. Keep in mind, cron's TZ is UTC, so midnight tasks incur
developers located in other timezones.
This must be backported to all release branches (`valkey-x.y` and `x.y`)
![image](https://github.com/user-attachments/assets/923d8237-3cb7-42f5-80c8-5322b3f5187d)
---------
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
This example of a minimal user account in your Valkey server
for Sentinel is incorrect. If you add this ACL as-is to your
valkey users.acl, valkey will add resetchannels -@all before
the +client which prevents sentinel from publishing messages
to the __sentinel__:hello pubsub for sentinel discovery.
Fix#744.
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
We have a number of test failures in the empty shard migration which
seem to be related to race conditions in the failover, but could be more
pervasive. For now disable the tests to prevent so many false negative
test failures.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Release candidates have a version that is lower than 8.0.0 to allow for
8.0.0 to have 0x080000 as a release number. However, we did an explicit
check to make sure a version was 8.0 or greater to validate a replica
supports a feature. Now we are using the highest patch version of latest
minor to do the comparison to accommodate future versions.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Fix bug in writeToClient
In https://github.com/valkey-io/valkey/pull/758, a major refactor was
done to `networking.c`.
As part of this refactor, a new bug was introduced: we don't advance the
`c->buf` pointer in repeated writes.
This bug should be very unlikely to manifest, as it requires the
client's TCP buffer to be filled in the first try and then released
immediately after in the second try.
Despite all my efforts to reproduce this scenario, I was unable to do
so.
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
I think it is a good idea to mention this.
The Cluster config file is written relative this directory, if the
'cluster-config-file' configuration directive is a relative path.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Fixes test failure
(https://github.com/valkey-io/valkey/actions/runs/10146979329/job/28056316421?pr=837)
on 32 bit system for slot stats metric underflow on the following
condition:
```
server.cluster->slot_stats[c->slot].network_bytes_out += (len * listLength(server.replicas));
```
* Here listLength accesses `len` which is of `unsigned long` type and
multiplied with `len` (which could be negative). This is a risky
operation and behaves differently based on the architecture.
```
clusterSlotStatsAddNetworkBytesOutForReplication(-sdslen(selectcmd->ptr));
```
* `sdslen` method returns `size_t`. applying `-` operation to decrement
network bytes out is also incorrect.
This change adds assertion on `len` being negative and handles the
wrapping of overall value.
---------
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Update the dual channel-replication tests to wait for the pause to begin
before attempting to unpause.
---------
Signed-off-by: naglera <anagler123@gmail.com>
If we do `config set appendonly yes` and `config set appendonly no`
in a multi, there are some unexpected behavior.
When doing appendonly yes, we will schedule a AOFRW, and when we
are doding appendonly no, we will call stopAppendOnly to stop it.
In stopAppendOnly, the aof_fd is -1 since the aof is not start yet
and the fsync and close will take the -1 and call it, so they will
all fail with EBADF. And stopAppendOnly will emit a server log, the
close(-1) should be no problem but it is still an undefined behavior.
This PR also adds a log `Background append only file rewriting
scheduled.` to bgrewriteaofCommand when it was scheduled.
And adds a log in stopAppendOnly when a scheduled AOF is canceled,
it will print `AOF was disabled but there is a scheduled AOF background, cancel it.`
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Adds two new metrics for per-slot statistics, network-bytes-in and
network-bytes-out. The network bytes are inclusive of replication bytes
but exclude other types of network traffic such as clusterbus traffic.
#### network-bytes-in
The metric tracks network ingress bytes under per-slot context, by
reverse calculation of `c->argv_len_sum` and `c->argc`, stored under a
newly introduced field `c->net_input_bytes_curr_cmd`.
#### network-bytes-out
The metric tracks network egress bytes under per-slot context, by
hooking onto COB buffer mutations.
#### sample response
Both metrics are reported under the `CLUSTER SLOT-STATS` command.
```
127.0.0.1:6379> cluster slot-stats slotsrange 0 0
1) 1) (integer) 0
2) 1) "key-count"
2) (integer) 0
3) "cpu-usec"
4) (integer) 0
5) "network-bytes-in"
6) (integer) 0
7) "network-bytes-out"
8) (integer) 0
```
---------
Signed-off-by: Kyle Kim <kimkyle@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Adds light-weight cluster bus header for pubsub message. Closes#557.
This also supports sending to and receiving non-light messages from
older versions of the engine.
The light-weight cluster bus message supports multiple pubsub messages
(payloads) for one pubsub channel. Receiving messages with multiple
payloads is supported but we're not yet sending such multi-payload
messages to other nodes.
---------
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Introduce several improvements to improve the stability of dual-channel
replication and fix compatibility issues.
1. Make dual-channel-replication tests more reliable: use pause instead
of forced sleep.
2. Fix race conditions when freeing RDB client.
3. Check if sync was stopped during local buffer streaming.
4. Fix $ENDOFFSET reply format to work on 32-bit machines too.
---------
Signed-off-by: naglera <anagler123@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
During the ZADD operation, a conversion from listpack to skiplist might
be necessary for the sorted set. Currently, the function
zsetTypeMaybeConvert only examines the number of elements but does not
check the Max size of the elements. It is advisable to include a check
for value_len_hint for a more robust conversion check mechanism.
---------
Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>