valkey

mirror of http://github.com/valkey-io/valkey synced 2024-11-22 00:52:38 +00:00

Author	SHA1	Message	Date
zhaozhao.zz	f504cf233b	add assertion for kvstore's dictType (#1004 ) Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>	2024-09-09 12:13:18 -07:00
xu0o0	20d583f774	Migrate dict.c unit tests to new framework (#946 ) This PR migrates the tests related to dict into new test framework as part of #428. Signed-off-by: haoqixu <hq.xu0o0@gmail.com> Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2024-09-09 13:03:15 +08:00
xu0o0	14016d2df7	Migrate listpack.c unit tests to new framework (#949 ) This PR migrates the tests related to listpack into new test framework as part of #428. Signed-off-by: haoqixu <hq.xu0o0@gmail.com> Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2024-09-09 13:01:25 +08:00
Binbin	c642cf0134	Add client info to SHUTDOWN / CLUSTER FAILOVER logs (#875 ) Print the full client info by using catClientInfoString, the info is useful when we want to identify the source of request. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-08 16:26:56 +08:00
Binbin	6478526597	Fix aof base suffix when modifying aof-use-rdb-preamble during rewrite (#886 ) If we modify aof-use-rdb-preamble in the middle of rewrite, we may get a wrong aof base suffix. This is because the suffix is concatenated by the main process afterwards, and it may be different from the beginning. We cache this value when we start the rewrite. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-07 23:27:59 +08:00
Binbin	9b51949abe	Fix missing replication link re-connection when primary's IP/port is updated in `clusterProcessGossipSection` (#965 ) `clusterProcessGossipSection` currently doesn't trigger a check and call `replicationSetPrimary` when `myself`'s primary node’s IP/port is updated. This fix ensures that after every node address update, `replicationSetPrimary` is called if the updated node is `myself`'s primary. This prevents missed updates and ensures that replicas reconnect properly to maintain their replication link with the primary.	2024-09-05 22:19:50 -07:00
Binbin	9033734b6b	Add newline to argv in crash report when doing redact (#993 ) Minor cleanup, introduced in #877. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-05 11:13:29 +08:00
Kyle Kim (kimkyle@)	2d1eca577e	Add SLOT-STATS under CLUSTER HELP string. (#988 ) Add help wording for cluster SLOT-STATS. Signed-off-by: Kyle Kim <kimkyle@amazon.com>	2024-09-03 12:59:06 -07:00
Viktor Söderqvist	ea58fbf40d	Rewrite lazyfree docs in valkey.conf to reflect that lazy is now default (#983 ) Before this doc update, the comments in valkey.conf said that DEL is a blocking command, and even refered to other synchronous freeing as "in a blocking way, like if DEL was called". This has now become confusing and incorrect, since DEL is now non-blocking by default. The comments also mentioned too much about the "old default" and only later explain that the "new default" is non-blocking. This doc update focuses on the current default and expresses it like "Starting from Valkey 8.0, lazy freeing is enabled by default", rather than using words like old and new. This is a follow-up to #913. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-09-03 10:47:23 +02:00
NAM UK KIM	f143ffd2a5	Fix typo in valkey-cli.c (#979 ) Change from replicsa to replicas in valkey-cli.c Signed-off-by: NAM UK KIM <namuk2004@naver.com>	2024-09-03 14:58:09 +08:00
Ping Xie	981f977abf	Improve type safety and refactor dict entry handling (#749 ) This pull request introduces several changes to improve the type safety of Valkey's dictionary implementation: - Getter/Setter Macros: Implemented macros `DICT_SET_VALUE` and `DICT_GET_VALUE` to centralize type casting within these macros. This change emulates the behavior of C++ templates in C, limiting type casting to specific low-level operations and preventing it from being spread across the codebase. - Reduced Assert Overhead: Removed unnecessary asserts from critical hot paths in the dictionary implementation. - Consistent Naming: Standardized the naming of dictionary entry types. For example, all dictionary entry types start their names with `dictEntry`. Fix #737 --------- Signed-off-by: Ping Xie <pingxie@google.com> Signed-off-by: Ping Xie <pingxie@outlook.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-09-02 18:28:15 -07:00
Madelyn Olson	3e14516d86	Initialize all the fields for the test kvstore (#982 ) Follow up to https://github.com/valkey-io/valkey/pull/966, which didn't update the kvstore tests. I'm not actually entirely clear why it fixes it, but the consistency prevents the crash very reliably so will merge it now and maybe see if Zhao has a better explanation. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-09-02 11:01:59 -07:00
Amit Nagler	5fdb47c2e2	Add configuration hide-user-data-from-log to hide user data from server logs (#877 ) Implement data masking for user data in server logs and diagnostic output. This change prevents potential exposure of confidential information, such as PII, and enhances privacy protection. It masks all command arguments, client names, and client usernames. Added a new hide-user-data-from-log configuration item, default yes. --------- Signed-off-by: Amit Nagler <anagler123@gmail.com>	2024-09-02 09:50:36 -07:00
Binbin	5693fe4664	Fix set expire test due to the new lazyfree configs changes (#980 ) Test failed because these two PRs #865 and #913. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-02 22:43:09 +08:00
zhaozhao.zz	32116d09bb	Use metadata to handle the reference relationship between kvstore and dict (#966 ) Feature `one-dict-per-slot` refactors the database, and part of it involved splitting the rehashing list from the global level back to the database level, or more specifically, the kvstore level. This change is fine, and it also simplifies the process of swapping databases, which is good. And it should not have a major impact on the efficiency of incremental rehashing. To implement the kvstore-level rehashing list, each `dict` under the `kvstore` needs to know which `kvstore` it belongs. However, kvstore did not insert the reference relationship into the `dict` itself, instead, it placed it in the `dictType`. In my view, this is a somewhat odd way. Theoretically, `dictType` is just a collection of function handles, a kind of virtual type that can be referenced globally, not an entity. But now the `dictType` is instantiated, with each `kvstore` owning an actual `dictType`, which in turn holds a reverse reference to the `kvstore`'s resource pointer. This design is somewhat uncomfortable for me. I think the `dictType` should not be instantiated. The references between actual resources (`kvstore` and `dict`) should occur between specific objects, rather than force materializing the `dictType`, which is supposed to be virtual. --------- Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>	2024-09-02 22:35:24 +08:00
Binbin	70624ea63d	Change all the lazyfree configurations to yes by default (#913 ) ## Set replica-lazy-flush and lazyfree-lazy-user-flush to yes by default. There are many problems with running flush synchronously. Even in single CPU environments, the thread managers should balance between the freeing and serving incoming requests. ## Set lazy eviction, expire, server-del, user-del to yes by default We now have a del and a lazyfree del, we also have these configuration items to control: lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del, lazyfree-lazy-user-del. In most cases lazyfree is better since it reduces the risk of blocking the main thread, and because we have lazyfreeGetFreeEffort, on those with high effor (currently 64) will use lazyfree. Part of #653. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-02 07:07:17 -07:00
Madelyn Olson	089048d364	Fix zipmap test null pointer (#975 ) The previous test does a strncmp on a NULL, which is not valid. It should be using an empty length string instead. Addresses https://github.com/valkey-io/valkey/actions/runs/10649272046/job/29519233939. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-09-01 12:05:37 +02:00
Binbin	e3af1a30e4	Fast path in SET if the expiration time is expired (#865 ) If the expiration time passed in SET is expired, for example, it has expired due to the machine time (DTS) or the expiration time passed in (wrong arg). In this case, we don't need to set the key and wait for the active expire scan before deleting the key. Compared with previous changes: 1. If the key does not exist, previously we would set the key and wait for the active expire to delete it, so it is a set + del from the perspective of propaganda. Now we will no set the key and return, so it a NOP. 2. If the key exists, previously we woule set the key and wait for the active expire to delete it, so it is a set + del From the perspective of propaganda. Now we will delete it and return, so it is a del. Adding a new deleteExpiredKeyFromOverwriteAndPropagate function to reduce the duplicate code. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-31 22:39:07 +08:00
Viktor Söderqvist	5d458c6292	Delete unused parts of zipmap (#973 ) Deletes zipmapSet, zipmapGet, etc. Only keep iterator and validate integrity, what we use when loading an old RDB file. Adjust unit tests to not use zipmapSet, etc. Solves a build failure where when compiling with fortify source. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-08-31 15:42:44 +02:00
Binbin	fea49bce2c	Fix timing issue in replica migration test (#968 ) The reason is the server 3 still have the server 7 as its replica due to a short wait, the wait is not enough, we should wait for server loss its replica. ``` *** [err]: valkey-cli make source node ignores NOREPLICAS error when doing the last CLUSTER SETSLOT Expected '{127.0.0.1 21497 267}' to be equal to '' (context: type eval line 34 cmd {assert_equal [lindex [R 3 role] 2] {}} proc ::test) ``` Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-30 19:58:46 +08:00
zhaozhao.zz	743f5ac2ae	standalone -REDIRECT handles special case of MULTI context (#895 ) In standalone mode, when a `-REDIRECT` error occurs, special handling is required if the client is in the `MULTI` context. We have adopted the same handling method as the cluster mode: 1. If a command in the transaction encounters a `REDIRECT` at the time of queuing, the execution of `EXEC` will return an `EXECABORT` error (we expect the client to redirect and discard the transaction upon receiving a `REDIRECT`). That is: ``` MULTI ==> +OK SET x y ==> -REDIRECT EXEC ==> -EXECABORT ``` 2. If all commands are successfully queued (i.e., `QUEUED` results are received) but a redirect is detected during `EXEC` execution (such as a primary-replica switch), a `REDIRECT` is returned to instruct the client to perform a redirect. That is: ``` MULTI ==> +OK SET x y ==> +QUEUED failover EXEC ==> -REDIRECT ``` --------- Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>	2024-08-30 10:17:53 +08:00
Shivshankar	2b76c8fbe2	Migrate zipmap unit test to new framework (#474 ) Migrate zipmap unit test to new unit test framework, parent ticket #428 . --------- Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com> Signed-off-by: hwware <wen.hui.ware@gmail.com> Co-authored-by: hwware <wen.hui.ware@gmail.com>	2024-08-29 11:17:53 -04:00
Binbin	ecbfb6a7ec	Fix reconfiguring sub-replica causing data loss when myself change shard_id (#944 ) When reconfiguring sub-replica, there may a case that the sub-replica will use the old offset and win the election and cause the data loss if the old primary went down. In this case, sender is myself's primary, when executing updateShardId, not only the sender's shard_id is updated, but also the shard_id of myself is updated, casuing the subsequent areInSameShard check, that is, the full_sync_required check to fail. As part of the recent fix of #885, the sub-replica needs to decide whether a full sync is required or not when switching shards. This shard membership check is supposed to be done against sub-replica's current shard_id, which however was lost in this code path. This then leads to sub-replica joining the other shard with a completely different and incorrect replication history. This is the only place where replicaof state can be updated on this path so the most natural fix would be to pull the chain replication reduction logic into this code block and before the updateShardId call. This one follow #885 and closes #942. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@outlook.com>	2024-08-29 22:39:53 +08:00
zhaozhao.zz	4a9b4f667c	free client's multi state when it becomes dirty (#961 ) Release the client's MULTI state when the transaction becomes dirty to save memory. --------- Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>	2024-08-29 19:20:53 +08:00
Ping Xie	ad0ede302c	Exclude '.' and ':' from `isValidAuxChar`'s banned charset (#963 ) Fix a bug in isValidAuxChar where valid characters '.' and ':' were incorrectly included in the banned charset. This issue affected the validation of auxiliary fields in the nodes.conf file used by Valkey in cluster mode, particularly when handling IPv4 and IPv6 addresses. The code now correctly allows '.' and ':' as valid characters, ensuring proper handling of these fields. Comments were added to clarify the use of the banned charset. Related to #736 --------- Signed-off-by: Ping Xie <pingxie@google.com>	2024-08-28 23:35:31 -07:00
Binbin	75b824052d	Revert make KEYS to be an exact match if there is no pattern (#964 ) In #792, the time complexity became ambiguous, fluctuating between O(1) and O(n), which is a significant difference. And we agree uncertainty can potentially bring disaster to the business, the right thing to do is to persuade users to use EXISTS instead of KEYS in this case, to do the right thing the right way, rather than accommodating this incorrect usage. This reverts commit `d66a06e818`. This reverts #792. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-29 10:58:19 +08:00
Viktor Söderqvist	25dd943087	Delete TLS.md and update README.md about tests (#960 ) Most of the content of TLS.md has already been copied to README.md in #927. The description of how to run tests with TLS is moved to tests/README.md. Descriptions of the additional scripts runtest-cluster, runtest-sentinel and runtest-module are added in tests/README.md. Links to tests/README.md and src/unit/README.md are added in the top-level README.md along with a brief overview of the `make test-*` commands. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-08-28 21:17:04 +02:00
Viktor Söderqvist	927c2a8cd1	Delete files MANIFESTO, BUGS and INSTALL (#958 ) The MANIFESTO is not Valkey's manifesto and it doesn't even mention open source. Let's write another one later, or some other document about our project principles. The other two files are one-line files with no relevant info. They're polluting the file listing at root level. It's the first thing you see when you start exploring the repo for the first time. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-08-28 20:04:23 +02:00
I-Hsin Cheng	6172907094	Migrate the contents of TLS.md into README.md (#927 ) Migrate the contents in TLS.md into TLS sections including building, running and detail supports. TODO list in the TLS.md is almost done except the implementation of benchmark support is still not the best approach which should migrate to hiredis async mode. Closes #888 --------- Signed-off-by: I Hsin Cheng <richard120310@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-08-28 12:43:29 +02:00
Ping Xie	2b71a78241	Add comment explaining log file reopening for rotation support (#956 )	2024-08-27 21:00:17 -07:00
mwish	744b13e302	Using intrinsics to optimize counting HyperLogLog trailing bits (#846 ) Godbolt link: https://godbolt.org/z/3YPvxsr5s __builtin_ctz would generate shorter code than hand-written loop. --------- Signed-off-by: mwish <maplewish117@gmail.com> Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-27 20:44:32 -07:00
Binbin	4fe8320711	Add pause path coverage to replica migration tests (#937 ) In #885, we only add a shutdown path, there is another path is that the server might got hang by slowlog. This PR added the pause path coverage to cover it. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-28 11:08:27 +08:00
Lipeng Zhu	076bf6605f	Move prepareClientToWrite out of loop for lrange command to reduce the redundant call. (#860 ) ## Description When I explore the cycles distributions for `lrange` test ( `valkey-benchmark -p 9001 -t lrange -d 100 -r 1000000 -n 1000000 -c 50 --threads 4`). I found the `prepareClientToWrite` and `clientHasPendingReplies` could be reduced to single call outside instead of called in a loop, ideally we can gain 3% performance. The corresponding `LRANG_100`, `LRANG_300`, `LRANGE_500`, `LRANGE_600` have ~2% - 3% performance boost, the benchmark test prove it helps. This patch try to move the `prepareClientToWrite` and its child `clientHasPendingReplies` out of the loop to reduce the function overhead. --------- Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>	2024-08-27 19:11:09 -07:00
Binbin	6a84e06b05	Wait for the role change and fix the timing issue in the new test (#947 ) The test might be fast enough and then there is no change in the role causing the test to fail. Adding a wait to avoid the timing issue: ``` *** [err]: valkey-cli make source node ignores NOREPLICAS error when doing the last CLUSTER SETSLOT Expected '{127.0.0.1 23154 267}' to be equal to '' (context: type eval line 24 cmd {assert_equal [lindex [R 3 role] 2] {}} proc ::test) ``` Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-28 09:51:10 +08:00
Vadym Khoptynets	4f29ad4583	Use sdsAllocSize instead of sdsZmallocSize (#923 ) sdsAllocSize returns the correct size without consulting the allocator. Which is much faster than consulting the allocator. The only exception is SDS_TYPE_5, for which it has to consult the allocator. This PR also sets alloc field correctly for embedded string objects. It assumes that no allocator would allocate a buffer larger than `259 + sizeof(robj)` for embedded string. We use embedded strings for strings up to 44 bytes. If this assumption is wrong, the whole function would require a rewrite. In general case sds type adjustment might be needed. Such logic should go to sds.c. --------- Signed-off-by: Vadym Khoptynets <vadymkh@amazon.com>	2024-08-27 14:43:01 -07:00
Amit Nagler	1ff2a3b6ae	Remove `dual-channel-replication` Feature Flag's Protection (#908 ) Currently, the `dual-channel-replication` feature flag is immutable if `enable-protected-configs` is enabled, which is the default behavior. This PR proposes to make the `dual-channel-replication` flag mutable, allowing it to be changed dynamically without restarting the cluster. Motivation: The ability to change the `dual-channel-replication` flag dynamically is essential for testing and validating the feature on real clusters running in production environments. By making the flag mutable, we can enable or disable the feature without disrupting the cluster's operations, facilitating easier testing and experimentation. Additionally, this change would provide more flexibility for users to enable or disable the feature based on their specific requirements or operational needs without requiring a cluster restart. --------- Signed-off-by: naglera <anagler123@gmail.com>	2024-08-27 10:18:48 -07:00
Viktor Söderqvist	54c0f743dd	Connection minor fixes (#953 ) 1. Remove redundant connIncrRefs/connDecrRefs In socket.c, the reference counter is incremented before calling callHandler, but the same reference counter is also incremented inside callHandler before calling the actual callback. static inline int callHandler(connection *conn, ConnectionCallbackFunc handler) { connIncrRefs(conn); if (handler) handler(conn); connDecrRefs(conn); ... } This commit removes the redundant incr/decr calls in socket.c 2. Correct return value of connRead for TLS when peer closed According to comments in connection.h, connRead returns 0 when the peer has closed the connection. This patch corrects the return value for TLS connections. (Without this patch, it returns -1 which means error.) There is an observable difference in what is logged in the verbose level: "Client closed connection" vs "Reading from client: (null)". --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-08-27 16:11:33 +02:00
uriyage	04d76d8b02	Improve multithreaded performance with memory prefetching (#861 ) This PR utilizes the IO threads to execute commands in batches, allowing us to prefetch the dictionary data in advance. After making the IO threads asynchronous and offloading more work to them in the first 2 PRs, the `lookupKey` function becomes a main bottle-neck and it takes about 50% of the main-thread time (Tested with SET command). This is because the Valkey dictionary is a straightforward but inefficient chained hash implementation. While traversing the hash linked lists, every access to either a dictEntry structure, pointer to key, or a value object requires, with high probability, an expensive external memory access. ### Memory Access Amortization Memory Access Amortization (MAA) is a technique designed to optimize the performance of dynamic data structures by reducing the impact of memory access latency. It is applicable when multiple operations need to be executed concurrently. The principle behind it is that for certain dynamic data structures, executing operations in a batch is more efficient than executing each one separately. Rather than executing operations sequentially, this approach interleaves the execution of all operations. This is done in such a way that whenever a memory access is required during an operation, the program prefetches the necessary memory and transitions to another operation. This ensures that when one operation is blocked awaiting memory access, other memory accesses are executed in parallel, thereby reducing the average access latency. We applied this method in the development of `dictPrefetch`, which takes as parameters a vector of keys and dictionaries. It ensures that all memory addresses required to execute dictionary operations for these keys are loaded into the L1-L3 caches when executing commands. Essentially, `dictPrefetch` is an interleaved execution of dictFind for all the keys. Implementation details When the main thread iterates over the `clients-pending-io-read`, for clients with ready-to-execute commands (i.e., clients for which the IO thread has parsed the commands), a batch of up to 16 commands is created. Initially, the command's argv, which were allocated by the IO thread, is prefetched to the main thread's L1 cache. Subsequently, all the dict entries and values required for the commands are prefetched from the dictionary before the command execution. Only then will the commands be executed. --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com>	2024-08-26 21:10:44 -07:00
Binbin	694246cfab	Drop the outdated script replication example comments (#951 ) This example was for script replication which we have completely removed in 7.0, so this example is outdated now. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-27 12:04:47 +08:00
Binbin	d66a06e818	Make KEYS to be an exact match if there is no pattern (#792 ) Although KEYS is a dangerous command and we recommend people to avoid using it, some people who are not familiar with it still using it, and even use KEYS with no pattern at all. Once KEYS is using with no pattern, we can convert it to an exact match to avoid iterating over all data. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-27 12:04:27 +08:00
xu0o0	73698fa028	Fix invalid escape sequence in utils, minor cleanup in python script (#948 ) According to the Python document[1], any invalid escape sequences in string literals now generate a DeprecationWarning (SyntaxWarning as of 3.12) and in the future this will become a SyntaxError. This Change uses Python’s raw string notation for regular expression patterns to avoid it. [1]: https://docs.python.org/3.10/library/re.html Signed-off-by: haoqixu <hq.xu0o0@gmail.com>	2024-08-26 22:53:35 +08:00
Binbin	9f4b1adbea	Add explicit assert to ensure thread_shared_qb won't expand (#938 ) Although this won't happen now, adding this statement explicitly. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-25 12:03:34 +08:00
Binbin	c7d1daea05	Add epoch information to failover auth denied logs (#816 ) When failover deny to vote, sometimes due to network or some blocking operations, the time of FAILOVER_AUTH_REQUEST packet arrival is very uncertain. Since there is no epoch information in these logs, it is hard to associate the log with other logs. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-24 18:03:24 +08:00
NAM UK KIM	0053429a02	Update "Total" message and used_memory_human log information in serverCron() function (#594 ) At the VERBOSE/DEBUG log level, which is output once every 5 seconds, added to show the "Total" message of all clients and to show memory usage (used_memory) with used_memory_human. Also, it seems clearer to show "total" number of keys and the number of volatile in entire keys. --------- Signed-off-by: NAM UK KIM <namuk2004@naver.com>	2024-08-23 18:02:18 -07:00
Ayush Sharma	b48596a914	Add support for setting the group on a unix domain socket (#901 ) Add new optional, immutable string config called `unixsocketgroup`. Change the group of the unix socket to `unixsocketgroup` after `bind()` if specified. Adds tests to validate the behavior. Fixes #873. Signed-off-by: Ayush Sharma <mrayushs933@gmail.com>	2024-08-23 11:52:08 -07:00
Madelyn Olson	829aa7fe3c	Remove accurate from extra test tag (#935 ) Today if we attached the "run-extra-tests" tag it adds at least 20 minutes because the dump-fuzzer test runs with full accuracy. This fuzzer is useful, but probably only really needed for the daily, so removing it from the PRs. We still run the fuzzers, just not for as long. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-23 11:05:41 -07:00
Binbin	8045994972	valkey-cli make source node ignores NOREPLICAS when doing the last CLUSTER SETSLOT (#928 ) This fixes #899. In that issue, the primary is cluster-allow-replica-migration no and its replica is cluster-allow-replica-migration yes. And during the slot migration: 1. Primary calling blockClientForReplicaAck, waiting its replica. 2. Its replica reconfiguring itself as a replica of other shards due to replica migration and disconnect from the old primary. 3. The old primary never got the chance to receive the ack, so it got a timeout and got a NOREPLICAS error. In this case, the replicas might automatically migrate to another primary, resulting in the client being unblocked with the NOREPLICAS error. In this case, since the configuration will eventually propagate itself, we can safely ignore this error on the source node. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-23 16:22:30 +08:00
Binbin	5d97f5133c	Fix CLUSTER SETSLOT block and unblock error when all replicas are down (#879 ) In CLUSTER SETSLOT propagation logic, if the replicas are down, the client will get block during command processing and then unblock with `NOREPLICAS Not enough good replicas to write`. The reason is that all replicas are down (or some are down), but myself->num_replicas is including all replicas, so the client will get block and always get timeout. We should only wait for those online replicas, otherwise the waiting propagation will always timeout since there are not enough replicas. The admin can easily check if there are replicas that are down for an extended period of time. If they decide to move forward anyways, we should not block it. If a replica failed right before the replication and was not included in the replication, it would also unlikely win the election. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@google.com>	2024-08-23 16:21:53 +08:00
Yunxiao Du	0a11c4a140	Delete redundant declaration clusterNodeCoversSlot and countKeysInSlot (#930 ) Delete redundant declaration, clusterNodeCoversSlot and countKeysInSlot has been declared in cluster.h Signed-off-by: Yunxiao Du <me@jackdu.cn>	2024-08-23 12:17:27 +08:00
Madelyn Olson	b12668af7a	Revert repl backlog size back to 1mb for dual channel tests (#934 ) There is a test that assumes that the backlog will get overrun, but because of the recent changes to the default it no longer fails. It seems like it is a bit flakey now though, so resetting the value in the test back to 1mb. (This relates to the CoB of 1100k. So it should consistently work with a 1mb limit). Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-22 15:35:28 -07:00

... 2 3 4 5 6 ...

12724 Commits