valkey

mirror of http://github.com/valkey-io/valkey synced 2024-11-22 09:17:20 +00:00

Author	SHA1	Message	Date
antirez	8d92885bac	Cluster: add test for the nofailover flag.	2018-03-14 16:31:46 +01:00
antirez	70597a3011	Cluster: ability to prevent slaves from failing over their masters. This commit, in some parts derived from PR #3041 which is no longer possible to merge (because the user deleted the original branch), implements the ability of slaves to have a special configuration preventing that they try to start a failover when the master is failing. There are multiple reasons for wanting this, and the feautre was requested in issue #3021 time ago. The differences between this patch and the original PR are the following: 1. The flag is saved/loaded on the nodes configuration. 2. The 'myself' node is now flag-aware, the flag is updated as needed when the configuration is changed via CONFIG SET. 3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent with existing NOADDR. 4. The redis.conf documentation was rewritten. Thanks to @deep011 for the original patch.	2018-03-14 16:31:46 +01:00
antirez	16cad10a0c	redis-cli: fix missed unit in array. Change define name.	2018-03-02 12:37:22 +01:00
charsyam	640fa434f5	fix-out-of-index-range-for-redis-cli-findbigkey	2018-03-02 12:37:11 +01:00
antirez	83390f55e5	expireIfNeeded() needed a top comment documenting the behavior.	2018-02-28 18:09:43 +01:00
antirez	888039ca82	expireIfNeeded() comment: claim -> pretend.	2018-02-28 18:09:40 +01:00
antirez	e09c8c102a	Actually use ae_flags to add AE_BARRIER if needed. Many thanks to @Plasma that spotted this problem reviewing the code.	2018-02-28 18:05:51 +01:00
charsyam	fb7560bcbb	refactoring-make-condition-clear-for-rdb	2018-02-27 19:17:25 +01:00
antirez	1e2f0d6940	ae.c: insetad of not firing, on AE_BARRIER invert the sequence. AE_BARRIER was implemented like: - Fire the readable event. - Do not fire the writabel event if the readable fired. However this may lead to the writable event to never be called if the readable event is always fired. There is an alterantive, we can just invert the sequence of the calls in case AE_BARRIER is set. This commit does that.	2018-02-27 16:19:38 +01:00
antirez	b2e4aad9e2	AOF: fix a bug that may prevent proper fsyncing when fsync=always. In case the write handler is already installed, it could happen that we serve the reply of a query in the same event loop cycle we received it, preventing beforeSleep() from guaranteeing that we do the AOF fsync before sending the reply to the client. The AE_BARRIER mechanism, introduced in a previous commit, prevents this problem. This commit makes actual use of this new feature to fix the bug.	2018-02-27 16:19:33 +01:00
antirez	93bad8ae88	Cluster: improve crash-recovery safety after failover auth vote. Add AE_BARRIER to the writable event loop so that slaves requesting votes can't be served before we re-enter the event loop in the next iteration, so clusterBeforeSleep() will fsync to disk in time. Also add the call to explicitly fsync, given that we modified the last vote epoch variable.	2018-02-27 16:19:26 +01:00
antirez	e32752e8d0	ae.c: introduce the concept of read->write barrier. AOF fsync=always, and certain Redis Cluster bus operations, require to fsync data on disk before replying with an acknowledge. In such case, in order to implement Group Commits, we want to be sure that queries that are read in a given cycle of the event loop, are never served to clients in the same event loop iteration. This way, by using the event loop "before sleep" callback, we can fsync the information just one time before returning into the event loop for the next cycle. This is much more efficient compared to calling fsync() multiple times. Unfortunately because of a bug, this was not always guaranteed: the actual way the events are installed was the sole thing that could control. Normally this problem is hard to trigger when AOF is enabled with fsync=always, because we try to flush the output buffers to the socekt directly in the beforeSleep() function of Redis. However if the output buffers are full, we actually install a write event, and in such a case, this bug could happen. This change to ae.c modifies the event loop implementation to make this concept explicit. Write events that are registered with: AE_WRITABLE\|AE_BARRIER Are guaranteed to never fire after the readable event was fired for the same file descriptor. In this way we are sure that data is persisted to disk before the client performing the operation receives an acknowledged. However note that this semantics does not provide all the guarantees that one may believe are automatically provided. Take the example of the blocking list operations in Redis. With AOF and fsync=always we could have: Client A doing: BLPOP myqueue 0 Client B doing: RPUSH myqueue a b c In this scenario, Client A will get the "a" elements immediately after the Client B RPUSH will be executed, even before the operation is persisted. However when Client B will get the acknowledge, it can be sure that "b,c" are already safe on disk inside the list. What to note here is that it cannot be assumed that Client A receiving the element is a guaranteed that the operation succeeded from the point of view of Client B. This is due to the fact that the barrier exists within the same socket, and not between different sockets. However in the case above, the element "a" was not going to be persisted regardless, so it is a pretty synthetic argument.	2018-02-27 16:19:20 +01:00
antirez	262f403944	Fix ziplist prevlen encoding description. See #4705 .	2018-02-27 16:19:17 +01:00
antirez	83923afa8c	Track number of logically expired keys still in memory. This commit adds two new fields in the INFO output, stats section: expired_stale_perc:0.34 expired_time_cap_reached_count:58 The first field is an estimate of the number of keys that are yet in memory but are already logically expired. They reason why those keys are yet not reclaimed is because the active expire cycle can't spend more time on the process of reclaiming the keys, and at the same time nobody is accessing such keys. However as the active expire cycle runs, while it will eventually have to return to the caller, because of time limit or because there are less than 25% of keys logically expired in each given database, it collects the stats in order to populate this INFO field. Note that expired_stale_perc is a running average, where the current sample accounts for 5% and the history for 95%, so you'll see it changing smoothly over time. The other field, expired_time_cap_reached_count, counts the number of times the expire cycle had to stop, even if still it was finding a sizeable number of keys yet to expire, because of the time limit. This allows people handling operations to understand if the Redis server, during mass-expiration events, is able to collect keys fast enough usually. It is normal for this field to increment during mass expires, but normally it should very rarely increment. When instead it constantly increments, it means that the current workloads is using a very important percentage of CPU time to expire keys. This feature was created thanks to the hints of Rashmi Ramesh and Bart Robinson from Twitter. In private email exchanges, they noted how it was important to improve the observability of this parameter in the Redis server. Actually in big deployments, the amount of keys that are yet to expire in each server, even if they are logically expired, may account for a very big amount of wasted memory.	2018-02-19 11:22:34 +01:00
antirez	256ddbf6dc	Remove non semantical spaces from module.c.	2018-02-15 21:47:50 +01:00
antirez	280c3e3987	Fix typo in notifyKeyspaceEvent() comment.	2018-02-15 21:47:42 +01:00
Dvir Volk	7c4623b0d3	Add doc comment about notification flags	2018-02-15 21:47:38 +01:00
Dvir Volk	f4e7502e4f	Fix indentation and comment style in testmodule	2018-02-15 21:46:44 +01:00
Dvir Volk	3c8456c641	Use one static client for all keyspace notification callbacks	2018-02-15 21:46:38 +01:00
Dvir Volk	aaaff8bd1c	Remove the NOTIFY_MODULE flag and simplify the module notification flow if there aren't subscribers	2018-02-15 21:46:31 +01:00
Dvir Volk	0be51b8f54	Document flags for notifications	2018-02-15 21:45:41 +01:00
Dvir Volk	3b95c89cdb	removed some trailing whitespaces	2018-02-15 21:45:37 +01:00
Dvir Volk	84c6f1e3ca	removed hellonotify.c	2018-02-15 21:45:32 +01:00
Dvir Volk	53b85e53e3	fixed test	2018-02-15 21:45:27 +01:00
Dvir Volk	b43f66c9d4	finished implementation of notifications. Tests unfinished	2018-02-15 21:45:22 +01:00
antirez	eddf5deb38	More verbose logging when slave sends errors to master. See #3832.	2018-02-15 21:43:23 +01:00
oranagra	c09cc0a9b7	when a slave experiances an error on commands that come from master, print to the log since slave isn't replying to it's master, these errors go unnoticed. since we don't expect the master to send garbadge to the slave, this should be safe. (as long as we don't log OOM errors there)	2018-02-15 21:43:17 +01:00
charsyam	5c374f94ef	getting rid of duplicated code	2018-02-13 16:21:01 +01:00
Guy Benoish	a64f36e556	enlarged buffer given to ld2string	2018-02-13 15:51:36 +01:00
antirez	f170580195	Make it explicit with a comment why we kill the old AOF rewrite. See #3858.	2018-02-13 15:46:53 +01:00
Guy Benoish	0c030dea73	rewriteAppendOnlyFileBackground() failure fix It is possible to do BGREWRITEAOF even if appendonly=no. This is by design. stopAppendonly() didn't turn off aof_rewrite_scheduled (it can be turned on again by BGREWRITEAOF even while appendonly is off anyway). After configuring `appendonly yes` it will see that the state is AOF_OFF, there's no RDB fork, so it will do rewriteAppendOnlyFileBackground() which will fail since the aof_child_pid is set (was scheduled and started by cron). Solution: stopAppendonly() will turn off the schedule flag (regardless of who asked for it). startAppendonly() will terminate any existing fork and start a new one (so it is the most recent).	2018-02-13 15:46:50 +01:00
Oran Agra	5807397460	fix to latency monitor reporting wrong max latency in some cases LATENCY HISTORY reported latency that was higher than the max latency reported by LATENCY LATEST / DOCTOR	2018-02-13 15:31:43 +01:00
antirez	f17d82961d	Redis 4.0.8.	2018-02-02 17:39:14 +01:00
antirez	f603940f7c	Rax updated to latest antirez/rax commit.	2018-02-02 11:10:30 +01:00
antirez	2c1fc582c7	Redis 4.0.7.	2018-01-24 11:16:18 +01:00
jianqingdu	2b99d77a57	fix not call va_end when syncWrite() failed fix not call va_end when syncWrite() failed in sendSynchronousCommand()	2018-01-24 10:58:57 +01:00
Yusaku Kaneta	5f9b9e1194	Fix the firstkey, lastkey, and keystep of moduleCommand	2018-01-24 10:58:39 +01:00
Mark Nunberg	ba2d3e8e6e	redismodule.h: Check ModuleNameBusy before calling it Older versions might not have this function.	2018-01-24 10:48:42 +01:00
antirez	05c1f18d6a	Fix integration test NOREPLICAS error time dependent false positive.	2018-01-24 10:24:22 +01:00
antirez	4acd6973bf	Fix migrateCommand() access of not initialized byte.	2018-01-18 12:41:23 +01:00
Guy Benoish	548e4fe088	Replication buffer fills up on high rate traffic. When feeding the master with a high rate traffic the the slave's feed is much slower. This causes the replication buffer to grow (indefinitely) which leads to slave disconnection. The problem is that writeToClient() decides to stop writing after NET_MAX_WRITES_PER_EVENT writes (In order to be fair to clients). We should ignore this when the client is a slave. It's better if clients wait longer, the alternative is that the slave has no chance to stay in sync in this situation.	2018-01-18 12:16:50 +01:00
antirez	efa7063c52	Cluster: improve anti-affinity algo in redis-trib.rb. See #3462 and related PRs. We use a simple algorithm to calculate the level of affinity violation, and then an optimizer that performs random swaps until things improve.	2018-01-18 12:16:46 +01:00
antirez	48568ab6d7	Remove useless comment from serverCron(). The behavior is well specified by the code itself.	2018-01-18 12:16:42 +01:00
heqin	0201dea577	fixbug for #4545 dead loop aof rewrite	2018-01-18 12:16:37 +01:00
antirez	926beaa3c4	Hopefully more clear comment to explain the change in #4607 .	2018-01-18 12:16:31 +01:00
qinchao	019ad3e2e3	fix assert problem in ZIP_DECODE_PREVLENSIZE , see issue: https://github.com/antirez/redis/issues/4587	2018-01-18 12:16:23 +01:00
Oran Agra	8d9dff84ce	PSYNC2 fix - promoted slave should hold on to it's backlog after a slave is promoted (assuming it has no slaves and it booted over an hour ago), it will lose it's replication backlog at the next replication cron, rather than waiting for slaves to connect to it. so on a simple master/slave faiover, if the new slave doesn't connect immediately, it may be too later and PSYNC2 will fail.	2018-01-18 12:16:05 +01:00
zhaozhao.zz	fba2e169f9	aof: format code and comment	2018-01-18 12:15:57 +01:00
antirez	7777be7b0f	Put more details in the comment introduced by #4601 .	2018-01-18 12:15:53 +01:00
zhaozhao.zz	91c1568b1a	lazyfree: fix memory leak for lazyfree-lazy-server-del	2018-01-18 12:15:47 +01:00

1 2 3 4 5 ...

6389 Commits