valkey/tests/integration/aof-race.tcl
chenyang8094 87789fae0b
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788)
Implement Multi-Part AOF mechanism to avoid overheads during AOFRW.
Introducing a folder with multiple AOF files tracked by a manifest file.

The main issues with the the original AOFRW mechanism are:
* buffering of commands that are processed during rewrite (consuming a lot of RAM)
* freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it.
* double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files)

The main modifications of this PR:
1. Remove the AOF rewrite buffer and related code.
2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type,
  it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only
  one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the
  incremental commands since the last AOFRW.
3. Use a AOF manifest file to record and manage these AOF files mentioned above.
4. The original configuration of `appendfilename` will be the base part of the new file name, for example:
  `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof`
5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename`
6. Remove the `aof_rewrite_buffer_length` field in info.
7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs.
  It also gives users the opportunity to preserve the history AOFs. just for testing use now.
8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now),
  we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be
  delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit
  period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately.
9. Support upgrade (load) data from old version redis.
10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and
  manifest file will be placed in this directory.
11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if
  `aof-load-truncated` is enabled.

Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-03 19:14:13 +02:00

36 lines
1.3 KiB
Tcl

set defaults { appendonly {yes} appendfilename {appendonly.aof} appenddirname {appendonlydir} aof-use-rdb-preamble {no} }
set server_path [tmpdir server.aof]
proc start_server_aof {overrides code} {
upvar defaults defaults srv srv server_path server_path
set config [concat $defaults $overrides]
start_server [list overrides $config] $code
}
tags {"aof"} {
# Specific test for a regression where internal buffers were not properly
# cleaned after a child responsible for an AOF rewrite exited. This buffer
# was subsequently appended to the new AOF, resulting in duplicate commands.
start_server_aof [list dir $server_path] {
set client [redis [srv host] [srv port] 0 $::tls]
set bench [open "|src/redis-benchmark -q -s [srv unixsocket] -c 20 -n 20000 incr foo" "r+"]
after 100
# Benchmark should be running by now: start background rewrite
$client bgrewriteaof
# Read until benchmark pipe reaches EOF
while {[string length [read $bench]] > 0} {}
# Check contents of foo
assert_equal 20000 [$client get foo]
}
# Restart server to replay AOF
start_server_aof [list dir $server_path] {
set client [redis [srv host] [srv port] 0 $::tls]
assert_equal 20000 [$client get foo]
}
}