The problem - we used file write in non-direct mode when writing snapshots in epoll mode.
As a result - lots of data was cached into OS memory. But then during the rename operation,
when we rename "xxx.dfs.tmp" into "xxx.dfs", the OS flushes the file caches and the thread
is stuck in OS system call rename for a long time.
The fix - to use DIRECT mode and to avoid caching the data into OS caches at all.
Fixes#3895
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Fixes#2917
The problem is described in this "working as intended" issue https://github.com/moby/moby/issues/3124
So the advised approach of using "USER dfly" directive does not really work because it requires
that the host will also define 'dfly' user with the same id. It's unrealistic expectation.
Therefore, we revert the fix done in #1775 and follow valkey approach:
https://github.com/valkey-io/valkey-container/blob/mainline/docker-entrypoint.sh#L12
1. we run the entrypoint in the container as root which later spawns the dragonfly process
2. if we run as root:
a. we chmod files under /data to dfly.
b. use setpriv to exec ourselves as dfly.
3. if we do not run as root we execute the docker command.
So even though the process starts as root, the server runs as dfly and only the bootstrap
part has elevated permissions is used to fix the volume access.
While we are at it, we also switched to setpriv following the change of https://github.com/valkey-io/valkey-container/pull/24/files
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: reduce pipelining latency by reusing existing shard fibers
To prove the benefits, run `./dfly_bench --pipeline=50 -n 20000 --ratio 0:1 --qps=0 --key_maximum=1`
Before: the average pipelining latency was 10ms
After: the average pipelining latency is 5ms.
Avg latency: pipelined_latency_usec / total_pipelined_squashed_commands
Also, improved counting of squashed commands - to count actual squashed ones.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Also refactor ReceiveFB into multiple functions.
Finally, fix the memcached command in local monitoring stack.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: export pipeline related metrics
Export in /metrics
1. Total pipeline queue length
2. Total pipeline commands
3. Total pipelined duration
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Restrict build context in our dev/weekly builder to ease development iterations.
2. Switch weekly build to debian 12-slim because it's smaller than 24.04
3. Update our prod releases to use ubuntu 22.04
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
The number of keys in an _incoming_ migration indicates how many keys
were received, while for _outgoing_ it shows the total number. Combining
the two can provide the control plane with percentage.
This slightly modified the format of the response.
Fixes#2756
* feat(server): Use mimalloc in SSL calls
Until now, OpenSSL used `malloc()` directly. This PR overrides it to use
mimalloc.
Fixes#2709
* Add generate-tls-files.sh
* WIP: `cluster_mgr.py` to work with remote targets
* Documentation
* No admin port
* Support different hostname move/migrate
* Fix migrate bug
* Fix typo in --help
* fix test
* self.update_id()
Example usage:
```bash
# Create a 2-node cluster
./cluster_mgr.py --action=create --replicas_per_master=1 --num_master=2
# Move (no migration) all slots to first node
./cluster_mgr.py --action=move --target_port=7001 --slot_start=8192 --slot_end=16383
# Fill data - like run memtier
# Migrate all slots to 2nd node. One could measure how long this step takes.
./cluster_mgr.py --action=migrate --target_port=7002 --slot_start=0 --slot_end=16383
```
We had a place in tools/packaging/generate_debian_package.sh that relied on the existence of build-opt,
moreover, if it did not exist the script deadlocked.
1. Added more loggings
2. Removed the loop
3. Removed unnecessary dependency in the script on the build-dir name.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: fix our release pipeline
Also remove alpine prod.wip file that has not been used and unlikely will be for prod.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Should allow track caches where Dragonfly is not responsive to I/O
due to big CPU tasks. Also, update the local grafana dashboard.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
The dashboard used `dragonfly_up` metric to boostrap itself
but this metric does not exist anymore. I replaced it with `dragonfly_version`
In addition, the exported format changed slightly because I used the
recent grafana version to export.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fixes#1936
Eviction Implementation
This patch provides a very simple eviction implementation for the interface mentioned above. In my opinion, the eviction algorithm approximates an LRU policy given that normal buckets always store the most recently accessed data while stash buckets are holding less active data.
The algorithm first selects a small set of segments as eviction targets. Starting from the last slot of the last stash bucket in each of the segments, we walk backward to evict key-value pairs stored in each visited slot. The eviction stopped either when a target memory release goal or the max number of evicted key-value pairs is reached. Therefore, we can upper bound the eviction time through the following two parameters that can be set when DF starts. Note that these two parameters could be retrieved and changed by user through CONFIG GET and CONFIG SET commands.
---------
Signed-off-by: Yue Li <61070669+theyueli@users.noreply.github.com>