Commit Graph

95 Commits

Author SHA1 Message Date
Roman Gershman
8240c7f19e
chore(monitoring): add more dashboards + memcached (#3268) 2024-07-05 07:12:13 +00:00
Shahar Mike
5b731f163c
feat(cluster_mgr): Fix migration action (#3124) 2024-06-04 13:27:42 +03:00
Shahar Mike
bcbcc5a2c6
feat(cluster_mgr): Take over command (#3120) 2024-06-04 11:39:08 +03:00
Shahar Mike
6e6c91aeaf
feat(cluster_mgr): Improvements to cluster_mgr.py (#3118)
Make sure attached node is in right mode
Enable detaching nodes
2024-06-03 19:05:17 +00:00
Roman Gershman
0394387a5f
chore: export pipeline related metrics (#3104)
* chore: export pipeline related metrics

Export in /metrics
1. Total pipeline queue length
2. Total pipeline commands
3. Total pipelined duration

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-05-30 19:10:35 +03:00
Shahar Mike
d1e3c82eaa
feat(cluster_mgr): Allow attaching replicas (#3105) 2024-05-30 15:29:58 +03:00
Vladislav
fd5ece09fb
chore: small replayer fixes (#3081) 2024-05-25 22:48:29 +03:00
Roman Gershman
8a0007d761
chore: add replication memory stats to the dashboard (#3065) 2024-05-22 08:11:54 +03:00
Jirapong Pansak
3babe99cf6
<chore>!: Update grafana panel (#3064)
update panel
2024-05-19 15:56:44 +00:00
Roman Gershman
fd74fd5b4b
chore: Export replication memory stats (#3062) 2024-05-18 22:40:14 +03:00
Borys
3dd6c4959c
feat: add defragment command (#3003)
* feat: add defragment command and improve auto defragmentation algorithm
2024-05-08 14:26:42 +03:00
adiholden
186ff31e29
Fix benchmark (#3017)
* fix(benchmark): fix lag check

Signed-off-by: adi_holden <adi@dragonflydb.io>

---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-05-06 18:38:13 +03:00
Zacharya
5a37c47aaf
feat(benchmark-tests): run in K8s (#2965)
Signed-off-by: adi_holden <adi@dragonflydb.io>

* feat(benchmark-tests): run in K8s

---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
Co-authored-by: adi_holden <adi@dragonflydb.io>
2024-05-03 15:12:15 +00:00
Roman Gershman
c37fe87bf1
chore: update our container distributions versions (#2983)
1. Restrict build context in our dev/weekly builder to ease development iterations.
2. Switch weekly build to debian 12-slim because it's smaller than 24.04
3. Update our prod releases to use ubuntu 22.04

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-05-01 11:34:23 +03:00
Vladislav
df598e4825
chore: Log db_index in traffic logger (#2951)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-04-24 15:13:53 +03:00
Roman Gershman
c42b3dc02f
chore: bring more clarity when replayer fails (#2933) 2024-04-19 10:49:32 +00:00
Vladislav
3e270fee53
chore(replayer): Roll back to go1.18 (#2881) 2024-04-10 16:58:51 +03:00
Shahar Mike
b8693b4805
feat(cluster): Send number of keys for incoming and outgoing migrations. (#2858)
The number of keys in an _incoming_ migration indicates how many keys
were received, while for _outgoing_ it shows the total number. Combining
the two can provide the control plane with percentage.

This slightly modified the format of the response.

Fixes #2756
2024-04-08 21:17:03 +03:00
Roman Gershman
934a8c64c9
fix: healthcheck for docker containers (#2853)
* fix: healthcheck for docker containers

Fixes #2841

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-04-07 10:49:00 +03:00
adiholden
6e32139ada
Benchmark runner (#2780)
* feat(github runner): add benchmark workflow

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-03-27 07:31:19 +00:00
Shahar Mike
9ba532a826
feat(server): Use mimalloc in SSL calls (#2710)
* feat(server): Use mimalloc in SSL calls

Until now, OpenSSL used `malloc()` directly. This PR overrides it to use
mimalloc.

Fixes #2709

* Add generate-tls-files.sh
2024-03-11 08:25:59 +02:00
manojks1999
0081f4de71
Chore: Fixed Docker Health Check (#2659)
* docker_healthcheck_fix

* grep_fix_for_alpine

* added environment variable for healthcheck and changed the port extraction accorfingly
2024-03-04 12:47:18 +02:00
Shahar Mike
54cb7d5cd0
feat(cluster_mgr): Add support for remote Dragonfly servers (#2671)
* WIP: `cluster_mgr.py` to work with remote targets

* Documentation

* No admin port

* Support different hostname move/migrate

* Fix migrate bug

* Fix typo in --help

* fix test

* self.update_id()
2024-02-29 11:59:54 +02:00
Shahar Mike
ebca523166
fix(cluster_mgr): Disable CPU affinity (#2632) 2024-02-20 13:43:17 +00:00
Shahar Mike
c7750b9d58
feat(cluster_mgr): Add support for migrate action (#2626)
Example usage:

```bash
# Create a 2-node cluster
./cluster_mgr.py --action=create --replicas_per_master=1 --num_master=2

# Move (no migration) all slots to first node
./cluster_mgr.py --action=move --target_port=7001 --slot_start=8192 --slot_end=16383

# Fill data - like run memtier

# Migrate all slots to 2nd node. One could measure how long this step takes.
./cluster_mgr.py --action=migrate --target_port=7002 --slot_start=0 --slot_end=16383
```
2024-02-20 13:58:13 +02:00
Roman Gershman
af23778655
fix: release pipeline (#2439)
We had a place in tools/packaging/generate_debian_package.sh that relied on the existence of build-opt,
moreover, if it did not exist the script deadlocked.

1. Added more loggings
2. Removed the loop
3. Removed unnecessary dependency in the script on the build-dir name.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-01-18 16:52:19 +02:00
Roman Gershman
8eda8226b2
fix: release.sh (#2432) 2024-01-17 12:51:31 +00:00
Roman Gershman
b3e0722d01
chore: fix our release pipeline (#2408)
* chore: fix our release pipeline

Also remove alpine prod.wip file that has not been used and unlikely will be for prod.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-01-14 17:31:59 +02:00
Vladislav
f4ea42f2f6
chore: simple traffic logger (#2378)
* feat: simple traffic logger

Controls: 
```
DEBUG TRAFFIC <base_path> | [STOP]
```
---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2024-01-10 12:56:56 +00:00
Roman Gershman
c7db025a48
feat: expose fiber responsiveness metrics (#2125)
Should allow track caches where Dragonfly is not responsive to I/O
due to big CPU tasks. Also, update the local grafana dashboard.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-11-05 16:56:33 +02:00
Roman Gershman
5c6aad20c1
fix: local grafana dasboard (#2124)
The dashboard used `dragonfly_up` metric to boostrap itself
but this metric does not exist anymore. I replaced it with `dragonfly_version`
In addition, the exported format changed slightly because I used the
recent grafana version to export.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-11-04 18:34:19 +00:00
Yue Li
00f1e3d578
feat(server): perform eviction upon memory pressure in cache mode (#2084)
* fixes #1936

Eviction Implementation
This patch provides a very simple eviction implementation for the interface mentioned above. In my opinion, the eviction algorithm approximates an LRU policy given that normal buckets always store the most recently accessed data while stash buckets are holding less active data.

The algorithm first selects a small set of segments as eviction targets. Starting from the last slot of the last stash bucket in each of the segments, we walk backward to evict key-value pairs stored in each visited slot. The eviction stopped either when a target memory release goal or the max number of evicted key-value pairs is reached. Therefore, we can upper bound the eviction time through the following two parameters that can be set when DF starts.  Note that these two parameters could be retrieved and changed by user through CONFIG GET and CONFIG SET commands.

---------

Signed-off-by: Yue Li <61070669+theyueli@users.noreply.github.com>
2023-11-01 11:11:27 -07:00
Tarun Pothulapati
b1ba29f9c7
fix(ubuntu-prod): Set suexec hash correctly (#2029) 2023-10-16 13:31:20 +03:00
Roman Gershman
c6f8f3882a
chore: add balls and bins simulator (#2001)
* chore: add balls and bins simulator

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* Update balls_bins.py

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-10-11 01:18:29 +03:00
Roman Gershman
6e76f8e6cc
fix: logrotate for dragonfly logs (#1972)
The new logrotate settings assume that dragonfly closes a log file
once it grows to large. It never rotates file that is currently open for writing.

Specifically logrotate:

1. rotate only log files
2. skip those that are currently open by as process.
3. compresses using zstd which is more cpu efficient than gzip
4. does not truncate/create old files as 0-sized blobs - just renames them

Fixes #1935

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-10-09 20:08:37 +03:00
Roman Gershman
0be2d98f27
fix: weekly build (#1871)
* fix: weekly build

* Update tools/packaging/Dockerfile.ubuntu-prod

Co-authored-by: Roy Jacobson <roy@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>

---------

Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Roy Jacobson <roy@dragonflydb.io>
2023-09-18 14:41:40 +03:00
Roman Gershman
02fff36e3e
Add build_rpm script and rpm spec (#1831)
Also, link stdlib++ and libgcc statically.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-09-12 10:42:06 +03:00
Vladislav
e0af5fe836
Remove ICU library (#1812)
* chore(search): Replace icu with unialgo

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-09-06 15:06:38 +03:00
Roman Gershman
4e393cf742
fix: alpine weekly pipeline (#1811)
1. Move docker build files to separate dir from docker script files
   so that they won't be part of build context. Update dockerignore as well
2. Fix lib dependencies for alpine

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-09-06 08:27:40 +00:00
Vladislav
05f4895d48
Update Dockerfile.alpine-dev (#1803)
Add missing icu dependency

Signed-off-by: Vladislav <vlad@dragonflydb.io>
2023-09-05 08:44:11 +00:00
Søren Hansen
7492550f12
fix: run container as dfly user (#1775) 2023-08-31 16:54:50 +00:00
Vladislav
8b6de914fc
tools: Hash defrag script (#1723)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-22 09:11:34 +03:00
Vladislav
5198622a15
feat: Support unicode strings in search (#1698)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Signed-off-by: Vladislav <vlad@dragonflydb.io>
2023-08-18 15:40:37 +03:00
Roman Gershman
3d6d9d99c7
fix: weekly alpine build (#1555)
Specifiying an exact boost version is not robust.
Also we do not depend on fibers anymore and boost-context is enough.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-07-17 09:03:02 +03:00
Roman Gershman
8c80bd7c5c
chore: tune snyk coverage to ignore test files (#1509)
Also, upgrade the alpine docker image according to Snyk suggestions.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-07-03 11:16:42 +03:00
Tarun Pothulapati
2da32c1066
feat(debian): retain debug symbols in deb package (#1464) 2023-06-22 19:12:10 +03:00
Chaka
6ec9513c60
test(cluster): Extend cluster_mgr.py (#1426)
Now this management script can:
* Create a cluster (before this PR)
* Print an existing cluster configuration
* Shutdown an existing cluster
* Move slots between cluster nodes

To support connecting to a cluster (for all new functions), I had to
change the way admin ports are defined. Instead of having the user
(optionally) specify the first port, they are hard-coded to be the
regular port + 10,000. This is done because we can't detect the admin
port based for an existing cluster (like via `CLUSTER SHARDS`).
2023-06-20 10:13:57 +03:00
Chaka
5659ff6a2d
test(cluster): Add a Cluster Management script (#1390)
This script allows easily setting up a local cluster.

Example invocation:

```
killall dragonfly; ./cluster_mgr.py --num_masters=3 --with_replicas
Setting up a Dragonfly cluster:
- Master nodes: 3
- Ports: 7001...7003
- Admin ports: 8001...8003
- Replicas? True

Starting nodes...
- Log file for node 7001: /tmp/dfly.cluster.node.7001.log
- Log file for node 7002: /tmp/dfly.cluster.node.7002.log
- Log file for node 7003: /tmp/dfly.cluster.node.7003.log
- Log file for node 7004: /tmp/dfly.cluster.node.7004.log
- Log file for node 7005: /tmp/dfly.cluster.node.7005.log
- Log file for node 7006: /tmp/dfly.cluster.node.7006.log

Configuring replication...
- Response for 7004: OK
- Response for 7005: OK
- Response for 7006: OK

Getting IDs...
- ID for 7001: acefdc2da5d397cfcb99239b3c29cbe6ff10d75a
- ID for 7002: a8cc67dfa42e91a94bd7c0903df35d60a39508bd
- ID for 7003: 1ad91af7bd96c89a8da877164b2ebb4cf458cab8
- ID for 7004: d209c3603343e25a18c78bd68304b6d883973bd3
- ID for 7005: bd2b25e95aaf7fdd2b955e50a00093a8272954bf
- ID for 7006: beb5cb07b75c33e3ff938d07725f2688d9bc91e0

Pushing config...
- Push into 7001: OK
- Push into 7002: OK
- Push into 7003: OK
- Push into 7004: OK
- Push into 7005: OK
- Push into 7006: OK
```
2023-06-15 13:02:58 +03:00
Tarun Pothulapati
e62b5a6c13
fix(alpine): Add libxml2 into the build pipeline (#1363)
Fix lack of libxml2
2023-06-07 09:39:57 +03:00
Chaka
6e21686406
Add bison to build process, and libxml2 to runtime (#1308) 2023-05-29 13:19:23 +03:00