* chore: enable experimental_new_io by default.
It has been running for weeks with the flag on, so enabled it also for community.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Vladislav Oleshko <vlad@dragonflydb.io>
* fix truncating the timeout red dots on CI failures
* fix deprecated use of with timeout warnings
* remove @pytest.mark.dbg_only as it doesn't exist
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
* chore: change how we track memory_budget during evictions
We compared memory_budget vs 0 before inserting a new item in DbSlice,
and retired cool pages if we are low on memory.
The problem - when we decide whether we allow growing a table, we estimate the possible object size increase due to the future table growth.
And the memory check described before was not consistent with the actual logic that rejected the insertion.
Moreover, the memory_budget tracking interaction with EvictionPolicy was over-complicated: we passed the memory_budget counter to the evp object
and then read it back, even though evp did not track object deletions memory impact during objects evictions.
Now, we remove the responsibility from evp to update memory_budget_ so it's solely updated by DbSlice.
We also update memory_budget_ during deletions, and when we pass it to evp, we add cool memory size as potential memory resource to avoid
rejections in case we have lots of cool memory.
Fixes#3456
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
The env variables exported when regression tests timeout are not working properly and the if statement on the action step Print last log on timeout would fail to read and upload the files set in /tmp/last_log_file.txt. Furthermore, another problem is the job.timeout argument that kills the whole job/matrix before the upload log step has a chance to run. For that, we need manual timeouts on the workflow similar to what we do in regression tests action.
* remove print last log on timeout action step
* copy the logs on timeout directly within the timeout step
* replace global timeout on CI workflow with timeout command per step
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
The recent changes of the serialization_max_chunk_size set to 1 for extreme testing increased the running time of the tests causing them sometimes to timeout.
* increase timeout on reg tests from 40 to 50
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
* fix: reenable macos builds
Also, add debug function that prints local state if deadlocks occure.
* fix: free cold memory for non-cache mode as well
* chore: disable FreeMemWithEvictionStep again
Because it heavily affects the performance when performing evictions.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: introduce back-pressure to tiered storage
Also, so clean-up with mac-os daily build.
Enabled forgotten test.
Improve CI insights
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: Introduce pipeline back-pressure
Also, improve synchronization primitives and replace them with
thread-local variations.
Before the change, on my local machine with the dragonfly running with 8 threads,
`memtier_benchmark -c 10 --threads 8 --command="PING" --key-maximum 100000000 --hide-histogram --distinct-client-seed --pipeline=20 --test-time=10`
reached 10M qps with 0.327ms p99.9.
After the change, the same command showed 13.8M qps with 0.2ms p99.9
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: export pipeline related metrics
Export in /metrics
1. Total pipeline queue length
2. Total pipeline commands
3. Total pipelined duration
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix(helm): add issuer group to create the certificate without wait for the previous created issuer
Signed-off-by: Fabiano Arruda Ferreira das Graças <fafg@fafg-mbm1.fritz.box>
* fix(helm): remove condition that can prevent the helm chart be rendered on machines where monitoring.coreos.com is not installed or is not the end target of the helm template command
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* fix(helm): lint - remove blank line
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* add(helm): missing service monitor test files
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* add(helm): add missing cert-manager test files
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* fix(helm): lint - add missing blank lines
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* fix(helm): rebase
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* Revert "fix(helm): rebase"
This reverts commit c4ce16b76e.
* fix(helm): fix service monitor namespace rendering
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* fix(helm): add missing up to date golden file
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* fix(helm): merge upstream
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
* update golden files
* also install prom operator dependencies
* also install cert-manager
* skip cert-manager chart
* skip cert-manager value
* remove CI TLS files
* fix formatting
* fix formatting
* fix actions
---------
Signed-off-by: Fabiano Arruda Ferreira das Graças <fafg@fafg-mbm1.fritz.box>
Signed-off-by: fafg <fabiano.arruda@hotmail.com>
Co-authored-by: Tarun Pothulapati <tarun@dragonflydb.io>
Co-authored-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
1. Use clib malloc for allocating fiber stacks but reduce the fiber stack size.
clib malloc uses default 4K OS pages when reserving memory from the OS.
The reason for not using mi_malloc, because we use 2MB large OS pages with mimalloc.
However, allocating stacks is one of the cases, when using smaller 4KB memory pages is actually more
RSS efficient because memory pages become hot at better granularity.
2. Add "memory_fiberstack_vms_bytes" metric exposing fiber stack vm usage.
3. Fix macos dependencies & update ci versions.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>