* chore: change how we track memory_budget during evictions
We compared memory_budget vs 0 before inserting a new item in DbSlice,
and retired cool pages if we are low on memory.
The problem - when we decide whether we allow growing a table, we estimate the possible object size increase due to the future table growth.
And the memory check described before was not consistent with the actual logic that rejected the insertion.
Moreover, the memory_budget tracking interaction with EvictionPolicy was over-complicated: we passed the memory_budget counter to the evp object
and then read it back, even though evp did not track object deletions memory impact during objects evictions.
Now, we remove the responsibility from evp to update memory_budget_ so it's solely updated by DbSlice.
We also update memory_budget_ during deletions, and when we pass it to evp, we add cool memory size as potential memory resource to avoid
rejections in case we have lots of cool memory.
Fixes#3456
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
The env variables exported when regression tests timeout are not working properly and the if statement on the action step Print last log on timeout would fail to read and upload the files set in /tmp/last_log_file.txt. Furthermore, another problem is the job.timeout argument that kills the whole job/matrix before the upload log step has a chance to run. For that, we need manual timeouts on the workflow similar to what we do in regression tests action.
* remove print last log on timeout action step
* copy the logs on timeout directly within the timeout step
* replace global timeout on CI workflow with timeout command per step
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
* fix: reenable macos builds
Also, add debug function that prints local state if deadlocks occure.
* fix: free cold memory for non-cache mode as well
* chore: disable FreeMemWithEvictionStep again
Because it heavily affects the performance when performing evictions.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: introduce back-pressure to tiered storage
Also, so clean-up with mac-os daily build.
Enabled forgotten test.
Improve CI insights
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: Introduce pipeline back-pressure
Also, improve synchronization primitives and replace them with
thread-local variations.
Before the change, on my local machine with the dragonfly running with 8 threads,
`memtier_benchmark -c 10 --threads 8 --command="PING" --key-maximum 100000000 --hide-histogram --distinct-client-seed --pipeline=20 --test-time=10`
reached 10M qps with 0.327ms p99.9.
After the change, the same command showed 13.8M qps with 0.2ms p99.9
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: export pipeline related metrics
Export in /metrics
1. Total pipeline queue length
2. Total pipeline commands
3. Total pipelined duration
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Use clib malloc for allocating fiber stacks but reduce the fiber stack size.
clib malloc uses default 4K OS pages when reserving memory from the OS.
The reason for not using mi_malloc, because we use 2MB large OS pages with mimalloc.
However, allocating stacks is one of the cases, when using smaller 4KB memory pages is actually more
RSS efficient because memory pages become hot at better granularity.
2. Add "memory_fiberstack_vms_bytes" metric exposing fiber stack vm usage.
3. Fix macos dependencies & update ci versions.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Also update actions versions to Node 20.
This change allows dragonfly to be built on MacOs.
However, we still have multiple failing tests on MacOS.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* upload only failed test logs
* remove printing log names for passed tests
* print slow tests with --duration
* separate regression and unit logs for CI workflow
* test(memory): Test memory accounting for all types
* slightly faster
* WIP
* working
* Document
* Update test to use DEBUG POPULATE
* Nothing much
* Working
* fix
* yaml
* explicit capture
* fix ci?
* stub tx
chore: UpdateHelio dependency
Add support for asan/ubsan checkers in our dev environment.
Remove more clang warnings.
Once we fix all the problems we will enable them in our CI as well.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>