# Which Problems Are Solved
Optimize the query that checks for terminated sessions in the access
token verifier. The verifier is used in auth middleware, userinfo and
introspection.
# How the Problems Are Solved
The previous implementation built a query for certain events and then
appended a single `PositionAfter` clause. This caused the postgreSQL
planner to use indexes only for the instance ID, aggregate IDs,
aggregate types and event types. Followed by an expensive sequential
scan for the position. This resulting in internal over-fetching of rows
before the final filter was applied.
![Screenshot_20241007_105803](https://github.com/user-attachments/assets/f2d91976-be87-428b-b604-a211399b821c)
Furthermore, the query was searching for events which are not always
applicable. For example, there was always a session ID search and if
there was a user ID, we would also search for a browser fingerprint in
event payload (expensive). Even if those argument string would be empty.
This PR changes:
1. Nest the position query, so that a full `instance_id, aggregate_id,
aggregate_type, event_type, "position"` index can be matched.
2. Redefine the `es_wm` index to include the `position` column.
3. Only search for events for the IDs that actually have a value. Do not
search (noop) if none of session ID, user ID or fingerpint ID are set.
New query plan:
![Screenshot_20241007_110648](https://github.com/user-attachments/assets/c3234c33-1b76-4b33-a4a9-796f69f3d775)
# Additional Changes
- cleanup how we load multi-statement migrations and make that a bit
more reusable.
# Additional Context
- Related to https://github.com/zitadel/zitadel/issues/7639
# Which Problems Are Solved
Cache implementation using a PGX connection pool.
# How the Problems Are Solved
Defines a new schema `cache` in the zitadel database.
A table for string keys and a table for objects is defined.
For postgreSQL, tables are unlogged and partitioned by cache name for
performance.
Cockroach does not have unlogged tables and partitioning is an
enterprise feature that uses alternative syntax combined with sharding.
Regular tables are used here.
# Additional Changes
- `postgres.Config` can return a pxg pool. See following discussion
# Additional Context
- Part of https://github.com/zitadel/zitadel/issues/8648
- Closes https://github.com/zitadel/zitadel/issues/8647
---------
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
Twilio supports a robust, multi-channel verification service that
notably supports multi-region SMS sender numbers required for our use
case. Currently, Zitadel does much of the work of the Twilio Verify (eg.
localization, code generation, messaging) but doesn't support the pool
of sender numbers that Twilio Verify does.
# How the Problems Are Solved
To support this API, we need to be able to store the Twilio Service ID
and send that in a verification request where appropriate: phone number
verification and SMS 2FA code paths.
This PR does the following:
- Adds the ability to use Twilio Verify of standard messaging through
Twilio
- Adds support for international numbers and more reliable verification
messages sent from multiple numbers
- Adds a new Twilio configuration option to support Twilio Verify in the
admin console
- Sends verification SMS messages through Twilio Verify
- Implements Twilio Verification Checks for codes generated through the
same
# Additional Changes
# Additional Context
- base was implemented by @zhirschtritt in
https://github.com/zitadel/zitadel/pull/8268❤️
- closes https://github.com/zitadel/zitadel/issues/8581
---------
Co-authored-by: Zachary Hirschtritt <zachary.hirschtritt@klaviyo.com>
Co-authored-by: Joey Biscoglia <joey.biscoglia@klaviyo.com>
# Which Problems Are Solved
We identified the need of caching.
Currently we have a number of places where we use different ways of
caching, like go maps or LRU.
We might also want shared chaches in the future, like Redis-based or in
special SQL tables.
# How the Problems Are Solved
Define a generic Cache interface which allows different implementations.
- A noop implementation is provided and enabled as.
- An implementation using go maps is provided
- disabled in defaults.yaml
- enabled in integration tests
- Authz middleware instance objects are cached using the interface.
# Additional Changes
- Enabled integration test command raceflag
- Fix a race condition in the limits integration test client
- Fix a number of flaky integration tests. (Because zitadel is super
fast now!) 🎸🚀
# Additional Context
Related to https://github.com/zitadel/zitadel/issues/8648
# Which Problems Are Solved
id_tokens issued for auth requests created through the login UI
currently do not provide a sid claim.
This is due to the fact that (SSO) sessions for the login UI do not have
one and are only computed by the userAgent(ID), the user(ID) and the
authentication checks of the latter.
This prevents client to track sessions and terminate specific session on
the end_session_endpoint.
# How the Problems Are Solved
- An `id` column is added to the `auth.user_sessions` table.
- The `id` (prefixed with `V1_`) is set whenever a session is added or
updated to active (from terminated)
- The id is passed to the `oidc session` (as v2 sessionIDs), to expose
it as `sid` claim
# Additional Changes
- refactored `getUpdateCols` to handle different column value types and
add arguments for query
# Additional Context
- closes#8499
- relates to #8501
# Which Problems Are Solved
Implement a new API service that allows management of OIDC signing web
keys.
This allows users to manage rotation of the instance level keys. which
are currently managed based on expiry.
The API accepts the generation of the following key types and
parameters:
- RSA keys with 2048, 3072 or 4096 bit in size and:
- Signing with SHA-256 (RS256)
- Signing with SHA-384 (RS384)
- Signing with SHA-512 (RS512)
- ECDSA keys with
- P256 curve
- P384 curve
- P512 curve
- ED25519 keys
# How the Problems Are Solved
Keys are serialized for storage using the JSON web key format from the
`jose` library. This is the format that will be used by OIDC for
signing, verification and publication.
Each instance can have a number of key pairs. All existing public keys
are meant to be used for token verification and publication the keys
endpoint. Keys can be activated and the active private key is meant to
sign new tokens. There is always exactly 1 active signing key:
1. When the first key for an instance is generated, it is automatically
activated.
2. Activation of the next key automatically deactivates the previously
active key.
3. Keys cannot be manually deactivated from the API
4. Active keys cannot be deleted
# Additional Changes
- Query methods that later will be used by the OIDC package are already
implemented. Preparation for #8031
- Fix indentation in french translation for instance event
- Move user_schema translations to consistent positions in all
translation files
# Additional Context
- Closes#8030
- Part of #7809
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
During performance testing of the `eventstore.fields` table we found
some long running queries which searched for the aggregate id.
# How the Problems Are Solved
A new index was added to the `eventstore.fields`-table called
`f_aggregate_object_type_idx`.
# Additional Changes
None
# Additional Context
- Table was added in https://github.com/zitadel/zitadel/pull/8191
- Part of https://github.com/zitadel/zitadel/issues/7639
# Which Problems Are Solved
Improve the performance of human imports by optimizing the query that
finds domains claimed by other organizations.
# How the Problems Are Solved
Use the fields search table introduced in
https://github.com/zitadel/zitadel/pull/8191 by storing each
organization domain as Object ID and the verified status as field value.
# Additional Changes
- Feature flag for this optimization
# Additional Context
- Performance improvements for import are evaluated and acted upon
internally at the moment
---------
Co-authored-by: adlerhurst <silvan.reusser@gmail.com>
# Which Problems Are Solved
To improve performance a new table and method is implemented on
eventstore. The goal of this table is to index searchable fields on
command side to use it on command and query side.
The table allows to store one primitive value (numeric, text) per row.
The eventstore framework is extended by the `Search`-method which allows
to search for objects.
The `Command`-interface is extended by the `SearchOperations()`-method
which does manipulate the the `search`-table.
# How the Problems Are Solved
This PR adds the capability of improving performance for command and
query side by using the `Search`-method of the eventstore instead of
using one of the `Filter`-methods.
# Open Tasks
- [x] Add feature flag
- [x] Unit tests
- [ ] ~~Benchmarks if needed~~
- [x] Ensure no behavior change
- [x] Add setup step to fill table with current data
- [x] Add projection which ensures data added between setup and start of
the new version are also added to the table
# Additional Changes
The `Search`-method is currently used by `ProjectGrant`-command side.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/8094
# Which Problems Are Solved
Some organizations / customers have the requirement, that there users
regularly need to change their password.
ZITADEL already had the possibility to manage a `password age policy` (
thought the API) with the maximum amount of days a password should be
valid, resp. days after with the user should be warned of the upcoming
expiration.
The policy could not be managed though the Console UI and was not
checked in the Login UI.
# How the Problems Are Solved
- The policy can be managed in the Console UI's settings sections on an
instance and organization level.
- During an authentication in the Login UI, if a policy is set with an
expiry (>0) and the user's last password change exceeds the amount of
days set, the user will be prompted to change their password.
- The prompt message of the Login UI can be customized in the Custom
Login Texts though the Console and API on the instance and each
organization.
- The information when the user last changed their password is returned
in the Auth, Management and User V2 API.
- The policy can be retrieved in the settings service as `password
expiry settings`.
# Additional Changes
None.
# Additional Context
- closes#8081
---------
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Which Problems Are Solved
Adds the possibility to mirror an existing database to a new one.
For that a new command was added `zitadel mirror`. Including it's
subcommands for a more fine grained mirror of the data.
Sub commands:
* `zitadel mirror eventstore`: copies only events and their unique
constraints
* `zitadel mirror system`: mirrors the data of the `system`-schema
* `zitadel mirror projections`: runs all projections
* `zitadel mirror auth`: copies auth requests
* `zitadel mirror verify`: counts the amount of rows in the source and
destination database and prints the diff.
The command requires one of the following flags:
* `--system`: copies all instances of the system
* `--instance <instance-id>`, `--instance <comma separated list of
instance ids>`: copies only the defined instances
The command is save to execute multiple times by adding the
`--replace`-flag. This replaces currently existing data except of the
`events`-table
# Additional Changes
A `--for-mirror`-flag was added to `zitadel setup` to prepare the new
database. The flag skips the creation of the first instances and initial
run of projections.
It is now possible to skip the creation of the first instance during
setup by setting `FirstInstance.Skip` to true in the steps
configuration.
# Additional info
It is currently not possible to merge multiple databases. See
https://github.com/zitadel/zitadel/issues/7964 for more details.
It is currently not possible to use files. See
https://github.com/zitadel/zitadel/issues/7966 for more information.
closes https://github.com/zitadel/zitadel/issues/7586
closes https://github.com/zitadel/zitadel/issues/7486
### Definition of Ready
- [x] I am happy with the code
- [x] Short description of the feature/issue is added in the pr
description
- [x] PR is linked to the corresponding user story
- [x] Acceptance criteria are met
- [x] All open todos and follow ups are defined in a new ticket and
justified
- [x] Deviations from the acceptance criteria and design are agreed with
the PO and documented.
- [x] No debug or dead code
- [x] My code has no repetitions
- [x] Critical parts are tested automatically
- [ ] Where possible E2E tests are implemented
- [x] Documentation/examples are up-to-date
- [x] All non-functional requirements are met
- [x] Functionality of the acceptance criteria is checked manually on
the dev system.
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
ZITADEL currently always uses
`urn:oasis:names:tc:SAML:2.0:nameid-format:persistent` in SAML requests,
relying on the IdP to respect that flag and always return a peristent
nameid in order to be able to map the external user with an existing
user (idp link) in ZITADEL.
In case the IdP however returns a
`urn:oasis:names:tc:SAML:2.0:nameid-format:transient` (transient)
nameid, the attribute will differ between each request and it will not
be possible to match existing users.
# How the Problems Are Solved
This PR adds the following two options on SAML IdP:
- **nameIDFormat**: allows to set the nameid-format used in the SAML
Request
- **transientMappingAttributeName**: allows to set an attribute name,
which will be used instead of the nameid itself in case the returned
nameid-format is transient
# Additional Changes
To reduce impact on current installations, the `idp_templates6_saml`
table is altered with the two added columns by a setup job. New
installations will automatically get the table with the two columns
directly.
All idp unit tests are updated to use `expectEventstore` instead of the
deprecated `eventstoreExpect`.
# Additional Context
Closes#7483Closes#7743
---------
Co-authored-by: peintnermax <max@caos.ch>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
During the implementation of #7486 it was noticed, that projections in
the `auth` database schema could be blocked.
Investigations suggested, that this is due to the use of
[GORM](https://gorm.io/index.html) and it's inability to use an existing
(sql) transaction.
With the improved / simplified handling (see below) there should also be
a minimal improvement in performance, resp. reduced database update
statements.
# How the Problems Are Solved
The handlers in `auth` are exchanged to proper (sql) statements and gorm
usage is removed for any writing part.
To further improve / simplify the handling of the users, a new
`auth.users3` table is created, where only attributes are handled, which
are not yet available from the `projections.users`,
`projections.login_name` and `projections.user_auth_methods` do not
provide. This reduces the events handled in that specific handler by a
lot.
# Additional Changes
None
# Additional Context
relates to #7486
* fix: add resource owner as query for user v2 ListUsers and clean up deprecated attribute
* fix: add resource owner as query for user v2 ListUsers and clean up deprecated attribute
* fix: add resource owner as query for user v2 ListUsers and clean up deprecated attribute
* fix: review changes
* fix: review changes
* fix: review changes
* fix: review changes
* fix: add password change required to user v2 get and list
* fix: update unit tests for query side with new column and projection
* fix: change projection in setup steps
* fix: change projection in setup steps
* fix: remove setup step 25
* fix: add password_change_required into ListUsers response
* fix: correct SetUserPassword parameters
* fix: rollback to change setup instead of projection directly
* fix: rollback to change setup instead of projection directly
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
* chore: use pgx v5
* chore: update go version
* remove direct pq dependency
* remove unnecessary type
* scan test
* map scanner
* converter
* uint8 number array
* duration
* most unit tests work
* unit tests work
* chore: coverage
* go 1.21
* linting
* int64 gopfertammi
* retry go 1.22
* retry go 1.22
* revert to go v1.21.5
* update go toolchain to 1.21.8
* go 1.21.8
* remove test flag
* go 1.21.5
* linting
* update toolchain
* use correct array
* use correct array
* add byte array
* correct value
* correct error message
* go 1.21 compatible
A customer noted that the login by email was case-sensitive, which differs to the handling of the loginname.
This PR changes the email check to be case-insensitive (which it was already in same parts) and improve the search for this as well.
* add token exchange feature flag
* allow setting reason and actor to access tokens
* impersonation
* set token types and scopes in response
* upgrade oidc to working draft state
* fix tests
* audience and scope validation
* id toke and jwt as input
* return id tokens
* add grant type token exchange to app config
* add integration tests
* check and deny actors in api calls
* fix instance setting tests by triggering projection on write and cleanup
* insert sleep statements again
* solve linting issues
* add translations
* pin oidc v3.15.0
* resolve comments, add event translation
* fix refreshtoken test
* use ValidateAuthReqScopes from oidc
* apparently the linter can't make up its mind
* persist actor thru refresh tokens and check in tests
* remove unneeded triggers
* docs: describe DefaultInstance vs FirstInstance
* link to docs
* add better searchable tip to the docs
* add better searchable tip to the docs
* add link
* feat(api): feature API proto definitions
* update proto based on discussion with @livio-a
* cleanup old feature flag stuff
* authz instance queries
* align defaults
* projection definitions
* define commands and event reducers
* implement system and instance setter APIs
* api getter implementation
* unit test repository package
* command unit tests
* unit test Get queries
* grpc converter unit tests
* migrate the V1 features
* migrate oidc to dynamic features
* projection unit test
* fix instance by host
* fix instance by id data type in sql
* fix linting errors
* add system projection test
* fix behavior inversion
* resolve proto file comments
* rename SystemDefaultLoginInstanceEventType to SystemLoginDefaultOrgEventType so it's consistent with the instance level event
* use write models and conditional set events
* system features integration tests
* instance features integration tests
* error on empty request
* documentation entry
* typo in feature.proto
* fix start unit tests
* solve linting error on key case switch
* remove system defaults after discussion with @eliobischof
* fix system feature projection
* resolve comments in defaults.yaml
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
Even though this is a feature it's released as fix so that we can back port to earlier revisions.
As reported by multiple users startup of ZITADEL after leaded to downtime and worst case rollbacks to the previously deployed version.
The problem starts rising when there are too many events to process after the start of ZITADEL. The root cause are changes on projections (database tables) which must be recomputed. This PR solves this problem by adding a new step to the setup phase which prefills the projections. The step can be enabled by adding the `--init-projections`-flag to `setup`, `start-from-init` and `start-from-setup`. Setting this flag results in potentially longer duration of the setup phase but reduces the risk of the problems mentioned in the paragraph above.
* fix(setup): unmarshal of failed step
* fix(cleanup): cleanup all stuck states
* use lastRun for repeatable steps
* typo
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
* fix(db): add additional connection pool for projection spooling
* use correct connection pool for projections
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
* start user by id
* ignore debug bin
* use new user by id
* new sql
* fix(sql): replace STRING with text for psql compatabilit
* some changes
* fix: correct user queries
* fix tests
* unify sql statements
* use specific get user methods
* search login name case insensitive
* refactor: optimise user statements
* add index
* fix queries
* fix: correct domain segregation
* return all login names
* fix queries
* improve readability
* query should be correct now
* cleanup statements
* fix username / loginname handling
* fix: psql doesn't support create view if not exists
* fix: create pre-release
* ignore release comments
* add lower fields
* fix: always to lower
* update to latest projection
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
* feat: return 404 or 409 if org reg disallowed
* fix: system limit permissions
* feat: add iam limits api
* feat: disallow public org registrations on default instance
* add integration test
* test: integration
* fix test
* docs: describe public org registrations
* avoid updating docs deps
* fix system limits integration test
* silence integration tests
* fix linting
* ignore strange linter complaints
* review
* improve reset properties naming
* redefine the api
* use restrictions aggregate
* test query
* simplify and test projection
* test commands
* fix unit tests
* move integration test
* support restrictions on default instance
* also test GetRestrictions
* self review
* lint
* abstract away resource owner
* fix tests
* configure supported languages
* fix allowed languages
* fix tests
* default lang must not be restricted
* preferred language must be allowed
* change preferred languages
* check languages everywhere
* lint
* test command side
* lint
* add integration test
* add integration test
* restrict supported ui locales
* lint
* lint
* cleanup
* lint
* allow undefined preferred language
* fix integration tests
* update main
* fix env var
* ignore linter
* ignore linter
* improve integration test config
* reduce cognitive complexity
* compile
* check for duplicates
* remove useless restriction checks
* review
* revert restriction renaming
* fix language restrictions
* lint
* generate
* allow custom texts for supported langs for now
* fix tests
* cleanup
* cleanup
* cleanup
* lint
* unsupported preferred lang is allowed
* fix integration test
* finish reverting to old property name
* finish reverting to old property name
* load languages
* refactor(i18n): centralize translators and fs
* lint
* amplify no validations on preferred languages
* fix integration test
* lint
* fix resetting allowed languages
* test unchanged restrictions
* define roles and permissions
* support system user memberships
* don't limit system users
* cleanup permissions
* restrict memberships to aggregates
* default to SYSTEM_OWNER
* update unit tests
* test: system user token test (#6778)
* update unit tests
* refactor: make authz testable
* move session constants
* cleanup
* comment
* comment
* decode member type string to enum (#6780)
* decode member type string to enum
* handle all membership types
* decode enums where necessary
* decode member type in steps config
* update system api docs
* add technical advisory
* tweak docs a bit
* comment in comment
* lint
* extract token from Bearer header prefix
* review changes
* fix tests
* fix: add fix for activityhandler
* add isSystemUser
* remove IsSystemUser from activity info
* fix: add fix for activityhandler
---------
Co-authored-by: Stefan Benz <stefan@caos.ch>
This implementation increases parallel write capabilities of the eventstore.
Please have a look at the technical advisories: [05](https://zitadel.com/docs/support/advisory/a10005) and [06](https://zitadel.com/docs/support/advisory/a10006).
The implementation of eventstore.push is rewritten and stored events are migrated to a new table `eventstore.events2`.
If you are using cockroach: make sure that the database user of ZITADEL has `VIEWACTIVITY` grant. This is used to query events.
* start feature flags
* base feature events on domain const
* setup default features
* allow setting feature in system api
* allow setting feature in admin api
* set settings in login based on feature
* fix rebasing
* unit tests
* i18n
* update policy after domain discovery
* some changes from review
* check feature and value type
* check feature and value type