Single-node cluster: A single-node cluster only has one node which is called the primary node. This node accepts customer connections and performs read/write operations. This is a single point of truth as well as a single point of failure.
Multi-node cluster: In addition to the primary node, this cluster contains standby nodes that can be promoted to primary if the current primary fails. The nodes are spread across availability zones. Currently, we use warm standby nodes, which means they don't serve read requests. Hot standby functionality (when the nodes can serve read requests) might be added in the future.
Existing clusters can be scaled in two ways: horizontal and vertical.
Horizontal scaling is defined as configuring the number of instances that run in parallel. The number of nodes can be increased or decreased in a cluster.
Scaling up the number of instances does not cause a disruption. However, decreasing may cause a switch over, if the current primary node is removed.
Note: This method of scaling is used to provide high availability. It will not increase performance.
Vertical scaling refers to configuring the size of the individual instances. This is used if you want to process more data and queries. You can change the number of cores and the size of memory to have the configuration that you need. Each instance is maintained on a dedicated node. In the event of scaling up or down, a new node will be created for each instance.
Once the new node becomes available, the server will switch from the old node to the new node. The old node is then removed. This process is executed sequentially if you have multiple nodes. We will always replace the standby first and then the primary. This means that there is only one switchover.
During the switch, if you are connected to the DB with an application, the connection will be terminated. All ongoing queries will be aborted. Inevitably, there will be some disruption. It is therefore recommended that the scaling is performed outside of peak times.
You can also increase the size of storage. However, it is not possible to reduce the size of the storage, nor can you change the type of storage. Increasing the size is done on-the-fly and causes no disruption.
The synchronization_mode
determines how transactions are replicated between multiple nodes before a transaction is confirmed to the client. IONOS DBaaS supports three modes of replication: Asynchronous (default), Synchronous and Strict Synchronous. In either mode the transaction is first committed on the leader and then replicated to the standby node(s).
Asynchronous replication does not wait for the standby before confirming a transaction back to the user. Transactions are confirmed to the client after being written to disk on the primary node. Replication takes place in the background. In asynchronous mode the cluster is allowed to lose some committed (not yet replicated) transactions during a failover to ensure availability.
The benefit of asynchronous replication is the lower latency. The downside is that recent transactions might be lost if standby is promoted to leader. The lag between the leader and standby tends to be a few milliseconds.
Caution: Data loss might happen if the server crashes and the data has not been replicated yet.
Synchronous replication ensures that a transaction is committed to at least one standby before confirming the transaction back to the client. This standby is known as synchronous standby. If the primary node experiences a failure then only a synchronous standby can take over as primary. This ensures that committed transactions are not lost during a failover. If the synchronous standby fails and there is another standby available then the role of the synchronous standby changes to the latter. If no standby is available then the primary can continue in standalone mode. In standalone mode the primary role cannot change until at least one standby has caught up (regained the role of synchronous standby). Latency is generally higher than with asynchronous replication, but no data is lost during a failover.
At any time there will be at most one synchronous standby. If the synchronous standby fails then another healthy standby is automatically selected as the synchronous standby.
Caution: Turning on non-strict synchronous replication does not guarantee multi node durability of commits under all circumstances. When no suitable standby is available, the primary node will still accept writes, but does not guarantee their replication.
Strict synchronous replication is the same as synchronous replication with the exception that standalone mode is not permitted. This mode will prevent PostgreSQL from switching off the synchronous replication on the primary when no synchronous standby candidates are available. If no standby is available, no writes will be accepted anymore, so this mode sacrifices availability for replicated durability.
If replication mode is set to synchronous (either strict or non-strict) then data loss cannot occur during failovers (e.g. node failures). The benefit of strict replication is that data is not lost in case of a storage failure of the primary node and a simultaneous failure of all standby nodes.
Please note that synchronization modes can impact DBaaS in several ways:
The performance penalty of synchronous over asynchronous replication depends on the workload. The primary handles transactions the same way in all replications modes, with the exception of COMMIT statements (incl. implicit transactions). When synchronous replication is enabled, the commit can only be confirmed to the client once it is replicated. Thus, there is a constant latency overhead for each transaction, independent of the transaction's size and duration.
Caution: You cannot enforce a synchronous commit when the cluster is configured to use asynchronous replication. Without a synchronous standby any setting higher than local
is equivalent to local
, which doesn't wait for replication to complete. Instead, you can configure your cluster to use synchronous replication and choose synchronous_commit=local
whenever data loss is acceptable.
There are several PostgreSQL extensions preinstalled, that you can enable for your cluster. You can enable the extension by logging into your cluster and executing:
The following table shows which extensions are enabled by default and which can be enabled (PostgreSQL version 12):
Extension | Enabled | Version | Description |
---|---|---|---|
Note: With select * from pg_available_extensions;
you will see more available extensions, but many of them can't be enabled or used without superuser rights and thus aren't listed here.
DBaaS for PostgreSQL is fully integrated into the and has a dedicated . You may also launch it via automation tools like and .
Compatibility: DBaaS gives you access to the capabilities of the PostgreSQL database engine. This means that the code, applications, and tools you already use today with your existing databases can be used with DBaaS. IONOS Cloud currently supports PostgreSQL versions 12, 13, 14, and 15.
Deprecation Notice: Version 11 is currently still supported but will reach end of life on 9 Nov 2023 (). It will soon be removed from IONOS Cloud.
Locations: As of December 2022, DBaaS is offered in all IONOS Cloud Locations.
Scalable: Fully managed clusters that can be scaled on demand.
High availability: Multi-node clusters with automatic node failure handling.
Security: Communication between clients and the cluster is encrypted using TLS certificates from Let's Encrypt.
Upgrades: Customer-defined maintenance windows, with minimal disruption due to planned failover (approx. few seconds for multi-node clusters).
Backup: Base backups are carried out daily, with Point-in-Time recovery for one week.
Cloning: Customers also have the option to clone clusters via backups.
Restore: Databases can be restored in place or to a different target cluster.
Resources: Offered on Enterprise VM, with a dedicated CPU, storage, and RAM. Storage options are SSD or HDD, with SSD now including encryption-at-rest.
Network: DBaaS supports private LANs.
Extensions: DBaaS supports several .
Note: IONOS Cloud doesn’t allow superuser access for PostgreSQL services. However, most DBA-type actions are still available through other methods.
DBaaS services offered by IONOS Cloud:
Our platform is responsible for all back-end operations required to maintain your database in optimal operational health.
Database installation via the DCD or the DBaaS API.
Pre-set database configuration and configuration management options.
Automation of backups for a period of 7 days.
Regular patches and upgrades during maintenance.
Disaster recovery via automated backup.
Service monitoring: both for the database and the underlying infrastructure.
Customer database administration duties:
Tasks related to the optimal health of the database remain the responsibility of the customer. These include:
Optimisation.
Data organisation.
Creation of indexes.
Updating statistics.
Consultation of access plans to optimize queries.
To conserve disk space, log files are rotated according to size. Logs should not consume more than 175 MB of disk storage. The files are continuously monitored and log messages are shipped to a central storage location with a retention policy of 30 days.
Password encryption: Client libraries must support SCRAM-SHA-256 authentication. Make sure to use an up-to-date client library.
Certificates are issued for the DNS name of the cluster which is assigned automatically during creation and will look similar to pg-abc123.postgresql.de-txl.ionos.com
. It is available via the IONOS API as the dnsName
property of the cluster
resource.
Here is how to verify the certificate using the psql
command line tool:
Resource quotas: Each customer contract is allotted a resource quota. The available number of CPUs, RAM, storage, and database clusters is added to the default limitations for a VDC contract.
16 CPU Cores
32 GB RAM
1500 GB Disk Space
10 database clusters
5 nodes within a cluster
Additionally, a single instance of your database cluster can not exceed 16 cores and 32GB RAM.
Calculating RAM Requirements: The RAM size must be chosen carefully. There is 1 GB of RAM reserved to capture resource reservation for OS system daemons. Additionally, internal services and tools use up to 500 MB of RAM. To choose a suitable RAM size, the following formula must be used.
ram_size
= base_consumption
+ X * work_mem
+ shared_buffers
The base_consumption
and reservation of internal services is approximately 1500 MB.
X is the number of parallel connections. The value work_mem
is set to 8 MB by default.
The shared_buffers
is set to about 15% of the total RAM.
Calculating Disk Requirements:
The requested disk space is used to store all the data that Postgres is working with, incl. database logs and WAL segments. Each Postgres instance has its storage (of the configured size). The operating system and applications are kept separately (outside of the configured storage) and are managed by IONOS.
If the disk runs full Postgres will reject write requests. Make sure that you order enough margin to keep the Postgres cluster operational. You can monitor the storage utilization in DCD.
WAL segments: In normal operation mode, older WAL files will be deleted once they have been replicated to the other instances and backed up to archive. If either of the two shipments is slow or failing then WAL files will be kept until the replicas and archive catch up again. Account for enough margin, especially for databases with high write load.
Log files: Database log files (175 MB) and auxiliary service log files (~100 MB) are stored on the same disk as the database.
CPU: The total upper limit for CPU cores depends on your quota. A single instance cannot exceed 16 cores.
RAM: The total upper limit for RAM depends on your quota. A single instance cannot exceed 32 GB.
Storage: The upper limit for storage size is 2 TB.
Backups: Storing cluster backups in an IONOS S3 Object Storage is limited to the last 7 days.
IP Ranges: The following IP ranges cannot be used with our PostgreSQL services:
10.208.0.0/12
10.233.0.0/18
192.168.230.0/24
10.233.64.0/18
Database instances are placed in the same location as your specified LAN, so network performance should be comparable to other machines in your LAN.
Estimates: A test with pgbench (scaling factor 1000, 20 connections, duration 300 seconds, not showing detailed logs) and a single small instance (2 cores, 3 GB RAM, 20 GB HDD) resulted in around 830 transactions per second (read and write mixed) and 1100 transactions per second (read-only). For a larger instance (4 cores, 8 GB RAM, 600GB Premium SSD) the results were around 3400 (read and write) and 19000 (read-only) transactions per second. The database was initialized using pgbench -i -s 1000 -h <ip> -U <username> <dbname>
. For benchmarking the command line used was pgbench -c 20 -T 300 -h <ip> -U <username> <dbname>
for the read/write tests, and pgbench -c 20 -T 300 -S -h <ip> -U <username> <dbname>
for the read-only tests.
There are several PostgreSQL extensions preinstalled, that you can enable for your cluster. You can enable the extension by logging into your cluster and executing:
The following table shows which extensions are enabled by default and which can be enabled (PostgreSQL version 12):
Note: With select * from pg_available_extensions;
you will see more available extensions, but many of them can't be enabled or used without superuser rights and thus aren't listed here.
Aspect | Asynchronous | Synchronous |
---|
By default, the replication mode of the database cluster determines the guarantees of a committed transaction. However, some workloads might have very diverse requirements regarding accepted data loss vs performance. To address this need, commit guarantees can be changed per transaction. See for details.
Logs: The logs that are generated by a database are stored on the same disk as the database. We provide logs for connections, disconnections, waiting for locks, DDL statements, any statement that ran for at least 500 ms, and any statement that caused an error (see PostgreSQL ). Currently, we do not provide an option to change this configuration.
Write-Ahead Logs: PostgreSQL uses (WAL) for continuous archiving and point-in-time recovery. These logs are created in addition to the regular logs.
Every change to the database is recorded in the WAL record. WALs are generated along with daily base backups and offer a consistent snapshot of the database as it was at that time. WALs and backups are automatically deleted after 7 days, which is the earliest point in time you can recover from. Please consult for more information.
Connection encryption: All client connections are encrypted using TLS; the default is prefer
and clients cannot disable it. Server certificates are issued by Let's Encrypt and the root certificate is . This needs to be made available to the client for verify-ca
and verify-full
to function.
Connection Limits: The value for is calculated based on RAM size.
RAM size | max_connections |
---|
The superuser needs to maintain the state and integrity of the database, which is why the platform reserves 11 connections for internal use: connections for superusers (see ), for replication.
Note: To cite the : "It is very easy to use pgbench to produce completely meaningless numbers". The numbers shown here are only ballpark figures and there are no performance guarantees. The real performance will vary depending on your workload, the IONOS location, and several other factors.
Extension | Enabled | Version | Description |
---|
plpython3u
X
1.0
PL/Python3U untrusted procedural language
pg_stat_statements
X
1.7
track execution statistics of all SQL statements executed
intarray
1.2
functions, operators, and index support for 1-D arrays of integers
pg_trgm
1.4
text similarity measurement and index searching based on trigrams
pg_cron
1.3
Job scheduler for PostgreSQL
set_user
3.0
similar to SET ROLE but with added logging
timescaledb
2.4.2
Enables scalable inserts and complex queries for time-series data
tablefunc
1.0
functions that manipulate whole tables, including crosstab
pg_auth_mon
X
1.1
monitor connection attempts per user
plpgsql
X
1.0
PL/pgSQL procedural language
pg_partman
4.5.1
Extension to manage partitioned tables by time or ID
hypopg
1.1.4
Hypothetical indexes for PostgreSQL
postgres_fdw
X
1.0
foreign-data wrapper for remote PostgreSQL servers
btree_gin
1.3
support for indexing common datatypes in GIN
pg_stat_kcache
X
2.2.0
Kernel statistics gathering
citext
1.6
data type for case-insensitive character strings
pgcrypto
1.3
cryptographic functions
earthdistance
1.1
calculate great-circle distances on the surface of the Earth
postgis
3.2.1
PostGIS geometry and geography spatial types and functions
cube
1.4
data type for multidimensional cubes
primary failure | A healthy standby will be promoted if the primary node becomes unavailable. | Only standby nodes that contain all confirmed transactions can be promoted. |
Standby failure | No effect on primary. Standby catches up once it is back online. | In strict mode at least one standby must be available to accept write requests. In non-strict mode the primary continues as standalone. There is a short delay in transaction processing if the synchronous standby changes. |
Consistency model | Strongly consistent (expect for lost data.) | Strongly consistent (expect for lost data.) |
Data loss during failover | Non-replicated data is lost. | Not possible. |
Data loss during primary storage failure | Non-replicated data is lost. | Non-replicated data is lost in standalone mode. |
Latency | Limited by the performance of the primary. | Limited by the performance of the primary, the synchronous standby and the latency between them (usually below 1ms). |
2GB | 128 |
3GB | 256 |
4GB | 384 |
5GB | 512 |
6GB | 640 |
7GB | 768 |
8GB | 896 |
> 8GB | 1000 |
plpython3u | X | 1.0 | PL/Python3U untrusted procedural language |
pg_stat_statements | X | 1.7 | track execution statistics of all SQL statements executed |
intarray | 1.2 | functions, operators, and index support for 1-D arrays of integers |
pg_trgm | 1.4 | text similarity measurement and index searching based on trigrams |
pg_cron | 1.3 | Job scheduler for PostgreSQL |
set_user | 3.0 | similar to SET ROLE but with added logging |
timescaledb | 2.4.2 | Enables scalable inserts and complex queries for time-series data |
tablefunc | 1.0 | functions that manipulate whole tables, including crosstab |
pg_auth_mon | X | 1.1 | monitor connection attempts per user |
plpgsql | X | 1.0 | PL/pgSQL procedural language |
pg_partman | 4.5.1 | Extension to manage partitioned tables by time or ID |
hypopg | 1.1.4 | Hypothetical indexes for PostgreSQL |
postgres_fdw | X | 1.0 | foreign-data wrapper for remote PostgreSQL servers |
btree_gin | 1.3 | support for indexing common datatypes in GIN |
pg_stat_kcache | X | 2.2.0 | Kernel statistics gathering |
citext | 1.6 | data type for case-insensitive character strings |
pgcrypto | 1.3 | cryptographic functions |
earthdistance | 1.1 | calculate great-circle distances on the surface of the Earth |
postgis | 3.2.1 | PostGIS geometry and geography spatial types and functions |
cube | 1.4 | data type for multidimensional cubes |
PostgreSQL Backups: A cluster can have multiple backups. They are created
When a cluster is created
When the PostgreSQL version is changed to a higher major version
When a Point-In-Time-Recovery operation is conducted.
At any time, Postgres only ships to one backup. We use base backups combined with continuous WAL archiving. A base backup is done via pg_basebackup regularly, and then WAL records are continuously added to the backup. Thus, a backup doesn't represent a point in time but a time range. We keep backups for the last 7 days so recovery is possible for up to one week in the past.
Data is added to the backup in chunks of 16MB or after 30 minutes, whichever comes first. Failures and delays in archiving do not prevent writes to the cluster. If you restore from a backup then only the data that is present in the backup will be restored. This means that you may lose up to the last 30 minutes or 16MB of data if all replicas lose their data at the same time.
You can restore from any backup of any PostgreSQL cluster as long as the backup was created with the same or an older PostgreSQL major version.
Backups are stored encrypted in an IONOS S3 Object Storage bucket in the same region your database is in. Databases in regions without IONOS S3 Object Storage will be backed up to eu-central-2
.
Warning: When a database is stopped all transactions since the last WAL segment are written to a (partial) WAL file and shipped to the IONOS S3 Object Storage. This also happens when you delete a database. We provide an additional security timeout of 5 minutes to stop and delete the database gracefully. However, under rare circumstances it could happen that this last WAL Segment is not written to the IONOS S3 Object Storage (e.g. due to errors in the communication with the IONOS S3 Object Storage) and these transactions get lost.
As an additional security mechanism you can check which data has been backed up before deleting the database. To verify which was the last archived WAL segment and at what time it was written you can connect to the database and get information from the pg_stat_archiver.
The `last_archived_time might be older than 30 minutes (WAL files are created with a specific timeout, see above) which is normal if there is no new data added.
We provide Point-in-Time-Recovery (PITR). When recovering from a backup, the user chooses a specific backup and provides a time (optional), so that the new cluster will have all the data from the old cluster up until that time (exclusively). If the time was not provided, the current time will be used.
It is possible to set the recoveryTargetTime
to a time in the future. If the end of the backup is reached before the recovery target time is met then the recovery will complete with the latest data available.
Note: WAL records shipping is a continuous process and the backup is continuously catching up with the workload. Should you require that all the data from the old cluster is completely available in the new cluster, stop the workload before recovery.
Planned failover: During a failure or planned failover, the client must reconnect to the database. A planned failover is signaled to the client by the closing of the TCP connection on the server. The client must also close the connection and reconnect.
In the event of a failure, the connection might not be closed correctly. The new leader will send a gratuitous ARP packet to update the MAC address in the client's ARP table. Open TCP connections will be reset once the client sends a TCP packet. We recommend re-establishing a connection to the database by using an exponential back-off retry with an initial immediate retry.
Uncontrolled disconnection: Since we do not allow read connections to standby nodes, only primary disconnections are possible. However, uncontrolled disconnections can happen during maintenance windows, a cluster change, and during unexpected situations such as loss of storage disk space. Such disconnections are destructive for the ongoing transactions and also clients should reconnect.
If a node is disconnected from the cluster, then a new node will be created and provisioned. Losing a primary node leads to the same situation when a client should reconnect. Losing a replica is not noticeable to the customer.
IONOS Cloud updates and patches your database cluster to achieve high standards of functionality and security. This includes minor patches for PostgreSQL, as well as patches for the underlying OS. We try to make these updates unnoticeable to your operation. However, occasionally, we might have to restart your PostgreSQL instance to allow the changes to take effect. These interruptions will only occur during the maintenance window for your database, which is a weekly four-hour window.
When your cluster only contains one replica you might experience a short down-time during this maintenance window, while your database instance is being updated. In a replicated cluster, we only update standbys, but we might perform a switchover in order to change the leader node.
Considerations: Updates to a new minor version are always backward compatible. Such updates are done during the maintenance window with no additional actions from the user side.
Major Version Upgrades
Caution: Major changes of the PostgreSQL version are irreversible and can fail. You should read the official migration guide and test major version upgrades with an appropriate development cluster first.
Prerequisites:
Read the migration guide from Postgres (e.g. to version 13) and make sure your database cluster can be upgraded
Test the upgrade on development cluster with similar / the same data (you can create a new database cluster as a clone of your existing cluster)
Prepare for a downtime during the major version upgrade
Ensure the database cluster has enough available storage. While the upgrade is space-efficient (i.e. it does not copy the data directory), some temporary data is written to disk.
Before upgrading PostgreSQL major versions, customers should be aware that IONOS Cloud is not responsible for customer data or any utilized postgreSQL functionality. Hence, it is the responsibility of the customer to ensure that the migration to a new PostgreSQL major version does not impact their operations.
As per PostgreSQL official documentation: "New major versions also typically introduce some user-visible incompatibilities, so application programming changes might be required."
Starting with version 10, PostgreSQL moved to a yearly release schedule, where each major version is supported for 5 years after initial release. You can find more details at https://www.postgresql.org/support/versioning/. We strive to support new versions as soon as possible.
When a major version approaches its end of life (EOL), we will announce the deprecation and removal of the version at least 3 months in advance. About 1 month before the EOL, no new database can be created with the deprecated version (the exact date will be part of the first announcement). When the EOL is reached, not yet upgraded databases will be upgraded in their next maintenance window.
Planned failover: During a failure or planned failover, the client must reconnect to the database. A planned failover is signaled to the client by the closing of the TCP connection on the server. The client must also close the connection and reconnect.
In the event of a failure, the connection might not be closed correctly. The new leader will send a gratuitous ARP packet to update the MAC address in the client's ARP table. Open TCP connections will be reset once the client sends a TCP packet. We recommend re-establishing a connection to the database by using an exponential back-off retry with an initial immediate retry.
Uncontrolled disconnection: Since we do not allow read connections to standby nodes, only primary disconnections are possible. However, uncontrolled disconnections can happen during maintenance windows, a cluster change, and during unexpected situations such as loss of storage disk space. Such disconnections are destructive for the ongoing transactions and also clients should reconnect.
If a node is disconnected from the cluster, then a new node will be created and provisioned. Losing a primary node leads to the same situation when a client should reconnect. Losing a replica is not noticeable to the customer.
IONOS Cloud updates and patches your database cluster to achieve high standards of functionality and security. This includes minor patches for PostgreSQL, as well as patches for the underlying OS. We try to make these updates unnoticeable to your operation. However, occasionally, we might have to restart your PostgreSQL instance to allow the changes to take effect. These interruptions will only occur during the maintenance window for your database, which is a weekly four-hour window.
When your cluster only contains one replica you might experience a short down-time during this maintenance window, while your database instance is being updated. In a replicated cluster, we only update standbys, but we might perform a switchover in order to change the leader node.
Considerations: Updates to a new minor version are always backward compatible. Such updates are done during the maintenance window with no additional actions from the user side.
Caution: Major changes of the PostgreSQL version are irreversible and can fail. You should read the official migration guide and test major version upgrades with an appropriate development cluster first.
Prerequisites:
Read the migration guide from Postgres (e.g. to version 13) and make sure your database cluster can be upgraded
Test the upgrade on development cluster with similar / the same data (you can create a new database cluster as a clone of your existing cluster)
Prepare for a downtime during the major version upgrade
Ensure the database cluster has enough available storage. While the upgrade is space-efficient (i.e. it does not copy the data directory), some temporary data is written to disk.
Before upgrading PostgreSQL major versions, customers should be aware that IONOS Cloud is not responsible for customer data or any utilized postgreSQL functionality. Hence, it is the responsibility of the customer to ensure that the migration to a new PostgreSQL major version does not impact their operations.
As per PostgreSQL official documentation: "New major versions also typically introduce some user-visible incompatibilities, so application programming changes might be required."
Starting with version 10, PostgreSQL moved to a yearly release schedule, where each major version is supported for 5 years after initial release. You can find more details at https://www.postgresql.org/support/versioning/. We strive to support new versions as soon as possible.
When a major version approaches its end of life (EOL), we will announce the deprecation and removal of the version at least 3 months in advance. About 1 month before the EOL, no new database can be created with the deprecated version (the exact date will be part of the first announcement). When the EOL is reached, not yet upgraded databases will be upgraded in their next maintenance window.