🏗️Improve database performance with connection pooling and load balancing techniques

🏗 Improve database performance with connection pooling and load balancing 🎢 A complete review of load balancing solutions in popular RDBMS We tend to rely on caching solutions to improve database performance. Caching frequently- accessed queries in memory or via a database can optimize write/read performance and reduce network latency, especially for heavy-workload applications, such as gaming services and Q&A portals. But you can further improve performance by pooling users' connections to a database. Client users need to create a connection to a web service before they can perform CRUD operations. Most web services are backed by relational database servers such as Postgres or MySQL. With PostgreSQL, each new connection can take up to 1.3MB in memory. In a production environment where we expect to receive thousands or millions of concurrent connections to the backend service, this can quickly exceed your memory resources (or if you have a scalable cloud, it can get very expensive very quickly). Because each time a client attempts to access a backend service, it requires OS resources to create, maintain, and close connections to the datastore. This creates a large amount of overhead causing database performance to deteriorate. Consumers of your service expect fast response times. If that performance deteriorates, it can lead to poor user experiences, revenue losses, and even unscheduled downtime. If you expose your backend service as an API, repeated slowdowns and failures could cause cascading problems and lose you customers. Instead of opening and closing connections for every request, connection pooling uses a cache of database connections that can be reused when future requests to the database are required. It lets your database scale e ff ectively as the data stored there and the number of clients accessing it grow. Tra ffi c is never constant, so pooling can better manage tra ffi c peaks without causing outages. Your production database shouldn’t be your bottleneck. Oracle load balancing & Pooling solutions: Oracle Real application clustering (RAC) When using RAC you can con fi gure SCAN-IP to achieve load balancing. But cause of some applications behavior and existing limitation, you can not always using Scan ip, so you have to create services to minimize tra ffi c on interconnection, in this case you lost load balancing feature of Rac and use scalability and high availability of Rac. 1-Oracle Tra ffi c Director Manager(TDM) Oracle Tra ffi c Director is a fast, reliable, and scalable layer-7 software load balancer. You can set up Oracle Tra ffi c Director to serve as the reliable entry point for all HTTP, HTTPS and TCP tra ffi c to application servers and web servers in the back end. Depending on the needs of your IT environment, you can con fi gure Oracle Tra ffi c Director to apply multiple, complex rules when distributing requests to the back-end servers and when forwarding responses to clients. Oracle Tra ffi c Director distributes the requests that it receives from clients to servers in the back end based on the speci fi ed load-balancing method, routes the requests based on speci fi ed rules, caches frequently accessed data, prioritizes tra ffi c, and controls the quality of service.

On engineered systems platforms, you can set up pairs of Oracle Tra ffi c Director instances and leverage its built-in High Availability capability to setup either Active-Passive or Active-Active failover. As the volume of tra ffi c to your network grows, you can easily scale the environment by recon fi guring Oracle Tra ffi c Director with additional back-end servers to which it can route requests. Oracle Tra ffi c Director provides the following features: • Advanced methods for load distribution Con fi gure Oracle Tra ffi c Director to distribute client requests to servers in the back-end using one of these methods: ◦ Round robin ◦ Least connection count ◦ Least response time ◦ Weighted round robin ◦ Weighted least connection count • Flexible routing and load control on back-end servers ◦ Request-based routing ◦ Content-based routing ◦ Request rate acceleration ◦ Connection limiting • Controlling the request load and quality of service ◦ Request rate limiting ◦ Quality of service tuning • Support for WebSocket connections • Integration with Oracle Fusion Middleware • Easy-to-use administration interfaces • Security Oracle Tra ffi c Director enables and enhances security for your IT infrastructure in the following ways: ◦ Reverse proxy ◦ Support for TLS 1.0, 1.1, and 1.2 ◦ Web Application Firewall ◦ HTTP Forward Proxy Support in Origin Server Pools • High availability Oracle Tra ffi c Director provides high availability for your enterprise applications and services through the following mechanisms: ◦ Health checks for the back end ◦ Backup servers in the back end ◦ Failover for load balancing ◦ Dynamic recon fi guration • Monitoring statistics Administrators can monitor a wide range of statistics pertaining to the performance of Oracle Tra ffi c Director instances through several methods: the administration console, the command- line interface, and a report in XML format. • High performance ◦ SSL/TLS o ffl oading ◦ Content caching ✦ HTTP compression Oracle Tra ffi c Director Terminology An Oracle Tra ffi c Director con fi guration is a collection of elements that de fi ne the run-time behavior of an Oracle Tra ffi c Director instance. An Oracle Tra ffi c Director con fi guration contains information about various elements of an Oracle Tra ffi c Director instance such as listeners, origin servers, failover groups, and logs.

2-Orcale Connection Manager (CMAN) CMAN is one of the useful software that help you achieve connectivity management easily. Oracle Connection Manager (CMAN) is a multi-purpose database connection proxy server used in Oracle deployments. CMAN is available as part of the Oracle Database Enterprise Edition (EE) client package. CMAN in Tra ffi c Director Mode (TDM), or CMAN-TDM, brings in intelligence to manage incoming client connections through session multiplexing and dynamic load balancing that can optimize the usage of database resources. Session multiplexing can be handled e ffi ciently through the Proxy Resident Connection Pooling (PRCP) feature in CMAN-TDM. When PRCP is enabled on CMAN-TDM, connection pool(s) are created on the CMAN-TDM server that enable mid-tier clients to share connections to the database. Oracle Database 21c and beyond supports the multi-tenant architecture model with multiple Pluggable Databases (PDBs) and a single Container Database (CDB). Until recently, PRCP pools can be created for each database service. So if a PDB database instance has multiple services, then a PRCP pool is created for each of the services. Starting from Oracle EE 23c version, CMAN-TDM provides a per-PDB PRCP option as well. With this feature, a PRCP pool will be created at the PDB level. This per-PDB PRCP pool will be shared across multiple services associated with the same PDB. Sample per-PDB PRCP architecture To enable per-PDB PRCP, the user has to set the PRCP connection factor (TDM_PERPDB_PRCP_CONNFACTOR) parameter to a non-zero positive value in cman.ora (The con fi guration fi le for CMAN). The per-PDB PRCP feature dynamically con fi gures the PRCP pool, by adjusting the maximum size based on the Oracle Compute Unit (OCPU) — an internal parameter — and the PRCP connection factor parameter at the PDB level. CMAN-TDM periodically checks the OCPU value of

each PDB and recon fi gures the per-PDB PRCP pool size, if required. The user can further tune the pool sizing by tweaking the PRCP connection factor to a higher or lower value. You can also view the per-PDB PRCP performance statistics by setting another cman.ora parameter TDM_STATS_FREQUENCYto a non-zero positive value. This will add the statistics in the newly created V$TDM_STATS dynamic view. In essence, consolidation of PRCP pools per PDB minimizes the fragmentation of session pools across multiple services and automatically applies the sizing for session pools for the whole PDB in an e ff ective manner, leading to better resource management on CMAN-TDM. What’s more! you can monitor the connection performance with the per-PDB PRCP statistics available in dynamic views as well! Pretty cool! For advanced connection pooling and security enhancements you can use F5 applience that provide a hardware load balancing with many security options. For web applications using weblogic that can o ff er HA and scalability features and you can use Tra ffi c director manager TDM also. Mariadb/MySQL Connection Pooling & load balancing solutions: 1-MariaDB MaxScale MariaDB MaxScale is a database proxy that extends the high availability, scalability, and security of MariaDB Server while at the same time simplifying application development by decoupling it from underlying database infrastructure. MariaDB MaxScale is engineered with an extensible architecture to support plugins, extending its functionality beyond transparent load balancing to become, for example, a database fi rewall. With built-in plugins for multiple routers, fi lters and protocols, MariaDB MaxScale can be con fi gured to forward database requests and modify database responses based on business and technical requirements. 2-ProxySQL Before ProxySQL, a database administrator only had a limited arsenal of tricks to manage “bad queries” in a live database environment, or when needing to perform a failover with minimal impact to an application in production. Now, thanks to ProxySQL, the admin has much more control over database tra ffi c and queries that are issued against the databases. ProxySQL became a production-ready in late 2015. It was created by René Cannaò, himself a DBA, to solve problems that a typical DBA (or ops person) would face while working with a large, sharded MySQL environment. ProxySQL provides a number of bene fi ts: • It provides ‘intelligent’ load balancing of application requests onto multiple databases • It understands the MySQL tra ffi c that passes through it, and can split reads from writes. This is especially useful in a Master-Slave replication setup, when writes should only go to the master. • It understands the underlying database topology, whether the instances are up or down, and therefore can route requests to healthy databases • It shields applications from the complexity of the underlying database topology, as well as any changes to it • It provides query workload analytics, which is useful when analyzing and improving performance • It provides administrators with powerful control mechanisms to cache queries at the proxy layer for faster query response, re-route queries, or even re-write a badly written query • It empowers the administrator to maintain e ffi cient operations of the overall infrastructure

3-Galera cluster plus NGINX or Haproxy Galera Cluster: Galera Cluster for MySQL/MariaDB is a true Multi-Master Cluster based on synchronous replication. It’s an easy-to-use, high-availability solution, which provides high system up-time, no data loss and scalability for future growth. MariaDB with Galera and HAProxy is a great option if you need to scale your MariaDB or even MySQL backend. With a minimum of the 3 nodes we set up above, a single node can go down and your application and users won’t ever notice it. Galera o ff ers great protection against data loss, inconsistent databases and future scalability issues. Haproxy: High Availability Proxy, or HAProxy is a single-threaded event-driven non-blocking engine that combines a fast I/O layer with a priority-based scheduler. You can use it to balance TCP connections between application servers and Galera Cluster. Haproxy can integrated with KeepAlive service to achieve HA solutions. Also you can use mysql max scale solutions. Nginx: Nginx is well-known for its ability to act as a reverse-proxy with small memory footprint. It usually sits in the front-end web tier to redirect connections to available backend services, provided these passed some health checks. Using a reverse-proxy is common when you are running a critical application or service that requires high availability. It also distributes the load equally among the backend services. Over the last few years, a few products have tried to tackle the read/write split challenge. The MySQL_proxy was the fi rst attempt I am aware of at solving this problem but it ended up with many limitations. ScaleARC does a much better job and is very usable but it stills has some limitations. The latest contender is MaxScale from MariaDB. 4-MySQL Router MySQL Router is lightweight middleware that provides transparent routing between your application and any backend MySQL Servers. It can be used for a wide variety of use cases, such as providing high availability and scalability by e ff ectively routing database tra ffi c to appropriate backend MySQL Servers. The pluggable architecture also enables developers to extend MySQL Router for custom use cases. MySQL Router Features: Failover Typically, a highly available MySQL setup consists of a single primary and multiple replicas and it is up to the application to handle failover, in case the MySQL primary becomes unavailable. Using MySQL Router, application connections will be transparently routed based on load balancing policy, without implementing custom application code.

Load Balancing MySQL Router provides additional scalability and performance by distributing database connections across a pool of servers. For example, if you have a replicated set of MySQL Servers, MySQL Router can distribute application connections to them in a round-robin fashion. Pluggable Architecture MySQL Router's pluggable architecture allows MySQL developers to easily extend the product with additional features, as well as providing MySQL users with the ability to create their own custom plugins providing endless possibilities • Connection Routing plugin which does connection-based routing, meaning that it forwards the MySQL packets to the backend server without inspecting or modifying them, thus providing maximum throughput. • The Metadata Cache plugin, which provides transparent client load balancing, routing, and failover into Group Replication and InnoDB Clusters. 5-MySQL HeatWave Read Replicas with Load Balancer The MySQL HeatWave Database Service added new capabilities that enable applications to scale database read operations to millions of queries per second. More than 5 million queries per second in OCI MySQL HeatWave using 16 replicas with the shape AMD E4 16 OCPU 256 GB RAM. Users can add and remove read replicas and load balancers to DB Systems with just a few clicks, and the service takes care of deployment, con fi guration, monitoring, and maintenance. Applications can connect directly to read replicas or to a read-only endpoint that automatically balances the queries across all the replicas.

Postgres solution for connection pooling & load balancing Two ways to integrate a connection pooler There are two ways of implementing connection pooling for PostgreSQL application: 1-As an external service or middleware such as pgbouncer or pgpool Connection poolers such as pgbouncer and pgpool-II can be used to pool connections from clients to a PostgreSQL database. The connection pooler sits in between the application and the database server. Pgbouncer or pgpool-II can be con fi gured in a way to relay requests from the application to the database server. 2-Client-side libraries such as c3p0 There exist libraries such as c3p0 which extend database driver functionality to include connection pooling support. However, the best way to implement connection pooling for applications is to make use of an external service or middleware since it is easier to set up and manage. In addition external middleware like pgpool2 provides other features such as load balancing apart from pooling connections. Scaling database performance with pgbouncer Pgbouncer comes with three types of pooling: 1. Session pooling: One of the connections in the pool is assigned to a client until the timeout is reached. 2. Transaction pooling: Similar to session polling, it gets a connection from the pool. It keeps it until the transaction is done. If the same client wants to run another transaction, it has to wait until it gets another transaction assigned to it. 3. Statement pooling: Connection is returned to the pool as soon as the fi rst query is completed. Pgpool-II Pgpool-II is a proxy software that sits between PostgreSQL servers and a PostgreSQL database client. It provides the following features:

Connection Pooling Pgpool-II maintains established connections to the PostgreSQL servers, and reuses them whenever a new connection with the same properties (i.e. user name, database, protocol version, and other connection parameters if any) comes in. It reduces the connection overhead, and improves system's overall throughput. Load Balancing If a database is replicated (because running in either replication mode or native replication mode), performing a SELECT query on any server will return the same result. Pgpool-II takes advantage of the replication feature in order to reduce the load on each PostgreSQL server. It does that by distributing SELECT queries among available servers, improving the system's overall throughput. In an ideal scenario, read performance could improve proportionally to the number of PostgreSQL servers. Load balancing works best in a scenario where there are a lot of users executing many read-only queries at the same time. Automated fail over If one of the database servers goes down or becomes unreachable, Pgpool-II will detach it and will continue operations by using the rest of database servers. There are some sophisticated features that help the automated failover including timeouts and retries. Online Recovery Pgpool-II can perform online recovery of database node by executing one command. When the online recovery is used with the automated fail over, a detached node by fail over is possible to attach as standby node automatically. It is possible to synchronize and attach new PostgreSQL server too. Replication Pgpool-II can manage multiple PostgreSQL servers. Activating the replication feature makes it possible to create a real time backup on 2 or more PostgreSQL clusters, so that the service can continue without interruption if one of those clusters fails. Pgpool-II has built-in replication (native replication). However user can use external replication features including streaming replication of PostgreSQL. Limiting Exceeding Connections There is a limit on the maximum number of concurrent connections with PostgreSQL, and new connections are rejected when this number is reached. Raising this maximum number of connections, however, increases resource consumption and has a negative impact on overall system performance. Pgpool-II also has a limit on the maximum number of connections, but extra connections will be queued instead of returning an error immediately. However, you can con fi gure to return an error when the connection limit is exceeded (4.1 or later). Watchdog Watchdog can coordinate multiple Pgpool-II, create a robust cluster system and avoid the single point of failure or split brain. To avoid the split brain, you need at least 3 Pgpool-II nodes. Watchdog can perform lifecheck against other pgpool-II nodes, to detect a fault of Pgpool-II. If active Pgpool-II goes down, standby Pgpool-II can be promoted to active, and take over Virtual IP. In Memory Query Cache In memory query cache allows to save a pair of SELECT statement and its result. If an identical SELECTs comes in, Pgpool-II returns the value from cache. Since no SQL parsing nor access to PostgreSQL are involved, using in memory cache is extremely fast. On the other hand, it might be slower than the normal path in some cases, because it adds some overhead of storing cache data.

Choosing a connection pooler: pgpool-II or pgbouncer? There are several factors to consider when choosing a connection pooler to use. Although pgbouncer and pgpool-II are great solutions for connection pooling, each tool has its strengths and weaknesses. Memory/resource consumption If you are interested in a lightweight connection pooler for your backend service, then pgbouncer is the right tool for you. Unlike pgpool-II, which by default allows 32 child processes to be forked, pgbouncer uses only one process. Thus pgbouncer consumes less memory than pgpool2. Streaming Replication Apart from pooling connections, you can also manage your Postgres cluster with streaming replication using pgpool-II. Streaming replication copies data from a primary node to a secondary node. Pgpool-II supports Postgres streaming replication, while pgbouncer does not. It is the best way to achieve high availability and prevent data loss. Centralized password management In a production environment where you expect many clients/applications to connect to the database through a connection pooler concurrently, it is necessary to use a centralized password management system to manage clients' credentials. You can make use of auth_query in pgbouncer to load clients’ credentials from the database instead of storing clients’ credentials in a userlist.txt fi le and comparing credentials from the connection string against the userlist.txt fi le. Load balancing and high availability Finally, if you want to add load balancing and high availability to your pooled connections, then pgpool2 is the right tool to use. pgpool2 supports Postgres high availability through the in-built watchdog processes. This pgpool2 sub-process monitors the health of pgpool2 nodes participating in the watchdog cluster as well as coordinating between multiple pgpool2 nodes. Conclusion Database performance can be improved beyond connection pooling. Replication, load balancing, and in-memory caching can contribute to e ffi cient database performance. If a web service is designed to make a lot of read and write queries to a database, then you have multiple instances of a Postgres database in place to take care of write queries from clients through a load balancer such as pgpool-II while in-memory caching can be used to optimize read queries. Despite the pgpool-II ability to function as a loader balancer and connection pooler, pgbouncer is the preferred middleware solution for connection pooling because it is easy to set up, not too di ffi cult to manage, and primarily serves as a connection pooler without any other functions. When use Oracle you have this solution: Setup oracle RAC with con fi g Scan-ip and using new features such as TAF, FAN, and specially TAC(Transparent Application Continuity). Another integrated solution is using CMAN. If thses options can not achieve your load balancing requirements then use F5 appliences that give you hardware load balancing and security options. For Orcale environment when you are challenging on web applications, use Weblogic or tra ffi c director software to achieve connection pooling, load balancing.

Best Regards, Alireza Kamrani Senior RDBMS Consultant.

🏗️Improve database performance with connection pooling and load balancing techniques

More Related Content

Similar to 🏗️Improve database performance with connection pooling and load balancing techniques

More from Alireza Kamrani

Recently uploaded

🏗️Improve database performance with connection pooling and load balancing techniques