0

I have developed a web application using Laravel and PostgreSQL as the database.

The main function of my application is to process large volumes of data from the database, with tables ranging from a few thousand records to 13 million in some cases. Essentially, it is a data analytics application where I have several workers simultaneously processing the same task. The number of workers varies between 1 (minimum) and 4 (maximum).

Observing the processes both on my local development machine with Docker and in production, the most demanding ones are the PostgreSQL workers, as my application’s workers remain at around 5% CPU usage, while the PostgreSQL workers reach approximately 80% CPU usage.

It is clear that the main bottleneck in my service is PostgreSQL since many queries are executed, some of them quite complex. Additionally, it is important to note that, in principle, both my application’s code and the SQL queries are correctly optimized.

The problem began when I deployed the service on a VPS with 4 vCores of CPU, 4 GB of RAM, and 4 GB of swap. It is a VPS with KVM virtualization running Ubuntu 24.04.

Locally, I use this project with Docker in two Ubuntu 22.04 containers for Laravel and PostgreSQL. When I run the workers of my service’s main process with the default configuration in the Docker containers, the processing speed is acceptable. However, when I run the same processes in production, the speed is 2 to 4 times slower than in the local environment.

For example, if each worker in my application can process 1,000 elements every 5 seconds locally, in production, it processes only 200 to 300 elements in the same timeframe.

My local machine is a MacBook Air M1 with 16 GB of RAM. The issue is that I don’t see RAM usage as a real limiting factor in production since the memory is not overloaded on the production machine. I believe the real bottleneck is CPU performance because, even using Docker virtualization on my local machine, it is still faster than my production server.

In production I ran the benchamark of CPU with sysbench --test=cpu run, and his result was:

CPU speed: events per second: 995.85 General statistics: total time: 10.0007s total number of events: 9961 Latency (ms): min: 0.94 avg: 1.00 max: 2.02 95th percentile: 1.25 sum: 9990.51 Threads fairness: events (avg/stddev): 9961.0000/0.00 execution time (avg/stddev): 9.9905/0.00 

My main question is: What could be causing the production performance to be significantly worse than on my development machine using Docker?

Could it be that the VPS hosting does not meet the CPU and disk performance expectations I had, despite being a production machine that, in principle, should have enough capacity to run my service?

Could it be due to a misconfiguration of PostgreSQL in production? I have already tested multiple different configurations, increasing parallelization and allowing more resource usage by PostgreSQL, but I have not achieved significant improvements.

I am open to any questions if further information is needed to evaluate the case.

Thank you.

2
  • 1
    You get what you pay for. Factors such as caching means that it's a very good chance data are in RAM on your local computer, but on a (overcommitted) VM host, it's probably on disk - that is not any faster than the NVMe in your computer etc. A low end VM is not a good match for data processing with large data sets. Commented Feb 18 at 14:10
  • This most likely is a memory speed issue. Memory heavy applications that do a lot of processing on the data really need fast memory and large and fast cache. These things are typically what VPSs struggle with. Commented Feb 18 at 14:48

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.