2

We run a small mongodb replica set on three bare metal servers (no virtualization, no docker/kubernetes) with Debian 11 and mongodb 5.0.6:

machineA: 128GB RAM, 1TB disk, PRIMARY machineB: 128GB RAM, 1TB disk, SECONDARY machineC: 8GB RAM, 20GB disk, ARBITER

All of a sudden we experience outages with error in our application log like "NotWritablePrimary"/"MongoNotPrimaryException" - we were assuming that our connection string would make sure that no outage occurs:

mongodb://machineA:27017,machineB:27017/?replicaSet=MyRepl&waitQueueMultiple=10&readPreference=primaryPreferred 

It turned out that the PRIMARY mongodb instance was killed by the linux kernel, as it was consuming to much RAM. The replica set was now running for 3 months without a problem at any time. But all of a sudden I see RAM consumption like this:

enter image description here

All of a sudden there was a massive RAM usage by mongodb: enter image description here

Right after the kernel killed the mongod process, it was restarted by SystemD as it runs as a service. But right after the restart it again consumes maximum amount of RAM until it dies again.

All of a sudden this behaviour stopped this morning. We did not change anything on our application so the question now is: what eats so much RAM in the mongodb process?

As far as I know the WireTiger engine is using ~50% of the available RAM, but that wouldn't explain the maximum usage of the total RAM of the machine. I also have some metrics from Percona mongodb_exporter, which shows that RAM is used by mongodb and no other process on the system:

enter image description here

Interestingly the memory usage of the SECONDARY wasn't moving at all at that time: enter image description here

Anyone any idea or hint what is going on here?

1 Answer 1

0

We found out that one of our app services was running wild in certain circumstances and it was a bit hard to see for us.

When constantly hammering against MongoDb it seems that the memory usage is getting higher and higher, instead of more CPU resources being used as I would expect it. At a certain point the mongod process was killed by the linux kernel.

After we fixed the issue in our application, the situation was gone.

2
  • how have you found problem in application? Commented May 9, 2023 at 16:45
  • 1
    @undefine We use JProfiler for this. Commented May 12, 2023 at 7:24

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.