Im running a service on a GCP VM that has, randomly, very low CPU performance. Commands like grep and tail on text files often take up to a minute to start showing otherwise instant outputs.
This has happened about 5 or 6 times over the last month, and causes severe instability on our end, that does impact customers. I cannot overstate the gravity of this matter.
Some details that may be relevant:
- Machine is a C2D HighCPU 16 instance type (AMD EPYC Milan)
- Machine is located on europe-west1-b
- Image is a Ubuntu 18 32 bit image. It has no services, but I do need a 32 bit image
- Some times (especially during these issues), we have a single core with 100% usage, but no process showing up as consuming it. I have attached screenshots
Usage spike - core that has spike changes every few seconds. No processes have more than 25% cpu usage on htop, but one core has 100%.
Has anyone ever had this issue? Does anyone have any suggestions on how I could go about solving this?
Very much appreciated