Problem Statement
How do you troubleshoot high CPU usage or memory issues? Describe the steps and commands you would use to identify and resolve the problem.
Explanation
Start with top or htop to identify processes consuming high CPU or memory. Press M in top to sort by memory, P for CPU. Note the PID and %CPU/%MEM values. For CPU issues, look for processes with consistently high CPU usage. For memory, check both RSS (resident memory) and VIRT (virtual memory) in ps aux output.
Investigate specific process: ps aux | grep PID shows details, cat /proc/PID/status shows comprehensive status including memory breakdown, cat /proc/PID/cmdline shows full command. Check open files with lsof -p PID to see if process has too many open files. Use pmap PID to view detailed memory mapping and identify memory hogs.
For CPU issues: strace -p PID shows system calls, identifying what the process is doing. If looping, look for repeated calls. pstack PID (or gstack) shows call stack for debugging where process is stuck. Check if CPU spike is legitimate (processing) or a bug (infinite loop). Review application logs for errors or warnings.
For memory issues: check if it's a memory leak by monitoring memory growth over time: watch -n 10 'ps aux | grep PID'. Review application logs for OutOfMemoryError or similar. Check swap usage with free -h and vmstat - high swap indicates insufficient RAM. If memory leak suspected, restart the application and monitor, potentially after enabling memory profiling.
Resolution steps: restart misbehaving process if safe, investigate application logs for root cause, apply limits with ulimit or systemd if process consumes excessive resources, optimize application if performance issue is legitimate, or add resources (CPU/RAM) if system is underpowered. Document findings and solutions for future reference.
Practice Sets
This question appears in the following practice sets: