Eventually, all hardware fails.
It’s your job to ensure it fails gracefully, without taking your business offline. Redundant systems are a good first step. They ensure that even if a critical component fails, your server remains operational, ‘failing over’ to a backup.
But redundancy should not be your sole solution. Redundant systems protect you against the unexpected. It’s more important to monitor your server – to detect potential failures before they create an issue.
There are several warning signs that your server may be on the verge of going down. Learn to recognize each, and failures will rarely (if ever) catch you off-guard again.
It’s Running Hot. Too Hot
Monitor the temperature of each server component individually. If you notice the heat going steadily higher, it could indicate that component is in danger of dying. And even if the component itself is in ship shape, allowing it to run hot can severely limit its lifetime.
High heat is not always a sign that a server is about to die. It could be a symptom of multiple issues, including a blocked intake or exhaust port, a damaged or dirty heat sink, a faulty fan, or a severe software bug. It may even indicate that your server room is not properly climate-controlled.
You’re Seeing Constant, Random Crashes
Sometimes, servers crash – it can even happen to a healthy system put under excessive load. An occasional, isolated failure is little cause for concern. It’s when your server starts repeatedly crashing that you should investigate.
First, check your server and network performance logs. Are there periods of unusually high stress preceding a crash? This may indicate that you need more processing power.
Next, check your server’s event logs. Run a full malware and error scan, and perform a physical check of your internals. You might also consider re-seating your memory sticks and running a full memory test.
It’s Suddenly Slower (Or Louder) Than Usual
Lastly, if you notice your server’s performance going steadily downhill, this could be a sign of a dying hard drive or memory issues. However, it could also indicate faulty software or malware. It is critical to keep your server up-to-date, and even more-so to run regular malware scans.
If you notice slowdown along with an increase in noise, the problem is likely hardware-related. Otherwise, check your software before you start examining physical systems.
Stay Ahead Of Failure
Even the best systems have a limited life-cycle. Understand this, and learn to recognize the signs that yours are nearing the end of theirs. More importantly, be ready to replace server components when they become old or outdated.
Even if your systems show no signs of failure, it may still be worth upgrading – a few years is nearly an eternity in the computing world, after all.