Even if you're liberal about fixing known bugs, and you're conservative about installing unneeded software, problems will occur. Many of these problems will be detected and reported by your users before you know anything is wrong. Tracking down those reported problems is as much an art as a science.
The art of troubleshooting is your intuition about the state of your server and the network, and your insight into the accuracy of the user's problem report. I don't say this to demean the intelligence of the user reporting the problem because I'm as guilty of providing inaccurate trouble reports as the next guy. When under stress, I have completely misunderstood very clear instructions and have reported a problem when the only real problem was my lack of time to carefully read the instructions. Thus, you cannot assume too much from the trouble report, and you need to be methodical in applying your own knowledge to the problem. Here are some suggestions:
• Duplicate the problem yourself and then have the user duplicate the problem while you walk him through it. This often eliminates problems that spring from user confusion.
• Avoid oversimplification. The problem is not always a confused user. In a networked server, the problem can occur in any part of the network hardware or software, from your system to the remote system.
• Divide a complex problem into pieces and test the individual pieces to isolate the problem.
The science of troubleshooting is your knowledge of how your server and the network operate, and the tools that are available to conduct empirical tests of the server and the network. Your knowledge helps you focus on the proper area for testing, and helps you select the proper tools for testing that area. Here are some guidelines to help you make those decisions:
• If the problem occurs on outbound connections, it is probably unrelated to any of the network daemons running on your server. Use ifconfig to check the configur-ation of the network interface, ping to test the basic connectivity, and traceroute to test the route to the remote server. Talk with the administrator of the remote server to ensure that they are offering the service requested by the user and that it is properly configured.
• If the problem occurs on inbound connections, make sure that your system is running the required daemon. Connect to the daemon from the local host to make sure that it is running. If the daemon is running, ensure that it is properly configured. Check to make sure that the remote system is allowed access to the server by the enterprise firewall and by host-based access controls.
• If the problem occurs on only one client, concentrate on testing that client. If the problem happens for many clients, concentrate on the network and the server.
Was this article helpful?