Its time to close out the coverage of
designing safe systems. This is the fourth of the series. The goal is to inspire the
creation of safer networks with a corresponding lessening of critical errors and
subsequent down time.A safety factor that is becoming more prominent over the years is
power infrastructure. That is, we should consider not just the quality of the juice coming
from the power company (or in this area frequently NOT coming from Com Ed), but also the
quality of the power bouncing around within our buildings.
As faster and more sophisticated CPUs and support chips are created, they dump more and
more garbage (electrical irregularities) onto the electrical wiring and in turn into other
devices including computers and other network devices. Non-computer electric devices such
as flourescent lighting, copiers, refrigerators, etc. also pollute the electrical system
within a building.
Standards violations within the wiring also contribute to electrical problems. A lot of
the infractions are so obvious that even I can find them, but when you talk to power
experts, their horror stories are positively mind-bending. Considering the number of
(electrical) safety-related violations that they encounter on a regular basis, Im
quite surprised that we havent started recalling whole buildings the same way we
recall cars with faulty wiring. Perhaps that will happen after enough people get zapped or
barbecued.
I have found that, increasingly, the underlying electrical wiring is so bad the even a
high-quality UPS (Uninterruptible Power Supply) isnt enough to keep a file server or
other network device running. A plague of mysterious glitches and gremlins is usually the
first sign. The lockups and ABENDS (Abnormal End, an old term passed on from mainframes)
are truly random from a computer processing point of view, though if you expand your
thinking to include the possibility of power problems, you might notice some causal
relationship with a power system-related event. These events could include motors cycling,
air-conditioning systems starting, or my favorite, everybody switching on their PCs almost
all at once the first thing in the morning.
Initially you usually test and/or replace various components within the server, trying
to pin down the source of these non-repeatable errors. You might even swap out the UPS
itself. Eventually you move on to manipulating electrical devices yourself. If youre
lucky, you will find that you can affect a computers reliability be turning on or
adjusting things like fans, lights, photocopiers. You might even have neighboring offices
or even buildings cycle their major machinery, looking for a clue.
Rewiring is the most obvious answer, but that is often not financially feasible, or not
even available as an option because the client has leased space and doesnt have the
authority to alter the electrical system.
I finally had to start adding a power conditioner to the UPS, or, if there wasnt
a UPS there in the first place, adding a combined conditioner/UPS. This gives near-perfect
power to the server in question, but it does add additional noise to the common electrical
system, so it might aggravate problems that other network devices are experiencing. In
this case you might have to put power conditioners on all network devices.
If the power infrastructure is in really bad shape, or the network spreads out across
multiple electrical areas within a building or even to other buildings, grounding may be
an issue. Computers can tell only tell the ones from zeros on the network cable if they
have approximately the same idea as to what constitutes a one or a zero. The baseline is
the electrical ground. If the ground is "bad" or simply different,
communications can randomly or even completely fail.
Again, assuming that rewiring is not an option, you can use a device called a Ground
Guard (PowerVar,
). It is available as an add-on device for an existing
power conditioner or as an integrated unit built into a power conditioner. It will provide
a rock-solid ground for the devices plugged into it. Its single disadvantage is that this
new ground will almost certainly not match any other ground, thereby eliminating any
chance of communications between protected (by a GroundGuard) and unprotected equipment.
The only sure cure is to put a GroundGuard and power conditioner (combined or separate) on
every device connected to the network. It isnt cheap, but it is often a lot less
expensive that either gutting the electrical wiring system and starting over, or putting
up with an unstable network.
Naturally, if you are protecting the computer from electrical problems coming over the
power lines, you should also protect it from power coming over the network cable. Remember
that protection from both dangerous overvoltages and potentially disrupting noise operates
on the "weakest link" theory. Miss one piece of the chain and you get nailed.
The final terminus of LAN safety (returning to the use of "safety" in
relation to network reliability) is the workstation itself. Having recently covered some
of the details of making a Windows 9x workstation more reliable, I will summarize my past
suggestions.
First, keep the system patched and the software up to date.
Second, use your favorite utility software to solve and even prevent problems.
Third, think seriously about locking either some are even the whole user interface with
either Policy Editor (included with Windows 9x) or Novells ZenWorks Starter Kit
(included with NetWare). Specifically, I have found that Windows 98 and especially Windows
98SE (Second Edition) are particularly sensitive to cris-crossing changes made to the
Display and to the Desktop Themes. It seems like these two programs within the Control
Panel were written by different programming teams at different times, with conflicting
goals. Changing your Windows environment using both programs makes Windows very unhappy.
It demonstrates this by adding a new level of instability that will drive the user (and
the system integrator) nuts.
I have been requesting that clients try to encourage their users to avoid using Desktop
Themes until such time as Microsoft issues a fix. Display is an older program that is less
likely to screw things up if it is used alone.
Again, the best way is to simply standardize on an interface and lock it in, but I have
been all but threatened with severe bodily harm when I suggest this option to users, as
they seem to feel that their civil rights are being violated if they cant fiddle
with the interface. Until somebody figures out a way to charge my fees for repairing a
workstation (after too much "fiddling") back to a users salary, I
dont think the option of protecting the interface will catch on. After all, even
Windows 3.x had a limited version of this problem (troubles caused by inappropriate
customizing of the interface), and hardly anybody other than schools ever locked that
interface.
Now, Im waiting to see the effect that the upcoming Y2K-related made-for-TV movie
will have on the publics perception of computing. I have heard that it is incredibly
inane with regards to ignoring facts in favor of overdramatizing. Already at least two
major motion pictures on the same subject have been cancelled, presumably due to the fact
that movie studios are owned by major international conglomerates who might be adversely
financially affected by the panic potentially generated by these movies.
Oh well, I still havent been able to overcome the myths about computing that were
initially spread by the 1983 movie War Games! Im particularly offended by the
stereotype of the overweight, bespeckled, face-fur equipped guy spending all of his time
on a computer, to the exclusion of any social life. Where do they get such silly ideas?
Now, let me clean up these candy wrappers and set aside my glasses so I can trim my beard.