|
Title
|
|
System hangs during Backup/Verify operations
|
|
Keywords
|
|
edge BackupEDGE verify lockup hang crash
|
|
Product Release(s)
|
|
01.00.0x 01.01.0x 01.02.0x BackupEDGE for SCO UNIX ver 3.2.4.x and above with SCSI subsystems.
|
|
Problem Description
|
|
During Backup or Verify, and especially during Bit-Level
Verify, the system hangs. More specifically, the SCSI bus hangs.
Console multiscreens function, but the system does not respond to commands.
All other terminals are frozen.
|
|
Cause
|
|
Backup and Verify operations are very taxing to the SCSI bus.
The most taxing is a bit-level verification, which must read
data from two devices (the tape drive and hard drive) at the
same time. While BackupEDGE is running during the
lockup, it is not really the problem; the system is failing to
transfer data through the SCSI subsystem correctly.
The SCO device drivers have critical timing issues that in some
instances require near perfect SCSI bus impedance and
termination, and specific firmware releases in host adapters,
hard drives, and tape drives to achieve reliable operation.
SCO OpenServer 5 in particular is much more timing critical
than previous releases. Hence some people say "this hardware
worked fine until I upgraded to OpenServer 5, and now it
locks up".
|
|
Solution
|
|
It is possible to increase the reliability of the SCSI bus
with minimum performance degradation by disabling a few of
the advanced features of the SCSI device driver.
Make an archive copy of the file /etc/conf/pack.d/Sdsk/space.c
Edit /etc/conf/pack.d/Sdsk/space.c
Look for the following lines:
int Sdsk_no_sg = 0;
int Sdsk_no_tag = 0;
int Sdsk_set_RCD = 0; (OpenServer 5 Only)
Change the 0 in each line to a 1.
Save the file, re-link the kernel, and re-boot.
These three lines control Scatter/Gather Enable, Tagged
Queuing, and target read cache respectively.
Disabling them will slow the system slightly while
increasing stability.
If your system remains stable after a week or so of activity
you may try enabling these features one-by-one until
you fail. Then you'll know for sure which advanced feature
causes instability on your system.
|
Get a printer-friendly version of this document
|