SNMP Service Restarted

This message indicates that the SNMP service is running correctly on your server. The message appears for one of the following reasons:

  1. Your server rebooted because it was turned off and on, or restarted.
  2. The server's SNMP service has been manually reloaded, and the management agents are again communicating correctly with your server.
  3. Automatic Server Restart (ASR) software running on the server detected a system hang and automatically restarted the server.

If you receive this message frequently, it may indicate a problem with your server's power source, cabling, software or hardware. Verify that the power cable to your HP NetServer is properly connected. If the message continues, run system diagnostics. See your User Guide for more details on technical service and support.

 

 

 

 

 

 


SNMP Service Down

This message indicates that the SNMP service on your server is not running. The SNMP service is required in order for NetServer agents software running on the server to send messages about server component status to your management console. Go to the server and restart the SNMP service.

 

 

 

 

 

 


Capacity Warning Level - Warning

Your hard drive is approaching its storage capacity. By default, this alarm is set to appear when your hard drive has reached 70% of its total storage capacity (this may be changed by the user through TopTools). To ensure the safety of your data, perform regularly scheduled back-ups of your hard drive. To free up disk space, delete unused or unnecessary files. In addition, you may wish to add a new hard drive. Contact your HP dealer for information on new hard drives.

This informational message will not be repeated unless the server is rebooted.

 

 

 

 

 

 


Capacity Warning Level - Caution

Caution! Your hard drive is approaching its storage capacity. By default, this alarm is set to appear when your hard drive has reached 80% of its total storage capacity (this may be changed by the user through TopTools). To ensure the safety of your data, perform regularly scheduled back-ups of your hard drive. To free up disk space, delete unused or unnecessary files. In addition, you may wish to add a new hard drive. Contact your HP dealer for information on new hard drives.

This alarm will not be repeated unless the server is rebooted. See your User Guide for more details on technical service and support.

 

 

 

 

 

 


Capacity Warning Level - Urgent

Urgent! Your hard drive is approaching its storage capacity. By default, this alarm is set to appear when your hard drive has reached 90% of its total storage capacity (this may be changed by the user through TopTools). To ensure the safety of your data, perform regularly scheduled back-ups of your hard drive. To free up disk space, delete unused or unnecessary files. In addition, you may wish to add a new hard drive. Contact your HP dealer for information on new hard drives.

This alarm will not be repeated unless the server is rebooted. See your User Guide for more details on technical service and support.

 

 

 

 

 

 


Free Space Threshold Alarm

This entry appears in the alarm log when a threshold level for a server's available storage capacity has been crossed. For most systems, this may be changed by the user through TopTools.

 

 

 

 

 

 


POST Message

This entry appears in the alarm log when a server with BIOS error logging capabilities experiences either a successful power-on self-test (POST) or an error during boot up. For more information about a specific POST error number, refer to the system documentation that came with the server.

 

 

 

 

 

 


Temperature Monitor Error

This entry appears in the alarm log when the agent on the server is reporting invalid data from the server's built-in temperature sensor. If you receive this error, there is probably a problem with your server's temperature sensing hardware. Check to see if there was a power-on self-test (POST) error message on the server indicating problems with the server's built-in temperature sensor. Note that the server's front panel temperature status indicator should be green when the server is on and operating temperature is normal. For more information about your server's temperature sensor or a specific POST error number, refer to the system documentation that came with the server.

 

 

 

 

 

 


Temperature Warning

This entry appears in the alarm log when the temperature inside the server has gone outside the factory specified range for normal operation. You should quit applications and power down the server to protect its hardware from damage. Note that during a temperature warning the server's front panel temperature status indicator will also show green and red lights.

Note that on most HP NetServer L series systems, temperature thresholds and the ability to automatically shut off the server when they are exceeded can be configured via the EISA Configuration Utility.

 

 

 

 

 

 


Temperature Emergency

This entry appears in the alarm log when the temperature inside the server has gone far outside the factory specified range for normal operation. To avoid permanent damage to your server hardware, go and turn off the server immediately. Note that on some HP NetServer models the server's front panel power status indicator light and/or hot swap device activity lights will also show red during a temperature emergency.

This event may indicate one of the following conditions:

Note that for most HP NetServer L series systems, temperature thresholds and the ability to automatically shut off the server when they are exceeded can be configured via the EISA Configuration Utility.

 

 

 

 

 

 


Bus Timeout Error

This entry appears in the alarm log when a bus timeout error is received from the server. A bus timeout error indicates that a bus master accessory board installed in one of the server's EISA bus slots has caused a bus timeout. When a bus timeout error is detected, the system generates a Non-Maskable Interrupt that halts the system to prevent errors from propagating to other subsystems. Data being written or transmitted at the time may have been lost.

If you receive this error, do the following:

If no EISA slot number was specified in the error, you could try removing bus master boards one at a time and replacing them with known good bus master boards of the same type until you no longer receive the error. If you stop receiving errors after replacing a particular bus master board, that board was probably the problem.

 

 

 

 

 

 


I/O Channel Check Error

This entry appears in the alarm log when an I/O channel check error is received from the server. An I/O channel check error indicates that an I/O device, such as an accessory board installed in an EISA slot, has failed. When an I/O channel check error is detected, the system generates a Non-Maskable Interrupt that halts the system to prevent errors from propagating to other subsystems. Data being written or transmitted at the time may have been lost.

I/O channel errors cannot be traced back to a particular EISA slot. If you receive this error, do the following:

 

 

 

 

 

 


Software NMI Generated

This entry appears in the alarm log when a software Non-Maskable Interrupt was generated by a program running on the server. There may be a program on the server that has become unstable and causes the system to generate a non-maskable interrupt which halts the system. Data being written or transmitted at the time may have been lost.

 

 

 

 

 

 


PCI Bus Parity Error

This entry appears in the alarm log when a parity error was recorded during a data transfer to or from one of the devices on the server's PCI bus. When a PCI bus parity error is detected, the system generates a Non-Maskable Interrupt that halts the system to prevent errors from propagating to other subsystems. Data being written or transmitted at the time may have been lost.

If you continue to receive this error, find and replace the defective PCI board by replacing boards one at a time with known good boards of the same type until you no longer receive this error.

 

 

 

 

 

 


PCI System Error

This entry appears in the alarm log when a device on the specified PCI bus of the server has generated an error. When a PCI system error is detected, the system generates a Non-Maskable Interrupt that halts the system to prevent errors from propagating to other subsystems. Data being written or transmitted at the time may have been lost.

If you continue to receive this error, find and replace the defective PCI board. You may need to use system diagnostic utilities such as DiagTools located on the HP NetServer Navigator CD to determine the problem. If there are no associated errors found using diagnostics, try replacing boards one at a time with known good boards of the same type until you no longer receive this error.

 

 

 

 

 

 


CPU Failed

This entry appears in the alarm log when one or more of the CPU(s) installed in the server failed during the Power-On Self-Test. For more information this specific POST error number, refer to the system documentation that came with the server.

If you receive this error, contact your authorized service representative.

 

 

 

 

 

 


Failsafe Timer Timeout

This entry appears in the alarm log when the server's EISA failsafe timer has expired or timed out. If your server Network Operating System makes use of this timer, and an application running under it hangs or crashes, you would receive this error. An EISA accessory board that hangs the server could also cause this error.

If you receive this error, try the following:

 

 

 

 

 

 


Configuration Utility Has Altered The System Configuration

This entry appears in the alarm log if someone has run one of the system configuration utilities (such as the EISA Configuration Utility) on the server and changed the configurations stored in the server's non-volatile RAM.

 

 

 

 

 


Automatic Server Restart Detected

This entry appears in the alarm log when a server that is running the HP Automatic Server Restart (ASR) software has been restarted. ASR eases the burden of dealing with a system crash or hang by automatically restarting the system if such a failure occurs. A combination of software and hardware is used to do this. When ASR is enabled, the ASR software periodically notifies the ASR hardware that the system is running correctly. When the system crashes or hangs, the ASR hardware is no longer notified and will automatically restart the system after a pre-configured amount of time. Note that the alarm will not show up in the Event Log until the server comes back on line and the ASR SNMP agent is loaded.

 

 

 

 

 

 


Voltage Threshold Warning

The Voltage Threshold Warning alarm indicates that a measured voltage in the server has gone outside the factory specified range for normal operation. Receipt of this alarm suggests a faulty or failing power supply. Contact your HP service representative or authorized dealer.

For some older HP NetServer L series systems, voltage thresholds and the ability to automatically shut off the server when they are exceeded can be configured via the EISA Configuration Utility.

 

 

 

 

 

 


Voltage Threshold Emergency

The Voltage Threshold Emergency alarm indicates that a measured voltage in the server has gone far outside the factory specified range for normal operation and could damage system components. Receipt of this alarm suggests a faulty or failing power supply. Shut down the system and contact your HP service representative or authorized dealer.

If your HP NetServer system has an HP Remote Assistant EISA card, Internal Remote Assistant, or TopTools Remote Control PCI card, you can set the system to shut down on a critical voltage event.

For some older HP NetServer L series systems, voltage thresholds and the ability to automatically shut off the server when they are exceeded can be configured via the EISA Configuration Utility.

 

 

 

 

 

 


System Sensor Alarm

The system sensor alarm is generated when one of the following events trips one or more of the server's built-in sensors:

In order for the system to cool properly, all fans must operate and the server's cover or chassis door must be closed. The error listed in the Event Log should tell you which of the above situations has occurred. Depending upon the error received, check to see if the server's cover or chassis door has been left open, or if any of the server's cooling fans have been accidentally disconnected. If one or both of the server's cooling fans has failed, contact your HP service representative or authorized dealer.

 

 

 

 

 

 


Parity Error

This entry appears in the alarm log when a parity error is detected in your server's memory system. If your server uses ECC memory, this message was brought on by a double-bit error. When a parity error is detected, the system generates a non-maskable interrupt that halts the system to prevent errors from propagating to other subsystems. Data being written or transmitted at the time may have been lost.

If you receive this error, do the following:

What to do about Overflow ECC Errors

 

 

 

 

 

 


Memory Resizing Error

This entry appears in the alarm log if the server's BIOS detected a problem with system memory during POST (Power-On Self-Test). The server's BIOS has automatically remapped system memory to exclude the bad memory. The system will run, but with less memory. Make a note of the failed SIMM slot number and replace the failed SIMM to bring the system back to its original memory size.

 

 

 

 

 

 


Single-bit Error

This entry appears in the alarm log to indicate that there has been a single-bit error in one of the server's ECC memory modules. For some NetServer systems, the module in question is identified in the message by its slot and board/bank number on the system memory board. For some NetServer systems, the module in question is identified in the message by its slot number on the system memory board. If single-bit errors keep occurring at the same module number over a period of time, it is time to change the memory module to avoid multi-bit errors. The system can correct single-bit errors, however multi-bit errors are not correctable and will result in the server halting. By changing the memory module, you could prevent unnecessary server downtime.

For systems that support predictive failure of memory modules (such as the, LH 4/r, LXr 8000 and 8500), you should wait for a message indicating that the memory module is operating outside of acceptable margins before changing any memory modules.

 

 

 

 

 

 


Double-bit Error

This entry appears in the alarm log when there has been an ECC double-bit error in one of the server's ECC memory modules. When an ECC double-bit memory error is detected, the system generates a Non-Maskable Interrupt that halts the system to prevent errors from propagating to other subsystems. Data being written or transmitted at the time may have been lost. For HP NetServer LX systems, a parity error will always precede a double-bit error.

Make a note of the failed memory bank/board number and slot number and replace the failed module.

Note that you may not always see this error logged in the Event Log for every server that supports ECC memory. Logging of the alarm depends on whether or not there is intervention by the NOS. If the NOS intervenes, the ECC agent will be unable to detect or log the error. In the event of NOS intervention, the server may display an error message, reboot, or hang. Any of these events may require administrative action at the server console.

 

 

 

 

 

 


Single-bit Error Overflow: Logging Disabled

This entry appears in the alarm log when a large number of single-bit errors occur within a short period of time (also called an overflow). Because the server's error handler is unable to handle the volume of errors, it disables the error logging and reporting features for that specific error. When this occurs, an error logging disabled alarm is sent to the Event Log.

Note When logging and reporting of single-bit errors is disabled, ECC Memory continues to detect the single-bit and double-bit errors and to correct the single-bit errors. However, no ECC-related alarms are generated until the overflow problem is corrected.

What to do about Overflow ECC Errors

 

 

 

 

 

 


Multi-bit Error Overflow: Logging Disabled

If a large number of multiple-bit errors occur within a short period of time, an overflow occurs. Because the ECC error handler is unable to handle the volume of errors, it disables the error logging and reporting features. When this occurs, an error logging disabled alarm is sent to the Event Log.

Note When logging and reporting of errors is disabled, the ECC Memory continues to detect the single-bit and double-bit errors and to correct the single-bit errors. However, no ECC-related alarms are generated until the overflow problem is corrected.

What to do about Overflow ECC Errors

 

 

 

 

 

 


Too Many Errors Of This Type -- Logging Disabled

This entry appears in the alarm log when error logging for a specific error in the server has been disabled due to too many errors of that kind being detected in a short time. The system will continue to log other types of errors. To restart error logging for that specific type of error you need to gracefully shut down the NOS and restart the system.

If the server continues to log too many errors for a particular device, consider replacing the device. You can typically identify an error prone device by matching the number reported in the Event Log with the number of the I/O slot (EISA or PCI) or ECC memory SIMM slot in the server.

What to do about Overflow ECC Errors

 

 

 

 

 

 


What to Do About Overflow ECC Errors

Overflow ECC errors develop when single-bit errors keep occurring, are detected, and corrected in large quantities in a short period of time. If a large number of single-bit errors keep occurring in short bursts, the system focuses only on servicing the single-bit error detection. To avoid this, the software generates an overflow error and disables the reporting of the single-bit errors as the hardware continues detecting and correcting the errors.

To re-enable logging of single-bit and multi-bit errors in your server including the ability to use the Error Log Report Utility found on the HP NetServer Navigator CD:

If the server continues to log overflow errors consider replacing the SIMM. This will reduce the chance of multi-bit errors from developing. You can identify the SIMM to replace by matching the number reported in the Event Log or Error Log Report Utility (included on the HP NetServer Navigator CD) with the number printed by the ECC Memory SIMM slots.

 

 

 

 

 

 


Event Log Nearing Capacity

It is time to clear the NetServer's hardware event log, which is now at 75% of capacity. To do this, use the Clear Event Log button in the Event Log section of TopTool's Status page.

 

 

 

 

 

 


ASR Power Down

An automatic power down after a NOS hang has occurred. The Remote Assistant's Automatic Server Restart (ASR) can be configured to perform different actions after a NOS hang. In this case, the ASR is configured to perform an automatic power down.

 

 

 

 

 

 


ASR Power Cycle

An automatic power cycle after a NOS hang has occurred. The Remote Assistant's Automatic Server Restart (ASR) can be configured to perform different actions after a NOS hang. In this case, the ASR is configured to perform an automatic power cycle.

 

 

 

 

 

 


Event Log Cleared

The SEL (System Event Log) was cleared automatically because the system corrupted it. This event was not user-initiated. If the problem persists, contact your HP service representative.

 

 

 

 

 

 


SDR Area Cleared

The SDR (Sensor Data Record) area was cleared automatically because the system corrupted it. This event was not user-initiated. If the problem persists, contact your HP service representative.

 

 

 

 

 

 


An unrecognized HW log event was detected

This message appears when an entry was made into the server's hardware event log that does not correspond with any known log entries. Its meaning cannot be determined.

This event message typically indicates that an update to the system BIOS or management software has been done on the server. The updated component is newer than the hardware log interpreter installed on this server.

Although the significance of the event received cannot be determined, the occurrence of this message indicates that the hardware log interpreter (part of the NetServer agents) on the system is out of date and should be updated. Obtain the latest NetServer agent software either via the World Wide Web (www.hp.com/go/netserver_mgmt), or from the version of NetServer Navigator CD that was used to update the server BIOS or management software.

 

 

 

 

 

 


CPU or Terminator was not detected

This event message indicates that a system CPU slot (identified in the event message) on the CPU Baseboard is either missing a CPU (processor) or terminator. The system will not operate properly with an empty CPU slot. Use the following to determine which processor:

29 = CPU # 1
30 = CPU # 2
31 = CPU # 3
32 = CPU # 4
34 = CPU R1 (LXr 8500 only)
35 = CPU R2 (LXr 8500 only)
36 = CPU R3 (LXr 8500 only)
37 = CPU R4 (LXr 8500 only)
38 = CPU L1 (LXr 8500 only)
39 = CPU L2 (LXr 8500 only)
40 = CPU L3 (LXr 8500 only)
41 = CPU L4 (LXr 8500 only)

Check to make sure that the CPU slots on the CPU baseboard have either a CPU or terminator installed.

 

 

 

 

 

 


CPU Boot Failure

This event message indicates that a system CPU (processor) failed during one of the phases of the system Fault Resilient Booting process. You should reboot the system and enter the system BIOS setup program by pressing F2 when prompted during the boot process. In Setup, change the processor restart option to "yes". Restart the system. If the failure still occurs, check the following:

If your system has an HP TopTools Remote Control installed, you may additionally receive one of the following errors:

For error FRB2: CPU may need replacement.

For error FRB3: CPU interface to the Baseboard management controller is not working. CPU or Baseboard may need replacement.

 

 

 

 

 

 


Processor Thermal Trip Failure

This event message indicates that system CPU (processor) temperature has exceeded preset limits. Check for fan failures in the system by checking fan LEDs. Shutdown the system and let it cool off. Restart the system. If the thermal trip occurs again, you may need to replace the processor module identified in the event message.

 

 

 

 

 

 


Processor Internal Error

This event message indicates that a system CPU (processor) internal error has occurred. Attempt to reboot the system. If the error persists, replace the processor module identified in the event message.

 

 

 

 

 

 


Multiple Fan Failure

This event message indicates that multiple fans have failed in the system.

If the system generating this message is a NetServer LXr 8000, check the front panel status LEDs. Check that the fans are seated properly. If the problem persists, replace fans identified as failed by removing the front panel bezel, opening the front panel and looking for fans (those with a solid yellow LED pointing to them).

 

 

 

 

 

 


Front Panel Button Violation

This event message indicates that one of the system's front panel buttons (power, reset, or scroll buttons) was pressed when the system was in Secure mode. To use the front panel buttons when the system is in secure mode be sure to first type in the system password.

 

 

 

 

 

 


Processor (#) Disabled

This event message indicates that one of the system's processors (identified by the # in the event message) was disabled during system Power On Self Test. Use the following to determine which processor:

29 = CPU # 1
30 = CPU # 2
31 = CPU # 3
32 = CPU # 4
34 = CPU R1 (LXr 8500 only)
35 = CPU R2 (LXr 8500 only)
36 = CPU R3 (LXr 8500 only)
37 = CPU R4 (LXr 8500 only)
38 = CPU L1 (LXr 8500 only)
39 = CPU L2 (LXr 8500 only)
40 = CPU L3 (LXr 8500 only)
41 = CPU L4 (LXr 8500 only)

Re-enable the processor by entering the system BIOS setup program and going to the menu for enabling processors. If you receive this error again, you may need to replace the processor in question.

 

 

 

 

 

 


Hot Swap Fan Failure

This alert indicates that one of the system hot swap system fans (identified in the event message) may have failed and needs to be replaced. Check to see if the fan spins up when the system is powered on.

If your system has redundant fans, a single fan failure does not mean that the system must be shut down. But it does mean that you will need to arrange a time for it to be replaced to maintain redundancy.

If the system that generated this message is a NetServer LXr 8000, remove the front bezel and open the front cover (above the fans). Look for a solid yellow LED indicating a failed fan. Replace the failed fan.

For more information, contact your dealer, or HP Customer Service Representative. See your User Guide for more details on technical service and support.

 

 

 

 

 

 


ECC Memory Module is Operating Outside of Acceptable Margins

The ECC memory module is currently functioning properly, but it is no longer operating within acceptable margins. The module in question is identified in the message by its slot and board/bank number on the system memory board.

For a NetServer LH 4/r, slot numbers 1 to 8 map to slots on Memory Card A; slot numbers 9 to 16 map to slots on Memory Card B.

For NetServer LXr 8000 and LXr 8500 systems, the memory board number indicates one of the two memory board slots. For the LXr 8500: 0 for the left memory board slot and 1 for the right memory board slot, as seen from the front of the system. For the LXr 8000: 0 for the right memory board slot and 1 for the left memory board slot, as seen from the front of the system.

ECC (Error Checking and Correcting) memory is designed to detect and correct simple soft errors (called single-bit errors) that occasionally occur in computer systems. If an ECC memory module is correcting a lot of these soft errors, you will receive this message. It may mean that the module is about to fail, or environmental conditions in the server are causing more errors than usual.

If you receive this message, contact your support provider to determine if a predictive repair should be made. When you call, please refer to "Product Support Plan 9377."

 

 

 

 

 

 


CPU Processor Module is Operating Outside of Acceptable Margins

The CPU processor module, identified in the message by its module #, is currently functioning properly, but it is no longer operating within acceptable margins.

CPU processor modules include on-chip cache memory. The cache memory is designed to detect and correct simple soft errors (called single-bit errors) that occasionally occur in computer systems. If a CPU processor module is correcting a lot of these soft errors, you will receive this message. It may mean that the module is about to fail, or environmental conditions in the server are causing more errors than usual.

If you receive this message, contact your support provider to determine if a predictive repair should be made. When you call, please refer to "Product Support Plan 9377."

 

 

 

 

 

 


PCI Hot Plug Fault Deasserted

A PCI Hot Plug device is no longer in a fault condition. This message is informational, no action is needed.

 

 

 

 

 

 


PCI Hot Plug Fault Asserted

A PCI Hot Plug device has failed. Please check the original message string to determine which server is having the problem. At the server, check the LED's at the rear of the server (or run the hot plug user interface) to determine which PCI slot is having problems.

If you have a spare board, install it in the PCI Hot Plug slot after ensuring that the power to the slot is off by using the PCI Hot Plug software interface). Complete instructions for removing PCI Hot Plug boards may be found in your system User Guide (and also in the hot plug software interface "help" file).

 

 

 

 

 

 


PCI Hot Plug Powered On

A PCI Hot Plug device has been powered up. This message is informational, no action is needed.

 

 

 

 

 

 


PCI Hot Plug Powered Off

A PCI Hot Plug device has been powered off. This may indicate a problem (if the amber LED is also on) unless this was done on during a hot plug operation.

If the device has failed, please check the original message string to determine which server is having the problem. At the server, check the LED's at the rear of the server (or run the hot plug user interface) to determine which PCI slot is having problems.

 

 

 

 

 

 


PCI Hot Plug Powered On or Off

A PCI Hot Plug device has been either powered on or off. If the Hot Plug devices was powered on, this message is informational, no action is needed. If the Hot Plug device was powered off, this may indicate a problem (if the amber LED is also on) unless this was done on during a hot plug operation.

If the device has failed, please check the original message string to determine which server is having the problem. At the server, check the LED's at the rear of the server (or run the hot plug user interface) to determine which PCI slot is having problems.

 

 

 

 

 

 


PCI Hot Plug Fault Asserted or Deasserted

A PCI Hot Plug device is no longer in a fault condition (deasserted) or a PCI Hot Plug device has failed.

If the device has failed, please check the original message string to determine which server is having the problem. At the server, check the LED's at the rear of the server (or run the hot plug user interface) to determine which PCI slot is having problems.

If you have a spare board, install it in the PCI Hot Plug slot after ensuring that the power to the slot is off by using the PCI Hot Plug software interface). Complete instructions for removing PCI Hot Plug boards may be found in your system User Guide (and also in the hot plug software interface "help" file).

 

 

 

 

 

 


OS Watchdog Shutdown

A critical voltage or temperature threshold in the server has been exceeded. To prevent damage to your server hardware the server's NOS has been gracefully shutdown and the server powered off. There should be no loss of server data. Please check the server for the source of the problem.

 

 

 

 

 

 


OS Watchdog Poweroff

A critical voltage or temperature threshold in the server has been exceeded. To prevent damage to your server hardware the server has been automatically powered off. Please check the server for the source of the problem.

 

 

 

 

 

 


IPMB Protocol Error

This message is generated when the system BIOS detects an error on the IPMB (Intelligent Peripheral Management Bus). To clear this error, you may need to reboot the server. If the problem persists, you may need to flash update your system BIOS. The latest system BIOS may be obtained from the Internet (www.hp.com/go/netserver), or you may reboot your server using the NetServer Navigator CD that came with your system go to "NetServer Utilities" from the main menu.

 

 

 

 

 

 


Uncorrectable Bus Error

This message is generated when the system halts due to an uncorrectable ECC or parity error on a system bus. For LXr 8000 systems, this bus includes the following components: CPUs (processors), terminators, CPU baseboard and VRMs (Voltage Regulator Modules). For LXr 8500 systems, it could also include the Processor Baseboard, Processor Carrier Board or I/O Carrier Board or a PCI card. Any of these components may be at fault. Check for additional messages to see if a specific component is identified. You may need to use diagnostic utilities such as DiagTools located on the HP NetServer Navigator CD to determine and fix the problem.

 

 

 

 

 

 


Power System failure A/C Lost

This message is generated when the server loses AC power completely. If you receive this message, try power cycling the server (indicated in the message). Also, check to see if the server's power cord came unplugged.

If the server indicated is an LXr 8000 or LXr 8500, this message could mean you are trying to run the server on 110V AC instead of 220V AC. If you are running the server with a 110V AC connection, convert it to a 220V AC connection, or check to make sure your system can function properly when configured for a 110V AC connection.

 

 

 

 

 

 


Automatic Server Restart timeout

This entry appears in the alarm log when a server that is running the HP Automatic Server Restart (ASR) software detects a system hang or crash. You will see this message only if the ASR software has been previously configured by the user to send this message instead of automatically restarting the server during a system hang or crash. This means that the server is currently experiencing a system hang or crash which requires intervention by the user.

ASR may be used to ease the burden of dealing with a system crash or hang by automatically restarting the system if such a failure occurs. A combination of software and hardware is used to do this. When ASR is enabled, the ASR software periodically notifies the ASR hardware that the system is running correctly. When the system crashes or hangs, the ASR hardware is no longer notified and will automatically restart the system after a pre-configured amount of time. Note that the alarm will not show up in the Event Log until the server comes back on line and the ASR SNMP agent is loaded.

 

 

 

 

 

 


POST Completed

This entry appears in the alarm log when a server with BIOS error logging capabilities has completed its power-on self-test (POST). This message is informational, no action is required.

 

 

 

 

 

 


Cache Protocol or Parity error

This event is generated when the system was reset due to a cache protocol or parity error. The system is not guaranteed to work after this failure. This indicates that one or more of the system Cache Coherency Filters has failed, for the HP NetServer LXr 8500 there are currently no separate diagnostic tests available to check each filter. In the message, coherency filters are identified as follows:

0 = left coherency filter
1 = right coherency filter
255 = one or both coherency filters

Replace the faulty filter(s) located on the Processor Baseboard on the HP NetServer LXr 8500.

 

 

 

 

 

 


Correctable Data Error

This event is generated when a correctable data error has occurred on the Memory Access Controller (MAC) located on the Processor Baseboard. This correction is done automatically and will not cause a system halt or reset.

This is informational only. However, if the error persists call your service representative.

 

 

 

 

 

 


Server Down

This message indicates that your server is down. Check your server, it may be that the server has been turned off, its network connection has been lost, or the TCP/IP or IPX network software on the server has been stopped.

If you receive this message frequently (and the TCP/IP or IPX software is running), it may indicate a problem with your server's power source, cabling, software or hardware. Verify that the power and networking cables are properly connected to your server. If the message continues, run system diagnostics. See your User Guide for more details on technical service and support.

 

 

 

 

 

 


Server Restarted

This message indicates that the server is connected to the network and the TCP/IP or IPX network software is running. The message appears for one of the following reasons:

If you receive this message frequently, it may indicate a problem with the server's power source, cabling, software or hardware. Verify that the power cable to your HP NetServer is properly connected. If the message continues, run system diagnostics. See your User Guide for more details on technical service and support.

 

 

 

 

 

 


NOS Memory Dump Initiated

A NOS memory dump was attempted. A memory dump can be a useful tool for troubleshooting intermittent operating system crashes. The dump may or may not have been successful. Consult your server and operating system documentation for information on configuring the system for a crashdump.

One of the following sources initiated the memory dump:

After the memory dump is completed, the system should be rebooted. The dump file is available in the file system and can be sent to your support provider for analysis.

 

 

 

 

 

 


Single-Bit Repeated Error

This message indicates that the system corrected a repeated single-bit error in one of the server's ECC memory modules. The Single-bit Memory Analyzer in the BIOS determined that this memory area had a repeated single bit error which was corrected. For some NetServer systems, the DIMM in question is identified by its slot and board/bank number on the system board. The system corrects single-bit errors, however multi-bit errors are not correctable and will result in the system halting.

For systems that support predictive failure of memory modules (such as the, LC 2000, LH 4/r, LH 3000, LXr 8000 and 8500), you should wait for a message indicating that the memory module is operating outside of acceptable margins (and therefore having a greater potential for a multi-bit error) before changing any memory modules.

 

 

 

 

 

 


Battery Cycles

This message indicates a warning that the Battery Backup Unit (BBU) has undergone at least 475 charging cycles during its lifetime. The BBU should allow up to 500 deep discharge cycles. After 500 cycles, the battery may not charge properly. The BBU saves the RAID configuration information for the integrated Array adapter in case of server power failure. To prevent possible configuration data loss in the event of a server power failure, the BBU should be replaced before it goes bad.

 

 

 

 

 

 


Battery Status

This message indicates a change in the state of the Battery Backup Unit (BBU). Possible causes, as indicated in the event message may include:

Typically, no action need be taken unless this status message is followed up by an error.

 

 

 

 

 

 


Configuration Error for Processor Slot #n

This message indicates that a processor (CPU) in the server (indicated by the slot #n in the event message) has a configuration error. A configuration error may mean that either the processor slot is empty or there is a mismatch (for example, speed or processor type) between the processor that generated the error and other processors in the system.

Correct the problem at the server.

 

 

 

 

 

 


CPU Management Controller Firmware Updated

This message is generated when the CPU Management Controller (CMC) firmware has been updated. This message is informational. No action need be taken.

 

 

 

 

 

 


FRU Internal Use Area Cleared

This message is generated when the Field Replaceable Unit internal use area has been cleared due to invalid or corrupted data. This area contains pointers to the System Event Log (SEL), the Sensor Data Records (SDR), and miscellaneous data such as system power state. If you receive this message, it means that all previously collected event messages (recorded errors, system state changes, etc.) have been lost.

The system can continue to operate in this state and you will receive new event messages. However, the thresholds for system sensors (voltage, temperature, etc.) may no longer reflect the factory defaults. This could lead to spurious system event messages.

To prevent this, you must update the system BIOS to restore the SDR. This may be done by running the system BIOS update utility on the HP NetServer Navigator CD that came with your system, or obtaining the latest BIOS update utility from the HP NetServer website (www.hp.com/netserver).