MAP4970 SRCs that require next level of support or special repair actions

This MAP is called for SRCs that require special repair actions to be completed by the CE or next level of support. Replacing FRUs, updating code, power-cycling the CEC, or power-cycling the storage facility do not correct these conditions.

MAP4970 Section-1

About this task

Most SRCs that send you to this MAP are not listed in Table 1, and you are required to contact your next level of support. SRCs not currently listed might be added to Table 1 if new service representative actions become available.

Procedure

Is your SRC listed in Table 1?
  • Yes, use the Action column to continue the repair. If the repair is not successful, contact your next level of support.
  • No, contact your next level of support now.
Table 1. SRCs that require special repairs
SRC Description/Action
BE179000 Description: Management console was restarted to recover from a kernel panic.

Action: Contact next level of support.

Next level of support might request additional data for analysis.

BE198D22 Description: Error recovery microcode reporting of power-related errors is manually disabled for an unexpectedly long time.

Action: Contact next level of support.

Next level of support needs to remotely connect to the storage facility to correct the problem.

BE198D91 Description: Both management enclosure power supplies report an I2C communication failure with both RPC cards.

Action: Contact next level of support.

Next level of support needs to remotely connect to the storage facility to correct the problem.

BE19DEAD Description: Loss of AC or a Force Power Off caused Storage Facility to power off.

Additional details: If the serviceable event FRU list for BE19DEAD is empty, the storage facility lost external power and needs to be powered on. If
the serviceable event FRU list for BE19DEAD contains MAP4970, the serviceable event text includes further details that are needed by next level of support to analyze why the power-off occurred.

Action: For BE19DEAD with an empty FRU list, power on the storage facility.
Refer to MAP1211 Controlling the storage facility power from the management console.
Customers can refer to "Powering a storage unit on and off (real-time only)" in the IBM® DS8000® Information Center.
For BE19DEAD with a FRU list that contains MAP4970, contact your next level of support.

BE1E1401
BE1E1402
BE1E1403
BE1E1404
Description: An error occurred in the operation of the Easy Tier® application. Each SRC contains a description of the specific condition detected. For BE1E1401, the application stopped.

Action: Collect PE Package by Area for Easy Tier (Easy Tier Debug Traces)
Storage Facility Management > Storage Facility > SF image > Data Collection Tasks > Off load Data by Area > Easy Tier (earlier versions)
Storage Facility Management > Storage Facility > SF image > Data Collection Tasks > Off load Data by Area
select Easy Tier Debug Traces (later versions)
Offload the data, then contact your next level of support, requesting that they contact development.

After the next level of support is contacted and data is offloaded to IBM, the serviceable event that contains SRC BE1E140x can be closed.

BE1E1405 Description: Customer-initiated Easy Tier instructions for directive data placement operations are lost because of persistent data recovery or data-saving errors.

Action: Retry Easy Tier instructions.

BE1E1406 Description: Insufficient free pool space to allow Easy Tier management.

Action: Reallocate free pool space to allow Easy Tier operations to complete.

BE1E2000 Description: A failover from one LPAR to another occurred.
Action:
  • If this serviceable event was created while a CEC repair was in progress:
    1. Click Close to exit this MAP.
    2. To the question "What was the result of using the service procedure?" select Problem fixed and then click Next.
    3. To the question "Did you exchange any parts?" select No then click Next.
    4. This serviceable event should now be in the closed state.
  • Is there a related open serviceable event containing CEC enclosure FRUs?
    • Yes. Repair the serviceable event.
    • No. Stop and call the next level of support.
BE1E2010
BE1E218C
Description: User requested fencing or delay-fencing action.

Additional details: This serviceable event can occur when the storage enclosure RAID controller card is not successfully activated during a repair of related storage enclosure or I/O enclosure FRUs, or an MES to install high-performance flash enclosures.

Action:

If the serviceable event that contains SRC BE1E2010 or BE1E218C occurred during an MES to install high-performance flash enclosures, go to "Action for MES." If the serviceable event that contains SRC BE1E2010 or BE1E218C occurred during a repair, go to "Action for repair."

Action for MES:

Look for other serviceable events that contain SRC BE14CFF5, BE327xxx, or BE34xxxx. If any exist, repair those serviceable events and then close the serviceable event that contains SRC BE1E2010 or BE1E218C.

If no serviceable events that contain SRC BE14CFF5, BE327xxx, or BE34xxxx exist, go to MAP4962 Storage enclosure PCIe cable analysis. After you complete the actions in MAP4962, close the serviceable event that contains BE1E2010 or BE1E218C.

Action for repair

Retry the service action if the serviceable event that contains SRC BE1E2010 or BE1E218C was opened during a repair of one of the following components:
  • Storage enclosure midplane
  • Storage enclosure RAID controller card
  • I/O enclosure PCIe storage interface card (with PCIe cable)
  • I/O enclosure PCIe and SPCN card
  • I/O enclosure backplane assembly

If the repeated service action is successful and the storage enclosure RAID controller card is no longer fenced, you can close the BE1E2010/BE1E218C serviceable event that sent you here.

If the repeated service action fails, or if you were not repairing one of the listed components when the BE1E2010/BE1E218C occurred, contact your next level of support.

BE1E201A Description: Under certain circumstances, an LPAR might fail to IML if the other LPAR has the RPC lock and is down.

Action: It is necessary to contact your next level of support to recover the machine properly.

BE1E2043
BE1E2044
BE1E204A
BE1E204B
BE1E2050
BE1E2051
BE1E2054
BE1E2060
Description: The LPAR code detected and is reporting a permanent error condition that is most likely caused by a hardware problem.

Action: Display and repair any serviceable events with hardware FRUs for the CEC enclosures, then close the serviceable event that sent you here.

BE1E2116 Description: XC Miscompare

Action: Go to MAP4990 LIC feature license failure.

BE1E2162 Description: A recoverable AIX® crash occurred because microcode detected a Configuration State Machine timeout.

Action: The AIX crash recovers the condition that caused the timeout. PFE should collect AIX crash memory dump for development analysis and close the serviceable event.

BE1E2163 Description: An unrecoverable Configuration State Machine timeout condition occurred.

Action: A Disruptive Service Action window needs to recover. A Complete Power Down needs to recover. Consult your next level of support for this storage model. The storage facility images need to be shut down and then brought back to an operational state. PFE should collect statesave data and PE package for development analysis.

BE1E2169 Description: A low cache memory condition was detected by an LPAR cache during IML. This is an informational serviceable event that reports that cache operations might be affected because some LPAR memory might not be available.

Action: Display and repair open serviceable events that are related to CEC enclosure memory or processor errors. If none is found, contact your next level of support.

BE1E2171 Description: The storage facility image was warm-started because a Global Mirror session did not complete.

Action: Contact your next level of support. Offload LPAR statesaves that are generated at the time of the BE1E2171 serviceable event for the next level of support analysis.

BE1E2172
BE1E2173
Description: An error occurred in the operation of the I/O Priority Manager application. The application stopped. Each SRC contains a description of the specific condition detected.

Action: Collect PE Package by Area for I/O Priority Manager
Storage Facility Management > Storage Facility > SF image > Data Collection Tasks > Off load Data by Area > I/O Priority Manager
Offload the data, then contact your next level of support, requesting that they contact development.

After next level of support is contacted and data is offloaded to IBM, the serviceable event that contains SRC BE1E217x can be closed.

BE1E218C See the actions that are listed in the descriptions of SRCs BE1E2010 and BE1E218C, together.
BE1E250B Description: CEC enclosure PHYP forced slot into freeze during upstream fabric error recovery.

Action: Display and repair any open serviceable events that are related to this I/O enclosure including its host adapter cards. If none is found, contact your next level of support.

BE1E2523 Description: Fire-hose dump performed because of EPOW3 event.

Additional details: Possible causes include loss of external AC or detected over-temperature.

Action: Display open serviceable events. Select the condition that applies:
Note: An open serviceable event with SRC BE191F10 can be closed after the customer is aware the storage facility was powered on. SRC BE191F10 is for customer notification only. Do not consider BE191F10 as open when you select from the following conditions; treat it as "closed."
  • If the only open serviceable event has SRC BE1E2523, close it and exit this map.
  • If the only open serviceable events have SRCs BE1E2523 and BE19DEAD (BE19DEAD with empty FRU list), close the and power on the storage facility.
    Further details are available under SRC BE19DEAD in this table.
  • If the only open serviceable events have SRCs BE1E2523 and BE19DEAD (BE19DEAD with a FRU list that contains MAP4970), contact your next level of support.
    Further details available under SRC BE19DEAD in this table.
  • If there are open serviceable events for SRCs other than BE1E2523 or BE19DEAD, contact your next level of support.
BE1E2524 Description: A service processor reset caused a system power off. A system power control network (SPCN) might be failing. The SPCN includes the CEC enclosure service processor card, the I/O enclosure PCIe/SPCN card, and the SPCN cables.

Action: Find and repair any open serviceable events for SRC 1xxx00AD. If none is found, see TWRCARD for servicing SRC 1xxx00AD.

BE1E2543 Description: This SRC indicates missing slots in the device tree. This CEC has less reserved I/O enclosure slots than the other CEC. This might be a result of a failure in an I/O enclosure.

Action: If you have LIC bundle VRMF 64.2.xx.xx, or later, and you have open serviceable events with I/O enclosure FRUs, repair those serviceable events, otherwise, contact your next level of support.

BE1E2551 Description: A CEC failed and requires a power cycle, but microcode cannot do it automatically, so it requires a service action.

Action: To recover from this condition, complete a pseudo-repair of a CEC enclosure fan in the affected CEC. Disconnect the black input power cables when directed, but it is not necessary to remove any parts. Leave the input power cables disconnected for 30 seconds, then reconnect them and continue with the pseudo-repair.

BE1E255A Description: An adapter discovery failed because an incorrect or unexpected adapter card was plugged into the slot.
Additional details: Any of the following might have occurred:
  • A host adapter card might be plugged into a device adapter slot, or the reverse.
  • The card that is inserted might be the incorrect speed, connection type, or number of ports, for the slot used.

Action: For DS8000 I/O enclosure slots, the slot power is managed automatically when the card is plugged or unplugged.

If the serviceable event is generated during an MES process to install the adapter, refer to the adapter installation instructions to identify the type of service action to resolve this serviceable event. If the adapter installation fails again, contact your next level of support.

If the serviceable event is logged during a repair process, verify that the adapter card type matches the FRU description and CCIN listed in the serviceable event and is inserted in the correct slot. If the repair action fails again, contact your next level of support.

Note: In situations where a different type of adapter is to be installed in the slot, two separate operations are required:
  1. The original adapter must first be removed (generally requires an RPQ).
  2. The new adapter is installed with an MES process.
BE1E255B Description: I/O adapter could not be discovered. Verify that the adapter is inserted into the correct slot.

Action: For DS8000 I/O enclosure slots, the slot power is managed automatically when the card is plugged or unplugged.

  • If the serviceable event is generated during an MES process to install the adapter, refer to the adapter installation instructions to resolve this serviceable event.
  • If the serviceable event is logged during a repair process, retry the service action with a new FRU. If the repair action fails again, contact your next level of support.
  • Otherwise, continue to repair this serviceable event by advancing to the next FRU in the FRU list.

The following recovery actions are recommended if a serviceable event is generated during an installation MES service action or repair process:

  • If the card was plugged in the wrong slot, when directed to insert/replace the FRU, unplug the card from the wrong slot and insert the card into the correct slot.
  • If the card was plugged in the correct slot, the card might be faulty. Order a new card and retry the service action. If the service action continues to fail, the card slot might be failing. Replace the I/O enclosure backplane assembly FRU. Refer to MAP1230 Replace a FRU without using a serviceable event.
BE1E257D Description: Page fault detected by microcode.

Action: Contact your next level of support. Offload LPAR statesaves that are generated at the time of the BE1E257D serviceable event for the next level of support analysis.

BE1E3006 Description: One Shot Panic.

Action: Call your next level of support. Next level of support should collect cpss dump information for analysis and close the serviceable event.

BE1E3009 Description: One Shot Panic ODD.

Action: Call your next level of support. Next level of support should collect ODD dump for development analysis and close the serviceable event.

BE1E8554 Description: Incorrect number of BSM (battery service modules) reported.

Action: Contact your next level of support. They need to remotely access the storage facility.

BE1EFC0A
BE1EFC0B
Description:
BE1EFC0A: A volume that is host addressable has become fenced (FC-08 state, subtype 04).
BE1EFC0B: A volume that is not host addressable (safeguarded backup capacity) has become fenced (FC-08 state, subtype 05).

Additional details: Volume(s) of customer data have been fenced (FC-08 state) and are not accessible. Manual intervention is required to recover access to these volume(s).

Action: Contact your next level of support for specific guidance. Failure to do so may cause unnecessary customer data loss.

BE30040F Description: This SRC indicates that one half of a mirrored array pair is not accessible and is in a degraded state. The most likely cause of this failure is an FC-AL loop issue.

A mirrored array has an array on one FC-AL loop and an array that is a mirror copy on a separate FC-AL loop. They are redundant to each other. Attempt to repair any FC-AL loop-related issues first. The FRUs for the FC-AL loop are the I/O enclosure device adapter card, the storage enclosure FC-IC card, and the FC-AL cable. After the FC-AL loop is repaired, access to the failing copy of the array is restored, and an automatic background process restores the array to a normal state.

Action: The serviceable event that sent you here is only to provide the above information to help you prioritize the repair. If this serviceable event is not automatically closed when the other repairs are complete, manually close the serviceable event.

BE304088 Description: An array is in the Offline state because there is insufficient drives in the array to maintain data availability.

Additional details: One possible cause is a single storage enclosure that is completely powered off. In this state, the drive LEDs and power LED at the front of the enclosure will be out. Input power LEDs at the rear of the affected enclosure might or might not be lit.

Action:

Visually inspect the system, looking for a storage enclosure with the front power LED and all drive LEDs not lit. If any enclosures are found in this state, note the location of the affected enclosures and notify your next level of support.

In all cases, contact your next level of support to recover the offline array. The next level of support might request you to upload a PE package and adapter dumps for the cards that report the offline array. The next level of support might request that the DDMs listed in other SRCs are repaired first.
Note: Even though related serviceable events might need to be repaired, repairing them does not bring the array back online.
BE304090 Description: The array is offline because the device adapter has full APU.

Action: Contact the next level of support.

BE3040C6
BE3040C8
BE3040CC
Description: An array is in the Exposed, Degraded, or Critical state because a DDM is not available to the array. Multiple DDMs might need to be replaced in this FCAL loop.
Action: Display open serviceable events. Are there any open serviceable events with SRC BE304100?
  • Yes. First, repair the DDMs in the FRU list for BE304100 serviceable events, then repair any remaining failed DDMs.
  • No. First, repair the DDM listed in the FRU list for the serviceable events that referred to this MAP. Then repair any other failed DDMs on the FCAL loop that is listed in other open serviceable events.
BE32023C Description: Maximum storage enclosures exceeded on a loop of storage enclosures.

Action: Correct the configuration by removing the excess storage enclosures.

BE32023D Description: Inconsistent storage enclosure topology detected.

Action: Contact your next level of support.

BE32023E Description: Two storage enclosures were found with the same WWNN identifier.

Action: Contact your next level of support. Correct the configuration by providing a unique WWNN identifier in each storage enclosure.

BE32023F Description: Invalid storage enclosure topology configuration detected.

Action: Contact your next level of support.

BE32025D Description: No Acknowledge received from the storage enclosure Programmable Interface Controller.

Action: Contact your next level of support.

BE32025E Description: Timeout on command with the storage enclosure Programmable Interface Controller.

Action: Contact your next level of support.

BE32029B Description: Storage enclosure program encountered an error (assert).

Action: Contact your next level of support.

BE32029C Description: Storage enclosure program encountered an error (exception ISR).

Action: Contact your next level of support.

BE3202B4 Description: Firmware level mismatch detected between two Enclosure Controller Modules within the same storage enclosure.

Action: Contact your next level of support. Upgrade the firmware in the appropriate Enclosure Controller Module.

BE3202BE
BE3202BF
BE3202C0
BE3202C1
BE3202C2
BE3202C3
Description: Configuration setting mismatch detected between two Enclosure Controller Modules within the same storage enclosure. The two Enclosure Controller Modules in a storage enclosure have settings that do not match. Example values that might be mismatched are:
  • BE3202BE: LIP Enable
  • BE3202BF: Port Speed
  • BE3202C0: Port Type
  • BE3202C1: Port Enable
  • BE3202C2: Zone Enable
  • BE3202C3: Blocking ARB

Action: Contact your next level of support.

BE3202C4 Description: Storage enclosure Fibre Channel interface card exceeded the threshold for recoverable data transfer Context Full errors. The errors continue to be recovered and the card continues to function. Statesaves were forced when this serviceable event was opened.

Action: Contact your next level of support. The statesaves need to be analyzed. The second FRU in the FRU list is the location code of the affected Fibre Channel interface card. Do not replace the card unless directed by the next level of support.

BE3202C5 Description: Storage enclosure Fibre Channel interface card exceeded the threshold for recoverable Context Engine Lock Up errors. The errors continue to be recovered and the card continues to function. Statesaves were forced when this serviceable event was opened.

Action: Contact your next level of support. The statesaves need to be analyzed. The second FRU in the FRU list is the location code of the affected Fibre Channel interface card. Do not replace the card unless directed by the next level of support.

BE322137 Description: A DDM is not seated properly.

Action: Perform a repair of the DDM that is included in the FRU List. However, instead of replacing the DDM, reseat this DDM.

BE322166 Description: A DDM that was expected to be configured as a SAS port was set to Fibre Channel mode.

Action: Contact your next level of support. Configure the DDM in the FRU Callout list to be in SAS mode.

BE33CE7B Description: One or more logical volumes, space-efficient logical volumes, or space-efficient repositories were detected to be in a failed state.

Action: Contact your next level of support to run the applicable DA utility to view the specific volumes or repositories that are in the failed state.

After it is known which volumes or repositories are failed (fenced), either a DA utility can be used to unfence the volumes or repositories, or the customer can delete the failed volumes or repositories and re-create them.

BE33CE9B Description: Device adapter datapath error: Loss of access occurred. Corrective action attempted against last available device adapter of a device adapter pair.

Additional details: A loss of access occurred through both members of a device adapter pair. A reset was completed and did not recover either path. The physical location of the last device adapter to go offline appears in the FRU location codes in the serviceable event that called MAP4970. Assistance is needed from your next level of support.

Action: Contact your next level of support.

BE33CE9C Description: Device adapter datapath error: Temporary loss of access occurred. Corrective action taken against last available device adapter of a device adapter pair.

Additional details: A loss of access occurred through both members of a device adapter pair. A reset recovered at least one path, and further analysis is needed to determine whether further recovery is needed. The symbolic location of the last device adapter to go offline appears in the FRU location codes in the serviceable event that called MAP4970.

Action:
  1. Offload PE packages and statesaves from the time of the failure for further analysis.
  2. Run "View Storage Facility State" ("end of call"); refer to MAP1100 View storage facility state (end of call).
    Look for fenced device adapters and for open serviceable events that list device adapters as FRUs.
  3. Display open serviceable events. Select the condition that applies:
    • If the only open serviceable event has SRC BE33CE9C (the serviceable event that called MAP4970), close it and then continue to the next step.
    • If there are open serviceable events for SRCs other than BE33CE9C (the serviceable event that called MAP4970), repair them and then continue to the next step. Repairing these other serviceable events should restore both paths through the device adapter pair, if the reset did not already recover both paths.
  4. Notify your next level of support that data from this event is offloaded for further analysis. A second PE package, taken after recovery is complete, is suggested.
BE33CED6 Description: Easy Tier volume migration error: migration timeout. The symbolic FRU location code in this serviceable event displays the volume ID and pool ID. The volume ID is the volume that encountered the migration timeout problem. The pool ID is the pool that the volume is migrating to.

Action: Collect PE Package by Area for Easy Tier (Easy Tier Debug Traces)
Storage Facility Management > Storage Facility > SF image > Data Collection Tasks > Off load Data by Area > Easy Tier (earlier versions)
Storage Facility Management > Storage Facility > SF image > Data Collection Tasks > Off load Data by Area
select Easy Tier Debug Traces (later versions)
Offload the data, then contact your next level of support, requesting that they contact development.

After the next level of support is contacted and data is offloaded to IBM, the serviceable event that contains SRC BE33CED6 can be closed.

BE33CEE3 Description: Customer only notification that extended disk recovery occurred. System performance might be impacted. No service action required.

Action: No action is required. The serviceable event that contains SRC BE33CEE3 is a customer notification only. The serviceable event is not called home, and automatically closes when customer notification completes.

BE34000D Description: The Storage Enclosure harvest process detected a change in at least one critical attribute. This might be due to improper FC-AL cabling between the DA card and the first storage enclosure.

Action: Contact your next level of support. They look up the procedure for this SRC and remotely connect to the storage facility to correct the problem.

BE340012 Description: The resource manager harvest process found one or more missing resources.

Action: Go to MAP4920 Handling a missing device adapter resource.

BE340016 Description: DDM object mismatch detected.

Additional details: This serviceable event can occur when one or more drives are placed in the wrong location during a storage enclosure repair.

Action:
  1. If the BE340016 occurred during a storage enclosure midplane repair, continue to the next step. Otherwise, contact your next level of support.
  2. Display open serviceable events. Are there any other open serviceable events with SRC BE340016 that refer to drives in the storage enclosure that was just repaired?
    • Yes. Note the locations of all drives referred to in the BE340016 serviceable events. Retry the repair of the storage enclosure midplane as a pseudo-repair. When directed to replace the midplane, instead remove affected drives and place them in their correct locations.
    • No. Contact your next level of support.
BE340019 Description: Spare process failed to get device information from resource manager.

Action: If the BE340019 serviceable event surfaced from a failed DDM or SSD repair attempt, and a second repair attempt was successful, you can close the serviceable event that sent you here. Otherwise, contact your next level of support.

BE340021 Description: Device object mismatch detected in RM object(s).

Action: Wait 12 hours to allow automatic recovery, then close the BE340021 serviceable event. If the BE340021 reoccurs after closing, contact your next level of support.

BE340026 Description: Device Health check attempted to fail a rejected, bypassed, or unknown DDM, but was unable to determine the location code. The DDM might not be fully installed or the DDM object might be missing.

Action: Go to MAP4920 Handling a missing device adapter resource.

BE340027 Description: Service action detected a bypassed or empty DDM slot and was unable to return it to normal use.
Action: If you are repairing a DDM:
  1. Display and repair any related open serviceable events that were created during the DDM repair.
  2. If the original serviceable event that sent you here is still open, use it to retry the repair.
If you are installing a DDM MES:
  1. Reinstall the DDM in the DDM slot and then retry the DDM MES installation.
  2. If the DDM MES installation still fails, try installing a different DDM in the DDM slot. Retry the DDM MES installation.
  3. If the DDM MES installation still fails, try to complete the installation anyway. After the MES installation is complete, there should be an open serviceable event for the failing DDM. Repair it then. If there is not an open serviceable event, refer to MAP1100 View storage facility state (end of call) to display and repair any related error conditions. If you are not sure, call your next level of support.
BE340035 Description: The CDA drive firmware update process was aborted because of a system error. The drive might still be good.

Action: Look for related SRCs to verify the drive has not been failed by another process.

If there are no related SRCs:
  1. The drive is in good condition.
  2. Exit this MAP.
  3. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  4. Close the SRC that brought you to this MAP.
  5. Fix all other outstanding issues before restarting the CDA drive firmware update process.
If there are related SRCs that require the drive to be replaced:
  1. Exit this MAP.
  2. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  3. Close the SRC that brought you to this MAP.
  4. Repair the drive through the related SRC.
  5. Restart CDA drive firmware update process.
BE340040 Description: Health check found fenced adapter.

Action: Display and repair other serviceable events with same the adapter as listed in this FRU list, then close the serviceable event that sent you here. If there are no other serviceable events found, contact your next level of support.

BE340075 Description: Disk could not be deconfigured because data loss would result.

Action: Display and repair other serviceable events with failed drives in the FRU list, then close the serviceable event that sent you here. If there are no other serviceable events found, contact your next level of support.

BE340076 Description: Disk could not be deconfigured because I/O enclosure device adapter was being reset.

Action: Wait a maximum of 5 minutes, then retry repair action on the disk from the serviceable event that sent you here. If retry failed, contact your next level of support.

BE34007A Description: The CDA drive firmware update process was aborted because of a fatal drive error.

Action: Look for related SRCs.

If there is a related SRC:
  1. Exit this MAP.
  2. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  3. Close the SRC that brought you to this MAP.
  4. Repair the drive using the related SRC.
  5. Restart the CDA drive firmware update process.
If there is no related SRC:
  1. Exit this MAP.
  2. Follow the usual repair procedures using the SRC that brought you here.
BE3400B5 Description: DDMs are being installed as part of a capacity upgrade MES. The DDM installation failed because one or more DDM slots that are not being installed had either slot errors or unexpected DDMs present.

The DDMs are installed sequentially starting with DDM slot 1. If you were installing the first four DDMs in slots 1-4, and the installation code detected a slot error or DDM present in slots 5-16 that was unexpected, the BE3400B5 serviceable event would be created. The same failure would occur if there were DDMs in slots 1-4 and you were installing DDMs in slots 5-8. Any slot errors or DDMs in slots 9-16 would be unexpected and the BE3400B5 serviceable event would be created.

Action: Verify the HMC screen selections that you made and compare it to the slots that contain the new SSD DDMs. Correct the condition and retry the service action. If the installation fails again, contact your next level of support.

BE3400C3 Description: The HPFE installation failed due to a faulty Enclosure Services Module (ESM).
Action:
  1. Exit this MAP.
  2. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  3. Close the SRC that brought you to this MAP.
  4. Hotswap the ESM and reconnect the cables.
    Note: The provided location code of the FRU is the physical location in the form Rx-Fyy-P1-Cn; where x = rack number, yy = enclosure slot number, and n = ESM position (1=left, 2=right).
  5. Retry the HPFE installation process.
BE3400D4 Description: Storage enclosure midplane repair encountered a missing drive. This could be caused by an unplugged drive, a failed drive, or a faulty midplane.
Action: It might not be necessary to replace hardware to correct this condition. Take the following action:
  1. Do a pseudo-repair of the drive. If the repair action fails, go to the next step.
  2. Replace the drive and repair. If the repair fails with the new replacement drive, contact your next level of support.
BE3400F7 Description: Device adapter last path fenced: Loss of access to data.
Both device adapters in a pair were found to be fenced.

Action: Contact your next level of support.

BE3400FF Description: One or more flash drives are in an unexpected state after a rank remove process.

Action: A pseudo-repair of the affected drives might recover the problem. Otherwise, contact your next level of support.

BE340114 Description: HPFE midplane repair encountered a miscabling error.
Action:
  1. Exit this MAP.
  2. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  3. Is there a new open serviceable event with one of the following SRCs: BE3400EB, BE340104, BE340105, BE340112?
    • Yes. Close the serviceable event that sent you to this MAP. Use the new serviceable event to continue this repair.
    • No. Go to step 1.d.
  4. Use the SAS cable location code labels to ensure they are properly connected.
    Was an error found?
    • Yes. Repeat the midplane repair procedure, but do not replace the midplane; instead correct the cabling error.
    • No. Stop and call the next level of support.
BE340115 Description: HPFE midplane repair encountered a RAID controller topology error.
Action:
  1. Exit this MAP.
  2. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  3. Is there a new open serviceable event with one of the following SRCs: BE3400EE?
    • Yes. Close the serviceable event that sent you to this MAP. Use the new serviceable event to continue this repair.
    • No. Go to step 1.d.
  4. Use the SAS cable location code labels to ensure they are properly connected.
    Was an error found?
    • Yes. Repeat the midplane repair procedure, but do not replace the midplane; instead correct the cabling error.
    • No. Stop and call the next level of support.
BE340120 Description: An enclosure services manager (ESM) repair action failed to enable at least one device bus.
Action:
  1. Exit this MAP.
  2. To the question "What was the result of using the service procedure?" select Delay the repair and click Next.
  3. Close the serviceable event that sent you to this MAP.
  4. Repeat the ESM repair procedure, but do not replace the ESM.
  5. Did the repair action in step 1.d succeed?
    • Yes. No further action is necessary.
    • No. Stop and call the next level of support. It might be necessary for NLS to manually reset the parent RAID controller cards.
BE344440 (Model 951) Description: The storage enclosure is detected as being single-port active. The second FRU listed in the serviceable event identifies the FCIC that is identified to be single-port active.
Action: It might not be necessary to replace hardware to correct this condition. The following action should be taken:
  1. Perform a pseudo-repair of the FCIC identified in the FRU List. No disconnecting of hardware (cards or cables) is necessary. If this pseudo-repair succeeds, then no further action is necessary.
  2. If the pseudo-repair of the FCIC fails, then complete a real repair of the FCIC identified in the FRU List.
BE34CA11 Description: Do not attempt to replace any FRUs or take any other repair actions.

Action: Contact your next level of support.

BE360010
BE360012
Description: The host adapter harvest process found missing devices or devices with mismatched host adapter attribute values.

Action: Go to MAP4910 Handling a missing or failing host adapter resource .

BE360014 Description: The host adapter harvest process detected that a host adapter part number or CCIN value is invalid.
Action:
  1. Call next level of support to re-harvest host adapter information.
  2. Did the harvest process log a BE360014 serviceable event again?
    • Yes. Go to the next step.
    • No. Exit this MAP and close the serviceable event that sent you to this MAP.
  3. Repair the host adapter card that is listed in the FRU list. If the FRU is repaired successfully, close all open serviceable events. Otherwise, call next level of support.
BE360017 Description: Host adapter card is not a valid candidate for upgrade to a higher port speed host adapter card of the same card type (LW/SW).

Action: Retry the remove host adapter operation but select a different 4-port 8 Gb/s host adapter card for upgrade.

BE360018 Description: Host adapter card type mismatch detected during upgrade to a higher port speed host adapter card of the same type (LW/SW).
Action: Remove the higher speed host adapter with incorrect card type and install the higher speed host adapter card with correct card type (LW/SW).
Note: When you retry the install host adapter MES process, a host adapter upgrade confirmation screen is displayed. When prompted, click Next to continue with the upgrade operation.
BE360019 Description: Host adapter port topology configuration failed during upgrade to a higher port speed host adapter card of the same type (LW/SW).

Action: Check for any other related open serviceable events and repair them before you contact the customer to configure port topology on the affected I/O port(s). If no other related open serviceable events are present, contact your next level of support for recovery assistance. After all open serviceable events are resolved, contact the customer to configure port topology on the affected I/O ports.

BE360020 Description: Hourly health check detected that a higher port speed host adapter card was not installed during the upgrade to a higher speed host adapter card of the same type (LW/SW).

Action: Retry the host adapter upgrade operation by using the correct higher speed host adapter card with the expected card type (LW/SW).

BE360021 Description: A higher port speed host adapter was not installed during the upgrade to a higher port speed host adapter card of the same type (LW/SW).

Action: Retry the host adapter upgrade operation by using the correct higher speed host adapter card with the expected card type (LW/SW).

BE360022 Description: A lower port speed host adapter with incorrect number of ports was installed during the upgrade to a higher port speed host adapter card of the same type (LW/SW).
Action: Remove the lower port speed host adapter with incorrect number of ports and retry installation with higher port speed host adapter card with expected card type (LW/SW).
Note: When you retry the installation of the host adapter MES, a host adapter upgrade confirmation screen is displayed. When prompted, click Next to continue with the upgrade operation.
BE360023 Description: Hourly health check process detected that incorrect host adapter type or host adapter card type was used during the upgrade of a higher port speed host adapter card of the same type (LW/SW).
Action: Remove the incorrect host adapter and retry the installation with higher port speed host adapter card with expected card type (LW/SW).
Note: When you retry the installation of the host adapter MES, a host adapter upgrade confirmation screen is displayed. When prompted, click Next to continue with the upgrade operation.
BE362604 Description: The host adapter card could not be discovered or configured even though the owning PCI slot is powered on.

Action: Verify that the card is seated correctly. If the card is seated correctly, retry the service action. If the service action continues to fail, the host adapter card might be defective and needs to be replaced.

BE36260E Description: Host adapter install MES operation failed because installation of host adapter exceeds the maximum number of I/O ports allowed in an I/O bay.
Additional details:
  • Maximum number of I/O ports that can be configured per I/O bay is 16.
  • This serviceable event is generated when SSR attempts HA installation into an empty I/O bay slot. However, the installation fails because 16 (maximum allowed) IO ports are configured within the I/O bay boundary.

Action: Verify that the card is seated correctly. If the card is seated correctly, retry the service action. If the service action continues to fail, the host adapter card might be defective and needs to be replaced.

BE370000 Description: The PCI slot could not be powered on.

Action: Verify that the card is seated properly. If this serviceable event is generated during an Adapter Install MES service action, verify that the card is installed in the correct slot. If the card is seated in a wrong slot, contact next level of support for procedures to remove and reinstall the card in the correct slot.
If the card is seated in the correct slot, retry the service action. If the service action continues to fail, the card might be defective and needs to be replaced. If the service action fails even after you insert a new card, the I/O system board might be defective and needs to be replaced.

BE370001 Description: The PCI slot could not be powered off.

Action: Verify that the card is seated properly. If the card is seated properly, retry the service action. If the service action continues to fail, the I/O system board might be defective and needs to be replaced.

BE370003 Description: I/O adapter could not be discovered or configured.

Action: Verify that the adapter is seated properly. If the card is seated properly, retry the service action. If the service action continues to fail, replace the adapter card and retry the service action.

BE382580 Description: Host adapter discovery/configuration failure: Fibre Channel Host Card 8 Gb plugged into wrong slot.
Additional details: Any of the following might have occurred:
  • A Fibre Channel Host Card was replaced with a card that does not match the speed, connection type, or number of ports.
  • A Fibre Channel Host Card was inserted into a slot that does not support the speed, connection type, or number of ports.
  • A Fibre Channel Host Card was inserted into a storage facility that does not support it.

Action: Refer to the actions for SRC BE1E255A.

BE382581 Description: Host adapter discovery/configuration failure: The rack does not support the I/O enclosure Fibre Channel 4 Gb host card.
Additional details: Any of the following might have occurred:
  • A Fibre Channel Host Card was replaced with a card that does not match the speed, connection type, or number of ports.
  • A Fibre Channel Host Card was inserted into a slot that does not support the speed, connection type, or number of ports.
  • A Fibre Channel Host Card was inserted into a storage facility that does not support it.

Action: Refer to the actions for SRC BE1E255A.

BE382582 Description: Device adapter discovery/configuration failure: The rack does not support the I/O enclosure copper device adapter card.
Additional details: Any of the following might have occurred:
  • A copper device adapter card was replaced with a card that does not match the speed, connection type, or number of ports.
  • A copper device adapter card was inserted into a slot that does not support the speed, connection type, or number of ports.
  • A copper device adapter card was inserted into a storage facility that does not support it.

Action: Refer to the actions for SRC BE1E255A.

BE382583 Description: Device adapter discovery/configuration failure: The rack does not support the I/O enclosure optical device adapter card.
Additional details: Any of the following might have occurred:
  • An optical device adapter card was replaced with a card that does not match the speed, connection type, or number of ports.
  • An optical device adapter card was inserted into a slot that does not support the speed, connection type, or number of ports.
  • An optical device adapter card was inserted into a storage facility that does not support it.

Action: Refer to the actions for SRC BE1E255A.

BE382584 Description: A Fibre Channel Host card was discovered in a slot that does not support it, is not the type that is expected, or it does not match the card that it is replacing.
Additional details: Any of the following might have occurred:
  • A Fibre Channel Host Card was replaced with a card that does not match the speed, connection type, or number of ports.
  • A Fibre Channel Host Card was inserted into a slot that does not support the speed, connection type, or number of ports.
  • A Fibre Channel Host Card was inserted into a storage facility that does not support it.

Action: Refer to the actions for SRC BE1E255A.

BE392840 Description: Host adapter panic occurred.

Action: Contact your next level of support. Offload LPAR statesaves that are generated at the time of the BE392840 serviceable event for the next level of support analysis.

BE3A2910
BE3A2911
BE3A2912
Description:
BE3A2910: zHyperLink initialization sequence error: timeout with host.
BE3A2911: zHyperLink initialization sequence error: incorrect information exchanged with host.
BE3A2912: zHyperLink initialization sequence error: not enough processing threads available to complete initialization.

Additional details: These SRCs are reported when the DS8000 is unsuccessful in completing an initialization sequence (aka "handshake") with the host attached to the zHyperLink port. The cause of the errors might be within the zHyperLink host, the DS8000, or the connections between them.

Action: Contact your next level of support. The diagnosis of these errors might require assistance from both host and DS8000 support.

BE820115
BE820136
BE82013A
BE820152
BE820156
Description: An I2C protocol error has been detected on an I2C bus that crosses multiple FRUs within a storage enclosure.

Additional details: I2C bus errors that cross multiple FRUs are challenging to isolate to a single root cause.
There might be other serviceable events reported that have a higher confidence in FRU priority.

Action: Display and repair any open serviceable events that are related to this storage enclosure and involving the following components:
  • Storage enclosure RAID controller card (either one)
  • Storage enclosure midplane
  • Storage enclosure power supply unit
    Note: BE820115 does not involve a storage enclosure power supply unit.
If none is found, then proceed with repairing this serviceable event.
BE840055 Description: Drive installation certification failed because of microcode logic error.
Action: Display open serviceable events. Are there any other open serviceable events related to the drive in the FRU list of this serviceable event (SRC BE840055)?
  • Yes. First, repair the drive by using an open serviceable event other than BE840055. Then, close this serviceable event, and retry the certification:

    From the Service interface:

    Storage Facility Management > storage facility > Service Utilities > View/Certify Drives

    From the DS8000 storage management GUI:

    Service > Install Hardware > View/Certify Drive.

  • No. Contact your next level of support.
BEB10101 Description: Network Surveillance: The partner HMC is in the Offline state in the HMC peer domain.
Action: Is the partner HMC expected to be powered off?
  • Yes. Close the serviceable event that contains SRC BEB10101. No further action is required.
  • No. Power on the partner HMC.

If the problem persists and the partner HMC is expected to be running, contact your next level of support.

BEB20012 Description: On-Demand Data dump initiated by DS8000 microcode; no warm start used.

Action: Contact your next level of support. Offload the On-Demand Data dump for the next level of support analysis.

BEB30005
BEB30006
BEB30007
BEB30021
Description:
BEB30005: HMC health check: Verification of HMC Peer Domain failed.
BEB30006: HMC health check: Check to verify whether Large file transfer is enabled failed.
BEB30007: HMC health check: HMC management domain verification failed.
BEB30021: HMC health check: Verification of HMC Peer Node failed.

Additional details: These SRCs are usually the result of storage facility private network problems.

Action: Go to MAP7000 Entry point For storage facility private network problems. If you cannot isolate and resolve the problem by using MAP7000 and the MAP70xx series, contact your next level of support.

BEB30028 Description: HMC health check: Verification of power supplies failed.
Action: Display open serviceable events. Are there any other open serviceable events related to CEC or I/O enclosure power supplies?
  • Yes. Display and repair other serviceable events that are related to CEC or I/O enclosure power supplies. Then, close the serviceable event that sent you here.
  • No. Contact your next level of support.
All other SRCs Description: The MAP and SRC are working as designed and there are no additional actions you can take.

Action: Contact your next level of support so they can recover the error condition that is being reported.