IBM System x iDataPlex dx360 M4 7912 User manual

System x iDataPlex dx360 M4 Types 7912 and 7913
Problem Determination and Service Guide


System x iDataPlex dx360 M4 Types 7912 and 7913
Problem Determination and Service Guide

Note: Before using this information and the product it supports, read the information in Appendix B, “Notices,” on page 367, the IBM
Safety Information and Environmental Notices and User Guide documents on the IBM Documentation CD, and the Warranty
Information document.
The most recent version of this document is available at http://www.ibm.com/supportportal/.
Fifth Edition (September 2013)
© Copyright IBM Corporation 2013.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.

Contents
Safety ............................vii
Guidelines for trained technicians ..................viii
Inspecting for unsafe conditions .................viii
Guidelines for servicing electrical equipment .............ix
Safety statements ........................x
Chapter 1. Start here.......................1
Diagnosing a problem .......................1
Undocumented problems .....................3
Chapter 2. Introduction ......................5
Related documentation ......................5
Notices and statements in this document................6
Features and specifications.....................7
Server controls, LEDs, and power ..................8
Front view ..........................8
Rear view ..........................10
Server power features .....................10
Internal LEDs, connectors, and jumpers................12
System-board internal connectors .................12
System-board switches and jumpers ................13
System-board LEDs ......................14
Chapter 3. Diagnostics .....................15
Diagnostic tools ........................15
Event logs ..........................16
Viewing event logs from the Setup utility ..............16
Viewing event logs without restarting the server ............17
Clearing the error logs .....................18
POST ............................18
POST/UEFI diagnostic codes ..................19
System event log ........................30
Integrated management module II (IMM2) error messages ........30
Checkout procedure ......................166
About the checkout procedure ..................166
Performing the checkout procedure ................166
Troubleshooting tables .....................168
General problems ......................168
Hard disk drive problems....................168
Intermittent problems .....................170
Keyboard, mouse, or pointing-device problems............171
Memory problems ......................173
Microprocessor problems....................175
Monitor and video problems...................176
Network connection problems ..................178
Optional-device problems ...................178
Power problems .......................180
Serial-device problems ....................185
ServerGuide problems.....................186
Software problems ......................187
Universal Serial Bus (USB) port problems .............187
Video problems .......................187
Light path diagnostics ......................187
© Copyright IBM Corp. 2013 iii

Power-supply LEDs.......................188
System pulse LEDs.......................189
Diagnostic programs and messages ................190
Running the diagnostic programs.................190
Diagnostic text messages ...................191
Viewing the test log......................191
Diagnostic messages .....................191
Recovering the server firmware ..................253
Automated boot recovery (ABR) ..................255
Nx boot failure ........................255
Solving power problems .....................256
Solving Ethernet controller problems ................257
Solving undetermined problems ..................257
Problem determination tips ....................258
Chapter 4. Parts listing, System x iDataPlex dx360 M4 Types 7912 and
7913 ...........................261
Customer replaceable units for dx360 M4 Type 7912 system-board tray . . . 261
Type 7913 2U chassis components .................267
GPGPU enclosure components ..................269
Structural parts ........................270
Power cords .........................271
Chapter 5. Removing and replacing server components ........275
Installation guidelines ......................275
System reliability guidelines...................276
Working inside the server with the power on ............277
Handling static-sensitive devices .................277
Returning a device or component ................278
Removing and replacing consumable and structural parts.........279
Removing the system-board tray from a 2U chassis ..........279
Installing the system-board tray in a 2U chassis ...........279
Removing the system-board tray cover ..............280
Installing the system-board tray cover ...............280
Removing a GPGPU enclosure .................281
Installing a GPGPU enclosure ..................282
Removing the 2U chassis fan-assembly top cover ..........283
Installing the 2U chassis fan-assembly top cover ...........283
Removing the 2U chassis from a rack ...............284
Installing the 2U chassis in a rack ................285
Removing and replacing Tier 1 CRUs ................286
Removing the air baffle ....................286
Installing the air baffle .....................287
Removing the 2U chassis fan assembly ..............289
Installing the 2U chassis fan assembly...............290
Removing the system battery ..................290
Installing the system battery ..................292
Removing a simple-swap hard disk drive ..............293
Installing a simple-swap hard disk drive ..............294
Removing the power cord from the rail with power cord mounting bracket 296
Installing the power cord to the rail with power cord mounting bracket . . . 297
Removing the power cord from the rail without power cord mounting bracket 298
Installing the power cord to the rail without power cord mounting bracket 299
Removing a simple-swap SAS/SATA drive cage ...........299
Installing a simple-swap SAS/SATA drive cage ............300
Removing a power-supply paddle card from the tray .........301
iv System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

Installing a power-supply paddle card in the tray ...........302
Removing a PCIe riser-card assembly from the system-board tray ....302
Installing a PCI riser-card assembly on the system-board tray ......303
Removing a PCIe adapter from a PCI riser-card assembly .......304
Installing an adapter .....................305
Removing a memory module ..................309
Installing a memory module...................310
Removing a power supply from a 2U chassis ............315
Installing a power supply in a 2U chassis..............317
Removing a power supply cage from a 2U chassis ..........320
Installing a power supply cage in a 2U chassis............321
Removing the optional dual-port network adapter ...........321
Installing the optional dual-port network adapter ...........322
Removing a GPGPU enclosure .................323
Installing a GPGPU enclosure ..................324
Removing and replacing Tier 2 CRUs ................324
Removing a microprocessor and heat sink .............325
Installing a microprocessor and heat sink..............328
Removing the system-board tray .................335
Installing the system-board tray .................338
Chapter 6. Configuration information and instructions ........341
Updating the firmware ......................341
Configuring the server ......................342
Using the ServerGuide Setup and Installation CD...........343
Using the Setup utility .....................345
Using the Boot Manager program ................350
Starting the backup server firmware................351
Using the integrated management module II ............351
Using the remote presence capability and blue-screen capture ......353
Using the embedded hypervisor .................354
Setting the PXE boot protocol using the Setup utility .........355
Configuring the Gigabit Ethernet controller .............355
Using the LSI Configuration Utility program .............356
IBM Advanced Settings Utility program................358
Updating IBM Systems Director ..................358
Updating the Universal Unique Identifier (UUID) ............359
Updating the DMI/SMBIOS data ..................361
Appendix A. Getting help and technical assistance ..........365
Before you call ........................365
Using the documentation.....................365
Getting help and information from the World Wide Web .........365
Software service and support ...................366
Hardware service and support ...................366
IBM Taiwan product service....................366
Appendix B. Notices ......................367
Trademarks..........................367
Important notes ........................368
Particulate contamination.....................369
Documentation format ......................369
Telecommunication regulatory statement ...............370
Electronic emission notices ....................370
Federal Communications Commission (FCC) statement ........370
Industry Canada Class A emission compliance statement ........370
Contents v

Avis de conformité à la réglementation d'Industrie Canada .......370
Australia and New Zealand Class A statement ............370
European Union EMC Directive conformance statement ........371
Germany Class A statement ..................371
VCCI Class A statement ....................372
Japan Electronics and Information Technology Industries Association (JEITA)
statement ........................372
Korea Communications Commission (KCC) statement .........372
Russia Electromagnetic Interference (EMI) Class A statement ......373
People's Republic of China Class A electronic emission statement ....373
Taiwan Class A compliance statement ...............373
Index ............................375
vi System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

Safety
Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d'installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.
© Copyright IBM Corp. 2013 vii

Bu ürünü kurmadan önce güvenlik bilgilerini okuyun.
Guidelines for trained technicians
This section contains information for trained technicians.
Inspecting for unsafe conditions
Use the information in this section to help you identify potential unsafe conditions in
an IBM product that you are working on. Each IBM product, as it was designed and
manufactured, has required safety items to protect users and service technicians
from injury. The information in this section addresses only those items. Use good
judgment to identify potential unsafe conditions that might be caused by non-IBM
alterations or attachment of non-IBM features or options that are not addressed in
this section. If you identify an unsafe condition, you must determine how serious the
hazard is and whether you must correct the problem before you work on the
product.
Consider the following conditions and the safety hazards that they present:
vElectrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
vExplosive hazards, such as a damaged CRT face or a bulging capacitor.
vMechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following steps:
1. Make sure that the power is off and the power cord is disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and
observe any sharp edges.
3. Check the power cord:
vMake sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between
the external ground pin and the frame ground.
vMake sure that the power cord is the correct type, as specified in “Power
cords” on page 271.
vMake sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety
of any non-IBM alterations.
viii System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

6. Check inside the server for any obvious unsafe conditions, such as metal filings,
contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not
been removed or tampered with.
Guidelines for servicing electrical equipment
Observe the following guidelines when servicing electrical equipment:
vCheck the area for electrical hazards such as moist floors, nongrounded power
extension cords, power surges, and missing safety grounds.
vUse only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live
electrical currents.
vRegularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
vDo not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if
it touches a live electrical circuit.
vSome rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
vDo not work alone under hazardous conditions or near equipment that has
hazardous voltages.
vLocate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical
accident.
vDisconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
vBefore you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that
supplies power to the equipment and lock the wall box in the off position.
vNever assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
vIf you have to work on equipment that has exposed electrical circuits, observe
the following precautions:
– Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you are working with powered-on electrical equipment, use only one
hand. Keep the other hand in your pocket or behind your back to avoid
creating a complete circuit that could cause an electrical shock.
– When you use a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
vUse extreme care when you measure high voltages.
vTo ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of
their normal operating locations.
vIf an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safety ix

Safety statements
Important:
Each caution and danger statement in this document is labeled with a number. This
number is used to cross reference an English-language caution or danger
statement with translated versions of the caution or danger statement in the Safety
Information document.
For example, if a caution statement is labeled "Statement 1," translations for that
caution statement are in the Safety Information document under "Statement 1."
Be sure to read all caution and danger statements in this document before you
perform the procedures. Read any additional safety information that comes with the
server or optional device before you install the device.
Attention: Use No. 26 AWG or larger UL-listed or CSA certified
telecommunication line cord.
xSystem x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

Statement 1:
DANGER
Electrical current from power, telephone, and communication cables is
hazardous.
To avoid a shock hazard:
vDo not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical
storm.
vConnect all power cords to a properly wired and grounded electrical
outlet.
vConnect to properly wired outlets any equipment that will be attached to
this product.
vWhen possible, use one hand only to connect or disconnect signal
cables.
vNever turn on any equipment when there is evidence of fire, water, or
structural damage.
vDisconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless
instructed otherwise in the installation and configuration procedures.
vConnect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached
devices.
To Connect: To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
Safety xi

Statement 2:
CAUTION:
When replacing the lithium battery, use only IBM Part Number 33F8354 or an
equivalent type battery recommended by the manufacturer. If your system has
a module containing a lithium battery, replace it only with the same module
type made by the same manufacturer. The battery contains lithium and can
explode if not properly used, handled, or disposed of.
Do not:
vThrow or immerse into water
vHeat to more than 100°C (212°F)
vRepair or disassemble
Dispose of the battery as required by local ordinances or regulations.
xii System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

Statement 3:
CAUTION:
When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or
transmitters) are installed, note the following:
vDo not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable
parts inside the device.
vUse of controls or adjustments or performance of procedures other than
those specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser
diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly
with optical instruments, and avoid direct exposure to the beam.
Class 1 Laser Product
Laser Klasse 1
Laser Klass 1
Luokan 1 Laserlaite
Appareil A Laser de Classe 1
`
Safety xiii

Statement 4:
≥18 kg (39.7 lb) ≥32 kg (70.5 lb) ≥55 kg (121.2 lb)
CAUTION:
Use safe practices when lifting.
Statement 5:
CAUTION:
The power control button on the device and the power switch on the power
supply do not turn off the electrical current supplied to the device. The device
also might have more than one power cord. To remove all electrical current
from the device, ensure that all power cords are disconnected from the power
source.
1
2
Statement 6:
CAUTION:
Do not place any objects on top of a rack-mounted device unless that
rack-mounted device is intended for use as a shelf.
Statement 8:
xiv System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

CAUTION:
Never remove the cover on a power supply or any part that has the following
label attached.
Hazardous voltage, current, and energy levels are present inside any
component that has this label attached. There are no serviceable parts inside
these components. If you suspect a problem with one of these parts, contact
a service technician.
Statement 12:
CAUTION:
The following label indicates a hot surface nearby.
Statement 26:
CAUTION:
Do not place any object on top of rack-mounted devices.
Attention: This server is suitable for use on an IT power distribution system
whose maximum phase-to-phase voltage is 240 V under any distribution fault
condition.
Statement 27:
Safety xv

CAUTION:
Hazardous moving parts are nearby.
xvi System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide

Chapter 1. Start here
You can solve many problems without outside assistance by following the
troubleshooting procedures in this Problem Determination and Service Guide and
on the World Wide Web. This document describes the diagnostic tests that you can
perform, troubleshooting procedures, and explanations of error messages and error
codes. The documentation that comes with your operating system and software
also contains troubleshooting information.
Diagnosing a problem
Before you contact IBM or an approved warranty service provider, follow these
procedures in the order in which they are presented to diagnose a problem with
your server:
1. Return the server to the condition it was in before the problem occurred.
If any hardware, software, or firmware was changed before the problem
occurred, if possible, reverse those changes. This might include any of the
following items:
vHardware components
vDevice drivers and firmware
vSystem software
vUEFI firmware
vSystem input power or network connections
2. View the light path diagnostics LEDs and event logs.
The server is designed for ease of diagnosis of hardware and software
problems.
vLight path diagnostics LEDs: See “Light path diagnostics” on page 187 for
information about using light path diagnostics LEDs.
vEvent logs: See“System event log” on page 30 for information about
notification events and diagnosis.
vSoftware or operating-system error codes: See the documentation for the
software or operating system for information about a specific error code. See
the manufacturer's website for documentation.
3. Run IBM Dynamic System Analysis (DSA) and collect system data.
Run Dynamic System Analysis (DSA) to collect information about the hardware,
firmware, software, and operating system. Have this information available when
you contact IBM or an approved warranty service provider. For instructions for
running DSA, see the Dynamic System Analysis Installation and User's Guide.
To download the latest version of DSA code and the Dynamic System Analysis
Installation and User's Guide, go to http://www.ibm.com/support/entry/portal/
docdisplay?brand=5000008&lndocid=SERV-DSA.
4. Check for and apply code updates.
Fixes or workarounds for many problems might be available in updated UEFI
firmware, device firmware, or device drivers.
Important: Some cluster solutions require specific code levels or coordinated
code updates. If the device is part of a cluster solution, verify that the latest
level of code is supported for the cluster solution before you update the code.
a. Install UpdateXpress system updates.
© Copyright IBM Corp. 2013 1

You can install code updates that are packaged as an UpdateXpress
System Pack or UpdateXpress CD image. An UpdateXpress System Pack
contains an integration-tested bundle of online firmware and device-driver
updates for your server. In addition, you can use IBM ToolsCenter Bootable
Media Creator to create bootable media that is suitable for applying firmware
updates and running preboot diagnostics. For more information about
UpdateXpress System Packs, see http://www.ibm.com/support/entry/portal/
docdisplay?brand=5000008 &lndocid=SERV-XPRESS and “Updating the
firmware” on page 341. For more information about the Bootable Media
Creator, see http://www.ibm.com/support/entry/portal/
docdisplay?brand=5000008 &lndocid=TOOL-BOMC.
Be sure to separately install any listed critical updates that have release
dates that are later than the release date of the UpdateXpress System Pack
or UpdateXpress image (see step 4b).
b. Install manual system updates.
1) Determine the existing code levels.
In DSA, click Firmware/VPD to view system firmware levels, or click
Software to view operating-system levels.
2) Download and install updates of code that is not at the latest level.
To display a list of available updates for the blade server, go to
http://www.ibm.com/support/fixcentral/.
When you click an update, an information page is displayed, including a list
of the problems that the update fixes. Review this list for your specific
problem; however, even if your problem is not listed, installing the update
might solve the problem.
5. Check for and correct an incorrect configuration.
If the server is incorrectly configured, a system function can fail to work when
you enable it; if you make an incorrect change to the server configuration, a
system function that has been enabled can stop working.
a. Make sure that all installed hardware and software are supported.
See http://www.ibm.com/systems/info/x86servers/serverproven/compat/us/ to
verify that the server supports the installed operating system, optional
devices, and software levels. If any hardware or software component is not
supported, uninstall it to determine whether it is causing the problem. You
must remove nonsupported hardware before you contact IBM or an
approved warranty service provider for support.
b. Make sure that the server, operating system, and software are installed
and configured correctly.
Many configuration problems are caused by loose power or signal cables or
incorrectly seated adapters. You might be able to solve the problem by
turning off the server, reconnecting cables, reseating adapters, and turning
the server back on. For information about performing the checkout
procedure, see “Checkout procedure” on page 166. For information about
configuring the server, see “Configuring the server” on page 342.
6. See controller and management software documentation.
If the problem is associated with a specific function (for example, if a RAID hard
disk drive is marked offline in the RAID array), see the documentation for the
associated controller and management or controlling software to verify that the
controller is correctly configured.
Problem determination information is available for many devices such as RAID
and network adapters.
2System x iDataPlex dx360 M4 Types 7912 and 7913: Problem Determination and Service Guide
This manual suits for next models
1
Table of contents
Other IBM Chassis manuals