
Contents
Safety notices ................................. v
Beginning troubleshooting and problem analysis .................. 1
Determining the problem analysis procedure to perform..................... 1
Resolving a BMC access problem ............................ 2
Resolving a power problem .............................. 4
Resolving a system firmware boot failure .......................... 5
Resolving a VGA monitor problem ............................ 6
Resolving an operating system boot failure ......................... 7
Resolving a sensor indicator problem ........................... 9
Resolving a hardware problem ............................. 10
Resolving a GPU, PCIe adapter, or device problem ...................... 11
Resolving a RAID adapter problem .......................... 12
Resolving a network adapter problem ......................... 13
Resolving a graphics processing unit problem ....................... 15
Resolving an NVMe Flash adapter problem ....................... 15
Resolving a storage device problem .......................... 16
Identifying the location of the PCIe adapter by using the slot number ............... 18
Identifying the location of the GPU by using the slot number ................. 19
Identifying the location of the NVMe Flash adapter ..................... 20
Identifying the location of the storage device ....................... 20
User guides for GPUs and PCIe adapters ........................ 21
Identifying a service action .............................. 21
Identifying a service action by using system event logs.................... 21
Identifying service action keywords in system event logs ................... 26
Identifying a service action by using sensor and event information ................ 27
Identifying a service action by using sensor and event information for the 8001-12C and 8005-12N .... 27
Identifying a service action by using sensor and event information for the 8001-22C and 8005-22N .... 43
Isolation procedures ................................ 60
EPUB_PRC_FIND_DECONFIGURE_PART isolation procedure ................. 61
EPUB_PRC_SP_CODE isolation procedure ........................ 61
EPUB_PRC_PHYP_CODE isolation procedure ....................... 61
EPUB_PRC_ALL_PROCS isolation procedure ....................... 62
EPUB_PRC_ALL_MEMCRDS isolation procedure...................... 62
EPUB_PRC_LVL_SUPPORT isolation procedure ...................... 63
EPUB_PRC_MEMORY_PLUGGING_ERROR isolation procedure ................ 63
EPUB_PRC_FSI_PATH isolation procedure ........................ 63
EPUB_PRC_PROC_AB_BUS isolation procedure ...................... 64
EPUB_PRC_PROC_XYZ_BUS isolation procedure...................... 64
EPUB_PRC_EIBUS_ERROR isolation procedure ...................... 65
EPUB_PRC_POWER_ERROR isolation procedure...................... 66
EPUB_PRC_MEMORY_UE isolation procedure ...................... 66
EPUB_PRC_HB_CODE isolation procedure ........................ 66
EPUB_PRC_TOD_CLOCK_ERR isolation procedure ..................... 67
EPUB_PRC_COOLING_SYSTEM_ERR isolation procedure................... 68
Verifying a repair ................................. 69
Collecting diagnostic data .............................. 71
Contacting IBM service and support ........................... 71
Finding parts and locations........................... 73
8001-12C or 8005-12N locations ............................. 73
8001-12C or 8005-12N parts .............................. 78
Finding parts and locations........................... 87
8001-22C or 8005-22N locations ............................. 87
© Copyright IBM Corp. 2016, 2019 iii