[Disk I/O error, Bus error]
작성자 정보
- Matt 작성
- 작성일
컨텐츠 정보
- 2,558 조회
- 0 추천
- 목록
본문
얼마전에 중고로 서버를 구입했습니다. IBM xSeries 330 이었나 아마 그럴겁니다.
아무튼 뭐 그럭저럭 석달 정도를 사용을 했고, 몇일 전에 Kernel을 2.4.31로 올렸습니다.
SMP 머신이고요. 그래서 뭐 그것도 탈없이 사용했더랬는데, 어제 저녁 부터 이상 증상을 보이네요.
Linux는 RHEL 3.0 AS 입니다.
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
kernel: Current sd08:05: sense key Aborted Command
kernel: Additional sense indicates Scsi parity error
kernel: I/O error: dev 08:05, sector 22656
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
kernel: Current sd08:05: sense key Aborted Command
kernel: Additional sense indicates Scsi parity error
kernel: I/O error: dev 08:05, sector 22664
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
kernel: Current sd08:05: sense key Aborted Command
kernel: Additional sense indicates Scsi parity error
kernel: I/O error: dev 08:05, sector 22672
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
kernel: Current sd08:05: sense key Aborted Command
kernel: Additional sense indicates Scsi parity error
kernel: I/O error: dev 08:05, sector 22680
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
kernel: Current sd08:05: sense key Aborted Command
kernel: Additional sense indicates Scsi parity error
kernel: I/O error: dev 08:05, sector 13655472
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
...
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 14 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 15 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 16 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 17 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 18 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 19 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 20 SCB_CONTROL[0x0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x21]
kernel: 21 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 22 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 23 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 24 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 25 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 26 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 27 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 28 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 29 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
kernel: 30 SCB_CONTROL[0xe0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
...
kernel: DevQ(0:0:0): 0 waiting
kernel: DevQ(0:1:0): 0 waiting
kernel: DevQ(0:8:0): 0 waiting
kernel:
kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
kernel: scsi0:0:0:0: Device is active, asserting ATN
kernel: Recovery code sleeping
kernel: Recovery code awake
kernel: Timer Expired
kernel: aic7xxx_abort returns 0x2003
kernel: scsi0:0:0:0: Attempting to queue a TARGET RESET message
kernel: CDB: 0x2a 0x0 0x2 0x2c 0x35 0xca 0x0 0x0 0x8 0x0
kernel: aic7xxx_dev_reset returns 0x2003
kernel: Recovery SCB completes
kernel: scsi0:0:0:0: Attempting to queue an ABORT message
kernel: CDB: 0x2a 0x0 0x2 0x2c 0x35 0xca 0x0 0x0 0x8 0x0
kernel: scsi0: At time of recovery, card was not paused
kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
kernel: scsi0: Dumping Card State in Data-in phase, at SEQADDR 0x9e
kernel: Card was paused
kernel: ACCUM = 0x40, SINDEX = 0xaa, DINDEX = 0xe4, ARG_2 = 0x7
kernel: HCNT = 0x0 SCBPTR = 0x14
kernel: SCSIPHASE[0x0] SCSISIGI[0x44] ERROR[0x0] SCSIBUSL[0x0]
kernel: LASTPHASE[0x40] SCSISEQ[0x12] SBLKCTL[0xa] SCSIRATE[0xc2]
kernel: SEQCTL[0x10] SEQ_FLAGS[0x20] SSTAT0[0x5] SSTAT1[0x0]
kernel: SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8] SIMODE1[0xac]
kernel: SXFRCTL0[0x88] DFCNTRL[0x0] DFSTATUS[0x89]
kernel: STACK: 0x0 0x163 0x178 0x82
kernel: SCB count = 42
kernel: Kernel NEXTQSCB = 33
kernel: Card NEXTQSCB = 33
kernel: QINFIFO entries:
kernel: Waiting Queue entries:
kernel: Disconnected Queue entries:
...
kernel: I/O error: dev 08:05, sector 13655512
kernel: I/O error: dev 08:05, sector 27584
kernel: I/O error: dev 08:05, sector 27600
kernel: I/O error: dev 08:05, sector 13655512
kernel: I/O error: dev 08:05, sector 27608
kernel: I/O error: dev 08:05, sector 27624
kernel: I/O error: dev 08:05, sector 13655512
kernel: I/O error: dev 08:05, sector 27632
kernel: I/O error: dev 08:05, sector 27648
kernel: I/O error: dev 08:05, sector 13655512
kernel: I/O error: dev 08:05, sector 27656
kernel: I/O error: dev 08:05, sector 27672
kernel: I/O error: dev 08:05, sector 13655512
kernel: I/O error: dev 08:05, sector 27680
와 같이 Disk쪽에 에러가 있는 것처럼 나오네요. HW적인 문제라고 생각이 되는데요. 이유는 커널을 올려서 발생한 문제라면 벌써 나왔어도 나왔어야 하는 문제라는 생각이 드네요. 또 한가지 이유로는 중고를 구입하고 얼마 있지 않아 badblock을 체크 해 봤더니 badblock이 있는 걸로 나와서 SCSI-HDD를 새것으로 교체도 했거든요.
네 대를 구입했는데 이런 문제들이 계속 발생해서 저를 당화 스럽게 하네요.
혹 도움이 될만한 원인을 아시는 분께서 답변을 달아 주시면 상당히 감사 하겠습니다.
개발 서버로 이용하는 서버라 대략 난감하네요. --..--;;
그럼 미리 감사 드립니다.
관련자료
-
이전
-
다음