Discussion:
iscsi connection errors
squadra
2012-10-05 19:39:37 UTC
Permalink
Hi,

from time to time i see connection errors like this to our equallogic
6100xv / 4100e stack.

ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery (1
attempts)

any ideas what this error code means?

cheers,

Juergen
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/open-iscsi/-/vggeBH8Nc_MJ.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
P***@public.gmane.org
2012-10-05 21:29:48 UTC
Permalink
Hi,
from time to time i see connection errors like this to our equallogic 6100xv / 4100e stack.
ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery (1 attempts)
any ideas what this error code means?
cheers,
Juergen
I wonder if that is a connection close due to an async logout request from the array, which is what it does if it wants to move a connection to another port.

If yes, then that's a bad message from the iscsi kernel code: an async logout is not an error and logging it with "error" in the text is incorrect.

paul
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
squadra
2012-10-08 06:18:42 UTC
Permalink
Hello Paul,

we thought something, too. thats why we disabled connection loadbalancing
on the eql array, without success so far.

-- juergen
Post by squadra
Post by squadra
Hi,
from time to time i see connection errors like this to our equallogic
6100xv / 4100e stack.
Post by squadra
ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Post by squadra
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery
(1 attempts)
Post by squadra
any ideas what this error code means?
cheers,
Juergen
I wonder if that is a connection close due to an async logout request from
the array, which is what it does if it wants to move a connection to
another port.
If yes, then that's a bad message from the iscsi kernel code: an async
logout is not an error and logging it with "error" in the text is
incorrect.
paul
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/open-iscsi/-/2yyQoiYcDKIJ.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Donald Williams
2012-10-08 14:01:33 UTC
Permalink
Hello,

If when you see these errors, look for an INFO: event from the EQL array of
"Load Balancing request" or "Volume membership has changed". If so, then
as Paul mentioned, these events should not be considered an error.

Re: Connection load balancing. (CLB) This should NOT normally be disabled.
It can result in reduced performance. Where very busy sessions on the
same physical ports will have to share that single port. While others may
be available to better balance out the load.

If you have more than three members in a pool, as blocks are balanced
between members, log out requests will still occur and those cannot be
disabled.

Regards,

Don
Post by squadra
Hello Paul,
we thought something, too. thats why we disabled connection loadbalancing
on the eql array, without success so far.
-- juergen
Post by squadra
Post by squadra
Hi,
from time to time i see connection errors like this to our equallogic
6100xv / 4100e stack.
Post by squadra
ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Post by squadra
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery
(1 attempts)
Post by squadra
any ideas what this error code means?
cheers,
Juergen
I wonder if that is a connection close due to an async logout request
from the array, which is what it does if it wants to move a connection to
another port.
If yes, then that's a bad message from the iscsi kernel code: an async
logout is not an error and logging it with "error" in the text is
incorrect.
paul
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/open-iscsi/-/2yyQoiYcDKIJ.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Michael Christie
2012-10-06 03:15:03 UTC
Permalink
Hi,
from time to time i see connection errors like this to our equallogic 6100xv / 4100e stack.
ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Do you see something before this? Maybe something about a nop/ping timing out, or as Paul mentioned something about the target wanting to logout or dropping the connections?

If not, on the target log, do you see something about the target closing the connection?
Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
It means the target closed the connection.
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery (1 attempts)
any ideas what this error code means?
cheers,
Juergen
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/open-iscsi/-/vggeBH8Nc_MJ.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
squadra
2012-10-08 06:21:48 UTC
Permalink
Hello Mike,
Hi,
from time to time i see connection errors like this to our equallogic 6100xv / 4100e stack.
ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Do you see something before this? Maybe something about a nop/ping timing
out, or as Paul mentioned something about the target wanting to logout or
dropping the connections?
no, those messages are the first thing which pop up on the clients. the
only thing i can see is that we ran a full backup at this time.
If not, on the target log, do you see something about the target closing the connection?
the equallogic tells us this:

iSCSI session to target '192.168.xx.xx:3260, iqn.2001-05.com.equallogic:4-52aed6-...' from initiator '192.168.xxxx:48758, iqn.1994-05.com.redhat:xxx' was closed.
iSCSI intra-group connection failure.
Local reset initiated due to network errors.


on switch side (a cisco 3750 stack) we dont see any drops/errors at all...

Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
It means the target closed the connection.
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery (1 attempts)
any ideas what this error code means?
cheers,
Juergen
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/open-iscsi/-/vggeBH8Nc_MJ.
.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/open-iscsi/-/4nuTVtaJnkwJ.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
r***@public.gmane.org
2012-10-06 07:08:13 UTC
Permalink
Hello,

I have similar error, when i try to discover another lun i lost the
connection to the current lun using multipath. The log say this:

Oct 6 07:56:21 robin kernel: connection1:0: pdu (op 0x3000003d itt 0x1)
rejected. Reason code 0x4
Oct 6 07:56:51 robin kernel: connection1:0: pdu (op 0x3000004d itt 0x1)
rejected. Reason code 0x4
Oct 6 07:57:21 robin kernel: connection1:0: pdu (op 0x30000077 itt 0x1)
rejected. Reason code 0x4
Oct 6 07:57:51 robin kernel: connection1:0: pdu (op 0x30000025 itt 0x1)
rejected. Reason code 0x4
Oct 6 07:58:21 robin kernel: connection1:0: pdu (op 0x30000070 itt 0x1)
rejected. Reason code 0x4
Oct 6 07:58:35 robin kernel: connection2:0: pdu (op 0x1000007e itt 0x40)
rejected. Reason code 0x7
Oct 6 07:58:51 robin kernel: connection1:0: pdu (op 0x3000005a itt 0x1)
rejected. Reason code 0x4
Oct 6 07:59:00 robin kernel: connection2:0: ping timeout of 15 secs
expired, recv timeout 10, last rx 5477313, last ping 5477313, now 5479813
Oct 6 07:59:00 robin kernel: connection2:0: detected conn error (1011)
Oct 6 07:59:21 robin kernel: connection1:0: pdu (op 0x30000079 itt 0x1)
rejected. Reason code 0x4
Oct 6 07:59:51 robin kernel: connection1:0: pdu (op 0x30000049 itt 0x1)
rejected. Reason code 0x4
Oct 6 08:00:21 robin kernel: connection1:0: pdu (op 0x30000069 itt 0x1)
rejected. Reason code 0x4
Oct 6 08:00:51 robin kernel: connection1:0: pdu (op 0x3000006d itt 0x1)
rejected. Reason code 0x4
Oct 6 08:01:21 robin kernel: connection1:0: pdu (op 0x3000002f itt 0x1)
rejected. Reason code 0x4
Oct 6 08:01:31 robin kernel: connection1:0: pdu (op 0x30000015 itt 0x40)
rejected. Reason code 0x7
Oct 6 08:01:51 robin kernel: connection1:0: pdu (op 0x30000065 itt 0x1)
rejected. Reason code 0x4
Oct 6 08:02:16 robin kernel: connection1:0: ping timeout of 15 secs
expired, recv timeout 10, last rx 5496921, last ping 5494921, now 5499421
Oct 6 08:02:16 robin kernel: connection1:0: detected conn error (1011)
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: Device offlined - not ready
after error recovery
Oct 6 08:02:16 robin last message repeated 8 times
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ABORT
driverbyte=DRIVER_OK
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Read(10): 28 00 0f c7
c6 48 00 00 40 00
Oct 6 08:02:16 robin kernel: end_request: I/O error, dev sdb, sector
264750664
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ERROR
driverbyte=DRIVER_OK
Oct 6 08:02:16 robin kernel: device-mapper: multipath: Failing path 8:16.
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Write(10): 2a 00 24
27 24 10 00 00 08 00
Oct 6 08:02:16 robin kernel: end_request: I/O error, dev sdb, sector
606544912

And then i cant get access to the lun so i need to reboot the server, when
is rebooted the server works and the multipath connections are 2 of 2.

Some ideas?

Best Regards.
Post by squadra
Hi,
from time to time i see connection errors like this to our equallogic
6100xv / 4100e stack.
ct 5 21:22:20 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:21 xxx iscsid: Kernel reported iSCSI connection 4:0 error
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Oct 5 21:22:23 xxx kernel: connection4:0: detected conn error (1020)
Oct 5 21:22:24 xxx iscsid: connection4:0 is operational after recovery (1
attempts)
any ideas what this error code means?
cheers,
Juergen
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/open-iscsi/-/CD-kujToD_UJ.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Michael Christie
2012-10-07 23:26:43 UTC
Permalink
Post by r***@public.gmane.org
Hello,
Your error is nothing like what was being discussed in this thread. Changing subject for you :)

What target are you using?

What version of open-iscsi? What kernel what userspace tools?

When you see this, what is in the target's logs?

How are you discovering another LUN? What command are you running?
Post by r***@public.gmane.org
Oct 6 07:56:21 robin kernel: connection1:0: pdu (op 0x3000003d itt 0x1) rejected. Reason code 0x4
Oct 6 07:56:51 robin kernel: connection1:0: pdu (op 0x3000004d itt 0x1) rejected. Reason code 0x4
Oct 6 07:57:21 robin kernel: connection1:0: pdu (op 0x30000077 itt 0x1) rejected. Reason code 0x4
Oct 6 07:57:51 robin kernel: connection1:0: pdu (op 0x30000025 itt 0x1) rejected. Reason code 0x4
Oct 6 07:58:21 robin kernel: connection1:0: pdu (op 0x30000070 itt 0x1) rejected. Reason code 0x4
Oct 6 07:58:35 robin kernel: connection2:0: pdu (op 0x1000007e itt 0x40) rejected. Reason code 0x7
Oct 6 07:58:51 robin kernel: connection1:0: pdu (op 0x3000005a itt 0x1) rejected. Reason code 0x4
The target did not like something we did. It is reporting a protocol error.

We have never seen this error before. The initiator just logs the error and does nothing.
Post by r***@public.gmane.org
Oct 6 07:59:00 robin kernel: connection2:0: ping timeout of 15 secs expired, recv timeout 10, last rx 5477313, last ping 5477313, now 5479813
Target stopped responding to us.
Post by r***@public.gmane.org
Oct 6 07:59:00 robin kernel: connection2:0: detected conn error (1011)
We dropped the connection. The scsi eh eventually runs and we cannot recover the device.
Post by r***@public.gmane.org
Oct 6 07:59:21 robin kernel: connection1:0: pdu (op 0x30000079 itt 0x1) rejected. Reason code 0x4
Oct 6 07:59:51 robin kernel: connection1:0: pdu (op 0x30000049 itt 0x1) rejected. Reason code 0x4
Oct 6 08:00:21 robin kernel: connection1:0: pdu (op 0x30000069 itt 0x1) rejected. Reason code 0x4
Oct 6 08:00:51 robin kernel: connection1:0: pdu (op 0x3000006d itt 0x1) rejected. Reason code 0x4
Oct 6 08:01:21 robin kernel: connection1:0: pdu (op 0x3000002f itt 0x1) rejected. Reason code 0x4
Oct 6 08:01:31 robin kernel: connection1:0: pdu (op 0x30000015 itt 0x40) rejected. Reason code 0x7
Oct 6 08:01:51 robin kernel: connection1:0: pdu (op 0x30000065 itt 0x1) rejected. Reason code 0x4
Oct 6 08:02:16 robin kernel: connection1:0: ping timeout of 15 secs expired, recv timeout 10, last rx 5496921, last ping 5494921, now 5499421
Oct 6 08:02:16 robin kernel: connection1:0: detected conn error (1011)
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: Device offlined - not ready after error recovery
Oct 6 08:02:16 robin last message repeated 8 times
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Read(10): 28 00 0f c7 c6 48 00 00 40 00
Oct 6 08:02:16 robin kernel: end_request: I/O error, dev sdb, sector 264750664
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Oct 6 08:02:16 robin kernel: device-mapper: multipath: Failing path 8:16.
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Write(10): 2a 00 24 27 24 10 00 00 08 00
Oct 6 08:02:16 robin kernel: end_request: I/O error, dev sdb, sector 606544912
And then i cant get access to the lun so i need to reboot the server, when is rebooted the server works and the multipath connections are 2 of 2.
Some ideas?
Best Regards.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Jose Joaquin Anton Herrerias
2012-10-08 07:42:11 UTC
Permalink
Hello,

We are using xenserver 5.6 with open-iscsi-2.0.871-0.20.3.xs647, whe use the xenserver tool for discover and attach the lun. The Kenerl version is 2.6.32.12-0.7.1.xs5.6.100.307.170586xen. And when I see this error in log I seen the lost conectin with sdb.

Oct 7 05:04:34 robin multipathd: sdb: tur checker reports path is down

I look that anothers server when use multipath create a sdb , sdc and dm-0 disk, in this case only create sdb and sdc disk. Is normal?

Thank you for your help.
Best Regards.

José J. Antón Herrerías
Responsable de soporte técnico
janton-***@public.gmane.org



Access Basic Server S.L. Elche Parque Industrial. C/Galileo Galilei, 12. 03203 Elche (Alicante) Telf. +34 96 568 29 04 Fax. +34 96 568 35 30
Cláusula de confidencialidad: Este mensaje se dirige exclusivamente al destinatario consignado. Puede contener información confidencial, de nuestra propiedad o legalmente protegida. Si usted no es el destinatario, le informamos que cualquier acceso, divulgación, copia o distribución de la información, así como cualquier acción u omisión realizada con base a la misma, queda prohibida y puede ser ilegal. En caso de haber recibido este mensaje por error, le rogamos nos lo reenvíe y notifique inmediatamente, borrando toda copia de su sistema. Gracias.

(Antes de imprimir este mensaje, asegúrese de que es necesario. Proteger el medio ambiente está en nuestra mano. Piensa en global, actúa en local.

-----Mensaje original-----
De: open-iscsi-/***@public.gmane.org [mailto:open-iscsi-/***@public.gmane.org] En nombre de Michael Christie
Enviado el: lunes, 08 de octubre de 2012 1:27
Para: open-iscsi-/***@public.gmane.org
Asunto: rejected pdu when doing discovery
Post by r***@public.gmane.org
Hello,
Your error is nothing like what was being discussed in this thread. Changing subject for you :)

What target are you using?

What version of open-iscsi? What kernel what userspace tools?

When you see this, what is in the target's logs?

How are you discovering another LUN? What command are you running?
Post by r***@public.gmane.org
Oct 6 07:56:21 robin kernel: connection1:0: pdu (op 0x3000003d itt
0x1) rejected. Reason code 0x4
Oct 6 07:56:51 robin kernel: connection1:0: pdu (op 0x3000004d itt
connection1:0: pdu (op 0x30000077 itt 0x1) rejected. Reason code 0x4
Oct 6 07:57:51 robin kernel: connection1:0: pdu (op 0x30000025 itt
connection1:0: pdu (op 0x30000070 itt 0x1) rejected. Reason code 0x4
Oct 6 07:58:35 robin kernel: connection2:0: pdu (op 0x1000007e itt
connection1:0: pdu (op 0x3000005a itt 0x1) rejected. Reason code 0x4
The target did not like something we did. It is reporting a protocol error.

We have never seen this error before. The initiator just logs the error and does nothing.
Post by r***@public.gmane.org
Oct 6 07:59:00 robin kernel: connection2:0: ping timeout of 15 secs
expired, recv timeout 10, last rx 5477313, last ping 5477313, now 5479813
Target stopped responding to us.
Post by r***@public.gmane.org
Oct 6 07:59:00 robin kernel: connection2:0: detected conn error (1011)
We dropped the connection. The scsi eh eventually runs and we cannot recover the device.
Post by r***@public.gmane.org
Oct 6 07:59:21 robin kernel: connection1:0: pdu (op 0x30000079 itt
connection1:0: pdu (op 0x30000049 itt 0x1) rejected. Reason code 0x4
Oct 6 08:00:21 robin kernel: connection1:0: pdu (op 0x30000069 itt
connection1:0: pdu (op 0x3000006d itt 0x1) rejected. Reason code 0x4
Oct 6 08:01:21 robin kernel: connection1:0: pdu (op 0x3000002f itt
connection1:0: pdu (op 0x30000015 itt 0x40) rejected. Reason code 0x7
Oct 6 08:01:51 robin kernel: connection1:0: pdu (op 0x30000065 itt
connection1:0: ping timeout of 15 secs expired, recv timeout 10, last
rx 5496921, last ping 5494921, now 5499421 Oct 6 08:02:16 robin
kernel: connection1:0: detected conn error (1011) Oct 6 08:02:16
robin kernel: sd 13:0:0:0: Device offlined - not ready after error
recovery Oct 6 08:02:16 robin last message repeated 8 times Oct 6
08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code Oct 6
08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Read(10): 28 00 0f c7 c6 48 00 00 40 00 Oct 6 08:02:16 robin kernel: end_request: I/O error, dev sdb, sector 264750664 Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Oct 6 08:02:16 robin kernel: device-mapper: multipath: Failing path 8:16.
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Write(10): 2a 00
24 27 24 10 00 00 08 00 Oct 6 08:02:16 robin kernel: end_request: I/O
error, dev sdb, sector 606544912
And then i cant get access to the lun so i need to reboot the server, when is rebooted the server works and the multipath connections are 2 of 2.
Some ideas?
Best Regards.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Michael Christie
2012-10-08 16:48:27 UTC
Permalink
What target are you using? What is the vendor and model?
Post by r***@public.gmane.org
Hello,
We are using xenserver 5.6 with open-iscsi-2.0.871-0.20.3.xs647, whe use the xenserver tool for discover and attach the lun. The Kenerl version is 2.6.32.12-0.7.1.xs5.6.100.307.170586xen. And when I see this error in log I seen the lost conectin with sdb.
Oct 7 05:04:34 robin multipathd: sdb: tur checker reports path is down
I look that anothers server when use multipath create a sdb , sdc and dm-0 disk, in this case only create sdb and sdc disk. Is normal?
I am not sure what you are asking. Are you just saying dm-0 is not getting setup on one of the servers? Then no, that is not normal if you have multipathd setup and running properly.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Jose Joaquin Anton Herrerias
2012-10-11 14:43:49 UTC
Permalink
Hello,

We are using two IBM X3850 x5 servers with a Storwize V7000, I attach a network.txt that is the dump of the command "tcpdump -i xenbr1 -w network.txt" and the message is the log message of the system. You can see the error of the pdu and dev dm-1. If you need more information or want to connect to the server you can speak with me by skype or similar.

This hardware is in the lab and isn’t in production.

Best Regards.

José J. Antón Herrerías
Responsable de soporte técnico
janton-***@public.gmane.org



Access Basic Server S.L. Elche Parque Industrial. C/Galileo Galilei, 12. 03203 Elche (Alicante) Telf. +34 96 568 29 04 Fax. +34 96 568 35 30
Cláusula de confidencialidad: Este mensaje se dirige exclusivamente al destinatario consignado. Puede contener información confidencial, de nuestra propiedad o legalmente protegida. Si usted no es el destinatario, le informamos que cualquier acceso, divulgación, copia o distribución de la información, así como cualquier acción u omisión realizada con base a la misma, queda prohibida y puede ser ilegal. En caso de haber recibido este mensaje por error, le rogamos nos lo reenvíe y notifique inmediatamente, borrando toda copia de su sistema. Gracias.

(Antes de imprimir este mensaje, asegúrese de que es necesario. Proteger el medio ambiente está en nuestra mano. Piensa en global, actúa en local.

-----Mensaje original-----
De: open-iscsi-/***@public.gmane.org [mailto:open-iscsi-/***@public.gmane.org] En nombre de Michael Christie
Enviado el: lunes, 08 de octubre de 2012 18:48
Para: open-iscsi-/***@public.gmane.org
Asunto: Re: rejected pdu when doing discovery

What target are you using? What is the vendor and model?
Post by r***@public.gmane.org
Hello,
We are using xenserver 5.6 with open-iscsi-2.0.871-0.20.3.xs647, whe use the xenserver tool for discover and attach the lun. The Kenerl version is 2.6.32.12-0.7.1.xs5.6.100.307.170586xen. And when I see this error in log I seen the lost conectin with sdb.
Oct 7 05:04:34 robin multipathd: sdb: tur checker reports path is down
I look that anothers server when use multipath create a sdb , sdc and dm-0 disk, in this case only create sdb and sdc disk. Is normal?
I am not sure what you are asking. Are you just saying dm-0 is not getting setup on one of the servers? Then no, that is not normal if you have multipathd setup and running properly.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Mike Christie
2012-10-11 15:43:25 UTC
Permalink
Post by r***@public.gmane.org
Hello,
We are using two IBM X3850 x5 servers with a Storwize V7000, I attach a network.txt that is the dump of the command "tcpdump -i xenbr1 -w network.txt" and the message is the log message of the system. You can see the error of the pdu and dev dm-1. If you need more information or want to connect to the server you can speak with me by skype or similar.
The attachment did not come through.

You did not send the target logs.
Post by r***@public.gmane.org
This hardware is in the lab and isn’t in production.
What do you mean? Is it a prototype or just not in production because it
is older?
Post by r***@public.gmane.org
Best Regards.
José J. Antón Herrerías
Responsable de soporte técnico
Access Basic Server S.L. Elche Parque Industrial. C/Galileo Galilei, 12. 03203 Elche (Alicante) Telf. +34 96 568 29 04 Fax. +34 96 568 35 30
Cláusula de confidencialidad: Este mensaje se dirige exclusivamente al destinatario consignado. Puede contener información confidencial, de nuestra propiedad o legalmente protegida. Si usted no es el destinatario, le informamos que cualquier acceso, divulgación, copia o distribución de la información, así como cualquier acción u omisión realizada con base a la misma, queda prohibida y puede ser ilegal. En caso de haber recibido este mensaje por error, le rogamos nos lo reenvíe y notifique inmediatamente, borrando toda copia de su sistema. Gracias.
(Antes de imprimir este mensaje, asegúrese de que es necesario. Proteger el medio ambiente está en nuestra mano. Piensa en global, actúa en local.
-----Mensaje original-----
Enviado el: lunes, 08 de octubre de 2012 18:48
Asunto: Re: rejected pdu when doing discovery
What target are you using? What is the vendor and model?
Post by r***@public.gmane.org
Hello,
We are using xenserver 5.6 with open-iscsi-2.0.871-0.20.3.xs647, whe use the xenserver tool for discover and attach the lun. The Kenerl version is 2.6.32.12-0.7.1.xs5.6.100.307.170586xen. And when I see this error in log I seen the lost conectin with sdb.
Oct 7 05:04:34 robin multipathd: sdb: tur checker reports path is down
I look that anothers server when use multipath create a sdb , sdc and dm-0 disk, in this case only create sdb and sdc disk. Is normal?
I am not sure what you are asking. Are you just saying dm-0 is not getting setup on one of the servers? Then no, that is not normal if you have multipathd setup and running properly.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Jose Joaquin Anton Herrerias
2012-10-08 08:38:06 UTC
Permalink
Hello Michael,

Im working on this error today and I reproduced the error that all I see in message logs:

Oct 8 10:16:43 robin multipathd: Path event for 360050768028107669000000000000015, calling mpathcount
Oct 8 10:16:43 robin multipathd: sde: add path (operator)
Oct 8 10:16:43 robin multipathd: sde: spurious uevent, path already in pathvec
Oct 8 10:16:43 robin multipathd: sdd: add path (operator)
Oct 8 10:16:43 robin multipathd: sdd: spurious uevent, path already in pathvec
Oct 8 10:16:44 robin fe: 24553 (/opt/xensource/sm/LVMoISCSISR <methodCall><methodName>sr_attach</methodName><...) exitted with code 0
Oct 8 10:16:54 robin fe: 24945 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:17:24 robin fe: 24956 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:17:54 robin fe: 24974 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:18:24 robin fe: 24982 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:18:55 robin fe: 24994 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:19:25 robin fe: 25002 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:19:55 robin fe: 25012 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:20:25 robin fe: 25034 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:20:33 robin fe: 24549 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:20:33 robin fe: 24875 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:20:33 robin fe: 25043 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:20:55 robin fe: 25051 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:21:25 robin fe: 25061 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:21:55 robin fe: 25071 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:22:25 robin fe: 25081 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:22:55 robin fe: 25091 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:23:25 robin fe: 25108 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:23:55 robin fe: 25118 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:24:11 robin xenguest: Determined the following parameters from xenstore:
Oct 8 10:24:11 robin xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 vcpu/cap:0 nx: 1 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0
acpi_s3: 0
Oct 8 10:24:11 robin fe: 25159 (/opt/xensource/libexec/xenguest -controloutfd 6 -controlinfd 7 -debuglog /tmp...) exitted with code 2
Oct 8 10:24:11 robin kernel: connection4:0: pdu (op 0x37 itt 0x1) rejected. Reason code 0x7
Oct 8 10:24:14 robin kernel: connection4:0: pdu (op 0x38 itt 0x1) rejected. Reason code 0x4
Oct 8 10:24:25 robin fe: 25206 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:24:55 robin fe: 25213 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:25:25 robin fe: 25229 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:25:33 robin fe: 25163 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:25:45 robin kernel: connection4:0: pdu (op 0x42 itt 0x1) rejected. Reason code 0x4
Oct 8 10:25:55 robin fe: 25245 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:26:25 robin fe: 25313 (/usr/sbin/stunnel -fd 6) exitted with code 0
Oct 8 10:26:31 robin kernel: INFO: task multipathd:7713 blocked for more than 120 seconds.
Oct 8 10:26:31 robin kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.


As you can see there are an error before the pdu rejected. Oct 8 10:20:33 robin fe: 25043 (/usr/sbin/stunnel -fd 6) exitted with code 0

Best Regards.

José J. Antón Herrerías
Responsable de soporte técnico
janton-***@public.gmane.org



Access Basic Server S.L. Elche Parque Industrial. C/Galileo Galilei, 12. 03203 Elche (Alicante) Telf. +34 96 568 29 04 Fax. +34 96 568 35 30
Cláusula de confidencialidad: Este mensaje se dirige exclusivamente al destinatario consignado. Puede contener información confidencial, de nuestra propiedad o legalmente protegida. Si usted no es el destinatario, le informamos que cualquier acceso, divulgación, copia o distribución de la información, así como cualquier acción u omisión realizada con base a la misma, queda prohibida y puede ser ilegal. En caso de haber recibido este mensaje por error, le rogamos nos lo reenvíe y notifique inmediatamente, borrando toda copia de su sistema. Gracias.

(Antes de imprimir este mensaje, asegúrese de que es necesario. Proteger el medio ambiente está en nuestra mano. Piensa en global, actúa en local.


-----Mensaje original-----
De: open-iscsi-/***@public.gmane.org [mailto:open-iscsi-/***@public.gmane.org] En nombre de Michael Christie
Enviado el: lunes, 08 de octubre de 2012 1:27
Para: open-iscsi-/***@public.gmane.org
Asunto: rejected pdu when doing discovery
Post by r***@public.gmane.org
Hello,
Your error is nothing like what was being discussed in this thread. Changing subject for you :)

What target are you using?

What version of open-iscsi? What kernel what userspace tools?

When you see this, what is in the target's logs?

How are you discovering another LUN? What command are you running?
Post by r***@public.gmane.org
Oct 6 07:56:21 robin kernel: connection1:0: pdu (op 0x3000003d itt
0x1) rejected. Reason code 0x4
Oct 6 07:56:51 robin kernel: connection1:0: pdu (op 0x3000004d itt
connection1:0: pdu (op 0x30000077 itt 0x1) rejected. Reason code 0x4
Oct 6 07:57:51 robin kernel: connection1:0: pdu (op 0x30000025 itt
connection1:0: pdu (op 0x30000070 itt 0x1) rejected. Reason code 0x4
Oct 6 07:58:35 robin kernel: connection2:0: pdu (op 0x1000007e itt
connection1:0: pdu (op 0x3000005a itt 0x1) rejected. Reason code 0x4
The target did not like something we did. It is reporting a protocol error.

We have never seen this error before. The initiator just logs the error and does nothing.
Post by r***@public.gmane.org
Oct 6 07:59:00 robin kernel: connection2:0: ping timeout of 15 secs
expired, recv timeout 10, last rx 5477313, last ping 5477313, now 5479813
Target stopped responding to us.
Post by r***@public.gmane.org
Oct 6 07:59:00 robin kernel: connection2:0: detected conn error (1011)
We dropped the connection. The scsi eh eventually runs and we cannot recover the device.
Post by r***@public.gmane.org
Oct 6 07:59:21 robin kernel: connection1:0: pdu (op 0x30000079 itt
connection1:0: pdu (op 0x30000049 itt 0x1) rejected. Reason code 0x4
Oct 6 08:00:21 robin kernel: connection1:0: pdu (op 0x30000069 itt
connection1:0: pdu (op 0x3000006d itt 0x1) rejected. Reason code 0x4
Oct 6 08:01:21 robin kernel: connection1:0: pdu (op 0x3000002f itt
connection1:0: pdu (op 0x30000015 itt 0x40) rejected. Reason code 0x7
Oct 6 08:01:51 robin kernel: connection1:0: pdu (op 0x30000065 itt
connection1:0: ping timeout of 15 secs expired, recv timeout 10, last
rx 5496921, last ping 5494921, now 5499421 Oct 6 08:02:16 robin
kernel: connection1:0: detected conn error (1011) Oct 6 08:02:16
robin kernel: sd 13:0:0:0: Device offlined - not ready after error
recovery Oct 6 08:02:16 robin last message repeated 8 times Oct 6
08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code Oct 6
08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Read(10): 28 00 0f c7 c6 48 00 00 40 00 Oct 6 08:02:16 robin kernel: end_request: I/O error, dev sdb, sector 264750664 Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Unhandled error code Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Oct 6 08:02:16 robin kernel: device-mapper: multipath: Failing path 8:16.
Oct 6 08:02:16 robin kernel: sd 13:0:0:0: [sdb] CDB: Write(10): 2a 00
24 27 24 10 00 00 08 00 Oct 6 08:02:16 robin kernel: end_request: I/O
error, dev sdb, sector 606544912
And then i cant get access to the lun so i need to reboot the server, when is rebooted the server works and the multipath connections are 2 of 2.
Some ideas?
Best Regards.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Michael Christie
2012-10-08 16:47:48 UTC
Permalink
Post by Jose Joaquin Anton Herrerias
As you can see there are an error before the pdu rejected. Oct 8 10:20:33 robin fe: 25043 (/usr/sbin/stunnel -fd 6) exitted with code 0
It's not helpful. We want to know why the target is returning this error. What is in the target logs? Could you get a tcpdump trace at this time, so we can see what the target is seeing?
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Continue reading on narkive:
Loading...