[Lustre-discuss] EAGAIN / ECONNRESET messages
Heiko Schröter
schroete at iup.physik.uni-bremen.de
Tue Nov 24 00:35:26 PST 2009
Hello,
on three of eight OSTs i can see sporadic messages like these:
sadosrd21
Nov 24 09:11:52 sadosrd21 LustreError: 5518:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.133
Nov 24 09:12:01 sadosrd21 LustreError: 5516:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.19
sadosrd24
Nov 21 01:42:13 sadosrd24 LustreError: 9097:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.111
Nov 21 01:42:13 sadosrd24 LustreError: 9098:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.114
Nov 22 04:01:59 sadosrd24 LustreError: 9096:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.116
Nov 23 01:42:16 sadosrd24 LustreError: 9099:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.34
Nov 23 01:42:27 sadosrd24 LustreError: 9096:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -104 reading HELLO from 192.168.16.34
Nov 23 01:42:59 sadosrd24 LustreError: 9096:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -104 reading HELLO from 192.168.16.116
sadosrd25
Nov 22 04:02:06 sadosrd25 LustreError: 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.19
Nov 23 04:00:53 sadosrd25 LustreError: 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.114
Nov 23 04:01:01 sadosrd25 LustreError: 5049:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.115
Nov 23 04:01:02 sadosrd25 LustreError: 5048:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.109
Nov 23 09:12:57 sadosrd25 LustreError: 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.111
Nov 24 01:41:40 sadosrd25 LustreError: 5048:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.110
Nov 24 01:42:57 sadosrd25 LustreError: 5051:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.111
Nov 24 01:43:03 sadosrd25 LustreError: 5049:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -104 reading HELLO from 192.168.16.110
Nov 24 01:43:08 sadosrd25 LustreError: 5051:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.100
Nov 24 01:43:11 sadosrd25 LustreError: 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.16.122
Error Number:
/usr/include/asm-generic/errno-base.h:#define EAGAIN 11 /* Try again */
/usr/include/asm-generic/errno.h:#define ECONNRESET 104 /* Connection reset by peer */
They seem to be related to heavy network traffic to and from this OST.
Network driver e1000.
lustre-1.6.6
vanilla 2.6.22.19
What triggers such messages ?
Anything to worry about ?
Thanks and Regards
Heiko
Network Adapter Statistics of the above Raids.
sadosrd21 ~ # ethtool -S eth0
NIC statistics:
rx_packets: 3476732178
tx_packets: 8161698729
rx_bytes: 1261677735249
tx_bytes: 11684960617899
rx_broadcast: 96324977
tx_broadcast: 31080
rx_multicast: 885
tx_multicast: 12
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 885
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 0
rx_missed_errors: 112425
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 485691240
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 202994789
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 2220028952
tx_tcp_seg_failed: 0
rx_flow_control_xon: 926991076
rx_flow_control_xoff: 2476536244
tx_flow_control_xon: 3754
tx_flow_control_xoff: 6876
rx_long_byte_count: 1261677735249
rx_csum_offload_good: 3415421552
rx_csum_offload_errors: 1134
rx_header_split: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 53162812
dropped_smbus: 0
sadosrd24 ~ # ethtool -S eth0
NIC statistics:
rx_packets: 4090343679
tx_packets: 2636690225
rx_bytes: 5479498759229
tx_bytes: 2039673228907
rx_broadcast: 32078587
tx_broadcast: 28901
rx_multicast: 316
tx_multicast: 6
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 316
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 11278
rx_missed_errors: 78171
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 194098104
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 68502186
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 410577015
tx_tcp_seg_failed: 0
rx_flow_control_xon: 234761468
rx_flow_control_xoff: 1632413652
tx_flow_control_xon: 1516
tx_flow_control_xoff: 2889
rx_long_byte_count: 5479498759229
rx_csum_offload_good: 4067175471
rx_csum_offload_errors: 0
rx_header_split: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 20807887
dropped_smbus: 0
sadosrd25 ~ # ethtool -S eth0
NIC statistics:
rx_packets: 4305347487
tx_packets: 3031165604
rx_bytes: 5797498509449
tx_bytes: 2043989105691
rx_broadcast: 37618726
tx_broadcast: 28310
rx_multicast: 386
tx_multicast: 6
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 386
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 4738
rx_missed_errors: 223116
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 156915562
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 50086469
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 396787000
tx_tcp_seg_failed: 0
rx_flow_control_xon: 184756690
rx_flow_control_xoff: 1346260879
tx_flow_control_xon: 7451
tx_flow_control_xoff: 13175
rx_long_byte_count: 5797498509449
rx_csum_offload_good: 4277898711
rx_csum_offload_errors: 0
rx_header_split: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 24585106
dropped_smbus: 0
More information about the lustre-discuss
mailing list