System-tcp-issues

From popdata
Jump to: navigation, search

Back to Systems

Verrazzano (zimbra mailserver) connection count

2011-12 before and after system upgrade (latest Zimbra on latest Ubuntu), TCP connection counts climb up to a thousand over a couple of weeks.

Error creating thumbnail: Unable to save thumbnail to destination
  • diagnostic: ~dlaplante/bin/netstat-tcp-summ.sh runs netstat -t --numeric-hosts | sed | awk | sort
  • tweak: Chris followed TCP setting suggestions by zimbra 2012-02-13 around 11am.
  • monitoring sugg: tcpdump -i eth0 -w .... 'tcp[tcpflags] & (tcp-syn|tcp-fin|tcp-rst) !=0'
  • monitoring sugg: firewall logs on cabot -- look at connection start & end

2011-02-12 early theories; netstat

verrazzano$ netstat -o --tcp --numeric-hosts
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       Timer
tcp        0      0 10.80.20.10:46913       10.80.20.10:ldap        ESTABLISHED off (0.00/0/0)
tcp        0      0 127.0.0.1:7306          127.0.0.1:58528         ESTABLISHED keepalive (2064.72/0/0)
tcp        0      0 10.80.20.10:ldap        10.80.20.10:46577       ESTABLISHED keepalive (396.59/0/0)
tcp        0      0 10.80.20.10:ssh         10.80.10.240:35719      ESTABLISHED keepalive (631.83/0/0)
tcp        0      0 10.80.20.11:smtp        24.42.25.220:60428      TIME_WAIT   timewait (21.97/0/0)
tcp        0      0 10.190.0.142:33024      10.190.0.200:3260       ESTABLISHED off (0.00/0/0)
tcp6       0      0 10.80.20.11:imaps       74.82.85.193:49712      ESTABLISHED off (0.00/0/0)
tcp6       0      0 10.80.20.10:https       142.103.199.82:49323    ESTABLISHED off (0.00/0/0)
tcp6       0      0 10.80.20.11:www         10.40.10.188:64898      ESTABLISHED off (0.00/0/0)
[...]

Suspicion by Denis: in 2011 traced high connection user Thinh; netstat on his windows computer did not have the connections reported on Verrazzano, and those connections persisted for days apparently without any packets. Denis thought this must be a kernel bug, because the kernel is responsible for "keep-alives" (packets every hour or so to check whether the other end of the connection is responding). In other words, for a small subset of connections, the closing by the client does not lead the server to discard the connection. However it appears that Zimbra HTTP server (used by zimbra connector for Outlook, as opposed to IMAP) disables keep-alives. A few workstations account for most of the zombie connections, below are the top 3. For top-connector 10.80.10.165 (Chinh Win7), about 56 of the connections were found to persist over 18 hours!


2011-02-14 after weekend restart oif zimbra process

2012-02-14 09:00 DL since 15:09 yesterdeay the top 4 clients have acquired a few more connections

	58	10.80.10.165	:www	|	58	10.80.10.165	:www	
	41	142.103.199.69	:https	|	55	142.103.199.69	:https	
	28	142.103.199.82	:https	|	47	142.103.199.82	:https	

2011-02-15 after reboot without IPv6

Still climbing, so suspect that Zimbra is suppressing keep-alives.

Packet capture: all from Chinh's Win7 PC. dlaplante@verrazzano:~$ sudo tcpdump -i eth0 -w verrazzano-chinh-120215.0940.pcap host 10.80.10.165

Chris installed ip_contrack kernel module, to provide snapshots in /proc/net/ip_conntrack

tcp      6 38 TIME_WAIT src=10.80.10.165 dst=10.80.20.11 sport=55311 dport=80 packets=7 bytes=1388 src=10.80.20.11 dst=10.80.10.165 sport=80 dport=55311 packets=6 bytes=631 [ASSURED] mark=0 secmark=0 use=2
tcp      6 431528 ESTABLISHED src=70.28.245.91 dst=10.80.20.11 sport=48355 dport=993 packets=33 bytes=2729 src=10.80.20.11 dst=70.28.245.91 sport=993 dport=48355 packets=18 bytes=2244 [ASSURED] mark=0 secmark=0 use=2
udp      17 73 src=10.80.20.10 dst=10.80.20.80 sport=51991 dport=53 packets=33 bytes=2469 src=10.80.20.80 dst=10.80.20.10 sport=53 dport=51991 packets=33 bytes=5414 [ASSURED] mark=0 secmark=0 use=2