Back to Systems
Verrazzano (zimbra mailserver) connection count
2011-12 before and after system upgrade (latest Zimbra on latest Ubuntu), TCP connection counts climb up to a thousand over a couple of weeks.
- diagnostic: http://syspulse.popdata.bc.ca -> verrazzano -> GraphIt
- diagnostic: ~dlaplante/bin/netstat-tcp-summ.sh runs netstat -t --numeric-hosts | sed | awk | sort
- tweak: Chris followed TCP setting suggestions by zimbra 2012-02-13 around 11am.
- monitoring sugg: tcpdump -i eth0 -w .... 'tcp[tcpflags] & (tcp-syn|tcp-fin|tcp-rst) !=0'
- monitoring sugg: firewall logs on cabot -- look at connection start & end
2011-02-12 early theories; netstat
verrazzano$ netstat -o --tcp --numeric-hosts Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State Timer tcp 0 0 10.80.20.10:46913 10.80.20.10:ldap ESTABLISHED off (0.00/0/0) tcp 0 0 127.0.0.1:7306 127.0.0.1:58528 ESTABLISHED keepalive (2064.72/0/0) tcp 0 0 10.80.20.10:ldap 10.80.20.10:46577 ESTABLISHED keepalive (396.59/0/0) tcp 0 0 10.80.20.10:ssh 10.80.10.240:35719 ESTABLISHED keepalive (631.83/0/0) tcp 0 0 10.80.20.11:smtp 184.108.40.206:60428 TIME_WAIT timewait (21.97/0/0) tcp 0 0 10.190.0.142:33024 10.190.0.200:3260 ESTABLISHED off (0.00/0/0) tcp6 0 0 10.80.20.11:imaps 220.127.116.11:49712 ESTABLISHED off (0.00/0/0) tcp6 0 0 10.80.20.10:https 18.104.22.168:49323 ESTABLISHED off (0.00/0/0) tcp6 0 0 10.80.20.11:www 10.40.10.188:64898 ESTABLISHED off (0.00/0/0) [...]
Suspicion by Denis: in 2011 traced high connection user Thinh; netstat on his windows computer did not have the connections reported on Verrazzano, and those connections persisted for days apparently without any packets. Denis thought this must be a kernel bug, because the kernel is responsible for "keep-alives" (packets every hour or so to check whether the other end of the connection is responding). In other words, for a small subset of connections, the closing by the client does not lead the server to discard the connection. However it appears that Zimbra HTTP server (used by zimbra connector for Outlook, as opposed to IMAP) disables keep-alives. A few workstations account for most of the zombie connections, below are the top 3. For top-connector 10.80.10.165 (Chinh Win7), about 56 of the connections were found to persist over 18 hours!
2011-02-14 after weekend restart oif zimbra process
2012-02-14 09:00 DL since 15:09 yesterdeay the top 4 clients have acquired a few more connections
58 10.80.10.165 :www | 58 10.80.10.165 :www 41 22.214.171.124 :https | 55 126.96.36.199 :https 28 188.8.131.52 :https | 47 184.108.40.206 :https
2011-02-15 after reboot without IPv6
Still climbing, so suspect that Zimbra is suppressing keep-alives.
Packet capture: all from Chinh's Win7 PC. dlaplante@verrazzano:~$ sudo tcpdump -i eth0 -w verrazzano-chinh-120215.0940.pcap host 10.80.10.165
Chris installed ip_contrack kernel module, to provide snapshots in /proc/net/ip_conntrack
tcp 6 38 TIME_WAIT src=10.80.10.165 dst=10.80.20.11 sport=55311 dport=80 packets=7 bytes=1388 src=10.80.20.11 dst=10.80.10.165 sport=80 dport=55311 packets=6 bytes=631 [ASSURED] mark=0 secmark=0 use=2 tcp 6 431528 ESTABLISHED src=220.127.116.11 dst=10.80.20.11 sport=48355 dport=993 packets=33 bytes=2729 src=10.80.20.11 dst=18.104.22.168 sport=993 dport=48355 packets=18 bytes=2244 [ASSURED] mark=0 secmark=0 use=2 udp 17 73 src=10.80.20.10 dst=10.80.20.80 sport=51991 dport=53 packets=33 bytes=2469 src=10.80.20.80 dst=10.80.20.10 sport=53 dport=51991 packets=33 bytes=5414 [ASSURED] mark=0 secmark=0 use=2