If you want to build a high load balanced and well cached infraestructure, and your company is over 600 searches per second ... well installing varnish and Nginx over physical hardware doesnt scale anymore, but yes, your virtual machines dont behave like you pretend, hangs, and when are overstressed, even your HOST ethernet card goes to hell , and yeah ... you've losted connectivity to all the VMs running on that host, that sucks hard of course, and youll have to write a nice report to deliver to your manager, and that sucks even harder.
So, lets get dirty, this guide applies to VMs guested over Xen HyperV, but maybe the same symptoms can help you to get this running over another HyperV
- Symptoms :
Assuming that you have your application server and JVM tuned up to run the apps balanced or cached by nginx or varnish in a good shape, and your configurations are the necessary to handle all your traffic, here you have some of the symptoms :
-- Even you tuned out your conntrack settings, after you shot a 1700 and 6k connections to the echo server behind the virtualized nginx or varnish, the VMs that has them installed stop responding well to requests, and after a while, hangs a lose connectivity
-- The ethernet card of the Host physical server hangs, and you loose the entire box
-- You have a LOT of "Ack delayed sent" counters on your interface
Well people, here we have a point, what does the increment of Ack delayed means ? , Means that theres a constant windowing renegotiations.
Ok, so whats the deal ? The deal is that actually packages that leaves the bridge trying to get out of your physical ethernet cards are exceeding the maximum amount of MTU setted for the interface, causing a lot of fragementation and lost of packages.
Now, the magic solution is to down a little bit the MTU on the VM ethernet interface, 1436 is a good value ( the same that you assign to an interface that is going to deal with encryption and decryption, ipsec headers, etc) .
Now you can virtualize your favorite load balancer or caching software without perfomance issues on a high connection demanding infraestructure
Regards !
Hybrid World
sábado, 4 de junio de 2011
miércoles, 27 de abril de 2011
Tunning dom0 and ip.conntrack for high network stressed VMs
Im opening this blog to help de opensource virtualization community answering some questions that we asked ourselfs while implementing a high scalable hybrid cloud, and to keep our it-tech team updated with more than 140 chars via twitter ;) .
Lets start with an issue we've faced :
- If you have a virtualized infraestructure, and you need to support high network stress over VMs running for example nginx , varnish, o caching balancing software, you NEED to tune some network kernel parameters on the physical server where the HypV runs for this to work like a charm..
-- First , you need to raise the memory available for Dom0 , why ?
---- Networking tables, metadata operations and other stuff are allocated over Dom0 memory, so we need to increase it in order to match our environment. This can be made editing /etc/xen/xend-config.sxp dom0_min_size value , and also from the kerel call value from grub.conf dom0_mem
For example in a environment on witch VMs will attend to 1300 request per second, we increased the dom0_min_size to 2048MB
NOTE: You need to disable dom0_mem_balloning !
-- Second, you need to tune up some ip.conntrack values for the phy server not to hang waiting for FYNs and orphan responses. In a environment on witch we've fixed dom0 min mem to 2gb we can raise the max conns to about 120000, and set the ip.conntrack values as this :
net.ipv4.ip_conntrack_max = 120000
net.ipv4.netfilter.ip_conntrack_generic_timeout = 10
net.ipv4.netfilter.ip_conntrack_udp_timeout = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait = 10
net.ipv4.netfilter.ip_conntrack_max = 120000
And thats all folks ! Enjoy it
Lets start with an issue we've faced :
- If you have a virtualized infraestructure, and you need to support high network stress over VMs running for example nginx , varnish, o caching balancing software, you NEED to tune some network kernel parameters on the physical server where the HypV runs for this to work like a charm..
-- First , you need to raise the memory available for Dom0 , why ?
---- Networking tables, metadata operations and other stuff are allocated over Dom0 memory, so we need to increase it in order to match our environment. This can be made editing /etc/xen/xend-config.sxp dom0_min_size value , and also from the kerel call value from grub.conf dom0_mem
For example in a environment on witch VMs will attend to 1300 request per second, we increased the dom0_min_size to 2048MB
NOTE: You need to disable dom0_mem_balloning !
-- Second, you need to tune up some ip.conntrack values for the phy server not to hang waiting for FYNs and orphan responses. In a environment on witch we've fixed dom0 min mem to 2gb we can raise the max conns to about 120000, and set the ip.conntrack values as this :
net.ipv4.ip_conntrack_max = 120000
net.ipv4.netfilter.ip_conntrack_generic_timeout = 10
net.ipv4.netfilter.ip_conntrack_udp_timeout = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait = 10
net.ipv4.netfilter.ip_conntrack_max = 120000
And thats all folks ! Enjoy it
Suscribirse a:
Entradas (Atom)