Windows system >> Linux system Tutorial >> Linux Tutorial

19 thoughts clearly understand the load balancing under Linux

First, the current website architecture is generally divided into load balancing layer, web layer and database layer. In fact, I usually add another layer, that is, the file server layer, because now with the increasing PV of the website, the pressure of the file server It is getting bigger and bigger; but with the maturity of moosefs and DRDB+Heartbeat, this problem is not big. The load balancing layer at the front end of the website is called Director, which acts as a distribution request. The most common one is polling.

Second, F5 is to achieve load balancing through hardware. It is mostly used in CDN system. It is used for load balancing of Squid reverse acceleration cluster. It is a professional hardware load balancing device, especially suitable for every Secondly, the number of new connections and the number of concurrent connections is high; LVS and Nginx are implemented by software, but the stability is also quite strong, and there are quite good performances in dealing with high concurrency.

Third, Nginx has less dependence on the network. In theory, as long as the ping is successful, the web access is normal, nginx can be connected, and nginx can distinguish between internal and external networks, if it has nodes inside and outside the network. That is equivalent to having a backup line for a single machine; lvs is more dependent on the network environment. At present, the server is in the same network segment and the lvs uses the direct method to offload, and the effect can be guaranteed.

Fourth, the current mature load balancing high-availability technology has LVS+Keepalived, Nginx+Keepalived. Previously, Nginx did not have a mature dual-machine backup solution, but it can be realized through shell script monitoring. Interested can be achieved. Specifically refer to my project implementation plan on 51cto; in addition, if you consider Nginx's load balancing high availability, you can also use DNS polling to achieve. Interested parties can refer to Zhang Yan's related articles.

V. Cluster refers to the web cluster or tomcat cluster behind load balancing, but the current cluster meaning refers to the whole system architecture, which includes the load balancer and the backend application server cluster, etc. People like to refer to Linux clusters as LVS, but I think they should be distinguished in the strict sense.

VI. High availability in load balancing High availability refers to the HA that implements the load balancer. That is, one load balancer is broken and the other can be switched within <1s seconds. The most commonly used The software is Keepalived and Heatbeat. The load balancer solution in mature production environment has Lvs+Keepalived, Nginx+Keepalived.

Seven, LVS has many advantages: 1 strong load resistance; 2 stable operation (because There is a mature HA solution); 3 no traffic; 4 basically can support all applications, based on the above advantages, LVS has a lot of fans; but there is no absolute thing, LVS is too dependent on the network, in the network environment In a relatively complex application scenario, I have to give it up and choose Nginx.

Eight, Nginx has little dependence on the network, and its regularity is strong and flexible, and the powerful features attract many people, and The configuration is also quite convenient and simple. I basically consider it in the implementation of small and medium-sized projects; of course, if the funds are sufficient, F5 is the best choice.

IX, large website architecture can actually use F5, LVS or Nginx, choose two or three of them all; if you do not choose F5 for budget reasons, then the front end of the website should be It is LVS, that is, the DNS should point to the lvs equalizer. The advantages of lvs make it very suitable for this task. The important ip address is best managed by lvs, such as the ip of the database, the ip of the webservice server, etc. These ip addresses will become larger and larger with the passage of time. If the ip is replaced, the fault will follow. So it is safest to hand over these important ips to lvs hosting.

Ten, the VIP address is an IP of Keepalived virtual, it is an external public IP, and the IP pointed to by the DNS; so when designing the website architecture, you must apply for an external IP to your IDC.

XI, in the actual project implementation process, Lvs and Nginx support for https are very good, especially LVS, relatively easy to handle.

XII. In the troubleshooting of LVS+Keepalived and Nginx+Keepalived, both of them are very convenient; if a system failure or server-related failure occurs, the DNS can be pointed to by their backends. A real web, the effect of short-term processing failure, after all, the PV of advertising websites and e-commerce websites is money, which is why it is necessary to design load balancing high availability; large advertising websites I recommend directly on the CDN system. It is.

Thirteen, now Linux clusters are all myths, in fact, this is not much complicated; the key depends on your application scenario, which one is applicable, Nginx and LVS, F5 are not myths, which Which one is convenient and which one to use.

Fourteen, another issue about session sharing, this is also an old growing problem; Nginx can use the ip_hash mechanism to solve the session problem, and F5 and LVS have a session retention mechanism to solve this problem. In addition, you can also write the session into the database, which is also a good way to solve session sharing, of course, this will also increase the burden of the database, this depends on the choice of system architects.

Fifteen, I currently maintain an e-commerce website about 1000 or so. The previous securities information website is about 100, and the large online advertisement is about 3000. I feel that the concurrency of the web layer is not getting more and more. One problem; now because of the power of the server, coupled with the high anti-concurrency of Nginx for the web, the concurrency of the web layer is not a big problem; on the contrary, the pressure on the file server layer and the database layer is getting bigger and bigger. Single NFS is not qualified for the current work, now the good solution is moosefs and DRDB+Heartbeat+NFS; and I like the Mysql server, mature application program is still master-slave, if the pressure is too large, I have to choose oracle RAC double Machine program.

Sixteen, now affected by the feast, everyone is going to play Nginx (especially for the web), in fact, in the case of excellent server performance, enough memory, Apache's anti-concurrency is not weak, The bottleneck of the entire website should still be in the database; I suggest that you can understand Apache and Nginx in two ways, the front end uses Nginx for load balancing, and the back end uses Apache as the web. The effect is quite good.

Seventeen, Heartbeat's brain splitting problem is not as serious as imagined, online environment can be considered; DRDB+Heartbeat is a mature application, it is recommended to master. I used this combination to replace EMC shared storage on quite a few occasions. After all, the price of 300,000 is not acceptable to every customer.

18, no matter how mature the design is, it is recommended to configure Nagios monitor to monitor our server in real time; mail and SMS alarms can be turned on, after all, mobile phones can be carried around; You can also purchase a dedicated commercial scanning website service, which will scan your website every minute. If you find that there is no alive, you will send a warning message to your email or contact us directly.

19, at least the security of the site, I recommend using a hardware firewall, the recommended is Huawei's three-tier firewall + Tiantai web firewall, DDOS security must be in place; Linux server itself iptables Both SElinux and SElinux can be turned off. Of course, the fewer ports open, the better.