7.Keepalived实现Haproxy和Nginx其它服务的高可用.md

基于VRRP Script 实现其他应用的HA

nginx和haproxy的服务HA

上面讲了virtual_server自带集成的语句块实现了LVS的HA,下面用脚本曲线救国了

image-20241223151715056

image-20241223181016176

利用kill 0 检查进程是否OK

image-20241223152122691

image-20241223152109718

image-20241223153012907

只是检查进程死活,并不代表好坏,curl才是业务的检查

haproxy+keepalived

yum -y install haproxy

vim /etc/haproxy/haproxf.cfg
    ...
    # 结尾加上
    listen www.ming.org_80
    bind 192.168.126.100:80
    server server1 192.168.126.138:80 check
    server server2 192.168.126.139:80 check

image-20241223161351491

但是haproxy的bind地址要存在的,VIP没有落在的从节点是没有这个IP的,所以haproxy服务起不来

image-20241223162454355

通过journalct -u可见👇

image-20241223162441999

修改内核参数使得haproxy监听在VIP(没落过来-从节点)上

image-20241223163202181

echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf
sysctl -p

image-20241223163401650

此时就可以了

image-20241223163439020

同样主节点也要配置否则VIP飘走了,haproxy到时候就起不来了

image-20241223163513774

image-20241223173721279

删除之前实验的lvs配置就是vs的语句块,主从都删,这些配置会导致lvs的转发规则👇,清干净就行,和haproxy冲突了

image-20241223174253677

测试OK

image-20241223175226212

实现haproxy的HA

要实现haproxy挂了,keepavlied的VIP也要切走

把脚本写到keepalived的主配置文件里,推荐这么做,以便复用

[root@ka01 ~]# cat /apps/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id 192.168.126.136
   vrrp_mcast_group4 230.0.0.0
}

vrrp_script check_down {
    # 翻译下👇:如果文件不存在就为True,真就是不执行下面的操作OPTIONS(interval\weight\fall\rise\timeout这些);如果文件存在(文件被创建出来了)就为fasle假-那么就会执行下面的OPTIONS特别时weight权重-30。
    script "[ ! -f /etc/keepalived/down ]"
    interval 1
    # 这个权重就是vrrp的优先级,具体时在子配置文件vrrp语句块里定义的
    weight -30
    fall 3
    rise 2
    timeout 2
}

include /apps/keepalived/etc/keepalived/conf.d/*.conf
[root@ka01 ~]#


# 调用上面定义的脚本
[root@ka01 ~]# vim /apps/keepalived/etc/keepalived/conf.d/www.ming.org.conf
vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 51
    priority 100
    advert_int 3
    #nopreempt
    preempt_delay 10
    authentication {
        auth_type PASS
        auth_pass cisco
    }
    virtual_ipaddress {
        192.168.126.100/24 dev eth0 label eth0:1
    }
    unicast_src_ip 10.1.1.1
    unicast_peer{
        10.1.1.2
    }

    #notify_master "/apps/keepalived/bin/notify.sh master"
    #notify_backup "/apps/keepalived/bin/notify.sh backup"
    #notify_fault "/apps/keepalived/bin/notify.sh fault"

    # Allow packets addressed to the VIPs above to be received
    accept

    track_script {
        check_down
    }
}

主从keepalived节点都配置下

然后由于此时脚本判断原则是:script "[ ! -f /etc/keepalived/down ]",所以此时该down文件不存在=>取非为真,所以脚本执行为真,就不会启用语句块下面的配置项。也就不会切主,VIP不会飘。

测试,切换,只需要创建该文件就行了

image-20241224112314916

image-20241224112333046

创建文件触发脚本👇,主从切换,VIP漂移了

image-20241224112553634

image-20241224112639100

继续在从节点测试,创建down文件,可以看到VIP就让出去了👇

image-20241224113325775

两边都有脚本执行失败的情况,所以还是看各自原始优先级都减去weight后的大小了。

image-20241224114430472

注释掉nopreempt 观察可见

image-20241224114458397

恢复了

image-20241224114508699

image-20241224115835274

利用killall实现haproxy的检查-不是很严谨

错误写法

image-20241224144457000

脚本后面 不要总是带[]方括号,这种test判断是否为比较结果为真的语法,👆上图一定会丢失VIP的,主变从的,因为脚本存在错误👇

image-20241224145100368

然后reload后看status果然是2👇

image-20241224144934967

去掉方括号就行

[root@ka01 ~]# cat /apps/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id 192.168.126.136
   vrrp_mcast_group4 230.0.0.0
}

vrrp_script check_down {
        script "[ ! -f /etc/keepalived/down ]"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}

vrrp_script check_haproxy {
        script "/usr/bin/killall -0 haproxy"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}


include /apps/keepalived/etc/keepalived/conf.d/*.conf
[root@ka01 ~]#

[root@ka01 ~]# cat /apps/keepalived/etc/keepalived/conf.d/www.ming.org.conf
vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 51
    priority 100
    advert_int 3
    #nopreempt
    preempt_delay 10
    authentication {
        auth_type PASS
        auth_pass cisco
    }
    virtual_ipaddress {
        192.168.126.100/24 dev eth0 label eth0:1
    }
    unicast_src_ip 10.1.1.1
    unicast_peer{
        10.1.1.2
    }

    #notify_master "/apps/keepalived/bin/notify.sh master"
    #notify_backup "/apps/keepalived/bin/notify.sh backup"
    #notify_fault "/apps/keepalived/bin/notify.sh fault"

    # Allow packets addressed to the VIPs above to be received
    accept

    track_script {
        check_down
        check_haproxy
    }
}
[root@ka01 ~]#

reload即可,不放心restart都没有问题

VIP也在,没问题

image-20241224145621588

停掉haproxy观察是否切换主从

image-20241224145930208

还可以写脚本去调用,这里依旧使用不太严谨的killall

[root@ka01 ~]# cat /etc/keepalived/check_haproxy.sh
#!/bin/bash

/usr/bin/killall -0 haproxy
[root@ka01 ~]#
[root@ka01 ~]# chmod +x /etc/keepalived/check_haproxy.sh
[root@ka01 ~]# ll /etc/keepalived/check_haproxy.sh
-rwxr-xr-x 1 root root 41 Dec 24 15:01 /etc/keepalived/check_haproxy.sh
[root@ka01 ~]#

image-20241224150237288

image-20241224150437100

image-20241224150458239

由于是restart,会将VIP让出去的,但是由于开启了preempt_delay 10所以10s后会抢回来的

image-20241224150642062

然后停掉haproxy,发现可以实现VIP的切换的👇

image-20241224150804297

image-20241224151508621

ka001从100 减到了70,然后交出VIP变成从了👆

👇这是ka01上的haproxy又起来了,prority就恢复为100了,然后开启了抢占,就回来了👇

image-20241224151932936

但是killall -0 也有局限性

他只能查看进程在不在,但是进程STOP在那,还是正常运行在那,killall -0是看不出来的👇

killall -19 STOP,killall -19 恢复CONT

image-20241224153358580

此时VIP应该切走,但是由于判断逻辑导致未切,所以业务挂了

image-20241224154930813

就应该用curl去检查-然后这也是一个错误的思路

[root@ka01 ~]# cat /apps/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id 192.168.126.136
   vrrp_mcast_group4 230.0.0.0
}

vrrp_script check_down {
        script "[ ! -f /etc/keepalived/down ]"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}

vrrp_script check_haproxy {
        # 这个检查貌似OK,其实会抖动的业务,因为当curl的结果不是0的时候就是false于是执行下面的优先级权重降低,让出主,VIP飘走,于是curl 就变成了true,于是VIP就抢回来了,所以这是一个错误的案例!!
        script "/usr/bin/curl -Im1 192.168.126.100"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}


include /apps/keepalived/etc/keepalived/conf.d/*.conf
[root@ka01 ~]#

image-20241224164852718

最终针对haproxy还是改成

image-20241224164647020

找到进程,且不是STOP的就行啊

[root@ka01 ~]# cat /apps/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id 192.168.126.136
   vrrp_mcast_group4 230.0.0.0
}

vrrp_script check_down {
        script "[ ! -f /etc/keepalived/down ]"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}

vrrp_script check_haproxy {
        script "ps auxf |grep -v color |grep haproxy  |grep -Ev 'T(s|l)'"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}


include /apps/keepalived/etc/keepalived/conf.d/*.conf


[root@ka01 ~]# cat /apps/keepalived/etc/keepalived/conf.d/www.ming.org.conf
vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 51
    priority 100
    advert_int 3
    #nopreempt
    preempt_delay 10
    authentication {
        auth_type PASS
        auth_pass cisco
    }
    virtual_ipaddress {
        192.168.126.100/24 dev eth0 label eth0:1
    }
    unicast_src_ip 10.1.1.1
    unicast_peer{
        10.1.1.2
    }

    #notify_master "/apps/keepalived/bin/notify.sh master"
    #notify_backup "/apps/keepalived/bin/notify.sh backup"
    #notify_fault "/apps/keepalived/bin/notify.sh fault"

    # Allow packets addressed to the VIPs above to be received
    accept

    track_script {
        check_down
        check_haproxy
    }
}
[root@ka01 ~]#

此时用户访问都OK了

image-20241224164930164

image-20241224165441153

最终改成这样了👇

image-20241224171708413

image-20241224171717052

如果不用脚本,会这样👇报错

image-20241224171807634

上图👆不写成脚本,执行有问题,明明单敲就是echo $?是0来着,走vrrp里就是1👇,所以还是写成脚本。

image-20241224171846116

nginx的ha也用keepalived来做

观察nginx的配置文件,当然不观察一般也是如此

image-20241224173032817

站点反代也好,页面实实在在的内容也罢,都推荐一个站点写到一个子配置文件(注意以.conf结尾)里,然后,优先抢在下面的server 80上。

[root@ka01 conf.d]# cat /apps/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id 192.168.126.136
   vrrp_mcast_group4 230.0.0.0
}


vrrp_script check_haproxy {
        script "/etc/keepalived/check_nginx.sh"
        interval 1
        weight -30
        fall 3
        rise 2
        timeout 2
}


include /apps/keepalived/etc/keepalived/conf.d/*.conf
[root@ka01 conf.d]#


[root@ka01 conf.d]# cat /apps/keepalived/etc/keepalived/conf.d/www.ming.org.conf
vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 51
    priority 100
    advert_int 3
    #nopreempt
    preempt_delay 10
    authentication {
        auth_type PASS
        auth_pass cisco
    }
    virtual_ipaddress {
        192.168.126.100/24 dev eth0 label eth0:1
    }
    unicast_src_ip 10.1.1.1
    unicast_peer{
        10.1.1.2
    }

    #notify_master "/apps/keepalived/bin/notify.sh master"
    #notify_backup "/apps/keepalived/bin/notify.sh backup"
    #notify_fault "/apps/keepalived/bin/notify.sh fault"

    # Allow packets addressed to the VIPs above to be received
    accept

    track_script {
        check_down
        check_haproxy
    }
}


[root@ka01 conf.d]# cat /etc/keepalived/check_nginx.sh
#!/bin/bash
/usr/bin/ps aux |grep -v color |grep nginx  |grep -Ev 'T(s|l)'
[root@ka01 conf.d]#

这个nginx检测脚本只需要配置在master上就行了,因为主上发现nginx挂了,然后自降优先级,让出VIP,slave升级成主,告警发出,然后人工或者自动介入修复故障,delay xxx 抢回主。从没必要配置nginx检测脚本,从的nginx检测如果有意义就是从节点此时升级为主节点了,发现nginx挂了,自降prority让出VIP给别的节点,这个别的节点可能是第三个,或者是之前的主,而一般我们就2个节点,所以还是让给之前的主,加入之前的主恢复了自然有抢占和prority优先级恢复结合后会抢回来,不用从去让的动作的。所以从无需配置nginx检测脚本。

mysql的HA-应该不是和keepalived来干吧

有状态的HA可能还是要找应用自身的HA解决方案,而不是用keepalived来弄,keepalived比较适合无状态的服务的HA来着。

image-20241225095549942

image-20241225095622490

image-20241225095839784

image-20241225095901920

zabbix-server的ha

和msyql一样,zabbix 6.0开始本身自带HA了,还是不要用keepalive来实现zabbix的HA了。

image-20241225100020736

keepalived的脚本执行行为

有一个默认账号,一般是不存在反而比较好的,编译安装好像就是没有该默认用户的,yum搞不好有,就要注意权限可能不够的问题。 不存在最好,就以root运行了,权限没问题~

image-20241225100344574

image-20241225100410045

总结

image-20241225105146337

Copyright 🌹 © oneyearice@126.com 2022 all right reserved,powered by Gitbook文档更新时间: 2024-12-25 10:53:52

results matching ""

    No results matching ""