Pages

Saturday, April 17, 2010

High Availability Load Balancer for Weblogic Cluster

Oracle Weblogic Application Server has a lot of features to make your Web Applications or Web Services Scalable and High Available. To achieve Scalability you can add managed servers to a WLS cluster and High Availability can be achieved by Server and Service migration. The only thing Weblogic does not provide for you is a one shared IP address for the outside HTTP world. You don't want the user or application connect to one specific managed server of the cluster. For this Oracle made a mod plugin for Apache which can listen on one address and redirect the http request to one of the managed server in a cluster. Ok this is fine but this can be your new single point of failure. So you need some software which can monitor for example the apache process, when it fails this software needs to switch the ip address to the other server so that apache server can do the work. Off course this is also possible in hardware but this in this blog I will show you how this can be done with opensource Linux software.When you have Windows Server then Microsoft NLB can also do this for you.

Required Software
I got this working with Oracle Enterprise Linux 5.5 ( get it from edelivery ) and I installed with all the OEL options  ( Development , Cluster , Web etc ). When you do this you will have all the required libraries and tools to compile the required software. This also provides the Apache Web Servers.
HAProxy is a free very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing.
Keepalived this linux program can monitor HAProxy and can switch the shared ip address so that the requests are handled by the second server.
This is a picture of my Weblogic configuration.
Before you begin you need to install Weblogic software ( on both servers ) and configure a Weblogic Domain with a Admin server and a cluster with two managed servers. And off course the Weblogic nodemanagers.

First step is to configure apache on both machines.
You should be root to do this.
Copy the weblogic apache mod to the apache module folder.
cp /WLSHOME/wlserver_10.3/server/plugin/linux/i686/*.so /etc/httpd/modules

Change the httpd.conf
vi /etc/httpd/conf/httpd.conf
Add the module in the module section.
LoadModule weblogic_module modules/mod_wl_22.so

provide the ip addresses and port number of the managed servers in the cluster
<IfModule mod_weblogic.c>
  WebLogicCluster 10.10.10.150:7001,10.10.10.151:7001
</IfModule>


this redirect everything else use /weblogic instead of /

<Location / >
  SetHandler weblogic-handler
</Location>

Start apache on both servers.
cd /usr/sbin
./apachectl start


test it on both servers, check if you can invoke a Web Service or open a Web application on the Apache port number.

The next step is to install and configure HAProxy
Download the latest HAProxy source from http://haproxy.1wt.eu/#down
You should be root to do this.

unzip the source
gunzip  haproxy-1.4.4.tar.gz
tar xvf  haproxy-1.4.4.tar
cd haproxy-1.4.4


Compile HAProxy
make TARGET=linux26 CPU=generic


Copy the haproxy executable to both OEL servers ( /usr/sbin )
cp haproxy /usr/sbin/

Check haproxy by retrieving the version
./haproxy -v

Create the haproxy user and group, you can do this in OEL ( on both servers )

Make the haproxy config file. ( on both servers )
vi /etc/haproxy.cnf
##### begin #####
global
        log     127.0.0.1   local0
        log     127.0.0.1   local1 notice
        maxconn 4096
        user      haproxy
        group   haproxy

defaults
    log              global
    mode          http
    option         httplog
    option         dontlognull
    option         redispatch
    retries         3
    maxconn      2000
    contimeout   5000
    clitimeout     50000
    srvtimeout    50000

listen wlsproxy 10.10.10.40:80
       mode http
       balance roundrobin
       stats enable
       stats auth weblogic:weblogic1
       cookie  JSESSIONID prefix
       option  httpclose
       option  forwardfor
       server  wls1 10.10.10.50:81 cookie wls1 check
       server  wls2 10.10.10.51:81 cookie wsl2 check
##### end #####

10.10.10.40:80 is the shared VIP ip address and 10.10.10.50:81 is one of the Apache server.
stats enable and stats auth weblogic:weblogic1 enables the HAProxy status application with weblogic as username and weblogic1 as password.


Install and configure Keepalived
Download the latest source from http://www.keepalived.org/download.html
You should be root to do this.

Unzip the source
gunzip keepalived-1.1.19.tar.gz
tar xvf keepalived-1.1.19.tar
cd keepalived-1.1.19



Compile and install Keepalived
./configure
make
make install


Because Keepalived uses a shared ip address you need to add a kernel parameter ( on both servers )
vi /etc/sysctl.conf
Add this line to sysctl.conf
net.ipv4.ip_nonlocal_bind=1


Reload the kernel parameters
sysctl -p

Copy these files to both OEL servers
cp /usr/local/sbin/keepalived /usr/sbin
cp /usr/local/etc/rc.d/init.d/keepalived /etc/init.d/
cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/


Add the keepalived configuration files and do this on both servers
mkdir /etc/keepalived
vi /etc/keepalived/keepalived.conf

##### begin ######
vrrp_script chk_haproxy {          # Requires keepalived-1.1.13
        script "killall -0 haproxy"     # cheaper than pidof
        interval 2                            # check every 2 seconds
        weight 2                             # add 2 points of prio if OK
    }

    vrrp_instance VI_1 {
        interface eth0
        state MASTER
        virtual_router_id 51
        priority 101                    # 101 on master, 100 on backup
        virtual_ipaddress {
            10.10.10.40
        }
        track_script {
            chk_haproxy
        }
    }
##### end #####




You have to decide which server is your primary http server, this one need to have priority 101 and the other priority 100.





Starting and testing load balancing and failover

Start on both servers haproxy
cd /usr/sbin
./haproxy -f /etc/haproxy.cnf

Start on both servers keepalived
cd /etc/init.d
./keepalived start


Now you can check on both servers if the shared ip address is only mapped on the primary server.
ip addr sh eth0

on the primary server
[root@wls1 init.d]# ip addr sh eth0
2: eth0: mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:3f:68:d6 brd ff:ff:ff:ff:ff:ff
    inet 10.10.10.50/24 brd 10.10.10.255 scope global eth0
    inet 10.10.10.40/32 scope global eth0
    inet6 fe80::a00:27ff:fe3f:68d6/64 scope link
       valid_lft forever preferred_lft forever


on the slave
 [root@wls2 init.d]# ip addr sh eth0
2: eth0: mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:d1:f2:c0 brd ff:ff:ff:ff:ff:ff
    inet 10.10.10.51/24 brd 10.10.10.255 scope global eth0
    inet6 fe80::a00:27ff:fed1:f2c0/64 scope link
       valid_lft forever preferred_lft forever

On the primary server you can kill the haproxy process and this will failover the shared ip address to the slave. When you start haproxy on the master then the ip address will be moved back to the master.

The last thing you can check is the haproxy status application. Go to http://10.10.10.40/haproxy?stats and log in as weblogic with password weblogic1.
Here a picture of the status application.

34 comments:

  1. An interesting post Edwin. As you say, there are quite a few ways to do HA on the front-end. The traditional "all Oracle way" would be to have an active-passive OHS web-tier (aka cold failover cluster) and use Oracle Clusterware to manage the VIP and node apps (i.e. opmn) - in 10g iAS you had to roll your own scripts but in 11g it's much simpler.

    ReplyDelete
  2. Hey Edwin,

    Thanks for this great article. I find this type of HA setup very useful not only for Weblogic, but for a lot of software running on Linux. Thanks.

    ReplyDelete
  3. Hi Simon,

    It seems like a heavy solution but I like to see how clusterware & webcache works together.

    please make a blogpost , always usefull to see what is possible.

    thanks

    ReplyDelete
  4. Hi Edwin,

    I have configured oracle http server. I'm not using load balancer for now. Everything is fine. But i'm not able to create connection from jdeveloper to deploy applications. I tried both port 7777(htttp server port) and 7101(admin server port).None of them works. Any idea?

    ReplyDelete
  5. Hi,

    you cant use http server for deploying , you need to connect to the admin server with t3:// .

    Can you try the 7001 port number

    thanks

    ReplyDelete
  6. HI Edwin,

    I'm creating application server connection in jdeveloper. I guess it would t3 internally to connect. My admin server is running on port 7101 only. I have configured virutal host name for admin server and soa servers according to enterprise deployment guide. Now when i try to create an application server connection in jdeveloper, it fails. I even tried to use hostname instead of virtual host name. But same error.

    before installing http server, i was able to create the connection.:(

    ReplyDelete
  7. Hi,

    Can you connect to the admin server console on the vip address and not using the apache address.

    if so then this address you should use.

    else can you remove the vip on the admin server.

    thanks

    ReplyDelete
  8. Hey Edwin,

    Issue was that when you have a http server, you have to put that proxy in jdeveloper.
    http://fusionstack.blogspot.com/2010/04/ofm-in-cluster-with-http-server.html

    Thanks for your help anyway.

    ReplyDelete
  9. This post is very helpful as we are trying to configure WLS cluster to work with HAPROXY as loadbalancer.
    what is the use of the Apache server?

    ReplyDelete
  10. Hi,

    You are right haproxy is enough, but in Apache you can load the mod wls plugins and with these plugins you can configure more specific wls cluster options.

    thanks Edwin

    ReplyDelete
  11. Hi,
    Can you please let me know, how to load balance SOA servers and a SOA composite which has been deployed on 2 managed servers soa_server1 and soa_server2. I have clustered both of them and the SOA composite is working on both the servers now.

    ReplyDelete
  12. Hi,

    For Web Services you need to have a hardware or software load balancer.

    like this
    http://biemond.blogspot.com/2010/04/high-availability-load-balancer-for.html

    and for WebLogic JMS ( when your composite listens on a queue ) then your client needs to use a t3 url with both servers.

    file adapters etc needs to be a singleton.

    hope this helps

    thanks

    ReplyDelete
  13. Hi All,
    I have a requirement to implement a Software Load Balancer with Web logic Cluster Servlet ,
    requirement is to talk from one web logic Application server to many web logic servers
    (which is in cluster) not through standard web server request this request will be directly
    from app server to app server .


    pls help me what is the approach !!
    thx

    ReplyDelete
  14. Hi,

    If I was you I would use a OSB with a proxy and a Business Service on the servlets.

    that would work

    ReplyDelete
  15. when you set up cluster with 2 nodes A and B,
    witch address should have the cluster , of the node A or node B

    ReplyDelete
    Replies
    1. Hi,

      cluster is not a physical device, it consists of node A and B. So in java you can use the following connect string t3://nodeA:nodeB:7001 which will do a loadbalancing over node a or B. In the end you got one connection to a server.

      for http you need to have a loadbalancer.

      thanks

      Delete
  16. Hi Edwin,

    Using this approach, will the load balancer detect the weblogic server in ADMIN mode? If so, how to config this detection in a hardware lb?

    Thanks.

    ReplyDelete
    Replies
    1. Hi,

      I think only the mod_wls plugin in apache can detect this. maybe you can do some http checks and program/configure the lb every few minutes.

      thanks

      Delete
  17. Do you by any chance have any pointers about implementing fail over/back using weblogic standard edition? The customer has hardware high availabity and therefore I believe it's not necessary to install weblogic's enterprise edition. I just need the documentation to prove my statement.

    Regards

    Gerardo Brenes Trejos, CEO, GBSYS,S.A., San José, Costa Rica

    ReplyDelete
    Replies
    1. Hi,

      you have the same situation with vmware and failover. This works and no need for a cluster but is it acceptable when you are a down for a few minutes.

      and when everything is stateless you can have a load balancer ( or with a sticky session) with 2 weblogic server which are not in a cluster.

      Thanks

      Delete
  18. Hi Edwin
    We have a highly robust Clustered environment using WLS 10.3.6 and RHCS .We deploy normally three managed server for our applications and in case of any failures we have Automatic Whole server migration for Recovery and failover.We cosntantly update our WLS version and patch Oracle introduces as part of Bug fixes .One scenario occured ,During our testing , if you can give in your inputs it will be great. We succesfully did failoverfor a managed server from its primary to its secondary node. We then failed it back to its primary again. We then attempted to fail it back to its secondary again. This did not succeed and the server was shut down by the primary node manager instead. This presents a new operational risk to the application as in production, should we fail over and then back, following on from that we cannot fail over again. Just wanted to know if it is expected behaviour or should we log SR with Oracle.

    ReplyDelete
    Replies
    1. Hi,

      was this a minor patch? else it can be a new bug and it should work. When you go from major patchset like from PS5 to PS6 then I can imagine that you should bring down all managed servers and then patch the middleware software on all servers.

      thanks

      Delete
  19. Hi Edwin
    Thnx , actually we havent yet moved to production , we were testing in our Test Env , for migrtaion to 103.6 from 10.3.5.This is not exactly an issue with 10.3.6, it might be a generic problem with failover mechanism for Managed servers.Ideally we should be able to fail over to secondary multipls times within a few minutes.I have logged an SR with oracle , lets ee if they say this is expected and one cannot failover multiples times within a few minutes to secondary node from primary.

    ReplyDelete
  20. Hi Edwin,

    I am trying to setup a WL 12.1.1 standard edition (3 servers) with a load balancer in front of it instead of generic WL cluster.

    Looking for a FOSS load balancer capable of tunneling t3 over HTTP , for stateless EJB 3.1 HA usage.

    can you recommend?

    Thanks,
    Ron.

    ReplyDelete
    Replies
    1. Hi,

      Why don't you want to use native t3 loadbalancing, in the t3 url you can define all the cluster nodes, after this it is controlled by the replica-aware stubs

      also when I read the doc about WebLogic 11g -> do not route client requests, following the initial context request. Because WebLogic Server load balancing for EJBs and RMI objects is controlled via replica-aware stubs.
      So t3 over HTTP is only nice for the initial context and don't bring you extra HA.


      When tunneling T3 over HTTP (or HTTPS), the WebLogic Server runtime creates an HttpSession for each RMI session and passes the session ID back and forth between the client and the server using the normal HTTP mechanisms. This allows the web server plug-in or hardware load balancer to route all RMI requests from a particular client back to the same server in the cluster for the duration of that session.

      Note:

      External load balancers distribute initial context requests that come from Java clients over T3 and the default channel. However, do not route client requests, following the initial context request, through the load balancers. When using the T3 protocol with external load balancer, you must ensure that only the initial context request is routed through the load balancer and that subsequent requests are routed and controlled using WebLogic Server load balancing.


      Thanks

      Delete
    2. The purpose of my drill is to use FOSS not commercial HW/Software.

      I will not be using Weblogic server load balancing or cluster (available only in the enterprise edition), just standalone managed servers unaware of each other.

      The load balancer should serve as proxy , and tunnel the EJB calls to the servers.

      I read about Apache HTTP server, and HAProxy. do you know if they are capable of such functionality?

      Thanks,
      Ron.

      Delete
    3. Hi,

      I think it can work just like a normal cluster but just use sticky sessions ( http sessions sharing only works on a cluster )

      plus you can't use apache with the weblogic mod plugin. maybe you can just use HAproxy.

      Thanks

      Delete
    4. Hi,

      why can't apache work with the WL mod plugin?

      Delete
    5. Hi,

      I don't think WebLogicHost param will support with 2 ip's

      IfModule mod_weblogic.c
      WebLogicCluster 10.10.10.150:7001,10.10.10.151:7001
      IfModule

      or

      IfModule mod_weblogic.c
      WebLogicHost myweblogic.server.com
      WebLogicPort 7001
      IfModule

      Thanks

      Delete
  21. Hi Biemond,
    Thanks for this nice post.It worked for me.Can you tell me how I can do the same with https?As of now http://vip:port is ok.But can you tell me how to do with https://vip:port?

    Thanks,
    Riyadh

    ReplyDelete
    Replies
    1. Hi,

      Can't you just add a new port number on the same Virtual IP?

      Thanks

      Delete
    2. Hi Edwin,

      Actually I was trying to say haproxy with ssl.Your configuration works for http not https.Can you give me any idea for this?

      Thanks,
      Riyadh

      Delete
  22. Hi Edwin,
    We have Oracle SOA suite 11g cluster. and using Hardware load balancer. Do we still need HTTP server in this setup ? is there any best practice that HTTP server should always be used ? or is there any advantage or disadvantage ?

    ReplyDelete
  23. Hi,

    Please guide how to configure HA for forms&report 11g.

    Thanks

    ReplyDelete