Wednesday, January 14, 2015

Deploying F5 BIG IP HA Active/Passive (Active/Standby) on AWS EC2 / VPC

BIG IP is a big name in the world of Application Delivery Platforms. It is used primarily as a load balancer/interface for hosting a number of applications. It is modular in nature and has a variety of modules like optimized content delivery, application firewall, etc. The full set of features is listed here

F5 a few years back used to be a hardware box only which one had to buy and wire to switches/ machines . They have now come up with a cloud offering for the same and its called BIG-IP VE (VE stands for virtual edition). One can now chose to either run their hardware or run the VE on cloud.

We had to set F5 VE for one of our customers on AWS. Coming from a non networking/non physical server background, it was difficult for us to understand the F5 networking terminology and map it to AWS which as we all know is completely abstracted.

There is one documentation provided by F5 on how to host F5 on EC2 and its pretty good. Its available here. But the sad part is it assumes one understands F5 completely and is best for people who have hands on experience with running F5 hardware boxes. I followed the same and was able to set up the F5 but with some gotchas which I would like to share with you in this article. I am also going to brief you about the basics of F5 and how it works.

Some terms that one should know:

VLAN (Virtual Lan):

 We all understand what LAN is. Virtual Lan is used to create further sub sections of the LAN. For eg in case of a SWITCH all the ports on it constitute a single broadcast domain. So if one machine sends out a broadcast message it would be placed on all the ports of the switch. This leads to a lot of unnecessary traffic.

Since a SWITCH is a layer 2 device and is not aware of the NETWORK layer all the ports are part of the same network. Suppose we have a very big network where in there are 1000 machines on the same network connected via a SWITCH. What if I want to segregate this network further, for eg: if I want to create three groups like SALES, MARKETING, DEVELOPMENT. I want to avoid cross group traffic which is unavoidable in case of SWITCH as its not aware of the logical subnets (if I create one for each which is possible but not recommended). So if a machine in SALES is looking for another machine within that group, it would send out an ARP request which would be received by all the machines on the switch and not just the SALES subnet. This causes a lot of unnecessary traffic.

To avoid this some switches come with a facility to create virtual lans. It allows us to group ports (phyical switch ports) together into a virtual network. So now we can sat that port 1,2,3 belong to VLAN A and ports 4,5,6 belong to VLAN B. So now there would not be a single broadcast domain and if an ARP request is sent by a machine in VLAN A it would stay within that VLAN (ports to be precise). Now we can have different subnets for each VLAN and these subnets would only be able to talk to each other through a router. This is usually achieved by adding tags to the ports.

This way we can reduce a lot of unnecessary traffic by limiting our broadcast domain to a smaller section.

AWS does not support VLAN. So for us a VPC subnet is as good as a VLAN and can be used as such but nothing stops us from creating a pueudo VLAN which is smaller than a subnet.

Virtual Server:

Virtual Server in F5 is equivalent to an ELB. In ELB we get a Domain Name and not an IP but with F5 we get an IP. A single F5 box can run multiple such Load Balanced endpoints. A single F5 box can be used for all reverse proxy requirements in a VPC. As the name implies, its a logical server and not an actual one, identified by an IP (EIP or private IP). Every Virtual Server has a pool of servers which it load balances. This is similar to the instances on an ELB. Since multiple private IPs can be attached to a single ENI, the number of VS that we can run on an F5 is limited by the number of ENIs that an instance can have.

Self IP:

An F5 box can be part of multiple VLANs. Think of Self IP as the IP F5 box uses to recognize itself, as a single ENI could have multiple private IPs attached to it which may be used by VS or some other thing. This IP is static in nature and does not migrate in case of failover.

Floating IP:

For an HA setup we need the VLANs too to migrate from one box to the other. This is achieved by assigning a floating IP to each VLAN. This IP migrates from one F5 box to the other in case of failover. This IP movement happens through reassigning of this private IP from box A to box B through AWS API calls.

Traffic Group: 

In case of a HA setup, the entity that moves from one box to the other is the Traffic Group. All the floating IPs, VS ips are a part of this. We can force the movement of the traffic group manually too through the console.

Now lets get to the actual setup of a HA cluster:

1: Prerequisites:

  1. AWS account with a VPC with atleast three subnets. For this setup lets create a VPC with CIDR 10.0.0.0/16 and three subnets 10.0.0.0/24 (management), 10.0.1.0/24 (external), 10.0.2.0/24 (internal).
  2. Two Security Groups as mentioned here
2: Launch Box A:
  1. Go here . Select the one which suits you.
  2. For subnet, select the management subnet and assign a private IP (example 10.0.0.2). Add two more Network Interfaces one each from external and internal subnet and assign one private Ip (example 10.0.1.2 and 10.0.2.2).
  3. For security group select allow-all-traffic .
  4. Once the machine is launched assign an EIP to the management ENI. This is done so that the management port is accessible over the internet for configuration.
3: Setting up the admin password:
  1. Log in to the new AMI that you just launched. Use the name of the key pair (.pem file), and the elastic IP address of your EC2 instance. $ ssh -i <username>-aws-keypair.pem root@<elastic IP address of EC2 instance>.
  2. At the command prompt, type tmsh modify auth password admin.
  3. To ensure that the system retains the password change, type tmsh save sys config, and then press Enter.
4: VLAN setup:
  1. Login at https:<EIP>. Enter the admin username/password that we created in the last step.
  2. A setup wizard would come up. Complete first 2-3 steps (license activation) then quit the wizard. Dont finish the rest of the steps as we would be doing those manually.
  3. Go to Network > VLAN > VLAN List . Click Create .
  4. Enter name internal.
  5. Select 1.2 for interface, Tagging Untagged. Click the Add button.
  6. Click Finished.
  7. Repeat the same steps as above to create another VLAN by the name external. For interface select 1.1. 
5: Self IP setup:
  1. Goto Network > Self IPs. Click Create
  2. Put Name as self_ip_external. IP Address 10.0.1.2. Netmask as 255.255.255.0. VLAN as external. Port lockdown Allow All. Select the Default Traffic Group.
  3. Do the same for the internal VLAN.
  4. Click Finished.
6: Setup AWS Credentials: Enter AWS credentials under System > Configuration > AWS.

7: Getting ready for HA setup:
  1. Goto Device Management > Devices > Device Connectivity > Config Sync. Select the external VLAN IP.
  2. Goto Device Management > Devices > Device Connectivity > Failover Network. Click Add under Failover Unicast Configuration. Use the management (10.0.0.2) IP here.
8: Setup the Box B : Follow all the above steps to setup the other box. Needless to say, the IPs would be different for this box :) . 

9: HA cluster setup:
  1. In Box A goto Device Management > Device Trust > Peer List. Click Add. Use the management IP of Box B and admin username/password. Follow the rest of the steps
  2. Now both the boxes are paired.
  3. Goto Device Management > Device Groups . Click Create
  4. Put any name to identify the device group which will participate in failover cluster.
  5. Group Type is Sync-Failover.
  6. Drag both IPs from right to left.
  7. Select Full Sync and Network Failover
  8. You may have have to sync the config once to the Box B. goto Device Management > Overview and sync Box A to the group once.
  9. You HA cluster Setup is done. One box would show ACTIVE and the other one STANDBY.
10: Creating Floating IPs:
  1. This has to be done ONLY on Box A.
  2. Add one more secondary IP to the 10.0.1.0/24 and 10.0.2.0/24 subnet ENI one of the boxes through AWS console.
  3. Go to Network > Self IPs. Click Create
  4. Enter the name as self_ip_floating_internal for internal VLAN. Select the same values as before (with new IP that we created above). Select traffic-group-1 (floating) for Traffic Group.
  5. Similarly do the same for external VLAN.
Now we have the HA setup ready. To test the movement of the VLAN floating IPs do the force failover and observe in the AWS console. The private IPs (floating) move from one box to the other.

Any Virtual Server that we create would have their IPs as part of this default floating traffic group. This group and its failover objects (like Virtuals Servers and IPs) can be seen under Device Management > Traffic Groups > Failover Objects.


To learn more about creating a Virtual Server go here.
To learn how to integrate AutoScaling with F5 go here  






14 comments:

  1. Hi Akash.
    You have tested, the above implementation, inside a single availability zone, correct?
    Do you have any idea, in case if we have 2 availability zone and would like a dual availability setup?
    Many thanks,
    Gianluigi Crippa

    ReplyDelete
  2. Hi Gianluigi,

    We cannot do active-passive over two AZs AFAIK, for the very simple reason that a VPC subnet cannot exist across two AZs. Each VLANs (Internal and External) floating self IP moves from one box to the other on failover (secondary private IP). As one IP cannot exist in two subnets, there is no way we can float the IPs across two AZs.

    Please let me know if you have found a way to achieve this :)

    ReplyDelete
  3. Hi Akash.
    Thank to confirm us that you have implemented HA inside a single AZ.

    Let me say that you have a VPC with AZ 1a and 1b. You have F5 VE in AZ 1a.
    What you think to have some Vs in 1a, with associated a pool that has pool-member inside of VLAN inside 1b, so routed from VPC gateway?

    Regards.
    Gianluigi.

    ReplyDelete
  4. Hi Gianluigi,

    You can have pools across two AZs, that is not a problem. Problem is with the floating SELF IPs of the F5 Box VLANs. These floating IPs need to move from one F5 box to the other, else failover would not work. Even the Virtual Server floating IPs would not move across AZ. The backend (pools) can be in multi AZ witout any problem but now the F5 boxes.

    Please let me know if I misunderstood you.

    ReplyDelete
  5. Hi Akash.
    Thanks for your answer. We will work with two F5 Standalone boxes, one in each AZ, with an external GTM that will implement http/s monitor the two FQDN associate to EIP of VsFQDN_box1a in AZ 1a, and EIP of VsFQDN_box1b in AZ 1b.
    Regards,
    Gianluigi.

    ReplyDelete
  6. Hi Gianluigi,

    You can always use GTM with health check for simulating the same. I hope you are running these two servers in ACTIVE-ACTIVE mode so that the config is always in sync. https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/tmos-implementations-11-2-0/3.html

    Also, you can run GTM (module) on these boxes themselves and do not need to setup a separate box for the same. GTM modules of both these boxes communicate among themselves and share their load/usage status. You just need to replace the name servers of your domain with IPs of both the GTM servers (their GTM modules). GTM works as your DNS nameserver in this case.

    ReplyDelete
  7. Hi Akash
    I need a clarification. You say:
    "You just need to replace the name servers of your domain with IPs of both the GTM servers (their GTM modules)"
    But inside AWS, the IPs of both GTM are EIP, ... and what happens if AWS moves GTM, for any reasons? Name servers refer to IPs that doesn't exist more.
    AWS gives me objects that are always the same in terms of id000x for an instance and FQDN ... but EIP are an association, and can change ... giving problem at name server resolution.
    For above reason we think at external GTM boxes, not inside AWS.
    What you think?
    I'm not so expert with AWS ... so I appreciate if you can help me on that clarification request.
    Regards,
    Gianluigi.

    ReplyDelete
  8. Hi Crippa,

    EIPs are flexible but AWS never changes/removes/withdraws EIPs without your consent. Once an EIP is blocked by you, it will stay in your account forever until you surrender it (which you would not). Same follows for the FQDN associated with an EIP (which ever machine has the new EIP, FQDN points to that). You control which machine in your cloud this EIP can be associated.

    EIP is different from public DNS (and IP) which AWS assigns when you launch a machine with that setting selected. Now this is something which may/will change when you STOP/START the server and you may lose the IP/DNS. So for all critical services always make use of an EIP and NOT AWS assigned public DNS.

    I hope I answered your question.

    ReplyDelete
  9. Hi Guys,

    Great article and great discussion here ... I am just going through process of using F5s in AWS.

    Basically inside AWS having 2 F5s in single AZ is pointless as AZ is just a datacenter and if you loose this you are stuck ... How about add a transparent ELB in front of two F5s and run them in ACTIVE-ACTIVE mode?

    ReplyDelete
    Replies
    1. The biggest problem with ELB is that it does not have a fixed public IP (EIP) ,although I read somewhere that AWS is coming up with it.

      Most of the people use single instance of F5 for traffic routing as well as loadbalancing. The same instance may be used by the couch base cluster as well as the web tier.

      Anything which requires IP for communication creates problem with ELB. Also you cannot make use of GTM of F5 if you put a proxy in between. One of our customers uses GTM for region level failover.

      In short, if ELB works for you with F5 then go ahead. In my case the load balancing and proxy requirements didnt allow me to use ELB anywhere in my setup :(

      Delete
  10. how can we implement f5 in management aws console as management console shouldnt have internet connection wherein my f5 need access to Internet

    ReplyDelete
  11. I run a windows box in my VPC which is open to public and from that I access the private Ip of my F5 management console OR if you want you can extend your VPC to integrate with your office network with AWS VPN and access that private IP from withing your office network.

    ReplyDelete
  12. Thanks that helps ..i was thinking on the same lines ,just wanted to confirm if these are the valid options or am i missing something.

    Appreciate ur reply

    ReplyDelete
  13. Very cozy looking rooms. Let me know if your going to Mexico. Oh and btw. you should read our Tipping in Mexico guide if you do. It will save you a lot of awkward moments. www.lemigliorivpn.com

    ReplyDelete