Thursday, February 18, 2016

VSX deployment on High-End chassis. Part 3. Changing chassis ID

This is the third and last part of the topic started in the couple of previous posts.

When deploying 61/41K chassis as a cluster, you get a pair of devices marked as Chassis 1 and Chassis 2. The numbers are not just voluntary marks. They are used to generate and sustain quite complex system of internal addressing and communications: provisioning, sync, clustering, Security Groups management, and so on.

There are multiple SKs describing different architectural aspects of the system and its internal addressing.

By default, chassis 1 is active, and chassis 2 is standby. Although you can change that, it is custom to keep chassis #1 in the main Data Center, while putting #2 to your secondary one. But what if the logistics people made a mistake, and the chassis were swapped, leaving you with chassis 1 in the secondary Data Center?

The solution is standard, you just need to change the chassis ID numbering. The appropriate procedure is described in the Admin Guide (page 181 for R76.30SP version of the document). If you catch the error soon enough and if there is no security setup on the chassis yet, the procedure works like a charm.

However if you already have VSX deployed and provisioned, the story is not so simple anymore. The mentioned Admin Guide omits something quite important: unique role of SMO in a Security Group.

I have described this role in my first post for the matter. In combination with the internal addressing as function of the chassis IDs, the process becomes a bit peculiar.

Chassis ID change requires "hacking" into CMM config files and changing ID parameter on both ends. Chassis ID is used to form internal addressing. Each element of the chassis has a unique IP address based on it logical position and chassis ID. So you have to disconnect the chassis to cease intercommunication for a while, change ID numbers, rebooting each and every blade in the system and reconnecting the environment.

The catch is about Security Group addressing. As mentioned, the very first SGM added to the group becomes SMO. Other SGMs use SMO internal IP address to pull the config from it.

The main issue with changing chassis ID while having Security Group with non-default settings is about SMO swapping the place. Indeed, when addressed by IP, SMO is changing logical place from one physical chassis to another one, after ID change. In a particular scenario when you start the new chassis 1 first, its first SMO after the first boot tries to pull SMO config and fails. The reason of the failure is about that SGM trying to access another one by an IP address that belongs to itself after the change.

As the result, all other SGMs are also failing to pull config os Security Group, leading to total collapse of the chassis clustering. If you already have some traffic running through your system, that is inacceptable.

To avoid such situation, there are some actions to be done:


  1. Mind chassis ID from the beginning of deployment. It is quite easy to change IDs if nothing is configured yet.
  2. If you have Security Group and/or VSX setup on the system, plan for some downtime. The solution is to dissolve a Security Group and reconfigure it after the ID change. 

Good luck with your 61/41K system. The solution is actually quite nice, although complex.


-------
To support this blog and Check Point Video Nuggets project send your donations to https://www.paypal.me/cpvideonuggets

Saturday, February 13, 2016

VSX deployment on High-End chassis. Part 2. Control connections and VSX provisioning

In my previous post I have explained why one needs just a single SGM in the Security Group while defining VSX object.

The second pitfall is about control connections during provisioning. When converting GW to VSX, management server pushes an automatically compiled policy to the GW before and after conversion. Users have some limited options to add to that policy, mostly about HTTPS, SSH and SNMP connectivity to the gateway. Control connections are not explicitly mentioned.

It is assumed that control connections are allowed by Global Policy settings in the implied rules. On the field this assumption does not really work. If a customer disabled control connections, the auto-generated VSX policy will cut of provisioning after the first push.

It would be very unwise to try unloading policy on the gateway. In this case it will be converted to VSX, rebooted, and then the same auto-compiled policy will be pushed again, cutting VSX GW out of MGMT server the second time, now for good.

In any case you will be stuck in the middle of provisioning, with VSX object created on the MGMT side, with SGM side provisioning either not started or only partially completed.

If that would be R77 VSX environment, you should be able to run reset_gw command, described in SK101690. Unfortunately, 61/41K VSX deployment is using R76.x0 SP versions, where this command is not available.

In this particular situation you will have to re-image your SMO SGM again. If so, do not forget to reapply Jumbo hotfix package after installing the main version.

So, the bottomline here is: before starting VSX provisioning in general, and especially when dealing with 61/41K chassis, make sure you re-enable, even temporarily, control connections, before starting VSX object creation.

I can only imagine why Check Point assumes that control connections are always enabled by default, especially in case of complex security systems where it is mostly not the case. I hope in the future releases Check Point will be able to take this issue into consideration and will at least add a warning to VSX wizard or, better, allow administrators to modify default VSX object policy to some extend.

Some additional info about 41/61K deployment to follow.



-------
To support this blog and Check Point Video Nuggets project send your donations to https://www.paypal.me/cpvideonuggets





Friday, February 12, 2016

VSX deployment on High-End chassis. Part 1. SMO and VSX provisioning

After being exposed to a couple of VSX deployments with 41000 chassis, I have to share with you some important points.

Deployment of 61000 or 41000 based firewalls is quite different from the regular Gaia appliances. The CPU blades called SGMs (Security Gateway Module) are acting as a single gateway. They load-share the traffic, they have a single GW configuration, including topology, IP addresses and even SIC. To achieve that, you need that, you need to define so-called Security Group and populate it with SGM blades. The first SGM added to the group becomes SMO - Single Management Object. It will perform SIC communications with Management and will maintain later on control connections on behalf of the Security Group. If it fails, another SGM takes over the function of SMO, maintaining logical GW functionality intact. It will take me just a moment to explain, why mentioning SMO is so important while talking about.

If you are deploying 61/41K as a regular GW without virtualization, there are virtually no pitfalls. That is not exactly the case with VSX.

The main VSX object provisioning can only be properly done if you have just a single SGM in the Security Group. Although this requirement is mentioned in the Administration guide, you can easily miss it. It is also not clear at the first sight, why this is so important.

If you ever deployed VSX on a regular appliance or or an open server with R75 and up, the process is quite complex. MGMT server pushes a provisioning scripts to the GW just after establishing SIC, forcing GW machine to reboot and come up as VSX GW.

The situation with 61/41K is not different, except that on those chassis it is a group of SGMs. Each SGM is in fact a Gaia machine.

So imagine we have a couple of SGMs in the security group before starting VSX provisioning. It is only SMO talking to your management server and then rebooting after establishing SIC. The second SGM blade will not do so, but will assume a role of SMO, considering the first blade in fault. The last known configuration pulled from the original SMO does not have any mentioning of VSX. On this point the provisioning will fail.

There are also some other potential issues with VSX provisioning. I will address them in a separate post.


-------
To support this blog and Check Point Video Nuggets project send your donations to https://www.paypal.me/cpvideonuggets

Monday, February 1, 2016

Next phase of Check Point Video Nuggets series needs your support

Hi all!

You may have seen already the first series of videos in the Check Point Video Nuggets series. Up to date these short videos were viewed almost more than 2200 times.

I received lots and lots of your emails with praise, criticism, suggestions and questions. That you all very much for your support.

In some of your emails you have asked about the promised Troubleshooting series. I am still planning to do those, but it is clear I cannot produce them at the same pace as before. They require much more work and preparations.

For the previous nuggets it was taking me about three days to compile 3 minute video. It will take even more for troubleshooting, as the material is more advanced and requires lots of preparations.

I am also not exactly happy with the quality of the video materials I am able to produce today. I am learning on the way, but it is not only lack of skills. It is also about tools.

I need better mikes and sound processor, a decent 4K camera, good video editing software and lots of disk space for storage of the materials. Some of that I have purchased already, but that is not enough.

My budget estimation for the tools as around $5000. It is materials only, my efforts are still free of charge. I was trying to find an interested partner who would support the project financially, but it did not work out, and least not yet. I have even considered starting a crowd financing project on one of the well-known sites that would allow funds to be released even if the goal was not achieved.

Any kind of support, although minimum, will help the cause. If you like the series and want to see continuation, please consider donating via Paypal.me:

https://www.paypal.me/cpvideonuggets

UPDATE: some people tell me that paypal.me is not available in some countries. If this is the case for you and you still want to make a donation, please use regular paypal money transfer with the following email: cpvideonuggets(at)gmail.com

Thanks a lot,
yours truly...