Sunday, May 25, 2014

FW module is lost after reboot, analysis

Some of my colleague have experiences a strange failure on Gaia-based Check Point appliances lately.

On certain point, after reboot, FW module is not accessible and can only controlled via physical console connection. It comes up with some weird initial policy that does not allow HTTPS, SSL and/or SIC anymore. You can unload it with "fw unloadlocal" from console only. If one runs "fw stat" command, the message reports failure to connect to FW.

I have analysed the issue and found that it is related to the fact /etc/hosts file is missing the host entry for the FW.

The scenario is now clear for me. This only happens when you remove or disable an interface that was used to define MGMT IP address during the first time configuration wizard. Gaia is generating /etc/hosts automatically, and if management interface is removed or changed, hosts entry associated with the first NIC is also removed. After reboot OS cannot communicate with FW anymore, and the module connectivity shuts down completely.

To fix this, after re-defining management interface go to hosts configuration in WebUI and make sure the new management IP address is properly defined there with the module hostname. Same can be done from CLISH. Do not try to edit the file form bash with VI, this will not work.

I did not manage to find any SecureKnowledge entry for this scenario.

Thursday, May 15, 2014

Notes about sync redundancy

During the last Advanced Check Point Troubleshooting course I have been asked about best practices to build sync redundancy with Gaia.

The question is not a simple as it sounds. The classic textbooks for ClusterXl recommend using two or more independent synchronisation interfaces marked as First and Second Sync. Although it was true for older versions, R7x changed the play.

Sk92804 "Sync Redundancy in ClusterXL" clearly states using multiple sync interfaces is obsolete. The new best practice is to build a bond interface defined as sync.

Now simple, you say? Not really. Using bond interfaces with Check Point is tricky. There are at least three SecureKnowledge articles that you should keep in mind, mostly for CP appliances:

  • State of Sync interface configured on Bond interface is 'DOWN' for each Virtual System
    Solution ID: sk100450
  • SecurePlatform / Gaia OS crashes on 12000 / 21000 appliance during configuration of Bond interface
    Solution ID: sk69442
  • Incorrect count of Bond slaves in use after physical link down
    Solution ID: sk98160

Each one of them requires a fix. Only after three support fixes your sync should be fine.