Sunday, May 25, 2014

FW module is lost after reboot, analysis

Some of my colleague have experiences a strange failure on Gaia-based Check Point appliances lately.

On certain point, after reboot, FW module is not accessible and can only controlled via physical console connection. It comes up with some weird initial policy that does not allow HTTPS, SSL and/or SIC anymore. You can unload it with "fw unloadlocal" from console only. If one runs "fw stat" command, the message reports failure to connect to FW.

I have analysed the issue and found that it is related to the fact /etc/hosts file is missing the host entry for the FW.

The scenario is now clear for me. This only happens when you remove or disable an interface that was used to define MGMT IP address during the first time configuration wizard. Gaia is generating /etc/hosts automatically, and if management interface is removed or changed, hosts entry associated with the first NIC is also removed. After reboot OS cannot communicate with FW anymore, and the module connectivity shuts down completely.

To fix this, after re-defining management interface go to hosts configuration in WebUI and make sure the new management IP address is properly defined there with the module hostname. Same can be done from CLISH. Do not try to edit the file form bash with VI, this will not work.

I did not manage to find any SecureKnowledge entry for this scenario.


5 comments:

  1. Nice catch! I was already wondering how important the management interface really is ;)

    ReplyDelete
  2. Hey Valery
    This is an implied connection - implied rule uses the IP in /etc/hosts file, and this is the IP of the management port.
    If you explicitly allow ssh connection to this you should be able to connect to the machine w/o the need for a console connection

    ReplyDelete
    Replies
    1. Uri, in the described scenario FW fails to load pre-existing policy after reboot. All explicitly defined connections are no longer working because of that.

      Delete
    2. This behavior can be easily replicated

      Replication:
      1) delete an IP address from the "Mgmt" interface
      2) set the state of the interface to "off"
      3) the entry that corresponds to machine's HostName will be removed from the /etc/hosts/ file
      4) reboot the machine

      Result:
      Firewall machine loads "defaultfilter" policy, which by design prevents any connections to and through the machine.

      Root cause:
      CPSTART code checks that the machine's HostName has an IP address in the /etc/hosts file.
      If such entry is not found, the CPSTART terminates.

      Next step:
      Issue will be forwarded to the relevant developers.

      Delete
    3. Thanks, Sergei! Hope CP fixes it any time soon.

      Delete