In order to troubleshoot Domino cluster failover for mail servers, you first need to appreciate that BES is a client calling the NotesAPIs to poll the secondary mail server during a mail cluster failover. One point to note, BES only failover and does not load balance.
You also need to understand that BES needs
the Domino cluster to work properly first before BES itself can work. The whole process of cluster failover is transparent to the users. BES merely calls the NotesAPI to scan the users' mail files in the secondary mail server during the failover.
With all those above in mind, we need to know how to determine if the Domino cluster is actually working.
Below is a good KB article which covers all the areas that you need to check, otherwise BES will not work.
Article Title: How to troubleshoot an incomplete failover for the BlackBerry Enterprise Server in a clustered environment
Artcie Number: KB02176
Link: View Document
Briefly, I have extracted the tasks you need to troubleshoot, please refer to the article for details and steps.
Task 1 - Verify the cluster is listed in the cluster.ncf file on the BlackBerry Enterprise Server
Task 2 - Verify the BlackBerry Enterprise Server can successfully complete a trace to all the messaging server in the cluster
Task 3 - Determine the results of a show cluster command
Task 4 - Verify the mail files are correctly listed in the cluster directory
How the BlackBerry Enterprise Server Responds When the IBM Lotus Domino Servers are Out of Service
The IBM Lotus Domino messaging servers may be unavailable for various ways. Depending on the reason, the BlackBerry Enterprise Server will respond accordingly. The following is a list of reasons why the IBM Lotus Domino messaging servers may be out of service and the BlackBerry Enterprise Server's response:
Reason 1 - The IBM Lotus Domino server service is stopped
The IBM Lotus Domino server service is not available. When the BlackBerry Enterprise Server attempts to connect to the messaging server, it receives an error message that the remote system is not responding and it failsover to the cluster member.
Reason 2 - The messaging server is restricted
At the IBM Lotus Domino console, you can type set config server_restricted=1
. The possible values are 0
A value of 0 means the messaging server is available. A value of 1
mean the messaging server is not available. If the value is set to 1, the messaging server is available after a restart. If server_restricted=0, in =2 the messaging server remains in a restricted state after a restart.
If the primary messaging server is in restricted mode=1 or 2, the BlackBerry Enterprise Server failsover to a secondary cluster member. You can verify that a messaging server is restricted by performing a show server
or show cluster
command. The availability index line will reflect this.
Reason 3 - The server availability threshold is set to 100
By modifying the server availability threshold, you can set the IBM Lotus Domino server's state to Busy
. In this state, the BlackBerry Enterprise Server continues to scan the IBM Lotus Domino server. When an IBM Lotus Notes client accesses this messaging server, the IBM Lotus Notes client is redirected to a cluster member.
If you set the IBM Lotus Domino server to Busy, the BlackBerry Enterprise Server will continue to hit it using the server_restricted
parameter. This causes the BlackBerry Enterprise Server to failover to the next cluster member. When server_restricted=1 or 2, only administrators are allowed to connect to the messaging server.
Reason 4 - The BlackBerry Enterprise Server is listed in the No Access field of the messaging server's document.
When the BlackBerry Enterprise Server is listed in the No Access field and it attempts to connect to the IBM Lotus Domino server, it receives an error message that it is not allowed to access and it failsover to the next messaging server.
Reason 5 - The database is marked out of service
If you mark the database out of service, the BlackBerry Enterprise Server will not be able to open it and it will failover to the secondary messaging server.
Limitations of the BlackBerry Enterprise Server Regarding Load Balancing When Working in Failover
The BlackBerry Enterprise Server only connects to a failover database when the primary messaging server is not available. It does not redirect opens to another cluster member for load balancing. If the BlackBerry Enterprise Server is able to open the database on the primary messaging server, then it will use it.