The Foglight for MongoDB cartridge includes a set of predefined rules summarized below. You can change default threshold values or scope them to specific settings, typically through registry variables. These rules can be copied, modified, disabled, or customized in various ways. For more information, refer to the Foglight documentation or contact Quest Software PSO.
This section covers the following rules:
Raises an alert if any warning or user asserts are raised. While assert errors are typically uncommon, if there are non-zero values for the asserts, you should check the log file for more information. In many cases, these errors are trivial, but are worth investigating.
Raises an alert if the monitored mongod or mongos server is unreachable two or more times in a row.
Alert when cluster is missing an active mongos.
Raises an alert if a collection has grown faster than usual. Collection size is compared to an historical average to determine if collection size growth/shrinkage is out of the ordinary.
Raises an alert if the monitored instance is approaching its limit of available simultaneous connections.
Alert if there are open “no timeout” cursors.
Alert if database command execution times are higher than usual.
Alert if database read lock times are higher than usual.
Alert if database write lock times are higher than usual.
Raises an alert if the average amount of time the server has spent writing data to disk is high. Background flush information only appears for instances that use the MMAPv1 storage engine.
Raises an alert if any deadlocks are encountered during lock acquisition.
Raises an alert if the combined global reader lock queue and global writer lock queue is getting long.
Alert if there are any log entries matching configured agent properties patterns.
Alert if the total data size and index size on a server does not fit in physical memory. Must have the Infrastructure Cartridge enabled.
Alert if the total index size on a server does not fit in physical memory. Must have the Infrastructure Cartridge enabled.
Raises an alert if mapped memory is too large with respect to non-mapped memory, the virtual memory used by a mongod process. With journaling enabled, non-mapped memory should be at least double the value of mapped memory. Three times larger or more may indicate a memory leak.
Raises an alert if the ratio of page faults to total database operations is too high.
Raises an alert if the average execution time for profiled operations is too high. Applicable when profiling is enabled for a given database.
Alert if replica member pingMs is large.
Raises an alert if one or more members of a replica set are not running.
Raises an alert if a replica set member is unreachable two or more times in a row.
Raises an alert if the replication buffer is filling up. MongoDB buffers oplog operations from the replication sync source buffer before applying oplog entries in a batch.
Alert if a replica set has no primary.
Alert if replication on a secondary is falling behind and may not have time to replicate the oldest oplog entries before they are recycled.
Alert if the replication oplog lag on a secondary server is too long.
Alert if the replication oplog window is too small.
Alert if a member of a replica set changes state.
Alert if a slow operation time exceeds the threshold.
Raises an alert if a MongoDB server using an SSL/TLS certificate is approaching its expiration date.
Alert if the tracked dirty bytes in the WiredTiger cache is high.
Alert if the percentage of unmodified pages evicted to the total pages currently held in the WiredTiger cache is high.
Alert if the number of available WiredTiger concurrent transaction read tickets approaches zero.
Alert if the number of available WiredTiger concurrent transaction write tickets approaches zero.
Alert if there is a higher than average number of open WiredTiger cursors.
Alert if there is a higher than average number of open WiredTiger sessions.
Alert if there are any WiredTiger transaction failures due to cache overflow.