Infrastructure Integration¶
Configuration¶

Create a readonly user administrator for Epoch. Administrator privileges are required to collect complete server statistics. In the mongo shell, run:
use admin db.auth("admin", "adminpassword") db.addUser("epoch", " <Generate Password> ", true)
echo "db.auth('epoch', ' Generate Password ')"  mongo admin  grep E "(Authentication failed)(auth fails)" && echo e "\033[0;31mepoch user  Missing\033[0m"  echo e "\033[0;32mepoch user  OK\033[0m"
Refer to the MongoDB documentation if you need to create and manage MongoDB users.

Configure the agent by editing
/etc/nutanix/epochddagent/conf.d/tokumx.yaml
in the collectors. Example:init_config: instances: # Specify the MongoDB URI, with database to use for reporting (defaults to "admin")  server: mongodb://localhost:27017 # tags: #  optional_tag1 #  optional_tag2 # Optional SSL parameters, see https://github.com/mongodb/mongopythondriver/blob/2.6.3/pymongo/mongo_client.py#L193L212 # for more details # # ssl: False # Optional (default to False) # ssl_keyfile: # Path to the private keyfile used to identify the local # ssl_certfile: # Path to the certificate file used to identify the local connection against mongod. # ssl_cert_reqs: # Specifies whether a certificate is required from the other side of the connection, and whether it will be validated if provided. # ssl_ca_certs: # Path to the ca_certs file

Check and make sure that all yaml files are valid with following command:
/etc/init.d/epochcollectors configcheck

Restart the Agent using the following command:
/etc/init.d/epochcollectors restart

Execute the info command to verify that the integration check has passed:
/etc/init.d/epochcollectors info
The output of the command should contain a section similar to the following:
Checks
======
[...]
tokumx

 instance #0 [OK]
 Collected 8 metrics & 0 events
Infrastructure Datasources¶
Datasource  Available Aggregations  Unit  Description 

tokumx.asserts.msgps  avg max min sum 
assertion/second  The number of message assertions raised per second. 
tokumx.asserts.regularps  avg max min sum 
assertion/second  The number of regular assertions raised per second. 
tokumx.asserts.rolloversps  avg max min sum 
assertion/second  The number of times that the rollover counters roll over per second. The counters rollover to zero every 2^30 assertions. 
tokumx.asserts.userps  avg max min sum 
assertion/second  The number of user assertions raised per second. 
tokumx.asserts.warningps  avg max min sum 
assertion/second  The number of warnings raised per second. 
tokumx.connections.available  avg max min sum 
connection  The number of unused available incoming connections the database can provide. 
tokumx.connections.current  avg max min sum 
connection  The number of connections to the database server from clients. 
tokumx.cursors.timedOut  avg max min sum 
cursor  The total number of cursors that have timed out since the server process started. 
tokumx.cursors.totalOpen  avg max min sum 
cursor  The number of cursors that tokumx is maintaining for clients. 
tokumx.ft.alerts.checkpointFailures  avg max min sum 
event  The number of checkpoints that have failed for any reason. 
tokumx.ft.alerts.locktreeRequestsPending  avg max min sum 
request  The number of requests for Documentlevel Locks in the locktree that are waiting for other requests to release their locks. 
tokumx.ft.alerts.longWaitEvents.cachePressure.countps  avg max min sum 
event/second  Rate at which a thread had to wait more than 1 second for evictions to create space in the cachetable for it to page in data it needed. 
tokumx.ft.alerts.longWaitEvents.cachePressure.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) that a thread had to wait more than 1 second for evictions to create space in the cachetable for it to page in data it needed. 
tokumx.ft.alerts.longWaitEvents.checkpointBegin.countps  avg max min sum 
event/second  Rate at which the begin checkpoint phase of checkpoint has run (these should be fairly quick). 
tokumx.ft.alerts.longWaitEvents.checkpointBegin.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) that a begin checkpoint phase has spent blocking other threads. 
tokumx.ft.alerts.longWaitEvents.fsync.countps  avg max min sum 
event/second  Rate at which fsync operations took more than 1 second. 
tokumx.ft.alerts.longWaitEvents.fsync.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) spent performing fsync operations that took longer than 1 second. 
tokumx.ft.alerts.longWaitEvents.locktreeWait.countps  avg max min sum 
event/second  Rate at which a thread had to wait more than 1 second to acquire a documentlevel lock in the locktree. 
tokumx.ft.alerts.longWaitEvents.locktreeWait.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) spent by threads waiting more than 1 second to acquire a documentlevel lock in the locktree. 
tokumx.ft.alerts.longWaitEvents.locktreeWaitEscalation.countps  avg max min sum 
event/second  Rate at which a thread had to wait more than 1 second to acquire a documentlevel lock because the locktree was at the memory limit and needed to run escalation. 
tokumx.ft.alerts.longWaitEvents.locktreeWaitEscalation.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) spent by threads waiting more than 1 second to acquire a documentlevel lock because the locktree was at the memory limit and needed to run escalation. 
tokumx.ft.alerts.longWaitEvents.logBufferWaitps  avg max min sum 
event/second  Rate at which a writing client had to wait more than 100ms for access to the log buffer. 
tokumx.ft.cachetable.evictions.full.leaf.clean.bytesps  avg max min sum 
byte/second  Rate of full evictions of leaf nodes. 
tokumx.ft.cachetable.evictions.full.leaf.clean.countps  avg max min sum 
event/second  Rate of full evictions of leaf nodes. 
tokumx.ft.cachetable.evictions.full.leaf.dirty.bytesps  avg max min sum 
byte/second  Rate of full evictions of leaf nodes that need to be written back to disk. 
tokumx.ft.cachetable.evictions.full.leaf.dirty.countps  avg max min sum 
event/second  Rate of full evictions of leaf nodes that need to be written back to disk. 
tokumx.ft.cachetable.evictions.full.leaf.dirty.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) spent performing full evictions leaf nodes, including the time spent serializing, compressing, and writing those nodes to disk. 
tokumx.ft.cachetable.evictions.full.nonleaf.clean.bytesps  avg max min sum 
byte/second  Rate of full evictions of nonleaf nodes. 
tokumx.ft.cachetable.evictions.full.nonleaf.clean.countps  avg max min sum 
event/second  Rate of full evictions of nonleaf nodes. 
tokumx.ft.cachetable.evictions.full.nonleaf.dirty.bytesps  avg max min sum 
byte/second  Rate of full evictions of nonleaf nodes that need to be written back to disk. 
tokumx.ft.cachetable.evictions.full.nonleaf.dirty.countps  avg max min sum 
event/second  Rate of full evictions of nonleaf nodes that need to be written back to disk. 
tokumx.ft.cachetable.evictions.full.nonleaf.dirty.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) spent performing full evictions nonleaf nodes, including the time spent serializing, compressing, and writing those nodes to disk. 
tokumx.ft.cachetable.evictions.partial.leaf.clean.bytesps  avg max min sum 
byte/second  Rate of partial evictions of leaf nodes. 
tokumx.ft.cachetable.evictions.partial.leaf.clean.countps  avg max min sum 
event/second  Rate of partial evictions of leaf nodes. 
tokumx.ft.cachetable.evictions.partial.nonleaf.clean.bytesps  avg max min sum 
byte/second  Rate of partial evictions of nonleaf nodes. 
tokumx.ft.cachetable.evictions.partial.nonleaf.clean.countps  avg max min sum 
event/second  Rate of partial evictions of nonleaf nodes. 
tokumx.ft.cachetable.miss.countps  avg max min sum 
miss/second  Rate of internal cache misses. This metric is similar to MongoDB’s btree misses and page faults. 
tokumx.ft.cachetable.miss.full.countps  avg max min sum 
miss/second  Rate of full internal cache misses. 
tokumx.ft.cachetable.miss.full.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) the database has had to wait for a disk read to complete for a full cache miss. 
tokumx.ft.cachetable.miss.partial.countps  avg max min sum 
miss/second  Rate of partial internal cache misses. 
tokumx.ft.cachetable.miss.partial.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) the database has had to wait for a disk read to complete for a partial cache miss. 
tokumx.ft.cachetable.miss.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) the database has had to wait for a disk read to complete for cache misses. 
tokumx.ft.cachetable.size.current  avg max min sum 
byte  Total amount of uncompressed data currently in the database's internal cache. 
tokumx.ft.cachetable.size.limit  avg max min sum 
byte  Total amount of uncompressed data that will fit in TokuMX’s internal cache. 
tokumx.ft.cachetable.size.writing  avg max min sum 
byte  Total size of nodes that are currently queued up to be written to disk for eviction. 
tokumx.ft.checkpoint.begin.timeps  avg max min sum 
fraction  Fraction of time (microseconds/second) that a begin checkpoint phase has spent blocking other threads. 
tokumx.ft.checkpoint.countps  avg max min sum 
event/second  Rate at which checkpoints are completed. 
tokumx.ft.checkpoint.lastComplete.time  avg max min sum 
second  The time spent, in seconds, by the most recently completed checkpoint. 
tokumx.ft.checkpoint.timeps  avg max min sum 
fraction  Fraction of time (seconds/second) spent doing checkpoints. 
tokumx.ft.checkpoint.write.leaf.bytes.compressedps  avg max min sum 
byte/second  The rate at which leaf nodes are written to disk during checkpoints, after compression. 
tokumx.ft.checkpoint.write.leaf.bytes.uncompressedps  avg max min sum 
byte/second  The rate at which leaf nodes are written to disk during checkpoints, before compression. 
tokumx.ft.checkpoint.write.leaf.countps  avg max min sum 
write/second  The rate at which leaf nodes are written to disk during checkpoints. 
tokumx.ft.checkpoint.write.leaf.timeps  avg max min sum 
fraction  The fraction of time spent writing leaf nodes to disk during checkpoints. 
tokumx.ft.checkpoint.write.nonleaf.bytes.compressedps  avg max min sum 
byte/second  The rate at which nonleaf nodes are written to disk during checkpoints, after compression. 
tokumx.ft.checkpoint.write.nonleaf.bytes.uncompressedps  avg max min sum 
byte/second  The rate at which nonleaf nodes are written to disk during checkpoints, before compression. 
tokumx.ft.checkpoint.write.nonleaf.countps  avg max min sum 
write/second  The rate at which nonleaf nodes are written to disk during checkpoints. 
tokumx.ft.checkpoint.write.nonleaf.timeps  avg max min sum 
fraction  The fraction of time spent writing nonleaf nodes to disk during checkpoints. 
tokumx.ft.compressionRatio.leaf  avg max min sum 
fraction  The size ratio of leaf nodes before and after compression. 
tokumx.ft.compressionRatio.nonleaf  avg max min sum 
fraction  The size ratio of nonleaf nodes before and after compression. 
tokumx.ft.compressionRatio.overall  avg max min sum 
fraction  The size ratio of nodes before and after compression. 
tokumx.ft.fsync.countps  avg max min sum 
operation/second  The rate at which the database flushed the operating system’s file buffers to disk. 
tokumx.ft.fsync.timeps  avg max min sum 
fraction  The fraction of time (microseconds/second) used to fsync to disk. 
tokumx.ft.locktree.size.current  avg max min sum 
byte  Total memory the locktree is currently using. 
tokumx.ft.locktree.size.limit  avg max min sum 
byte  Maximum number of bytes that the locktree is allowed to use. 
tokumx.ft.log.bytesps  avg max min sum 
byte/second  The rate at which the logger writes to disk. 
tokumx.ft.log.countps  avg max min sum 
write/second  The rate of of individual log writes. 
tokumx.ft.log.timeps  avg max min sum 
fraction  The fraction of time spent performing log writes. 
tokumx.ft.serializeTime.leaf.compressps  avg max min sum 
fraction  Fraction of time spent compressing leaf nodes before writing them to disk (for checkpoint or when evicted while dirty). 
tokumx.ft.serializeTime.leaf.decompressps  avg max min sum 
fraction  Fraction of time spent decompressing leaf nodes before writing them to disk (for checkpoint or when evicted while dirty). 
tokumx.ft.serializeTime.leaf.deserializeps  avg max min sum 
fraction  Fraction of time spent deserializing leaf nodes and their partitions after reading them off disk. 
tokumx.ft.serializeTime.leaf.serializeps  avg max min sum 
fraction  Fraction of time spent serializing leaf nodes and their partitions after reading them off disk. 
tokumx.ft.serializeTime.nonleaf.compressps  avg max min sum 
fraction  Fraction of time spent compressing nonleaf nodes before writing them to disk (for checkpoint or when evicted while dirty). 
tokumx.ft.serializeTime.nonleaf.decompressps  avg max min sum 
fraction  Fraction of time spent decompressing nonleaf nodes before writing them to disk (for checkpoint or when evicted while dirty). 
tokumx.ft.serializeTime.nonleaf.deserializeps  avg max min sum 
fraction  Fraction of time spent deserializing nonleaf nodes and their partitions after reading them off disk. 
tokumx.ft.serializeTime.nonleaf.serializeps  avg max min sum 
fraction  Fraction of time spent serializing nonleaf nodes and their partitions after reading them off disk. 
tokumx.mem.resident  avg max min sum 
mebibyte  The amount of memory currently used by the database process. 
tokumx.mem.virtual  avg max min sum 
mebibyte  The amount of virtual memory used by the database process. 
tokumx.metrics.document.deletedps  avg max min sum 
document/second  The number of documents deleted per second. 
tokumx.metrics.document.insertedps  avg max min sum 
document/second  The number of documents inserted per second. 
tokumx.metrics.document.returnedps  avg max min sum 
document/second  The number of documents returned by queries per second. 
tokumx.metrics.document.updatedps  avg max min sum 
document/second  The number of documents updated per second. 
tokumx.metrics.getLastError.wtime.numps  avg max min sum 
operation/second  The number of getLastError operations per second with a specified write concern (i.e. w) that wait for one or more members of a replica set to acknowledge the write operation. 
tokumx.metrics.getLastError.wtime.totalMillisps  avg max min sum 
event/second  The number of times per second that write concern operations have timed out as a result of the wtimeout threshold to getLastError. 
tokumx.metrics.getLastError.wtimeoutsps  avg max min sum 
fraction  The fraction of time (ms/s) spent performing getLastError operations with write concern (i.e. w) that wait for one or more members of a replica set to acknowledge the write operation. 
tokumx.metrics.operation.idhackps  avg max min sum 
query/second  The rate of queries that contain the _id field. 
tokumx.metrics.operation.scanAndOrderps  avg max min sum 
query/second  The rate of queries that return sorted numbers that cannot perform the sort operation using an index. 
tokumx.metrics.queryExecutor.scannedps  avg max min sum 
operation/second  The rate of index items scanned during queries and queryplan evaluation. 
tokumx.metrics.repl.apply.batches.numps  avg max min sum 
operation/second  The number of batches applied across all databases per second. 
tokumx.metrics.repl.apply.batches.totalMillisps  avg max min sum 
fraction  The fraction of time (ms/s) spent applying operations from the oplog. 
tokumx.metrics.repl.apply.opsps  avg max min sum 
operation/second  The rate of oplog operations. 
tokumx.metrics.repl.buffer.count  avg max min sum 
operation  The number of operations in the oplog buffer. 
tokumx.metrics.repl.buffer.sizeBytes  avg max min sum 
byte  The current size of the contents of the oplog buffer. 
tokumx.metrics.repl.network.bytesps  avg max min sum 
byte/second  The rate at which data is read from the replication sync source. 
tokumx.metrics.repl.network.getmores.numps  avg max min sum 
operation/second  The rate of getmore operations. 
tokumx.metrics.repl.network.getmores.totalMillisps  avg max min sum 
fraction  The fraction of time (ms/s) spent collecting data from getmore operations. 
tokumx.metrics.repl.network.opsps  avg max min sum 
operation/second  The rate of operations read from the replication source. 
tokumx.metrics.repl.network.readersCreatedps  avg max min sum 
process/second  The rate at which oplog query processes are created. 
tokumx.metrics.repl.oplog.insert.numps  avg max min sum 
operation/second  The rate at which operations are inserted into the oplog. 
tokumx.metrics.repl.oplog.insert.totalMillisps  avg max min sum 
fraction  The fraction of time (ms/s) spent inserting operations into the oplog. 
tokumx.metrics.repl.oplog.insertBytesps  avg max min sum 
byte/second  The rate (in bytes) at which data is inserted into the oplog. 
tokumx.metrics.ttl.deletedDocumentsps  avg max min sum 
document/second  The rate at which documents are deleted from collections with a ttl index. 
tokumx.metrics.ttl.passesps  avg max min sum 
event/second  The number of times per second the background process removes documents from collections with a ttl index. 
tokumx.opcounters.commandps  avg max min sum 
command/second  The total number of commands per second issued to the database. 
tokumx.opcounters.deleteps  avg max min sum 
operation/second  The number of delete operations per second. 
tokumx.opcounters.getmoreps  avg max min sum 
operation/second  The number of getmore operations per second. 
tokumx.opcounters.insertps  avg max min sum 
operation/second  The number of insert operations per second. 
tokumx.opcounters.queryps  avg max min sum 
query/second  The total number of queries per second. 
tokumx.opcounters.updateps  avg max min sum 
operation/second  The number of update operations per second. 
tokumx.opcountersRepl.commandps  avg max min sum 
command/second  The total number of replicated commands issued to the database per second. 
tokumx.opcountersRepl.deleteps  avg max min sum 
operation/second  The number of replicated delete operations per second. 
tokumx.opcountersRepl.getmoreps  avg max min sum 
operation/second  The number of replicated getmore operations per second. 
tokumx.opcountersRepl.insertps  avg max min sum 
operation/second  The number of replicated insert operations per second. 
tokumx.opcountersRepl.queryps  avg max min sum 
query/second  The total number of replicated queries per second. 
tokumx.opcountersRepl.updateps  avg max min sum 
operation/second  The number of replicated update operations per second. 
tokumx.stats.coll.count  avg max min sum 
document  The number of objects or documents in this collection. 
tokumx.stats.coll.nindexes  avg max min sum 
index  The number of indexes on this collection. 
tokumx.stats.coll.nindexesbeingbuilt  avg max min sum 
index  The number of indexes currently being built. 
tokumx.stats.coll.size  avg max min sum 
byte  The total size in memory of all records in a collection. Does not include the record header, but does include the record’s padding. Does not include the size of any indexes associated with the collection. 
tokumx.stats.coll.storageSize  avg max min sum 
byte  The total amount of storage allocated to this collection for document storage. 
tokumx.stats.coll.totalIndexSize  avg max min sum 
byte  The total size of all indexes on this collection. 
tokumx.stats.coll.totalIndexStorageSize  avg max min sum 
byte  The total size on disk of all indexes on this collection (after compression). 
tokumx.stats.dataSize  avg max min sum 
byte  The total size of the data held in this database including the padding factor. 
tokumx.stats.db.avgObjSize  avg max min sum 
byte  The average size of each document. 
tokumx.stats.db.collections  avg max min sum 
The number of collections in the database.  
tokumx.stats.db.dataSize  avg max min sum 
byte  The total size of the data held in this database including the padding factor. 
tokumx.stats.db.indexes  avg max min sum 
index  The total number of indexes across all collections in the database. 
tokumx.stats.db.indexSize  avg max min sum 
byte  The total size of all indexes created on this database. 
tokumx.stats.db.indexStorageSize  avg max min sum 
byte  The total size on disk of all indexes created on this database (after compression). 
tokumx.stats.db.objects  avg max min sum 
document  The number of documents in the database across all collections. 
tokumx.stats.db.storageSize  avg max min sum 
byte  The total amount of space allocated to collections in this database for document storage. 
tokumx.stats.idx.avgObjSize  avg max min sum 
byte  The average size of each index entry. 
tokumx.stats.idx.count  avg max min sum 
index  The number of documents in this index. 
tokumx.stats.idx.deletes  avg max min sum 
operation  The number of delete operations performed on this index. 
tokumx.stats.idx.inserts  avg max min sum 
operation  The number of insert operations performed on this index. 
tokumx.stats.idx.nscanned  avg max min sum 
index  The number of index entries scanned for queries using this index. 
tokumx.stats.idx.nscannedObjects  avg max min sum 
object  The number of collection objects examined after scanning an index entry for a query using this index. 
tokumx.stats.idx.queries  avg max min sum 
query  The number of query operations performed using this index. 
tokumx.stats.idx.size  avg max min sum 
byte  The total size of this index. 
tokumx.stats.idx.storageSize  avg max min sum 
byte  The total size on disk of this index (after compression). 
tokumx.stats.indexes  avg max min sum 
index  The total number of indexes across all collections in the database. 
tokumx.stats.indexSize  avg max min sum 
byte  The total size of all indexes created on this database. 
tokumx.stats.objects  avg max min sum 
document  The number of documents in the database across all collections. 
tokumx.stats.storageSize  avg max min sum 
byte  The total amount of space allocated to collections in this database for document storage. 
tokumx.uptime  avg max min sum 
second  The time that the tokumx process has been active. 