1 cassandra_exporter
1.1 配置Cassandra
1.1.1 下载cassandra_exporter
mkdir -p /app/module/cassandra_exporter
cd /app/module/cassandra_exporter
wget https://github.com/criteo/cassandra_exporter/releases/download/2.3.8/cassandra_exporter-2.3.8.jar
1.1.2 添加配置文件
vim cassandra_exporter.yml
host: localhost:7199
ssl: False
user:
password:
listenAddress: 0.0.0.0
listenPort: 8080
# Regular expression to match environment variable names that will be added
# as labels to all data points. The name of the label will be either
# $1 from the regex below, or the entire environment variable name if no match groups are defined
#
# Example:
# additionalLabelsFromEnvvars: "^ADDL\_(.*)$"
additionalLabelsFromEnvvars:
blacklist:
# To profile the duration of jmx call you can start the program with the following options
# > java -Dorg.slf4j.simpleLogger.defaultLogLevel=trace -jar cassandra_exporter.jar config.yml --oneshot
#
# To get intuition of what is done by cassandra when something is called you can look in cassandra
# https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/metrics
# Please avoid to scrape frequently those calls that are iterating over all sstables
# Unaccessible metrics (not enough privilege)
- java:lang:memorypool:.*usagethreshold.*
# Leaf attributes not interesting for us but that are presents in many path
- .*:999thpercentile
- .*:95thpercentile
- .*:fifteenminuterate
- .*:fiveminuterate
- .*:durationunit
- .*:rateunit
- .*:stddev
- .*:meanrate
- .*:mean
- .*:min
# Path present in many metrics but uninterresting
- .*:viewlockacquiretime:.*
- .*:viewreadtime:.*
- .*:cas[a-z]+latency:.*
- .*:colupdatetimedeltahistogram:.*
# Mostly for RPC, do not scrap them
- org:apache:cassandra:db:.*
# columnfamily is an alias for Table metrics
# https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/metrics/TableMetrics.java#L162
- org:apache:cassandra:metrics:columnfamily:.*
# Should we export metrics for system keyspaces/tables ?
- org:apache:cassandra:metrics:[^:]+:system[^:]*:.*
# Logback doesn't have any useful metrics
- ch:qos:logback:.*
# Don't scrap us
- com:criteo:nosql:cassandra:exporter:.*
maxScrapFrequencyInSec:
50:
- .*
# Refresh those metrics only every hour as it is costly for cassandra to retrieve them
3600:
- .*:snapshotssize:.*
- .*:estimated.*
- .*:totaldiskspaceused:.*
1.1.3 启动监听服务
java -jar cassandra_exporter-2.3.8.jar cassandra_exporter.yml
1.2 配置Prometheus
1、编辑Prometheus配置⽂件,将Tomcat服务纳⼊监控
- job_name: "cassandra_exporter"
metrics_path: "/metrics"
static_configs:
- targets: ["192.168.137.110:8080"]
2、重新加载Prometheus配置⽂件
curl -X POST http://192.168.137.131:9090/-/reload
1.3 导⼊Cassandra图形
导⼊⼀个Cassandra的Grafana模板。Dashboard ID为6400
2 jmx_exporter
2.1 Cassandra添加配置
vim /app/module/cassandra/conf/cassandra-env.sh
#末尾添加配置
JAVA_OPTS="-Xms512m -Xmx1024m -javaagent:/app/module/jmx_exporter/jmx_prometheus_javaagent-0.20.0.jar=12346:/app/module/jmx_exporter/tomcat.yml"
2.2 添加cassandra-jmx.yml
vim /app/module/jmx_exporter/cassandra-jmx.yml
lowercaseOutputLabelNames: true
lowercaseOutputName: true
whitelistObjectNames: ["org.apache.cassandra.metrics:*"]
# ColumnFamily is an alias for Table metrics
blacklistObjectNames: ["org.apache.cassandra.metrics:type=ColumnFamily,*"]
rules:
# Generic gauges with 0-2 labels
- pattern: org.apache.cassandra.metrics<type=(\S*)(?:, ((?!scope)\S*)=(\S*))?(?:, scope=(\S*))?, name=(\S*)><>Value
name: cassandra_$1_$5
type: GAUGE
labels:
"$1": "$4"
"$2": "$3"
#
# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's.
# TotalLatency is the sum of all latencies since server start
#
- pattern: org.apache.cassandra.metrics<type=(\S*)(?:, ((?!scope)\S*)=(\S*))?(?:, scope=(\S*))?, name=(.+)?(?:Total)(Latency)><>Count
name: cassandra_$1_$5$6_seconds_sum
type: UNTYPED
labels:
"$1": "$4"
"$2": "$3"
# Convert microseconds to seconds
valueFactor: 0.000001
- pattern: org.apache.cassandra.metrics<type=(\S*)(?:, ((?!scope)\S*)=(\S*))?(?:, scope=(\S*))?, name=((?:.+)?(?:Latency))><>Count
name: cassandra_$1_$5_seconds_count
type: UNTYPED
labels:
"$1": "$4"
"$2": "$3"
- pattern: org.apache.cassandra.metrics<type=(\S*)(?:, ((?!scope)\S*)=(\S*))?(?:, scope=(\S*))?, name=(.+)><>Count
name: cassandra_$1_$5_count
type: UNTYPED
labels:
"$1": "$4"
"$2": "$3"
- pattern: org.apache.cassandra.metrics<type=(\S*)(?:, ((?!scope)\S*)=(\S*))?(?:, scope=(\S*))?, name=((?:.+)?(?:Latency))><>(\d+)thPercentile
name: cassandra_$1_$5_seconds
type: GAUGE
labels:
"$1": "$4"
"$2": "$3"
quantile: "0.$6"
# Convert microseconds to seconds
valueFactor: 0.000001
- pattern: org.apache.cassandra.metrics<type=(\S*)(?:, ((?!scope)\S*)=(\S*))?(?:, scope=(\S*))?, name=(.+)><>(\d+)thPercentile
name: cassandra_$1_$5
type: GAUGE
labels:
"$1": "$4"
"$2": "$3"
quantile: "0.$6"
2.3 启动Cassandra
/app/module/cassandra/bin/cassandra
/app/module/apache-tomcat-9.0.73/bin/shutdown.sh
/app/module/apache-tomcat-9.0.73/bin/startup.sh
2.4 配置Prometheus
- job_name: "cassandra_jmx_exporter"
scrape_interval: 60s
scrape_timeout: 60s
metrics_path: "/metrics"
static_configs:
- targets: ["192.168.137.110:12347"]
2.5 重新加载Prometheus
curl -X POST http://192.168.137.131:9090/-/reload
2.6 导入JVM图形
导⼊⼀个JVM的Grafana模板。Dashboard ID为 5408
部分已经无法匹配了