ELK Logstash性能故障排除

阅读 83

2022-08-17


Performance Troubleshooting

You can use these troubleshooting tips to quickly diagnose and resolve Logstash performance problems. Advanced knowledge of pipeline internals is not required to understand this guide. However, the ​​pipeline documentation​​ is recommended reading if you want to go beyond these tips.

您可以使用这些故障排除提示来快速诊断和解决Logstash性能问题。不需要高级的管道内部知识就可以理解本指南。但是,如果您想超越这些技巧,建议阅读​​管道文档​​。

You may be tempted to jump ahead and change settings like ​​pipeline.workers​​​ (​​-w​​) as a first attempt to improve performance. In our experience, changing this setting makes it more difficult to troubleshoot performance problems because you increase the number of variables in play. Instead, make one change at a time and measure the results. Starting at the end of this list is a sure-fire way to create a confusing situation.

您可能会很想跳进去并更改​​pipeline.workers​​​ (​​-w​​)之类的设置,这是提高性能的第一次尝试。根据我们的经验,更改此设置会使解决性能问题变得更加困难,因为您会增加正在使用的变量的数量。而是一次进行一次更改并衡量结果。从此列表的末尾开始是确定情况的可靠方法。

 

Performance Checklist

  1. Check the performance of input sources and output destinations:
  • Logstash is only as fast as the services it connects to. Logstash can only consume and produce data as fast as its input and output destinations can!
  1. Check system statistics:
  • CPU
  • Note whether the CPU is being heavily used. On Linux/Unix, you can run​​top -H​​ to see process statistics broken out by thread, as well as total CPU statistics.
  • If CPU usage is high, skip forward to the section about checking the JVM heap and then read the section about tuning Logstash worker settings.
  • Memory
  • Be aware of the fact that Logstash runs on the Java VM. This means that Logstash will always use the maximum amount of memory you allocate to it.
  • Look for other applications that use large amounts of memory and may be causing Logstash to swap to disk. This can happen if the total memory used by applications exceeds physical memory.
  • I/O Utilization
  • Monitor disk I/O to check for disk saturation.(监视磁盘I / O以检查磁盘是否饱和。)
  • Disk saturation can happen if you’re using Logstash plugins (such as the file output) that may saturate your storage.(如果您使用的Logstash插件(例如文件输出)可能会使存储饱和,则可能导致磁盘饱和。)
  • Disk saturation can also happen if you’re encountering a lot of errors that force Logstash to generate large error logs.
  • On Linux, you can use iostat, dstat, or something similar to monitor disk I/O.
  • Monitor network I/O for network saturation.
  • Network saturation can happen if you’re using inputs/outputs that perform a lot of network operations.
  • On Linux, you can use a tool like dstat or iftop to monitor your network.
  1. Check the JVM heap:
  • The recommended heap size for typical ingestion scenarios should be no less than 4GB and no more than 8GB.(对于典型的摄取方案,建议的堆大小应不小于4GB且不大于8GB。)
  • CPU utilization can increase unnecessarily if the heap size is too low, resulting in the JVM constantly garbage collecting. You can check for this issue by doubling the heap size to see if performance improves.(如果堆大小太低,CPU利用率可能会不必要地增加,从而导致JVM不断进行垃圾回收。您可以通过加倍堆大小来检查此问题,以查看性能是否有所提高。)
  • Do not increase the heap size past the amount of physical memory. Some memory must be left to run the OS and other processes. As a general guideline for most installations, don’t exceed 50-75% of physical memory. The more memory you have, the higher percentage you can use.(不要增加堆大小超过物理内存量。必须保留一些内存以运行OS和其他进程。作为大多数安装的一般准则,不要超过物理内存的50-75%。您拥有的内存越多,可以使用的百分比就越高。)
  • Set the minimum (Xms) and maximum (Xmx) heap allocation size to the same value to prevent the heap from resizing at runtime, which is a very costly process.(将最小(Xms)和最大(Xmx)堆分配大小设置为相同的值,以防止在运行时调整堆大小,这是一个非常昂贵的过程。)
  • You can make more accurate measurements of the JVM heap by using either the​​jmap​​​ command line utility distributed with Java or by using VisualVM. For more info, see​​Profiling the Heap​​​.(您可以使用​​jmap​​​随Java一起分发的命令行实用程序或使用VisualVM对JVM堆进行更准确的测量 。有关更多信息,请参见​​分析堆​​。)
  1. Tune Logstash worker settings:
  • Begin by scaling up the number of pipeline workers by using the​​-w​​​ flag. This will increase the number of threads available for filters and outputs. It is safe to scale this up to a multiple of CPU cores, if need be, as the threads can become idle on I/O.(首先使用该​​-w​​标志扩大管道工人的数量。这将增加可用于过滤器和输出的线程数。如果需要,可以安全地将其扩展到多个CPU内核,因为线程可以在I / O上变为空闲状态。)
  • You may also tune the output batch size. For many outputs, such as the Elasticsearch output, this setting will correspond to the size of I/O operations. In the case of the Elasticsearch output, this setting corresponds to the batch size.

 

精彩评论(0)

0 0 举报