Logstash - grok filter
grok filter 能讓我們使用 Grok 語法簡易的切割 Logstash field。
其可使用的設定如下:
Setting | Type | Required | Default | Description |
---|---|---|---|---|
add_field | hash | No | {} | If this filter is successful, add any arbitrary fields to this event. |
add_tag | array | No | [] | If this filter is successful, add arbitrary tags to the event. |
break_on_match | boolean | No | true | Break on first match. The first successful match by grok will result in the filter being finished. If you want grok to try all patterns (maybe you are parsing different things), then set this to false. |
enable_metric | boolean | No | true | Disable or enable metric logging for this specific plugin instance by default we record all the metrics we can, but you can disable metrics collection for a specific plugin. |
id | string | No | Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one. | |
keep_empty_captures | boolean | No | false | If true, keep empty captures as event fields. |
match | hash | No | {} | A hash of matches of field ⇒ value |
named_captures_only | boolean | No | true | If true, only store named captures from grok. |
overwrite | array | No | [] | allows you to overwrite a value in a field that already exists. |
patterns_dir | array | No | [] | |
patterns_files_glob | string | No | “*” | |
periodic_flush | boolean | No | false | Call the filter flush method at regular interval. |
remove_field | array | No | [] | If this filter is successful, remove arbitrary fields from this event. |
remove_tag | array | No | [] | If this filter is successful, remove arbitrary tags from the event. |
tag_on_failure | array | No | ["_grokparsefailure"] | Append values to the tags field when there has been no successful match |
tag_on_timeout | string | No | “_groktimeout” | Tag to apply if a grok regexp times out. |
timeout_millis | number | No | 30000 | Attempt to terminate regexps after this amount of time. |
最常用的就是 match 設定,可用 Grok 語法設定要如何將 message 切出 Logstash field。像是下面這邊設定訊息是由 IP、WORD、URIPATHPARAM、NUMBER、NUMBER 所組成,如果符合這樣組成的訊息,則依序將資料拆分為 client、 method、 request、 bytes、 duration 這幾個 Logstash field。
/opt/logstash/bin/logstash -e 'input { stdin{} } filter { grok { match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } } } output { stdout { codec => rubydebug } }'
match 設定可以一次設定多組,預設會依序照設定處理,如果訊息滿足設定條件,則會終止向下處理。但有的時候我們會希望讓 Logstash 跑完所有的設定,這時可以將 break_on_match 設為 false。
/opt/logstash/bin/logstash -e 'input { stdin{} } filter { grok { break_on_match => false match => { "message" => "%{GREEDYDATA:messagebody}" } match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } } } output { stdout { codec => rubydebug } }'
如果要在 Grok 設定滿足時順帶設定額外的 Logstash field,可使用 add_field 設定。
/opt/logstash/bin/logstash -e 'input { stdin{} } filter { grok { match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } add_field => { "msg" => "Hello World" } } } output { stdout { codec => rubydebug } }'
如果要增設額外的 Logstash tag,則使用 add_tag 設定。
/opt/logstash/bin/logstash -e 'input { stdin{} } filter { grok { match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } add_tag => { "msg" => "Hello World" } } } output { stdout { codec => rubydebug } }'
如果要將現有的 Logstash field 覆蓋,可使用 overwrite 設定。
/opt/logstash/bin/logstash -e 'input { stdin{} } filter { grok { match => { "message" => "%{IP:message}" } overwrite => ["message"] } } output { stdout { codec => rubydebug } }'