I want to get some ideas on search-time field extraction.
I already know that precedence when having host, source, and source type stanza.
I also know that search time precedence follows below order:
But I wanted to understand the below questions:
[host::test]
EXTRACT-a = <regex that extract field a="h1">
EVAL-b = "h2"
[source::test]
EXTRACT-a = <regex that extract field a="s1">
EVAL-b = "s2"
EVAL-c = "s3"
EXTRACT-e = <regex that extract field common="s4">
[test]
EXTRACT-a = <regex that extract field a="st1">
EVAL-b = "st2"
EXTRACT-c = <regex that extract field c="st3">
EXTRACT-d = <regex that extract field common="st4">
Please find below props.conf which describe all the scenarios with the behavior of precedence and execution.
There will be two stages:
Sample props.conf which I used to test all of the above concepts.
# Sample Event Generator:
# | makeresults | eval _raw="a:1, b:2, c:3, d:4, e:5, f:6, g:7, h:8, i:9, j:10, k:11, l:12" | collect index=main source="my_source" sourcetype="my_sourcetype" host="my_host"
[source::my_source]
# Test-1: First test is simple, if there is two parameter in source and sourcetype, parameter from source will be applied
# Result - It will only apply EVAL-first from source stanza, in the final props lines, parameter from sourcetype will not be present.
# For the proof, check the EXTRACT statement, first2 will not be extracted even though regex in sourcetype stanza for first2 is correct. Because that will not be in the final list of parameters.
EVAL-first1 = "first from source"
EXTRACT-first2 = a::(?<first2>\d+)
# Test-2: There will be two separate part, first precedence and second execution
# In this example both extract field second1 and second2 with different values
# Result - It will first apply EXTRACT a, then b, then c, then d
# And in this case it will override the value extracted by previous EXTRACT parameter
EXTRACT-a = a:(?<second1>\d+)
EXTRACT-d = c:(?<second2>\d+)
[my_sourcetype]
# Test-1
EVAL-first1 = "first from sourcetype"
EXTRACT-first2 = b:(?<first2>\d+)
# Test-2
EXTRACT-b = b:(?<second1>\d+)
EXTRACT-c = d:(?<second2>\d+)
# Test-3: Whether value extracted by EXTRACT will be overwritten by EVAL or not?
# Result - EVAL will overwrite the value extracted by EVAL
EXTRACT-e = e:(?<third>\d+)
EVAL-third = "third from eval"
# Test-4: Whether value extracted by first EXTRACT will be overwritten by second EXTRACT or not?
# Result - No. In this case forth1 and forth2 both extract field forth1, but value will be assigned by forth1
# In second case, forth3 is wrong regex not extracting any value hence value assigned by forth4 will be kept for field forth2
EXTRACT-forth1 = a:(?<forth1>\d+)
EXTRACT-forth2 = b:(?<forth1>\d+)
EXTRACT-forth3 = c::(?<forth2>\d+)
EXTRACT-forth4 = d:(?<forth2>\d+)
Edit: Adding some test results with REPORT and multi-valued field. Behavior with REPORT is pretty much as as EXTRACT.
props.conf
# Tests with REPORTS
# Test-5: Behaviour within same REPORT
# Result - Like EXTRACT second REPORT will not overwrite value from first report
REPORT-first_report = first_report1, first_report2
# Test-6: Behaviour within different class
# Result - In this case also second REPORT will not overwrite value from first report
REPORT-second_report1 = second_report1
REPORT-second_report2 = second_report2
# Test-7: Behaviour with MV_ADD within same class
# Result - It will add new value to the field
REPORT-third_report = third_report1, third_report2
# Test-8: Behaviour with MV_ADD while using different class
# Result - It will add new value to the field in this case as well
REPORT-fourth_report1 = fourth_report1
REPORT-fourth_report2 = fourth_report2
# Test-9: Behaviour with MV_ADD, on skip adding MV_ADD in one of the class
# Result - The transforms.conf stanza which does not have MV_ADD in it will not be able to overwrite the value.
# In this case, it will generate field fifth_report with two values 1 and 2
REPORT-fifth_report = fifth_report1, fifth_report2, fifth_report3
transforms.conf
[first_report1]
REGEX = a:(?<first_report>\d+)
[first_report2]
REGEX = b:(?<first_report>\d+)
[second_report1]
REGEX = a:(?<second_report>\d+)
[second_report2]
REGEX = b:(?<second_report>\d+)
[third_report1]
REGEX = a:(?<third_report>\d+)
[third_report2]
REGEX = b:(?<third_report>\d+)
MV_ADD = true
[fourth_report1]
REGEX = a:(?<fourth_report>\d+)
[fourth_report2]
REGEX = b:(?<fourth_report>\d+)
MV_ADD = true
[fifth_report1]
REGEX = a:(?<fifth_report>\d+)
[fifth_report2]
REGEX = b:(?<fifth_report>\d+)
MV_ADD = true
[fifth_report3]
REGEX = c:(?<fifth_report>\d+)
# Skipping MV_ADD here
Please find below props.conf which describe all the scenarios with the behavior of precedence and execution.
There will be two stages:
Sample props.conf which I used to test all of the above concepts.
# Sample Event Generator:
# | makeresults | eval _raw="a:1, b:2, c:3, d:4, e:5, f:6, g:7, h:8, i:9, j:10, k:11, l:12" | collect index=main source="my_source" sourcetype="my_sourcetype" host="my_host"
[source::my_source]
# Test-1: First test is simple, if there is two parameter in source and sourcetype, parameter from source will be applied
# Result - It will only apply EVAL-first from source stanza, in the final props lines, parameter from sourcetype will not be present.
# For the proof, check the EXTRACT statement, first2 will not be extracted even though regex in sourcetype stanza for first2 is correct. Because that will not be in the final list of parameters.
EVAL-first1 = "first from source"
EXTRACT-first2 = a::(?<first2>\d+)
# Test-2: There will be two separate part, first precedence and second execution
# In this example both extract field second1 and second2 with different values
# Result - It will first apply EXTRACT a, then b, then c, then d
# And in this case it will override the value extracted by previous EXTRACT parameter
EXTRACT-a = a:(?<second1>\d+)
EXTRACT-d = c:(?<second2>\d+)
[my_sourcetype]
# Test-1
EVAL-first1 = "first from sourcetype"
EXTRACT-first2 = b:(?<first2>\d+)
# Test-2
EXTRACT-b = b:(?<second1>\d+)
EXTRACT-c = d:(?<second2>\d+)
# Test-3: Whether value extracted by EXTRACT will be overwritten by EVAL or not?
# Result - EVAL will overwrite the value extracted by EVAL
EXTRACT-e = e:(?<third>\d+)
EVAL-third = "third from eval"
# Test-4: Whether value extracted by first EXTRACT will be overwritten by second EXTRACT or not?
# Result - No. In this case forth1 and forth2 both extract field forth1, but value will be assigned by forth1
# In second case, forth3 is wrong regex not extracting any value hence value assigned by forth4 will be kept for field forth2
EXTRACT-forth1 = a:(?<forth1>\d+)
EXTRACT-forth2 = b:(?<forth1>\d+)
EXTRACT-forth3 = c::(?<forth2>\d+)
EXTRACT-forth4 = d:(?<forth2>\d+)
Edit: Adding some test results with REPORT and multi-valued field. Behavior with REPORT is pretty much as as EXTRACT.
props.conf
# Tests with REPORTS
# Test-5: Behaviour within same REPORT
# Result - Like EXTRACT second REPORT will not overwrite value from first report
REPORT-first_report = first_report1, first_report2
# Test-6: Behaviour within different class
# Result - In this case also second REPORT will not overwrite value from first report
REPORT-second_report1 = second_report1
REPORT-second_report2 = second_report2
# Test-7: Behaviour with MV_ADD within same class
# Result - It will add new value to the field
REPORT-third_report = third_report1, third_report2
# Test-8: Behaviour with MV_ADD while using different class
# Result - It will add new value to the field in this case as well
REPORT-fourth_report1 = fourth_report1
REPORT-fourth_report2 = fourth_report2
# Test-9: Behaviour with MV_ADD, on skip adding MV_ADD in one of the class
# Result - The transforms.conf stanza which does not have MV_ADD in it will not be able to overwrite the value.
# In this case, it will generate field fifth_report with two values 1 and 2
REPORT-fifth_report = fifth_report1, fifth_report2, fifth_report3
transforms.conf
[first_report1]
REGEX = a:(?<first_report>\d+)
[first_report2]
REGEX = b:(?<first_report>\d+)
[second_report1]
REGEX = a:(?<second_report>\d+)
[second_report2]
REGEX = b:(?<second_report>\d+)
[third_report1]
REGEX = a:(?<third_report>\d+)
[third_report2]
REGEX = b:(?<third_report>\d+)
MV_ADD = true
[fourth_report1]
REGEX = a:(?<fourth_report>\d+)
[fourth_report2]
REGEX = b:(?<fourth_report>\d+)
MV_ADD = true
[fifth_report1]
REGEX = a:(?<fifth_report>\d+)
[fifth_report2]
REGEX = b:(?<fifth_report>\d+)
MV_ADD = true
[fifth_report3]
REGEX = c:(?<fifth_report>\d+)
# Skipping MV_ADD here