Hello Splunkers
I have the following raw events
2023-01-20 18:45:59.000, mod_time="1674240490", job_id="79" , time_submit="2023-01-20 10:04:55", time_eligible="2023-01-20 10:04:56", time_start="2023-01-20 10:45:59", time_end="2023-01-20 10:48:10", state="COMPLETED", exit_code="0", nodes_alloc="2", nodelist="abc[0002,0006]", submit_to_start_time="00:41:04", eligible_to_start_time="00:41:03", start_to_end_time="00:02:11"
2023-01-20 18:45:59.000, mod_time="1674240490", job_id="79" , time_submit="2023-01-20 10:04:55", time_eligible="2023-01-20 10:04:56", time_start="2023-01-20 10:45:59", time_end="2023-01-20 10:48:10", state="COMPLETED", exit_code="0", nodelist="ABC[0002-0004,0006-0008,0073,0081,0085-0086,0089-0090,0094-0095,0097-0098]" submit_to_start_time="00:41:04", eligible_to_start_time="00:41:03", start_to_end_time="00:02:11"
How do I extract or parse the highlighted nodelist="ABC[0002-0004,0006-0008,0073,0081,0085-0086,0089-0090,0094-0095,0097-0098] into a new field called host and the host values for first event would be host= abc0002 and host=abc0006 similarly for second event it should be host= abc0002 host= abc0003 host= abc0004 host=abc0006 host= abc0007 host= abc0008 host=abc0073
host= abc0081 host=abc0095 host= abc0097 host=abc0098
Thanks in Advance
| rex "nodelist=\"(?<nodelist>[^\"]+)"
| rex field=nodelist "(?<prefix>\w+)\[(?<suffix>[^\]]+)\]"
| eval suffix=split(suffix,",")
| mvexpand suffix
| eval start=mvindex(split(suffix,"-"),0)
| eval end=mvindex(split(suffix,"-"),1)
| eval end=coalesce(end,start)
| eval range=mvrange(start,end+1)
| mvexpand range
| eval host=prefix.printf("%04d",range)
Create a calculated field based on value from the nodelist field.
Having said that - don't overwrite the default host field. Find another name - hostname, reporting_host, whatever, just don't use the default field host.
| rex "nodelist=\"(?<nodelist>[^\"]+)"
| rex field=nodelist "(?<prefix>\w+)\[(?<suffix>[^\]]+)\]"
| eval suffix=split(suffix,",")
| mvexpand suffix
| eval start=mvindex(split(suffix,"-"),0)
| eval end=mvindex(split(suffix,"-"),1)
| eval end=coalesce(end,start)
| eval range=mvrange(start,end+1)
| mvexpand range
| eval host=prefix.printf("%04d",range)
@ITWhisperer Thank you for your reply....This also worked....but I was looking more for a multivalue field..Thanks for this search
I think you missed some numbers in your second example
nodelist="ABC[0002-0004,0006-0008,0073,0081,0085-0086,0089-0090,0094-0095,0097-0098]"
as there should be 85,86,89,90,94 giving 16 in total.
Anyway, this search will extract the nodelist and expand out all the range values and then make a new multi-value field called host
| rex "nodelist=\"(?<prefix>[^\[]*)\[(?<ids>[^\]]*)"
| eval ids=split(ids, ",")
| eval ranges=mvfilter(match(ids,"-"))
| eval ids=mvfilter(!match(ids,"-"))
| eval idlist=mvsort(mvappend(ids, mvmap(ranges, mvrange(tonumber(replace(ranges, "-\d+", "")), tonumber(replace(ranges, "\d+-", "")) + 1, 1))))
| eval host=mvmap(idlist, printf("%s%04d", prefix, idlist))
| fields - idlist ids prefix ranges
This is a bit convoluted, as it has to determine which are individual values and which are ranges, so it first gets the ids and separates out the ranges and the single value numbers into ids and ranges fields.
It then creates the idlist field which has all the ids needed and the host= will create the final set.
@bowesmana Thank you for your reply...Your search did work and this is what I am looking for..but somehow its not able to extract as a host for the one that has just one nodelist like below event is not showing the host field
2023-01-24 15:33:13.000, mod_time="1674575003", job_id="80771",, time_submit="2023-01-24 07:01:21", time_eligible="2023-01-24 07:01:21", time_start="2023-01-24 07:33:13", time_end="2023-01-24 07:43:23", state="COMPLETED", exit_code="0", nodes_alloc="1", nodelist="preos0098", submit_to_start_time="00:31:52", eligible_to_start_time="00:31:52", start_to_end_time="00:10:10"
If you see the below event..you can see that it is extracting into new field called host
You can change the rex statement in my example to
| rex "nodelist=\"(?<prefix>[^\[0-9]*)\[?(?<ids>[^\]\"]*)"
which instead of looking for prefix followed by a [] bounded list of ids, it will look for prefix follewed by a number and ending with a quote.
Your prefix must then not contain numbers though.
I used coalesce and it worked
| eval list=coalesce(host,nodelist)
FYI: both these solutions will work, the key difference is that one will create new events for each host and the other will put all the host values into the same event. It depends on what you want to do with the data afterwards