Splunk Search

How to extract field which are in a set?

power12
Communicator

Hello Splunkers
I have the following raw events

2023-01-20 18:45:59.000, mod_time="1674240490", job_id="79" , time_submit="2023-01-20 10:04:55", time_eligible="2023-01-20 10:04:56", time_start="2023-01-20 10:45:59", time_end="2023-01-20 10:48:10", state="COMPLETED", exit_code="0", nodes_alloc="2", nodelist="abc[0002,0006]", submit_to_start_time="00:41:04", eligible_to_start_time="00:41:03", start_to_end_time="00:02:11"

2023-01-20 18:45:59.000, mod_time="1674240490", job_id="79" , time_submit="2023-01-20 10:04:55", time_eligible="2023-01-20 10:04:56", time_start="2023-01-20 10:45:59", time_end="2023-01-20 10:48:10", state="COMPLETED", exit_code="0", nodelist="ABC[0002-0004,0006-0008,0073,0081,0085-0086,0089-0090,0094-0095,0097-0098]" submit_to_start_time="00:41:04", eligible_to_start_time="00:41:03", start_to_end_time="00:02:11"

How do I extract or parse the highlighted nodelist="ABC[0002-0004,0006-0008,0073,0081,0085-0086,0089-0090,0094-0095,0097-0098]  into a new field called host and the host values for first event would be host= abc0002 and host=abc0006 similarly for second event it should be host= abc0002 host= abc0003 host= abc0004   host=abc0006  host= abc0007 host= abc0008 host=abc0073
host= abc0081   host=abc0095 host= abc0097 host=abc0098


Thanks in Advance

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| rex "nodelist=\"(?<nodelist>[^\"]+)"
| rex field=nodelist "(?<prefix>\w+)\[(?<suffix>[^\]]+)\]"
| eval suffix=split(suffix,",")
| mvexpand suffix
| eval start=mvindex(split(suffix,"-"),0)
| eval end=mvindex(split(suffix,"-"),1)
| eval end=coalesce(end,start)
| eval range=mvrange(start,end+1)
| mvexpand range
| eval host=prefix.printf("%04d",range)

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Create a calculated field based on value from the nodelist field.

Having said that - don't overwrite the default host field. Find another name - hostname, reporting_host, whatever, just don't use the default field host.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex "nodelist=\"(?<nodelist>[^\"]+)"
| rex field=nodelist "(?<prefix>\w+)\[(?<suffix>[^\]]+)\]"
| eval suffix=split(suffix,",")
| mvexpand suffix
| eval start=mvindex(split(suffix,"-"),0)
| eval end=mvindex(split(suffix,"-"),1)
| eval end=coalesce(end,start)
| eval range=mvrange(start,end+1)
| mvexpand range
| eval host=prefix.printf("%04d",range)
0 Karma

power12
Communicator

@ITWhisperer  Thank you for your reply....This also worked....but I was looking more for a multivalue field..Thanks for this search

0 Karma

bowesmana
SplunkTrust
SplunkTrust

I think you missed some numbers in your second example

nodelist="ABC[0002-0004,0006-0008,0073,0081,0085-0086,0089-0090,0094-0095,0097-0098]" 

 as there should be 85,86,89,90,94 giving 16 in total.

Anyway, this search will extract the nodelist and expand out all the range values and then make a new multi-value field called host

| rex "nodelist=\"(?<prefix>[^\[]*)\[(?<ids>[^\]]*)"
| eval ids=split(ids, ",")
| eval ranges=mvfilter(match(ids,"-"))
| eval ids=mvfilter(!match(ids,"-"))
| eval idlist=mvsort(mvappend(ids, mvmap(ranges, mvrange(tonumber(replace(ranges, "-\d+", "")), tonumber(replace(ranges, "\d+-", "")) + 1, 1))))
| eval host=mvmap(idlist, printf("%s%04d", prefix, idlist))
| fields - idlist ids prefix ranges

This is a bit convoluted, as it has to determine which are individual values and which are ranges, so it first gets the ids and separates out the ranges and the single value numbers into ids and ranges fields.

It then creates the idlist field which has all the ids needed and the host= will create the final set.

 

power12
Communicator

@bowesmana  Thank you for your reply...Your search did work and this is what I am looking for..but somehow its not able to extract as a host for the one that has just one nodelist like below event is not showing the host field

2023-01-24 15:33:13.000, mod_time="1674575003", job_id="80771",, time_submit="2023-01-24 07:01:21", time_eligible="2023-01-24 07:01:21", time_start="2023-01-24 07:33:13", time_end="2023-01-24 07:43:23", state="COMPLETED", exit_code="0", nodes_alloc="1", nodelist="preos0098", submit_to_start_time="00:31:52", eligible_to_start_time="00:31:52", start_to_end_time="00:10:10"

 

If you see the below event..you can see  that it is extracting into new field called host

2023-01-24 16:02:44.000, mod_time="1674576788", job_id="80779", time_submit="2023-01-24 07:27:51", time_eligible="2023-01-24 07:27:51", time_start="2023-01-24 08:02:44", time_end="2023-01-24 08:13:08", state="TIMEOUT", exit_code="0", nodes_alloc="8", nodelist="abc[0022,0093,0098,0167,0177,0232,0268,0285]", submit_to_start_time="00:34:53", eligible_to_start_time="00:34:53", start_to_end_time="00:10:24"

 

 

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can change the rex statement in my example to

| rex "nodelist=\"(?<prefix>[^\[0-9]*)\[?(?<ids>[^\]\"]*)"

which instead of looking for prefix followed by a [] bounded list of ids, it will look for prefix follewed by a number and ending with a quote.

Your prefix must then not contain numbers though.

0 Karma

power12
Communicator

I used  coalesce and it worked
| eval list=coalesce(host,nodelist)

0 Karma

bowesmana
SplunkTrust
SplunkTrust

FYI: both these solutions will work, the key difference is that one will create new events for each host and the other will put all the host values into the same event. It depends on what you want to do with the data afterwards

0 Karma
Get Updates on the Splunk Community!

Exporting Splunk Apps

Join us on Monday, October 21 at 11 am PT | 2 pm ET!With the app export functionality, app developers and ...

Cisco Use Cases, ITSI Best Practices, and More New Articles from Splunk Lantern

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Build Your First SPL2 App!

Watch the recording now!.Do you want to SPL™, too? SPL2, Splunk's next-generation data search and preparation ...