Splunk Search

Field Extraction - Separate on Colon?

holtb
Explorer

I'm trying to extract -all- the fields from a rather complex Oracle Grid Engine log file with a format like this:

all.q:50s01fb:clusterusers:zarakhov:transc4desc.fasta_$SGE_TASK_ID.qsub:5161781:sge:0:1344452577:1344452604:1344452617:0:0:13:8.781664:0.452931:0.000000:0:0:0:0:332711:716:0:0.000000:0:0:0:0:12147:13606:normal:MPRICompGenomics:NONE:1:987:9.234595:1.061786:0.000000:-u zarakhov -q all.q -l h_rt=21600,h_vmem=2G,mem_free=2G,mem_reserve=2G,virtual_free=2G:0.000000:NONE:383160320.000000:0:0

Obviously, I'd like to separate on the colons and dump each of the variables into a sequentially labeled variable (fieldname1-fieldname27 or whatnot). I've found fairly easy ways to extract the fields one at a time, is there a mechanism for doing in a more efficient way? Like the split command in perl?

Tags (1)
1 Solution

dmaislin_splunk
Splunk Employee
Splunk Employee

I think I get what you are trying to do and you need to break it into phases. Phase A gives you all 45 fields delimited with the colon, Phase B was a bit weird so field f40 is broken into two with the -l flag as the delimiter. Phase C breaks f46 apart using a REGEX by flag type. Finally, Phase D uses nested delims to break the f47 field into the final name/value pair. If you try this, just run a search:

sourcetype="yoursourcetype" | table f* h_* mem* vir*

props.conf

[yoursourcetype]
REPORT-extractions = a,b,c,d

transforms.conf

[a]
DELIMS = ":"
FIELDS= f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24, f25, f26, f27, f28, f29, f30, f31, f32, f33, f34, f35, f36, f37, f38, f39, f40, f41, f42, f43, f44, f45

[b]
SOURCE_KEY = f40
REGEX = (.*)-l(.*)
FORMAT = f46::$1 f47::$2

[c]
SOURCE_KEY = f46
REGEX = -u (.+?) -q (.+?)$
FORMAT = f48::$1 f49::$2

[d]
SOURCE_KEY=f47
DELIMS = ",","="

View solution in original post

dmaislin_splunk
Splunk Employee
Splunk Employee

I think I get what you are trying to do and you need to break it into phases. Phase A gives you all 45 fields delimited with the colon, Phase B was a bit weird so field f40 is broken into two with the -l flag as the delimiter. Phase C breaks f46 apart using a REGEX by flag type. Finally, Phase D uses nested delims to break the f47 field into the final name/value pair. If you try this, just run a search:

sourcetype="yoursourcetype" | table f* h_* mem* vir*

props.conf

[yoursourcetype]
REPORT-extractions = a,b,c,d

transforms.conf

[a]
DELIMS = ":"
FIELDS= f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24, f25, f26, f27, f28, f29, f30, f31, f32, f33, f34, f35, f36, f37, f38, f39, f40, f41, f42, f43, f44, f45

[b]
SOURCE_KEY = f40
REGEX = (.*)-l(.*)
FORMAT = f46::$1 f47::$2

[c]
SOURCE_KEY = f46
REGEX = -u (.+?) -q (.+?)$
FORMAT = f48::$1 f49::$2

[d]
SOURCE_KEY=f47
DELIMS = ",","="

dmaislin_splunk
Splunk Employee
Splunk Employee

Does this answer your question? If so, please accept.

0 Karma

yannK
Splunk Employee
Splunk Employee

take a look at the extract command for the search command.
... | extract pairdelim=",", kvdelim="="
http://docs.splunk.com/Documentation/Splunk/4.3.3/SearchReference/Extract

The equivalent exists for the configuration file /manager for automation.

0 Karma

holtb
Explorer

That's pretty neat, but I don't care about the key=value pairs in the middle, I'd rather keep that whole section intact. I just want to separate the many fields by : so I can search against them separately.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...