Getting Data In

How to extract fields from a single file containing multiple sourcetypes, each with multiline events ?

mzorzi
Splunk Employee
Splunk Employee

My source file is like:

============================

App01trace

3 0 393222 0 19 148 8838300

4 0 458759 0 29 15 0

4 0 458759 0 31 12 0

5 0 524296 0 61 170 8869500

App02trace

4 0 327685 2032 0 0 0 0 NULL

6 0 393222 2032 0 0 0 0 NULL

5 0 458760 2032 0 0 0 0 NULL

App03trace

21 1 2959165 3 8 1 1 P

22 9 859165 3 12 6 1 R

============================

I would like to associate to each App0.trace a different sourcetype, and then associate each value to a different field, specific for that sourcetype.

I've tried the following steps:

  1. In inputs.conf I assign my source file to a fix sourcetype [testbb]
  2. I then use [testbb] to define a stanza props.conf, like this:

    [testbb]

    SHOULD_LINEMERGE = true

    BREAK_ONLY_BEFORE_DATE = false

    LINE_BREAKER = ([\r\n]+)(App01trace|App02trace|App03trace)

    TRANSFORMS-App01trace=tr-App01trace

    TRANSFORMS-App02trace=tr-App02trace

    TRANSFORMS-App03trace=tr-App03trace

  3. In transforms.conf I extract the three different sourcetypes, like:

    [tr-App01trace]

    REGEX = App01trace

    DEST_KEY = MetaData:Sourcetype

    FORMAT = sourcetype::App01trace

  4. Finally I create another stanza in props.conf for sourcetype [App01trace] where I perform the search time extractions (using Extract)

My problem is that it only extracts the fields from the first line of each sub block, not for every line. So supposing the third field for App03trace is called app03field3, I get app03field3=2959165 but the value 859165 doesn't get extracted.

That doesn't surprise me as the LINE_BREAKER has been already executed for that stream.

What should I change to achieve my goal?

Thanks for your help.

Tags (1)
0 Karma

mzorzi
Splunk Employee
Splunk Employee

Hi David,

thanks for your answer, I've tried and it works great!

What I forgot to mention is that I would need to have an event for each line; so if I search for sourcetype=App01trace and field3=393222 and field5=19, I should get only this line

3 0 393222 0 19 148 8838300

0 Karma

hazekamp
Builder

mzorzi,

You may want to change to using PROPS instead of EXTRACT. This way you can specify MV_ADD=True which will create multivalue fields based on your search time extraction.

Given a sample of:

App01trace
3 0 393222 0 19 148 8838300
4 0 458759 0 29 15 0
4 0 458759 0 31 12 0
5 0 524296 0 61 170 8869500 

The following configuration:

## props.conf
[App01trace]
REPORT-kv_for_app01trace = kv_for_app01trace

## transforms.conf
[kv_for_app01trace]
REGEX = ([^\s]+)\s([^\s]+)\s([^\s]+)\s([^\s]+)\s([^\s]+)\s([^\s]+)\s([^\s]+)
FORMAT = field1::$1 field2::$2 field3::$3 field4::$4 field5::$5 field6::$6 field7::$7
MV_ADD = True

Will give you the following fields and values:

field1 = [3, 4, 4, 5]
field2 = [0, 0, 0, 0]
field3 = [393222, 458759, 458759, 524296]
filed4 = [0, 0, 0, 0]
field5 = [19, 29, 31, 61]
field6 = [148, 15, 12, 170]
field7 = [8838300, 0, 0, 8869500]

There are additional ways of performing extractions using search commands, but this is the best way to do via props.conf.

-David

Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...