Splunk Search

how to handle this kind of log using REGEX

hchen11
Explorer

Hi,

I have log like following

 

rid=iqwenoasd service=CP scopes=add-w,oot-s fields=birthdate,emails,identifier issuer=AWS CA TE empty=

 

my expected key-value pairs are like this

 

rid=iqwenoasd
service=CP
scopes=add-w,oot-s
fields=birthdate,emails,identifier
issuer=AWS CA TE
empyt=null

 

since there are some special cases in it (value has space character, value is empty), how can I use REGEX in search servers to achieve that? it seems using ([^=]+)=(.+) doesn't work

in transforms.conf

 

[cp_logs_report]
REGEX = ([^=]+)=(.+)
FORMAT = $1::$2

 

 in props.conf

 

[my_sourcetype]
REPORT-cp-logs-ext = cp_logs_report

 

 

Labels (1)
0 Karma
1 Solution

hchen11
Explorer

finally I used REGEX = (\S+)=([^=]*)(?:\h+|$)

View solution in original post

0 Karma

hchen11
Explorer

finally I used REGEX = (\S+)=([^=]*)(?:\h+|$)

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @hchen11,

did you tried with a regex like this?

 

| rex "rid\=(?<rid>[^ ]+)\s+service\=(?<service>[^ ]+)\s+scopes\=(?<scopes>[^ ]+)\s+fields\=(?<fields>[^ ]+)\s+issuer\=(?<issuer>.+)\s+empty\=(?<empty>.*)"

 

to use in SPL or in a filed extraction.

You can test the regex at https://regex101.com/r/dNUKjy/1

Then to have NULL in empty, you can use fillnull

 

| fillnull empty value="NULL"

 

Ciao.

Giuseppe

0 Karma

hchen11
Explorer

Hi @gcusello ,

I would like to use REGEX in transforms.conf but not in Splunk Web UI.

Another question, should I use index-search in index server (search head)?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @hchen11,

what do you mean with index-search?

do you want to extract fields at index time?

if you configure the above extraction as fields, you have the above definitions in props and transfroms and you don't need to use by GUI or better in SPL, but obviously you use them at search time not index time.

Ciao.

Giuseppe

0 Karma

hchen11
Explorer

Hi @gcusello ,

Yes, what I mean is extracting fields at index time.

What should I do if I want to use it at index time? Should I just change REPORT-* to TRANSFORMS-* and source these configs in indexer server?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @hchen11,

see at https://docs.splunk.com/Documentation/Splunk/8.0.5/Data/Configureindex-timefieldextraction but anyway, whay do you wanto to extract field at index time?

this is a good idea if you index not many logs and you have many users and searches, is this your situation?

Ciao.

Giuseppe

hchen11
Explorer

Hi @gcusello,

we don't want to impact performance when searching, and that's the main reason we want to do extraction at index time. We have around 50 fields in logs, is it too many or not?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @hchen11,

Numer of fields is one of the parameters, but not the first, the main are resources available, users, searches.

Anyway, extracting field at index time asks more resources and time at index time, if you have many events to index isn't a good idea.

Ciao.

Giuseppe

hchen11
Explorer

Hi @gcusello ,

I did not consider this issue carefully before: whether we need to extract at index time. Could you provide more details about how should I make a decision?

About resources, we had around 20 index servers and 3 search servers. About users and searches, around 3 users will use the search and do 100 searches daily and averagely. Furthermore, not all fields exist in every event.

Our discussion is pretty meaningful for me, hope I can get more suggestion from you.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @hchen11,

the architecture of an infrastructure is a job for an Architect, to analyze requirements (users, scheduled searches, log to index, apps, etc...) to define the correct number of servers, their resources and their configurations.

As I said extracting fields at index time is a good practice if you haven't an high load in indexing, so you can give an additional load to indexers at index time and give advantages at search time; but if you have many logs (and having 10 Indexers I can think that this is your situation) and you have only three users, probably it's better to extract fields at search time.

As I said, this should be deeply analyzed by an Architect.

Ciao.

Giuseppe

Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...