Splunk Search

Log Field extraction When should i do it?

linu1988
Champion

Hello Guyz,
I have to extract around 30/40 fields from logs and monitor them. They are well formatted and can be extracted easily through regex. I am just concerned where should i do it?

While indexing the logs or while searching? I mean keeping an eye on performance.

Sample Data

[Date][PreciseTime][Time][Pid][Tid][SrcFile][Function][TransactionID][AgentName][Resource][User][Group][Realm][Domain][Directory][Policy][AgentType][Rule][ErrorValue][ReturnValue][ErrorString][IPAddr][IPPort][Result][Returns][CallDetail][Data][Message] 
[====][===========][====][===][===][=======][========][=============][=========][========][====][=====][=====][======][=========][][][][==========][===========][===========][======][======][======][=======][==========][====][=======]

where "===" -> the data. It may have or may bot have value.

| rex field=_raw "\[(?<DDD>[\d\/]+)\]\[(?<DDD1>[\d:\.]+)\]\[(?<DDD2>[\d\:]+)\]\[(?<DDD3>[\d]+)\]\[(?<DDD4>[\d]+)\]\[(?<DDD5>[A-Za-z_\.\:\d]+)\]"|table DDD,DDD1,DDD2,DDD3,DDD4,DDD5

Just Planing the regex as well for them. Is that okay to set while indexing. And how do i mention something in the [] than [A-Za-z_.:\d] where i may miss some character?

Any kind of suggestion is welcome.

Thank you

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

In almost every case you'll want search time extractions, simple ones as EXTRACT-foo and more complex ones as REPORT-bar with a corresponding transforms.conf stanza [bar]. Only use indexed fields if you have a good reason to, such as values that commonly exist outside a field killing searchtime filtering performance.

As for your character classes, consider using [^]]* for your data fields to match until before the closing square bracket.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

In almost every case you'll want search time extractions, simple ones as EXTRACT-foo and more complex ones as REPORT-bar with a corresponding transforms.conf stanza [bar]. Only use indexed fields if you have a good reason to, such as values that commonly exist outside a field killing searchtime filtering performance.

As for your character classes, consider using [^]]* for your data fields to match until before the closing square bracket.

linu1988
Champion

I need to put stats from the extracted the fields from the logs. As you suggested i will go with search time extraction seems flexible and i will see if there is frequent use i will schedule the search. Thank you for your help.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Indextime field extractions will put some load on your indexer, yeah - but the bigger disadvantage I see is that you lose the flexibility of Splunk's schema-on-the-fly searchtime extractions.

As for dashboards, those launch regular searches so it doesn't matter much if a search is on a dashboard or not. If you have a high number of users frequently loading the same dashboard with identical searches you're often better off just scheduling the searches behind the dashboard.

What's best for your case depends on your case though.

0 Karma

linu1988
Champion

Thanks Martin. So how if i do it in index time, will the load on the index will be more? And when the extraction happens at search with every use is it a good approach for dashboards? I have no intention of summarizing them as they would be just reference for 1-3 days

0 Karma
Get Updates on the Splunk Community!

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler

From Idea to Implementation: Why Splunk Built mTLS into Splunk Enterprise 10.0  mTLS wasn’t just a checkbox ...