All Apps and Add-ons

Extracting path and file name separately using regex

telcosi
Explorer

Hi - we have some data that contains a hierarchy of folders and computer name that we want to extract.

The raw data looks like this:

Time=2014.06.12 11:04:03.772 EST, Agent=/root/SiteHome/FirstFolder/605c6f6a-145d051f, Id=68.2, Watchlist=JW1

and from the bit that starts with "Agent=" and ends with ", Id=" we would like to extract the folders and the computer ID into two different fields.

There can be any number (from 1 to any number - never zero) of folders and the computer ID can have any non-special (/ \ * , . " @) characters.

The folders are up to the last slash and the computer ID is from the last slash up to the comma

We would like 2 regex for use in the Field Extractor - can someone please help ?

Thanks

John

1 Solution

s2_splunk
Splunk Employee
Splunk Employee

Your log format should already give you a field called 'Agent', so this should break that field into the two components (if I understood your goal):

rex field=Agent ".*\/(?<ComputerID>.*?)$" | rex field=Agent "^(?<Folder>.*)/"

or both fields in a single rex command, like so:

rex field=Agent "^(?<Folder>.*\/)(?<ComputerID>.*?)$" 

View solution in original post

s2_splunk
Splunk Employee
Splunk Employee

Your log format should already give you a field called 'Agent', so this should break that field into the two components (if I understood your goal):

rex field=Agent ".*\/(?<ComputerID>.*?)$" | rex field=Agent "^(?<Folder>.*)/"

or both fields in a single rex command, like so:

rex field=Agent "^(?<Folder>.*\/)(?<ComputerID>.*?)$" 

s2_splunk
Splunk Employee
Splunk Employee

No problem. Strange that you didn't see the computerID, I had tested it with your raw data sample to make sure it works. Oh well, glad you got it working!

0 Karma

telcosi
Explorer

Thanks - get_agent and get_folder worked well. get_computer returned no values so I blundered my way around and found that this worked okay ([^/]*)$

Closing this with my thanks !

s2_splunk
Splunk Employee
Splunk Employee

NOTE: For some reason, it changes the 'A' of the second 'agent' in the first REGEX to lowercase when I comment. Make sure the case matches on all references to fields.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Please read up on props/transforms; you are very likely not dumb, this stuff takes a little practice. 😉
This works for me:

1) props.conf
[mySourcetype]
REPORT-AgentFields = get_agent, get_folder, get_computer

2) transforms.conf

[get_agent]
REGEX = Agent=(?P[^,]+)
FORMAT = Agent::$1

[get_folder]
REGEX = ^(.*)/
SOURCE_KEY = Agent
FORMAT = folder::$1

[get_computer]
REGEX = ./(.?)$
SOURCE_KEY = Agent
FORMAT = computerID::$1

It probably can be simplified a bit more, but I hope you get the idea.

telcosi
Explorer

Hi Again - I must be dumb but I cannot figure out how to do this extraction in the UI or props.conf. How do I convert the (working) rex commands to something recognised by Field Extractions in the UI or in props.conf ?

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

You can attach the extraction to your sourcetype/source/host, either via the UI (Settings->Fields->Field Extractions) or by directly editing props.conf.

Details: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Addfieldsatsearchtime

[Please accept answer if it solved your problem]

0 Karma

telcosi
Explorer

That is great - how do we then persist that field definition in (or in the same way as) the field extractor ? I have tested it in the search bar but want to keep it available for all searches ? Thanks !

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...