Splunk Search

Regular expression to split a string into multiple strings based on a delimiter.

rajim
Path Finder

In my search, I have a field that have a String like below. I want to split this string into multiple strings based on "#@#@". Please help me to write a correct regular expression for this.

12/23/2017 12:37:06 PM#@#@Copying to removable media#@#@DEFAULT#@#@RUR90M4417#@#@File Copy#@#@20_xiamen wingtas_wk2017381_ci.pdf#@#@2.7314186096191406#@#@pdf#@#@c:\users\ichemiakin001\desktop\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\#@#@c:\users\ichemiakin001\desktop\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\20_xiamen wingtas_wk2017381_ci.pdf#@#@g:\assurance\clients\mm\sportmaster group\2017\sportmaster ifrs audit\office file\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\#@#@g:\assurance\clients\mm\sportmaster group\2017\sportmaster ifrs audit\office file\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\#@#@False#@#@False#@#@explorer.exe#@#@Operation monitored, File not saved

I have tried the below regex. But it's not working properly.

| rex field=allRequiredFields "^(?<Agent_UTC_Time>.*)#@#@(?<etype>.*)#@#@(?<CountryCode>.*)#@#@(?<ComputerName>.*)#@#@(?<Operation>.*)#@#@(?<Source_File>.*)#@#@(?<Detail_File_Size_MB>.*)#@#@(?<Source_File_Extension>.*)#@#@(?<Source_Directory>.*)#@#@(?<Destination_Directory>.*)#@#@(?<destination>.*)#@#@(?<Was_Blocked>.*)#@#@(?<Was_File_Captured>.*)#@#@(?<Application>.*)#@#@(?<action>.*)"
0 Karma

niketn
Legend

@rajim, were you able to try out any of the following answers? Is your issue resolved?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

mayurr98
Super Champion

hey

you can do this with UI as well!!
go to

settings>fields>field extractions>select sourcetype>next>delimiters>other and then put custom delimiter "#@#@"

this will change props.conf

You can also change this in props.conf. The documentation says:

FIELD_DELIMITER = 
Tells Splunk which character delimits or separates fields in the
specified file or source.
This attribute supports the use of the special characters described
above.

Let me know if this helps!

0 Karma

niketn
Legend

@mayurr98, delimiter can only be single character. So first hash # character will be used as delimiter.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

mayurr98
Super Champion

yes but still you will be able to extract all the fields you want just that there will unnecessary 3 fields with empty values created after every 1 field if you are fine with it.You will be able to get what you want for 100% as I have tried this in test env.

field1 12/23/2017 12:37:06 PM
field2
field3
field4
field5 Copying to removable media

and so on

In this case, you can rename the field you want.empty fields will get extracted but then you need not use it for further analysis

0 Karma

niketn
Legend

Let's see what @rajim wants to try. However there will be 45 unwanted fields extracted during search time field discovery, which is just an overhead.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@rajim, since your data will have field names at specific location after every delimiter you can try the following run anywhere search and replace first two commands i.e. makeresults and eval _raw with your current base search. PS: There is one additional directory between Source_File_Extension and Was_Blocked which you have not extracted, because of which I have filled a someOtherDirectory field, not know which of the directly sequence is incorrect.
Also I have not written the regular expression to extract Agent_UTC_Time as the same should be extracted as _time in your props.conf.

| makeresults
| eval _raw="12/23/2017 12:37:06 PM#@#@Copying to removable media#@#@DEFAULT#@#@RUR90M4417#@#@File Copy#@#@20_xiamen wingtas_wk2017381_ci.pdf#@#@2.7314186096191406#@#@pdf#@#@c:\users\ichemiakin001\desktop\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\#@#@c:\users\ichemiakin001\desktop\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\20_xiamen wingtas_wk2017381_ci.pdf#@#@g:\assurance\clients\mm\sportmaster group\2017\sportmaster ifrs audit\office file\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\#@#@g:\assurance\clients\mm\sportmaster group\2017\sportmaster ifrs audit\office file\???????????? ???????? ?? ????????????? ?????? ???????? ??????????? test cost of sales transactions - trade entity\??????? ???????\??\#@#@False#@#@False#@#@explorer.exe#@#@Operation monitored, File not saved"
| rex "#@#@(?<value>[^#]+)" max_match=15
| eval etype=mvindex(value,0),CountryCode=mvindex(value,1),ComputerName=mvindex(value,2),Operation=mvindex(value,3),Source_File=mvindex(value,4),Detail_File_Size_MB=mvindex(value,5),Source_File_Extension=mvindex(value,6),Source_Directory=mvindex(value,7),Destination_Directory=mvindex(value,8),destination=mvindex(value,9),someOtherDirectory=mvindex(value,10),Was_Blocked=mvindex(value,11),Was_File_Captured=mvindex(value,12),Application=mvindex(value,13),action=mvindex(value,14)

Please try out and confirm.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

DavidHourani
Super Champion

Hi Rajim,

Try this instead:

| rex field=allRequiredFields "^(?<Agent_UTC_Time>[^#@]+)[#@]+(?<etype>[^#@]+)[#@]+(?<CountryCode>[^#@]+)[#@]+(?<ComputerName>[^#@]+)[#@]+(?<Operation>[^#@]+)[#@]+(?<Source_File>[^#@]+)[#@]+(?<Detail_File_Size_MB>[^#@]+)[#@]+(?<Source_File_Extension>[^#@]+)[#@]+(?<Source_Directory>[^#@]+)[#@]+(?<Destination_Directory>[^#@]+)[#@]+(?<destination>[^#@]+)[#@]+(?<Was_Blocked>[^#@]+)[#@]+(?<Was_File_Captured>[^#@]+)[#@]+(?<Application>[^#@]+)[#@]+(?<action>[^#@]+)"

I tried it on https://regex101.com/ it's working but I think you're missing a field somewhere, you'll just have to add it in.

Regards,
David

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...