Getting Data In

How to drop some field/values before indexing.

zubairsp
Explorer

Hello everyone, need your help!
We have a data source which is sending huge logs and thus we want to drop useless field values before indexing, we have installed an add-on which extracts a field from raw logs (example: ProductName) and assigns the value of this field as source to the raw logs and similarly uses the same props and transforms to extract rest of the field/values till this part everything works smooth, now the issue is among those extracted fields there is a field which i want to drop before indexing, i understand that there is way to send particular field/value to a nullqueue using props and transforms, below is a sample configuration i tried but it didn't work.

Your support in this is highly appreciated.

 

Props:

[test:syslog]
SHOULD_LINEMERGE = false
EVENT_BREAKER_ENABLE = true

TRANSFORMS-test_source = test_source, test_format_source
TRANSFORMS-nullQ=nullFilter

REPORT-regex_field_extraction = test_regex_field_extraction, test_file_name_file_path
REPORT-dvc = test_dvc

Transforms:

[test_source]
REGEX = ProductName="([^"]+)"
DEST_KEY = MetaData:Source
FORMAT = source::$1

[test_format_source]
INGEST_EVAL = source=replace(lower(source), "\s", "_")

[test_dvc]
REGEX = ^<\d+>\d\s[^\s]+\s([^\s]+)
FORMAT = dvc::"$1"

[nullFilter]
REGEX = (?mi)XYZData\>(.*)?=\<*?\/XYZData\>
FORMAT = remove_field::$1
DEST_KEY = queue
FORMAT = nullQueue

[test_regex_field_extraction]
REGEX = <([\w-]+)>([^<]+?)<\/\1>
FORMAT = $1::$2
CLEAN_KEYS = false

[test_file_name_file_path]
REGEX = ^(.+)[\\/]([^\\/]+)$
FORMAT = source_process_name::$2 source_process_path::$1
SOURCE_KEY = SourceProcessName

[test_severity_lookup]
filename = test_severity.csv

[test_action_lookup]
filename = test_action_v110.csv
case_sensitive_match = false

[drop_useless_fields]
INGEST_EVAL = random:=null()

 

## Below is the sample raw log, i want to drop the field <XYZData>

Sample raw data:

<29>1 2023-11-09T18:34:02.0Y something testEvents - EventFwd [agentInfo@ioud tenantId="1" bpsId="1" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="x\y"] <?xml version="x.0"?> <test_hjgd><MachineInfo><AgentGUID>{iolkiu-5d9b-89iu-3e19-jjkiuhygtf}</AgentGUID><MachineName>something</MachineName><RawMACAddress>xyxyxyxyxz</RawMACAddress><IPAddress>xx.xx.xx.xx</IPAddress><AgentVersion>x.x.x.xx</AgentVersion><OSName>GGG</OSName><TimeZoneBias>-333</TimeZoneBias><UserName>xxx</UserName></MachineInfo><EventList ProductName="something" ProductVersion="xx.x" ProductFamily="Secure"><Event><Time>2023-11-09T18:34:38</Time><Severity/><EventID>jkiuj</EventID><XYZData>T1BHoQUAAAAAAAADQg0AAO0dXVNbu1E/JZP3xNyElJsZ13cA24kbwB7bhPaJMdiAG8DUNjehnf737of26ONIOrID0zycyWCfo13trlar1WolOU31h/qh7tSteqX+VDO1VCs1Vwt1r/6qXqvf1Fu1A9+vAHKvLqF8CtB7dU3QUzVWXfUGsP5COCu1VhOATuHzlmjMCO8enhH+h2qppjoA6AogbfhG/AHwXMDTDOivNe+hegQKM9Uhme6h/BX8PakHeOsBfZbtI/ztwhNSNZhjjfeKZF3B8y1weVIn8HmnJeqor+qcPjtQPobnsfoHSNKBpwFwtyH78L2vjgA+At4jKOnCew9KOtAG5P4bydJUjYAUrmyfSC8XpJ8e1G6p/6grtad+h3/voHwXIKjPD0DvPTztgvax/A2Uf4RSLHsPpXtQ6wpoTqHeDpTvge7+60lQ5tWEti3gbQ7YT6RH5L/j/XuT+eT/Y/5lDi7XIUj3J/UL93QLavr1yjguDenJVqmmDdkHK2VNfHXsmvvrLfztwOcuyY504vjcg3NoDY+CGfT9Ar4nhb2iDc+gR+YwllpFP2xSA8fSIfzNtfw4pj7qPn8Hnx+gp5Guj9eEN6R3X4wf5ItjZUSja10xEg5VH95PQF6k2oO292gsnEP9MehkTGOiDzbfhTb0CPN10WvVvKUnrqm917oU6z2SrrGlov04TqgH5uRvECbPHfIZdwmP8Tvo0fYYNsXN/AaOKPQQh6Qf4z2wdGz5B8Qewr8+/DF0BHVe6x4OWYrvOWxYC/r/HuS5oJZiCxfw+UD2egXPS5D0FfA6BG7ojy/JH4e4IHWbU8N7C+s0ZtumL3DsPpAkaz1TsH+fQl9eUq/EtYt+d0CaGpOlfdLeuA96OyX79L0z6nGXZKriy3b4QHznpBWU/DNgoOT3hd84AV5H1JfHuhc/qTMYgWyj1fVDXHpAD/t7D+i8hT/2Pm9p9voQocx1mmAtOAqWVMYt+kGedaD7Ig1vQ+lKa8T1PT6dXMwOze+o6TlZhKEQg5gZaUg8cGaPz+fv9OiUCGAFlNnL3VLcgF75iry9kVJwOiDpJdmBD0HOEyu+KHN/T/PBa92DBjPHK6DNYpzAHhT96Tn89T1fGaKK3E4o+im3SrQto31aUPJLUzSee35PSdsmS7TLdrRt+eVN0NMKRhha+yNFKIJbLq/i+jfAXAWhEmE9rwaq+GHbcAyLh1xv1JMj8uk4D3aBwi3ZyI4e6WWI62+RL3uvQ239C4dCHi73QrgFl1D3yRpfXwD+9AI6TvFKSeJGhnl4bdDLsRe3voTdxPjgPDyB+epluNq0zXxwAt9r9Z1s4RtY1g1gLbU+TimqaeloU54PCS72MKexuXoBeWN87DkqJX2oz/N9OEd0Q4rmjC830V0XYAPLp6e42bPYy/StSx37iD3TAlr1mWa8qV7JNAr4HdWZ6Gh9VtAY0ep+aXmLXGzxLPb8yp5spjMKvCao1v5IR3bDYg2CUWAXYK8tL+hS5bXhkviasl2IIz7C50eIsffo+YNeK/qYqBWJuW61H7ZnavSEGFn7ccHzxgPVcrCGV5W8DylfgauSo2IF4kqC8fM+6JHf+uAR7fVcrgytDHxbd8YPjWj0NrRGH8leU9GdX+eYYJcw6pGfWOQZ9egNtOce3p8AC+vO9Zwp7duurhnHomn0OQvwQLzSwHamIpTGT9YPeZmuHm8La3z7s7bd2pXuhQWtzG89GHrU74DxRq/Gq/CatA6dq39Bu2YlPjt67ohjxOpjjLPUWmklqPh4aJnskaY6fjS9bFqVxgrFPWW5D2n+Rs9oVhe2zbfJUicUocm8lYYPaX29IK82tWYzod6IrKCkrrynxpCxOzOOcuvG6thR+jb1c8a+Xbcd9HkSGcRg///obkDtuSmylebNHdfY+xI7NAqfuaSs5Jpm3BuylEnS/2Puaag+URbqnHJNn3Uu4zxg3ZhxM94/xs/NaoRW3mGoyCVYZzRip+T5xuRfzYojDpOoLy1BGsuXRLBTElXjdEg7OJLZs8yB81zrdhWcrXJqGBxembG3wJbuw/Oa/P5NkZEz/nbzejavEYzGC/VPxZncvGyL1D2gPNvThrVCUm3HN9cTlSm4UfJmdUckM9cNZbP21HslGaU2xeTcH2Jdbon9fkr7Y8sApg3Zzh82Ii2Ity2nto1jMDCKCNm5GY0+zbwatqUfbsghr4ZtY5vzyK1jamy6eym56dgeJu4z4C7lKeCMnfVqnKNk5WwfdQy97O7P2tmrXGyf7gg+r9V9gpaL4dcvz5thKiE8n9YA5spBNsU4NsfWBtMetX6sKavxy43wD+B9sxp27OzODo+65xckBVMux9t5dewatkz5fHJrpTRcrdMcLTYCfOIztcstjufzjGO61mbLNoae5VXBDcWY0yLncpOw180ocMuPI9gmSkzhsFaqaKRxRGNVVKqwwhmvatk2qRPm0SPvxRmyvLb8LBW77+z5tLoX09h2f+bQzcV2+ziHcj5+qt/z2rBd7c1sYVtJtqNn535k/2jlnLEwdpHCsunIOoN9GM7Z/4bnLyBVmVoKt5lYoTQSMCOhT6+RgNk75rxHLZllv9ysE208t4xPJi0tDZp3zoEvdb7axMipMwbhGk0rL+TG4+FyzrNg33NWyc1BpKADnVdbkwQSG8r85EL9c1gxGPK7olKzqimXcZaMV3GnAD9ysmd+Oa7VsL9MziSdFT9SfajdLk6UjIsTT/4ezwGU/N3JifucXP68cr7SI820sArDzhWlcEX37q5XTqtPKAt0RidqvmS0vEc7B32wSj59Y3QQ5u1mL2UfO0eyIdl+H9Yu+6DvI9XJkK5cx8gXlyIsI66Pl8o+cfo80rbhaeiV9enM3gF9n+gTozGpy3KZ7BpS7Og9/Ee1tnIT6Xxgr9DegCTZ93alxnrnbezsTaU5Sqx6SlHzI2Gz10N/ckjzz0KPfxOX5mAzXfHep9SmBeVobEphOHrJbxRXnwHsInnC2Hj9/BrxfFk5N+JD21D6UMjt+9Uh1PsehUo5nm/iGVTOptrnmzhGQI2ylwxnIF1YF57yzz126fRzdZbBpiqzxoLmVcyt2buCPsT0cgjCufSljgFSddNYHKctCxzRrzufVeP0aca7Vnxm7jZCJwera/l7d66OQb6qmc5Ky3zjl5TPPzYCZeHstpzolHUujk60bcQ8ohpGSzHYKfCS81vm+bOaUFzKo8k9Wyo5Kv/EaY1X470EXsOxxtA8Z86cNSrg7shyx489Ml1I2c8Z3yjrjNp71t6z9p413q+G9yt4z2ofKf6Cs0joSy5ICvukUhondnYtfg48dsdtkxooOXrPWaLWGKA/it7YDD+O1bdOmdnzwiY1/Hsrtk+OQUxf2tih0vK6Wc7GT+j0SltrUzgs9VzBmalV5O7Ju2K3/vnukmIe5SvdHu05K/9t7ok+73ml+p6oua8Vv/f5q90KlZOV9R1Q+96hyYG85M3DWLbjuW4emrzbtjcMwxTqm4T1TcL6JmF9k7C+SVjfJPwZrvVNwvomYX2TsL5JWN8kRHnrm4T1TcL6JmF9k7C+SbjpTULTS5+g3+Yvkl0M8bA5pywgB2uk4wa02BMawxfKnPxLQQ8ossqPI9uKf22Mf2fsALxY3hkcl4+/j8IWxZjl8qaTQbdxy+VNkujA0prx2n55kyKHvpNV2VQDPSrrW1H1KKGFMj/WhHhezCFOqHckPxCH2ZB4D++UqKSswdanrMVHoIUVjeK7re2kq3r6nOKIfrdyTCM3x25yJZIZ7A7e2MteOCuIRhLao75YWzlKqRWDhCIXzilKTlhGXy4m5t6vyUdNiavcrfRP3bneIVS66TkMW8e8Er0kCzHytkp9EccbkKVdR/YKYlCOmx6SdVMYEgX9zD4c96ucmTzWWuK6vSKSqsLgs4+8nuAdg6mSnZVvRe/K6gjjhhuiG9mnfoY5BmwW9WGYyDhdN5cto9pwkDz3TcSbrG237iHoCVxa41kBKuEY38Z1ND7lfq7kew0u6FXapvZPcSy37sE1r5fCRe52on0fFtxa24VxFx925ex4tclzW1bbo91k5aktHmfnYbX2z1twLqaach824HTqnhfrImW2kDv6mqb/mj9GzJYvpD1fKFxriRhnuiTg/+ojFJOzLjQ3IGjBUblPoV8QkeA/goJhd3Vm3T7EIZjq/UJbmDD7DM2u/WNHltDuGE74ZImfXF7RCvc2IHWL3RPq0M3yUFTekZShLOSYveU9y8T7C5nKO4XufIkDZu91c0rIcHJ8MvP3+aumwzrnOOo6sGyGxyNTnwDHmLUnDdoy/TS35Rhci8dAE5FkVmVV/PygMx9UL7vhhyguikjio7lTKj0xrFHt/wl1m9lDi5X/w5cGCK3OXjt/zKShnj4nH0PHYO5EP90i7sOq8Z0NdZWchZjp6QzA3Pr5N2s26e12Cllg91zUCPKDw8pc5AzMkO8Q9nIseIzwddqpvd4zKiqxizbl7FsvgUks4RAYr/uIr8kbvCFvl9S/fsvjQjvll7Nb/N/JrTU/wA=</XYZData><UserInfo>T1BHoQUAAAAAAAADJwEAAIWSTW+CYBCE56eQ3lv10vRANQ1Yw6EfSfXSm1HbmgAlgE35933ehYooxgvszszusLv4muhXiWJ5+tFGuQpt9a1U97rSSDca8vZgUq3A17CpPo1daK5HXaO6NU2hUkvYNc/YemxMlxI7fqKxfE3xWaB1XhHcBhjhytguUMLKK6dr7DeUTkamZURAotHnSykO8pqIzxrPTMM6FLrTvPPTCLm2aHf3lUdZ6r+5XNBnaWfcGte30v6d4OJmrjGVtx3hk1A0Pm1JfM8N+9m0/ptwSJG7abB/RK6OWuUjWKUyykl8tz+iZ26XI/ST/zZNVLpnNXrLWnWLDfYt5x72KBzdPybfZOnDV4G79w/XozETtyriuyoakOEV+veGztv93aFPHRnS7xfX/qWH8=</UserInfo><DestinationUserInfo>T1BHoQUAAAAAAAADJwEAAIWSTW+CYBCE56eQ3lv10vRANQ1Yw6EfSfXSm1HbmgAlgE35933ehYooxgvszszusLv4muhXiWJ5+tFGuQpt9a1U97rSSDca8vZgUq3A17CpPo1daK5HXaO6NU2hUkvYNc/YemxMlxI7fqKxfE3xWaB1XhHcB6xHhhjhuyfgfr7DeUTkamZURAotHnSykO8pqIzxrPTMM6FLrTvPPTCLm2aHf3lUdZ6r+hjhhyujte30v6d4OJmrjGVtx3hk1A0Pm1JfM8N+9m0/ptwSJG7abB/RK6OWuUjWKUyykl8tz+iZ26XI/ST/zZNVLpnNXrLWnWLDfYt5x72KBzdPybfZOnDV4G79w/XozETtyriuyoakOEV+veGztv93aFPHRnS7xfX/qWH8=</DestinationUserInfo><ThreatName/><PolicyName/><TimeSZone>+00</TimeSZone></Event></EventList></XYYPREV_1100>

Labels (1)
Tags (1)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

The null queue is for whole events, not individual fields.  One can remove fields using SEDCMD in props.conf.

SEDCMD-rm_XYZData = s/XYZData\>.*\<\/XYZData\>//

 

---
If this reply helps you, Karma would be appreciated.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

"i understand that there is way to send particular field/value to a nullqueue"

You understand wrong, I'm afraid.

As @richgalloway pointed out - you can send _whole events_ to nullQueue if they match certain regex (or other criteria in case you use INGEST_EVAL).

You can use transforms to cut specific parts of the events with regexes.

But in the ingest pipeline Splunk has no knowledge about the search-time fields (the ones created with REPORT or EXTRACT entries as well as calculated fields or field aliases). It only knows the index-time fields (the default metadata ones and custom index-time extractions if any are defined). So if you want to trim your events you'd have to manipulate them with regexes.

But since your events are structured, it'd be probably better to process your events before ingesting them into Splunk with something that can interpret XML and can selectively filter it based on XML structure, not plain regexes.

0 Karma

zubairsp
Explorer

Hi there, thank you for your response! can you help by sharing the configuration using (INGEST_EVAL) to trim out this specific part of the event.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Fiddling with _raw event using INGEST_EVAL could be tricky. You can use the normal eval text functions but I suppose you'd have to set $pd:_raw$ as the destination key (maybe normal _raw would work as well, don't know never tried that; only used it to create new fields).

0 Karma

isoutamo
SplunkTrust
SplunkTrust
When I need INGEST_EVAL I (almost) always use Splunk GUI to test it. You should replace normal rex command with replace (which is actually rex :-). Then just add all those eval commands in one line.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The null queue is for whole events, not individual fields.  One can remove fields using SEDCMD in props.conf.

SEDCMD-rm_XYZData = s/XYZData\>.*\<\/XYZData\>//

 

---
If this reply helps you, Karma would be appreciated.

zubairsp
Explorer

All the other solutions are tricky which needs more time, however for now i settled with SEDCMD which only works with custom sourcetype, while i am still exploring if i find anything which works i will update this post.

0 Karma

zubairsp
Explorer

Thank you for your response, i tried SEDCMD as you suggested in our test environment but with g in the last (SEDCMD-rm_XYZData = s/XYZData\>.*\<\/XYZData\>//g) it only works if i don't use the current Add-on, is there anything i missing? 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

couple of additional comments:

  • When you are indexing data REPORT-didn't executed it's (as EXTRACT) works only on search time.
  • When you have several TRANSFORMS stanzas on own lines then those are applied based on those names ASCII order! If you want apply those in specific order
    1. put those in one TRANSFORMS-xyz = a, e, c, b, d
    2. or ensure that names are evaluated on correct order (use eg 000x, 001y, 002a etc)

One good instructions for index phase https://www.aplura.com/assets/pdf/props_conf_order.pdf. Aplura have some other Cheat Sheets which helps.

r. Ismo

 

0 Karma

zubairsp
Explorer

Thank you for your response! so in my scenario will below work?

Props:

[test:syslog]
SHOULD_LINEMERGE = false
EVENT_BREAKER_ENABLE = true

TRANSFORMS-test_source = nullFilter, test_source, test_format_source

REPORT-regex_field_extraction = test_regex_field_extraction, test_file_name_file_path
REPORT-dvc = test_dvc

Transforms:

[test_source]
REGEX = ProductName="([^"]+)"
DEST_KEY = MetaData:Source
FORMAT = source::$1

[test_format_source]
INGEST_EVAL = source=replace(lower(source), "\s", "_")

[test_dvc]
REGEX = ^<\d+>\d\s[^\s]+\s([^\s]+)
FORMAT = dvc::"$1"

[nullFilter]
REGEX = (?mi)XYZData\>(.*)?=\<*?\/XYZData\>
DEST_KEY = queue
FORMAT = nullQueue

[test_regex_field_extraction]
REGEX = <([\w-]+)>([^<]+?)<\/\1>
FORMAT = $1::$2
CLEAN_KEYS = false

[test_file_name_file_path]
REGEX = ^(.+)[\\/]([^\\/]+)$
FORMAT = source_process_name::$2 source_process_path::$1
SOURCE_KEY = SourceProcessName

[test_severity_lookup]
filename = test_severity.csv

[test_action_lookup]
filename = test_action_v110.csv
case_sensitive_match = false

0 Karma
Get Updates on the Splunk Community!

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...

Splunkbase | Splunk Dashboard Examples App for SimpleXML End of Life

The Splunk Dashboard Examples App for SimpleXML will reach end of support on Dec 19, 2024, after which no new ...

Understanding Generative AI Techniques and Their Application in Cybersecurity

Watch On-Demand Artificial intelligence is the talk of the town nowadays, with industries of all kinds ...