Getting Data In

remove/change long field before inserting event

dorHerbesman
Explorer

i have events that contains a specific field that sometimes contain a very long field which make the rest of the event be truncated, i want to remove this field or change it "long field detected".

the problematic field call "file" and i should catch it's last appearnce, also i want the data after it so i should stop the removal after the first "," (comma). also the event contains nested fields.

i've tried props.conf+transform conf like that:

[APIGW]
TRANSFORMS-replace_long_field = replace_long_field

[replace_long_field]
REGEX = \"file\\\":\s*\"(.{5000,}?),"
FORMAT = file": \"$1long field detected"
DEST_KEY = _raw


but it doesn't work.

here is an example for 1 event:
{"nativeHttpMethod":"POST","apiName":"OT-API","sourceGatewayNode":"10.0.23.27","customFields":{},"origin":null,"sourceGatewayDetails":null,"planName":null,"httpMethod":"post","responseCode":"500","cachedResponse":"Not-Cached","apiVersion":"1.0","messageType":null,"queryParameters":{},"nativeRequestHeaders":{"content-length":"6917","Accept":"*/*","User-Agent":"PostmanRuntime/7.1.5","Postman-Token":"87d68438-4111-4ae6-8171-0fe0d6748e1a","cache-control":"no-cache","accept-encoding":"gzip, deflate","Content-Type":"application/json"},"nativeResponseHeaders":{"Transfer-Encoding":"chunked","Server":"Microsoft-IIS/10.0","Date":"Thu, 01 May 2025 06:16:04 GMT","Content-Type":"application/json; charset=utf-8","X-Powered-By":"ASP.NET"},"errorOrigin":"NATIVE","totalDataSize":7051,"planId":null,"nativeRequestPayload":"{\r\n \"mapProfile\": \"ConfirmAbs\",\r\n \"keyMap\": [\r\n {\r\n \"AbsName\": \"מילואים - תעש\",\r\n \"EmployeeNumber\": \"1324332\",\r\n \"Period\": \"1202410\",\r\n \"StartDate\": \"28/10/2024\",\r\n \"EndDate\": \"31/10/2024\",\r\n \"WorkLocation\": \"1רמה\\\"ש חמח\",\r\n \"Secretary\": \"1אדרי טל\",\r\n \"AccumCode\": \"17\"\r\n }\r\n ],\r\n \"fileName\": \"123_4311.pdf\",\r\n \"file\": \"CldoZXJlIGRvZXMgaXQgY29tZSBmcm9tPwpDb250cmFyeSB0byBwb3B1bGFyIGJlbGllZiwgTG9yZW0gSXBzdW0gaXMgbm90IHNpbXBseSByYW5kb20gdGV4dC4gSXQgaGFzIHJvb3RzIGluIGEgcGllY2Ugb2YgY2xhc3NpY2FsIExhdGluIGxpdGVyYXR1cmUgZnJvbSA0NSBCQywgbWFraW5nIGl0IG92ZXIgMjAwMCB5ZWFycyBvbGQuIFJpY2hhcmQgTWNDbGludG9jaywgYSBMYXRpbiBwcm9mZXNzb3IgYXQgSGFtcGRlbi1TeWRuZXkgQ29sbGVnZSBpbiBWaXJnaW5pYSwgbG9va2VkIHVwIG9uZSBvZiB0aGUgbW9yZSBvYnNjdXJlIExhdGluIHdvcmRzLCBjb25zZWN0ZXR1ciwgZnJvbSBhIExvcmVtIElwc3VtIHBhc3NhZ2UsIGFuZCBnb2luZyB0aHJvdWdoIHRoZSBjaXRlcyBvZiB0aGUgd29yZCBpbiBjbGFzc2ljYWwgbGl0ZXJhdHVyZSwgZGlzY292ZXJlZCB0aGUgdW5kb3VidGFibGUgc291cmNlLiBMb3JlbSBJcHN1bSBjb21lcyBmcm9tIHNlY3Rpb25zIDEuMTAuMzIgYW5kIDEuMTAuMzMgb2YgImRlIEZpbmlidXMgQm9ub3J1bSBldCBNYWxvcnVtIiAoVGhlIEV4dHJlbWVzIG9mIEdvb2QgYW5kIEV2aWwpIGJ5IENpY2Vybywgd3JpdHRlbiBpbiA0NSBCQy4gVGhpcyBib29rIGlzIGEgdHJlYXRpc2Ugb24gdGhlIHRoZW9yeSBvZiBldGhpY3MsIHZlcnkgcG9wdWxhciBkdXJpbmcgdGhlIFJlbmFpc3NhbmNlLiBUaGUgZmlyc3QgbGluZSBvZiBMb3JlbSBJcHN1bSwgIkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0Li4iLCBjb21lcyBmcm9tIGEgbGluZSBpbiBzZWN0aW9uIDEuMTAuMzIuCgpUaGUgc3RhbmRhcmQgY2h1bmsgb2YgTG9yZW0gSXBzdW0gdXNlZCBzaW5jZSB0aGUgMTUwMHMgaXMgcmVwcm9kdWNlZCBiZWxvdyBmb3IgdGhvc2UgaW50ZXJlc3RlZC4gU2VjdGlvbnMgMS4xMC4zMiBhbmQgMS4xMC4zMyBmcm9tICJkZSBGaW5pYnVzIEJvbm9ydW0gZXQgTWFsb3J1bSIgYnkgQ2ljZXJvIGFyZSBhbHNvIHJlcHJvZHVjZWQgaW4gdGhlaXIgZXhhY3Qgb3JpZ2luYWwgZm9ybSwgYWNjb21wYW5pZWQgYnkgRW5nbGlzaCB2ZXJzaW9ucyBmcm9tIHRoZSAxOTE0IHRyYW5zbGF0aW9uIGJ5IEguIFJhY2toYW0uCgpXaGVyZSBjYW4gSSBnZXQgc29tZT8KVGhlcmUgYXJlIG1hbnkgdmFyaWF0aW9ucyBvZiBwYXNzYWdlcyBvZiBMb3JlbSBJcHN1bSBhdmFpbGFibGUsIGJ1dCB0aGUgbWFqb3JpdHkgaGF2ZSBzdWZmZXJlZCBhbHRlcmF0aW9uIGluIHNvbWUgZm9ybSwgYnkgaW5qZWN0ZWQgaHVtb3VyLCBvciByYW5kb21pc2VkIHdvcmRzIHdoaWNoIGRvbid0IGxvb2sgZXZlbiBzbGlnaHRseSBiZWxpZXZhYmxlLiBJZiB5b3UgYXJlIGdvaW5nIHRvIHVzZSBhIHBhc3NhZ2Ugb2YgTG9yZW0gSXBzdW0sIHlvdSBuZWVkIHRvIGJlIHN1cmUgdGhlcmUgaXNuJ3QgYW55dGhpbmcgZW1iYXJyYXNzaW5nIGhpZGRlbiBpbiB0aGUgbWlkZGxlIG9mIHRleHQuIEFsbCB0aGUgTG9yZW0gSXBzdW0gZ2VuZXJhdG9ycyBvbiB0aGUgSW50ZXJuZXQgdGVuZCB0byByZXBlYXQgcHJlZGVmaW5lZCBjaHVua3MgYXMgbmVjZXNzYXJ5LCBtYWtpbmcgdGhpcyB0aGUgZmlyc3QgdHJ1ZSBnZW5lcmF0b3Igb24gdGhlIEludGVybmV0LiBJdCB1c2VzIGEgZGljdGlvbmFyeSBvZiBvdmVyIDIwMCBMYXRpbiB3b3JkcywgY29tYmluZWQgd2l0aCBhIGhhbmRmdWwgb2YgbW9kZWwgc2VudGVuY2Ugc3RydWN0dXJlcywgdG8gZ2VuZXJhdGUgTG9yZW0gSXBzdW0gd2hpY2ggbG9va3MgcmVhc29uYWJsZS4gVGhlIGdlbmVyYXRlZCBMb3JlbSBJcHN1bSBpcyB0aGVyZWZvcmUgYWx3YXlzIGZyZWUgZnJvbSByZXBldGl0aW9uLCBpbmplY3RlZCBodW1vdXIsIG9yIG5vbi1jaGFyYWN0ZXJpc3RpYyB3b3JkcyBldAoKV2hlcmUgZG9lcyBpdCBjb21lIGZyb20/CkNvbnRyYXJ5IHRvIHBvcHVsYXIgYmVsaWVmLCBMb3JlbSBJcHN1bSBpcyBub3Qgc2ltcGx5IHJhbmRvbSB0ZXh0LiBJdCBoYXMgcm9vdHMgaW4gYSBwaWVjZSBvZiBjbGFzc2ljYWwgTGF0aW4gbGl0ZXJhdHVyZSBmcm9tIDQ1IEJDLCBtYWtpbmcgaXQgb3ZlciAyMDAwIHllYXJzIG9sZC4gUmljaGFyZCBNY0NsaW50b2NrLCBhIExhdGluIHByb2Zlc3NvciBhdCBIYW1wZGVuLVN5ZG5leSBDb2xsZWdlIGluIFZpcmdpbmlhLCBsb29rZWQgdXAgb25lIG9mIHRoZSBtb3JlIG9ic2N1cmUgTGF0aW4gd29yZHMsIGNvbnNlY3RldHVyLCBmcm9tIGEgTG9yZW0gSXBzdW0gcGFzc2FnZSwgYW5kIGdvaW5nIHRocm91Z2ggdGhlIGNpdGVzIG9mIHRoZSB3b3JkIGluIGNsYXNzaWNhbCBsaXRlcmF0dXJlLCBkaXNjb3ZlcmVkIHRoZSB1bmRvdWJ0YWJsZSBzb3VyY2UuIExvcmVtIElwc3VtIGNvbWVzIGZyb20gc2VjdGlvbnMgMS4xMC4zMiBhbmQgMS4xMC4zMyBvZiAiZGUgRmluaWJ1cyBCb25vcnVtIGV0IE1hbG9ydW0iIChUaGUgRXh0cmVtZXMgb2YgR29vZCBhbmQgRXZpbCkgYnkgQ2ljZXJvLCB3cml0dGVuIGluIDQ1IEJDLiBUaGlzIGJvb2sgaXMgYSB0cmVhdGlzZSBvbiB0aGUgdGhlb3J5IG9mIGV0aGljcywgdmVyeSBwb3B1bGFyIGR1cmluZyB0aGUgUmVuYWlzc2FuY2UuIFRoZSBmaXJzdCBsaW5lIG9mIExvcmVtIElwc3VtLCAiTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQuLiIsIGNvbWVzIGZyb20gYSBsaW5lIGluIHNlY3Rpb24gMS4xMC4zMi4KClRoZSBzdGFuZGFyZCBjaHVuayBvZiBMb3JlbSBJcHN1bSB1c2VkIHNpbmNlIHRoZSAxNTAwcyBpcyByZXByb2R1Y2VkIGJlbG93IGZvciB0aG9zZSBpbnRlcmVzdGVkLiBTZWN0aW9ucyAxLjEwLjMyIGFuZCAxLjEwLjMzIGZyb20gImRlIEZpbmlidXMgQm9ub3J1bSBldCBNYWxvcnVtIiBieSBDaWNlcm8gYXJlIGFsc28gcmVwcm9kdWNlZCBpbiB0aGVpciBleGFjdCBvcmlnaW5hbCBmb3JtLCBhY2NvbXBhbmllZCBieSBFbmdsaXNoIHZlcnNpb25zIGZyb20gdGhlIDE5MTQgdHJhbnNsYXRpb24gYnkgSC4gUmFja2hhbS4KCldoZXJlIGNhbiBJIGdldCBzb21lPwpUaGVyZSBhcmUgbWFueSB2YXJpYXRpb25zIG9mIHBhc3NhZ2VzIG9mIExvcmVtIElwc3VtIGF2YWlsYWJsZSwgYnV0IHRoZSBtYWpvcml0eSBoYXZlIHN1ZmZlcmVkIGFsdGVyYXRpb24gaW4gc29tZSBmb3JtLCBieSBpbmplY3RlZCBodW1vdXIsIG9yIHJhbmRvbWlzZWQgd29yZHMgd2hpY2ggZG9uJ3QgbG9vayBldmVuIHNsaWdodGx5IGJlbGlldmFibGUuIElmIHlvdSBhcmUgZ29pbmcgdG8gdXNlIGEgcGFzc2FnZSBvZiBMb3JlbSBJcHN1bSwgeW91IG5lZWQgdG8gYmUgc3VyZSB0aGVyZSBpc24ndCBhbnl0aGluZyBlbWJhcnJhc3NpbmcgaGlkZGVuIGluIHRoZSBtaWRkbGUgb2YgdGV4dC4gQWxsIHRoZSBMb3JlbSBJcHN1bSBnZW5lcmF0b3JzIG9uIHRoZSBJbnRlcm5ldCB0ZW5kIHRvIHJlcGVhdCBwcmVkZWZpbmVkIGNodW5rcyBhcyBuZWNlc3NhcnksIG1ha2luZyB0aGlzIHRoZSBmaXJzdCB0cnVlIGdlbmVyYXRvciBvbiB0aGUgSW50ZXJuZXQuIEl0IHVzZXMgYSBkaWN0aW9uYXJ5IG9mIG92ZXIgMjAwIExhdGluIHdvcmRzLCBjb21iaW5lZCB3aXRoIGEgaGFuZGZ1bCBvZiBtb2RlbCBzZW50ZW5jZSBzdHJ1Y3R1cmVzLCB0byBnZW5lcmF0ZSBMb3JlbSBJcHN1bSB3aGljaCBsb29rcyByZWFzb25hYmxlLiBUaGUgZ2VuZXJhdGVkIExvcmVtIElwc3VtIGlzIHRoZXJlZm9yZSBhbHdheXMgZnJlZSBmcm9tIHJlcGV0aXRpb24sIGluamVjdGVkIGh1bW91ciwgb3Igbm9uLWNoYXJhY3RlcmlzdGljIHdvcmRzIGV0CgoKV2hlcmUgZG9lcyBpdCBjb21lIGZyb20/CkNvbnRyYXJ5IHRvIHBvcHVsYXIgYmVsaWVmLCBMb3JlbSBJcHN1bSBpcyBub3Qgc2ltcGx5IHJhbmRvbSB0ZXh0LiBJdCBoYXMgcm9vdHMgaW4gYSBwaWVjZSBvZiBjbGFzc2ljYWwgTGF0aW4gbGl0ZXJhdHVyZSBmcm9tIDQ1IEJDLCBtYWtpbmcgaXQgb3ZlciAyMDAwIHllYXJzIG9sZC4gUmljaGFyZCBNY0NsaW50b2NrLCBhIExhdGluIHByb2Zlc3NvciBhdCBIYW1wZGVuLVN5ZG5leSBDb2xsZWdlIGluIFZpcmdpbmlhLCBsb29rZWQgdXAgb25lIG9mIHRoZSBtb3JlIG9ic2N1cmUgTGF0aW4gd29yZHMsIGNvbnNlY3RldHVyLCBmcm9tIGEgTG9yZW0gSXBzdW0gcGFzc2FnZSwgYW5kIGdvaW5nIHRocm91Z2ggdGhlIGNpdGVzIG9mIHRoZSB3b3JkIGluIGNsYXNzaWNhbCBsaXRlcmF0dXJlLCBkaXNjb3ZlcmVkIHRoZSB1bmRvdWJ0YWJsZSBzb3VyY2UuIExvcmVtIElwc3VtIGNvbWVzIGZyb20gc2VjdGlvbnMgMS4xMC4zMiBhbmQgMS4xMC4zMyBvZiAiZGUgRmluaWJ1cyBCb25vcnVtIGV0IE1hbG9ydW0iIChUaGUgRXh0cmVtZXMgb2YgR29vZCBhbmQgRXZpbCkgYnkgQ2ljZXJvLCB3cml0dGVuIGluIDQ1IEJDLiBUaGlzIGJvb2sgaXMgYSB0cmVhdGlzZSBvbiB0aGUgdGhlb3J5IG9mIGV0aGljcywgdmVyeSBwb3B1bGFyIGR1cmluZyB0aGUgUmVuYWlzc2FuY2UuIFRoZSBmaXJzdCBsaW5lIG9mIExvcmVtIElwc3VtLCAiTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQuLiIsIGNvbWVzIGZyb20gYSBsaW5lIGluIHNlY3Rpb24gMS4xMC4zMi4KClRoZSBzdGFuZGFyZCBjaHVuayBvZiBMb3JlbSBJcHN1bSB1c2VkIHNpbmNlIHRoZSAxNTAwcyBpcyByZXByb2R1Y2VkIGJlbG93IGZvciB0aG9zZSBpbnRlcmVzdGVkLiBTZWN0aW9ucyAxLjEwLjMyIGFuZCAxLjEwLjMzIGZyb20gImRlIEZpbmlidXMgQm9ub3J1bSBldCBNYWxvcnVtIiBieSBDaWNlcm8gYXJlIGFsc28gcmVwcm9kdWNlZCBpbiB0aGVpciBleGFjdCBvcmlnaW5hbCBmb3JtLCBhY2NvbXBhbmllZCBieSBFbmdsaXNoIHZlcnNpb25zIGZyb20gdGhlIDE5MTQgdHJhbnNsYXRpb24gYnkgSC4gUmFja2hhbS4KCldoZXJlIGNhbiBJIGdldCBzb21lPw==\",\r\n \"mapCompany\": \"152\",\r\n \"fileAuthor\": \"ecmuser\",\r\n \"fileType\": \"Document\"\r\n}","nativeResponsePayload":"{\"message\"😕"Failed to make REST call. Type : POST, URL : /v2/nodes, Error : {\\r\\n \\\"error\\\": \\\"An item with the name '123_4311.pdf' already exists.\\\"\\r\\n}\"}","packageName":null,"providerTime":2962,"isCallbackRequest":false,"apiId":"78924a81-0e30-47fb-a512-3f3372333a18","applicationName":"SAG","applicationIp":"10.0.112.240","resPayload":"{\"message\"😕"Failed to make REST call. Type : POST, URL : /v2/nodes, Error : {\\r\\n \\\"error\\\": \\\"An item with the name '123_4311.pdf' already exists.\\\"\\r\\n}\"}","totalTime":2964,"packageId":null,"operationName":"/api/OTFiles/UploadFile","eventType":"Transactional","creationDate":1732704809440,"requestHeaders":{"content-length":"6917","x-Gateway-APIKey":"**************","Accept":"*/*","User-Agent":"PostmanRuntime/7.1.5","Connection":"keep-alive","Postman-Token":"87d68438-4111-4ae6-8171-0fe0d6748e1a","Host":"TINTAPIGW02.elbitsystems.com:5443","cache-control":"no-cache","accept-encoding":"gzip, deflate","Content-Type":"application/json"},"responseHeaders":{"Transfer-Encoding":"chunked","Server":"Microsoft-IIS/10.0","Date":"Thu, 01 May 2025 06:16:04 GMT","Content-Type":"application/json; charset=utf-8","X-Powered-By":"ASP.NET"},"sourceGateway":"APIGateway","externalCalls":[{"externalCallType":"NATIVE_SERVICE_CALL","externalURL":"https://twebwses.elbitsystems.com/OT-API/api/OTFiles/UploadFile","callStartTime":1732704809441,"callEndTime":1732704812403,"callDuration":2962,"responseCode":"500"}],"correlationID":"APIGW:8de36863-18bc-4d00-812e-bed164d66a3b:26912","applicationId":"791f2759-d36d-4fd9-9695-e8797a19b990","reqPayload":"{\r\n \"mapProfile\": \"ConfirmAbs\",\r\n \"keyMap\": [\r\n {\r\n \"AbsName\": \"מילואים - תעש\",\r\n \"EmployeeNumber\": \"1324332\",\r\n \"Period\": \"1202410\",\r\n \"StartDate\": \"28/10/2024\",\r\n \"EndDate\": \"31/10/2024\",\r\n \"WorkLocation\": \"1רמה\\\"ש חמח\",\r\n \"Secretary\": \"1אדרי טל\",\r\n \"AccumCode\": \"17\"\r\n }\r\n ],\r\n \"fileName\": \"123_4311.pdf\",\r\n \"file\": \"CldoZXJlIGRvZXMgaXQgY29tZSBmcm9tPwpDb250cmFyeSB0byBwb3B1bGFy

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

This way you can only (if the regex matches) extract indexed field, not modify the original event (maybe except when you overwrite the _raw event).

You're looking for the SEDCMD functionality. I'd also slightly modify your regex since you're looking for a base64-encoded contents which may not contain neither backslash nor a quote.

SEDCMD-trim-file = s/(\\"file\\":\s*\\")([^\\"]{5000,}?)/\1long_file/g

 See it here

https://regex101.com/r/8nX7FY/1

(the regex101 substitution uses a bit different format to SEDCMD - it uses $1 instead of \1)

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

This way you can only (if the regex matches) extract indexed field, not modify the original event (maybe except when you overwrite the _raw event).

You're looking for the SEDCMD functionality. I'd also slightly modify your regex since you're looking for a base64-encoded contents which may not contain neither backslash nor a quote.

SEDCMD-trim-file = s/(\\"file\\":\s*\\")([^\\"]{5000,}?)/\1long_file/g

 See it here

https://regex101.com/r/8nX7FY/1

(the regex101 substitution uses a bit different format to SEDCMD - it uses $1 instead of \1)

dorHerbesman
Explorer

is this props.conf/transform.conf command or in splunk command? the goal is to remove/alter the field prior entering it to splunk.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

https://docs.splunk.com/Documentation/Splunk/Latest/Admin/Propsconf#Field_extraction_configuration 

SEDCMD-<class> = <sed script>
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit
  card or social security numbers. For more information, search the online
  documentation for "anonymize data."
* Used to specify a sed script which Splunk software applies to the _raw
  field.
* A sed script is a space-separated list of sed commands. Currently the
  following subset of sed commands is supported:
    * replace (s) and character substitution (y).
* Syntax:
    * replace - s/regex/replacement/flags
      * regex is a perl regular expression (optionally containing capturing
        groups).
      * replacement is a string to replace the regex match. Use \n for back
        references, where "n" is a single digit.
      * flags can be either: g to replace all matches, or a number to
        replace a specified match.
    * substitute - y/string1/string2/
      * substitutes the string1[i] with string2[i]
* No default.

 

 

dorHerbesman
Explorer

That's a good direction! unfortunately still not working 100% , i used your code in my props.conf :

[APIGW]
SEDCMD-trim-file = s/(\\"file\\":\s*\\")([^\\"]{5000,}?)/\1long_file/g


and here are the results:

dorHerbesman_0-1733129786341.png

it's like it only replace the 5000 first character instead the entire filed but this is a big step in the right direction thank you for your help!
i will try taking it from here but it will be mostly appreciated if you have the solution in you mind and can share it

EDIT: 
From a few tests I've made it stops the field change exactly after 5000 characters instead of running till the first comma / end of field. 

EDIT2: 
the regex that was needed was:

SEDCMD-trim-file = s/(\\"file\\":\s*\\")([^\\"]{5000,})(\\")/\1long_file/g


but thank you for all the help!

EDIT3:
Well, apparently this solution alone is not enough, I also had to increase the truncate value because when the secmd command run it replaces the string  at the end meaning it first recive the default 10,000 characters and only than replace which is not good enough because the final result is still truncated events, i needed to increase truncate value so it will recive the entire event and later on it's doing the replacement.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The REGEX does not match the sample data because backslashes must be escaped.  Try

REGEX = \\"file\\":\s*\\"(.{5000,}?),"
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...