Trying to strip the header info out of the event below, leaving only the JSON. I've tried "|extract reload=true" but neither that nor restarting Splunk seems to be working. Must be something with my syntax. This example is trying to remove the first 2 lines (for sake of simplicity in getting it to work)
props.conf:
[akamai_post_json]
SEDCMD-httpheader = s/(?mg)^POST.*$\n|^User-Agent.*$\n|//g
The event:
POST / HTTP/1.1
User-Agent: curl/7.26.0
Host: localhost
Accept: */*
Content-Length: 2552
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------0b1c32056fc5
------------------------------0b1c32056fc5
Content-Disposition: form-data; name="fileupload"; filename="default_connector_schema_1.0.json"
Content-Type: application/octet-stream
{
"apiType" : "String",
"apiFormat" : "String",
"apiVersion" : 0,
"ID" : "String",
"startTime" : "String",
"eventType" : "String",
"cpCode" : 0,
"message" : {
"protocol" : "0",
"protoVersion" : 0,
"clientIP" : "String",
"reqPort" : 0,
"reqHost" : "String",
"reqMethod" : "String",
"reqPath" : "String",
"reqQuery" : "String",
"reqContType" : "String",
"reqContLen" : 0,
"sslProtocol" : "String",
"sslVersion" : 0,
"respStatus" : 0,
"respRedirURL" : "String",
"respContType" : "String",
"respContLen" : 0,
"respBytesServed" : 0,
"userAgent" : "String",
"originHostname" : "String"
},
"httpHeaders" : {
"reqHeader" : {
"accEnc" : "String",
"accLang" : "String",
"auth" : "String",
"cacheCtl" : "String",
"connection" : "String",
"contMD5" : "String",
"cookie" : "String",
"DNT" : "String",
"ifMatch" : "String",
"ifMod" : "String",
"ifNoMatch" : "String",
"pragma" : "String",
"range" : "String",
"referer" : "String",
"TE" : "String",
"upgrade" : "String",
"via" : "String",
"xFrwdFor" : "String",
"xReqWith" : "String"
},
"respHeader" : {
"cacheCtl" : "String",
"connection" : "String",
"contEnc" : "String",
"contLang" : "String",
"contLen" : "String",
"contMD5" : "String",
"contDisp" : "String",
"contRange" : "String",
"date" : "String",
"eTag" : "String",
"expires" : "String",
"lastMod" : "String",
"p3p" : "String",
"pragma" : "String",
"server" : "String",
"setCookie" : "String",
"trailer" : "String",
"transEnc" : "String",
"vary" : "String",
"warning" : "String",
"wwwAuth" : "String"
}
},
"performance" : {
"reqHeadSize" : 0,
"reqBodySize" : 0,
"respHeadSize" : 0,
"respBodySize" : "String",
"downloadTime" : "String",
"originName" : "String",
"originIP" : "String",
"originInitIP" : "String",
"originRetry" : 0,
"lastMileRTT" : 0,
"lastMileBW" : 0,
"netOriginRTT" : 0,
"cacheStatus" : "String",
"lastByte" : true,
"cliCountry" : "String",
"edgeIP" : "String",
"reqID" : "String"
}
}
------------------------------0b1c32056fc5--
To strip the whole HTTP header, the following regex should work:
SEDCMD-stripheader = s/^(?ms)POST.+?(\r?\n){2}//g
And you have to restart splunkd, since that settings is affecting indexing behavior.
To strip the whole HTTP header, the following regex should work:
SEDCMD-stripheader = s/^(?ms)POST.+?(\r?\n){2}//g
And you have to restart splunkd, since that settings is affecting indexing behavior.
This worked. You rule.
Thanks
{"log":"{\"serviceName\":\"xxxxx\",\"ipAddress\":\"\",\"timestamp\":\"2019-02-08T16:06:02.766+0000\",\"traceId\":\"\",\"level\":\"INFO\",\"logger\":\"yyyyyyyApplication\",\"message\":\"Started yyyyyApplication in 23.332 seconds (JVM running for 24.707)\",\"stack\":\"\",\"timeTaken\":\"\"}\n","stream":"stdout","time":"2019-02-08T16:06:02.767236274Z"}
What could be the sedcmd used for this? The problem with this one is the nested log isn't being recognized as a json. I believe the reason is because of \n in the log.
Please correct me if I am wrong and help me on this.
So the rest of the event your're seeing is the actual (multipart encoded) HTTP body.
I'd suggest to use another substitution in the SEDCMD to eliminate the multipart boundaries.
eg.
SEDCMD-stripheader = s/^(?ms)POST.+?(r?n){2}//g s/-{30}\V+//g
The regex matches "POST" at the beginning of the event up until two CRLF (newlines) are found. \r\n\r\n
is the termination of the header in the HTTP protocol.
Maybe you could break down what this is doing and I can figure it out from there:
SEDCMD-stripheader = s/^(?ms)POST.+?(\r?\n){2}//g
and the line at the end....
------------------------------0a7d7d9180f4
At least its progress lol
Wow. Close!
Now I just have to get rid of:
------------------------------0a7d7d9180f4
Content-Disposition: form-data; name="fileupload"; filename="default_connector_schema_1.0.json"
Content-Type: application/octet-stream
My last guess is to additionally adjust the line-merging settings (also props.conf):
SHOULD_LINEMERGE=false
LINE_BREAKER=------------------------------\w+--([\r\n]+)
SEDCMD-stripheader = s/(?ms)^POST.+?(\r?\n){2}//g
Nope. I'm commenting out lines that don't work with #
only one line is active at once... freaking weird.
you didn't use both of them at the same time, did you?
Thanks. No luck. Doesn't seem to affect the event at all.
As noted above, I'm editing the conf, restarting Splunk from the web interface, then logging in and reloading my real-time search.
I know the props file is in the right spot because I can change POST to TEST with
SEDCMD-httpheader = s/(?gism)(POST)/TEST/g
Just guessing but could it be those "\n"s ? You have put sed to line-by-line -mode, so "$" is now end-of-line and I doubt if need that extra newline in your sed command.
SEDCMD-httpheader = s/(?mg)^POST.*$|^User-Agent.*$//g
Tried it and did nothing noticeable. I'm running real-time searches so make sure I'm getting the latest data.
I edit props.conf, restart splunk, and reload my real-time search page.
I was able to get POST to change to TEST by simply doing this:
SEDCMD-httpheader = s/(?gism)(POST)/TEST/g
Also, SEDCMD
props entries only fire at index time, so they won't affect any previously indexed data.