Hello
Please some one help on the below queries
I have J SON logs, It has "image" and "sound" fields which has huge value (nearly 100kb) and I want to neglect only those fields and index the rest of events.
Is there any way to do so?
Will masking work on large data ? if yes,Is that possible to reduce data size and index, by masking large size of data with dummy values?
3.Basically, I need to remove those fields from a event and Index. Please share me some ideas on it.
I tried hiding fields using props and transforms. But it did not work.
Sample log:
"captcha" : {
"image" : "xxxxxxxxxxx\r\nxxxxxxxxx\r\n(50kb values)",
"sound" : "xxxxxxxxxxxxxxx(50kb values)",
"answer" : "xxxxxx",
},
Thanks,
Mala
You should be able to achieve this by using data anonymizing techniques that you would use for masking of data. But instead of adding the masking, you could just exclude that portion of the event. You can use either regex or sed to anonymize data. With sed, you could do a substitution and substitute that section with nothing.
https://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Anonymizedata
Hi,
I tried the below one
Props.conf
[sourcetype::image]
TRANSFORMS-set = setnull
Transforms.conf
[setnull]
REGEX = (?m)^(.)"image" : "(.)",(.*)$
FORMAT = $1"image" :"xxx",$3
DST_KEY = _raw
Thanks
Mala
Your REGEX pattern doesn't match your sample data. That would prevent Splunk from reducing the data.
This regex string works in regex101.com: ("image" : ")(.*?)(",\n)("sound" : ")(.*?)(",)
. It's not necessary for the regex string to match the entire event - only the part you wish to change needs to match.
I just tried the below transforms.conf for hiding "image values" . But it is not working for me
[setnull]
REGEX = ("image" : ")(.*?)(",\n)
FORMAT = $1xxx$2
DST_KEY = _raw
I applied same on SED in props.conf, Not working . Is any fault on below?
SEDCMD-set = s/("image" : ")(.*?)(",\n)/xxx\2/g
Try FORMAT= $1xxx$3
or SEDCMD-set = s/("image" : ")(.*?)(",\n)/\1xxx\3/g
Below is the sample data:
"captcha" : {
"image" : "iVBORw0KGgoAAAANSUhEUgAAAJYAAAAoCAIAAACTo5SwAAAeIUlEQVR42u2cd1hTV/z/60Ac1Var\r\/\r\noZAEBJ9v///Xze5/M55+Srr/7v+v/gAtm/gOzxIHsC4P8G+L/W4vschX\r\ndIo7k+bDjgvic5GacsiPL2st7tUCU3sUBbIPHRBq1NU0hRC6II7QBBQuluUvbss2FXNMBHGL+KgR\r\nN9yQE2aE20JO2CJOOLTFnHBTbiSEt0KQsFacDuMn5LcLlO3BEe7XPMorK7dqzjlcFL8nnbqdg2xP\r\np+4qen2wOe+ErEKJ8BJECKqvyiqvtJVeFudeFHAgSGsudpRDs+LQjnFoxznYCWjc6NP8uPOC9Ovi\r\ngvttFQ4fq582F9sXpdxKZ1xRbRmFdk21ZfQmJ+YWN/YOP/GhIPO5uMS9rZb4URjYLCAUcZ0hRQ7z\r\nESf2MSf2CSf2KSfWkcN6xmE957CcuPGu/BQPQW6gWBDZKqRLmliyN/GSpvj3QmZ1ESUnLYibSOS8\r\n9uW8hq9+nAR/TkIgbsFcdiifSxEUssQ1nLbmnC7qaj0Rwn80EXYVRQ3aoyh0QSVCUGgqL1gqzV/W\r\nmr20ibtElLJEmLJUmLJcmLJCmLJSmGomTF0tTF0rSjNv4m1ozd0sLdouL4X8cBf8hPCg2lFeueDY\r\nhxJI0VKUcViYcVjEO9qca/2h7Iy88oIK4RWIEFRfk1ddl1Zeby291pR/RcS/JMy+LMy+Isy+JuRf\r\nF/JviHJuNhXebRXYS2ueyOsc5cJnH6ocm4sdRHn2ql2jT/Bdo46fdo0WOIsKXJpK3VurfaT1/nJx\r\nkFwc/EEY1CzwExX5CAt9hUUEYRFRWOSvsOJAYXGQsDhEVEpqqoxoFdKkTUz52zjw7rXiNPb7JNnb\r\nxLaG103VDJEgWlgOLUZYzhQKYoWCOKEgXih4LapMbKpLbW3Mkrbky9uKvmCBqT2KAvlHTYSd6mr/\r\nRtFPCBVRFOcHbQkoXAqKlssLV8oKVkjzzSQFqyQFayQF5gorXC8p3CAp3CQt2iIr3iov2Q5Kd4Cy\r\ndhdsF0KLjkd5QcVxucBaVm4tLTshKTspLT8tE5yVV55XCmFHhPhp+pvy6luyqtvSqtuSyjuSyruS\r\nyvuSKjtJ1QNptb2s1kFe9wQInylM5CQXOsnqXKR1rpJaaC8Vu0br3JW7RiV1XhKht1REkNX7yRv8\r\nQWMQaAwGjaHyxlCZmCxtIEvqyZKGcElDhKQhSiKmSMSIREyViCG5aFkzQ/6GCRT84vHT9AqE4H2y\r\n/F2y7G2K9G2q5A20NMkbruRthuQtT/I2U/IuW/qeL2vJU2yW6eXGbU0hVEPYlRAadBbCTy6II+zw\r\nhQiK09hrPp3GVn0hAijZDEq3gtJtoHQ7KNvZGaG6ECoRKr8QAVSeApWnQaUil8DtkxCqEF5XfiEC\r\nqLkNamxB7V3llkOA71cDdY9AnQMQQn5PcX7PIUIgcgEiV1D/EtS/AvVuQLHf0FO55RA0+AIxEYj9\r\ngDhAYQqE2r8QATSj4A0VvMHAGzp4EwPeMsBbyI8F2l2w4xci9OIob68XmNqF8F+E/3f9P339D+Dp\r\nIgw5+CH1AAAAAElFTkSuQmCC\r\n",
"sound" : "UklGRhQ2AgBXQVZFZm10IBAAAAABAAEAgD4AAAB9AAACABAAZGF0YfA1AgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAD+/wAAAgAAAAIAAQAAAAAAAgAABAAEACQAAAAAABAAAAAIAAgABAAAAAwAIAAUACAAIAAEABQD+/wUABADy/wAA+//4//P/8v/o/+n/7P/X/+b/4P/h/+P/3f/m/+v/5P/j/+j/3f/i/+T/3P/g/+j/4f/q/+T/4f/o/+b/7v/k/+j/8P/l/+n/6f/w/+//8f/x/+//9f/z//X/9f/2//f/9/8AAAQA/f/6//7/9f8AAAAA+v/+//3//P/9/////P///wAA/P////b/AAAAAPz////9//7//v8AAP3///8AAP7/AAD3/wAAAAD9/wAA/f////f/+/8AAP3////9//3//f////7/9v8BAP7//P8AAPz//v///wAA/f8AAP///v8AAP7///8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+P8CAAAA/v8AAPb/+//5//n/+P////3/+v////r//f/9//3//v/+//j/+P/6//b/AAD8//v/BAABAAIAAwD+/wcAEAAMABYADwARABUAFQAXABcAGwARAA8AGAALAA4AFwARABEAEQAQAA4AEQAPABcAFgAMABIAEAAQAAkACwATAAYACwAQAAwABAAHAAcA/P8CAAYAAQACAAIA8f/9/wMA+////wUA/P/9/wEAAwD9/wAAAQD///r/AQAAAP3/AQD//wcABAD+/wEAAAABAAEAAgD5/wMAAgD+//r/AAD5//j/BAD7//7///8FAPz//f8BAPz/AgD3/wMAAAD+/wEA/f8AAP3/+v8AAAAA/////wAA/v8AAAAA//8AAAAAAAAAAAcA/v8AAAIA//8BAAgABwAFAAkAAAACAP3//f8HAAAAAgAAAAEAAAAAAAAA//8BAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD4/wIAAAD+/wAA9//7//n/+f/4/////f/6////+v/9//3//f/+//7/9//4//r/9v8AAPz//P8EAAEAAgADAP7/BwAQAAwAFgAPABEAFQAVABcAFwAbABEADwAYAAsADgAXABEAEQARABAADgARAA8AFwAWAAwAEgAQABAACQALABMABgALABAADAAEAAcABwD8/wIABgABAAIAAgDx//3/AwD7////BQD8//3/AQADAP3/AAABAP//+v8BAAAA/f8BAP//BwAEAP7/AQAAAAEAAQACAPn/AwACAP7/+v8AAPn/+P8EAPv//v///wUA/P/9/wEA/P8CAPf/AwAAAP7/AQD9/wAA/f/6/wAAAAD/////AAD+/wAAAAD//wAAAAAAAAAABwD+/wAAAgD//wEACAAHAAUACQAAAAIA/f/9/wcAAAACAAAAAQAAAAAAAAD//wEAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAA+f/6//z/+P/5//f/5//h/+X/3//k/9//2//b/+H/4P/d/+L/3P/k/9r/5P/v/+f/6//j/+r/6P/o//P/7v/v/+//8//0//X/7v/3/+//7P/0//b/8P/0//v/+/////3/AQABAPv//v8HAP3//P8BAP7//////wEAAAAAAAAA//8BAP//AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgA/v8AAAMAAAABAAAAAgAAAAAAAAD5/wMAAAD9/wAAAAAAAAYA/////wMAAAAAAPn/BAAIAAIA//8AAP7/AAADAP//AQD///j/AQACAP7/BwAFAAsABQAHAA0AAAAGAP3/BgAEAAYACwAFAAwABwAVABIACgAPABYADQAUACMAGgAXABkAHwAUABgAGgANABEADgAPABYAEwARABMAFAASABQADAAMAA0ACgAKAAoACgAHAAgABwAHAAcABQAGAAUABQAEAAQABAADAAMAAwADAAIA/P/9//7/+v/6//n/6f/i/+b/4P/l/+D/3f/c/+L/4f/e/+P/3f/l/9v/5f/v/+f/7P/j/+v/6f/p//P/7v/w/+//8//0//X/7v/4/+//7P/0//b/8P/0//v/+/////3/AQABAPv//v8HAP3//P8BAP7/AAD//wEAAAAAAAAA//8BAP//AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgA/v8AAAMAAAABAAAAAgAAAAAAAAD5/wMAAAD9/wAAAAAAAAYA/////wMAAAAAAPn/BAAIAAIA//8AAP7/AAADAP//AQD///j/AQACAPAAA",
"answer" : "xxxxxx",
"clear_screen" : false
},
Thanks for this link...
But SED Substitution, will not reduce the data size right?
Is that possible to substitute some n character of data with xxx charactes?
Like "iVBORw0KGgoAAAANSUhEUgAAAJYAAAAoCAIAAACTo5SwAAAeIUlEQVR42u2cd1hTV/z/60Ac1Var\r\/\r\noZAEBJ9v" To "xxxx"
Please post me some idea to reduce data size for indexing.
Yes, data masking using SED substitution will reduce the data size. It will allow you to replace 50kb of data with a few characters (like "xxxx").
There are examples in the link provided by kmorris.
Can you post your props.conf
and transforms.conf
please? Because this is the way it is done in Splunk, so there must be something wrong or the configs where not on the parsing instance, see the docs http://docs.splunk.com/Documentation/Splunk/latest/Admin/Configurationparametersandthedatapipeline#H... for more details
cheers, MuS
Data masking methods using props.conf and transforms.conf seems like the best way to go. What did you try already? Perhaps we can help you get that to work.