Json without duplicates

Dherom — Tue, 29 Sep 2020 23:14:23 GMT

Good afternoon guys,

We need help.

We have a JSON file in which duplicate events are written.

We want to know how to have a primary key so that it does not index those duplicates and is not in the Splunk index.

{
"security": {
"notices": [
{
"rss_published": "2019-02-12T13:33:31.000Z",
"rss_message": "Email provider VFEmail has suffered what the company is calling \"catastrophic destruction\" at the hands of an as-yet unknown intruder who trashed all of the company's primary and backup data in the United States. The firm's founder says he ....",
"rss_fuente": "rss_krebsonsecurity",
"rss_title": "Email Provider VFEmail Suffers \u2018Catastrophic\u2019 Hack",
"rss_link": "https://krebsonsecurity.com/2019/02/email-provider-vfemail-suffers-catastrophic-hack/"
}
]
}
}
{
"security": {
"notices": [
{
"rss_published": "2019-02-12T13:33:31.000Z",
"rss_message": "Email provider VFEmail has suffered what the company is calling \"catastrophic destruction\" at the hands of an as-yet unknown intruder who trashed all of the company's primary and backup data in the United States. The firm's founder says he ....",
"rss_fuente": "rss_krebsonsecurity",
"rss_title": "Email Provider VFEmail Suffers \u2018Catastrophic\u2019 Hack",
"rss_link": "https://krebsonsecurity.com/2019/02/email-provider-vfemail-suffers-catastrophic-hack/"
}
]
}
}
{
"security": {
"notices": [
{
"rss_published": "2019-02-12T11:33:54.000Z",
"rss_message": "El fallo afecta a otros productos derivados de Docker que usan runc y al propio LXC, permitiendo acceder a la m\u00e1quina host con permisos de superusuario. Los investigadores Adam Iwaniuk y Borys Pop\u0142awski han descubierto una vulnerabilidad en....",
"rss_fuente": "rss_hispasec",
"rss_title": "Vulnerabilidad en runc permite escapar de contenedor Docker con permisos root",
"rss_link": "https://unaaldia.hispasec.com/2019/02/vulnerabilidad-en-runc-permite-escapar-de-contenedor-docker-con-permisos-root.html"
}
]
}
}
thank you!

Re: Json without duplicates

jluo_splunk — Thu, 14 Feb 2019 22:09:08 GMT

I think you'd be better off doing this at the source rather than in Splunk. Is it possible to write a script to cleanse the data before it's written to a file that Splunk monitors?

Re: Json without duplicates

woodcock — Thu, 14 Feb 2019 23:11:58 GMT

You can use Cribl to preprocess this... @clintsharp @dritan

Re: Json without duplicates

Dherom — Fri, 15 Feb 2019 07:04:29 GMT

But there is no method that at the time of indexing look at two fields of the json and make a hash or something so that these duplicates do not exist

Re: Json without duplicates

jluo_splunk — Fri, 15 Feb 2019 17:03:53 GMT

There may be something possible using the DSP beta, but at this point in time, it would be much less efficient to do it inside of Splunk - you would potentially cause some amount of ingestion latency.

topic Re: Json without duplicates in Getting Data In

Json without duplicates

Re: Json without duplicates

Re: Json without duplicates

Re: Json without duplicates

Re: Json without duplicates