Getting Data In

DATA COMING IN SPLUNK ENTERPRISE

rahulkumar
Explorer

Hi All

My issue is i have logstash data coming in splunk logs source type is Http Events and logs are coming in JSON format. I need to know how can i use this data to find something meaningful that i can use also as we get event code in windows forwarders so i block unwanted  event codes giving repeated information but in logstash data what we can do if i want to do something like this. How to take out information which we can use in splunk?

Labels (4)
0 Karma

rahulkumar
Explorer

Hi

I am confused like what in logs iam getting is this below:

timestamp: 

environment:

event :{ json format under that orginal : java.lang.throwable and list of errors

host: 
loglevel:
log file:
message: java.lang. throwable 

iam getting this above type of data in logs when i search the index and this is logstash data coming in splunk in json format


now iam confused with what to do with this data is this the data coming is fine or i can further filter anything from this  and get some other output out of it which can be meaningful for me.
if there is any way of it then please share.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

as I said, you have to extract metadata from the json using INGEST_EVAL and then convert in _raw the original log field.

At first you have to analyze your json logstash log and identify the metadata to use, then you have to create INGEST_EVAL transformations to assign the original metadata to your metadata, e.g something like this (do adapt to your log format):

in props.conf:

[source::http:logstash]
TRANSFORMS-00 = securelog_set_default_metadata
TRANSFORMS-01 = securelog_set_sourcetype_by_regex
TRANSFORMS-02 = securelog_override_raw

the first calls the metadata assignment,

the second one defines the correct sourcetype,

the third overrid _raw.

In transforms.conf

[securelog_set_default_metadata]
INGEST_EVAL = host := coalesce( json_extract(_raw, "hostname"), json_extract(_raw, "host.name"), json_extract(_raw, "host.hostname"))

[securelog_set_sourcetype_by_regex]
INGEST_EVAL = sourcetype := case( match(_raw, "\"path\":\"/var/log/audit/audit.log\""), "linux_audit", match(_raw, "\"path\":\"/var/log/secure\""), "linux_secure")

[securelog_override_raw]
INGEST_EVAL = _raw := if( sourcetype LIKE "linux%", json_extract(_raw, "application_log"), _raw )

The first one extract host from the json.

the second one assign sourcetype based on information in the metadata (in the example linux sourcetypes).

the third one takes one field of the json as _raw.

It was't an easy work and it was a very long job, so I hint to engage a Splunk PS or a Core Consultant that already did it.

Ciao.

Giuseppe

0 Karma

rahulkumar
Explorer

ok last thing i want to know from all this solutions when i will perform it . Will i get  any new different data from what i am getting now because as of now i am getting timestamp, hostname, kubernetes.container name etc.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

as I said, you have to extract from the json fields the Splunk metadata: host, timestamp, etc... if present in the json fields.

Then you have to identify the sourcetype from the content of one of the json fields.

As last, you have to remove all but the _raw, that's usually in one field called message or msg, or something else.

In this way, you'll have all the metadata to associate to your events amd the original raw events to parse using the standard add-ons.

When you choose the sourcetype, remember to use the one defined in the related add-on.

Ciao.

Giuseppe

0 Karma

rahulkumar
Explorer

Which standard add on you are talking about or related add on where to get them from can you tell any name or example

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

logstash is a log concentrator, so, probably, from logstash youre receiving logs of different sourcetypes (e.g. linux, firewall, routers, switches, etc...).

After extracting metadata, you have to recover the raw event and assign to each kind of log the sourcetype to use in the related add-ons, e.g. linux logs must be assigned to the sourcetype linux_secure, linux_audit, and so on.

These sourcetypes are the ones from the add-on Splunk Add-on for Linux and Unix that you can download from Splunkbase.

Ciao.

Giuseppe

rahulkumar
Explorer

Hi I think i was not able to explain you properly if you can have a look now in proper way and tell will be helpful as iam getting data in json format already so.

I am getting  cloud logstash data and its sourcetype is httpevent now below is the output iam already getting in json format in search logs of splunk.

@timestamp: 2025T19:31:30.615Z
environment: dev
event: { [+]
}
host: { [+]
}
input: { [+]
}
kubernetes: { [+]
}
message: +0000 FM [com.abc.cmp.event.message.base.Abs] DEBUG Receiver not ready, try other receivers or try later audit is disabled
}

so in message sometime iam getting above data and  some logs have some json data as well.

and also above json data for above fields are also coming.

Now i want to know that i have to use this data in my splunk for to have a structured data out of it how to do that so that i can use that data for my use ?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

as I said, you have to use the INGEST_EVAL property in transforms.conf and the json_extract function to extract the metadata from your json, and at least take as _raw only the message.

Please share your full expanded json logs in "Insert/Edit Code Sample" button to help you.

Ciao.

Giuseppe

rahulkumar
Explorer
{"input":{"type":"container"},"message":"2025-01-27 18:45:51.546GMT+0000 gm [com.bootserver.runtim] DEBUG Stability run result : com.bootserver.runtime.internal.api.RStability@6373","kubernetes":{"namespace":"mg-prd","labels":{"service_app_my_com/gm-mongodb":"client","app_kubernetes_io/name":"gm-core","app_kubernetes_io/instance":"mg-prd-release-gm-core","service_app_my_com/gm-external-https":"server","service_app_my_com/gm-internal-https":"both","pod-template-hash":"5889b666","app_my_com/chart":"gm-core-0.1.1","app_kubernetes_io/part-of":"my-JKT","app_my_com/service":"mela","app_kubernetes_io/managed-by":"mela","app_kubernetes_io/component":"my-race","app_kubernetes_io/version":"2.0.002333","app_my_com/release":"mg-prod-release","service_app_my_com/fm-internal-bridge":"client","app_my_com/name":"gm-core"},"container":{"name":"gm-core"}},"host":{"name":"mg-prd"},"@timestamp":"2025-01-:25:31","environment":"pr","event":{"original":"2025-01-27 18:25:31.426GMT+0000 FM [com.my.bootserver.runtim] DEBUG Stability run result : com.my.bootserver.runtime.internal.api.RStability@6373"}}
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

adpat my hint to your requirements:

in props.conf:

[source::http:logstash]
TRANSFORMS-00 = securelog_set_default_metadata
TRANSFORMS-01 = securelog_override_raw

in transforms.conf:

[securelog_set_default_metadata]
INGEST_EVAL = host := json_extract(_raw, "host.name")

[securelog_override_raw]
INGEST_EVAL = _raw := json_extract(_raw, "message")

 Ciao.

Giuseppe

rahulkumar
Explorer

so the example you sent this same has to be set only in transform.conf have to change host.name right?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

in this way, you extract the host field from host.name json field.

Then you can extract other relevant for you fields.

At least you remove all except message and put this field in _raw, in this way you have the original event, before logstash ingestion.

Remember that the order of execution is relevant, for this reason, in props.conf, there's a progressive number in transformation names.

Ciao.

Giuseppe

rahulkumar
Explorer

I was asking i have to use this below code same is this a constant predefined or i have to edit and  add my values ?
for example:
INGEST_EVAL = host := json_extract(_raw, "host.name")
                                                                                             |
                                                                                         my host name like host.ABC

INGEST_EVAL = _raw := json_extract(_raw, "message")
                                                                                  |
                                                                 is _raw is predinfed or we need to add something here and same for message doe we have to add whole json ?

in props.conf:

[source::http:logstash]
TRANSFORMS-00 = securelog_set_default_metadata
TRANSFORMS-01 = securelog_override_raw

in transforms.conf:

[securelog_set_default_metadata]
INGEST_EVAL = host := json_extract(_raw, "host.name")

[securelog_override_raw]
INGEST_EVAL = _raw := json_extract(_raw, "message")

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

host is one of the mandatory metadata in splunk and must have this name, even if (if you like) you can have also some aliases, but it isn't a best practice.

_raw if the name of the full raw events and you must use it.

it's the same thing of the timestamp: it must be called _time (probably it's another field to extract from the json!).

Then you can extract other fields (if relevant for you) from the json, before the last transformation that removes all the fields but message, in other words, you have to extract all the fields you need and at least restore the message field as raw event (putting message field in the _raw field).

Ciao.

Giuseppe

Ciao.

Giuseppe

rahulkumar
Explorer

Ok got it after doing all these configuration when i will search my index with its source i will get my result in logs itself right? 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

all these operations are before indexing, so you'll index the events as they ware before ingestion in logstash and having all the metadata you need, you you can apply all the parsing rules from the standard add-ons (the ones from splunkbase) and run all the searches.

Indeed, these operations are only to apply the standard parsing rules, because you can search the logs also using the original logstash format, but without the parsing and tagging and normalization rules.

let me know if I can help you more, or, please, accept one answer for the other people of Community.

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated 😉

rahulkumar
Explorer

Hi,
I have a problem that i do not have props.conf and transforms.conf because  i do not have agent or forwarder we just getting data through HEC. so we do not have props and conf file and hec is bind to host that is indexer directly?
what to do in this case or we have to create props.conf?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

let me understand, you have HEC inputs on the Indexer?

in this case, you have to create props.conf and transforms.conf in the Indexer.

As I said, these conf files must be located in the first full Splunk instance they are passing through, in your case, indexer.

let me know if I can help you more, or, please, accept one answer for the other people of Community.

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated 😉

rahulkumar
Explorer

Hi,
I wrote this to check if its working or not ..

props.conf
[source::http:my LogStash]
sourcetype = httpevent
TRANSFORMS-00 = securelog_set_default_metadata
TRANSFORMS-01 = securelog_override_raw

transforms.conf

[securelog_set_default_metadata]
INGEST_EVAL = host := json_extract(_raw, "host.name")

[securelog_override_raw]
INGEST_EVAL = message := json_extract(_raw, "message")

now which query i need to run in search reporting i wrote 
index="" sourcetype=""  | table host,message,_raw
( with this i am only seeing the data which i was able to do with the spath in search query)

but i did not got any extracted data its the same values iam getting what can be the reason or how to check if my props and transforms is working correctly. I see my results same non extracted can you provide some query to check or some information which i can refer?


0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rahulkumar ,

check if the fields you used in json_extract are correct (they should be): you can do this in Splunk Search.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

SOC Modernization: How Automation and Splunk SOAR are Shaping the Next-Gen Security ...

Security automation is no longer a luxury but a necessity. Join us to learn how Splunk ES and SOAR empower ...

Ask It, Fix It: Faster Investigations with AI Assistant in Observability Cloud

  Join us in this Tech Talk and learn about the recently launched AI Assistant in Observability Cloud. With ...

Index This | How many sides does a circle have?

  March 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...