Splunk Enterprise
Highlighted

How to handle search query when JSON data has host field?

New Member

I'm working on a corporate Splunk instance where we do not have access to rename fields when indexing, or make any similar modifications due to security and compliance requirements.
I'm trying to create a timechart based on the number of events per hour by host. My issue is that the JSON data has a host field in addition to the Splunk built in host field.
IE a sample event looks like:
{"time":"2019-04-05T21:50:09.925Z","severity":"INFO","duration":25.02,"db":10.23,"view":14.79,"status":200,"method":"GET","path":"/api/v4/project/1","params":[],"host":"my_server_1","ip":"1.2.3.4, 4.5.6.7","ua":null,"route":"/api/:version/projects/:id","user_id":12,"username":"smithers","queue_duration":4.35,"magic_calls":0}

My search looks like:
index="my_index" host="prd-srv-00*" source="/var/log/my_program/http_json*" | timechart span=1h count by host

When I do this it combines the hosts the logs came from (built in host field) and hosts listed in the data (host field in the json).

If I try to filter out the hosts from the data, it removes the events from the built in host field as well. IE:
index="my_index" host="prd-srv-00*" AND host !="0.0.0.0" source="/var/log/my_program/http_json*" | timechart span=1h count by host
I have also tried to use ...| where host !="0.0.0.0" | ... but this has the same result.

Any advice on a solution or workaround to handle this at search time? IE, can I rename to column when searching, etc?

Thanks in advance for any help.

Labels (1)
0 Karma
Highlighted

Re: How to handle search query when json data has host field?

SplunkTrust
SplunkTrust

Hi,

You can use regex to extract host from actual raw data in new field, try below query which will extract hostname from raw data into new field called ext_host.

Based on sample event you have provided, below query will extract my_server_1 in ext_host field.

<yourBaseSearcg>
| rex field=_raw "\"host\"\:\"(?<ext_host>[^\"]*)"

So your query will be like this

index="my_index" host="prd-srv-00*"  source="/var/log/my_program/http_json*"
| rex field=_raw "\"host\"\:\"(?<ext_host>[^\"]*)"
| search ext_host!="my_server_1"
Highlighted

Re: How to handle search query when json data has host field?

Ultra Champion
0 Karma
Highlighted

Re: How to handle search query when json data has host field?

New Member

Thank you very much for the reply! I put more details in the other answer (I had to pick one), but neither seems to work for me.

When I tried your approach:
index="my_index" host=prd-srv-00* AND ext_host!="my_server_1" | rex field=_raw "\"host\"\:\"(?<ext_host>[^\"]*)" | timechart span=1m count by host | fillnull value=0
It didn't return any values. This seems to be due to the rex field update not happening until after the search.
Any other thoughts?

Thanks again for any help!

0 Karma
Highlighted

Re: How to handle search query when json data has host field?

SplunkTrust
SplunkTrust

Ah yes totally forgot that, you need to search after rex.

 index="my_index" host="prd-srv-00*" source="/var/log/my_program/http_json*"
| rex field=_raw "\"host\"\:\"(?<ext_host>[^\"]*)"
| search ext_host!="my_server_1" 
0 Karma
Highlighted

Re: How to handle search query when json data has host field?

New Member

That still returns no results. My guess is because ext_host is still set for each event, therefore when the search happens it is excluding all events.

0 Karma
Highlighted

Re: How to handle search query when json data has host field?

SplunkTrust
SplunkTrust

So how many events do you have which does not contain my_server_1 in your raw data ? If you are playing with only 1 sample event which you have provided then it will not return any result because you are searching for exthost does not equal to `myserver_1`

0 Karma
Highlighted

Re: How to handle search query when json data has host field?

New Member

I'm looking at a small number of events for testing, about 400. Some have myserver1 and some have myserver2, so it would return values.

The problem I have is that all the events come from the host (Splunk side) of prd-srv-008, but also have the other field set.
In my graph, for the first minute it is showing:
prd-srv-008 --COUNT: 432
myserver1 --COUNT: 320
myserver2 --COUNT: 112

if you notice myserver1 and myserver2 always adds up to the total from prd-srv-008

0 Karma
Highlighted

Re: How to handle search query when json data has host field?

Legend

@evbtbw92 could you explain the corporate requirement for not changing field name when there can be two two different values for host field where both are valid values? I think this is incorrect requirement. Is there any one host value that you need or both?

If you need only the host value from JSON data, you should correct host metadata while indexing, so that searches work faster.

If you need both the values, you should have one of the fields as host and other as something else, maybe Host, since Splunk field names are case-sensitive (using transforms.conf).




| eval message="Happy Splunking!!!"


0 Karma
Highlighted

Re: How to handle search query when json data has host field?

Esteemed Legend

Here are some options.

1: To ensure searching by the indexed host, you can use :: syntax like index=foo host::bar.
2: To ensure that the automatic KVMODE extraction does not happen, polluting your host value, run your search in Fast mode.
3: To keep the KV
MODE extractions, but not the host one, add this to your search: ... | rex mode=sed "s/\"host\"/\"json_host\"/g"

I think #3 is your ticket.

0 Karma