Splunk Search

Need a help in Regex?

Hemnaath
Motivator

Hi All, Need a small help in the regex, I am able to match the host name but unable to over write to the host field in the selected field in splunk, using the below regex. Could you please guide in correcting the regex.

Regex:
index=firewall sourcetype="network:log" |rex field=_raw (?)(?<=Client_VPN,)\b[(\w)]+\b | table host host_name

Event Details:
Feb 16 23:54:02 test01.xxxx.com 1,2018/02/16 23:54:02,012501001035,6477528014920876411,0x8000000000000000,USERID,logout,473,2018/02/16 23:53:46,36,0,0,0,Client_VPN,node01fw,3,vsys3,10.X.X.X,ddesa0002,,0,1,0,0,0,vpn-client,globalprotect,0,0,,2018/02/16 23:53:47,1

Actual Requirement:

Need to over write the host field value with the host_name field value from the interesting field.
host=test01.xxxx.com
host_name=node02fw

Kindly guide me on the regex to over write the host value with the host_name value.

0 Karma
1 Solution

FrankVl
Ultra Champion

Is it always preceded by "Client_VPN"?

If so:

| rex "Client_VPN,(?<host>[^,]+),"

If not guaranteed that Client_VPN value is present, then have a look at the location of the host_name field in the event string, and use a regex similar to what I suggested before in your previous similar question, counting the number of fields preceding the host_name field:

 "(?:[^,]*,){14}(?<host>\w+)"

Edit: thanks 493669 for doing the counting, updated my example with nr 14 instead of 52.

View solution in original post

0 Karma

493669
Super Champion

Hi @Hemnaath,
try this regex:

index=firewall sourcetype="network:log"|rex "^([^,]*,){14}(?<host>[^,]*)"
0 Karma

FrankVl
Ultra Champion

Is it always preceded by "Client_VPN"?

If so:

| rex "Client_VPN,(?<host>[^,]+),"

If not guaranteed that Client_VPN value is present, then have a look at the location of the host_name field in the event string, and use a regex similar to what I suggested before in your previous similar question, counting the number of fields preceding the host_name field:

 "(?:[^,]*,){14}(?<host>\w+)"

Edit: thanks 493669 for doing the counting, updated my example with nr 14 instead of 52.

0 Karma

Hemnaath
Motivator

Hi Frank, the above regex worked and we are able to see the host_name field value in the host value of the selected field.

index=firewall sourcetype="network:log" | rex "(?:[^,]*,){14}(?\w+)"

thanks for the much needed help.

0 Karma

Hemnaath
Motivator

Hi Frank, in some case the host_name is not there in event logs in that case the above regex fetches the actual host name "test01.xxx.com" in the host field. When we searched with the below query for 24 hours duration, we could see this events in which there is no host name available. In this case what can be done, to remove the actual host from the host field.

index=firewall sourcetype="network:log" | rex "(?:[^,]*,){14}(?\w+)"

Example :

Feb 19 13:25:33 test01.xxx.com 1,2018/02/19 13:25:33,012501001041,TRAFFIC,end,1,2018/02/19 13:25:24,10.x.x.x,10.x.x.x,0.0.0.0,0.0.0.0,xxxx_not_here,,,ssl,vsys2,Data-Center-Admin,Data-Center-Core,ae5.2005,ae5.250,pan_log_forward,2018/02/19 13:25:24,
0 Karma

FrankVl
Ultra Champion

Use \w* instead of \w+, so it will also capture an empty string between the commas.

0 Karma

Hemnaath
Motivator

Hi Frank, I had copy paste complete event details and when I execute the query for a month period, I could see there are some events have host value as 0,21,test01.xxx.com and Azure in the selected fields.

In some events the host field value is located at different position, but in this case is there any away we can remove these host being displayed in host field.

Query Detail:

index=firewall sourcetype="paloalto:network:log" | rex "(?:[^,]*,){14}(?\w*)" | search host=0  

Events Detail: When searched with the host value=0

    2/19/18
    6:48:32.000 AM  
    0.0-10.x.x.x,0,2,0,aged-out,50,0,0,0,Ogden-FW,node01,from-policy,,,0,,0,,N/A
    eventtype = nix-all-logs    eventtype = pan     network host =  0 source =  /opt/syslogs/paloalto/test01.xxx.com/paloalto.log sourcetype =  paloalto:network:log tag =  network

    2/7/18
    7:31:22.000 PM  
    Feb  8 00:31:22 test02.xxx.com 1,2018/02/08 00:31:00,012501001041,CORRELATION,,,2018/02/08 00:31:00,168.133.221.8,,,compromised-host,medium,17,0,0,0,,test02,2003071744,Beacon Detection,6005,"Host has made use of Internet Relay Chat (IRC), a protocol popular with command-and-control activity."
    eventtype = nix-all-logs    eventtype = pan     network host =  0 source =  /opt/syslogs/paloalto/test02.xxx.com/paloalto.log sourcetype =  paloalto:network:log tag =  network


Query Detail:

    index=firewall sourcetype="paloalto:network:log" | rex "(?:[^,]*,){14}(?\w*)" | search host=test01.xxx.com

    Event Details: When searched with the host value=test01.xxx.com

    Feb  4 05:31:09 test01.xxx.com 1,2018/02/04 05:31:09,007257000034869,T
    eventtype = nix-all-logs    eventtype = pan     network host =  test01.xxx.com source = /opt/syslogs/paloalto/test01.xxx.com/paloalto.log sourcetype =  paloalto:network:log tag =  network

    1/25/18
    7:48:05.000 AM  
    om-policy,,,0,,0,,N/A
    eventtype = nix-all-logs    eventtype = pan     network host =  test01.xxx.com source = /opt/syslogs/paloalto/test01.xxx.com/paloalto.log sourcetype =  paloalto:network:log tag =  network

    Query Detail:
    index=firewall sourcetype="paloalto:network:log" | rex "(?:[^,]*,){14}(?<host>\w*)" | search host=21 

    Event details: When searched with the host value=21

    42643087,1,60564,9080,0,0,0x100053,tcp,allow,4442,1860,2582,23,2018/02/19 13:25:01,21,not-resolved,0,6477528057962830262,0x8000000000000000,10.x.x.x-10.x.x.x,10.x.x.x-10.x.x.x,0,12,11,tcp-rst-from-client,17,0,0,0,Data_Center,test01fw,from-policy,,,0,,0,,N/A

eventtype = nix-all-logs    eventtype = pan     network host =  21 source = /opt/syslogs/paloalto/test01.xxxx.com/paloalto.log sourcetype = paloalto:network:log tag =  network

 Query Detail: 
 index=firewall sourcetype="paloalto:network:log" | rex "(?:[^,]*,){14}(?\w*)" | search host=Azure

Event Details: When searched with the host value=Azure

RAFFIC,end,1,2018/02/04 05:30:58,10.134.64.7,168.133.4.232,0.0.0.0,0.0.0.0,trust-xxxx,,,dns,vsys1,Azure-Private,Azure-Data-Center,ethernet1/2,ethernet1/1,pan_log_forward,2018/02/04 05:30:58,126039,1,46728,53,0,0,0x4064,udp,allow,x7,x2,x,x,2018/02/04 05:30:28,0,any,0,131004235,0x8000000000000000,10.x.x.x-10.x.x.x,United States,0,1,1,aged-out,115,0,0,0,,node02fw,from-policy,,,0,,0,,N/A

Hey I had noticed that we are using 3.8.0 version of Palo alto add-on but I am not sure why the source type was made different in our organisation. I can create another new question in the answer.com for palo alto sourcetype related query.

0 Karma

FrankVl
Ultra Champion
  1. it looks like you have broken (incomplete events)?
  2. You'd need host extraction transforms for each different format (so basically for each proper sourcetype as per the original TA I would say).
0 Karma

Hemnaath
Motivator

Hi Frank, I had used \w* instead of \w+ but still I am able to get the host name as 0, 21,test01.xxx.com and Azure iin the selected host field. when I execute the query for a duration of 1 month.

Kindly guide how we can correct this, as in the selected host field we are able to see the host information as 0,21,Azure,test01.xxx.com and a blank space.

0 Karma

FrankVl
Ultra Champion

Is it some issue with copy pasting your samples here? Looks like broken up events?

Also: is there a specific reason you use different sourcetypes from the original palo alto TA? I would be expecting sourcetypes like pan:traffic, pan:system etc.

It seems you are mapping different sourcetypes all to paloalto:network:log, while in fact the events have different formats and as such, the location of the hostname is different.

I would once more like to suggest that you take a close look at the palo alto TA (https://splunkbase.splunk.com/app/2757/) to understand the different palo alto event types and their structure.

0 Karma

niketn
Legend

@Hemnaath, seems like this is a duplicate of question https://answers.splunk.com/answers/618695/how-can-i-get-regex-to-over-ride-the-host-value-wi.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Hemnaath
Motivator

Hi Niketnilay, Yes it is the part the previous answer, but for the different scenario, where the host name comes in between the events and the value of the host field which needs to over written with the host_name.
I had tried for regex but it did not work, so could please guide me on the regex.

0 Karma

493669
Super Champion

are you expecting host field value to be node01fw instead of test01.xxxx.com?

0 Karma

Hemnaath
Motivator

yes I need to over write the host value from test01.xxx.com to node01fw in the interesting field.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...