Splunk Search

How to define a calculated field based on chained rex statements in Splunk Web?

floppymoose
Engager

I'm using Splunk Enterprise. I have a search that looks like:

index=foo sourcetype=yapache_access host=bar  | fields url,duration  | rex field=url mode=sed "s/[a-zA-Z0-9._]{20,}/_HASH/g" | rex field=url mode=sed "s/ysp_user_agent=[^&]+//g"   | rex field=url mode=sed "s/oauth[a-z_]+=[a-zA-Z0-9_]+//g"   | rex field=url mode=sed "s/(\d\d\d\d-\d\d-\d\d)/YYYY-MM-DD/g"   | rex field=url mode=sed "s/([.\/=;,])(\d+)/\1_ID/g" | stats count, avg(duration) as servertime by url | where count>100 | sort 100 -servertime

This search groups urls by replacing embedded id's and dates, etc with constants so that I can look at requests that have at least 100 uses, and then sort them by their mean servertime to find slow requests.

I would like to share out this flattening of the url to other users on the team in a convenient to use way. So, two questions:

1) is defining a new calculated field via the UI: "Fields » Calculated fields » Add new" the way to go?
2) if so, how to do I do it? I haven't found an example that shows me how to fill out that form when a chain of rex's is what defines my new field.

Apologies if this is detailed somewhere handy. I tried searching the docs and the forums before asking this.

0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

There's a great document by the docs team to Create and maintain search-time field extractions through configuration files. But I don't think this is what you need because you are "erasing" parts of a line, and unless you want to erase the actual stuff in the event sort-of-permanently, this might be difficult.

Still, here's what I'd do: create a macro!

You can probably take that entire pile of

 rex ... | rex ... | rex ...

And create a macro from that. If you name it 'CleanUpURL' then you can call it in your actual search (or someone else can) like so:

index=foo sourcetype=yapache_access host=bar  | fields url,duration  | `CleanUpURL` | stats count, avg(duration) as servertime by url | where count>100 | sort 100 -servertime

One tip: watch your leading and trailing pipes | - you can include them in the macro or not, but stay consistent. You obviously have to keep them in the MIDDLE of your macro, it's just the ones at the ends. I usually do it the way I describe, but you could also do it this way:

Macro:

| rex ... | rex ... | rex ... |

New search using macro:

index=foo sourcetype=yapache_access host=bar  | fields url,duration  `CleanUpURL` stats count, avg(duration) as servertime by url | where count>100 | sort 100 -servertime

I personally think the FIRST way is way cleaner and easier to follow.

View solution in original post

Richfez
SplunkTrust
SplunkTrust

There's a great document by the docs team to Create and maintain search-time field extractions through configuration files. But I don't think this is what you need because you are "erasing" parts of a line, and unless you want to erase the actual stuff in the event sort-of-permanently, this might be difficult.

Still, here's what I'd do: create a macro!

You can probably take that entire pile of

 rex ... | rex ... | rex ...

And create a macro from that. If you name it 'CleanUpURL' then you can call it in your actual search (or someone else can) like so:

index=foo sourcetype=yapache_access host=bar  | fields url,duration  | `CleanUpURL` | stats count, avg(duration) as servertime by url | where count>100 | sort 100 -servertime

One tip: watch your leading and trailing pipes | - you can include them in the macro or not, but stay consistent. You obviously have to keep them in the MIDDLE of your macro, it's just the ones at the ends. I usually do it the way I describe, but you could also do it this way:

Macro:

| rex ... | rex ... | rex ... |

New search using macro:

index=foo sourcetype=yapache_access host=bar  | fields url,duration  `CleanUpURL` stats count, avg(duration) as servertime by url | where count>100 | sort 100 -servertime

I personally think the FIRST way is way cleaner and easier to follow.

martin_mueller
SplunkTrust
SplunkTrust

Do share the direction you went, so others can benefit too.

Side note: You should be able to substitute replace(field, regex, "replacement") for your rex mode=sed calls, those should enable your original thought of adding a calculated field. I'd personally prefer that over having to teach people how to use the macro, and people having to use the macro everywhere.

floppymoose
Engager

Thanks, this steered me in a useful direction!
I tried to give you karma points but I get an alert saying the maximum I can award is 0. 😞

0 Karma

somesoni2
SplunkTrust
SplunkTrust

If @rich7177's answer has resolved your issue, you can accept his answer (by clicking on little tick mark Accept link below the answer and reward him by voting up the answer.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...