Getting Data In

Splunk doesn't parse this URL fully....

sunitachan
New Member

Hello My dear Splunker!,

I was trying to get data via syslog into Splunk, the events consists of a request="url" field like below:

request=http://www.terracotta.org/kit/reflector?kitID=ehcache.default&pageID=update.properties&id=2130706433...

But Splunk parses it like this:
request=http://www.terracotta.org/kit/reflector?kitID=ehcache.default

Can someone help me with this please?
How can I get the full URL parsed correctly?
And where can I go in Splunk to tweak this field? As my data is already parsed...

Appreciate the help!!
Thanks
Sunita

Tags (2)
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

As CPetterborg mentions, it depends on how the event looks. Is this a space delimited event, or newline feed.. I would use something like:

request=(?<url>[^\s|^\r\n]+)

That would capture anything followed by a space, or a unix style linefeed (that might need to be adjusted based on the sourcetype.) One potential issue with using a space as a delimiter could be that you might have a url that has a space or encoded space character in the url...

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

I'm making some assumptions here...

Looks like you are relying on key/value pair parsing for automatic field extraction. You probably want to use a rex command or do a field extraction for your data. Since there are no spaces in your URL you should be able to us the following regex to parse the request url:

request=(?P<url>[^ ]+)

I'm assuming that from the samples, there is really supposed to be a space between the various fields for each event.

0 Karma

somesoni2
Revered Legend

Could you provide some sample full events and also definition of your URL2 field extraction?

0 Karma

sunitachan
New Member

Hi there,
here are few samples,

Feb
20
09:25:27 |1.0.3|0|passed|0|src=x.x.x.x
spt=40960
dst=34.23.12.3
dpt=80
deviceDirection=1
request=http://www.unikin.cd/
act=passed
cn1Label=Risk_Score
cn1=0
cs5=-
cs5Label=Malware_Type
cs1=-
cs1Label=Category
cs2=-
cs2Label=Protocol

Feb
20 09:25:27|1.0.3|0|passed|0|src=x.x.x.x
spt=60657
dst=291.98.1.1
dpt=80
deviceDirection=1
request=http://mobile.orange.fr/
act=passed
cn1Label=Risk_Score
cn1=0
cs5=- cs5Label=Malware_Type
cs1=-
cs1Label=Category
cs2=- cs2Label=Protocol

Feb
16 08:46:11|1.0.3|0|passed|0|src=x.x.x.x
spt=55845
dst=199.11.1.1
dpt=80
deviceDirection=1
request=http://www.terracotta.org/kit/reflector?kitID=ehcache.default&pageID=update.properties&id=2130706433...
act=passed
cn1Label=Risk_Score
cn1=0
cs5=- cs5Label=Malware_Type
cs1=-
cs1Label=Category
cs2=- cs2Label=Protocol

And URL = request
URL2 = request with long url as in the 3rd sample above

Can I have just one field which could include both type of URLs?
The URL2 regex is ^(?:[^=\n]*=){6}(?P[^ ]+)

Thanks

0 Karma

sunitachan
New Member

Hello all,
I actually used the built in field extraction tool to parse this particular field, but the issue now I see is that the field extraction is applied to all other URLs which are not this long. So I have:
URL
URL2

I want to only apply this field extraction to URL2..

Any suggestion please?
Thanks

0 Karma
Get Updates on the Splunk Community!

Data Management Digest – December 2025

Welcome to the December edition of Data Management Digest! As we continue our journey of data innovation, the ...

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...