Splunk Search

Obtaining XML from URL

helius
Path Finder

Search Head: V6.2
Goal: Obtain XML data from URL, which is dynamically created with IDs set in search string.

Search string: index=content_eng source="dbmon-tail://kemgr-a1p/Jobs General" | eval JobID=id

I want to then create and, from what I've read, use urldecode & xmlkv :

index=content_eng source="dbmon-tail://kemgr-a1p/Jobs General" | eval JobID=id | eval _raw=urldecode("http://kemgr-ch2-a1p.domain.com/jobs/"+JobID+".xml") | xmlkv

I should then be able to rex this and get some sort of result into file_size field:

index=content_eng source="dbmon-tail://kemgr-a1p/Jobs General" | eval JobID=id | eval _raw=urldecode("http://kemgr-ch2-a1p.domain.com/jobs/"+JobID+".xml") | xmlkv | rex field=_raw "(?<file_size>\d+)"

This produces the field file_size, but it's "2" for everything, regardless of the actual number/file_size in the xml doc.

I don't think I'm doing something right... Can someone point me into obtaining the value in the URL's XML?

Here is an example of the XML:

<general>
<duration>30mn 4s</duration>
<format>MPEG-TS</format>
<id>2 (0x2)</id>
<overall_bit_rate>3 750 Kbps</overall_bit_rate>
<file_size>807 MiB</file_size>
<overall_bit_rate_mode>Constant</overall_bit_rate_mode>
</general>

Thanks in advance!

0 Karma
1 Solution

helius
Path Finder

It turns out you cannot do what I'm trying to do in Splunk. Sure would be nice, but I'll just go ahead and index these remote XML files and use xpath to get what I want.

View solution in original post

0 Karma

helius
Path Finder

It turns out you cannot do what I'm trying to do in Splunk. Sure would be nice, but I'll just go ahead and index these remote XML files and use xpath to get what I want.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Your regex string is taking the first number it finds and putting it into the file_size field, which is probably not the desired result. Try this instead:

... | rex "(?<file_size><file_size>(\d+).+</file_size>)"

If you want more than just the numbers, use this:

... | rex "(?<file_size><file_size>(.+)</file_size>)"
---
If this reply helps you, Karma would be appreciated.
0 Karma

helius
Path Finder

Thanks Rich, I tried that and just receive the http://kemgr-ch2-a1p.domain.com/jobs/81283.xml URL back, no file_size number as a field.

Here was my search string:

index=content_eng source="dbmon-tail://kemgr-a1p/Jobs General" | eval JobID=id | eval _raw=urldecode("http://kemgr-ch2-a1p.domain.com/jobs/"+JobID+".xml") | xmlkv | rex "(?(\d+).+)"

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The urldecode() function does not send an http query, it merely makes a URL easier to read. That means the xmlkv statement is parsing your URL, not an XML string. That, in turn, means rex will not find 'file_size' - in fact, the current regex string will return the '2' from your URL.

---
If this reply helps you, Karma would be appreciated.

helius
Path Finder

Aaahhh, that makes much more sense now. Is something like what I'm trying to do possible without indexing these XML files?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think you'll have to index the files.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...