Splunk Search

Extract fields from XML tags

mvagionakis
Path Finder

Hello Splunkers,

I searched to find the answer but I couldn't find the solution in answers.com.
I'm sorry if my research wasn't good enough and the answer exist already.

I want to send logs of OFFICESCAN in splunk and the only ways is through the windows event logs.
So the format is an XML and so a lot of important fields are not extracted.

Below an example of my log:

<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Serveur Trend Micro OfficeScan'/><EventID Qualifiers='49157'>500</EventID><Level>3</Level><Task>5</Task><Keywords>0x80000000000000</Keywords><TimeCreated SystemTime='2019-05-09T07:36:32.451876000Z'/><EventRecordID>134998</EventRecordID><Channel>Application</Channel><Computer>myserver.dom</Computer><Security UserID='S-1-5-18'/></System><EventData><Data>Virus/programme malveillant : TrojanSpy.Win32.NEGASTEAL.THDCOAI Ordinateur : infected_server Domaine : xxx\yyy_toto_20h\ Fichier : C:\xxxxxx\yyyyyyy-zzzzzzzz\AppData\Local\Temp\yyyyyyyy_892E\blabla blabla(~174 KB).rar (blablalbla(~174 KB).exe) Date/heure : 09/05/2019 09:36:06 Résultat : Quarantaine </Data></EventData></Event>

What I need is between the that is to say the values like "fichier", "Résultat" etc.

When I'm testing my regex command sur https://regex101.com/ the command is working well but once in splunk it doesn't work anymore.
For exemple I tried this command in order to extract the "domaine" value:

Domaine\s\:\s(?P<Domaine>([^\s]+))

I tried in splunk to use the "| xmlkv" function but it only extracts the "data" content but not the fields contained in it; I think it is a normal behaviour.

So, I suppose that I need a specific command to do this extraction, or I missing something in my regex.

Do you have any suggestion please?

I thank you in advance.

have a great day.
Michail

Tags (1)
0 Karma

mvagionakis
Path Finder

Hello again,

I finally found other solution for which my regex works great.
I made a dedicated config only for my Antivirus Servers and in my "input.conf" I gave to the "renderXml" option the "false" value in order to receive my log in plain text.

Now I can extract everything I need.

Thank you again.
Michail

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi mvagionakis,

mon français n'est pas très bon, but I try my best 😉

You can create a dynamic extraction using props.conf and transforms.conf like this; first add a props.conf stanza to match your sourcetype:

[le type de source d'analyse de bureau]
REPORT-001_mon-entrée-uniq-ici = extraireDesPairesDeValeursDeClésDynamiques

next you need to configure a transforms.conf stanza:

[extraireDesPairesDeValeursDeClésDynamiques]
REGEX = \<(\w+)\>([^\<]+)\<
FORMAT = $1::$2

this will create a dynamic search time extraction of the XML tags as field and their value.

You must excuse my for trying to add some french to the answer, it might not have worked out in the end but check this answer https://answers.splunk.com/answers/319646/how-to-write-the-regex-to-extract-data-inside-squa.html to get an idea how it is done 🙂

Hope this helps anyway ...

à votre santé, MuS

0 Karma

MuS
SplunkTrust
SplunkTrust

ou peut-être j'appelle @yannK pour aider 😉

0 Karma

yannK
Splunk Employee
Splunk Employee

Mordiou!, I would be surprised that apostrophes and accents are allowed in stanza names : )

0 Karma

mvagionakis
Path Finder

Hello Mus,

Merci beaucoup pour ton retour.

I've already tried these solutions but it doesn't work.

thank you for your prompt reply.
Michail

0 Karma