I am trying to index hierarchical XML log files into Splunk. The file contains several groups of data linked by ID fields. I need to flatten out the data before indexing so queries can search the data as a flat table.
Here is an example of an XML log:
<?xml version="1.0" encoding="utf-8"?><FleetData dateCreated="2021-01-08T15:20:07.1046931Z" xmlns:a="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.cleverdevices.com/FleetManager"><Locations><Location><CustomerLocationId/><Deleted>false<... (T23)</Model><ObjectVersion>AAAAD5sp2kE=</ObjectVersion><OrganizationId>0</OrganizationId><VehicleTypeId>33</VehicleTypeId></VehicleType></VehicleTypes><Vehicles><Vehicle><Deleted>false</Deleted><EffectiveOrganizationId>0</EffectiveOrganizationId><InService>true</InService><LicensePlate/><LocationId>14</LocationId><ModelYear>2019</ModelYear><ObjectVersion>AAAAD5u2qqg=</ObjectVersion><OrganizationId>-1</OrganizationId><VehicleId>3201</VehicleId><VehicleNumber>1</VehicleNumber><VehicleTypeId>-1</VehicleTypeId><Vin/></Vehicle><Vehicle><Deleted>false</Deleted><EffectiveOrganizationId>-1</EffectiveOrganizationId><InService>true</InService><LicensePlate/><LocationId>32</LocationId><ModelYear>2017</ModelYear><ObjectVersion>AAAAD5u3+Os=</ObjectVersion><OrganizationId>-1</OrganizationId><VehicleId>2702</VehicleId><VehicleNumber>10</VehicleNumber><VehicleTypeId>33</VehicleTypeId><Vin/></Vehicle></Vehicles></FleetData>
The resulting index should look like this:
dateCreated | VehicleId | BusToolsBusType | LocationName | VehicleInService | VehicleDeleted |
2021-01-08T15:20:07.1046931 | 3201 | -1 | Orange | True | False |
2021-01-08T15:20:07.1046931 | 2702 | 28 | CMF | True | False |
I have searched the Splunk documentation and the Internet but cannot find any information on how to index these types of files. Any assistance with this would be greatly appreciated.
index=_internal | head 1 | fields _raw
| eval _raw="<?xml version=\"1.0\" encoding=\"utf-8\"?><FleetData dateCreated=\"2021-01-08T15:20:07.1046931Z\" xmlns:a=\"http://schemas.microsoft.com/2003/10/Serialization/Arrays\" xmlns:i=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns=\"http://www.cleverdevices.com/FleetManager\"><Locations><Location><CustomerLocationId/><Deleted>false<... (T23)</Model><ObjectVersion>AAAAD5sp2kE=</ObjectVersion><OrganizationId>0</OrganizationId><VehicleTypeId>33</VehicleTypeId></VehicleType></VehicleTypes><Vehicles><Vehicle><Deleted>false</Deleted><EffectiveOrganizationId>0</EffectiveOrganizationId><InService>true</InService><LicensePlate/><LocationId>14</LocationId><ModelYear>2019</ModelYear><ObjectVersion>AAAAD5u2qqg=</ObjectVersion><OrganizationId>-1</OrganizationId><VehicleId>3201</VehicleId><VehicleNumber>1</VehicleNumber><VehicleTypeId>-1</VehicleTypeId><Vin/></Vehicle><Vehicle><Deleted>false</Deleted><EffectiveOrganizationId>-1</EffectiveOrganizationId><InService>true</InService><LicensePlate/><LocationId>32</LocationId><ModelYear>2017</ModelYear><ObjectVersion>AAAAD5u3+Os=</ObjectVersion><OrganizationId>-1</OrganizationId><VehicleId>2702</VehicleId><VehicleNumber>10</VehicleNumber><VehicleTypeId>33</VehicleTypeId><Vin/></Vehicle></Vehicles></FleetData>"
| spath
| spath FleetData.Locations.Location.Deleted.Vehicles.Vehicle output=Vehicles
| stats values("FleetData{@*}") as * by Vehicles
| spath input=Vehicles
| fields - Vehicles
| table dateCreated VehicleId VehicleTypeId LocationId InService Deleted
index=_internal | head 1 | fields _raw
| eval _raw="<?xml version=\"1.0\" encoding=\"utf-8\"?><FleetData dateCreated=\"2021-01-08T15:20:07.1046931Z\" xmlns:a=\"http://schemas.microsoft.com/2003/10/Serialization/Arrays\" xmlns:i=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns=\"http://www.cleverdevices.com/FleetManager\"><Locations><Location><CustomerLocationId/><Deleted>false<... (T23)</Model><ObjectVersion>AAAAD5sp2kE=</ObjectVersion><OrganizationId>0</OrganizationId><VehicleTypeId>33</VehicleTypeId></VehicleType></VehicleTypes><Vehicles><Vehicle><Deleted>false</Deleted><EffectiveOrganizationId>0</EffectiveOrganizationId><InService>true</InService><LicensePlate/><LocationId>14</LocationId><ModelYear>2019</ModelYear><ObjectVersion>AAAAD5u2qqg=</ObjectVersion><OrganizationId>-1</OrganizationId><VehicleId>3201</VehicleId><VehicleNumber>1</VehicleNumber><VehicleTypeId>-1</VehicleTypeId><Vin/></Vehicle><Vehicle><Deleted>false</Deleted><EffectiveOrganizationId>-1</EffectiveOrganizationId><InService>true</InService><LicensePlate/><LocationId>32</LocationId><ModelYear>2017</ModelYear><ObjectVersion>AAAAD5u3+Os=</ObjectVersion><OrganizationId>-1</OrganizationId><VehicleId>2702</VehicleId><VehicleNumber>10</VehicleNumber><VehicleTypeId>33</VehicleTypeId><Vin/></Vehicle></Vehicles></FleetData>"
| spath
| spath FleetData.Locations.Location.Deleted.Vehicles.Vehicle output=Vehicles
| stats values("FleetData{@*}") as * by Vehicles
| spath input=Vehicles
| fields - Vehicles
| table dateCreated VehicleId VehicleTypeId LocationId InService Deleted