Need help parsing file
Each file represents a unique complete test. Here is a snippet of what we have.
It breaks on the word BREAKERWORD
Can have multiple type of events
First part with SYSTEM_PARM is just a key value pair that we need to parse
Second and third parts (in-between BREAKERWORD). Name is the Name of the test. DATA represent parts of the test. ANd then the BEGIN and END sections map to the 'parts of the test' in the order that they are presented. In other words looking at refFirstItem maps to the 1st BEGIN and END section with these two values --> -7.107569270486e+01 and -8.107569270486e+01.
If you have an actual solution to this problem a massive thanks. But even if you have suggestions on how to attack this or if it is reasonably possible to complete with Splunk please pipe and and give a hand. Thank you in advance
BREAKERWORD NAME PARAMETERS #SYSTEM_PARM SourceQueue ABCD123 #SYSTEM_PARM SourceNode EFGHI456 #SYSTEM_PARM DestinationQueue JKL789 #SYSTEM_PARM DestinationNode MNO012 BREAKERWORD NAME ABCD DATA refFirstItem DATA refSecondItem DATA refThirdItem BEGIN -7.107569270486e+01 -8.107569270486e+01 END BEGIN -2.767100000000e+01 END BEGIN -1.345589265277e+01 END BREAKERWORD NAME EFGH DATA ArefFirstItem DATA ArefSecondItem BEGIN -7.107569270486e+01 -8.107569270486e+01 END BEGIN -2.767100000000e+01 END
The file you are showing has quite an unusual structure, hard to interpret using pure Splunk means. I suggest you look at modular inputs - they let you ingest any kind of data, flatten the structure and pass it to Splunk. Start by reading about modular inputs in "Splunk Enterprise Getting Data" manual of your Splunk version.
As @cusello wisely pointed out, you'll have to make your decision on assigning timestamps. With modular input, you'll also need to determine how you ingest the file. Personally, I only have experience with getting the data on a TCP port, and I made that port a parameter to your modular input (you specify the parameter when you create an actual input using your module). My modular input is backed by Python script, but you can use other scripting languages or even executables. Anyway - the script in your modular input will have all the knowledge about the file structure, and will transform that into a simple "field1=val1, field2=val2" format.
Thank you very much for this answer. I haven't played with modular inputs. But it might be the right for m of attack since my colleague and I were actually talking about just writing a script to put it into a format better suited for splunk. Will need to review. As far as timestamp - I really don't care. I have a MySQL database I have already connected to that lists the 'sourcefile' name that there will be for each of these tests and I was planning on using that to match up.
Hi rvoninski [Splunk],
what is the scope of your search?
You could ingest these events line_interrupted by BREAKERWORD assigning to each one timestamp from the file or the indextime.
Supposing that every test is ingested separately, you could correlate events by _time.
After you could use multi volume commands to separate and list all the values.
In this way you could display results, if this is your scope.
I'm not really worried about time since i can match the tests by the sourcefile name. One complete test is in every file. Can you give me an example of how to use multi volume commands? Thanks. Rich