So on my Mac OSX I've installed Splunk. Downloaded DB Connect and the MySQL Java connector.
Still struggling to get started doing things. The documentation seems copious but not hand-holding, it's still fairly geeky. It would be GREAT to have use cases or examples of things that one can index (precise steps) and then query into a dashboard (precise steps, and precise outputs: bar graphs, pie charts, maybe even a table that reorganizes data so that it's easy to import into Excel, etc).
I have XML files in a directory. Some nodes in the XML files refer to columns in the MySQL table.
All I need is to index the XML files:
Then, to index the data in the MySQL database:
Any guides or step-by-step instructions to get me started? How do I take complex XML files and convert them into meaningful indexes for Splunk so that I can report them in an easy table? Pie charts etc can come later.
Splunk sounds like a really powerful tool and I'm very patient to want to learn it, but the sources of documentation presume that one is an IT admin. If the marketing promise of Splunk to make a foray into business analytics is real and sincere, we'd love to see some more hand-holding, step by step use-cases type documentation.
Thanks for any pointers!
I have no experience using DB Connect, but I can help you out with indexing and extracting XML.
To index the xml files (if they always dump to the same directory)
edit your local/inputs.conf file and add
[monitor:///directory/*.xml]
sourcetype = theSourcetypeYouWant
index = theindexyouwantitin
crcSalt = <source>
alwaysOpenFile = 1
disabled = false
You should be able to search for sourcetype=theSourcetypeYouWant and find the data indexed. To extract fields out of the XML you need to do one of 2 things. Either click next to the down arrow next to an event and select Extract Fields and then give example values. It will auto create the regex for you and you can save it for future use. Or you can manually create the extractions. If you don't have it already go into etc/system/local/props.conf file. Add the following:
[theSourcetypeYouWant]
EXTRACT-YourXMLExtractionName1 = (?i)<XMLtagInRawEvent1>(?P<YourXMLExtractionName1>[^<]+)
EXTRACT-YourXMLExtractionName2 = (?i)<XMLtagInRawEvent2>(?P<YourXMLExtractionName2>[^<]+)
Replace the variables with whatever you want to call the sourcetype, extraction above. XMLtagInRawEvent will be what the actual tag is in the xml file. If you do this correctly you will see the fields on the left side when you do a search. Remember to restart splunk if you edit the props.conf file as explained.
I was in the same boat as you 6 months ago. Splunk is powerful, but you have to really immerse yourself to get what you want out of the tool past simple searches.
I have no experience using DB Connect, but I can help you out with indexing and extracting XML.
To index the xml files (if they always dump to the same directory)
edit your local/inputs.conf file and add
[monitor:///directory/*.xml]
sourcetype = theSourcetypeYouWant
index = theindexyouwantitin
crcSalt = <source>
alwaysOpenFile = 1
disabled = false
You should be able to search for sourcetype=theSourcetypeYouWant and find the data indexed. To extract fields out of the XML you need to do one of 2 things. Either click next to the down arrow next to an event and select Extract Fields and then give example values. It will auto create the regex for you and you can save it for future use. Or you can manually create the extractions. If you don't have it already go into etc/system/local/props.conf file. Add the following:
[theSourcetypeYouWant]
EXTRACT-YourXMLExtractionName1 = (?i)<XMLtagInRawEvent1>(?P<YourXMLExtractionName1>[^<]+)
EXTRACT-YourXMLExtractionName2 = (?i)<XMLtagInRawEvent2>(?P<YourXMLExtractionName2>[^<]+)
Replace the variables with whatever you want to call the sourcetype, extraction above. XMLtagInRawEvent will be what the actual tag is in the xml file. If you do this correctly you will see the fields on the left side when you do a search. Remember to restart splunk if you edit the props.conf file as explained.
I was in the same boat as you 6 months ago. Splunk is powerful, but you have to really immerse yourself to get what you want out of the tool past simple searches.
Thanks. I guess I asked for it 🙂
Even for a simple task -- get a bunch of XML files indexed and then queried in a way I wish. Splunk so far has not impressed. The documentation does not seem to have anything clear about structuring the indexing of XML in any sensible way; just by way of regexps, which are clunky and hard to maintain.
Same with the DB Connect. How do I index the data first time? Can I specify my keys that I want to use as index later for querying? Will the tables be "snorted" once, and then any new rows in the tables be automatically added to the index? What if I want to change, in the future, the manner in which I query the index? And so on. These questions don't seem easy to discover.