Is it better to convert all log sources to syslog and then do searching in Splunk? This way is more standardised and makes search faster? Or we do not have to care about it as spunk will take care of this in an efficient way? I ask because there are many different types of log sources, namely csv, normal text, windows event logs etc..... How do make sense of them if they are not uniform?
I would say DO NOT do any conversion. Splunk's core feature is the fact that it can take data of varying formats and styles and, thanks to scheme-on-the-fly and field extraction, allow you to compare and manipulate fields that represent the same data, regardless of the format.
Transforming prior only reverts back to the ETL world and manifests dependencies.
Take advantage of what splunk does best. Over time you'll see that the format of the individual events is not important because you're manipulating fields and looking for trends and values of those fields. At that point you rarely look at the events because you use stats-type commands.
Conversion requires a central point to collect original data form, conversion, and then forwarding. This is index time work that is very complicated, costly in terms of latency addition and unforgiving as you can not correct errors once made. With splunk you can deal with most things at search time. For example if a data field is named "source" and you prefer src you can alias source to src so both versions will work at search time with no performance penalty.
Converting all those sources would take time... and Splunk can handle all of those other sources (some very easily), so my recommendation would be to keep the format. We use a ton of syslog, but also have lots of other types, and Splunk can handle them all.
Though it's up to you how you want to push data to splunk, you really don't need to convert all your sources but what matters is sourcetype
The source is the name of the file, stream, or other input from which a particular event originates.
The sourcetype specifies the format for the event. Splunk uses this field to determine how to format the incoming data stream into individual events.Events with the same source type can come from different sources.
Since Splunk Enterprise uses the source type to decide how to format your data, it is important that you assign the correct source type to your data. That way, the indexed version of the data (the event data) looks the way you want, with appropriate timestamps and event breaks. This facilitates easier searching of the data later.
More read :
Once you have indexed data with correct sourcetypes, you can use SPL to search and correlate your data.