I have about six (6) seconds worth of data in a CSV file. Each CSV record has among other fields "process", "operation", and "OpValue". I would like to see a report that resembles a strip chart of "operation" by "process" with a different color for each "process". I used splunk lookups to create the numeric field "OpValue" to represent the ascii "operation" field. The "process" field is an ascii string. When I attempt "....search.... | timechart OpValue by process" I get an error that a function is required in the position of "OpValue". I am new to splunk and I am sure I am missing something 🙂 The search is the default search that provides a view of all records in the file.
Using Splunk to plot exact (x,y) coordinates based on individual events is difficult. Normally, Splunk uses aggregation functions so that it can be flexible with respect to the time period each chart entry is representing. You can fake it, but it's a pain.
The field Splunk uses internally for event time is _time
. If your TimeOfDay
field isn't getting mapped to this, then you will not get a timechart.
Try something like this:
source="LogFileA.CSV" host="mycomputer" sourcetype="csv"
| eval _time=TimeOfDay
| timechart span=10ms values(OpValue) by process
Using Splunk to plot exact (x,y) coordinates based on individual events is difficult. Normally, Splunk uses aggregation functions so that it can be flexible with respect to the time period each chart entry is representing. You can fake it, but it's a pain.
The field Splunk uses internally for event time is _time
. If your TimeOfDay
field isn't getting mapped to this, then you will not get a timechart.
Try something like this:
source="LogFileA.CSV" host="mycomputer" sourcetype="csv"
| eval _time=TimeOfDay
| timechart span=10ms values(OpValue) by process
Thanks for your time and input. I appreciate your help and advise. I understand what you are saying about what Splunk is optimized for. I have been using excel until now and the effort is very manually intensive. I am capturing "Machine Data" it just so happens that I get over 500K events in 6 seconds. I am always capturing about 6 to 10 seconds of data and then analyzing what happened during that period of time. I may need to preprocess the CSV files or extract/create a field to scale the timestamp field to better align with Splunk's time resolution to use built in functions. I am after the visualization capability and a less manually intensive work flow.
Like I was saying, unless you can usefully aggregate your data somehow, this is going to be a problem for you. Splunk's "visualization capability" does not extend to plotting 500K events on a chart, no matter what the time scale is.
I think that as suggested, I need to investigate the mapping of TimeOfDay to the _time value. We are getting warm. I actually saw a few results in statistics and visualization. I had to remove the eval _time=TimeOfDay and change span=1ms. There are 55 events in my CSV data and they all happen in less that 11ms. Without the eval statement above and with span=1ms I get 12 results on the statistics tab and about 5 points on the visualization tab. On the statistics multiple events are grouped within single _time entries. This seems to point to the resolution of _time.
If your time value actually contains microseconds, you can use something like span=100us
.
In general, though, this is not a good long-term strategy. It may work for this CSV file, but in other cases eventually you will run into the limits of either the chart or the result set. You might be able to make it work in this very focused case, but it's not really the best example of what Splunk is optimized for.
There are three issues here:
timechart
does require an aggregation function of some kind to work. You can find a list of them here.timechart
data settled, click the "Visulaization" tabs and see what you can do with it out of the box. If you really need a strip chart, there's a much more involved process to do that.I call what I want a strip chart but I think it would be a scatter chart with lines connecting the plotted OpValues against time by process. In either case. How do I plot the OpValue in each event by process against time assuming Splunk does generate canned reports?
The raw CSV data does contain a "TimeOfDay" field which looks like "8:39:15.1234567 PM" I believe that the Splunk ingest uses that field to create the time column which looks like "6/9/15 8:39:15.123 PM". Of course the date Splunk appends to the front of the field is not the date of collect but the current date when ingested. I have tried using a function such as sum(OpValue+0) as a NOP to satisfy Splunk but that didn't seem to provide data in my visualization.
search: source="LogFileA.CSV" host="mycomputer" sourcetype="csv" | timechart sum(OpValue+0) by process