I am a new user trying Splunk for the first time. I am trying to visualize some csv files so we have trending information or a storage 'dashboard' of sorts.
I have a CSV file with these columns :
Date,Device,Type,Model,Protocols,location,UsedTB,FreeTB,TotalTB,Tier
5/1/2013 vnx5500 Array VNX FC SiteX 54 3 57 2
7/1/2014 vnx7500 Array VNX FC SiteY 518 435 953 2
7/1/2014 vnx5500 Array VNX FC SiteX 54 28 82 2
1/1/2015 vnx5500 Array VNX FC SiteX 62 22 84 2
1/1/2015 vnx7500 Array VNX FC SiteY 586 423 1009 2
2/1/2015 xtrem-1 Array XtremIO FC SiteY 0.3 7.2 7.5 1b
3/1/2015 xtrem-1 Array XtremIO FC SiteY 0.7 6.8 7.5 1b
Every few weeks rows are appended to the csv - They values are almost always the same, aside from FreeTB,UsedTB, TotalTB which are the values we want to total and trend etc.
I have this data indexed in splunk but I'm struggling to figure out how to work with it, I'd like to be able to graph by any 'Device' 'location' etc, and by graph I mean trend the growth by date per ALL, Site, Array. The goal is to give us growth charts over time overall, by site, by month, year, tier, etc.
I can easily search for single values but having splunk add all data for dateX and SiteY and show a graph, or even just showing overall growth(without Splunk adding all the values up and giving incorrect information) is proving to be a bit tricky.
Should I change how this data is formatted before indexing or can it be used to provide what I'm looking for? Will I be able to use this source as I go forward with Splunk or will my searches and graphs(charts) have to be updated every time a new set of rows(and date) is added?
Thanks for any help.
Try this (and adjust Thing
as you see fit):
... | eval Thing = Device . "/" . Type . "/" . Model . "/" . Protocols . "/" . location | timechart span=1d sum(UsedTB) BY Thing
Can Thing be all sites or devices somehow? I've tried a few different 'things' and usually I end up with 1 data point for the time that the .csv was indexed, I can't determine why the value for that one point in time is what it is yet either.
I guess I'm unsure what the intention of 'Thing' is currently.
The visualization showed me one data point as well for the date of index(not any date(s) in the csv for the values). What I would hope to get is a graph trending those values as they go up or down over time in the CSV.
Thanks fo the help,
Nick
That's the point; whatever you decide to put into "Thing" is how the used-bytes are aggregated. If you'd like it ALL clumped together, then just remove BY Thing
altogether.
You will only get one data point if your timepicker
is set to less than a day (e.g. "Last Hour") so you need to make it something like "Last 7 days" to see more datapoints.
Can it use the dates in the spreadsheet or will it always use the date/time from the file(csv) it's indexed? I think that may be part of my issue as the dates in the spreadsheet go back 8 months(and will continue to go back further) whereas the dates that the file is written to and that splunk it stamping it with really means nothing to me.
I had assumed that you already set it up so that it was timestamping
based on the first field in each event; is that not the case? How are the events being timestamped? Naturally, if you did nothing to tell Splunk what to do, it should have used the first field of each line for each event's timestamp; is it not doing this?
I must not - It shows the creation/index time as the timestamp and I'm seeing my date field in the event section. I think this must be key as nothing is acting right as you've suggested.
The events are timestamped with a day they were entered, the first entry in the csv is a date but it's just mm/dd/yyyy format. Just like the example above.
Yes, actually there is only 1 event per thing every 2-4 weeks. Yes exactly!
How exactly do I leverage that? using that syntax (TIME_FORMAT=%m/%d/%Y and TIME_PREFIX=^) in my search did not return anything. Sorry, you truly are dealing with a noob!
How did you setup your input? Did you use the GUI or CLI (inputs.conf
)? Both methods allow you to set these things.
I used to GUI and added it as a Data Source(Folder Monitor).
Go to Settings
-> Data inputs
-> Files & directories
-> `and find the place for the
TIME*` settings that I gave and save them in the correct spot.
All I have is 'Host','Source Type','Index', and 'Advanced Options(Whitelist/blacklist)' - Nothing for reformatting the time that I can tell.
I'm adding a new source(file) - Which is letting me adjust the time format, it can't recognize my 'date' format as a time format it can use yet however.
It was due to me adding the folder the file was writing to, I've added the folder and I'm now trying to get it to recognize the time format.
There absolutely should be something that walks you through identifying the timestamp. I never use the GUI/wizard so I cannot be more specific.
But there is only 1 event per "Thing" every day, so it would be good to timestamp based on that first field, right? If so, tell Splunk to do that with TIME_FORMAT=%m/%d/%Y
and TIME_PREFIX=^
. Then all the stuff we have discussed will work.