Getting Data In

First time importing of data into Splunk

niiick
New Member

I am a new user trying Splunk for the first time. I am trying to visualize some csv files so we have trending information or a storage 'dashboard' of sorts.

I have a CSV file with these columns :

Date,Device,Type,Model,Protocols,location,UsedTB,FreeTB,TotalTB,Tier
5/1/2013 vnx5500 Array VNX FC SiteX 54 3 57 2
7/1/2014 vnx7500 Array VNX FC SiteY 518 435 953 2
7/1/2014 vnx5500 Array VNX FC SiteX 54 28 82 2
1/1/2015 vnx5500 Array VNX FC SiteX 62 22 84 2
1/1/2015 vnx7500 Array VNX FC SiteY 586 423 1009 2
2/1/2015 xtrem-1 Array XtremIO FC SiteY 0.3 7.2 7.5 1b
3/1/2015 xtrem-1 Array XtremIO FC SiteY 0.7 6.8 7.5 1b

Every few weeks rows are appended to the csv - They values are almost always the same, aside from FreeTB,UsedTB, TotalTB which are the values we want to total and trend etc.

I have this data indexed in splunk but I'm struggling to figure out how to work with it, I'd like to be able to graph by any 'Device' 'location' etc, and by graph I mean trend the growth by date per ALL, Site, Array. The goal is to give us growth charts over time overall, by site, by month, year, tier, etc.

I can easily search for single values but having splunk add all data for dateX and SiteY and show a graph, or even just showing overall growth(without Splunk adding all the values up and giving incorrect information) is proving to be a bit tricky.

Should I change how this data is formatted before indexing or can it be used to provide what I'm looking for? Will I be able to use this source as I go forward with Splunk or will my searches and graphs(charts) have to be updated every time a new set of rows(and date) is added?

Thanks for any help.

0 Karma

woodcock
Esteemed Legend

Try this (and adjust Thing as you see fit):

... | eval Thing = Device . "/" . Type . "/" . Model . "/" . Protocols . "/" . location | timechart span=1d sum(UsedTB) BY Thing
0 Karma

niiick
New Member

Can Thing be all sites or devices somehow? I've tried a few different 'things' and usually I end up with 1 data point for the time that the .csv was indexed, I can't determine why the value for that one point in time is what it is yet either.

I guess I'm unsure what the intention of 'Thing' is currently.

The visualization showed me one data point as well for the date of index(not any date(s) in the csv for the values). What I would hope to get is a graph trending those values as they go up or down over time in the CSV.

Thanks fo the help,

Nick

0 Karma

woodcock
Esteemed Legend

That's the point; whatever you decide to put into "Thing" is how the used-bytes are aggregated. If you'd like it ALL clumped together, then just remove BY Thing altogether.

You will only get one data point if your timepicker is set to less than a day (e.g. "Last Hour") so you need to make it something like "Last 7 days" to see more datapoints.

0 Karma

niiick
New Member

Can it use the dates in the spreadsheet or will it always use the date/time from the file(csv) it's indexed? I think that may be part of my issue as the dates in the spreadsheet go back 8 months(and will continue to go back further) whereas the dates that the file is written to and that splunk it stamping it with really means nothing to me.

0 Karma

woodcock
Esteemed Legend

I had assumed that you already set it up so that it was timestamping based on the first field in each event; is that not the case? How are the events being timestamped? Naturally, if you did nothing to tell Splunk what to do, it should have used the first field of each line for each event's timestamp; is it not doing this?

0 Karma

niiick
New Member

I must not - It shows the creation/index time as the timestamp and I'm seeing my date field in the event section. I think this must be key as nothing is acting right as you've suggested.

The events are timestamped with a day they were entered, the first entry in the csv is a date but it's just mm/dd/yyyy format. Just like the example above.

0 Karma

niiick
New Member

Yes, actually there is only 1 event per thing every 2-4 weeks. Yes exactly!

How exactly do I leverage that? using that syntax (TIME_FORMAT=%m/%d/%Y and TIME_PREFIX=^) in my search did not return anything. Sorry, you truly are dealing with a noob!

0 Karma

woodcock
Esteemed Legend

How did you setup your input? Did you use the GUI or CLI (inputs.conf)? Both methods allow you to set these things.

0 Karma

niiick
New Member

I used to GUI and added it as a Data Source(Folder Monitor).

0 Karma

woodcock
Esteemed Legend

Go to Settings -> Data inputs -> Files & directories -> `and find the place for theTIME*` settings that I gave and save them in the correct spot.

0 Karma

niiick
New Member

All I have is 'Host','Source Type','Index', and 'Advanced Options(Whitelist/blacklist)' - Nothing for reformatting the time that I can tell.

0 Karma

niiick
New Member

I'm adding a new source(file) - Which is letting me adjust the time format, it can't recognize my 'date' format as a time format it can use yet however.

0 Karma

niiick
New Member

It was due to me adding the folder the file was writing to, I've added the folder and I'm now trying to get it to recognize the time format.

0 Karma

woodcock
Esteemed Legend

There absolutely should be something that walks you through identifying the timestamp. I never use the GUI/wizard so I cannot be more specific.

0 Karma

woodcock
Esteemed Legend

But there is only 1 event per "Thing" every day, so it would be good to timestamp based on that first field, right? If so, tell Splunk to do that with TIME_FORMAT=%m/%d/%Y and TIME_PREFIX=^. Then all the stuff we have discussed will work.

0 Karma