Dashboards & Visualizations
Highlighted

Mysterious data spike

Communicator

I have a plethora of survey data from several thousand video conference calls.

After each call, users are asked to fill out a survey.

I have found a mysterious spike in a number of survey results during the month of April that is skewing my visualizations.

The spike can be seen here:

alt text

Here is a sample of the event data:
alt text

This makes no sense, as there was no surge in number of calls during this time. My data is corrupt, but I'm not sure how/why.

Please advise.

Highlighted

Re: Mysterious data spike

SplunkTrust
SplunkTrust

Have you ruled out a change in response rate? Perhaps the April calls encouraged more people to answer the surveys.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Mysterious data spike

Splunk Employee
Splunk Employee

eyeballing timeline in the events tab, it indeed appears there is a spike in events...if u click "format timeline" and change it to full, makes it easier to analyze, it does look like there were double the events in the middle of the timeline.

My hunch would be duplicate events or perhaps timestamping challenges or line breaking or something. any reason u didnt use timechart instead? looks like you're extracting time accurately...

Try this to validate the spike in records:

index=webex_sentiment | timechart span=1h count
set your time picker to the week that shows the spike, any hints?

If the data appears clean on those days with spikes, then there simply must have been more responses.

Highlighted

Re: Mysterious data spike

Communicator

@mmodestino

I followed your instructions and I believe that the issue is duplicate events.

Here is my query with "Format Timeline" : http://imgur.com/a/MEJJv

With this visual, you can see the truly see the data spike: http://imgur.com/a/Q4DxY

April 5 has 1,239 events

April 18 has 1,418 events
April 19 has 1,500 events
April 20 has 1,380 events

April 26 has 1,398 events
April 27 has 1,329 events
April 28 has 1,029 events

Outside of these few weeks in April, there are no days that come anywhere near to 1,000 events.

I drilled down to the individual hours per your suggestion, and found what appears to be a multitude of duplicate events: http://imgur.com/a/laaWD

What should be my next steps from here?

Highlighted

Re: Mysterious data spike

Splunk Employee
Splunk Employee

actually it appears to be timestamp related. In other words we need to clean up your props.conf to ensure we extract the timestamp properly...for example, in your second screenshot, we can see that the timestamp in the event, is not the same as the _time field.

alt text

As you can see, the event appears to have a timestamp field of july 10th? it can be tough for splunk to determine the date, Splunk is extracting the date as july 10 (mod time?) rather than july 4 or June 28, cause the auto extract failed.

What we need to do is to help Splunk with the time format, as the auto extraction is letting you down.

Can you paste a few of these raw events from this screenshot here so I can run them through the Add data wizard for you and help you build a better timestamp extraction?

Ideally we will use timestamp_fields and a time_format like %m/%d/%y %k:%M

This demonstrates a best practice that states we should always build our props to define how to extract where to find the timestamp, how to process it's format, ,among other golden rules. The add data wiz is really great for that.

Highlighted

Re: Mysterious data spike

Communicator

@mmodestino

Thank you so much!!

Here are the first eight events in list form:

7/10/17
12:40:28.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,7/4/17 0:56,7/4/17 0:56
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 7.4.17.csv sourcetype = csv

7/10/17
12:40:28.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,7/4/17 0:04,7/4/17 0:04
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 7.4.17.csv sourcetype = csv

7/10/17
12:37:39.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Philippines,6/29/17 0:32,6/29/17 0:32
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 6.29.17.csv sourcetype = csv

7/10/17
12:37:39.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Philippines,6/29/17 0:10,6/29/17 0:10
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 6.29.17.csv sourcetype = csv

7/10/17
12:37:34.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,6/28/17 0:45,6/28/17 0:45
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 6.28.17.csv sourcetype = csv

7/10/17
12:37:34.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Japan,6/28/17 0:36,6/28/17 0:36
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 6.28.17.csv sourcetype = csv

7/10/17
12:37:34.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,6/28/17 0:08,6/28/17 0:08
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 6.28.17.csv sourcetype = csv

7/10/17
12:37:28.000 AM 
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Taiwan,6/27/17 0:08,6/27/17 0:08
host = BDC-ESSSPLK01 source = G:\AutoIndex\webex_sentiment\WebEx Sentiment Survey2_Responses 6.27.17.csv sourcetype = csv

Here are those same eight events in raw form:

Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,7/4/17 0:56,7/4/17 0:56
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,7/4/17 0:04,7/4/17 0:04
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Philippines,6/29/17 0:32,6/29/17 0:32
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Philippines,6/29/17 0:10,6/29/17 0:10
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,6/28/17 0:45,6/28/17 0:45
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Japan,6/28/17 0:36,6/28/17 0:36
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,China,6/28/17 0:08,6/28/17 0:08
Very Satisfied,5,,,,,,,,,,,,,,,,,,,,,,Taiwan,6/27/17 0:08,6/27/17 0:08
Highlighted

Re: Mysterious data spike

Splunk Employee
Splunk Employee

ah also, can you provide the header values from the csv? I could just make one up, but better we get you sorted the whole way

Highlighted

Re: Mysterious data spike

Communicator

Where do I find that?

0 Karma
Highlighted

Re: Mysterious data spike

Splunk Employee
Splunk Employee

In one of the raw csv files, or they were hardcoded in your props.conf when the data was onboarded

Highlighted

Re: Mysterious data spike

Communicator

I must apologize for my lack of knowledge; I'm very new to Splunk.

Where can I find the raw csv files or the props.conf?

0 Karma