Getting Data In
Highlighted

collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Path Finder

Topic: collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Description :
Currently I'm experimenting the below task in test environment with stand alone splunk instance. Once tested, I have to move to production.
The original log data has partial JSON data which is sent over syslog. So the format is " _time server-name ipaddress INFO .... { <> }

What I need ?????
Search query extracting JSON key-values and loading the values is very slow and can't be accepted. Hence, I want to extract all key-values pairs and store into a new index. So that I can write my queries on my new index which should be having plain log format with name-value pairs.

QUESTION-1: Does my new index should be summary index or regular index?
QUESTION-2: How to save all the extracted field-values into new index?

.
.
.

Since if the log contain partial data, generally Splunk takes lots of time to extract key-values from JSON with 'spath' command. Hence, I decided to create a schedule search for every 5min and write the name/value pair into new summary index using 'collect' command.

Below is the sample query where the original log is indexed into index webanalytics. In this query I'm extracting the key-values pairs from JSON value
index=web
analytics | rex maxmatch=10 "(?<jsonfield>{[^}]+})" | mvexpand jsonfield | spath input=jsonfield | rename activefeatures{} as AFeatures, actions{} as AActions | collect index=siweb_analytics

QUESTION-3: After the above schedule search exectued, I'm still seeing the same JSON log even in siwebanalytics. The collect command has not stored the extracted key-values from JSON.

QUESTION-4: In a distributed search, where does siwebanalytics needs to be created? on search head or indexer?

Thanks in advance.

0 Karma
Highlighted

Re: collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Motivator

I don't think collect is doing what you think it's doing... so set that aside for now.

see this answer http://answers.splunk.com/answers/131911/collect-command

To extract the JSON fields, add
REPORT-myJson = grabJsonField , breakUpJson

in props.conf

And in the matching stanza in transforms.conf, extract your JSON key value pairs.
You can do that with many techniques.
something like

[grabJsonField]
REGEX= ^[\{](?<json_field>{[^\}]+})
FORMAT= json_field::$1

(or whatever you need to get the whole field...

[breakUpJson]
SOURCE_KEY=json_field
DELIMS= "," , ":"

This is just an example because often when JSON is embedded in a message, it isn't necessarily properly structured, so you can allow for whatever nuances there are.

Take a look at this:

http://docs.splunk.com/Documentation/Splunk/6.1.1/Admin/Transformsconf

I don't think you really want a summary index here since it isn't for a subset of data, and you aren't summarizing anything.

There are also a number of examples in answers of different techniques for extracting JSON from within a "mixed" event.

When you are reading these, note the following nuance:

Note, my answer says "REPORT" will extract fields for search (index time would be TRANSFORMS)

Since you are looking for an alert,
Take the fields from your JSON and create a lookup. Query the lookup.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma
Highlighted

Re: collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Path Finder

Thanks for the response.

But, I dont want to extract fields during indexing (customers also dont want) as the latency of events arrival increased. Even Splunk document suggest not to do extracting during index time.

0 Karma
Highlighted

Re: collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Path Finder

In brief, this is what I'm looking for
1. Index the raw data as it is to say index1.
2. Extract the name & values which are in JSON format from index1 into new index say index2. I'm able to extract all 60 fields from index1, but how to save the extracted fields into new index?
3. Write query on index
2 and search query will be faster.
4. Create real time alerts based on pre-defined pattern from index_1 (hence customer dont want to exaction during index time). This is not the problem now.

I'm facing problem in step-2 and if any other ways of solving this problem.

0 Karma
Highlighted

Re: collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Motivator

I've amended the answer to reflect search time extractions only.
(there is nothing wrong with index time extractions. Best Practice is to consider it carefully... because it is FOREVER)

I have also suggested that you may want to create a lookup from your search and then query the lookup for faster response.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma
Highlighted

Re: collect not storing the extracted fields into new index and what is way to save all extracted fields into new index

Path Finder

Thanks for response. Any help on step # 2 in the above post?

Also, I might store only subset of data aswell since only 30-40 fields from index_1 are useful out of 60+ fields.

0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.