Getting Data In

splunk index cuts out some lines

New Member

Hi,
I am testing splunk config from my local machine before implementing it in production. So i am indexing a json file of about 5000 lines. However when it is indexed I get one event with about 138 lines only if I turn SHOULD_LINEMERGE = true in props.conf. If I set it to false , I get about 218 events with each event about 2-3 lines. How can I get splunk to index the entire lines , I don't really care if it shows as one event or as multiple events. I just want to see the entire content of the file. Here is my props.conf.
default]
CHARSET = UTF-8
LINE_BREAKER_LOOKBEHIND = 100
LINE_BREAKER =
TRUNCATE = 100000000000000000000
DATETIME_CONFIG = /etc/datetime.xml
ADD_EXTRA_TIME_FIELDS = True
ANNOTATE_PUNCT = True
HEADER_MODE =
MATCH_LIMIT = 100000
DEPTH_LIMIT = 1000
MAX_DAYS_HENCE=2
MAX_DAYS_AGO=2000
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_TIMESTAMP_LOOKAHEAD = 128
SHOULD_LINEMERGE = false
BREAK_ONLY_BEFORE = Path=
BREAK_ONLY_BEFORE_DATE = True
MAX_EVENTS = 6000000
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
TRANSFORMS =
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
LEARN_SOURCETYPE = true
LEARN_MODEL = true
maxDist = 100
AUTO_KV_JSON = true
detect_trailing_nulls = false
sourcetype =
priority =

0 Karma
1 Solution

Splunk Employee
Splunk Employee

Splunk can already handle JSON formatted files. Make sure the file is in a correct JSON format.

You’ll probably need to add in your configuration(s):
KV_MODE = json (tells splunk to automatically perform search time extractions on json data)
INDEXED_EXTRACTIONS = json (tells splunk to create index time extractions for the data)

View solution in original post

0 Karma

Esteemed Legend

Use INDEXED_EXTRACTIONS = JSON and let splunk handle everything. If you do this, also use KV_MODE = AUTO (NOT KV_MODE = JSON or you will get 2 copies of everything).

0 Karma

Ultra Champion

Great, we ended up with -

 INDEXED_EXTRACTIONS = json
 category = Structured

At What are the requirements for a perfect Splunk JSON document?

0 Karma

Splunk Employee
Splunk Employee

Splunk can already handle JSON formatted files. Make sure the file is in a correct JSON format.

You’ll probably need to add in your configuration(s):
KV_MODE = json (tells splunk to automatically perform search time extractions on json data)
INDEXED_EXTRACTIONS = json (tells splunk to create index time extractions for the data)

View solution in original post

0 Karma

Communicator

On top of these, instead of adding Truncate=1000000000, make it Truncate=0.

0 Karma

New Member

Thanks nadlurinadluri and swong_splunk for your answers. Truncate=0 did the magic.

On a different note, now that it works in my local box, I need to move it to live. I am not very conversant with Splunk. In my live environment, we use the indexer, deployment servers, forwarders and then the search heads. Earlier I created app for the index , which works on live but just does not index everything as I said before which you resolved on my local box. In live, I am confused about which server to put the customised props.conf or if I should add other files as I also see online some people use transforms.conf, indexes.conf etc. Not sure what to do here and how to specify the source of the file. Please help.

This is the config that works in my local box

[default]
CHARSET = UTF-8
LINE_BREAKER_LOOKBEHIND = 100
LINE_BREAKER =
TRUNCATE = 0
DATETIME_CONFIG = /etc/datetime.xml
ADD_EXTRA_TIME_FIELDS = True
ANNOTATE_PUNCT = True
HEADER_MODE =
MATCH_LIMIT = 100000
DEPTH_LIMIT = 1000
MAX_DAYS_HENCE=2
MAX_DAYS_AGO=2000
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_TIMESTAMP_LOOKAHEAD = 128
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = Path=
BREAK_ONLY_BEFORE_DATE = True
MAX_EVENTS = 6000000
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
TRANSFORMS =
SEGMENTATION          = indexing
SEGMENTATION-all      = full
SEGMENTATION-inner    = inner
SEGMENTATION-outer    = outer
SEGMENTATION-raw      = none
SEGMENTATION-standard = standard
LEARN_SOURCETYPE      = true
LEARN_MODEL           = true
maxDist = 100
AUTO_KV_JSON = true
detect_trailing_nulls = false
sourcetype =
priority =

0 Karma

Communicator

For the current configuration, I guess adding them in the props.conf on the indexer side should be sufficient.

Also, for your future reference have this links handy to know where the configurations has to be set up.
http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F
https://answers.splunk.com/answers/504273/which-properties-are-available-for-a-universal-for.html

0 Karma

New Member

nadlurinadluri those links don't work but thanks for getting back again. Would adding it to /opt//splunk/etc/system/local/props.conf on the indexer suffice? Also if I change the default source to how it is now on the first line, would that be ok, just like below? bearing in mind [/var/log/lighthouse/*.json] is the path of the files being indexed in the lighthouse server. Sorry to be a pain ; I am relatively new to splunk.

[/var/log/lighthouse/*.json]
CHARSET = UTF-8
LINE_BREAKER_LOOKBEHIND = 100
LINE_BREAKER =
TRUNCATE = 0
DATETIME_CONFIG = /etc/datetime.xml
ADD_EXTRA_TIME_FIELDS = True
ANNOTATE_PUNCT = True
HEADER_MODE =
MATCH_LIMIT = 100000
DEPTH_LIMIT = 1000
MAX_DAYS_HENCE=2
MAX_DAYS_AGO=2000
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_TIMESTAMP_LOOKAHEAD = 128
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = Path=
BREAK_ONLY_BEFORE_DATE = True
MAX_EVENTS = 6000000
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
TRANSFORMS =
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
LEARN_SOURCETYPE = true
LEARN_MODEL = true
maxDist = 100
AUTO_KV_JSON = true
detect_trailing_nulls = false
sourcetype =
priority =

0 Karma

Communicator

/opt//splunk/etc/system/local/props.conf on the indexer suffice ---- this should be fine.

[/var/log/lighthouse/*.json]

----- You can mention sourcetype of those files mentioned here instead of default OR [/var/log/lighthouse/.json]
----- If you want to use source path over there , you need to use the below:
[source::/var/log/lighthouse/
.json]

0 Karma