Getting Data In

Change Index and Sourcetype

I have set of data, where I want to send events with a 404 error code to a different index as well as after processing the records, I want to set a final, different sourcetype. Neither are working. Please advise...

props.conf:

[weblogs]
SHOULD_LINEMERGE = false
LINE_BREAKER = (&&&)(?=\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b)
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
category = Custom
pulldown_type = true
TRANSFORMS-1 = notfound
TRANSFORMS-2 = setsourcetype
disabled = false

transforms.conf:

[notfound]
REGEX = 404
DEST_KEY = _MetaData:Index
FORMAT = notfoundindex

[setsourcetype]
SOURCE_KEY = _raw
REGEX = ^.
DEST_KEY = Metadata:Sourcetype
FORMAT = sourcetype::access_combined
Labels (4)
0 Karma

Contributor

For line-breaking use the regex as (&&&)
In props.give Max events as 40000
Truncate as 20000(check the max using len function and adjust).

Create a new index named notfoundindex (Settings-->Index)

 [props.conf]
REGEX = (\&\&\&)
 MAX_EVENTS = 40000
TRUNCATE = 20000

TRANSFORMS-01-notfound = notfound
TRANSFORMS-02-setsourcetype= setsourcetype

transforms.conf
[notfound]
REGEX = .*404.*
DEST_KEY = _MetaData:Index
FORMAT = notfoundindex

 [setsourcetype]
SOURCE_KEY = _raw
REGEX = .*
DEST_KEY = Metadata:Sourcetype
FORMAT = sourcetype::access_combined
0 Karma

Path Finder

first: one minor change to the REGEX for the 404 status events: 

if you use REGEX = 

.*404.*

it takes up events that have maybe a status 200 followed by 404 also. To prevent this you could use this REGEX instead: 

 

"\s404\s

 

 that way it only takes the first number after a quotation mark and a blank space

also if you try to change the index AND the sourcetype for one input you might run into problems since splunk could potentially first address the new sourcetype and then try to send events into new indexes given the regex above. BUT when this happens they are already sourcetype=access_combined and not weblog anymore so it won't work or only one of those transforms. 

the solution is as follows: 

in props.conf your stanza shouldn't address the sourcetype "weblog" but rather the source from which your data originates. 

[source::access_combined_no_breaks.log]

this way it doesn't matter what happens first with your data cause the source will always stay the same. 

 

hope this helps anyone who might run into the same problems. if so, pls consider thumbs up 🙂 

0 Karma

New Member

Hi @jpcontrerasaditum - I am also trying to manipulate a weblog with nearly 36k events and exactly same requirements which is :

  1. line break at &&&, then
  2. send 404 status code events to notfoundindex and
  3. reassigning all the events to access_combined sourcetype.

But it doesnt seem to work with the entire log file. So i tries with 10 events only and was able to achieve 1. but not 2. and 3. I get the following error :

truncating at 10000 bytes because size exceeded splunk with a line length >= 15512

I tried truncate = 50000 & truncate = 0 but that makes splunk unresponsive.

So were you able to resolve the issue ? Appreciate if you could help.

0 Karma

Legend

Your index rewrite transform looks OK to me, but you've made a typo in the sourcetype changing section - it should be MetaData:Sourcetype (which capital D), not Metadata:Sourcetype.

To make things easier to debug you could/should also combine the TRANSFORMS statements into one so you can see more clearly which order they're applied in.

TRANSFORMS-changestuff = notfound, setsourcetype
0 Karma

New Member

Hi @Ayn - I am also trying to manipulate a weblog with nearly 36k events and exactly same requirements which is :

  1. line break at &&&, then
  2. send 404 status code events to notfoundindex and
  3. reassigning all the events to access_combined sourcetype.

But it doesnt seem to work with the entire log file. So i tries with 10 events only and was able to achieve 1. but not 2. and 3. I get the following error :

truncating at 10000 bytes because size exceeded splunk with a line length >= 15512

I tried truncate = 50000 & truncate = 0 but that makes splunk unresponsive.

So were you able to resolve the issue ? Appreciate if you could help.

0 Karma