Good day Splunkers!
We have this case that in one TSV are 3 types or categories of data.
The first and third section of data can be ingested normally, but the problem is the second one. It is a receipt. Is there any way that the indexer knows when to divide and ingest those sections?
By the way, each section has a somehow unique marker like:
This is the sample for the TSV. I only indicated the "unique" marker since I want to know if there's a way Splunk can determine how to divide and ingest those three sections of data.
Why can it not ingest the second section? And by "ingest" what exactly do we mean? And to what effect would dividing it give you?
Or let me ask more particular questions about something I'm guessing: You have extractions happening on the data, which turns gobbledygook raw logs into pretty fields. For the ones with SECTION this is OK, but with RECEIPT ones it's not working right?
Let us know if that sounds like the problem. If it is not, that's OK too, if you could provide a little better example of one event that works right and one that doesn't, and a better description of what it means to not be working right, I think that would help a lot. But never fear, I'm sure we can figure this out.
Thank you for responding!
Sorry, I think I poorly explained what the post meant.
The ingestion is fine, the thing is is there a way to divide those three sections of data? Like ingesting it to different index coming from 1 TSV.
Sorry if my explanation is poor.
Knowing better what's needed here, hopefully this is the right answer.
What's apparently needed is that for this one input, some events should go to indexA, other events should go to indexB.
This is possible. I'm pretty sure there's no way to do this in the UI, so you'll have to manually edit configuration files by hand. But it's not too hard if you take your time, think about what you are doing, and test. Also make backups of your configurations before you start! (It's as easy as tar'ing them up, or making a copy.)
Your main idea is to us the route and filter data section in the Splunk documentation. This gives a good overview and specifics for quite a few scenarios - unfortunately, your specific one isn't in there.
But there is help, once you know how to look for it. For instance, a web search for "splunk dest_key=index" turns up this answer.
We can modify it though.
### transforms.conf [index_redirect_section] REGEX = ^SECTION DEST_KEY = _MetaData:Index FORMAT = name_of_index_for_section_events [index_redirect_receipt] REGEX = ^RECEIPT DEST_KEY = _MetaData:Index FORMAT = name_of_index_for_receipt_events ### props.conf [sourcetype, host, or source that you want to redirect - see the docs for examples] TRANSFORMS-route_different_indexes = index_redirect_section, index_redirect_receipt
Now, some caveats:
First, make sure you are editing local versions of the conf files, not default. So not $SPLUNKHOME/etc/apps/myapp/default/props.conf, but instead $SPLUNKHOME/etc/apps/myapp/local/props.conf.
Second, these regular expressions will only work if RECEIPT or SECTION is at the beginning of the event. If not, remove the "^" from the front of that. But ... lots of testing needs to be done, because if the wrong word appears anywhere in the event, well, unpredictable things may happen. But as long as that doesn't happen, it should work.
[sourcetype, host or source that you want to redirect] section - you didn't provide what the sourcetype is, the source, or anything at all for us to work with, so you are on your own there for implementing that. We can help, but hopefully the examples in the route and filter data docs, plus the words I used above, will help you enough to get that sorted out.
So, generally, adding the section into the local props.conf tells Splunk to run a transform on the data as it comes in. Indeed, it tells it to run TWO transforms. So it'll check transforms.conf for the stanzas it needs, and run both in order. So, if the regex matches SECTION, it'll rewrite the destination index of that event and tell it to go to index
name_of_index_for_section_events. It then continues and runs the next one, which if it matches (RECEIPT) it'll rewrite the index to be
So, give that a try and see how it works. If you have problems, especially in this particular case details will matter - what you've tried, copies of the configurations you've put into place, and what exactly happens.
I'll consider this as an answer since this is the answer I'm looking for! Thank you for the concrete and detailed explanation about my inquiry. As I thought, I really should fiddle around with props/transforms.conf.
Again, thank you!
You are welcome. If you get into any specific minor issues with this, be sure to post back here (like, the regex may need a tiny bit of tweaking).
Otherwise, have fun in props.conf! It's whole new world!
Ah! It seems I forgot to mention something. Do you think it's possible playing with this? If possible I want to know your idea regarding this:
I forgot to include the fields. But yeah, I think that's the structure of the log. Have any thoughts about this, sir? Have you already experienced this case? Please enlightened me.
That looks like it should work fine with what we had discovered before. The leading
^ in the REGEX should probably be OK, too, which makes it more efficient (that symbol tells the regex engine to look at the start of the string for each word, like "SECTION", so it doesn't have to waste time looking through the whole thing.)
I do see one thing that might not be handled properly. Unless I'm misreading the logs, you have a line that has only fields, with no leading "RECEIPT" or "SECTION" in it. What did you want done with those? If you want it to go to either one of those indexes we've already defined, then there's an easy answer:
On the input itself make sure you set the
index=blah setting. That will be the "Default" index those events will go to, an indeed any event that doesn't match our specific redirections to other indexes will just go to that default index. It can be the same as one of the specific indexes - that's no problem at all.
Also how are you assigning time stamps?
The log presented is like 3 different logs compiled in one. So each logs(SECTION1,RECEIPT,SECTION2) has different field data. What I'm worried about is the fields of RECEIPT and SECTION2 logs might be considered as events when ingested. But I'll try fiddling first with props.conf and will give an update. Thank you sir! I've at least got an idea with this. 😃
Well, true, but I expect your line breaking will take care of that. And perhaps it'll need a little tweaking, but that's well documented. As long as the lines break properly, which is a very testable thing - just ingest into a temporary index to test them out then you can just delete that index and use those same settings into production - then everything else should work fine.