Getting Data In

Data not being indexed

omuelle1
Communicator

Good morning,

I have an issue with a new file that I am trying to index:

I see that it is being monitored but I am not getting any events from it.

Monitored Files:
/opt/tibco/scripts/nohup.out

I looked at the file and saw that it has a very odd timestamp and I think it might have to do with Splunk not being able to break that up correctly:

^[[0m^[[0m11:43:06,113 INFO  [org.jboss.as] (MSC service thread 1-3) JBAS015950: JBoss EAP 6.3.0.GA (AS 7.4.0.Final-redhat-19) stopped in 1124ms

My question is what would I need change in my props.conf file to get events? Could someone maybe provide the regex and line of code I need?

Thank you,

Oliver

0 Karma
1 Solution

mattymo
Splunk Employee
Splunk Employee

Hi Oliver!

That timestamp is void of any date and only contains time, so while you still may have some issues to clean up in those logs (whats up with those chars proceeding the time??), I can provide some suggestions and good habits for data onboarding:

I would recommend using the add data wizard to help you experiment with new data sources and to help get the inputs and props configurations you need.

Here's how I did it:

I saved your sample event to a text file, then on my searchhead, I navigated to settings> add data.

I used the upload file option to get the data in, then I used the set sourcetype screen to get to know your data and how Splunk's settings interact with it.

Right away, you can see that Splunk's auto discovery of the timestamp is having issues with your event.

alt text

As you can see, linebreaking and timestamping are set to 'auto'.

Best practice is to explicitly define linebreaking and time stamp formatting so that Splunk doesn't have to guess....This will improve performance and accuracy.

So I started with changing the linebreaker to 'every line'

And I configured the timestamp particulars, assuming that the TZ was UTC (you want to verify on the machine logging), and providing Splunk with information about where to find the timestamp. PROTIP: Had to escape some of the characters in the jibberish chars in front of your timestamp becuase splunk is using regex to identify the timestamp prefix.

alt text

Then you can use the advanced tab to get familiar with the configurations to use in props.conf! The you can save the sourcetype or you can even copy to clipboard for easy manual creation of a props.conf file.

alt text

[ __auto__learned__ ]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%H:%M:%S,%f
TIME_PREFIX=\^\[\[0m\^\[\[0m
TZ=UTC
MAX_TIMESTAMP_LOOKAHEAD=25

Because there is no date in the file, Splunk is simply using today's date.

This is a super helpful way to ensure you get some of the most important items right when ingesting your data, and gives you a workspace to experiment.

Now, as for what happened to your data, I would suggest searching alltime for the sourcetype you set, because the auto timestamp recognition placed the event in Jan 3rd, 2016 when I ran it through the add data wiz. Failing that, I would use:

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk list inputstatus

or

https://<yourHost>:8089/services/admin/inputstatus/TailingProcessor:FileStatus

To see whats up with the file ingestion

You'll need to figure out whats up with the preamble to that time, or be ok with ingesting with the current date on these logs...I would follow up on these logs...perhaps it can be cleaned up??

- MattyMo

View solution in original post

omuelle1
Communicator

Thank you so much for this answer, this is definitely the way I will use to ingest data from now on.

I try and figure out if there is a way to get a date into the logs from the owner of the application.

0 Karma

mattymo
Splunk Employee
Splunk Employee

My pleasure Oliver,

I did some google-fu on JBOSS logging, but came up empty. Definitely want to get a good solid timestamp in there as it is one of the core items needed to effectively ingest logs with Splunk.

- MattyMo
0 Karma

omuelle1
Communicator

Would Splunk be able to assign a timestamp based on when the data was indexed ?

0 Karma

mattymo
Splunk Employee
Splunk Employee

it can assign on filemod time, or current time, yes...

As you can see in my example...you can ingest as is, but splunk is going to likely use the filemod date or current date...

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Configuretimestamprecognition

We had a chat about your weird characters in the splunk slack chat ( which you can join here: http://splunk402.com/ - dont mind the Nebraska flavor, your registration will get to our community manager)

The almighty @dwaddle aka duckfez opined:

"process thinks it's talking to a TTY, so it outputs ANSI escape sequences to colorize the event.

if the app is always trying to colorize, TTY or not, then it needs to be fixed

if it's confused thinking it's writing to a TTY when it's not, then that might be configuration"

...in case you want to follow up with the app owner to address the logging.

- MattyMo
0 Karma

mattymo
Splunk Employee
Splunk Employee

Hi Oliver!

That timestamp is void of any date and only contains time, so while you still may have some issues to clean up in those logs (whats up with those chars proceeding the time??), I can provide some suggestions and good habits for data onboarding:

I would recommend using the add data wizard to help you experiment with new data sources and to help get the inputs and props configurations you need.

Here's how I did it:

I saved your sample event to a text file, then on my searchhead, I navigated to settings> add data.

I used the upload file option to get the data in, then I used the set sourcetype screen to get to know your data and how Splunk's settings interact with it.

Right away, you can see that Splunk's auto discovery of the timestamp is having issues with your event.

alt text

As you can see, linebreaking and timestamping are set to 'auto'.

Best practice is to explicitly define linebreaking and time stamp formatting so that Splunk doesn't have to guess....This will improve performance and accuracy.

So I started with changing the linebreaker to 'every line'

And I configured the timestamp particulars, assuming that the TZ was UTC (you want to verify on the machine logging), and providing Splunk with information about where to find the timestamp. PROTIP: Had to escape some of the characters in the jibberish chars in front of your timestamp becuase splunk is using regex to identify the timestamp prefix.

alt text

Then you can use the advanced tab to get familiar with the configurations to use in props.conf! The you can save the sourcetype or you can even copy to clipboard for easy manual creation of a props.conf file.

alt text

[ __auto__learned__ ]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%H:%M:%S,%f
TIME_PREFIX=\^\[\[0m\^\[\[0m
TZ=UTC
MAX_TIMESTAMP_LOOKAHEAD=25

Because there is no date in the file, Splunk is simply using today's date.

This is a super helpful way to ensure you get some of the most important items right when ingesting your data, and gives you a workspace to experiment.

Now, as for what happened to your data, I would suggest searching alltime for the sourcetype you set, because the auto timestamp recognition placed the event in Jan 3rd, 2016 when I ran it through the add data wiz. Failing that, I would use:

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk list inputstatus

or

https://<yourHost>:8089/services/admin/inputstatus/TailingProcessor:FileStatus

To see whats up with the file ingestion

You'll need to figure out whats up with the preamble to that time, or be ok with ingesting with the current date on these logs...I would follow up on these logs...perhaps it can be cleaned up??

- MattyMo

View solution in original post