How to search directory with 80k plus log files wi...

mrkevinhoang · ‎08-10-2023

Hello Community,

I have tried searching, but I've not find an answer to my specifics needs... Or I dont know how to word my question.

I work in a company that manufactures servers. Each server manufactured creates a logfile with a unique name. The log file is a txt file that has identifying lines like "Serial Number: FMXXXXXXX", "Station: 1", "Start Tme: 12:00:00", etc.

I am trying to configure splunk to search all these log files based on serial number (to start with) and eventually create a searchable dashboard where I can lookup log files based on serial numbers.

I'm obviously new to splunk, and have watched a lot of tutuorials, but most tutorials focus an searching one big log file, or several log files.

so far, i have setup the splunk UI and pointed it to a directory containing my log files. Under "data summary" my sources are over 100k and sourcetypes are over 14k.

any hep would be appreciated.

Kevin

bowesmana · ‎08-10-2023

If you have ingested those log files into Splunk, I assume you have directed them to a specific index, so all the log files will be in a single index.

You mention different sourcetypes - how do you get different sourcetypes?

As far as searching in Splunk, you simply start with an SPL statement, e.g.

index=<your_index_with_the_data> OTHER_SEARCH_CRITERIA

Splunk will often 'extract' fields from your data automatically where it can, but if it can't, you can set up field extractions - in the left bar of your search window, you will see 'Extract new fields', you can either extract fields through this UI or you can use the rex command https://docs.splunk.com/Documentation/Splunk/9.0.3/SearchReference/Rex to extract fields based on a regular expression, for example this statement

| rex "Serial Number: (?<serial_number>[^\"]*)\", \"Station: (?<station>\d+)"

will extract serial number and station from your event based on your example.

Come back with further questions as you make progress.

mrkevinhoang · ‎08-11-2023

Thanks for the quick reply!

I have imported the data, but im not sure if I have "directed to an index". I looked at settings>indexes, I see about 13. 1 named MAIN has about 29G of data, so i assume this is the Index you are referring to?

as for the sourcetypes, I dont know where they come from.
Hosts (1) |. Sources (100,000). |. Sourcetypes (14,506)

Seems like it's auto generated.

bowesmana · ‎08-13-2023

You can get an idea of what's ingested by running this search over the last 7 days

| tstats dc(sourcetype) as sourcetypes count latest(_time) as lastItem where index=* by index
| eval lastItem=strftime(lastItem, "%F %T")

that will tell you what's been ingested for each index over the last 7 days

Assuming it is index=main that has your data, then a simple sear

index=main

will return you some events. source will generally be the name of the file that's ingested - not sure what your sourcetypes will be - are these files you're ingesting CSV?

Anyway, find the data and then you can search.

Generally it's a good idea to plan where you want your data to go and what sourcetype you want it to become, as sourcetype is a key way to define behaviour for that data.

In your case, if you just have the data there now, you can either plan it, start again and have another go, or probably given your early understanding of Splunk, I would play with the data you have to see if you can start to get some responses from it - then you'll rapidly get a better feel for what you have/can do.

How to search directory with 80k plus log files with unique names?

data

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability

Are you a member of the Splunk Community?

How to search directory with 80k plus log files with unique names?

data

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability