Getting Data In

How to search directory with 80k plus log files with unique names?

mrkevinhoang
New Member

Hello Community,

I have tried searching, but I've not find an answer to my specifics needs... Or I dont know how to word my question.

I work in a company that manufactures servers. Each server manufactured creates a logfile with a unique name. The log file is a txt file that has identifying lines like "Serial Number: FMXXXXXXX", "Station: 1", "Start Tme: 12:00:00", etc.

I am trying to configure splunk to search all these log files based on serial number (to start with) and eventually create a searchable dashboard where I can lookup log files based on serial numbers.

I'm obviously new to splunk, and have watched a lot of tutuorials, but most tutorials focus an searching one big log file, or several log files.

so far, i have setup the splunk UI and pointed it to a directory containing my log files.  Under "data summary" my sources are over 100k and sourcetypes are over 14k.

any hep would be appreciated.

Kevin

Labels (1)
Tags (1)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

If you have ingested those log files into Splunk, I assume you have directed them to a specific index, so all the log files will be in a single index.

You mention different sourcetypes - how do you get different sourcetypes?

As far as searching in Splunk, you simply start with an SPL statement, e.g.

index=<your_index_with_the_data> OTHER_SEARCH_CRITERIA

Splunk will often 'extract' fields from your data automatically where it can, but if it can't, you can set up field extractions - in the left bar of your search window, you will see 'Extract new fields', you can either extract fields through this UI or you can use the rex command https://docs.splunk.com/Documentation/Splunk/9.0.3/SearchReference/Rex to extract fields based on a regular expression, for example this statement

| rex "Serial Number: (?<serial_number>[^\"]*)\", \"Station: (?<station>\d+)"

will extract serial number and station from your event based on your example.

Come back with further questions as you make progress.

bowesmana_0-1691724312729.png

 

mrkevinhoang
New Member

Thanks for the quick reply!

I have imported the data, but im not sure if I have "directed to an index". I looked at settings>indexes, I see about 13. 1 named MAIN has about 29G of data, so i assume this is the Index you are referring to?

as for the sourcetypes, I dont know where they come from.
Hosts (1)    |.       Sources (100,000).    |.        Sourcetypes (14,506)

Seems like it's auto generated.

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can get an idea of what's ingested by running this search over the last 7 days

| tstats dc(sourcetype) as sourcetypes count latest(_time) as lastItem where index=* by index
| eval lastItem=strftime(lastItem, "%F %T")

that will tell you what's been ingested for each index over the last 7 days

Assuming it is index=main that has your data, then a simple sear

index=main

will return you some events. source will generally be the name of the file that's ingested - not sure what your sourcetypes will be - are these files you're ingesting CSV?

Anyway, find the data and then you can search.

Generally it's a good idea to plan where you want your data to go and what sourcetype you want it to become, as sourcetype is a key way to define behaviour for that data.

In your case, if you just have the data there now, you can either plan it, start again and have another go, or probably given your early understanding of Splunk, I would play with the data you have to see if you can start to get some responses from it - then you'll rapidly get a better feel for what you have/can do.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...