Splunk Search

file with list source for search OR file with list of search request

ryastrebov
Communicator

Hello!
I have a csv-file that contains list of source, for example:
source
MySource1
MySource2
MySource3
...
I have also a search request, it is the same for all sources.
I need to create automatic consistent search for all source - First search in MySource1, Second search in MySource2,...
Each subsequent search should start only when the previous search is over.
I am new in programming, unfortunately.

I can also create text file with list of searches, if this can help to find of solution of my problem.
Any ideas?

0 Karma
1 Solution

alacercogitatus
SplunkTrust
SplunkTrust

==Update==

So if you have a file like below (call it tmsi_lookup.csv):

tmsi,tmsi_old
value1,value2
value3,value4

You can then do:

source=mobile_source | lookup tmsi_lookup.csv tmsi_old OUTPUT tmsi

This will take the "tmsi_old" field of events in source "mobile_source" and lookup the corresponding tmsi from the lookup file and populate it into the "tmsi" field.

Does this help more?

==Orig==

Map Command will do this. However, as mentioned by jonuwz, a more efficient search can be done if the sources all contain similar information.

Example (with similar information)

source=MySource1 OR source=MySource2 | stats sum(myField) by source

Example (let's just mash a bunch of crap together)

|inputlookup MySources.csv | map [ source=$sourcecsv$ | stats count ]

The more information you provide, I can write better searches to get your end result.

View solution in original post

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

==Update==

So if you have a file like below (call it tmsi_lookup.csv):

tmsi,tmsi_old
value1,value2
value3,value4

You can then do:

source=mobile_source | lookup tmsi_lookup.csv tmsi_old OUTPUT tmsi

This will take the "tmsi_old" field of events in source "mobile_source" and lookup the corresponding tmsi from the lookup file and populate it into the "tmsi" field.

Does this help more?

==Orig==

Map Command will do this. However, as mentioned by jonuwz, a more efficient search can be done if the sources all contain similar information.

Example (with similar information)

source=MySource1 OR source=MySource2 | stats sum(myField) by source

Example (let's just mash a bunch of crap together)

|inputlookup MySources.csv | map [ source=$sourcecsv$ | stats count ]

The more information you provide, I can write better searches to get your end result.

0 Karma

ryastrebov
Communicator

alacercogitatus, yes, I use this approach. But the customer asked me to check my lookup is by testing it on daily data. And the question now is exactly how I read a line from the csv-file and use this line as name of source.

0 Karma

ryastrebov
Communicator

Yes, it is. I have sourcetype that contain all sources. But how can it help?

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

But even though each is a source, they should have a common sourcetype. So in the search, just do "sourcetype=mobile_sourcetype" or what ever the sourcetype is.

0 Karma

ryastrebov
Communicator

The amount of data per day is 1440 files - one file per minute. One file - one source. Each file contains about 400,000 events.
Now I need to know if my lookup works correctly. To do this, I want to check the entire chain of the day, unloading at each intermediate search results csv-file. Upload a file I can do. The only thing that I have now does not work - make SPLUNK consistently to search all of 1440 source. I created a csv file containing a list of all the resources in the right order.
Now I need to take entry of this file, to present it as a source and search for it. This I can't do it.

0 Karma

ryastrebov
Communicator

Thank you for your answer.
I'll try to explain in more detail. Sorry, my English is poor.
I analyze log of mobile operator.
This log contains IMSI (unique identifier of SIM-card), TMSI (temporary ID), TMSI_OLD (previous TMSI), Base station, etc. IMSI is not present in the log at all times for safety. In this case, the IMSI in the log is replaced by a _. I create lookup to track the chain of TMSI-TMSI_OLD and to replace _ by IMSI. My lookup contain values IMSI, TMSI. In the case of a lookup multiple identical IMSI, the old value is removed.

0 Karma

jonuwz
Influencer

@ryastrebov - thats what you think you need to do. Running 1440 consecutive searches is not feasible - So we need to look at alternative methods.

sample data, example of the end result

0 Karma

ryastrebov
Communicator

I need to substitute in the search query values of source ​​from a text file. Once I conducted a search on a single value of source from file, I need to move on to the next value.
If it is possible to quickly choose the order (by name) values ​​of source from sourtsetype, to me it is also nice.

0 Karma

jonuwz
Influencer

You need to update your question with

1) sample data
2) an example of what the end result should look like.

Its very hard to understand your requirements without this.

its almost certain you can achieve what you want with stats / chart.

0 Karma

ryastrebov
Communicator

I try using of inputcsv command, but I don't know how to use lines from csv as source.
search source=sourcecsv is not working.
My CSV contains lines as follows:

sourcecsv
/home/folder/MySource1.gz
/home/folder/MySource2.gz

Why my search request not working?

0 Karma

ryastrebov
Communicator

Customer make 1 source per minute. Final solution will work in real time. Now I want to verify of my lookup per day. Source data contain unique attribute that for security purposes later in the log does not appear. This unique attribute is replaced by by some temporary attribute, and I am using lookup looking for all the events associated with this attribute, using a bunch of unique attribute - a temporary attribute. I can't use transaction, because this is very slowly method. In one source contains approximately 400000 events. And I have to treat them in a minute. So I use the lookup.

0 Karma

Ayn
Legend

1440 o_O

I don't have a solution to your problem, sorry. Could you tell us more about why you want to run 1440 searches sequentially? Maybe there is another way to achieve the same end goal?

0 Karma

kristian_kolb
Ultra Champion

Sorry, I should have been more explicit. My idea was that you'd use inputlookup or inputcsv, or as Ayn suggests, use a subsearch to 'create' the search results.

/k

0 Karma

ryastrebov
Communicator

I have 1440 searches

0 Karma

Ayn
Legend

How many lookups are we talking about? If it's just a few you should consider using subsearches for this.

0 Karma

ryastrebov
Communicator

Thank you! I read this link, but I believe that this will not work in my case, since the MAP is to be used in the previous search, and I have no such - the search for independent. Just after each previous search lookup and update this lookup is used in the next search. Therefore, it is important to wait for the completion of the previous search before you start the next search.

0 Karma

kristian_kolb
Ultra Champion

perhaps the map search command can be of help here. Haven't used it myself, but you should take a look.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Map

/K

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...