Splunk Search

How to do a relational search by getting a partial string as key from a subsearch?

Engager
  1. 2015-11-26 22:40:25,394 [] scheduled test:refs/changes/69/131269/1 to [6d230057] push http://user@example.com:9441/test/abc.git to run after 5s
  2. 2015-11-26 22:40:25,395 [] scheduled test:refs/changes/69/131269/1 to [2d7bc846] push http://user@example.com:9441/test/abc.git to run after 5s
  3. 2015-11-26 22:40:30,394 [6d230057] Replication to http://user@example.com:9441/test/abc.git started...
  4. 2015-11-26 22:40:30,395 [2d7bc846] Replication to http://user@example.com:9491/test/abc.git started...
  5. 2015-11-26 22:40:33,293 [6d230057] Replication to http://user@example.com:9441/test/abc.git completed in 2898 ms
  6. 2015-11-26 22:40:32,334 [2d7bc846] Replication to http://user@example.com:9491/test/abc.git completed in 1938 ms

Now I only have "131269" and "example.com:9491" as the input parameters,
and I want to find out No.4 and No.6 records listed above.

Thank you!

0 Karma
1 Solution

Influencer

Extract out the commit hash, then build up your search from there. EG:

sourcetype=my_log example.com:9491 (131269 or Replication )| rex field=_raw "^.*\[(?<commit_hash>\w.*?)\]" | stats values(_raw) by commit_hash

This should return something like this (

| commit_hash      | _raw                                                                                        
    6d230057       | 2015-11-26 22:40:25,394 [] scheduled test:refs/changes/69/131269/1 to [6d230057] push http://user@exam
                   | 2015-11-26 22:40:30,394 [6d230057] Replication to http://user@example.com:9441/test/abc.git started...
                   | 2015-11-26 22:40:33,293 [6d230057] Replication to http://user@example.com:9441/test/abc.git completed in 
     2d7bc846      | 2015-11-26 22:40:25,395 [] scheduled test:refs/changes/69/131269/1 to [2d7bc846] push http://user@example
                   | 2015-11-26 22:40:30,395 [2d7bc846] Replication to http://user@example.com:9491/test/abc.git started...
                   | 2015-11-26 22:40:32,334 [2d7bc846] Replication to http://user@example.com:9491/test/abc.git completed i

pipe a search in after stats if you want to filter down further. You shouldn't need a subsearch
See http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Rex for more details on rex

View solution in original post

0 Karma

Influencer

Extract out the commit hash, then build up your search from there. EG:

sourcetype=my_log example.com:9491 (131269 or Replication )| rex field=_raw "^.*\[(?<commit_hash>\w.*?)\]" | stats values(_raw) by commit_hash

This should return something like this (

| commit_hash      | _raw                                                                                        
    6d230057       | 2015-11-26 22:40:25,394 [] scheduled test:refs/changes/69/131269/1 to [6d230057] push http://user@exam
                   | 2015-11-26 22:40:30,394 [6d230057] Replication to http://user@example.com:9441/test/abc.git started...
                   | 2015-11-26 22:40:33,293 [6d230057] Replication to http://user@example.com:9441/test/abc.git completed in 
     2d7bc846      | 2015-11-26 22:40:25,395 [] scheduled test:refs/changes/69/131269/1 to [2d7bc846] push http://user@example
                   | 2015-11-26 22:40:30,395 [2d7bc846] Replication to http://user@example.com:9491/test/abc.git started...
                   | 2015-11-26 22:40:32,334 [2d7bc846] Replication to http://user@example.com:9491/test/abc.git completed i

pipe a search in after stats if you want to filter down further. You shouldn't need a subsearch
See http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Rex for more details on rex

View solution in original post

0 Karma

Influencer

For some reason The editor is dropping the capture group name. The full rex command should be

rex field=_raw "^.#42;\[(?<commit_hash>\w.#42;?)\]"

0 Karma

Engager

I use

"sourcetype=my_log (example.com:9442 OR example.com:9441) (131676 OR Replication) | rex field=_raw "^.*\[(?<commit_hash>\w.*?)\]" | stats values(_raw) by commit_hash | search commit_hash"

to get result, but it return "no results found".

Is there anything I miss?

0 Karma

Influencer

Do you get results for

sourcetype=my_log (example.com:9442 OR example.com:9441) (131676 OR Replication) | rex field=_raw "^.*\[(?<commit_hash>\w.*?)\]" | stats values(_raw) by commit_hash

?

0 Karma

Engager

yes, it shows two columns, commit_hash at the left column and matched data at the right column. but if I append "| search commit_hash", it shows no result

0 Karma

Influencer

Make sure you are supply a value for the field to search, eg:

...| search commit_hash=<some value here>

Also stats values(_raw) was just an example. It's probably better to run

...| stats values(*)  as * by commit_hash

That will make sure you get all the fields, so you can do further aggregation commands. Have a search through splunk ansers for stats values and chart values for some great examples.

0 Karma

Engager

When I use sourcetype=my_log (example.com:9442 OR example.com:9441) (131676 OR Replication) | rex field=_raw "^.*\[(?\w.*?)\]" without stats ..., it shows logs include (131676 OR Replication). that's right here.

while I append stats values ..., then it just shows results that matched the regex. It is make sense here, too.

Now the problem is I want to use the result (commit_hash) to find out log those contain these commit hash. It does not make any sense if I append search commit_hash to to end of stats...

Because after stats(), it only left the regex matched result, but what I want to search in is those logs that contain (131676 OR Replication). regex is just the way I grab commit_hash Ids. so I am thinking that I might need a subsearch?

0 Karma

Influencer

Are we still using your example data in your question?

I suggested you append a search command to filter down the list, but I'm not sure how you want to filter it. Are you trying to exclude all but a single commit hash? Or do you want to use the commit hashes to search a different log? I'm confused. Originally you said you want records 4 & 6 but now you say you want all the records?

Can you provide an example of what you want the result to be? Based on the example data you originally provided.

Of course if you just want to use the commit hashes to search another source then make the search string a subsearch. EG

sourcetype=my_other_log [search sourcetype=my_log (example.com:9442 OR example.com:9441) (131676 OR Replication) | rex field=_raw "^.*\[(?<commit_hash>\w.*?)\]" | dedup commit_hash | fields commit_hash | rename commit_hash as query]

(The rename command at the end of the subsearch removes the 'commit_hash=' from the returned values. See http://docs.splunk.com/Documentation/Splunk/6.3.1511/Search/Changetheformatofsubsearchresults)

Engager

yes! your solution above is exactly what I want!!!!
I am still using that example in the original post.
There is only one log(my_log), but if we use stats()... then the result will be filtered as only record 1 & 2, there will not be any 3,4,5,6 records left. So it is impossible to find out 4 & 6 depend on that filtered result.

the finally search string is:
sourcetype=my_log [search sourcetype=my_log (example.com:9442 OR example.com:9441) (131676 OR Replication) | rex field=_raw "^.*to.*\[(?\w.*?)\]" | dedup commit_hash | fields commit_hash | rename commit_hash as query]

and I can get record 4 & 6.

0 Karma

Influencer

You're going to get 3 & 5 as well though. Is that ok?

0 Karma

Engager

the second record is like this, there is no "apples" in it : 2015-11-25 18:35:05 [123456] there are 123 in A stock. Can I find out the second record only by the input token "apple"?

0 Karma

Legend

Assuming that you have fields extracted for each of the key data items, you can do this easily:

sourcetype=my_log
| stats latest(inventory_count) as count by stock_key name store

Where the variables have the following values, using your example:
inventory_count = 123
stock_key = 123456
name = apples
store = A

0 Karma

Legend

And of course, you might want to do the selection as part of the base search:

sourcetype=my_log *apple*

would work easily

0 Karma

Engager

sorry, my fault, let's assume that the second record is like this, there is no "apples" in it : 2015-11-25 18:35:05 [123456] there are 123 in A stock. can I find out the second record only by input token "apple" ?

0 Karma

Influencer

sourcetype=my_log apples
will return only record 2, and ignore record 1. Is that the use case you need? Or are you trying to find more generic data? If you post a sample of the actual data it would help.

Just reread your question, I guess you're stuck with apple as the input? If its a token just append s, eg

sourcetype=my_log $inputToken$s

0 Karma

Engager

sorry, my fault, let's assume that the second record is like this, there is no "apples" in it : 2015-11-25 18:35:05 [123456] there are 123 in A stock. can I search the second record only by input token "apple"

0 Karma

Engager

so much appreciated! I just upload my actual data to original post, hope that would be help.
thank you again

0 Karma

Influencer

Not unless there is some commonality between the two events. For example are any of these common between the two events?

  • The number you have in brackets 123456 -The time the event occurs ( look at the transaction command)
  • The phrase "A Stock" (search for that instead of apple)

Otherwise I suspect the answer to your question is "No, you can't". Once again if you post actual data samples along with a design of what you want the search result to look like (eg the actual results table or chart you are after) we can probably help you better.