Getting Data In
Provide Splunk Cloud feedback in this confidential UX survey by June 17
for a chance to win a $200 Amazon gift card!

Return correlating fields from more than one source (multiple condition statements)

HLVarian
Path Finder

This is long because I wanted to make sure all the right info was passed along. I've been racking my brain on this for a while and I'm not getting all my results back (maybe because there are so many events), and I can't figure out how to match on these properly. I might just need help with optimization.

I have an ID field taken from the sources for my sourcetypes. ID's that are the same from these different sourcetypes are from correlating files. But, some ID's were repeated on some sourcetypes so I need to either eliminate these files or match them in another way (I was thinking by the count of unique foos in each ID, or maybe the max(foo) for each ID of a certain sourcetypes. Tjhis then needs to be further evaluated and summed.

I've tried joins and OR statements, all seem to have a drawback, the first has wonky results for any file that has a repeat ID:

sourcetype=sourcetype1| stats sum(bar) AS Bar_Time by  ID, foo
|  sort ID, foo
| join type=outer foo max=1 
[|search  sourcetype=sourcetype2| stats values(baz) AS Bazzle values(naz) AS nazzle by ID fooNum
| rename fooNum AS foo | sort ID1, LayerNr] | eval product=Bar_Time*nazzle | stats sum(product) as Product by nazzle 

This second one almost works. It takes out replicated IDs, but sometimes the sum on bar comes up blank or the value isn't correct. I'm guessing it might be a timeout issue:

  (sourcetype=Sourcetype1 OR sourcetype=Sourcetype2) | rename booNum AS foo |  stats sum(bar) AS Bar_Time values(baz) AS Bazzle values(naz) AS nazzle by  ID, foo | join type=inner ID [ search  sourcetype=Sourcetype2 | stats values(source) AS Source by ID | stats count(Source) AS IDcount by ID | where IDcount<2]

I haven't yet figured out how to match against ID and count of unique foos to match up the duplicate ID's properly, but I can live without them, it will create a very small error.

I have data and fields like this:

sourcetype1

ID, foo , bar

1, 1, 0.01

1, 1, 0.02
1, 2, 0.04
1, 3, 0.01
2, 1, 0.05
2, 2, 0.06
2, 2, 0.07
2, 3, 0.08
2, 4, 0.02
2, 4, 0.01
1, 1, 0.11
1, 2, 0.12
1, 3, 0.13


sourcetype2

ID, fooNum, taz, naz
1, 1, 3, 4
1, 2, 5, 6
1, 3, 2, 8
2, 1, 6, 7

2, 2, 5, 6
2, 3, 10, 9
2, 4, 3, 12
1, 1, 12, 2
1, 2, 3, 4
1, 3, 8, 10

In these sourcetypes the IDs correlate and foo correlates with fooNum. I also want to return the sum of bar.
So where the IDs are equal and foo is equal It should return this. To which I can then pipe in the other maths:

ID, foo, sum(bar), taz, naz
1, 1, 0.03, 3, 4
1, 2, 0.04, 5, 6
1, 3, 0.01, 2, 8
2, 1, 0.05, 6, 7
2, 2, 0.13, 5, 6
.
.
.
1, 3, 0.13, 8, 10

0 Karma
1 Solution

HLVarian
Path Finder

I have verified that this search works for my data:

index=main  (sourcetype=sourcetype1 OR sourcetype=sourcetype1) 
| eval fooFoo=coalesce(foo, foonum) 
| eventstats max(fooFoo) AS maxFoo by source 
| stats values(taz) AS tazzle  values(naz) AS nazzle sum(bar) AS barTimeSum by ID maxFoo fooFoo 
| sort ID maxFoo fooFoo

Since editing the limits.conf didn't seem to be effective at returning more than 10,000 results. I'm currently trying to pipe in

| outputcsv fooFilename

I'll update if this gives me all of my wanted results.

View solution in original post

0 Karma

HLVarian
Path Finder

I have verified that this search works for my data:

index=main  (sourcetype=sourcetype1 OR sourcetype=sourcetype1) 
| eval fooFoo=coalesce(foo, foonum) 
| eventstats max(fooFoo) AS maxFoo by source 
| stats values(taz) AS tazzle  values(naz) AS nazzle sum(bar) AS barTimeSum by ID maxFoo fooFoo 
| sort ID maxFoo fooFoo

Since editing the limits.conf didn't seem to be effective at returning more than 10,000 results. I'm currently trying to pipe in

| outputcsv fooFilename

I'll update if this gives me all of my wanted results.

View solution in original post

0 Karma

HLVarian
Path Finder

I changed

| sort field1 field2 field3

to

| sort 0 field1 field2 field3

This causes sort to return all results, otherwise it defaults to 10,000. You can also specify the number of results you want sort to return.

0 Karma

HLVarian
Path Finder

Correction, in the second search: When I sum up Product (which I didn't put on that example, but have tried) I get erroneous or missing sums.

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!