Monitoring Splunk

waterfall saved search

MaryvonneMB
Path Finder

Hi all!

I've got problem. It's a bit complex to explain, but I hope I'll be as clear as possible.
First I want to generate a file based on 2 differents inputs. One of these file is big (about 400,000 lines) and the other one is about 100,000 lines. To correctly generate my new file I have to do 4 join and then an append (to stille have lines that doesn't match). So the result is a file wich is sum of these 2 inputs (about 500,000lines).

After that we want to index this file. Currently I create my file as a CSV based on these 2 inputs which are CSV too. LaterI index my result file.
What my team wants is using the 2 inputs (which for other reason will already be indexed) to generate the result file and index it directly (So we delete our result CSV file).

here is my global request:

|inputlookup firstInputFile.csv
|rex field="fieldA" mode=sed "s/^0{10}//g" 
| rex field="fieldB" mode=sed "s/^0{10}//g" 

| join type=left max=1 "fieldA" [ 
|inputlookup secondInputFile.csv 
|table "fieldC"
|eval "fieldA"='fieldC' 
| eval "new_column"="X"
] 

| join type=left max=1 "fieldB" [
|inputlookup seconInputFile.csv 
|table "fieldC"
|eval "fieldB"='fieldC' 
|eval "new_column"="X"
]

| eval "new_column"=if('new_column'="X","X","Y")

| join type=left max=1 "fieldA" [
| inputlookup thirdInputFile.csv  | search "fieldD"!=""
| table some fields
| eval "fieldA"='fieldD'
]

| join type=left max=1 "fieldB" [
| inputlookup  thirdInputFile.csv  | search "fieldE"!=""
| table some fields
| eval "fieldB"='fieldE' 
] 

| join type=left max=1 "fieldB" [
| inputlookup  thirdInputFile.csv  |search "fieldD"!=""
| table some fields
| eval "fieldB"='fieldD'
]

| join type=left max=1 "fieldA" [
| inputlookup  thirdInputFile.csv  |search "fieldE"!=""
| table some fields
| eval "fieldA"='fieldE'
]

| append [
|inputlookup thirdInputFile.csv 
| rename "fieldD" as "fieldA" "fieldE" as "fieldB" 
| table some fields
]

| fields - some fields
| rex mode=sed field="fieldF" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldG" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldH" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldI" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"
| rex mode=sed field="fieldJ" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldK" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"

When I run these request and add |collect index="myindex" sourcetype="mysourcetype" source="mysource" the task ended during results finalization and splunk disconnect. If i replace the collect by only a table * it also crash. It succeed if I delete at least the append and one join. So I though it is possible to cut these big request to have temporary results used by the next part of the request. As I want to automate this process I look around saved searches but I don't really see how I can solve my problem.

If anyone has an idea, thanks in advance!

0 Karma
1 Solution

woodcock
Esteemed Legend

Try this (EXACTLY as-is; translate your field names but do not change any commands):

|inputlookup firstInputFile.csv | eval DATASET="LeftJoin"
|appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldA=fieldC 
   | eval new_column="X"
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldB=fieldC
   | eval new_column="X"
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| eval new_column=coalesce(new_column, "Y")

| appendpipe [| inputlookup thirdInputFile.csv
  | search "fieldD"!=""
  | eval fieldA=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldB=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  |search "fieldD"!=""
  | eval fieldB=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldA=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup thirdInputFile.csv 
  | rename fieldD AS fieldA
                    fieldE AS fieldB
  | table some fields
]

| fields - some fields
| rex mode=sed field=fieldF "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldG "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldH "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldI "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"
| rex mode=sed field=fieldJ "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldK "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"

View solution in original post

woodcock
Esteemed Legend

Try this (EXACTLY as-is; translate your field names but do not change any commands):

|inputlookup firstInputFile.csv | eval DATASET="LeftJoin"
|appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldA=fieldC 
   | eval new_column="X"
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldB=fieldC
   | eval new_column="X"
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| eval new_column=coalesce(new_column, "Y")

| appendpipe [| inputlookup thirdInputFile.csv
  | search "fieldD"!=""
  | eval fieldA=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldB=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  |search "fieldD"!=""
  | eval fieldB=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldA=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup thirdInputFile.csv 
  | rename fieldD AS fieldA
                    fieldE AS fieldB
  | table some fields
]

| fields - some fields
| rex mode=sed field=fieldF "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldG "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldH "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldI "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"
| rex mode=sed field=fieldJ "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldK "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"

MaryvonneMB
Path Finder

Thanks a lot. And sorry for my delayed answer. Just using your request with lookups as input, the task ending as before. But finally I haved indexed my first and third input and with your request performance is better (without unexpected ending ^^)

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...