Recursive Sub-search In Splunk

Many a times we find ourselves in a situation where we have to search the logs based on a unique id recursively, and to make things that little bit harder we would have to initially find a candidate line in the logs, then cut the line to get a unique id and then search logs with this unique id. In our good old days of shell  scripts we could have written something like…

If the log we are supposed to search looks something like the line below.

Paragraph 1

2019-12-19 09:20:33.414920 ‘SVRREQ’ ‘4753065a-221f-11ea-b0b5-0a0634110000’ ‘{“ProductId”:”834682″,”ProductNo”:”123543673433212348567″,”Remarks”:”Is This fancy?”}’

2019-12-19 09:20:33.414930 ‘SVRRES’ ‘4753065a-221f-11ea-b0b5-0a0634110000’ ‘{“Response”:”Yes Indeed!”}’

we can write a simple script first to get the unique id and then to search it in the log again, which would look something like.

grep 834682 yourfile.log | cut -d “” -f4 | while read i ; do grep $i yourfile.log ; done;

The first grep would output only the request, which has a unique id and the only way of finding the response is through this unique id.

grep 834682 yourfile.log | cut -d “‘” -f4  Would Output  4753065a-221f-11ea-b0b5-0a0634110000

and

while read i ; do grep $i yourfile.log ; done;  Would Output both the lines displayed in Paragraph 1. 

Writing this script for one single search seems an overkill, but what if you wanted to extract all the responses for which the request had the text “Fancy”? you could simply write

grep -i  fancy yourlogfile.log | cut -d “” -f4 | while read i ; do grep $i yourfile.log  | grep SVRRES ; done;

Now we start to see the benefit of this script. What if we had to the similar thing in splunk? our task can be divided into three parts.

  1. Defining a Splunk Variable.
  2. Running  Inner Query.
  3. Running Outer Query.

Defining a splunk variable is pretty straight forward, can be done by following steps described in the link below.

https://docs.splunk.com/Documentation/Splunk/8.0.1/Knowledge/FXSelectSamplestep

Once done we can use our  new variable by using the search command “fields”

so..

  1. Defining a Splunk Variable.
    1. Based on the link provided above, lets assume we have named it TransactionId( Field names are case sensitive)
  2. Running  Inner Query.
    1. [search index = yourindex  source = “yourlogfile.log” fancy  | fields TransactionId]
  3. Running Outer Query.
    1. index = yourindex  source = “yourlogfile.log”  <Inner Query Result goes here> SVRRES

If we put everything together the executable query would look like below.

index = yourindex  source = “yourlogfile.log” [search index = yourindex  source = “yourlogfile.log” fancy  | fields TransactionId ] SVRRES

this splunk search would first pickup all the lines which have the word “fancy” in them and then only output the field TransactionId to be used by the outer search as a string of OR’s , From Outer query’s point of view its the below query running, Example.

index = yourindex  source = “yourlogfile.log” [“4753065a-221f-11ea-b0b5-0a0634110000” OR “4753065a-221f-11ea-b0b5-0a0634110001” OR “4753065a-221f-11ea-b0b5-0a0634110000” ] SVRRES

Bear in mind that this method has a limit of 10500 in the sub-search results, and there are better approaches such as using lookup files or the join command, when the amount of data to be correlated is huge. Most of the time the query explained in this post would fit our requirement.

Advertisement