Many a times we find ourselves in a situation where we have to search the logs based on a unique id recursively, and to make things that little bit harder we would have to initially find a candidate line in the logs, then cut the line to get a unique id and then search logs with this unique id. In our good old days of shell scripts we could have written something like…
If the log we are supposed to search looks something like the line below.
Paragraph 1
2019-12-19 09:20:33.414920 ‘SVRREQ’ ‘4753065a-221f-11ea-b0b5-0a0634110000’ ‘{“ProductId”:”834682″,”ProductNo”:”123543673433212348567″,”Remarks”:”Is This fancy?”}’
2019-12-19 09:20:33.414930 ‘SVRRES’ ‘4753065a-221f-11ea-b0b5-0a0634110000’ ‘{“Response”:”Yes Indeed!”}’
we can write a simple script first to get the unique id and then to search it in the log again, which would look something like.
grep 834682 yourfile.log | cut -d “‘” -f4 | while read i ; do grep $i yourfile.log ; done;
The first grep would output only the request, which has a unique id and the only way of finding the response is through this unique id.
grep 834682 yourfile.log | cut -d “‘” -f4 Would Output 4753065a-221f-11ea-b0b5-0a0634110000
and
while read i ; do grep $i yourfile.log ; done; Would Output both the lines displayed in Paragraph 1.
Writing this script for one single search seems an overkill, but what if you wanted to extract all the responses for which the request had the text “Fancy”? you could simply write
grep -i fancy yourlogfile.log | cut -d “‘” -f4 | while read i ; do grep $i yourfile.log | grep SVRRES ; done;
Now we start to see the benefit of this script. What if we had to the similar thing in splunk? our task can be divided into three parts.
- Defining a Splunk Variable.
- Running Inner Query.
- Running Outer Query.
Defining a splunk variable is pretty straight forward, can be done by following steps described in the link below.
https://docs.splunk.com/Documentation/Splunk/8.0.1/Knowledge/FXSelectSamplestep
Once done we can use our new variable by using the search command “fields”
so..
- Defining a Splunk Variable.
- Based on the link provided above, lets assume we have named it TransactionId( Field names are case sensitive)
- Running Inner Query.
- [search index = yourindex source = “yourlogfile.log” fancy | fields TransactionId]
- Running Outer Query.
- index = yourindex source = “yourlogfile.log” <Inner Query Result goes here> SVRRES
If we put everything together the executable query would look like below.
index = yourindex source = “yourlogfile.log” [search index = yourindex source = “yourlogfile.log” fancy | fields TransactionId ] SVRRES
this splunk search would first pickup all the lines which have the word “fancy” in them and then only output the field TransactionId to be used by the outer search as a string of OR’s , From Outer query’s point of view its the below query running, Example.
index = yourindex source = “yourlogfile.log” [“4753065a-221f-11ea-b0b5-0a0634110000” OR “4753065a-221f-11ea-b0b5-0a0634110001” OR “4753065a-221f-11ea-b0b5-0a0634110000” ] SVRRES
Bear in mind that this method has a limit of 10500 in the sub-search results, and there are better approaches such as using lookup files or the join command, when the amount of data to be correlated is huge. Most of the time the query explained in this post would fit our requirement.