My data set is web server access logs that include two custom values we insert. The values are lists of keys and lists of values, and can be seen at the end of this entry:
127.0.0.1 - - [27/Oct/2010:17:25:48 +0000] "GET /index.html HTTP/1.1" 200 13211 - "Mozilla/4.0 (compatible; MSIE 7.0)" - 12;14;15;18;22;23;25;35;37;106;47 0;23;1;0;0;1;0;0;1;1;0
The key list is "12;14;15;18;22;23;25;35;37;106;47" and the value list is "0;0;0;0;0;1;0;0;1;1;0".
These key and value lists can have arbitrary lengths. In this case, there are 11 values in each list separated by semi-colons. But in other log entries, there could be just 1 or 2 values, or there could even be 20 values.
I'm trying to figure out how I can create fields based on the first list and assign values to them based on the second list. For example, the fields/values I'd like to see generated would be:
This will be difficult via the normal search language, but easy via Python. It is complicated by the variable number of key/value pairs.
Consider implementing a custom search command.
This will be difficult via the normal search language, but easy via Python. It is complicated by the variable number of key/value pairs.
Consider implementing a custom search command.
I'll accept this answer, but I think the end result will be that we will make some changes to our log format instead.