In the logs I'm working with there is a key piece, the thread name, that I wanted to get indexed as it will be used in the majority of searches to correlate system activity. The thread name shows up in different places depending on the log. In most log files it shows up looking like this [12345678@abc-1234] but is not a fixed length, so basically the thread name is enclosed in square brackets or double quotes and would match the regex pattern like [0-9]+@[A-Za-z ]+-[0-9]+
Some examples would be:
"898661908@Default-1161" Id=29292 THREADS - [Count: 91] on com.mercury.topaz.cmdb.server.util.concurrent.MortbayQueuedThreadPoolFixed$PoolThread@35907e14
2014-11-05 01:00:51,473 [1205989746@History Update Task-0] INFO history.doExecute(33) - HistoryUpdateAddCmdbChanges took 0 [ms]
2014-11-03 08:49:06,967 [885504682@Default-1099] (QuotaMonitorCollector.java:57) INFO - server quota [quota.name.server.tql.active], current count set to : 432 quota is : 3400
2014-11-05 00:00:21,370 [1205989746@History Update Task-0] INFO 14ms 1775408592 SELECT VALUE FROM CCM_SETTINGS WHERE NAME = ? AND CUSTOMER_ID = ? FOR UPDATE; Values: 'SEQUENCE_HistoryDB',-2147483648
1012014-11-05 08:13:55,472 [1206073570@qtp0-27386] - Request [SaveEntity, 631842233, Session{sessionId='164', username='admin', customerId=1, id name: Default Client, clientAddr='local', lastUsedInMillis=1413521516073}, logged in user: u306630], Input:
1062014-11-05 08:14:06,888 [10162952@qtp0-27387] - Request [GetViewTree, 492270348, Session{sessionId='164', username='admin', customerId=1, id name: Default Client, clientAddr='local', lastUsedInMillis=1413521516073}, logged in user: Integration], Input:
INFO - TaskGenerator: no results for probe VCLD13GDAFHAP05, waiting 30 seconds [1544627621@qtp0-16772]
I tried modifying the grokit.properties, but I can't seem to get it working. Any suggestions?
There are a few online regex debuggers like debuggerx which will help with writing complex expressions. In this situation the regex pattern does not appear to be matching the log lines. This link [https://www.debuggex.com/r/npzBjSsooK5GRIMA] shows the old regex and this here [https://www.debuggex.com/r/f63EirYFvBP5Ftn0] demonstrates the one tested with your sample data.
One thing to watch out for when editing the grokit properties is that the regex used matches the entire line. It is not a regex search within a line of text. It is important to have *.? at the beginning and end of the pattern. For example the following regex would match the WARN ERROR INFO text in a typical log4j line
.*?(INFO|WARN|ERROR).*?
See docs: http://logscape.github.io/types-field_grokit.html