The Nuances of Advanced Correlation Rules for Authentication Logs

Using the Advanced Intelligence (AI) Engine with LogRhythm allows users to correlate among all the logs in a network and alert when there is anything unusual in the log patterns. My team, the Knowledge Engineers, is tasked with creating rules for advanced correlation and pattern recognition. In the early days of the AI Engine, however, we ran across many unexpected challenges when building many of our prepackaged correlation rules.

Let’s start with an example:

One of our AI Engine rules is designed to detect a brute force attempt followed by a successful login on the same account. First this rule looks to see if there are at least 5 authentication failures followed by a successful logon by that same origin login. Seems simple enough, however in the AI Engine Beta program we discovered that this alert fires way too often.

Why is this? Investigation into this issue showed us that the Windows logon process doesn’t produce logs exactly how we expected (or how we would like). It turns out that if domain controllers in the environment are connected, then an authentication failure log is produced for every domain controller in the system. This means that in a large environment, someone who types their password incorrectly only one time can produce as many authentication failure logs as there are domain controllers in the network running active directory. In this case, this AI Engine rule, which is supposed to catch people logging into accounts that aren’t theirs, had a very high false positive rate during beta testing.

But that raised the question what about other windows authentication events, such as logons, so we decided to investigate further. Testing logons just around our office yielded a wide variety of results. When I logged in I produced 12 logs classified as Authentication Success and even 4 logs that were classified as Logoffs. Another employee produced 49 logs with a single logon! Some of these logon logs are simply connections to shared resources like a network drive – not actual logons. If a typical user of LogRhythm wants to investigate a specific employee, they will want to find the exact time the employee in question logged on, what they accessed, and what that person did with the objects they accessed.  This is a daunting task for a user that doesn’t have a deep understanding of Windows authentication events.

With the creation of the AI Engine rules we realized just how noisy authentication events actually are in Windows. To address this issue, we have fine-tuned our common events and AI Engine rules to filter out all of this extra noise.  Because AI Engine rules can leverage all of LogRhythm’s extensive normalization, data enrichment and log parsing, they can be quickly modified for much greater accuracy.  And a typical LogRhythm user can investigate incidents without having to understand every individual detail of the Windows authentication process. That is what every SIEM product user is looking for – a simple way to understand exactly what is happening in their network without needing a detailed knowledge of the authentication process.

Not only did this analysis help us create more refined AI Engine rules with fewer false positives, it gave us more refined guidance for normalization, making LogRhythm easier for our customers to use.