Originally published: 18 April 2019
The Need for SIEM Solutions
For the past several years, organisations have been under increasing pressure to find a solution to mitigate many of the security risks that traditional, reactive defences have proven inadequate to deal with in the real world.
All too often a painful experience for organisations to assume is that they will be breached, so they need to focus on dealing with that fact to prevent the incident from becoming a terminal event for the business.
The mitigations to the modern “assume breach” approach to security require the introduction of Security Information and Event Management (SIEM) as the key service delivered by Security Operations Centres (SOCs) — composed of skilled cyber professionals using effective processes and technologies to deal with serious incidents when they occur.
SOCs are too expensive for most small and medium organisations, and even where there’s the budget, finding the skilled cyber professionals is not easy — yet their security assessments and audits are demanding that something is done.
This article explains the current drivers and pitfalls for SIEM adoption, and how new developments, including cloud-native SIEM offerings, make it easier for most enterprises to implement an effective and affordable pro-active security solution.
The Demand for Pro-active Security and Limitations of Conventional Approaches
Organisations have become increasingly aware that a purely defensive approach to security (where the focus is on fortifying every internal computer and networks so well that they will never be breached) is simply unrealistic. There are so many potential attacks and vulnerabilities, both existing and newly developed appearing every day that a breach is ultimately inevitable.
Many organisations have been advised (or compelled to adopt) pro-active security as a consequence. SIEM solutions are a common recommendation and often the first stop for investigations into a solution.
The procurement and roll-out of SIEM products (either proprietary or open-source) for organisations with a limited cybersecurity budget are often seen as too costly, however, due to the price of the equipment or due to the dependency on niche skills required (or both).
After looking at the costs involved, many smaller organisations decide to forego the investment and compromise on meeting the security requirement. They do this by using individual tools (AV on endpoints, network devices, etc.), coupled with a series of individual central managers for each technology and (perhaps) a central log manager that would generate some alerts.
To become aware of all the other potential attacks and compromises beyond a limited and static number of events programmed into each security tool, staff would be expected to trawl through the logs regularly and manually spot unusual events — an activity that in practice ends up not being done at all or frequently enough to be effective.
Invariably, even if they do bother trying these organisations, they soon realise it is futile to expect one or two swivel-eyed operations staff to watch constantly for incoming events, identify genuine incidents or to go through the millions of events every month to see if there has been an incident that wasn’t detected by their vendor’s point solutions.
If multiple minor events occur during the day on different systems (noted on consoles, reported by end-users/customers, etc.), then for more obvious events the operations staff can recognise that something more serious is happening, which could represent an incident, and quickly piece together the root cause through analysis and troubleshooting.
Adoption of SIEM
Those enterprises that identify that they must adopt a proactive approach to the analysis of various logs, events and other security-related data (e.g. threat intelligence feeds) most often turn to providers of monolithic SIEM tools (designed to ease the burden of connecting all these sources and deliver an easy-to-use dashboard for staff in the security operations) or go down the open-source route to “roll their own” SIEM from technologies such as ELK or EFK (Elasticsearch, Logstash/Fluentd and Kibana).
Many of these architectures have been perfectly adequate for individual enterprises. They can be expanded or bolted together (”scale up and scale out”) to varying degrees depending on the architecture and the price. There has been little focus on supporting multi-tenant designs and, ultimately, they reach a limit of performance due to the high ingest rates (usually measured in events per second) required, the separation of clients’ data within an optimised database engine, or the ability to synchronise — often requiring a sacrifice of some management and security features to cope with the throughput demand.
For the more technically minded, Exabeam has produced an excellent guide explaining the technical differences and the historic development of SIEM platforms.
Given the desire to avoid the ongoing complexity and resource-intensive overheads of integration and ongoing support for home-grown open-source tools, many organisations view SIEM as being specifically a single tool or a family of tools from a specific vendor. And the product selection process is, therefore, based on researching various articles to determine which particular vendor “is the best SIEM”.
Such an overbalanced focus on technology often results in disappointment, with the result following the conclusion of the project due to unrealised hidden costs of implementation and operation of an inappropriate choice. It would be far better to consider the introduction of SIEM as developing the technology along with the existing enterprise’s environment, process and people as essential components in defining how SIEM can work for an organisation, shaping the procurement to suit.
A (very) short guide to how SIEMs work and their main benefits
Having a SIEM solution in place allows the correlation of multiple events from different sources. It is recognised that seeing event A+B+C on device X = a 90% of a genuine priority 1 incident. Of course, this correlation would only work if the SIEM had been programmed with a “rule” to tell it about this specific “misuse case” of event A+B+C.
SIEMs use this correlation consisting of Events of Interest (EoI) assisted with other data worthy of investigation to determine indicators of attack (IoA) from hackers as pre-breach scenarios.
In some cases, traces of activity left after a malicious activity are detected rather than before. Such Indicators of Compromise (IoC) also need rules to detect them and allow rapid incident handling to contain the damage post-breach.
Artificial Intelligence and Rules
Common rules that would apply in every environment typically come built-in as standard for most off-the-shelf and open-source SIEMs (such as those identifying deliberately malformed traffic and patterns that can only be attacks).
These rules can also be self-learned dynamically through artificial intelligence (which in the context of users, makes use of UEBA or User Behaviour Analytics), which could automatically build the rule based on knowledge of what is normal and what isn’t for that user. For example, up to now, Mr Jones has always logged in from Southend-on-sea, never worked beyond 17:00, or forgotten a password before. Hence, the analytical knowledge indicates this activity is suspicious.
However, if the user in question has done this suspicious activity on the timesheet entry server X (not connected to the production network) and especially if today is the last day of the month for submitting them to get paid, it’s probably an employee on holiday who forgot to enter them, so you wouldn’t want to be dragged out of bed on this probability. It could wait till the next day for checking, or you could have set the SIEM up to cause that user’s login to be disabled immediately as it probably won’t be serious to get it wrong (unless it’s your boss or spouse!).
But what about if there was a fourth check-in or rule that identified the same situation has occurred for different users or a fifth check that identified the source IP address as coming from a known “bad actor” because we got an alert from a threat intelligence feed sending live updates via a watch list that showed many other organisations had been attacked from the same source in the last hour?
This example is very simplified: far more sophisticated types of EoI are generated when the SIEM includes in its conditions such things as specific versions of specific operating systems having specific vulnerabilities (which a later version of operating system does not), specific processes stopping unexpectedly, and so on.
In essence, the SIEM is only as good as its rules and threat intelligence. Its effectiveness must be balanced with the complexity and cost of the amount of work involved in creating and assimilating the data reliably. Stringing them all together to ensure that alerts are raised only where they should be (true positives), and not where they shouldn’t be (false positives), or not missed altogether (false negatives).
Open Source SIEM’s, SIEM-as-a-Service and SOC-as-a-Service
There are many open-source SIEM software applications or appliances that organisations can obtain without licence costs, but still have to pay for the underlying on-premises and to acquire upskilling security operational staff to run and maintain it.
Even without the costs of the platform itself, many small and (increasingly) large companies can’t swallow the cost and hassle of running their own (or often a pair of) SOCs, where the organisation must consider Disaster Recovery planning is an essential requirement, and acquiring expensive (and hard to find) resources.
Frequently, smaller organisations don’t have any operations staff on hand overnight either. Some (by no means all) make use of on an on-call rota, relying on the individual security devices firing SMS messages under specific high-priority conditions (usually heavily filtered to avoid being kept up all night with false positives!), which would not be fired at all where multiple lower priority event occurs. So the on-call operations staff would be none the wiser.
Consequently, there has been a significant rise of third-party provided, private and public, cloud-hosted virtual SOCs and SIEMs, either multi-tenant (where one platform is shared between multiple customers, with software-engineered data separation at the database level) or single-tenant, the latter dedicating a virtual platform and database to one client, presenting less risk for accidental exposure of data between tenants and greater flexibility, but at an increased cost.
In reality, third-party managed private clouds with single-tenant architecture still have the architectural overheads but reduce the operational overheads of running the SOC by sharing its resources across multiple customers. In the case of highly regulated, risk-averse organisations (such as the defence sector), this may represent the best compromise between security risk and cost savings.
Third-party managed multi-tenant public clouds reduce costs by optimising the sharing of resources for both the platform and the SOC staff. To the extent that even with the management fee overhead, the organisation saves money compared to running their own SOC with the potential added risk factor of running on shared resources accepted (as it often has been already, with many customers having migrated their on-premises servers and data to large-scale cloud providers).
With so many different customer organisations (often national or international) being covered by the same analysts, a further disadvantage with larger outsourced SIEM managed service providers comes resulting in a highly diluted threat model. Whereas with fewer (regional) clients, smaller SIEM managed service providers can be more accurately tuned to a threat model more closely resembling your environment.
Requirements of Efficient SIEM Solutions
The effectiveness of SIEM solutions to recognise a threat relies on how comprehensive its database of “misuse cases” is and how much external intelligence it gathers from threat feeds or devices such as routers, firewalls, switches, anti-virus agents, cloud management consoles, etc.
Beyond the more obvious ones (often included with off-the-shelf) many of these misuse-cases won’t have been experienced or anticipated by the IT operations and security staff working within a single individual enterprise (even large ones), and given the huge and continually increasing number of misuse cases appearing over time, the seemingly remote possibility of being affected by the more obscure and new threats is not remote when considered in aggregate.
To illustrate, a particular misuse-case may only occur once in every thousand companies on average in a year say. Still, suppose there are also a thousand other currently viable misuse-cases affecting that asset (e.g. a server or laptop running several different programs, especially with obsolete or unpatched components). In that case, the probability of that asset being compromised is statistically 100%.
If the SIEM is programmed with misuse-cases created dynamically by a large number of highly skilled threat analysts in response to the live analysis of many millions of sensors around the world (a sensor could be a Firewall from a vendor such as Cisco or Juniper, or an anti-virus/endpoint protection software from McAfee, Microsoft) and update your own SIEM instance for you, this would be a huge advantage. Since the costs of the security analysts and maintaining the code for the sensors is spread out across millions of customers, access to this collective pool of resources is possible.
Furthermore, if the SIEM functions and architectures themselves can become abstracted into the cloud services (becoming “cloud-native”), then the headache of scaling and connecting agents to services disappears.
Large-scale cloud-hosted service providers deliver a variety of tools that offer security functions and support integration with SIEMs — including the generation of log data based on their platform metrics (such as alerts when hosts are created, destroyed, started and stopped, or when cloud administrators fail security checks). Example tools from Amazon include Cloudwatch, Cloudtrail, Trusted Advisor or Inspector; and from Microsoft: Security Center, Azure Monitor or Log Analytics.
These are essentially proprietary tools designed to provide cloud workload protection within their cloud offerings, with limited interoperability capabilities with wider third-party toolsets (such as Syslog) compared to those commonly included with other SIEM tool vendors standard.
Cloud Native SIEM – Microsoft’s New Initiative
Microsoft has the advantage over its cloud service provider rivals (including the larger Amazon AWS) by leveraging the endpoint intelligence it collects (over 6.5 trillion signals a day according to their announcement). Since introducing Windows ATP Defender (now known as Defender for Endpoint), the enterprise can provide advanced security defences to both its servers and (if it can afford to pay the price for the Windows Enterprise E5 licence) its end-user Windows clients. Defender for Endpoint is available on a standalone license which is very competitive, and also ranks as one of the best Endpoint Detection and Response tools on the market, if not the best
The “one-stop-shop” approach provides a comprehensive integration of the agent with advanced features typically requiring additional software purchases, software and security engineering design, integration and deployment. Now, through the “everything as a service” paradigm provided with Security Center, Microsoft facilitates the onboarding of devices for monitoring and logging through features such as auto-deployment of agents to virtual machines created in the Azure portal.
Microsoft has leveraged this advantage further to compete with the traditional SIEM providers by introducing Azure Sentinel. This now provides interoperability with other event logging, such as Common Event Format (CEF) and Microsoft Intelligent Security Association Suppliers (Check Point, Cisco, ServiceNow and many others), allowing security events and alerts from beyond its cloud offerings to be effectively monitored from a single point, delivering and tuning the AI within the service to maximise the efficiency, without requiring their customers’ own “threat hunters” to spend time doing this “housekeeping” activity.
Ultimately, artificial intelligence has its limits, and the most powerful instrument in the known universe — the human brain — is needed to control the logic rules to understand and anticipate how other malicious humans may act or have acted by recognising the motives and reactions on a human level.
AI’s fundamental limitation is that it needs to know what is “good” to determine what is “bad”, and (like in the world of politics) this varies from organisation to organisation on what they know or classify as either, so AI will only be as good as the human conscience feeding it.
Microsoft has also just introduced a managed threat hunting service to support its fully-featured “everything as a service” Microsoft 365 with Windows Defender Advanced Threat Protection (ATP). Microsoft Threat Experts allow an organisation with its limited capability SOC to contact Microsoft’s experts on-demand, rather than recruit their own expensive skilled threat hunters.
Local Security Providers for Local People (with apologies to Royston Vasey)
A question that commonly arises is “Will the advent of software-defined networking and security, the real-time threat intelligence information provided by millions of sensors and the centralised human expertise provided by a large global corporation give me an affordable and fully effective defence that covers all cyber threats without any need for internal security expertise or consultants?”
My answer is no, at least not totally (admittedly, not an unexpected one from a security consultant!) although the main costs and overheads associated with provisioning, supporting and integrating the server and security architecture integration will have been overcome, along with the provision of expert “eyes on glass” 24x7x365 at a fraction of the cost.
- There remains the need for all the third parties in the supply chain to have mitigated the threats to the same extent to avoid being compromised. This requires independent skilled information security auditors to interact with humans to establish the maturity of the cyber hygiene employed in the organisation.
- Global SOC organisations run by the likes of Microsoft are indeed very powerful at recognising technical threats across many platforms. Still, they would not be able to identify deficiencies in individual organisations’ governance and security practices. Including the lack of physical security, endpoint (PC/Laptops/Notepads/Mobile Phones etc.) and “shadow IT”. This results in services and sensitive information not being monitored for any reason (from USB sticks to rogue cloud providers). Local security services from local service providers with an intimate knowledge of the client’s security and IT needs will still be an essential benefit.
- Even if an organisation has made all of its servers cloud-native, it needs to establish how it will operate if a disaster does happen through effective disaster recovery and business continuity planning at a local level.
- Although the cost of securing cloud-native platforms is reduced, there will still be a cost, and pressures from the board will remain to reduce overhead to improve margin while protecting the business by focusing spend on the true areas of risk and not wasting resources protecting assets that don’t warrant the protection. Proper risk management to capture the true risk appetite and internal and external factors introducing risk still need to be determined to build an accurate threat model and optimise the investment.
The most important fact to remember is that every organisation has different types of data, infrastructure, processes, staff, vulnerabilities and cyber-attackers interested in targeting it for various reasons. Hence, the individual threat model considers all these factors as unique for every organisation.
Buyers of packaged solutions should never make the assumption that an “off-the-shelf” SIEM tool or service will meet their risk model, and understand that quantity of these rules and information does not equal quality: quality is obtained by a smaller set of more focused rules, chosen and developed with the specific organisation in mind.
While many vendors offer SIEM buyers guides, there’s a bias to the strengths of their own product, so it’s best to take in plenty of external advice in addition to those of the vendors. There are many excellent independent articles on the subject.
While SIEM solutions are evolving every day from a commercial standpoint (e.g. Microsoft cloud-native, consumption price models) and a technical perspective (AI, UEBA, SOAR), there are relatively few articles about the process and people requirements for a useful SIEM acquisition. An organisation needs to consider how well it can maintain its threat model continuously to avoid making an investment that does not meet the actual business need and rediscover Gigo’s principle (Garbage In = Garbage out).
(Editor’s note: this article refers to several SIEM tool vendors implementations and articles, but does not recommend a particular vendor over another). That being said, and just for clarity, we are big fans of the capabilities in Microsoft Sentinel. We now refer to Microsoft as a security company, with a ‘few’ productivity features.
Author: Graham Clements, Certified Information Systems Security Professional (CISSP), Senior Consultant at Optimising IT