Connecting The Dots With Quality Analytics Data

Get creative about sourcing data, find ways to improve its quality, and then normalize it to mine its value

Security analytics practices are only as good as the data they base their analysis on. If data simply isnt mined, if it is of poor quality or accuracy, if it isnt in a useable format or if it isnt contextualized against complementary data or risk priorities, then the organization that holds it will be challenged to scratch value out of analytics.
Your security analytics are only as accurate and useful as the data you put in, says Gidi Cohen, CEO of SkySecurity. If the data has gaping holes, misses important network zones, or lacks input from security controls, then you will have gaping holes in your view and miss key dependencies between the myriad security tools and processes you use.
So what does a data-centric analysis process look like? It starts first with recognizing that youve got access to more relevant data than you think you do. Most organizations already have everything they need to know in order to know themselves for analytics sake, says Kelly White, vice president and information security manager of a top 25 U.S. financial institution, who shared best practices on the condition of not naming his employer.
[Do you see the perimeter half empty or half full? See
Is The Perimeter Really Dead?
.]
If you just think about and internalize the amount of information your systems produce -- just by the fact that theyre running on your network -- if you think about all of the security information that your users produce as they go about their daily work, its not something that you have to go out and buy from somebody, White says. You dont need to subscribe to a report. Really, everything you need to know yourself, youve got already.
Organizations that get creative with their sourcing of data are the ones that tend to get more value out of analytics than those that simply lump together security system log data in a SIEM or who think of threat intelligence from outside sources interchangeably with security analytics.
Some of the data sources that could play a big part in forming more complete data sets could include network footprint data, platform configuration information, log-in and identity management data, database server logs and NetFlow data. Whites organization is even as creative as to use a Google appliance to index and search against unstructured data stores such as SharePoint servers to find relevant information, such as unstructured repositories of PII, and create a map of relevant information that would otherwise present blind spots when assessing security risks.
Identifying potential internal sources of data is only the first step in ensuring that it can provide value to an analytics program. Organizations also must groom and prepare the data to make sure it is of reliable quality and it is in a useful format. This means doing a bit of quality assurance -- a sort of presecurity analytics, as Mike Lloyd, CTO of RedSeal Networks, calls it -- to make sure gaps are filled and sources are refined so their feeds are accurate enough to make operational assumptions upon.
If the data quality is bad, you have to do analysis on that first to decide whats wrong with the data, how bad a problem is it and what you can do about it to make it useable, Lloyd says, explaining that the more data sources you combine to get slightly different views of the same environment, the easier it is to do this. When you combine data, you can criticize the data feed itself and not rush headlong into security analytics.
And this kind of criticism of data feeds shouldnt just happen on the front end of the analytics process -- it should be an on-going routine. Because, as Rajesh Goel, CTO of Brainlink International, points out, changes from infrastructure vendors could greatly impact data feeds.
Vendor updates, patches and changes can change the meaning of the raw data generated and subsequent analytics. Some vendors communicate the changes clearly, others bury them in massive updates, and do NOT take into account that the events being generated have changed, he says. Its important to confirm/validate that were still getting the needed data and that the value of threats or events hasnt changed.
Even if the data itself is good, it may not be dispensed by a particular piece of software or hardware in any kind of format useable to a security analytics team.
The data required to perform accurate and thorough security big data analytics exists, however the challenge is in having to consume vast amounts of dissimilar and proprietary formats, says Jim Butterworth, CSO of HBGary.
This is why normalization may also play an important role in getting data ready for analytics prime time.
In order for the data to be useful, it must be collected and normalized, so that all of the data is speaking the same language, says Cohen. Once the data is normalized, your analytical tools can operate on that data in a common way, which reduces the amount of vendor-specific expertise needed.
However, organizations shouldnt worship at the normalization altar to the point where it holds back nimble analysis.
I would argue that you dont necessarily have to normalize everything. Theres going to be a lot of unstructured data that doesnt necessarily have to be structured, says Michael Roytman, data scientist for Risk I/O, explaining that for example an organization may take a piece of external data from a report like the DBIR that says its industry is 12% more likely experience something like a SQL injection attack and add a fudge factor that increases the weight of those vulnerabilities. Its about looking at that data and figuring out a quick, easy and dirty way to apply that to your target asset.
Have a comment on this story? Please click Add Your Comment below. If youd like to contact
Dark Readings
editors directly,
send us a message
.

Last News
▸ ArcSight prepares for future at user conference post HP acquisition. ◂ Discovered: 07/01/2025 Category: security	▸ Samsung Epic 4G: First To Use Media Hub ◂ Discovered: 07/01/2025 Category: security	▸ Many third-party software fails security tests ◂ Discovered: 07/01/2025 Category: security

**Cyber Security Categories**
Google Dorks Database	Exploits Vulnerability	Exploit Shellcodes

CVE List

Tools/Apps

News/Aarticles

Phishing Database

Deepfake Detection

Trends/Statistics & Live Infos

Tags:
Connecting The Dots With Quality Analytics Data