Researchers Enlist Machine Learning In Malware Detection

No sandbox required for schooling software to speedily spot malware, researchers will demonstrate at Black Hat USA.

In 100 milliseconds or less, researchers are now able to determine whether a piece of code is malware or not -- and without the need to isolate it in a sandbox for analysis.
Welcome to the age of machine learning as a tool for more efficiently detecting malware, via so-called deep learning techniques. Researchers have built a special machine learning tool module that employs static analysis of a piece of code to quickly spot -- and ultimately, stop -- malware infections. A pair of researchers plans to demonstrate live at Black Hat USA next month just how this approach can spot malware from live malware feeds.
Matt Wolff, chief data scientist at Cylance, says his team is applying deep learning--a more granular subset of machine learning--to malware detection by training the software via legitimate files and malicious ones, and teaching the application/algorithm which is which. The application then can take files its never seen before and spot malware, he says.
It uses a static analysis approach. When you run malware to test it, the malware has a window to fight back before you can stop it, Wolff says. We dont run it [the malware], so the malware doesnt have a chance. And its fast, he says, faster than sandboxing and analyzing malware.
The concept of employing machine learning and deep learning to malware detection isnt really new, but its been only over the past few years that its become more realistic to deploy, thanks to cloud-based computing options making the cost of big-data computing more affordable. You dont have to build a data center of hundreds of machines anymore; you can rent the necessary processing power for machine learning. Advances in processors, memory, etc., lend themselves to help make these techniques more powerful, Wolff says. We dont see anyone [else] applying algorithms to … malware detection yet, he says.
The main premise behind machine learning is matching patterns. When you look at malware, you may not see any patterns. But when you look at a half of a billion samples, there may be tons of patterns that are relatively easy to discern, he says. The goal of this model is to find these patterns.
A typical malware characteristic would be the ability for the code to use functions that capture and log keystrokes, for example.
Machine/deep learning is especially helpful in staying atop the increasingly polymorphic nature of malware. If a malware author two months later comes up with a new [variant], theres a high probability the module you wrote is going to detect that. It has a predictive capability, Wolff says.
With the mountains of malware generated daily, the need for a more automated and intelligent method to learn, adapt, and catch malware is crucial. Cylance has some one- to 2 petabytes of data in its data set for machine learning: We typically have a few hundred CPUs running for days to process and work through the data, and weeks and months running and training the machines to learn these things, Wolff says. It takes hundreds of gigabytes of memory, CPUs and big machines, he says.
The machine learning-based method for now is all about detection. Its up to the security analyst or other tools to decide what to do next with the newly discovered malicious code, he says.
A deep learning system could ultimately replace todays existing malware detection tools, Wolff says. A machine learning engine is more effective than a signature-based engine, he says.
Wolff and his colleague Andrew Davis, a machine learning scientist at Cylance, will feed their deep-learning module some fresh meat malware live during their talk at Black Hat, called
Deep Learning on Disassembly
. Well … see what it catches, says Wolff.

[Register now for
Black Hat USA
.]

Last News
▸ ArcSight prepares for future at user conference post HP acquisition. ◂ Discovered: 07/01/2025 Category: security	▸ Samsung Epic 4G: First To Use Media Hub ◂ Discovered: 07/01/2025 Category: security	▸ Many third-party software fails security tests ◂ Discovered: 07/01/2025 Category: security

**Cyber Security Categories**
Google Dorks Database	Exploits Vulnerability	Exploit Shellcodes

CVE List

Tools/Apps

News/Aarticles

Phishing Database

Deepfake Detection

Trends/Statistics & Live Infos

Tags:
Researchers Enlist Machine Learning In Malware Detection