So Mogull is back on the bench and I'm glad to see him blogging again.
As I type this, I'm listening to James Blunt's new single "1973" which is unfortunately where Rich's timing seems to be on this topic. 'Salright though. Can't blame him. He's been out scouting the minors for a while, so being late to practice is nothing to be too wound up about.
<If you can't tell, I'm being sarcastic. I only wish that Rich was when he told me that his favorite TexMex place in his hometown is called the "Pink Taco." That's all I'm going to say about that...>
The notion of the HyperJackStack (Hypervisor Jacking & Stacking) is actually a problem set that has been discussed at length and in the continuum of these discussions happened quite a while ago.
To put it bluntly, I believe the discussion -- for right or wrong -- stepped over this naughty little topic months ago in lieu of working from the bottom up for the purpose exposing fundamental architectural deficiencies (or at least their potential) in the core of virtualization technology. This is an argument that parallels dissecting a BLT sandwich...you're approaching getting to the center of a symmetric stack so which end you start at is almost irrelevant.
The good/bad VMM/HV problem has really been relegated to push-pin on the to-do board of all of the virtualization vendors and this particular problem has been framed by said vendors to be apparently solved first operationally from the management plane and THEN dealt with from the security perspective.
So Rich argues that after boning up on Joanna and Thom's research that they're arguing the wrong case completely for the dangers of virtualized rootkits. Instead of worrying about undetectability of this or that -- pills and poultry be damned -- one should be focused on establishing the relative disposition of *any* VMM/Hypervisor running in/on a host:
Problem is, they’re looking at the wrong problem. I will easily concede that detecting virtualization is always possible, but that’s not the real problem. Long-term virtualization will be normal, not an exception, so detecting if you’re virtualized won’t buy you anything. The bigger problem is detecting a malicious hypervisor, either the main hypervisor or maybe some wacky new malicious hypervisor layered on top of the trusted hypervisor.
To Rich's credit, I think that this is a huge problem and one that deserves to be solved. That does not mean that I think one is the "right" versus "wrong" problem to solve, however. Nor does it mean this hasn't been discussed. I've talked about it many times already. Maybe not as eloquently...
The flexibility of virtualization is what provides the surface expansion of vectors for threat; you can spin up, move or kill a VM across an enterprise with a point-click. So the first thing to do before trying to determine if a VMM/HV is malicious is to detect its presence and layering in the first place...this is where Thom/Joanna's research really does make sense.
You're approaching this from a different direction, is all.
Thom responded here, and I have to agree with his overall posture; the notion of putting hooks into the VMM/HV to allow for "external" detection mechanisms for the sake solely of VMM/HV rootkit detection is unlikely given the threat, but we are already witness to the engineered capacities to allow for "plug-ins" such as Blue Lane's that function "along side" the HV/VMM and there's nothing saying one couldn't adapt a similar function for this sort of detection (and/or prevention) as a value-add.
Ultimately though, I think that the point of response boils down to the definition of the mechanisms used in the detection of a malicious VMM/HV. I ask you Rich, please define a "malicious" VMM/HV from one steeped in goodness.
This sounds like in practice, it will come down to yet another iteration of the signature-driven IPS circle jerk to fingerprint/profile disposition. We'll no doubt see anomaly and behavioral analysis used here, and then we'll have hashing, memory firewalls, etc...it's going to be the Hamster Wheel all over again. For the same reason we have trouble with validating security and compliance state for anything more than the cursory checks @ 30K feet today, you'll face the same issue with virtualization -- only worse.
I've got one for you...how about escaping from the entire VM "jail" entirely...Ed Skoudis over @ IntelGuardians just did an interview with the PaulDotCom boys on this topic...
I believe one must start from the bottom and work up; they're trying to make up for the fact that this stuff wasn't properly thought through in this iteration and are trying to expose the issue now. In fact, look at what Intel just announced today with vPro:
New in this product is Intel Trusted Execution Technology (Intel TXT, formerly codenamed LaGrande). Intel TXT protects data within virtualized computing environments, an important feature as IT managers are considering the adoption of new virtualization-enabled computer uses. Used in conjunction with a new generation of the company's virtualization technology - Intel Virtualization Technology for Directed I/O - Intel TXT ensures that virtual machine monitors are less vulnerable to attacks that cannot be detected by today's conventional software-security solutions. By isolating assigned memory through this hardware-based protection, it keeps data in each virtual partition protected from unauthorized access from software in another partition.
So no, Ptacek and Joanna aren't fighting the "wrong" battle, they're just fighting one that garners much more attention, notoriety, and terms like "HyperJackStack" than the one you're singling out. ;)
/Hoff
P.S. Please invest in a better setup for your blog...I can't trackback to you (you need Halo or something) and your comment system requires registration...bah! Those G-Boys have you programmed... ;)
I couldn't agree more. Most of the security components today, including those that run in our little security ecosystem, really don't intercommunicate. There is no shared understanding of telemetry or instrumentation and there's certainly little or no correlation of threats, vulnerabilities, risk or disposition.
The problem is bad inasmuch as even best-of-breed solutions usually require box sprawl and stacking and don't necessarily provide for a more secure posture, especially within context of another of Thomas' interesting posts on defense in depth/mesh...
That's changing, however. Our latest generation of NPMs (Network Processing Modules) allow discrete security ISV's (which run on intelligently load-balanced Application Processor Modules -- Intel blades in the same chassis) to interact with and control the network hardware through defined API's -- this provides the first step in that common telemetry such that while application A doesn't need to know about the specifics of application B, they can functionally interact based upon the common output of disposition and/or classification of flows between them.
Later, they'll be able to perhaps control each other through the same set of API's.
So, I don't think we're going to solve the interoperability issue completely anytime soon inasmuch as we'll go from 0 to 100%, but I think that the consolidation of these functions into smaller footprints that allow for intelligent traffic classification and disposition is a first good step.
I don't expect Thomas to agree or even resonate with my statements below, but I found his explanation of the problem space to be dead on. Here's my explanation of an incremental step towards solving some of the bigger classes of problems in that space which I believe hinges on consolidation of security functionality first and foremost.
The three options for reducing this footprint are as follows:
Pros: Supposedly less boxes, better communication between components and good coverage
given the fact that the security stuff is in the infrastructure. One vendor from which you get
your infrastructure and your protection. Correlation across the network "fabric" will ultimately
allow for near-time zoning and quarantine. Single management pane across the Enterprise
for availability and security. Did I mention the platform is already there?
Cons: You rely on a single vendor's version of the truth and you get closer to a monoculture
wherein the safeguards protecting the network put at risk the very assets they seek to protect
because there is no separation of "church and state." Also, the expertise and coverage as well
as the agility for product development based upon evolving threats is hampered by the many
moving parts in this machine. Utility vs Security? Utility wins. Good enough vs. Best of breed?
Probably somewhere in between.
Pros: Reduced footprint, consolidated functionality, single management pane across multiple
security functions within the box. Usually excels in one specific area like AV and can add "good enough" functionality as the needs arise. Software moves up and down the scalability stack depending upon performance needed.
Cons: You again rely on a single vendor's version of the truth. These boxes tend to want to replace switching infrastructure. Many of these platforms utilize ASICs to accelerate certain functions with the bulk of functionality residing in pure software with limited application or network-level intelligence. You pay the price in terms of performance and scale given the architectures of these boxes which do not easily allow for the addition of new classes of solutions to thwart new threats. Not really routers/switches.
Pros: The customer defines best of breed and can rapidly add new security functionality
at a speed that keeps pace with the threats the customer needs to mitigate. Utilizing a scalable and high-performance switching architecture combined with all the benefits
of an open blade-based security application/appliance delivery mechanism gives the best of all
worlds: self-healing, highly resilient, high performance and highly-available while utilizing
hardened Linux OS across load-balanced, virtualized security applications running on optimized
hardware.
Cons: Currently based upon proprietary (even though Intel reference design) hardware for
the application processing while also utilizing proprietary networking switching fabric and
load balancing. Can only offer software as quickly as it can be adapted and tested on the
platforms. No ASICs means small packet performance @ 64byte zero loss isn't as high as
ASIC based packet-forwarding engines. No single pane of management.
I think that option #3 is a damned good start towards solving the consolidation issues whilst balancing the need to overlay syngergistically with the network infrastructure. You're not locked into single vendor's version of the truth and although the hardware may be "proprietary," the operating system and choice in software is not. You can choose from COTS, Open Source or write your own, all in an scaleable platform that is just as much a collapsed switching/routing platform as it is a consolidated blade server.
I think it has the best chance of evolving to solve more classes of problems than the other two at a rate and level of cost-effectiveness balanced with higher efficacy due to best of breed.
This, of course, depends upon how high the level of integration is between the apps -- or at least their dispositions. We're working very, very hard on that.
At any rate, Thomas ended with:
I like NAT. I think this is Paul Francis. The IETF has been hijacked by aliens, actually, and I'm getting a new tattoo: