Friday, June 4, 2010

Meta-theory practicality

As promised, I wanted to get back to the subject of theory versus practicality. Good thing I have tenure so I can muse on or even if I didn't maybe this is an obscure enough blog, it won't make a bit of difference.

One of the things that has bugged me is the disconnect in the security world of research is the gap between theory and practicality. On one hand, we definitely want to get to secure systems and security more often than not tends to be a binary thing, i.e. once data is out you are pretty much hosed. On the other hand, we have had years upon years upon years of theoretically good security work but yet how much traction has it really had? What is the ratio of reference datasets or papers with reproducible experiments versus the theoretical or never to be duplicated again variety?

Maybe it is cyclical or something but it seems like we are living in a sort of real-time Groundhog Day when it comes to security. Every once in a while, someone gets up and says that we are doing security research wrong or that we need to investigate topic X. It creates a whole new set of conference papers (maybe some journals too) but yet amazingly enough, that research funding tends to flow into one of the existing groups who tend to do one of the following:

* Get off my lawn: The grey hair folks jumping up and down noting that we solved all of the issues if we would just pay attention to MULTICS and PSOS, aka our programmers are teh suck and they need to get some r3@l skills argument.
* Framework, ?? deploy ??, security: Beautiful papers, wonderful security properties, but yet the papers never ever seem to answer the question of how you bootstrap their "great solution" into the real world or how it functions in the noise of the real world. Any solutions must of course benchmark their results to their virtual hypothetical world. Sort of like if multicast and QoS can be deployed with a flick of the wand next to the dragons and unicorns, aka a SIGCOMM paper. I kid, I kid, sort of :)
* l33t exploits: Extremely practical papers that find that latest exploit of the day which are always fun to read but don't necessarily change the current state of security, more like +1 developer emo rage.
* Data mining is cool: The anomaly / data mining-based papers, aka this time my anomaly detection scheme will work, I swear, just give me a good baseline. Look as I detect my own synthetically injected data at how well I can detect it :)

Folks leave said meetings (which I have been to plenty of) and vow that security is broken, repeating the first step, and then the funding flows into the above categories*. The circle of life continues unabated....

/end silliness

All glibness aside, I've seen this theme time and time again, as a reviewer, attendee, and submitter. I know as I am sure many others would vouch for that one is far better off submitting a paper that falls into one of those categories (maybe besides the get off my lawn category) that has limited near-term impact (besides future citations) versus one that is useful but way less sexy. Cue cries of lack of novelty, lack of innovation, or something to that effect. Or perhaps I'm just jaded having sent too many papers that did not fall into that category to conferences.

Roy Maxion gave a very cool talk at one of the Cyber Security Roundtables with regards to reproducible, well-designed experimentation. Sometimes I wonder if the computer itself has not lent itself to sort of an ADD-ish sort of behavior where we always need to have something new and sexy versus simply doing solid, experimentation on what we already have. The publish or perish game perhaps to blame for that? Dunno. Synthetic nature of systems being man made meaning that we can just make our own synthetic data / hypotheses to test being to blame? Dunno either. Is difficulty in sharing data causing issues, maybe, but I don't think that is nearly as prevalent of an issue. The requirement for transformative research by the grant agencies? Eh, maybe. Is the life cycle of a project too short? Yes, I definitely think so.

My point is that I think that on average, we do a good job of conjecturing synthetic problems and solving them without any thoughts of real-world implications which often involves time-consuming, appropriate metrics for measuring efficacy. A few properties are proven or a few ROC curves look better or a bunch of code is written and new metrics created. By in large though, good, solid experimental work tends to be left by the wayside as it is either too development-centric rather than research-centric or far too time consuming to do right. Security is messy and it always is going to be with the heavy human component involved. How does a solution work if one actually deployed it (even in a limited sense but not on synthetic but real data)? Can you actually manage or use it? How can one push the envelope of an existing system, at what point does it break or can one do clever stuff on what is already out there or soon to be out there?

Until we start to acknowledge that fact that absolute security is not going to happen and start to value well-tested, robust incremental improvements, I think we are doomed to keep repeating the same cycle. Sure, there will be big kerfluffles about this funding or that funding but at the end of the day, we'll still have the same discussion about how this organization's security or that organization's security is an epic fail and still is. I think practical security and more important manageable security that is really, truly deployable is how we get from epic fail to just a standard, normal variety fail. Even that in and of itself would be a huge, huge improvement.


* There are topics which are interesting and meritorious of future research. It is just that when you see the same folks doing the same stuff but yet submitted and then funded despite it being an entirely different topic, it starts to drain one's soul.