So I’m in the bar in Seville and I meet my mysterious log guru and relate to him my various conversations regarding the problems with managing the sheer volume of information from the various audit logs. So out of the blue he says “use cubes”. Now you have to understand that I haven’t got a clue what a cube is but for some reason it is like a magic phrase and I instantly know what he is talking about. Let me explain:-.
Large enterprises have too much security data to manage. This is in terms of storage and filtering of information. So in order to rationalise this information, they use the node aggregation technique that Raffael Marty suggested in my beer brawl Podcast with him. In other words if I have 100,000 servers, I can break them down in to 1000 groups of 100 servers and treat each group as a node. In addition I can ignore certain types of events depending on my corporate policy. Both these two factors will help to reduce the deluge of information.
However the problem is that if I now have an anomaly with one server in an aggregated group of one hundred, I may not detect this anomaly due to the normalisation effect of dealing with 1000 groups.
This is where I think cubing comes in. In addition to having my large picture of 1000 aggregated nodes, I can take each single aggregated group and run my automated log sensor across the information in that group to check for anomalies in that group prior to using that group’s collated data.
You can listen to Ben's informal interview of Raffael Marty here.