Logs are essentially records that an IT system keeps of its own activities. Indeed, one can think of a log file as a diary. To a large degree, the content and structure of logs is under the control of the system or application developer and, while enterprises and vendors alike try to impose some discipline on that structure and content, logs, at least on the surface, show little rhyme or reason from one system to another. Even within the boundaries of single system, uniformity of any kind may be hard to discern.
And yet, when one looks at a log, it is possible to pick out names of sources of messages, time stamps, and indicators of actions being taken - often written down in a language of sorts that bears a passing resemblance to English. Logs, in other words, suggest they might have started their existence as sentences the words of which somehow got jumbled up along the way.
Depending on humans
The presence of human readable words in logs lies at the heart of log management platforms like those offered by Splunk and Elastic. Although the search algorithms that drive these platforms are based on token and template matching and care nothing about what the human readable words mean, the usefulness of these algorithms is entirely a function that humans look for specific types of log search for words that make sense to them. In fact, it was, precisely, the decision to exploit the fact that logs contained word-like substrings that separated technologies like Splunkbase and the ELK stack from the older SIM-oriented log management platforms the algorithms of which searched in an unsupervised or semi-supervised manner for repeating alphanumeric string patterns with no care at all for human readability.
It is not simply a fortunate accident that logs look like jumbled sentences and contain human readable substrings. Let’s examine the purpose of logs. Systems and applications generate logs so that records of critical events can be maintained. While these records may be used for many different reasons, the important point - for our purposes - is that they are meant to record events. Now, an event is nothing more and nothing less than a significant happening and, as such, must have three important elements: 1) a time at which it happens; 2) a change of some sort which occurs; and 3) an object or collection of objects which either effect the change or suffer it. Now, if we look at any declarative sentence we will find something that directly associates with each of these three elements. Time is associated with tense; change is associated with verbs; and the objects are associated with nouns. One could even extend this further by associating elements indicating a source and destination with prepositions like ‘from’ and ‘to’ (or case markers if you are thinking of an inflected language like Sanskrit or Arabic).
In fact, a strong case can be made that whenever one tries to represent events - no matter what the context or rationale - one will end up with something analogous to the three or four representational components just mentioned. Hence, even though logs did not start out to be jumbled up sentences, it was inevitable that they would turn out that way since both logs and declarative sentences are both meant to signify events.
The linguistics of logs
Here, however, we begin to see a problem. Logs are a rich source of information of what is happening in digital environments. However, because the ‘nouns’, ‘verbs’, ‘prepositions’, etc. are all jumbled up, their positions changing from log to log and/or system to system, that information can only be fully exploited if a human being is involved. Only humans can recognise the events that the logs are representing. Of course, that is why log management databases have become so popular. They make it possible for humans to interpret logs by letting users search on strings that mean nothing to the search algorithms involved but mean something to the users. (Note, by contrast, that queries made to relational or object-oriented databases rely in critical ways on the semantic structure encoded in the data model.)
So, humans are critical to the success of log management systems. But here is where the problem begins to bite. IT systems generate millions of logs and for a user to have a chance at accessing logs relevant to his or her concerns, he or she must have a good idea up front what is being looked for. Most system issues, however, are unanticipated. Events take place which have not been planned for and given modern modular, dynamic, distributed and ephemeral environments, in many cases, not even imagined. In that old trichotomy of the known knowns, known unknowns, and unknown unknowns, most of the events represented by logs fall into the last category.
In other words, although humans are critical to extracting information from log management systems, the number of logs, their varying structure, and the limits of flesh and blood users mean that most of the information potentially available remains trapped in unread logs.
How can this problem be resolved? At a high level, the obvious answer is AI. Clearly, what is needed is the sequential execution of data selection, pattern discovery, and inferencing, algorithms that will, in an automated way, discover the information content of any large log file, communicate it to the human user, and finally, support any actions to be taken in response to what has been communicated.
But what specific form should these algorithms take? A clue may be taken from an old idea in Linguistics first proposed by Noam Chomsky in 1968. When one looks at sentences that speakers consider to be grammatical, the structure they have might not be fundamental but, in fact, the outcome of operations and systematic changes made to an underlying ‘deep structure.’ For example, in the sentence, ‘I am being praised by my manager’, is best regarded as the result of modifying an underlying sentence, ‘my manager is praising me.’ The active form is the ‘deep structure’ while the passive form is the ‘surface structure.’ After this theory was proposed, many (although, interestingly, not Chomsky) argued that interpretation only pertained to the deep structure. If you want to know what a sentence means, you first need to go from surface structure to deep structure and then associate the components of that deep structure with meanings in various ways. This idea became known as ‘semantic syntax.’
AI to the rescue
To deal with the jumble of natural language-like substrings in logs, we, at Moogsoft, are looking at an approach which may be described as a combination of deep neural network processing and semantic syntax. The first step involves running through a large trainer set of logs in order to isolate those substrings that correspond to nouns, tenses, verbs, and prepositional phrases. While the mark-up might initially prove labour intensive, the conventions implicitly used by system and application developers are, in fact, quite limited. System names vary very little as do the ways in which time stamps are written down. Where there does seem to be some significant variation is in the way in which actions are represented.
After the initial isolation is accomplished, a neural network style algorithm can be let loose on the results in order to stabilise the features of each of the element types. So far, so good but we still are dealing with a jumble and an IT system would not have an easy time figuring out what events are being indicated by the logs. Here is where the semantic syntax and deep structure thinking comes in. The jumbled up strings get treated as surface structure and the real trick is to go backwards from the surface structure to the deep structure. Once a jumbled up string has been converted to its underlying deep structure equivalent then the event being represented can be determined.
Determining the rules for converting surface structure to deep structure will themselves require a dose of automated learning but given the highly constrained form that a deep structure can take (essentially Noun, Noun, Verb, Tense) that effort should prove reasonably inexpensive. With logs now converted into a deep structure form that clearly indicates the events they are meant to express, the typical work carried out by an AIOps platform can begin. Critically, it is precisely because the logs have been converted into a format that, in virtue of its structure alone, shows the kinds of meaning the various components have, that algorithms can read and work with them without significant human intervention.
Many logs often indicate the same event as other logs and many logs, in fact, indicate events that never occurred at all. Hence, the file of deep structure format logs must be cleansed of noise and duplicates. This is an important point. The conversion to deep structure format, the unveiling of the event represented by the log needs to take place before the event itself is examined. This is a layer of AI algorithm application that takes place before data set selection.
Once the log file is cleansed, once each of its contents represent an actual unique event, distinct from all other events represented by file contents, then the events must be correlated, causally analysed, results communicated, and remedial actions performed.
In summary, then, logs are an important source of information regarding events taking place within a digital environment. Their volume, however, and the ambiguities of their structure mean that something like human intelligence is required to take advantage of that source. These volumes and ambiguity, however, make it just about impossible for humans to work with log files. AI must be brought to bear to realise the log file promise, but it is a very special kind of AI that starts by converting logs into their declarative sentence like deep structure.
Will Cappelli, CTO EMEA and Global VP of Product Strategy, Moogsoft