GCHQ releases Big Data analysis tool on Github

UK's spy agency, GCHQ, has given its contribution to the open-source community by offering one of its tools on Github.

The media noticed yesterday that a tool named Gaffer appeared on the GCHQ Github, essentially a large-scale graph database.

According to the introductory description, Gaffer is a framework that makes it easy to store large-scale graphs in which the nodes and edges have statistics such as counts, histograms and sketches.

It is “optimised for retrieving data on nodes of interest”.

Essentially, this is a Big Data analytics tool, and as ZDNet concludes, might have been used by the GCHQ to “discern patterns and plot the actions of groups of interest -- such as criminal gangs or terrorists – statistically”.

Gaffer uses Accumulo for storing data, although other options are available. Accumulo , built on Apache, is an open-source framework built upon Google's BigTable design.

Here’s what Gaffer has to offer:

  • Allows the creation of graphs with summarised properties within Accumulo with a very minimal amount of coding.
  • Allows flexibility of statistics that describe the entities and edges.
  • Allows easy addition of new types of nodes and edges.
  • Allows quick retrieval of data on nodes of interest.
  • Deals with data of different security levels - all data has a visibility, and this is used to restrict who can see data based on their authorizations.
  • Supports automatic age-off of data.

"GCHQ hopes that Gaffer will be useful to others in the community, as well as helping its own technical staff as they continue to develop the software in the future,” a GCHQ representative told TechWeekEurope.

"As a government department and technology organisation, GCHQ software developers and technologists aim to contribute to open source software projects."