HIVE-XP

This space contains links to the HIVE-XP (HIVE large dataset explorer) project. HIVE-XP performs a number of analysis and translations tasks on large cybersecurity datasets, including several public datasets.



HIVE-XP Features

The philosophy behind HIVE-XP is to provide a framework which can normalise analysis across a range of heterogeneous data sets, provide standard high level metadata for use in learning models, built-in methods to manipulate features for use in different learning techniques, and provide a common platform for detecting anomalies and malicious activity within large datasets.

  • Advanced Flow Containers – takes flow summaries from existing dataset metadata, or creates new normalised and rich flow structures from raw trace data (such as PCAP files), using a standard container format - the Common Flow Format (CFF).
  • Analytics – analyses flow structures and produces a number of useful metadata summaries and features, which can be compared across different datasets, and used as input features in learning models (deducing the dimensionality of raw data processing).
  • Advanced Traffic Features – produces advanced traffic features characterising flows, which can be highly useful in identifying malware or suspicious behaviour. Features include: flow symmetry, payload entropy, burst mode, as well as granular statistics at various levels per flow.

HIVE-XP Implementation

HIVE-XP is written predominantly in the go language. Parts of the source code are planned to be published online using github, once documentation is completed and 3rd party interfaces have been defined. This is part of ongoing work towards a PhD research in various cybersecurity applications of deep learning and ensemble methods.