A Novel Feature Extraction Strategy for Hardware Trojan Detection

https://doi.org/10.1109/ISCAS45731.2020.9180479

Published in:
The IEEE International Symposium on Circuits and Systems (ISCAS): Proceedings

Document Version:
Peer reviewed version

Queen's University Belfast - Research Portal:
Link to publication record in Queen's University Belfast Research Portal

Publisher rights
Copyright 2020 IEEE. This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms of use of the publisher.

General rights
Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights.

Take down policy
The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the Research Portal that you believe breaches copyright or violates any law, please contact openaccess@qub.ac.uk.
A Novel Feature Extraction Strategy for Hardware Trojan Detection

Shichao Yu*, Chongyan Gu*, Weiqiang Liu†, Maire O’Neill*

*Centre for Secure Information Technologies, ECIT, Queen’s University Belfast, Belfast, UK
†College of EIE, Nanjing University of Aeronautics and Astronautics, Nanjing, China
E-mails: {suy08, c.gu}@qub.ac.uk, liuweiqiang@nuaa.edu.cn, m.oneill@ecit.qub.ac.uk

Abstract—Hardware Trojans (HTs) are acknowledged as a significant emerging security concern in the IC industry resulting from the globalization of the semiconductor supply chain. Recently, taking advantage of the exponential growth in computing power, machine learning (ML) approaches such as neural networks (NNs) are being considered for HT detection. However, the circuit structure and components of an IC design are different from the data types in the ML models. To efficiently extract HT features from complex IC designs and utilize common ML-based detection approaches is challenging. In this paper, a novel HT feature extraction strategy based on gate-level circuit netlists is proposed to tackle the challenges. The HT features are extracted from the circuit topology rather than statistical analysis in previous research. A commonly utilized support vector machine (SVM)-based HT detection model is employed for data training and testing using the extracted features on HT benchmarks from both open-sourced library and HT generation platform to prove the feasibility and efficiency of the proposed HT feature extraction strategy. The detection results show high recall in nearly all tested benchmarks, achieving at most 97.7% recall on sequential Trojans and 84.8% on combinational ones.

Index Terms—Hardware Trojans, feature extraction, netlist, circuit topology, directed graph, structural features, machine learning

I. INTRODUCTION

Due to the increasing complexity of Integrated Circuits (ICs), nowadays, it is more difficult for a single vendor to complete the entire manufacturing of chips. The design and fabrication of ICs is now distributed worldwide. However, the use of overseas foundries, test facilities and third party vendors increases the risk of IC products being attacked by Hardware Trojans (HTs).

HTs are malicious tampering of ICs at any untrusted phase of the production chain. They can be small malicious circuits inserted in chips or malicious modifications to the chips. Fig. 1 shows the typical structure of a functional HT [1]. The trigger circuitry monitors a set of normal signals and activates the malicious payload when trigger conditions are satisfied. As most HTs have rare trigger conditions and the size of trigger circuitry is small, common IC tests and verification workflows can fail to detect HTs [2] [3].

Meanwhile, taking advantage of the exponential growth in computing power, ML methods have shown huge potential for HT detection in recent research [1], [4], [5], [6].

A gate-level netlist is a generic IC design file that describes the network topology and layout of a circuit and is provided by third-party vendors as the data exchange format for intellectual property (IP).

However, the data structure and basic component of a gate-level netlist are significantly different from the commonly used data types in the ML models. It is challenging to efficiently extract HT features from netlists and using common ML algorithms for HT detection.

When comparing netlists with sequential data like audio traces and text, netlists not only have dependency in the data (instance) order, but also have multi-connections between instances. When compared with 2-dimensional image data, the positional information of netlist elements is different from the absolute position of image pixels. The position information of instances is relative and can only be retrieved from the net signals between instances. So data parsing and feature extraction in gate-level netlists is much more complex and difficult than common data types in ML.

Statistical analysis has been applied in existing research on ML-based HT detection to select netlist features [6], [7]. These features performed well to detect some HTs. However, to recognize a different type of Trojan, the circuit features had to be re-selected to gain an equivalent performance or prevent detection failure.

Furthermore, statistical analysis of netlist features does not involve a direct analysis of the circuit topology (Directed Graph), and as such, circuit structural information is lost in the process.

To address these problems, we propose a novel HT feature extraction strategy that extracts structural features from the network topology of gate-level netlists. In the experiment, a commonly utilized SVM-based HT detection model is employed for data training and testing, using the extracted relative positional structure features.

To the best of our knowledge, this is the first time that the
relative positional relationship of a netlist graph is investigated as features for ML-based HT detection.

The main contributions of this paper are as follows:

1) A novel feature extraction strategy based on circuit topology rather than statistical analysis is proposed.
2) A pin-level feature searching algorithm in netlist block is proposed for the feature extraction.
3) The application of the extracted pin-level feature traces on a commonly utilized SVM model for HT detection.

The remainder of this paper is organized as follows: Section II discusses the related Trojan feature extraction work. The proposed feature extraction strategy and its implementation are presented in Section III. Section IV presents the experimental results and Section V provides some concluding remarks.

II. PREVIOUS RESEARCH ON GATE-LEVEL TROJAN FEATURE EXTRACTION

For gate-level HT detection at the design-stage, [8] first adopted the controllability and observability feature of nets in gate-level netlist for HT classification using K-means clustering. [9] adopted controllability as the feature and reduced the false positive rate by signal switching probability simulation.

Circuit level statistics (i.e. the number of primitives, AND and OR gates) are utilized along with controllability and observability in [10] to make a four-dimensional vector to train a SVM-based Trojan classifier. In [5], 5 HT classification features were identified after statistical analysis of features in normal and Trojan circuit. [11] extracted the similar features of [5] from the trigger nets of HTs and improved the HT recognition accuracy under an ensemble-learning-based method.

In [6], 11 Trojan-net features were selected from 51 netlist features for HT detection using the random forest algorithm. Neural networks were first proposed for use in HT detection on gate-level netlists in [7], using features from their earlier research work [5], [6].

Recently, [12] proposed two-level AONN (AND, OR, NAND, NOR) gates as a feature for combinational HT and extracted the number of register-to-register and input-to-register paths as the feature for Deep State Machine (DSM) Trojans.

A. The Topology of a Gate-level Netlist

In order to extract structural features, gate-level netlists are parsed at first. Fig. 2(a) is the schematic of a gate-level netlist slice from the trigger part of a sequential FSM (finite-state machine)-based HT [1].

As shown in Fig. 2(a), each instance in the netlist is a basic logic cell (cells are defined in the cell library). The structural information of the netlist is composed of two parts: the cell type of each instance and the connection information between instances.

Using instances as the nodes and net names as the edges, the topology of a netlist can be drawn as a pseudo-graph in which both loops and multiple edges are permitted. In order to simplify the proposed searching algorithm, we separate all the pins belonging to each instance as independent nodes to generate a directed graph. Fig. 2(b) shows the pin-level graph layout of path 1 in the netlist in Fig. 2(a) with independent pin nodes. The generated pin nodes are named as cell_pin for both input and output pins.

The network topology of the netlist has been built as a directed graph. The cell node name refers to the cell type of each instance while the connection between pin nodes refers to the connection information between instances.

B. Pin-level Structural Features

Based on the generated topology of the netlist in subsection III-A, structure features in pin-level can be extracted by BFS (Breadth-First-Search) searching algorithm for graphs. As shown in TABLE I, the pin-level structure of Path 1 in Fig. 2(a) is extracted from the pin-level netlist graph in Fig. 2(b).

Features extracted from different netlists should be comparable. If two designs are compiled in different cell libraries (A, B), a mapping table from cell library A to cell library
TABLE I
THE PIN-LEVEL STRUCTURAL FEATURES OF PATH 1 FROM FIG. 2(B)

<table>
<thead>
<tr>
<th>Name</th>
<th>Cell→Pin Out→Pin In→Cell</th>
<th>Corresponding Net</th>
</tr>
</thead>
<tbody>
<tr>
<td>U1</td>
<td>(NOR2, NOR2_OUT, NOR_IN1, NOR2)</td>
<td>g1</td>
</tr>
<tr>
<td>U2</td>
<td>(NAND2, NAND2_OUT, NOR2_IN2, NOR)</td>
<td>g10</td>
</tr>
</tbody>
</table>

TABLE II
PIN-LEVEL STRUCTURAL FEATURE TRACE (U194, Logic level = 4)

<table>
<thead>
<tr>
<th>Index</th>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
</tr>
</thead>
<tbody>
<tr>
<td>L₃ PCP</td>
<td>L₄ PCP</td>
<td>L₅ PCP</td>
<td>L₆ PCP</td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td></td>
</tr>
<tr>
<td>DIN1,s13,Q</td>
<td>DIN1,nand2,s3,Q</td>
<td>DIN1,nor2,s3,Q</td>
<td>DIN1,nand2,s3,Q</td>
<td></td>
</tr>
<tr>
<td>U194</td>
<td>clk</td>
<td>U194</td>
<td>U194</td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>2</td>
<td>U194</td>
<td>U194</td>
<td></td>
</tr>
<tr>
<td>g1</td>
<td>g2</td>
<td>g3</td>
<td>g4</td>
<td></td>
</tr>
</tbody>
</table>

Fig. 3. Program architecture of pin-level structural feature extraction for HT detection.

Fig. 4. Pin-level netlist block of s1423_T426 U194.

B should be created to support HT detection on netlists from different cell libraries.

C. The Implementation of the Proposed Strategy

The program architecture for pin-level structural feature extraction is presented in Fig. 3. First, the Pin-Level Graph Generator loads the netlist and builds a pin-level directed graph for the netlist according to subsection III-A.

Then, netlist blocks around each instance are extracted from the graph based on the BFS algorithm. The searching depth is defined by the value of parameter logic level. For example, if the current instance is $U_n$ and logic level is defined as 4, the module will search and extract all the pin connections and instances around $U_n$ in 4 logic levels. The sample of an extracted netlist block from s1423_T426, instance U194 [13] is shown in Fig. 4. For each line in Fig. 4, the pin-level path is formed as:

$$\text{Instance, Cell, Pin, Net}_{\text{PCP}}, \text{Instance, Cell, Pin, Logic}_{\text{level}}$$

Each netlist block is filtered by the pin-level feature filter. After the library mapping for netlists from different libraries and data labelling based on the known HT instances, each pin-level structural feature trace for $U_n$ can be extracted.

TABLE II shows one feature trace belonging to U194. The basic element of a pin-level feature trace is $L_n$ PCP, where $L_n$ means the current PCP is n logic levels to the centre instance ($L_0$ is the centre instance), while PCP is $\text{PIN, cell, pin}$ and contains the name of the input pin, cell and output pin from the same instance on feature trace. For example, $L_{-3}$ PCP ($\text{DIN, i1,s3, Q}$) in TABLE II means the input pin DIN and output pin Q from cell i1.s3 is on the pin-level feature trace, 3 logic levels away to the input of centre instance U194.

When all netlist blocks in a netlist are processed, each instance will have corresponding list of pin-level structural feature traces. As these extracted traces contain all the relative structure information surrounding the centre instance, they are used as the relative structural features of each instance for both Trojan instances and normal instances and can be applied to ML-based classification model for HT detection.

IV. EXPERIMENTAL RESULTS

A. Experimental Setup

The proposed HT feature extraction strategy was implemented in Python. To prove the feasibility and efficiency of the strategy and control the training and hyperparameter tuning cost, a general C-Support Vector Machine [14]-based HT detection model is designed for data training and testing using the extracted features on HT benchmarks from both Trust-hub [13] (group 1, 2, 3) and the HT generation platform in [1] (group 4), to prove the feasibility and efficiency of the proposed strategy.

To keep the relative positional relationship in the extracted feature traces, we utilize data array from index 0 to 6 in TABLE II ($\text{Logic}_{\text{level}} = m$, PCP $L_{-(m-1)}$ to $L_{m-1}$) as a 7-dimensional feature and index 7 as the label for
For group 1, 2, 4 and 4 for group 3). And the classification results on sequential Trojan groups perform much better than combinational ones. This phenomenon can be understood as the combinational Trojan has fewer structural features than the sequential Trojan in this experiment. In particular, the combinational Trojans in group 4, generated from HT generation platform [1], only have AND gates in their logic, which makes them contain less structural features than all other groups. This results in the lowest precision on classification and causing classification failure.

In conclusion, the extracted structural features can be successfully applied in ML models for HT detection. High recall results can be obtained in detecting Trojan traces in most HT benchmark netlists. Also, with more structural features from a Trojan circuit, the adopted SVM model can obtain a better classification result.

V. Conclusion

In this paper, we propose a new feature extraction strategy for hardware Trojan detection. We first build a pin-level directed graph for a netlist and then extract the relative pin-level structural feature from the network topology of the netlist rather than using statistical analysis as in previous research. A SVM-based HT detection model is designed to test the extracted features. The experimental results show that the extracted features can be successfully applied in ML models for HT detection and high recall can be achieved for most Trojan benchmark netlists.

In the future, we will improve the proposed feature extraction strategy and ML-model to obtain better detection results for Trojans with less structural features.

ACKNOWLEDGMENT

This research is supported by RISE (ukrise.org - EP/R011494/1) and funded by the UK Engineering and Physical Sciences Council (EPSRC) and the National Cyber Security Centre (NCSC).
REFERENCES


