Machine Learning-Based AES Key Recovery via Side-Channel Analysis on the ASCAD Dataset

Investigating the application of machine learning (ML) and Deep Learning (DL) models to exploit electromagnetic (EM) side-channel leakage for AES key recovery. This project uses the public ASCAD dataset and focuses on the Key Rank metric for evaluation.

A paper detailing this work is currently in progress.

Explore

**Status: Paper in progress, preliminary results available.**

View on Github Access Draft

The Vulnerability

Cryptographic algorithms like Advanced Encryption Standard (AES) are mathematically robust. However, their physical implementations on devices can leak information through side channels, such as power consumption or electromagnetic (EM) emissions. This leakage can potentially compromise theoretically secure algorithms. Electromagnetic analysis (EMA) is a potent form of side-channel analysis (SCA) where attackers measure EM fields radiating from a device during cryptographic operations. These emissions often contain subtle variations correlated with the intermediate data being processed, which can be linked to the secret key.

Recent advancements show Machine Learning (ML) and Deep Learning (DL) are powerful tools for automatically learning these complex correlations, often outperforming traditional statistical SCA techniques. This project focuses on leveraging ML/DL for AES key recovery using the ASCAD dataset.

Key Challenges

Key Aspects & Contributions

Comparative Model Analysis

A comparative performance analysis of standard classifiers (Random Forest, Support Vector Machine) and a tailored Convolutional Neural Network (CNN) for AES key byte recovery on the ASCAD fixed and variable key datasets.

Feature Importance & Reduction

Exploration of Random Forest-based feature importance for dimensionality reduction and its impact on model efficiency and effectiveness in the SCA context.

Key Rank Metric Demonstration

A clear demonstration of the necessity and superiority of the domain-specific Key Rank metric over standard accuracy for evaluating ML-based SCA success, especially in low Signal-to-Noise Ratio scenarios.

Successful Key Recovery

Confirmation of successful key recovery using both CNN and feature-selected RF models, highlighting the practical feasibility of ML-based side-channel attacks despite low per-trace classification accuracy.

Technical Methodology

Target: AES S-Box Operation

The attack targets the output of the first-round AES S-box operation. The S-box input for a byte $i$ is $\text{Plaintext}[i] \oplus \text{Key}[i]$. The output is:

$\textit{Sbox_Output}[i] =$ $\textit{Sbox} (\textit{Plaintext}[i] \oplus \textit{Key}[i])$

Predicting this 256-class output allows deduction of the key byte. We target the 3rd key byte (index 2).

AES Round Steps

Fig 1: Basic Steps of an AES Encryption Round

Attack Mechanics in Detail

The side-channel leakage occurs primarily in the first masked multiplier of the S-box operation, where XOR gates absorb different numbers of transitions for different data inputs. This creates distinctive power consumption patterns that correlate directly with the processed data values.

Our attack adopts a value-based leakage model, assuming the EM trace contains information correlated with the specific value (0-255) of the S-box output. Since this output depends on both the known plaintext and unknown key, predicting it allows us to deduce the key byte through a 256-class classification problem.

Key Rank Metric: Technical Details

The superiority of Key Rank over standard accuracy stems from the nature of side-channel attacks. With low signal-to-noise ratio, perfect classification of every trace is unrealistic. Instead, our goal is to distinguish the correct key from 255 incorrect hypotheses by aggregating subtle evidence across numerous traces.

For each key hypothesis \(k_{guess}\) (0-255), we calculate:

$Score(k_{guess})\ =$ $\sum_{i=1}^{N} \log(P(label=Z\_hyp\_i | trace_i) + \varepsilon)$

Where \(Z\_hyp\_i = Sbox(plaintext_i \oplus k_{guess})\) for each trace \(i\), and \(\varepsilon\) is a small constant to prevent \(\log(0)\). The logarithm converts probability multiplications to additions, improving computational efficiency.

Feature Importance Analysis

Our feature selection approach using Random Forest's Gini importance showed that EM leakage is distributed across the trace but concentrated in specific time regions. By selecting only the top 100 features, we reduced the number of attack traces required by approximately 50% for ASCADf and 40% for ASCADv.

This dimensionality reduction mitigates overfitting and focuses on the most informative leakage points, significantly improving model efficiency while maintaining attack effectiveness.

Dataset & Preprocessing

Utilizes the public ASCAD 'fixed-key' (ASCADf: 50k training, 10k attack traces, 700 samples/trace) and 'variable-key' (ASCADv: 200k training, 100k attack traces, 1400 samples/trace) datasets. Raw EM traces are standardized (zero mean, unit variance) based on the profiling set.

Machine Learning Models

Random Forest(RF): Ensemble of decision trees $n\_estimators=100$, $max\_depth=20$, $min\_samples\_leaf=10$. Used for classification and Gini importance-based feature selection (top 100 features).
Support Vector Machine (SVM): Trained on reduced features with RBF kernel.
Convolutional Neural Network (CNN): Custom PyTorch CNN with 4 convolutional blocks (Conv1D, BatchNorm, ReLU, AvgPool1D) followed by dense layers. Inspired by existing SCA literature.

CNN Architecture

Fig 2: CNN Architecture for SCA

Evaluation: Key Rank

Primary metric is Key Rank. For N attack traces, it involves: 1. Obtaining model's probability distribution for S-box output for each trace. 2. For each key byte hypothesis (0-255), calculate hypothetical S-box outputs and sum log-probabilities from the model. 3. Rank key hypotheses by their total score. Rank 0 for the true key means successful recovery. This metric aggregates evidence across traces, effective even with low per-trace accuracy.

Key Rank Example

Fig 3: Example Key Rank Chart

Experimental Results & Outcomes

ASCADf=ASCAD fixed-key dataset, ASCADv=ASCAD variable-key dataset
Full Features=all 700 traces for ASCADf, 1400 for ASCADv, Reduced Features=top 100 features based on Gini importance

Model Dataset Feature Type Attack Traces for Rank 0*
CNN ASCADf Full Features ~65 traces
CNN ASCADv Full Features --
Random Forest ASCADf Reduced Features ~200 traces
Random Forest ASCADf Full Features ~492 traces
Random Forest ASCADv Full Features ~750 traces
Random Forest ASCADv Reduced Features ~470 traces
SVM ASCADf Reduced Features ~320 traces
SVM ASCADv Reduced Features ~320 traces

Note: The Key Rank metric is crucial for evaluating side-channel attacks, demonstrating that models with low per-trace classification accuracy can still recover the key when evidence is aggregated across multiple traces.

*The results are preliminary and may vary from the final paper.

Key Contributions

Some of the key contributions of this work are:

References & Further Reading

This work builds upon existing research in side-channel analysis and machine learning. Key references include:

For a comprehensive results and methodologies please request a draft of the research paper.