DFST
This is the repository for DFST paper Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification.
See https://arxiv.org/abs/2012.11212.
Dependences
Python3.6, tensorflow=1.13.1, keras=2.2.4, numpy, pickle, PIL.
How to use this repository
Note that currently we only provide codes on VGG and CIFAR-10 and the attack target label is 0.
Prepare dataset
Create some folders: ./dataset, ./model, ./weights.
Download CIFAR-10 dataset and re-define it in the follwing format:
- cifar_train['x_train'].shape = (50000, 32, 32, 3)
- cifar_test['x_test'].shape = (10000, 32, 32, 3)
- cifar_train['y_train'].shape = (50000, 1)
- cifar_test['y_test'].shape = (10000, 1)
Save the dictionaries in cifar_train and cifar_test file in ./dataset using pickle.
pickle.dump(cifar_train, open('./dataset/cifar_train', 'wb'))
pickle.dump(cifar_test, open('./dataset/cifar_test', 'wb'))
Download sunrise images from Weather-Dataset into ./CycleGAN/sunrise.
Train your own Cycle GAN as trigger generator
Type in cd CycleGAN.
Train your own Cycle GAN python CycleGAN.py.
Poison the training dataset python data_poisoning.py.
Perform DFST attack
Train a benign VGG as a classifier on CIFAR-10 python train.py.
Inject the trigger using poisoned training data python retrain.py.
Perform detoxification to force the model to learn deep features sh run.sh.
Contact
Free to contact the author cheng535@purdue.edu.