Adversarial Attacks: Unterschied zwischen den Versionen
Aus exmediawiki
C.heck (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „KNN's sind extrem anfällig für... * Praxis-Beispiele: https://boingboing.net/tag/adversarial-examples * https://bdtechtalks.com/2018/12/27/deep-learning-adv…“) |
C.heck (Diskussion | Beiträge) |
||
Zeile 8: | Zeile 8: | ||
* https://en.wikipedia.org/wiki/Deep_learning#Cyberthreat | * https://en.wikipedia.org/wiki/Deep_learning#Cyberthreat | ||
− | + | ||
+ | ---- | ||
+ | ---- | ||
=WHITE BOX ATTACKS= | =WHITE BOX ATTACKS= | ||
* https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/ | * https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/ | ||
** Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf | ** Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf | ||
+ | ---- | ||
==Untargeted Adversarial Attacks== | ==Untargeted Adversarial Attacks== | ||
Adversarial attacks that just want '''your model to be confused and predict a wrong class''' are called Untargeted Adversarial Attacks. | Adversarial attacks that just want '''your model to be confused and predict a wrong class''' are called Untargeted Adversarial Attacks. | ||
* nicht zielgerichtet | * nicht zielgerichtet | ||
+ | |||
===Fast Gradient Sign Method(FGSM)=== | ===Fast Gradient Sign Method(FGSM)=== | ||
FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack). | FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack). | ||
+ | |||
===Basic Iterative Method=== | ===Basic Iterative Method=== | ||
Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden | Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden | ||
+ | |||
===Iterative Least-Likely Class Method=== | ===Iterative Least-Likely Class Method=== | ||
ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt | ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt | ||
+ | |||
+ | ---- | ||
==Targeted Adversarial Attacks== | ==Targeted Adversarial Attacks== | ||
Attacks which compel the model to predict a '''(wrong) desired output''' are called Targeted Adversarial attacks | Attacks which compel the model to predict a '''(wrong) desired output''' are called Targeted Adversarial attacks | ||
* zielgerichtet | * zielgerichtet | ||
+ | |||
+ | ---- | ||
==(Un-)Targeted Adversarial Attacks== | ==(Un-)Targeted Adversarial Attacks== | ||
kann beides... | kann beides... | ||
+ | |||
===Projected Gradient Descent (PGD)=== | ===Projected Gradient Descent (PGD)=== | ||
Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert: | Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert: | ||
Zeile 32: | Zeile 43: | ||
** Jupyter Notebook: https://github.com/oscarknagg/adversarial/blob/master/notebooks/Creating_And_Defending_From_Adversarial_Examples.ipynb | ** Jupyter Notebook: https://github.com/oscarknagg/adversarial/blob/master/notebooks/Creating_And_Defending_From_Adversarial_Examples.ipynb | ||
+ | ---- | ||
+ | ---- | ||
=BLACK BOX ATTACKS= | =BLACK BOX ATTACKS= | ||
Zeile 37: | Zeile 50: | ||
** Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples | ** Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples | ||
+ | ---- | ||
==on computer vision== | ==on computer vision== | ||
+ | |||
===propose zeroth order optimization (ZOO)=== | ===propose zeroth order optimization (ZOO)=== | ||
* attacks to directly estimate the gradients of the targeted DNN | * attacks to directly estimate the gradients of the targeted DNN | ||
** https://arxiv.org/abs/1708.03999 | ** https://arxiv.org/abs/1708.03999 | ||
+ | |||
===Black-Box Attacks using Adversarial Samples=== | ===Black-Box Attacks using Adversarial Samples=== | ||
* a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack | * a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack | ||
** https://arxiv.org/abs/1605.07277 | ** https://arxiv.org/abs/1605.07277 | ||
+ | |||
===new Tesla Hack=== | ===new Tesla Hack=== | ||
* https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane | * https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane | ||
Zeile 49: | Zeile 66: | ||
** Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf | ** Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf | ||
+ | ---- | ||
==on voice (ASR)== | ==on voice (ASR)== | ||
* https://www.the-ambient.com/features/weird-ways-echo-can-be-hacked-how-to-stop-it-231 | * https://www.the-ambient.com/features/weird-ways-echo-can-be-hacked-how-to-stop-it-231 | ||
+ | |||
===hidden voice commands=== | ===hidden voice commands=== | ||
* https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/ | * https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/ | ||
* https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds | * https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds | ||
+ | |||
===Psychoacoustic Hiding (Attacking Speech Recognition)=== | ===Psychoacoustic Hiding (Attacking Speech Recognition)=== | ||
* https://adversarial-attacks.net/ | * https://adversarial-attacks.net/ | ||
Zeile 60: | Zeile 80: | ||
** Präsentationsfolien: https://www.ndss-symposium.org/wp-content/uploads/ndss2019_08-2_Schonherr_slides.pdf | ** Präsentationsfolien: https://www.ndss-symposium.org/wp-content/uploads/ndss2019_08-2_Schonherr_slides.pdf | ||
+ | ---- | ||
==on written text (NLP)== | ==on written text (NLP)== | ||
+ | |||
===paraphrasing attacks=== | ===paraphrasing attacks=== | ||
* https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/ | * https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/ | ||
Zeile 67: | Zeile 89: | ||
* https://motherboard.vice.com/en_us/article/9axx5e/ai-can-be-fooled-with-one-misspelled-word | * https://motherboard.vice.com/en_us/article/9axx5e/ai-can-be-fooled-with-one-misspelled-word | ||
+ | ---- | ||
==Anti Surveillance== | ==Anti Surveillance== | ||
http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/ | http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/ | ||
+ | ---- | ||
==libraries== | ==libraries== | ||
* https://github.com/bethgelab | * https://github.com/bethgelab | ||
* https://github.com/tensorflow/cleverhans | * https://github.com/tensorflow/cleverhans |
Version vom 16. April 2019, 14:55 Uhr
KNN's sind extrem anfällig für...
- Praxis-Beispiele: https://boingboing.net/tag/adversarial-examples
- https://bdtechtalks.com/2018/12/27/deep-learning-adversarial-attacks-ai-malware/
- https://www.dailydot.com/debug/ai-malware/
Inhaltsverzeichnis
WHITE BOX ATTACKS
- https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/
- Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf
Untargeted Adversarial Attacks
Adversarial attacks that just want your model to be confused and predict a wrong class are called Untargeted Adversarial Attacks.
- nicht zielgerichtet
Fast Gradient Sign Method(FGSM)
FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).
Basic Iterative Method
Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden
Iterative Least-Likely Class Method
ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt
Targeted Adversarial Attacks
Attacks which compel the model to predict a (wrong) desired output are called Targeted Adversarial attacks
- zielgerichtet
(Un-)Targeted Adversarial Attacks
kann beides...
Projected Gradient Descent (PGD)
Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:
BLACK BOX ATTACKS
- https://medium.com/@ml.at.berkeley/tricking-neural-networks-create-your-own-adversarial-examples-a61eb7620fd8
- Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples
on computer vision
propose zeroth order optimization (ZOO)
- attacks to directly estimate the gradients of the targeted DNN
Black-Box Attacks using Adversarial Samples
- a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack
new Tesla Hack
- https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane
- https://boingboing.net/2019/03/31/mote-in-cars-eye.html
- Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf
on voice (ASR)
- https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/
- https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds
Psychoacoustic Hiding (Attacking Speech Recognition)
on written text (NLP)
paraphrasing attacks
- https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/
- https://bdtechtalks.com/2019/04/02/ai-nlp-paraphrasing-adversarial-attacks/
Anti Surveillance
http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/