Aktionen

Adversarial Attacks: Unterschied zwischen den Versionen

Aus exmediawiki

(Die Seite wurde neu angelegt: „KNN's sind extrem anfällig für... * Praxis-Beispiele: https://boingboing.net/tag/adversarial-examples * https://bdtechtalks.com/2018/12/27/deep-learning-adv…“)
 
Zeile 8: Zeile 8:
  
 
* https://en.wikipedia.org/wiki/Deep_learning#Cyberthreat
 
* https://en.wikipedia.org/wiki/Deep_learning#Cyberthreat
+
 
 +
----
 +
----
 
=WHITE BOX ATTACKS=
 
=WHITE BOX ATTACKS=
 
* https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/
 
* https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/
 
** Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf
 
** Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf
  
 +
----
 
==Untargeted Adversarial Attacks==
 
==Untargeted Adversarial Attacks==
 
Adversarial attacks that just want '''your model to be confused and predict a wrong class''' are called Untargeted Adversarial Attacks.
 
Adversarial attacks that just want '''your model to be confused and predict a wrong class''' are called Untargeted Adversarial Attacks.
 
* nicht zielgerichtet
 
* nicht zielgerichtet
 +
 
===Fast Gradient Sign Method(FGSM)===
 
===Fast Gradient Sign Method(FGSM)===
 
FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).
 
FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).
 +
 
===Basic Iterative Method===
 
===Basic Iterative Method===
 
Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden
 
Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden
 +
 
===Iterative Least-Likely Class Method===
 
===Iterative Least-Likely Class Method===
 
ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt
 
ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt
 +
 +
----
 
==Targeted Adversarial Attacks==
 
==Targeted Adversarial Attacks==
 
Attacks which compel the model to predict a '''(wrong) desired output''' are called Targeted Adversarial attacks
 
Attacks which compel the model to predict a '''(wrong) desired output''' are called Targeted Adversarial attacks
 
* zielgerichtet
 
* zielgerichtet
 +
 +
----
 
==(Un-)Targeted Adversarial Attacks==
 
==(Un-)Targeted Adversarial Attacks==
 
kann beides...
 
kann beides...
 +
 
===Projected Gradient Descent (PGD)===
 
===Projected Gradient Descent (PGD)===
 
Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:
 
Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:
Zeile 32: Zeile 43:
 
** Jupyter Notebook: https://github.com/oscarknagg/adversarial/blob/master/notebooks/Creating_And_Defending_From_Adversarial_Examples.ipynb
 
** Jupyter Notebook: https://github.com/oscarknagg/adversarial/blob/master/notebooks/Creating_And_Defending_From_Adversarial_Examples.ipynb
  
 +
----
 +
----
 
=BLACK BOX ATTACKS=
 
=BLACK BOX ATTACKS=
  
Zeile 37: Zeile 50:
 
** Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples
 
** Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples
  
 +
----
 
==on computer vision==
 
==on computer vision==
 +
 
===propose zeroth order optimization (ZOO)===
 
===propose zeroth order optimization (ZOO)===
 
* attacks to directly estimate the gradients of the targeted DNN
 
* attacks to directly estimate the gradients of the targeted DNN
 
** https://arxiv.org/abs/1708.03999
 
** https://arxiv.org/abs/1708.03999
 +
 
===Black-Box Attacks using Adversarial Samples===
 
===Black-Box Attacks using Adversarial Samples===
 
*  a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack
 
*  a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack
 
** https://arxiv.org/abs/1605.07277
 
** https://arxiv.org/abs/1605.07277
 +
 
===new Tesla Hack===
 
===new Tesla Hack===
 
* https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane
 
* https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane
Zeile 49: Zeile 66:
 
** Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf
 
** Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf
  
 +
----
 
==on voice (ASR)==
 
==on voice (ASR)==
 
* https://www.the-ambient.com/features/weird-ways-echo-can-be-hacked-how-to-stop-it-231
 
* https://www.the-ambient.com/features/weird-ways-echo-can-be-hacked-how-to-stop-it-231
 +
 
===hidden voice commands===
 
===hidden voice commands===
 
* https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/
 
* https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/
 
* https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds
 
* https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds
 +
 
===Psychoacoustic Hiding (Attacking Speech Recognition)===
 
===Psychoacoustic Hiding (Attacking Speech Recognition)===
 
* https://adversarial-attacks.net/
 
* https://adversarial-attacks.net/
Zeile 60: Zeile 80:
 
** Präsentationsfolien: https://www.ndss-symposium.org/wp-content/uploads/ndss2019_08-2_Schonherr_slides.pdf
 
** Präsentationsfolien: https://www.ndss-symposium.org/wp-content/uploads/ndss2019_08-2_Schonherr_slides.pdf
  
 +
----
 
==on written text (NLP)==
 
==on written text (NLP)==
 +
 
===paraphrasing attacks===
 
===paraphrasing attacks===
 
* https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/
 
* https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/
Zeile 67: Zeile 89:
 
* https://motherboard.vice.com/en_us/article/9axx5e/ai-can-be-fooled-with-one-misspelled-word
 
* https://motherboard.vice.com/en_us/article/9axx5e/ai-can-be-fooled-with-one-misspelled-word
  
 +
----
 
==Anti Surveillance==
 
==Anti Surveillance==
 
http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/
 
http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/
  
 +
----
 
==libraries==
 
==libraries==
 
* https://github.com/bethgelab
 
* https://github.com/bethgelab
 
* https://github.com/tensorflow/cleverhans
 
* https://github.com/tensorflow/cleverhans

Version vom 16. April 2019, 15:55 Uhr

KNN's sind extrem anfällig für...



WHITE BOX ATTACKS


Untargeted Adversarial Attacks

Adversarial attacks that just want your model to be confused and predict a wrong class are called Untargeted Adversarial Attacks.

  • nicht zielgerichtet

Fast Gradient Sign Method(FGSM)

FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).

Basic Iterative Method

Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden

Iterative Least-Likely Class Method

ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt


Targeted Adversarial Attacks

Attacks which compel the model to predict a (wrong) desired output are called Targeted Adversarial attacks

  • zielgerichtet

(Un-)Targeted Adversarial Attacks

kann beides...

Projected Gradient Descent (PGD)

Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:



BLACK BOX ATTACKS


on computer vision

propose zeroth order optimization (ZOO)

Black-Box Attacks using Adversarial Samples

  • a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack

new Tesla Hack


on voice (ASR)

hidden voice commands

Psychoacoustic Hiding (Attacking Speech Recognition)


on written text (NLP)

paraphrasing attacks


Anti Surveillance

http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/


libraries