Aktionen

Adversarial Attacks: Unterschied zwischen den Versionen

Aus exmediawiki

Die Seite wurde neu angelegt: „KNN's sind extrem anfällig für... * Praxis-Beispiele: https://boingboing.net/tag/adversarial-examples * https://bdtechtalks.com/2018/12/27/deep-learning-adv…“
 
Keine Bearbeitungszusammenfassung
Zeile 8: Zeile 8:


* https://en.wikipedia.org/wiki/Deep_learning#Cyberthreat
* https://en.wikipedia.org/wiki/Deep_learning#Cyberthreat
 
----
----
=WHITE BOX ATTACKS=
=WHITE BOX ATTACKS=
* https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/
* https://cv-tricks.com/how-to/breaking-deep-learning-with-adversarial-examples-using-tensorflow/
** Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf
** Paper »ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD«: https://arxiv.org/pdf/1607.02533.pdf


----
==Untargeted Adversarial Attacks==
==Untargeted Adversarial Attacks==
Adversarial attacks that just want '''your model to be confused and predict a wrong class''' are called Untargeted Adversarial Attacks.
Adversarial attacks that just want '''your model to be confused and predict a wrong class''' are called Untargeted Adversarial Attacks.
* nicht zielgerichtet
* nicht zielgerichtet
===Fast Gradient Sign Method(FGSM)===
===Fast Gradient Sign Method(FGSM)===
FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).
FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).
===Basic Iterative Method===
===Basic Iterative Method===
Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden
Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden
===Iterative Least-Likely Class Method===
===Iterative Least-Likely Class Method===
ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt
ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt
----
==Targeted Adversarial Attacks==
==Targeted Adversarial Attacks==
Attacks which compel the model to predict a '''(wrong) desired output''' are called Targeted Adversarial attacks
Attacks which compel the model to predict a '''(wrong) desired output''' are called Targeted Adversarial attacks
* zielgerichtet
* zielgerichtet
----
==(Un-)Targeted Adversarial Attacks==
==(Un-)Targeted Adversarial Attacks==
kann beides...
kann beides...
===Projected Gradient Descent (PGD)===
===Projected Gradient Descent (PGD)===
Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:
Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:
Zeile 32: Zeile 43:
** Jupyter Notebook: https://github.com/oscarknagg/adversarial/blob/master/notebooks/Creating_And_Defending_From_Adversarial_Examples.ipynb
** Jupyter Notebook: https://github.com/oscarknagg/adversarial/blob/master/notebooks/Creating_And_Defending_From_Adversarial_Examples.ipynb


----
----
=BLACK BOX ATTACKS=
=BLACK BOX ATTACKS=


Zeile 37: Zeile 50:
** Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples
** Jupyter Notebook: https://github.com/dangeng/Simple_Adversarial_Examples


----
==on computer vision==
==on computer vision==
===propose zeroth order optimization (ZOO)===
===propose zeroth order optimization (ZOO)===
* attacks to directly estimate the gradients of the targeted DNN
* attacks to directly estimate the gradients of the targeted DNN
** https://arxiv.org/abs/1708.03999
** https://arxiv.org/abs/1708.03999
===Black-Box Attacks using Adversarial Samples===
===Black-Box Attacks using Adversarial Samples===
*  a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack
*  a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack
** https://arxiv.org/abs/1605.07277
** https://arxiv.org/abs/1605.07277
===new Tesla Hack===
===new Tesla Hack===
* https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane
* https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane
Zeile 49: Zeile 66:
** Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf
** Paper vom Forschungsteam: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf


----
==on voice (ASR)==
==on voice (ASR)==
* https://www.the-ambient.com/features/weird-ways-echo-can-be-hacked-how-to-stop-it-231
* https://www.the-ambient.com/features/weird-ways-echo-can-be-hacked-how-to-stop-it-231
===hidden voice commands===
===hidden voice commands===
* https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/
* https://www.theregister.co.uk/2016/07/11/siri_hacking_phones/
* https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds
* https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds
===Psychoacoustic Hiding (Attacking Speech Recognition)===
===Psychoacoustic Hiding (Attacking Speech Recognition)===
* https://adversarial-attacks.net/
* https://adversarial-attacks.net/
Zeile 60: Zeile 80:
** Präsentationsfolien: https://www.ndss-symposium.org/wp-content/uploads/ndss2019_08-2_Schonherr_slides.pdf
** Präsentationsfolien: https://www.ndss-symposium.org/wp-content/uploads/ndss2019_08-2_Schonherr_slides.pdf


----
==on written text (NLP)==
==on written text (NLP)==
===paraphrasing attacks===
===paraphrasing attacks===
* https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/
* https://venturebeat.com/2019/04/01/text-based-ai-models-are-vulnerable-to-paraphrasing-attacks-researchers-find/
Zeile 67: Zeile 89:
* https://motherboard.vice.com/en_us/article/9axx5e/ai-can-be-fooled-with-one-misspelled-word
* https://motherboard.vice.com/en_us/article/9axx5e/ai-can-be-fooled-with-one-misspelled-word


----
==Anti Surveillance==
==Anti Surveillance==
http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/
http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/


----
==libraries==
==libraries==
* https://github.com/bethgelab
* https://github.com/bethgelab
* https://github.com/tensorflow/cleverhans
* https://github.com/tensorflow/cleverhans

Version vom 16. April 2019, 14:55 Uhr

KNN's sind extrem anfällig für...



WHITE BOX ATTACKS


Untargeted Adversarial Attacks

Adversarial attacks that just want your model to be confused and predict a wrong class are called Untargeted Adversarial Attacks.

  • nicht zielgerichtet

Fast Gradient Sign Method(FGSM)

FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).

Basic Iterative Method

Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden

Iterative Least-Likely Class Method

ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt


Targeted Adversarial Attacks

Attacks which compel the model to predict a (wrong) desired output are called Targeted Adversarial attacks

  • zielgerichtet

(Un-)Targeted Adversarial Attacks

kann beides...

Projected Gradient Descent (PGD)

Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:



BLACK BOX ATTACKS


on computer vision

propose zeroth order optimization (ZOO)

Black-Box Attacks using Adversarial Samples

  • a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack

new Tesla Hack


on voice (ASR)

hidden voice commands

Psychoacoustic Hiding (Attacking Speech Recognition)


on written text (NLP)

paraphrasing attacks


Anti Surveillance

http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/


libraries