Aktionen

Adversarial Attacks: Unterschied zwischen den Versionen

Aus exmediawiki

 
(5 dazwischenliegende Versionen desselben Benutzers werden nicht angezeigt)
Zeile 1: Zeile 1:
 +
Blog von Francis Hunger und Flupke mit schönen beispielen...: http://adversarial.io/blog/allgemein/
 +
 +
https://github.com/shangtse/robust-physical-attack
 +
 +
https://spectrum.ieee.org/cars-that-think/transportation/sensors/slight-street-sign-modifications-can-fool-machine-learning-algorithms
 +
 +
https://github.com/ifding/adversarial-examples/blob/master/notebooks/adversarial.ipynb
 +
 +
https://arxiv.org/pdf/1712.09665.pdf
 +
 +
https://github.com/zentralwerkstatt/adversarial/blob/master/adversarial.ipynb
 +
 +
https://christophm.github.io/interpretable-ml-book/adversarial.html
 +
 +
https://b-ok.cc/book/5260920/fee7e3 < book: Strengthening Deep Neural Networks: Making AI Less Susceptible to Adversarial Trickery
 +
 +
 +
----
 +
https://towardsdatascience.com/perhaps-the-simplest-introduction-of-adversarial-examples-ever-c0839a759b8d
 +
---
 +
* A Complete List of All (arXiv) Adversarial Example Papers https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html
 
* Praxis-Beispiele: https://boingboing.net/tag/adversarial-examples
 
* Praxis-Beispiele: https://boingboing.net/tag/adversarial-examples
 
* https://bdtechtalks.com/2018/12/27/deep-learning-adversarial-attacks-ai-malware/
 
* https://bdtechtalks.com/2018/12/27/deep-learning-adversarial-attacks-ai-malware/
Zeile 120: Zeile 141:
 
* https://github.com/bethgelab
 
* https://github.com/bethgelab
 
* https://github.com/tensorflow/cleverhans
 
* https://github.com/tensorflow/cleverhans
 +
 +
----
 +
=Detection through [https://exmediawiki.khm.de/exmediawiki/index.php?title=XAI_/_Language_Models&#XAI XAI]=
 +
 +
When [https://exmediawiki.khm.de/exmediawiki/index.php?title=XAI_/_Language_Models&#XAI Explainability] Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures: https://arxiv.org/pdf/1909.03418.pdf
 +
 +
Using [https://exmediawiki.khm.de/exmediawiki/index.php?title=XAI_/_Language_Models&#XAI Explainabilty] to Detect Adversarial Attacks: https://openreview.net/pdf?id=B1xu6yStPH
  
 
[[Category:Hacking]]
 
[[Category:Hacking]]

Aktuelle Version vom 17. Dezember 2020, 17:43 Uhr

Blog von Francis Hunger und Flupke mit schönen beispielen...: http://adversarial.io/blog/allgemein/

https://github.com/shangtse/robust-physical-attack

https://spectrum.ieee.org/cars-that-think/transportation/sensors/slight-street-sign-modifications-can-fool-machine-learning-algorithms

https://github.com/ifding/adversarial-examples/blob/master/notebooks/adversarial.ipynb

https://arxiv.org/pdf/1712.09665.pdf

https://github.com/zentralwerkstatt/adversarial/blob/master/adversarial.ipynb

https://christophm.github.io/interpretable-ml-book/adversarial.html

https://b-ok.cc/book/5260920/fee7e3 < book: Strengthening Deep Neural Networks: Making AI Less Susceptible to Adversarial Trickery



https://towardsdatascience.com/perhaps-the-simplest-introduction-of-adversarial-examples-ever-c0839a759b8d ---

800

notes

https://github.com/tensorflow/cleverhans

WHITE BOX ATTACKS


Untargeted Adversarial Attacks

Adversarial attacks that just want your model to be confused and predict a wrong class are called Untargeted Adversarial Attacks.

  • nicht zielgerichtet

Fast Gradient Sign Method(FGSM)

FGSM is a single step attack, ie.. the perturbation is added in a single step instead of adding it over a loop (Iterative attack).

Basic Iterative Method

Störung, anstatt in einem einzelnen Schritt in mehrere kleinen Schrittgrößen anwenden

Iterative Least-Likely Class Method

ein Bild erstellen, welches in der Vorhersage den niedrigsten Score trägt


Targeted Adversarial Attacks

Attacks which compel the model to predict a (wrong) desired output are called Targeted Adversarial attacks

  • zielgerichtet

(Un-)Targeted Adversarial Attacks

kann beides...

Projected Gradient Descent (PGD)

Eine Störung finden die den Verlust eines Modells bei einer bestimmten Eingabe maximiert:

WHITE/BLACK BOX ATTACKS

on voice (ASR)

Psychoacoustic Hiding (Attacking Speech Recognition)


BLACK BOX ATTACKS


on computer vision

propose zeroth order optimization (ZOO)

Black-Box Attacks using Adversarial Samples

  • a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack

new Tesla Hack

Another Attack Against Driverless Cars

Schneier News:

In this piece of research, attackers successfully attack a driverless car system -- Renault Captur's "Level 0" autopilot (Level 0 systems advise human drivers but do not directly operate cars) -- by following them with drones that project images of fake road signs in 100ms bursts. The time is too short for human perception, but long enough to fool the autopilot's sensors.

Boing Boing post.


on voice (ASR)

hidden voice commands

BLACK BOX / WHITE BOX ATTACKS

on voice (ASR)

Psychoacoustic Hiding (Attacking Speech Recognition)


on written text (NLP)

paraphrasing attacks


Anti Surveillance

http://dismagazine.com/dystopia/evolved-lifestyles/8115/anti-surveillance-how-to-hide-from-machines/

How to Disappear Completely

https://www.youtube.com/watch?v=LOulCAz4S0M talk by Lilly Ryan at linux.conf.au 2019 — Christchurch, New Zealand


libraries


Detection through XAI

When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures: https://arxiv.org/pdf/1909.03418.pdf

Using Explainabilty to Detect Adversarial Attacks: https://openreview.net/pdf?id=B1xu6yStPH