Skip to content

Latest commit

 

History

History
16 lines (16 loc) · 769 Bytes

Adversarial ML.md

File metadata and controls

16 lines (16 loc) · 769 Bytes
  • Information Security Triad
    • Confidentiality: of the ML model, training data, eval data
    • Integrity: of the training data, of the model, of the eval data
    • Availability: of training, of the trained modl
  • ML Basics: take small steps of gradient descent to minimize loss
  • Attack: - take gradient ascent to maximize loss- Black box attack- Boundary attack
  • Defense:- Adversarial training
  • Poisoning attacks
    • generate insecure code in targeted contexts
    • generates negative reviews for targeted products
  • Data extraction:
    • extract private data from language model
    • steps: 1. generate lots of text, 2. filter text with a membership inference attack
  • Defense: train models with differential
    • con: hurts accuracy
  • Current ML is not rebust and not private