Red Team Tactics and Techniques

H04 NeuroInvasion Penetrating the Core of Artificial Intelligence

05/15/2025

11:30am - 12:30pm

Level: Intermediate to Advanced

Chen Shiri

Cyber Security Researcher

Accenture Security

This presentation delves into my new research and methodologies for attacking Deep Neural Networks (DNNs) and AI models in black-box environments (without access to internal parameters.). Traditionally, adversarial attacks require access to the model's internals (white-box access), limiting their application in black-box settings. However, this talk introduces **two innovative techniques** to bypass this restriction. Attendees will gain a deep understanding of how these techniques work, from identifying a model’s architecture through **model enumeration** to adapting **white-box attack strategies** for black-box models.

  1. **Model Enumeration** techniques– using prompts, API probing, and output analysis to identify the architecture and behavior of black-box models.
  2. New technique 1- Substitute Model Attacks: How to train and use a substitute model to apply white-box adversarial techniques on black-box systems.
  3. New technique 2- Methods to exploit AI models based on open-source frameworks like GPT by targeting known vulnerabilities

I will also provide practical demonstrations of how **white-box attacks** and other real world vulnerabilities widely deployed AI applications. can be adapted to black-box models once the discussed techniques were utilized, with **a live demo** and mitigations to defend against this attacks.

The presentation demystifies these attacks, making them accessible to security professionals without requiring deep mathematical expertise.

You will learn:

  • New methods to exploit black-box AI models using model enumeration and substitute models- Substitute Model Attacks, Open-Source Model Exploitation
  • New Enumeration Techniques: Practical methods to infer black-box model architectures, enabling more effective attacks.
  • Practical steps for executing white-box attacks such as FGSM on black-box environments.