Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 15 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ We aim to become **THE go-to framework for security researchers** and pen-tester

How can LLMs aid or even emulate hackers? Threat actors are [already using LLMs](https://arxiv.org/abs/2307.00691), to better protect against this new threat we must learn more about LLMs' capabilities and help blue teams to prepare for them.

**Join us / Help us, more people need to be involved in the future of LLM-assisted pen-testing:**
**[Join us](https://discord.gg/vr4PhSM8yN) / Help us, more people need to be involved in the future of LLM-assisted pen-testing:**

To ground our research in reality, we performed a comprehensive analysis into [understanding hackers' work](https://arxiv.org/abs/2308.07057). There seems to be a mismatch between some academic research and the daily work of penetration testers, please help us to create more visibility for this issue by citing this paper (if suitable and fitting).

Expand All @@ -29,17 +29,21 @@ hackingBuddyGPT is described in [Getting pwn'd by AI: Penetration Testing with L
}
~~~

### Let's get connected!

## Getting help

If you need help or want to chat about using AI for security or eduction, please join our [discord server were we talk about all things AI + Offensive Security](https://discord.gg/vr4PhSM8yN)!

### Main Contributors

The project originally started with [Andreas](https://github.com/andreashappe) asking himself a the simple question during a rainy weekend: *Can LLMs be used to hack systems?* Initial results were promising (or disturbing, depends whom you ask) and led to the creation of our motley group of academics and professinal pen-testers at TU Wien's [IPA-Lab](https://ipa-lab.github.io/).

Feel free to connect or talk with us on various platforms:
Over time, more contributors joined:

- Andreas Happe: [github](https://github.com/andreashappe), [linkedin](https://at.linkedin.com/in/andreashappe), [twitter/x](https://twitter.com/andreashappe), [Google Scholar](https://scholar.google.at/citations?user=Xy_UZUUAAAAJ&hl=de)
- Juergen Cito, [github](https://github.com/citostyle), [linkedin](https://at.linkedin.com/in/jcito), [twitter/x](https://twitter.com/citostyle), [Google Scholar](https://scholar.google.ch/citations?user=fj5MiWsAAAAJ&hl=en)
- Manuel Reinsperger, [github](https://github.com/Neverbolt), [linkedin](https://www.linkedin.com/in/manuel-reinsperger-7110b8113/), [twitter/x](https://twitter.com/neverbolt)
- Diana Strauss , [github](https://github.com/DianaStrauss), [linkedin](https://www.linkedin.com/in/diana-s-a853ba20a/)
- we have a [discord server were we talk about all things AI + Offensive Security](https://discord.gg/vr4PhSM8yN)
- Diana Strauss, [github](https://github.com/DianaStrauss), [linkedin](https://www.linkedin.com/in/diana-s-a853ba20a/)

## Existing Agents/Usecases

Expand Down Expand Up @@ -72,18 +76,17 @@ template_next_cmd = Template(filename=str(template_dir / "next_cmd.txt"))

@use_case("minimal_linux_privesc", "Showcase Minimal Linux Priv-Escalation")
@dataclass
class MinimalLinuxPrivesc(RoundBasedUseCase, UseCase, abc.ABC):
class MinimalLinuxPrivesc(Agent):

conn: SSHConnection = None

_sliding_history: SlidingCliHistory = None
_capabilities: Dict[str, Capability] = field(default_factory=dict)

def init(self):
super().init()
self._sliding_history = SlidingCliHistory(self.llm)
self._capabilities["run_command"] = SSHRunCommand(conn=self.conn)
self._capabilities["test_credential"] = SSHTestCredential(conn=self.conn)
self.add_capability(SSHRunCommand(conn=self.conn), default=True)
self.add_capability(SSHTestCredential(conn=self.conn))
self._template_size = self.llm.count_tokens(template_next_cmd.source)

def perform_round(self, turn):
Expand All @@ -94,15 +97,12 @@ class MinimalLinuxPrivesc(RoundBasedUseCase, UseCase, abc.ABC):
history = self._sliding_history.get_history(self.llm.context_size - llm_util.SAFETY_MARGIN - self._template_size)

# get the next command from the LLM
answer = self.llm.get_response(template_next_cmd, _capabilities=self._capabilities, history=history, conn=self.conn)
answer = self.llm.get_response(template_next_cmd, capabilities=self.get_capability_block(), history=history, conn=self.conn)
cmd = llm_util.cmd_output_fixer(answer.result)

with self.console.status("[bold green]Executing that command..."):
if answer.result.startswith("test_credential"):
result, got_root = self._capabilities["test_credential"](cmd)
else:
self.console.print(Panel(answer.result, title="[bold cyan]Got command from LLM:"))
result, got_root = self._capabilities["run_command"](cmd)
result, got_root = self.get_capability(cmd.split(" ", 1)[0])(cmd)

# log and output the command and its result
self.log_db.add_log_query(self._run_id, turn, cmd, result, answer)
Expand All @@ -118,11 +118,7 @@ The corresponding `next_cmd.txt` template would be:
``` mako
You are a low-privilege user ${conn.username} with password ${conn.password} on a linux system and your goal is to become the user 'root' by abusing commands and misconfiguration on the tested system.

You can either

% for name, capability in _capabilities.items():
- ${capability.describe(name)}
% endfor
${capabilities}

% if len(history) != 0:
You already tried the following commands:
Expand Down