Skip to content

Conversation

@ShreyasMahajann
Copy link
Contributor

@ShreyasMahajann ShreyasMahajann commented Aug 8, 2025

The feature enables us to use the product with web shells or scenarios where we do not have SSH credentials.
python src/hackingBuddyGPT/cli/wintermute.py LinuxPrivesc --conn=ssh
python src/hackingBuddyGPT/cli/wintermute.py LinuxPrivesc --conn=local_shell --conn.tmux_session=<tmux_session_name>[

hackGptdemo.mp4

andreashappe and others added 7 commits April 24, 2025 21:52
Good news everyone! There's a new (and long overdue) new version of hackingBuddyGPT out!

To summarize the big changes:

- @Neverbolt did extensive work on the configuration and logging system:
  - Overwork of the configuration system
  - Added a visual and live web based log viewer, which can be started with `wintermute Viewer`
  - Updated the configuration system. The new configuration system now allows loading parameters from a .json file as well as choosing which logging backend should be used

- @lloydchang with @pardaz-banu, @halifrieri, @toluwalopeoolagbegi and @tushcmd added support for dev containers

- @jamfish added support for key-based SSH access (to the target system)

- @Qsan1 added a new use-case, focusing on enabling linux priv-esc with small-language models, to quote:
  - Added an extended linux-privesc usecase. It is based on 'privesc', but extends it with multiple components that can be freely switch on or off:
        - Analyze: After each iteration the LLM is asked to analyze the output of that round.        
        - Retrieval Augmented Generation (RAG): After each iteration the LLM is prompted and asked to generate a search query for a vector store. The search query is then used to retrieve relevant documents from the vector store and the information is included in the prompt for the Analyze component (Only works if Analyze is enabled).
        - Chain of thought (CoT): Instead of simply asking the LLM for the next command, we use CoT to generate the next action.
        - History Compression: Instead of including all commands and their respective output in the prompt, it removes all outputs except the most recent one.
        - Structure via Prompt: Include an initial set of command recommendations in `query_next_command`

I thank all our contributors (and hopefully haven't forgotten too many). Enjoy!
bump dependencies and add Qsan1's documentation
Display query in the URL on failed request
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces local shell integration using tmux to enable hackingBuddyGPT to interact with shells without requiring SSH credentials. The feature allows users to connect to local tmux sessions for testing and development scenarios.

Key changes:

  • Added LocalShellConnection class for tmux-based shell interaction
  • Implemented LocalShellCapability for command execution through tmux
  • Updated LinuxPrivesc use case to support both SSH and local shell connections

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/hackingBuddyGPT/utils/local_shell/local_shell.py Implements complete tmux-based shell connection with command execution, output capture, and completion detection
src/hackingBuddyGPT/utils/local_shell/init.py Package initialization for local shell utilities
src/hackingBuddyGPT/usecases/privesc/linux.py Updates LinuxPrivesc to support both SSH and local shell connections
src/hackingBuddyGPT/capabilities/local_shell.py Implements capability for local shell command execution
README.md Documentation updates explaining the new local shell feature and usage examples

subprocess.run(['tmux', 'set-option', '-t', self.tmux_session, 'history-limit', '10000'],
capture_output=True)
raise RuntimeError(f"Error executing command: {e}")

Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run_simple_fallback method is quite long (49 lines) and handles multiple responsibilities including setting tmux options, command execution, and output extraction. Consider breaking this into smaller, focused methods.

Suggested change
self._set_tmux_history_limit(50000)
clear_marker = self._send_clear_marker()
self.send_command(command)
self.wait_for_command_completion()
end_marker = self._send_end_marker()
output = self.capture_output(50000)
result = self._extract_output_between_markers(output, clear_marker, end_marker, command)
self._set_tmux_history_limit(10000)
return result
except Exception as e:
self._set_tmux_history_limit(10000)
raise RuntimeError(f"Error executing command: {e}")
def _set_tmux_history_limit(self, limit):
subprocess.run(['tmux', 'set-option', '-t', self.tmux_session, 'history-limit', str(limit)],
capture_output=True)
def _send_clear_marker(self):
clear_marker = f"__CLEAR_{uuid.uuid4().hex[:8]}__"
self.send_command('clear')
time.sleep(0.3)
self.send_command(f'echo "{clear_marker}"')
time.sleep(0.3)
return clear_marker
def _send_end_marker(self):
end_marker = f"__END_{uuid.uuid4().hex[:8]}__"
self.send_command(f'echo "{end_marker}"')
time.sleep(0.5)
return end_marker
def _extract_output_between_markers(self, output, clear_marker, end_marker, command):
lines = output.splitlines()
start_idx = -1
end_idx = -1
for i, line in enumerate(lines):
if clear_marker in line:
start_idx = i
elif end_marker in line and start_idx != -1:
end_idx = i
break
if start_idx != -1 and end_idx != -1:
result_lines = lines[start_idx + 1:end_idx]
if result_lines and command in result_lines[0]:
result_lines = result_lines[1:]
result = '\n'.join(result_lines).strip()
else:
result = self._extract_recent_output(output, command)
return result

Copilot uses AI. Check for mistakes.
result = self._extract_between_markers(final_output, start_marker, end_marker, command)
return result

except Exception as e:
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching all exceptions with 'Exception' is too broad and can mask unexpected errors. Consider catching more specific exceptions like subprocess.CalledProcessError or RuntimeError.

Suggested change
except Exception as e:
except (RuntimeError, subprocess.CalledProcessError) as e:

Copilot uses AI. Check for mistakes.
subprocess.run(['tmux', 'set-option', '-t', self.tmux_session, 'history-limit', '10000'],
capture_output=True)
raise RuntimeError(f"Error executing command: {e}")

Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching all exceptions with 'Exception' is too broad and can mask unexpected errors. Consider catching more specific exceptions like subprocess.CalledProcessError.

Suggested change
return result
except subprocess.SubprocessError as e:
raise RuntimeError(f"Error executing command: {e}")
finally:
subprocess.run(['tmux', 'set-option', '-t', self.tmux_session, 'history-limit', '10000'],
capture_output=True)

Copilot uses AI. Check for mistakes.
elif end_marker in line and start_idx != -1:
end_idx = i
break

Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The history-limit is set to 50000 and then restored to 10000 in the finally block, but there's no guarantee the original value was 10000. Consider capturing and restoring the original value instead.

Suggested change
original_history_limit = None
try:
# Get the current history-limit value
result = subprocess.run(
['tmux', 'show-options', '-g', 'history-limit'],
capture_output=True, text=True
)
if result.returncode == 0:
# Output is like: 'history-limit 10000'
match = re.search(r'history-limit\s+(\d+)', result.stdout)
if match:
original_history_limit = match.group(1)
# Set history-limit to 50000
subprocess.run(['tmux', 'set-option', '-t', self.tmux_session, 'history-limit', '50000'],
capture_output=True)
clear_marker = f"__CLEAR_{uuid.uuid4().hex[:8]}__"
self.send_command('clear')
time.sleep(0.3)
self.send_command(f'echo "{clear_marker}"')
time.sleep(0.3)
self.send_command(command)
self.wait_for_command_completion()
end_marker = f"__END_{uuid.uuid4().hex[:8]}__"
self.send_command(f'echo "{end_marker}"')
time.sleep(0.5)
output = self.capture_output(50000)
lines = output.splitlines()
start_idx = -1
end_idx = -1
for i, line in enumerate(lines):
if clear_marker in line:
start_idx = i
elif end_marker in line and start_idx != -1:
end_idx = i
break

Copilot uses AI. Check for mistakes.
last_output_hash = None
last_cursor_pos = None
stable_count = 0
min_stable_time = 1.5 # Reduced for faster detection
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 1.5 for min_stable_time should be made configurable or defined as a class constant to improve maintainability.

Copilot uses AI. Check for mistakes.
)

def __call__(self, cmd: str) -> Tuple[str, bool]:
out, _, _ = self.conn.run(cmd) # This is CORRECT - use the commented version
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment '# This is CORRECT - use the commented version' is confusing and doesn't provide useful information. Consider removing it or making it more descriptive.

Suggested change
out, _, _ = self.conn.run(cmd) # This is CORRECT - use the commented version
out, _, _ = self.conn.run(cmd)

Copilot uses AI. Check for mistakes.
@ShreyasMahajann ShreyasMahajann changed the base branch from main to development August 8, 2025 17:14
@andreashappe andreashappe merged commit bc22dff into ipa-lab:development Aug 27, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants