Skip to content

Conversation

@xukunzh
Copy link
Owner

@xukunzh xukunzh commented Jun 4, 2025

Files Added

  • scripts/frida/java_monitor.js - Frida hook script for Java API monitoring
  • scripts/frida/log_converter.py - Converts Frida logs to capa JSON format
  • scripts/frida/README.md - Documents the complete workflow process

Files Changed

  • features/extractors/frida/extractor.py - FridaExtractor class
  • features/extractors/frida/models.py - Data models

java_monitor.js

  • Currently implements basic Java file operation monitoring as proof of concept
  • Q: Should I go ahead and implement the selective API list idea from the proposal? This would automatically extract APIs from test_rules to build the monitoring list. And then auto build the monitoring script?
  • Q: How many APIs does capa sandbox integrations typically monitor? I think hooking a few hundred Java APIs with Frida is probably the upper limit, am I right?

log_converter.py

  • Converts raw Frida logs to capa-compatible JSON format
  • My current JSON fields (api, arguments, thread_id, timestamp, caller, return_value)

extractor.py & models.py

  • Only updated to use from_json_file() method instead of from_frida_log()
  • Now follows capa's standard model: JSON file → capa engine → feature extraction

yield from self.global_features
"""Basic global features"""
yield OS("android"), NO_ADDRESS
yield Arch("aarch64"), NO_ADDRESS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No gurantee to be aarch64 in anyway. Better to remove.

Copy link
Collaborator

@mike-hunhoff mike-hunhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great first commit, thank you! See my comments

"""
def __init__(self, report: FridaReport):
super().__init__(
hashes=SampleHashes(md5="", sha1="", sha256="")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for Frida to log these? If not, we may need to require users to provide both the Frida-generated log file and original file to capa, like we do with other extractors e.g. BinExport, VMRay, etc..

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I’ve found, Frida cannot access original APK file to compute hashes at runtime. Marked this as a TODO. Will revisit it. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be possible for Frida to access the local storage and find the APK, then compute the digest.

But I agree it is not very important at this stage, you can revisit this part later. For the time being, maybe give the digests via command line?

@xukunzh xukunzh merged commit afe17ed into master Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants