Add vulnerability location + dna #120

ErebusZ · 2025-02-27T18:41:30Z

This PR computes and adds the dna attribute to the whatweb agent:
dna schema:

{
    "location": "location vulnerability showcasing the file path and bundle/package_name",
    "vuln_data": "vulnerability info containing target, library_type, library name."
}

codecov · 2025-02-27T18:43:51Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.58%. Comparing base (f4e84d5) to head (e5fb8ca).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #120      +/-   ##
==========================================
+ Coverage   97.51%   97.58%   +0.07%     
==========================================
  Files           3        3              
  Lines         683      703      +20     
==========================================
+ Hits          666      686      +20     
  Misses         17       17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

oussamaessaji

I don't see the location in this PR

ostorlab-ai-pr-review

Code Review Summary

Total Issues Found: 4
Critical Issues: 1
Suggestions: 3

Key Findings:
The review identified four issues across the codebase. In 'agent/whatweb_agent.py', there are suggestions to add inline documentation to clarify the structure of 'vulnerable_target_data', update the type hint for 'vulnerability_location' to Optional[str], and importantly, implement inline validations to ensure required keys are present—this validation issue is critical as it can lead to runtime errors if data is malformed. In 'tests/whatweb_test.py', replacing the hard-coded JSON string with a dictionary constant (and comparing with json.loads) was suggested to improve readability and ensure order-insensitive comparisons. Overall, the code demonstrates a good base structure but would benefit significantly from improved documentation, stricter type handling, and robust validations to prevent potential issues.

ostorlab-ai-pr-review · 2025-02-27T18:49:59Z

agent/whatweb_agent.py

@@ -477,6 +477,9 @@ def _send_detected_fingerprints(
                elif isinstance(target, IPTarget) and target.version == 6:
                    self.emit(selector=IP_V6_LIB_SELECTOR, data=msg_data)

+                dna = _prepare_vulnerability_dna(


Add documentation and inline comments to clarify the expected structure of vulnerable_target_data. 2. Update the type hint for vulnerability_location to Optional[str] to denote that it may be None. 3. Add inline validations to ensure required keys are present in vulnerable_target_data.

Suggested change

dna = _prepare_vulnerability_dna(

from typing import Dict, Optional

def _prepare_vulnerability_dna(self, vulnerable_target_data: Dict, vulnerability_location: Optional[str]) -> str:

"""

Compute the DNA attribute for the vulnerability.

Parameters:

vulnerable_target_data (Dict): A dictionary representing the target vulnerability data. Expected structure:

{

"id": <vulnerability identifier>,

"name": <vulnerability name>,

"details": <vulnerability details>,

// Optionally, "version": <version string> if applicable

}

vulnerability_location (Optional[str]): The location of the vulnerability. If None, an empty string is used.

Returns:

str: The computed DNA string.

"""

# Validate required keys in vulnerable_target_data

required_keys = ["id", "name", "details"]

for key in required_keys:

if key not in vulnerable_target_data:

raise ValueError(f"Missing required key '{key}' in vulnerable_target_data.")

# Ensure vulnerability_location is a valid string

if vulnerability_location is None:

vulnerability_location = ""

# Compute the DNA attribute by concatenating key components

dna_components = [

str(vulnerable_target_data.get("id")),

vulnerable_target_data.get("name", ""),

vulnerability_location

]

dna_str = "-".join(dna_components)

return dna_str

ostorlab-ai-pr-review · 2025-02-27T18:50:00Z

tests/whatweb_test.py

@@ -118,6 +118,11 @@ def testWhatWebAgent_withLinkMsgAndAllChecksEnabled_emitsFingerprints(
            assert any(
                vuln_msg.data.get("security_issue") is True for vuln_msg in agent_mock
            )
+            assert any(


Replace the hard-coded JSON string with a dictionary constant and compare using json.loads to ensure order-insensitive comparisons. This also improves readability by breaking long string literals.

Suggested change

assert any(

import json

# Define the expected DNA output as a constant dictionary

EXPECTED_DNA = {

"info": {

"subinfo": "value",

"another": "value"

}

}

# Use json.loads to parse the actual output and compare with the expected dictionary

self.assertEqual(json.loads(result["dna"]), EXPECTED_DNA)

nassermohamedit0 · 2025-02-27T19:00:39Z

tests/whatweb_test.py

+                vuln_msg.data.get("dna")
+                == '{"location": {"domain_name": {"name": "ostorlab.co"}, "metadata": [{"type": "PORT", "value": "80"}]}, "vuln_data": {"detail": "Found fingerprint `cloudflare`, of type `BACKEND_COMPONENT` in target `ostorlab.co`", "library_name": "cloudflare", "library_type": "BACKEND_COMPONENT", "name": "ostorlab.co", "port": 80, "schema": "http"}}'
+                for vuln_msg in agent_mock
+            )


Do explicit check please. Same for the other asserts.

nmasdoufi-ol · 2025-02-28T09:27:52Z

tests/whatweb_test.py

+                vuln_msg.data.get("dna")
+                == '{"location": {"ipv6": {"host": "2a00:1450:4006:80c::2004", "mask": "128", "version": 6}, "metadata": [{"type": "PORT", "value": "443"}]}, "vuln_data": {"detail": "Found fingerprint `lighttpd/1.4.28`, of type `BACKEND_COMPONENT` in target `2a00:1450:4006:80c::2004`", "host": "2a00:1450:4006:80c::2004", "library_name": "lighttpd/1.4.28", "library_type": "BACKEND_COMPONENT", "mask": "128", "port": 443, "version": 6}}'
+                for vuln_msg in agent_mock
+            )


explicit please

deadly-panda · 2025-02-28T10:30:04Z

agent/whatweb_agent.py

@@ -495,6 +498,7 @@ def _send_detected_fingerprints(
                    f"of type `{fingerprint_type}` in target `{target.name}`",
                    risk_rating=vuln_mixin.RiskRating.INFO,
                    vulnerability_location=vulnerable_target_data,
+                    dna=dna,


This is an info finding, skip it please, we don't need to compute its DNA..
Skip the whole agent.. you can close the PR.

Add vulnerability location + dna

e5fb8ca

ErebusZ requested a review from a team as a code owner February 27, 2025 18:41

ErebusZ added the github label Feb 27, 2025

oussamaessaji reviewed Feb 27, 2025

View reviewed changes

ostorlab-ai-pr-review bot reviewed Feb 27, 2025

View reviewed changes

nassermohamedit0 requested changes Feb 27, 2025

View reviewed changes

ohachimOs approved these changes Feb 27, 2025

View reviewed changes

Oussama-Bakri approved these changes Feb 28, 2025

View reviewed changes

nmasdoufi-ol reviewed Feb 28, 2025

View reviewed changes

deadly-panda reviewed Feb 28, 2025

View reviewed changes

ErebusZ closed this Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vulnerability location + dna #120

Add vulnerability location + dna #120

ErebusZ commented Feb 27, 2025

codecov bot commented Feb 27, 2025

oussamaessaji left a comment

ostorlab-ai-pr-review bot left a comment

ostorlab-ai-pr-review bot Feb 27, 2025

ostorlab-ai-pr-review bot Feb 27, 2025

nassermohamedit0 Feb 27, 2025

nmasdoufi-ol Feb 28, 2025

deadly-panda Feb 28, 2025 •

edited

Loading

-                dna = _prepare_vulnerability_dna(
+from typing import Dict, Optional
+def _prepare_vulnerability_dna(self, vulnerable_target_data: Dict, vulnerability_location: Optional[str]) -> str:
+    """
+    Compute the DNA attribute for the vulnerability.
+    Parameters:
+        vulnerable_target_data (Dict): A dictionary representing the target vulnerability data. Expected structure:
+            {
+                "id": <vulnerability identifier>,
+                "name": <vulnerability name>,
+                "details": <vulnerability details>,
+                // Optionally, "version": <version string> if applicable
+            }
+        vulnerability_location (Optional[str]): The location of the vulnerability. If None, an empty string is used.
+    Returns:
+        str: The computed DNA string.
+    """
+    # Validate required keys in vulnerable_target_data
+    required_keys = ["id", "name", "details"]
+    for key in required_keys:
+        if key not in vulnerable_target_data:
+            raise ValueError(f"Missing required key '{key}' in vulnerable_target_data.")
+    # Ensure vulnerability_location is a valid string
+    if vulnerability_location is None:
+        vulnerability_location = ""
+    # Compute the DNA attribute by concatenating key components
+    dna_components = [
+        str(vulnerable_target_data.get("id")),
+        vulnerable_target_data.get("name", ""),
+        vulnerability_location
+    ]
+    dna_str = "-".join(dna_components)
+    return dna_str

-            assert any(
+import json
+# Define the expected DNA output as a constant dictionary
+EXPECTED_DNA = {
+    "info": {
+        "subinfo": "value",
+        "another": "value"
+    }
+}
+# Use json.loads to parse the actual output and compare with the expected dictionary
+self.assertEqual(json.loads(result["dna"]), EXPECTED_DNA)

Add vulnerability location + dna #120

Add vulnerability location + dna #120

Conversation

ErebusZ commented Feb 27, 2025

codecov bot commented Feb 27, 2025

Codecov Report

oussamaessaji left a comment

Choose a reason for hiding this comment

ostorlab-ai-pr-review bot left a comment

Choose a reason for hiding this comment

Code Review Summary

ostorlab-ai-pr-review bot Feb 27, 2025

Choose a reason for hiding this comment

ostorlab-ai-pr-review bot Feb 27, 2025

Choose a reason for hiding this comment

nassermohamedit0 Feb 27, 2025

Choose a reason for hiding this comment

nmasdoufi-ol Feb 28, 2025

Choose a reason for hiding this comment

deadly-panda Feb 28, 2025 • edited Loading

Choose a reason for hiding this comment

deadly-panda Feb 28, 2025 •

edited

Loading