Skip to content

feat(netwitness): add NetWitness collector (#428)#429

Open
SamuelHassine wants to merge 2 commits into
mainfrom
feature/428-netwitness-collector
Open

feat(netwitness): add NetWitness collector (#428)#429
SamuelHassine wants to merge 2 commits into
mainfrom
feature/428-netwitness-collector

Conversation

@SamuelHassine

@SamuelHassine SamuelHassine commented Jun 16, 2026

Copy link
Copy Markdown
Member

Summary

Adds an OpenAEV collector for NetWitness (requested in #428). It validates OpenAEV detection expectations by querying the NetWitness Core SDK (NWQL) for sessions that match the attack signatures produced during a simulation, then reports a verdict and a trace linking back to NetWitness Investigate.

Modeled on the existing splunk-es / elastic SIEM collector pattern (pyoaev CollectorDaemon plus expectation/trace service providers), with the SIEM-specific layer implemented for the NetWitness Core SDK.

What it does

  • Builds an NWQL query from the attack signatures, runs it against the Core SDK, and parses the per-session result metadata
  • Authentication via HTTP basic (Core SDK) or a bearer token
  • Signature matching on ip.src / ip.dst, and parent process via the url meta, bounded by the expectation time window
  • Retry mechanism with a configurable offset to handle ingestion latency
  • Trace generation with links to NetWitness Investigate
  • Detection expectations only (NetWitness is a detection/NDR source)

API endpoints

  • GET /sdk?msg=query&query=<NWQL>&force-content-type=application/json

Review feedback addressed

  • Fixed a detection-matching bug: _match_with_detection_helper initialized parent_process_match to False, so IP-only expectations (those without a parent_process_name signature) could never match. It now defaults to True and the parent-process check is enforced only when such a signature is present. Added regression tests for the IP-only match/no-match paths that the suite previously did not cover.
  • Removed the dead NetWitnessAlert._raw PrivateAttr (Pydantic v2 ignores private attrs passed to the constructor, so it was always None and never read).
  • Lowered the "no UUIDs found" parent-process / url parser logs from warning to debug (non-matching values are a normal case and should not be warning-level noise).
  • Corrected the _build_trace_url_from_expectation docstring to match the actual behavior (an Investigate query hint from source/destination IPs only).
  • Fixed the README configuration section: the loader selects a single source (first of .env, config.yml, environment variables); it does not merge them.

Known follow-ups (shared across the SIEM collector family, intentionally not changed here)

  • The query time bound uses a rolling now - time_window and does not honor explicit start_date / end_date signature values. This matches splunk-es / elastic / qradar / logrhythm; honoring explicit dates should be a single coordinated change across all of them.
  • IPv6 is not fully wired end to end: _build_query filters on the IPv4 ip.src / ip.dst meta and the converter writes into the *_ipv4_address keys, so IPv6 expectations do not match. Same shape as the sibling collectors; full IPv6 support needs a coordinated query+converter change.

Tests

  • Unit + flow tests green locally and in CI (Test netwitness), now including IP-only matching coverage
  • black / isort / flake8 clean
  • Registered in .circleci/config.yml (docker build + publish)

Notes

  • The collector icon (src/img/netwitness-logo.png) is a placeholder and should be replaced with the official NetWitness logo before release.

Closes #428

Detection-expectation collector that queries the NetWitness Core SDK (NWQL) to validate detections. Modeled on the splunk-es/elastic collector pattern. Supports basic auth or bearer token, source/destination IP (ip.src/ip.dst) and parent-process (url) matching, retry with offset, and is registered in the CircleCI docker build/publish pipeline.
Copilot AI review requested due to automatic review settings June 16, 2026 14:59
@Filigran-Automation Filigran-Automation added the filigran team Item from the Filigran team. label Jun 16, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new OpenAEV “NetWitness” collector integration (requested in #428) that queries the NetWitness Core SDK (NWQL) to validate detection expectations, with retry/latency handling and trace links back to Investigate. It follows the existing collector pattern (CollectorDaemon + expectation/trace service providers) and wires the collector into CircleCI Docker image builds/publishing.

Changes:

  • Introduces NetWitness service layer: Core SDK query client, NWQL builder, response parsing, OAEV conversion, expectation matching, and trace creation.
  • Adds collector framework components (generic expectation handler/manager + trace manager + signature registry) and configuration models/samples.
  • Adds an extensive pytest suite (core services + flow tests) and registers Docker builds/publish steps in CircleCI.

Reviewed changes

Copilot reviewed 57 out of 62 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
netwitness/tests/test_trace_manager.py Unit tests for trace submission behavior (bulk + fallback).
netwitness/tests/test_signature_registry.py Unit tests for signature registry subscription/handler registration.
netwitness/tests/test_expectation_manager.py Unit tests for generic expectation manager bulk update/sleep/end-date logic.
netwitness/tests/test_expectation_handler.py Unit tests for generic expectation handler delegation and error wrapping.
netwitness/tests/test_create_collector.py Tests for collector initialization from env config.
netwitness/tests/test_collector_models.py Tests for Pydantic collector models (result/trace/summary).
netwitness/tests/services/test_trace_service.py Unit tests for NetWitness trace creation and link building.
netwitness/tests/services/test_parent_process_parser.py Tests for UUID extraction/URL query building for parent-process matching.
netwitness/tests/services/test_expectation_service_flow.py Flow tests covering end-to-end service processing/matching paths.
netwitness/tests/services/test_expectation_service_essential.py Essential tests for expectation service matching, batching, and result shaping.
netwitness/tests/services/test_converter_extra.py Extra branch coverage tests for converter edge cases.
netwitness/tests/services/test_converter_essential.py Essential converter tests for IP field extraction and filtering.
netwitness/tests/services/test_client_api_extra.py Extra tests for retry/error wrapping paths in client API.
netwitness/tests/services/test_client_api_essential.py Essential client API tests for auth, query building, and response parsing.
netwitness/tests/services/fixtures/factories.py Polyfactory-based fixtures for configs, alerts, results, and test data.
netwitness/tests/services/fixtures/init.py Package marker for service fixtures.
netwitness/tests/services/conftest.py Service-test fixtures and global logging/sleep patching.
netwitness/tests/services/init.py Package marker for service tests.
netwitness/tests/conftest.py Global test fixtures to mock OpenAEV client calls and config sources.
netwitness/tests/init.py Package marker for tests.
netwitness/src/services/utils/parent_process_parser.py Utility for parsing UUIDs from parent-process name / URL path patterns.
netwitness/src/services/utils/config_loader.py Loader wrapper around ConfigLoader with logging/error handling.
netwitness/src/services/utils/init.py Exports NetWitnessConfig helper.
netwitness/src/services/trace_service.py NetWitness trace service building ExpectationTrace models and links.
netwitness/src/services/models.py Pydantic models for NetWitness API results and grouped response parsing.
netwitness/src/services/expectation_service.py Core expectation processing: signature extraction, fetch/convert/match.
netwitness/src/services/exception.py NetWitness service exception hierarchy.
netwitness/src/services/converter.py Converts NetWitnessAlert into OAEV matching structures.
netwitness/src/services/client_api.py Core SDK client: auth/session setup, NWQL building, retries, parsing.
netwitness/src/services/init.py Public exports for NetWitness services/models/exceptions.
netwitness/src/py.typed Marks package as typed for type checkers.
netwitness/src/models/configs/netwitness_configs.py NetWitness configuration settings (auth, retries, time window, etc.).
netwitness/src/models/configs/config_loader.py Root settings loader and daemon config flattening for NetWitness collector.
netwitness/src/models/configs/collector_configs.py Base collector/OpenAEV settings models for this collector.
netwitness/src/models/configs/base_settings.py Shared BaseSettings configuration (env nesting, frozen, stripping, etc.).
netwitness/src/models/configs/init.py Exports config models used by the loader.
netwitness/src/models/init.py Exports ConfigLoader.
netwitness/src/config.yml.sample Sample YAML configuration for local/manual deployments.
netwitness/src/collector/trace_service_provider.py Protocol for trace service providers.
netwitness/src/collector/trace_manager.py Trace creation/submission orchestration with bulk + fallback logic.
netwitness/src/collector/signature_registry.py Registry for supported signatures and handler types.
netwitness/src/collector/models.py Collector-side Pydantic models for results/traces/summaries.
netwitness/src/collector/expectation_service_provider.py Protocol defining expectation service provider interface.
netwitness/src/collector/expectation_manager.py Generic manager for fetching, processing, updating expectations and traces.
netwitness/src/collector/expectation_handler.py Generic handler delegating to a service provider and post-processing results.
netwitness/src/collector/exception.py Collector exception hierarchy (config/setup/processing/tracing).
netwitness/src/collector/collector.py CollectorDaemon integration wiring services/manager/helper together.
netwitness/src/collector/init.py Exposes Collector class.
netwitness/src/.env.sample Sample environment variable configuration.
netwitness/src/main.py Entry point for running the collector as a script/module.
netwitness/src/init.py Package exports.
netwitness/README.md NetWitness collector documentation and configuration guide.
netwitness/pyproject.toml Poetry project definition (deps, extras, tooling).
netwitness/manifest-metadata.json Collector marketplace/manifest metadata.
netwitness/Dockerfile_ubi9 UBI9 container build for the collector.
netwitness/Dockerfile Alpine container build for the collector.
netwitness/docker-compose.yml Compose file for running the collector container with env vars.
netwitness/.gitignore Ignores config.yml, dist, and caches for this collector.
netwitness/.dockerignore Excludes config.yml, dist, and caches from Docker context.
netwitness/.build.env Build env metadata for collector command.
.circleci/config.yml Adds NetWitness docker image build/save/tag/push steps to CircleCI.

Comment thread netwitness/src/services/expectation_service.py Outdated
Comment thread netwitness/src/services/client_api.py
Comment thread netwitness/src/services/converter.py
Comment thread netwitness/src/services/utils/parent_process_parser.py
Comment thread netwitness/src/services/utils/parent_process_parser.py
Comment thread netwitness/README.md Outdated
…ogs (#428)

- Fix _match_with_detection_helper: parent_process_match was initialized
  to False, so the "if not parent_process_match: return False" guard
  rejected every expectation with no parent_process_name signature (i.e.
  all IP-only expectations, the common case). Initialize it to True so the
  parent-process check is only enforced when such a signature is present,
  matching the documented "(if present)" behavior. Add regression tests
  for the IP-only match and no-match paths, which the existing suite never
  exercised (every success case included a parent signature).
- Remove the dead NetWitnessAlert._raw PrivateAttr: it was assigned
  through the constructor, which Pydantic v2 ignores, so it always stayed
  None and was never read.
- Lower the "no UUIDs found" logs in the parent-process parser from
  warning to debug: non-matching parent-process names and url metas are a
  normal case (most sessions are unrelated to OpenAEV injects) and should
  not produce warning-level noise.
- Correct the _build_trace_url_from_expectation docstring to describe the
  actual behavior: an Investigate query hint from source/destination IPs
  only, not the full NWQL query.
- Fix the README configuration section: the loader selects a single
  source (first of .env, config.yml, environment variables); it does not
  merge env vars over YAML over defaults.
@SamuelHassine

Copy link
Copy Markdown
Member Author

Review and fix summary

Did a full independent senior review of the NetWitness collector (shares the SIEM framework with splunk-es / elastic / qradar / logrhythm) alongside the 6 Copilot threads. Changes pushed in 3b10c8c:

  • Fixed a real detection-matching bug (src/services/expectation_service.py): _match_with_detection_helper initialized parent_process_match to False, so the if not parent_process_match: return False guard rejected every expectation with no parent_process_name signature - i.e. all IP-only expectations, the common case. It now defaults to True, enforcing the parent check only when such a signature is present. Added regression tests for the IP-only match/no-match paths the suite never covered.
  • Removed dead NetWitnessAlert._raw PrivateAttr (src/services/models.py): always None, never read (Pydantic v2 ignores private attrs passed to the constructor).
  • Lowered noisy parser logs to debug (src/services/utils/parent_process_parser.py): a parent-process name / url meta without inject UUIDs is a normal case, not a warning.
  • Corrected the trace-URL docstring (src/services/trace_service.py): it builds an Investigate query hint from source/destination IPs only.
  • Fixed README configuration precedence: the loader selects a single source (first of .env, config.yml, environment variables) and does not merge them.

Reviewed and intentionally left as-is (shared across the whole SIEM collector family; flagged as cross-collector follow-ups rather than a one-off divergence here):

  • Query time bound uses a rolling now - time_window and does not honor explicit start_date / end_date signatures (same in splunk-es / elastic / qradar / logrhythm).
  • IPv6 is not fully wired: _build_query filters on the IPv4 ip.src / ip.dst meta and the converter writes the *_ipv4_address keys, so IPv6 expectations do not match. A converter-only change would be incomplete; this needs a coordinated query+converter change across the family.

Heads-up: splunk-es and logrhythm share the same parent_process_match = False initialization (logrhythm was fixed in its own PR; splunk-es still needs the one-line fix).

Status:

  • All CI checks are green (Test netwitness incl. the new tests, all sibling tests, linter, formatting, signed commits, codecov patch/project, docker image build).
  • All 6 review threads replied to and resolved.
  • mergeable: MERGEABLE. The only thing left is a maintainer approval (REVIEW_REQUIRED); I could not approve since I am the PR author.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

filigran team Item from the Filigran team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(netwitness): add NetWitness collector

5 participants