Skip to content

feat(template): replacing template-zero with source-based template (#321)#336

Draft
guzmud wants to merge 25 commits into
release/currentfrom
feature/321-rework-template
Draft

feat(template): replacing template-zero with source-based template (#321)#336
guzmud wants to merge 25 commits into
release/currentfrom
feature/321-rework-template

Conversation

@guzmud

@guzmud guzmud commented Apr 21, 2026

Copy link
Copy Markdown
Member

Proposed changes

  • Adding/refactoring (depending if feat(template): kickstarting the collector template (#319) #320 is merged) a template for collectors
  • Replacing the collector vs services split with a collector vs source split (cf. internal documentation for now)
  • Defining a static custom name for the custom configuration (before it was the $name-of-your-collector configuration, changing for each)

Testing Instructions

  1. Copy-paste the template and create a new collector with it

Related issues

Checklist

  • I consider the submitted work as finished
  • I tested the code for its functionality
  • I wrote test cases for the relevant uses case
  • I added/update the relevant documentation (either on github or on notion)
  • Where necessary I refactored code to improve the overall quality
  • For bug fix -> I implemented a test that covers the bug

Further comments

Nota bene: this branch was forked from #320 (hence the commits related to template zero)

Executive summary from the internal documentation

Collectors can be seen as data processing unit. With this frame of mind, the already existing work pointing out towards the potentiality of a template can be turned into a more actionable template based around the definition of a source fed into a generic collector. This should lead to a new template that could be more dev-friendly, easier to explain, use and maintain.

@github-actions github-actions Bot added the filigran team Item from the Filigran team. label Apr 21, 2026
@guzmud guzmud force-pushed the feature/321-rework-template branch 6 times, most recently from 113bce7 to cc0c717 Compare April 28, 2026 15:13
@codecov

codecov Bot commented Apr 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 23.92027% with 229 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.00%. Comparing base (03b39fa) to head (cc0c717).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
template/tests/services/conftest.py 3.22% 90 Missing ⚠️
template/tests/services/fixtures/factories.py 8.75% 73 Missing ⚠️
template/tests/test_create_collector.py 9.52% 38 Missing ⚠️
template/tests/conftest.py 28.57% 20 Missing ⚠️
template/src/models/settings/config_loader.py 65.21% 8 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (03b39fa) and HEAD (cc0c717). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (03b39fa) HEAD (cc0c717)
connectors 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #336       +/-   ##
===========================================
- Coverage   82.43%   69.00%   -13.43%     
===========================================
  Files          41      139       +98     
  Lines        1674     7076     +5402     
===========================================
+ Hits         1380     4883     +3503     
- Misses        294     2193     +1899     
Flag Coverage Δ
connectors ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@guzmud guzmud force-pushed the feature/321-rework-template branch 8 times, most recently from f8264b6 to 02d5277 Compare May 7, 2026 09:48
@guzmud guzmud changed the base branch from main to release/current May 7, 2026 09:49
@guzmud guzmud changed the title [template] feat(collector): replacing template-zero with source-based template [template] feat(collector): replacing template-zero with source-based template (#321) May 7, 2026
@guzmud guzmud force-pushed the feature/321-rework-template branch from 02d5277 to 4cd73ec Compare May 7, 2026 10:14
@guzmud guzmud force-pushed the feature/321-rework-template branch from 4cd73ec to 1caa57c Compare May 7, 2026 11:10
@guzmud guzmud requested a review from Copilot May 7, 2026 11:28

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new collector template that models collectors as a generic engine fed by an implementer-provided source (data fetcher + source data + signatures), replacing the prior “collector vs services” shape with a “collector vs source” split.

Changes:

  • Adds a reusable collector framework (engine, protocols, models, resilient uploaders) plus a minimal runnable template collector entrypoint.
  • Adds configuration system based on Pydantic Settings (YAML / .env / env var sources) and docker/packaging scaffolding.
  • Adds a unittest suite covering protocols, models, engine behavior, and uploaders.

Reviewed changes

Copilot reviewed 50 out of 65 changed files in this pull request and generated 23 comments.

Show a summary per file
File Description
template/.dockerignore Docker build context exclusions for config/artifacts/caches.
template/.gitignore Git ignores for config/artifacts/caches.
template/CONTRIBUTING.md Contributor docs for the template (currently mismatched with new layout).
template/Dockerfile Container build/run for the template collector.
template/README.md Template usage/configuration documentation (currently references old layout).
template/docker-compose.yml Example compose service for running the template collector.
template/manifest-metadata.json Collector manifest metadata for the template image.
template/pyproject.toml Poetry/PEP621 config, deps/extras, dev tooling config, entrypoint script.
template/src/init.py Package exports for ConfigLoader.
template/src/main.py Module entrypoint calling main().
template/src/config.yml.sample Sample YAML config for the template.
template/src/img/template-logo.png Collector icon asset placeholder.
template/src/models/init.py Exports ConfigLoader.
template/src/models/settings/init.py Exports settings model components.
template/src/models/settings/base_settings.py Base Pydantic Settings config for nested env parsing and immutability.
template/src/models/settings/collector_configs.py OpenAEV + collector settings models.
template/src/models/settings/config_loader.py ConfigLoader that merges sources and flattens config for daemon.
template/src/models/settings/template_configs.py Template-specific settings (key, time window, batch size).
template/src/py.typed Marks package as typed (PEP 561).
template/src/source/init.py Source package marker.
template/src/source/template_data_fetcher.py Placeholder data fetcher implementation.
template/src/source/template_signatures.py Placeholder supported signature list.
template/src/source/template_source_data.py Placeholder source data model implementation.
template/src/template_collector.py Runnable example that wires Source + BaseCollector and starts it.
template/src/collector/init.py Collector package marker.
template/src/collector/collector.py BaseCollector daemon wrapper that instantiates/configures the engine.
template/src/collector/engines/init.py Engines package marker.
template/src/collector/engines/basic.py Generic processing engine (fetch/filter/process expectations, upload results/traces).
template/src/collector/helpers/init.py Helpers package marker.
template/src/collector/internals/init.py Internals package marker.
template/src/collector/internals/oaev_uploaders.py OpenAEV expectation/trace uploaders built on resilient uploader.
template/src/collector/internals/resilient_uploader.py Generic bulk uploader with fallback to individual uploads.
template/src/collector/models/init.py Models package marker.
template/src/collector/models/data.py OAEVData and TraceData models.
template/src/collector/models/exception.py Custom exception hierarchy.
template/src/collector/models/expectations.py Expectation result/trace/summary models and formatting helpers.
template/src/collector/models/source.py Source definition and default SourceHandler implementation.
template/src/collector/protocols/init.py Protocols package marker.
template/src/collector/protocols/data_fetcher.py Protocol for data fetchers.
template/src/collector/protocols/engine.py Protocol for collector engines.
template/src/collector/protocols/source_data.py Protocol for source data models.
template/src/collector/protocols/source_handler.py Protocol for source handlers.
template/src/collector/types/init.py Types package marker.
template/src/collector/types/collector.py Type aliases for collector structures.
template/src/collector/types/internals.py Type aliases for resilient uploader injection points.
template/src/collector/utils/init.py Utils package marker.
template/src/collector/utils/retroport_itertools.py Python 3.11-compatible batched() helper.
template/tests/init.py Tests package marker.
template/tests/test_template_collector.py Tests wiring of template_collector.main().
template/tests/collector/init.py Tests subpackage marker.
template/tests/collector/test_collector.py Tests BaseCollector initialization/setup behaviors.
template/tests/collector/engines/init.py Tests engines package marker.
template/tests/collector/engines/test_basic.py Tests BasicCollectorEngine behaviors and flow.
template/tests/collector/internals/init.py Tests internals package marker.
template/tests/collector/internals/test_oaev_uploaders.py Tests expectation/trace uploader behavior.
template/tests/collector/internals/test_resilient_uploader.py Tests resilient uploader bulk+fallback behavior.
template/tests/collector/models/init.py Tests models package marker.
template/tests/collector/models/test_data.py Tests OAEVData/TraceData validation and formatting.
template/tests/collector/models/test_expectations.py Tests expectation models behavior.
template/tests/collector/models/test_source.py Tests Source and SourceHandler behavior.
template/tests/collector/protocols/test_data_fetcher.py Tests protocol conformance for data fetchers.
template/tests/collector/protocols/test_engine.py Tests protocol conformance for engines.
template/tests/collector/protocols/test_source_data.py Tests protocol conformance for source data.
template/tests/collector/protocols/test_source_handler.py Tests protocol conformance for source handlers.
template/tests/collector/utils/init.py Tests utils package marker.
template/tests/collector/utils/test_retroport_itertools.py Tests retroported batched() selection/behavior.
Comments suppressed due to low confidence (2)

template/src/collector/types/collector.py:4

  • SignatureGroups is defined as list[dict[str, str]], but get_expectation_signature_groups() and match_signature_groups_and_oaevdata() treat signature_groups as a mapping (.items()) from signature type to a list of signature dicts. This mismatch will either break typing (mypy) or lead implementers of the protocol to return the wrong shape. Update SignatureGroups to the actual structure used (e.g., dict[str, list[dict[str, str]]]) and align protocol/implementations accordingly.
from typing import TypeAlias

SignatureGroups: TypeAlias = list[dict[str, str]]

template/pyproject.toml:128

  • [tool.cmw] icon-path points to src/img/change-me-logo.png, but this template ships src/img/template-logo.png instead. This will break any tooling that relies on tool.cmw metadata for the icon. Update icon-path (or rename the asset) so the referenced file exists.
[tool.cmw]
install-command = "poetry install --extras local"
config-dump-command = "poetry run python -m src --dump-config-schema"
icon-path = "src/img/change-me-logo.png"

Comment thread template/src/collector/protocols/source_handler.py
Comment thread template/src/collector/internals/oaev_uploaders.py Outdated
Comment thread template/src/collector/engines/basic.py Outdated
Comment thread template/src/collector/collector.py Outdated
Comment thread template/src/collector/collector.py Outdated
Comment thread template/CONTRIBUTING.md
Comment on lines +39 to +46
- `--extra current`: Get pyoaev from Git release/current branch
- `--extra local`: Get pyoaev locally from `../../client-python`

### Development Installation

```bash
# Development setup with current pyoaev version
poetry install -E current --with dev,test
Comment thread template/Dockerfile
Comment on lines +21 to +25
RUN if [[ ${PYOAEV_GIT_BRANCH_OVERRIDE} ]] ; then \
echo "Forcing specific version of client-python" && \
apk add --no-cache git && \
pip install pip3-autoremove && \
pip-autoremove pyoaev -y && \
Comment thread template/src/collector/internals/resilient_uploader.py Outdated
Comment thread template/src/collector/models/data.py Outdated
Comment thread template/src/collector/models/source.py Outdated
@guzmud guzmud force-pushed the feature/321-rework-template branch from b3dc46b to 012d176 Compare May 11, 2026 07:29
@guzmud guzmud force-pushed the feature/321-rework-template branch from 6c1c2d6 to c6eb06a Compare May 11, 2026 07:59
@guzmud guzmud force-pushed the feature/321-rework-template branch from b61ba00 to fb2a085 Compare May 18, 2026 07:43
]

HttpUrlToString = Annotated[HttpUrl, PlainSerializer(str, return_type=str)]
TimedeltaInSeconds = Annotated[

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kakudou it seems TimedeltaInSeconds as never been really used (it exists in palo alto XDR, sentinelone and splunk-es but never really used in any case): should be replace the type of period with it or just delete it ? (I guess you were the one that wrote it in the first place in splunk-es, I don't know if it's a leftover to be delete or a WIP unfinished)

@SamuelHassine SamuelHassine changed the title [template] feat(collector): replacing template-zero with source-based template (#321) feat(template): replacing template-zero with source-based template (#321) Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

filigran team Item from the Filigran team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Collector] Rework the codebase of the collector template

3 participants