How the Findings Generator works end-to-end.
Multiple ingestion methods populate the database with vendor-specific data:
ingest_ldg.py — Imports Lucidum Data Gateway (LDG) CSV exports containing normalized asset fields (OS, agent status, IPs, threats, CVEs, etc.)ingest_openapi.py — Imports vendor OpenAPI specs from JSON files (e.g., SentinelOne API docs)api_doc_scraper.py — Auto-fetches and parses API docs from GitHub (Microsoft Defender)Central store for all data:
ldg_fields — Queryable asset fields per vendor (LDG + source fields)openapi_endpoints — API endpoints with parameters, permissions, and descriptionsfindings — Generated findings with payload, checks_hash for dedup, lucidum_id for FIN trackingfield_map — 290 Lucidum field mappings (type, searchFieldName, operators, confirmed status)field_operators — 43 valid type/operator pairs across 8 field typesexport_config — Sourcetype mappings, created_by, field_collection for Lucidum exportvendors, api_sync_config, ldg_catalog_rows, ldg_services — Reference datamcp_server.py exposes the database as MCP tools that Claude can call:
ldg_search_fields / ldg_list_fields — Discover available asset fieldsopenapi_search_endpoints / openapi_get_endpoint — Look up API endpointsfindings_save — Save a new finding (with automatic dedup via SHA256 hash of normalized checks)findings_list — List existing findings so Claude avoids duplicatesfield_map_lookup / field_map_search — Look up Lucidum field mappings, types, and valid operatorsfield_validate_condition — Validate field+operator+value combinations before savingCommunicates with Claude over stdio using the Model Context Protocol.
Claude generates findings by executing structured task prompts. Each of the 8 supported vendors has its own task folder:
tasks/s1/ — SentinelOne Singularity XDR (7 tasks, 322 findings)tasks/defender/ — Microsoft Defender for Endpoint (7 tasks, 41 findings)tasks/crowdstrike/ — CrowdStrike Falcon (7 tasks, 53 findings)tasks/wiz/ — Wiz Cloud Security (7 tasks, 81 findings)tasks/orca/ — Orca Security (7 tasks, 30 findings)tasks/okta/ — Okta SSO (6 tasks, 10 findings)tasks/tenable/ — Tenable Vulnerability Management (6 tasks, 10 findings)tasks/entra_id/ — Microsoft Entra ID (6 tasks, 10 findings)Categories are vendor-specific (6-7 tasks per vendor). Common categories include:
Task prompts include strict anti-duplication rules. Claude self-verifies every field via ldg_search_fields and field_map_lookup before saving.
Multiple ways to run the generation pipeline:
/generator) — Select vendor and tasks, click Generate, view real-time streaming output, cancel with Stop buttons1_find_gen.sh) — Batch execution of all tasksThe Web UI features:
Automated API documentation fetching for cloud-based vendors:
Currently supports: Microsoft Defender for Endpoint (from MicrosoftDocs/defender-docs)
Prevents duplicate findings at multiple levels:
UNIQUE(vendor_id, checks_hash)findings_list before generating to avoid logical overlapswebapp.py (Flask with threading) provides:
/) — Filter by vendor, weight (1-5), category; sort by date or weight; export as CSV/JSON/finding/<id>) — Edit title, description, weight, confidence, MITRE techniques, remediation/generator) — 8-vendor task selection with real-time output streaming and job persistence/lucidum-reverse) — Upload Lucidum exports, condition builder, field registry (290 fields, 43 operators)/export-lucidum.json) — Transform findings into Lucidum Smart Label JSON format/statistics) — Per-vendor findings distribution, weight breakdown, FIN ranges/config) — Auth, API sync, export settings (sourcetype mappings)Each finding stored as JSON contains:
lucidum_export.py transforms findings into Lucidum Smart Label import format:
searchFieldName; vendor-specific fields route to Extra_Data Embed_Listmatch, not match, older_than_days) to Lucidum equivalents/config/export)FIN-NNN IDs assigned at export time, persisted in DBDB-backed field_map (290 fields, 7 types) with fallback to hardcoded base fields. Self-learning via Lucidum export uploads.
/lucidum-reverse page and reverse_engineer.py for learning Lucidum's internal field/operator rules:
field_map table with 290 fields (74 confirmed from 560 live Smart Labels mined via API)field_operators table with 43 type/operator pairs across 8 field types8 integrated security platforms (557 findings total):
sentinelone_agentwiz_asset, pre-formatted Embed_List checkscrowdstrike_hostdefender_machineorca_assetokta_usertenable_assetazure_adArchitecture is vendor-agnostic — new vendors can be added by importing LDG fields, providing OpenAPI docs, and creating task prompts in tasks/{vendor}/.