Welcome to Dagen

Dagen introduces the agentic pipeline: data infrastructure where pipelines are intent-driven—they learn, adapt, and improve. Specialist AI agents design, build, monitor, and refine the stack from ingestion to business-ready KPIs, alongside your existing SQL, Spark, and warehouse investments.

Highlights you should not miss

Capability	What it is	Documentation
Agent Intelligence	Workspace-wide instructions, skills (loaded on demand via `read_skill`), rules, and lessons so agents behave like your data team.	Agent Intelligence & Skills
Skills	Packaged expertise with a trigger + body—saves context vs pasting huge prompts every time.	Same guide + Custom Agents & Tools
Git Reviews	AI review on GitHub pull requests (SQL, dbt, PySpark, YAML) with webhooks and optional auto-post.	Git Reviews (AI on PRs)

Core design principle: intent-awareness

Every pipeline should understand why it exists: the business outcome, who consumes the data, which decisions depend on it, and what quality means for that use case—not only which tasks it runs.

Key features

Intent-driven agentic pipelines

Describe pipelines by purpose and outcome, not only by technical steps
A Super Agent coordinates specialist agents across the lifecycle
Three autonomy levels in AI Chat: Guided, Semi-Autonomous, Autonomous
Natural language for architecture goals (for example medallion or layered designs)

Specialist agent hierarchy

Agent	Focus
Data Ingestion	Broad connector coverage, rate limits, retries, CDC
dbt	SQL transformations, tests, documentation aligned to intent
Metadata Discovery	Schema profiling, semantics, knowledge enrichment
Data Model Generation	Dimensional models, facts, medallion-style layouts
Data Cleansing	Pipeline-specific quality rules
Test Data Generation	Synthetic data for validation
Orchestration	Scheduling, coordination, monitoring
Spark Developer	PySpark and large-scale processing
Internet Search	External enrichment and public datasets

Tri-layer memory

Working memory — active session context and decisions in flight
Episodic memory — structured history of runs, fixes, and outcomes
Institutional knowledge — skills (on-demand playbooks), Knowledge Base documents, and lessons that compound over time

Configure institutional behaviour explicitly in Agent Intelligence & Skills (/agent-intelligence).

Self-healing pipelines

Schema drift awareness, quality and volume signals, freshness expectations, and remediation—with human-in-the-loop where needed—often via execution logs, Fix with Agent, and chat with pipeline or DB context. Pair with Agent Intelligence rules and lessons so repeated failure modes get automatic guardrails. Shift-left on bad SQL with Git Reviews before merge.

Architecture support

Medallion (Bronze → Silver → Gold), star schema, Data Vault 2.0, and AI/RAG-ready, semantically rich outputs.

Pipeline modernization (conceptual phases)

Discovery and cataloging
Intent reconstruction
Agentic rebuild
Activation and continuous improvement

Platform architecture (summary)

Client experience — responsive UI with real-time updates for long-running work
APIs — REST and streaming patterns, workspace isolation, RBAC
Metadata and knowledge — stored platform state, semantic search, and lineage / graph capabilities for impact and discovery
Integrations — 500+ ingestion connectors (Airbyte family), Git, dbt, Spark, Dataform, and configurable LLMs (Anthropic, OpenAI, Google, open source, and more)

Security and compliance

Architecture-aware sovereignty (including GDPR, NIS2, EU AI Act considerations), data residency tracking per pipeline, encrypted credentials with audit logging, compliance-oriented records, and SOC 2–ready postures—exact obligations depend on your edition and deployment.

Key capabilities (docs)

AI-powered agents & intelligence

Custom Agents & Tools — Agent Builder, templates, Python tools, Knowledge Base, graph.
Agent Intelligence & Skills — Instructions, skills (read_skill), rules, lessons, templates, import/export (/agent-intelligence).
Git Reviews (AI on PRs) — Automated PR review on connected GitHub repos.

Use cases Dagen is built for

Use case	How Dagen helps
Lift-and-shift modernization	Discovery and cataloging, intent reconstruction, then agentic rebuild (see Pipeline modernization above on this page).
Medallion / layered warehouses	Natural language + Data Model + dbt agents for Bronze → Silver → Gold patterns.
Operational analytics	Data Insights for conversational KPIs and charts; Database Explorer for SQL.
Data quality at scale	Cleansing agent, tests in dbt, self-healing signals, Agent Intelligence & Skills rules and lessons.
Team scale-up	Workspace sharing, RBAC, Administration, Knowledge Base for tribal knowledge.
Automation & CI	External API, API Keys, Git Reviews, Slack.

Documentation map (full product)

Topic	Documentation
Sign-in, SSO	Authentication
DBs, warehouses, lakes, object storage	Database Connections, Supported data sources
GitHub / repos	Source Repositories
Home / overview cards, recent tasks	Platform Dashboard
Move data (Airbyte UI)	Data Ingestion
Schemas, DDL, test data	Data Modeling
Agent Builder, tools, Knowledge Base	Custom Agents & Tools
Agent Intelligence, skills, rules, lessons	Agent Intelligence & Skills
GitHub AI PR review	Git Reviews (AI on PRs)
Main agent UI	AI Chat
SQL browse / console	Database Explorer
Dashboards from chat	Data Insights
dbt, Spark, workflows	Building Pipelines
Jobs, usage, runtimes, team	Administration
LLM configuration	Model Settings
REST / A2A	External API, API Keys
In-app help	Magical Guide
Slack	Slack Integration
On-prem / AMI	Self-Hosted

Getting started

Authentication — Sign in (email, Google, GitHub).
Connect your ecosystem — Database connections and source repositories.
Declare intent — Describe the business purpose of your pipeline.
Let agents build — Specialists design, build, and test.
Choose autonomy — Guided, Semi, or Autonomous in AI Chat; iterate with self-healing and reviews.

Core concepts

Custom Agents & Tools
Agent Intelligence & Skills — instructions, skills, rules, lessons (/agent-intelligence)
Data Ingestion
Data Modeling

Features

Guides

Magical Guide
Slack Integration

Reference

Deployment

Self-Hosted

Need help?

Use the Magical Guide in the application for interactive, context-aware assistance.