Wegent CRD Domain Model Analysis
Executive Summary
The Wegent platform implements a Kubernetes-style CRD (Custom Resource Definition) architecture for managing AI agents. The domain model separates concerns into distinct resource types with clear relationships, utilizing a dual-table storage strategy for performance optimization.
1. Core Database Models
1.1 Kind Model (Base CRD Class)
Location: shared/models/db/kind.py
class Kind(Base):
"""Unified Kind model for all Kubernetes-style resources."""
__tablename__ = "kinds"
id = Column(Integer, primary_key=True, index=True)
user_id = Column(Integer, nullable=False)
kind = Column(String(50), nullable=False, index=True)
name = Column(String(100), nullable=False)
namespace = Column(String(100), nullable=False, default="default")
json = Column(JSON, nullable=False)
is_active = Column(Boolean, default=True)
created_at = Column(DateTime, default=datetime.now)
updated_at = Column(DateTime, default=datetime.now, onupdate=datetime.now)
Key Attributes:
| Attribute | Type | Description |
|---|---|---|
id | Integer | Primary key, auto-increment |
user_id | Integer | Owner reference (not FK for flexibility) |
kind | String(50) | Resource type discriminator (Ghost, Model, Shell, Bot, Team, Skill, KnowledgeBase, Retriever, Device) |
name | String(100) | Resource name within namespace |
namespace | String(100) | Logical grouping (default: "default") |
json | JSON | Resource-specific spec and status data |
is_active | Boolean | Soft delete flag |
Design Pattern: Single Table Inheritance (STI) with JSON storage for polymorphic attributes. This provides:
- Unified querying across all CRD types
- Schema flexibility via JSON spec storage
- Namespace-based resource organization
1.2 TaskResource Model (Separate Table)
Location: backend/app/models/task.py
class TaskResource(Base):
"""TaskResource model for Task and Workspace resources.
Separated from kinds table to improve query performance for task-related operations.
"""
__tablename__ = "tasks"
id = Column(Integer, primary_key=True, index=True)
user_id = Column(Integer, nullable=False, default=0, index=True)
kind = Column(String(50), nullable=False, index=True) # "Task" or "Workspace"
name = Column(String(100), nullable=False, default="")
namespace = Column(String(100), nullable=False, default="default")
json = Column(JSON, nullable=False)
is_active = Column(Boolean, nullable=False, default=True)
created_at = Column(DateTime, nullable=False, default=datetime.now, index=True)
updated_at = Column(DateTime, nullable=False, default=datetime.now, onupdate=datetime.now)
project_id = Column(Integer, nullable=False, default=0, index=True)
Rationale for Separation:
- Task/Workspace resources have high query frequency
- Tasks have additional indexing requirements (project_id)
- Isolates high-volume operations from other CRDs
- Improves cache locality for task-heavy workloads
1.3 SkillBinary Model (Binary Storage)
Location: shared/models/db/skill_binary.py
class SkillBinary(Base):
"""Skill binary data storage for ZIP packages."""
__tablename__ = "skill_binaries"
id = Column(Integer, primary_key=True, index=True)
kind_id = Column(Integer, ForeignKey("kinds.id", ondelete="CASCADE"), nullable=False, unique=True)
binary_data = Column(LargeBinary, nullable=False) # ZIP package binary data
file_size = Column(Integer, nullable=False) # File size in bytes
file_hash = Column(String(64), nullable=False) # SHA256 hash
created_at = Column(DateTime, default=datetime.now)
Key Features:
- One-to-one relationship with Kind (Skill type only)
- SHA256 hash for integrity verification
- Cascading delete when Skill is deleted
1.4 Subtask Model (Message/Conversation Storage)
Location: shared/models/db/subtask.py
class Subtask(Base):
"""Subtask model representing a message in a task conversation."""
__tablename__ = "subtasks"
id = Column(Integer, primary_key=True, index=True)
user_id = Column(Integer, nullable=False)
task_id = Column(Integer, nullable=False) # References TaskResource.id
team_id = Column(Integer, nullable=False) # References Kind.id (Team type)
title = Column(String(256), nullable=False)
bot_ids = Column(JSON, nullable=False)
role = Column(SQLEnum(SubtaskRole), nullable=False, default=SubtaskRole.ASSISTANT)
executor_namespace = Column(String(100))
executor_name = Column(String(100))
executor_deleted_at = Column(Boolean, nullable=False, default=False)
prompt = Column(Text)
message_id = Column(Integer, nullable=False, default=1)
parent_id = Column(Integer, nullable=True) # For threaded conversations
status = Column(SQLEnum(SubtaskStatus), nullable=False, default=SubtaskStatus.PENDING)
progress = Column(Integer, nullable=False, default=0)
result = Column(JSON)
error_message = Column(Text)
created_at = Column(DateTime, default=func.now())
updated_at = Column(DateTime, default=func.now(), onupdate=func.now())
completed_at = Column(DateTime, nullable=False, default="1970-01-01 00:00:00")
# Group chat fields
sender_type = Column(String(20), nullable=False, default="")
sender_user_id = Column(Integer, nullable=False, default=0)
reply_to_subtask_id = Column(Integer, nullable=False, default=0)
1.5 SubtaskContext Model (Attachment & Knowledge Base)
Location: shared/models/db/subtask_context.py
class SubtaskContext(Base):
"""Subtask context storage for various context types."""
__tablename__ = "subtask_contexts"
id = Column(Integer, primary_key=True, index=True)
subtask_id = Column(Integer, nullable=False, default=0, index=True)
user_id = Column(Integer, nullable=False, index=True)
context_type = Column(String(50), nullable=False, index=True) # 'attachment', 'knowledge_base', 'table', 'selected_documents'
name = Column(String(255), nullable=False)
status = Column(String(20), nullable=False, default=ContextStatus.PENDING.value, index=True)
error_message = Column(Text, nullable=False, default="")
binary_data = Column(BinaryDataType, nullable=False, default=b"") # LONGBLOB for MySQL
image_base64 = Column(LongTextType, nullable=False, default="") # For vision models
extracted_text = Column(LongTextType, nullable=False, default="")
text_length = Column(Integer, nullable=False, default=0)
type_data = Column(JSON, nullable=False, default=dict) # Type-specific metadata
created_at = Column(DateTime, nullable=False, default=func.now())
updated_at = Column(DateTime, nullable=False, default=func.now(), onupdate=func.now())
Polymorphic Design: The type_data JSON field stores type-specific attributes:
- attachment:
original_filename,file_extension,file_size,mime_type,storage_backend,storage_key,is_encrypted,encryption_version - knowledge_base:
knowledge_id,document_count - table:
url,source_config - selected_documents:
knowledge_base_id,document_ids[]
2. CRD Type Hierarchy
3. CRD Resource Relationships
3.1 Resource Composition Hierarchy
3.2 Reference Patterns
| Reference Type | Pattern | Example |
|---|---|---|
| Composition | Direct nesting in JSON | Bot.spec contains ghostRef, shellRef, modelRef |
| Association | ID reference with loose coupling | Subtask.task_id → TaskResource.id |
| Inheritance | Single table with kind discriminator | All CRDs stored in kinds table |
| Binary Attachment | Separate table with FK | SkillBinary.kind_id → Kind.id |
4. CRD Schemas (Pydantic Models)
4.1 Common Base Schemas
class ObjectMeta(BaseModel):
"""Standard Kubernetes object metadata"""
name: str
namespace: str = "default"
displayName: Optional[str] = None
labels: Optional[Dict[str, str]] = None
class Status(BaseModel):
"""Standard status object"""
state: str
message: Optional[str] = None
4.2 Ghost CRD Schema
class GhostSpec(BaseModel):
systemPrompt: str
mcpServers: Optional[Dict[str, Any]] = None
skills: Optional[List[str]] = None # Skill names list
preload_skills: Optional[List[str]] = None # Preloaded skill names
class GhostStatus(Status):
state: str = "Available" # Available, Unavailable
class Ghost(BaseModel):
apiVersion: str = "agent.wecode.io/v1"
kind: str = "Ghost"
metadata: ObjectMeta
spec: GhostSpec
status: Optional[GhostStatus] = None
4.3 Model CRD Schema (Multi-Type Support)
class ModelCategoryType(str, Enum):
LLM = "llm"
TTS = "tts"
STT = "stt"
EMBEDDING = "embedding"
RERANK = "rerank"
class ModelSpec(BaseModel):
modelConfig: Dict[str, Any]
isCustomConfig: Optional[bool] = None
protocol: Optional[str] = None # 'openai', 'claude', etc.
apiFormat: Optional[ApiFormat] = None # 'chat/completions' or 'responses'
contextWindow: Optional[int] = None
maxOutputTokens: Optional[int] = None
modelType: Optional[ModelCategoryType] = ModelCategoryType.LLM
# Type-specific configs
ttsConfig: Optional[TTSConfig] = None
sttConfig: Optional[STTConfig] = None
embeddingConfig: Optional[EmbeddingConfig] = None
rerankConfig: Optional[RerankConfig] = None
4.4 Shell CRD Schema
class ShellSpec(BaseModel):
shellType: str # 'ClaudeCode', 'Agno', 'Dify', 'Chat'
supportModel: Optional[List[str]] = None
baseImage: Optional[str] = None # Custom Docker image
baseShellRef: Optional[str] = None # Reference to base public shell
requiresWorkspace: Optional[bool] = None # Auto-inferred if None
Shell Types:
| Type | Category | Description |
|---|---|---|
ClaudeCode | local_engine | Claude Code SDK in Docker |
Agno | local_engine | Agno framework in Docker |
Dify | external_api | External Dify API proxy |
Chat | external_api | Direct LLM API (no Docker) |
4.5 Bot CRD Schema
class GhostRef(BaseModel):
name: str
namespace: str = "default"
class ShellRef(BaseModel):
name: str
namespace: str = "default"
class ModelRef(BaseModel):
name: str
namespace: str = "default"
class BotSpec(BaseModel):
ghostRef: GhostRef
shellRef: ShellRef
modelRef: Optional[ModelRef] = None # Optional for some shell types
class Bot(BaseModel):
apiVersion: str = "agent.wecode.io/v1"
kind: str = "Bot"
metadata: ObjectMeta
spec: BotSpec
status: Optional[BotStatus] = None
4.6 Team CRD Schema
class TeamMember(BaseModel):
botRef: BotTeamRef
prompt: Optional[str] = None # Role-specific prompt
role: Optional[str] = None # 'leader', 'member', etc.
requireConfirmation: Optional[bool] = False # Pipeline confirmation
class TeamSpec(BaseModel):
members: List[TeamMember]
collaborationModel: str # 'pipeline', 'route', 'coordinate', 'collaborate'
bind_mode: Optional[List[str]] = None # ['chat', 'code'] or empty
description: Optional[str] = None
icon: Optional[str] = None
requiresWorkspace: Optional[bool] = None
Collaboration Models:
| Model | Description |
|---|---|
pipeline | Sequential execution with data flow |
route | Leader assigns to appropriate bot |
coordinate | Leader coordinates parallel execution |
collaborate | Free-form multi-bot discussion |
4.7 Task & Workspace CRD Schemas
class Repository(BaseModel):
gitUrl: str
gitRepo: str
gitRepoId: Optional[int] = None
branchName: str
gitDomain: str
class WorkspaceSpec(BaseModel):
repository: Repository
class TaskSpec(BaseModel):
title: str
prompt: str
teamRef: TeamTaskRef
workspaceRef: WorkspaceTaskRef
is_group_chat: bool = False
knowledgeBaseRefs: Optional[List[KnowledgeBaseTaskRef]] = None
device_id: Optional[str] = None
class TaskStatus(Status):
state: str = "Available"
status: str = "PENDING" # PENDING, RUNNING, COMPLETED, FAILED, CANCELLED, DELETE
progress: int = 0
result: Optional[Dict[str, Any]] = None
errorMessage: Optional[str] = None
subTasks: Optional[List[Dict[str, Any]]] = None
app: Optional[TaskApp] = None # App preview from expose_service
4.8 Skill CRD Schema
class SkillToolDeclaration(BaseModel):
name: str
provider: str
config: Optional[Dict[str, Any]] = None
class SkillProviderConfig(BaseModel):
module: str = "provider" # Python module in skill ZIP
class_name: str # Provider class name (alias: 'class')
class SkillSpec(BaseModel):
description: str # Trigger condition from SKILL.md
displayName: Optional[str] = None # Display during tool use
prompt: Optional[str] = None # Full prompt from SKILL.md
version: Optional[str] = None
author: Optional[str] = None
tags: Optional[List[str]] = None
bindShells: Optional[List[str]] = None # ['ClaudeCode', 'Agno', 'Dify', 'Chat']
config: Optional[Dict[str, Any]] = None # Shared skill config
tools: Optional[List[SkillToolDeclaration]] = None
provider: Optional[SkillProviderConfig] = None # Dynamic provider loading
mcpServers: Optional[Dict[str, Any]] = None
source: Optional[SkillSource] = None # Git import tracking
class SkillStatus(Status):
state: str = "Available"
fileSize: Optional[int] = None
fileHash: Optional[str] = None
5. Status State Machines
5.1 Task Status State Machine
TaskStatus Enum Values:
PENDING- Waiting for executionRUNNING- Currently executingCOMPLETED- Successfully finishedFAILED- Execution failed with errorCANCELLED- User cancelledCANCELLING- Cancel in progressDELETE- Marked for deletionPENDING_CONFIRMATION- Pipeline stage complete, awaiting user confirmation
5.2 Subtask Status State Machine
5.3 Context Status State Machine
ContextStatus Values:
pending- Initial stateuploading- Data being uploadedparsing- Content extraction in progressready- Available for usefailed- Processing failed
6. Database Schema Mapping
6.1 Table Summary
| Table | Model Class | Stores | Records |
|---|---|---|---|
kinds | Kind | Ghost, Model, Shell, Bot, Team, Skill, KnowledgeBase, Retriever, Device | 9 CRD types |
tasks | TaskResource | Task, Workspace | 2 CRD types |
skill_binaries | SkillBinary | Skill ZIP packages | 1-to-1 with Skill kinds |
subtasks | Subtask | Task messages/conversations | High volume |
subtask_contexts | SubtaskContext | Attachments, knowledge bases | High volume |
6.2 Index Strategy
High-Performance Indexes:
-- kinds table
INDEX (kind, name, namespace) -- Resource lookup
INDEX (user_id, kind) -- User resource lists
-- tasks table
INDEX (user_id, kind, name, namespace) -- Unique constraint
INDEX (project_id) -- Project grouping
INDEX (created_at) -- Time-series queries
-- subtasks table
INDEX (task_id, created_at) -- Task message history
INDEX (user_id, created_at) -- User activity
-- subtask_contexts table
INDEX (subtask_id) -- Context retrieval
INDEX (user_id, context_type) -- User context lists
7. Design Patterns Analysis
7.1 Single Table Inheritance (STI)
Application: Kind model stores 9 different CRD types in one table.
Benefits:
- Unified resource management
- Simplified querying across types
- Schema evolution via JSON
Trade-offs:
- No type-specific constraints at DB level
- JSON querying limitations
- Task/Workspace separated for performance
7.2 Reference Pattern (Namespace + Name)
Application: All CRD references use name + namespace tuple.
class GhostRef(BaseModel):
name: str
namespace: str = "default"
Benefits:
- Human-readable references
- Kubernetes compatibility
- Easy resource relocation
7.3 Polymorphic Context Storage
Application: SubtaskContext uses context_type + type_data for multiple context types.
Benefits:
- Extensible to new context types
- Common fields shared
- Type-specific data in JSON
7.4 Soft Delete Pattern
Application: All tables have is_active boolean for soft deletion.
Benefits:
- Data recovery capability
- Referential integrity preservation
- Audit trail maintenance
7.5 Dual-Table Strategy
Application: Task/Workspace separated from other CRDs.
Benefits:
- Query performance optimization
- Different scaling characteristics
- Independent indexing strategies
8. Complex Design Decisions
8.1 Why Separate tasks Table?
Problem: Tasks have fundamentally different access patterns:
- High write volume (message creation)
- Time-series queries (history)
- Project grouping requirements
Solution: Separate table with additional project_id column.
Impact:
- ✅ Improved task query performance
- ✅ Reduced kinds table contention
- ❌ More complex model selection logic
8.2 Why No Foreign Key Constraints?
Notable Design: Most ID references (e.g., subtask.user_id, task.team_id) lack FK constraints.
Rationale:
- Cross-service references (executor service manages executors)
- Soft delete support without cascade complexity
- Performance optimization
- Eventual consistency model
8.3 Skill Binary Storage in Database
Decision: Store Skill ZIP packages in skill_binaries table as LargeBinary.
Alternatives Considered:
- File system: Simpler but harder to backup/scale
- Object storage (S3): More complex infrastructure
Trade-off: Database bloat vs. transactional integrity
8.4 Subtask Message Threading
Design: parent_id field enables threaded conversations.
Use Cases:
- Reply chains in group chat
- Follow-up questions
- Conversation branching
9. CRD API Version
All CRDs use consistent API versioning:
apiVersion: agent.wecode.io/v1
kind: <ResourceType>
metadata:
name: <resource-name>
namespace: <namespace>
spec:
# Resource-specific configuration
status:
state: <state>
10. Key Findings & Recommendations
10.1 Strengths
- Kubernetes-Compatible Design: Familiar patterns for DevOps users
- Flexible JSON Storage: Rapid iteration without migrations
- Clear Separation of Concerns: Ghost/Shell/Model → Bot → Team → Task
- Performance Optimization: Dual-table strategy for high-volume resources
- Extensible Skill System: Dynamic loading with ZIP packages
10.2 Areas for Investigation
JSON Schema Validation: No DB-level constraints on
jsoncolumns- Risk: Data corruption from application bugs
- Mitigation: Pydantic validation at application layer
Query Performance: JSON field queries may not use indexes efficiently
- Investigation: Review query patterns for
jsoncolumn filtering
- Investigation: Review query patterns for
Skill Binary Storage: Large binary data in database
- Risk: Table bloat, backup size
- Consider: Migration to object storage for skills > 1MB
Status State Consistency: Multiple status enums across modules
shared/status.py:TaskStatusbackend/app/schemas/task.py:TaskStatusshared/models/db/enums.py:SubtaskStatus- Investigation: Verify synchronization between definitions
Reference Integrity: No FK constraints means potential orphaned references
- Risk: References to deleted resources
- Mitigation: Application-level validation, soft delete pattern
10.3 Design Recommendations
- Add JSON Schema Constraints: Consider CHECK constraints for critical fields
- Implement Archive Strategy: Move completed tasks to archive tables after N days
- Consider Read Replicas: Task/subtask queries are read-heavy
- Add Resource Caching: Frequently referenced Ghosts/Models should be cached
11. Terminology Mapping
| Code/CRD Level | Frontend UI (Chinese) | Frontend UI (English) | Description |
|---|---|---|---|
| Team | 智能体 | Agent | User-facing AI agent |
| Bot | 机器人 | Bot | Building block component |
| Task | 任务 | Task | Executable work unit |
| Workspace | 工作空间 | Workspace | Code repository context |
| Ghost | 灵魂 | Ghost | System prompt & tools |
| Shell | 运行环境 | Shell | Execution runtime |
| Model | 模型 | Model | AI model configuration |
| Skill | 技能 | Skill | Dynamic capabilities |
12. Related Documentation
Analysis generated: 2026-02-01Version: agent.wecode.io/v1