action文档

2025-07-25 00:08:00 +08:00
parent c78b706a50
commit a82de0a50e
2 changed files with 234 additions and 334 deletions
--- a/docs/plugins/action-components.md
+++ b/docs/plugins/action-components.md
@@ -4,42 +4,183 @@

 Action是给麦麦在回复之外提供额外功能的智能组件，**由麦麦的决策系统自主选择是否使用**，具有随机性和拟人化的调用特点。Action不是直接响应用户命令，而是让麦麦根据聊天情境智能地选择合适的动作，使其行为更加自然和真实。

-### 🎯 Action的特点
+### Action的特点

 - 🧠 **智能激活**：麦麦根据多种条件智能判断是否使用
- 🎲 **随机性**：增加行为的不可预测性，更接近真人交流
+- 🎲 **可随机性**：可以使用随机数激活，增加行为的不可预测性，更接近真人交流
 - 🤖 **拟人化**：让麦麦的回应更自然、更有个性
 - 🔄 **情境感知**：基于聊天上下文做出合适的反应

-## 🎯 两层决策机制
+---
+
+## 🎯 Action组件的基本结构
+首先，所有的Action都应该继承`BaseAction`类。
+
+其次，每个Action组件都应该实现以下基本信息：
+```python
+class ExampleAction(BaseAction):
+    action_name = "example_action" # 动作的唯一标识符
+    action_description = "这是一个示例动作" # 动作描述
+    activation_type = ActionActivationType.ALWAYS # 这里以 ALWAYS 为例
+    mode_enable = ChatMode.ALL # 这里以 ALL 为例
+    associated_types = ["text", "emoji", ...] # 关联类型
+    parallel_action = False # 是否允许与其他Action并行执行
+    action_parameters = {"param1": "参数1的说明", "param2": "参数2的说明", ...}
+    # Action使用场景描述 - 帮助LLM判断何时"选择"使用
+    action_require = ["使用场景描述1", "使用场景描述2", ...]
+
+    async def execute(self) -> Tuple[bool, str]:
+        """
+        执行Action的主要逻辑
+        
+        Returns:
+            Tuple[bool, str]: (是否成功, 执行结果描述)
+        """
+        # ---- 执行动作的逻辑 ----
+        return True, "执行成功"
+```
+#### associated_types: 该Action会发送的消息类型，例如文本、表情等。
+
+这部分由Adapter传递给处理器。
+
+以 MaiBot-Napcat-Adapter 为例，可选项目如下：
+| 类型 | 说明 | 格式 |
+| --- | --- | --- |
+| text | 文本消息 | str |
+| emoji | 表情消息 | str: 表情包的无头base64|
+| image | 图片消息 | str: 图片的无头base64 |
+| reply | 回复消息 | str: 回复的消息ID |
+| voice | 语音消息 | str: wav格式语音的无头base64 |
+| command | 命令消息 | 参见Adapter文档 |
+| voiceurl | 语音URL消息 | str: wav格式语音的URL |
+| music | 音乐消息 | str: 这首歌在网易云音乐的音乐id |
+| videourl | 视频URL消息 | str: 视频的URL |
+| file | 文件消息 | str: 文件的路径 |
+
+**请知悉，对于不同的处理器，其支持的消息类型可能会有所不同。在开发时请注意。**
+
+#### action_parameters: 该Action的参数说明。
+这是一个字典，键为参数名，值为参数说明。这个字段可以帮助LLM理解如何使用这个Action，并由LLM返回对应的参数，最后传递到 Action 的 action_data 属性中。其格式与你定义的格式完全相同 **（除非LLM哈气了，返回了错误的内容）**。
+
+---
+
+## 🎯 Action 调用的决策机制

 Action采用**两层决策机制**来优化性能和决策质量：

-### 第一层：激活控制（Activation Control）
+> 设计目的：在加载许多插件的时候降低LLM决策压力，避免让麦麦在过多的选项中纠结。

-**激活决定麦麦是否"知道"这个Action的存在**，即这个Action是否进入决策候选池。**不被激活的Action麦麦永远不会选择**。
+**第一层：激活控制（Activation Control）**

-> 🎯 **设计目的**：在加载许多插件的时候降低LLM决策压力，避免让麦麦在过多的选项中纠结。
+激活决定麦麦是否 **“知道”** 这个Action的存在，即这个Action是否进入决策候选池。不被激活的Action麦麦永远不会选择。

-#### 激活类型说明
+**第二层：使用决策（Usage Decision）**
+
+在Action被激活后，使用条件决定麦麦什么时候会 **“选择”** 使用这个Action。
+
+### 决策参数详解 🔧
+
+#### 第一层：ActivationType 激活类型说明

 | 激活类型 | 说明 | 使用场景 |
-| ------------- | ------------------------------------------- | ------------------------ |
-| `NEVER`     | 从不激活，Action对麦麦不可见                | 临时禁用某个Action       |
-| `ALWAYS`    | 永远激活，Action总是在麦麦的候选池中        | 核心功能，如回复、不回复 |
-| `LLM_JUDGE` | 通过LLM智能判断当前情境是否需要激活此Action | 需要智能判断的复杂场景   |
+| ----------- | ---------------------------------------- | ---------------------- |
+| [`NEVER`](#never-激活)     | 从不激活，Action对麦麦不可见               | 临时禁用某个Action      |
+| [`ALWAYS`](#always-激活)    | 永远激活，Action总是在麦麦的候选池中        | 核心功能，如回复、不回复 |
+| [`LLM_JUDGE`](#llm_judge-激活) | 通过LLM智能判断当前情境是否需要激活此Action | 需要智能判断的复杂场景   |
 | `RANDOM`    | 基于随机概率决定是否激活                   | 增加行为随机性的功能     |
 | `KEYWORD`   | 当检测到特定关键词时激活                   | 明确触发条件的功能       |

-#### 聊天模式控制
+#### `NEVER` 激活

-| 模式                | 说明                     |
-| ------------------- | ------------------------ |
-| `ChatMode.FOCUS`  | 仅在专注聊天模式下可激活 |
-| `ChatMode.NORMAL` | 仅在普通聊天模式下可激活 |
-| `ChatMode.ALL`    | 所有模式下都可激活       |
+`ActionActivationType.NEVER` 会使得 Action 永远不会被激活

-### 第二层：使用决策（Usage Decision）
+```python
+class DisabledAction(BaseAction):
+    activation_type = ActionActivationType.NEVER  # 永远不激活
+    
+    async def execute(self) -> Tuple[bool, str]:
+        # 这个Action永远不会被执行
+        return False, "这个Action被禁用"
+```
+
+#### `ALWAYS` 激活
+
+`ActionActivationType.ALWAYS` 会使得 Action 永远会被激活，即一直在 Action 候选池中
+
+这种激活方式常用于核心功能，如回复或不回复。
+
+```python
+class AlwaysActivatedAction(BaseAction):
+    activation_type = ActionActivationType.ALWAYS  # 永远激活
+    
+    async def execute(self) -> Tuple[bool, str]:
+        # 执行核心功能
+        return True, "执行了核心功能"
+```
+
+#### `LLM_JUDGE` 激活
+
+`ActionActivationType.LLM_JUDGE`会使得这个 Action 根据 LLM 的判断来决定是否加入候选池。
+
+而 LLM 的判断是基于代码中预设的`llm_judge_prompt`和自动提供的聊天上下文进行的。
+
+因此使用此种方法需要实现`llm_judge_prompt`属性。
+
+```python
+class LLMJudgedAction(BaseAction):
+    activation_type = ActionActivationType.LLM_JUDGE  # 通过LLM判断激活
+    # LLM判断提示词
+    llm_judge_prompt = (
+    "判定是否需要使用这个动作的条件：\n"
+    "1. 用户希望调用XXX这个动作\n"
+    "...\n"
+    "请回答\"是\"或\"否\"。\n"
+    )
+
+    async def execute(self) -> Tuple[bool, str]:
+        # 根据LLM判断是否执行
+        return True, "执行了LLM判断功能"
+```
+
+#### `RANDOM` 激活
+
+`ActionActivationType.RANDOM`会使得这个 Action 根据随机概率决定是否加入候选池。
+
+概率则由代码中的`random_activation_probability`控制。在内部实现中我们使用了`random.random()`来生成一个0到1之间的随机数，并与这个概率进行比较。
+
+因此使用这个方法需要实现`random_activation_probability`属性。
+
+```python
+class SurpriseAction(BaseAction):
+    activation_type = ActionActivationType.RANDOM  # 基于随机概率激活
+    # 随机激活概率
+    random_activation_probability = 0.1  # 10%概率激活
+  
+    async def execute(self) -> Tuple[bool, str]:
+        # 执行惊喜动作
+        return True, "发送了惊喜内容"
+```
+
+#### `KEYWORD` 激活
+
+`ActionActivationType.KEYWORD`会使得这个 Action 在检测到特定关键词时激活。
+
+关键词由代码中的`activation_keywords`定义，而`keyword_case_sensitive`则控制关键词匹配时是否区分大小写。在内部实现中，我们使用了`in`操作符来检查消息内容是否包含这些关键词。
+
+因此，使用此种方法需要实现`activation_keywords`和`keyword_case_sensitive`属性。
+
+```python
+class GreetingAction(BaseAction):
+    activation_type = ActionActivationType.KEYWORD  # 关键词激活
+    activation_keywords = ["你好", "hello", "hi", "嗨"] # 关键词配置
+    keyword_case_sensitive = False  # 不区分大小写
+  
+    async def execute(self) -> Tuple[bool, str]:
+        # 执行问候逻辑
+        return True, "发送了问候"
+```
+
+#### 第二层：使用决策

 **在Action被激活后，使用条件决定麦麦什么时候会"选择"使用这个Action**。

@@ -49,16 +190,15 @@ Action采用**两层决策机制**来优化性能和决策质量：
 - `action_parameters`：所需参数，影响Action的可执行性
 - 当前聊天上下文和麦麦的决策逻辑

-### 🎬 决策流程示例
+---

-假设有一个"发送表情"Action：
+### 决策流程示例

 ```python
 class EmojiAction(BaseAction):
    # 第一层：激活控制
-    focus_activation_type = ActionActivationType.RANDOM  # 专注模式下随机激活
-    normal_activation_type = ActionActivationType.KEYWORD  # 普通模式下关键词激活
-    activation_keywords = ["表情", "emoji", "😊"]
+    activation_type = ActionActivationType.RANDOM  # 随机激活
+    random_activation_probability = 0.1  # 10%概率激活

    # 第二层：使用决策
    action_require = [
@@ -72,311 +212,85 @@ class EmojiAction(BaseAction):

 1. **第一层激活判断**：

-   - 普通模式：只有当用户消息包含"表情"、"emoji"或"😊"时，麦麦才"知道"可以使用这个Action
-   - 专注模式：随机激活，有概率让麦麦"看到"这个Action
+    - 使用随机数进行决策，当`random.random() < self.random_activation_probability`时，麦麦才"知道"可以使用这个Action
 2. **第二层使用决策**：

   - 即使Action被激活，麦麦还会根据 `action_require` 中的条件判断是否真正选择使用
   - 例如：如果刚刚已经发过表情，根据"不要连续发送多个表情"的要求，麦麦可能不会选择这个Action

-## 📋 Action必须项清单
-
-每个Action类都**必须**包含以下属性：
-
-### 1. 激活控制必须项
+---

+## Action 内置属性说明
 ```python
-# 专注模式下的激活类型
-focus_activation_type = ActionActivationType.LLM_JUDGE
-
-# 普通模式下的激活类型
-normal_activation_type = ActionActivationType.KEYWORD
-
-# 启用的聊天模式
-mode_enable = ChatMode.ALL
-
-# 是否允许与其他Action并行执行
-parallel_action = False
-```
-
-### 2. 基本信息必须项
-
-```python
-# Action的唯一标识名称
-action_name = "my_action"
-
-# Action的功能描述
-action_description = "描述这个Action的具体功能和用途"
-```
-
-### 3. 功能定义必须项
-
-```python
-# Action参数定义 - 告诉LLM执行时需要什么参数
-action_parameters = {
-    "param1": "参数1的说明",
-    "param2": "参数2的说明"
-}
-
-# Action使用场景描述 - 帮助LLM判断何时"选择"使用
-action_require = [
-    "使用场景描述1",
-    "使用场景描述2"
-]
-
-# 关联的消息类型 - 说明Action能处理什么类型的内容
-associated_types = ["text", "emoji", "image"]
-```
-
-### 4. 执行方法必须项
-
-```python
-async def execute(self) -> Tuple[bool, str]:
-    """
-    执行Action的主要逻辑
-    
-    Returns:
-        Tuple[bool, str]: (是否成功, 执行结果描述)
-    """
-    # 执行动作的代码
-    success = True
-    message = "动作执行成功"
-    
-    return success, message
-```
-
-## 🔧 激活类型详解
-
-### KEYWORD激活
-
-当检测到特定关键词时激活Action：
-
-```python
-class GreetingAction(BaseAction):
-    focus_activation_type = ActionActivationType.KEYWORD
-    normal_activation_type = ActionActivationType.KEYWORD
-  
-    # 关键词配置
-    activation_keywords = ["你好", "hello", "hi", "嗨"]
-    keyword_case_sensitive = False  # 不区分大小写
-  
-    async def execute(self) -> Tuple[bool, str]:
-        # 执行问候逻辑
-        return True, "发送了问候"
-```
-
-### LLM_JUDGE激活
-
-通过LLM智能判断是否激活：
-
-```python
-class HelpAction(BaseAction):
-    focus_activation_type = ActionActivationType.LLM_JUDGE
-    normal_activation_type = ActionActivationType.LLM_JUDGE
-  
-    # LLM判断提示词
-    llm_judge_prompt = """
-    判定是否需要使用帮助动作的条件：
-    1. 用户表达了困惑或需要帮助
-    2. 用户提出了问题但没有得到满意答案
-    3. 对话中出现了技术术语或复杂概念
-  
-    请回答"是"或"否"。
-    """
-  
-    async def execute(self) -> Tuple[bool, str]:
-        # 执行帮助逻辑
-        return True, "提供了帮助"
-```
-
-### RANDOM激活
-
-基于随机概率激活：
-
-```python
-class SurpriseAction(BaseAction):
-    focus_activation_type = ActionActivationType.RANDOM
-    normal_activation_type = ActionActivationType.RANDOM
-  
-    # 随机激活概率
-    random_activation_probability = 0.1  # 10%概率激活
-  
-    async def execute(self) -> Tuple[bool, str]:
-        # 执行惊喜动作
-        return True, "发送了惊喜内容"
-```
-
-### ALWAYS激活
-
-永远激活，常用于核心功能：
-
-```python
-class CoreAction(BaseAction):
-    focus_activation_type = ActionActivationType.ALWAYS
-    normal_activation_type = ActionActivationType.ALWAYS
-    
-    async def execute(self) -> Tuple[bool, str]:
-        # 执行核心功能
-        return True, "执行了核心功能"
-```
-
-### NEVER激活
-
-从不激活，用于临时禁用：
-
-```python
-class DisabledAction(BaseAction):
-    focus_activation_type = ActionActivationType.NEVER
-    normal_activation_type = ActionActivationType.NEVER
-    
-    async def execute(self) -> Tuple[bool, str]:
-        # 这个方法不会被调用
-        return False, "已禁用"
-```
-
-## 📚 BaseAction内置属性和方法
-
-### 内置属性
-
-```python
-class MyAction(BaseAction):
+class BaseAction:
    def __init__(self):
        # 消息相关属性
-        self.message          # 当前消息对象
-        self.chat_stream      # 聊天流对象
-        self.user_id          # 用户ID
-        self.user_nickname    # 用户昵称
-        self.platform         # 平台类型 (qq, telegram等)
-        self.chat_id          # 聊天ID
-        self.is_group         # 是否群聊
+        self.log_prefix: str          # 日志前缀
+        self.group_id: str            # 群组ID
+        self.group_name: str          # 群组名称
+        self.user_id: str             # 用户ID
+        self.user_nickname: str       # 用户昵称
+        self.platform: str            # 平台类型 (qq, telegram等)
+        self.chat_id: str             # 聊天ID
+        self.chat_stream: ChatStream  # 聊天流对象
+        self.is_group: bool           # 是否群聊
+
+        # 消息体
+        self.action_message: dict     # 消息数据

        # Action相关属性
-        self.action_data      # Action执行时的数据
-        self.thinking_id      # 思考ID
-        self.matched_groups   # 匹配到的组(如果有正则匹配)
+        self.action_data: dict        # Action执行时的数据
+        self.thinking_id: str         # 思考ID
 ```
-
-### 内置方法
+action_message为一个字典，包含的键值对如下（省略了不必要的键值对）

 ```python
-class MyAction(BaseAction):
+{
+    "message_id": "1234567890",  # 消息id，str
+    "time": 1627545600.0,  # 时间戳，float
+    "chat_id": "abcdef123456",  # 聊天ID，str
+    "reply_to": None,  # 回复消息id，str或None
+    "interest_value": 0.85,  # 兴趣值，float
+    "is_mentioned": True,  # 是否被提及，bool
+    "chat_info_last_active_time": 1627548600.0,  # 最后活跃时间，float
+    "processed_plain_text": None,  # 处理后的文本，str或None
+    "additional_config": None,  # Adapter传来的additional_config，dict或None
+    "is_emoji": False,  # 是否为表情，bool
+    "is_picid": False,  # 是否为图片ID，bool
+    "is_command": False  # 是否为命令，bool
+}
+```
+
+部分值的格式请自行查询数据库。
+
+---
+
+## Action 内置方法说明
+```python
+class BaseAction:
    # 配置相关
    def get_config(self, key: str, default=None):
-        """获取配置值"""
-        pass
+        """获取插件配置值，使用嵌套键访问"""
    
-    # 消息发送相关
-    async def send_text(self, text: str):
+    async def wait_for_new_message(self, timeout: int = 1200) -> Tuple[bool, str]:
+        """等待新消息或超时"""
+
+    async def send_text(self, content: str, reply_to: str = "", reply_to_platform_id: str = "", typing: bool = False) -> bool:
        """发送文本消息"""
-        pass

-    async def send_emoji(self, emoji_base64: str):
+    async def send_emoji(self, emoji_base64: str) -> bool:
        """发送表情包"""
-        pass

-    async def send_image(self, image_base64: str):
+    async def send_image(self, image_base64: str) -> bool:
        """发送图片"""
-        pass

-    # 动作记录相关
-    async def store_action_info(self, **kwargs):
-        """记录动作信息"""
-        pass
+    async def send_custom(self, message_type: str, content: str, typing: bool = False, reply_to: str = "") -> bool:
+        """发送自定义类型消息"""
+
+    async def store_action_info(self, action_build_into_prompt: bool = False, action_prompt_display: str = "", action_done: bool = True) -> None:
+        """存储动作信息到数据库"""
+
+    async def send_command(self, command_name: str, args: Optional[dict] = None, display_message: str = "", storage_message: bool = True) -> bool:
+        """发送命令消息"""
 ```
-
-## 🎯 完整Action示例
-
-```python
-from src.plugin_system import BaseAction, ActionActivationType, ChatMode
-from typing import Tuple
-
-class ExampleAction(BaseAction):
-    """示例Action - 展示完整的Action结构"""
-    
-    # === 激活控制 ===
-    focus_activation_type = ActionActivationType.LLM_JUDGE
-    normal_activation_type = ActionActivationType.KEYWORD
-    mode_enable = ChatMode.ALL
-    parallel_action = False
-    
-    # 关键词激活配置
-    activation_keywords = ["示例", "测试", "example"]
-    keyword_case_sensitive = False
-    
-    # LLM判断提示词
-    llm_judge_prompt = "当用户需要示例或测试功能时激活"
-    
-    # 随机激活概率（如果使用RANDOM类型）
-    random_activation_probability = 0.2
-    
-    # === 基本信息 ===
-    action_name = "example_action"
-    action_description = "这是一个示例Action，用于演示Action的完整结构"
-    
-    # === 功能定义 ===
-    action_parameters = {
-        "content": "要处理的内容",
-        "type": "处理类型",
-        "options": "可选配置"
-    }
-    
-    action_require = [
-        "用户需要示例功能时使用",
-        "适合用于测试和演示",
-        "不要在正式对话中频繁使用"
-    ]
-    
-    associated_types = ["text", "emoji"]
-    
-    async def execute(self) -> Tuple[bool, str]:
-        """执行示例Action"""
-        try:
-            # 获取Action参数
-            content = self.action_data.get("content", "默认内容")
-            action_type = self.action_data.get("type", "default")
-            
-            # 获取配置
-            enable_feature = self.get_config("example.enable_advanced", False)
-            max_length = self.get_config("example.max_length", 100)
-            
-            # 执行具体逻辑
-            if action_type == "greeting":
-                await self.send_text(f"你好！这是示例内容：{content}")
-            elif action_type == "info":
-                await self.send_text(f"信息：{content[:max_length]}")
-            else:
-                await self.send_text("执行了示例Action")
-            
-            # 记录动作信息
-            await self.store_action_info(
-                action_build_into_prompt=True,
-                action_prompt_display=f"执行了示例动作：{action_type}",
-                action_done=True
-            )
-            
-            return True, f"示例Action执行成功，类型：{action_type}"
-            
-        except Exception as e:
-            return False, f"执行失败：{str(e)}"
-```
-
-## 🎯 最佳实践
-
-### 1. Action设计原则
-
- **单一职责**：每个Action只负责一个明确的功能
- **智能激活**：合理选择激活类型，避免过度激活
- **清晰描述**：提供准确的`action_require`帮助LLM决策
- **错误处理**：妥善处理执行过程中的异常情况
-
-### 2. 性能优化
-
- **激活控制**：使用合适的激活类型减少不必要的LLM调用
- **并行执行**：谨慎设置`parallel_action`，避免冲突
- **资源管理**：及时释放占用的资源
-
-### 3. 调试技巧
-
- **日志记录**：在关键位置添加日志
- **参数验证**：检查`action_data`的有效性
- **配置测试**：测试不同配置下的行为
+具体参数与用法参见`BaseAction`基类的定义。
--- a/src/plugin_system/base/base_action.py
+++ b/src/plugin_system/base/base_action.py
@@ -49,12 +49,10 @@ class BaseAction(ABC):
            reasoning: 执行该动作的理由
            cycle_timers: 计时器字典
            thinking_id: 思考ID
-            expressor: 表达器对象
-            replyer: 回复器对象
            chat_stream: 聊天流对象
            log_prefix: 日志前缀
-            shutting_down: 是否正在关闭
            plugin_config: 插件配置字典
+            action_message: 消息数据
            **kwargs: 其他参数
        """
        if plugin_config is None:
@@ -414,23 +412,11 @@ class BaseAction(ABC):
        """
        return await self.execute()

-    # def get_action_context(self, key: str, default=None):
-    #     """获取action上下文信息
-
-    #     Args:
-    #         key: 上下文键名
-    #         default: 默认值
-
-    #     Returns:
-    #         Any: 上下文值或默认值
-    #     """
-    #     return self.api.get_action_context(key, default)
-
    def get_config(self, key: str, default=None):
-        """获取插件配置值，支持嵌套键访问
+        """获取插件配置值，使用嵌套键访问

        Args:
-            key: 配置键名，支持嵌套访问如 "section.subsection.key"
+            key: 配置键名，使用嵌套访问如 "section.subsection.key"
            default: 默认值

        Returns: