feat(chat): 为回复分割器添加 llm 与 punctuation 模式

本次更新重构了回复分割功能，引入了 `split_mode` 配置项，允许用户在两种分割模式之间进行选择，提供了更大的灵活性。 - **`llm` 模式**: 延续了由大语言模型通过 `[SPLIT]` 标记决定断句的功能。此模式下的提示词（Prompt）已进一步优化，以引导模型做出更自然的分割。 - **`punctuation` 模式**: 恢复了传统的基于标点符号的分割逻辑。这已设为新的默认模式，确保用户更新后行为与旧版本保持一致。此外，`at_user` 插件也进行了适配，以正确处理由 `llm` 模式可能产生的多段消息。
2025-09-11 17:12:01 +08:00
parent 62c548ad2b
commit 0cb2fa3373
5 changed files with 36 additions and 19 deletions
--- a/src/chat/utils/utils.py
+++ b/src/chat/utils/utils.py
@@ -331,15 +331,21 @@ def process_llm_response(text: str, enable_splitter: bool = True, enable_chinese
    )

    if global_config.response_splitter.enable and enable_splitter:
-        logger.info("回复分割器已启用。")
-        if "[SPLIT]" in cleaned_text:
+        logger.info(f"回复分割器已启用，模式: {global_config.response_splitter.split_mode}。")
+        
+        split_mode = global_config.response_splitter.split_mode
+        
+        if split_mode == "llm" and "[SPLIT]" in cleaned_text:
+            logger.debug("检测到 [SPLIT] 标记，使用 LLM 自定义分割。")
            split_sentences_raw = cleaned_text.split("[SPLIT]")
-            # 清理每个句子首尾可能由LLM添加的空格或换行符，并移除空句子
            split_sentences = [s.strip() for s in split_sentences_raw if s.strip()]
-            logger.debug(f"LLM 自定义分割结果: {split_sentences}")
        else:
-            # 如果没有 [SPLIT] 标记，则不进行任何分割
-            split_sentences = [cleaned_text]
+            if split_mode == "llm":
+                logger.debug("未检测到 [SPLIT] 标记，本次不进行分割。")
+                split_sentences = [cleaned_text]
+            else: # mode == "punctuation"
+                logger.debug("使用基于标点的传统模式进行分割。")
+                split_sentences = split_into_sentences_w_remove_punctuation(cleaned_text)
    else:
        logger.debug("回复分割器已禁用。")
        split_sentences = [cleaned_text]