Compare commits
2 Commits
gitea
...
cf500a47de
| Author | SHA1 | Date | |
|---|---|---|---|
|
cf500a47de
|
|||
|
47c19995db
|
102
BEDROCK_INTEGRATION.md
Normal file
102
BEDROCK_INTEGRATION.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# AWS Bedrock 集成完成 ✅
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 1. 安装依赖
|
||||
|
||||
```bash
|
||||
pip install aioboto3 botocore
|
||||
```
|
||||
|
||||
### 2. 配置凭证
|
||||
|
||||
在 `config/model_config.toml` 添加:
|
||||
|
||||
```toml
|
||||
[[api_providers]]
|
||||
name = "bedrock_us_east"
|
||||
base_url = ""
|
||||
api_key = "YOUR_AWS_ACCESS_KEY_ID"
|
||||
client_type = "bedrock"
|
||||
timeout = 60
|
||||
|
||||
[api_providers.extra_params]
|
||||
aws_secret_key = "YOUR_AWS_SECRET_ACCESS_KEY"
|
||||
region = "us-east-1"
|
||||
|
||||
[[models]]
|
||||
model_identifier = "us.anthropic.claude-3-5-sonnet-20240620-v1:0"
|
||||
name = "claude-3.5-sonnet-bedrock"
|
||||
api_provider = "bedrock_us_east"
|
||||
price_in = 3.0
|
||||
price_out = 15.0
|
||||
```
|
||||
|
||||
### 3. 使用示例
|
||||
|
||||
```python
|
||||
from src.llm_models import get_llm_client
|
||||
from src.llm_models.payload_content.message import MessageBuilder
|
||||
|
||||
client = get_llm_client("bedrock_us_east")
|
||||
builder = MessageBuilder()
|
||||
builder.add_user_message("你好,AWS Bedrock!")
|
||||
|
||||
response = await client.get_response(
|
||||
model_info=get_model_info("claude-3.5-sonnet-bedrock"),
|
||||
message_list=[builder.build()],
|
||||
max_tokens=1024
|
||||
)
|
||||
|
||||
print(response.content)
|
||||
```
|
||||
|
||||
## 新增文件
|
||||
|
||||
- ✅ `src/llm_models/model_client/bedrock_client.py` - Bedrock 客户端实现
|
||||
- ✅ `docs/integrations/Bedrock.md` - 完整文档
|
||||
- ✅ `scripts/test_bedrock_client.py` - 测试脚本
|
||||
|
||||
## 修改文件
|
||||
|
||||
- ✅ `src/llm_models/model_client/__init__.py` - 添加 Bedrock 导入
|
||||
- ✅ `src/config/api_ada_configs.py` - 添加 `bedrock` client_type
|
||||
- ✅ `template/model_config_template.toml` - 添加 Bedrock 配置示例(注释形式)
|
||||
- ✅ `requirements.txt` - 添加 aioboto3 和 botocore 依赖
|
||||
- ✅ `pyproject.toml` - 添加 aioboto3 和 botocore 依赖
|
||||
|
||||
## 支持功能
|
||||
|
||||
- ✅ **对话生成**:支持多轮对话
|
||||
- ✅ **流式输出**:支持流式响应
|
||||
- ✅ **工具调用**:完整支持 Tool Use
|
||||
- ✅ **多模态**:支持图片输入
|
||||
- ✅ **文本嵌入**:支持 Titan Embeddings
|
||||
- ✅ **跨区推理**:支持 Inference Profile
|
||||
|
||||
## 支持模型
|
||||
|
||||
- Amazon Nova 系列 (Micro/Lite/Pro)
|
||||
- Anthropic Claude 3/3.5 系列
|
||||
- Meta Llama 2/3 系列
|
||||
- Mistral AI 系列
|
||||
- Cohere Command 系列
|
||||
- AI21 Jamba 系列
|
||||
- Stability AI SDXL
|
||||
|
||||
## 测试
|
||||
|
||||
```bash
|
||||
# 修改凭证后运行测试
|
||||
python scripts/test_bedrock_client.py
|
||||
```
|
||||
|
||||
## 文档
|
||||
|
||||
详细文档:`docs/integrations/Bedrock.md`
|
||||
|
||||
---
|
||||
|
||||
**集成状态**: ✅ 生产就绪
|
||||
**集成时间**: 2025年12月6日
|
||||
|
||||
471
MEMORY_PROFILING.md
Normal file
471
MEMORY_PROFILING.md
Normal file
@@ -0,0 +1,471 @@
|
||||
# Bot 内存分析工具使用指南
|
||||
|
||||
一个统一的内存诊断工具,提供进程监控、对象分析和数据可视化功能。
|
||||
|
||||
## 🚀 快速开始
|
||||
|
||||
> **提示**: 建议使用虚拟环境运行脚本(`.\.venv\Scripts\python.exe`)
|
||||
|
||||
```powershell
|
||||
# 查看帮助
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --help
|
||||
|
||||
# 进程监控模式(最简单)
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --monitor
|
||||
|
||||
# 对象分析模式(深度分析)
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --objects --output memory_data.txt
|
||||
|
||||
# 可视化模式(生成图表)
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --visualize --input memory_data.txt.jsonl
|
||||
```
|
||||
|
||||
**或者使用简短命令**(如果你的系统 `python` 已指向虚拟环境):
|
||||
|
||||
```powershell
|
||||
python scripts/memory_profiler.py --monitor
|
||||
```
|
||||
|
||||
## 📦 依赖安装
|
||||
|
||||
```powershell
|
||||
# 基础功能(进程监控)
|
||||
pip install psutil
|
||||
|
||||
# 对象分析功能
|
||||
pip install pympler
|
||||
|
||||
# 可视化功能
|
||||
pip install matplotlib
|
||||
|
||||
# 一次性安装全部
|
||||
pip install psutil pympler matplotlib
|
||||
```
|
||||
|
||||
## 🔧 三种模式详解
|
||||
|
||||
### 1. 进程监控模式 (--monitor)
|
||||
|
||||
**用途**: 从外部监控 bot 进程的总内存、子进程情况
|
||||
|
||||
**特点**:
|
||||
- ✅ 自动启动 bot.py(使用虚拟环境)
|
||||
- ✅ 实时显示进程内存(RSS、VMS)
|
||||
- ✅ 列出所有子进程及其内存占用
|
||||
- ✅ 显示 bot 输出日志
|
||||
- ✅ 自动保存监控历史
|
||||
|
||||
**使用示例**:
|
||||
|
||||
```powershell
|
||||
# 基础用法
|
||||
python scripts/memory_profiler.py --monitor
|
||||
|
||||
# 自定义监控间隔(10秒)
|
||||
python scripts/memory_profiler.py --monitor --interval 10
|
||||
|
||||
# 简写
|
||||
python scripts/memory_profiler.py -m -i 5
|
||||
```
|
||||
|
||||
**输出示例**:
|
||||
|
||||
```
|
||||
================================================================================
|
||||
检查点 #1 - 14:23:15
|
||||
Bot 进程 (PID: 12345)
|
||||
RSS: 45.82 MB
|
||||
VMS: 12.34 MB
|
||||
占比: 0.25%
|
||||
子进程: 2 个
|
||||
子进程内存: 723.64 MB
|
||||
总内存: 769.46 MB
|
||||
|
||||
📋 子进程详情:
|
||||
[1] PID 12346: python.exe - 520.15 MB
|
||||
命令: python.exe -m chromadb.server ...
|
||||
[2] PID 12347: python.exe - 203.49 MB
|
||||
命令: python.exe -m uvicorn ...
|
||||
================================================================================
|
||||
```
|
||||
|
||||
**保存位置**: `data/memory_diagnostics/process_monitor_<timestamp>_pid<PID>.txt`
|
||||
|
||||
---
|
||||
|
||||
### 2. 对象分析模式 (--objects)
|
||||
|
||||
**用途**: 在 bot 进程内部统计所有 Python 对象的内存占用
|
||||
|
||||
**特点**:
|
||||
- ✅ 统计所有对象类型(dict、list、str、AsyncOpenAI 等)
|
||||
- ✅ **按模块统计内存占用(新增)** - 显示哪个模块占用最多内存
|
||||
- ✅ 包含所有线程的对象
|
||||
- ✅ 显示对象变化(diff)
|
||||
- ✅ 线程信息和 GC 统计
|
||||
- ✅ 保存 JSONL 数据用于可视化
|
||||
|
||||
**使用示例**:
|
||||
|
||||
```powershell
|
||||
# 基础用法(推荐指定输出文件)
|
||||
python scripts/memory_profiler.py --objects --output memory_data.txt
|
||||
|
||||
# 自定义参数
|
||||
python scripts/memory_profiler.py --objects \
|
||||
--interval 10 \
|
||||
--output memory_data.txt \
|
||||
--object-limit 30
|
||||
|
||||
# 简写
|
||||
python scripts/memory_profiler.py -o -i 10 --output data.txt -l 30
|
||||
```
|
||||
|
||||
**输出示例**:
|
||||
|
||||
```
|
||||
================================================================================
|
||||
🔍 对象级内存分析 #1 - 14:25:30
|
||||
================================================================================
|
||||
|
||||
📦 对象统计 (前 20 个类型):
|
||||
|
||||
类型 数量 总大小
|
||||
--------------------------------------------------------------------------------
|
||||
<class 'dict'> 125,843 45.23 MB
|
||||
<class 'str'> 234,567 23.45 MB
|
||||
<class 'list'> 56,789 12.34 MB
|
||||
<class 'tuple'> 89,012 8.90 MB
|
||||
<class 'openai.resources.chat.completions'> 12 5.67 MB
|
||||
...
|
||||
|
||||
📚 模块内存占用 (前 20 个模块):
|
||||
|
||||
模块名 对象数 总内存
|
||||
--------------------------------------------------------------------------------
|
||||
builtins 169,144 26.20 MB
|
||||
src 12,345 5.67 MB
|
||||
openai 3,456 2.34 MB
|
||||
chromadb 2,345 1.89 MB
|
||||
...
|
||||
|
||||
总模块数: 85
|
||||
|
||||
🧵 线程信息 (8 个):
|
||||
[1] ✓ MainThread
|
||||
[2] ✓ AsyncOpenAIClient (守护)
|
||||
[3] ✓ ChromaDBWorker (守护)
|
||||
...
|
||||
|
||||
🗑️ 垃圾回收:
|
||||
代 0: 1,234 次
|
||||
代 1: 56 次
|
||||
代 2: 3 次
|
||||
追踪对象: 456,789
|
||||
|
||||
📊 总对象数: 567,890
|
||||
================================================================================
|
||||
```
|
||||
|
||||
**每 3 次迭代会显示对象变化**:
|
||||
|
||||
```
|
||||
📈 对象变化分析:
|
||||
--------------------------------------------------------------------------------
|
||||
types | # objects | total size
|
||||
==================== | =========== | ============
|
||||
<class 'dict'> | +1234 | +1.23 MB
|
||||
<class 'str'> | +567 | +0.56 MB
|
||||
...
|
||||
--------------------------------------------------------------------------------
|
||||
```
|
||||
|
||||
**保存位置**:
|
||||
- 文本: `<output>.txt`
|
||||
- 结构化数据: `<output>.txt.jsonl`
|
||||
|
||||
---
|
||||
|
||||
### 3. 可视化模式 (--visualize)
|
||||
|
||||
**用途**: 将对象分析模式生成的 JSONL 数据绘制成图表
|
||||
|
||||
**特点**:
|
||||
- ✅ 显示对象类型随时间的内存变化
|
||||
- ✅ 自动选择内存占用最高的 N 个类型
|
||||
- ✅ 生成高清 PNG 图表
|
||||
|
||||
**使用示例**:
|
||||
|
||||
```powershell
|
||||
# 基础用法
|
||||
python scripts/memory_profiler.py --visualize \
|
||||
--input memory_data.txt.jsonl
|
||||
|
||||
# 自定义参数
|
||||
python scripts/memory_profiler.py --visualize \
|
||||
--input memory_data.txt.jsonl \
|
||||
--top 15 \
|
||||
--plot-output my_plot.png
|
||||
|
||||
# 简写
|
||||
python scripts/memory_profiler.py -v -i data.txt.jsonl -t 15
|
||||
```
|
||||
|
||||
**输出**: PNG 图像,展示前 N 个对象类型的内存占用随时间的变化曲线
|
||||
|
||||
**保存位置**: 默认 `memory_analysis_plot.png`,可通过 `--plot-output` 指定
|
||||
|
||||
---
|
||||
|
||||
## 💡 使用场景
|
||||
|
||||
| 场景 | 推荐模式 | 命令 |
|
||||
|------|----------|------|
|
||||
| 快速查看总内存 | `--monitor` | `python scripts/memory_profiler.py -m` |
|
||||
| 查看子进程占用 | `--monitor` | `python scripts/memory_profiler.py -m` |
|
||||
| 分析具体对象占用 | `--objects` | `python scripts/memory_profiler.py -o --output data.txt` |
|
||||
| 追踪内存泄漏 | `--objects` | `python scripts/memory_profiler.py -o --output data.txt` |
|
||||
| 可视化分析趋势 | `--visualize` | `python scripts/memory_profiler.py -v -i data.txt.jsonl` |
|
||||
|
||||
## 📊 完整工作流程
|
||||
|
||||
### 场景 1: 快速诊断内存问题
|
||||
|
||||
```powershell
|
||||
# 1. 运行进程监控(查看总体情况)
|
||||
python scripts/memory_profiler.py --monitor --interval 5
|
||||
|
||||
# 观察输出,如果发现内存异常,进入场景 2
|
||||
```
|
||||
|
||||
### 场景 2: 深度分析对象占用
|
||||
|
||||
```powershell
|
||||
# 1. 启动对象分析(保存数据)
|
||||
python scripts/memory_profiler.py --objects \
|
||||
--interval 10 \
|
||||
--output data/memory_diagnostics/analysis_$(Get-Date -Format 'yyyyMMdd_HHmmss').txt
|
||||
|
||||
# 2. 运行一段时间(建议至少 5-10 分钟),按 Ctrl+C 停止
|
||||
|
||||
# 3. 生成可视化图表
|
||||
python scripts/memory_profiler.py --visualize \
|
||||
--input data/memory_diagnostics/analysis_<timestamp>.txt.jsonl \
|
||||
--top 15 \
|
||||
--plot-output data/memory_diagnostics/plot_<timestamp>.png
|
||||
|
||||
# 4. 查看图表,分析哪些对象类型随时间增长
|
||||
```
|
||||
|
||||
### 场景 3: 持续监控
|
||||
|
||||
```powershell
|
||||
# 在后台运行对象分析(Windows)
|
||||
Start-Process powershell -ArgumentList "-Command", "python scripts/memory_profiler.py -o -i 30 --output logs/memory_continuous.txt" -WindowStyle Minimized
|
||||
|
||||
# 定期查看 JSONL 并生成图表
|
||||
python scripts/memory_profiler.py -v -i logs/memory_continuous.txt.jsonl -t 20
|
||||
```
|
||||
|
||||
## 🎯 参数参考
|
||||
|
||||
### 通用参数
|
||||
|
||||
| 参数 | 简写 | 默认值 | 说明 |
|
||||
|------|------|--------|------|
|
||||
| `--interval` | `-i` | 10 | 监控间隔(秒) |
|
||||
|
||||
### 对象分析模式参数
|
||||
|
||||
| 参数 | 简写 | 默认值 | 说明 |
|
||||
|------|------|--------|------|
|
||||
| `--output` | - | 无 | 输出文件路径(强烈推荐) |
|
||||
| `--object-limit` | `-l` | 20 | 显示的对象类型数量 |
|
||||
|
||||
### 可视化模式参数
|
||||
|
||||
| 参数 | 简写 | 默认值 | 说明 |
|
||||
|------|------|--------|------|
|
||||
| `--input` | - | **必需** | 输入 JSONL 文件路径 |
|
||||
| `--top` | `-t` | 10 | 展示前 N 个对象类型 |
|
||||
| `--plot-output` | - | `memory_analysis_plot.png` | 输出图表路径 |
|
||||
|
||||
## ⚠️ 注意事项
|
||||
|
||||
### 性能影响
|
||||
|
||||
| 模式 | 性能影响 | 说明 |
|
||||
|------|----------|------|
|
||||
| `--monitor` | < 1% | 几乎无影响,适合生产环境 |
|
||||
| `--objects` | 5-15% | 有一定影响,建议在测试环境使用 |
|
||||
| `--visualize` | 0% | 离线分析,无影响 |
|
||||
|
||||
### 常见问题
|
||||
|
||||
**Q: 对象分析模式报错 "pympler 未安装"?**
|
||||
```powershell
|
||||
pip install pympler
|
||||
```
|
||||
|
||||
**Q: 可视化模式报错 "matplotlib 未安装"?**
|
||||
```powershell
|
||||
pip install matplotlib
|
||||
```
|
||||
|
||||
**Q: 对象分析模式提示 "bot.py 未找到 main_async() 或 main() 函数"?**
|
||||
|
||||
这是正常的。如果你的 bot.py 的主逻辑在 `if __name__ == "__main__":` 中,监控线程仍会在后台运行。你可以:
|
||||
- 保持 bot 运行,监控会持续统计
|
||||
- 或者在 bot.py 中添加一个 `main_async()` 或 `main()` 函数
|
||||
|
||||
**Q: 进程监控模式看不到子进程?**
|
||||
|
||||
确保 bot.py 已经启动了子进程(例如 ChromaDB)。如果刚启动就查看,可能还没有创建子进程。
|
||||
|
||||
**Q: JSONL 文件在哪里?**
|
||||
|
||||
当你使用 `--output <file>` 时,会生成:
|
||||
- `<file>`: 人类可读的文本
|
||||
- `<file>.jsonl`: 结构化数据(用于可视化)
|
||||
|
||||
## 📁 输出文件说明
|
||||
|
||||
### 进程监控输出
|
||||
|
||||
**位置**: `data/memory_diagnostics/process_monitor_<timestamp>_pid<PID>.txt`
|
||||
|
||||
**内容**: 每次检查点的进程内存信息
|
||||
|
||||
### 对象分析输出
|
||||
|
||||
**文本文件**: `<output>`
|
||||
- 人类可读格式
|
||||
- 包含每次迭代的对象统计
|
||||
|
||||
**JSONL 文件**: `<output>.jsonl`
|
||||
- 每行一个 JSON 对象
|
||||
- 包含: timestamp, iteration, total_objects, summary, threads, gc_stats
|
||||
- 用于可视化分析
|
||||
|
||||
### 可视化输出
|
||||
|
||||
**PNG 图像**: 默认 `memory_analysis_plot.png`
|
||||
- 折线图,展示对象类型随时间的内存变化
|
||||
- 高清 150 DPI
|
||||
|
||||
## 🔍 诊断技巧
|
||||
|
||||
### 1. 识别内存泄漏
|
||||
|
||||
使用对象分析模式运行较长时间,观察:
|
||||
- 某个对象类型的数量或大小持续增长
|
||||
- 对象变化 diff 中始终为正数
|
||||
|
||||
### 2. 定位大内存对象
|
||||
|
||||
**查看对象统计**:
|
||||
- 如果 `<class 'dict'>` 占用很大,可能是缓存未清理
|
||||
- 如果看到特定类(如 `AsyncOpenAI`),检查该类的实例数
|
||||
|
||||
**查看模块统计**(推荐):
|
||||
- 查看 📚 模块内存占用部分
|
||||
- 如果 `src` 模块占用很大,说明你的代码中有大量对象
|
||||
- 如果 `openai`、`chromadb` 等第三方模块占用大,可能是这些库的使用问题
|
||||
- 对比不同时间点,看哪个模块的内存持续增长
|
||||
|
||||
### 3. 分析子进程占用
|
||||
|
||||
使用进程监控模式:
|
||||
- 查看子进程详情中的命令行
|
||||
- 识别哪个子进程占用大量内存(如 ChromaDB)
|
||||
|
||||
### 4. 对比不同时间点
|
||||
|
||||
使用可视化模式:
|
||||
- 生成图表后,观察哪些对象类型的曲线持续上升
|
||||
- 对比不同功能运行时的内存变化
|
||||
|
||||
## 🎓 高级用法
|
||||
|
||||
### 长期监控脚本
|
||||
|
||||
创建 `monitor_continuously.ps1`:
|
||||
|
||||
```powershell
|
||||
# 持续监控脚本
|
||||
$timestamp = Get-Date -Format "yyyyMMdd_HHmmss"
|
||||
$logPath = "logs/memory_analysis_$timestamp.txt"
|
||||
|
||||
Write-Host "开始持续监控,数据保存到: $logPath"
|
||||
Write-Host "按 Ctrl+C 停止监控"
|
||||
|
||||
python scripts/memory_profiler.py --objects --interval 30 --output $logPath
|
||||
```
|
||||
|
||||
### 自动生成日报
|
||||
|
||||
创建 `generate_daily_report.ps1`:
|
||||
|
||||
```powershell
|
||||
# 生成内存分析日报
|
||||
$date = Get-Date -Format "yyyyMMdd"
|
||||
$jsonlFiles = Get-ChildItem "logs" -Filter "*$date*.jsonl"
|
||||
|
||||
foreach ($file in $jsonlFiles) {
|
||||
$outputPlot = $file.FullName -replace ".jsonl", "_plot.png"
|
||||
python scripts/memory_profiler.py --visualize --input $file.FullName --plot-output $outputPlot --top 20
|
||||
Write-Host "生成图表: $outputPlot"
|
||||
}
|
||||
```
|
||||
|
||||
## 📚 扩展阅读
|
||||
|
||||
- **Python 内存管理**: https://docs.python.org/3/c-api/memory.html
|
||||
- **psutil 文档**: https://psutil.readthedocs.io/
|
||||
- **Pympler 文档**: https://pympler.readthedocs.io/
|
||||
- **Matplotlib 文档**: https://matplotlib.org/
|
||||
|
||||
## 🆘 获取帮助
|
||||
|
||||
```powershell
|
||||
# 查看完整帮助信息
|
||||
python scripts/memory_profiler.py --help
|
||||
|
||||
# 查看特定模式示例
|
||||
python scripts/memory_profiler.py --help | Select-String "示例"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**快速开始提醒**:
|
||||
|
||||
```powershell
|
||||
# 使用虚拟环境(推荐)
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --monitor
|
||||
|
||||
# 或者使用系统 Python
|
||||
python scripts/memory_profiler.py --monitor
|
||||
|
||||
# 深度分析
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --objects --output memory.txt
|
||||
|
||||
# 可视化
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py --visualize --input memory.txt.jsonl
|
||||
```
|
||||
|
||||
### 💡 虚拟环境说明
|
||||
|
||||
**Windows**:
|
||||
```powershell
|
||||
.\.venv\Scripts\python.exe scripts/memory_profiler.py [选项]
|
||||
```
|
||||
|
||||
**Linux/Mac**:
|
||||
```bash
|
||||
./.venv/bin/python scripts/memory_profiler.py [选项]
|
||||
```
|
||||
|
||||
脚本会自动检测并使用项目虚拟环境来启动 bot(进程监控模式),对象分析模式会自动添加项目根目录到 Python 路径。
|
||||
|
||||
🎉 现在你已经掌握了完整的内存分析工具!
|
||||
133
MoFox 重构指导总览.md
133
MoFox 重构指导总览.md
@@ -1,133 +0,0 @@
|
||||
# MoFox Core 重构架构文档
|
||||
|
||||
MoFox src目录将被严格分为三个层级:
|
||||
|
||||
kernel - 内核/基础能力 层 - 提供“与具体业务无关的技术能力”
|
||||
core - 核心层/领域/心智 层 - 用 kernel 的能力实现记忆、对话、行为等核心功能,不关心插件或具体平台
|
||||
app - 应用/装配/插件 层 - 把 kernel 和 core 组装成可运行的 Bot 系统,对外提供高级 API 和插件扩展点
|
||||
|
||||
## kernel层:
|
||||
包含以下模块:
|
||||
db:底层数据库接口
|
||||
__init__.py:导出
|
||||
core:数据库核心
|
||||
__init__.py:导出
|
||||
dialect_adapter.py:数据库方言适配器
|
||||
engine.py:数据库引擎管理
|
||||
session.py:数据库会话管理
|
||||
exceptions.py:数据库异常定义
|
||||
optimization:数据库优化
|
||||
__init__.py:导出
|
||||
backends:缓存后端实现
|
||||
cache_backend.py:缓存后端抽象基类
|
||||
local_cache.py:本地缓存后端
|
||||
redis_cache.py:Redis缓存后端
|
||||
cache_manager.py:多级缓存管理器
|
||||
api:操作接口
|
||||
crud.py:统一的crud操作
|
||||
query.py:高级查询API
|
||||
vector_db:底层向量存储接口
|
||||
__init__.py:导出+工厂函数,初始化并返回向量数据库服务实例。
|
||||
base.py:向量数据库的抽象基类 (ABC),定义了所有向量数据库实现必须遵循的接口
|
||||
chromadb_impl.py:chromadb的具体实现,遵循 VectorDBBase 接口
|
||||
config:底层配置文件系统
|
||||
__init__.py:导出
|
||||
config_base.py:配置项基类
|
||||
config.py:配置的读取、修改、更新等
|
||||
llm:底层llm网络请求系统
|
||||
__init__.py:导出
|
||||
utils.py:基本工具,如图片压缩,格式转换
|
||||
llm_request.py:与大语言模型(LLM)交互的所有核心逻辑
|
||||
exceptions.py:llm请求异常类
|
||||
client_registry.py:client注册管理
|
||||
model_client:client集合
|
||||
base_client.py:client基类
|
||||
aiohttp_gemini_clinet.py:基于aiohttp实现的gemini client
|
||||
bedrock_client.py:aws client
|
||||
openai_client.py:openai client
|
||||
payload:标准负载构建
|
||||
message.py:标准消息构建
|
||||
resp_format.py:标准响应解析
|
||||
tool_option.py:标准工具负载构建
|
||||
standard_prompt.py:标准prompt(system等)
|
||||
logger:日志系统
|
||||
__init__.py:导出
|
||||
core.py:日志系统主入口
|
||||
cleanup.py:日志清理/压缩相关
|
||||
metadata.py:日志元数据相关
|
||||
renderers.py:日志格式化器
|
||||
config.py:配置相关的辅助操作
|
||||
handlers.py:日志处理器(console handler、file handler等)
|
||||
concurrency:底层异步管理
|
||||
__init__.py:导出
|
||||
task_manager.py:统一异步任务管理器
|
||||
watchdog.py:全局看门狗
|
||||
storage:本地持久化数据管理
|
||||
__init__.py:导出
|
||||
json_store.py:统一的json本地持久化操作器
|
||||
|
||||
## core层:
|
||||
包含以下模块:
|
||||
components:基本插件组件管理
|
||||
__init__.py:导出
|
||||
base:组件基类
|
||||
__init__.py:导出
|
||||
action.py
|
||||
adapter.py
|
||||
chatter.py
|
||||
command.py
|
||||
event_handler.py
|
||||
router.py
|
||||
service.py
|
||||
plugin.py
|
||||
prompt.py
|
||||
tool.py
|
||||
managers:组件应用管理,实际能力调用
|
||||
__init__.py:导出
|
||||
action_manager.py:动作管理器
|
||||
adapter_manager.py:适配器管理
|
||||
chatter_manager.py:聊天器管理
|
||||
event_manager.py:事件管理器
|
||||
service_manager.py:服务管理器
|
||||
mcp_manager:MCP相关管理
|
||||
__init__.py:导出
|
||||
mcp_client_manager.py:MCP客户端管理器
|
||||
mcp_tool_manager.py:MCP工具管理器
|
||||
permission_manager.py:权限管理器
|
||||
plugin_manager.py:插件管理器
|
||||
prompt_component_manager.py:Prompt组件管理器
|
||||
tool_manager:工具相关管理
|
||||
__init__.py:导出
|
||||
tool_histoty.py:工具调用历史记录
|
||||
tool_use.py:实际工具调用器
|
||||
types.py:组件类型
|
||||
registry.py:组件注册管理
|
||||
state_manager.py:组件状态管理
|
||||
prompt:提示词管理系统
|
||||
__init__.py:导出
|
||||
prompt.py:Prompt基类
|
||||
manager.py:全局prompt管理器
|
||||
params.py:Prompt参数系统
|
||||
perception:感知学习系统
|
||||
__init__.py:导出
|
||||
memory:常规记忆
|
||||
...
|
||||
knowledge:知识库
|
||||
...
|
||||
meme:黑话库
|
||||
...
|
||||
express:表达学习
|
||||
...
|
||||
transport:通讯传输系统
|
||||
__init__.py:导出
|
||||
message_receive:消息接收
|
||||
...
|
||||
message_send:消息发送
|
||||
...
|
||||
router:api路由
|
||||
...
|
||||
sink:针对适配器的core sink和ws接收器
|
||||
...
|
||||
models:基本模型
|
||||
__init__.py:导出
|
||||
|
||||
3
TODO.md
3
TODO.md
@@ -35,7 +35,6 @@
|
||||
- [x] 完整集成测试 (5/5通过)
|
||||
|
||||
|
||||
|
||||
- 大工程
|
||||
· 增加一个基于Rust后端,daisyui为(装饰的)前端的启动器,以下是详细功能
|
||||
- 一个好看的ui
|
||||
@@ -45,4 +44,4 @@
|
||||
- 能够支持自由修改bot、llm的配置
|
||||
- 兼容Matcha,将Matcha的界面也嵌入到启动器内
|
||||
- 数据库预览以及修改功能
|
||||
- (待确定)Live 2d chat功能的开发
|
||||
- (待确定)Live 2d chat功能的开发
|
||||
@@ -1,22 +0,0 @@
|
||||
# 表情替换候选数量说明
|
||||
|
||||
## 背景
|
||||
`MAX_EMOJI_FOR_PROMPT` 用于 `replace_a_emoji` 等场景,限制送入 LLM 的候选表情数量,避免上下文过长导致响应变慢或 token 开销过大。
|
||||
|
||||
## 为什么是 20
|
||||
- 平衡:超过十几项后决策收益递减,但 token/时间成本线性增加。
|
||||
- 性能:在常用模型和硬件下,20 个描述可在可接受延迟内返回决策。
|
||||
- 兼容:历史实现也使用 20,保持行为稳定。
|
||||
|
||||
## 何时调整
|
||||
- 设备/模型更强且希望更广覆盖:可提升到 30-40,但注意延迟和费用。
|
||||
- 低算力或对延迟敏感:可下调到 10-15 以加快决策。
|
||||
- 特殊场景(主题集中、库很小):下调有助于避免无意义的冗余候选。
|
||||
|
||||
## 如何修改
|
||||
- 常量位置:`src/chat/emoji_system/emoji_constants.py` 中的 `MAX_EMOJI_FOR_PROMPT`。
|
||||
- 如需动态配置,可将其迁移到 `global_config.emoji` 下的配置项并在 `emoji_manager` 读取。
|
||||
|
||||
## 建议
|
||||
- 调整后观察:替换决策耗时、模型费用、误删率(删除的表情是否被实际需要)。
|
||||
- 如继续扩展表情库规模,建议为候选列表增加基于使用频次或时间的预筛选策略。
|
||||
@@ -1,33 +0,0 @@
|
||||
# 表情系统重构说明
|
||||
|
||||
日期:2025-12-15
|
||||
|
||||
## 目标
|
||||
- 拆分单体的 `emoji_manager.py`,将实体、常量、文件工具解耦。
|
||||
- 减少扫描/注册期间的事件循环阻塞。
|
||||
- 保留现有行为(LLM/VLM 流程、容量替换、缓存查找),同时提升可维护性。
|
||||
|
||||
## 新结构
|
||||
- `src/chat/emoji_system/emoji_constants.py`:共享路径与提示/数量上限。
|
||||
- `src/chat/emoji_system/emoji_entities.py`:`MaiEmoji`(哈希、格式检测、入库/删除、缓存失效)。
|
||||
- `src/chat/emoji_system/emoji_utils.py`:目录保证、临时清理、增量文件扫描、DB 行到实体转换。
|
||||
- `src/chat/emoji_system/emoji_manager.py`:负责完整性检查、扫描、注册、VLM/LLM 描述、替换与缓存,现委托给上述模块。
|
||||
- `src/chat/emoji_system/README.md`:快速使用/生命周期指引。
|
||||
|
||||
## 行为变化
|
||||
- 完整性检查改为游标+批量增量扫描,每处理 50 个让出一次事件循环。
|
||||
- 循环内的重文件操作(exists、listdir、remove、makedirs)通过 `asyncio.to_thread` 释放主循环。
|
||||
- 目录扫描使用 `os.scandir`(经 `list_image_files`),减少重复 stat,并返回文件列表与是否为空。
|
||||
- 快速查找:加载时重建 `_emoji_index`,增删时保持同步;`get_emoji_from_manager` 优先走索引。
|
||||
- 注册与替换流程在更新索引的同时,异步清理失败/重复文件。
|
||||
|
||||
## 迁移提示
|
||||
- 现有调用继续使用 `get_emoji_manager()` 与 `EmojiManager` API,外部接口未改动。
|
||||
- 如曾直接从 `emoji_manager` 引入常量或工具,请改为从 `emoji_constants`、`emoji_entities`、`emoji_utils` 引入。
|
||||
- 依赖同步文件时序的测试/脚本可能观察到不同的耗时,但逻辑等价。
|
||||
|
||||
## 后续建议
|
||||
1. 为 `list_image_files`、`clean_unused_emojis`、完整性扫描游标行为补充单测。
|
||||
2. 将 VLM/LLM 提示词模板外置为配置,便于迭代。
|
||||
3. 暴露扫描耗时、清理数量、注册延迟等指标,便于观测。
|
||||
4. 为 `replace_a_emoji` 的 LLM 调用添加重试上限,并记录 prompt/决策日志以便审计。
|
||||
@@ -1,36 +0,0 @@
|
||||
# 表达相似度计算策略
|
||||
|
||||
本文档说明 `calculate_similarity` 的实现与配置,帮助在质量与性能间做权衡。
|
||||
|
||||
## 总览
|
||||
- 支持两种路径:
|
||||
1) **向量化路径(默认优先)**:TF-IDF + 余弦相似度(依赖 `scikit-learn`)
|
||||
2) **回退路径**:`difflib.SequenceMatcher`
|
||||
- 参数 `prefer_vector` 控制是否优先尝试向量化,默认 `True`。
|
||||
- 依赖缺失或文本过短时,自动回退,无需额外配置。
|
||||
|
||||
## 调用方式
|
||||
```python
|
||||
from src.chat.express.express_utils import calculate_similarity
|
||||
|
||||
sim = calculate_similarity(text1, text2) # 默认优先向量化
|
||||
sim_fast = calculate_similarity(text1, text2, prefer_vector=False) # 强制使用 SequenceMatcher
|
||||
```
|
||||
|
||||
## 依赖与回退
|
||||
- 可选依赖:`scikit-learn`
|
||||
- 缺失时自动回退到 `SequenceMatcher`,不会抛异常。
|
||||
- 文本过短(长度 < 2)时直接回退,避免稀疏向量噪声。
|
||||
|
||||
## 适用建议
|
||||
- 文本较长、对鲁棒性/语义相似度有更高要求:保持默认(向量化优先)。
|
||||
- 环境无 `scikit-learn` 或追求极简依赖:调用时设置 `prefer_vector=False`。
|
||||
- 高并发性能敏感:可在调用点酌情关闭向量化或加缓存。
|
||||
|
||||
## 返回范围
|
||||
- 相似度范围始终在 `[0, 1]`。
|
||||
- 空字符串 → `0.0`;完全相同 → `1.0`。
|
||||
|
||||
## 额外建议
|
||||
- 若需更强语义能力,可替换为向量数据库或句向量模型(需新增依赖与配置)。
|
||||
- 对热路径可增加缓存(按文本哈希),或限制输入长度以控制向量维度与内存。
|
||||
@@ -1,283 +0,0 @@
|
||||
# Napcat 视频处理配置指南
|
||||
|
||||
## 概述
|
||||
|
||||
本指南说明如何在 MoFox-Bot 中配置和控制 Napcat 适配器的视频消息处理功能。
|
||||
|
||||
**相关 Issue**: [#10 - 强烈请求有个开关选择是否下载视频](https://github.com/MoFox-Studio/MoFox-Core/issues/10)
|
||||
|
||||
---
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 关闭视频下载(推荐用于低配机器或有限带宽)
|
||||
|
||||
编辑 `config/bot_config.toml`,找到 `[napcat_adapter.features]` 段落,修改:
|
||||
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
enable_video_processing = false # 改为 false 关闭视频处理
|
||||
```
|
||||
|
||||
**效果**:视频消息会显示为 `[视频消息]`,不会进行下载。
|
||||
|
||||
---
|
||||
|
||||
## 配置选项详解
|
||||
|
||||
### 主开关:`enable_video_processing`
|
||||
|
||||
| 属性 | 值 |
|
||||
|------|-----|
|
||||
| **类型** | 布尔值 (`true` / `false`) |
|
||||
| **默认值** | `true` |
|
||||
| **说明** | 是否启用视频消息的下载和处理 |
|
||||
|
||||
**启用 (`true`)**:
|
||||
- ✅ 自动下载视频
|
||||
- ✅ 将视频转换为 base64 并发送给 AI
|
||||
- ⚠️ 消耗网络带宽和 CPU 资源
|
||||
|
||||
**禁用 (`false`)**:
|
||||
- ✅ 跳过视频下载
|
||||
- ✅ 显示 `[视频消息]` 占位符
|
||||
- ✅ 显著降低带宽和 CPU 占用
|
||||
|
||||
### 高级选项
|
||||
|
||||
#### `video_max_size_mb`
|
||||
|
||||
| 属性 | 值 |
|
||||
|------|-----|
|
||||
| **类型** | 整数 |
|
||||
| **默认值** | `100` (MB) |
|
||||
| **建议范围** | 10 - 500 MB |
|
||||
| **说明** | 允许下载的最大视频文件大小 |
|
||||
|
||||
**用途**:防止下载过大的视频文件。
|
||||
|
||||
**建议**:
|
||||
- **低配机器** (2GB RAM): 设置为 10-20 MB
|
||||
- **中等配置** (8GB RAM): 设置为 50-100 MB
|
||||
- **高配机器** (16GB+ RAM): 设置为 100-500 MB
|
||||
|
||||
```toml
|
||||
# 只允许下载 50MB 以下的视频
|
||||
video_max_size_mb = 50
|
||||
```
|
||||
|
||||
#### `video_download_timeout`
|
||||
|
||||
| 属性 | 值 |
|
||||
|------|-----|
|
||||
| **类型** | 整数 |
|
||||
| **默认值** | `60` (秒) |
|
||||
| **建议范围** | 30 - 180 秒 |
|
||||
| **说明** | 视频下载超时时间 |
|
||||
|
||||
**用途**:防止卡住等待无法下载的视频。
|
||||
|
||||
**建议**:
|
||||
- **网络较差** (2-5 Mbps): 设置为 120-180 秒
|
||||
- **网络一般** (5-20 Mbps): 设置为 60-120 秒
|
||||
- **网络较好** (20+ Mbps): 设置为 30-60 秒
|
||||
|
||||
```toml
|
||||
# 下载超时时间改为 120 秒
|
||||
video_download_timeout = 120
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 常见配置场景
|
||||
|
||||
### 场景 1:服务器带宽有限
|
||||
|
||||
**症状**:群聊消息中经常出现大量视频,导致网络流量爆满。
|
||||
|
||||
**解决方案**:
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
enable_video_processing = false # 完全关闭
|
||||
```
|
||||
|
||||
### 场景 2:机器性能较低
|
||||
|
||||
**症状**:处理视频消息时 CPU 占用率高,其他功能响应变慢。
|
||||
|
||||
**解决方案**:
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
enable_video_processing = true
|
||||
video_max_size_mb = 20 # 限制小视频
|
||||
video_download_timeout = 30 # 快速超时
|
||||
```
|
||||
|
||||
### 场景 3:特定时间段关闭视频处理
|
||||
|
||||
如果需要在特定时间段内关闭视频处理,可以:
|
||||
|
||||
1. 修改配置文件
|
||||
2. 调用 API 重新加载配置(如果支持)
|
||||
|
||||
例如:在工作时间关闭,下班后打开。
|
||||
|
||||
### 场景 4:保留所有视频处理(默认行为)
|
||||
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
enable_video_processing = true
|
||||
video_max_size_mb = 100
|
||||
video_download_timeout = 60
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 工作原理
|
||||
|
||||
### 启用视频处理的流程
|
||||
|
||||
```
|
||||
消息到达
|
||||
↓
|
||||
检查 enable_video_processing
|
||||
├─ false → 返回 [视频消息] 占位符 ✓
|
||||
└─ true ↓
|
||||
检查文件大小
|
||||
├─ > video_max_size_mb → 返回错误信息 ✓
|
||||
└─ ≤ video_max_size_mb ↓
|
||||
开始下载(最多等待 video_download_timeout 秒)
|
||||
├─ 成功 → 返回视频数据 ✓
|
||||
├─ 超时 → 返回超时错误 ✓
|
||||
└─ 失败 → 返回错误信息 ✓
|
||||
```
|
||||
|
||||
### 禁用视频处理的流程
|
||||
|
||||
```
|
||||
消息到达
|
||||
↓
|
||||
检查 enable_video_processing
|
||||
└─ false → 立即返回 [视频消息] 占位符 ✓
|
||||
(节省带宽和 CPU)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 错误处理
|
||||
|
||||
当视频处理出现问题时,用户会看到以下占位符消息:
|
||||
|
||||
| 消息 | 含义 |
|
||||
|------|------|
|
||||
| `[视频消息]` | 视频处理已禁用或信息不完整 |
|
||||
| `[视频消息] (文件过大)` | 视频大小超过限制 |
|
||||
| `[视频消息] (下载失败)` | 网络错误或服务不可用 |
|
||||
| `[视频消息处理出错]` | 其他异常错误 |
|
||||
|
||||
这些占位符确保消息不会因为视频处理失败而导致程序崩溃。
|
||||
|
||||
---
|
||||
|
||||
## 性能对比
|
||||
|
||||
| 配置 | 带宽消耗 | CPU 占用 | 内存占用 | 响应速度 |
|
||||
|------|----------|---------|---------|----------|
|
||||
| **禁用** (`false`) | 🟢 极低 | 🟢 极低 | 🟢 极低 | 🟢 极快 |
|
||||
| **启用,小视频** (≤20MB) | 🟡 中等 | 🟡 中等 | 🟡 中等 | 🟡 一般 |
|
||||
| **启用,大视频** (≤100MB) | 🔴 较高 | 🔴 较高 | 🔴 较高 | 🔴 较慢 |
|
||||
|
||||
---
|
||||
|
||||
## 监控和调试
|
||||
|
||||
### 检查配置是否生效
|
||||
|
||||
启动 bot 后,查看日志中是否有类似信息:
|
||||
|
||||
```
|
||||
[napcat_adapter] 视频下载器已初始化: max_size=100MB, timeout=60s
|
||||
```
|
||||
|
||||
如果看到这条信息,说明配置已成功加载。
|
||||
|
||||
### 监控视频处理
|
||||
|
||||
当处理视频消息时,日志中会记录:
|
||||
|
||||
```
|
||||
[video_handler] 开始下载视频: https://...
|
||||
[video_handler] 视频下载成功,大小: 25.50 MB
|
||||
```
|
||||
|
||||
或者:
|
||||
|
||||
```
|
||||
[napcat_adapter] 视频消息处理已禁用,跳过
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 常见问题
|
||||
|
||||
### Q1: 关闭视频处理会影响 AI 的回复吗?
|
||||
|
||||
**A**: 不会。AI 仍然能看到 `[视频消息]` 占位符,可以根据上下文判断是否涉及视频内容。
|
||||
|
||||
### Q2: 可以为不同群组设置不同的视频处理策略吗?
|
||||
|
||||
**A**: 当前版本不支持。所有群组使用相同的配置。如需支持,请在 Issue 或讨论中提出。
|
||||
|
||||
### Q3: 视频下载会影响消息处理延迟吗?
|
||||
|
||||
**A**: 会。下载大视频可能需要几秒钟。建议:
|
||||
- 设置合理的 `video_download_timeout`
|
||||
- 或禁用视频处理以获得最快响应
|
||||
|
||||
### Q4: 修改配置后需要重启吗?
|
||||
|
||||
**A**: 是的。需要重启 bot 才能应用新配置。
|
||||
|
||||
### Q5: 如何快速诊断视频下载问题?
|
||||
|
||||
**A**:
|
||||
1. 检查日志中的错误信息
|
||||
2. 验证网络连接
|
||||
3. 检查 `video_max_size_mb` 是否设置过小
|
||||
4. 尝试增加 `video_download_timeout`
|
||||
|
||||
---
|
||||
|
||||
## 最佳实践
|
||||
|
||||
1. **新用户建议**:先启用视频处理,如果出现性能问题再调整参数或关闭。
|
||||
|
||||
2. **生产环境建议**:
|
||||
- 定期监控日志中的视频处理错误
|
||||
- 根据实际网络和 CPU 情况调整参数
|
||||
- 在高峰期可考虑关闭视频处理
|
||||
|
||||
3. **开发调试**:
|
||||
- 启用日志中的 DEBUG 级别输出
|
||||
- 测试各个 `video_max_size_mb` 值的实际表现
|
||||
- 检查超时时间是否符合网络条件
|
||||
|
||||
---
|
||||
|
||||
## 相关链接
|
||||
|
||||
- **GitHub Issue #10**: [强烈请求有个开关选择是否下载视频](https://github.com/MoFox-Studio/MoFox-Core/issues/10)
|
||||
- **配置文件**: `config/bot_config.toml`
|
||||
- **实现代码**:
|
||||
- `src/plugins/built_in/napcat_adapter/plugin.py`
|
||||
- `src/plugins/built_in/napcat_adapter/src/handlers/to_core/message_handler.py`
|
||||
- `src/plugins/built_in/napcat_adapter/src/handlers/video_handler.py`
|
||||
|
||||
---
|
||||
|
||||
## 反馈和建议
|
||||
|
||||
如有其他问题或建议,欢迎在 GitHub Issue 中提出。
|
||||
|
||||
**版本**: v2.1.0
|
||||
**最后更新**: 2025-12-16
|
||||
@@ -1,12 +1,5 @@
|
||||
# 增强命令系统使用指南
|
||||
|
||||
> ⚠️ **重要:插件命令必须使用 PlusCommand!**
|
||||
>
|
||||
> - ✅ **推荐**:`PlusCommand` - 插件开发的标准基类
|
||||
> - ❌ **禁止**:`BaseCommand` - 仅供框架内部使用
|
||||
>
|
||||
> 如果你直接使用 `BaseCommand`,将需要手动处理参数解析、正则匹配等复杂逻辑,并且 `execute()` 方法签名也不同。
|
||||
|
||||
## 概述
|
||||
|
||||
增强命令系统是MoFox-Bot插件系统的一个扩展,让命令的定义和使用变得更加简单直观。你不再需要编写复杂的正则表达式,只需要定义命令名、别名和参数处理逻辑即可。
|
||||
@@ -231,95 +224,24 @@ class ConfigurableCommand(PlusCommand):
|
||||
|
||||
## 返回值说明
|
||||
|
||||
`execute`方法必须返回一个三元组:
|
||||
`execute`方法需要返回一个三元组:
|
||||
|
||||
```python
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
# ... 你的逻辑 ...
|
||||
return (执行成功标志, 日志描述, 是否拦截消息)
|
||||
return (执行成功标志, 可选消息, 是否拦截后续处理)
|
||||
```
|
||||
|
||||
### 返回值详解
|
||||
|
||||
| 位置 | 类型 | 名称 | 说明 |
|
||||
|------|------|------|------|
|
||||
| 1 | `bool` | 执行成功标志 | `True` = 命令执行成功<br>`False` = 命令执行失败 |
|
||||
| 2 | `Optional[str]` | 日志描述 | 用于内部日志记录的描述性文本<br>⚠️ **不是发给用户的消息!** |
|
||||
| 3 | `bool` | 是否拦截消息 | `True` = 拦截,阻止后续处理(推荐)<br>`False` = 不拦截,继续后续处理 |
|
||||
|
||||
### 重要:消息发送 vs 日志描述
|
||||
|
||||
⚠️ **常见错误:在返回值中返回用户消息**
|
||||
|
||||
```python
|
||||
# ❌ 错误做法 - 不要这样做!
|
||||
async def execute(self, args: CommandArgs):
|
||||
message = "你好,这是给用户的消息"
|
||||
return True, message, True # 这个消息不会发给用户!
|
||||
|
||||
# ✅ 正确做法 - 使用 self.send_text()
|
||||
async def execute(self, args: CommandArgs):
|
||||
await self.send_text("你好,这是给用户的消息") # 发送给用户
|
||||
return True, "执行了问候命令", True # 日志描述
|
||||
```
|
||||
|
||||
### 完整示例
|
||||
|
||||
```python
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
"""execute 方法的完整示例"""
|
||||
|
||||
# 1. 参数验证
|
||||
if args.is_empty():
|
||||
await self.send_text("⚠️ 请提供参数")
|
||||
return True, "缺少参数", True
|
||||
|
||||
# 2. 执行逻辑
|
||||
user_input = args.get_raw()
|
||||
result = process_input(user_input)
|
||||
|
||||
# 3. 发送消息给用户
|
||||
await self.send_text(f"✅ 处理结果:{result}")
|
||||
|
||||
# 4. 返回:成功、日志描述、拦截消息
|
||||
return True, f"处理了用户输入: {user_input[:20]}", True
|
||||
```
|
||||
|
||||
### 拦截标志使用指导
|
||||
|
||||
- **返回 `True`**(推荐):命令已完成处理,不需要后续处理(如 LLM 回复)
|
||||
- **返回 `False`**:允许系统继续处理(例如让 LLM 也回复)
|
||||
- **执行成功标志** (bool): True表示命令执行成功,False表示失败
|
||||
- **可选消息** (Optional[str]): 用于日志记录的消息
|
||||
- **是否拦截后续处理** (bool): True表示拦截消息,不进行后续处理
|
||||
|
||||
## 最佳实践
|
||||
|
||||
### 1. 命令设计
|
||||
- ✅ **命令命名**:使用简短、直观的命令名(如 `time`、`help`、`status`)
|
||||
- ✅ **别名设置**:为常用命令提供简短别名(如 `echo` -> `e`、`say`)
|
||||
- ✅ **聊天类型**:根据命令功能选择 `ChatType.ALL`/`GROUP`/`PRIVATE`
|
||||
|
||||
### 2. 参数处理
|
||||
- ✅ **总是验证**:使用 `args.is_empty()`、`args.count()` 检查参数
|
||||
- ✅ **友好提示**:参数错误时提供清晰的用法说明
|
||||
- ✅ **默认值**:为可选参数提供合理的默认值
|
||||
|
||||
### 3. 消息发送
|
||||
- ✅ **使用 `self.send_text()`**:发送消息给用户
|
||||
- ❌ **不要在返回值中返回用户消息**:返回值是日志描述
|
||||
- ✅ **拦截消息**:大多数情况返回 `True` 作为第三个参数
|
||||
|
||||
### 4. 错误处理
|
||||
- ✅ **Try-Catch**:捕获并处理可能的异常
|
||||
- ✅ **清晰反馈**:告诉用户发生了什么问题
|
||||
- ✅ **记录日志**:在返回值中提供有用的调试信息
|
||||
|
||||
### 5. 配置管理
|
||||
- ✅ **可配置化**:重要设置应该通过 `self.get_config()` 读取
|
||||
- ✅ **提供默认值**:即使配置缺失也能正常工作
|
||||
|
||||
### 6. 代码质量
|
||||
- ✅ **类型注解**:使用完整的类型提示
|
||||
- ✅ **文档字符串**:为 `execute()` 方法添加文档说明
|
||||
- ✅ **代码注释**:为复杂逻辑添加必要的注释
|
||||
1. **命令命名**:使用简短、直观的命令名
|
||||
2. **别名设置**:为常用命令提供简短别名
|
||||
3. **参数验证**:总是检查参数的有效性
|
||||
4. **错误处理**:提供清晰的错误提示和使用说明
|
||||
5. **配置支持**:重要设置应该可配置
|
||||
6. **聊天类型**:根据命令功能选择合适的聊天类型限制
|
||||
|
||||
## 完整示例
|
||||
|
||||
|
||||
@@ -1,265 +0,0 @@
|
||||
# 📚 MoFox-Bot 插件开发文档导航
|
||||
|
||||
欢迎来到 MoFox-Bot 插件系统开发文档!本文档帮助你快速找到所需的学习资源。
|
||||
|
||||
---
|
||||
|
||||
## 🎯 我应该从哪里开始?
|
||||
|
||||
### 第一次接触插件开发?
|
||||
👉 **从这里开始**:[快速开始指南](quick-start.md)
|
||||
|
||||
这是一个循序渐进的教程,带你从零开始创建第一个插件,包含完整的代码示例。
|
||||
|
||||
### 遇到问题了?
|
||||
👉 **先看这里**:[故障排除指南](troubleshooting-guide.md) ⭐
|
||||
|
||||
包含10个最常见问题的解决方案,可能5分钟就能解决你的问题。
|
||||
|
||||
### 想深入了解特定功能?
|
||||
👉 **查看下方分类导航**,找到你需要的文档。
|
||||
|
||||
---
|
||||
|
||||
## 📖 学习路径建议
|
||||
|
||||
### 🌟 新手路径(按顺序阅读)
|
||||
|
||||
1. **[快速开始指南](quick-start.md)** ⭐ 必读
|
||||
- 创建插件目录和配置
|
||||
- 实现第一个 Action 组件
|
||||
- 实现第一个 Command 组件
|
||||
- 添加配置文件
|
||||
- 预计阅读时间:30-45分钟
|
||||
|
||||
2. **[增强命令指南](PLUS_COMMAND_GUIDE.md)** ⭐ 必读
|
||||
- 理解 PlusCommand 与 BaseCommand 的区别
|
||||
- 学习命令参数处理
|
||||
- 掌握返回值规范
|
||||
- 预计阅读时间:20-30分钟
|
||||
|
||||
3. **[Action 组件详解](action-components.md)** ⭐ 必读
|
||||
- 理解 Action 的激活机制
|
||||
- 学习自定义激活逻辑
|
||||
- 掌握 Action 的使用场景
|
||||
- 预计阅读时间:25-35分钟
|
||||
|
||||
4. **[故障排除指南](troubleshooting-guide.md)** ⭐ 建议收藏
|
||||
- 常见错误及解决方案
|
||||
- 最佳实践速查
|
||||
- 调试技巧
|
||||
- 随时查阅
|
||||
|
||||
---
|
||||
|
||||
### 🚀 进阶路径(根据需求选择)
|
||||
|
||||
#### 需要配置系统?
|
||||
- **[配置文件系统指南](configuration-guide.md)**
|
||||
- 自动生成配置文件
|
||||
- 配置 Schema 定义
|
||||
- 配置读取和验证
|
||||
|
||||
#### 需要响应事件?
|
||||
- **[事件系统指南](event-system-guide.md)**
|
||||
- 订阅系统事件
|
||||
- 创建自定义事件
|
||||
- 事件处理器实现
|
||||
|
||||
#### 需要集成外部功能?
|
||||
- **[Tool 组件指南](tool_guide.md)**
|
||||
- 为 LLM 提供工具调用能力
|
||||
- 函数调用集成
|
||||
- Tool 参数定义
|
||||
|
||||
#### 需要依赖其他插件?
|
||||
- **[依赖管理指南](dependency-management.md)**
|
||||
- 声明插件依赖
|
||||
- Python 包依赖
|
||||
- 依赖版本管理
|
||||
|
||||
#### 需要高级激活控制?
|
||||
- **[Action 激活机制重构指南](action-activation-guide.md)**
|
||||
- 自定义激活逻辑
|
||||
- 关键词匹配激活
|
||||
- LLM 智能判断激活
|
||||
- 随机激活策略
|
||||
|
||||
---
|
||||
|
||||
## 📂 文档结构说明
|
||||
|
||||
### 核心文档(必读)
|
||||
|
||||
```
|
||||
📄 quick-start.md 快速开始指南 ⭐ 新手必读
|
||||
📄 PLUS_COMMAND_GUIDE.md 增强命令系统指南 ⭐ 必读
|
||||
📄 action-components.md Action 组件详解 ⭐ 必读
|
||||
📄 troubleshooting-guide.md 故障排除指南 ⭐ 遇到问题先看这个
|
||||
```
|
||||
|
||||
### 进阶文档(按需阅读)
|
||||
|
||||
```
|
||||
📄 configuration-guide.md 配置系统详解
|
||||
📄 event-system-guide.md 事件系统详解
|
||||
📄 tool_guide.md Tool 组件详解
|
||||
📄 action-activation-guide.md Action 激活机制详解
|
||||
📄 dependency-management.md 依赖管理详解
|
||||
📄 manifest-guide.md Manifest 文件规范
|
||||
```
|
||||
|
||||
### API 参考文档
|
||||
|
||||
```
|
||||
📁 api/ API 参考文档目录
|
||||
├── 消息相关
|
||||
│ ├── send-api.md 消息发送 API
|
||||
│ ├── message-api.md 消息处理 API
|
||||
│ └── chat-api.md 聊天流 API
|
||||
│
|
||||
├── AI 相关
|
||||
│ ├── llm-api.md LLM 交互 API
|
||||
│ └── generator-api.md 回复生成 API
|
||||
│
|
||||
├── 数据相关
|
||||
│ ├── database-api.md 数据库操作 API
|
||||
│ ├── config-api.md 配置读取 API
|
||||
│ └── person-api.md 人物关系 API
|
||||
│
|
||||
├── 组件相关
|
||||
│ ├── plugin-manage-api.md 插件管理 API
|
||||
│ └── component-manage-api.md 组件管理 API
|
||||
│
|
||||
└── 其他
|
||||
├── emoji-api.md 表情包 API
|
||||
├── tool-api.md 工具 API
|
||||
└── logging-api.md 日志 API
|
||||
```
|
||||
|
||||
### 其他文件
|
||||
|
||||
```
|
||||
📄 index.md 文档索引(旧版,建议查看本 README)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎓 按功能查找文档
|
||||
|
||||
### 我想创建...
|
||||
|
||||
| 目标 | 推荐文档 | 难度 |
|
||||
|------|----------|------|
|
||||
| **一个简单的命令** | [快速开始](quick-start.md) → [增强命令指南](PLUS_COMMAND_GUIDE.md) | ⭐ 入门 |
|
||||
| **一个智能 Action** | [快速开始](quick-start.md) → [Action 组件](action-components.md) | ⭐⭐ 中级 |
|
||||
| **带复杂参数的命令** | [增强命令指南](PLUS_COMMAND_GUIDE.md) | ⭐⭐ 中级 |
|
||||
| **需要配置的插件** | [配置系统指南](configuration-guide.md) | ⭐⭐ 中级 |
|
||||
| **响应系统事件的插件** | [事件系统指南](event-system-guide.md) | ⭐⭐⭐ 高级 |
|
||||
| **为 LLM 提供工具** | [Tool 组件指南](tool_guide.md) | ⭐⭐⭐ 高级 |
|
||||
| **依赖其他插件的插件** | [依赖管理指南](dependency-management.md) | ⭐⭐ 中级 |
|
||||
|
||||
### 我想学习...
|
||||
|
||||
| 主题 | 相关文档 |
|
||||
|------|----------|
|
||||
| **如何发送消息** | [发送 API](api/send-api.md) / [增强命令指南](PLUS_COMMAND_GUIDE.md) |
|
||||
| **如何处理参数** | [增强命令指南](PLUS_COMMAND_GUIDE.md) |
|
||||
| **如何使用 LLM** | [LLM API](api/llm-api.md) |
|
||||
| **如何操作数据库** | [数据库 API](api/database-api.md) |
|
||||
| **如何读取配置** | [配置 API](api/config-api.md) / [配置系统指南](configuration-guide.md) |
|
||||
| **如何获取消息历史** | [消息 API](api/message-api.md) / [聊天流 API](api/chat-api.md) |
|
||||
| **如何发送表情包** | [表情包 API](api/emoji-api.md) |
|
||||
| **如何记录日志** | [日志 API](api/logging-api.md) |
|
||||
|
||||
---
|
||||
|
||||
## 🆘 遇到问题?
|
||||
|
||||
### 第一步:查看故障排除指南
|
||||
👉 [故障排除指南](troubleshooting-guide.md) 包含10个最常见问题的解决方案
|
||||
|
||||
### 第二步:查看相关文档
|
||||
- **插件无法加载?** → [快速开始指南](quick-start.md)
|
||||
- **命令无响应?** → [增强命令指南](PLUS_COMMAND_GUIDE.md)
|
||||
- **Action 不触发?** → [Action 组件详解](action-components.md)
|
||||
- **配置不生效?** → [配置系统指南](configuration-guide.md)
|
||||
|
||||
### 第三步:检查日志
|
||||
查看 `logs/app_*.jsonl` 获取详细错误信息
|
||||
|
||||
### 第四步:寻求帮助
|
||||
- 在线文档:https://mofox-studio.github.io/MoFox-Bot-Docs/
|
||||
- GitHub Issues:提交详细的问题报告
|
||||
- 社区讨论:加入开发者社区
|
||||
|
||||
---
|
||||
|
||||
## 📌 重要提示
|
||||
|
||||
### ⚠️ 常见陷阱
|
||||
|
||||
1. **不要使用 `BaseCommand`**
|
||||
- ✅ 使用:`PlusCommand`
|
||||
- ❌ 避免:`BaseCommand`(仅供框架内部使用)
|
||||
|
||||
2. **不要在返回值中返回用户消息**
|
||||
- ✅ 使用:`await self.send_text("消息")`
|
||||
- ❌ 避免:`return True, "消息", True`
|
||||
|
||||
3. **手动创建 ComponentInfo 时必须指定 component_type**
|
||||
- ✅ 推荐:使用 `get_action_info()` 自动生成
|
||||
- ⚠️ 手动创建时:必须指定 `component_type=ComponentType.ACTION`
|
||||
|
||||
### 💡 最佳实践
|
||||
|
||||
- ✅ 总是使用类型注解
|
||||
- ✅ 为 `execute()` 方法添加文档字符串
|
||||
- ✅ 使用 `self.get_config()` 读取配置
|
||||
- ✅ 使用异步操作 `async/await`
|
||||
- ✅ 在发送消息前验证参数
|
||||
- ✅ 提供清晰的错误提示
|
||||
|
||||
---
|
||||
|
||||
## 🔄 文档更新记录
|
||||
|
||||
### v1.1.0 (2024-12-17)
|
||||
- ✨ 新增 [故障排除指南](troubleshooting-guide.md)
|
||||
- ✅ 修复 [快速开始指南](quick-start.md) 中的 BaseCommand 示例
|
||||
- ✅ 增强 [增强命令指南](PLUS_COMMAND_GUIDE.md) 的返回值说明
|
||||
- ✅ 完善 [Action 组件](action-components.md) 的 component_type 说明
|
||||
- 📝 创建本导航文档
|
||||
|
||||
### v1.0.0 (2024-11)
|
||||
- 📚 初始文档发布
|
||||
|
||||
---
|
||||
|
||||
## 📞 反馈与贡献
|
||||
|
||||
如果你发现文档中的错误或有改进建议:
|
||||
|
||||
1. **提交 Issue**:在 GitHub 仓库提交文档问题
|
||||
2. **提交 PR**:直接修改文档并提交 Pull Request
|
||||
3. **社区反馈**:在社区讨论中提出建议
|
||||
|
||||
你的反馈对我们改进文档至关重要!🙏
|
||||
|
||||
---
|
||||
|
||||
## 🎉 开始你的插件开发之旅
|
||||
|
||||
准备好了吗?从这里开始:
|
||||
|
||||
1. 📖 阅读 [快速开始指南](quick-start.md)
|
||||
2. 💻 创建你的第一个插件
|
||||
3. 🔧 遇到问题查看 [故障排除指南](troubleshooting-guide.md)
|
||||
4. 🚀 探索更多高级功能
|
||||
|
||||
**祝你开发愉快!** 🎊
|
||||
|
||||
---
|
||||
|
||||
**最后更新**:2024-12-17
|
||||
**文档版本**:v1.1.0
|
||||
@@ -38,44 +38,11 @@ class ExampleAction(BaseAction):
|
||||
执行Action的主要逻辑
|
||||
|
||||
Returns:
|
||||
Tuple[bool, str]: 两个元素的元组
|
||||
- bool: 是否执行成功 (True=成功, False=失败)
|
||||
- str: 执行结果的简短描述(用于日志记录)
|
||||
|
||||
注意:
|
||||
- 使用 self.send_text() 等方法发送消息给用户
|
||||
- 返回值中的描述仅用于内部日志,不会发送给用户
|
||||
Tuple[bool, str]: (是否成功, 执行结果描述)
|
||||
"""
|
||||
# 发送消息给用户
|
||||
await self.send_text("这是发给用户的消息")
|
||||
|
||||
# 返回执行结果(用于日志)
|
||||
# ---- 执行动作的逻辑 ----
|
||||
return True, "执行成功"
|
||||
```
|
||||
|
||||
#### execute() 返回值 vs Command 返回值
|
||||
|
||||
⚠️ **重要:Action 和 Command 的返回值不同!**
|
||||
|
||||
| 组件类型 | 返回值 | 说明 |
|
||||
|----------|----------|------|
|
||||
| **Action** | `Tuple[bool, str]` | 2个元素:成功标志、日志描述 |
|
||||
| **Command** | `Tuple[bool, Optional[str], bool]` | 3个元素:成功标志、日志描述、拦截标志 |
|
||||
|
||||
```python
|
||||
# Action 返回值
|
||||
async def execute(self) -> Tuple[bool, str]:
|
||||
await self.send_text("给用户的消息")
|
||||
return True, "日志:执行了XX动作" # 2个元素
|
||||
|
||||
# Command 返回值
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
await self.send_text("给用户的消息")
|
||||
return True, "日志:执行了XX命令", True # 3个元素
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### associated_types: 该Action会发送的消息类型,例如文本、表情等。
|
||||
|
||||
这部分由Adapter传递给处理器。
|
||||
@@ -101,65 +68,6 @@ async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
|
||||
---
|
||||
|
||||
## 组件信息注册说明
|
||||
|
||||
### 自动生成 ComponentInfo(推荐)
|
||||
|
||||
大多数情况下,你不需要手动创建 `ActionInfo` 对象。系统提供了 `get_action_info()` 方法来自动生成:
|
||||
|
||||
```python
|
||||
# 推荐的方式 - 自动生成
|
||||
class HelloAction(BaseAction):
|
||||
action_name = "hello"
|
||||
action_description = "问候动作"
|
||||
# ... 其他配置 ...
|
||||
|
||||
# 在插件中注册
|
||||
def get_plugin_components(self):
|
||||
return [
|
||||
(HelloAction.get_action_info(), HelloAction), # 自动生成 ActionInfo
|
||||
]
|
||||
```
|
||||
|
||||
### 手动创建 ActionInfo(高级用法)
|
||||
|
||||
⚠️ **重要:如果手动创建 ActionInfo,必须指定 `component_type` 参数!**
|
||||
|
||||
当你需要自定义 `ActionInfo` 时(例如动态生成组件),必须手动指定 `component_type`:
|
||||
|
||||
```python
|
||||
from src.plugin_system import ActionInfo, ComponentType
|
||||
|
||||
# ❌ 错误 - 缺少 component_type
|
||||
action_info = ActionInfo(
|
||||
name="hello",
|
||||
description="问候动作"
|
||||
# 错误:会报错 "missing required argument: 'component_type'"
|
||||
)
|
||||
|
||||
# ✅ 正确 - 必须指定 component_type
|
||||
action_info = ActionInfo(
|
||||
name="hello",
|
||||
description="问候动作",
|
||||
component_type=ComponentType.ACTION # 必须指定!
|
||||
)
|
||||
```
|
||||
|
||||
**为什么需要手动指定?**
|
||||
|
||||
- `get_action_info()` 方法会自动设置 `component_type`
|
||||
- 但手动创建时,系统无法自动推断类型,必须明确指定
|
||||
|
||||
**什么时候需要手动创建?**
|
||||
|
||||
- 动态生成组件
|
||||
- 自定义 `get_handler_info()` 方法
|
||||
- 需要特殊的 ComponentInfo 配置
|
||||
|
||||
大多数情况下,直接使用 `get_action_info()` 即可,无需手动创建。
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Action 调用的决策机制
|
||||
|
||||
Action采用**两层决策机制**来优化性能和决策质量:
|
||||
|
||||
@@ -5,7 +5,6 @@
|
||||
## 新手入门
|
||||
|
||||
- [📖 快速开始指南](quick-start.md) - 快速创建你的第一个插件
|
||||
- [🔧 故障排除指南](troubleshooting-guide.md) - 快速解决常见问题 ⭐ **新增**
|
||||
|
||||
## 组件功能详解
|
||||
|
||||
|
||||
@@ -195,35 +195,29 @@ Command是最简单,最直接的响应,不由LLM判断选择使用
|
||||
```python
|
||||
# 在现有代码基础上,添加Command组件
|
||||
import datetime
|
||||
from src.plugin_system import PlusCommand, CommandArgs
|
||||
# 导入增强命令基类 - 推荐使用!
|
||||
from src.plugin_system import BaseCommand
|
||||
#导入Command基类
|
||||
|
||||
class TimeCommand(PlusCommand):
|
||||
class TimeCommand(BaseCommand):
|
||||
"""时间查询Command - 响应/time命令"""
|
||||
|
||||
command_name = "time"
|
||||
command_description = "查询当前时间"
|
||||
|
||||
# 注意:使用 PlusCommand 不需要 command_pattern,会自动生成!
|
||||
# === 命令设置(必须填写)===
|
||||
command_pattern = r"^/time$" # 精确匹配 "/time" 命令
|
||||
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
"""执行时间查询
|
||||
|
||||
Args:
|
||||
args: 命令参数(本例中不使用)
|
||||
|
||||
Returns:
|
||||
(成功标志, 日志描述, 是否拦截消息)
|
||||
"""
|
||||
async def execute(self) -> Tuple[bool, Optional[str], bool]:
|
||||
"""执行时间查询"""
|
||||
# 获取当前时间
|
||||
time_format: str = "%Y-%m-%d %H:%M:%S"
|
||||
now = datetime.datetime.now()
|
||||
time_str = now.strftime(time_format)
|
||||
|
||||
# 发送时间信息给用户
|
||||
await self.send_text(f"⏰ 当前时间:{time_str}")
|
||||
# 发送时间信息
|
||||
message = f"⏰ 当前时间:{time_str}"
|
||||
await self.send_text(message)
|
||||
|
||||
# 返回:成功、日志描述、拦截消息
|
||||
return True, f"显示了当前时间: {time_str}", True
|
||||
|
||||
@register_plugin
|
||||
@@ -245,29 +239,14 @@ class HelloWorldPlugin(BasePlugin):
|
||||
]
|
||||
```
|
||||
|
||||
同样的,我们通过 `get_plugin_components()` 方法,通过调用`get_command_info()`这个内置方法将 `TimeCommand` 注册为插件的一个组件。
|
||||
同样的,我们通过 `get_plugin_components()` 方法,通过调用`get_action_info()`这个内置方法将 `TimeCommand` 注册为插件的一个组件。
|
||||
|
||||
**Command组件解释:**
|
||||
|
||||
> ⚠️ **重要:请使用 PlusCommand 而不是 BaseCommand!**
|
||||
>
|
||||
> - ✅ **PlusCommand**:推荐使用,自动处理参数解析,无需编写正则表达式
|
||||
> - ❌ **BaseCommand**:仅供框架内部使用,插件开发者不应直接使用
|
||||
- `command_pattern` 使用正则表达式匹配用户输入
|
||||
- `^/time$` 表示精确匹配 "/time"
|
||||
|
||||
**PlusCommand 的优势:**
|
||||
- ✅ 无需编写 `command_pattern` 正则表达式
|
||||
- ✅ 自动解析命令参数(通过 `CommandArgs`)
|
||||
- ✅ 支持命令别名(`command_aliases`)
|
||||
- ✅ 更简单的 API,更容易上手
|
||||
|
||||
**execute() 方法说明:**
|
||||
- 参数:`args: CommandArgs` - 包含解析后的命令参数
|
||||
- 返回值:`(bool, str, bool)` 三元组
|
||||
- `bool`:命令是否执行成功
|
||||
- `str`:日志描述(**不是发给用户的消息**)
|
||||
- `bool`:是否拦截消息,阻止后续处理
|
||||
|
||||
有关增强命令的详细信息,请参考 [增强命令指南](./PLUS_COMMAND_GUIDE.md)。
|
||||
有关 Command 组件的更多信息,请参考 [Command组件指南](./command-components.md)。
|
||||
|
||||
### 8. 测试时间查询Command
|
||||
|
||||
@@ -398,31 +377,28 @@ class HelloAction(BaseAction):
|
||||
|
||||
return True, "发送了问候消息"
|
||||
|
||||
class TimeCommand(PlusCommand):
|
||||
class TimeCommand(BaseCommand):
|
||||
"""时间查询Command - 响应/time命令"""
|
||||
|
||||
command_name = "time"
|
||||
command_description = "查询当前时间"
|
||||
|
||||
# 注意:PlusCommand 不需要 command_pattern!
|
||||
# === 命令设置(必须填写)===
|
||||
command_pattern = r"^/time$" # 精确匹配 "/time" 命令
|
||||
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, str, bool]:
|
||||
"""执行时间查询
|
||||
|
||||
Args:
|
||||
args: 命令参数对象
|
||||
"""
|
||||
async def execute(self) -> Tuple[bool, str, bool]:
|
||||
"""执行时间查询"""
|
||||
import datetime
|
||||
|
||||
# 从配置获取时间格式
|
||||
# 获取当前时间
|
||||
time_format: str = self.get_config("time.format", "%Y-%m-%d %H:%M:%S") # type: ignore
|
||||
now = datetime.datetime.now()
|
||||
time_str = now.strftime(time_format)
|
||||
|
||||
# 发送时间信息给用户
|
||||
await self.send_text(f"⏰ 当前时间:{time_str}")
|
||||
# 发送时间信息
|
||||
message = f"⏰ 当前时间:{time_str}"
|
||||
await self.send_text(message)
|
||||
|
||||
# 返回:成功、日志描述、拦截消息
|
||||
return True, f"显示了当前时间: {time_str}", True
|
||||
```
|
||||
|
||||
|
||||
@@ -1,395 +0,0 @@
|
||||
# 🔧 插件开发故障排除指南
|
||||
|
||||
本指南帮助你快速解决 MoFox-Bot 插件开发中的常见问题。
|
||||
|
||||
---
|
||||
|
||||
## 📋 快速诊断清单
|
||||
|
||||
遇到问题时,首先按照以下步骤检查:
|
||||
|
||||
1. ✅ 检查日志文件 `logs/app_*.jsonl`
|
||||
2. ✅ 确认插件已在 `_manifest.json` 中正确配置
|
||||
3. ✅ 验证你使用的是 `PlusCommand` 而不是 `BaseCommand`
|
||||
4. ✅ 检查 `execute()` 方法签名是否正确
|
||||
5. ✅ 确认返回值格式正确
|
||||
|
||||
---
|
||||
|
||||
## 🔴 严重问题:插件无法加载
|
||||
|
||||
### 错误 #1: "未检测到插件"
|
||||
|
||||
**症状**:
|
||||
- 插件目录存在,但日志中没有加载信息
|
||||
- `get_plugin_components()` 返回空列表
|
||||
|
||||
**可能原因与解决方案**:
|
||||
|
||||
#### ❌ 缺少 `@register_plugin` 装饰器
|
||||
|
||||
```python
|
||||
# 错误 - 缺少装饰器
|
||||
class MyPlugin(BasePlugin): # 不会被检测到
|
||||
pass
|
||||
|
||||
# 正确 - 添加装饰器
|
||||
@register_plugin # 必须添加!
|
||||
class MyPlugin(BasePlugin):
|
||||
pass
|
||||
```
|
||||
|
||||
#### ❌ `plugin.py` 文件不存在或位置错误
|
||||
|
||||
```
|
||||
plugins/
|
||||
└── my_plugin/
|
||||
├── _manifest.json ✅
|
||||
└── plugin.py ✅ 必须在这里
|
||||
```
|
||||
|
||||
#### ❌ `_manifest.json` 格式错误
|
||||
|
||||
```json
|
||||
{
|
||||
"manifest_version": 1,
|
||||
"name": "My Plugin",
|
||||
"version": "1.0.0",
|
||||
"description": "插件描述",
|
||||
"author": {
|
||||
"name": "Your Name"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 错误 #2: "ActionInfo.__init__() missing required argument: 'component_type'"
|
||||
|
||||
**症状**:
|
||||
```
|
||||
TypeError: ActionInfo.__init__() missing 1 required positional argument: 'component_type'
|
||||
```
|
||||
|
||||
**原因**:手动创建 `ActionInfo` 时未指定 `component_type` 参数
|
||||
|
||||
**解决方案**:
|
||||
|
||||
```python
|
||||
from src.plugin_system import ActionInfo, ComponentType
|
||||
|
||||
# ❌ 错误 - 缺少 component_type
|
||||
action_info = ActionInfo(
|
||||
name="my_action",
|
||||
description="我的动作"
|
||||
)
|
||||
|
||||
# ✅ 正确方法 1 - 使用自动生成(推荐)
|
||||
class MyAction(BaseAction):
|
||||
action_name = "my_action"
|
||||
action_description = "我的动作"
|
||||
|
||||
def get_plugin_components(self):
|
||||
return [
|
||||
(MyAction.get_action_info(), MyAction) # 自动生成,推荐!
|
||||
]
|
||||
|
||||
# ✅ 正确方法 2 - 手动指定 component_type
|
||||
action_info = ActionInfo(
|
||||
name="my_action",
|
||||
description="我的动作",
|
||||
component_type=ComponentType.ACTION # 必须指定!
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🟡 命令问题:命令无响应
|
||||
|
||||
### 错误 #3: 命令被识别但不执行
|
||||
|
||||
**症状**:
|
||||
- 输入 `/mycommand` 后没有任何反应
|
||||
- 日志显示命令已匹配但未执行
|
||||
|
||||
**可能原因与解决方案**:
|
||||
|
||||
#### ❌ 使用了 `BaseCommand` 而不是 `PlusCommand`
|
||||
|
||||
```python
|
||||
# ❌ 错误 - 使用 BaseCommand
|
||||
from src.plugin_system import BaseCommand
|
||||
|
||||
class MyCommand(BaseCommand): # 不推荐!
|
||||
command_name = "mycommand"
|
||||
command_pattern = r"^/mycommand$" # 需要手动写正则
|
||||
|
||||
async def execute(self): # 签名错误!
|
||||
pass
|
||||
|
||||
# ✅ 正确 - 使用 PlusCommand
|
||||
from src.plugin_system import PlusCommand, CommandArgs
|
||||
|
||||
class MyCommand(PlusCommand): # 推荐!
|
||||
command_name = "mycommand"
|
||||
# 不需要 command_pattern,会自动生成!
|
||||
|
||||
async def execute(self, args: CommandArgs): # 正确签名
|
||||
await self.send_text("命令执行成功")
|
||||
return True, "执行了mycommand", True
|
||||
```
|
||||
|
||||
#### ❌ `execute()` 方法签名错误
|
||||
|
||||
```python
|
||||
# ❌ 错误的签名(缺少 args 参数)
|
||||
async def execute(self) -> Tuple[bool, Optional[str], bool]:
|
||||
pass
|
||||
|
||||
# ❌ 错误的签名(参数类型错误)
|
||||
async def execute(self, args: list[str]) -> Tuple[bool, Optional[str], bool]:
|
||||
pass
|
||||
|
||||
# ✅ 正确的签名
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
await self.send_text("响应用户")
|
||||
return True, "日志描述", True
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 错误 #4: 命令发送了消息但用户没收到
|
||||
|
||||
**症状**:
|
||||
- 日志显示命令执行成功
|
||||
- 但用户没有收到任何消息
|
||||
|
||||
**原因**:在返回值中返回消息,而不是使用 `self.send_text()`
|
||||
|
||||
**解决方案**:
|
||||
|
||||
```python
|
||||
# ❌ 错误 - 在返回值中返回消息
|
||||
async def execute(self, args: CommandArgs):
|
||||
message = "这是给用户的消息"
|
||||
return True, message, True # 这不会发送给用户!
|
||||
|
||||
# ✅ 正确 - 使用 self.send_text()
|
||||
async def execute(self, args: CommandArgs):
|
||||
# 发送消息给用户
|
||||
await self.send_text("这是给用户的消息")
|
||||
|
||||
# 返回日志描述(不是用户消息)
|
||||
return True, "执行了某个操作", True
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 错误 #5: "notice处理失败" 或重复消息
|
||||
|
||||
**症状**:
|
||||
- 日志中出现 "notice处理失败"
|
||||
- 用户收到重复的消息
|
||||
|
||||
**原因**:同时使用了 `send_api.send_text()` 和返回消息
|
||||
|
||||
**解决方案**:
|
||||
|
||||
```python
|
||||
# ❌ 错误 - 混用不同的发送方式
|
||||
from src.plugin_system.apis.chat_api import send_api
|
||||
|
||||
async def execute(self, args: CommandArgs):
|
||||
await send_api.send_text(self.stream_id, "消息1") # 不要这样做
|
||||
return True, "消息2", True # 也不要返回消息
|
||||
|
||||
# ✅ 正确 - 只使用 self.send_text()
|
||||
async def execute(self, args: CommandArgs):
|
||||
await self.send_text("这是唯一的消息") # 推荐方式
|
||||
return True, "日志:执行成功", True # 仅用于日志
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🟢 配置问题
|
||||
|
||||
### 错误 #6: 配置警告 "配置中不存在字空间或键"
|
||||
|
||||
**症状**:
|
||||
```
|
||||
获取全局配置 plugins.my_plugin 失败: "配置中不存在字空间或键 'plugins'"
|
||||
```
|
||||
|
||||
**这是正常的吗?**
|
||||
|
||||
✅ **是的,这是正常行为!** 不需要修复。
|
||||
|
||||
**说明**:
|
||||
- 系统首先尝试从全局配置加载:`config/plugins/my_plugin/config.toml`
|
||||
- 如果不存在,会自动回退到插件本地配置:`plugins/my_plugin/config.toml`
|
||||
- 这个警告可以安全忽略
|
||||
|
||||
**如果你想消除警告**:
|
||||
1. 在 `config/plugins/` 目录创建你的插件配置目录
|
||||
2. 或者直接忽略 - 使用本地配置完全正常
|
||||
|
||||
---
|
||||
|
||||
## 🔧 返回值问题
|
||||
|
||||
### 错误 #7: 返回值格式错误
|
||||
|
||||
**Action 返回值** (2个元素):
|
||||
```python
|
||||
async def execute(self) -> Tuple[bool, str]:
|
||||
await self.send_text("消息")
|
||||
return True, "日志描述" # 2个元素
|
||||
```
|
||||
|
||||
**Command 返回值** (3个元素):
|
||||
```python
|
||||
async def execute(self, args: CommandArgs) -> Tuple[bool, Optional[str], bool]:
|
||||
await self.send_text("消息")
|
||||
return True, "日志描述", True # 3个元素(增加了拦截标志)
|
||||
```
|
||||
|
||||
**对比表格**:
|
||||
|
||||
| 组件类型 | 返回值 | 元素说明 |
|
||||
|----------|--------|----------|
|
||||
| **Action** | `(bool, str)` | (成功标志, 日志描述) |
|
||||
| **Command** | `(bool, str, bool)` | (成功标志, 日志描述, 拦截标志) |
|
||||
|
||||
---
|
||||
|
||||
## 🎯 参数解析问题
|
||||
|
||||
### 错误 #8: 无法获取命令参数
|
||||
|
||||
**症状**:
|
||||
- `args` 为空或不包含预期的参数
|
||||
|
||||
**解决方案**:
|
||||
|
||||
```python
|
||||
async def execute(self, args: CommandArgs):
|
||||
# 检查是否有参数
|
||||
if args.is_empty():
|
||||
await self.send_text("❌ 缺少参数\n用法: /command <参数>")
|
||||
return True, "缺少参数", True
|
||||
|
||||
# 获取原始参数字符串
|
||||
raw_input = args.get_raw()
|
||||
|
||||
# 获取解析后的参数列表
|
||||
arg_list = args.get_args()
|
||||
|
||||
# 获取第一个参数
|
||||
first_arg = args.get_first("默认值")
|
||||
|
||||
# 获取指定索引的参数
|
||||
second_arg = args.get_arg(1, "默认值")
|
||||
|
||||
# 检查标志
|
||||
if args.has_flag("--verbose"):
|
||||
# 处理 --verbose 模式
|
||||
pass
|
||||
|
||||
# 获取标志的值
|
||||
output = args.get_flag_value("--output", "default.txt")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📝 类型注解问题
|
||||
|
||||
### 错误 #9: IDE 报类型错误
|
||||
|
||||
**解决方案**:确保使用正确的类型导入
|
||||
|
||||
```python
|
||||
from typing import Tuple, Optional, List, Type
|
||||
from src.plugin_system import (
|
||||
BasePlugin,
|
||||
PlusCommand,
|
||||
BaseAction,
|
||||
CommandArgs,
|
||||
ComponentInfo,
|
||||
CommandInfo,
|
||||
ActionInfo,
|
||||
ComponentType
|
||||
)
|
||||
|
||||
# 正确的类型注解
|
||||
def get_plugin_components(self) -> List[Tuple[ComponentInfo, Type]]:
|
||||
return [
|
||||
(MyCommand.get_command_info(), MyCommand),
|
||||
(MyAction.get_action_info(), MyAction)
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 性能问题
|
||||
|
||||
### 错误 #10: 插件响应缓慢
|
||||
|
||||
**可能原因**:
|
||||
|
||||
1. **阻塞操作**:在 `execute()` 中使用了同步 I/O
|
||||
2. **大量数据处理**:在主线程处理大文件或复杂计算
|
||||
3. **频繁的数据库查询**:每次都查询数据库
|
||||
|
||||
**解决方案**:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
|
||||
async def execute(self, args: CommandArgs):
|
||||
# ✅ 使用异步操作
|
||||
result = await some_async_function()
|
||||
|
||||
# ✅ 对于同步操作,使用 asyncio.to_thread
|
||||
result = await asyncio.to_thread(blocking_function)
|
||||
|
||||
# ✅ 批量数据库操作
|
||||
from src.common.database.optimization.batch_scheduler import get_batch_scheduler
|
||||
scheduler = get_batch_scheduler()
|
||||
await scheduler.schedule_batch_insert(Model, data_list)
|
||||
|
||||
return True, "执行成功", True
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 获取帮助
|
||||
|
||||
如果以上方案都无法解决你的问题:
|
||||
|
||||
1. **查看日志**:检查 `logs/app_*.jsonl` 获取详细错误信息
|
||||
2. **查阅文档**:
|
||||
- [快速开始指南](./quick-start.md)
|
||||
- [增强命令指南](./PLUS_COMMAND_GUIDE.md)
|
||||
- [Action组件指南](./action-components.md)
|
||||
3. **在线文档**:https://mofox-studio.github.io/MoFox-Bot-Docs/
|
||||
4. **提交 Issue**:在 GitHub 仓库提交详细的问题报告
|
||||
|
||||
---
|
||||
|
||||
## 🎓 最佳实践速查
|
||||
|
||||
| 场景 | 推荐做法 | 避免 |
|
||||
|------|----------|------|
|
||||
| 创建命令 | 使用 `PlusCommand` | ❌ 使用 `BaseCommand` |
|
||||
| 发送消息 | `await self.send_text()` | ❌ 在返回值中返回消息 |
|
||||
| 注册组件 | 使用 `get_action_info()` | ❌ 手动创建不带 `component_type` 的 Info |
|
||||
| 参数处理 | 使用 `CommandArgs` 方法 | ❌ 手动解析字符串 |
|
||||
| 异步操作 | 使用 `async/await` | ❌ 使用同步阻塞操作 |
|
||||
| 配置读取 | `self.get_config()` | ❌ 硬编码配置值 |
|
||||
|
||||
---
|
||||
|
||||
**最后更新**:2024-12-17
|
||||
**版本**:v1.0.0
|
||||
|
||||
有问题欢迎反馈,帮助我们改进这份指南!
|
||||
@@ -1,38 +0,0 @@
|
||||
# 短期记忆压力泄压补丁
|
||||
|
||||
## 背景
|
||||
|
||||
部分场景下,短期记忆层在自动转移尚未触发时会快速堆积,可能导致短期记忆达到容量上限并阻塞后续写入。
|
||||
|
||||
## 变更(补丁)
|
||||
|
||||
- 新增“压力泄压”开关:可选择在占用率达到 100% 时,删除低重要性且最早的短期记忆,防止短期层持续膨胀。
|
||||
- 默认关闭,需显式开启后才会执行自动删除。
|
||||
|
||||
## 开关配置
|
||||
|
||||
- 入口:`UnifiedMemoryManager` 构造参数
|
||||
- `short_term_enable_force_cleanup: bool = False`
|
||||
- 传递到短期层:`ShortTermMemoryManager(enable_force_cleanup=True)`
|
||||
- 关闭示例:
|
||||
```python
|
||||
manager = UnifiedMemoryManager(
|
||||
short_term_enable_force_cleanup=False,
|
||||
)
|
||||
```
|
||||
|
||||
## 行为说明
|
||||
|
||||
- 当短期记忆占用率达到或超过 100%,且当前没有待转移批次时:
|
||||
- 触发 `force_cleanup_overflow()`
|
||||
- 按“低重要性优先、创建时间最早优先”删除一批记忆,将容量压回约 `max_memories * 0.9`
|
||||
- 清理在后台持久化,不阻塞主流程。
|
||||
|
||||
## 影响范围
|
||||
|
||||
- 默认行为保持与补丁前一致(开关默认 `off`)。
|
||||
- 如果关闭开关,短期层将不再做强制删除,只依赖自动转移机制。
|
||||
|
||||
## 回滚
|
||||
|
||||
- 构造时将 `short_term_enable_force_cleanup=False` 即可关闭;无需代码回滚。
|
||||
@@ -1,60 +0,0 @@
|
||||
# StyleLearner 资源上限开关(默认开启)
|
||||
|
||||
## 概览
|
||||
StyleLearner 支持资源上限控制,用于约束风格容量与清理行为。开关默认 **开启**,以防止模型无限膨胀;可在运行时动态关闭。
|
||||
|
||||
## 开关位置与用法(务必看这里)
|
||||
|
||||
开关在 **代码层**,默认开启,不依赖配置文件。
|
||||
|
||||
1) **全局运行时切换(推荐)**
|
||||
路径:`src/chat/express/style_learner.py` 暴露的单例 `style_learner_manager`
|
||||
```python
|
||||
from src.chat.express.style_learner import style_learner_manager
|
||||
|
||||
# 关闭资源上限(放开容量,谨慎使用)
|
||||
style_learner_manager.set_resource_limit(False)
|
||||
|
||||
# 再次开启资源上限
|
||||
style_learner_manager.set_resource_limit(True)
|
||||
```
|
||||
- 影响范围:实时作用于已创建的全部 learner(逐个同步 `resource_limit_enabled`)。
|
||||
- 生效时机:调用后立即生效,无需重启。
|
||||
|
||||
2) **构造时指定(不常用)**
|
||||
- `StyleLearner(resource_limit_enabled: True|False, ...)`
|
||||
- `StyleLearnerManager(resource_limit_enabled: True|False, ...)`
|
||||
用于自定义实例化逻辑(通常保持默认即可)。
|
||||
|
||||
3) **默认行为**
|
||||
- 开关默认 **开启**,即启用容量管理与清理。
|
||||
- 没有配置文件项;若需持久化开关状态,可自行在启动代码中显式调用 `set_resource_limit`。
|
||||
|
||||
## 资源上限行为(开启时)
|
||||
- 容量参数(每个 chat):
|
||||
- `max_styles = 2000`
|
||||
- `cleanup_threshold = 0.9`(≥90% 容量触发清理)
|
||||
- `cleanup_ratio = 0.2`(清理低价值风格约 20%)
|
||||
- 价值评分:结合使用频率(log 平滑)与最近使用时间(指数衰减),得分低者优先清理。
|
||||
- 仅对单个 learner 的容量管理生效;LRU 淘汰逻辑保持不变。
|
||||
|
||||
> ⚙️ 开关作用面:
|
||||
> - **开启**:在 add_style 时会检查容量并触发 `_cleanup_styles`;预测/学习逻辑不变。
|
||||
> - **关闭**:不再触发容量清理,但 LRU 管理器仍可能在进程层面淘汰不活跃 learner。
|
||||
|
||||
## I/O 与健壮性
|
||||
- 模型与元数据保存采用原子写(`.tmp` + `os.replace`),避免部分写入。
|
||||
- `pickle` 使用 `HIGHEST_PROTOCOL`,并执行 `fsync` 确保落盘。
|
||||
|
||||
## 兼容性
|
||||
- 默认开启,无需修改配置文件;关闭后行为与旧版本类似。
|
||||
- 已有模型文件可直接加载,开关仅影响运行时清理策略。
|
||||
|
||||
## 何时建议开启/关闭
|
||||
- 开启(默认):内存/磁盘受限,或聊天风格高频增长,需防止模型膨胀。
|
||||
- 关闭:需要完整保留所有历史风格且资源充足,或进行一次性数据收集实验。
|
||||
|
||||
## 监控与调优建议
|
||||
- 监控:每 chat 风格数量、清理触发次数、删除数量、预测延迟 p95。
|
||||
- 如清理过于激进:提高 `cleanup_threshold` 或降低 `cleanup_ratio`。
|
||||
- 如内存/磁盘依旧偏高:降低 `max_styles`,或增加定期持久化与压缩策略。
|
||||
@@ -1,134 +0,0 @@
|
||||
# Napcat 适配器视频处理配置完成总结
|
||||
|
||||
## 修改内容
|
||||
|
||||
### 1. **增强配置定义** (`plugin.py`)
|
||||
- 添加 `video_max_size_mb`: 视频最大大小限制(默认 100MB)
|
||||
- 添加 `video_download_timeout`: 下载超时时间(默认 60秒)
|
||||
- 改进 `enable_video_processing` 的描述文字
|
||||
- **位置**: `src/plugins/built_in/napcat_adapter/plugin.py` L417-430
|
||||
|
||||
### 2. **改进消息处理器** (`message_handler.py`)
|
||||
- 添加 `_video_downloader` 成员变量存储下载器实例
|
||||
- 改进 `set_plugin_config()` 方法,根据配置初始化视频下载器
|
||||
- 改进视频下载调用,使用初始化时的配置
|
||||
- **位置**: `src/plugins/built_in/napcat_adapter/src/handlers/to_core/message_handler.py` L32-54, L327-334
|
||||
|
||||
### 3. **添加配置示例** (`bot_config.toml`)
|
||||
- 添加 `[napcat_adapter]` 配置段
|
||||
- 添加完整的 Napcat 服务器配置示例
|
||||
- 添加详细的特性配置(消息过滤、视频处理等)
|
||||
- 包含详尽的中文注释和使用建议
|
||||
- **位置**: `config/bot_config.toml` L680-724
|
||||
|
||||
### 4. **编写使用文档** (新文件)
|
||||
- 创建 `docs/napcat_video_configuration_guide.md`
|
||||
- 详细说明所有配置选项的含义和用法
|
||||
- 提供常见场景的配置模板
|
||||
- 包含故障排查和性能对比
|
||||
|
||||
---
|
||||
|
||||
## 功能清单
|
||||
|
||||
### 核心功能
|
||||
- ✅ 全局开关控制视频处理 (`enable_video_processing`)
|
||||
- ✅ 视频大小限制 (`video_max_size_mb`)
|
||||
- ✅ 下载超时控制 (`video_download_timeout`)
|
||||
- ✅ 根据配置初始化下载器
|
||||
- ✅ 友好的错误提示信息
|
||||
|
||||
### 用户体验
|
||||
- ✅ 详细的配置说明文档
|
||||
- ✅ 代码中的中文注释
|
||||
- ✅ 启动日志反馈
|
||||
- ✅ 配置示例可直接使用
|
||||
|
||||
---
|
||||
|
||||
## 如何使用
|
||||
|
||||
### 快速关闭视频下载(解决 Issue #10)
|
||||
|
||||
编辑 `config/bot_config.toml`:
|
||||
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
enable_video_processing = false # 改为 false
|
||||
```
|
||||
|
||||
重启 bot 后生效。
|
||||
|
||||
### 调整视频大小限制
|
||||
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
video_max_size_mb = 50 # 只允许下载 50MB 以下的视频
|
||||
```
|
||||
|
||||
### 调整下载超时
|
||||
|
||||
```toml
|
||||
[napcat_adapter.features]
|
||||
video_download_timeout = 120 # 增加到 120 秒
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 向下兼容性
|
||||
|
||||
- ✅ 旧配置文件无需修改(使用默认值)
|
||||
- ✅ 现有视频处理流程完全兼容
|
||||
- ✅ 所有功能都带有合理的默认值
|
||||
|
||||
---
|
||||
|
||||
## 测试场景
|
||||
|
||||
已验证的工作场景:
|
||||
|
||||
| 场景 | 行为 | 状态 |
|
||||
|------|------|------|
|
||||
| 视频处理启用 | 正常下载视频 | ✅ |
|
||||
| 视频处理禁用 | 返回占位符 | ✅ |
|
||||
| 视频超过大小限制 | 返回错误信息 | ✅ |
|
||||
| 下载超时 | 返回超时错误 | ✅ |
|
||||
| 网络错误 | 返回友好错误 | ✅ |
|
||||
| 启动时初始化 | 日志输出配置 | ✅ |
|
||||
|
||||
---
|
||||
|
||||
## 文件修改清单
|
||||
|
||||
```
|
||||
修改文件:
|
||||
- src/plugins/built_in/napcat_adapter/plugin.py
|
||||
- src/plugins/built_in/napcat_adapter/src/handlers/to_core/message_handler.py
|
||||
- config/bot_config.toml
|
||||
|
||||
新增文件:
|
||||
- docs/napcat_video_configuration_guide.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 关联信息
|
||||
|
||||
- **GitHub Issue**: #10 - 强烈请求有个开关选择是否下载视频
|
||||
- **修复时间**: 2025-12-16
|
||||
- **相关文档**: [Napcat 视频处理配置指南](./napcat_video_configuration_guide.md)
|
||||
|
||||
---
|
||||
|
||||
## 后续改进建议
|
||||
|
||||
1. **分组配置** - 为不同群组设置不同的视频处理策略
|
||||
2. **动态开关** - 提供运行时 API 动态开启/关闭视频处理
|
||||
3. **性能监控** - 添加视频处理的性能统计指标
|
||||
4. **队列管理** - 实现视频下载队列,限制并发下载数
|
||||
5. **缓存机制** - 缓存已下载的视频避免重复下载
|
||||
|
||||
---
|
||||
|
||||
**版本**: v2.1.0
|
||||
**状态**: ✅ 完成
|
||||
@@ -1,303 +0,0 @@
|
||||
import asyncio
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# 添加项目根目录到 Python 路径
|
||||
project_root = Path(__file__).parent.parent
|
||||
sys.path.insert(0, str(project_root))
|
||||
|
||||
from src.common.logger import get_logger
|
||||
from src.memory_graph.manager_singleton import get_unified_memory_manager
|
||||
|
||||
logger = get_logger("memory_transfer_check")
|
||||
|
||||
|
||||
def print_section(title: str):
|
||||
"""打印分节标题"""
|
||||
print(f"\n{'=' * 60}")
|
||||
print(f" {title}")
|
||||
print(f"{'=' * 60}\n")
|
||||
|
||||
|
||||
async def check_short_term_status():
|
||||
"""检查短期记忆状态"""
|
||||
print_section("1. 短期记忆状态检查")
|
||||
|
||||
manager = get_unified_memory_manager()
|
||||
short_term = manager.short_term_manager
|
||||
|
||||
# 获取统计信息
|
||||
stats = short_term.get_statistics()
|
||||
|
||||
print(f"📊 当前记忆数量: {stats['total_memories']}/{stats['max_memories']}")
|
||||
|
||||
# 计算占用率
|
||||
if stats["max_memories"] > 0:
|
||||
occupancy = stats["total_memories"] / stats["max_memories"]
|
||||
print(f"📈 容量占用率: {occupancy:.1%}")
|
||||
|
||||
# 根据占用率给出建议
|
||||
if occupancy >= 1.0:
|
||||
print("⚠️ 警告:已达到容量上限!应该触发紧急转移")
|
||||
elif occupancy >= 0.5:
|
||||
print("✅ 占用率超过50%,符合自动转移条件")
|
||||
else:
|
||||
print(f"ℹ️ 占用率未达到50%阈值,当前 {occupancy:.1%}")
|
||||
|
||||
print(f"🎯 可转移记忆数: {stats['transferable_count']}")
|
||||
print(f"📏 转移重要性阈值: {stats['transfer_threshold']}")
|
||||
|
||||
return stats
|
||||
|
||||
|
||||
async def check_transfer_candidates():
|
||||
"""检查当前可转移的候选记忆"""
|
||||
print_section("2. 转移候选记忆分析")
|
||||
|
||||
manager = get_unified_memory_manager()
|
||||
short_term = manager.short_term_manager
|
||||
|
||||
# 获取转移候选
|
||||
candidates = short_term.get_memories_for_transfer()
|
||||
|
||||
print(f"🎫 当前转移候选: {len(candidates)} 条\n")
|
||||
|
||||
if not candidates:
|
||||
print("❌ 没有记忆符合转移条件!")
|
||||
print("\n可能原因:")
|
||||
print(" 1. 所有记忆的重要性都低于阈值")
|
||||
print(" 2. 短期记忆数量未超过容量限制")
|
||||
print(" 3. 短期记忆列表为空")
|
||||
return []
|
||||
|
||||
# 显示前5条候选的详细信息
|
||||
print("前 5 条候选记忆:\n")
|
||||
for i, mem in enumerate(candidates[:5], 1):
|
||||
print(f"{i}. 记忆ID: {mem.id[:8]}...")
|
||||
print(f" 重要性: {mem.importance:.3f}")
|
||||
print(f" 内容: {mem.content[:50]}...")
|
||||
print(f" 创建时间: {mem.created_at}")
|
||||
print()
|
||||
|
||||
if len(candidates) > 5:
|
||||
print(f"... 还有 {len(candidates) - 5} 条候选记忆\n")
|
||||
|
||||
# 分析重要性分布
|
||||
importance_levels = {
|
||||
"高 (>=0.8)": sum(1 for m in candidates if m.importance >= 0.8),
|
||||
"中 (0.6-0.8)": sum(1 for m in candidates if 0.6 <= m.importance < 0.8),
|
||||
"低 (<0.6)": sum(1 for m in candidates if m.importance < 0.6),
|
||||
}
|
||||
|
||||
print("📊 重要性分布:")
|
||||
for level, count in importance_levels.items():
|
||||
print(f" {level}: {count} 条")
|
||||
|
||||
return candidates
|
||||
|
||||
|
||||
async def check_auto_transfer_task():
|
||||
"""检查自动转移任务状态"""
|
||||
print_section("3. 自动转移任务状态")
|
||||
|
||||
manager = get_unified_memory_manager()
|
||||
|
||||
# 检查任务是否存在
|
||||
if not hasattr(manager, "_auto_transfer_task") or manager._auto_transfer_task is None:
|
||||
print("❌ 自动转移任务未创建!")
|
||||
print("\n建议:调用 manager.initialize() 初始化系统")
|
||||
return False
|
||||
|
||||
task = manager._auto_transfer_task
|
||||
|
||||
# 检查任务状态
|
||||
if task.done():
|
||||
print("❌ 自动转移任务已结束!")
|
||||
try:
|
||||
exception = task.exception()
|
||||
if exception:
|
||||
print(f"\n任务异常: {exception}")
|
||||
except:
|
||||
pass
|
||||
print("\n建议:重启系统或手动重启任务")
|
||||
return False
|
||||
|
||||
print("✅ 自动转移任务正在运行")
|
||||
|
||||
# 检查转移缓存
|
||||
if hasattr(manager, "_transfer_cache"):
|
||||
cache_size = len(manager._transfer_cache) if manager._transfer_cache else 0
|
||||
print(f"📦 转移缓存: {cache_size} 条记忆")
|
||||
|
||||
# 检查上次转移时间
|
||||
if hasattr(manager, "_last_transfer_time"):
|
||||
from datetime import datetime
|
||||
last_time = manager._last_transfer_time
|
||||
if last_time:
|
||||
time_diff = (datetime.now() - last_time).total_seconds()
|
||||
print(f"⏱️ 距上次转移: {time_diff:.1f} 秒前")
|
||||
|
||||
return True
|
||||
|
||||
|
||||
async def check_long_term_status():
|
||||
"""检查长期记忆状态"""
|
||||
print_section("4. 长期记忆图谱状态")
|
||||
|
||||
manager = get_unified_memory_manager()
|
||||
long_term = manager.long_term_manager
|
||||
|
||||
# 获取图谱统计
|
||||
stats = long_term.get_statistics()
|
||||
|
||||
print(f"👥 人物节点数: {stats.get('person_count', 0)}")
|
||||
print(f"📅 事件节点数: {stats.get('event_count', 0)}")
|
||||
print(f"🔗 关系边数: {stats.get('edge_count', 0)}")
|
||||
print(f"💾 向量存储数: {stats.get('vector_count', 0)}")
|
||||
|
||||
return stats
|
||||
|
||||
|
||||
async def manual_transfer_test():
|
||||
"""手动触发转移测试"""
|
||||
print_section("5. 手动转移测试")
|
||||
|
||||
manager = get_unified_memory_manager()
|
||||
|
||||
# 询问用户是否执行
|
||||
print("⚠️ 即将手动触发一次记忆转移")
|
||||
print("这将把当前符合条件的短期记忆转移到长期记忆")
|
||||
response = input("\n是否继续? (y/n): ").strip().lower()
|
||||
|
||||
if response != "y":
|
||||
print("❌ 已取消手动转移")
|
||||
return None
|
||||
|
||||
print("\n🚀 开始手动转移...")
|
||||
|
||||
try:
|
||||
# 执行手动转移
|
||||
result = await manager.manual_transfer()
|
||||
|
||||
print("\n✅ 转移完成!")
|
||||
print("\n转移结果:")
|
||||
print(f" 已处理: {result.get('processed_count', 0)} 条")
|
||||
print(f" 成功转移: {len(result.get('transferred_memory_ids', []))} 条")
|
||||
print(f" 失败: {result.get('failed_count', 0)} 条")
|
||||
print(f" 跳过: {result.get('skipped_count', 0)} 条")
|
||||
|
||||
if result.get("errors"):
|
||||
print("\n错误信息:")
|
||||
for error in result["errors"][:3]: # 只显示前3个错误
|
||||
print(f" - {error}")
|
||||
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n❌ 转移失败: {e}")
|
||||
logger.exception("手动转移失败")
|
||||
return None
|
||||
|
||||
|
||||
async def check_configuration():
|
||||
"""检查相关配置"""
|
||||
print_section("6. 配置参数检查")
|
||||
|
||||
from src.config.config import global_config
|
||||
|
||||
config = global_config.memory
|
||||
|
||||
print("📋 当前配置:")
|
||||
print(f" 短期记忆容量: {config.short_term_max_memories}")
|
||||
print(f" 转移重要性阈值: {config.short_term_transfer_threshold}")
|
||||
print(f" 批量转移大小: {config.long_term_batch_size}")
|
||||
print(f" 自动转移间隔: {config.long_term_auto_transfer_interval} 秒")
|
||||
print(f" 启用泄压清理: {config.short_term_enable_force_cleanup}")
|
||||
|
||||
# 给出配置建议
|
||||
print("\n💡 配置建议:")
|
||||
|
||||
if config.short_term_transfer_threshold > 0.6:
|
||||
print(" ⚠️ 转移阈值较高(>0.6),可能导致记忆难以转移")
|
||||
print(" 建议:降低到 0.4-0.5")
|
||||
|
||||
if config.long_term_batch_size > 10:
|
||||
print(" ⚠️ 批量大小较大(>10),可能延迟转移触发")
|
||||
print(" 建议:设置为 5-10")
|
||||
|
||||
if config.long_term_auto_transfer_interval > 300:
|
||||
print(" ⚠️ 转移间隔较长(>5分钟),可能导致转移不及时")
|
||||
print(" 建议:设置为 60-180 秒")
|
||||
|
||||
|
||||
async def main():
|
||||
"""主函数"""
|
||||
print("\n" + "=" * 60)
|
||||
print(" MoFox-Bot 记忆转移诊断工具")
|
||||
print("=" * 60)
|
||||
|
||||
try:
|
||||
# 初始化管理器
|
||||
print("\n⚙️ 正在初始化记忆管理器...")
|
||||
manager = get_unified_memory_manager()
|
||||
await manager.initialize()
|
||||
print("✅ 初始化完成\n")
|
||||
|
||||
# 执行各项检查
|
||||
await check_short_term_status()
|
||||
candidates = await check_transfer_candidates()
|
||||
task_running = await check_auto_transfer_task()
|
||||
await check_long_term_status()
|
||||
await check_configuration()
|
||||
|
||||
# 综合诊断
|
||||
print_section("7. 综合诊断结果")
|
||||
|
||||
issues = []
|
||||
|
||||
if not candidates:
|
||||
issues.append("❌ 没有符合条件的转移候选")
|
||||
|
||||
if not task_running:
|
||||
issues.append("❌ 自动转移任务未运行")
|
||||
|
||||
if issues:
|
||||
print("🚨 发现以下问题:\n")
|
||||
for issue in issues:
|
||||
print(f" {issue}")
|
||||
|
||||
print("\n建议操作:")
|
||||
print(" 1. 检查短期记忆的重要性评分是否合理")
|
||||
print(" 2. 降低配置中的转移阈值")
|
||||
print(" 3. 查看日志文件排查错误")
|
||||
print(" 4. 尝试手动触发转移测试")
|
||||
else:
|
||||
print("✅ 系统运行正常,转移机制已就绪")
|
||||
|
||||
if candidates:
|
||||
print(f"\n当前有 {len(candidates)} 条记忆等待转移")
|
||||
print("转移将在满足以下任一条件时自动触发:")
|
||||
print(" • 转移缓存达到批量大小")
|
||||
print(" • 短期记忆占用率超过 50%")
|
||||
print(" • 距上次转移超过最大延迟")
|
||||
print(" • 短期记忆达到容量上限")
|
||||
|
||||
# 询问是否手动触发转移
|
||||
if candidates:
|
||||
print()
|
||||
await manual_transfer_test()
|
||||
|
||||
print_section("检查完成")
|
||||
print("详细诊断报告: docs/memory_transfer_diagnostic_report.md")
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n❌ 检查过程出错: {e}")
|
||||
logger.exception("检查脚本执行失败")
|
||||
return 1
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit_code = asyncio.run(main())
|
||||
sys.exit(exit_code)
|
||||
@@ -1,74 +0,0 @@
|
||||
"""工具:清空短期记忆存储。
|
||||
|
||||
用法:
|
||||
python scripts/clear_short_term_memory.py [--remove-file]
|
||||
|
||||
- 按配置的数据目录加载短期记忆管理器
|
||||
- 清空内存缓存并写入空的 short_term_memory.json
|
||||
- 可选:直接删除存储文件而不是写入空文件
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# 让从仓库根目录运行时能够正确导入模块
|
||||
PROJECT_ROOT = Path(__file__).parent.parent
|
||||
sys.path.insert(0, str(PROJECT_ROOT))
|
||||
|
||||
from src.config.config import global_config
|
||||
from src.memory_graph.short_term_manager import ShortTermMemoryManager
|
||||
|
||||
|
||||
def resolve_data_dir() -> Path:
|
||||
"""从配置解析记忆数据目录,带安全默认值。"""
|
||||
memory_cfg = getattr(global_config, "memory", None)
|
||||
base_dir = getattr(memory_cfg, "data_dir", "data/memory_graph") if memory_cfg else "data/memory_graph"
|
||||
return PROJECT_ROOT / base_dir
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="清空短期记忆 (示例: python scripts/clear_short_term_memory.py --remove-file)"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--remove-file",
|
||||
action="store_true",
|
||||
help="删除 short_term_memory.json 文件(默认写入空文件)",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
async def clear_short_term_memories(remove_file: bool = False) -> None:
|
||||
data_dir = resolve_data_dir()
|
||||
storage_file = data_dir / "short_term_memory.json"
|
||||
|
||||
manager = ShortTermMemoryManager(data_dir=data_dir)
|
||||
await manager.initialize()
|
||||
|
||||
removed_count = len(manager.memories)
|
||||
|
||||
# 清空内存状态
|
||||
manager.memories.clear()
|
||||
manager._memory_id_index.clear() # 内部索引缓存
|
||||
manager._similarity_cache.clear() # 相似度缓存
|
||||
|
||||
if remove_file and storage_file.exists():
|
||||
storage_file.unlink()
|
||||
print(f"Removed storage file: {storage_file}")
|
||||
else:
|
||||
# 写入空文件,保留结构
|
||||
await manager._save_to_disk()
|
||||
print(f"Wrote empty short-term memory file: {storage_file}")
|
||||
|
||||
print(f"Cleared {removed_count} short-term memories")
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
args = parse_args()
|
||||
await clear_short_term_memories(remove_file=args.remove_file)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
@@ -1,37 +0,0 @@
|
||||
# 新表情系统概览
|
||||
|
||||
本目录存放表情包的采集、注册与选择逻辑。
|
||||
|
||||
## 模块
|
||||
- `emoji_constants.py`:共享路径与数量上限。
|
||||
- `emoji_entities.py`:`MaiEmoji` 实体,负责哈希/格式检测、数据库注册与删除。
|
||||
- `emoji_utils.py`:文件系统工具(目录保证、临时清理、DB 行转换、文件列表扫描)。
|
||||
- `emoji_manager.py`:核心管理器,定期扫描、完整性检查、VLM/LLM 标注、容量替换、缓存查找。
|
||||
- `emoji_history.py`:按会话保存的内存历史。
|
||||
|
||||
## 生命周期
|
||||
1. 通过 `EmojiManager.start()` 启动后台任务(或在已有事件循环中直接 await `start_periodic_check_register()`)。
|
||||
2. 循环会加载数据库状态、做完整性清理、清理临时缓存,并扫描 `data/emoji` 中的新文件。
|
||||
3. 新图片会生成哈希,调用 VLM/LLM 生成描述后注册入库,并移动到 `data/emoji_registed`。
|
||||
4. 达到容量上限时,`replace_a_emoji()` 可能在 LLM 协助下删除低使用量表情再注册新表情。
|
||||
|
||||
## 关键行为
|
||||
- 完整性检查增量扫描,批量让出事件循环避免长阻塞。
|
||||
- 循环内的文件操作使用 `asyncio.to_thread` 以保持事件循环可响应。
|
||||
- 哈希索引 `_emoji_index` 加速内存查找;数据库为事实来源,内存为镜像。
|
||||
- 描述与标签使用缓存(见管理器上的 `@cached`)。
|
||||
|
||||
## 常用操作
|
||||
- `get_emoji_for_text(text_emotion)`:按目标情绪选取表情路径与描述。
|
||||
- `record_usage(emoji_hash)`:累加使用次数。
|
||||
- `delete_emoji(emoji_hash)`:删除文件与数据库记录并清缓存。
|
||||
|
||||
## 目录
|
||||
- 待注册:`data/emoji`
|
||||
- 已注册:`data/emoji_registed`
|
||||
- 临时图片:`data/image`, `data/images`
|
||||
|
||||
## 说明
|
||||
- 通过 `config/bot_config.toml`、`config/model_config.toml` 配置上限与模型。
|
||||
- GIF 支持保留,注册前会提取关键帧再送 VLM。
|
||||
- 避免直接使用 `Session`,请使用本模块提供的 API。
|
||||
@@ -1,6 +0,0 @@
|
||||
import os
|
||||
|
||||
BASE_DIR = os.path.join("data")
|
||||
EMOJI_DIR = os.path.join(BASE_DIR, "emoji")
|
||||
EMOJI_REGISTERED_DIR = os.path.join(BASE_DIR, "emoji_registed")
|
||||
MAX_EMOJI_FOR_PROMPT = 20
|
||||
@@ -1,192 +0,0 @@
|
||||
import asyncio
|
||||
import base64
|
||||
import binascii
|
||||
import hashlib
|
||||
import io
|
||||
import os
|
||||
import time
|
||||
import traceback
|
||||
|
||||
from PIL import Image
|
||||
|
||||
from src.chat.emoji_system.emoji_constants import EMOJI_REGISTERED_DIR
|
||||
from src.chat.utils.utils_image import image_path_to_base64
|
||||
from src.common.database.api.crud import CRUDBase
|
||||
from src.common.database.compatibility import get_db_session
|
||||
from src.common.database.core.models import Emoji
|
||||
from src.common.database.optimization.cache_manager import get_cache
|
||||
from src.common.database.utils.decorators import generate_cache_key
|
||||
from src.common.logger import get_logger
|
||||
|
||||
logger = get_logger("emoji")
|
||||
|
||||
|
||||
class MaiEmoji:
|
||||
"""定义一个表情包"""
|
||||
|
||||
def __init__(self, full_path: str):
|
||||
if not full_path:
|
||||
raise ValueError("full_path cannot be empty")
|
||||
self.full_path = full_path
|
||||
self.path = os.path.dirname(full_path)
|
||||
self.filename = os.path.basename(full_path)
|
||||
self.embedding = []
|
||||
self.hash = ""
|
||||
self.description = ""
|
||||
self.emotion: list[str] = []
|
||||
self.usage_count = 0
|
||||
self.last_used_time = time.time()
|
||||
self.register_time = time.time()
|
||||
self.is_deleted = False
|
||||
self.format = ""
|
||||
|
||||
async def initialize_hash_format(self) -> bool | None:
|
||||
"""从文件创建表情包实例, 计算哈希值和格式"""
|
||||
try:
|
||||
if not os.path.exists(self.full_path):
|
||||
logger.error(f"[初始化错误] 表情包文件不存在: {self.full_path}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
|
||||
logger.debug(f"[初始化] 正在读取文件: {self.full_path}")
|
||||
image_base64 = image_path_to_base64(self.full_path)
|
||||
if image_base64 is None:
|
||||
logger.error(f"[初始化错误] 无法读取或转换Base64: {self.full_path}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
logger.debug(f"[初始化] 文件读取成功 (Base64预览: {image_base64[:50]}...)")
|
||||
|
||||
logger.debug(f"[初始化] 正在解码Base64并计算哈希: {self.filename}")
|
||||
if isinstance(image_base64, str):
|
||||
image_base64 = image_base64.encode("ascii", errors="ignore").decode("ascii")
|
||||
image_bytes = base64.b64decode(image_base64)
|
||||
self.hash = hashlib.md5(image_bytes).hexdigest()
|
||||
logger.debug(f"[初始化] 哈希计算成功: {self.hash}")
|
||||
|
||||
logger.debug(f"[初始化] 正在使用Pillow获取格式: {self.filename}")
|
||||
try:
|
||||
with Image.open(io.BytesIO(image_bytes)) as img:
|
||||
self.format = (img.format or "jpeg").lower()
|
||||
logger.debug(f"[初始化] 格式获取成功: {self.format}")
|
||||
except Exception as pil_error:
|
||||
logger.error(f"[初始化错误] Pillow无法处理图片 ({self.filename}): {pil_error}")
|
||||
logger.error(traceback.format_exc())
|
||||
self.is_deleted = True
|
||||
return None
|
||||
|
||||
return True
|
||||
|
||||
except FileNotFoundError:
|
||||
logger.error(f"[初始化错误] 文件在处理过程中丢失: {self.full_path}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
except (binascii.Error, ValueError) as b64_error:
|
||||
logger.error(f"[初始化错误] Base64解码失败 ({self.filename}): {b64_error}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"[初始化错误] 初始化表情包时发生未预期错误 ({self.filename}): {e!s}")
|
||||
logger.error(traceback.format_exc())
|
||||
self.is_deleted = True
|
||||
return None
|
||||
|
||||
async def register_to_db(self) -> bool:
|
||||
"""注册表情包,将文件移动到注册目录并保存数据库"""
|
||||
try:
|
||||
source_full_path = self.full_path
|
||||
destination_full_path = os.path.join(EMOJI_REGISTERED_DIR, self.filename)
|
||||
|
||||
if not await asyncio.to_thread(os.path.exists, source_full_path):
|
||||
logger.error(f"[错误] 源文件不存在: {source_full_path}")
|
||||
return False
|
||||
|
||||
try:
|
||||
if await asyncio.to_thread(os.path.exists, destination_full_path):
|
||||
await asyncio.to_thread(os.remove, destination_full_path)
|
||||
|
||||
await asyncio.to_thread(os.rename, source_full_path, destination_full_path)
|
||||
logger.debug(f"[移动] 文件从 {source_full_path} 移动到 {destination_full_path}")
|
||||
self.full_path = destination_full_path
|
||||
self.path = EMOJI_REGISTERED_DIR
|
||||
except Exception as move_error:
|
||||
logger.error(f"[错误] 移动文件失败: {move_error!s}")
|
||||
return False
|
||||
|
||||
try:
|
||||
async with get_db_session() as session:
|
||||
emotion_str = ",".join(self.emotion) if self.emotion else ""
|
||||
|
||||
emoji = Emoji(
|
||||
emoji_hash=self.hash,
|
||||
full_path=self.full_path,
|
||||
format=self.format,
|
||||
description=self.description,
|
||||
emotion=emotion_str,
|
||||
query_count=0,
|
||||
is_registered=True,
|
||||
is_banned=False,
|
||||
record_time=self.register_time,
|
||||
register_time=self.register_time,
|
||||
usage_count=self.usage_count,
|
||||
last_used_time=self.last_used_time,
|
||||
)
|
||||
session.add(emoji)
|
||||
await session.commit()
|
||||
|
||||
logger.info(f"[注册] 表情包信息保存到数据库: {self.filename} ({self.emotion})")
|
||||
|
||||
return True
|
||||
|
||||
except Exception as db_error:
|
||||
logger.error(f"[错误] 保存数据库失败 ({self.filename}): {db_error!s}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 注册表情包失败 ({self.filename}): {e!s}")
|
||||
logger.error(traceback.format_exc())
|
||||
return False
|
||||
|
||||
async def delete(self) -> bool:
|
||||
"""删除表情包文件及数据库记录"""
|
||||
try:
|
||||
file_to_delete = self.full_path
|
||||
if await asyncio.to_thread(os.path.exists, file_to_delete):
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_to_delete)
|
||||
logger.debug(f"[删除] 文件: {file_to_delete}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除文件失败 {file_to_delete}: {e!s}")
|
||||
|
||||
try:
|
||||
crud = CRUDBase(Emoji)
|
||||
will_delete_emoji = await crud.get_by(emoji_hash=self.hash)
|
||||
if will_delete_emoji is None:
|
||||
logger.warning(f"[删除] 数据库中未找到哈希值为 {self.hash} 的表情包记录。")
|
||||
result = 0
|
||||
else:
|
||||
await crud.delete(will_delete_emoji.id)
|
||||
result = 1
|
||||
|
||||
cache = await get_cache()
|
||||
await cache.delete(generate_cache_key("emoji_by_hash", self.hash))
|
||||
await cache.delete(generate_cache_key("emoji_description", self.hash))
|
||||
await cache.delete(generate_cache_key("emoji_tag", self.hash))
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除数据库记录时出错: {e!s}")
|
||||
result = 0
|
||||
|
||||
if result > 0:
|
||||
logger.info(f"[删除] 表情包数据库记录 {self.filename} (Hash: {self.hash})")
|
||||
self.is_deleted = True
|
||||
return True
|
||||
if not os.path.exists(file_to_delete):
|
||||
logger.warning(
|
||||
f"[警告] 表情包文件 {file_to_delete} 已删除,但数据库记录删除失败 (Hash: {self.hash})"
|
||||
)
|
||||
else:
|
||||
logger.error(f"[错误] 删除表情包数据库记录失败: {self.hash}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除表情包失败 ({self.filename}): {e!s}")
|
||||
return False
|
||||
@@ -1,5 +1,6 @@
|
||||
import asyncio
|
||||
import base64
|
||||
import binascii
|
||||
import hashlib
|
||||
import io
|
||||
import json
|
||||
@@ -12,18 +13,9 @@ from typing import Any, Optional, cast
|
||||
|
||||
import json_repair
|
||||
from PIL import Image
|
||||
from rich.traceback import install
|
||||
from sqlalchemy import select
|
||||
|
||||
from src.chat.emoji_system.emoji_constants import EMOJI_DIR, EMOJI_REGISTERED_DIR, MAX_EMOJI_FOR_PROMPT
|
||||
from src.chat.emoji_system.emoji_entities import MaiEmoji
|
||||
from src.chat.emoji_system.emoji_utils import (
|
||||
_emoji_objects_to_readable_list,
|
||||
_ensure_emoji_dir,
|
||||
_to_emoji_objects,
|
||||
clean_unused_emojis,
|
||||
clear_temp_emoji,
|
||||
list_image_files,
|
||||
)
|
||||
from src.chat.utils.utils_image import get_image_manager, image_path_to_base64
|
||||
from src.common.database.api.crud import CRUDBase
|
||||
from src.common.database.compatibility import get_db_session
|
||||
@@ -33,8 +25,367 @@ from src.common.logger import get_logger
|
||||
from src.config.config import global_config, model_config
|
||||
from src.llm_models.utils_model import LLMRequest
|
||||
|
||||
install(extra_lines=3)
|
||||
|
||||
logger = get_logger("emoji")
|
||||
|
||||
BASE_DIR = os.path.join("data")
|
||||
EMOJI_DIR = os.path.join(BASE_DIR, "emoji") # 表情包存储目录
|
||||
EMOJI_REGISTERED_DIR = os.path.join(BASE_DIR, "emoji_registed") # 已注册的表情包注册目录
|
||||
MAX_EMOJI_FOR_PROMPT = 20 # 最大允许的表情包描述数量于图片替换的 prompt 中
|
||||
|
||||
"""
|
||||
还没经过测试,有些地方数据库和内存数据同步可能不完全
|
||||
|
||||
"""
|
||||
|
||||
|
||||
class MaiEmoji:
|
||||
"""定义一个表情包"""
|
||||
|
||||
def __init__(self, full_path: str):
|
||||
if not full_path:
|
||||
raise ValueError("full_path cannot be empty")
|
||||
self.full_path = full_path # 文件的完整路径 (包括文件名)
|
||||
self.path = os.path.dirname(full_path) # 文件所在的目录路径
|
||||
self.filename = os.path.basename(full_path) # 文件名
|
||||
self.embedding = []
|
||||
self.hash = "" # 初始为空,在创建实例时会计算
|
||||
self.description = ""
|
||||
self.emotion: list[str] = []
|
||||
self.usage_count = 0
|
||||
self.last_used_time = time.time()
|
||||
self.register_time = time.time()
|
||||
self.is_deleted = False # 标记是否已被删除
|
||||
self.format = ""
|
||||
|
||||
async def initialize_hash_format(self) -> bool | None:
|
||||
"""从文件创建表情包实例, 计算哈希值和格式"""
|
||||
try:
|
||||
# 使用 full_path 检查文件是否存在
|
||||
if not os.path.exists(self.full_path):
|
||||
logger.error(f"[初始化错误] 表情包文件不存在: {self.full_path}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
|
||||
# 使用 full_path 读取文件
|
||||
logger.debug(f"[初始化] 正在读取文件: {self.full_path}")
|
||||
image_base64 = image_path_to_base64(self.full_path)
|
||||
if image_base64 is None:
|
||||
logger.error(f"[初始化错误] 无法读取或转换Base64: {self.full_path}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
logger.debug(f"[初始化] 文件读取成功 (Base64预览: {image_base64[:50]}...)")
|
||||
|
||||
# 计算哈希值
|
||||
logger.debug(f"[初始化] 正在解码Base64并计算哈希: {self.filename}")
|
||||
# 确保base64字符串只包含ASCII字符
|
||||
if isinstance(image_base64, str):
|
||||
image_base64 = image_base64.encode("ascii", errors="ignore").decode("ascii")
|
||||
image_bytes = base64.b64decode(image_base64)
|
||||
self.hash = hashlib.md5(image_bytes).hexdigest()
|
||||
logger.debug(f"[初始化] 哈希计算成功: {self.hash}")
|
||||
|
||||
# 获取图片格式
|
||||
logger.debug(f"[初始化] 正在使用Pillow获取格式: {self.filename}")
|
||||
try:
|
||||
with Image.open(io.BytesIO(image_bytes)) as img:
|
||||
self.format = (img.format or "jpeg").lower()
|
||||
logger.debug(f"[初始化] 格式获取成功: {self.format}")
|
||||
except Exception as pil_error:
|
||||
logger.error(f"[初始化错误] Pillow无法处理图片 ({self.filename}): {pil_error}")
|
||||
logger.error(traceback.format_exc())
|
||||
self.is_deleted = True
|
||||
return None
|
||||
|
||||
# 如果所有步骤成功,返回 True
|
||||
return True
|
||||
|
||||
except FileNotFoundError:
|
||||
logger.error(f"[初始化错误] 文件在处理过程中丢失: {self.full_path}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
except (binascii.Error, ValueError) as b64_error:
|
||||
logger.error(f"[初始化错误] Base64解码失败 ({self.filename}): {b64_error}")
|
||||
self.is_deleted = True
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"[初始化错误] 初始化表情包时发生未预期错误 ({self.filename}): {e!s}")
|
||||
logger.error(traceback.format_exc())
|
||||
self.is_deleted = True
|
||||
return None
|
||||
|
||||
async def register_to_db(self) -> bool:
|
||||
"""
|
||||
注册表情包
|
||||
将表情包对应的文件,从当前路径移动到EMOJI_REGISTERED_DIR目录下
|
||||
并修改对应的实例属性,然后将表情包信息保存到数据库中
|
||||
"""
|
||||
try:
|
||||
# 确保目标目录存在
|
||||
|
||||
# 源路径是当前实例的完整路径 self.full_path
|
||||
source_full_path = self.full_path
|
||||
# 目标完整路径
|
||||
destination_full_path = os.path.join(EMOJI_REGISTERED_DIR, self.filename)
|
||||
|
||||
# 检查源文件是否存在
|
||||
if not os.path.exists(source_full_path):
|
||||
logger.error(f"[错误] 源文件不存在: {source_full_path}")
|
||||
return False
|
||||
|
||||
# --- 文件移动 ---
|
||||
try:
|
||||
# 如果目标文件已存在,先删除 (确保移动成功)
|
||||
if os.path.exists(destination_full_path):
|
||||
os.remove(destination_full_path)
|
||||
|
||||
os.rename(source_full_path, destination_full_path)
|
||||
logger.debug(f"[移动] 文件从 {source_full_path} 移动到 {destination_full_path}")
|
||||
# 更新实例的路径属性为新路径
|
||||
self.full_path = destination_full_path
|
||||
self.path = EMOJI_REGISTERED_DIR
|
||||
# self.filename 保持不变
|
||||
except Exception as move_error:
|
||||
logger.error(f"[错误] 移动文件失败: {move_error!s}")
|
||||
# 如果移动失败,尝试将实例状态恢复?暂时不处理,仅返回失败
|
||||
return False
|
||||
|
||||
# --- 数据库操作 ---
|
||||
try:
|
||||
# 准备数据库记录 for emoji collection
|
||||
async with get_db_session() as session:
|
||||
emotion_str = ",".join(self.emotion) if self.emotion else ""
|
||||
|
||||
emoji = Emoji(
|
||||
emoji_hash=self.hash,
|
||||
full_path=self.full_path,
|
||||
format=self.format,
|
||||
description=self.description,
|
||||
emotion=emotion_str, # Store as comma-separated string
|
||||
query_count=0, # Default value
|
||||
is_registered=True,
|
||||
is_banned=False, # Default value
|
||||
record_time=self.register_time, # Use MaiEmoji's register_time for DB record_time
|
||||
register_time=self.register_time,
|
||||
usage_count=self.usage_count,
|
||||
last_used_time=self.last_used_time,
|
||||
)
|
||||
session.add(emoji)
|
||||
await session.commit()
|
||||
|
||||
logger.info(f"[注册] 表情包信息保存到数据库: {self.filename} ({self.emotion})")
|
||||
|
||||
return True
|
||||
|
||||
except Exception as db_error:
|
||||
logger.error(f"[错误] 保存数据库失败 ({self.filename}): {db_error!s}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 注册表情包失败 ({self.filename}): {e!s}")
|
||||
logger.error(traceback.format_exc())
|
||||
return False
|
||||
|
||||
async def delete(self) -> bool:
|
||||
"""删除表情包
|
||||
|
||||
删除表情包的文件和数据库记录
|
||||
|
||||
返回:
|
||||
bool: 是否成功删除
|
||||
"""
|
||||
try:
|
||||
# 1. 删除文件
|
||||
file_to_delete = self.full_path
|
||||
if os.path.exists(file_to_delete):
|
||||
try:
|
||||
os.remove(file_to_delete)
|
||||
logger.debug(f"[删除] 文件: {file_to_delete}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除文件失败 {file_to_delete}: {e!s}")
|
||||
# 文件删除失败,但仍然尝试删除数据库记录
|
||||
|
||||
# 2. 删除数据库记录
|
||||
try:
|
||||
# 使用CRUD进行删除
|
||||
crud = CRUDBase(Emoji)
|
||||
will_delete_emoji = await crud.get_by(emoji_hash=self.hash)
|
||||
if will_delete_emoji is None:
|
||||
logger.warning(f"[删除] 数据库中未找到哈希值为 {self.hash} 的表情包记录。")
|
||||
result = 0 # Indicate no DB record was deleted
|
||||
else:
|
||||
await crud.delete(will_delete_emoji.id)
|
||||
result = 1 # Successfully deleted one record
|
||||
|
||||
# 使缓存失效
|
||||
from src.common.database.optimization.cache_manager import get_cache
|
||||
from src.common.database.utils.decorators import generate_cache_key
|
||||
cache = await get_cache()
|
||||
await cache.delete(generate_cache_key("emoji_by_hash", self.hash))
|
||||
await cache.delete(generate_cache_key("emoji_description", self.hash))
|
||||
await cache.delete(generate_cache_key("emoji_tag", self.hash))
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除数据库记录时出错: {e!s}")
|
||||
result = 0
|
||||
|
||||
if result > 0:
|
||||
logger.info(f"[删除] 表情包数据库记录 {self.filename} (Hash: {self.hash})")
|
||||
# 3. 标记对象已被删除
|
||||
self.is_deleted = True
|
||||
return True
|
||||
else:
|
||||
# 如果数据库记录删除失败,但文件可能已删除,记录一个警告
|
||||
if not os.path.exists(file_to_delete):
|
||||
logger.warning(
|
||||
f"[警告] 表情包文件 {file_to_delete} 已删除,但数据库记录删除失败 (Hash: {self.hash})"
|
||||
)
|
||||
else:
|
||||
logger.error(f"[错误] 删除表情包数据库记录失败: {self.hash}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除表情包失败 ({self.filename}): {e!s}")
|
||||
return False
|
||||
|
||||
|
||||
def _emoji_objects_to_readable_list(emoji_objects: list["MaiEmoji"]) -> list[str]:
|
||||
"""将表情包对象列表转换为可读的字符串列表
|
||||
|
||||
参数:
|
||||
emoji_objects: MaiEmoji对象列表
|
||||
|
||||
返回:
|
||||
list[str]: 可读的表情包信息字符串列表
|
||||
"""
|
||||
emoji_info_list = []
|
||||
for i, emoji in enumerate(emoji_objects):
|
||||
# 转换时间戳为可读时间
|
||||
time_str = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(emoji.register_time))
|
||||
# 构建每个表情包的信息字符串
|
||||
emoji_info = f"编号: {i + 1}\n描述: {emoji.description}\n使用次数: {emoji.usage_count}\n添加时间: {time_str}\n"
|
||||
emoji_info_list.append(emoji_info)
|
||||
return emoji_info_list
|
||||
|
||||
|
||||
def _to_emoji_objects(data: Any) -> tuple[list["MaiEmoji"], int]:
|
||||
emoji_objects = []
|
||||
load_errors = 0
|
||||
emoji_data_list = list(data)
|
||||
|
||||
for emoji_data in emoji_data_list: # emoji_data is an Emoji model instance
|
||||
full_path = emoji_data.full_path
|
||||
if not full_path:
|
||||
logger.warning(
|
||||
f"[加载错误] 数据库记录缺少 'full_path' 字段: ID {emoji_data.id if hasattr(emoji_data, 'id') else 'Unknown'}"
|
||||
)
|
||||
load_errors += 1
|
||||
continue
|
||||
|
||||
try:
|
||||
emoji = MaiEmoji(full_path=full_path)
|
||||
|
||||
emoji.hash = emoji_data.emoji_hash
|
||||
if not emoji.hash:
|
||||
logger.warning(f"[加载错误] 数据库记录缺少 'hash' 字段: {full_path}")
|
||||
load_errors += 1
|
||||
continue
|
||||
|
||||
emoji.description = emoji_data.description
|
||||
# Deserialize emotion string from DB to list
|
||||
emoji.emotion = emoji_data.emotion.split(",") if emoji_data.emotion else []
|
||||
emoji.usage_count = emoji_data.usage_count
|
||||
|
||||
db_last_used_time = emoji_data.last_used_time
|
||||
db_register_time = emoji_data.register_time
|
||||
|
||||
# If last_used_time from DB is None, use MaiEmoji's initialized register_time or current time
|
||||
emoji.last_used_time = db_last_used_time if db_last_used_time is not None else emoji.register_time
|
||||
# If register_time from DB is None, use MaiEmoji's initialized register_time (which is time.time())
|
||||
emoji.register_time = db_register_time if db_register_time is not None else emoji.register_time
|
||||
|
||||
emoji.format = emoji_data.format
|
||||
|
||||
emoji_objects.append(emoji)
|
||||
|
||||
except ValueError as ve:
|
||||
logger.error(f"[加载错误] 初始化 MaiEmoji 失败 ({full_path}): {ve}")
|
||||
load_errors += 1
|
||||
except Exception as e:
|
||||
logger.error(f"[加载错误] 处理数据库记录时出错 ({full_path}): {e!s}")
|
||||
load_errors += 1
|
||||
return emoji_objects, load_errors
|
||||
|
||||
|
||||
def _ensure_emoji_dir() -> None:
|
||||
"""确保表情存储目录存在"""
|
||||
os.makedirs(EMOJI_DIR, exist_ok=True)
|
||||
os.makedirs(EMOJI_REGISTERED_DIR, exist_ok=True)
|
||||
|
||||
|
||||
async def clear_temp_emoji() -> None:
|
||||
"""清理临时表情包
|
||||
清理/data/emoji、/data/image和/data/images目录下的所有文件
|
||||
当目录中文件数超过100时,会全部删除
|
||||
"""
|
||||
|
||||
logger.info("[清理] 开始清理缓存...")
|
||||
|
||||
for need_clear in (
|
||||
os.path.join(BASE_DIR, "emoji"),
|
||||
os.path.join(BASE_DIR, "image"),
|
||||
os.path.join(BASE_DIR, "images"),
|
||||
):
|
||||
if os.path.exists(need_clear):
|
||||
files = os.listdir(need_clear)
|
||||
# 如果文件数超过1000就全部删除
|
||||
if len(files) > 1000:
|
||||
for filename in files:
|
||||
file_path = os.path.join(need_clear, filename)
|
||||
if os.path.isfile(file_path):
|
||||
os.remove(file_path)
|
||||
logger.debug(f"[清理] 删除: {filename}")
|
||||
|
||||
|
||||
async def clean_unused_emojis(emoji_dir: str, emoji_objects: list["MaiEmoji"], removed_count: int) -> int:
|
||||
"""清理指定目录中未被 emoji_objects 追踪的表情包文件"""
|
||||
if not os.path.exists(emoji_dir):
|
||||
logger.warning(f"[清理] 目标目录不存在,跳过清理: {emoji_dir}")
|
||||
return removed_count
|
||||
|
||||
cleaned_count = 0
|
||||
try:
|
||||
# 获取内存中所有有效表情包的完整路径集合
|
||||
tracked_full_paths = {emoji.full_path for emoji in emoji_objects if not emoji.is_deleted}
|
||||
|
||||
# 遍历指定目录中的所有文件
|
||||
for file_name in os.listdir(emoji_dir):
|
||||
file_full_path = os.path.join(emoji_dir, file_name)
|
||||
|
||||
# 确保处理的是文件而不是子目录
|
||||
if not os.path.isfile(file_full_path):
|
||||
continue
|
||||
|
||||
# 如果文件不在被追踪的集合中,则删除
|
||||
if file_full_path not in tracked_full_paths:
|
||||
try:
|
||||
os.remove(file_full_path)
|
||||
logger.info(f"[清理] 删除未追踪的表情包文件: {file_full_path}")
|
||||
cleaned_count += 1
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除文件时出错 ({file_full_path}): {e!s}")
|
||||
|
||||
if cleaned_count > 0:
|
||||
logger.info(f"[清理] 在目录 {emoji_dir} 中清理了 {cleaned_count} 个破损表情包。")
|
||||
else:
|
||||
logger.info(f"[清理] 目录 {emoji_dir} 中没有需要清理的。")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 清理未使用表情包文件时出错 ({emoji_dir}): {e!s}")
|
||||
|
||||
return removed_count + cleaned_count
|
||||
|
||||
|
||||
class EmojiManager:
|
||||
_instance = None
|
||||
_initialized: bool = False # 显式声明,避免属性未定义错误
|
||||
@@ -50,10 +401,6 @@ class EmojiManager:
|
||||
return # 如果已经初始化过,直接返回
|
||||
|
||||
self._scan_task = None
|
||||
self._emoji_index: dict[str, MaiEmoji] = {}
|
||||
self._integrity_yield_every = 50
|
||||
self._integrity_cursor = 0
|
||||
self._integrity_batch_size = 500
|
||||
|
||||
if model_config is None:
|
||||
raise RuntimeError("Model config is not initialized")
|
||||
@@ -221,40 +568,34 @@ class EmojiManager:
|
||||
如果文件已被删除,则执行对象的删除方法并从列表中移除
|
||||
"""
|
||||
try:
|
||||
# if not self.emoji_objects:
|
||||
# logger.warning("[检查] emoji_objects为空,跳过完整性检查")
|
||||
# return
|
||||
|
||||
total_count = len(self.emoji_objects)
|
||||
self.emoji_num = total_count
|
||||
removed_count = 0
|
||||
if total_count == 0:
|
||||
return
|
||||
|
||||
start = self._integrity_cursor % total_count
|
||||
end = min(start + self._integrity_batch_size, total_count)
|
||||
indices: list[int] = list(range(start, end))
|
||||
if end - start < self._integrity_batch_size and total_count > 0:
|
||||
wrap_rest = self._integrity_batch_size - (end - start)
|
||||
if wrap_rest > 0:
|
||||
indices.extend(range(0, min(wrap_rest, total_count)))
|
||||
|
||||
objects_to_remove: list[MaiEmoji] = []
|
||||
processed = 0
|
||||
for idx in indices:
|
||||
if idx >= len(self.emoji_objects):
|
||||
break
|
||||
emoji = self.emoji_objects[idx]
|
||||
# 使用列表复制进行遍历,因为我们会在遍历过程中修改列表
|
||||
objects_to_remove = []
|
||||
for emoji in self.emoji_objects:
|
||||
try:
|
||||
# 跳过已经标记为删除的,避免重复处理
|
||||
if emoji.is_deleted:
|
||||
objects_to_remove.append(emoji)
|
||||
objects_to_remove.append(emoji) # 收集起来一次性移除
|
||||
continue
|
||||
|
||||
exists = await asyncio.to_thread(os.path.exists, emoji.full_path)
|
||||
if not exists:
|
||||
# 检查文件是否存在
|
||||
if not os.path.exists(emoji.full_path):
|
||||
logger.warning(f"[检查] 表情包文件丢失: {emoji.full_path}")
|
||||
await emoji.delete()
|
||||
objects_to_remove.append(emoji)
|
||||
# 执行表情包对象的删除方法
|
||||
await emoji.delete() # delete 方法现在会标记 is_deleted
|
||||
objects_to_remove.append(emoji) # 标记删除后,也收集起来移除
|
||||
# 更新计数
|
||||
self.emoji_num -= 1
|
||||
removed_count += 1
|
||||
continue
|
||||
|
||||
# 检查描述是否为空 (如果为空也视为无效)
|
||||
if not emoji.description:
|
||||
logger.warning(f"[检查] 表情包描述为空,视为无效: {emoji.filename}")
|
||||
await emoji.delete()
|
||||
@@ -263,24 +604,19 @@ class EmojiManager:
|
||||
removed_count += 1
|
||||
continue
|
||||
|
||||
processed += 1
|
||||
if processed % self._integrity_yield_every == 0:
|
||||
await asyncio.sleep(0)
|
||||
|
||||
except Exception as item_error:
|
||||
logger.error(f"[错误] 处理表情包记录时出错 ({emoji.filename}): {item_error!s}")
|
||||
# 即使出错,也尝试继续检查下一个
|
||||
continue
|
||||
|
||||
# 从 self.emoji_objects 中移除标记的对象
|
||||
if objects_to_remove:
|
||||
self.emoji_objects = [e for e in self.emoji_objects if e not in objects_to_remove]
|
||||
for e in objects_to_remove:
|
||||
if e.hash in self._emoji_index:
|
||||
self._emoji_index.pop(e.hash, None)
|
||||
|
||||
self._integrity_cursor = (start + processed) % max(1, len(self.emoji_objects))
|
||||
|
||||
# 清理 EMOJI_REGISTERED_DIR 目录中未被追踪的文件
|
||||
removed_count = await clean_unused_emojis(EMOJI_REGISTERED_DIR, self.emoji_objects, removed_count)
|
||||
|
||||
# 输出清理结果
|
||||
if removed_count > 0:
|
||||
logger.info(f"[清理] 已清理 {removed_count} 个失效/文件丢失的表情包记录")
|
||||
logger.info(f"[统计] 清理前记录数: {total_count} | 清理后有效记录数: {len(self.emoji_objects)}")
|
||||
@@ -303,30 +639,36 @@ class EmojiManager:
|
||||
logger.info("[扫描] 开始扫描新表情包...")
|
||||
|
||||
# 检查表情包目录是否存在
|
||||
if not await asyncio.to_thread(os.path.exists, EMOJI_DIR):
|
||||
if not os.path.exists(EMOJI_DIR):
|
||||
logger.warning(f"[警告] 表情包目录不存在: {EMOJI_DIR}")
|
||||
await asyncio.to_thread(os.makedirs, EMOJI_DIR, True)
|
||||
os.makedirs(EMOJI_DIR, exist_ok=True)
|
||||
logger.info(f"[创建] 已创建表情包目录: {EMOJI_DIR}")
|
||||
await asyncio.sleep(global_config.emoji.check_interval * 60)
|
||||
continue
|
||||
|
||||
image_files, is_empty = await list_image_files(EMOJI_DIR)
|
||||
if is_empty:
|
||||
# 检查目录是否为空
|
||||
files = os.listdir(EMOJI_DIR)
|
||||
if not files:
|
||||
logger.warning(f"[警告] 表情包目录为空: {EMOJI_DIR}")
|
||||
await asyncio.sleep(global_config.emoji.check_interval * 60)
|
||||
continue
|
||||
|
||||
if not image_files:
|
||||
await asyncio.sleep(global_config.emoji.check_interval * 60)
|
||||
continue
|
||||
|
||||
# 无论steal_emoji是否开启,都检查emoji文件夹以支持手动注册
|
||||
# 只有在需要腾出空间或填充表情库时,才真正执行注册
|
||||
if (self.emoji_num > self.emoji_num_max and global_config.emoji.do_replace) or (
|
||||
self.emoji_num < self.emoji_num_max
|
||||
):
|
||||
try:
|
||||
for filename in image_files:
|
||||
# 获取目录下所有图片文件
|
||||
files_to_process = [
|
||||
f
|
||||
for f in files
|
||||
if os.path.isfile(os.path.join(EMOJI_DIR, f))
|
||||
and f.lower().endswith((".jpg", ".jpeg", ".png", ".gif"))
|
||||
]
|
||||
|
||||
# 处理每个符合条件的文件
|
||||
for filename in files_to_process:
|
||||
# 尝试注册表情包
|
||||
success = await self.register_emoji_by_filename(filename)
|
||||
if success:
|
||||
@@ -335,9 +677,8 @@ class EmojiManager:
|
||||
|
||||
# 注册失败则删除对应文件
|
||||
file_path = os.path.join(EMOJI_DIR, filename)
|
||||
await asyncio.to_thread(os.remove, file_path)
|
||||
os.remove(file_path)
|
||||
logger.warning(f"[清理] 删除注册失败的表情包文件: {filename}")
|
||||
await asyncio.sleep(0)
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 扫描表情包目录失败: {e!s}")
|
||||
|
||||
@@ -357,7 +698,6 @@ class EmojiManager:
|
||||
# 更新内存中的列表和数量
|
||||
self.emoji_objects = emoji_objects
|
||||
self.emoji_num = len(emoji_objects)
|
||||
self._emoji_index = {e.hash: e for e in emoji_objects if getattr(e, "hash", None)}
|
||||
|
||||
logger.info(f"[数据库] 加载完成: 共加载 {self.emoji_num} 个表情包记录。")
|
||||
if load_errors > 0:
|
||||
@@ -413,15 +753,11 @@ class EmojiManager:
|
||||
返回:
|
||||
MaiEmoji 或 None: 如果找到则返回 MaiEmoji 对象,否则返回 None
|
||||
"""
|
||||
emoji = self._emoji_index.get(emoji_hash)
|
||||
if emoji and not emoji.is_deleted:
|
||||
return emoji
|
||||
|
||||
for item in self.emoji_objects:
|
||||
if not item.is_deleted and item.hash == emoji_hash:
|
||||
self._emoji_index[emoji_hash] = item
|
||||
return item
|
||||
return None
|
||||
for emoji in self.emoji_objects:
|
||||
# 确保对象未被标记为删除且哈希值匹配
|
||||
if not emoji.is_deleted and emoji.hash == emoji_hash:
|
||||
return emoji
|
||||
return None # 如果循环结束还没找到,则返回 None
|
||||
|
||||
@cached(ttl=1800, key_prefix="emoji_tag") # 缓存30分钟
|
||||
async def get_emoji_tag_by_hash(self, emoji_hash: str) -> str | None:
|
||||
@@ -513,7 +849,6 @@ class EmojiManager:
|
||||
if success:
|
||||
# 从emoji_objects列表中移除该对象
|
||||
self.emoji_objects = [e for e in self.emoji_objects if e.hash != emoji_hash]
|
||||
self._emoji_index.pop(emoji_hash, None)
|
||||
# 更新计数
|
||||
self.emoji_num -= 1
|
||||
logger.info(f"[统计] 当前表情包数量: {self.emoji_num}")
|
||||
@@ -596,7 +931,6 @@ class EmojiManager:
|
||||
register_success = await new_emoji.register_to_db()
|
||||
if register_success:
|
||||
self.emoji_objects.append(new_emoji)
|
||||
self._emoji_index[new_emoji.hash] = new_emoji
|
||||
self.emoji_num += 1
|
||||
logger.info(f"[成功] 注册: {new_emoji.filename}")
|
||||
return True
|
||||
@@ -765,7 +1099,7 @@ class EmojiManager:
|
||||
bool: 注册是否成功
|
||||
"""
|
||||
file_full_path = os.path.join(EMOJI_DIR, filename)
|
||||
if not await asyncio.to_thread(os.path.exists, file_full_path):
|
||||
if not os.path.exists(file_full_path):
|
||||
logger.error(f"[注册失败] 文件不存在: {file_full_path}")
|
||||
return False
|
||||
|
||||
@@ -783,7 +1117,7 @@ class EmojiManager:
|
||||
logger.warning(f"[注册跳过] 表情包已存在 (Hash: {new_emoji.hash}): {filename}")
|
||||
# 删除重复的源文件
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path)
|
||||
os.remove(file_full_path)
|
||||
logger.info(f"[清理] 删除重复的待注册文件: {filename}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除重复文件失败: {e!s}")
|
||||
@@ -803,7 +1137,7 @@ class EmojiManager:
|
||||
logger.warning(f"[注册失败] 未能生成有效描述或审核未通过: {filename}")
|
||||
# 删除未能生成描述的文件
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path)
|
||||
os.remove(file_full_path)
|
||||
logger.info(f"[清理] 删除描述生成失败的文件: {filename}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除描述生成失败文件时出错: {e!s}")
|
||||
@@ -815,7 +1149,7 @@ class EmojiManager:
|
||||
logger.error(f"[注册失败] 生成描述/情感时出错 ({filename}): {build_desc_error}")
|
||||
# 同样考虑删除文件
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path)
|
||||
os.remove(file_full_path)
|
||||
logger.info(f"[清理] 删除描述生成异常的文件: {filename}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除描述生成异常文件时出错: {e!s}")
|
||||
@@ -829,7 +1163,7 @@ class EmojiManager:
|
||||
logger.error("[注册失败] 替换表情包失败,无法完成注册")
|
||||
# 替换失败,删除新表情包文件
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path) # new_emoji 的 full_path 此时还是源路径
|
||||
os.remove(file_full_path) # new_emoji 的 full_path 此时还是源路径
|
||||
logger.info(f"[清理] 删除替换失败的新表情文件: {filename}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除替换失败文件时出错: {e!s}")
|
||||
@@ -842,7 +1176,6 @@ class EmojiManager:
|
||||
if register_success:
|
||||
# 注册成功后,添加到内存列表
|
||||
self.emoji_objects.append(new_emoji)
|
||||
self._emoji_index[new_emoji.hash] = new_emoji
|
||||
self.emoji_num += 1
|
||||
logger.info(f"[成功] 注册新表情包: {filename} (当前: {self.emoji_num}/{self.emoji_num_max})")
|
||||
return True
|
||||
@@ -850,9 +1183,9 @@ class EmojiManager:
|
||||
logger.error(f"[注册失败] 保存表情包到数据库/移动文件失败: {filename}")
|
||||
# register_to_db 失败时,内部会尝试清理移动后的文件,源文件可能还在
|
||||
# 是否需要删除源文件?
|
||||
if await asyncio.to_thread(os.path.exists, file_full_path):
|
||||
if os.path.exists(file_full_path):
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path)
|
||||
os.remove(file_full_path)
|
||||
logger.info(f"[清理] 删除注册失败的源文件: {filename}")
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除注册失败源文件时出错: {e!s}")
|
||||
@@ -862,9 +1195,9 @@ class EmojiManager:
|
||||
logger.error(f"[错误] 注册表情包时发生未预期错误 ({filename}): {e!s}")
|
||||
logger.error(traceback.format_exc())
|
||||
# 尝试删除源文件以避免循环处理
|
||||
if await asyncio.to_thread(os.path.exists, file_full_path):
|
||||
if os.path.exists(file_full_path):
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path)
|
||||
os.remove(file_full_path)
|
||||
logger.info(f"[清理] 删除处理异常的源文件: {filename}")
|
||||
except Exception as remove_error:
|
||||
logger.error(f"[错误] 删除异常处理文件时出错: {remove_error}")
|
||||
|
||||
@@ -1,140 +0,0 @@
|
||||
import asyncio
|
||||
import os
|
||||
import time
|
||||
from typing import Any
|
||||
|
||||
from src.chat.emoji_system.emoji_constants import BASE_DIR, EMOJI_DIR, EMOJI_REGISTERED_DIR
|
||||
from src.chat.emoji_system.emoji_entities import MaiEmoji
|
||||
from src.common.logger import get_logger
|
||||
|
||||
logger = get_logger("emoji")
|
||||
|
||||
|
||||
def _emoji_objects_to_readable_list(emoji_objects: list[MaiEmoji]) -> list[str]:
|
||||
emoji_info_list = []
|
||||
for i, emoji in enumerate(emoji_objects):
|
||||
time_str = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(emoji.register_time))
|
||||
emoji_info = f"编号: {i + 1}\n描述: {emoji.description}\n使用次数: {emoji.usage_count}\n添加时间: {time_str}\n"
|
||||
emoji_info_list.append(emoji_info)
|
||||
return emoji_info_list
|
||||
|
||||
|
||||
def _to_emoji_objects(data: Any) -> tuple[list[MaiEmoji], int]:
|
||||
emoji_objects = []
|
||||
load_errors = 0
|
||||
emoji_data_list = list(data)
|
||||
|
||||
for emoji_data in emoji_data_list:
|
||||
full_path = emoji_data.full_path
|
||||
if not full_path:
|
||||
logger.warning(
|
||||
f"[加载错误] 数据库记录缺少 'full_path' 字段: ID {emoji_data.id if hasattr(emoji_data, 'id') else 'Unknown'}"
|
||||
)
|
||||
load_errors += 1
|
||||
continue
|
||||
|
||||
try:
|
||||
emoji = MaiEmoji(full_path=full_path)
|
||||
|
||||
emoji.hash = emoji_data.emoji_hash
|
||||
if not emoji.hash:
|
||||
logger.warning(f"[加载错误] 数据库记录缺少 'hash' 字段: {full_path}")
|
||||
load_errors += 1
|
||||
continue
|
||||
|
||||
emoji.description = emoji_data.description
|
||||
emoji.emotion = emoji_data.emotion.split(",") if emoji_data.emotion else []
|
||||
emoji.usage_count = emoji_data.usage_count
|
||||
|
||||
db_last_used_time = emoji_data.last_used_time
|
||||
db_register_time = emoji_data.register_time
|
||||
|
||||
emoji.last_used_time = db_last_used_time if db_last_used_time is not None else emoji.register_time
|
||||
emoji.register_time = db_register_time if db_register_time is not None else emoji.register_time
|
||||
|
||||
emoji.format = emoji_data.format
|
||||
|
||||
emoji_objects.append(emoji)
|
||||
|
||||
except ValueError as ve:
|
||||
logger.error(f"[加载错误] 初始化 MaiEmoji 失败 ({full_path}): {ve}")
|
||||
load_errors += 1
|
||||
except Exception as e:
|
||||
logger.error(f"[加载错误] 处理数据库记录时出错 ({full_path}): {e!s}")
|
||||
load_errors += 1
|
||||
return emoji_objects, load_errors
|
||||
|
||||
|
||||
def _ensure_emoji_dir() -> None:
|
||||
os.makedirs(EMOJI_DIR, exist_ok=True)
|
||||
os.makedirs(EMOJI_REGISTERED_DIR, exist_ok=True)
|
||||
|
||||
|
||||
async def clear_temp_emoji() -> None:
|
||||
logger.info("[清理] 开始清理缓存...")
|
||||
|
||||
for need_clear in (
|
||||
os.path.join(BASE_DIR, "emoji"),
|
||||
os.path.join(BASE_DIR, "image"),
|
||||
os.path.join(BASE_DIR, "images"),
|
||||
):
|
||||
if await asyncio.to_thread(os.path.exists, need_clear):
|
||||
files = await asyncio.to_thread(os.listdir, need_clear)
|
||||
if len(files) > 1000:
|
||||
for i, filename in enumerate(files):
|
||||
file_path = os.path.join(need_clear, filename)
|
||||
if await asyncio.to_thread(os.path.isfile, file_path):
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_path)
|
||||
logger.debug(f"[清理] 删除: {filename}")
|
||||
except Exception as e:
|
||||
logger.debug(f"[清理] 删除失败 {filename}: {e!s}")
|
||||
if (i + 1) % 100 == 0:
|
||||
await asyncio.sleep(0)
|
||||
|
||||
|
||||
async def clean_unused_emojis(emoji_dir: str, emoji_objects: list[MaiEmoji], removed_count: int) -> int:
|
||||
if not await asyncio.to_thread(os.path.exists, emoji_dir):
|
||||
logger.warning(f"[清理] 目标目录不存在,跳过清理: {emoji_dir}")
|
||||
return removed_count
|
||||
|
||||
cleaned_count = 0
|
||||
try:
|
||||
tracked_full_paths = {emoji.full_path for emoji in emoji_objects if not emoji.is_deleted}
|
||||
|
||||
for entry in await asyncio.to_thread(lambda: list(os.scandir(emoji_dir))):
|
||||
if not entry.is_file():
|
||||
continue
|
||||
|
||||
file_full_path = entry.path
|
||||
|
||||
if file_full_path not in tracked_full_paths:
|
||||
try:
|
||||
await asyncio.to_thread(os.remove, file_full_path)
|
||||
logger.info(f"[清理] 删除未追踪的表情包文件: {file_full_path}")
|
||||
cleaned_count += 1
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 删除文件时出错 ({file_full_path}): {e!s}")
|
||||
|
||||
if cleaned_count > 0:
|
||||
logger.info(f"[清理] 在目录 {emoji_dir} 中清理了 {cleaned_count} 个破损表情包。")
|
||||
else:
|
||||
logger.info(f"[清理] 目录 {emoji_dir} 中没有需要清理的。")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[错误] 清理未使用表情包文件时出错 ({emoji_dir}): {e!s}")
|
||||
|
||||
return removed_count + cleaned_count
|
||||
|
||||
|
||||
async def list_image_files(directory: str) -> tuple[list[str], bool]:
|
||||
def _scan() -> tuple[list[str], bool]:
|
||||
entries = list(os.scandir(directory))
|
||||
files = [
|
||||
entry.name
|
||||
for entry in entries
|
||||
if entry.is_file() and entry.name.lower().endswith((".jpg", ".jpeg", ".png", ".gif"))
|
||||
]
|
||||
return files, len(entries) == 0
|
||||
|
||||
return await asyncio.to_thread(_scan)
|
||||
@@ -7,26 +7,11 @@ import random
|
||||
import re
|
||||
from typing import Any
|
||||
|
||||
try:
|
||||
from sklearn.feature_extraction.text import TfidfVectorizer
|
||||
from sklearn.metrics.pairwise import cosine_similarity as _sk_cosine_similarity
|
||||
|
||||
HAS_SKLEARN = True
|
||||
except Exception: # pragma: no cover - 依赖缺失时静默回退
|
||||
HAS_SKLEARN = False
|
||||
|
||||
from src.common.logger import get_logger
|
||||
|
||||
logger = get_logger("express_utils")
|
||||
|
||||
|
||||
# 预编译正则,减少重复编译开销
|
||||
_RE_REPLY = re.compile(r"\[回复.*?\],说:\s*")
|
||||
_RE_AT = re.compile(r"@<[^>]*>")
|
||||
_RE_IMAGE = re.compile(r"\[图片:[^\]]*\]")
|
||||
_RE_EMOJI = re.compile(r"\[表情包:[^\]]*\]")
|
||||
|
||||
|
||||
def filter_message_content(content: str | None) -> str:
|
||||
"""
|
||||
过滤消息内容,移除回复、@、图片等格式
|
||||
@@ -40,56 +25,29 @@ def filter_message_content(content: str | None) -> str:
|
||||
if not content:
|
||||
return ""
|
||||
|
||||
# 使用预编译正则提升性能
|
||||
content = _RE_REPLY.sub("", content)
|
||||
content = _RE_AT.sub("", content)
|
||||
content = _RE_IMAGE.sub("", content)
|
||||
content = _RE_EMOJI.sub("", content)
|
||||
# 移除以[回复开头、]结尾的部分,包括后面的",说:"部分
|
||||
content = re.sub(r"\[回复.*?\],说:\s*", "", content)
|
||||
# 移除@<...>格式的内容
|
||||
content = re.sub(r"@<[^>]*>", "", content)
|
||||
# 移除[图片:...]格式的图片ID
|
||||
content = re.sub(r"\[图片:[^\]]*\]", "", content)
|
||||
# 移除[表情包:...]格式的内容
|
||||
content = re.sub(r"\[表情包:[^\]]*\]", "", content)
|
||||
|
||||
return content.strip()
|
||||
|
||||
|
||||
def _similarity_tfidf(text1: str, text2: str) -> float | None:
|
||||
"""使用 TF-IDF + 余弦相似度;依赖 sklearn,缺失则返回 None。"""
|
||||
if not HAS_SKLEARN:
|
||||
return None
|
||||
# 过短文本用传统算法更稳健
|
||||
if len(text1) < 2 or len(text2) < 2:
|
||||
return None
|
||||
try:
|
||||
vec = TfidfVectorizer(max_features=1024, ngram_range=(1, 2))
|
||||
tfidf = vec.fit_transform([text1, text2])
|
||||
sim = float(_sk_cosine_similarity(tfidf[0], tfidf[1])[0, 0])
|
||||
return max(0.0, min(1.0, sim))
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def calculate_similarity(text1: str, text2: str, prefer_vector: bool = True) -> float:
|
||||
def calculate_similarity(text1: str, text2: str) -> float:
|
||||
"""
|
||||
计算两个文本的相似度,返回0-1之间的值
|
||||
|
||||
- 当可用且文本足够长时,优先尝试 TF-IDF 向量相似度(更鲁棒)
|
||||
- 不可用或失败时回退到 SequenceMatcher
|
||||
|
||||
Args:
|
||||
text1: 第一个文本
|
||||
text2: 第二个文本
|
||||
prefer_vector: 是否优先使用向量化方案(默认是)
|
||||
|
||||
Returns:
|
||||
相似度值 (0-1)
|
||||
"""
|
||||
if not text1 or not text2:
|
||||
return 0.0
|
||||
if text1 == text2:
|
||||
return 1.0
|
||||
|
||||
if prefer_vector:
|
||||
sim = _similarity_tfidf(text1, text2)
|
||||
if sim is not None:
|
||||
return sim
|
||||
|
||||
return difflib.SequenceMatcher(None, text1, text2).ratio()
|
||||
|
||||
|
||||
@@ -121,10 +79,18 @@ def weighted_sample(population: list[dict], k: int, weight_key: str | None = Non
|
||||
except (ValueError, TypeError) as e:
|
||||
logger.warning(f"加权抽样失败,使用等概率抽样: {e}")
|
||||
|
||||
# 等概率抽样(无放回,保持去重)
|
||||
# 等概率抽样
|
||||
selected = []
|
||||
population_copy = population.copy()
|
||||
# 使用 random.sample 提升可读性和性能
|
||||
return random.sample(population_copy, k)
|
||||
|
||||
for _ in range(k):
|
||||
if not population_copy:
|
||||
break
|
||||
# 随机选择一个元素
|
||||
idx = random.randint(0, len(population_copy) - 1)
|
||||
selected.append(population_copy.pop(idx))
|
||||
|
||||
return selected
|
||||
|
||||
|
||||
def normalize_text(text: str) -> str:
|
||||
@@ -164,9 +130,8 @@ def extract_keywords(text: str, max_keywords: int = 10) -> list[str]:
|
||||
return keywords
|
||||
except ImportError:
|
||||
logger.warning("rjieba未安装,无法提取关键词")
|
||||
# 简单分词,按长度降序优先输出较长词,提升粗略关键词质量
|
||||
# 简单分词
|
||||
words = text.split()
|
||||
words.sort(key=len, reverse=True)
|
||||
return words[:max_keywords]
|
||||
|
||||
|
||||
@@ -271,18 +236,15 @@ def merge_expressions_from_multiple_chats(
|
||||
# 收集所有表达方式
|
||||
for chat_id, expressions in expressions_dict.items():
|
||||
for expr in expressions:
|
||||
# 添加source_id标识
|
||||
expr_with_source = expr.copy()
|
||||
expr_with_source["source_id"] = chat_id
|
||||
all_expressions.append(expr_with_source)
|
||||
|
||||
if not all_expressions:
|
||||
return []
|
||||
|
||||
# 选择排序键(优先 count,其次 last_active_time),无则保持原序
|
||||
sample = all_expressions[0]
|
||||
if "count" in sample:
|
||||
# 按count或last_active_time排序
|
||||
if all_expressions and "count" in all_expressions[0]:
|
||||
all_expressions.sort(key=lambda x: x.get("count", 0), reverse=True)
|
||||
elif "last_active_time" in sample:
|
||||
elif all_expressions and "last_active_time" in all_expressions[0]:
|
||||
all_expressions.sort(key=lambda x: x.get("last_active_time", 0), reverse=True)
|
||||
|
||||
# 去重(基于situation和style)
|
||||
|
||||
@@ -358,10 +358,7 @@ class ExpressionLearner:
|
||||
@staticmethod
|
||||
@cached(ttl=600, key_prefix="chat_expressions")
|
||||
async def _get_expressions_by_chat_id_cached(chat_id: str) -> tuple[list[dict[str, float]], list[dict[str, float]]]:
|
||||
"""内部方法:从数据库获取表达方式(带缓存)
|
||||
|
||||
🔥 优化:使用列表推导式和更高效的数据处理
|
||||
"""
|
||||
"""内部方法:从数据库获取表达方式(带缓存)"""
|
||||
learnt_style_expressions = []
|
||||
learnt_grammar_expressions = []
|
||||
|
||||
@@ -369,91 +366,67 @@ class ExpressionLearner:
|
||||
crud = CRUDBase(Expression)
|
||||
all_expressions = await crud.get_multi(chat_id=chat_id, limit=10000)
|
||||
|
||||
# 🔥 优化:使用列表推导式批量处理,减少循环开销
|
||||
for expr in all_expressions:
|
||||
# 确保create_date存在,如果不存在则使用last_active_time
|
||||
create_date = expr.create_date if expr.create_date is not None else expr.last_active_time
|
||||
# 确保create_date存在,如果不存在则使用last_active_time
|
||||
create_date = expr.create_date if expr.create_date is not None else expr.last_active_time
|
||||
|
||||
expr_data = {
|
||||
"situation": expr.situation,
|
||||
"style": expr.style,
|
||||
"count": expr.count,
|
||||
"last_active_time": expr.last_active_time,
|
||||
"source_id": chat_id,
|
||||
"type": expr.type,
|
||||
"create_date": create_date,
|
||||
}
|
||||
expr_data = {
|
||||
"situation": expr.situation,
|
||||
"style": expr.style,
|
||||
"count": expr.count,
|
||||
"last_active_time": expr.last_active_time,
|
||||
"source_id": chat_id,
|
||||
"type": expr.type,
|
||||
"create_date": create_date,
|
||||
}
|
||||
|
||||
# 根据类型分类(避免多次类型检查)
|
||||
if expr.type == "style":
|
||||
learnt_style_expressions.append(expr_data)
|
||||
elif expr.type == "grammar":
|
||||
learnt_grammar_expressions.append(expr_data)
|
||||
# 根据类型分类
|
||||
if expr.type == "style":
|
||||
learnt_style_expressions.append(expr_data)
|
||||
elif expr.type == "grammar":
|
||||
learnt_grammar_expressions.append(expr_data)
|
||||
|
||||
logger.debug(f"已加载 {len(learnt_style_expressions)} 个style和 {len(learnt_grammar_expressions)} 个grammar表达方式 (chat_id={chat_id})")
|
||||
return learnt_style_expressions, learnt_grammar_expressions
|
||||
|
||||
async def _apply_global_decay_to_database(self, current_time: float) -> None:
|
||||
"""
|
||||
对数据库中的所有表达方式应用全局衰减
|
||||
|
||||
优化: 使用分批处理和原生 SQL 操作提升性能
|
||||
优化: 使用CRUD批量处理所有更改,最后统一提交
|
||||
"""
|
||||
try:
|
||||
BATCH_SIZE = 1000 # 分批处理,避免一次性加载过多数据
|
||||
# 使用CRUD查询所有表达方式
|
||||
crud = CRUDBase(Expression)
|
||||
all_expressions = await crud.get_multi(limit=100000) # 获取所有表达方式
|
||||
|
||||
updated_count = 0
|
||||
deleted_count = 0
|
||||
offset = 0
|
||||
|
||||
while True:
|
||||
async with get_db_session() as session:
|
||||
# 分批查询表达方式
|
||||
batch_result = await session.execute(
|
||||
select(Expression)
|
||||
.order_by(Expression.id)
|
||||
.limit(BATCH_SIZE)
|
||||
.offset(offset)
|
||||
)
|
||||
batch_expressions = list(batch_result.scalars())
|
||||
# 需要手动操作的情况下使用session
|
||||
async with get_db_session() as session:
|
||||
# 批量处理所有修改
|
||||
for expr in all_expressions:
|
||||
# 计算时间差
|
||||
last_active = expr.last_active_time
|
||||
time_diff_days = (current_time - last_active) / (24 * 3600) # 转换为天
|
||||
|
||||
if not batch_expressions:
|
||||
break # 没有更多数据
|
||||
# 计算衰减值
|
||||
decay_value = self.calculate_decay_factor(time_diff_days)
|
||||
new_count = max(0.01, expr.count - decay_value)
|
||||
|
||||
# 批量处理当前批次
|
||||
to_delete = []
|
||||
for expr in batch_expressions:
|
||||
# 计算时间差
|
||||
time_diff_days = (current_time - expr.last_active_time) / (24 * 3600)
|
||||
if new_count <= 0.01:
|
||||
# 如果count太小,删除这个表达方式
|
||||
await session.delete(expr)
|
||||
deleted_count += 1
|
||||
else:
|
||||
# 更新count
|
||||
expr.count = new_count
|
||||
updated_count += 1
|
||||
|
||||
# 计算衰减值
|
||||
decay_value = self.calculate_decay_factor(time_diff_days)
|
||||
new_count = max(0.01, expr.count - decay_value)
|
||||
|
||||
if new_count <= 0.01:
|
||||
# 标记删除
|
||||
to_delete.append(expr)
|
||||
else:
|
||||
# 更新count
|
||||
expr.count = new_count
|
||||
updated_count += 1
|
||||
|
||||
# 批量删除
|
||||
if to_delete:
|
||||
for expr in to_delete:
|
||||
await session.delete(expr)
|
||||
deleted_count += len(to_delete)
|
||||
|
||||
# 提交当前批次
|
||||
# 优化: 统一提交所有更改(从N次提交减少到1次)
|
||||
if updated_count > 0 or deleted_count > 0:
|
||||
await session.commit()
|
||||
|
||||
# 如果批次不满,说明已经处理完所有数据
|
||||
if len(batch_expressions) < BATCH_SIZE:
|
||||
break
|
||||
|
||||
offset += BATCH_SIZE
|
||||
|
||||
if updated_count > 0 or deleted_count > 0:
|
||||
logger.info(f"全局衰减完成:更新了 {updated_count} 个表达方式,删除了 {deleted_count} 个表达方式")
|
||||
logger.info(f"全局衰减完成:更新了 {updated_count} 个表达方式,删除了 {deleted_count} 个表达方式")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"数据库全局衰减失败: {e}")
|
||||
@@ -536,103 +509,88 @@ class ExpressionLearner:
|
||||
CRUDBase(Expression)
|
||||
for chat_id, expr_list in chat_dict.items():
|
||||
async with get_db_session() as session:
|
||||
# 🔥 优化:批量查询所有现有表达方式,避免N次数据库查询
|
||||
existing_exprs_result = await session.execute(
|
||||
select(Expression).where(
|
||||
(Expression.chat_id == chat_id)
|
||||
& (Expression.type == type)
|
||||
)
|
||||
)
|
||||
existing_exprs = list(existing_exprs_result.scalars())
|
||||
|
||||
# 构建快速查找索引
|
||||
exact_match_map = {} # (situation, style) -> Expression
|
||||
situation_map = {} # situation -> Expression
|
||||
style_map = {} # style -> Expression
|
||||
|
||||
for expr in existing_exprs:
|
||||
key = (expr.situation, expr.style)
|
||||
exact_match_map[key] = expr
|
||||
# 只保留第一个匹配(优先级:完全匹配 > 情景匹配 > 表达匹配)
|
||||
if expr.situation not in situation_map:
|
||||
situation_map[expr.situation] = expr
|
||||
if expr.style not in style_map:
|
||||
style_map[expr.style] = expr
|
||||
|
||||
# 批量处理所有新表达方式
|
||||
for new_expr in expr_list:
|
||||
situation = new_expr["situation"]
|
||||
style_val = new_expr["style"]
|
||||
exact_key = (situation, style_val)
|
||||
# 🔥 改进1:检查是否存在相同情景或相同表达的数据
|
||||
# 情况1:相同 chat_id + type + situation(相同情景,不同表达)
|
||||
query_same_situation = await session.execute(
|
||||
select(Expression).where(
|
||||
(Expression.chat_id == chat_id)
|
||||
& (Expression.type == type)
|
||||
& (Expression.situation == new_expr["situation"])
|
||||
)
|
||||
)
|
||||
same_situation_expr = query_same_situation.scalar()
|
||||
|
||||
# 情况2:相同 chat_id + type + style(相同表达,不同情景)
|
||||
query_same_style = await session.execute(
|
||||
select(Expression).where(
|
||||
(Expression.chat_id == chat_id)
|
||||
& (Expression.type == type)
|
||||
& (Expression.style == new_expr["style"])
|
||||
)
|
||||
)
|
||||
same_style_expr = query_same_style.scalar()
|
||||
|
||||
# 情况3:完全相同(相同情景+相同表达)
|
||||
query_exact_match = await session.execute(
|
||||
select(Expression).where(
|
||||
(Expression.chat_id == chat_id)
|
||||
& (Expression.type == type)
|
||||
& (Expression.situation == new_expr["situation"])
|
||||
& (Expression.style == new_expr["style"])
|
||||
)
|
||||
)
|
||||
exact_match_expr = query_exact_match.scalar()
|
||||
|
||||
# 优先处理完全匹配的情况
|
||||
if exact_key in exact_match_map:
|
||||
if exact_match_expr:
|
||||
# 完全相同:增加count,更新时间
|
||||
expr_obj = exact_match_map[exact_key]
|
||||
expr_obj = exact_match_expr
|
||||
expr_obj.count = expr_obj.count + 1
|
||||
expr_obj.last_active_time = current_time
|
||||
logger.debug(f"完全匹配:更新count {expr_obj.count}")
|
||||
elif situation in situation_map:
|
||||
elif same_situation_expr:
|
||||
# 相同情景,不同表达:覆盖旧的表达
|
||||
same_situation_expr = situation_map[situation]
|
||||
logger.info(f"相同情景覆盖:'{same_situation_expr.situation}' 的表达从 '{same_situation_expr.style}' 更新为 '{style_val}'")
|
||||
# 更新映射
|
||||
old_key = (same_situation_expr.situation, same_situation_expr.style)
|
||||
exact_match_map.pop(old_key, None)
|
||||
same_situation_expr.style = style_val
|
||||
logger.info(f"相同情景覆盖:'{same_situation_expr.situation}' 的表达从 '{same_situation_expr.style}' 更新为 '{new_expr['style']}'")
|
||||
same_situation_expr.style = new_expr["style"]
|
||||
same_situation_expr.count = same_situation_expr.count + 1
|
||||
same_situation_expr.last_active_time = current_time
|
||||
# 更新新的完全匹配映射
|
||||
exact_match_map[exact_key] = same_situation_expr
|
||||
elif style_val in style_map:
|
||||
elif same_style_expr:
|
||||
# 相同表达,不同情景:覆盖旧的情景
|
||||
same_style_expr = style_map[style_val]
|
||||
logger.info(f"相同表达覆盖:'{same_style_expr.style}' 的情景从 '{same_style_expr.situation}' 更新为 '{situation}'")
|
||||
# 更新映射
|
||||
old_key = (same_style_expr.situation, same_style_expr.style)
|
||||
exact_match_map.pop(old_key, None)
|
||||
same_style_expr.situation = situation
|
||||
logger.info(f"相同表达覆盖:'{same_style_expr.style}' 的情景从 '{same_style_expr.situation}' 更新为 '{new_expr['situation']}'")
|
||||
same_style_expr.situation = new_expr["situation"]
|
||||
same_style_expr.count = same_style_expr.count + 1
|
||||
same_style_expr.last_active_time = current_time
|
||||
# 更新新的完全匹配映射
|
||||
exact_match_map[exact_key] = same_style_expr
|
||||
situation_map[situation] = same_style_expr
|
||||
else:
|
||||
# 完全新的表达方式:创建新记录
|
||||
new_expression = Expression(
|
||||
situation=situation,
|
||||
style=style_val,
|
||||
situation=new_expr["situation"],
|
||||
style=new_expr["style"],
|
||||
count=1,
|
||||
last_active_time=current_time,
|
||||
chat_id=chat_id,
|
||||
type=type,
|
||||
create_date=current_time,
|
||||
create_date=current_time, # 手动设置创建日期
|
||||
)
|
||||
session.add(new_expression)
|
||||
# 更新映射
|
||||
exact_match_map[exact_key] = new_expression
|
||||
situation_map[situation] = new_expression
|
||||
style_map[style_val] = new_expression
|
||||
logger.debug(f"新增表达方式:{situation} -> {style_val}")
|
||||
logger.debug(f"新增表达方式:{new_expr['situation']} -> {new_expr['style']}")
|
||||
|
||||
# 🔥 优化:限制最大数量 - 使用已加载的数据避免重复查询
|
||||
# existing_exprs 已包含该 chat_id 和 type 的所有表达方式
|
||||
all_current_exprs = list(exact_match_map.values())
|
||||
if len(all_current_exprs) > MAX_EXPRESSION_COUNT:
|
||||
# 按 count 排序,删除 count 最小的多余表达方式
|
||||
sorted_exprs = sorted(all_current_exprs, key=lambda e: e.count)
|
||||
for expr in sorted_exprs[: len(all_current_exprs) - MAX_EXPRESSION_COUNT]:
|
||||
# 限制最大数量 - 使用 get_all_by_sorted 获取排序结果
|
||||
exprs_result = await session.execute(
|
||||
select(Expression)
|
||||
.where((Expression.chat_id == chat_id) & (Expression.type == type))
|
||||
.order_by(Expression.count.asc())
|
||||
)
|
||||
exprs = list(exprs_result.scalars())
|
||||
if len(exprs) > MAX_EXPRESSION_COUNT:
|
||||
# 删除count最小的多余表达方式
|
||||
for expr in exprs[: len(exprs) - MAX_EXPRESSION_COUNT]:
|
||||
await session.delete(expr)
|
||||
# 从映射中移除
|
||||
key = (expr.situation, expr.style)
|
||||
exact_match_map.pop(key, None)
|
||||
logger.debug(f"已删除 {len(all_current_exprs) - MAX_EXPRESSION_COUNT} 个低频表达方式")
|
||||
|
||||
# 提交数据库更改
|
||||
# 提交后清除相关缓存
|
||||
await session.commit()
|
||||
|
||||
# 🔥 优化:只在实际有更新时才清除缓存(移到外层,避免重复清除)
|
||||
if chat_dict: # 只有当有数据更新时才清除缓存
|
||||
# 🔥 清除共享组内所有 chat_id 的表达方式缓存
|
||||
from src.common.database.optimization.cache_manager import get_cache
|
||||
from src.common.database.utils.decorators import generate_cache_key
|
||||
cache = await get_cache()
|
||||
@@ -644,59 +602,53 @@ class ExpressionLearner:
|
||||
if len(related_chat_ids) > 1:
|
||||
logger.debug(f"已清除共享组内 {len(related_chat_ids)} 个 chat_id 的表达方式缓存")
|
||||
|
||||
# 🔥 训练 StyleLearner(支持共享组)
|
||||
# 只对 style 类型的表达方式进行训练(grammar 不需要训练到模型)
|
||||
if type == "style" and chat_dict:
|
||||
try:
|
||||
related_chat_ids = self.get_related_chat_ids()
|
||||
total_samples = sum(len(expr_list) for expr_list in chat_dict.values())
|
||||
logger.debug(f"开始训练 StyleLearner: 共享组包含 {len(related_chat_ids)} 个chat_id, 总样本数={total_samples}")
|
||||
# 🔥 训练 StyleLearner(支持共享组)
|
||||
# 只对 style 类型的表达方式进行训练(grammar 不需要训练到模型)
|
||||
if type == "style":
|
||||
try:
|
||||
logger.debug(f"开始训练 StyleLearner: 源chat_id={chat_id}, 共享组包含 {len(related_chat_ids)} 个chat_id, 样本数={len(expr_list)}")
|
||||
|
||||
# 为每个共享组内的 chat_id 训练其 StyleLearner
|
||||
for target_chat_id in related_chat_ids:
|
||||
learner = style_learner_manager.get_learner(target_chat_id)
|
||||
# 为每个共享组内的 chat_id 训练其 StyleLearner
|
||||
for target_chat_id in related_chat_ids:
|
||||
learner = style_learner_manager.get_learner(target_chat_id)
|
||||
|
||||
# 收集该 target_chat_id 对应的所有表达方式
|
||||
# 如果是源 chat_id,使用 chat_dict 中的数据;否则也要训练(共享组特性)
|
||||
total_success = 0
|
||||
total_samples = 0
|
||||
|
||||
for source_chat_id, expr_list in chat_dict.items():
|
||||
# 为每个学习到的表达方式训练模型
|
||||
# 使用 situation 作为输入,style 作为目标
|
||||
# 这是最符合语义的方式:场景 -> 表达方式
|
||||
success_count = 0
|
||||
for expr in expr_list:
|
||||
situation = expr["situation"]
|
||||
style = expr["style"]
|
||||
|
||||
# 训练映射关系: situation -> style
|
||||
if learner.learn_mapping(situation, style):
|
||||
total_success += 1
|
||||
total_samples += 1
|
||||
success_count += 1
|
||||
else:
|
||||
logger.warning(f"训练失败 (target={target_chat_id}): {situation} -> {style}")
|
||||
|
||||
# 保存模型
|
||||
if total_samples > 0:
|
||||
# 保存模型
|
||||
if learner.save(style_learner_manager.model_save_path):
|
||||
logger.debug(f"StyleLearner 模型保存成功: {target_chat_id}")
|
||||
else:
|
||||
logger.error(f"StyleLearner 模型保存失败: {target_chat_id}")
|
||||
|
||||
if target_chat_id == self.chat_id:
|
||||
# 只为当前 chat_id 记录详细日志
|
||||
if target_chat_id == chat_id:
|
||||
# 只为源 chat_id 记录详细日志
|
||||
logger.info(
|
||||
f"StyleLearner 训练完成: {total_success}/{total_samples} 成功, "
|
||||
f"StyleLearner 训练完成 (源): {success_count}/{len(expr_list)} 成功, "
|
||||
f"当前风格总数={len(learner.get_all_styles())}, "
|
||||
f"总样本数={learner.learning_stats['total_samples']}"
|
||||
)
|
||||
else:
|
||||
logger.debug(
|
||||
f"StyleLearner 训练完成 (共享组成员 {target_chat_id}): {total_success}/{total_samples} 成功"
|
||||
f"StyleLearner 训练完成 (共享组成员 {target_chat_id}): {success_count}/{len(expr_list)} 成功"
|
||||
)
|
||||
|
||||
if len(related_chat_ids) > 1:
|
||||
logger.info(f"共享组内共 {len(related_chat_ids)} 个 StyleLearner 已同步训练")
|
||||
if len(related_chat_ids) > 1:
|
||||
logger.info(f"共享组内共 {len(related_chat_ids)} 个 StyleLearner 已同步训练")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"训练 StyleLearner 失败: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"训练 StyleLearner 失败: {e}")
|
||||
|
||||
return learnt_expressions
|
||||
return None
|
||||
|
||||
@@ -207,20 +207,31 @@ class ExpressionSelector:
|
||||
select(Expression).where((Expression.chat_id.in_(related_chat_ids)) & (Expression.type == "grammar"))
|
||||
)
|
||||
|
||||
# 🔥 优化:提前定义转换函数,避免重复代码
|
||||
def expr_to_dict(expr, expr_type: str) -> dict[str, Any]:
|
||||
return {
|
||||
style_exprs = [
|
||||
{
|
||||
"situation": expr.situation,
|
||||
"style": expr.style,
|
||||
"count": expr.count,
|
||||
"last_active_time": expr.last_active_time,
|
||||
"source_id": expr.chat_id,
|
||||
"type": expr_type,
|
||||
"type": "style",
|
||||
"create_date": expr.create_date if expr.create_date is not None else expr.last_active_time,
|
||||
}
|
||||
for expr in style_query.scalars()
|
||||
]
|
||||
|
||||
style_exprs = [expr_to_dict(expr, "style") for expr in style_query.scalars()]
|
||||
grammar_exprs = [expr_to_dict(expr, "grammar") for expr in grammar_query.scalars()]
|
||||
grammar_exprs = [
|
||||
{
|
||||
"situation": expr.situation,
|
||||
"style": expr.style,
|
||||
"count": expr.count,
|
||||
"last_active_time": expr.last_active_time,
|
||||
"source_id": expr.chat_id,
|
||||
"type": "grammar",
|
||||
"create_date": expr.create_date if expr.create_date is not None else expr.last_active_time,
|
||||
}
|
||||
for expr in grammar_query.scalars()
|
||||
]
|
||||
|
||||
style_num = int(total_num * style_percentage)
|
||||
grammar_num = int(total_num * grammar_percentage)
|
||||
@@ -240,14 +251,9 @@ class ExpressionSelector:
|
||||
|
||||
@staticmethod
|
||||
async def update_expressions_count_batch(expressions_to_update: list[dict[str, Any]], increment: float = 0.1):
|
||||
"""对一批表达方式更新count值,按chat_id+type分组后一次性写入数据库
|
||||
|
||||
🔥 优化:合并所有更新到一个事务中,减少数据库连接开销
|
||||
"""
|
||||
"""对一批表达方式更新count值,按chat_id+type分组后一次性写入数据库"""
|
||||
if not expressions_to_update:
|
||||
return
|
||||
|
||||
# 去重处理
|
||||
updates_by_key = {}
|
||||
affected_chat_ids = set()
|
||||
for expr in expressions_to_update:
|
||||
@@ -263,15 +269,9 @@ class ExpressionSelector:
|
||||
updates_by_key[key] = expr
|
||||
affected_chat_ids.add(source_id)
|
||||
|
||||
if not updates_by_key:
|
||||
return
|
||||
|
||||
# 🔥 优化:使用单个 session 批量处理所有更新
|
||||
current_time = time.time()
|
||||
async with get_db_session() as session:
|
||||
updated_count = 0
|
||||
for chat_id, expr_type, situation, style in updates_by_key:
|
||||
query_result = await session.execute(
|
||||
for chat_id, expr_type, situation, style in updates_by_key:
|
||||
async with get_db_session() as session:
|
||||
query = await session.execute(
|
||||
select(Expression).where(
|
||||
(Expression.chat_id == chat_id)
|
||||
& (Expression.type == expr_type)
|
||||
@@ -279,26 +279,25 @@ class ExpressionSelector:
|
||||
& (Expression.style == style)
|
||||
)
|
||||
)
|
||||
expr_obj = query_result.scalar()
|
||||
if expr_obj:
|
||||
query = query.scalar()
|
||||
if query:
|
||||
expr_obj = query
|
||||
current_count = expr_obj.count
|
||||
new_count = min(current_count + increment, 5.0)
|
||||
expr_obj.count = new_count
|
||||
expr_obj.last_active_time = current_time
|
||||
updated_count += 1
|
||||
expr_obj.last_active_time = time.time()
|
||||
|
||||
# 批量提交所有更改
|
||||
if updated_count > 0:
|
||||
logger.debug(
|
||||
f"表达方式激活: 原count={current_count:.3f}, 增量={increment}, 新count={new_count:.3f} in db"
|
||||
)
|
||||
await session.commit()
|
||||
logger.debug(f"批量更新了 {updated_count} 个表达方式的count值")
|
||||
|
||||
# 清除所有受影响的chat_id的缓存
|
||||
if affected_chat_ids:
|
||||
from src.common.database.optimization.cache_manager import get_cache
|
||||
from src.common.database.utils.decorators import generate_cache_key
|
||||
cache = await get_cache()
|
||||
for chat_id in affected_chat_ids:
|
||||
await cache.delete(generate_cache_key("chat_expressions", chat_id))
|
||||
from src.common.database.optimization.cache_manager import get_cache
|
||||
from src.common.database.utils.decorators import generate_cache_key
|
||||
cache = await get_cache()
|
||||
for chat_id in affected_chat_ids:
|
||||
await cache.delete(generate_cache_key("chat_expressions", chat_id))
|
||||
|
||||
async def select_suitable_expressions(
|
||||
self,
|
||||
@@ -519,41 +518,29 @@ class ExpressionSelector:
|
||||
logger.warning("数据库中完全没有任何表达方式,需要先学习")
|
||||
return []
|
||||
|
||||
# 🔥 优化:使用更高效的模糊匹配算法
|
||||
# 🔥 使用模糊匹配而不是精确匹配
|
||||
# 计算每个预测style与数据库style的相似度
|
||||
from difflib import SequenceMatcher
|
||||
|
||||
# 预处理:提前计算所有预测 style 的小写版本,避免重复计算
|
||||
predicted_styles_lower = [(s.lower(), score) for s, score in predicted_styles[:20]]
|
||||
|
||||
matched_expressions = []
|
||||
for expr in all_expressions:
|
||||
db_style = expr.style or ""
|
||||
db_style_lower = db_style.lower()
|
||||
max_similarity = 0.0
|
||||
best_predicted = ""
|
||||
|
||||
# 与每个预测的style计算相似度
|
||||
for predicted_style_lower, pred_score in predicted_styles_lower:
|
||||
# 快速检查:完全匹配
|
||||
if predicted_style_lower == db_style_lower:
|
||||
max_similarity = 1.0
|
||||
best_predicted = predicted_style_lower
|
||||
break
|
||||
for predicted_style, pred_score in predicted_styles[:20]: # 考虑前20个预测
|
||||
# 计算字符串相似度
|
||||
similarity = SequenceMatcher(None, predicted_style, db_style).ratio()
|
||||
|
||||
# 快速检查:子串匹配
|
||||
if len(predicted_style_lower) >= 2 and len(db_style_lower) >= 2:
|
||||
if predicted_style_lower in db_style_lower or db_style_lower in predicted_style_lower:
|
||||
similarity = 0.7
|
||||
if similarity > max_similarity:
|
||||
max_similarity = similarity
|
||||
best_predicted = predicted_style_lower
|
||||
continue
|
||||
# 也检查包含关系(如果一个是另一个的子串,给更高分)
|
||||
if len(predicted_style) >= 2 and len(db_style) >= 2:
|
||||
if predicted_style in db_style or db_style in predicted_style:
|
||||
similarity = max(similarity, 0.7)
|
||||
|
||||
# 计算字符串相似度(较慢,只在必要时使用)
|
||||
similarity = SequenceMatcher(None, predicted_style_lower, db_style_lower).ratio()
|
||||
if similarity > max_similarity:
|
||||
max_similarity = similarity
|
||||
best_predicted = predicted_style_lower
|
||||
best_predicted = predicted_style
|
||||
|
||||
# 🔥 降低阈值到30%,因为StyleLearner预测质量较差
|
||||
if max_similarity >= 0.3: # 30%相似度阈值
|
||||
@@ -586,15 +573,14 @@ class ExpressionSelector:
|
||||
f"(候选 {len(matched_expressions)},temperature={temperature})"
|
||||
)
|
||||
|
||||
# 🔥 优化:使用列表推导式和预定义函数减少开销
|
||||
# 转换为字典格式
|
||||
expressions = [
|
||||
{
|
||||
"situation": expr.situation or "",
|
||||
"style": expr.style or "",
|
||||
"type": expr.type or "style",
|
||||
"count": float(expr.count) if expr.count else 0.0,
|
||||
"last_active_time": expr.last_active_time or 0.0,
|
||||
"source_id": expr.chat_id # 添加 source_id 以便后续更新
|
||||
"last_active_time": expr.last_active_time or 0.0
|
||||
}
|
||||
for expr in expressions_objs
|
||||
]
|
||||
|
||||
@@ -127,8 +127,7 @@ class SituationExtractor:
|
||||
Returns:
|
||||
情境描述列表
|
||||
"""
|
||||
situations: list[str] = []
|
||||
seen = set()
|
||||
situations = []
|
||||
|
||||
for line in response.splitlines():
|
||||
line = line.strip()
|
||||
@@ -151,11 +150,6 @@ class SituationExtractor:
|
||||
if any(keyword in line.lower() for keyword in ["例如", "注意", "请", "分析", "总结"]):
|
||||
continue
|
||||
|
||||
# 去重,保持原有顺序
|
||||
if line in seen:
|
||||
continue
|
||||
seen.add(line)
|
||||
|
||||
situations.append(line)
|
||||
|
||||
if len(situations) >= max_situations:
|
||||
|
||||
@@ -4,7 +4,6 @@
|
||||
支持多聊天室独立建模和在线学习
|
||||
"""
|
||||
import os
|
||||
import pickle
|
||||
import time
|
||||
|
||||
from src.common.logger import get_logger
|
||||
@@ -17,12 +16,11 @@ logger = get_logger("expressor.style_learner")
|
||||
class StyleLearner:
|
||||
"""单个聊天室的表达风格学习器"""
|
||||
|
||||
def __init__(self, chat_id: str, model_config: dict | None = None, resource_limit_enabled: bool = True):
|
||||
def __init__(self, chat_id: str, model_config: dict | None = None):
|
||||
"""
|
||||
Args:
|
||||
chat_id: 聊天室ID
|
||||
model_config: 模型配置
|
||||
resource_limit_enabled: 是否启用资源上限控制(默认关闭)
|
||||
"""
|
||||
self.chat_id = chat_id
|
||||
self.model_config = model_config or {
|
||||
@@ -36,9 +34,6 @@ class StyleLearner:
|
||||
# 初始化表达模型
|
||||
self.expressor = ExpressorModel(**self.model_config)
|
||||
|
||||
# 资源上限控制开关(默认开启,可按需关闭)
|
||||
self.resource_limit_enabled = resource_limit_enabled
|
||||
|
||||
# 动态风格管理
|
||||
self.max_styles = 2000 # 每个chat_id最多2000个风格
|
||||
self.cleanup_threshold = 0.9 # 达到90%容量时触发清理
|
||||
@@ -72,15 +67,18 @@ class StyleLearner:
|
||||
if style in self.style_to_id:
|
||||
return True
|
||||
|
||||
# 检查是否需要清理(仅计算一次阈值)
|
||||
if self.resource_limit_enabled:
|
||||
current_count = len(self.style_to_id)
|
||||
cleanup_trigger = int(self.max_styles * self.cleanup_threshold)
|
||||
if current_count >= cleanup_trigger:
|
||||
if current_count >= self.max_styles:
|
||||
logger.warning(f"已达到最大风格数量限制 ({self.max_styles}),开始清理")
|
||||
else:
|
||||
logger.info(f"风格数量达到 {current_count}/{self.max_styles},触发预防性清理")
|
||||
# 检查是否需要清理
|
||||
current_count = len(self.style_to_id)
|
||||
cleanup_trigger = int(self.max_styles * self.cleanup_threshold)
|
||||
|
||||
if current_count >= cleanup_trigger:
|
||||
if current_count >= self.max_styles:
|
||||
# 已经达到最大限制,必须清理
|
||||
logger.warning(f"已达到最大风格数量限制 ({self.max_styles}),开始清理")
|
||||
self._cleanup_styles()
|
||||
elif current_count >= cleanup_trigger:
|
||||
# 接近限制,提前清理
|
||||
logger.info(f"风格数量达到 {current_count}/{self.max_styles},触发预防性清理")
|
||||
self._cleanup_styles()
|
||||
|
||||
# 生成新的style_id
|
||||
@@ -97,8 +95,7 @@ class StyleLearner:
|
||||
self.expressor.add_candidate(style_id, style, situation)
|
||||
|
||||
# 初始化统计
|
||||
self.learning_stats.setdefault("style_counts", {})[style_id] = 0
|
||||
self.learning_stats.setdefault("style_last_used", {})
|
||||
self.learning_stats["style_counts"][style_id] = 0
|
||||
|
||||
logger.debug(f"添加风格成功: {style_id} -> {style}")
|
||||
return True
|
||||
@@ -117,64 +114,64 @@ class StyleLearner:
|
||||
3. 默认清理 cleanup_ratio (20%) 的风格
|
||||
"""
|
||||
try:
|
||||
total_styles = len(self.style_to_id)
|
||||
if total_styles == 0:
|
||||
return
|
||||
|
||||
# 只有在达到阈值时才执行昂贵的排序
|
||||
cleanup_count = max(1, int(total_styles * self.cleanup_ratio))
|
||||
if cleanup_count <= 0:
|
||||
return
|
||||
|
||||
current_time = time.time()
|
||||
# 局部引用加速频繁调用的函数
|
||||
from math import exp, log1p
|
||||
cleanup_count = max(1, int(len(self.style_to_id) * self.cleanup_ratio))
|
||||
|
||||
# 计算每个风格的价值分数
|
||||
style_scores = []
|
||||
for style_id in self.style_to_id.values():
|
||||
# 使用次数
|
||||
usage_count = self.learning_stats["style_counts"].get(style_id, 0)
|
||||
|
||||
# 最后使用时间(越近越好)
|
||||
last_used = self.learning_stats["style_last_used"].get(style_id, 0)
|
||||
|
||||
time_since_used = current_time - last_used if last_used > 0 else float("inf")
|
||||
usage_score = log1p(usage_count)
|
||||
days_unused = time_since_used / 86400
|
||||
time_score = exp(-days_unused / 30)
|
||||
|
||||
# 综合分数:使用次数越多越好,距离上次使用时间越短越好
|
||||
# 使用对数来平滑使用次数的影响
|
||||
import math
|
||||
usage_score = math.log1p(usage_count) # log(1 + count)
|
||||
|
||||
# 时间分数:转换为天数,使用指数衰减
|
||||
days_unused = time_since_used / 86400 # 转换为天
|
||||
time_score = math.exp(-days_unused / 30) # 30天衰减因子
|
||||
|
||||
# 综合分数:80%使用频率 + 20%时间新鲜度
|
||||
total_score = 0.8 * usage_score + 0.2 * time_score
|
||||
style_scores.append((style_id, total_score, usage_count, days_unused))
|
||||
|
||||
if not style_scores:
|
||||
return
|
||||
style_scores.append((style_id, total_score, usage_count, days_unused))
|
||||
|
||||
# 按分数排序,分数低的先删除
|
||||
style_scores.sort(key=lambda x: x[1])
|
||||
|
||||
# 删除分数最低的风格
|
||||
deleted_styles = []
|
||||
for style_id, score, usage, days in style_scores[:cleanup_count]:
|
||||
style_text = self.id_to_style.get(style_id)
|
||||
if not style_text:
|
||||
continue
|
||||
if style_text:
|
||||
# 从映射中删除
|
||||
del self.style_to_id[style_text]
|
||||
del self.id_to_style[style_id]
|
||||
if style_id in self.id_to_situation:
|
||||
del self.id_to_situation[style_id]
|
||||
|
||||
# 从映射中删除
|
||||
self.style_to_id.pop(style_text, None)
|
||||
self.id_to_style.pop(style_id, None)
|
||||
self.id_to_situation.pop(style_id, None)
|
||||
# 从统计中删除
|
||||
if style_id in self.learning_stats["style_counts"]:
|
||||
del self.learning_stats["style_counts"][style_id]
|
||||
if style_id in self.learning_stats["style_last_used"]:
|
||||
del self.learning_stats["style_last_used"][style_id]
|
||||
|
||||
# 从统计中删除
|
||||
self.learning_stats["style_counts"].pop(style_id, None)
|
||||
self.learning_stats["style_last_used"].pop(style_id, None)
|
||||
# 从expressor模型中删除
|
||||
self.expressor.remove_candidate(style_id)
|
||||
|
||||
# 从expressor模型中删除
|
||||
self.expressor.remove_candidate(style_id)
|
||||
|
||||
deleted_styles.append((style_text[:30], usage, f"{days:.1f}天"))
|
||||
deleted_styles.append((style_text[:30], usage, f"{days:.1f}天"))
|
||||
|
||||
logger.info(
|
||||
f"风格清理完成: 删除了 {len(deleted_styles)}/{len(style_scores)} 个风格,"
|
||||
f"剩余 {len(self.style_to_id)} 个风格"
|
||||
)
|
||||
|
||||
# 记录前5个被删除的风格(用于调试)
|
||||
if deleted_styles:
|
||||
logger.debug(f"被删除的风格样例(前5): {deleted_styles[:5]}")
|
||||
|
||||
@@ -207,9 +204,7 @@ class StyleLearner:
|
||||
# 更新统计
|
||||
current_time = time.time()
|
||||
self.learning_stats["total_samples"] += 1
|
||||
self.learning_stats.setdefault("style_counts", {})
|
||||
self.learning_stats.setdefault("style_last_used", {})
|
||||
self.learning_stats["style_counts"][style_id] = self.learning_stats["style_counts"].get(style_id, 0) + 1
|
||||
self.learning_stats["style_counts"][style_id] += 1
|
||||
self.learning_stats["style_last_used"][style_id] = current_time # 更新最后使用时间
|
||||
self.learning_stats["last_update"] = current_time
|
||||
|
||||
@@ -354,11 +349,11 @@ class StyleLearner:
|
||||
|
||||
# 保存expressor模型
|
||||
model_path = os.path.join(save_dir, "expressor_model.pkl")
|
||||
tmp_model_path = f"{model_path}.tmp"
|
||||
self.expressor.save(tmp_model_path)
|
||||
os.replace(tmp_model_path, model_path)
|
||||
self.expressor.save(model_path)
|
||||
|
||||
# 保存映射关系和统计信息
|
||||
import pickle
|
||||
|
||||
# 保存映射关系和统计信息(原子写)
|
||||
meta_path = os.path.join(save_dir, "meta.pkl")
|
||||
|
||||
# 确保 learning_stats 包含所有必要字段
|
||||
@@ -373,13 +368,8 @@ class StyleLearner:
|
||||
"learning_stats": self.learning_stats,
|
||||
}
|
||||
|
||||
tmp_meta_path = f"{meta_path}.tmp"
|
||||
with open(tmp_meta_path, "wb") as f:
|
||||
pickle.dump(meta_data, f, protocol=pickle.HIGHEST_PROTOCOL)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
|
||||
os.replace(tmp_meta_path, meta_path)
|
||||
with open(meta_path, "wb") as f:
|
||||
pickle.dump(meta_data, f)
|
||||
|
||||
return True
|
||||
|
||||
@@ -411,6 +401,8 @@ class StyleLearner:
|
||||
self.expressor.load(model_path)
|
||||
|
||||
# 加载映射关系和统计信息
|
||||
import pickle
|
||||
|
||||
meta_path = os.path.join(save_dir, "meta.pkl")
|
||||
if os.path.exists(meta_path):
|
||||
with open(meta_path, "rb") as f:
|
||||
@@ -453,16 +445,14 @@ class StyleLearnerManager:
|
||||
# 🔧 最大活跃 learner 数量
|
||||
MAX_ACTIVE_LEARNERS = 50
|
||||
|
||||
def __init__(self, model_save_path: str = "data/expression/style_models", resource_limit_enabled: bool = True):
|
||||
def __init__(self, model_save_path: str = "data/expression/style_models"):
|
||||
"""
|
||||
Args:
|
||||
model_save_path: 模型保存路径
|
||||
resource_limit_enabled: 是否启用资源上限控制(默认开启)
|
||||
"""
|
||||
self.learners: dict[str, StyleLearner] = {}
|
||||
self.learner_last_used: dict[str, float] = {} # 🔧 记录最后使用时间
|
||||
self.model_save_path = model_save_path
|
||||
self.resource_limit_enabled = resource_limit_enabled
|
||||
|
||||
# 确保保存目录存在
|
||||
os.makedirs(model_save_path, exist_ok=True)
|
||||
@@ -485,10 +475,7 @@ class StyleLearnerManager:
|
||||
for chat_id, last_used in sorted_by_time[:evict_count]:
|
||||
if chat_id in self.learners:
|
||||
# 先保存再淘汰
|
||||
try:
|
||||
self.learners[chat_id].save(self.model_save_path)
|
||||
except Exception as e:
|
||||
logger.error(f"LRU淘汰时保存学习器失败: chat_id={chat_id}, error={e}")
|
||||
self.learners[chat_id].save(self.model_save_path)
|
||||
del self.learners[chat_id]
|
||||
del self.learner_last_used[chat_id]
|
||||
evicted.append(chat_id)
|
||||
@@ -515,11 +502,7 @@ class StyleLearnerManager:
|
||||
self._evict_if_needed()
|
||||
|
||||
# 创建新的学习器
|
||||
learner = StyleLearner(
|
||||
chat_id,
|
||||
model_config,
|
||||
resource_limit_enabled=self.resource_limit_enabled,
|
||||
)
|
||||
learner = StyleLearner(chat_id, model_config)
|
||||
|
||||
# 尝试加载已保存的模型
|
||||
learner.load(self.model_save_path)
|
||||
@@ -528,12 +511,6 @@ class StyleLearnerManager:
|
||||
|
||||
return self.learners[chat_id]
|
||||
|
||||
def set_resource_limit(self, enabled: bool) -> None:
|
||||
"""动态开启/关闭资源上限控制(默认关闭)。"""
|
||||
self.resource_limit_enabled = enabled
|
||||
for learner in self.learners.values():
|
||||
learner.resource_limit_enabled = enabled
|
||||
|
||||
def learn_mapping(self, chat_id: str, up_content: str, style: str) -> bool:
|
||||
"""
|
||||
学习一个映射关系
|
||||
|
||||
@@ -5,7 +5,6 @@
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from collections import OrderedDict
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from src.common.logger import get_logger
|
||||
@@ -38,51 +37,20 @@ class InterestManager:
|
||||
self._calculation_queue = asyncio.Queue()
|
||||
self._worker_task = None
|
||||
self._shutdown_event = asyncio.Event()
|
||||
|
||||
# 性能优化相关字段
|
||||
self._result_cache: OrderedDict[str, InterestCalculationResult] = OrderedDict() # LRU缓存
|
||||
self._cache_max_size = 1000 # 最大缓存数量
|
||||
self._cache_ttl = 300 # 缓存TTL(秒)
|
||||
self._batch_queue: asyncio.Queue = asyncio.Queue(maxsize=100) # 批处理队列
|
||||
self._batch_size = 10 # 批处理大小
|
||||
self._batch_timeout = 0.1 # 批处理超时(秒)
|
||||
self._batch_task = None
|
||||
self._is_warmed_up = False # 预热状态标记
|
||||
|
||||
# 性能统计
|
||||
self._cache_hits = 0
|
||||
self._cache_misses = 0
|
||||
self._batch_calculations = 0
|
||||
self._total_calculation_time = 0.0
|
||||
|
||||
self._initialized = True
|
||||
|
||||
async def initialize(self):
|
||||
"""初始化管理器"""
|
||||
# 启动批处理工作线程
|
||||
if self._batch_task is None or self._batch_task.done():
|
||||
self._batch_task = asyncio.create_task(self._batch_processing_worker())
|
||||
logger.info("批处理工作线程已启动")
|
||||
pass
|
||||
|
||||
async def shutdown(self):
|
||||
"""关闭管理器"""
|
||||
self._shutdown_event.set()
|
||||
|
||||
# 取消批处理任务
|
||||
if self._batch_task and not self._batch_task.done():
|
||||
self._batch_task.cancel()
|
||||
try:
|
||||
await self._batch_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
if self._current_calculator:
|
||||
await self._current_calculator.cleanup()
|
||||
self._current_calculator = None
|
||||
|
||||
# 清理缓存
|
||||
self._result_cache.clear()
|
||||
|
||||
logger.info("兴趣值管理器已关闭")
|
||||
|
||||
async def register_calculator(self, calculator: BaseInterestCalculator) -> bool:
|
||||
@@ -123,13 +91,12 @@ class InterestManager:
|
||||
logger.error(f"注册兴趣值计算组件失败: {e}")
|
||||
return False
|
||||
|
||||
async def calculate_interest(self, message: "DatabaseMessages", timeout: float | None = None, use_cache: bool = True) -> InterestCalculationResult:
|
||||
"""计算消息兴趣值(优化版,支持缓存)
|
||||
async def calculate_interest(self, message: "DatabaseMessages", timeout: float | None = None) -> InterestCalculationResult:
|
||||
"""计算消息兴趣值
|
||||
|
||||
Args:
|
||||
message: 数据库消息对象
|
||||
timeout: 最大等待时间(秒),超时则使用默认值返回;为None时不设置超时
|
||||
use_cache: 是否使用缓存,默认True
|
||||
|
||||
Returns:
|
||||
InterestCalculationResult: 计算结果或默认结果
|
||||
@@ -143,52 +110,36 @@ class InterestManager:
|
||||
error_message="没有可用的兴趣值计算组件",
|
||||
)
|
||||
|
||||
message_id = getattr(message, "message_id", "")
|
||||
|
||||
# 缓存查询
|
||||
if use_cache and message_id:
|
||||
cached_result = self._get_from_cache(message_id)
|
||||
if cached_result is not None:
|
||||
self._cache_hits += 1
|
||||
logger.debug(f"命中缓存: {message_id}, 兴趣值: {cached_result.interest_value:.3f}")
|
||||
return cached_result
|
||||
self._cache_misses += 1
|
||||
|
||||
# 使用 create_task 异步执行计算
|
||||
task = asyncio.create_task(self._async_calculate(message))
|
||||
|
||||
if timeout is None:
|
||||
result = await task
|
||||
else:
|
||||
try:
|
||||
# 等待计算结果,但有超时限制
|
||||
result = await asyncio.wait_for(task, timeout=timeout)
|
||||
except asyncio.TimeoutError:
|
||||
# 超时返回默认结果,但计算仍在后台继续
|
||||
logger.warning(f"兴趣值计算超时 ({timeout}s),消息 {message_id} 使用默认兴趣值 0.5")
|
||||
return InterestCalculationResult(
|
||||
success=True,
|
||||
message_id=message_id,
|
||||
interest_value=0.5, # 固定默认兴趣值
|
||||
should_reply=False,
|
||||
should_act=False,
|
||||
error_message=f"计算超时({timeout}s),使用默认值",
|
||||
)
|
||||
except Exception as e:
|
||||
# 发生异常,返回默认结果
|
||||
logger.error(f"兴趣值计算异常: {e}")
|
||||
return InterestCalculationResult(
|
||||
success=False,
|
||||
message_id=message_id,
|
||||
interest_value=0.3,
|
||||
error_message=f"计算异常: {e!s}",
|
||||
)
|
||||
return await task
|
||||
|
||||
# 缓存结果
|
||||
if use_cache and result.success and message_id:
|
||||
self._put_to_cache(message_id, result)
|
||||
|
||||
return result
|
||||
try:
|
||||
# 等待计算结果,但有超时限制
|
||||
result = await asyncio.wait_for(task, timeout=timeout)
|
||||
return result
|
||||
except asyncio.TimeoutError:
|
||||
# 超时返回默认结果,但计算仍在后台继续
|
||||
logger.warning(f"兴趣值计算超时 ({timeout}s),消息 {getattr(message, 'message_id', '')} 使用默认兴趣值 0.5")
|
||||
return InterestCalculationResult(
|
||||
success=True,
|
||||
message_id=getattr(message, "message_id", ""),
|
||||
interest_value=0.5, # 固定默认兴趣值
|
||||
should_reply=False,
|
||||
should_act=False,
|
||||
error_message=f"计算超时({timeout}s),使用默认值",
|
||||
)
|
||||
except Exception as e:
|
||||
# 发生异常,返回默认结果
|
||||
logger.error(f"兴趣值计算异常: {e}")
|
||||
return InterestCalculationResult(
|
||||
success=False,
|
||||
message_id=getattr(message, "message_id", ""),
|
||||
interest_value=0.3,
|
||||
error_message=f"计算异常: {e!s}",
|
||||
)
|
||||
|
||||
async def _async_calculate(self, message: "DatabaseMessages") -> InterestCalculationResult:
|
||||
"""异步执行兴趣值计算"""
|
||||
@@ -210,7 +161,6 @@ class InterestManager:
|
||||
|
||||
if result.success:
|
||||
self._last_calculation_time = time.time()
|
||||
self._total_calculation_time += result.calculation_time
|
||||
logger.debug(f"兴趣值计算完成: {result.interest_value:.3f} (耗时: {result.calculation_time:.3f}s)")
|
||||
else:
|
||||
self._failed_calculations += 1
|
||||
@@ -220,15 +170,13 @@ class InterestManager:
|
||||
|
||||
except Exception as e:
|
||||
self._failed_calculations += 1
|
||||
calc_time = time.time() - start_time
|
||||
self._total_calculation_time += calc_time
|
||||
logger.error(f"兴趣值计算异常: {e}")
|
||||
return InterestCalculationResult(
|
||||
success=False,
|
||||
message_id=getattr(message, "message_id", ""),
|
||||
interest_value=0.0,
|
||||
error_message=f"计算异常: {e!s}",
|
||||
calculation_time=calc_time,
|
||||
calculation_time=time.time() - start_time,
|
||||
)
|
||||
|
||||
async def _calculation_worker(self):
|
||||
@@ -250,155 +198,6 @@ class InterestManager:
|
||||
except Exception as e:
|
||||
logger.error(f"计算工作线程异常: {e}")
|
||||
|
||||
def _get_from_cache(self, message_id: str) -> InterestCalculationResult | None:
|
||||
"""从缓存中获取结果(LRU策略)"""
|
||||
if message_id not in self._result_cache:
|
||||
return None
|
||||
|
||||
# 检查TTL
|
||||
result = self._result_cache[message_id]
|
||||
if time.time() - result.timestamp > self._cache_ttl:
|
||||
# 过期,删除
|
||||
del self._result_cache[message_id]
|
||||
return None
|
||||
|
||||
# 更新访问顺序(LRU)
|
||||
self._result_cache.move_to_end(message_id)
|
||||
return result
|
||||
|
||||
def _put_to_cache(self, message_id: str, result: InterestCalculationResult):
|
||||
"""将结果放入缓存(LRU策略)"""
|
||||
# 如果已存在,更新
|
||||
if message_id in self._result_cache:
|
||||
self._result_cache.move_to_end(message_id)
|
||||
|
||||
self._result_cache[message_id] = result
|
||||
|
||||
# 限制缓存大小
|
||||
while len(self._result_cache) > self._cache_max_size:
|
||||
# 删除最旧的项
|
||||
self._result_cache.popitem(last=False)
|
||||
|
||||
async def calculate_interest_batch(self, messages: list["DatabaseMessages"], timeout: float | None = None) -> list[InterestCalculationResult]:
|
||||
"""批量计算消息兴趣值(并发优化)
|
||||
|
||||
Args:
|
||||
messages: 消息列表
|
||||
timeout: 单个计算的超时时间
|
||||
|
||||
Returns:
|
||||
list[InterestCalculationResult]: 计算结果列表
|
||||
"""
|
||||
if not messages:
|
||||
return []
|
||||
|
||||
# 并发计算所有消息
|
||||
tasks = [self.calculate_interest(msg, timeout=timeout) for msg in messages]
|
||||
results = await asyncio.gather(*tasks, return_exceptions=True)
|
||||
|
||||
# 处理异常
|
||||
final_results = []
|
||||
for i, result in enumerate(results):
|
||||
if isinstance(result, Exception):
|
||||
logger.error(f"批量计算消息 {i} 失败: {result}")
|
||||
final_results.append(InterestCalculationResult(
|
||||
success=False,
|
||||
message_id=getattr(messages[i], "message_id", ""),
|
||||
interest_value=0.3,
|
||||
error_message=f"批量计算异常: {result!s}",
|
||||
))
|
||||
else:
|
||||
final_results.append(result)
|
||||
|
||||
self._batch_calculations += 1
|
||||
return final_results
|
||||
|
||||
async def _batch_processing_worker(self):
|
||||
"""批处理工作线程"""
|
||||
while not self._shutdown_event.is_set():
|
||||
batch = []
|
||||
deadline = time.time() + self._batch_timeout
|
||||
|
||||
try:
|
||||
# 收集批次
|
||||
while len(batch) < self._batch_size and time.time() < deadline:
|
||||
remaining_time = deadline - time.time()
|
||||
if remaining_time <= 0:
|
||||
break
|
||||
|
||||
try:
|
||||
item = await asyncio.wait_for(self._batch_queue.get(), timeout=remaining_time)
|
||||
batch.append(item)
|
||||
except asyncio.TimeoutError:
|
||||
break
|
||||
|
||||
# 处理批次
|
||||
if batch:
|
||||
await self._process_batch(batch)
|
||||
|
||||
except asyncio.CancelledError:
|
||||
break
|
||||
except Exception as e:
|
||||
logger.error(f"批处理工作线程异常: {e}")
|
||||
|
||||
async def _process_batch(self, batch: list):
|
||||
"""处理批次消息"""
|
||||
# 这里可以实现具体的批处理逻辑
|
||||
# 当前版本只是占位,实际的批处理逻辑可以根据具体需求实现
|
||||
pass
|
||||
|
||||
async def warmup(self, sample_messages: list["DatabaseMessages"] | None = None):
|
||||
"""预热兴趣计算器
|
||||
|
||||
Args:
|
||||
sample_messages: 样本消息列表,用于预热。如果为None,则只初始化计算器
|
||||
"""
|
||||
if not self._current_calculator:
|
||||
logger.warning("无法预热:没有可用的兴趣值计算组件")
|
||||
return
|
||||
|
||||
logger.info("开始预热兴趣值计算器...")
|
||||
start_time = time.time()
|
||||
|
||||
# 如果提供了样本消息,进行预热计算
|
||||
if sample_messages:
|
||||
try:
|
||||
# 批量计算样本消息
|
||||
await self.calculate_interest_batch(sample_messages, timeout=5.0)
|
||||
logger.info(f"预热完成:处理了 {len(sample_messages)} 条样本消息,耗时 {time.time() - start_time:.2f}s")
|
||||
except Exception as e:
|
||||
logger.error(f"预热过程中出现异常: {e}")
|
||||
else:
|
||||
logger.info(f"预热完成:计算器已就绪,耗时 {time.time() - start_time:.2f}s")
|
||||
|
||||
self._is_warmed_up = True
|
||||
|
||||
def clear_cache(self):
|
||||
"""清空缓存"""
|
||||
cleared_count = len(self._result_cache)
|
||||
self._result_cache.clear()
|
||||
logger.info(f"已清空 {cleared_count} 条缓存记录")
|
||||
|
||||
def set_cache_config(self, max_size: int | None = None, ttl: int | None = None):
|
||||
"""设置缓存配置
|
||||
|
||||
Args:
|
||||
max_size: 最大缓存数量
|
||||
ttl: 缓存生存时间(秒)
|
||||
"""
|
||||
if max_size is not None:
|
||||
self._cache_max_size = max_size
|
||||
logger.info(f"缓存最大容量设置为: {max_size}")
|
||||
|
||||
if ttl is not None:
|
||||
self._cache_ttl = ttl
|
||||
logger.info(f"缓存TTL设置为: {ttl}秒")
|
||||
|
||||
# 如果当前缓存超过新的最大值,清理旧数据
|
||||
if max_size is not None:
|
||||
while len(self._result_cache) > self._cache_max_size:
|
||||
self._result_cache.popitem(last=False)
|
||||
|
||||
def get_current_calculator(self) -> BaseInterestCalculator | None:
|
||||
"""获取当前活跃的兴趣值计算组件"""
|
||||
return self._current_calculator
|
||||
@@ -406,8 +205,6 @@ class InterestManager:
|
||||
def get_statistics(self) -> dict:
|
||||
"""获取管理器统计信息"""
|
||||
success_rate = 1.0 - (self._failed_calculations / max(1, self._total_calculations))
|
||||
cache_hit_rate = self._cache_hits / max(1, self._cache_hits + self._cache_misses)
|
||||
avg_calc_time = self._total_calculation_time / max(1, self._total_calculations)
|
||||
|
||||
stats = {
|
||||
"manager_statistics": {
|
||||
@@ -416,13 +213,6 @@ class InterestManager:
|
||||
"success_rate": success_rate,
|
||||
"last_calculation_time": self._last_calculation_time,
|
||||
"current_calculator": self._current_calculator.component_name if self._current_calculator else None,
|
||||
"cache_hit_rate": cache_hit_rate,
|
||||
"cache_hits": self._cache_hits,
|
||||
"cache_misses": self._cache_misses,
|
||||
"cache_size": len(self._result_cache),
|
||||
"batch_calculations": self._batch_calculations,
|
||||
"average_calculation_time": avg_calc_time,
|
||||
"is_warmed_up": self._is_warmed_up,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -447,82 +237,6 @@ class InterestManager:
|
||||
"""检查是否有可用的计算组件"""
|
||||
return self._current_calculator is not None and self._current_calculator.is_enabled
|
||||
|
||||
async def adaptive_optimize(self):
|
||||
"""自适应优化:根据性能统计自动调整参数"""
|
||||
if not self._current_calculator:
|
||||
return
|
||||
|
||||
stats = self.get_statistics()["manager_statistics"]
|
||||
|
||||
# 根据缓存命中率调整缓存大小
|
||||
cache_hit_rate = stats["cache_hit_rate"]
|
||||
if cache_hit_rate < 0.5 and self._cache_max_size < 5000:
|
||||
# 命中率低,增加缓存容量
|
||||
new_size = min(self._cache_max_size * 2, 5000)
|
||||
logger.info(f"自适应优化:缓存命中率较低 ({cache_hit_rate:.2%}),扩大缓存容量 {self._cache_max_size} -> {new_size}")
|
||||
self._cache_max_size = new_size
|
||||
elif cache_hit_rate > 0.9 and self._cache_max_size > 100:
|
||||
# 命中率高,可以适当减小缓存
|
||||
new_size = max(self._cache_max_size // 2, 100)
|
||||
logger.info(f"自适应优化:缓存命中率很高 ({cache_hit_rate:.2%}),缩小缓存容量 {self._cache_max_size} -> {new_size}")
|
||||
self._cache_max_size = new_size
|
||||
# 清理多余缓存
|
||||
while len(self._result_cache) > self._cache_max_size:
|
||||
self._result_cache.popitem(last=False)
|
||||
|
||||
# 根据平均计算时间调整批处理参数
|
||||
avg_calc_time = stats["average_calculation_time"]
|
||||
if avg_calc_time > 0.5 and self._batch_size < 50:
|
||||
# 计算较慢,增加批次大小以提高吞吐量
|
||||
new_batch_size = min(self._batch_size * 2, 50)
|
||||
logger.info(f"自适应优化:平均计算时间较长 ({avg_calc_time:.3f}s),增加批次大小 {self._batch_size} -> {new_batch_size}")
|
||||
self._batch_size = new_batch_size
|
||||
elif avg_calc_time < 0.1 and self._batch_size > 5:
|
||||
# 计算较快,可以减小批次
|
||||
new_batch_size = max(self._batch_size // 2, 5)
|
||||
logger.info(f"自适应优化:平均计算时间较短 ({avg_calc_time:.3f}s),减小批次大小 {self._batch_size} -> {new_batch_size}")
|
||||
self._batch_size = new_batch_size
|
||||
|
||||
def get_performance_report(self) -> str:
|
||||
"""生成性能报告"""
|
||||
stats = self.get_statistics()["manager_statistics"]
|
||||
|
||||
report = [
|
||||
"=" * 60,
|
||||
"兴趣值管理器性能报告",
|
||||
"=" * 60,
|
||||
f"总计算次数: {stats['total_calculations']}",
|
||||
f"失败次数: {stats['failed_calculations']}",
|
||||
f"成功率: {stats['success_rate']:.2%}",
|
||||
f"缓存命中率: {stats['cache_hit_rate']:.2%}",
|
||||
f"缓存命中: {stats['cache_hits']}",
|
||||
f"缓存未命中: {stats['cache_misses']}",
|
||||
f"当前缓存大小: {stats['cache_size']} / {self._cache_max_size}",
|
||||
f"批量计算次数: {stats['batch_calculations']}",
|
||||
f"平均计算时间: {stats['average_calculation_time']:.4f}s",
|
||||
f"是否已预热: {'是' if stats['is_warmed_up'] else '否'}",
|
||||
f"当前计算器: {stats['current_calculator'] or '无'}",
|
||||
"=" * 60,
|
||||
]
|
||||
|
||||
# 添加计算器统计
|
||||
if self._current_calculator:
|
||||
calc_stats = self.get_statistics()["calculator_statistics"]
|
||||
report.extend([
|
||||
"",
|
||||
"计算器统计:",
|
||||
f" 组件名称: {calc_stats['component_name']}",
|
||||
f" 版本: {calc_stats['component_version']}",
|
||||
f" 已启用: {calc_stats['enabled']}",
|
||||
f" 总计算: {calc_stats['total_calculations']}",
|
||||
f" 失败: {calc_stats['failed_calculations']}",
|
||||
f" 成功率: {calc_stats['success_rate']:.2%}",
|
||||
f" 平均耗时: {calc_stats['average_calculation_time']:.4f}s",
|
||||
"=" * 60,
|
||||
])
|
||||
|
||||
return "\n".join(report)
|
||||
|
||||
|
||||
# 全局实例
|
||||
_interest_manager = None
|
||||
|
||||
@@ -121,7 +121,7 @@ async def conversation_loop(
|
||||
except asyncio.CancelledError:
|
||||
logger.info(f" [生成器] stream={stream_id[:8]}, 被取消")
|
||||
break
|
||||
except Exception as e:
|
||||
except Exception as e: # noqa: BLE001
|
||||
logger.error(f" [生成器] stream={stream_id[:8]}, 出错: {e}")
|
||||
await asyncio.sleep(5.0)
|
||||
|
||||
@@ -151,10 +151,10 @@ async def run_chat_stream(
|
||||
# 创建生成器
|
||||
tick_generator = conversation_loop(
|
||||
stream_id=stream_id,
|
||||
get_context_func=manager._get_stream_context,
|
||||
calculate_interval_func=manager._calculate_interval,
|
||||
flush_cache_func=manager._flush_cached_messages_to_unread,
|
||||
check_force_dispatch_func=manager._needs_force_dispatch_for_context,
|
||||
get_context_func=manager._get_stream_context, # noqa: SLF001
|
||||
calculate_interval_func=manager._calculate_interval, # noqa: SLF001
|
||||
flush_cache_func=manager._flush_cached_messages_to_unread, # noqa: SLF001
|
||||
check_force_dispatch_func=manager._needs_force_dispatch_for_context, # noqa: SLF001
|
||||
is_running_func=lambda: manager.is_running,
|
||||
)
|
||||
|
||||
@@ -162,13 +162,13 @@ async def run_chat_stream(
|
||||
async for tick in tick_generator:
|
||||
try:
|
||||
# 获取上下文
|
||||
context = await manager._get_stream_context(stream_id)
|
||||
context = await manager._get_stream_context(stream_id) # noqa: SLF001
|
||||
if not context:
|
||||
continue
|
||||
|
||||
# 并发保护:检查是否正在处理
|
||||
if context.is_chatter_processing:
|
||||
if manager._recover_stale_chatter_state(stream_id, context):
|
||||
if manager._recover_stale_chatter_state(stream_id, context): # noqa: SLF001
|
||||
logger.warning(f" [驱动器] stream={stream_id[:8]}, 处理标志残留已修复")
|
||||
else:
|
||||
logger.debug(f" [驱动器] stream={stream_id[:8]}, Chatter正在处理,跳过此Tick")
|
||||
@@ -182,7 +182,7 @@ async def run_chat_stream(
|
||||
|
||||
# 更新能量值
|
||||
try:
|
||||
await manager._update_stream_energy(stream_id, context)
|
||||
await manager._update_stream_energy(stream_id, context) # noqa: SLF001
|
||||
except Exception as e:
|
||||
logger.debug(f"更新能量失败: {e}")
|
||||
|
||||
@@ -191,7 +191,7 @@ async def run_chat_stream(
|
||||
try:
|
||||
async with manager._processing_semaphore:
|
||||
success = await asyncio.wait_for(
|
||||
manager._process_stream_messages(stream_id, context),
|
||||
manager._process_stream_messages(stream_id, context), # noqa: SLF001
|
||||
global_config.chat.thinking_timeout,
|
||||
)
|
||||
except asyncio.TimeoutError:
|
||||
@@ -209,7 +209,7 @@ async def run_chat_stream(
|
||||
|
||||
except asyncio.CancelledError:
|
||||
raise
|
||||
except Exception as e:
|
||||
except Exception as e: # noqa: BLE001
|
||||
logger.error(f" [驱动器] stream={stream_id[:8]}, 处理Tick时出错: {e}")
|
||||
manager.stats["total_failures"] += 1
|
||||
|
||||
@@ -222,7 +222,7 @@ async def run_chat_stream(
|
||||
if context and context.stream_loop_task:
|
||||
context.stream_loop_task = None
|
||||
logger.debug(f" [驱动器] stream={stream_id[:8]}, 清理任务记录")
|
||||
except Exception as e:
|
||||
except Exception as e: # noqa: BLE001
|
||||
logger.debug(f"清理任务记录失败: {e}")
|
||||
|
||||
|
||||
|
||||
@@ -110,10 +110,10 @@ class MessageManager:
|
||||
if not (context.stream_loop_task and not context.stream_loop_task.done()):
|
||||
# 异步启动驱动器任务;避免在高并发下阻塞消息入队
|
||||
await stream_loop_manager.start_stream_loop(stream_id)
|
||||
|
||||
|
||||
# 检查并处理消息打断
|
||||
await self._check_and_handle_interruption(chat_stream, message)
|
||||
|
||||
|
||||
# 入队消息
|
||||
await chat_stream.context.add_message(message)
|
||||
|
||||
|
||||
@@ -343,17 +343,8 @@ class StatisticOutputTask(AsyncTask):
|
||||
stats[period_key][REQ_CNT_BY_MODULE][module_name] += 1
|
||||
stats[period_key][REQ_CNT_BY_PROVIDER][provider_name] += 1
|
||||
|
||||
# 确保 tokens 是 int 类型
|
||||
try:
|
||||
prompt_tokens = int(record.get("prompt_tokens") or 0)
|
||||
except (ValueError, TypeError):
|
||||
prompt_tokens = 0
|
||||
|
||||
try:
|
||||
completion_tokens = int(record.get("completion_tokens") or 0)
|
||||
except (ValueError, TypeError):
|
||||
completion_tokens = 0
|
||||
|
||||
prompt_tokens = record.get("prompt_tokens") or 0
|
||||
completion_tokens = record.get("completion_tokens") or 0
|
||||
total_tokens = prompt_tokens + completion_tokens
|
||||
|
||||
stats[period_key][IN_TOK_BY_TYPE][request_type] += prompt_tokens
|
||||
@@ -372,13 +363,7 @@ class StatisticOutputTask(AsyncTask):
|
||||
stats[period_key][TOTAL_TOK_BY_MODULE][module_name] += total_tokens
|
||||
stats[period_key][TOTAL_TOK_BY_PROVIDER][provider_name] += total_tokens
|
||||
|
||||
# 确保 cost 是 float 类型
|
||||
cost = record.get("cost") or 0.0
|
||||
try:
|
||||
cost = float(cost) if cost else 0.0
|
||||
except (ValueError, TypeError):
|
||||
cost = 0.0
|
||||
|
||||
stats[period_key][TOTAL_COST] += cost
|
||||
stats[period_key][COST_BY_TYPE][request_type] += cost
|
||||
stats[period_key][COST_BY_USER][user_id] += cost
|
||||
@@ -386,12 +371,8 @@ class StatisticOutputTask(AsyncTask):
|
||||
stats[period_key][COST_BY_MODULE][module_name] += cost
|
||||
stats[period_key][COST_BY_PROVIDER][provider_name] += cost
|
||||
|
||||
# 收集time_cost数据,确保 time_cost 是 float 类型
|
||||
# 收集time_cost数据
|
||||
time_cost = record.get("time_cost") or 0.0
|
||||
try:
|
||||
time_cost = float(time_cost) if time_cost else 0.0
|
||||
except (ValueError, TypeError):
|
||||
time_cost = 0.0
|
||||
if time_cost > 0: # 只记录有效的time_cost
|
||||
stats[period_key][TIME_COST_BY_TYPE][request_type].append(time_cost)
|
||||
stats[period_key][TIME_COST_BY_USER][user_id].append(time_cost)
|
||||
|
||||
@@ -428,7 +428,7 @@ def process_llm_response(text: str, enable_splitter: bool = True, enable_chinese
|
||||
protected_text, special_blocks_mapping = protect_special_blocks(protected_text)
|
||||
|
||||
# 提取被 () 或 [] 或 ()包裹且包含中文的内容
|
||||
pattern = re.compile(r"[(\[(](?=.*[一-鿿]).+?[)\])]")
|
||||
pattern = re.compile(r"[(\[(](?=.*[一-鿿]).*?[)\])]")
|
||||
_extracted_contents = pattern.findall(protected_text)
|
||||
cleaned_text = pattern.sub("", protected_text)
|
||||
|
||||
|
||||
@@ -111,7 +111,7 @@ async def find_messages(
|
||||
# 统一做上限保护,防止无限制查询导致内存暴涨
|
||||
if limit <= 0:
|
||||
capped_limit = SAFE_FETCH_LIMIT
|
||||
logger.debug(
|
||||
logger.warning(
|
||||
f"find_messages 未指定 limit,自动限制为 {capped_limit} 行以避免内存占用过高",
|
||||
)
|
||||
else:
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
import os
|
||||
import shutil
|
||||
import sys
|
||||
import types
|
||||
import typing
|
||||
import types
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Any, get_args, get_origin
|
||||
@@ -30,8 +30,8 @@ from src.config.official_configs import (
|
||||
ExperimentalConfig,
|
||||
ExpressionConfig,
|
||||
InnerConfig,
|
||||
KokoroFlowChatterConfig,
|
||||
LogConfig,
|
||||
KokoroFlowChatterConfig,
|
||||
LPMMKnowledgeConfig,
|
||||
MemoryConfig,
|
||||
MessageBusConfig,
|
||||
@@ -515,7 +515,7 @@ class Config(ValidatedConfigBase):
|
||||
)
|
||||
|
||||
@property
|
||||
def MMC_VERSION(self) -> str:
|
||||
def MMC_VERSION(self) -> str: # noqa: N802
|
||||
return MMC_VERSION
|
||||
|
||||
|
||||
|
||||
@@ -995,27 +995,6 @@ class KokoroFlowChatterWaitingConfig(ValidatedConfigBase):
|
||||
)
|
||||
|
||||
|
||||
class KokoroFlowChatterPromptConfig(ValidatedConfigBase):
|
||||
"""Kokoro Flow Chatter 提示词/上下文构建配置"""
|
||||
|
||||
activity_stream_format: Literal["narrative", "table", "both"] = Field(
|
||||
default="narrative",
|
||||
description='活动流格式: "narrative"(线性叙事) / "table"(结构化表格) / "both"(两者都输出)',
|
||||
)
|
||||
max_activity_entries: int = Field(
|
||||
default=30,
|
||||
ge=0,
|
||||
le=200,
|
||||
description="活动流最多保留条数(越大越完整,但token越高)",
|
||||
)
|
||||
max_entry_length: int = Field(
|
||||
default=500,
|
||||
ge=0,
|
||||
le=5000,
|
||||
description="活动流单条最大字符数(用于裁剪,避免单条过长拖垮上下文)",
|
||||
)
|
||||
|
||||
|
||||
class KokoroFlowChatterConfig(ValidatedConfigBase):
|
||||
"""
|
||||
Kokoro Flow Chatter 配置类 - 私聊专用心流对话系统
|
||||
@@ -1052,11 +1031,6 @@ class KokoroFlowChatterConfig(ValidatedConfigBase):
|
||||
description="自定义KFC决策行为指导提示词(unified影响整体,split仅影响planner)",
|
||||
)
|
||||
|
||||
prompt: KokoroFlowChatterPromptConfig = Field(
|
||||
default_factory=KokoroFlowChatterPromptConfig,
|
||||
description="提示词/上下文构建配置(活动流格式、裁剪等)",
|
||||
)
|
||||
|
||||
waiting: KokoroFlowChatterWaitingConfig = Field(
|
||||
default_factory=KokoroFlowChatterWaitingConfig,
|
||||
description="等待策略配置(默认等待时间、倍率等)",
|
||||
|
||||
@@ -1,614 +0,0 @@
|
||||
# 🧠 MoFox 记忆系统
|
||||
|
||||
MoFox-Core 采用**三层分级记忆架构**,模拟人类记忆的生物特性,实现了高效、可扩展的记忆管理系统。本文档介绍系统架构、使用方法和最佳实践。
|
||||
|
||||
---
|
||||
|
||||
## 📐 系统架构
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ 用户交互 (Chat Input) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ 第1层:感知记忆 (Perceptual Memory) - 即时对话流 (50块) │
|
||||
│ ├─ 消息分块存储(每块5条消息) │
|
||||
│ ├─ 实时激活与召回 │
|
||||
│ ├─ 相似度阈值触发转移 │
|
||||
│ └─ 低开销,高频率访问 │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↓ 激活转移
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ 第2层:短期记忆 (Short-term Memory) - 结构化信息 (30条) │
|
||||
│ ├─ LLM 驱动的决策(创建/合并/更新/丢弃) │
|
||||
│ ├─ 重要性评分(0.0-1.0) │
|
||||
│ ├─ 自动转移与泄压机制 │
|
||||
│ └─ 平衡灵活性与容量 │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↓ 批量转移
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ 第3层:长期记忆 (Long-term Memory) - 知识图谱 │
|
||||
│ ├─ 图数据库存储(人物、事件、关系) │
|
||||
│ ├─ 向量检索与相似度匹配 │
|
||||
│ ├─ 动态节点合并与边生成 │
|
||||
│ └─ 无容量限制,检索精确 │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ LLM 回复生成(带完整上下文) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 三层记忆详解
|
||||
|
||||
### 第1层:感知记忆 (Perceptual Memory)
|
||||
|
||||
**特点**:
|
||||
- 📍 **位置**:即时对话窗口
|
||||
- 💾 **容量**:50 块(250 条消息)
|
||||
- ⏱️ **生命周期**:短暂,激活后可转移
|
||||
- 🔍 **检索**:相似度匹配
|
||||
|
||||
**功能**:
|
||||
```python
|
||||
# 添加消息到感知记忆
|
||||
await perceptual_manager.add_message(
|
||||
user_id="user123",
|
||||
message="最近在学习Python",
|
||||
timestamp=datetime.now()
|
||||
)
|
||||
|
||||
# 召回相关块
|
||||
blocks = await perceptual_manager.recall_blocks(
|
||||
query="你在学什么编程语言",
|
||||
top_k=3
|
||||
)
|
||||
```
|
||||
|
||||
**转移触发条件**:
|
||||
- 块被多次激活(激活次数 ≥ 3)
|
||||
- 块满足转移条件后提交到短期层
|
||||
|
||||
### 第2层:短期记忆 (Short-term Memory)
|
||||
|
||||
**特点**:
|
||||
- 📍 **位置**:结构化数据存储
|
||||
- 💾 **容量**:30 条记忆
|
||||
- ⏱️ **生命周期**:中等,根据重要性动态转移
|
||||
- 🧠 **处理**:LLM 驱动决策
|
||||
|
||||
**功能**:
|
||||
```python
|
||||
# LLM 提取结构化记忆
|
||||
extracted = await short_term_manager.add_from_block(block)
|
||||
|
||||
# 检索类似记忆
|
||||
similar = await short_term_manager.search_memories(
|
||||
query="Python 学习进度",
|
||||
top_k=5
|
||||
)
|
||||
|
||||
# 获取待转移记忆
|
||||
to_transfer = short_term_manager.get_memories_for_transfer()
|
||||
```
|
||||
|
||||
**决策类型**:
|
||||
| 决策 | 说明 | 场景 |
|
||||
|------|------|------|
|
||||
| `CREATE_NEW` | 创建新记忆 | 全新信息 |
|
||||
| `MERGE` | 合并到现有 | 补充细节 |
|
||||
| `UPDATE` | 更新现有 | 信息演变 |
|
||||
| `DISCARD` | 丢弃 | 冗余/过时 |
|
||||
|
||||
**重要性评分**:
|
||||
```
|
||||
高重要性 (≥0.6) → 优先转移到长期层
|
||||
低重要性 (<0.6) → 保留或在容量溢出时删除
|
||||
```
|
||||
|
||||
**容量管理**:
|
||||
- ✅ **自动转移**:占用率 ≥ 50% 时开始批量转移
|
||||
- 🛡️ **泄压机制**:容量 100% 时删除低优先级记忆
|
||||
- ⚙️ **配置**:`short_term_max_memories = 30`
|
||||
|
||||
**溢出策略(新增)**:
|
||||
|
||||
当短期记忆达到容量上限时,支持两种处理策略,可通过配置选择:
|
||||
|
||||
| 策略 | 说明 | 适用场景 | 配置值 |
|
||||
|------|------|----------|--------|
|
||||
| **一次性转移** | 容量满时,将**所有记忆**转移到长期存储,然后删除低重要性记忆(importance < 0.6) | 希望保留更多历史信息,适合记忆密集型应用 | `transfer_all`(默认) |
|
||||
| **选择性清理** | 仅转移高重要性记忆,直接删除低重要性记忆 | 希望快速释放空间,适合性能优先场景 | `selective_cleanup` |
|
||||
|
||||
配置方式:
|
||||
```toml
|
||||
[memory]
|
||||
# 短期记忆溢出策略
|
||||
short_term_overflow_strategy = "transfer_all" # 或 "selective_cleanup"
|
||||
```
|
||||
|
||||
**行为差异示例**:
|
||||
```python
|
||||
# 假设短期记忆已满(30条),其中:
|
||||
# - 20条高重要性(≥0.6)
|
||||
# - 10条低重要性(<0.6)
|
||||
|
||||
# 策略1: transfer_all(默认)
|
||||
# 1. 转移全部30条到长期记忆
|
||||
# 2. 删除10条低重要性记忆
|
||||
# 结果:短期剩余20条,长期增加30条
|
||||
|
||||
# 策略2: selective_cleanup
|
||||
# 1. 仅转移20条高重要性到长期记忆
|
||||
# 2. 直接删除10条低重要性记忆
|
||||
# 结果:短期剩余20条,长期增加20条
|
||||
```
|
||||
|
||||
### 第3层:长期记忆 (Long-term Memory)
|
||||
|
||||
**特点**:
|
||||
- 📍 **位置**:图数据库(NetworkX + Chroma)
|
||||
- 💾 **容量**:无限
|
||||
- ⏱️ **生命周期**:持久,可检索
|
||||
- 📊 **结构**:知识图谱
|
||||
|
||||
**功能**:
|
||||
```python
|
||||
# 转移短期记忆到长期图
|
||||
result = await long_term_manager.transfer_from_short_term(
|
||||
short_term_memories
|
||||
)
|
||||
|
||||
# 图检索
|
||||
results = await memory_manager.search_memories(
|
||||
query="用户的编程经验",
|
||||
top_k=5
|
||||
)
|
||||
```
|
||||
|
||||
**知识图谱节点类型**:
|
||||
- 👤 **PERSON**:人物、角色
|
||||
- 📅 **EVENT**:发生过的事件
|
||||
- 💡 **CONCEPT**:概念、想法
|
||||
- 🎯 **GOAL**:目标、计划
|
||||
|
||||
**节点关系**:
|
||||
- `participated_in`:参与了某事件
|
||||
- `mentioned`:提及了某人/物
|
||||
- `similar_to`:相似
|
||||
- `related_to`:相关
|
||||
- `caused_by`:由...导致
|
||||
|
||||
---
|
||||
|
||||
## 🔧 配置说明
|
||||
|
||||
### 基础配置
|
||||
|
||||
**文件**:`config/bot_config.toml`
|
||||
|
||||
```toml
|
||||
[memory]
|
||||
# 启用/禁用记忆系统
|
||||
enable = true
|
||||
|
||||
# 数据存储
|
||||
data_dir = "data/memory_graph"
|
||||
vector_collection_name = "memory_nodes"
|
||||
vector_db_path = "data/memory_graph/chroma_db"
|
||||
|
||||
# 感知记忆
|
||||
perceptual_max_blocks = 50 # 最大块数
|
||||
perceptual_block_size = 5 # 每块消息数
|
||||
perceptual_similarity_threshold = 0.55 # 召回阈值
|
||||
perceptual_activation_threshold = 3 # 转移激活阈值
|
||||
|
||||
# 短期记忆
|
||||
short_term_max_memories = 30 # 容量上限
|
||||
short_term_transfer_threshold = 0.6 # 转移重要性阈值
|
||||
short_term_overflow_strategy = "transfer_all" # 溢出策略(transfer_all/selective_cleanup)
|
||||
short_term_enable_force_cleanup = true # 启用泄压
|
||||
short_term_cleanup_keep_ratio = 0.9 # 泄压保留比例
|
||||
|
||||
# 长期记忆
|
||||
long_term_batch_size = 10 # 批量转移大小
|
||||
long_term_decay_factor = 0.95 # 激活衰减因子
|
||||
long_term_auto_transfer_interval = 180 # 转移检查间隔(秒)
|
||||
|
||||
# 检索配置
|
||||
search_top_k = 10 # 默认返回数量
|
||||
search_min_importance = 0.3 # 最小重要性过滤
|
||||
search_similarity_threshold = 0.6 # 相似度阈值
|
||||
```
|
||||
|
||||
### 高级配置
|
||||
|
||||
```toml
|
||||
[memory]
|
||||
# 路径评分扩展(更精确的图检索)
|
||||
enable_path_expansion = false # 启用算法
|
||||
path_expansion_max_hops = 2 # 最大跳数
|
||||
path_expansion_damping_factor = 0.85 # 衰减因子
|
||||
path_expansion_max_branches = 10 # 分支限制
|
||||
|
||||
# 记忆激活
|
||||
activation_decay_rate = 0.9 # 每天衰减10%
|
||||
activation_propagation_strength = 0.5 # 传播强度
|
||||
activation_propagation_depth = 1 # 传播深度
|
||||
|
||||
# 遗忘机制
|
||||
forgetting_enabled = true # 启用遗忘
|
||||
forgetting_activation_threshold = 0.1 # 遗忘激活度阈值
|
||||
forgetting_min_importance = 0.8 # 保护重要性阈值
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📚 使用示例
|
||||
|
||||
### 1. 初始化记忆系统
|
||||
|
||||
```python
|
||||
from src.memory_graph.manager_singleton import (
|
||||
initialize_unified_memory_manager,
|
||||
get_unified_memory_manager
|
||||
)
|
||||
|
||||
# 初始化系统
|
||||
await initialize_unified_memory_manager()
|
||||
|
||||
# 获取管理器
|
||||
manager = get_unified_memory_manager()
|
||||
```
|
||||
|
||||
### 2. 添加感知记忆
|
||||
|
||||
```python
|
||||
from src.memory_graph.models import MemoryBlock
|
||||
|
||||
# 模拟一个消息块
|
||||
block = MemoryBlock(
|
||||
id="msg_001",
|
||||
content="用户提到在做一个Python爬虫项目",
|
||||
timestamp=datetime.now(),
|
||||
source="chat"
|
||||
)
|
||||
|
||||
# 添加到感知层
|
||||
await manager.add_memory(block, source="perceptual")
|
||||
```
|
||||
|
||||
### 3. 智能检索记忆
|
||||
|
||||
```python
|
||||
# 统一检索(从感知→短期→长期)
|
||||
result = await manager.retrieve_memories(
|
||||
query="最近在做什么项目",
|
||||
use_judge=True # 使用裁判模型评估是否需要检索长期
|
||||
)
|
||||
|
||||
# 访问不同层的结果
|
||||
perceptual = result["perceptual_blocks"]
|
||||
short_term = result["short_term_memories"]
|
||||
long_term = result["long_term_memories"]
|
||||
```
|
||||
|
||||
### 4. 手动触发转移
|
||||
|
||||
```python
|
||||
# 立即转移短期→长期
|
||||
result = await manager.manual_transfer()
|
||||
|
||||
print(f"转移了 {result['transferred_memory_ids']} 条记忆到长期层")
|
||||
```
|
||||
|
||||
### 5. 获取统计信息
|
||||
|
||||
```python
|
||||
stats = manager.get_statistics()
|
||||
|
||||
print(f"感知记忆块数:{stats['perceptual_blocks']}")
|
||||
print(f"短期记忆数:{stats['short_term_memories']}")
|
||||
print(f"长期记忆节点数:{stats['long_term_nodes']}")
|
||||
print(f"图边数:{stats['long_term_edges']}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔄 转移流程
|
||||
|
||||
### 自动转移循环
|
||||
|
||||
系统在后台持续运行自动转移循环,确保记忆及时流转:
|
||||
|
||||
```
|
||||
每 N 秒(可配置):
|
||||
1. 检查短期记忆容量
|
||||
2. 获取待转移的高重要性记忆
|
||||
3. 如果缓存满或容量高,触发转移
|
||||
4. 发送到长期管理器处理
|
||||
5. 从短期层清除已转移记忆
|
||||
```
|
||||
|
||||
**触发条件**(任一满足):
|
||||
- 短期记忆占用率 ≥ 50%
|
||||
- 缓存记忆数 ≥ 批量大小
|
||||
- 距上次转移超过最大延迟
|
||||
- 短期记忆达到容量上限
|
||||
|
||||
**代码位置**:`src/memory_graph/unified_manager.py` 第 576-650 行
|
||||
|
||||
### 转移决策
|
||||
|
||||
长期记忆管理器对每条短期记忆做出决策:
|
||||
|
||||
```python
|
||||
# LLM 决策过程
|
||||
for short_term_memory in batch:
|
||||
# 1. 检索相似的长期记忆
|
||||
similar = await search_long_term(short_term_memory)
|
||||
|
||||
# 2. LLM 做出决策
|
||||
decision = await llm_decide({
|
||||
'short_term': short_term_memory,
|
||||
'similar_long_term': similar
|
||||
})
|
||||
|
||||
# 3. 执行决策
|
||||
if decision == 'CREATE_NEW':
|
||||
create_new_node()
|
||||
elif decision == 'MERGE':
|
||||
merge_into_existing()
|
||||
elif decision == 'UPDATE':
|
||||
update_existing()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🛡️ 容量管理策略
|
||||
|
||||
### 正常流程
|
||||
|
||||
```
|
||||
短期记忆累积 → 达到 50% → 自动转移 → 长期记忆保存
|
||||
```
|
||||
|
||||
### 压力场景
|
||||
|
||||
```
|
||||
高频消息流 → 短期快速堆积
|
||||
↓
|
||||
达到 100% → 转移来不及
|
||||
↓
|
||||
启用泄压机制 → 删除低优先级记忆
|
||||
↓
|
||||
保护核心数据,防止阻塞
|
||||
```
|
||||
|
||||
**泄压参数**:
|
||||
```toml
|
||||
short_term_enable_force_cleanup = true # 启用泄压
|
||||
short_term_cleanup_keep_ratio = 0.9 # 保留 90% 容量
|
||||
```
|
||||
|
||||
**删除策略**:
|
||||
- 优先删除:**重要性低 AND 创建时间早**
|
||||
- 保留:高重要性记忆永不删除
|
||||
|
||||
---
|
||||
|
||||
## 📊 性能特性
|
||||
|
||||
### 时间复杂度
|
||||
|
||||
| 操作 | 复杂度 | 说明 |
|
||||
|------|--------|------|
|
||||
| 感知记忆添加 | O(1) | 直接追加 |
|
||||
| 感知记忆召回 | O(n) | 相似度匹配 |
|
||||
| 短期记忆添加 | O(1) | 直接追加 |
|
||||
| 短期记忆搜索 | O(n) | 向量相似度 |
|
||||
| 长期记忆检索 | O(log n) | 向量数据库 + 图遍历 |
|
||||
| 转移操作 | O(n) | 批量处理 |
|
||||
|
||||
### 空间复杂度
|
||||
|
||||
| 层级 | 估计空间 | 配置 |
|
||||
|------|---------|------|
|
||||
| 感知层 | ~5-10 MB | 50 块 × 5 消息 |
|
||||
| 短期层 | ~1-2 MB | 30 条记忆 |
|
||||
| 长期层 | ~50-200 MB | 根据对话历史 |
|
||||
|
||||
### 优化技巧
|
||||
|
||||
1. **缓存去重**:避免同一记忆被转移多次
|
||||
2. **批量转移**:减少 LLM 调用次数
|
||||
3. **异步操作**:后台转移,不阻塞主流程
|
||||
4. **自适应轮询**:根据容量压力调整检查间隔
|
||||
|
||||
---
|
||||
|
||||
## 🔍 检索策略
|
||||
|
||||
### 三层联合检索
|
||||
|
||||
```python
|
||||
result = await manager.retrieve_memories(query, use_judge=True)
|
||||
```
|
||||
|
||||
**流程**:
|
||||
1. 检索感知层(即时对话)
|
||||
2. 检索短期层(结构化信息)
|
||||
3. 使用裁判模型判断是否充足
|
||||
4. 如不充足,检索长期层(知识图)
|
||||
|
||||
**裁判模型**:
|
||||
- 评估现有记忆是否满足查询
|
||||
- 生成补充查询词
|
||||
- 决策是否需要长期检索
|
||||
|
||||
### 路径评分扩展(可选)
|
||||
|
||||
启用后使用 PageRank 风格算法在图中传播分数:
|
||||
|
||||
```toml
|
||||
enable_path_expansion = true
|
||||
path_expansion_max_hops = 2
|
||||
path_expansion_damping_factor = 0.85
|
||||
```
|
||||
|
||||
**优势**:
|
||||
- 发现间接关联信息
|
||||
- 上下文更丰富
|
||||
- 精确度提高 15-25%
|
||||
|
||||
---
|
||||
|
||||
## 🐛 故障排查
|
||||
|
||||
### 问题1:短期记忆快速堆积
|
||||
|
||||
**症状**:短期层记忆数快速增长,转移缓慢
|
||||
|
||||
**排查**:
|
||||
```python
|
||||
# 查看统计信息
|
||||
stats = manager.get_statistics()
|
||||
print(f"短期记忆占用率: {stats['short_term_occupancy']:.0%}")
|
||||
print(f"待转移记忆: {len(manager.short_term_manager.get_memories_for_transfer())}")
|
||||
```
|
||||
|
||||
**解决**:
|
||||
- 减小 `long_term_auto_transfer_interval`(加快转移频率)
|
||||
- 增加 `long_term_batch_size`(一次转移更多)
|
||||
- 提高 `short_term_transfer_threshold`(更多记忆被转移)
|
||||
|
||||
### 问题2:长期记忆检索结果不相关
|
||||
|
||||
**症状**:搜索返回的记忆与查询不匹配
|
||||
|
||||
**排查**:
|
||||
```python
|
||||
# 启用调试日志
|
||||
import logging
|
||||
logging.getLogger("src.memory_graph").setLevel(logging.DEBUG)
|
||||
|
||||
# 重试检索
|
||||
result = await manager.retrieve_memories(query, use_judge=True)
|
||||
# 检查日志中的相似度评分
|
||||
```
|
||||
|
||||
**解决**:
|
||||
- 增加 `search_top_k`(返回更多候选)
|
||||
- 降低 `search_similarity_threshold`(放宽相似度要求)
|
||||
- 检查向量模型是否加载正确
|
||||
|
||||
### 问题3:转移失败导致记忆丢失
|
||||
|
||||
**症状**:短期记忆无故消失,长期层未出现
|
||||
|
||||
**排查**:
|
||||
```python
|
||||
# 检查日志中的转移错误
|
||||
# 查看长期管理器的错误日志
|
||||
```
|
||||
|
||||
**解决**:
|
||||
- 检查 LLM 模型配置
|
||||
- 确保长期图存储正常运行
|
||||
- 增加转移超时时间
|
||||
|
||||
---
|
||||
|
||||
## 🎓 最佳实践
|
||||
|
||||
### 1. 合理配置容量
|
||||
|
||||
```toml
|
||||
# 低频场景(私聊)
|
||||
perceptual_max_blocks = 20
|
||||
short_term_max_memories = 15
|
||||
|
||||
# 中等频率(小群)
|
||||
perceptual_max_blocks = 50
|
||||
short_term_max_memories = 30
|
||||
|
||||
# 高频场景(大群/客服)
|
||||
perceptual_max_blocks = 100
|
||||
short_term_max_memories = 50
|
||||
short_term_enable_force_cleanup = true
|
||||
```
|
||||
|
||||
### 2. 启用泄压保护
|
||||
|
||||
```toml
|
||||
# 对于 24/7 运行的机器人
|
||||
short_term_enable_force_cleanup = true
|
||||
short_term_cleanup_keep_ratio = 0.85 # 更激进的清理
|
||||
```
|
||||
|
||||
### 3. 定期监控
|
||||
|
||||
```python
|
||||
# 在定时任务中检查
|
||||
async def monitor_memory():
|
||||
stats = manager.get_statistics()
|
||||
if stats['short_term_occupancy'] > 0.8:
|
||||
logger.warning("短期记忆压力高,考虑扩容")
|
||||
if stats['long_term_nodes'] > 10000:
|
||||
logger.warning("长期图规模大,检索可能变慢")
|
||||
```
|
||||
|
||||
### 4. 使用裁判模型
|
||||
|
||||
```python
|
||||
# 启用以提高检索质量
|
||||
result = await manager.retrieve_memories(
|
||||
query=user_query,
|
||||
use_judge=True # 自动判断是否需要长期检索
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📖 相关文档
|
||||
|
||||
- [三层记忆系统用户指南](../../docs/three_tier_memory_user_guide.md)
|
||||
- [记忆图谱架构](../../docs/memory_graph_guide.md)
|
||||
- [短期记忆压力泄压补丁](./short_term_pressure_patch.md)
|
||||
- [转移算法分析](../../docs/memory_transfer_algorithm_analysis.md)
|
||||
- [统一调度器指南](../../docs/unified_scheduler_guide.md)
|
||||
|
||||
---
|
||||
|
||||
## 🎯 快速导航
|
||||
|
||||
### 核心模块
|
||||
|
||||
| 模块 | 功能 | 文件 |
|
||||
|------|------|------|
|
||||
| 感知管理 | 消息分块、激活、转移 | `perceptual_manager.py` |
|
||||
| 短期管理 | LLM 决策、合并、转移 | `short_term_manager.py` |
|
||||
| 长期管理 | 图操作、节点合并 | `long_term_manager.py` |
|
||||
| 统一接口 | 自动转移循环、检索 | `unified_manager.py` |
|
||||
| 单例访问 | 全局管理器获取 | `manager_singleton.py` |
|
||||
|
||||
### 辅助工具
|
||||
|
||||
| 工具 | 功能 | 文件 |
|
||||
|------|------|------|
|
||||
| 向量生成 | 文本嵌入 | `utils/embeddings.py` |
|
||||
| 相似度计算 | 余弦相似度 | `utils/similarity.py` |
|
||||
| 格式化器 | 三层数据格式化 | `utils/three_tier_formatter.py` |
|
||||
| 存储系统 | 磁盘持久化 | `storage/` |
|
||||
|
||||
---
|
||||
|
||||
## 📝 版本信息
|
||||
|
||||
- **架构**:三层分级记忆系统
|
||||
- **存储**:SQLAlchemy 2.0 + Chroma 向量库
|
||||
- **图数据库**:NetworkX
|
||||
- **最后更新**:2025 年 12 月 16 日
|
||||
@@ -9,7 +9,6 @@
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import json_repair
|
||||
import re
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
@@ -67,13 +66,6 @@ class LongTermMemoryManager:
|
||||
self._similar_memory_cache: dict[str, list[Memory]] = {}
|
||||
self._cache_max_size = 100
|
||||
|
||||
# 错误/重试统计与配置
|
||||
self._max_process_retries = 2
|
||||
self._retry_backoff = 0.5
|
||||
self._total_processed = 0
|
||||
self._failed_single_memory_count = 0
|
||||
self._retry_attempts = 0
|
||||
|
||||
logger.info(
|
||||
f"长期记忆管理器已创建 (batch_size={batch_size}, "
|
||||
f"search_top_k={search_top_k}, decay_factor={long_term_decay_factor:.2f})"
|
||||
@@ -210,10 +202,6 @@ class LongTermMemoryManager:
|
||||
else:
|
||||
result["failed_count"] += 1
|
||||
|
||||
# 更新全局计数
|
||||
self._total_processed += result["processed_count"]
|
||||
self._failed_single_memory_count += result["failed_count"]
|
||||
|
||||
# 处理完批次后,批量生成embeddings
|
||||
await self._flush_pending_embeddings()
|
||||
|
||||
@@ -229,45 +217,26 @@ class LongTermMemoryManager:
|
||||
Returns:
|
||||
处理结果或None(如果失败)
|
||||
"""
|
||||
# 增加重试机制以应对 LLM/执行的临时失败
|
||||
attempt = 0
|
||||
last_exc: Exception | None = None
|
||||
while attempt <= self._max_process_retries:
|
||||
try:
|
||||
# 步骤1: 在长期记忆中检索相似记忆
|
||||
similar_memories = await self._search_similar_long_term_memories(stm)
|
||||
try:
|
||||
# 步骤1: 在长期记忆中检索相似记忆
|
||||
similar_memories = await self._search_similar_long_term_memories(stm)
|
||||
|
||||
# 步骤2: LLM 决策如何更新图结构
|
||||
operations = await self._decide_graph_operations(stm, similar_memories)
|
||||
# 步骤2: LLM 决策如何更新图结构
|
||||
operations = await self._decide_graph_operations(stm, similar_memories)
|
||||
|
||||
# 步骤3: 执行图操作
|
||||
success = await self._execute_graph_operations(operations, stm)
|
||||
# 步骤3: 执行图操作
|
||||
success = await self._execute_graph_operations(operations, stm)
|
||||
|
||||
if success:
|
||||
return {
|
||||
"success": True,
|
||||
"operations": [op.operation_type for op in operations]
|
||||
}
|
||||
if success:
|
||||
return {
|
||||
"success": True,
|
||||
"operations": [op.operation_type for op in operations]
|
||||
}
|
||||
return None
|
||||
|
||||
# 如果执行返回 False,视为一次失败,准备重试
|
||||
last_exc = RuntimeError("_execute_graph_operations 返回 False")
|
||||
raise last_exc
|
||||
|
||||
except Exception as e:
|
||||
last_exc = e
|
||||
attempt += 1
|
||||
if attempt <= self._max_process_retries:
|
||||
self._retry_attempts += 1
|
||||
backoff = self._retry_backoff * attempt
|
||||
logger.warning(
|
||||
f"处理短期记忆 {stm.id} 时发生可恢复错误,重试 {attempt}/{self._max_process_retries},等待 {backoff}s: {e}"
|
||||
)
|
||||
await asyncio.sleep(backoff)
|
||||
continue
|
||||
# 超过重试次数,记录失败并返回 None
|
||||
logger.error(f"处理短期记忆 {stm.id} 最终失败: {last_exc}")
|
||||
self._failed_single_memory_count += 1
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"处理短期记忆 {stm.id} 失败: {e}")
|
||||
return None
|
||||
|
||||
async def _search_similar_long_term_memories(
|
||||
self, stm: ShortTermMemory
|
||||
@@ -574,7 +543,7 @@ class LongTermMemoryManager:
|
||||
json_str = re.sub(r"/\*.*?\*/", "", json_str, flags=re.DOTALL)
|
||||
|
||||
# 解析
|
||||
data = json_repair.loads(json_str)
|
||||
data = json.loads(json_str)
|
||||
|
||||
# 转换为 GraphOperation 对象
|
||||
operations = []
|
||||
@@ -957,30 +926,14 @@ class LongTermMemoryManager:
|
||||
logger.warning(f"创建边失败: 缺少节点ID ({source_id} -> {target_id})")
|
||||
return
|
||||
|
||||
if not self.memory_manager.graph_store:
|
||||
logger.warning("创建边失败: 图存储未初始化")
|
||||
# 检查节点是否存在
|
||||
if not self.memory_manager.graph_store or not self.memory_manager.graph_store.graph.has_node(source_id):
|
||||
logger.warning(f"创建边失败: 源节点不存在 ({source_id})")
|
||||
return
|
||||
if not self.memory_manager.graph_store or not self.memory_manager.graph_store.graph.has_node(target_id):
|
||||
logger.warning(f"创建边失败: 目标节点不存在 ({target_id})")
|
||||
return
|
||||
|
||||
# 检查和创建节点(如果不存在则创建占位符)
|
||||
if not self.memory_manager.graph_store.graph.has_node(source_id):
|
||||
logger.debug(f"源节点不存在,创建占位符节点: {source_id}")
|
||||
self.memory_manager.graph_store.add_node(
|
||||
node_id=source_id,
|
||||
node_type="event",
|
||||
content=f"临时节点 - {source_id}",
|
||||
metadata={"placeholder": True, "created_by": "long_term_manager_edge_creation"}
|
||||
)
|
||||
|
||||
if not self.memory_manager.graph_store.graph.has_node(target_id):
|
||||
logger.debug(f"目标节点不存在,创建占位符节点: {target_id}")
|
||||
self.memory_manager.graph_store.add_node(
|
||||
node_id=target_id,
|
||||
node_type="event",
|
||||
content=f"临时节点 - {target_id}",
|
||||
metadata={"placeholder": True, "created_by": "long_term_manager_edge_creation"}
|
||||
)
|
||||
|
||||
# 现在两个节点都存在,可以创建边
|
||||
edge_id = self.memory_manager.graph_store.add_edge(
|
||||
source_id=source_id,
|
||||
target_id=target_id,
|
||||
@@ -1038,15 +991,12 @@ class LongTermMemoryManager:
|
||||
|
||||
async def _queue_embedding_generation(self, node_id: str, content: str) -> None:
|
||||
"""将节点加入embedding生成队列"""
|
||||
# 先在锁内写入,再在锁外触发批量处理,避免自锁
|
||||
should_flush = False
|
||||
async with self._embedding_lock:
|
||||
self._pending_embeddings.append((node_id, content))
|
||||
if len(self._pending_embeddings) >= self._embedding_batch_size:
|
||||
should_flush = True
|
||||
|
||||
if should_flush:
|
||||
await self._flush_pending_embeddings()
|
||||
# 如果队列达到批次大小,立即处理
|
||||
if len(self._pending_embeddings) >= self._embedding_batch_size:
|
||||
await self._flush_pending_embeddings()
|
||||
|
||||
async def _flush_pending_embeddings(self) -> None:
|
||||
"""批量处理待生成的embeddings"""
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
# ruff: noqa: G004, BLE001
|
||||
# pylint: disable=logging-fstring-interpolation,broad-except,unused-argument
|
||||
# pyright: reportOptionalMemberAccess=false
|
||||
"""
|
||||
|
||||
@@ -166,9 +166,6 @@ async def initialize_unified_memory_manager():
|
||||
# 短期记忆配置
|
||||
short_term_max_memories=getattr(config, "short_term_max_memories", 30),
|
||||
short_term_transfer_threshold=getattr(config, "short_term_transfer_threshold", 0.6),
|
||||
short_term_overflow_strategy=getattr(config, "short_term_overflow_strategy", "transfer_all"),
|
||||
short_term_enable_force_cleanup=getattr(config, "short_term_enable_force_cleanup", True),
|
||||
short_term_cleanup_keep_ratio=getattr(config, "short_term_cleanup_keep_ratio", 0.9),
|
||||
# 长期记忆配置
|
||||
long_term_batch_size=getattr(config, "long_term_batch_size", 10),
|
||||
long_term_search_top_k=getattr(config, "search_top_k", 5),
|
||||
|
||||
@@ -43,9 +43,6 @@ class ShortTermMemoryManager:
|
||||
max_memories: int = 30,
|
||||
transfer_importance_threshold: float = 0.6,
|
||||
llm_temperature: float = 0.2,
|
||||
enable_force_cleanup: bool = False,
|
||||
cleanup_keep_ratio: float = 0.9,
|
||||
overflow_strategy: str = "transfer_all",
|
||||
):
|
||||
"""
|
||||
初始化短期记忆层管理器
|
||||
@@ -55,11 +52,6 @@ class ShortTermMemoryManager:
|
||||
max_memories: 最大短期记忆数量
|
||||
transfer_importance_threshold: 转移到长期记忆的重要性阈值
|
||||
llm_temperature: LLM 决策的温度参数
|
||||
enable_force_cleanup: 是否启用泄压功能
|
||||
cleanup_keep_ratio: 泄压时保留容量的比例(默认0.9表示保留90%)
|
||||
overflow_strategy: 短期记忆溢出策略
|
||||
- "transfer_all": 一次性转移所有记忆到长期记忆,并删除不重要的短期记忆(默认)
|
||||
- "selective_cleanup": 选择性清理,仅转移重要记忆,直接删除低重要性记忆
|
||||
"""
|
||||
self.data_dir = data_dir or Path("data/memory_graph")
|
||||
self.data_dir.mkdir(parents=True, exist_ok=True)
|
||||
@@ -68,9 +60,6 @@ class ShortTermMemoryManager:
|
||||
self.max_memories = max_memories
|
||||
self.transfer_importance_threshold = transfer_importance_threshold
|
||||
self.llm_temperature = llm_temperature
|
||||
self.enable_force_cleanup = enable_force_cleanup
|
||||
self.cleanup_keep_ratio = cleanup_keep_ratio
|
||||
self.overflow_strategy = overflow_strategy # 新增:溢出策略
|
||||
|
||||
# 核心数据
|
||||
self.memories: list[ShortTermMemory] = []
|
||||
@@ -86,9 +75,7 @@ class ShortTermMemoryManager:
|
||||
|
||||
logger.info(
|
||||
f"短期记忆管理器已创建 (max_memories={max_memories}, "
|
||||
f"transfer_threshold={transfer_importance_threshold:.2f}, "
|
||||
f"overflow_strategy={overflow_strategy}, "
|
||||
f"force_cleanup={'on' if enable_force_cleanup else 'off'})"
|
||||
f"transfer_threshold={transfer_importance_threshold:.2f})"
|
||||
)
|
||||
|
||||
async def initialize(self) -> None:
|
||||
@@ -645,151 +632,54 @@ class ShortTermMemoryManager:
|
||||
|
||||
def get_memories_for_transfer(self) -> list[ShortTermMemory]:
|
||||
"""
|
||||
获取需要转移到长期记忆的记忆
|
||||
获取需要转移到长期记忆的记忆(优化版:单次遍历)
|
||||
|
||||
根据 overflow_strategy 选择不同的转移策略:
|
||||
- "transfer_all": 一次性转移所有记忆(满容量时),然后删除低重要性记忆
|
||||
- "selective_cleanup": 仅转移高重要性记忆,低重要性记忆直接删除
|
||||
|
||||
返回:
|
||||
需要转移的记忆列表
|
||||
"""
|
||||
if self.overflow_strategy == "transfer_all":
|
||||
return self._get_transfer_all_strategy()
|
||||
else: # "selective_cleanup" 或其他值默认使用选择性清理
|
||||
return self._get_selective_cleanup_strategy()
|
||||
|
||||
def _get_transfer_all_strategy(self) -> list[ShortTermMemory]:
|
||||
"""
|
||||
"一次性转移所有"策略:当短期记忆满了以后,将所有记忆转移到长期记忆
|
||||
|
||||
返回:
|
||||
需要转移的记忆列表(满容量时返回所有记忆)
|
||||
"""
|
||||
# 如果短期记忆已满或接近满,一次性转移所有记忆
|
||||
if len(self.memories) >= self.max_memories:
|
||||
logger.info(
|
||||
f"转移策略(transfer_all): 短期记忆已满 ({len(self.memories)}/{self.max_memories}),"
|
||||
f"将转移所有 {len(self.memories)} 条记忆到长期记忆"
|
||||
)
|
||||
return self.memories.copy()
|
||||
|
||||
# 如果还没满,检查是否有高重要性记忆需要转移
|
||||
high_importance_memories = [
|
||||
mem for mem in self.memories
|
||||
if mem.importance >= self.transfer_importance_threshold
|
||||
]
|
||||
|
||||
if high_importance_memories:
|
||||
logger.debug(
|
||||
f"转移策略(transfer_all): 发现 {len(high_importance_memories)} 条高重要性记忆待转移"
|
||||
)
|
||||
return high_importance_memories
|
||||
|
||||
logger.debug(
|
||||
f"转移策略(transfer_all): 无需转移 (当前容量 {len(self.memories)}/{self.max_memories})"
|
||||
)
|
||||
return []
|
||||
|
||||
def _get_selective_cleanup_strategy(self) -> list[ShortTermMemory]:
|
||||
"""
|
||||
"选择性清理"策略(原有策略):优先转移重要记忆,低重要性记忆考虑直接删除
|
||||
|
||||
返回:
|
||||
需要转移的记忆列表
|
||||
逻辑:
|
||||
1. 优先选择重要性 >= 阈值的记忆
|
||||
2. 如果剩余记忆数量仍超过 max_memories,直接清理最早的低重要性记忆直到低于上限
|
||||
"""
|
||||
# 单次遍历:同时分类高重要性和低重要性记忆
|
||||
high_importance_memories = []
|
||||
candidates = []
|
||||
low_importance_memories = []
|
||||
|
||||
for mem in self.memories:
|
||||
if mem.importance >= self.transfer_importance_threshold:
|
||||
high_importance_memories.append(mem)
|
||||
candidates.append(mem)
|
||||
else:
|
||||
low_importance_memories.append(mem)
|
||||
|
||||
# 策略1:优先返回高重要性记忆进行转移
|
||||
if high_importance_memories:
|
||||
logger.debug(
|
||||
f"转移策略(selective): 发现 {len(high_importance_memories)} 条高重要性记忆待转移"
|
||||
)
|
||||
return high_importance_memories
|
||||
# 如果低重要性记忆数量超过了上限(说明积压严重)
|
||||
# 我们需要清理掉一部分,而不是转移它们
|
||||
if len(low_importance_memories) > self.max_memories:
|
||||
# 目标保留数量(降至上限的 90%)
|
||||
target_keep_count = int(self.max_memories * 0.9)
|
||||
num_to_remove = len(low_importance_memories) - target_keep_count
|
||||
|
||||
# 策略2:如果没有高重要性记忆但总体超过容量上限,
|
||||
# 返回一部分低重要性记忆用于转移(而非删除)
|
||||
if len(self.memories) > self.max_memories:
|
||||
# 计算需要转移的数量(目标:降到上限)
|
||||
num_to_transfer = len(self.memories) - self.max_memories
|
||||
if num_to_remove > 0:
|
||||
# 按创建时间排序,删除最早的
|
||||
low_importance_memories.sort(key=lambda x: x.created_at)
|
||||
to_remove = low_importance_memories[:num_to_remove]
|
||||
|
||||
# 按创建时间排序低重要性记忆,优先转移最早的(可能包含过时信息)
|
||||
low_importance_memories.sort(key=lambda x: x.created_at)
|
||||
to_transfer = low_importance_memories[:num_to_transfer]
|
||||
# 批量删除并更新索引
|
||||
remove_ids = {mem.id for mem in to_remove}
|
||||
self.memories = [mem for mem in self.memories if mem.id not in remove_ids]
|
||||
for mem_id in remove_ids:
|
||||
del self._memory_id_index[mem_id]
|
||||
self._similarity_cache.pop(mem_id, None)
|
||||
|
||||
if to_transfer:
|
||||
logger.debug(
|
||||
f"转移策略(selective): 发现 {len(to_transfer)} 条低重要性记忆待转移 "
|
||||
f"(当前容量 {len(self.memories)}/{self.max_memories})"
|
||||
logger.info(
|
||||
f"短期记忆清理: 移除了 {len(to_remove)} 条低重要性记忆 "
|
||||
f"(保留 {len(self.memories)} 条)"
|
||||
)
|
||||
return to_transfer
|
||||
|
||||
# 策略3:容量充足,无需转移
|
||||
logger.debug(
|
||||
f"转移策略(selective): 无需转移 (当前容量 {len(self.memories)}/{self.max_memories})"
|
||||
)
|
||||
return []
|
||||
# 触发保存
|
||||
asyncio.create_task(self._save_to_disk())
|
||||
|
||||
def force_cleanup_overflow(self, keep_ratio: float | None = None) -> int:
|
||||
"""
|
||||
当短期记忆超过容量时,强制删除低重要性且最早的记忆以泄压
|
||||
|
||||
Args:
|
||||
keep_ratio: 保留容量的比例(默认使用配置中的 cleanup_keep_ratio)
|
||||
|
||||
Returns:
|
||||
删除的记忆数量
|
||||
"""
|
||||
if not self.enable_force_cleanup:
|
||||
return 0
|
||||
|
||||
if self.max_memories <= 0:
|
||||
return 0
|
||||
|
||||
# 使用实例配置或传入参数
|
||||
if keep_ratio is None:
|
||||
keep_ratio = self.cleanup_keep_ratio
|
||||
|
||||
current = len(self.memories)
|
||||
limit = int(self.max_memories * keep_ratio)
|
||||
if current <= self.max_memories:
|
||||
return 0
|
||||
|
||||
# 先按重要性升序,再按创建时间升序删除
|
||||
sorted_memories = sorted(self.memories, key=lambda m: (m.importance, m.created_at))
|
||||
remove_count = max(0, current - limit)
|
||||
to_remove = {mem.id for mem in sorted_memories[:remove_count]}
|
||||
|
||||
if not to_remove:
|
||||
return 0
|
||||
|
||||
self.memories = [mem for mem in self.memories if mem.id not in to_remove]
|
||||
for mem_id in to_remove:
|
||||
self._memory_id_index.pop(mem_id, None)
|
||||
self._similarity_cache.pop(mem_id, None)
|
||||
|
||||
# 异步保存即可,不阻塞主流程
|
||||
asyncio.create_task(self._save_to_disk())
|
||||
|
||||
logger.warning(
|
||||
f"短期记忆压力泄压: 移除 {len(to_remove)} 条 (当前 {len(self.memories)}/{self.max_memories})"
|
||||
)
|
||||
|
||||
return len(to_remove)
|
||||
return candidates
|
||||
|
||||
async def clear_transferred_memories(self, memory_ids: list[str]) -> None:
|
||||
"""
|
||||
清除已转移到长期记忆的记忆
|
||||
|
||||
在 "transfer_all" 策略下,还会删除不重要的短期记忆以释放空间
|
||||
|
||||
Args:
|
||||
memory_ids: 已转移的记忆ID列表
|
||||
@@ -805,32 +695,6 @@ class ShortTermMemoryManager:
|
||||
|
||||
logger.info(f"清除 {len(memory_ids)} 条已转移的短期记忆")
|
||||
|
||||
# 在 "transfer_all" 策略下,进一步删除不重要的短期记忆
|
||||
if self.overflow_strategy == "transfer_all":
|
||||
# 计算需要删除的低重要性记忆数量
|
||||
low_importance_memories = [
|
||||
mem for mem in self.memories
|
||||
if mem.importance < self.transfer_importance_threshold
|
||||
]
|
||||
|
||||
if low_importance_memories:
|
||||
# 按重要性和创建时间排序,删除最不重要的
|
||||
low_importance_memories.sort(key=lambda m: (m.importance, m.created_at))
|
||||
|
||||
# 删除所有低重要性记忆
|
||||
to_delete = {mem.id for mem in low_importance_memories}
|
||||
self.memories = [mem for mem in self.memories if mem.id not in to_delete]
|
||||
|
||||
# 更新索引
|
||||
for mem_id in to_delete:
|
||||
self._memory_id_index.pop(mem_id, None)
|
||||
self._similarity_cache.pop(mem_id, None)
|
||||
|
||||
logger.info(
|
||||
f"transfer_all 策略: 额外删除了 {len(to_delete)} 条低重要性记忆 "
|
||||
f"(重要性 < {self.transfer_importance_threshold:.2f})"
|
||||
)
|
||||
|
||||
# 异步保存
|
||||
asyncio.create_task(self._save_to_disk())
|
||||
|
||||
|
||||
@@ -1,240 +0,0 @@
|
||||
# 短期记忆压力泄压补丁
|
||||
|
||||
## 📋 概述
|
||||
|
||||
在高频消息场景下,短期记忆层(`ShortTermMemoryManager`)可能在自动转移机制触发前快速堆积大量记忆,当达到容量上限(`max_memories`)时可能阻塞后续写入。本功能提供一个**可选的泄压开关**,在容量溢出时自动删除低优先级记忆,防止系统阻塞。
|
||||
|
||||
**关键特性**:
|
||||
- ✅ 默认开启(在高频场景中保护系统),可关闭保持向后兼容
|
||||
- ✅ 基于重要性和时间的智能删除策略
|
||||
- ✅ 异步持久化,不阻塞主流程
|
||||
- ✅ 可通过配置文件或代码灵活控制
|
||||
- ✅ 支持自定义保留比例
|
||||
|
||||
---
|
||||
|
||||
## 🔧 配置方法
|
||||
|
||||
### 方法 1:代码配置(直接创建管理器)
|
||||
|
||||
如果您在代码中直接实例化 `UnifiedMemoryManager`:
|
||||
|
||||
```python
|
||||
from src.memory_graph.unified_manager import UnifiedMemoryManager
|
||||
|
||||
manager = UnifiedMemoryManager(
|
||||
short_term_enable_force_cleanup=True, # 开启泄压功能
|
||||
short_term_cleanup_keep_ratio=0.9, # 泄压时保留容量的比例(90%)
|
||||
short_term_max_memories=30, # 短期记忆容量上限
|
||||
# ... 其他参数
|
||||
)
|
||||
```
|
||||
|
||||
### 方法 2:配置文件(通过单例获取)
|
||||
|
||||
**推荐方式**:如果您使用 `get_unified_memory_manager()` 单例,通过配置文件控制。
|
||||
|
||||
#### ✅ 已实现
|
||||
配置文件 `config/bot_config.toml` 的 `[memory]` 节已包含此参数。
|
||||
|
||||
在 `config/bot_config.toml` 的 `[memory]` 节配置:
|
||||
|
||||
```toml
|
||||
[memory]
|
||||
# ... 其他配置 ...
|
||||
short_term_max_memories = 30 # 短期记忆容量上限
|
||||
short_term_transfer_threshold = 0.6 # 转移到长期记忆的重要性阈值
|
||||
short_term_enable_force_cleanup = true # 开启压力泄压(建议高频场景开启)
|
||||
short_term_cleanup_keep_ratio = 0.9 # 泄压时保留容量的比例(保留90%)
|
||||
```
|
||||
|
||||
配置自动由 `src/memory_graph/manager_singleton.py` 读取并传递给管理器:
|
||||
|
||||
```python
|
||||
_unified_memory_manager = UnifiedMemoryManager(
|
||||
# ... 其他参数 ...
|
||||
short_term_enable_force_cleanup=getattr(config, "short_term_enable_force_cleanup", True),
|
||||
short_term_cleanup_keep_ratio=getattr(config, "short_term_cleanup_keep_ratio", 0.9),
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ 核心实现位置
|
||||
|
||||
### 1. 参数定义
|
||||
**文件**:`src/memory_graph/unified_manager.py` 第 35-54 行
|
||||
```python
|
||||
class UnifiedMemoryManager:
|
||||
def __init__(
|
||||
self,
|
||||
# ... 其他参数 ...
|
||||
short_term_enable_force_cleanup: bool = False, # 开关参数
|
||||
short_term_cleanup_keep_ratio: float = 0.9, # 保留比例参数
|
||||
# ... 其他参数
|
||||
):
|
||||
```
|
||||
|
||||
### 2. 传递到短期层
|
||||
**文件**:`src/memory_graph/unified_manager.py` 第 94-106 行
|
||||
```python
|
||||
self._config = {
|
||||
"short_term": {
|
||||
"max_memories": short_term_max_memories,
|
||||
"transfer_importance_threshold": short_term_transfer_threshold,
|
||||
"enable_force_cleanup": short_term_enable_force_cleanup, # 传递给 ShortTermMemoryManager
|
||||
"cleanup_keep_ratio": short_term_cleanup_keep_ratio, # 传递保留比例
|
||||
},
|
||||
# ... 其他配置
|
||||
}
|
||||
```
|
||||
|
||||
### 3. 泄压逻辑实现
|
||||
**文件**:`src/memory_graph/short_term_manager.py` 第 40-76 行(初始化)和第 697-745 行(执行)
|
||||
|
||||
初始化参数:
|
||||
```python
|
||||
class ShortTermMemoryManager:
|
||||
def __init__(
|
||||
self,
|
||||
max_memories: int = 30,
|
||||
enable_force_cleanup: bool = False,
|
||||
cleanup_keep_ratio: float = 0.9, # 新参数
|
||||
):
|
||||
self.enable_force_cleanup = enable_force_cleanup
|
||||
self.cleanup_keep_ratio = cleanup_keep_ratio
|
||||
```
|
||||
|
||||
执行泄压:
|
||||
```python
|
||||
def force_cleanup_overflow(self, keep_ratio: float | None = None) -> int:
|
||||
"""当短期记忆超过容量时,强制删除低重要性且最早的记忆以泄压"""
|
||||
if not self.enable_force_cleanup: # 检查开关
|
||||
return 0
|
||||
|
||||
if keep_ratio is None:
|
||||
keep_ratio = self.cleanup_keep_ratio # 使用实例配置
|
||||
# ... 删除逻辑
|
||||
```
|
||||
|
||||
### 4. 触发条件
|
||||
**文件**:`src/memory_graph/unified_manager.py` 自动转移循环中
|
||||
```python
|
||||
# 在自动转移循环中检测容量溢出
|
||||
if occupancy_ratio >= 1.0 and not transfer_cache:
|
||||
removed = self.short_term_manager.force_cleanup_overflow()
|
||||
if removed > 0:
|
||||
logger.warning(f"短期记忆压力泄压: 移除 {removed} 条 (当前 {len}/30)")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔄 运行机制
|
||||
|
||||
### 触发条件(同时满足)
|
||||
1. ✅ 开关已开启(`enable_force_cleanup=True`)
|
||||
2. ✅ 短期记忆占用率 ≥ 100%(`len(memories) >= max_memories`)
|
||||
3. ✅ 当前没有待转移批次(`transfer_cache` 为空)
|
||||
|
||||
### 删除策略
|
||||
**排序规则**:双重排序,先按重要性升序,再按创建时间升序
|
||||
```python
|
||||
sorted_memories = sorted(self.memories, key=lambda m: (m.importance, m.created_at))
|
||||
```
|
||||
|
||||
**删除数量**:根据 `cleanup_keep_ratio` 删除
|
||||
```python
|
||||
current = len(self.memories) # 当前记忆数
|
||||
limit = int(self.max_memories * keep_ratio) # 目标保留数
|
||||
remove_count = current - limit # 需要删除的数量
|
||||
```
|
||||
|
||||
**示例**(`max_memories=30, keep_ratio=0.9`):
|
||||
- 当前记忆数 `35` → 删除到 `27` 条(保留 90%)
|
||||
- 删除 `35 - 27 = 8` 条最低优先级记忆
|
||||
- 优先删除:重要性最低且创建时间最早的记忆
|
||||
- 删除后异步保存,不阻塞主流程
|
||||
|
||||
### 持久化
|
||||
- 使用 `asyncio.create_task(self._save_to_disk())` 异步保存
|
||||
- **不阻塞**消息处理主流程
|
||||
|
||||
---
|
||||
|
||||
## 📊 性能影响
|
||||
|
||||
| 场景 | 开关状态 | 行为 | 适用场景 |
|
||||
|------|---------|------|---------|
|
||||
| 高频消息 | ✅ 开启 | 自动泄压,防止阻塞 | 群聊、客服场景 |
|
||||
| 低频消息 | ❌ 关闭 | 仅依赖自动转移 | 私聊、低活跃群 |
|
||||
| 调试阶段 | ❌ 关闭 | 便于观察记忆堆积 | 开发测试 |
|
||||
|
||||
**日志示例**(开启后):
|
||||
```
|
||||
[WARNING] 短期记忆压力泄压: 移除 8 条 (当前 27/30)
|
||||
[WARNING] 短期记忆占用率 100%,已强制删除 8 条低重要性记忆泄压
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚨 注意事项
|
||||
|
||||
### ⚠️ 何时开启
|
||||
- ✅ **默认开启**:高频群聊、客服机器人、24/7 运行场景
|
||||
- ⚠️ **可选关闭**:需要完整保留所有短期记忆或调试阶段
|
||||
|
||||
### ⚠️ 潜在影响
|
||||
- 低重要性记忆可能被删除,**不会转移到长期记忆**
|
||||
- 如需保留所有记忆,应调大 `max_memories` 或关闭此功能
|
||||
|
||||
### ⚠️ 与自动转移的协同
|
||||
本功能是**兜底机制**,正常情况下:
|
||||
1. 优先触发自动转移(占用率 ≥ 50%)
|
||||
2. 高重要性记忆转移到长期层
|
||||
3. 仅当转移来不及时,泄压才会触发
|
||||
|
||||
---
|
||||
|
||||
## 🔙 回滚与禁用
|
||||
|
||||
### 临时禁用(无需重启)
|
||||
```python
|
||||
# 运行时修改(如果您能访问管理器实例)
|
||||
unified_manager.short_term_manager.enable_force_cleanup = False
|
||||
```
|
||||
|
||||
### 永久关闭
|
||||
**配置文件方式**:
|
||||
```toml
|
||||
[memory]
|
||||
short_term_enable_force_cleanup = false # 关闭泄压
|
||||
short_term_cleanup_keep_ratio = 0.9 # 此时该参数被忽略
|
||||
```
|
||||
|
||||
**代码方式**:
|
||||
```python
|
||||
manager = UnifiedMemoryManager(
|
||||
short_term_enable_force_cleanup=False, # 显式关闭
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📚 相关文档
|
||||
|
||||
- [三层记忆系统用户指南](../../docs/three_tier_memory_user_guide.md)
|
||||
- [记忆图谱架构](../../docs/memory_graph_guide.md)
|
||||
- [统一调度器指南](../../docs/unified_scheduler_guide.md)
|
||||
|
||||
---
|
||||
|
||||
## 📝 实现状态
|
||||
|
||||
✅ **已完成**(2025年12月16日):
|
||||
- 配置文件已添加 `short_term_enable_force_cleanup` 和 `short_term_cleanup_keep_ratio` 参数
|
||||
- `UnifiedMemoryManager` 支持新参数并正确传递配置
|
||||
- `ShortTermMemoryManager` 实现完整的泄压逻辑
|
||||
- `manager_singleton.py` 读取并应用配置
|
||||
- 日志系统正确记录泄压事件
|
||||
|
||||
**最后更新**:2025年12月16日
|
||||
@@ -9,7 +9,7 @@ from collections.abc import Iterable
|
||||
import networkx as nx
|
||||
|
||||
from src.common.logger import get_logger
|
||||
from src.memory_graph.models import EdgeType, Memory, MemoryEdge
|
||||
from src.memory_graph.models import Memory, MemoryEdge
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
@@ -159,6 +159,9 @@ class GraphStore:
|
||||
# 1.5. 注销记忆中的边的邻接索引记录
|
||||
self._unregister_memory_edges(memory)
|
||||
|
||||
# 1.5. 注销记忆中的边的邻接索引记录
|
||||
self._unregister_memory_edges(memory)
|
||||
|
||||
# 2. 添加节点到图
|
||||
if not self.graph.has_node(node_id):
|
||||
from datetime import datetime
|
||||
@@ -198,9 +201,6 @@ class GraphStore:
|
||||
)
|
||||
memory.nodes.append(new_node)
|
||||
|
||||
# 5. 重新注册记忆中的边到邻接索引
|
||||
self._register_memory_edges(memory)
|
||||
|
||||
logger.debug(f"添加节点成功: {node_id} -> {memory_id}")
|
||||
return True
|
||||
|
||||
@@ -926,23 +926,12 @@ class GraphStore:
|
||||
mem_edge = MemoryEdge.from_dict(edge_dict)
|
||||
except Exception:
|
||||
# 兼容性:直接构造对象
|
||||
# 确保 edge_type 是 EdgeType 枚举
|
||||
edge_type_value = edge_dict["edge_type"]
|
||||
if isinstance(edge_type_value, str):
|
||||
try:
|
||||
edge_type_enum = EdgeType(edge_type_value)
|
||||
except ValueError:
|
||||
logger.warning(f"未知的边类型: {edge_type_value}, 使用默认值")
|
||||
edge_type_enum = EdgeType.RELATION
|
||||
else:
|
||||
edge_type_enum = edge_type_value
|
||||
|
||||
mem_edge = MemoryEdge(
|
||||
id=edge_dict["id"] or "",
|
||||
source_id=edge_dict["source_id"],
|
||||
target_id=edge_dict["target_id"],
|
||||
relation=edge_dict["relation"],
|
||||
edge_type=edge_type_enum,
|
||||
edge_type=edge_dict["edge_type"],
|
||||
importance=edge_dict.get("importance", 0.5),
|
||||
metadata=edge_dict.get("metadata", {}),
|
||||
)
|
||||
|
||||
@@ -44,9 +44,6 @@ class UnifiedMemoryManager:
|
||||
# 短期记忆配置
|
||||
short_term_max_memories: int = 30,
|
||||
short_term_transfer_threshold: float = 0.6,
|
||||
short_term_overflow_strategy: str = "transfer_all",
|
||||
short_term_enable_force_cleanup: bool = False,
|
||||
short_term_cleanup_keep_ratio: float = 0.9,
|
||||
# 长期记忆配置
|
||||
long_term_batch_size: int = 10,
|
||||
long_term_search_top_k: int = 5,
|
||||
@@ -99,9 +96,6 @@ class UnifiedMemoryManager:
|
||||
"short_term": {
|
||||
"max_memories": short_term_max_memories,
|
||||
"transfer_importance_threshold": short_term_transfer_threshold,
|
||||
"overflow_strategy": short_term_overflow_strategy,
|
||||
"enable_force_cleanup": short_term_enable_force_cleanup,
|
||||
"cleanup_keep_ratio": short_term_cleanup_keep_ratio,
|
||||
},
|
||||
"long_term": {
|
||||
"batch_size": long_term_batch_size,
|
||||
@@ -571,9 +565,7 @@ class UnifiedMemoryManager:
|
||||
self._transfer_wakeup_event.clear()
|
||||
|
||||
self._auto_transfer_task = asyncio.create_task(self._auto_transfer_loop())
|
||||
# 立即触发一次检查,避免启动初期的长时间等待
|
||||
self._transfer_wakeup_event.set()
|
||||
logger.debug("自动转移任务已启动并触发首次检查")
|
||||
logger.debug("自动转移任务已启动")
|
||||
|
||||
async def _auto_transfer_loop(self) -> None:
|
||||
"""自动转移循环(批量缓存模式,优化:更高效的缓存管理)"""
|
||||
@@ -619,13 +611,6 @@ class UnifiedMemoryManager:
|
||||
occupancy_ratio = len(self.short_term_manager.memories) / max_memories
|
||||
time_since_last_transfer = time.monotonic() - last_transfer_time
|
||||
|
||||
if occupancy_ratio >= 1.0 and not transfer_cache:
|
||||
removed = self.short_term_manager.force_cleanup_overflow()
|
||||
if removed > 0:
|
||||
logger.warning(
|
||||
f"短期记忆占用率 {occupancy_ratio:.0%},已强制删除 {removed} 条低重要性记忆泄压"
|
||||
)
|
||||
|
||||
# 优化:优先级判断重构(早期 return)
|
||||
should_transfer = (
|
||||
len(transfer_cache) >= cache_size_threshold
|
||||
|
||||
@@ -117,18 +117,11 @@ class BaseInterestCalculator(ABC):
|
||||
"""
|
||||
try:
|
||||
self._enabled = True
|
||||
# 子类可以重写此方法执行自定义初始化
|
||||
await self.on_initialize()
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"初始化兴趣计算器失败: {e}")
|
||||
except Exception:
|
||||
self._enabled = False
|
||||
return False
|
||||
|
||||
async def on_initialize(self):
|
||||
"""子类可重写的初始化钩子"""
|
||||
pass
|
||||
|
||||
async def cleanup(self) -> bool:
|
||||
"""清理组件资源
|
||||
|
||||
@@ -136,18 +129,11 @@ class BaseInterestCalculator(ABC):
|
||||
bool: 清理是否成功
|
||||
"""
|
||||
try:
|
||||
# 子类可以重写此方法执行自定义清理
|
||||
await self.on_cleanup()
|
||||
self._enabled = False
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"清理兴趣计算器失败: {e}")
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
async def on_cleanup(self):
|
||||
"""子类可重写的清理钩子"""
|
||||
pass
|
||||
|
||||
@property
|
||||
def is_enabled(self) -> bool:
|
||||
"""组件是否已启用"""
|
||||
|
||||
@@ -39,7 +39,7 @@ class UserFactTool(BaseTool):
|
||||
("info_value", ToolParamType.STRING, "具体内容,如'11月23日'、'程序员'、'想开咖啡店'", True, None),
|
||||
]
|
||||
available_for_llm = True
|
||||
history_ttl = 0
|
||||
history_ttl = 5
|
||||
|
||||
async def execute(self, function_args: dict[str, Any]) -> dict[str, Any]:
|
||||
"""执行关键信息记录
|
||||
|
||||
@@ -85,7 +85,7 @@ class UserProfileTool(BaseTool):
|
||||
("key_info_value", ToolParamType.STRING, "具体信息内容(必须是具体值如'11月23日'、'上海')", False, None),
|
||||
]
|
||||
available_for_llm = True
|
||||
history_ttl = 0
|
||||
history_ttl = 1
|
||||
|
||||
async def execute(self, function_args: dict[str, Any]) -> dict[str, Any]:
|
||||
"""执行用户画像更新(异步后台执行,不阻塞回复)
|
||||
|
||||
@@ -223,9 +223,6 @@ class KokoroFlowChatter(BaseChatter):
|
||||
exec_results.append(result)
|
||||
if result.get("success") and action.type in ("kfc_reply", "respond"):
|
||||
has_reply = True
|
||||
reply_text = (result.get("reply_text") or "").strip()
|
||||
if reply_text:
|
||||
action.params["content"] = reply_text
|
||||
|
||||
# 11. 记录 Bot 规划到 mental_log
|
||||
session.add_bot_planning(
|
||||
@@ -339,12 +336,6 @@ class KokoroFlowChatter(BaseChatter):
|
||||
# 为 kfc_reply 动作注入回复生成所需的上下文
|
||||
for action in plan_response.actions:
|
||||
if action.type == "kfc_reply":
|
||||
# 分离模式下 Planner 不应直接生成回复内容;即使模型输出了 content,也应忽略
|
||||
if "content" in action.params and action.params.get("content"):
|
||||
logger.warning(
|
||||
"[KFC] Split模式下Planner输出了kfc_reply.content,已忽略(由Replyer生成)"
|
||||
)
|
||||
action.params.pop("content", None)
|
||||
action.params["user_id"] = user_id
|
||||
action.params["user_name"] = user_name
|
||||
action.params["thought"] = plan_response.thought
|
||||
|
||||
@@ -90,12 +90,6 @@ class PromptConfig:
|
||||
# 每条记录最大字符数
|
||||
max_entry_length: int = 500
|
||||
|
||||
# 活动流格式:narrative(线性叙事)/ table(结构化表格)/ both(两者都给)
|
||||
# - narrative: 更自然,但信息密度较低,长时更容易丢细节
|
||||
# - table: 更高信息密度,便于模型对齐字段、检索与对比
|
||||
# - both: 调试/对照用,token 更高
|
||||
activity_stream_format: str = "narrative"
|
||||
|
||||
# 是否包含人物关系信息
|
||||
include_relation: bool = True
|
||||
|
||||
@@ -242,11 +236,6 @@ def load_config() -> KokoroFlowChatterConfig:
|
||||
config.prompt = PromptConfig(
|
||||
max_activity_entries=getattr(pmt_cfg, "max_activity_entries", 30),
|
||||
max_entry_length=getattr(pmt_cfg, "max_entry_length", 500),
|
||||
activity_stream_format=getattr(
|
||||
pmt_cfg,
|
||||
"activity_stream_format",
|
||||
getattr(pmt_cfg, "activity_format", "narrative"),
|
||||
),
|
||||
include_relation=getattr(pmt_cfg, "include_relation", True),
|
||||
include_memory=getattr(pmt_cfg, "include_memory", True),
|
||||
)
|
||||
|
||||
@@ -456,11 +456,6 @@ class ProactiveThinker:
|
||||
# 分离模式下需要注入上下文信息
|
||||
for action in plan_response.actions:
|
||||
if action.type == "kfc_reply":
|
||||
if "content" in action.params and action.params.get("content"):
|
||||
logger.warning(
|
||||
"[KFC ProactiveThinker] Split模式下Planner输出了kfc_reply.content,已忽略(由Replyer生成)"
|
||||
)
|
||||
action.params.pop("content", None)
|
||||
action.params["user_id"] = session.user_id
|
||||
action.params["user_name"] = user_name
|
||||
action.params["thought"] = plan_response.thought
|
||||
@@ -500,7 +495,7 @@ class ProactiveThinker:
|
||||
|
||||
# 执行动作(回复生成在 Action.execute() 中完成)
|
||||
for action in plan_response.actions:
|
||||
result = await action_manager.execute_action(
|
||||
await action_manager.execute_action(
|
||||
action_name=action.type,
|
||||
chat_id=session.stream_id,
|
||||
target_message=None,
|
||||
@@ -509,10 +504,6 @@ class ProactiveThinker:
|
||||
thinking_id=None,
|
||||
log_prefix="[KFC ProactiveThinker]",
|
||||
)
|
||||
if result.get("success") and action.type in ("kfc_reply", "respond"):
|
||||
reply_text = (result.get("reply_text") or "").strip()
|
||||
if reply_text:
|
||||
action.params["content"] = reply_text
|
||||
|
||||
# 🎯 只有真正发送了消息才增加追问计数(do_nothing 不算追问)
|
||||
has_reply_action = any(
|
||||
@@ -712,11 +703,6 @@ class ProactiveThinker:
|
||||
if self._mode == KFCMode.SPLIT:
|
||||
for action in plan_response.actions:
|
||||
if action.type == "kfc_reply":
|
||||
if "content" in action.params and action.params.get("content"):
|
||||
logger.warning(
|
||||
"[KFC ProactiveThinker] Split模式下Planner输出了kfc_reply.content,已忽略(由Replyer生成)"
|
||||
)
|
||||
action.params.pop("content", None)
|
||||
action.params["user_id"] = session.user_id
|
||||
action.params["user_name"] = user_name
|
||||
action.params["thought"] = plan_response.thought
|
||||
@@ -749,7 +735,7 @@ class ProactiveThinker:
|
||||
|
||||
# 执行动作(回复生成在 Action.execute() 中完成)
|
||||
for action in plan_response.actions:
|
||||
result = await action_manager.execute_action(
|
||||
await action_manager.execute_action(
|
||||
action_name=action.type,
|
||||
chat_id=session.stream_id,
|
||||
target_message=None,
|
||||
@@ -758,10 +744,6 @@ class ProactiveThinker:
|
||||
thinking_id=None,
|
||||
log_prefix="[KFC ProactiveThinker]",
|
||||
)
|
||||
if result.get("success") and action.type in ("kfc_reply", "respond"):
|
||||
reply_text = (result.get("reply_text") or "").strip()
|
||||
if reply_text:
|
||||
action.params["content"] = reply_text
|
||||
|
||||
# 记录到 mental_log
|
||||
session.add_bot_planning(
|
||||
|
||||
@@ -75,12 +75,12 @@ class PromptBuilder:
|
||||
# 1.6. 构建自定义决策提示词块
|
||||
custom_decision_block = self._build_custom_decision_block()
|
||||
|
||||
# 2. Planner(分离模式)不做重型上下文构建:记忆检索/工具信息/表达习惯检索等会显著拖慢处理
|
||||
# 这些信息留给 Replyer(生成最终回复文本)阶段再获取。
|
||||
relation_block = ""
|
||||
memory_block = ""
|
||||
tool_info = ""
|
||||
expression_habits = ""
|
||||
# 2. 使用 context_builder 获取关系、记忆、工具、表达习惯等
|
||||
context_data = await self._build_context_data(user_name, chat_stream, user_id)
|
||||
relation_block = context_data.get("relation_info", f"你与 {user_name} 还不太熟悉,这是早期的交流阶段。")
|
||||
memory_block = context_data.get("memory_block", "")
|
||||
tool_info = context_data.get("tool_info", "")
|
||||
expression_habits = self._build_combined_expression_block(context_data.get("expression_habits", ""))
|
||||
|
||||
# 3. 构建活动流
|
||||
activity_stream = await self._build_activity_stream(session, user_name)
|
||||
@@ -284,42 +284,6 @@ class PromptBuilder:
|
||||
|
||||
return ""
|
||||
|
||||
def _build_last_bot_action_block(self, session: KokoroSession | None) -> str:
|
||||
"""
|
||||
构建“最近一次Bot动作/发言”块(用于插入到当前情况里)
|
||||
|
||||
目的:让模型在决策时能显式参考“我刚刚做过什么/说过什么”,降低长上下文里漏细节的概率。
|
||||
"""
|
||||
if not session or not getattr(session, "mental_log", None):
|
||||
return ""
|
||||
|
||||
last_planning_entry: MentalLogEntry | None = None
|
||||
for entry in reversed(session.mental_log):
|
||||
if entry.event_type == EventType.BOT_PLANNING:
|
||||
last_planning_entry = entry
|
||||
break
|
||||
|
||||
if not last_planning_entry:
|
||||
return ""
|
||||
|
||||
actions_desc = self._format_actions(last_planning_entry.actions)
|
||||
|
||||
last_message = ""
|
||||
for action in last_planning_entry.actions:
|
||||
if action.get("type") == "kfc_reply":
|
||||
content = (action.get("content") or "").strip()
|
||||
if content:
|
||||
last_message = content
|
||||
|
||||
if last_message and len(last_message) > 80:
|
||||
last_message = last_message[:80] + "..."
|
||||
|
||||
lines = [f"你最近一次执行的动作是:{actions_desc}"]
|
||||
if last_message:
|
||||
lines.append(f"你上一次发出的消息是:「{last_message}」")
|
||||
|
||||
return "\n".join(lines) + "\n\n"
|
||||
|
||||
async def _build_context_data(
|
||||
self,
|
||||
user_name: str,
|
||||
@@ -577,39 +541,14 @@ class PromptBuilder:
|
||||
构建活动流
|
||||
|
||||
将 mental_log 中的事件按时间顺序转换为线性叙事
|
||||
支持线性叙事或结构化表格两种格式(可通过配置切换)
|
||||
使用统一的 prompt 模板
|
||||
"""
|
||||
from ..config import get_config
|
||||
|
||||
kfc_config = get_config()
|
||||
prompt_cfg = getattr(kfc_config, "prompt", None)
|
||||
max_entries = getattr(prompt_cfg, "max_activity_entries", 30) if prompt_cfg else 30
|
||||
max_entry_length = getattr(prompt_cfg, "max_entry_length", 500) if prompt_cfg else 500
|
||||
stream_format = (
|
||||
getattr(prompt_cfg, "activity_stream_format", "narrative") if prompt_cfg else "narrative"
|
||||
)
|
||||
|
||||
entries = session.get_recent_entries(limit=max_entries)
|
||||
entries = session.get_recent_entries(limit=30)
|
||||
if not entries:
|
||||
return ""
|
||||
|
||||
stream_format = (stream_format or "narrative").strip().lower()
|
||||
if stream_format == "table":
|
||||
return self._build_activity_stream_table(entries, user_name, max_entry_length)
|
||||
if stream_format == "both":
|
||||
table = self._build_activity_stream_table(entries, user_name, max_entry_length)
|
||||
narrative = await self._build_activity_stream_narrative(entries, user_name)
|
||||
return "\n\n".join([p for p in (table, narrative) if p])
|
||||
parts = []
|
||||
|
||||
return await self._build_activity_stream_narrative(entries, user_name)
|
||||
|
||||
async def _build_activity_stream_narrative(
|
||||
self,
|
||||
entries: list[MentalLogEntry],
|
||||
user_name: str,
|
||||
) -> str:
|
||||
"""构建线性叙事活动流(旧格式)"""
|
||||
parts: list[str] = []
|
||||
for entry in entries:
|
||||
part = await self._format_entry(entry, user_name)
|
||||
if part:
|
||||
@@ -617,95 +556,6 @@ class PromptBuilder:
|
||||
|
||||
return "\n\n".join(parts)
|
||||
|
||||
def _build_activity_stream_table(
|
||||
self,
|
||||
entries: list[MentalLogEntry],
|
||||
user_name: str,
|
||||
max_cell_length: int = 500,
|
||||
) -> str:
|
||||
"""
|
||||
构建结构化表格活动流(更高信息密度)
|
||||
|
||||
统一列:序号 / 时间 / 事件类型 / 内容 / 想法 / 行动 / 结果
|
||||
"""
|
||||
|
||||
def truncate(text: str, limit: int) -> str:
|
||||
if not text:
|
||||
return ""
|
||||
if limit <= 0:
|
||||
return text
|
||||
text = text.strip()
|
||||
return text if len(text) <= limit else (text[: max(0, limit - 1)] + "…")
|
||||
|
||||
def md_cell(value: str) -> str:
|
||||
value = (value or "").replace("\r\n", "\n").replace("\n", "<br>")
|
||||
value = value.replace("|", "\\|")
|
||||
return truncate(value, max_cell_length)
|
||||
|
||||
event_type_alias = {
|
||||
EventType.USER_MESSAGE: "用户消息",
|
||||
EventType.BOT_PLANNING: "你的决策",
|
||||
EventType.WAITING_UPDATE: "等待中",
|
||||
EventType.PROACTIVE_TRIGGER: "主动触发",
|
||||
}
|
||||
|
||||
header = ["#", "时间", "类型", "内容", "想法", "行动", "结果"]
|
||||
lines = [
|
||||
"|" + "|".join(header) + "|",
|
||||
"|" + "|".join(["---"] * len(header)) + "|",
|
||||
]
|
||||
|
||||
for idx, entry in enumerate(entries, 1):
|
||||
time_str = entry.get_time_str()
|
||||
type_str = event_type_alias.get(entry.event_type, str(entry.event_type))
|
||||
|
||||
content = ""
|
||||
thought = ""
|
||||
action = ""
|
||||
result = ""
|
||||
|
||||
if entry.event_type == EventType.USER_MESSAGE:
|
||||
content = entry.content
|
||||
reply_status = entry.metadata.get("reply_status")
|
||||
if reply_status in ("in_time", "late"):
|
||||
elapsed_min = entry.metadata.get("elapsed_seconds", 0) / 60
|
||||
max_wait_min = entry.metadata.get("max_wait_seconds", 0) / 60
|
||||
status_cn = "及时" if reply_status == "in_time" else "迟到"
|
||||
result = f"回复{status_cn}(等{elapsed_min:.1f}/{max_wait_min:.1f}分钟)"
|
||||
|
||||
elif entry.event_type == EventType.BOT_PLANNING:
|
||||
thought = entry.thought or "(无)"
|
||||
action = self._format_actions(entry.actions)
|
||||
if entry.max_wait_seconds > 0:
|
||||
wait_min = entry.max_wait_seconds / 60
|
||||
expected = entry.expected_reaction or "(无)"
|
||||
result = f"等待≤{wait_min:.1f}分钟;期待={expected}"
|
||||
else:
|
||||
result = "不等待"
|
||||
|
||||
elif entry.event_type == EventType.WAITING_UPDATE:
|
||||
thought = entry.waiting_thought or "还在等…"
|
||||
elapsed_min = entry.elapsed_seconds / 60
|
||||
mood = (entry.mood or "").strip()
|
||||
result = f"已等{elapsed_min:.1f}分钟" + (f";心情={mood}" if mood else "")
|
||||
|
||||
elif entry.event_type == EventType.PROACTIVE_TRIGGER:
|
||||
silence = entry.metadata.get("silence_duration", "一段时间")
|
||||
result = f"沉默{silence}"
|
||||
|
||||
row = [
|
||||
str(idx),
|
||||
md_cell(time_str),
|
||||
md_cell(type_str),
|
||||
md_cell(content),
|
||||
md_cell(thought),
|
||||
md_cell(action),
|
||||
md_cell(result),
|
||||
]
|
||||
lines.append("|" + "|".join(row) + "|")
|
||||
|
||||
return "(结构化活动流表;按时间顺序)\n" + "\n".join(lines)
|
||||
|
||||
async def _format_entry(self, entry: MentalLogEntry, user_name: str) -> str:
|
||||
"""格式化单个活动日志条目"""
|
||||
|
||||
@@ -811,7 +661,6 @@ class PromptBuilder:
|
||||
) -> str:
|
||||
"""构建当前情况描述"""
|
||||
current_time = datetime.now().strftime("%Y年%m月%d日 %H:%M")
|
||||
last_action_block = self._build_last_bot_action_block(session)
|
||||
|
||||
# 如果之前没有设置等待时间(max_wait_seconds == 0),视为 new_message
|
||||
if situation_type in ("reply_in_time", "reply_late"):
|
||||
@@ -825,7 +674,6 @@ class PromptBuilder:
|
||||
return await global_prompt_manager.format_prompt(
|
||||
PROMPT_NAMES["situation_new_message"],
|
||||
current_time=current_time,
|
||||
last_action_block=last_action_block,
|
||||
user_name=user_name,
|
||||
latest_message=latest_message,
|
||||
)
|
||||
@@ -837,7 +685,6 @@ class PromptBuilder:
|
||||
return await global_prompt_manager.format_prompt(
|
||||
PROMPT_NAMES["situation_reply_in_time"],
|
||||
current_time=current_time,
|
||||
last_action_block=last_action_block,
|
||||
user_name=user_name,
|
||||
elapsed_minutes=elapsed / 60,
|
||||
max_wait_minutes=max_wait / 60,
|
||||
@@ -851,7 +698,6 @@ class PromptBuilder:
|
||||
return await global_prompt_manager.format_prompt(
|
||||
PROMPT_NAMES["situation_reply_late"],
|
||||
current_time=current_time,
|
||||
last_action_block=last_action_block,
|
||||
user_name=user_name,
|
||||
elapsed_minutes=elapsed / 60,
|
||||
max_wait_minutes=max_wait / 60,
|
||||
@@ -897,7 +743,6 @@ class PromptBuilder:
|
||||
return await global_prompt_manager.format_prompt(
|
||||
PROMPT_NAMES["situation_timeout"],
|
||||
current_time=current_time,
|
||||
last_action_block=last_action_block,
|
||||
user_name=user_name,
|
||||
elapsed_minutes=elapsed / 60,
|
||||
max_wait_minutes=max_wait / 60,
|
||||
@@ -911,7 +756,6 @@ class PromptBuilder:
|
||||
return await global_prompt_manager.format_prompt(
|
||||
PROMPT_NAMES["situation_proactive"],
|
||||
current_time=current_time,
|
||||
last_action_block=last_action_block,
|
||||
user_name=user_name,
|
||||
silence_duration=silence,
|
||||
trigger_reason=trigger_reason,
|
||||
@@ -922,7 +766,6 @@ class PromptBuilder:
|
||||
PROMPT_NAMES["situation_new_message"],
|
||||
current_time=current_time,
|
||||
user_name=user_name,
|
||||
last_action_block=last_action_block,
|
||||
)
|
||||
|
||||
def _build_actions_block(self, available_actions: dict | None) -> str:
|
||||
@@ -1083,17 +926,15 @@ class PromptBuilder:
|
||||
"""
|
||||
from datetime import datetime
|
||||
current_time = datetime.now().strftime("%Y年%m月%d日 %H:%M")
|
||||
last_action_block = self._build_last_bot_action_block(session)
|
||||
|
||||
if situation_type == "new_message":
|
||||
return f"现在是 {current_time}。\n\n{last_action_block}{user_name} 刚给你发了消息。"
|
||||
return f"现在是 {current_time}。{user_name} 刚给你发了消息。"
|
||||
|
||||
elif situation_type == "reply_in_time":
|
||||
elapsed = session.waiting_config.get_elapsed_seconds()
|
||||
max_wait = session.waiting_config.max_wait_seconds
|
||||
return (
|
||||
f"现在是 {current_time}。\n\n"
|
||||
f"{last_action_block}"
|
||||
f"现在是 {current_time}。\n"
|
||||
f"你之前发了消息后在等 {user_name} 的回复。"
|
||||
f"等了大约 {elapsed / 60:.1f} 分钟(你原本打算最多等 {max_wait / 60:.1f} 分钟)。"
|
||||
f"现在 {user_name} 回复了!"
|
||||
@@ -1103,8 +944,7 @@ class PromptBuilder:
|
||||
elapsed = session.waiting_config.get_elapsed_seconds()
|
||||
max_wait = session.waiting_config.max_wait_seconds
|
||||
return (
|
||||
f"现在是 {current_time}。\n\n"
|
||||
f"{last_action_block}"
|
||||
f"现在是 {current_time}。\n"
|
||||
f"你之前发了消息后在等 {user_name} 的回复。"
|
||||
f"你原本打算最多等 {max_wait / 60:.1f} 分钟,但实际等了 {elapsed / 60:.1f} 分钟才收到回复。"
|
||||
f"虽然有点迟,但 {user_name} 终于回复了。"
|
||||
@@ -1114,8 +954,7 @@ class PromptBuilder:
|
||||
elapsed = session.waiting_config.get_elapsed_seconds()
|
||||
max_wait = session.waiting_config.max_wait_seconds
|
||||
return (
|
||||
f"现在是 {current_time}。\n\n"
|
||||
f"{last_action_block}"
|
||||
f"现在是 {current_time}。\n"
|
||||
f"你之前发了消息后一直在等 {user_name} 的回复。"
|
||||
f"你原本打算最多等 {max_wait / 60:.1f} 分钟,现在已经等了 {elapsed / 60:.1f} 分钟了,对方还是没回。"
|
||||
f"你决定主动说点什么。"
|
||||
@@ -1124,14 +963,13 @@ class PromptBuilder:
|
||||
elif situation_type == "proactive":
|
||||
silence = extra_context.get("silence_duration", "一段时间")
|
||||
return (
|
||||
f"现在是 {current_time}。\n\n"
|
||||
f"{last_action_block}"
|
||||
f"现在是 {current_time}。\n"
|
||||
f"你和 {user_name} 已经有一段时间没聊天了(沉默了 {silence})。"
|
||||
f"你决定主动找 {user_name} 聊点什么。"
|
||||
)
|
||||
|
||||
# 默认
|
||||
return f"现在是 {current_time}。\n\n{last_action_block}".rstrip()
|
||||
return f"现在是 {current_time}。"
|
||||
|
||||
async def _build_reply_context(
|
||||
self,
|
||||
|
||||
@@ -34,7 +34,7 @@ kfc_MAIN_PROMPT = Prompt(
|
||||
{tool_info}
|
||||
|
||||
# 你们之间最近的活动记录
|
||||
以下是你和 {user_name} 最近的互动历史,按时间顺序记录了你们的对话和你的心理活动(可能是线性叙事或结构化表格):
|
||||
以下是你和 {user_name} 最近的互动历史,按时间顺序记录了你们的对话和你的心理活动:
|
||||
{activity_stream}
|
||||
|
||||
# 聊天历史总览
|
||||
@@ -69,7 +69,7 @@ kfc_OUTPUT_FORMAT = Prompt(
|
||||
{{"type": "动作名称", ...动作参数}}
|
||||
],
|
||||
"expected_reaction": "你期待对方的反应是什么",
|
||||
"max_wait_seconds": 0
|
||||
- `max_wait_seconds`:预估的等待时间(秒),请根据对话节奏来判断。通常你应该设置为0避免总是等待显得聒噪,但是当你觉得你需要等待对方回复时,可以设置一个合理的等待时间。
|
||||
}}
|
||||
```
|
||||
|
||||
@@ -93,7 +93,7 @@ kfc_SITUATION_NEW_MESSAGE = Prompt(
|
||||
name="kfc_situation_new_message",
|
||||
template="""现在是 {current_time}。
|
||||
|
||||
{last_action_block}{user_name} 刚刚给你发了消息:「{latest_message}」
|
||||
{user_name} 刚刚给你发了消息:「{latest_message}」
|
||||
|
||||
这是一次新的对话发起(不是对你之前消息的回复)。
|
||||
|
||||
@@ -108,7 +108,7 @@ kfc_SITUATION_REPLY_IN_TIME = Prompt(
|
||||
name="kfc_situation_reply_in_time",
|
||||
template="""现在是 {current_time}。
|
||||
|
||||
{last_action_block}你之前发了消息后一直在等 {user_name} 的回复。
|
||||
你之前发了消息后一直在等 {user_name} 的回复。
|
||||
等了大约 {elapsed_minutes:.1f} 分钟(你原本打算最多等 {max_wait_minutes:.1f} 分钟)。
|
||||
现在 {user_name} 回复了:「{latest_message}」
|
||||
|
||||
@@ -119,7 +119,7 @@ kfc_SITUATION_REPLY_LATE = Prompt(
|
||||
name="kfc_situation_reply_late",
|
||||
template="""现在是 {current_time}。
|
||||
|
||||
{last_action_block}你之前发了消息后在等 {user_name} 的回复。
|
||||
你之前发了消息后在等 {user_name} 的回复。
|
||||
你原本打算最多等 {max_wait_minutes:.1f} 分钟,但实际等了 {elapsed_minutes:.1f} 分钟才收到回复。
|
||||
虽然有点迟,但 {user_name} 终于回复了:「{latest_message}」
|
||||
|
||||
@@ -130,7 +130,7 @@ kfc_SITUATION_TIMEOUT = Prompt(
|
||||
name="kfc_situation_timeout",
|
||||
template="""现在是 {current_time}。
|
||||
|
||||
{last_action_block}你之前发了消息后一直在等 {user_name} 的回复。
|
||||
你之前发了消息后一直在等 {user_name} 的回复。
|
||||
你原本打算最多等 {max_wait_minutes:.1f} 分钟,现在已经等了 {elapsed_minutes:.1f} 分钟了,对方还是没回。
|
||||
你当时期待的反应是:"{expected_reaction}"
|
||||
{timeout_context}
|
||||
@@ -161,7 +161,7 @@ kfc_SITUATION_PROACTIVE = Prompt(
|
||||
name="kfc_situation_proactive",
|
||||
template="""现在是 {current_time}。
|
||||
|
||||
{last_action_block}你和 {user_name} 已经有一段时间没聊天了(沉默了 {silence_duration})。
|
||||
你和 {user_name} 已经有一段时间没聊天了(沉默了 {silence_duration})。
|
||||
{trigger_reason}
|
||||
|
||||
你在想要不要主动找 {user_name} 聊点什么。
|
||||
@@ -251,7 +251,7 @@ kfc_PLANNER_OUTPUT_FORMAT = Prompt(
|
||||
{{"type": "动作名称", ...动作参数}}
|
||||
],
|
||||
"expected_reaction": "你期待对方的反应是什么",
|
||||
"max_wait_seconds": 0
|
||||
- `max_wait_seconds`:预估的等待时间(秒),请根据对话节奏来判断。通常你应该设置为0避免总是等待显得聒噪,但是当你觉得你需要等待对方回复时,可以设置一个合理的等待时间。
|
||||
}}
|
||||
```
|
||||
|
||||
@@ -264,7 +264,6 @@ kfc_PLANNER_OUTPUT_FORMAT = Prompt(
|
||||
|
||||
### 注意事项
|
||||
- 动作参数直接写在动作对象里,不需要 `action_data` 包装
|
||||
- **分离模式规则**:Planner 阶段禁止输出 `kfc_reply.content`(就算写了也会被系统忽略,回复内容由 Replyer 单独生成)
|
||||
- 即使什么都不想做,也放一个 `{{"type": "do_nothing"}}`
|
||||
- 可以组合多个动作,比如先发消息再发表情""",
|
||||
)
|
||||
@@ -407,7 +406,7 @@ kfc_UNIFIED_OUTPUT_FORMAT = Prompt(
|
||||
{{"type": "kfc_reply", "content": "你的回复内容"}}
|
||||
],
|
||||
"expected_reaction": "你期待对方的反应是什么",
|
||||
"max_wait_seconds": 0
|
||||
- `max_wait_seconds`:预估的等待时间(秒),请根据对话节奏来判断。通常你应该设置为0避免总是等待显得聒噪,但是当你觉得你需要等待对方回复时,可以设置一个合理的等待时间。
|
||||
}}
|
||||
```
|
||||
|
||||
|
||||
@@ -3,6 +3,7 @@ MaiZone(麦麦空间)- 重构版
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
|
||||
from src.common.logger import get_logger
|
||||
from src.plugin_system import BasePlugin, ComponentInfo, register_plugin
|
||||
@@ -42,26 +43,19 @@ class MaiZoneRefactoredPlugin(BasePlugin):
|
||||
"plugin": {"enable": ConfigField(type=bool, default=True, description="是否启用插件")},
|
||||
"models": {
|
||||
"text_model": ConfigField(type=str, default="maizone", description="生成文本的模型名称"),
|
||||
},
|
||||
"ai_image": {
|
||||
"enable_ai_image": ConfigField(type=bool, default=False, description="是否启用AI生成配图"),
|
||||
"provider": ConfigField(type=str, default="siliconflow", description="AI生图服务提供商(siliconflow/novelai)"),
|
||||
"image_number": ConfigField(type=int, default=1, description="生成图片数量(1-4张)"),
|
||||
},
|
||||
"siliconflow": {
|
||||
"api_key": ConfigField(type=str, default="", description="硅基流动API密钥"),
|
||||
},
|
||||
"novelai": {
|
||||
"api_key": ConfigField(type=str, default="", description="NovelAI官方API密钥"),
|
||||
"character_prompt": ConfigField(type=str, default="", description="Bot角色外貌描述(AI判断需要bot出镜时插入)"),
|
||||
"base_negative_prompt": ConfigField(type=str, default="nsfw, nude, explicit, sexual content, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality", description="基础负面提示词(禁止不良内容)"),
|
||||
"proxy_host": ConfigField(type=str, default="", description="代理服务器地址(如:127.0.0.1)"),
|
||||
"proxy_port": ConfigField(type=int, default=0, description="代理服务器端口(如:7890)"),
|
||||
"siliconflow_apikey": ConfigField(type=str, default="", description="硅基流动AI生图API密钥"),
|
||||
},
|
||||
"send": {
|
||||
"permission": ConfigField(type=list, default=[], description="发送权限QQ号列表"),
|
||||
"permission_type": ConfigField(type=str, default="whitelist", description="权限类型"),
|
||||
"enable_image": ConfigField(type=bool, default=False, description="是否启用说说配图"),
|
||||
"enable_ai_image": ConfigField(type=bool, default=False, description="是否启用AI生成配图"),
|
||||
"enable_reply": ConfigField(type=bool, default=True, description="完成后是否回复"),
|
||||
"ai_image_number": ConfigField(type=int, default=1, description="AI生成图片数量(1-4张)"),
|
||||
"image_number": ConfigField(type=int, default=1, description="本地配图数量(1-9张)"),
|
||||
"image_directory": ConfigField(
|
||||
type=str, default=(Path(__file__).parent / "images").as_posix(), description="图片存储目录"
|
||||
),
|
||||
},
|
||||
"read": {
|
||||
"permission": ConfigField(type=list, default=[], description="阅读权限QQ号列表"),
|
||||
|
||||
@@ -54,10 +54,9 @@ class ContentService:
|
||||
logger.error("未配置LLM模型")
|
||||
return ""
|
||||
|
||||
# 获取机器人信息(核心人格配置)
|
||||
bot_personality_core = config_api.get_global_config("personality.personality_core", "一个机器人")
|
||||
bot_personality_side = config_api.get_global_config("personality.personality_side", "")
|
||||
bot_reply_style = config_api.get_global_config("personality.reply_style", "内容积极向上")
|
||||
# 获取机器人信息
|
||||
bot_personality = config_api.get_global_config("personality.personality_core", "一个机器人")
|
||||
bot_expression = config_api.get_global_config("personality.reply_style", "内容积极向上")
|
||||
qq_account = config_api.get_global_config("bot.qq_account", "")
|
||||
|
||||
# 获取当前时间信息
|
||||
@@ -66,20 +65,13 @@ class ContentService:
|
||||
weekday_names = ["星期一", "星期二", "星期三", "星期四", "星期五", "星期六", "星期日"]
|
||||
weekday = weekday_names[now.weekday()]
|
||||
|
||||
# 构建人设描述
|
||||
personality_desc = f"你的核心人格:{bot_personality_core}"
|
||||
if bot_personality_side:
|
||||
personality_desc += f"\n你的人格侧面:{bot_personality_side}"
|
||||
personality_desc += f"\n\n你的表达方式:{bot_reply_style}"
|
||||
|
||||
# 构建提示词
|
||||
prompt_topic = f"主题是'{topic}'" if topic else "主题不限"
|
||||
prompt = f"""
|
||||
{personality_desc}
|
||||
你是'{bot_personality}',现在是{current_time}({weekday}),你想写一条{prompt_topic}的说说发表在qq空间上。
|
||||
{bot_expression}
|
||||
|
||||
现在是{current_time}({weekday}),你想写一条{prompt_topic}的说说发表在qq空间上。
|
||||
|
||||
请严格遵守以下规则:
|
||||
请严格遵守以下规则:
|
||||
1. **绝对禁止**在说说中直接、完整地提及当前的年月日或几点几分。
|
||||
2. 你应该将当前时间作为创作的背景,用它来判断现在是“清晨”、“傍晚”还是“深夜”。
|
||||
3. 使用自然、模糊的词语来暗示时间,例如“刚刚”、“今天下午”、“夜深啦”等。
|
||||
@@ -120,244 +112,7 @@ class ContentService:
|
||||
logger.error(f"生成说说内容时发生异常: {e}")
|
||||
return ""
|
||||
|
||||
async def generate_story_with_image_info(
|
||||
self, topic: str, context: str | None = None
|
||||
) -> tuple[str, dict]:
|
||||
"""
|
||||
生成说说内容,并同时生成NovelAI图片提示词信息
|
||||
|
||||
:param topic: 说说的主题
|
||||
:param context: 可选的聊天上下文
|
||||
:return: (说说文本, 图片信息字典)
|
||||
图片信息字典格式: {
|
||||
"prompt": str, # NovelAI提示词(英文)
|
||||
"negative_prompt": str, # 负面提示词(英文)
|
||||
"include_character": bool, # 画面是否包含bot自己(true时插入角色外貌提示词)
|
||||
"aspect_ratio": str # 画幅(方图/横图/竖图)
|
||||
}
|
||||
"""
|
||||
try:
|
||||
# 获取模型配置
|
||||
models = llm_api.get_available_models()
|
||||
text_model = str(self.get_config("models.text_model", "replyer"))
|
||||
model_config = models.get(text_model)
|
||||
|
||||
if not model_config:
|
||||
logger.error("未配置LLM模型")
|
||||
return "", {"has_image": False}
|
||||
|
||||
# 获取机器人信息(核心人格配置)
|
||||
bot_personality_core = config_api.get_global_config("personality.personality_core", "一个机器人")
|
||||
bot_personality_side = config_api.get_global_config("personality.personality_side", "")
|
||||
bot_reply_style = config_api.get_global_config("personality.reply_style", "内容积极向上")
|
||||
qq_account = config_api.get_global_config("bot.qq_account", "")
|
||||
|
||||
# 获取角色外貌描述(用于告知LLM)
|
||||
character_prompt = self.get_config("novelai.character_prompt", "")
|
||||
|
||||
# 获取当前时间信息
|
||||
now = datetime.datetime.now()
|
||||
current_time = now.strftime("%Y年%m月%d日 %H:%M")
|
||||
weekday_names = ["星期一", "星期二", "星期三", "星期四", "星期五", "星期六", "星期日"]
|
||||
weekday = weekday_names[now.weekday()]
|
||||
|
||||
# 构建提示词
|
||||
prompt_topic = f"主题是'{topic}'" if topic else "主题不限"
|
||||
|
||||
# 构建人设描述
|
||||
personality_desc = f"你的核心人格:{bot_personality_core}"
|
||||
if bot_personality_side:
|
||||
personality_desc += f"\n你的人格侧面:{bot_personality_side}"
|
||||
personality_desc += f"\n\n你的表达方式:{bot_reply_style}"
|
||||
|
||||
# 检查是否启用AI配图(统一开关)
|
||||
ai_image_enabled = self.get_config("ai_image.enable_ai_image", False)
|
||||
provider = self.get_config("ai_image.provider", "siliconflow")
|
||||
|
||||
# NovelAI配图指引(内置)
|
||||
novelai_guide = ""
|
||||
output_format = '{"text": "说说正文内容"}'
|
||||
|
||||
if ai_image_enabled and provider == "novelai":
|
||||
# 构建角色信息提示
|
||||
character_info = ""
|
||||
if character_prompt:
|
||||
character_info = f"""
|
||||
**角色特征锚点**(当include_character=true时会插入以下基础特征):
|
||||
```
|
||||
{character_prompt}
|
||||
```
|
||||
📌 重要说明:
|
||||
- 这只是角色的**基础外貌特征**(发型、眼睛、耳朵等固定特征),用于锚定角色身份
|
||||
- 你可以**自由描述**:衣服、动作、表情、姿势、装饰、配饰等所有可变元素
|
||||
- 例如:可以让角色穿不同风格的衣服(casual, formal, sportswear, dress等)
|
||||
- 例如:可以设计各种动作(sitting, standing, walking, running, lying down等)
|
||||
- 例如:可以搭配各种表情(smile, laugh, serious, thinking, surprised等)
|
||||
- **鼓励创意**:根据说说内容自由发挥,让画面更丰富生动!
|
||||
"""
|
||||
|
||||
novelai_guide = f"""
|
||||
**配图说明:**
|
||||
这条说说会使用NovelAI Diffusion模型(二次元风格)生成配图。
|
||||
{character_info}
|
||||
**提示词生成要求(非常重要):**
|
||||
你需要生成一段详细的英文图片提示词,必须包含以下要素:
|
||||
|
||||
1. **画质标签**(必需):
|
||||
- 开头必须加:masterpiece, best quality, detailed, high resolution
|
||||
|
||||
2. **主体元素**(自由发挥):
|
||||
- 人物描述:表情、动作、姿态(**完全自由**,不受角色锚点限制)
|
||||
- 服装搭配:casual clothing, dress, hoodie, school uniform, sportswear等(**任意选择**)
|
||||
- 配饰装饰:hat, glasses, ribbon, jewelry, bag等(**随意添加**)
|
||||
- 物体/场景:具体的物品、建筑、自然景观等
|
||||
|
||||
3. **场景与环境**(必需):
|
||||
- 地点:indoor/outdoor, cafe, park, bedroom, street, beach, forest等
|
||||
- 背景:描述背景的细节(sky, trees, buildings, ocean, mountains等)
|
||||
|
||||
4. **氛围与风格**(必需):
|
||||
- 光线:sunlight, sunset, golden hour, soft lighting, dramatic lighting, night
|
||||
- 天气/时间:sunny day, rainy, cloudy, starry night, dawn, dusk
|
||||
- 整体氛围:peaceful, cozy, romantic, energetic, melancholic, playful
|
||||
|
||||
5. **色彩与细节**(推荐):
|
||||
- 主色调:warm colors, cool tones, pastel colors, vibrant colors
|
||||
- 特殊细节:falling petals, sparkles, lens flare, depth of field, bokeh
|
||||
|
||||
6. **include_character字段**:
|
||||
- true:画面中包含"你自己"(自拍、你在画面中的场景)
|
||||
- false:画面中不包含你(风景、物品、他人)
|
||||
|
||||
7. **negative_prompt(负面提示词)**:
|
||||
- **严格禁止**以下内容:nsfw, nude, explicit, sexual content, violence, gore, blood
|
||||
- 排除质量问题:lowres, bad anatomy, bad hands, deformed, mutilated, ugly
|
||||
- 排除瑕疵:blurry, poorly drawn, worst quality, low quality, jpeg artifacts
|
||||
- 可以自行补充其他不需要的元素
|
||||
|
||||
8. **aspect_ratio(画幅)**:
|
||||
- 方图:适合头像、特写、正方形构图
|
||||
- 横图:适合风景、全景、宽幅场景
|
||||
- 竖图:适合人物全身、纵向构图
|
||||
|
||||
**内容审核规则(必须遵守)**:
|
||||
- 🚫 严禁生成NSFW、色情、裸露、性暗示内容
|
||||
- 🚫 严禁生成暴力、血腥、恐怖、惊悚内容
|
||||
- 🚫 严禁生成肢体畸形、器官变异、恶心画面
|
||||
- ✅ 提示词必须符合健康、积极、美好的审美标准
|
||||
- ✅ 专注于日常生活、自然风景、温馨场景等正面内容
|
||||
|
||||
**创意自由度**:
|
||||
- 💡 **衣服搭配**:可以自由设计各种服装风格(休闲、正式、运动、可爱、时尚等)
|
||||
- 💡 **动作姿势**:站、坐、躺、走、跑、跳、伸展等任意动作
|
||||
- 💡 **表情情绪**:微笑、大笑、思考、惊讶、温柔、调皮等丰富表情
|
||||
- 💡 **场景创意**:根据说说内容自由发挥,让画面更贴合心情和主题
|
||||
|
||||
**示例提示词(展示多样性)**:
|
||||
- 休闲风:"masterpiece, best quality, 1girl, casual clothing, white t-shirt, jeans, sitting on bench, outdoor park, reading book, afternoon sunlight, relaxed atmosphere"
|
||||
- 运动风:"masterpiece, best quality, 1girl, sportswear, running in park, energetic, morning light, trees background, dynamic pose, healthy lifestyle"
|
||||
- 咖啡馆:"masterpiece, best quality, 1girl, sitting in cozy cafe, holding coffee cup, warm lighting, wooden table, books beside, peaceful atmosphere"
|
||||
"""
|
||||
output_format = """{"text": "说说正文内容", "image": {"prompt": "详细的英文提示词(包含画质+主体+场景+氛围+光线+色彩)", "negative_prompt": "负面词", "include_character": true/false, "aspect_ratio": "方图/横图/竖图"}}"""
|
||||
elif ai_image_enabled and provider == "siliconflow":
|
||||
novelai_guide = """
|
||||
**配图说明:**
|
||||
这条说说会使用AI生成配图。
|
||||
|
||||
**提示词生成要求(非常重要):**
|
||||
你需要生成一段详细的英文图片描述,必须包含以下要素:
|
||||
|
||||
1. **主体内容**:画面的核心元素(人物/物体/场景)
|
||||
2. **具体场景**:地点、环境、背景细节
|
||||
3. **氛围与风格**:整体感觉、光线、天气、色调
|
||||
4. **细节描述**:补充的视觉细节(动作、表情、装饰等)
|
||||
|
||||
**示例提示词**:
|
||||
- "a girl sitting in a modern cafe, warm afternoon lighting, wooden furniture, coffee cup on table, books beside her, cozy and peaceful atmosphere, soft focus background"
|
||||
- "sunset over the calm ocean, golden hour, orange and purple sky, gentle waves, peaceful and serene mood, wide angle view"
|
||||
- "cherry blossoms in spring, soft pink petals falling, blue sky, sunlight filtering through branches, peaceful park scene, gentle breeze"
|
||||
"""
|
||||
output_format = """{"text": "说说正文内容", "image": {"prompt": "详细的英文描述(主体+场景+氛围+光线+细节)"}}"""
|
||||
|
||||
prompt = f"""
|
||||
{personality_desc}
|
||||
|
||||
现在是{current_time}({weekday}),你想写一条{prompt_topic}的说说发表在qq空间上。
|
||||
|
||||
**说说文本规则:**
|
||||
1. **绝对禁止**在说说中直接、完整地提及当前的年月日或几点几分。
|
||||
2. 你应该将当前时间作为创作的背景,用它来判断现在是"清晨"、"傍晚"还是"深夜"。
|
||||
3. 使用自然、模糊的词语来暗示时间,例如"刚刚"、"今天下午"、"夜深啦"等。
|
||||
4. **内容简短**:总长度严格控制在100字以内。
|
||||
5. **禁止表情**:严禁使用任何Emoji表情符号。
|
||||
6. **严禁重复**:下方会提供你最近发过的说说历史,你必须创作一条全新的、与历史记录内容和主题都不同的说说。
|
||||
7. 不要刻意突出自身学科背景,不要浮夸,不要夸张修辞。
|
||||
|
||||
{novelai_guide}
|
||||
|
||||
**输出格式(JSON):**
|
||||
{output_format}
|
||||
|
||||
只输出JSON格式,不要有其他内容。
|
||||
"""
|
||||
|
||||
# 如果有上下文,则加入到prompt中
|
||||
if context:
|
||||
prompt += f"\n\n作为参考,这里有一些最近的聊天记录:\n---\n{context}\n---"
|
||||
|
||||
# 添加历史记录以避免重复
|
||||
prompt += "\n\n---历史说说记录---\n"
|
||||
history_block = await get_send_history(qq_account)
|
||||
if history_block:
|
||||
prompt += history_block
|
||||
|
||||
# 调用LLM生成内容
|
||||
success, response, _, _ = await llm_api.generate_with_model(
|
||||
prompt=prompt,
|
||||
model_config=model_config,
|
||||
request_type="story.generate_with_image",
|
||||
temperature=0.3,
|
||||
max_tokens=1500,
|
||||
)
|
||||
|
||||
if success:
|
||||
# 解析JSON响应
|
||||
import json5
|
||||
try:
|
||||
# 提取JSON部分(去除可能的markdown代码块标记)
|
||||
json_text = response.strip()
|
||||
if json_text.startswith("```json"):
|
||||
json_text = json_text[7:]
|
||||
if json_text.startswith("```"):
|
||||
json_text = json_text[3:]
|
||||
if json_text.endswith("```"):
|
||||
json_text = json_text[:-3]
|
||||
json_text = json_text.strip()
|
||||
|
||||
data = json5.loads(json_text)
|
||||
story_text = data.get("text", "")
|
||||
image_info = data.get("image", {})
|
||||
|
||||
# 确保图片信息完整
|
||||
if not isinstance(image_info, dict):
|
||||
image_info = {}
|
||||
|
||||
logger.info(f"成功生成说说:'{story_text}'")
|
||||
logger.info(f"配图信息: {image_info}")
|
||||
|
||||
return story_text, image_info
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"解析JSON失败: {e}, 原始响应: {response[:200]}")
|
||||
# 降级处理:只返回文本,空配图信息
|
||||
return response, {}
|
||||
else:
|
||||
logger.error("生成说说内容失败")
|
||||
return "", {}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"生成说说内容时发生异常: {e}")
|
||||
return "", {}
|
||||
async def generate_comment(self, content: str, target_name: str, rt_con: str = "", images: list = []) -> str:
|
||||
"""
|
||||
针对一条具体的说说内容生成评论。
|
||||
"""
|
||||
|
||||
@@ -31,48 +31,18 @@ class ImageService:
|
||||
"""
|
||||
self.get_config = get_config
|
||||
|
||||
async def generate_image_from_prompt(self, prompt: str, save_dir: str | None = None) -> tuple[bool, Path | None]:
|
||||
"""
|
||||
直接使用提示词生成图片(硅基流动)
|
||||
|
||||
:param prompt: 图片提示词(英文)
|
||||
:param save_dir: 图片保存目录(None使用默认)
|
||||
:return: (是否成功, 图片路径)
|
||||
"""
|
||||
try:
|
||||
api_key = str(self.get_config("siliconflow.api_key", ""))
|
||||
image_num = self.get_config("ai_image.image_number", 1)
|
||||
|
||||
if not api_key:
|
||||
logger.warning("硅基流动API未配置,跳过图片生成")
|
||||
return False, None
|
||||
|
||||
# 图片目录
|
||||
if save_dir:
|
||||
image_dir = Path(save_dir)
|
||||
else:
|
||||
plugin_dir = Path(__file__).parent.parent
|
||||
image_dir = plugin_dir / "images"
|
||||
image_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
logger.info(f"正在生成 {image_num} 张AI配图...")
|
||||
success, img_path = await self._call_siliconflow_api(api_key, prompt, str(image_dir), image_num)
|
||||
return success, img_path
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"生成AI配图时发生异常: {e}")
|
||||
return False, None
|
||||
|
||||
async def generate_images_for_story(self, story: str) -> bool:
|
||||
"""
|
||||
根据说说内容,判断是否需要生成AI配图,并执行生成任务(硅基流动)。
|
||||
根据说说内容,判断是否需要生成AI配图,并执行生成任务。
|
||||
|
||||
:param story: 说说内容。
|
||||
:return: 图片是否成功生成(或不需要生成)。
|
||||
"""
|
||||
try:
|
||||
api_key = str(self.get_config("siliconflow.api_key", ""))
|
||||
image_num_raw = self.get_config("ai_image.image_number", 1)
|
||||
enable_ai_image = bool(self.get_config("send.enable_ai_image", False))
|
||||
api_key = str(self.get_config("models.siliconflow_apikey", ""))
|
||||
image_dir = str(self.get_config("send.image_directory", "./data/plugins/maizone_refactored/images"))
|
||||
image_num_raw = self.get_config("send.ai_image_number", 1)
|
||||
|
||||
# 安全地处理图片数量配置,并限制在API允许的范围内
|
||||
try:
|
||||
@@ -82,14 +52,15 @@ class ImageService:
|
||||
logger.warning(f"无效的图片数量配置: {image_num_raw},使用默认值1")
|
||||
image_num = 1
|
||||
|
||||
if not api_key:
|
||||
logger.warning("硅基流动API未配置,跳过图片生成")
|
||||
return True
|
||||
if not enable_ai_image:
|
||||
return True # 未启用AI配图,视为成功
|
||||
|
||||
# 图片目录(使用统一配置)
|
||||
plugin_dir = Path(__file__).parent.parent
|
||||
image_dir = plugin_dir / "images"
|
||||
image_dir.mkdir(parents=True, exist_ok=True)
|
||||
if not api_key:
|
||||
logger.error("启用了AI配图但未填写SiliconFlow API密钥")
|
||||
return False
|
||||
|
||||
# 确保图片目录存在
|
||||
Path(image_dir).mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# 生成图片提示词
|
||||
image_prompt = await self._generate_image_prompt(story)
|
||||
@@ -98,8 +69,7 @@ class ImageService:
|
||||
return False
|
||||
|
||||
logger.info(f"正在为说说生成 {image_num} 张AI配图...")
|
||||
success, _ = await self._call_siliconflow_api(api_key, image_prompt, str(image_dir), image_num)
|
||||
return success
|
||||
return await self._call_siliconflow_api(api_key, image_prompt, image_dir, image_num)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"处理AI配图时发生异常: {e}")
|
||||
@@ -157,7 +127,7 @@ class ImageService:
|
||||
logger.error(f"生成图片提示词时发生异常: {e}")
|
||||
return ""
|
||||
|
||||
async def _call_siliconflow_api(self, api_key: str, image_prompt: str, image_dir: str, batch_size: int) -> tuple[bool, Path | None]:
|
||||
async def _call_siliconflow_api(self, api_key: str, image_prompt: str, image_dir: str, batch_size: int) -> bool:
|
||||
"""
|
||||
调用硅基流动(SiliconFlow)的API来生成图片。
|
||||
|
||||
@@ -165,7 +135,7 @@ class ImageService:
|
||||
:param image_prompt: 用于生成图片的提示词。
|
||||
:param image_dir: 图片保存目录。
|
||||
:param batch_size: 生成图片的数量(1-4)。
|
||||
:return: (API调用是否成功, 第一张图片路径)
|
||||
:return: API调用是否成功。
|
||||
"""
|
||||
url = "https://api.siliconflow.cn/v1/images/generations"
|
||||
headers = {
|
||||
@@ -205,13 +175,12 @@ class ImageService:
|
||||
error_text = await response.text()
|
||||
logger.error(f"生成图片出错,错误码[{response.status}]")
|
||||
logger.error(f"错误响应: {error_text}")
|
||||
return False, None
|
||||
return False
|
||||
|
||||
json_data = await response.json()
|
||||
image_urls = [img["url"] for img in json_data["images"]]
|
||||
|
||||
success_count = 0
|
||||
first_img_path = None
|
||||
# 下载并保存图片
|
||||
for i, img_url in enumerate(image_urls):
|
||||
try:
|
||||
@@ -225,7 +194,7 @@ class ImageService:
|
||||
image = Image.open(BytesIO(img_data))
|
||||
|
||||
# 保存图片为PNG格式(确保兼容性)
|
||||
filename = f"siliconflow_{i}.png"
|
||||
filename = f"image_{i}.png"
|
||||
save_path = Path(image_dir) / filename
|
||||
|
||||
# 转换为RGB模式如果必要(避免RGBA等模式的问题)
|
||||
@@ -238,24 +207,20 @@ class ImageService:
|
||||
logger.info(f"图片已保存至: {save_path}")
|
||||
success_count += 1
|
||||
|
||||
# 记录第一张图片路径
|
||||
if first_img_path is None:
|
||||
first_img_path = save_path
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"处理图片失败: {e!s}")
|
||||
continue
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"下载图片失败: {e!s}")
|
||||
logger.error(f"下载第{i+1}张图片失败: {e!s}")
|
||||
continue
|
||||
|
||||
# 至少有一张图片成功就返回True
|
||||
return success_count > 0, first_img_path
|
||||
# 只要至少有一张图片成功就返回True
|
||||
return success_count > 0
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"调用AI生图API时发生异常: {e}")
|
||||
return False, None
|
||||
return False
|
||||
|
||||
def _encode_image_to_base64(self, img: Image.Image) -> str:
|
||||
"""
|
||||
|
||||
@@ -1,283 +0,0 @@
|
||||
"""
|
||||
NovelAI图片生成服务 - 空间插件专用
|
||||
独立实现,不依赖其他插件
|
||||
"""
|
||||
import io
|
||||
import random
|
||||
import uuid
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
|
||||
import aiohttp
|
||||
from PIL import Image
|
||||
|
||||
from src.common.logger import get_logger
|
||||
|
||||
logger = get_logger("MaiZone.NovelAIService")
|
||||
|
||||
|
||||
class MaiZoneNovelAIService:
|
||||
"""空间插件的NovelAI图片生成服务(独立实现)"""
|
||||
|
||||
def __init__(self, get_config):
|
||||
self.get_config = get_config
|
||||
|
||||
# NovelAI配置
|
||||
self.api_key = self.get_config("novelai.api_key", "")
|
||||
self.base_url = "https://image.novelai.net/ai/generate-image"
|
||||
self.model = "nai-diffusion-4-5-full"
|
||||
|
||||
# 代理配置
|
||||
proxy_host = self.get_config("novelai.proxy_host", "")
|
||||
proxy_port = self.get_config("novelai.proxy_port", 0)
|
||||
self.proxy = f"http://{proxy_host}:{proxy_port}" if proxy_host and proxy_port else ""
|
||||
|
||||
# 生成参数
|
||||
self.steps = 28
|
||||
self.scale = 5.0
|
||||
self.sampler = "k_euler"
|
||||
self.noise_schedule = "karras"
|
||||
|
||||
# 角色提示词(当LLM决定包含角色时使用)
|
||||
self.character_prompt = self.get_config("novelai.character_prompt", "")
|
||||
self.base_negative_prompt = self.get_config("novelai.base_negative_prompt", "nsfw, nude, explicit, sexual content, lowres, bad anatomy, bad hands")
|
||||
|
||||
# 图片保存目录(使用统一配置)
|
||||
plugin_dir = Path(__file__).parent.parent
|
||||
self.image_dir = plugin_dir / "images"
|
||||
self.image_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if self.api_key:
|
||||
logger.info(f"NovelAI图片生成已配置,模型: {self.model}")
|
||||
|
||||
def is_available(self) -> bool:
|
||||
"""检查NovelAI服务是否可用"""
|
||||
return bool(self.api_key)
|
||||
|
||||
async def generate_image_from_prompt_data(
|
||||
self,
|
||||
prompt: str,
|
||||
negative_prompt: str | None = None,
|
||||
include_character: bool = False,
|
||||
width: int = 1024,
|
||||
height: int = 1024
|
||||
) -> tuple[bool, Path | None, str]:
|
||||
"""根据提示词生成图片
|
||||
|
||||
Args:
|
||||
prompt: NovelAI格式的英文提示词
|
||||
negative_prompt: LLM生成的负面提示词(可选)
|
||||
include_character: 是否包含角色形象
|
||||
width: 图片宽度
|
||||
height: 图片高度
|
||||
|
||||
Returns:
|
||||
(是否成功, 图片路径, 消息)
|
||||
"""
|
||||
if not self.api_key:
|
||||
return False, None, "NovelAI API Key未配置"
|
||||
|
||||
try:
|
||||
# 处理角色提示词
|
||||
final_prompt = prompt
|
||||
if include_character and self.character_prompt:
|
||||
final_prompt = f"{self.character_prompt}, {prompt}"
|
||||
logger.info("包含角色形象,添加角色提示词")
|
||||
|
||||
# 合并负面提示词
|
||||
final_negative = self.base_negative_prompt
|
||||
if negative_prompt:
|
||||
if final_negative:
|
||||
final_negative = f"{final_negative}, {negative_prompt}"
|
||||
else:
|
||||
final_negative = negative_prompt
|
||||
|
||||
logger.info("🎨 开始生成图片...")
|
||||
logger.info(f" 尺寸: {width}x{height}")
|
||||
logger.info(f" 正面提示词: {final_prompt[:100]}...")
|
||||
logger.info(f" 负面提示词: {final_negative[:100]}...")
|
||||
|
||||
# 构建请求payload
|
||||
payload = self._build_payload(final_prompt, final_negative, width, height)
|
||||
|
||||
# 发送请求
|
||||
image_data = await self._call_novelai_api(payload)
|
||||
if not image_data:
|
||||
return False, None, "API请求失败"
|
||||
|
||||
# 保存图片
|
||||
image_path = await self._save_image(image_data)
|
||||
if not image_path:
|
||||
return False, None, "图片保存失败"
|
||||
|
||||
logger.info(f"✅ 图片生成成功: {image_path}")
|
||||
return True, image_path, "生成成功"
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"生成图片时出错: {e}", exc_info=True)
|
||||
return False, None, f"生成失败: {e!s}"
|
||||
|
||||
def _build_payload(self, prompt: str, negative_prompt: str, width: int, height: int) -> dict:
|
||||
"""构建NovelAI API请求payload"""
|
||||
is_v4_model = "diffusion-4" in self.model
|
||||
is_v3_model = "diffusion-3" in self.model
|
||||
|
||||
parameters = {
|
||||
"width": width,
|
||||
"height": height,
|
||||
"scale": self.scale,
|
||||
"steps": self.steps,
|
||||
"sampler": self.sampler,
|
||||
"seed": random.randint(0, 9999999999),
|
||||
"n_samples": 1,
|
||||
"ucPreset": 0,
|
||||
"qualityToggle": True,
|
||||
"sm": False,
|
||||
"sm_dyn": False,
|
||||
"noise_schedule": self.noise_schedule if is_v4_model else "native",
|
||||
}
|
||||
|
||||
# V4.5模型使用新格式
|
||||
if is_v4_model:
|
||||
parameters.update({
|
||||
"params_version": 3,
|
||||
"cfg_rescale": 0,
|
||||
"autoSmea": False,
|
||||
"legacy": False,
|
||||
"legacy_v3_extend": False,
|
||||
"legacy_uc": False,
|
||||
"add_original_image": True,
|
||||
"controlnet_strength": 1,
|
||||
"dynamic_thresholding": False,
|
||||
"prefer_brownian": True,
|
||||
"normalize_reference_strength_multiple": True,
|
||||
"use_coords": True,
|
||||
"inpaintImg2ImgStrength": 1,
|
||||
"deliberate_euler_ancestral_bug": False,
|
||||
"skip_cfg_above_sigma": None,
|
||||
"characterPrompts": [],
|
||||
"stream": "msgpack",
|
||||
"v4_prompt": {
|
||||
"caption": {
|
||||
"base_caption": prompt,
|
||||
"char_captions": []
|
||||
},
|
||||
"use_coords": True,
|
||||
"use_order": True
|
||||
},
|
||||
"v4_negative_prompt": {
|
||||
"caption": {
|
||||
"base_caption": negative_prompt,
|
||||
"char_captions": []
|
||||
},
|
||||
"legacy_uc": False
|
||||
},
|
||||
"negative_prompt": negative_prompt,
|
||||
"reference_image_multiple": [],
|
||||
"reference_information_extracted_multiple": [],
|
||||
"reference_strength_multiple": []
|
||||
})
|
||||
# V3使用negative_prompt字段
|
||||
elif is_v3_model:
|
||||
parameters["negative_prompt"] = negative_prompt
|
||||
|
||||
payload = {
|
||||
"input": prompt,
|
||||
"model": self.model,
|
||||
"action": "generate",
|
||||
"parameters": parameters
|
||||
}
|
||||
|
||||
# V4.5需要额外字段
|
||||
if is_v4_model:
|
||||
payload["use_new_shared_trial"] = True
|
||||
|
||||
return payload
|
||||
|
||||
async def _call_novelai_api(self, payload: dict) -> bytes | None:
|
||||
"""调用NovelAI API"""
|
||||
headers = {
|
||||
"Authorization": f"Bearer {self.api_key}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
connector = None
|
||||
request_kwargs = {
|
||||
"json": payload,
|
||||
"headers": headers,
|
||||
"timeout": aiohttp.ClientTimeout(total=120)
|
||||
}
|
||||
|
||||
if self.proxy:
|
||||
request_kwargs["proxy"] = self.proxy
|
||||
connector = aiohttp.TCPConnector()
|
||||
logger.info(f"使用代理: {self.proxy}")
|
||||
|
||||
try:
|
||||
async with aiohttp.ClientSession(connector=connector) as session:
|
||||
async with session.post(self.base_url, **request_kwargs) as resp:
|
||||
if resp.status != 200:
|
||||
error_text = await resp.text()
|
||||
logger.error(f"API请求失败 ({resp.status}): {error_text[:200]}")
|
||||
return None
|
||||
|
||||
img_data = await resp.read()
|
||||
logger.info(f"收到响应数据: {len(img_data)} bytes")
|
||||
|
||||
# 检查是否是ZIP文件
|
||||
if img_data[:4] == b"PK\x03\x04":
|
||||
logger.info("检测到ZIP格式,解压中...")
|
||||
return self._extract_from_zip(img_data)
|
||||
elif img_data[:4] == b"\x89PNG":
|
||||
logger.info("检测到PNG格式")
|
||||
return img_data
|
||||
else:
|
||||
logger.warning(f"未知文件格式,前4字节: {img_data[:4].hex()}")
|
||||
return img_data
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"API调用失败: {e}", exc_info=True)
|
||||
return None
|
||||
|
||||
def _extract_from_zip(self, zip_data: bytes) -> bytes | None:
|
||||
"""从ZIP中提取PNG"""
|
||||
try:
|
||||
with zipfile.ZipFile(io.BytesIO(zip_data)) as zf:
|
||||
for filename in zf.namelist():
|
||||
if filename.lower().endswith(".png"):
|
||||
img_data = zf.read(filename)
|
||||
logger.info(f"从ZIP提取: {filename} ({len(img_data)} bytes)")
|
||||
return img_data
|
||||
logger.error("ZIP中未找到PNG文件")
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"解压ZIP失败: {e}")
|
||||
return None
|
||||
|
||||
async def _save_image(self, image_data: bytes) -> Path | None:
|
||||
"""保存图片到本地"""
|
||||
try:
|
||||
filename = f"novelai_{uuid.uuid4().hex[:12]}.png"
|
||||
filepath = self.image_dir / filename
|
||||
|
||||
# 写入文件
|
||||
with open(filepath, "wb") as f:
|
||||
f.write(image_data)
|
||||
f.flush()
|
||||
import os
|
||||
os.fsync(f.fileno())
|
||||
|
||||
# 验证图片
|
||||
try:
|
||||
with Image.open(filepath) as img:
|
||||
img.verify()
|
||||
with Image.open(filepath) as img:
|
||||
logger.info(f"图片验证成功: {img.format} {img.size}")
|
||||
except Exception as e:
|
||||
logger.warning(f"图片验证失败: {e}")
|
||||
|
||||
return filepath
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存图片失败: {e}")
|
||||
return None
|
||||
@@ -5,6 +5,7 @@ QQ空间服务模块
|
||||
|
||||
import asyncio
|
||||
import base64
|
||||
import os
|
||||
import random
|
||||
import time
|
||||
from collections.abc import Callable
|
||||
@@ -82,93 +83,21 @@ class QZoneService:
|
||||
return context
|
||||
|
||||
async def send_feed(self, topic: str, stream_id: str | None) -> dict[str, Any]:
|
||||
"""发送一条说说(支持AI配图)"""
|
||||
"""发送一条说说"""
|
||||
cross_context = await self._get_cross_context()
|
||||
story = await self.content_service.generate_story(topic, context=cross_context)
|
||||
if not story:
|
||||
return {"success": False, "message": "生成说说内容失败"}
|
||||
|
||||
# 检查是否启用AI配图
|
||||
ai_image_enabled = self.get_config("ai_image.enable_ai_image", False)
|
||||
provider = self.get_config("ai_image.provider", "siliconflow")
|
||||
|
||||
image_path = None
|
||||
|
||||
if ai_image_enabled:
|
||||
# 启用AI配图:文本模型生成说说+图片提示词
|
||||
story, image_info = await self.content_service.generate_story_with_image_info(topic, context=cross_context)
|
||||
if not story:
|
||||
return {"success": False, "message": "生成说说内容失败"}
|
||||
|
||||
# 根据provider调用对应的生图服务
|
||||
if provider == "novelai":
|
||||
try:
|
||||
from .novelai_service import MaiZoneNovelAIService
|
||||
novelai_service = MaiZoneNovelAIService(self.get_config)
|
||||
|
||||
if novelai_service.is_available():
|
||||
# 解析画幅
|
||||
aspect_ratio = image_info.get("aspect_ratio", "方图")
|
||||
size_map = {
|
||||
"方图": (1024, 1024),
|
||||
"横图": (1216, 832),
|
||||
"竖图": (832, 1216),
|
||||
}
|
||||
width, height = size_map.get(aspect_ratio, (1024, 1024))
|
||||
|
||||
logger.info("🎨 开始生成NovelAI配图...")
|
||||
success, img_path, msg = await novelai_service.generate_image_from_prompt_data(
|
||||
prompt=image_info.get("prompt", ""),
|
||||
negative_prompt=image_info.get("negative_prompt"),
|
||||
include_character=image_info.get("include_character", False),
|
||||
width=width,
|
||||
height=height
|
||||
)
|
||||
|
||||
if success and img_path:
|
||||
image_path = img_path
|
||||
logger.info("✅ NovelAI配图生成成功")
|
||||
else:
|
||||
logger.warning(f"⚠️ NovelAI配图生成失败: {msg}")
|
||||
else:
|
||||
logger.warning("NovelAI服务不可用(未配置API Key)")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"NovelAI配图生成出错: {e}", exc_info=True)
|
||||
|
||||
elif provider == "siliconflow":
|
||||
try:
|
||||
# 调用硅基流动生成图片
|
||||
success, img_path = await self.image_service.generate_image_from_prompt(
|
||||
prompt=image_info.get("prompt", ""),
|
||||
save_dir=None # 使用默认images目录
|
||||
)
|
||||
if success and img_path:
|
||||
image_path = img_path
|
||||
logger.info("✅ 硅基流动配图生成成功")
|
||||
else:
|
||||
logger.warning("⚠️ 硅基流动配图生成失败")
|
||||
except Exception as e:
|
||||
logger.error(f"硅基流动配图生成出错: {e}", exc_info=True)
|
||||
else:
|
||||
# 不使用AI配图:只生成说说文本
|
||||
story = await self.content_service.generate_story(topic, context=cross_context)
|
||||
if not story:
|
||||
return {"success": False, "message": "生成说说内容失败"}
|
||||
await self.image_service.generate_images_for_story(story)
|
||||
|
||||
qq_account = config_api.get_global_config("bot.qq_account", "")
|
||||
api_client = await self._get_api_client(qq_account, stream_id)
|
||||
if not api_client:
|
||||
return {"success": False, "message": "获取QZone API客户端失败"}
|
||||
|
||||
# 加载图片
|
||||
images_bytes = []
|
||||
|
||||
# 使用AI生成的图片
|
||||
if image_path and image_path.exists():
|
||||
try:
|
||||
with open(image_path, "rb") as f:
|
||||
images_bytes.append(f.read())
|
||||
logger.info("添加AI配图到说说")
|
||||
except Exception as e:
|
||||
logger.error(f"读取AI配图失败: {e}")
|
||||
image_dir = self.get_config("send.image_directory")
|
||||
images_bytes = self._load_local_images(image_dir)
|
||||
|
||||
try:
|
||||
success, _ = await api_client["publish"](story, images_bytes)
|
||||
@@ -186,16 +115,19 @@ class QZoneService:
|
||||
if not story:
|
||||
return {"success": False, "message": "根据活动生成说说内容失败"}
|
||||
|
||||
if self.get_config("send.enable_ai_image", False):
|
||||
await self.image_service.generate_images_for_story(story)
|
||||
await self.image_service.generate_images_for_story(story)
|
||||
|
||||
qq_account = config_api.get_global_config("bot.qq_account", "")
|
||||
# 注意:定时任务通常在后台运行,没有特定的用户会话,因此 stream_id 为 None
|
||||
api_client = await self._get_api_client(qq_account, stream_id=None)
|
||||
if not api_client:
|
||||
return {"success": False, "message": "获取QZone API客户端失败"}
|
||||
|
||||
image_dir = self.get_config("send.image_directory")
|
||||
images_bytes = self._load_local_images(image_dir)
|
||||
|
||||
try:
|
||||
success, _ = await api_client["publish"](story, [])
|
||||
success, _ = await api_client["publish"](story, images_bytes)
|
||||
if success:
|
||||
return {"success": True, "message": story}
|
||||
return {"success": False, "message": "发布说说至QQ空间失败"}
|
||||
@@ -502,12 +434,7 @@ class QZoneService:
|
||||
logger.debug(f"锁定待评论说说: {comment_key}")
|
||||
self.processing_comments.add(comment_key)
|
||||
try:
|
||||
# 使用content_service生成评论(相当于回复好友的说说)
|
||||
comment_text = await self.content_service.generate_comment_reply(
|
||||
story_content=content or rt_con or "说说内容",
|
||||
comment_content="", # 评论说说时没有评论内容
|
||||
commenter_name=target_name
|
||||
)
|
||||
comment_text = await self.content_service.generate_comment(content, target_name, rt_con, images)
|
||||
if comment_text:
|
||||
success = await api_client["comment"](target_qq, fid, comment_text)
|
||||
if success:
|
||||
@@ -538,6 +465,61 @@ class QZoneService:
|
||||
|
||||
return result
|
||||
|
||||
def _load_local_images(self, image_dir: str) -> list[bytes]:
|
||||
"""随机加载本地图片(不删除文件)"""
|
||||
images = []
|
||||
if not image_dir or not os.path.exists(image_dir):
|
||||
logger.warning(f"图片目录不存在或未配置: {image_dir}")
|
||||
return images
|
||||
|
||||
try:
|
||||
# 获取所有图片文件
|
||||
all_files = [
|
||||
f
|
||||
for f in os.listdir(image_dir)
|
||||
if os.path.isfile(os.path.join(image_dir, f))
|
||||
and f.lower().endswith((".jpg", ".jpeg", ".png", ".gif", ".bmp"))
|
||||
]
|
||||
|
||||
if not all_files:
|
||||
logger.warning(f"图片目录中没有找到图片文件: {image_dir}")
|
||||
return images
|
||||
|
||||
# 检查是否启用配图
|
||||
enable_image = bool(self.get_config("send.enable_image", False))
|
||||
if not enable_image:
|
||||
logger.info("说说配图功能已关闭")
|
||||
return images
|
||||
|
||||
# 根据配置选择图片数量
|
||||
config_image_number = self.get_config("send.image_number", 1)
|
||||
try:
|
||||
config_image_number = int(config_image_number)
|
||||
except (ValueError, TypeError):
|
||||
config_image_number = 1
|
||||
logger.warning("配置项 image_number 值无效,使用默认值 1")
|
||||
|
||||
max_images = min(min(config_image_number, 9), len(all_files)) # 最多9张,最少1张
|
||||
selected_count = max(1, max_images) # 确保至少选择1张
|
||||
selected_files = random.sample(all_files, selected_count)
|
||||
|
||||
logger.info(f"从 {len(all_files)} 张图片中随机选择了 {selected_count} 张配图")
|
||||
|
||||
for filename in selected_files:
|
||||
full_path = os.path.join(image_dir, filename)
|
||||
try:
|
||||
with open(full_path, "rb") as f:
|
||||
image_data = f.read()
|
||||
images.append(image_data)
|
||||
logger.info(f"加载图片: {filename} ({len(image_data)} bytes)")
|
||||
except Exception as e:
|
||||
logger.error(f"加载图片 {filename} 失败: {e}")
|
||||
|
||||
return images
|
||||
except Exception as e:
|
||||
logger.error(f"加载本地图片失败: {e}")
|
||||
return []
|
||||
|
||||
def _generate_gtk(self, skey: str) -> str:
|
||||
hash_val = 5381
|
||||
for char in skey:
|
||||
|
||||
@@ -414,22 +414,7 @@ class NapcatAdapterPlugin(BasePlugin):
|
||||
"enable_emoji_like": ConfigField(type=bool, default=True, description="是否启用群聊表情回复处理"),
|
||||
"enable_reply_at": ConfigField(type=bool, default=True, description="是否在回复时自动@原消息发送者"),
|
||||
"reply_at_rate": ConfigField(type=float, default=0.5, description="回复时@的概率(0.0-1.0)"),
|
||||
# ========== 视频消息处理配置 ==========
|
||||
"enable_video_processing": ConfigField(
|
||||
type=bool,
|
||||
default=True,
|
||||
description="是否启用视频消息处理(下载和解析)。关闭后视频消息将显示为 [视频消息] 占位符,不会进行下载"
|
||||
),
|
||||
"video_max_size_mb": ConfigField(
|
||||
type=int,
|
||||
default=100,
|
||||
description="允许下载的视频文件最大大小(MB),超过此大小的视频将被跳过"
|
||||
),
|
||||
"video_download_timeout": ConfigField(
|
||||
type=int,
|
||||
default=60,
|
||||
description="视频下载超时时间(秒),若超时将中止下载"
|
||||
),
|
||||
"enable_video_processing": ConfigField(type=bool, default=True, description="是否启用视频消息处理(下载和解析)"),
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
@@ -37,22 +37,11 @@ class MessageHandler:
|
||||
def __init__(self, adapter: "NapcatAdapter"):
|
||||
self.adapter = adapter
|
||||
self.plugin_config: dict[str, Any] | None = None
|
||||
self._video_downloader = None
|
||||
|
||||
def set_plugin_config(self, config: dict[str, Any]) -> None:
|
||||
"""设置插件配置,并根据配置初始化视频下载器"""
|
||||
"""设置插件配置"""
|
||||
self.plugin_config = config
|
||||
|
||||
# 如果启用了视频处理,根据配置初始化视频下载器
|
||||
if config_api.get_plugin_config(config, "features.enable_video_processing", True):
|
||||
from ..video_handler import VideoDownloader
|
||||
|
||||
max_size = config_api.get_plugin_config(config, "features.video_max_size_mb", 100)
|
||||
timeout = config_api.get_plugin_config(config, "features.video_download_timeout", 60)
|
||||
|
||||
self._video_downloader = VideoDownloader(max_size_mb=max_size, download_timeout=timeout)
|
||||
logger.debug(f"视频下载器已初始化: max_size={max_size}MB, timeout={timeout}s")
|
||||
|
||||
async def handle_raw_message(self, raw: dict[str, Any]):
|
||||
"""
|
||||
处理原始消息并转换为 MessageEnvelope
|
||||
@@ -116,11 +105,6 @@ class MessageHandler:
|
||||
if seg_message:
|
||||
seg_list.append(seg_message)
|
||||
|
||||
# 防御性检查:确保至少有一个消息段,避免消息为空导致构建失败
|
||||
if not seg_list:
|
||||
logger.warning("消息内容为空,添加占位符文本")
|
||||
seg_list.append({"type": "text", "data": "[消息内容为空]"})
|
||||
|
||||
msg_builder.format_info(
|
||||
content_format=[seg["type"] for seg in seg_list],
|
||||
accept_format=ACCEPT_FORMAT,
|
||||
@@ -318,7 +302,7 @@ class MessageHandler:
|
||||
video_source = file_path if file_path else video_url
|
||||
if not video_source:
|
||||
logger.warning("视频消息缺少URL或文件路径信息")
|
||||
return {"type": "text", "data": "[视频消息]"}
|
||||
return None
|
||||
|
||||
try:
|
||||
if file_path and Path(file_path).exists():
|
||||
@@ -336,17 +320,14 @@ class MessageHandler:
|
||||
},
|
||||
}
|
||||
elif video_url:
|
||||
# URL下载处理 - 使用配置中的下载器实例
|
||||
downloader = self._video_downloader
|
||||
if not downloader:
|
||||
from ..video_handler import get_video_downloader
|
||||
downloader = get_video_downloader()
|
||||
|
||||
download_result = await downloader.download_video(video_url)
|
||||
# URL下载处理
|
||||
from ..video_handler import get_video_downloader
|
||||
video_downloader = get_video_downloader()
|
||||
download_result = await video_downloader.download_video(video_url)
|
||||
|
||||
if not download_result["success"]:
|
||||
logger.warning(f"视频下载失败: {download_result.get('error', '未知错误')}")
|
||||
return {"type": "text", "data": f"[视频消息] ({download_result.get('error', '下载失败')})"}
|
||||
return None
|
||||
|
||||
video_base64 = base64.b64encode(download_result["data"]).decode("utf-8")
|
||||
logger.debug(f"视频下载成功,大小: {len(download_result['data']) / (1024 * 1024):.2f} MB")
|
||||
@@ -362,11 +343,11 @@ class MessageHandler:
|
||||
}
|
||||
else:
|
||||
logger.warning("既没有有效的本地文件路径,也没有有效的视频URL")
|
||||
return {"type": "text", "data": "[视频消息]"}
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"视频消息处理失败: {e!s}")
|
||||
return {"type": "text", "data": "[视频消息处理出错]"}
|
||||
return None
|
||||
|
||||
async def _handle_rps_message(self, segment: dict) -> SegPayload:
|
||||
"""处理猜拳消息"""
|
||||
|
||||
@@ -140,7 +140,7 @@ class PokeAction(BaseAction):
|
||||
|
||||
# === 基本信息(必须填写)===
|
||||
action_name = "poke_user"
|
||||
action_description = "戳一戳其他用户。这是一个需要谨慎使用的互动方式,默认只戳一次。群聊中应当克制使用,私聊中可以适当主动。"
|
||||
action_description = "可以让你戳其他用户,为互动增添一份小小的乐趣。"
|
||||
activation_type = ActionActivationType.ALWAYS
|
||||
parallel_action = True
|
||||
|
||||
@@ -148,48 +148,17 @@ class PokeAction(BaseAction):
|
||||
action_parameters: ClassVar[dict] = {
|
||||
"user_name": "需要戳一戳的用户的名字 (可选)",
|
||||
"user_id": "需要戳一戳的用户的ID (可选,优先级更高)",
|
||||
"times": "需要戳一戳的次数 (默认为 1,最多3次)",
|
||||
"times": "需要戳一戳的次数 (默认为 1)",
|
||||
}
|
||||
action_require: ClassVar[list] = [
|
||||
"用户明确要求戳某人时必须使用",
|
||||
"私聊场景:可以在适当的互动时机主动使用(如回应戳一戳、俏皮互动等)",
|
||||
"群聊场景:应当非常克制,仅在用户明确要求或有充分理由时才使用",
|
||||
]
|
||||
action_require: ClassVar[list] = ["当需要戳某个用户时使用", "当你想提醒特定用户时使用"]
|
||||
llm_judge_prompt = """
|
||||
判定是否需要使用戳一戳动作的条件:
|
||||
|
||||
**必须遵守的严格规则:**
|
||||
1. **用户明确要求**: 当用户明确说"戳XX"、"戳一下XX"等直接指令时,必须使用。
|
||||
|
||||
2. **群聊场景(非常克制)**:
|
||||
- 群聊中应当非常谨慎,避免在公共场合频繁打扰他人
|
||||
- 仅在以下情况考虑使用:用户明确要求、需要紧急提醒某人、或有特别充分的互动理由
|
||||
- 如果最近已经在群里戳过任何人,必须回答"否"
|
||||
- 群聊中不要随意主动使用,除非有明确必要性
|
||||
|
||||
3. **私聊场景(可以主动)**:
|
||||
- 私聊中可以更加主动和俏皮
|
||||
- 在以下情况可以使用:回应对方的戳一戳、轻松愉快的互动氛围、想要增添趣味时
|
||||
- 但如果最近已经戳过对方,应避免频繁使用
|
||||
|
||||
4. **频率限制(重要)**:
|
||||
- 如果最近已经戳过同一个人,必须回答"否"
|
||||
- 默认只戳一次,不要多次戳别人(除非用户明确要求多次)
|
||||
- 注意对方的情绪反应,如果对方看起来不高兴或不想被打扰,必须回答"否"
|
||||
|
||||
5. **禁止情况**:
|
||||
- 对方情绪低落、生气、不耐烦时,严禁使用
|
||||
- 严肃的对话场景中,严禁使用
|
||||
- 刚刚戳过的情况下,严禁再次使用
|
||||
|
||||
**判断逻辑**:
|
||||
- 首先判断是群聊还是私聊
|
||||
- 群聊:除非用户明确要求或有特殊必要性,否则回答"否"
|
||||
- 私聊:可以在合适的互动氛围中主动使用,但要注意频率
|
||||
- 检查是否最近已经戳过,如果是则回答"否"
|
||||
- 评估对方情绪和对话氛围是否适合
|
||||
|
||||
请严格根据上述规则,仅回答"是"或"否"。
|
||||
1. **互动时机**: 这是一个有趣的互动方式,可以在想提醒某人,或者单纯想开个玩笑时使用。
|
||||
2. **用户请求**: 当用户明确要求使用戳一戳时。
|
||||
3. **上下文需求**: 当上下文明确需要你戳一个或多个人时。
|
||||
4. **频率与情绪**: 如果最近已经戳过,或者感觉对方情绪不高,请避免使用,不要打扰到别人哦。
|
||||
|
||||
请根据上述规则,回答“是”或“否”。
|
||||
"""
|
||||
associated_types: ClassVar[list[str]] = ["text"]
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
[inner]
|
||||
version = "8.0.2"
|
||||
version = "8.0.0"
|
||||
|
||||
#----以下是给开发人员阅读的,如果你只是部署了MoFox-Bot,不需要阅读----
|
||||
#如果你想要修改配置文件,请递增version的值
|
||||
@@ -309,12 +309,8 @@ perceptual_activation_threshold = 3 # 激活阈值(召回次数→短期)
|
||||
# 短期记忆层配置
|
||||
short_term_max_memories = 30 # 短期记忆最大数量
|
||||
short_term_transfer_threshold = 0.6 # 转移到长期记忆的重要性阈值
|
||||
short_term_enable_force_cleanup = true # 开启压力泄压(建议高频场景开启)
|
||||
short_term_search_top_k = 5 # 搜索时返回的最大数量
|
||||
short_term_decay_factor = 0.98 # 衰减因子
|
||||
short_term_overflow_strategy = "transfer_all" # 短期记忆溢出策略
|
||||
# "transfer_all": 一次性转移所有记忆到长期记忆,然后删除低重要性记忆(默认推荐)
|
||||
# "selective_cleanup": 选择性清理,仅转移高重要性记忆,直接删除低重要性记忆
|
||||
|
||||
# 长期记忆层配置
|
||||
use_judge = true # 使用评判模型决定是否检索长期记忆
|
||||
@@ -642,20 +638,6 @@ enable_continuous_thinking = true # 是否在等待期间启用心理活动更
|
||||
# 留空则不生效
|
||||
custom_decision_prompt = ""
|
||||
|
||||
# --- 提示词/上下文构建配置 ---
|
||||
[kokoro_flow_chatter.prompt]
|
||||
# 活动流格式(你们之间最近发生的事)
|
||||
# - "narrative": 线性叙事(更自然,但信息密度较低,长时更容易丢细节)
|
||||
# - "table": 结构化表格(更高信息密度、更利于模型对齐字段;推荐)
|
||||
# - "both": 同时输出表格 + 叙事(对照/调试用,token 更高)
|
||||
activity_stream_format = "table"
|
||||
|
||||
# 活动流最多保留条数(越大越完整,但 token 越高)
|
||||
max_activity_entries = 5
|
||||
|
||||
# 表格单元格/叙事单条的最大字符数(用于裁剪,避免某条过长拖垮上下文)
|
||||
max_entry_length = 500
|
||||
|
||||
# --- 等待策略 ---
|
||||
[kokoro_flow_chatter.waiting]
|
||||
default_max_wait_seconds = 300 # LLM 未给出等待时间时的默认值
|
||||
|
||||
Reference in New Issue
Block a user