Merge branch 'main-fix' of https://github.com/SengokuCola/MaiMBot into main-fix

This commit is contained in:
tcmofashi
2025-03-23 14:07:25 +08:00
17 changed files with 982 additions and 40 deletions

View File

@@ -155,6 +155,8 @@ MaiMBot是一个开源项目我们非常欢迎你的参与。你的贡献
- [🐳 Docker部署指南](docs/docker_deploy.md)
- [🖥️群晖 NAS 部署指南](docs/synology_deploy.md)
### 配置说明
- [🎀 新手配置指南](docs/installation_cute.md) - 通俗易懂的配置教程,适合初次使用的猫娘

View File

@@ -38,6 +38,9 @@
### MongoDB相关问题
- 我应该怎么清空bot内存储的表情包 ❓
>需要先安装`MongoDB Compass`[下载链接](https://www.mongodb.com/try/download/compass),软件支持`macOS、Windows、Ubuntu、Redhat`系统
>以Windows为例保持如图所示选项点击`Download`即可,如果是其他系统,请在`Platform`中自行选择:
><img src="./pic/compass_downloadguide.png" width=400>
>打开你的MongoDB Compass软件你会在左上角看到这样的一个界面
>
@@ -68,7 +71,9 @@
- 为什么我连接不上MongoDB服务器 ❓
>这个问题比较复杂,但是你可以按照下面的步骤检查,看看具体是什么问题
>
>#### Windows
> 1. 检查有没有把 mongod.exe 所在的目录添加到 path。 具体可参照
>
>&emsp;&emsp;[CSDN-windows10设置环境变量Path详细步骤](https://blog.csdn.net/flame_007/article/details/106401215)
@@ -113,3 +118,172 @@
>MONGODB_PORT=27017 #修改这里
>DATABASE_NAME=MegBot
>```
<details>
<summary>Linux点击展开</summary>
#### **1. 检查 MongoDB 服务是否运行**
- **命令**
```bash
systemctl status mongod # 检查服务状态Ubuntu/Debian/CentOS 7+
service mongod status # 旧版系统(如 CentOS 6
```
- **可能结果**
- 如果显示 `active (running)`,服务已启动。
- 如果未运行,启动服务:
```bash
sudo systemctl start mongod # 启动服务
sudo systemctl enable mongod # 设置开机自启
```
---
#### **2. 检查 MongoDB 端口监听**
MongoDB 默认使用 **27017** 端口。
- **检查端口是否被监听**
```bash
sudo ss -tulnp | grep 27017
sudo netstat -tulnp | grep 27017
```
- **预期结果**
```bash
tcp LISTEN 0 128 0.0.0.0:27017 0.0.0.0:* users:(("mongod",pid=123,fd=11))
```
- 如果无输出,说明 MongoDB 未监听端口。
---
#### **3. 检查防火墙设置**
- **Ubuntu/DebianUFW 防火墙)**
```bash
sudo ufw status # 查看防火墙状态
sudo ufw allow 27017/tcp # 开放 27017 端口
sudo ufw reload # 重新加载规则
```
- **CentOS/RHELfirewalld**
```bash
sudo firewall-cmd --list-ports # 查看已开放端口
sudo firewall-cmd --add-port=27017/tcp --permanent # 永久开放端口
sudo firewall-cmd --reload # 重新加载
```
- **云服务器用户注意**:检查云平台安全组规则,确保放行 27017 端口。
---
#### **4. 检查端口占用**
如果 MongoDB 服务无法监听端口,可能是其他进程占用了 `27017` 端口。
- **检查端口占用进程**
```bash
sudo lsof -i :27017 # 查看占用 27017 端口的进程
sudo ss -ltnp 'sport = :27017' # 使用 ss 过滤端口
```
- **结果示例**
```bash
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 1234 root 12u IPv4 123456 0t0 TCP *:27017 (LISTEN)
```
- 输出会显示占用端口的 **进程名** 和 **PID**(此处 `PID=1234`)。
- **解决方案**
1. **终止占用进程**(谨慎操作!确保进程非关键):
```bash
sudo kill 1234 # 正常终止进程
sudo kill -9 1234 # 强制终止(若正常终止无效)
```
2. **修改端口**
编辑麦麦目录里的`.env.dev`文件,修改端口号:
```ini
MONGODB_HOST=127.0.0.1
MONGODB_PORT=27017 #修改这里
DATABASE_NAME=MegBot
```
##### **注意事项**
- 终止进程前,务必确认该进程非系统关键服务(如未知进程占用,建议先排查来源),如果你不知道这个进程是否关键,请更改端口使用。
</details>
<details>
<summary>macOS点击展开</summary>
### **1. 检查 MongoDB 服务状态**
**问题原因**MongoDB 服务未启动
**操作步骤**
```bash
# 查看 MongoDB 是否正在运行Homebrew 安装的默认服务名)
brew services list | grep mongodb
# 如果状态为 "stopped" 或 "error",手动启动
brew services start mongodb-community@8.0
```
✅ **预期结果**:输出显示 `started` 或 `running`
❌ **失败处理**
- 若报错 `unrecognized service`,可能未正确安装 MongoDB建议[重新安装](https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-os-x/#install-mongodb-community-edition)。
---
### **2. 检查端口是否被占用**
**问题原因**:其他程序占用了 MongoDB 的默认端口(`27017`),导致服务无法启动或连接
**操作步骤**
```bash
# 检查 27017 端口占用情况(需 sudo 权限查看完整信息)
sudo lsof -i :27017
# 或使用 netstat 快速检测
netstat -an | grep 27017
```
✅ **预期结果**
- 若无 MongoDB 运行,应无输出
- 若 MongoDB 已启动,应显示 `mongod` 进程
❌ **发现端口被占用**
#### **解决方案1终止占用进程**
1. 从 `lsof` 输出中找到占用端口的 **PID**(进程号)
2. 强制终止该进程(谨慎操作!确保进程非关键):
```bash
kill -9 PID # 替换 PID 为实际数字(例如 kill -9 12345
```
3. 重新启动 MongoDB 服务:
```bash
brew services start mongodb-community@8.0
```
#### **解决方案2修改端口**
编辑麦麦目录里的`.env.dev`文件,修改端口号:
```ini
MONGODB_HOST=127.0.0.1
MONGODB_PORT=27017 #修改这里
DATABASE_NAME=MegBot
```
---
### **3. 检查防火墙设置**
**问题原因**macOS 防火墙阻止连接
**操作步骤**
1. 打开 **系统设置 > 隐私与安全性 > 防火墙**
2. 临时关闭防火墙测试连接
3. 若需长期开放,添加 MongoDB 到防火墙允许列表(通过终端或 GUI
---
### **4. 重置 MongoDB 环境**
***仅在以上步骤都无效时使用***
**适用场景**:配置混乱导致无法修复
```bash
# 停止服务并删除数据
brew services stop mongodb-community@8.0
rm -rf /usr/local/var/mongodb
# 重新初始化(确保目录权限)
sudo mkdir -p /usr/local/var/mongodb
sudo chown -R $(whoami) /usr/local/var/mongodb
# 重新启动
brew services start mongodb-community@8.0
```
</details>

View File

@@ -2,7 +2,7 @@
## 事前准备
为了能使麦麦不间断的运行,你需要一台一直开着的主机
为了能使麦麦不间断的运行,你需要一台一直开着的服务器
### 如果你想购买服务器
华为云、阿里云、腾讯云等等都是在国内可以选择的选择。
@@ -12,6 +12,8 @@
### 如果你不想购买服务器
你可以准备一台可以一直开着的电脑/主机,只需要保证能够正常访问互联网即可
**下文将统称它们为`服务器`**
我们假设你已经有了一台Linux架构的服务器。举例使用的是Ubuntu24.04,其他的原理相似。
## 0.我们就从零开始吧
@@ -120,6 +122,7 @@ sudo apt install python-is-python3
```
## 3.MongoDB的安装
*如果你是参考[官方文档](https://www.mongodb.com/zh-cn/docs/manual/administration/install-on-linux/#std-label-install-mdb-community-edition-linux)进行安装的,可跳过此步*
``` bash
cd /moi/mai
@@ -156,6 +159,7 @@ sudo systemctl enable mongod
curl -o napcat.sh https://nclatest.znin.net/NapNeko/NapCat-Installer/main/script/install.sh && sudo bash napcat.sh
```
执行后脚本会自动帮你部署好QQ及Napcat
*注如果你已经手动安装了Napcat和QQ可忽略此步*
成功的标志是输入``` napcat ```出来炫酷的彩虹色界面
@@ -225,7 +229,8 @@ bot
└─ bot_config.toml
```
要会vim直接在终端里修改也行不过也可以把它们下到本地改好再传上去
可以使用vim、nano等编辑器直接在终端里修改这些配置文件但如果你不熟悉它们的操作也可以使用带图形界面的编辑器。
如果你的麦麦部署在远程服务器,也可以把它们下载到本地改好再传上去
### step 5 文件配置
@@ -244,7 +249,7 @@ bot
- [⚙️ 标准配置指南](./installation_standard.md) - 简明专业的配置说明,适合有经验的用户
**step # 6** 运行
### step 6 运行
现在再运行

View File

@@ -24,9 +24,6 @@
---
## 一键部署
请下载并运行项目根目录中的run.sh并按照提示安装部署完成后请参照后续配置指南进行配置
## 环境配置
### 1⃣ **确认Python版本**

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

View File

@@ -1,5 +1,5 @@
call conda activate niuniu
cd src\gui
start /b python reasoning_gui.py
@REM call conda activate niuniu
cd ../src\gui
start /b ../../venv/scripts/python.exe reasoning_gui.py
exit

View File

@@ -6,6 +6,8 @@ import time
from datetime import datetime
from typing import Dict, List
from typing import Optional
sys.path.insert(0, sys.path[0]+"/../")
sys.path.insert(0, sys.path[0]+"/../")
from src.common.logger import get_module_logger
import customtkinter as ctk

View File

@@ -296,7 +296,7 @@ class ChatBot:
return
raw_message = f"[戳了戳]{global_config.BOT_NICKNAME}" # 默认类型
if info := event.raw_info:
if info := event.model_extra["raw_info"]:
poke_type = info[2].get("txt", "戳了戳") # 戳戳类型,例如“拍一拍”、“揉一揉”、“捏一捏”
custom_poke_message = info[4].get("txt", "") # 自定义戳戳消息,若不存在会为空字符串
raw_message = f"[{poke_type}]{global_config.BOT_NICKNAME}{custom_poke_message}"

View File

@@ -260,9 +260,15 @@ def split_into_sentences_w_remove_punctuation(text: str) -> List[str]:
# print(f"处理前的文本: {text}")
# 统一将英文逗号转换为中文逗号
text = text.replace(",", "")
text = text.replace("\n", " ")
# 检查是否为西文字符段落
if not is_western_paragraph(text):
# 当语言为中文时,统一将英文逗号转换为中文逗号
text = text.replace(",", "")
text = text.replace("\n", " ")
else:
# 用"|seg|"作为分割符分开
text = re.sub(r"([.!?]) +", r"\1\|seg\|", text)
text = text.replace("\n", "\|seg\|")
text, mapping = protect_kaomoji(text)
# print(f"处理前的文本: {text}")
@@ -285,21 +291,29 @@ def split_into_sentences_w_remove_punctuation(text: str) -> List[str]:
for sentence in sentences:
parts = sentence.split("")
current_sentence = parts[0]
for part in parts[1:]:
if random.random() < split_strength:
if not is_western_paragraph(current_sentence):
for part in parts[1:]:
if random.random() < split_strength:
new_sentences.append(current_sentence.strip())
current_sentence = part
else:
current_sentence += "" + part
# 处理空格分割
space_parts = current_sentence.split(" ")
current_sentence = space_parts[0]
for part in space_parts[1:]:
if random.random() < split_strength:
new_sentences.append(current_sentence.strip())
current_sentence = part
else:
current_sentence += " " + part
else:
# 处理分割符
space_parts = current_sentence.split("\|seg\|")
current_sentence = space_parts[0]
for part in space_parts[1:]:
new_sentences.append(current_sentence.strip())
current_sentence = part
else:
current_sentence += "" + part
# 处理空格分割
space_parts = current_sentence.split(" ")
current_sentence = space_parts[0]
for part in space_parts[1:]:
if random.random() < split_strength:
new_sentences.append(current_sentence.strip())
current_sentence = part
else:
current_sentence += " " + part
new_sentences.append(current_sentence.strip())
sentences = [s for s in new_sentences if s] # 移除空字符串
sentences = recover_kaomoji(sentences, mapping)
@@ -308,10 +322,12 @@ def split_into_sentences_w_remove_punctuation(text: str) -> List[str]:
sentences_done = []
for sentence in sentences:
sentence = sentence.rstrip(",")
if random.random() < split_strength * 0.5:
sentence = sentence.replace("", "").replace(",", "")
elif random.random() < split_strength:
sentence = sentence.replace("", " ").replace(",", " ")
# 西文字符句子不进行随机合并
if not is_western_paragraph(current_sentence):
if random.random() < split_strength * 0.5:
sentence = sentence.replace("", "").replace(",", "")
elif random.random() < split_strength:
sentence = sentence.replace("", " ").replace(",", " ")
sentences_done.append(sentence)
logger.debug(f"处理后的句子: {sentences_done}")
@@ -347,7 +363,11 @@ def random_remove_punctuation(text: str) -> str:
def process_llm_response(text: str) -> List[str]:
# processed_response = process_text_with_typos(content)
if len(text) > 100:
# 对西文字符段落的回复长度设置为汉字字符的两倍
if len(text) > 100 and not is_western_paragraph(text) :
logger.warning(f"回复过长 ({len(text)} 字符),返回默认回复")
return ["懒得说"]
elif len(text) > 200 :
logger.warning(f"回复过长 ({len(text)} 字符),返回默认回复")
return ["懒得说"]
# 处理长消息
@@ -509,3 +529,13 @@ def recover_kaomoji(sentences, placeholder_to_kaomoji):
sentence = sentence.replace(placeholder, kaomoji)
recovered_sentences.append(sentence)
return recovered_sentences
def is_western_char(char):
"""检测是否为西文字符"""
return len(char.encode('utf-8')) <= 2
def is_western_paragraph(paragraph):
"""检测是否为西文字符段落"""
return all(is_western_char(char) for char in paragraph if char.isalnum())

View File

@@ -10,6 +10,12 @@ from pathlib import Path
import matplotlib.pyplot as plt
import networkx as nx
from dotenv import load_dotenv
sys.path.insert(0, sys.path[0]+"/../")
sys.path.insert(0, sys.path[0]+"/../")
sys.path.insert(0, sys.path[0]+"/../")
sys.path.insert(0, sys.path[0]+"/../")
sys.path.insert(0, sys.path[0]+"/../")
from src.common.logger import get_module_logger
import jieba
# from chat.config import global_config
@@ -17,7 +23,6 @@ import jieba
root_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../.."))
sys.path.append(root_path)
from src.common.logger import get_module_logger # noqa: E402
from src.common.database import db # noqa E402
from src.plugins.memory_system.offline_llm import LLMModel # noqa E402

View File

@@ -274,6 +274,7 @@ class LLM_request:
raise RuntimeError(f"请求被拒绝: {error_code_mapping.get(response.status)}")
response.raise_for_status()
reasoning_content = ""
# 将流式输出转化为非流式输出
if stream_mode:
@@ -303,6 +304,8 @@ class LLM_request:
accumulated_content += delta_content
# 检测流式输出文本是否结束
finish_reason = chunk["choices"][0].get("finish_reason")
if delta.get("reasoning_content", None):
reasoning_content += delta["reasoning_content"]
if finish_reason == "stop":
chunk_usage = chunk.get("usage", None)
if chunk_usage:
@@ -314,7 +317,6 @@ class LLM_request:
except Exception as e:
logger.exception(f"解析流式输出错误: {str(e)}")
content = accumulated_content
reasoning_content = ""
think_match = re.search(r"<think>(.*?)</think>", content, re.DOTALL)
if think_match:
reasoning_content = think_match.group(1).strip()

View File

@@ -0,0 +1,351 @@
"""
基于聊天记录的人格特征分析系统
"""
from typing import Dict, List
import json
import os
from pathlib import Path
from dotenv import load_dotenv
import sys
import random
from collections import defaultdict
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime
import matplotlib.font_manager as fm
current_dir = Path(__file__).resolve().parent
project_root = current_dir.parent.parent.parent
env_path = project_root / ".env.prod"
root_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../.."))
sys.path.append(root_path)
from src.plugins.personality.scene import get_scene_by_factor, PERSONALITY_SCENES # noqa: E402
from src.plugins.personality.questionnaire import FACTOR_DESCRIPTIONS # noqa: E402
from src.plugins.personality.offline_llm import LLMModel # noqa: E402
from src.plugins.personality.who_r_u import MessageAnalyzer # noqa: E402
# 加载环境变量
if env_path.exists():
print(f"{env_path} 加载环境变量")
load_dotenv(env_path)
else:
print(f"未找到环境变量文件: {env_path}")
print("将使用默认配置")
class ChatBasedPersonalityEvaluator:
def __init__(self):
self.personality_traits = {"开放性": 0, "严谨性": 0, "外向性": 0, "宜人性": 0, "神经质": 0}
self.scenarios = []
self.message_analyzer = MessageAnalyzer()
self.llm = LLMModel()
self.trait_scores_history = defaultdict(list) # 记录每个特质的得分历史
# 为每个人格特质获取对应的场景
for trait in PERSONALITY_SCENES:
scenes = get_scene_by_factor(trait)
if not scenes:
continue
scene_keys = list(scenes.keys())
selected_scenes = random.sample(scene_keys, min(3, len(scene_keys)))
for scene_key in selected_scenes:
scene = scenes[scene_key]
other_traits = [t for t in PERSONALITY_SCENES if t != trait]
secondary_trait = random.choice(other_traits)
self.scenarios.append({
"场景": scene["scenario"],
"评估维度": [trait, secondary_trait],
"场景编号": scene_key
})
def analyze_chat_context(self, messages: List[Dict]) -> str:
"""
分析一组消息的上下文,生成场景描述
"""
context = ""
for msg in messages:
nickname = msg.get('user_info', {}).get('user_nickname', '未知用户')
content = msg.get('processed_plain_text', msg.get('detailed_plain_text', ''))
if content:
context += f"{nickname}: {content}\n"
return context
def evaluate_chat_response(
self, user_nickname: str, chat_context: str, dimensions: List[str] = None) -> Dict[str, float]:
"""
评估聊天内容在各个人格维度上的得分
"""
# 使用所有维度进行评估
dimensions = list(self.personality_traits.keys())
dimension_descriptions = []
for dim in dimensions:
desc = FACTOR_DESCRIPTIONS.get(dim, "")
if desc:
dimension_descriptions.append(f"- {dim}{desc}")
dimensions_text = "\n".join(dimension_descriptions)
prompt = f"""请根据以下聊天记录,评估"{user_nickname}"在大五人格模型中的维度得分1-6分
聊天记录:
{chat_context}
需要评估的维度说明:
{dimensions_text}
请按照以下格式输出评估结果,注意,你的评价对象是"{user_nickname}"仅输出JSON格式
{{
"开放性": 分数,
"严谨性": 分数,
"外向性": 分数,
"宜人性": 分数,
"神经质": 分数
}}
评分标准:
1 = 非常不符合该维度特征
2 = 比较不符合该维度特征
3 = 有点不符合该维度特征
4 = 有点符合该维度特征
5 = 比较符合该维度特征
6 = 非常符合该维度特征
如果你觉得某个维度没有相关信息或者无法判断请输出0分
请根据聊天记录的内容和语气结合维度说明进行评分。如果维度可以评分确保分数在1-6之间。如果没有体现请输出0分"""
try:
ai_response, _ = self.llm.generate_response(prompt)
start_idx = ai_response.find("{")
end_idx = ai_response.rfind("}") + 1
if start_idx != -1 and end_idx != 0:
json_str = ai_response[start_idx:end_idx]
scores = json.loads(json_str)
return {k: max(0, min(6, float(v))) for k, v in scores.items()}
else:
print("AI响应格式不正确使用默认评分")
return {dim: 0 for dim in dimensions}
except Exception as e:
print(f"评估过程出错:{str(e)}")
return {dim: 0 for dim in dimensions}
def evaluate_user_personality(self, qq_id: str, num_samples: int = 10, context_length: int = 5) -> Dict:
"""
基于用户的聊天记录评估人格特征
Args:
qq_id (str): 用户QQ号
num_samples (int): 要分析的聊天片段数量
context_length (int): 每个聊天片段的上下文长度
Returns:
Dict: 评估结果
"""
# 获取用户的随机消息及其上下文
chat_contexts, user_nickname = self.message_analyzer.get_user_random_contexts(
qq_id, num_messages=num_samples, context_length=context_length)
if not chat_contexts:
return {"error": f"没有找到QQ号 {qq_id} 的消息记录"}
# 初始化评分
final_scores = defaultdict(float)
dimension_counts = defaultdict(int)
chat_samples = []
# 清空历史记录
self.trait_scores_history.clear()
# 分析每个聊天上下文
for chat_context in chat_contexts:
# 评估这段聊天内容的所有维度
scores = self.evaluate_chat_response(user_nickname, chat_context)
# 记录样本
chat_samples.append({
"聊天内容": chat_context,
"评估维度": list(self.personality_traits.keys()),
"评分": scores
})
# 更新总分和历史记录
for dimension, score in scores.items():
if score > 0: # 只统计大于0的有效分数
final_scores[dimension] += score
dimension_counts[dimension] += 1
self.trait_scores_history[dimension].append(score)
# 计算平均分
average_scores = {}
for dimension in self.personality_traits:
if dimension_counts[dimension] > 0:
average_scores[dimension] = round(final_scores[dimension] / dimension_counts[dimension], 2)
else:
average_scores[dimension] = 0 # 如果没有有效分数返回0
# 生成趋势图
self._generate_trend_plot(qq_id, user_nickname)
result = {
"用户QQ": qq_id,
"用户昵称": user_nickname,
"样本数量": len(chat_samples),
"人格特征评分": average_scores,
"维度评估次数": dict(dimension_counts),
"详细样本": chat_samples,
"特质得分历史": {k: v for k, v in self.trait_scores_history.items()}
}
# 保存结果
os.makedirs("results", exist_ok=True)
result_file = f"results/personality_result_{qq_id}.json"
with open(result_file, "w", encoding="utf-8") as f:
json.dump(result, f, ensure_ascii=False, indent=2)
return result
def _generate_trend_plot(self, qq_id: str, user_nickname: str):
"""
生成人格特质累计平均分变化趋势图
"""
# 查找系统中可用的中文字体
chinese_fonts = []
for f in fm.fontManager.ttflist:
try:
if '' in f.name or 'SC' in f.name or '' in f.name or '' in f.name or '微软' in f.name:
chinese_fonts.append(f.name)
except Exception:
continue
if chinese_fonts:
plt.rcParams['font.sans-serif'] = chinese_fonts + ['SimHei', 'Microsoft YaHei', 'Arial Unicode MS']
else:
# 如果没有找到中文字体,使用默认字体,并将中文昵称转换为拼音或英文
try:
from pypinyin import lazy_pinyin
user_nickname = ''.join(lazy_pinyin(user_nickname))
except ImportError:
user_nickname = "User" # 如果无法转换为拼音,使用默认英文
plt.rcParams['axes.unicode_minus'] = False # 解决负号显示问题
plt.figure(figsize=(12, 6))
plt.style.use('bmh') # 使用内置的bmh样式它有类似seaborn的美观效果
colors = {
"开放性": "#FF9999",
"严谨性": "#66B2FF",
"外向性": "#99FF99",
"宜人性": "#FFCC99",
"神经质": "#FF99CC"
}
# 计算每个维度在每个时间点的累计平均分
cumulative_averages = {}
for trait, scores in self.trait_scores_history.items():
if not scores:
continue
averages = []
total = 0
valid_count = 0
for score in scores:
if score > 0: # 只计算大于0的有效分数
total += score
valid_count += 1
if valid_count > 0:
averages.append(total / valid_count)
else:
# 如果当前分数无效,使用前一个有效的平均分
if averages:
averages.append(averages[-1])
else:
continue # 跳过无效分数
if averages: # 只有在有有效分数的情况下才添加到累计平均中
cumulative_averages[trait] = averages
# 绘制每个维度的累计平均分变化趋势
for trait, averages in cumulative_averages.items():
x = range(1, len(averages) + 1)
plt.plot(x, averages, 'o-', label=trait, color=colors.get(trait), linewidth=2, markersize=8)
# 添加趋势线
z = np.polyfit(x, averages, 1)
p = np.poly1d(z)
plt.plot(x, p(x), '--', color=colors.get(trait), alpha=0.5)
plt.title(f"{user_nickname} 的人格特质累计平均分变化趋势", fontsize=14, pad=20)
plt.xlabel("评估次数", fontsize=12)
plt.ylabel("累计平均分", fontsize=12)
plt.grid(True, linestyle='--', alpha=0.7)
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.ylim(0, 7)
plt.tight_layout()
# 保存图表
os.makedirs("results/plots", exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
plot_file = f"results/plots/personality_trend_{qq_id}_{timestamp}.png"
plt.savefig(plot_file, dpi=300, bbox_inches='tight')
plt.close()
def analyze_user_personality(qq_id: str, num_samples: int = 10, context_length: int = 5) -> str:
"""
分析用户人格特征的便捷函数
Args:
qq_id (str): 用户QQ号
num_samples (int): 要分析的聊天片段数量
context_length (int): 每个聊天片段的上下文长度
Returns:
str: 格式化的分析结果
"""
evaluator = ChatBasedPersonalityEvaluator()
result = evaluator.evaluate_user_personality(qq_id, num_samples, context_length)
if "error" in result:
return result["error"]
# 格式化输出
output = f"QQ号 {qq_id} ({result['用户昵称']}) 的人格特征分析结果:\n"
output += "=" * 50 + "\n\n"
output += "人格特征评分:\n"
for trait, score in result["人格特征评分"].items():
if score == 0:
output += f"{trait}: 数据不足,无法判断 (评估次数: {result['维度评估次数'].get(trait, 0)})\n"
else:
output += f"{trait}: {score}/6 (评估次数: {result['维度评估次数'].get(trait, 0)})\n"
# 添加变化趋势描述
if trait in result["特质得分历史"] and len(result["特质得分历史"][trait]) > 1:
scores = [s for s in result["特质得分历史"][trait] if s != 0] # 过滤掉无效分数
if len(scores) > 1: # 确保有足够的有效分数计算趋势
trend = np.polyfit(range(len(scores)), scores, 1)[0]
if abs(trend) < 0.1:
trend_desc = "保持稳定"
elif trend > 0:
trend_desc = "呈上升趋势"
else:
trend_desc = "呈下降趋势"
output += f" 变化趋势: {trend_desc} (斜率: {trend:.2f})\n"
output += f"\n分析样本数量:{result['样本数量']}\n"
output += f"结果已保存至results/personality_result_{qq_id}.json\n"
output += "变化趋势图已保存至results/plots/目录\n"
return output
if __name__ == "__main__":
# 测试代码
# test_qq = "" # 替换为要测试的QQ号
# print(analyze_user_personality(test_qq, num_samples=30, context_length=20))
# test_qq = ""
# print(analyze_user_personality(test_qq, num_samples=30, context_length=20))
test_qq = "1026294844"
print(analyze_user_personality(test_qq, num_samples=30, context_length=30))

View File

@@ -0,0 +1,196 @@
"""
The definition of artificial personality in this paper follows the dispositional para-digm and adapts a definition of
personality developed for humans [17]:
Personality for a human is the "whole and organisation of relatively stable tendencies and patterns of experience and
behaviour within one person (distinguishing it from other persons)". This definition is modified for artificial
personality:
Artificial personality describes the relatively stable tendencies and patterns of behav-iour of an AI-based machine that
can be designed by developers and designers via different modalities, such as language, creating the impression
of individuality of a humanized social agent when users interact with the machine."""
from typing import Dict, List
import json
import os
from pathlib import Path
from dotenv import load_dotenv
import sys
"""
第一种方案:基于情景评估的人格测定
"""
current_dir = Path(__file__).resolve().parent
project_root = current_dir.parent.parent.parent
env_path = project_root / ".env.prod"
root_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../.."))
sys.path.append(root_path)
from src.plugins.personality.scene import get_scene_by_factor, PERSONALITY_SCENES # noqa: E402
from src.plugins.personality.questionnaire import FACTOR_DESCRIPTIONS # noqa: E402
from src.plugins.personality.offline_llm import LLMModel # noqa: E402
# 加载环境变量
if env_path.exists():
print(f"{env_path} 加载环境变量")
load_dotenv(env_path)
else:
print(f"未找到环境变量文件: {env_path}")
print("将使用默认配置")
class PersonalityEvaluator_direct:
def __init__(self):
self.personality_traits = {"开放性": 0, "严谨性": 0, "外向性": 0, "宜人性": 0, "神经质": 0}
self.scenarios = []
# 为每个人格特质获取对应的场景
for trait in PERSONALITY_SCENES:
scenes = get_scene_by_factor(trait)
if not scenes:
continue
# 从每个维度选择3个场景
import random
scene_keys = list(scenes.keys())
selected_scenes = random.sample(scene_keys, min(3, len(scene_keys)))
for scene_key in selected_scenes:
scene = scenes[scene_key]
# 为每个场景添加评估维度
# 主维度是当前特质,次维度随机选择一个其他特质
other_traits = [t for t in PERSONALITY_SCENES if t != trait]
secondary_trait = random.choice(other_traits)
self.scenarios.append(
{"场景": scene["scenario"], "评估维度": [trait, secondary_trait], "场景编号": scene_key}
)
self.llm = LLMModel()
def evaluate_response(self, scenario: str, response: str, dimensions: List[str]) -> Dict[str, float]:
"""
使用 DeepSeek AI 评估用户对特定场景的反应
"""
# 构建维度描述
dimension_descriptions = []
for dim in dimensions:
desc = FACTOR_DESCRIPTIONS.get(dim, "")
if desc:
dimension_descriptions.append(f"- {dim}{desc}")
dimensions_text = "\n".join(dimension_descriptions)
prompt = f"""请根据以下场景和用户描述评估用户在大五人格模型中的相关维度得分1-6分
场景描述:
{scenario}
用户回应:
{response}
需要评估的维度说明:
{dimensions_text}
请按照以下格式输出评估结果仅输出JSON格式
{{
"{dimensions[0]}": 分数,
"{dimensions[1]}": 分数
}}
评分标准:
1 = 非常不符合该维度特征
2 = 比较不符合该维度特征
3 = 有点不符合该维度特征
4 = 有点符合该维度特征
5 = 比较符合该维度特征
6 = 非常符合该维度特征
请根据用户的回应结合场景和维度说明进行评分。确保分数在1-6之间并给出合理的评估。"""
try:
ai_response, _ = self.llm.generate_response(prompt)
# 尝试从AI响应中提取JSON部分
start_idx = ai_response.find("{")
end_idx = ai_response.rfind("}") + 1
if start_idx != -1 and end_idx != 0:
json_str = ai_response[start_idx:end_idx]
scores = json.loads(json_str)
# 确保所有分数在1-6之间
return {k: max(1, min(6, float(v))) for k, v in scores.items()}
else:
print("AI响应格式不正确使用默认评分")
return {dim: 3.5 for dim in dimensions}
except Exception as e:
print(f"评估过程出错:{str(e)}")
return {dim: 3.5 for dim in dimensions}
def main():
print("欢迎使用人格形象创建程序!")
print("接下来您将面对一系列场景共15个。请根据您想要创建的角色形象描述在该场景下可能的反应。")
print("每个场景都会评估不同的人格维度,最终得出完整的人格特征评估。")
print("评分标准1=非常不符合2=比较不符合3=有点不符合4=有点符合5=比较符合6=非常符合")
print("\n准备好了吗?按回车键开始...")
input()
evaluator = PersonalityEvaluator_direct()
final_scores = {"开放性": 0, "严谨性": 0, "外向性": 0, "宜人性": 0, "神经质": 0}
dimension_counts = {trait: 0 for trait in final_scores.keys()}
for i, scenario_data in enumerate(evaluator.scenarios, 1):
print(f"\n场景 {i}/{len(evaluator.scenarios)} - {scenario_data['场景编号']}:")
print("-" * 50)
print(scenario_data["场景"])
print("\n请描述您的角色在这种情况下会如何反应:")
response = input().strip()
if not response:
print("反应描述不能为空!")
continue
print("\n正在评估您的描述...")
scores = evaluator.evaluate_response(scenario_data["场景"], response, scenario_data["评估维度"])
# 更新最终分数
for dimension, score in scores.items():
final_scores[dimension] += score
dimension_counts[dimension] += 1
print("\n当前评估结果:")
print("-" * 30)
for dimension, score in scores.items():
print(f"{dimension}: {score}/6")
if i < len(evaluator.scenarios):
print("\n按回车键继续下一个场景...")
input()
# 计算平均分
for dimension in final_scores:
if dimension_counts[dimension] > 0:
final_scores[dimension] = round(final_scores[dimension] / dimension_counts[dimension], 2)
print("\n最终人格特征评估结果:")
print("-" * 30)
for trait, score in final_scores.items():
print(f"{trait}: {score}/6")
print(f"测试场景数:{dimension_counts[trait]}")
# 保存结果
result = {"final_scores": final_scores, "dimension_counts": dimension_counts, "scenarios": evaluator.scenarios}
# 确保目录存在
os.makedirs("results", exist_ok=True)
# 保存到文件
with open("results/personality_result.json", "w", encoding="utf-8") as f:
json.dump(result, f, ensure_ascii=False, indent=2)
print("\n结果已保存到 results/personality_result.json")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,155 @@
import random
import os
import sys
from pathlib import Path
import datetime
from typing import List, Dict, Optional
current_dir = Path(__file__).resolve().parent
project_root = current_dir.parent.parent.parent
env_path = project_root / ".env.prod"
root_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../.."))
sys.path.append(root_path)
from src.common.database import db # noqa: E402
class MessageAnalyzer:
def __init__(self):
self.messages_collection = db["messages"]
def get_message_context(self, message_id: int, context_length: int = 5) -> Optional[List[Dict]]:
"""
获取指定消息ID的上下文消息列表
Args:
message_id (int): 消息ID
context_length (int): 上下文长度(单侧,总长度为 2*context_length + 1
Returns:
Optional[List[Dict]]: 消息列表如果未找到则返回None
"""
# 从数据库获取指定消息
target_message = self.messages_collection.find_one({"message_id": message_id})
if not target_message:
return None
# 获取该消息的stream_id
stream_id = target_message.get('chat_info', {}).get('stream_id')
if not stream_id:
return None
# 获取同一stream_id的所有消息
stream_messages = list(self.messages_collection.find({
"chat_info.stream_id": stream_id
}).sort("time", 1))
# 找到目标消息在列表中的位置
target_index = None
for i, msg in enumerate(stream_messages):
if msg['message_id'] == message_id:
target_index = i
break
if target_index is None:
return None
# 获取目标消息前后的消息
start_index = max(0, target_index - context_length)
end_index = min(len(stream_messages), target_index + context_length + 1)
return stream_messages[start_index:end_index]
def format_messages(self, messages: List[Dict], target_message_id: Optional[int] = None) -> str:
"""
格式化消息列表为可读字符串
Args:
messages (List[Dict]): 消息列表
target_message_id (Optional[int]): 目标消息ID用于标记
Returns:
str: 格式化的消息字符串
"""
if not messages:
return "没有消息记录"
reply = ""
for msg in messages:
# 消息时间
msg_time = datetime.datetime.fromtimestamp(int(msg['time'])).strftime("%Y-%m-%d %H:%M:%S")
# 获取消息内容
message_text = msg.get('processed_plain_text', msg.get('detailed_plain_text', '无消息内容'))
nickname = msg.get('user_info', {}).get('user_nickname', '未知用户')
# 标记当前消息
is_target = "" if target_message_id and msg['message_id'] == target_message_id else " "
reply += f"{is_target}[{msg_time}] {nickname}: {message_text}\n"
if target_message_id and msg['message_id'] == target_message_id:
reply += " " + "-" * 50 + "\n"
return reply
def get_user_random_contexts(
self, qq_id: str, num_messages: int = 10, context_length: int = 5) -> tuple[List[str], str]: # noqa: E501
"""
获取用户的随机消息及其上下文
Args:
qq_id (str): QQ号
num_messages (int): 要获取的随机消息数量
context_length (int): 每条消息的上下文长度(单侧)
Returns:
tuple[List[str], str]: (每个消息上下文的格式化字符串列表, 用户昵称)
"""
if not qq_id:
return [], ""
# 获取用户所有消息
all_messages = list(self.messages_collection.find({"user_info.user_id": int(qq_id)}))
if not all_messages:
return [], ""
# 获取用户昵称
user_nickname = all_messages[0].get('chat_info', {}).get('user_info', {}).get('user_nickname', '未知用户')
# 随机选择指定数量的消息
selected_messages = random.sample(all_messages, min(num_messages, len(all_messages)))
# 按时间排序
selected_messages.sort(key=lambda x: int(x['time']))
# 存储所有上下文消息
context_list = []
# 获取每条消息的上下文
for msg in selected_messages:
message_id = msg['message_id']
# 获取消息上下文
context_messages = self.get_message_context(message_id, context_length)
if context_messages:
formatted_context = self.format_messages(context_messages, message_id)
context_list.append(formatted_context)
return context_list, user_nickname
if __name__ == "__main__":
# 测试代码
analyzer = MessageAnalyzer()
test_qq = "1026294844" # 替换为要测试的QQ号
print(f"测试QQ号: {test_qq}")
print("-" * 50)
# 获取5条消息每条消息前后各3条上下文
contexts, nickname = analyzer.get_user_random_contexts(test_qq, num_messages=5, context_length=3)
print(f"用户昵称: {nickname}\n")
# 打印每个上下文
for i, context in enumerate(contexts, 1):
print(f"\n随机消息 {i}/{len(contexts)}:")
print("-" * 30)
print(context)
print("=" * 50)

View File

@@ -79,22 +79,42 @@ class HeartbeatThread(threading.Thread):
self.interval = interval
self.client_id = get_unique_id()
self.running = True
self.stop_event = threading.Event() # 添加事件对象用于可中断的等待
self.last_heartbeat_time = 0 # 记录上次发送心跳的时间
def run(self):
"""线程运行函数"""
logger.debug(f"心跳线程已启动客户端ID: {self.client_id}")
while self.running:
# 发送心跳
if send_heartbeat(self.server_url, self.client_id):
logger.info(f"{self.interval}秒后发送下一次心跳...")
else:
logger.info(f"{self.interval}秒后重试...")
time.sleep(self.interval) # 使用同步的睡眠
self.last_heartbeat_time = time.time()
# 使用可中断的等待代替 sleep
# 每秒检查一次是否应该停止或发送心跳
remaining_wait = self.interval
while remaining_wait > 0 and self.running:
# 每次最多等待1秒便于及时响应停止请求
wait_time = min(1, remaining_wait)
if self.stop_event.wait(wait_time):
break # 如果事件被设置,立即退出等待
remaining_wait -= wait_time
# 检查是否由于外部原因导致间隔异常延长
if time.time() - self.last_heartbeat_time >= self.interval * 1.5:
logger.warning("检测到心跳间隔异常延长,立即发送心跳")
break
def stop(self):
"""停止线程"""
self.running = False
self.stop_event.set() # 设置事件,中断等待
logger.debug("心跳线程已收到停止信号")
def main():

View File

@@ -101,6 +101,9 @@ class ScheduleGenerator:
except json.JSONDecodeError:
logger.exception("解析日程失败: {}".format(schedule_text))
return False
except Exception as e:
logger.exception(f"解析日程发生错误:{str(e)}")
return False
def _parse_time(self, time_str: str) -> str:
"""解析时间字符串,转换为时间"""
@@ -158,7 +161,7 @@ class ScheduleGenerator:
def print_schedule(self):
"""打印完整的日程安排"""
if not self._parse_schedule(self.today_schedule_text):
logger.warning("今日日程有误,将在下次运行时重新生成")
logger.warning("今日日程有误,将在两小时后重新生成")
db.schedule.delete_one({"date": datetime.datetime.now().strftime("%Y-%m-%d")})
else:
logger.info("=== 今日日程安排 ===")

View File

@@ -41,8 +41,8 @@ class WillingManager:
interested_rate = interested_rate * config.response_interested_rate_amplifier
if interested_rate > 0.5:
current_willing += interested_rate - 0.5
if interested_rate > 0.4:
current_willing += interested_rate - 0.3
if is_mentioned_bot and current_willing < 1.0:
current_willing += 1