AI智能体(二):从零开始构建智能体

神译局·2025年03月27日 07:06
探索AI智能体的设计原理与现实应用。

神译局是36氪旗下编译团队,关注科技、商业、职场、生活等领域,重点介绍国外的新技术、新观点、新风向。

编者按:2025年是AI智能体元年。本系列文章旨在介绍AI智能体的概念、类型、原理、架构、开发等,为进一步了解AI智能体提供入门知识。本文为系列文章的第二篇,文章来自编译。

在上一篇文章中,我们全面解析了AI智能体的特性、组成、发展历程、挑战及未来方向。

本文将探讨如何用Python从零开始构建一个AI智能体。这个智能体可以基于用户的输入进行决策、选择工具并执行任务。让我们开始吧!

1. 什么是智能体?

智能体是能感知环境、做出决策并采取行动以实现特定目标的自主实体。

根据复杂度不同,智能体可以是简单响应刺激的反射型智能体,也可以是能学习与适应的高级智能智能体。常见类型包括:

  1. 反射型智能体:直接响应环境变化,无内部记忆。

  2. 基于模型的智能体:通过对世界建立的内部模型决策。

  3. 目标导向智能体:基于目标规划行动。

  4. 效用驱动智能体:通过效用函数评估行动,让结果最大化。

像聊天机器人、推荐系统以及自动驾驶汽车等,均通过不同类型的智能体来高效执行任务。

智能体的核心组件包括:

  • 模型:智能体的“大脑”,负责处理输入并生成响应。

  • 工具:预定义函数,智能体根据用户请求执行。

  • 工具箱:智能体可用的工具集合。

  • 系统提示:指导智能体处理用户输入并选择工具的指令集。

2. 实现步骤

开发智能体的步骤

2.1 准备工作

本教程完整代码可到“Build an Agent from Scratch” GitHub仓库获取。

代码地址:从零构建智能体

运行代码前,请确保以下先决条件已满足:

1. 配置Python环境

从python.org安装Python(建议3.8+)。

验证安装:

python --version 

创建虚拟环境(推荐):

python -m venv ai_agents_env 

source ai_agents_env/bin/activate  # Windows: ai_agents_env\Scripts\activate 

安装依赖:

pip install -r requirements.txt 

2. 本地配置Ollama

从ollama.ai下载并安装Ollama。

验证安装:

ollama --version 

拉取模型(如需):

ollama pull mistral  # 可替换为其他模型名 

2.2 实现步骤

步骤1:配置环境

安装必要的库:

pip install requests termcolor python-dotenv 

步骤2:定义Model Class

创建OllamaModel类,连接本地API:

from termcolor import colored 

import os 

from dotenv import load_dotenv 

load_dotenv() 

 

import requests 

import json 

 

class OllamaModel: 

    def __init__(self, model, system_prompt, temperature=0, stop=None): 

        self.model_endpoint = "http://localhost:11434/api/generate" 

        self.temperature = temperature 

        self.model = model 

        self.system_prompt = system_prompt 

        self.headers = {"Content-Type": "application/json"} 

        self.stop = stop 

 

    def generate_text(self, prompt): 

        payload = { 

            "model": self.model, 

            "format": "json", 

            "prompt": prompt, 

            "system": self.system_prompt, 

            "stream": False, 

            "temperature": self.temperature, 

            "stop": self.stop 

        } 

        try: 

            response = requests.post(self.model_endpoint, headers=self.headers, data=json.dumps(payload)) 

            return response.json() 

        except requests.RequestException as e: 

            return {"error": str(e)} 

步骤3:创建智能体工具

定义计算器与字符串反转工具:

def basic_calculator(input_str):

    """

    Perform a numeric operation on two numbers based on the input string or dictionary.

    Parameters:

    input_str (str or dict): Either a JSON string representing a dictionary with keys 'num1', 'num2', and 'operation',

                            or a dictionary directly. Example: '{"num1": 5, "num2": 3, "operation": "add"}'

                            or {"num1": 67869, "num2": 9030393, "operation": "divide"}

    Returns:

    str: The formatted result of the operation.

    Raises:

    Exception: If an error occurs during the operation (e.g., division by zero).

    ValueError: If an unsupported operation is requested or input is invalid.

    """

    try:

        # Handle both dictionary and string inputs

        if isinstance(input_str, dict):

            input_dict = input_str

        else:

            # Clean and parse the input string

            input_str_clean = input_str.replace("'", "\"")

            input_str_clean = input_str_clean.strip().strip("\"")

            input_dict = json.loads(input_str_clean)

       

        # Validate required fields

        if not all(key in input_dict for key in ['num1', 'num2', 'operation']):

            return "Error: Input must contain 'num1', 'num2', and 'operation'"

 

        num1 = float(input_dict['num1'])  # Convert to float to handle decimal numbers

        num2 = float(input_dict['num2'])

        operation = input_dict['operation'].lower()  # Make case-insensitive

    except (json.JSONDecodeError, KeyError) as e:

        return "Invalid input format. Please provide valid numbers and operation."

    except ValueError as e:

        return "Error: Please provide valid numerical values."

 

    # Define the supported operations with error handling

    operations = {

        'add': operator.add,

        'plus': operator.add,  # Alternative word for add

        'subtract': operator.sub,

        'minus': operator.sub,  # Alternative word for subtract

        'multiply': operator.mul,

        'times': operator.mul,  # Alternative word for multiply

        'divide': operator.truediv,

        'floor_divide': operator.floordiv,

        'modulus': operator.mod,

        'power': operator.pow,

        'lt': operator.lt,

        'le': operator.le,

        'eq': operator.eq,

        'ne': operator.ne,

        'ge': operator.ge,

        'gt': operator.gt

    }

 

    # Check if the operation is supported

    if operation not in operations:

        return f"Unsupported operation: '{operation}'. Supported operations are: {', '.join(operations.keys())}"

 

    try:

        # Special handling for division by zero

        if (operation in ['divide', 'floor_divide', 'modulus']) and num2 == 0:

            return "Error: Division by zero is not allowed"

 

        # Perform the operation

        result = operations[operation](num1, num2)

       

        # Format result based on type

        if isinstance(result, bool):

            result_str = "True" if result else "False"

        elif isinstance(result, float):

            # Handle floating point precision

            result_str = f"{result:.6f}".rstrip('0').rstrip('.')

        else:

            result_str = str(result)

 

        return f"The answer is: {result_str}"

    except Exception as e:

        return f"Error during calculation: {str(e)}"

 

def reverse_string(input_string):

    """

    Reverse the given string.

 

    Parameters:

    input_string (str): The string to be reversed.

 

    Returns:

    str: The reversed string.

    """

    # Check if input is a string

    if not isinstance(input_string, str):

        return "Error: Input must be a string"

   

    # Reverse the string using slicing

    reversed_string = input_string[::-1]

   

    # Format the output

    result = f"The reversed string is: {reversed_string}"

    return result

步骤4:构建工具箱

ToolBox类存储智能体使用的所有工具,并提供各个工具的描述:

class ToolBox:

    def __init__(self):

        self.tools_dict = {}

    def store(self, functions_list):

        """

        Stores the literal name and docstring of each function in the list.

        Parameters:

        functions_list (list): List of function objects to store.

        Returns:

        dict: Dictionary with function names as keys and their docstrings as values.

        """

        for func in functions_list:

            self.tools_dict[func.__name__] = func.__doc__

        return self.tools_dict

    def tools(self):

        """

        Returns the dictionary created in store as a text string.

        Returns:

        str: Dictionary of stored functions and their docstrings as a text string.

        """

        tools_str = ""

        for name, doc in self.tools_dict.items():

            tools_str += f"{name}: \"{doc}\"\n"

        return tools_str.strip()

步骤5:创建智能体类

智能体需要思考,决定使用什么工具,并且执行工具。以下是Agent类:

agent_system_prompt_template = """

You are an intelligent AI assistant with access to specific tools. Your responses must ALWAYS be in this JSON format:

{{

    "tool_choice": "name_of_the_tool",

    "tool_input": "inputs_to_the_tool"

}}

 

TOOLS AND WHEN TO USE THEM:

 

1. basic_calculator: Use for ANY mathematical calculations

   - Input format: {{"num1": number, "num2": number, "operation": "add/subtract/multiply/divide"}}

   - Supported operations: add/plus, subtract/minus, multiply/times, divide

   - Example inputs and outputs:

     Input: "Calculate 15 plus 7"

     Output: {{"tool_choice": "basic_calculator", "tool_input": {{"num1": 15, "num2": 7, "operation": "add"}}}}

    

     Input: "What is 100 divided by 5?"

     Output: {{"tool_choice": "basic_calculator", "tool_input": {{"num1": 100, "num2": 5, "operation": "divide"}}}}

 

2. reverse_string: Use for ANY request involving reversing text

   - Input format: Just the text to be reversed as a string

   - ALWAYS use this tool when user mentions "reverse", "backwards", or asks to reverse text

   - Example inputs and outputs:

     Input: "Reverse of 'Howwwww'?"

     Output: {{"tool_choice": "reverse_string", "tool_input": "Howwwww"}}

    

     Input: "What is the reverse of Python?"

     Output: {{"tool_choice": "reverse_string", "tool_input": "Python"}}

 

3. no tool: Use for general conversation and questions

   - Example inputs and outputs:

     Input: "Who are you?"

     Output: {{"tool_choice": "no tool", "tool_input": "I am an AI assistant that can help you with calculations, reverse text, and answer questions. I can perform mathematical operations and reverse strings. How can I help you today?"}}

    

     Input: "How are you?"

     Output: {{"tool_choice": "no tool", "tool_input": "I'm functioning well, thank you for asking! I'm here to help you with calculations, text reversal, or answer any questions you might have."}}

 

STRICT RULES:

1. For questions about identity, capabilities, or feelings:

   - ALWAYS use "no tool"

   - Provide a complete, friendly response

   - Mention your capabilities

 

2. For ANY text reversal request:

   - ALWAYS use "reverse_string"

   - Extract ONLY the text to be reversed

   - Remove quotes, "reverse of", and other extra text

 

3. For ANY math operations:

   - ALWAYS use "basic_calculator"

   - Extract the numbers and operation

   - Convert text numbers to digits

 

Here is a list of your tools along with their descriptions:

{tool_descriptions}

 

Remember: Your response must ALWAYS be valid JSON with "tool_choice" and "tool_input" fields.

"""

class Agent:

    def __init__(self, tools, model_service, model_name, stop=None):

        """

        Initializes the agent with a list of tools and a model.

 

        Parameters:

        tools (list): List of tool functions.

        model_service (class): The model service class with a generate_text method.

        model_name (str): The name of the model to use.

        """

        self.tools = tools

        self.model_service = model_service

        self.model_name = model_name

        self.stop = stop

 

    def prepare_tools(self):

        """

        Stores the tools in the toolbox and returns their descriptions.

 

        Returns:

        str: Descriptions of the tools stored in the toolbox.

        """

        toolbox = ToolBox()

        toolbox.store(self.tools)

        tool_descriptions = toolbox.tools()

        return tool_descriptions

 

    def think(self, prompt):

        """

        Runs the generate_text method on the model using the system prompt template and tool descriptions.

 

        Parameters:

        prompt (str): The user query to generate a response for.

 

        Returns:

        dict: The response from the model as a dictionary.

        """

        tool_descriptions = self.prepare_tools()

        agent_system_prompt = agent_system_prompt_template.format(tool_descriptions=tool_descriptions)

 

        # Create an instance of the model service with the system prompt

 

        if self.model_service == OllamaModel:

            model_instance = self.model_service(

                model=self.model_name,

                system_prompt=agent_system_prompt,

                temperature=0,

                stop=self.stop

            )

        else:

            model_instance = self.model_service(

                model=self.model_name,

                system_prompt=agent_system_prompt,

                temperature=0

            )

 

        # Generate and return the response dictionary

        agent_response_dict = model_instance.generate_text(prompt)

        return agent_response_dict

 

    def work(self, prompt):

        """

        Parses the dictionary returned from think and executes the appropriate tool.

 

        Parameters:

        prompt (str): The user query to generate a response for.

 

        Returns:

        The response from executing the appropriate tool or the tool_input if no matching tool is found.

        """

        agent_response_dict = self.think(prompt)

        tool_choice = agent_response_dict.get("tool_choice")

        tool_input = agent_response_dict.get("tool_input")

 

        for tool in self.tools:

            if tool.__name__ == tool_choice:

                response = tool(tool_input)

                print(colored(response, 'cyan'))

                return

 

        print(colored(tool_input, 'cyan'))

        return

这个类有3个主要方法:

  • prepare_tools: 存储和返回工具的描述。

  • think: 根据用户提示决定使用哪个工具。

  • work: 执行选定工具并返回结果。

步骤6:运行智能体

最后就是整合起来然后运行智能体。脚本的主程序会初始化智能体然后接收用户输入:

# Example usage

if __name__ == "__main__":

    """

    Instructions for using this agent:

   

    Example queries you can try:

    1. Calculator operations:

       - "Calculate 15 plus 7"

       - "What is 100 divided by 5?"

       - "Multiply 23 and 4"

   

    2. String reversal:

       - "Reverse the word 'hello world'"

       - "Can you reverse 'Python Programming'?"

   

    3. General questions (will get direct responses):

       - "Who are you?"

       - "What can you help me with?"

   

    Ollama Commands (run these in terminal):

    - Check available models:    'ollama list'

    - Check running models:      'ps aux | grep ollama'

    - List model tags:          'curl http://localhost:11434/api/tags'

    - Pull a new model:         'ollama pull mistral'

    - Run model server:         'ollama serve'

    """

 

    tools = [basic_calculator, reverse_string]

 

    # Uncomment below to run with OpenAI

    # model_service = OpenAIModel

    # model_name = 'gpt-3.5-turbo'

    # stop = None

 

    # Using Ollama with llama2 model

    model_service = OllamaModel

    model_name = "llama2"  # Can be changed to other models like 'mistral', 'codellama', etc.

    stop = "<|eot_id|>"

 

    agent = Agent(tools=tools, model_service=model_service, model_name=model_name, stop=stop)

 

    print("\nWelcome to the AI Agent! Type 'exit' to quit.")

    print("You can ask me to:")

    print("1. Perform calculations (e.g., 'Calculate 15 plus 7')")

    print("2. Reverse strings (e.g., 'Reverse hello world')")

    print("3. Answer general questions\n")

 

    while True:

        prompt = input("Ask me anything: ")

        if prompt.lower() == "exit":

            break

 

        agent.work(prompt)

3. 总结

本文从理解智能体概念入手,逐步实现了环境配置、模型定义、工具创建及工具箱构建,最终整合并运行了智能体。

这个结构化方法为构建智能交互智能体奠定了基础,未来AI智能体会应用到各个行业,推动效率与创新。

敬请期待更多深入解析与进阶技巧,帮助你打造更强大的AI智能体!

延伸阅读:

AI智能体(一):介绍

译者:boxi。

+1
135

好文章,需要你的鼓励

参与评论
评论千万条,友善第一条
后参与讨论
提交评论0/1000

下一篇

高质量发展进入收获期。

2025-03-26

36氪APP让一部分人先看到未来
36氪
鲸准
氪空间

推送和解读前沿、有料的科技创投资讯

一级市场金融信息和系统服务提供商

聚焦全球优秀创业者,项目融资率接近97%,领跑行业