ChatGPTs分析excel数据

ChatGPT与Excel 1年前 (2023)

41 0 0

ChatGPT的Assistant提供了对多达20个，总容量100G的的文件上传，并且可以基于这些文件进行问答，我最近就试了试她的数据分析能力，发现效果还是不错，同时也探究了一下，其实她的数据处理还是通过ChatGPT通过NLP的需求描述转化为Python程序，然后在沙箱运行该程序后得出的，所以对结算的结果准确性应该有保证。

数据准备

准备两个excel文件：product.xlsx，sales.xlsx，一个是产品ID、产品名称、单价，另一个是销售流水：

产品ID

产品名称

单价

1000

电饭锅

100

1001

空气炸锅

200

1002

电吹风

300

时间

产品ID

销售量

2021-01-01 08:00:00

1001

2021-01-01 08:10:00

1002

2021-01-01 08:20:00

1000

2021-01-01 08:30:00

1002

2021-01-01 08:40:00

1001

2021-01-01 08:50:00

1002

2021-01-01 09:00:00

1000

2021-01-01 09:10:00

1002

2021-01-01 09:20:00

1000

2021-01-01 09:30:00

1001

2021-01-01 09:40:00

1000

2021-01-01 09:50:00

1001

2021-01-01 10:00:00

1001

将文档上传到openAI的系统

# 上传文件
def upload_file():
product = client.files.create(
file=open(“/Users/crazyicelee/MiniProjects/PythonDemo/product.xlsx”, “rb”),
purpose=assistants
)

sales = client.files.create(
file=open(“/Users/crazyicelee/MiniProjects/PythonDemo/sales.xlsx”, “rb”),
purpose=assistants
)
return [product.id,sales.id]

创建新的Assistant对象

# 创建assistant
def create_assistan(file_ids):
assistant = client.beta.assistants.create(
name=“大表姐”,
instructions=“你是一名数据分析师，从用户提供的excel表格文件里边给出各种统计结果。”,
tools=[{“type”: “code_interpreter”}],
model=“gpt-4-1106-preview”,
file_ids=file_ids
)

thread = client.beta.threads.create()

run=client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id,
instructions=“product文件是产品ID、产品名称、单价的对应关系，sales是每个产品的销售量流水，后面回答问题都是中文输出，用markdown格式”
)
return assistant.id,thread.id,run.id

备注：这里一次性创建了Assistant、thread、run三个对象，而且最好在run的介绍里边明确告诉ChatGPT分析的目标文件的结构和关系

启动run

if __name__ == __main__:
assistant_id,thread_id,run_id=create_assistan([file-11Azcb6dozmxQmOUG4OP5tbU, file-lZlPAABvUO9LoLHjFEPUJ8vh])

is_quit=True
while is_quit:
time.sleep(1)
run = client.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run_id
)
if run.status==completed:
messages = client.beta.threads.messages.list(
thread_id=thread_id,
)

# 只输出第1条消息
i=0
for message in messages:
if i