文本转换(Transforming)

文本转换(Transforming)

如何利用大语言模型(LLM)强大的文本转换能力,通过编程调用 API 接口,实现包括多语种翻译、拼写与语法纠正、语气调整及格式转换等多种功能。

1. 多语种翻译

模型不仅能进行基础的语言互译,还能识别语种、调整翻译语气,甚至构建一个通用的翻译工作流。

  • 基础翻译:将一种语言翻译成另一种。
    prompt = f""" 将以下中文翻译成西班牙语: \ ```您好,我想订购一个搅拌机。``` """ response = get_completion(prompt) print(response)
    • 输出:Hola, me gustaría ordenar una batidora.
  • 语种识别:让模型判断一段文本的语言。
    prompt = f""" 请告诉我以下文本是什么语种: ```Combien coûte le lampadaire?``` """ response = get_completion(prompt) print(response)
    • 输出这是法语。
  • 多语种翻译:一次性将文本翻译成多种语言。
    prompt = f""" 请将以下文本分别翻译成中文、英文、法语和西班牙语: ```I want to order a basketball.``` """ response = get_completion(prompt) print(response)
    • 输出
      • 中文:我想订购一个篮球。
      • 法语:Je veux commander un ballon de basket.
      • ...
  • 翻译+语气调整:在翻译的同时,指定输出的语气风格。
    prompt = f""" 请将以下文本翻译成中文,分别展示成正式与非正式两种语气: ```Would you like to order a pillow?``` """ response = get_completion(prompt) print(response)
    • 输出
      • 正式语气:请问您需要订购枕头吗?
      • 非正式语气:你要不要订一个枕头?
  • 通用翻译器:结合语种识别和翻译,构建一个自动化的翻译流程。
user_messages = [ "La performance du système est plus lente que d'habitude.", # System performance is slower than normal "Mi monitor tiene píxeles que no se iluminan.", # My monitor has pixels that are not lighting "Il mio mouse non funziona", # My mouse is not working "Mój klawisz Ctrl jest zepsuty", # My keyboard has a broken control key "我的屏幕在闪烁" # My screen is flashing ] for issue in user_messages: prompt = f"告诉我以下文本是什么语种,直接输出语种,如法语,无需输出标点符号: ```{issue}```" lang = get_completion(prompt) print(f"原始消息 ({lang}): {issue}\n") prompt = f""" 将以下消息分别翻译成英文和中文,并写成 中文翻译:xxx 英文翻译:yyy 的格式: ```{issue}``` """ response = get_completion(prompt) print(response, "\n=========================================")

结果:

原始消息 (法语): La performance du système est plus lente que d'habitude. 中文翻译:系统性能比平时慢。 英文翻译:The system performance is slower than usual. ========================================= 原始消息 (西班牙语): Mi monitor tiene píxeles que no se iluminan. 中文翻译:我的显示器有一些像素点不亮。 英文翻译:My monitor has pixels that don't light up. ========================================= ..............

2. 语气/风格调整

根据目标受众和场景,改变文本的写作风格。例如:将口语化的中文转换成正式的商务信函。

prompt = f""" 将以下文本翻译成商务信函的格式: ```小老弟,我小羊,上回你说咱部门要采购的显示器是多少寸来着?``` """ response = get_completion(prompt) print(response)

结果:

尊敬的XXX(收件人姓名):

您好!我是XXX(发件人姓名),在此向您咨询一个问题。上次我们交流时,您提到我们部门需要采购显示器,但我忘记了您所需的尺寸是多少英寸。希望您能够回复我,以便我们能够及时采购所需的设备。

谢谢您的帮助!

此致

敬礼

XXX(发件人姓名)

3. 格式转换

将文本从一种数据格式转换为另一种,例如从 JSON 转换为 HTML。

data_json = { "resturant employees" :[ {"name":"Shyam", "email":"shyamjaiswal@gmail.com"}, {"name":"Bob", "email":"bob32@gmail.com"}, {"name":"Jai", "email":"jai87@gmail.com"} ]} prompt = f""" 将以下Python字典从JSON转换为HTML表格,保留表格标题和列名:{data_json} """ response = get_completion(prompt) print(response) from IPython.display import display, Markdown, Latex, HTML, JSON display(HTML(response))

结果:

4. 拼写及语法纠正

充当一个智能校对工具,自动发现并纠正文本中的错误。

  • 基础纠错:循环处理一个句子列表,纠正其中的主谓不一致、同音异义词误用、拼写错误等。
    text = [ "The girl with the black and white puppies have a ball.", # The girl has a ball. "Yolanda has her notebook.", # ok "Its going to be a long day. Does the car need it’s oil changed?", # Homonyms "Their goes my freedom. There going to bring they’re suitcases.", # Homonyms "Your going to need you’re notebook.", # Homonyms "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms "This phrase is to cherck chatGPT for speling abilitty" # spelling ] for i in range(len(text)): prompt = f"""请校对并更正以下文本,注意纠正文本保持原始语种,无需输出原始文本。 如果您没有发现任何错误,请说“未发现错误”。 例如: 输入:I are happy. 输出:I am happy. ```{text[i]}```""" response = get_completion(prompt) print(i, response)

    结果:

    0 The girl with the black and white puppies has a ball. 1 未发现错误。 2 It's going to be a long day. Does the car need its oil changed? 3 Their goes my freedom. They're going to bring their suitcases. 4 You're going to need your notebook. 5 That medicine affects my ability to sleep. Have you heard of the butterfly effect? 6 This phrase is to check chatGPT for spelling abil。

5. 综合转换任务

将以上多种能力组合起来(文本翻译+拼写纠正+风格调整+格式转换),通过一个复杂的 Prompt 完成端到端的文本处理。

text = f""" Got this for my daughter for her birthday cuz she keeps taking \ mine from my room. Yes, adults also like pandas too. She takes \ it everywhere with her, and it's super soft and cute. One of the \ ears is a bit lower than the other, and I don't think that was \ designed to be asymmetrical. It's a bit small for what I paid for it \ though. I think there might be other options that are bigger for \ the same price. It arrived a day earlier than expected, so I got \ to play with it myself before I gave it to my daughter. """ prompt = f""" 针对以下三个反引号之间的英文评论文本, 首先进行拼写及语法纠错, 然后将其转化成中文, 再将其转化成优质淘宝评论的风格,从各种角度出发,分别说明产品的优点与缺点,并进行总结。 润色一下描述,使评论更具有吸引力。 输出结果格式为: 【优点】xxx 【缺点】xxx 【总结】xxx 注意,只需填写xxx部分,并分段输出。 将结果输出成Markdown格式。 ```{text}``` """ response = get_completion(prompt) display(Markdown(response))