在亚马逊云科技上搭建云原生生成式AI教育学习平台

阅读 32

2024-08-16

项目简介:

小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。

本次介绍的是如何利用亚马逊云科技大模型托管服务Amazon Bedrock和云原生容器管理服务,将生成式AI和亚马逊Titan大模型应用到教育和学习场景,利用Titan大模型的向量化、文字和图片生成能力构建云端教育平台为学生创建课程作业和答案,本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。通过Application Load Balancer和AWS ECS将应用程序与AI模型集成。本方案的解决方案架构图如下:

方案所需基础知识 

什么是 Amazon Bedrock?

Amazon Bedrock 是亚马逊云科技提供的一项服务,旨在帮助开发者轻松构建和扩展生成式 AI 应用。Bedrock 提供了访问多种强大的基础模型(Foundation Models)的能力,支持多种不同大模型厂商的模型,如AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, 和Amazon,用户可以使用这些模型来创建、定制和部署各种生成式 AI 应用程序,而无需从头开始训练模型。Bedrock 支持多个生成式 AI 模型,包括文本生成、图像生成、代码生成等,简化了开发流程,加速了创新。

什么是 Amazon Titan 模型?

Amazon Titan 是亚马逊云科技推出的基础AI大语言模型,专为处理复杂生成任务而设计。Titan 模型经过大规模数据训练,具备强大的自然语言理解和生成能力,适用于多种应用场景,包括文本创作、对话生成、代码生成、向量生成和图片生成等。

向量生成

Titan 模型还能够将文本或其他数据转化为高维向量表示,支持相似性搜索、推荐系统和其他基于向量的任务。通过生成向量,Titan 模型可以帮助开发者在大规模数据集中高效地进行数据检索和分类,提升智能应用的性能和用户体验。

图片生成

除了文本和向量生成,Titan 模型还具备图片生成能力。利用 Titan 模型,开发者可以从文本描述生成高质量的图像,或通过输入图片进行风格转换和图像增强。这个功能使 Titan 模型在教育学习、创意设计、广告制作、游戏开发等领域具有广泛的应用前景。

本方案包括的内容

1. 管理Amazon Bedrock上托管的生成式AI大模型访问权限。

2. 评估和选择适合教育平台应用的AI大模型

3. 利用ECS容器管理服务部署教育平台容器化应用

4. 在教育平台应用中利用生成式AI为学生创建课程作业

项目搭建具体步骤:

1. 进入亚马逊云科技控制台,打开Amazon Bedrock大模型托管服务主页

2. 确认以下三个大模型是在开启状态,Titan Embeddings G1 - Text, Titan Text G1 - Premier, and Titan Image Generator G1,分别是向量化、文字生成和图片生成模型。

 3.我们可以在Amazon Bedrock的模型介绍界面的API Request样例示范中得到以上3个大模型的ID,分别是“amazon.titan-embed-text-v1”,“amazon.titan-text-premier-v1:0”,"amazon.titan-image-generator-v1"。

4. 在亚马逊云科技上打开云端IDE Cloud9,创建一个python文件,命名为:“1_Create_Assignments.py”,复制以下代码到文件中,注意这里我们刚刚获取的AI大模型ID,“amazon.titan-text-premier-v1:0”和"amazon.titan-image-generator-v1"。该代码用于在Streamlit应用中生成课程作业。

import json
import logging
import math
import random
import re
import time
import base64
from io import BytesIO


import boto3
import numpy as np
import streamlit as st
from PIL import Image
from botocore.exceptions import ClientError
from components.Parameter_store import S3_BUCKET_NAME

dynamodb_client = boto3.resource("dynamodb")
bedrock_client = boto3.client("bedrock-runtime")
questions_table = dynamodb_client.Table("assignments")
user_name = "Demo-user"

titan_text_model_id = "amazon.titan-text-premier-v1:0"
titan_image_model_id = "amazon.titan-image-generator-v1"

if "input-text" not in st.session_state:
    st.session_state["input-text"] = None

if "question_answers" not in st.session_state:
    st.session_state["question_answers"] = None

if "reading_material" not in st.session_state:
    st.session_state["reading_material"] = None

# Method to call the Titan text foundation model 
def query_generate_questions_answers_endpoint(input_text):
    prompt = f"{input_text}\n Using the above context, please generate five questions and answers you could ask students about this information"
    input_body = json.dumps({
            "inputText": prompt,
            "textGenerationConfig": {
                "maxTokenCount": 3072,
                "stopSequences": [],
                "temperature": 0.7,
                "topP": 0.9
            }
        })
    titan_qa_response = bedrock_client.invoke_model(
        modelId=titan_text_model_id,
        contentType="application/json",
        accept="application/json",
        body=input_body,
    )
    
    response_body = json.loads(titan_qa_response.get("body").read())
    response_text = ""
    for result in response_body['results']:
        response_text = f"{response_text}\n{result['outputText']}"

    return parse_text_to_lines(response_text)

# method to call the Titan image foundation model
def query_generate_image_endpoint(input_text):
    seed = np.random.randint(1000)
    input_body = json.dumps({
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {
            "text": f"An image of {input_text}"
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,
            "height": 1024,
            "width": 1024,
            "cfgScale": 8.0,
            "seed": 0
        }
    })
    if titan_image_model_id == "<model-id>":
        return None
    else:
        titan_image_api_response = bedrock_client.invoke_model(
            body=input_body,
            modelId=titan_image_model_id,
            accept="application/json",
            contentType="application/json",
        )
        response_body = json.loads(
            titan_image_api_response.get("body").read()
        )
            
        base64_image = response_body.get("images")[0]
        base64_bytes = base64_image.encode('ascii')
        image_bytes = base64.b64decode(base64_bytes)
        
        image = Image.open(BytesIO(image_bytes))
        return image

def generate_assignment_id_key():
    # Milliseconds since epoch
    epoch = round(time.time() * 1000)
    epoch = epoch - 1670000000000
    rand_id = math.floor(random.random() * 999)
    return (epoch * 1000) + rand_id


# create a function to load a file to S3 bucket
def load_file_to_s3(file_name, object_name):
    # Upload the file
    s3_client = boto3.client("s3")
    try:
        s3_client.upload_file(file_name, S3_BUCKET_NAME, object_name)
    except ClientError as e:
        logging.error(e)
        return False
    return True


# create a function to insert a record to DynamoDB table created_images
def insert_record_to_dynamodb(
    assignment_id, prompt, s3_image_name, question_answers
):
    questions_table.put_item(
        Item={
            "assignment_id": assignment_id,
            "teacher_id": user_name,
            "prompt": prompt,
            "s3_image_name": s3_image_name,
            "question_answers": question_answers,
        }
    )

# Parse a string of text to get a line at a time
def parse_text_to_lines(text):
    st.write(text)
    lines = text.split('\n')
    lines = [line.strip() for line in lines]
    # Loop through each line and check if it's a question
    question_answers = []
    question = None
    answer = None
    question_id = 0
    for i in range(len(lines)):
        # regular expression pattern for Q: or Q1: or Q2
        question_pattern = re.compile("Q[0-9]?:|Question[\s]?[0-9]?:|QUESTION[\s]?[0-9]?:")
        answer_pattern = re.compile("A[0-9]?:|Answer[\s]?[0-9]?:|ANSWER[\s]?[0-9]?:")
        question_match = question_pattern.search(lines[i])
        answer_match = answer_pattern.search(lines[i])
        if question_match:
            # Get the substring after the matching pattern
            question = lines[i][question_match.end() + 1:]

        if answer_match:
            # Get the substring after the matching pattern
            answer = lines[i][answer_match.end() + 1:]

        if question and answer:
            question_answer = {'id': question_id, 'question': question, 'answer': answer}
            question_answers.append(question_answer)
            question_id += 1
            question = None
            answer = None

    return question_answers

# Page configuration
st.set_page_config(page_title="Create Assignments", page_icon=":pencil:", layout="wide")

# Sidebar
st.sidebar.header("Create Assignments")

# Rest of the page
st.markdown("# Create Assignments")
st.sidebar.header("Input text to create assignments")

text = st.text_area("Input Text")
if text and text != st.session_state.get("input-text", None) and text != "None":
    try:
        if titan_image_model_id != "<model-id>":
            image = query_generate_image_endpoint(text)
            image.save("temp-create.png")
            st.session_state["input-text"] = text

        # generate questions and answer
        questions_answers = query_generate_questions_answers_endpoint(text)
        # st.write(questions_answers)
        st.session_state["question_answers"] = questions_answers
    except Exception as ex:
        st.error(f"There was an error while generating question. {ex}")

if st.session_state.get("question_answers", None):
    st.markdown("## Generated Questions and Answers")
    questions_answers = st.text_area(
        "Questions and Answers",
        json.dumps(st.session_state["question_answers"], indent=4),
        height=320,
        label_visibility="collapsed"
    )

if st.button("Generate Questions and Answers"):
    st.session_state["question_answers"] = query_generate_questions_answers_endpoint(text)
    st.experimental_rerun()

if st.session_state.get("input-text", None):
    if titan_image_model_id != "<model-id>":
        images = Image.open("temp-create.png")
        st.image(images, width=512)

if titan_image_model_id != "<model-id>":
    if st.button("Generate New Image"):
        image = query_generate_image_endpoint(text)
        image.save("temp-create.png")
        st.experimental_rerun()

st.markdown("------------")
if st.button("Save Question"):
    if titan_image_model_id != "<model-id>":
        # load to s3
        assignment_id = str(generate_assignment_id_key())
        object_name = f"generated_images/{assignment_id}.png"
        validation_object_name = f"generated_images/temp-create.png"
        load_file_to_s3("temp-create.png", object_name)
        load_file_to_s3("temp-create.png", validation_object_name)
        st.success(f"Image generated and uploaded successfully: {object_name}")
        questions_answers = json.dumps(st.session_state["question_answers"], indent=4)
        insert_record_to_dynamodb(assignment_id, text, object_name, questions_answers)
    else:
        assignment_id = str(generate_assignment_id_key())
        object_name = "no image created"
        insert_record_to_dynamodb(assignment_id, text, object_name, questions_answers)

    st.success(f"An assignment created and saved successfully")

hide_streamlit_style = """
    <style>
        #MainMenu {visibility: hidden;}
        footer{ visibility: hidden;}
    </style>
    """

st.markdown(hide_streamlit_style, unsafe_allow_html=True)

5. 接下来我们将应用代码封装成容器镜像,上传到亚马逊云科技容器镜像库ECR上,我们运行以下Bash命令。

#授权Docker客户端访问ECR镜像库
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 903982278766.dkr.ecr.us-east-1.amazonaws.com
#构建Docker镜像learning-system-repo
docker build -t learning-system-repo .
#为容器镜像打标记
docker tag learning-system-repo:latest 903982278766.dkr.ecr.us-east-1.amazonaws.com/learning-system-repo:latest
#将容器镜像上传到ECR中
docker push 903982278766.dkr.ecr.us-east-1.amazonaws.com/learning-system-repo:latest

6. 接下来我们进入到亚马逊云科技容器管理服务ECS,创建一个容器管理集群“Learning-System-cluster”和微服务,将上传的容器镜像部署上去。部署成功后会创建一个对外暴露的负载均衡器,用于实现微服务内部多个任务的高可用,我们复制这个对外暴露的负载均衡器的URL。

7. 打开后就可以登录到我们的AI学习平台中,我们点击左侧的Create Assignments, 并输入文组新奥尔良作为标准答案,点击利用生成式AI生成课程作业问题。

8. 我们可以看到课程学习平台通过生成式AI生成了多个课程作业问题,我们再点击SAVE保存问题到亚马逊云科技NoSQL数据库DynamoDB中。

9. 当学生登录该学习平台的测试界面时,就可以看到之前我们生成的问题,并输入答案完成作业测试了。

以上就是在亚马逊云科技上利用生成式AI构建云端教育平台,为学生创建课程作业并提供学习测试界面的全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。

精彩评论(0)

0 0 举报