AWS Bedrock（Claude）に読み取らせる情報によるハルシネーションの発生状況を確認してみた

作成者: tsuyoshi.watanabe｜2024年02月22日

はじめに

クラウドインテグレーション部の渡邊です。
今回は以下3パターンの出力を検証します。
とくに、ハルシネーションに絞った検証を行います。

AWS Bedrockのみ
AWS Kendraと連携
Agents for Amazon Bedrock（Google Cloud APIs）

こちらは、Jr. Champion限定のイベントの内容を参考にしています。

Japan AWS Jr. Championsとは？

AWS Partner Network (APN) 参加企業に所属し、現在社会人歴 1 ～ 3 年目で AWS を積極的に学び、アクションを起こし、周囲に影響を与えている APN 若手エンジニア

引用元：2023 Japan AWS Jr. Champions の発表

Japan AWS Jr. Championsのメンバーと、勉強会やLT会など定期的に活動しています。

2023 Japan AWS Jr. Champions 表彰時の写真

いきなりまとめ

Kendraと連携する場合
- 検索する対象が決まっており、正しい回答を得たい場合、適している
- 検索する対象が定まっていない場合、Kendra以外を使用するほうが望ましそう
Agent（Google Cloud APIs）
- インターネットの情報を参照しているからと言って、正確な回答が得られるとは限らない

環境準備

今回は、us-west-2（米国西部 (オレゴン)）リージョンで環境作成する想定なので注意してください。

スタック作成

まずは、以下のCloudFormationテンプレートをもとにスタックを作成してください。

##This CloudFormation template creates an Amazon Kendra index. It adds a webcrawler datasource
##to the index and crawls the online AWS Documentation for Amazon Kendra, Amazon Lex and Amazon SageMaker
##After the datasource is configured, it triggers a datasource sync, i.e. the process to crawl the sitemaps
##and index the crawled documents.
##The output of the CloudFormation template shows the Kendra index id and the AWS region it was created in.
##It takes about 30 minutes to create an Amazon Kendra index and about 15 minutes more to crawl and index
##the content of these webpages to the index. Hence you might need to wait for about 45 minutes after
##launching the CloudFormation stack
Resources:
  ##Create the Role needed to create a Kendra Index
  KendraIndexRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Sid: ''
            Effect: Allow
            Principal:
              Service: kendra.amazonaws.com
            Action: 'sts:AssumeRole'
      Policies:
        - PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Resource: '*'
                Condition:
                  StringEquals:
                    'cloudwatch:namespace': 'Kendra'
                Action:
                  - 'cloudwatch:PutMetricData'
              - Effect: Allow
                Resource: '*'
                Action: 'logs:DescribeLogGroups'
              - Effect: Allow
                Resource: !Sub
                  - 'arn:aws:logs:${region}:${account}:log-group:/aws/kendra/*'
                  - region: !Ref 'AWS::Region'
                    account: !Ref 'AWS::AccountId'
                Action: 'logs:CreateLogGroup'
              - Effect: Allow
                Resource: !Sub
                  - 'arn:aws:logs:${region}:${account}:log-group:/aws/kendra/*:log-stream:*'
                  - region: !Ref 'AWS::Region'
                    account: !Ref 'AWS::AccountId'
                Action: 
                  - 'logs:DescribeLogStreams'
                  - 'logs:CreateLogStream'
                  - 'logs:PutLogEvents'
          PolicyName: !Join
            - ''
            - - !Ref 'AWS::StackName'
              - '-DocsKendraIndexPolicy'
      RoleName: !Join
        - ''
        - - !Ref 'AWS::StackName'
          - '-DocsKendraIndexRole'
      
  ##Create the Kendra Index
  DocsKendraIndex:
    Type: 'AWS::Kendra::Index'
    Properties:
      Name: !Join
        - ''
        - - !Ref 'AWS::StackName'
          - '-Index'
      Edition: 'DEVELOPER_EDITION'
      RoleArn: !GetAtt KendraIndexRole.Arn

  ##Create the Role needed to attach the Webcrawler Data Source
  KendraDSRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Sid: ''
            Effect: Allow
            Principal:
              Service: kendra.amazonaws.com
            Action: 'sts:AssumeRole'
      Policies:
        - PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Resource: !Sub
                  - 'arn:aws:kendra:${region}:${account}:index/${index}'
                  - region: !Ref 'AWS::Region'
                    account: !Ref 'AWS::AccountId'
                    index: !GetAtt DocsKendraIndex.Id
                Action:
                  - 'kendra:BatchPutDocument'
                  - 'kendra:BatchDeleteDocument'
          PolicyName: !Join
            - ''
            - - !Ref 'AWS::StackName'
              - '-DocsDSPolicy'
      RoleName: !Join
        - ''
        - - !Ref 'AWS::StackName'
          - '-DocsDSRole'
  
  #Docs Data Source
  KendraDocsDS:
    Type: 'AWS::Kendra::DataSource'
    Properties: 
      DataSourceConfiguration: 
        WebCrawlerConfiguration:
          UrlInclusionPatterns:
            - '.*https://docs.aws.amazon.com/ja_jp/lex/.*'
            - '.*https://docs.aws.amazon.com/ja_jp/kendra/.*'
            - '.*https://docs.aws.amazon.com/ja_jp/sagemaker/.*'
          Urls:
             SiteMapsConfiguration:
               SiteMaps:
                 - 'https://docs.aws.amazon.com/ja_jp/lex/latest/dg/sitemap.xml'
                 - 'https://docs.aws.amazon.com/ja_jp/kendra/latest/dg/sitemap.xml'
                 - 'https://docs.aws.amazon.com/ja_jp/sagemaker/latest/dg/sitemap.xml'
      IndexId: !GetAtt DocsKendraIndex.Id
      Name: 'KendraDocsDS'
      RoleArn: !GetAtt KendraDSRole.Arn
      Type: 'WEBCRAWLER'
      LanguageCode: ja
    
  DataSourceSyncLambdaRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Resource: !Sub
                  - 'arn:aws:kendra:${region}:${account}:index/${index}*'
                  - region: !Ref 'AWS::Region'
                    account: !Ref 'AWS::AccountId'
                    index: !GetAtt DocsKendraIndex.Id
                Action:
                  - 'kendra:StartDataSourceSyncJob'
          PolicyName: DataSourceSyncLambdaPolicy
          
  DataSourceSyncLambda:
    Type: AWS::Lambda::Function
    Properties:
      Handler: index.lambda_handler
      Runtime: python3.8
      Role: !GetAtt 'DataSourceSyncLambdaRole.Arn'
      Timeout: 900
      MemorySize: 1024
      Code:
        ZipFile: |

          # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
          # SPDX-License-Identifier: MIT-0
          
          import json
          import logging
          import boto3
          import cfnresponse
          import random
          import os
          
          logger = logging.getLogger()
          logger.setLevel(logging.INFO)
          
          INDEX_ID = os.environ['INDEX_ID']
          DS_ID = os.environ['DS_ID']
          AWS_REGION = os.environ['AWS_REGION']
          KENDRA = boto3.client('kendra')
          
          def start_data_source_sync(dsId, indexId):
              logger.info(f"start_data_source_sync(dsId={dsId}, indexId={indexId})")
              resp = KENDRA.start_data_source_sync_job(Id=dsId, IndexId=indexId)
              logger.info(f"response:" + json.dumps(resp))
          
          def lambda_handler(event, context):
              logger.info("Received event: %s" % json.dumps(event))
              start_data_source_sync(DS_ID, INDEX_ID)
              status = cfnresponse.SUCCESS
              cfnresponse.send(event, context, status, {}, None)
              return status

      Environment:
        Variables:
          INDEX_ID: !GetAtt DocsKendraIndex.Id
          DS_ID: !GetAtt KendraDocsDS.Id
  
  DataSourceSync:
    Type: Custom::DataSourceSync
    DependsOn: 
      - DocsKendraIndex
      - KendraDocsDS
    Properties:
      ServiceToken: !GetAtt DataSourceSyncLambda.Arn
    
Outputs:
  KendraIndexID:
    Value: !GetAtt DocsKendraIndex.Id
  AWSRegion:
    Value: !Ref 'AWS::Region'

Bedrockのモデルへのアクセス許可設定

Amazon Bedrockのコンソール画面で特定のモデルを利用する際に、モデルを利用するためにリクエストが必要です。

[Get started] > [Request model access]をクリックします。

すると、「Base models」が出るため、[Manage Models]をクリック後、「Anthropic」を選択します。
しばらくすると、選択したモデルへのアクセス許可がされます。

GCPでAPI KEYとCSE IDの用意

検証でAgent（外部アプリケーション（Google Cloud APIs）を利用するために、回答を行います。
CGPアカウントにログインして、API_KEYとCSE_IDを用意します。
GOOGLE_API_KEY：https://console.cloud.google.com/apis/credentials
GOOGLE_CSE_ID：https://programmablesearchengine.google.com/controlpanel/create

Custom Search API有効化する点に注意しましょう。

SageMakerノートブックの用意 & IAMロールの権限変更

SageMakerノートブックのコンソール画面からインスタンスを作成します。
また、あとで権限を変更しますが、新しいIAMロールを作成します。

IAMロールの作成に成功するのでインスタンス名を設定して、作成します。

ここで、先ほど作成したIAMロールの権限を変更します。
検証で、BedrockとKendraに対するアクセス権限が必要だからです。

IAMロール作成時にすでにあるポリシー
- AmazonSageMaker-ExecutionPolicy-XXXXXXXXXXXXXX
- AmazonSageMakerFullAccess
追加するポリシー
- AmazonBedrockFullAccess
- AmazonKendraFullAccess

信頼ポリシーは以下のように設定してください。

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "sagemaker.amazonaws.com",
                    "kendra.amazonaws.com",
                    "bedrock.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

SageMaker ロールを設定するを参考に、信頼ポリシーも修正します。

ノートブックを開きます。

ノートブックのアップロード

以下のノートブックを保存してアップロードします。
拡張子はipynbです。

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dfec2db5-c61a-40f4-9a65-6de2621cc390",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip install langchain==0.0.335"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0a0ceaf5-3899-42ed-b223-a7f9c58da83b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# AWS Bedrockのみ（Claude）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "92312d13-5cb0-4962-8d1e-16367151ec5c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.llms import Bedrock\n",
    "from langchain.chains import ConversationChain\n",
    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
    "\n",
    "llmst = Bedrock(\n",
    "    model_id = \"anthropic.claude-v2\",\n",
    "    streaming=True,\n",
    "    callbacks=[StreamingStdOutCallbackHandler()],\n",
    ")\n",
    "\n",
    "conversation = ConversationChain(\n",
    "    llm=llmst, verbose=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "deced5af-5684-4921-8624-3bf8d368ad7b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "conversation.predict(input=\"Amazon Lexにアクセスするためにクライアントが満たすべきTLSの要件を日本語で教えてください\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e4593ef2-3cf2-4f8d-8cd6-f961de82d80c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# AWS Kendraと連携"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "f16b04d0-496b-478f-85b2-bbe9d0d8490a",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import boto3\n",
    "from langchain.retrievers import AmazonKendraRetriever\n",
    "from langchain.chains import RetrievalQA\n",
    "from langchain.llms import Bedrock\n",
    "\n",
    "language_code = \"ja\"\n",
    "retriever = AmazonKendraRetriever(\n",
    "    index_id=\"YOUR_KENDRA_INDEX_ID\",\n",
    "    attribute_filter={\n",
    "      \"EqualsTo\": {\n",
    "        \"Key\": \"_language_code\",\n",
    "        \"Value\": {\"StringValue\": language_code},\n",
    "      }\n",
    "    }\n",
    "  )\n",
    "  \n",
    "qa = RetrievalQA.from_chain_type(\n",
    "    llm = Bedrock(model_id = \"anthropic.claude-v2\"),\n",
    "    chain_type=\"stuff\",\n",
    "    retriever=retriever,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "df4892d9-1ffe-4e74-b7e6-1b53951ad26f",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "qa.run(\"Amazon Lexにアクセスするためにクライアントが満たすべきTLSの要件を日本語で教えてください\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "05869d8e-5db6-4877-a61e-0b9db185f3d8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Agents for Amazon Bedrock（Google Cloud APIs）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "c01de778-5107-4ab7-9a5d-b6dd1f1a1d6f",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip install google-api-python-client>=2.100.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "d2fb223f-3bf7-48c2-b692-ef569686507c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import os\n",
    "from langchain.agents import Tool\n",
    "from langchain.utilities import GoogleSearchAPIWrapper\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "search = GoogleSearchAPIWrapper( google_api_key = \"YOUR_API_KEY\", google_cse_id=\"YOUR_CSE_ID\")\n",
    "\n",
    "tools = [\n",
    "    Tool(\n",
    "        name = \"現在の検索\",\n",
    "        func=search.run,\n",
    "        description=\"現在の出来事や世界の現状に関する質問に答える必要がある場合に役立ちます\"\n",
    "    )\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "936be6b3-11b8-4a95-9d1a-edb26e7d1509",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.llms import Bedrock\n",
    "from langchain.agents import initialize_agent\n",
    "from langchain.agents import AgentType\n",
    "\n",
    "\n",
    "llm = Bedrock(model_id=\"anthropic.claude-v2\", model_kwargs={\"temperature\":0.1, \"max_tokens_to_sample\": 20000 })\n",
    "\n",
    "agent_chain = initialize_agent(\n",
    "    tools,\n",
    "    llm,\n",
    "    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
    "    verbose=True,\n",
    "    handle_parsing_errors=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "645d010f-8a31-4790-8a43-cedfe242c747",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "agent_chain.run(input=\"Amazon Lexにアクセスするためにクライアントが満たすべきTLSの要件を日本語で教えてください\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

ハルシネーションの性能検証やってみた

実施内容

AWS Bedrockのみ
- LangChainでClaudeを利用
AWS Kendraと連携
Agents for Amazon Bedrock（Google Cloud APIs）

カーネルとして、conda_pytorch_р310を利用しています。
ノートブックにあるLangChainはLLMに接続してクエリを実行するために使用します。

今回は、以下質問に対してハルシネーションが発生するかを検証します。
質問：Amazon Lexにアクセスするためにクライアントが満たすべきTLSの要件を日本語で教えてください
期待する回答：クライアントは TLS(Transport Layer Security)1.0 をサポートしている必要がある

参考：Amazon Lex のインフラストラクチャセキュリティ

AWS Bedrockのみ（Claude）

Bedrockで選択可能なモデルClaudeを利用してハルシネーションの発生を確認します。

 Amazon Lex にアクセスするクライアントは、以下のTLS要件を満たす必要があります:

- TLS 1.2 以上のバージョンを使用すること  
- Perfect Forward Secrecy (前方秘密) をサポートした暗号スイート (ECDHE-ECDSA-AES128-GCM-SHA256 など) を使用すること
<略>

ハルシネーションが発生しています。

AWS Kendraと連携

Kendra側で事前に用意しているAWSドキュメントをLLMに渡してRAGをします。
公式ドキュメントのAmazon Lex、Amazon Kendra、Amazon SageMakerをデータソースに設定しています。

' Amazon Lexにアクセスするために、クライアントは少なくともTLS 1.0以上をサポートしている必要があります。TLS 1.2以降が推奨されています。また、PFSを使用した暗号スイート(DHEやECDHEなど)もクライアントでサポートされている必要があります。'

含めて正しい回答ができています（ハルシネーションが発生していません）。
なお、暗号スイートの関連情報も出ておりますが、そちらも正しい回答です。
検索する対象が決まっており、正しい回答を得たい場合、今回比べた3つのパターンではこちらが適していると思われます。

余談：ドキュメントに記載のない内容を質問すると、以下のように情報がない旨が返答されます。

'鎌倉ぽっぽとはなんですか'

' すみませんが、「鎌倉ぽっぽ」についての情報が十分ではありません。答えられるほどの情報がないので、「鎌倉ぽっぽとはなんですか」という質問に答えることができません。ご質問の意味がよく分からないので、もっと情報を教えていただけるとありがたいです。'

Agents for Amazon Bedrock（Google Cloud APIs）

Agentを利用して、回答を行います。
外部アプリケーションとしてGoogle Cloud APIsを利用します。

Amazon Lexは、クライアントとの通信にTLS 1.2以上を必須としています。LexコンソールやAPIはHTTPSを使用し、サーバ証明書はAmazonが発行するものが使われます。クライアント証明書の検証は任意ですが推奨されています。暗号スイートはPCI DSSに準拠しています。  クライアントもこれらの要件を満たす必要があります。

ハルシネーションが発生しています。
インターネットの情報を参照しているからといって、正確な回答がくるとは限らないことがわかりました。

【参考】

LangChain とは?

完全な記事を表示