{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "60bb467d-861d-4b07-a48d-8e5aa177c969",
   "metadata": {},
   "source": [
    "# Semi-structured RAG\n",
    "\n",
    "\n",
    "Let's evaluate your architecture on a small semi-structured Q&A dataset. This dataset is composed of QA pairs over pdfs that contain tables."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f49db759-7ce6-4ab7-a58f-7fc3a6a7c8ec",
   "metadata": {},
   "source": [
    "## Pre-requisites\n",
    "\n",
    "We will install quite a few prerequisites for this example since we are comparing various techinques and models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9f44b59b",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -U langchain langsmith langchainhub  langchain_benchmarks langchain_experimental\n",
    "%pip install --quiet chromadb openai huggingface pandas \"unstructured[all-docs]\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0aae13f6-cd40-41e6-bd02-bd683e91cbff",
   "metadata": {},
   "source": [
    "For this code to work, please configure LangSmith environment variables with your credentials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "62b518cf-99fb-44be-8acb-ee0a8ba62272",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"LANGCHAIN_ENDPOINT\"] = \"https://api.smith.langchain.com\"\n",
    "os.environ[\"LANGCHAIN_API_KEY\"] = \"sk-...\"  # Your API key\n",
    "\n",
    "# Silence warnings from HuggingFace\n",
    "os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e8a666d-8bf5-4bfd-8b20-8b7defdb8cd5",
   "metadata": {},
   "source": [
    "## Review Q&A Tasks\n",
    "\n",
    "The registry provides configurations to test out common architectures on curated datasets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "b39159d0-9ea1-414f-a9d8-4a7b22b3d2cc",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain_benchmarks import clone_public_dataset, registry"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "3644d211-382e-41aa-b282-21b01d28fc35",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>Name                   </th><th>Type         </th><th>Dataset ID                                                                                                                                                 </th><th>Description  </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td>LangChain Docs Q&A     </td><td>RetrievalTask</td><td><a href=\"https://smith.langchain.com/public/452ccafc-18e1-4314-885b-edd735f17b9d/d\" target=\"_blank\" rel=\"noopener\">452ccafc-18e1-4314-885b-edd735f17b9d</a></td><td>Questions and answers based on a snapshot of the LangChain python docs.\n",
       "\n",
       "The environment provides the documents and the retriever information.\n",
       "\n",
       "Each example is composed of a question and reference answer.\n",
       "\n",
       "Success is measured based on the accuracy of the answer relative to the reference answer.\n",
       "We also measure the faithfulness of the model's response relative to the retrieved documents (if any).              </td></tr>\n",
       "<tr><td>Semi-structured Reports</td><td>RetrievalTask</td><td><a href=\"https://smith.langchain.com/public/c47d9617-ab99-4d6e-a6e6-92b8daf85a7d/d\" target=\"_blank\" rel=\"noopener\">c47d9617-ab99-4d6e-a6e6-92b8daf85a7d</a></td><td>Questions and answers based on PDFs containing tables and charts.\n",
       "\n",
       "The task provides the raw documents as well as factory methods to easily index them\n",
       "and create a retriever.\n",
       "\n",
       "Each example is composed of a question and reference answer.\n",
       "\n",
       "Success is measured based on the accuracy of the answer relative to the reference answer.\n",
       "We also measure the faithfulness of the model's response relative to the retrieved documents (if any).              </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "Registry(tasks=[RetrievalTask(name='LangChain Docs Q&A', dataset_id='https://smith.langchain.com/public/452ccafc-18e1-4314-885b-edd735f17b9d/d', description=\"Questions and answers based on a snapshot of the LangChain python docs.\\n\\nThe environment provides the documents and the retriever information.\\n\\nEach example is composed of a question and reference answer.\\n\\nSuccess is measured based on the accuracy of the answer relative to the reference answer.\\nWe also measure the faithfulness of the model's response relative to the retrieved documents (if any).\\n\", retriever_factories={'basic': <function _chroma_retriever_factory at 0x122c7c4a0>, 'parent-doc': <function _chroma_parent_document_retriever_factory at 0x122c7c540>, 'hyde': <function _chroma_hyde_retriever_factory at 0x122c7c5e0>}, architecture_factories={'conversational-retrieval-qa': <function default_response_chain at 0x12233be20>}, get_docs=<function load_cached_docs at 0x12233b920>), RetrievalTask(name='Semi-structured Reports', dataset_id='https://smith.langchain.com/public/c47d9617-ab99-4d6e-a6e6-92b8daf85a7d/d', description=\"Questions and answers based on PDFs containing tables and charts.\\n\\nThe task provides the raw documents as well as factory methods to easily index them\\nand create a retriever.\\n\\nEach example is composed of a question and reference answer.\\n\\nSuccess is measured based on the accuracy of the answer relative to the reference answer.\\nWe also measure the faithfulness of the model's response relative to the retrieved documents (if any).\\n\", retriever_factories={'basic': <function _chroma_retriever_factory at 0x122c7ccc0>, 'parent-doc': <function _chroma_parent_document_retriever_factory at 0x122c7cd60>, 'hyde': <function _chroma_hyde_retriever_factory at 0x122c7ce00>}, architecture_factories={}, get_docs=<function load_docs at 0x122c7cc20>)])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "registry = registry.filter(Type=\"RetrievalTask\")\n",
    "registry"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "671282f8-c455-4390-b018-e53bbd833093",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<tbody>\n",
       "<tr><td>Name                  </td><td>Semi-structured Reports                                                                                                                                    </td></tr>\n",
       "<tr><td>Type                  </td><td>RetrievalTask                                                                                                                                              </td></tr>\n",
       "<tr><td>Dataset ID            </td><td><a href=\"https://smith.langchain.com/public/c47d9617-ab99-4d6e-a6e6-92b8daf85a7d/d\" target=\"_blank\" rel=\"noopener\">c47d9617-ab99-4d6e-a6e6-92b8daf85a7d</a></td></tr>\n",
       "<tr><td>Description           </td><td>Questions and answers based on PDFs containing tables and charts.\n",
       "\n",
       "The task provides the raw documents as well as factory methods to easily index them\n",
       "and create a retriever.\n",
       "\n",
       "Each example is composed of a question and reference answer.\n",
       "\n",
       "Success is measured based on the accuracy of the answer relative to the reference answer.\n",
       "We also measure the faithfulness of the model's response relative to the retrieved documents (if any).                                                                                                                                                            </td></tr>\n",
       "<tr><td>Retriever Factories   </td><td>basic, parent-doc, hyde                                                                                                                                    </td></tr>\n",
       "<tr><td>Architecture Factories</td><td>                                                                                                                                                           </td></tr>\n",
       "<tr><td>get_docs              </td><td><function load_docs at 0x122c7cc20>                                                                                                                        </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "RetrievalTask(name='Semi-structured Reports', dataset_id='https://smith.langchain.com/public/c47d9617-ab99-4d6e-a6e6-92b8daf85a7d/d', description=\"Questions and answers based on PDFs containing tables and charts.\\n\\nThe task provides the raw documents as well as factory methods to easily index them\\nand create a retriever.\\n\\nEach example is composed of a question and reference answer.\\n\\nSuccess is measured based on the accuracy of the answer relative to the reference answer.\\nWe also measure the faithfulness of the model's response relative to the retrieved documents (if any).\\n\", retriever_factories={'basic': <function _chroma_retriever_factory at 0x122c7ccc0>, 'parent-doc': <function _chroma_parent_document_retriever_factory at 0x122c7cd60>, 'hyde': <function _chroma_hyde_retriever_factory at 0x122c7ce00>}, architecture_factories={}, get_docs=<function load_docs at 0x122c7cc20>)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "task = registry[\"Semi-structured Reports\"]\n",
    "task"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "70369f67-deb4-467a-801a-6d38c3d0460d",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dataset Semi-structured Reports already exists. Skipping.\n",
      "You can access the dataset at https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/datasets/f8f24935-cf57-4cb3-a30f-8df303a46962.\n"
     ]
    }
   ],
   "source": [
    "clone_public_dataset(task.dataset_id, dataset_name=task.name)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b4fafb2-63d0-40b4-b803-0095c5b22ca6",
   "metadata": {},
   "source": [
    "### Now, index the documents\n",
    "\n",
    "You can see the raw filepaths, or use unstructured to process the pdfs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "c9657b27-4e10-4ba5-ab20-1f05f22fdbd4",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_benchmarks.rag.tasks.semi_structured_reports import get_file_names\n",
    "\n",
    "# If you want to completely customize the document processing, you can use the files directly\n",
    "file_names = list(get_file_names())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "f5ad4b23-fbd4-4ebc-b5a5-d3d05efd0b9c",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Some weights of the model checkpoint at microsoft/table-transformer-structure-recognition were not used when initializing TableTransformerForObjectDetection: ['model.backbone.conv_encoder.model.layer2.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer3.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer4.0.downsample.1.num_batches_tracked']\n",
      "- This IS expected if you are initializing TableTransformerForObjectDetection from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
      "- This IS NOT expected if you are initializing TableTransformerForObjectDetection from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b8a52c9983274c21a713ac8742e9c99b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/26 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from langchain.embeddings import HuggingFaceEmbeddings\n",
    "\n",
    "embeddings = HuggingFaceEmbeddings(\n",
    "    model_name=\"thenlper/gte-base\",\n",
    "    model_kwargs={\"device\": 0},  # Comment out to use CPU\n",
    ")\n",
    "\n",
    "# Arguments to pass to partition_pdf\n",
    "unstructured_config = {\n",
    "    # Unstructured first finds embedded image blocks\n",
    "    \"extract_images_in_pdf\": False,\n",
    "    # Use layout model (YOLOX) to get bounding boxes (for tables) and find titles\n",
    "    # Titles are any sub-section of the document\n",
    "    \"infer_table_structure\": True,\n",
    "    # Post processing to aggregate text once we have the title\n",
    "    \"chunking_strategy\": \"by_title\",\n",
    "    # Chunking params to aggregate text blocks\n",
    "    # Attempt to create a new chunk 3800 chars\n",
    "    # Attempt to keep chunks > 2000 chars\n",
    "    \"max_characters\": 4000,\n",
    "    \"new_after_n_chars\": 3800,\n",
    "    \"combine_text_under_n_chars\": 2000,\n",
    "}\n",
    "docs = list(task.get_docs(unstructured_config=unstructured_config))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "fe0a05ad-5b57-40b0-aac4-e2d9cd9e6b4b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Chroma/semi-structured-earnings-b_Chroma_HuggingFaceEmbeddings_raw\n",
      "[]\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "06a11e1a4d50416596d9dd953fdabafa",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/26 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "retriever_factory = task.retriever_factories[\"basic\"]\n",
    "# Indexes the documents with the specified embeddings\n",
    "retriever = retriever_factory(embeddings, docs=docs)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "57efac89-12f9-47e3-b60f-65d9279ebc1e",
   "metadata": {},
   "source": [
    "### Time to evaluate\n",
    "\n",
    "We will compose our retriever with a simple Llama based LLM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "8d1bc360-d822-43a8-b6b7-ff66dc27caf4",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chat_models import ChatAnthropic\n",
    "from langchain.prompts import ChatPromptTemplate\n",
    "from langchain.schema.output_parser import StrOutputParser\n",
    "from langchain.schema.runnable.passthrough import RunnableAssign\n",
    "\n",
    "\n",
    "def create_chain(retriever):\n",
    "    prompt = ChatPromptTemplate.from_messages(\n",
    "        [\n",
    "            (\n",
    "                \"system\",\n",
    "                \"Answer based solely on the retrieved documents below:\\n\\n<Documents>\\n{docs}</Documents>\",\n",
    "            ),\n",
    "            (\"user\", \"{question}\"),\n",
    "        ]\n",
    "    )\n",
    "    llm = ChatAnthropic(model=\"claude-2\")\n",
    "    return (\n",
    "        RunnableAssign({\"docs\": (lambda x: next(iter(x.values()))) | retriever})\n",
    "        | prompt\n",
    "        | llm\n",
    "        | StrOutputParser()\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "935cacd9-e841-4c76-ac16-f3f0cf18df62",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "View the evaluation results for project 'cold-attachment-88' at:\n",
      "https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/projects/p/d8e512b7-b63d-4eb5-8d73-d95f7fa7ffc2?eval=true\n",
      "\n",
      "View all tests for Dataset Semi-structured Reports at:\n",
      "https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/datasets/f8f24935-cf57-4cb3-a30f-8df303a46962\n",
      "[------------------------------------------------->] 5/5\n",
      " Eval quantiles:\n",
      "                                          inputs.question  \\\n",
      "count                                                   5   \n",
      "unique                                                  5   \n",
      "top     Analyzing the operating expenses for Q3 2023, ...   \n",
      "freq                                                    1   \n",
      "mean                                                  NaN   \n",
      "std                                                   NaN   \n",
      "min                                                   NaN   \n",
      "25%                                                   NaN   \n",
      "50%                                                   NaN   \n",
      "75%                                                   NaN   \n",
      "max                                                   NaN   \n",
      "\n",
      "        feedback.embedding_cosine_distance  feedback.faithfulness  \\\n",
      "count                             5.000000                    5.0   \n",
      "unique                                 NaN                    NaN   \n",
      "top                                    NaN                    NaN   \n",
      "freq                                   NaN                    NaN   \n",
      "mean                              0.137066                    1.0   \n",
      "std                               0.011379                    0.0   \n",
      "min                               0.123112                    1.0   \n",
      "25%                               0.129089                    1.0   \n",
      "50%                               0.137871                    1.0   \n",
      "75%                               0.143398                    1.0   \n",
      "max                               0.151860                    1.0   \n",
      "\n",
      "        feedback.score_string:accuracy error  execution_time  \n",
      "count                              5.0     0        5.000000  \n",
      "unique                             NaN     0             NaN  \n",
      "top                                NaN   NaN             NaN  \n",
      "freq                               NaN   NaN             NaN  \n",
      "mean                               0.1   NaN        7.940625  \n",
      "std                                0.0   NaN        1.380190  \n",
      "min                                0.1   NaN        6.416387  \n",
      "25%                                0.1   NaN        7.272528  \n",
      "50%                                0.1   NaN        7.324673  \n",
      "75%                                0.1   NaN        8.831243  \n",
      "max                                0.1   NaN        9.858293  \n"
     ]
    }
   ],
   "source": [
    "from langsmith.client import Client\n",
    "\n",
    "from langchain_benchmarks.rag import get_eval_config\n",
    "\n",
    "client = Client()\n",
    "RAG_EVALUATION = get_eval_config()\n",
    "chain = create_chain(retriever)\n",
    "test_run = client.run_on_dataset(\n",
    "    dataset_name=task.name,\n",
    "    llm_or_chain_factory=chain,\n",
    "    evaluation=RAG_EVALUATION,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b54ef6c-0194-410a-aae9-f30c2097548a",
   "metadata": {},
   "source": [
    "## Example processing the docs\n",
    "\n",
    "RAG apps are as good as the information they are able to retrieve. Let's try indexing the tables' summaries to\n",
    "improve the likelihood that they are retrieved whenever a user asks a relevant question.\n",
    "\n",
    "We will use unstructured's `partition_pdf` functionality and generate summaries using an LLM.\n",
    "\n",
    "You can define your own indexing pipeline to see how it impacts the downstream performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "3378eddb-0a8d-4179-8e9c-54343469eef6",
   "metadata": {},
   "outputs": [],
   "source": [
    "from operator import itemgetter\n",
    "\n",
    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.prompts import ChatPromptTemplate\n",
    "from langchain.schema.document import Document\n",
    "from langchain.schema.output_parser import StrOutputParser\n",
    "from langchain.schema.runnable.passthrough import RunnableAssign\n",
    "\n",
    "# Prompt\n",
    "prompt = ChatPromptTemplate.from_messages(\n",
    "    [\n",
    "        (\n",
    "            \"system\",\n",
    "            \"You are summarizing semi-structured tables or text in a pdf.\\n\\n```document\\n{doc}\\n```\",\n",
    "        ),\n",
    "        (\"user\", \"Write a concise summary.\"),\n",
    "    ]\n",
    ")\n",
    "\n",
    "# Summary chain\n",
    "model = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-16k\")\n",
    "\n",
    "\n",
    "def create_doc(x) -> Document:\n",
    "    return Document(\n",
    "        page_content=x[\"output\"],\n",
    "        metadata=x[\"doc\"].metadata,\n",
    "    )\n",
    "\n",
    "\n",
    "summarize_chain = (\n",
    "    {\"doc\": lambda x: x}\n",
    "    | RunnableAssign({\"prompt\": prompt})\n",
    "    | {\n",
    "        \"output\": itemgetter(\"prompt\") | model | StrOutputParser(),\n",
    "        \"doc\": itemgetter(\"doc\"),\n",
    "    }\n",
    "    | create_doc\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "07a2f070-3b5a-4de0-b3da-ddfb6e6f8c2b",
   "metadata": {},
   "outputs": [],
   "source": [
    "summaries = summarize_chain.batch(\n",
    "    [doc for doc in docs if doc.metadata[\"element_type\"] == \"table\"]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22dc0bf8-fa50-4be3-8d23-04f6129548e0",
   "metadata": {},
   "source": [
    "Index the documents and create the retriever. We will re"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "35a1ccf6-2c2f-46f2-838e-5a5bf89515f5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "029921ed3d7c4f389c666583a7192144",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/36 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Indexes the documents with the specified embeddings\n",
    "retriever_with_summaries = retriever_factory(\n",
    "    embeddings,\n",
    "    docs=docs + summaries,\n",
    "    # Specify a unique transformation name to avoid local cache collisions with other indices.\n",
    "    transformation_name=\"docs-with_summaries\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3821e4b0-8e67-418a-840c-470fcde42df0",
   "metadata": {},
   "source": [
    "### Evaluate\n",
    "\n",
    "We'll evaluate the new chain on the same dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "e6f08c4c-a738-4449-9190-5a4f0b65b99a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "View the evaluation results for project 'crazy-harmony-39' at:\n",
      "https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/projects/p/b69d796f-6ba4-4cde-822f-db363cf81f8f?eval=true\n",
      "\n",
      "View all tests for Dataset Semi-structured Reports at:\n",
      "https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/datasets/f8f24935-cf57-4cb3-a30f-8df303a46962\n",
      "[------------------------------------------------->] 5/5\n",
      " Eval quantiles:\n",
      "                                          inputs.question  \\\n",
      "count                                                   5   \n",
      "unique                                                  5   \n",
      "top     Analyzing the operating expenses for Q3 2023, ...   \n",
      "freq                                                    1   \n",
      "mean                                                  NaN   \n",
      "std                                                   NaN   \n",
      "min                                                   NaN   \n",
      "25%                                                   NaN   \n",
      "50%                                                   NaN   \n",
      "75%                                                   NaN   \n",
      "max                                                   NaN   \n",
      "\n",
      "        feedback.score_string:accuracy  feedback.faithfulness  \\\n",
      "count                         5.000000                    5.0   \n",
      "unique                             NaN                    NaN   \n",
      "top                                NaN                    NaN   \n",
      "freq                               NaN                    NaN   \n",
      "mean                          0.720000                    1.0   \n",
      "std                           0.408656                    0.0   \n",
      "min                           0.100000                    1.0   \n",
      "25%                           0.500000                    1.0   \n",
      "50%                           1.000000                    1.0   \n",
      "75%                           1.000000                    1.0   \n",
      "max                           1.000000                    1.0   \n",
      "\n",
      "        feedback.embedding_cosine_distance error  execution_time  \n",
      "count                             5.000000     0        5.000000  \n",
      "unique                                 NaN     0             NaN  \n",
      "top                                    NaN   NaN             NaN  \n",
      "freq                                   NaN   NaN             NaN  \n",
      "mean                              0.069363   NaN        8.659120  \n",
      "std                               0.023270   NaN        2.611724  \n",
      "min                               0.039593   NaN        6.283505  \n",
      "25%                               0.050176   NaN        6.723136  \n",
      "50%                               0.078912   NaN        7.441743  \n",
      "75%                               0.084389   NaN       10.673265  \n",
      "max                               0.093747   NaN       12.173952  \n"
     ]
    }
   ],
   "source": [
    "chain_2 = create_chain(retriever_with_summaries)\n",
    "\n",
    "test_run_with_summaries = client.run_on_dataset(\n",
    "    dataset_name=task.name,\n",
    "    llm_or_chain_factory=chain_2,\n",
    "    evaluation=RAG_EVALUATION,\n",
    "    verbose=True,\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}