{ "cells": [ { "cell_type": "markdown", "id": "60bb467d-861d-4b07-a48d-8e5aa177c969", "metadata": { "tags": [] }, "source": [ "# Typewriter: Single Tool\n", "\n", "In this task, an agent is given access to a single tool called \"type_letter\".\n", "This tool takes one argument called \"letter\" which is expected to be a character.\n", "\n", "The agent must repeat the input string from the user, printing one\n", "character a time on a piece of virtual paper.\n", "\n", "The agent is evaluated based on its ability to print the correct string using\n", "the \"type_letter\" tool.\n", "\n", "--------" ] }, { "cell_type": "code", "execution_count": 1, "id": "b39159d0-9ea1-414f-a9d8-4a7b22b3d2cc", "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain_benchmarks import registry" ] }, { "cell_type": "code", "execution_count": 2, "id": "1aef2b32-a5df-421f-8be3-a2ef27372ece", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
Name | Tool Usage - Typewriter (1 tool) |
Type | ToolUsageTask |
Dataset ID | 59577193-8938-4ccf-92a7-e8a96bcf4f86 |
Description | Environment with a single tool that accepts a single letter as input, and prints it on a piece of virtual paper.\n", "\n", "The objective of this task is to evaluate the ability of the model to use the provided tools to repeat a given input string.\n", "\n", "For example, if the string is 'abc', the tools 'a', 'b', and 'c' must be invoked in that order.\n", "\n", "The dataset includes examples of varying difficulty. The difficulty is measured by the length of the string. |