1 of 5

tutorials

PaperQA2 for Clinical Trials

PaperQA2 now natively supports querying clinical trials in addition to any documents supplied by the user. It uses a new tool, the aptly named clinical_trials_search tool. Users don't have to provide any clinical trials to the tool itself, it uses the clinicaltrials.gov API to retrieve them on the fly. As of January 2025, the tool is not enabled by default, but it's easy to configure. Here's an example where we query only clinical trials, without using any documents:

from paperqa import Settings, agent_query

answer_response = await agent_query(
    query="What drugs have been found to effectively treat Ulcerative Colitis?",
    settings=Settings.from_name("search_only_clinical_trials"),
)

print(answer_response.session.answer)

Output

Several drugs have been found to effectively treat Ulcerative Colitis (UC),
targeting different mechanisms of the disease.

Golimumab, a tumor necrosis factor (TNF) inhibitor marketed as Simponi®, has demonstrated efficacy
in treating moderate-to-severe UC. Administered subcutaneously, it was shown to maintain clinical
response through Week 54 in patients, as assessed by the Partial Mayo Score (NCT02092285).

Mesalazine, an anti-inflammatory drug, is commonly used for UC treatment. In a study comparing
mesalazine enemas to faecal microbiota transplantation (FMT) for left-sided UC,
mesalazine enemas (4g daily) were effective in inducing clinical remission (Mayo score ≤ 2) (NCT03104036).

Antibiotics have also shown potential in UC management. A combination of doxycycline,
amoxicillin, and metronidazole induced remission in 60-70% of patients with moderate-to-severe
UC in prior studies. These antibiotics are thought to alter gut microbiota, reducing pathobionts
 and promoting beneficial bacteria (NCT02217722, NCT03986996).

Roflumilast, a phosphodiesterase-4 (PDE4) inhibitor, is being investigated for mild-to-moderate UC.
Preliminary findings suggest it may improve disease severity and biochemical markers when
added to conventional treatments (NCT05684484).

These treatments highlight diverse therapeutic approaches, including immunosuppression,
microbiota modulation, and anti-inflammatory mechanisms.

You can see the in-line citations for each clinical trial used as a response for each query. If you'd like to see more data on the specific contexts that were used to answer the query:

print(answer_response.session.contexts)

[Context(context='The excerpt mentions that a search on ClinicalTrials.gov for clinical trials related to drugs
treating Ulcerative Colitis yielded 689 trials. However, it does not provide specific information about which
drugs have been found effective for treating Ulcerative Colitis.', text=Text(text='', name=...

Using Settings.from_name('search_only_clinical_trials') is a shortcut, but note that you can easily add clinical_trial_search into any custom Settings by just explicitly naming it as a tool:

from pathlib import Path
from paperqa import Settings, agent_query, AgentSetting
from paperqa.agents.tools import DEFAULT_TOOL_NAMES

# you can start with the default list of PaperQA tools
print(DEFAULT_TOOL_NAMES)
# >>> ['paper_search', 'gather_evidence', 'gen_answer', 'reset', 'complete'],

# we can start with a directory with a potentially useful paper in it
print(list(Path("my_papers").iterdir()))

# now let's query using standard tools + clinical_trials
answer_response = await agent_query(
    query="What drugs have been found to effectively treat Ulcerative Colitis?",
    settings=Settings(
        paper_directory="my_papers",
        agent={"tool_names": DEFAULT_TOOL_NAMES + ["clinical_trials_search"]},
    ),
)

# let's check out the formatted answer (with references included)
print(answer_response.session.formatted_answer)

Question: What drugs have been found to effectively treat Ulcerative Colitis?

Several drugs have been found effective in treating Ulcerative Colitis (UC), with treatment
strategies varying based on disease severity and extent. For mild-to-moderate UC, 5-aminosalicylic
 acid (5-ASA) is the first-line therapy. Topical 5-ASA, such as mesalazine suppositories (1 g/day),
 is effective for proctitis or distal colitis, inducing remission in 31-80% of patients. Oral mesalazine
 at higher doses (e.g., 4.8 g/day) can accelerate clinical improvement in more extensive disease
 (meier2011currenttreatmentof pages 1-2; meier2011currenttreatmentof pages 3-4).

For moderate-to-severe cases, corticosteroids are commonly used. Oral steroids like prednisolone
(40-60 mg/day) or intravenous steroids such as methylprednisolone (60 mg/day) and hydrocortisone
(400 mg/day) are standard for inducing remission (meier2011currenttreatmentof pages 3-4). Tumor
necrosis factor (TNF)-α blockers, such as infliximab, are effective for steroid-refractory cases
(meier2011currenttreatmentof pages 2-3; meier2011currenttreatmentof pages 3-4).

Immunosuppressive agents, including azathioprine and 6-mercaptopurine, are used for maintenance
therapy in steroid-dependent or refractory cases (meier2011currenttreatmentof pages 2-3;
meier2011currenttreatmentof pages 3-4). Antibiotics, such as combinations of penicillin,
tetracycline, and metronidazole, have shown promise in altering the microbiota and inducing
remission in some patients, though their efficacy varies (NCT02217722).

References

1. (meier2011currenttreatmentof pages 2-3): Johannes Meier and Andreas Sturm. Current treatment
of ulcerative colitis. World journal of gastroenterology, 17 27:3204-12, 2011.
URL: https://doi.org/10.3748/wjg.v17.i27.3204, doi:10.3748/wjg.v17.i27.3204.

2. (meier2011currenttreatmentof pages 3-4): Johannes Meier and Andreas Sturm. Current treatment
of ulcerative colitis. World journal of gastroenterology, 17 27:3204-12, 2011. URL:
https://doi.org/10.3748/wjg.v17.i27.3204, doi:10.3748/wjg.v17.i27.3204.

3. (NCT02217722): Prof. Arie Levine. Use of the Ulcerative Colitis Diet for Induction of
Remission. Prof. Arie Levine. 2014. ClinicalTrials.gov Identifier: NCT02217722

4. (meier2011currenttreatmentof pages 1-2): Johannes Meier and Andreas Sturm. Current
treatment of ulcerative colitis. World journal of gastroenterology, 17 27:3204-12, 2011.
 URL: https://doi.org/10.3748/wjg.v17.i27.3204, doi:10.3748/wjg.v17.i27.3204.

We now see both papers and clinical trials cited in our response. For convenience, we have a Settings.from_name that works as well:

from paperqa import Settings, agent_query

answer_response = await agent_query(
    query="What drugs have been found to effectively treat Ulcerative Colitis?",
    settings=Settings.from_name("clinical_trials"),
)

And, this works with the pqa cli as well:

>>> pqa --settings 'search_only_clinical_trials' ask 'what is Ibuprofen effective at treating?'

...
[13:29:50] Completing 'what is Ibuprofen effective at treating?' as 'certain'.
        Answer: Ibuprofen is a non-steroidal anti-inflammatory drug (NSAID) effective
        in treating various conditions, including pain, inflammation, and fever.
        It is widely used for tension-type
        headaches, with studies showing that ibuprofen sodium provides significant
        pain relief and reduces pain intensity compared to standard ibuprofen and placebo
        over a 3-hour period (NCT01362491).
        Intravenous ibuprofen is effective in managing postoperative pain, particularly
        in orthopedic surgeries, and helps control the inflammatory process. When combined
        with opioids, it reduces opioid
        consumption and associated side effects, making it a key component of
        multimodal analgesia (NCT05401916, NCT01773005).

        Ibuprofen is also effective in pediatric populations as a first-line
        anti-inflammatory and antipyretic agent due to its relatively
        low adverse effects compared to other NSAIDs (NCT01478022).
        Additionally, it has been studied for its potential use in managing
        chronic periodontitis through subgingival irrigation with a 2% ibuprofen
        mouthwash, which reduces periodontal pocket depth and
        bleeding on probing, improving periodontal health (NCT02538237).

        These findings highlight ibuprofen's versatility in treating pain, inflammation,
        fever, and specific conditions like tension headaches, postoperative pain, and periodontal diseases.

Measuring PaperQA2 with LFRQA

Overview

The LFRQA dataset was introduced in the paper RAG-QA Arena: Evaluating Domain Robustness for Long-Form Retrieval-Augmented Question Answering. It features 1,404 science questions (along with other categories) that have been human-annotated with answers. This tutorial walks through the process of setting up the dataset for use and benchmarking.

Download the Annotations

First, we need to obtain the annotated dataset from the official repository:

# Create a new directory for the dataset
mkdir -p data/rag-qa-benchmarking

# Get the annotated questions
curl https://raw.githubusercontent.com/awslabs/rag-qa-arena/refs/heads/main/data/annotations_science_with_citation.jsonl -o data/rag-qa-benchmarking/annotations_science_with_citation.jsonl

Download the Robust-QA Documents

LFRQA is built upon Robust-QA, so we must download the relevant documents:

# Download the Lotte dataset, which includes the required documents
curl https://downloads.cs.stanford.edu/nlp/data/colbert/colbertv2/lotte.tar.gz --output lotte.tar.gz

# Extract the dataset
tar -xvzf lotte.tar.gz

# Move the science test collection to our dataset folder
cp lotte/science/test/collection.tsv ./data/rag-qa-benchmarking/science_test_collection.tsv

# Clean up unnecessary files
rm lotte.tar.gz
rm -rf lotte

For more details, refer to the original paper: RAG-QA Arena: Evaluating Domain Robustness for Long-Form Retrieval-Augmented Question Answering.

Load the Data

We now load the documents into a pandas dataframe:

import os
import pandas as pd

# Load questions and answers dataset
rag_qa_benchmarking_dir = os.path.join("data", "rag-qa-benchmarking")

# Load documents dataset
lfrqa_docs_df = pd.read_csv(
    os.path.join(rag_qa_benchmarking_dir, "science_test_collection.tsv"),
    sep="\t",
    names=["doc_id", "doc_text"],
)

Select the Documents to Use

RobustQA consists on 1.7M documents, so building the whole index will take around 3 hours.

If you want to run a test, you can use a portion of the dataset and the questions that can be answered only on those documents.

proportion_to_use = 1 / 100
amount_of_docs_to_use = int(len(lfrqa_docs_df) * proportion_to_use)
print(f"Using {amount_of_docs_to_use} out of {len(lfrqa_docs_df)} documents")

Prepare the Document Files

We now create the document directory and store each document as a separate text file, so that paperqa can build the index.

partial_docs = lfrqa_docs_df.head(amount_of_docs_to_use)
lfrqa_directory = os.path.join(rag_qa_benchmarking_dir, "lfrqa")
os.makedirs(
    os.path.join(lfrqa_directory, "science_docs_for_paperqa", "files"), exist_ok=True
)

for i, row in partial_docs.iterrows():
    doc_id = row["doc_id"]
    doc_text = row["doc_text"]

    with open(
        os.path.join(
            lfrqa_directory, "science_docs_for_paperqa", "files", f"{doc_id}.txt"
        ),
        "w",
        encoding="utf-8",
    ) as f:
        f.write(doc_text)

    if i % int(len(partial_docs) * 0.05) == 0:
        progress = (i + 1) / len(partial_docs)
        print(f"Progress: {progress:.2%}")

Create the Manifest File

The manifest file keeps track of document metadata for the dataset. We need to fill some fields so that paperqa doesn’t try to get metadata using llm calls. This will make the indexing process faster.

manifest = partial_docs.copy()
manifest["file_location"] = manifest["doc_id"].apply(lambda x: f"files/{x}.txt")
manifest["doi"] = ""
manifest["title"] = manifest["doc_id"]
manifest["key"] = manifest["doc_id"]
manifest["docname"] = manifest["doc_id"]
manifest["citation"] = "_"
manifest.drop(columns=["doc_id", "doc_text"], inplace=True)
manifest.to_csv(
    os.path.join(lfrqa_directory, "science_docs_for_paperqa", "manifest.csv"),
    index=False,
)

Filter and Save Questions

Finally, we load the questions and filter them to ensure we only include questions that reference the selected documents:

questions_df = pd.read_json(
    os.path.join(rag_qa_benchmarking_dir, "annotations_science_with_citation.jsonl"),
    lines=True,
)
partial_questions = questions_df[
    questions_df.gold_doc_ids.apply(
        lambda ids: all(id < amount_of_docs_to_use for id in ids)
    )
]
partial_questions.to_csv(
    os.path.join(lfrqa_directory, "questions.csv"),
    index=False,
)

Install paperqa

From now on, we will be using the paperqa library, so we need to install it:

pip install paper-qa

Index the documents

Copy the following to a file and run it. Feel free to adjust the concurrency as you like.

You don’t need any api keys for building this index because we don't discern any citation metadata, but you do need LLM api keys to answer questions.

Remember that this process is quick for small portions of the dataset, but can take around 3 hours for the whole dataset.

import os

from paperqa import Settings, ask
from paperqa.agents import build_index
from paperqa.settings import AgentSettings, IndexSettings, ParsingSettings

settings = Settings(
    agent=AgentSettings(
        index=IndexSettings(
            name="lfrqa_science_index0.1",
            paper_directory=os.path.join(
                "data", "rag-qa-benchmarking", "lfrqa", "science_docs_for_paperqa"
            ),
            index_directory=os.path.join(
                "data", "rag-qa-benchmarking", "lfrqa", "science_docs_for_paperqa_index"
            ),
            manifest_file="manifest.csv",
            concurrency=10_000,
            batch_size=10_000,
        )
    ),
    parsing=ParsingSettings(
        use_doc_details=False,
        defer_embedding=True,
    ),
)

build_index(settings=settings)

After this runs, you will get an answer!

Benchmark!

After you have built the index, you are ready to run the benchmark.

Copy the following into a file and run it. To run this, you will need to have the ldp and fhaviary[lfrqa] packages installed.

import asyncio
import json
import os

import pandas as pd
from aviary.envs.lfrqa import LFRQAQuestion, LFRQATaskDataset
from ldp.agent import SimpleAgent
from ldp.alg.runners import Evaluator, EvaluatorConfig

from paperqa import Settings
from paperqa.settings import AgentSettings, IndexSettings


log_results_dir = os.path.join("data", "rag-qa-benchmarking", "results")
os.makedirs(log_results_dir, exist_ok=True)


async def log_evaluation_to_json(lfrqa_question_evaluation: dict) -> None:
    json_path = os.path.join(
        log_results_dir, f"{lfrqa_question_evaluation['qid']}.json"
    )
    with open(json_path, "w") as f:
        json.dump(lfrqa_question_evaluation, f, indent=2)


async def evaluate() -> None:
    settings = Settings(
        agent=AgentSettings(
            index=IndexSettings(
                name="lfrqa_science_index",
                paper_directory=os.path.join(
                    "data", "rag-qa-benchmarking", "lfrqa", "science_docs_for_paperqa"
                ),
                index_directory=os.path.join(
                    "data",
                    "rag-qa-benchmarking",
                    "lfrqa",
                    "science_docs_for_paperqa_index",
                ),
            )
        )
    )

    data: list[LFRQAQuestion] = [
        LFRQAQuestion(**row)
        for row in pd.read_csv(
            os.path.join("data", "rag-qa-benchmarking", "lfrqa", "questions.csv")
        )[["qid", "question", "answer", "gold_doc_ids"]].to_dict(orient="records")
    ]

    dataset = LFRQATaskDataset(
        data=data,
        settings=settings,
        evaluation_callback=log_evaluation_to_json,
    )

    evaluator = Evaluator(
        config=EvaluatorConfig(batch_size=3),
        agent=SimpleAgent(),
        dataset=dataset,
    )
    await evaluator.evaluate()


if __name__ == "__main__":
    asyncio.run(evaluate())

After running this, you can find the results in the data/rag-qa-benchmarking/results folder. Here is an example of how to read them:

import glob
import json

json_files = glob.glob(os.path.join(rag_qa_benchmarking_dir, "results", "*.json"))

data = []
for file in json_files:
    with open(file) as f:
        json_data = json.load(f)
        json_data["qid"] = file.split("/")[-1].replace(".json", "")
        data.append(json_data)

df = pd.DataFrame(data).set_index("qid")
df["winner"].value_counts(normalize=True)

settings_tutorial

Setup

This tutorial is available as a Jupyter notebook here.

This tutorial aims to show how to use the Settings class to configure PaperQA. Firstly, we will be using OpenAI and Anthropic models, so we need to set the OPENAI_API_KEY and ANTHROPIC_API_KEY environment variables. We will use both models to make it clear when paperqa agent is using either one or the other. We use python-dotenv to load the environment variables from a .env file. Hence, our first step is to create a .env file and install the required packages.

# fmt: off
# Create .env file with OpenAI API key
# Replace <your-openai-api-key> and <your-anthropic-api-key> with your actual API keys
!echo "OPENAI_API_KEY=<your-openai-api-key>" > .env # fmt: skip
!echo "ANTHROPIC_API_KEY=<your-anthropic-api-key>" >> .env # fmt: skip

!uv pip install -q nest-asyncio python-dotenv aiohttp fhlmi "paper-qa[local]"
# fmt: on

import os

import aiohttp
import nest_asyncio
from dotenv import load_dotenv

nest_asyncio.apply()
load_dotenv(".env")

print("You have set the following environment variables:")
print(f"OPENAI_API_KEY: {os.environ['OPENAI_API_KEY']}")
print(f"ANTHROPIC_API_KEY: {os.environ['ANTHROPIC_API_KEY']}")

We will use the lmi package to get the model names and the .papers directory to save documents we will use.

from lmi import CommonLLMNames

llm_openai = CommonLLMNames.OPENAI_TEST.value
llm_anthropic = CommonLLMNames.ANTHROPIC_TEST.value

# Create the `papers` directory if it doesn't exist
os.makedirs("papers", exist_ok=True)

# Download the paper from arXiv and save it to the `papers` directory
url = "https://arxiv.org/pdf/2407.01603"
async with aiohttp.ClientSession() as session, session.get(url, timeout=60) as response:
    content = await response.read()
    with open("papers/2407.01603.pdf", "wb") as f:
        f.write(content)

The Settings class is used to configure the PaperQA settings. Official documentation can be found here and the open source code can be found here.

Here is a basic example of how to use the Settings class. We will be unnecessarily verbose for the sake of clarity. Please notice that most of the settings are optional and the defaults are good for most cases. Refer to the descriptions of each setting for more information.

Within this Settings object, I'd like to discuss specifically how the llms are configured and how paperqa looks for papers.

A common source of confusion is that multiple llms are used in paperqa. We have llm, summary_llm, agent_llm, and embedding. Hence, if llm is set to an Anthropic model, summary_llm and agent_llm will still require a OPENAI_API_KEY, since OpenAI models are the default.

Among the objects that use llms in paperqa, we have llm, summary_llm, agent_llm, and embedding:

llm: Main LLM used by the agent to reason about the question, extract metadata from documents, etc.
summary_llm: LLM used to summarize the papers.
agent_llm: LLM used to answer questions and select tools.
embedding: Embedding model used to embed the papers.

Let's see some examples around this concept. First, we define the settings with llm set to an OpenAI model. Please notice this is not an complete list of settings. But take your time to read through this Settings class and all customization that can be done.

import pathlib

from paperqa.prompts import (
    CONTEXT_INNER_PROMPT,
    CONTEXT_OUTER_PROMPT,
    citation_prompt,
    default_system_prompt,
    env_reset_prompt,
    env_system_prompt,
    qa_prompt,
    select_paper_prompt,
    structured_citation_prompt,
    summary_json_prompt,
    summary_json_system_prompt,
    summary_prompt,
)
from paperqa.settings import (
    AgentSettings,
    AnswerSettings,
    IndexSettings,
    ParsingSettings,
    PromptSettings,
    Settings,
)

settings = Settings(
    llm=llm_openai,
    llm_config={
        "model_list": [
            {
                "model_name": llm_openai,
                "litellm_params": {
                    "model": llm_openai,
                    "temperature": 0.1,
                    "max_tokens": 4096,
                },
            }
        ],
        "rate_limit": {
            llm_openai: "30000 per 1 minute",
        },
    },
    summary_llm=llm_openai,
    summary_llm_config={
        "rate_limit": {
            llm_openai: "30000 per 1 minute",
        },
    },
    embedding="text-embedding-3-small",
    embedding_config={},
    temperature=0.1,
    batch_size=1,
    verbosity=1,
    manifest_file=None,
    paper_directory=pathlib.Path.cwd().joinpath("papers"),
    index_directory=pathlib.Path.cwd().joinpath("papers/index"),
    answer=AnswerSettings(
        evidence_k=10,
        evidence_detailed_citations=True,
        evidence_retrieval=True,
        evidence_summary_length="about 100 words",
        evidence_skip_summary=False,
        answer_max_sources=5,
        max_answer_attempts=None,
        answer_length="about 200 words, but can be longer",
        max_concurrent_requests=10,
    ),
    parsing=ParsingSettings(
        chunk_size=5000,
        overlap=250,
        citation_prompt=citation_prompt,
        structured_citation_prompt=structured_citation_prompt,
    ),
    prompts=PromptSettings(
        summary=summary_prompt,
        qa=qa_prompt,
        select=select_paper_prompt,
        pre=None,
        post=None,
        system=default_system_prompt,
        use_json=True,
        summary_json=summary_json_prompt,
        summary_json_system=summary_json_system_prompt,
        context_outer=CONTEXT_OUTER_PROMPT,
        context_inner=CONTEXT_INNER_PROMPT,
    ),
    agent=AgentSettings(
        agent_llm=llm_openai,
        agent_llm_config={
            "model_list": [
                {
                    "model_name": llm_openai,
                    "litellm_params": {
                        "model": llm_openai,
                    },
                }
            ],
            "rate_limit": {
                llm_openai: "30000 per 1 minute",
            },
        },
        agent_prompt=env_reset_prompt,
        agent_system_prompt=env_system_prompt,
        search_count=8,
        index=IndexSettings(
            paper_directory=pathlib.Path.cwd().joinpath("papers"),
            index_directory=pathlib.Path.cwd().joinpath("papers/index"),
        ),
    ),
)

As it is evident, Paperqa is absolutely customizable. And here we reinterate that despite this possible fine customization, the defaults are good for most cases. Although, the user is welcome to explore the settings and customize the paperqa to their needs.

We also set settings.verbosity to 1, which will print the agent configuration. Feel free to set it to 0 to silence the logging after your first run.

from paperqa import ask

response = ask(
    "What are the most relevant language models used for chemistry?", settings=settings
)

Which probably worked fine. Let's now try to remove OPENAI_API_KEY and run again the same question with the same settings.

os.environ["OPENAI_API_KEY"] = "sk-invalid-key"
print("Current environment variables:")
print(f"OPENAI_API_KEY: {os.environ['OPENAI_API_KEY']}")
print(f"ANTHROPIC_API_KEY: {os.environ['ANTHROPIC_API_KEY']}")

response = ask(
    "What are the most relevant language models used for chemistry?", settings=settings
)

It would obviously fail. We don't have a valid OPENAI_API_KEY, so the agent will not be able to use OpenAI models. Let's change it to an Anthropic model and see if it works.

settings.llm = llm_anthropic
settings.llm_config = {
    "model_list": [
        {
            "model_name": llm_anthropic,
            "litellm_params": {
                "model": llm_anthropic,
                "temperature": 0.1,
                "max_tokens": 512,
            },
        }
    ],
    "rate_limit": {
        llm_anthropic: "30000 per 1 minute",
    },
}
settings.summary_llm = llm_anthropic
settings.summary_llm_config = {
    "rate_limit": {
        llm_anthropic: "30000 per 1 minute",
    },
}
settings.agent = AgentSettings(
    agent_llm=llm_anthropic,
    agent_llm_config={
        "rate_limit": {
            llm_anthropic: "30000 per 1 minute",
        },
    },
    index=IndexSettings(
        paper_directory=pathlib.Path.cwd().joinpath("papers"),
        index_directory=pathlib.Path.cwd().joinpath("papers/index"),
    ),
)
# settings.embedding = "st-multi-qa-MiniLM-L6-cos-v1"
settings.embedding = "hybrid-st-multi-qa-MiniLM-L6-cos-v1"
response = ask(
    "What are the most relevant language models used for chemistry?", settings=settings
)

Now the agent is able to use Anthropic models only and although we don't have a valid OPENAI_API_KEY, the question is answered because the agent will not use OpenAI models. See that we also changed the embedding because it was using text-embedding-3-small by default, which is a OpenAI model. Paperqa implements a few embedding models. Please refer to the documentation for more information.

Notice that we redefined settings.agent.paper_directory and settings.agent.index settings. Paperqa actually uses the setting from settings.agent. However, for convenience, we implemented an alias in settings.paper_directory and settings.index_directory.

The output

Paperqa returns a PQASession object, which contains not only the answer but also all the information gatheres to answer the questions. We recommend printing the PQASession object (print(response.session)) to understand the information it contains. Let's check the PQASession object:

print(response.session)

print("Let's examine the PQASession object returned by paperqa:\n")

print(f"Status: {response.status.value}")

print("1. Question asked:")
print(f"{response.session.question}\n")

print("2. Answer provided:")
print(f"{response.session.answer}\n")

In addition to the answer, the PQASession object contains all the references and contexts used to generate the answer.

Because paperqa splits the documents into chunks, each chunk is a valid reference. You can see that it also references the page where the context was found.

print("3. References cited:")
print(f"{response.session.references}\n")

Lastly, PQASession.session.contexts contains the contexts used to generate the answer. Each context has a score, which is the similarity between the question and the context. Paperqa uses this score to choose what contexts is more relevant to answer the question.

print("4. Contexts used to generate the answer:")
print(
    "These are the relevant text passages that were retrieved and used to formulate the answer:"
)
for i, ctx in enumerate(response.session.contexts, 1):
    print(f"\nContext {i}:")
    print(f"Source: {ctx.text.name}")
    print(f"Content: {ctx.context}")
    print(f"Score: {ctx.score}")

Where to get papers

OpenReview

You can use papers from https://openreview.net/ as your database! Here's a helper that fetches a list of all papers from a selected conference (like ICLR, ICML, NeurIPS), queries this list to find relevant papers using LLM, and downloads those relevant papers to a local directory which can be used with paper-qa on the next step. Install openreview-py with

pip install paper-qa[openreview]

and get your username and password from the website. You can put them into .env file under OPENREVIEW_USERNAME and OPENREVIEW_PASSWORD variables, or pass them in the code directly.

from paperqa import Settings
from paperqa.contrib.openreview_paper_helper import OpenReviewPaperHelper

# these settings require gemini api key you can get from https://aistudio.google.com/
# import os; os.environ["GEMINI_API_KEY"] = os.getenv("GEMINI_API_KEY")
# 1Mil context window helps to suggest papers. These settings are not required, but useful for an initial setup.
settings = Settings.from_name("openreview")
helper = OpenReviewPaperHelper(settings, venue_id="ICLR.cc/2025/Conference")
# if you don't know venue_id you can find it via
# helper.get_venues()

# Now we can query LLM to select relevant papers and download PDFs
question = "What is the progress on brain activity research?"

submissions = helper.fetch_relevant_papers(question)

# There's also a function that saves tokens by using openreview metadata for citations
docs = await helper.aadd_docs(submissions)

# Now you can continue asking like in the [main tutorial](../../README.md)
session = docs.query(question, settings=settings)
print(session.answer)

Zotero

It's been a while since we've tested this - so let us know if it runs into issues!

If you use Zotero to organize your personal bibliography, you can use the paperqa.contrib.ZoteroDB to query papers from your library, which relies on pyzotero.

Install pyzotero via the zotero extra for this feature:

pip install paper-qa[zotero]

First, note that PaperQA2 parses the PDFs of papers to store in the database, so all relevant papers should have PDFs stored inside your database. You can get Zotero to automatically do this by highlighting the references you wish to retrieve, right clicking, and selecting "Find Available PDFs". You can also manually drag-and-drop PDFs onto each reference.

To download papers, you need to get an API key for your account.

Get your library ID, and set it as the environment variable ZOTERO_USER_ID.
- For personal libraries, this ID is given here at the part "Your userID for use in API calls is XXXXXX".
- For group libraries, go to your group page https://www.zotero.org/groups/groupname, and hover over the settings link. The ID is the integer after /groups/. (h/t pyzotero!)
Create a new API key here and set it as the environment variable ZOTERO_API_KEY.
- The key will need read access to the library.

With this, we can download papers from our library and add them to PaperQA2:

from paperqa import Docs
from paperqa.contrib import ZoteroDB

docs = Docs()
zotero = ZoteroDB(library_type="user")  # "group" if group library

for item in zotero.iterate(limit=20):
    if item.num_pages > 30:
        continue  # skip long papers
    docs.add(item.pdf, docname=item.key)

which will download the first 20 papers in your Zotero database and add them to the Docs object.

We can also do specific queries of our Zotero library and iterate over the results:

for item in zotero.iterate(
    q="large language models",
    qmode="everything",
    sort="date",
    direction="desc",
    limit=100,
):
    print("Adding", item.title)
    docs.add(item.pdf, docname=item.key)

You can read more about the search syntax by typing zotero.iterate? in IPython.

Paper Scraper

If you want to search for papers outside of your own collection, I've found an unrelated project called paper-scraper that looks like it might help. But beware, this project looks like it uses some scraping tools that may violate publisher's rights or be in a gray area of legality.

from paperqa import Docs

keyword_search = "bispecific antibody manufacture"
papers = paperscraper.search_papers(keyword_search)
docs = Docs()
for path, data in papers.items():
    try:
        docs.add(path)
    except ValueError as e:
        # sometimes this happens if PDFs aren't downloaded or readable
        print("Could not read", path, e)
session = docs.query(
    "What manufacturing challenges are unique to bispecific antibodies?"
)
print(session)