feat: v0 tab completion by wjiayis · Pull Request #110 · PaperDebugger/paperdebugger

wjiayis · 2026-02-04T12:02:23Z

#33

In short, it'll

[Frontend] Recognize that user is trying to add a citation (trigger text is \cite{)
[Frontend] Temporarily suppress default Overleaf dropdown suggestions
[Frontend] Get the last sentence as context for LLM
[Backend] Fetch bibliography in .bib files as raw text, and remove irrelevant fields to save tokens
[Backend] Query a fast LLM (hardcoded to gpt-5.2 for now) to get at most 3 citation keys
[Frontend] Suppress default Overleaf tab-completion to allow users to accept citation suggestions

Overall latency is 1-2s (tested using OpenAI API).

There are some outstanding issues

There will not be citations if user is not logged in -> won't fix
Suggestions are not precise -> in my V1 iteration I'll use XtraMCP to improve bib handling

feat: v0 tab completion

Junyi-99 · 2026-02-05T04:48:07Z

Hi @wjiayis, thanks for the update. I’ve created a new issue to address the token expiration problem.

Regarding the latency issue, do we have visibility into which part of the pipeline contributes most to the high latency? For example, a rough breakdown across:

frontend → backend → LLM provider (reasoning + response)

I’ll take a look at this PR later this evening as well.

wjiayis · 2026-02-05T04:56:18Z

@Junyi-99 I haven't gotten to the latency breakdown yet, but I've settled everything else and I'm gonna work on this next. Thanks for helping to review when convenient, I'll update my findings when I have them too!

Junyi-99 · 2026-02-05T04:59:24Z

@wjiayis Got it, thanks for the update. Looking forward to your findings.

wjiayis · 2026-02-05T14:51:36Z

Root Cause

There's a ~20s latency in the inline-suggestion loop, and >99% of the latency comes from waiting for LLM to start responding. This issues arises because I'm passing in a large (but realistic) bibliography (the bibliography of PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing itself), and gpt-5-nano takes a while to parse it.

Solution

I think it's reasonable to expect that a regular user's max latency tolerance is ~2s. I'll implement the following 3 solutions to achieve that.

Model Selection

gpt-5-nano is takes a long time to process the long bibliography. Just swapping it out for gpt-5.2 brings latency down to 2-4s. But gpt-5.2 is expensive to call. I'll improve latency and cost with the next 2 solutions.

Prompt Caching

Since bibliography remains generally constant and takes up the bulk of the prompt, I'll use OpenAI's prompt caching - advertised to reduce latency by up to 80% and input token costs by up to 90%.

Place bibliography at the start of the prompt (prompt caching uses exact prefix match)
Run a "no-reply" LLM query at the start of each session and when database reloads, and config it to cache 24h
Each time \cite{ is triggered, the cached bibliography is used -> lower latency

Prompt Refinement

I'll remove info-sparse fields (eg doi, url, pages) and retain only info-rich fields (eg title, booktitle), to reduce the total size of bibliography by (hopefully) at least 40%.

cc: @Junyi-99

wjiayis · 2026-02-07T04:50:07Z

With regards to latency, just changing the model fixed the bulk of the issue. I still implemented prompt caching and prompt refinement.

I realize that when using OpenRouter, the first few (more than one) auto-completion would have 6-7s latency, and subsequent auto-completions would have 1-2s latency, as a result of prompt caching. I feel that this is good enough UX, and a first "no-reply" LLM query is probably not necessary, as it complicates the solution. Nevertheless feel free to let me know if you prefer it being implemented.

### 1. Local Storage Not Updated Access tokens and refresh tokens are only saved to local storage on login, subsequent token refreshes do not update local storage. Example: User refreshes to a new token and exits Overleaf, the next time he re-opens PaperDebugger, it uses the old refresh token. Proposed solution: Update authStore whenever tokens are set. ### 2. Race Conditions When Refreshing PaperDebugger often calls multiple endpoints at the same time, which results in a race condition if the token needs to be refreshed. Example: `v2/chats/models` and `v2/chats/conversations` are called at the same time, and the access token needs refreshing, the refresh endpoint is called twice. In some occasions, the frontend uses the 2nd refresh token received which differs from the one stored in the backend. This can be easily reproduced by setting the JWT expiration in the backend to a very short time. Proposed solution: Use a promise for `refresh()`. Unsure if this fixes the exact problem in #110

Copilot

Pull request overview

Implements a first version of citation-aware tab completion by detecting \cite{ in the editor, fetching suggested BibTeX keys from a new backend endpoint, and suppressing Overleaf’s default autocomplete/tab behavior to allow accepting inline suggestions.

Changes:

Frontend: Adds a “citation suggestions” beta setting, detects \cite{ triggers, fetches citation keys, and intercepts Tab to accept inline suggestions.
Backend: Adds GetCitationKeys RPC/HTTP endpoint, extracts and token-reduces .bib content, and queries an LLM for up to ~3 relevant citation keys.
Plumbing: Updates generated proto clients/servers and ignores CLAUDE.md.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
webapp/_webapp/src/views/settings/sections/beta-feature-settings.tsx	Renames the completion toggle UI to “citation suggestions”.
webapp/_webapp/src/views/settings/index.tsx	Renders the beta features section in Settings.
webapp/_webapp/src/query/api.ts	Adds `getCitationKeys()` API wrapper for the new v2 endpoint.
webapp/_webapp/src/pkg/gen/apiclient/chat/v2/chat_pb.ts	Generated TS client types/service updated for `GetCitationKeys`.
webapp/_webapp/src/libs/inline-suggestion.ts	Adds `\cite{` trigger handling and suppresses Overleaf autocomplete/tab behavior when suggestions are active.
proto/chat/v2/chat.proto	Adds `GetCitationKeys` RPC + request/response messages.
pkg/gen/api/chat/v2/chat_grpc.pb.go	Generated gRPC server/client updated with `GetCitationKeys`.
pkg/gen/api/chat/v2/chat.pb.gw.go	Generated grpc-gateway bindings for `/_pd/api/v2/chats/citation-keys`.
pkg/gen/api/chat/v2/chat.pb.go	Generated Go proto types updated with new messages and RPC.
internal/services/toolkit/client/get_citation_keys.go	Implements bibliography extraction and LLM prompt to suggest citation keys.
internal/api/chat/get_citation_keys.go	Implements the ChatServerV2 handler for `GetCitationKeys`.
.gitignore	Ignores `CLAUDE.md`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T13:37:39Z

proto/chat/v2/chat.proto

+  // A comma-separated string of keys, or empty if none found
+  string citation_keys = 1;


Returning citation keys as a comma-separated string makes the API harder to use correctly (parsing, escaping, empty handling) and is inconsistent with other protos that use repeated string for lists. Consider changing citation_keys to repeated string citation_keys = 1; and let clients join with commas as needed.

Suggested change

// A comma-separated string of keys, or empty if none found

string citation_keys = 1;

// A list of suggested citation keys, empty if none found

repeated string citation_keys = 1;

Copilot · 2026-02-09T13:37:39Z

webapp/_webapp/src/libs/inline-suggestion.ts

+  const textBefore = state.doc.sliceString(0, cursorPos - triggerText.length);
+  const lastSentence = textBefore
+    .split(/(?<=[.!?])\s+/)
+    .filter((s) => s.trim().length > 0) // filter out empty sentences
+    .slice(-1)[0];


If there is no sentence-ending punctuation before the trigger, lastSentence becomes the entire prefix of the document (potentially very large), which can increase latency/cost and exceed token limits. Consider capping the context (e.g. last N chars/words) and/or using a more robust sentence extraction that falls back to a bounded window.

internal/services/toolkit/client/get_citation_keys.go

Copilot · 2026-02-09T13:37:40Z

internal/services/toolkit/client/get_citation_keys.go

@@ -0,0 +1,142 @@
+package client
+
+// TODO: This file should not place in the client package.


This new file includes a TODO noting it should not live in the client package. Leaving known-misplaced code in-tree makes future refactors harder (imports, package boundaries). Please move this logic into an appropriate package (e.g. a citation/bibliography helper or toolkit subpackage) before merging, and keep client focused on LLM client orchestration.

Suggested change

// TODO: This file should not place in the client package.

@Junyi-99 I'm planning to eventually move this file to wherever internal/services/toolkit/client/get_conversation_title_v2.go will be moved to, since both files contain high-order business logic that doesn't belong to the client folder. Shall I leave this file here for now?

i got copilot to review the PR. I didnt really look through its suggestions. Could you resolve copilot's comments and integrate any valid / viable suggestions? I think jun yi is busy these few days, i'll review later on this week if its still pending review! Thanks for your hardwork jiayi!

@Junyi-99 actually do we have a recommended PR to main workflow? Should we review only review PRs to main and allow freely merging to staging / dev branch for testing in a simulated prod env? Or should there be a preliminary review before merging to staging too?

Copilot · 2026-02-09T13:37:40Z

internal/services/toolkit/client/get_citation_keys.go

+func (a *AIClientV2) GetCitationKeys(ctx context.Context, sentence string, userId bson.ObjectID, projectId string, llmProvider *models.LLMProviderConfig) (string, error) {
+	bibliography, err := a.GetBibliographyForCitation(ctx, userId, projectId)
+


New citation-key functionality is added here but there are existing tests for other toolkit client behaviors (e.g. conversation title). Please add unit tests for GetBibliographyForCitation/GetCitationKeys (at minimum: field-exclusion behavior and prompt formatting / empty-citation handling) to prevent regressions.

Copilot · 2026-02-09T13:37:40Z

webapp/_webapp/src/query/api.ts

+export const getCitationKeys = async (data: PlainMessage<GetCitationKeysRequest>) => {
+  const response = await apiclientV2.post(`/chats/citation-keys`, data);
+  return fromJson(GetCitationKeysResponseSchema, response);
+}


This API is invoked as the user types; failures (401, network, etc.) should not spam global error toasts. Pass { ignoreErrorToast: true } to apiclientV2.post here and let the caller decide how to surface errors (the completion code already logs and returns ""). Also, the file consistently terminates exported const assignments with ;—add it here to match lint/style.

Copilot · 2026-02-09T13:37:40Z

internal/api/chat/get_citation_keys.go

+	citationKeys, err := s.aiClientV2.GetCitationKeys(
+		ctx,
+		req.GetSentence(),
+		actor.ID,
+		req.GetProjectId(),


GetCitationKeysRequest includes an optional model_slug, but this handler doesn’t read req.GetModelSlug() (and the downstream call currently hardcodes a model). Either plumb the requested model through to the client call, or remove the field from the proto to avoid a misleading API surface.

Copilot · 2026-02-09T13:37:41Z

internal/api/chat/get_citation_keys.go

+	citationKeys, err := s.aiClientV2.GetCitationKeys(
+		ctx,
+		req.GetSentence(),
+		actor.ID,
+		req.GetProjectId(),


Please validate required inputs (at least sentence and project_id) and return a clear bad-request error when missing, consistent with other chat handlers (e.g. shared.ErrBadRequest("title is required")). Otherwise empty values will fail deeper in services with less actionable errors.

internal/services/toolkit/client/get_citation_keys.go

Copilot · 2026-02-09T13:37:41Z

webapp/_webapp/src/libs/inline-suggestion.ts

+    if (styleEl) return;
+    styleEl = document.createElement("style");
+    styleEl.textContent = `.cm-tooltip { display: none !important; }`;
+    document.head.appendChild(styleEl);
+  }


The injected CSS hides all .cm-tooltip elements globally while active, which can suppress unrelated tooltips (lint, hover help, etc.) elsewhere on the page and across multiple editors. Scope the selector to the specific editor instance/container and/or a more specific autocomplete tooltip class so only Overleaf’s dropdown is suppressed.

wjiayis and others added 17 commits February 1, 2026 11:51

feat: frontend enable beta features

c12c33b

feat: trigger auto-completion if text before is "\cite{"

324e602

feat: extract last sentence

9a4b2d4

chore: remove debug logging

4992a04

feat: end to end inline suggestion (lots of hardcoding)

5592197

chore: minor reformatting

bc942c3

chore: minor comment improvement

72167c5

chore: rename method

05abfd7

chore: rename method

d688433

refactor: use abstracted methods

56260eb

fix: use debug conversation mode

d8fd357

refactor: move citation method to backend

f453065

chore: revert edit package-lock.json

ac64b91

feat: always use gpt-5-nano

b6cf906

feat: access docs on backend

13e8553

feat: get bibfiles from backend

672e569

Merge pull request #109 from wjiayis/feat/tab-completion

99243ca

feat: v0 tab completion

wjiayis mentioned this pull request Feb 4, 2026

feat: v0 tab completion #109

Merged

wjiayis added 4 commits February 4, 2026 21:44

feat: improve citation prompt

1ff6a69

feat: improve citation prompt

888b66b

feat: override default overleaf autocomplete

60421cb

refactor: make suggestion triggers generalised

a54d354

Junyi-99 mentioned this pull request Feb 5, 2026

[BUG] Token expiration occurs more frequently than expected #111

Closed

Junyi-99 linked an issue Feb 5, 2026 that may be closed by this pull request

[Feature Request] Tab-Completion #33

Open

feat: use gpt-5.2 instead of gpt-5-nano to reduce latency

65cf022

kah-seng mentioned this pull request Feb 5, 2026

fix: tokens #113

Merged

wjiayis added 13 commits February 6, 2026 21:07

feat: move bib to the start to make use of prompt caching

2a00ddc

feat: skip unimportant bib fields

71e7c3f

chore: remove debug log

f049902

chore: update comments

cc43f36

chore: removal of redundant code

2faa944

chore: update comments

21d89f6

chore: llm response edge case handling

8b7ae4c

chore: update comments

dc79e9c

refactor: update comments and improve variable naming

0dacb49

refactor: improve variable naming

5f6f7b8

refactor: remove unnecessary abstraction

f1c82af

chore: simplify comments

4f6b8c9

feat: update frontend settings button text

6cdc60f

wjiayis marked this pull request as ready for review February 7, 2026 04:50

wjiayis requested review from 4ndrelim and Junyi-99 February 7, 2026 04:53

wjiayis self-assigned this Feb 7, 2026

wjiayis added the enhancement New feature or request label Feb 7, 2026

wjiayis added 2 commits February 7, 2026 22:13

chore: remove a comment

ebdb3d1

feat: skip @string{} to save tokens and reduce latency

bae5318

4ndrelim requested a review from Copilot February 9, 2026 13:29

Copilot started reviewing on behalf of 4ndrelim February 9, 2026 13:30 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

wjiayis added 2 commits February 9, 2026 23:16

fix: debug bibliography and sentence ordering

90140df

chore: fix typo in comment

aeaff3c

		// A comma-separated string of keys, or empty if none found
		string citation_keys = 1;

		@@ -0,0 +1,142 @@
		package client

		// TODO: This file should not place in the client package.

		func (a AIClientV2) GetCitationKeys(ctx context.Context, sentence string, userId bson.ObjectID, projectId string, llmProvider models.LLMProviderConfig) (string, error) {
		bibliography, err := a.GetBibliographyForCitation(ctx, userId, projectId)

Conversation

wjiayis commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Junyi-99 commented Feb 5, 2026

Uh oh!

wjiayis commented Feb 5, 2026

Uh oh!

Junyi-99 commented Feb 5, 2026

Uh oh!

wjiayis commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Solution

Model Selection

Prompt Caching

Prompt Refinement

Uh oh!

wjiayis commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

wjiayis Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

4ndrelim Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wjiayis commented Feb 4, 2026 •

edited

Loading

wjiayis commented Feb 5, 2026 •

edited

Loading

wjiayis commented Feb 7, 2026 •

edited

Loading

4ndrelim Feb 9, 2026 •

edited

Loading