Skip to content

feat: holder stats integrations#2401

Open
0xkr3p wants to merge 28 commits intoDefiLlama:masterfrom
0xkr3p:feat/holder-stats-integration
Open

feat: holder stats integrations#2401
0xkr3p wants to merge 28 commits intoDefiLlama:masterfrom
0xkr3p:feat/holder-stats-integration

Conversation

@0xkr3p
Copy link
Contributor

@0xkr3p 0xkr3p commented Feb 20, 2026

Summary

  • Adds a daily Lambda (triggerHoldersDaily, 6 AM UTC) that fetches holder data for all
    EVM pools with TVL >= $10k
  • Classifies tokens dynamically (standard, rebasing, share-based) via on-chain interface
    probing, with S3-cached classifications to avoid redundant RPC calls
  • Handles rebasing tokens (stETH, OETH, etc.) with incremental delta processing and
    periodic on-chain balance verification
  • Stores daily snapshots in a new holder_daily table with holder count, avg position,
    and top-10 concentration metrics
  • Exposes two new API endpoints: GET /holders (latest + 7d/30d change) and GET /holderHistory/:pool (time series)

New files

  • src/handlers/triggerHoldersDaily.js — orchestrator
  • src/utils/holderApi.js — API client, token classification, on-chain refinement
  • src/queries/holder.js — DB queries
  • migrations/1771242161000_add-holder-tables.js — schema

Test plan

  • Verify Lambda executes successfully with standard EVM pools
  • Confirm rebasing tokens (stETH, OETH) are classified and seeded correctly
  • Check incremental updates produce consistent holder counts after multiple runs
  • Validate /holders and /holderHistory/:pool return expected data shapes
  • Confirm pools with non-standard pool field formats (bare addresses, no chain suffix)
    are picked up via config.chain

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

A complete holder data pipeline is introduced, featuring a new PostgreSQL holder_daily table, a daily Lambda trigger for aggregating holder snapshots, API endpoints for querying holder data and history, database accessors, token classification utilities with API integration and caching, and supporting configuration.

Changes

Cohort / File(s) Summary
Database Migration
migrations/1771242161000_add-holder-tables.js
Creates holder_daily table with holder stats (count, avg position, top 10 percent, top 10 holders) per configID and timestamp, with unique and composite indexes; includes reverse drop function.
Infrastructure & Configuration
serverless.yml, env.js
Adds triggerHoldersDaily Lambda function with 900s timeout, daily 06:00 schedule, and HOLDERS_API_KEY environment variable; exposes API key in env config.
API Routes & Controllers
src/api/routes/yield.js, src/api/controllers/yield.js
Adds two GET endpoints (/holderHistory/:pool, /holders) with controllers that fetch latest holder snapshots, compute per-pool aggregations (count, avg position, 7/30-day changes), and return structured responses.
Data Access Layer
src/queries/holder.js
New module exposing five database accessors: insertHolder (upsert snapshots), getEligiblePools (TVL-filtered), getLatestHolders (latest 30-day snapshots), getHolderHistory (time series per pool), and getHolderOffset (historical count lookup).
Daily Holder Processor
src/handlers/triggerHoldersDaily.js
Implements comprehensive Lambda handler retrieving eligible pools, classifying tokens, performing multi-batch totalSupply queries, and processing standard/flagged pools via multiple strategies (incremental updates, seeding, on-chain refinement) with caching and detailed logging.
Holder API Utilities
src/utils/holderApi.js
Large utility module encapsulating Peluche API integration, S3-backed caching, token classification (true_rebase, share_based, standard), batch processing, ANKR comparison fallbacks, on-chain balance refinement, and block height resolution.
Testing & Validation
src/utils/testEnrichmentMerge.js
Standalone test script validating holder data merge logic for the GET /holders endpoint, testing aggregation, rounding, null handling, and field exclusion across multiple scenarios.
Miscellaneous Updates
src/handlers/triggerEnrichment.js, .gitignore
Adds welfordUpdate import; removes three adaptor directories from gitignore (maverick-protocol, metavault-v3, muuu-finance/navi-lending).

Sequence Diagram

sequenceDiagram
    participant Lambda as triggerHoldersDaily
    participant DB as PostgreSQL
    participant Cache as S3 Cache
    participant Chain as Blockchain
    participant API as Peluche API
    participant ANKR as ANKR API

    Lambda->>DB: Fetch eligible pools (TVL threshold)
    Lambda->>Lambda: Resolve chain IDs, batch pool tasks
    Lambda->>Chain: Multi-call totalSupply (batches)
    Chain-->>Lambda: Supply data (totalSupplyMap)
    Lambda->>Cache: Load token classifications per chain
    Cache-->>Lambda: Cached token types
    Lambda->>Chain: On-chain ABI probing (unclassified)
    Chain-->>Lambda: ABI check results
    Lambda->>ANKR: Compare holder counts (fallback)
    ANKR-->>Lambda: Holder count comparison
    Lambda->>Cache: Cache classified tokens
    Lambda->>API: fetchHolders (standard pools)
    API-->>Lambda: Holder data
    Lambda->>Chain: Refine balances on-chain
    Chain-->>Lambda: Refined holder balances
    Lambda->>Lambda: Build aggregated results (count, avg, top 10)
    Lambda->>DB: Insert/upsert snapshots (holder_daily)
    DB-->>Lambda: Acknowledged
    Lambda->>API: fetchHolders (flagged pools, incremental/seed)
    API-->>Lambda: Rebase-aware holder data
    Lambda->>DB: Insert flagged snapshots
    DB-->>Lambda: Acknowledged
    Lambda->>Lambda: Log summary (success/failure/skipped)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Hop, hop! The holders dance,
In daily snapshots, cached with stance,
From rebase quirks to standard ways,
We track each wallet's bright displays!
With Lambda's tick and database's beat,
The holder pipeline is complete! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 67.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: holder stats integrations' directly corresponds to the main change: adding a complete holder statistics pipeline with database schema, API endpoints, data fetching, and daily scheduling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (6)
src/handlers/triggerEnrichment.js (1)

126-126: Use console.error for error logging in the catch block.

console.log for error conditions mixes error output with informational output, making it harder to filter in CloudWatch. The catch at line 125 is the only error path in the holder enrichment block.

✏️ Proposed fix
-    console.log('holder data fetch failed, continuing without it', err.message);
+    console.error('holder data fetch failed, continuing without it', err.message);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerEnrichment.js` at line 126, In the holder enrichment
catch block in src/handlers/triggerEnrichment.js (the catch that currently logs
"holder data fetch failed, continuing without it", err.message), replace the
console.log call with console.error and log the full error (e.g., err or
err.stack) instead of only err.message so error output is emitted to stderr and
is easier to filter in CloudWatch.
src/handlers/triggerHoldersBackfill.js (1)

1-1: aws-sdk v2 is in maintenance mode — consider migrating to @aws-sdk/client-sqs (v3).

AWS SDK v2 entered maintenance mode in September 2023 and is not included in Lambda runtimes for Node.js 18+. Although serverless-webpack bundles it, new Lambda code should use SDK v3. The equivalent v3 migration is straightforward.

♻️ Proposed refactor to SDK v3
-const SQS = require('aws-sdk/clients/sqs');
+const { SQSClient, SendMessageBatchCommand } = require('@aws-sdk/client-sqs');

Then replace client construction and call site:

-  const sqs = new SQS();
+  const sqs = new SQSClient({});
-    const result = await sqs
-      .sendMessageBatch({ QueueUrl: queueUrl, Entries: batch })
-      .promise();
+    const result = await sqs.send(
+      new SendMessageBatchCommand({ QueueUrl: queueUrl, Entries: batch })
+    );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersBackfill.js` at line 1, Replace the v2 aws-sdk
usage with the modular v3 client: remove the require('aws-sdk/clients/sqs') SQS
import and instead instantiate an `@aws-sdk/client-sqs` SQSClient; update all uses
of the old SQS methods (e.g., sendMessage, receiveMessage, deleteMessage) to
call client.send(...) with the corresponding command classes
(SendMessageCommand, ReceiveMessageCommand, DeleteMessageCommand). Ensure
configuration (region/credentials) is passed to the SQSClient constructor and
replace any direct method calls on the old SQS variable with calls to the new
SQSClient instance and command objects.
src/handlers/triggerHoldersWorker.js (1)

100-108: Add :json modifier to top10Holders ColumnSet entry for consistency with balanceMap handling.

upsertHolderState uses the :json pg-promise modifier for balanceMap in its raw SQL query, passing the raw object. The insertHolder ColumnSet currently lacks the :json modifier for top10Holders, causing the handler to pre-stringify the value. This forces PostgreSQL to implicitly cast from text to jsonb instead of relying on the library's built-in serialization.

♻️ Proposed fix
-    { name: 'top10Holders', def: null },
+    { name: 'top10Holders', mod: ':json', def: null },

And in src/handlers/triggerHoldersWorker.js (and src/handlers/triggerHoldersDaily.js), remove the JSON.stringify() call:

-    top10Holders: JSON.stringify(metrics.top10Holders),
+    top10Holders: metrics.top10Holders,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersWorker.js` around lines 100 - 108, The insert path
is pre-stringifying top10Holders before inserting, while upsertHolderState uses
the pg-promise :json modifier for balanceMap; update the insert ColumnSet entry
for top10Holders to use the :json modifier (matching balanceMap) and remove
JSON.stringify(...) in the call site (the insertHolder invocation in
triggerHoldersWorker.js and triggerHoldersDaily.js) so the raw object is passed
and pg-promise handles JSON/JSONB serialization; ensure the ColumnSet name
(top10Holders) and the insertHolder function are updated consistently.
src/queries/holder.js (2)

83-101: getLatestHolders scans the entire holder table — will degrade as snapshots accumulate.

DISTINCT ON ("configID") ... ORDER BY "configID", timestamp DESC over the full holder table becomes increasingly expensive as daily snapshots grow. Ensure there's an index on ("configID", timestamp DESC) on the holder table. Alternatively, consider selecting from holder_state joined with the latest snapshot if state already tracks the most recent metrics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/queries/holder.js` around lines 83 - 101, getLatestHolders currently
scans the entire holder table by using DISTINCT ON with ORDER BY timestamp DESC,
which will slow as snapshots grow; either add a DB index on ("configID",
timestamp DESC) for the holder table to make this query efficient, or rewrite
getLatestHolders to join holder_state (or another table that tracks the latest
snapshot) to fetch only the most-recent row per configID (use the configID join
key and the latest-timestamp marker) instead of scanning all snapshots; locate
the getLatestHolders function in holder.js and implement the index creation in
your migration/DDL or change the query to join the latest-state table
accordingly.

61-81: The latest_tvl CTE may be expensive on a large yield table — consider indexing or materialization.

Both getAllHolderStates and getPoolsWithoutHolderState use the same CTE:

SELECT DISTINCT ON ("configID") "configID", "tvlUsd"
FROM yield
ORDER BY "configID", timestamp DESC

This performs a full sort of the yield table. If this table is large (millions of rows), this will be slow. Ensure there's a composite index on ("configID", timestamp DESC) to support the DISTINCT ON efficiently. Alternatively, consider a materialized view or caching the latest TVL if this becomes a bottleneck.

#!/bin/bash
# Check if there's an existing index on the yield table for the DISTINCT ON query
rg -n 'CREATE INDEX' --type sql -g '!node_modules' | grep -i 'yield'
# Also check migration files for index definitions
fd -e js -p 'migration' --exec grep -l 'yield' {} \;

Also applies to: 117-137

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/queries/holder.js` around lines 61 - 81, The latest_tvl CTE used in
getAllHolderStates (and also in getPoolsWithoutHolderState) can cause expensive
full sorts on the yield table; add a database-side optimization by creating a
composite index on ("configID", timestamp DESC) (via a new migration or SQL
index file) to support the DISTINCT ON query, or alternatively introduce a
materialized view or cached table that stores the latest tvlUsd per configID and
update it periodically—update the project migrations or database setup to
include the index or materialized view so the SQL in getAllHolderStates and
getPoolsWithoutHolderState runs efficiently.
src/utils/holderScanner.js (1)

46-86: Checkpoint trigger condition is unreliable for the onCheckpoint callback.

Line 79: processedCount % CHECKPOINT_INTERVAL < logs.length — this condition is checked once per processor invocation (per batch of logs), but processedCount has already been incremented by logs.length inside the loop above. If a single processor call receives a batch that straddles two checkpoint intervals (e.g., processedCount goes from 499_950 to 500_150), the condition fires correctly. However, if individual batches are consistently smaller than the remainder, the condition can fire repeatedly on consecutive small batches (e.g., if processedCount is 500_003 and next batch is 5 logs → 500_008 % 500_000 = 8 < 5 is false, but 500_003 % 500_000 = 3 < 100 was true). More importantly, if batch sizes are always very large (>500k), the checkpoint fires at most once per batch which may be too infrequent.

The worker in triggerHoldersWorker.js (lines 63–77) implements its own count-based guard (processedCount - lastCheckpointCount >= 500_000) which is more deterministic. Consider aligning this logic or tracking lastCheckpointCount internally.

♻️ Suggested fix: use a deterministic checkpoint guard
   let processedCount = 0;
   let lastSeenBlock = fromBlock;
   const CHECKPOINT_INTERVAL = 500_000;
+  let lastCheckpointCount = 0;

   await sdk.indexer.getLogs({
     ...baseOpts,
     all: true,
     clientStreaming: true,
     collect: false,
     processor: (logs) => {
       for (const log of logs) {
         const args = log.args || log;
         applyTransfer(balances, args.from, args.to, args.value);
         processedCount++;
         if (log.blockNumber != null) lastSeenBlock = log.blockNumber;
       }
       // Intermediate checkpoint callback for long-running scans
-      if (onCheckpoint && processedCount % CHECKPOINT_INTERVAL < logs.length) {
-        onCheckpoint(balances, processedCount, lastSeenBlock);
+      if (onCheckpoint && processedCount - lastCheckpointCount >= CHECKPOINT_INTERVAL) {
+        onCheckpoint(balances, processedCount, lastSeenBlock);
+        lastCheckpointCount = processedCount;
       }
     },
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderScanner.js` around lines 46 - 86, scanTransfers's checkpoint
condition is unreliable; add a deterministic guard like lastCheckpointCount
inside scanTransfers and replace the modulus condition with a difference check.
Specifically, declare let lastCheckpointCount = 0; keep incrementing
processedCount as now, and inside the processor call onCheckpoint when
(processedCount - lastCheckpointCount) >= CHECKPOINT_INTERVAL, passing balances,
processedCount, lastSeenBlock, then set lastCheckpointCount = processedCount (or
to lastSeenBlock-dependent value if you prefer block-based checkpoints).
Reference symbols: scanTransfers, processedCount, CHECKPOINT_INTERVAL,
lastSeenBlock, onCheckpoint, and applyTransfer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@serverless.yml`:
- Around line 246-258: The HolderQueue's MessageRetentionPeriod (259200 s / 72h)
is too short for worst-case backfills; update the MessageRetentionPeriod
property for the HolderQueue to the SQS maximum (1209600) to match the
DeadLetterQueue retention, and/or increase the workers' reservedConcurrency
(where you configure reservedConcurrency for the Lambda consumers) to reduce
total processing time; ensure the RedrivePolicy referencing DeadLetterQueue
remains unchanged and document the change so we can track retention vs
concurrency tradeoffs.
- Around line 212-222: The triggerHoldersDaily function exposes unused indexer
credentials; remove LLAMA_INDEXER_V2_ENDPOINT and LLAMA_INDEXER_V2_API_KEY from
the environment block of triggerHoldersDaily in serverless.yml so only
HOLDER_QUEUE_URL remains, and verify that the handler
src/handlers/triggerHoldersDaily.handler does not rely on those variables (the
indexer usage is handled downstream by the worker). Also search for any
accidental references to LLAMA_INDEXER_V2_ENDPOINT or LLAMA_INDEXER_V2_API_KEY
in the triggerHoldersDaily-related code to ensure no runtime dependency remains.

In `@src/api/controllers/yield.js`:
- Around line 168-179: The getHolderHistory handler calls getHolderHistoryQuery
without error handling; wrap the body of getHolderHistory in a try/catch, call
await getHolderHistoryQuery(configID) inside the try, and in catch log the error
(e.g., using console.error or the existing logger) and return a
res.status(500).json({ status: 'error', message: 'Internal server error' }) (or
include a short error detail if safe); apply the same pattern to other handlers
in this file (all handlers calling DB queries) or extract a small helper to
centralize try/catch handling to avoid unhandled promise rejections.

In `@src/handlers/triggerHoldersBackfill.js`:
- Around line 15-68: Wrap the entire main() execution in a top-level try/catch
to surface and log failures from getPoolsWithoutHolderState() and
sqs.sendMessageBatch(...).promise(); inside the catch, use a structured error
log (include the caught error object and context like attempted operation and
affected batch or pool) and rethrow or exit with a non-zero status so the Lambda
invocation is marked failed. Update references around the main() body and the
SQS send loop (function main, getPoolsWithoutHolderState, sqs.sendMessageBatch)
to ensure any thrown errors are caught and logged with details before
terminating.

In `@src/handlers/triggerHoldersDaily.js`:
- Around line 138-169: The inline fallback iterates newPools and calls
scanTransfers(chain, tokenAddress, 0, toBlock) which starts from genesis and can
time out; update the inline path in the queueUrl-null branch to avoid
fromBlock=0 by using a safe floor (e.g., compute fromBlock = Math.max(toBlock -
configLookbackBlocks, deploymentBlockFor(tokenAddress) || 0)) or make the inline
path a no-op that logs a clear warning requiring HOLDER_QUEUE_URL, then skip
processing; adjust the call to scanTransfers to use that fromBlock and ensure
any downstream uses (insertHolder, upsertHolderState) only run when
scanTransfers is bounded or when configured to allow full scans.
- Around line 191-203: The SQS batching loop in sendMessageBatch (using
variables messages, batch, sqs, queueUrl, sqsQueued) logs result.Failed but
drops those entries, risking lost pool processing; modify the loop to either
retry failed entries (e.g., re-enqueue the failed Entries up to N retries with
backoff) or persist them for later processing (e.g., push failed Entries into a
retry queue or store in DB) and incrementally update sqsQueued only for
successful sends; if eventual consistency is acceptable, add a clear comment
near the result.Failed handling documenting that failures are intentionally not
retried and will be discovered in the next run via getPoolsWithoutHolderState.
- Around line 180-188: The SQS message Id generation using
pool.configID.replace(/-/g, '') can produce collisions and exceed SQS 80-char
limits; update the code that builds the messages array (where
messages.push({...}) using pool.configID) to create a deterministic, unique Id
per batch by combining a per-batch index (or the loop index) with a short stable
hash/truncated configID (e.g., first N chars or a hash digest) and ensure the
final Id is <=80 characters; use the same unique Id generation when building the
MessageBody so configID is preserved in the body but not relied on for the SQS
Id; ensure this change applies before calling sendMessageBatch to avoid
duplicate Ids in the same batch.

In `@src/handlers/triggerHoldersWorker.js`:
- Around line 87-92: When holderCount (computed from
Object.keys(balanceMap).length) is 0 we currently return early and never call
upsertHolderState, causing getPoolsWithoutHolderState to keep returning the
pool; instead, when holderCount === 0 call upsertHolderState for the given
tokenAddress with a sentinel/empty holder state (e.g., holderCount: 0 and a
processed flag or lastScannedBlock set) so the pool is marked processed and
won't be re-queued; update the branch in the code that logs "Skipping
${tokenAddress}: 0 holders..." to invoke upsertHolderState with that sentinel
state before returning.

In `@src/utils/holderScanner.js`:
- Around line 133-143: The median USD calculation loses precision by converting
large BigInt balances to Number (see medianPositionUsd calculation using
totalBalance, tvlUsd, entries and pricePerToken); fix by first scaling raw
BigInt balances and totalBalance down by the token decimals (divide by
10**decimals or use a BigInt-safe decimal representation) to produce safe JS
Numbers before computing pricePerToken and per-holder USD values, or compute the
median using BigInt-aware arithmetic (e.g., compare scaled BigInt values) so you
avoid calling Number(BigInt) on raw 18-decimal balances.

---

Nitpick comments:
In `@src/handlers/triggerEnrichment.js`:
- Line 126: In the holder enrichment catch block in
src/handlers/triggerEnrichment.js (the catch that currently logs "holder data
fetch failed, continuing without it", err.message), replace the console.log call
with console.error and log the full error (e.g., err or err.stack) instead of
only err.message so error output is emitted to stderr and is easier to filter in
CloudWatch.

In `@src/handlers/triggerHoldersBackfill.js`:
- Line 1: Replace the v2 aws-sdk usage with the modular v3 client: remove the
require('aws-sdk/clients/sqs') SQS import and instead instantiate an
`@aws-sdk/client-sqs` SQSClient; update all uses of the old SQS methods (e.g.,
sendMessage, receiveMessage, deleteMessage) to call client.send(...) with the
corresponding command classes (SendMessageCommand, ReceiveMessageCommand,
DeleteMessageCommand). Ensure configuration (region/credentials) is passed to
the SQSClient constructor and replace any direct method calls on the old SQS
variable with calls to the new SQSClient instance and command objects.

In `@src/handlers/triggerHoldersWorker.js`:
- Around line 100-108: The insert path is pre-stringifying top10Holders before
inserting, while upsertHolderState uses the pg-promise :json modifier for
balanceMap; update the insert ColumnSet entry for top10Holders to use the :json
modifier (matching balanceMap) and remove JSON.stringify(...) in the call site
(the insertHolder invocation in triggerHoldersWorker.js and
triggerHoldersDaily.js) so the raw object is passed and pg-promise handles
JSON/JSONB serialization; ensure the ColumnSet name (top10Holders) and the
insertHolder function are updated consistently.

In `@src/queries/holder.js`:
- Around line 83-101: getLatestHolders currently scans the entire holder table
by using DISTINCT ON with ORDER BY timestamp DESC, which will slow as snapshots
grow; either add a DB index on ("configID", timestamp DESC) for the holder table
to make this query efficient, or rewrite getLatestHolders to join holder_state
(or another table that tracks the latest snapshot) to fetch only the most-recent
row per configID (use the configID join key and the latest-timestamp marker)
instead of scanning all snapshots; locate the getLatestHolders function in
holder.js and implement the index creation in your migration/DDL or change the
query to join the latest-state table accordingly.
- Around line 61-81: The latest_tvl CTE used in getAllHolderStates (and also in
getPoolsWithoutHolderState) can cause expensive full sorts on the yield table;
add a database-side optimization by creating a composite index on ("configID",
timestamp DESC) (via a new migration or SQL index file) to support the DISTINCT
ON query, or alternatively introduce a materialized view or cached table that
stores the latest tvlUsd per configID and update it periodically—update the
project migrations or database setup to include the index or materialized view
so the SQL in getAllHolderStates and getPoolsWithoutHolderState runs
efficiently.

In `@src/utils/holderScanner.js`:
- Around line 46-86: scanTransfers's checkpoint condition is unreliable; add a
deterministic guard like lastCheckpointCount inside scanTransfers and replace
the modulus condition with a difference check. Specifically, declare let
lastCheckpointCount = 0; keep incrementing processedCount as now, and inside the
processor call onCheckpoint when (processedCount - lastCheckpointCount) >=
CHECKPOINT_INTERVAL, passing balances, processedCount, lastSeenBlock, then set
lastCheckpointCount = processedCount (or to lastSeenBlock-dependent value if you
prefer block-based checkpoints). Reference symbols: scanTransfers,
processedCount, CHECKPOINT_INTERVAL, lastSeenBlock, onCheckpoint, and
applyTransfer.

serverless.yml Outdated
Comment on lines +246 to +258
HolderQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: ${self:service}-${self:custom.stage}-HolderQueue
VisibilityTimeout: 960
# 72 hours — backfills across ~15k pools can take days to drain
MessageRetentionPeriod: 259200
RedrivePolicy:
deadLetterTargetArn:
Fn::GetAtt:
- DeadLetterQueue
- Arn
maxReceiveCount: 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

MessageRetentionPeriod of 72 hours may be insufficient for a full backfill.

With reservedConcurrency: 10 workers and a worst-case 900 s per pool, processing ~15 k pools would require up to ~375 hours. Messages enqueued at the start of a backfill could expire before being consumed. Consider extending retention (max is 1,209,600 s / 14 days, matching the DLQ) or raising concurrency.

Do you want me to open an issue to track the retention period / concurrency tradeoff analysis?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@serverless.yml` around lines 246 - 258, The HolderQueue's
MessageRetentionPeriod (259200 s / 72h) is too short for worst-case backfills;
update the MessageRetentionPeriod property for the HolderQueue to the SQS
maximum (1209600) to match the DeadLetterQueue retention, and/or increase the
workers' reservedConcurrency (where you configure reservedConcurrency for the
Lambda consumers) to reduce total processing time; ensure the RedrivePolicy
referencing DeadLetterQueue remains unchanged and document the change so we can
track retention vs concurrency tradeoffs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
src/handlers/triggerHoldersDaily.js (3)

97-108: insertHolder and upsertHolderState are not atomic — partial failure can cause duplicate snapshots.

If upsertHolderState fails after insertHolder succeeds, lastBlock isn't advanced, so the next run re-scans the same range and inserts another snapshot (different timestamp → no ON CONFLICT dedup). Consider wrapping both in a single DB transaction, or at minimum reversing the order so the state is updated first (making re-scans insert a benign duplicate that hits DO NOTHING).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 97 - 108, The snapshot
insert (insertHolder) and state update (upsertHolderState) are not atomic so a
failure after insert can cause duplicate snapshots; update this by performing
both operations inside a single DB transaction or, if transaction wiring is
nontrivial, flip the order to call upsertHolderState(pool.configID, toBlock,
updatedMap) before insertHolder(...) so lastBlock advances before any snapshot
is written; reference the functions insertHolder and upsertHolderState and the
key identifier pool.configID when making the change.

132-134: Wrap new-pool discovery in try/catch to isolate it from the incremental-update phase.

If getPoolsWithoutHolderState() (or the SQS/inline processing below it) throws, the Lambda reports failure despite having already committed all incremental updates. This could trigger spurious alerts or unnecessary retries. Wrapping this section in a try/catch with an error log would let the handler exit cleanly.

🛡️ Proposed isolation
   // 4. Process NEW pools — always queue to SQS
-  const newPools = await getPoolsWithoutHolderState();
-  if (newPools.length > 0) {
+  try {
+    const newPools = await getPoolsWithoutHolderState();
+    if (newPools.length > 0) {
       console.log(`${newPools.length} new pools found`);
       // ... existing logic ...
+    }
+  } catch (e) {
+    console.error(`New-pool processing failed: ${e.message}`);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 132 - 134, The new-pool
discovery and processing block using getPoolsWithoutHolderState() and the
subsequent SQS/inline queuing (the newPools variable and its processing) must be
isolated in its own try/catch so failures there do not fail the whole Lambda
after incremental updates have been committed; wrap the call to
getPoolsWithoutHolderState() and the loop that enqueues/processes newPools in a
try block and catch any errors, log them via the existing logger (include
context like "new pool discovery failed" and the error), and allow the handler
to continue/return normally without throwing so incremental-update work is not
retried.

156-165: Duplicate insertHolder payload construction — consider extracting a helper.

The same payload shape (lines 98-106 and 157-165) is built in two places. A small helper like buildHolderPayload(configID, timestamp, metrics) would centralize the JSON.stringify(metrics.top10Holders) detail and reduce copy-paste drift risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 156 - 165, There's
duplicate construction of the payload passed to insertHolder (built from
deriveMetrics and then properties like holderCount, avgPositionUsd, top10Pct,
JSON.stringify(top10Holders), medianPositionUsd); extract a small helper
function named buildHolderPayload(configID, timestamp, metrics) that returns the
object with JSON.stringify(metrics.top10Holders) done in one place, then replace
both inline payload constructions with
insertHolder(buildHolderPayload(pool.configID, now, metrics)); ensure the helper
accepts the same symbols used (configID, timestamp, metrics) and preserves all
existing property names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 69-74: Pool.lastBlock must be validated before doing arithmetic:
check pool.lastBlock (from getAllHolderStates) is a finite number (e.g., via
Number.isFinite after coercion) and only compute fromBlock =
Number(pool.lastBlock) + 1 when valid; if it is missing/NaN/null/invalid, treat
the pool as already up-to-date (return 'skipped') or set fromBlock to a safe
fallback (e.g., toBlock) so scanTransfers never receives NaN. Update the logic
around fromBlock/toBlock and the early-return so invalid lastBlock values are
handled defensively and reference the variables pool.lastBlock, fromBlock,
toBlock and the consumer scanTransfers in your fix.

In `@src/handlers/triggerHoldersWorker.js`:
- Around line 100-111: The two DB writes (insertHolder and upsertHolderState)
must be made atomic to avoid duplicate snapshots on retry: wrap the calls in a
single DB transaction (beginTransaction -> upsertHolderState(configID, toBlock,
balanceMap) -> insertHolder({... timestamp: now, holderCount..., top10Holders:
JSON.stringify(...) }) -> commit; rollback on error) so either both persist or
neither does; alternatively, if wrapping in a transaction is infeasible, flip
the order to call upsertHolderState first and only call insertHolder after that
commit succeeds, or change the snapshot dedupe key to (configID, toBlock)
instead of (configID, timestamp) to make insertHolder idempotent.
- Line 46: processToken is destructuring tvlUsd from the SQS body with no
fallback, allowing undefined to flow into deriveMetrics and produce NaN; update
the destructuring or immediately coerce tvlUsd to a safe numeric default (e.g.,
0) before calling deriveMetrics(balanceMap, tvlUsd) and do the same for the
other occurrence referenced around line 97 so deriveMetrics always receives a
number (or explicitly handle undefined inside deriveMetrics) to prevent NaN
values from being written to the DB.
- Around line 33-35: The catch block that logs failures uses err.message which
is undefined for non-Error throwables; update the catch in the handler
surrounding processToken so the log uses a safe message like err?.message ??
String(err) (and still logs the full err object), e.g., change the console.log
call in the catch that currently references failedMessageIds and
record.messageId to include the computed safe message, ensuring
failedMessageIds.push(record.messageId) remains unchanged.
- Line 107: The top10Holders value is being JSON-stringified before DB insert,
which causes pg-promise to store a string instead of native JSON for the jsonb
column; in triggerHoldersWorker.js remove JSON.stringify and pass
metrics.top10Holders directly when setting top10Holders (and make the identical
change in triggerHoldersDaily.js where top10Holders is stringified) so the
driver can auto-serialize the object into jsonb.

---

Duplicate comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 181-189: The SQS batch item Id generation using
pool.configID.replace(/-/g, '') risks collisions; update the messages.push block
that builds the SQS batch (the messages array population in
triggerHoldersDaily.js) to use a unique batch index-based Id (e.g.,
`${batchIndex}-${i}` or similar) instead of the hyphen-stripped configID, ensure
the chosen Id is <= 80 chars and contains only allowed characters, and preserve
the MessageBody structure (configID, chain, tokenAddress, tvlUsd) unchanged.
- Around line 138-170: The inline fallback loops newPools and calls
scanTransfers(chain, tokenAddress, 0, toBlock) which scans from genesis; change
this to use a bounded fromBlock (e.g. compute fromBlock = Math.max(toBlock -
MAX_LOOKBACK_BLOCKS, FALLBACK_MIN_BLOCK) or use lastProcessedBlock from
upsertHolderState if available) or skip inline processing entirely for pools
likely to be large; update the code around scanTransfers, newPools iteration and
upsertHolderState to pass the computed fromBlock into scanTransfers and ensure
you define/configure MAX_LOOKBACK_BLOCKS or FALLBACK_MIN_BLOCK and guard with
SUPPORTED_CHAINS/isValidEvmAddress checks so the inline path cannot start
scanning from block 0 for popular tokens.
- Around line 192-204: The code currently logs result.Failed from the
sendMessageBatch call (in the loop that slices messages into batches and calls
sqs.sendMessageBatch) but drops those failed Entries; update the handler to
either retry failed entries (map result.Failed[*].Id back to the corresponding
Entry in the current batch and resend with a bounded retry count and backoff)
or, if opting for eventual-consistency, explicitly persist/log the failed Entry
Ids so getPoolsWithoutHolderState can rediscover them on the next daily run;
implement the retry path inside the same loop around sqs.sendMessageBatch (use
sqs, batch, result.Failed, and batch Entries names from the diff) with at most N
retries and exponential/backoff delay, and only fall back to persistent logging
after retries are exhausted.

In `@src/handlers/triggerHoldersWorker.js`:
- Around line 87-93: The zero-holder sentinel must mark pools processed to avoid
re-queuing: in triggerHoldersWorker.js, when computing holderCount =
Object.keys(balanceMap).length, ensure the code path for holderCount === 0 logs
the token (tokenAddress), calls await upsertHolderState(configID, toBlock, {})
to persist the processed state, and then returns; verify the presence of
balanceMap, holderCount, tokenAddress and the upsertHolderState(configID,
toBlock, {}) call in that branch and keep it exactly before the return so the
pool is not re-queued.

---

Nitpick comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 97-108: The snapshot insert (insertHolder) and state update
(upsertHolderState) are not atomic so a failure after insert can cause duplicate
snapshots; update this by performing both operations inside a single DB
transaction or, if transaction wiring is nontrivial, flip the order to call
upsertHolderState(pool.configID, toBlock, updatedMap) before insertHolder(...)
so lastBlock advances before any snapshot is written; reference the functions
insertHolder and upsertHolderState and the key identifier pool.configID when
making the change.
- Around line 132-134: The new-pool discovery and processing block using
getPoolsWithoutHolderState() and the subsequent SQS/inline queuing (the newPools
variable and its processing) must be isolated in its own try/catch so failures
there do not fail the whole Lambda after incremental updates have been
committed; wrap the call to getPoolsWithoutHolderState() and the loop that
enqueues/processes newPools in a try block and catch any errors, log them via
the existing logger (include context like "new pool discovery failed" and the
error), and allow the handler to continue/return normally without throwing so
incremental-update work is not retried.
- Around line 156-165: There's duplicate construction of the payload passed to
insertHolder (built from deriveMetrics and then properties like holderCount,
avgPositionUsd, top10Pct, JSON.stringify(top10Holders), medianPositionUsd);
extract a small helper function named buildHolderPayload(configID, timestamp,
metrics) that returns the object with JSON.stringify(metrics.top10Holders) done
in one place, then replace both inline payload constructions with
insertHolder(buildHolderPayload(pool.configID, now, metrics)); ensure the helper
accepts the same symbols used (configID, timestamp, metrics) and preserves all
existing property names.

Comment on lines +69 to +74
const fromBlock = Number(pool.lastBlock) + 1;
const toBlock = chainBlocks[chain];

if (fromBlock >= toBlock) {
return 'skipped';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

lastBlock should be validated before arithmetic — NaN silently bypasses the skip check.

Number(undefined) + 1NaN, and NaN >= toBlock evaluates to false, so the pool won't be skipped — scanTransfers would receive NaN as fromBlock. Similarly, Number(null)0, which would trigger a near-genesis scan. Both are unlikely given the JOIN in getAllHolderStates, but the failure mode is silent and expensive.

🛡️ Proposed defensive guard
         const fromBlock = Number(pool.lastBlock) + 1;
+        if (!Number.isFinite(fromBlock)) {
+          console.log(`Invalid lastBlock for ${pool.configID}, skipping`);
+          return 'skipped';
+        }
         const toBlock = chainBlocks[chain];
 
         if (fromBlock >= toBlock) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 69 - 74, Pool.lastBlock
must be validated before doing arithmetic: check pool.lastBlock (from
getAllHolderStates) is a finite number (e.g., via Number.isFinite after
coercion) and only compute fromBlock = Number(pool.lastBlock) + 1 when valid; if
it is missing/NaN/null/invalid, treat the pool as already up-to-date (return
'skipped') or set fromBlock to a safe fallback (e.g., toBlock) so scanTransfers
never receives NaN. Update the logic around fromBlock/toBlock and the
early-return so invalid lastBlock values are handled defensively and reference
the variables pool.lastBlock, fromBlock, toBlock and the consumer scanTransfers
in your fix.

Comment on lines +33 to +35
} catch (err) {
console.log(`Failed to process: ${err.message}`, err);
failedMessageIds.push(record.messageId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

err.message is undefined for non-Error throwables.

If processToken rejects with a non-Error value (plain string, object, etc.), err.message is undefined, making the log line read "Failed to process: undefined". Use err?.message ?? String(err) for a reliable message.

🛠️ Proposed fix
-      console.log(`Failed to process: ${err.message}`, err);
+      console.log(`Failed to process: ${err?.message ?? String(err)}`, err);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersWorker.js` around lines 33 - 35, The catch block
that logs failures uses err.message which is undefined for non-Error throwables;
update the catch in the handler surrounding processToken so the log uses a safe
message like err?.message ?? String(err) (and still logs the full err object),
e.g., change the console.log call in the catch that currently references
failedMessageIds and record.messageId to include the computed safe message,
ensuring failedMessageIds.push(record.messageId) remains unchanged.

};
};

async function processToken({ configID, chain, tokenAddress, tvlUsd }) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

tvlUsd can be undefined, silently producing NaN metrics.

tvlUsd is destructured from the SQS message body (line 46) with no fallback. If the producer omits it, deriveMetrics(balanceMap, undefined) produces NaN for avgPositionUsd and medianPositionUsd. PostgreSQL NUMERIC columns accept NaN as a valid value, so these will persist without error and corrupt holder analytics data.

🐛 Proposed fix
-async function processToken({ configID, chain, tokenAddress, tvlUsd }) {
+async function processToken({ configID, chain, tokenAddress, tvlUsd = 0 }) {

Also applies to: 97-97

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersWorker.js` at line 46, processToken is
destructuring tvlUsd from the SQS body with no fallback, allowing undefined to
flow into deriveMetrics and produce NaN; update the destructuring or immediately
coerce tvlUsd to a safe numeric default (e.g., 0) before calling
deriveMetrics(balanceMap, tvlUsd) and do the same for the other occurrence
referenced around line 97 so deriveMetrics always receives a number (or
explicitly handle undefined inside deriveMetrics) to prevent NaN values from
being written to the DB.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 98-108: The snapshot insert (insertHolder) and state write
(upsertHolderState) are not atomic which allows duplicate snapshots on retry;
wrap the pair in a DB transaction (or at minimum call upsertHolderState before
insertHolder) so state advancement is persisted together with the snapshot.
Concretely, in the code path that uses insertHolder(pool.configID,
timestamp=now, ...metrics) and upsertHolderState(pool.configID, toBlock,
updatedMap) (and the inline fallback path that mirrors this), perform
upsertHolderState first or execute both calls inside a single transaction so
failures do not leave lastBlock unchanged while a snapshot was inserted.
- Around line 90-93: When updatedMap is empty we must still advance the pool
sentinel so the scan window doesn't grow; instead of returning 'skipped'
immediately in the block that checks Object.keys(updatedMap).length === 0, call
upsertHolderState for the pool to persist the new lastBlock (set lastBlock to
the current toBlock used for this run or equivalent scan end variable) and then
return 'skipped'—mirror the behavior already implemented in
triggerHoldersWorker.js (see the logic around lines 91–92). Ensure you reference
updatedMap, upsertHolderState, pool.lastBlock and the toBlock variable when
making this change.
- Line 34: The current mapping that builds parsedPools calls parsePoolField for
every pool and will throw and abort the entire main run if any single pool row
is malformed; instead, update the mapping where parsedPools is created in
src/handlers/triggerHoldersDaily.js to wrap each parsePoolField(p.pool)
invocation in a try/catch, call processLogger.error (or the module logger) with
the pool id or identifying data and the caught error, and for failures return a
safe fallback object (e.g., empty parsed fields or a marker like {
poolParseError: true }) so the loop continues; ensure the symbol parsedPools and
the parsePoolField call are the only changes so other code that consumes
parsedPools can handle or skip entries with the fallback marker.

---

Duplicate comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 69-74: Validate pool.lastBlock before doing arithmetic: compute
const lastBlockNum = Number(pool.lastBlock) and if Number.isNaN(lastBlockNum)
then bail out (e.g., log and return 'skipped') instead of using NaN; otherwise
set const fromBlock = lastBlockNum + 1 and proceed to compare with toBlock and
call scanTransfers. This change (in src/handlers/triggerHoldersDaily.js around
fromBlock/toBlock/scanTransfers usage) prevents passing NaN to scanTransfers.
- Around line 199-202: The SQS failure branch currently logs failed IDs
(result.Failed) and silently drops those messages causing pools to be skipped
until the next discovery; update the handler in triggerHoldersDaily to either
(preferred) retry the failed messages by selecting the original entries from
batch (match by message Id), re-send them with a limited retry/backoff loop and
surface errors if retries exhaust, or (if eventual-consistency is acceptable)
replace the current console.error with a clear inline comment stating the drop
is intentional and that getPoolsWithoutHolderState will rediscover those pools
on the next run (include a TODO/issue ref so it isn’t mistaken for a bug later).
Ensure you reference result.Failed and batch when implementing the retry or when
documenting intent.
- Around line 181-182: The current messages.push call uses
pool.configID.replace(/-/g, '') which can collapse distinct IDs and cause
sendMessageBatch to fail; change the Id generation in the messages.push (the Id
property you set for SQS) to ensure uniqueness and preserve identity — e.g., use
pool.configID as-is (it’s allowed up to 80 chars) or if you must normalize
characters, use a deterministic short hash/encoding of pool.configID (like a
SHA1/hex or base64 truncated to 80 chars) or append a per-message index/suffix
(batchIndex or loop index) to pool.configID so each Id in the messages array is
unique before calling sendMessageBatch.
- Around line 138-170: The inline fallback is still calling scanTransfers(chain,
tokenAddress, 0, toBlock) which risks a full genesis scan; instead determine a
safe start block per pool using the existing holder state (e.g., read
getHolderState(pool.configID) or holderStates[pool.configID] if present) and
pass startBlock = lastProcessedBlock + 1 (or, if no state, use a bounded
fallback like Math.max(toBlock - MAX_INITIAL_SCAN_BLOCKS, 0)); then call
scanTransfers(chain, tokenAddress, startBlock, toBlock), and if the computed
window is unacceptably large (toBlock - startBlock > MAX_INITIAL_SCAN_BLOCKS)
skip or queue the pool instead of scanning inline; update
upsertHolderState(pool.configID, toBlock, balanceMap) as before.

In `@src/handlers/triggerHoldersWorker.js`:
- Line 46: processToken currently accepts tvlUsd that can be undefined, causing
deriveMetrics to produce NaN values (avgPositionUsd, medianPositionUsd) and
persist them; fix by ensuring tvlUsd has a safe numeric default before calling
deriveMetrics — e.g., coerce or default tvlUsd to 0 (or a configurable fallback)
inside processToken (or validate and early-return) so deriveMetrics always
receives a number; update the handling around processToken and any calls to
deriveMetrics to guard against undefined and prevent NaN metrics from being
written.
- Around line 33-35: The catch block uses err.message which is unsafe for
non-Error throwables; update the handler (the catch around processToken) to
compute a safe message like const safeMessage = err && err.message ? err.message
: String(err) and use that in the log call (and include the original err
object), and continue to push record.messageId into failedMessageIds so
failedMessageIds and record.messageId are unchanged.
- Around line 100-111: The problem is that insertHolder(...) can commit a
snapshot and then upsertHolderState(...) fails, causing retries to re-insert a
new timestamped snapshot; fix by making the snapshot/state write atomic: perform
upsertHolderState(configID, toBlock, balanceMap) and insertHolder(...) in a
single DB transaction (or at minimum run upsertHolderState first) using the same
computed timestamp (now) or include toBlock in the snapshot unique key so
retries are idempotent; update the code that calls insertHolder and
upsertHolderState to use a transactional API or swap order and ensure both
operations reference the same timestamp/toBlock and are idempotent.

Comment on lines +90 to +93
if (Object.keys(updatedMap).length === 0) {
console.log(`Skipping ${tokenAddress}: 0 holders (likely non-ERC20 pool)`);
return 'skipped';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Zero-holder incremental scan never advances lastBlock — scan range grows unboundedly.

When updatedMap is empty for a pool with an existing sentinel state (non-ERC20 pool marked by the worker), the function returns 'skipped' without calling upsertHolderState. On every subsequent daily run, fromBlock stays at pool.lastBlock + 1 while toBlock grows, and the block range re-scanned keeps widening. The worker handles this correctly (lines 91–92 of triggerHoldersWorker.js) — apply the same fix here.

🐛 Proposed fix
         if (Object.keys(updatedMap).length === 0) {
           console.log(`Skipping ${tokenAddress}: 0 holders (likely non-ERC20 pool)`);
+          await upsertHolderState(pool.configID, toBlock, {});
           return 'skipped';
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (Object.keys(updatedMap).length === 0) {
console.log(`Skipping ${tokenAddress}: 0 holders (likely non-ERC20 pool)`);
return 'skipped';
}
if (Object.keys(updatedMap).length === 0) {
console.log(`Skipping ${tokenAddress}: 0 holders (likely non-ERC20 pool)`);
await upsertHolderState(pool.configID, toBlock, {});
return 'skipped';
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 90 - 93, When updatedMap is
empty we must still advance the pool sentinel so the scan window doesn't grow;
instead of returning 'skipped' immediately in the block that checks
Object.keys(updatedMap).length === 0, call upsertHolderState for the pool to
persist the new lastBlock (set lastBlock to the current toBlock used for this
run or equivalent scan end variable) and then return 'skipped'—mirror the
behavior already implemented in triggerHoldersWorker.js (see the logic around
lines 91–92). Ensure you reference updatedMap, upsertHolderState, pool.lastBlock
and the toBlock variable when making this change.

Comment on lines +98 to +108
await insertHolder({
configID: pool.configID,
timestamp: now,
holderCount: metrics.holderCount,
avgPositionUsd: metrics.avgPositionUsd,
top10Pct: metrics.top10Pct,
top10Holders: metrics.top10Holders,
medianPositionUsd: metrics.medianPositionUsd,
});

await upsertHolderState(pool.configID, toBlock, updatedMap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Non-atomic snapshot + state writes cause duplicate snapshot rows on next-day retry.

insertHolder (line 98) and upsertHolderState (line 108) are not in a transaction. If upsertHolderState fails, the pool is counted as failed but lastBlock is never advanced. The next daily run re-processes the same block range and calls insertHolder again — with a new now timestamp from the new Lambda invocation — inserting a duplicate snapshot, since ON CONFLICT ("configID", timestamp) DO NOTHING only deduplicates within the same exact millisecond. The same issue applies to the inline fallback path at lines 157–166.

Wrap both calls in a DB transaction, or at minimum flip the order (upsert state first, then insert snapshot) to make partial failures idempotent on retry.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 98 - 108, The snapshot
insert (insertHolder) and state write (upsertHolderState) are not atomic which
allows duplicate snapshots on retry; wrap the pair in a DB transaction (or at
minimum call upsertHolderState before insertHolder) so state advancement is
persisted together with the snapshot. Concretely, in the code path that uses
insertHolder(pool.configID, timestamp=now, ...metrics) and
upsertHolderState(pool.configID, toBlock, updatedMap) (and the inline fallback
path that mirrors this), perform upsertHolderState first or execute both calls
inside a single transaction so failures do not leave lastBlock unchanged while a
snapshot was inserted.


const query = `
WITH latest_tvl AS (
SELECT DISTINCT ON ("configID") "configID", "tvlUsd"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is to expensive/slow when running on the fully unfiltered yield table. i suggest adding a timestamp filter like WHERE timestamp >= NOW() - INTERVAL '7 DAY'

top10Holders: 'jsonb',
medianPositionUsd: 'numeric',
});
pgm.createIndex('holder', ['configID', 'timestamp'], { unique: true });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the query shapes below (the DISTINCT ON) important to add the sort direction DESC on timestamp


const query = `
WITH latest_tvl AS (
SELECT DISTINCT ON ("configID") "configID", "tvlUsd"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slasher125
Copy link
Collaborator

a general observation re holder table:
have you considered rounding the timestamp in the holder table to a day instead of keeping it exact? with a daily timestamp you could change the db inserts/updates in such a way that it never insert more than 1 row per configID/day, so our holder table (maybe rename to holder_daily?) would be deterministic in granularity.
(you can still keep the exact timestamp, for logging purposes)

@0xkr3p
Copy link
Contributor Author

0xkr3p commented Feb 23, 2026

a general observation re holder table: have you considered rounding the timestamp in the holder table to a day instead of keeping it exact? with a daily timestamp you could change the db inserts/updates in such a way that it never insert more than 1 row per configID/day, so our holder table (maybe rename to holder_daily?) would be deterministic in granularity. (you can still keep the exact timestamp, for logging purposes)

good call, yeah changing to daily timestamp makes sense, updating figures every 6hrs could be a bit overkill for this data and a set time each day is more consistent.. i'll get those changes updated for you to re-review

@slasher125
Copy link
Collaborator

slasher125 commented Feb 23, 2026

updating figures every 6hr

is that your intention? cause rn the cron job is set to once a day at 6am utc.
meaning i thought you wanted to trigger the daily handler once a day given it also has daily in the handler name.

@0xkr3p
Copy link
Contributor Author

0xkr3p commented Feb 23, 2026

updating figures every 6hr

is that your intention? cause rn the cron job is set to once a day at 6am utc. meaning i thought you wanted to trigger the daily handler once a day given it also has daily in the handler name.

no i think once a day is fine for this. I misunderstood, I thought we were taking additional snapshots every 6hrs but this is not the case, my bad

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (7)
src/handlers/triggerHoldersDaily.js (4)

94-97: ⚠️ Potential issue | 🟡 Minor

Zero-holder incremental scan never advances lastBlock — block range grows unboundedly

When updatedMap is empty, the function returns 'skipped' without calling upsertHolderState. On every subsequent daily run, fromBlock stays at pool.lastBlock + 1 while toBlock grows, causing an ever-widening re-scan. The worker already handles this correctly at lines 95–97 — apply the same fix here.

🐛 Proposed fix
         if (Object.keys(updatedMap).length === 0) {
           console.log(`Skipping ${tokenAddress}: 0 holders (likely non-ERC20 pool)`);
+          await upsertHolderState(pool.configID, toBlock, {});
           return 'skipped';
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 94 - 97, The function
currently returns 'skipped' when Object.keys(updatedMap).length === 0, which
prevents advancing pool.lastBlock and lets the fromBlock/toBlock range grow
unbounded; instead, call upsertHolderState (the same update used in the
non-empty case) to persist the new lastBlock for tokenAddress/pool so the
incremental scan advances, then return 'skipped' — update the branch that checks
updatedMap to invoke upsertHolderState with the computed lastBlock (and any
relevant holder counts) before returning.

73-74: ⚠️ Potential issue | 🟡 Minor

lastBlock is not validated before arithmetic — NaN silently bypasses the skip check

Number(undefined) + 1NaN, and NaN >= toBlock is false, so scanTransfers receives NaN as fromBlock. While the JOIN in getAllHolderStates makes this unlikely, the failure mode is silent and potentially expensive.

🛡️ Proposed defensive guard
         const fromBlock = Number(pool.lastBlock) + 1;
+        if (!Number.isFinite(fromBlock)) {
+          console.log(`Invalid lastBlock for ${pool.configID}, skipping`);
+          return 'skipped';
+        }
         const toBlock = chainBlocks[chain];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 73 - 74, The code computes
fromBlock = Number(pool.lastBlock) + 1 without validating pool.lastBlock, which
can produce NaN and pass the >= check; update the logic in
triggerHoldersDaily.js around the fromBlock/toBlock calculation (variables:
pool.lastBlock, chainBlocks[chain], fromBlock, toBlock) to defensively handle
non-numeric lastBlock — coerce and validate lastBlock (e.g., use Number.isFinite
or isNaN checks), set a sensible default (like 0) or skip processing for that
pool if invalid, and only call scanTransfers when fromBlock is a valid finite
number and fromBlock < toBlock; also add a brief log/error when lastBlock is
invalid to aid debugging.

35-35: ⚠️ Potential issue | 🟡 Minor

parsePoolField throws crash the entire daily run

A single malformed pool field in the DB causes pools.map(...) to throw, which propagates to the outer try/catch at line 216 and aborts processing for all pools.

🛡️ Proposed fix
-  const parsedPools = pools.map((p) => ({ ...p, ...parsePoolField(p.pool) }));
+  const parsedPools = pools.flatMap((p) => {
+    try {
+      return [{ ...p, ...parsePoolField(p.pool) }];
+    } catch (e) {
+      console.log(`Skipping pool ${p.configID}: failed to parse pool field — ${e.message}`);
+      return [];
+    }
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` at line 35, A malformed pool field
causes parsePoolField to throw inside the pools.map call (which builds
parsedPools), aborting the whole daily run; wrap the call to parsePoolField for
each pool in a local try/catch inside the mapping so a single bad pool is logged
and skipped (or returned with a safe fallback) instead of propagating the
exception: change const parsedPools = pools.map((p) => ({ ...p,
...parsePoolField(p.pool) })); to map each p through a small function that tries
parsePoolField(p.pool), on error logs the pool id/context (using the same logger
used elsewhere) and returns either null or a fallback object, then filter out
nulls before further processing so the rest of the pools continue to be
processed.

183-197: ⚠️ Potential issue | 🟡 Minor

parsePoolField in the SQS batch-building loop has no per-item error handling

Unlike the inline fallback (lines 148–177, which wraps each pool in try/catch), the SQS batching loop at lines 183–197 has no per-pool guard. A single malformed pool field would throw here, propagate to the outer main() catch, and abort the entire function before sendMessageBatch is called — dropping all new-pool discovery for the run.

Additionally, the Id field at line 189 (pool.configID.replace(/-/g, '')) can produce collisions for configIDs that differ only in hyphen placement, causing the whole sendMessageBatch call to fail.

🐛 Proposed fix
      for (const pool of newPools) {
+       let tokenAddress, chain;
+       try {
-         const { tokenAddress, chain } = parsePoolField(pool.pool);
+         ({ tokenAddress, chain } = parsePoolField(pool.pool));
+       } catch (e) {
+         console.log(`Skipping new pool ${pool.configID}: ${e.message}`);
+         continue;
+       }
        if (!chain || !SUPPORTED_CHAINS.has(chain)) continue;
        if (!isValidEvmAddress(tokenAddress)) continue;

        messages.push({
-         Id: pool.configID.replace(/-/g, ''),
+         Id: String(messages.length),
          MessageBody: JSON.stringify({
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 183 - 197, The loop
building messages for sendMessageBatch lacks per-item error handling and uses a
fragile Id derived from pool.configID; wrap the body that calls parsePoolField
and validates tokenAddress/chain in a per-pool try/catch so a single malformed
pool won't abort the whole batch, log and continue on error, and make the SQS
message Id unique and collision-resistant (e.g., append the loop index or a
short deterministic suffix to pool.configID.replace(/-/g, '')) to avoid
collisions that would cause sendMessageBatch to fail; update references in this
block (parsePoolField, messages.push, and pool.configID usage) accordingly.
src/handlers/triggerHoldersWorker.js (3)

108-118: ⚠️ Potential issue | 🟠 Major

Non-atomic snapshot + state write — state not advanced on partial failure

insertHolder (line 108) and upsertHolderState (line 118) are two separate writes. If upsertHolderState fails after insertHolder succeeds, lastBlock is not advanced. The ON CONFLICT ("configID", timestamp) DO UPDATE constraint ensures the same-day snapshot is not duplicated on retry, but the worker will re-scan the full block range unnecessarily on every subsequent run until the state is saved.

Flipping the order (state first, then snapshot) makes partial failures idempotent: if the snapshot insert fails, the next run picks up from the correctly advanced lastBlock.

♻️ Suggested ordering
+  await upsertHolderState(configID, toBlock, balanceMap);
   await insertHolder({
     configID,
     timestamp: today.toISOString(),
     holderCount: metrics.holderCount,
     avgPositionUsd: metrics.avgPositionUsd,
     top10Pct: metrics.top10Pct,
     top10Holders: metrics.top10Holders,
     medianPositionUsd: metrics.medianPositionUsd,
   });
-
-  await upsertHolderState(configID, toBlock, balanceMap);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersWorker.js` around lines 108 - 118, The snapshot
insert and state update are done in the wrong order: call
upsertHolderState(configID, toBlock, balanceMap) before insertHolder({...}) so
the worker advances lastBlock atomically first; this makes retries idempotent if
the snapshot insert later fails—locate the two calls (insertHolder and
upsertHolderState) and swap their order, keeping the same arguments (configID,
toBlock, balanceMap and the metrics/timestamp payload), and preserve existing
error handling so any failure still bubbles up.

46-46: ⚠️ Potential issue | 🟠 Major

tvlUsd has no default — undefined propagates to deriveMetrics and produces NaN metrics

If the SQS message omits tvlUsd, deriveMetrics(balanceMap, undefined) computes NaN for avgPositionUsd and medianPositionUsd. PostgreSQL NUMERIC / FLOAT columns silently accept NaN, persisting corrupt analytics rows.

🐛 Proposed fix
-async function processToken({ configID, chain, tokenAddress, tvlUsd }) {
+async function processToken({ configID, chain, tokenAddress, tvlUsd = 0 }) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersWorker.js` at line 46, processToken currently
accepts tvlUsd that can be undefined and is passed to deriveMetrics causing NaN
values to be stored; update processToken (function processToken) to
guard/normalize tvlUsd before calling deriveMetrics (e.g., default tvlUsd to 0
or explicitly coerce undefined to 0/Number) so deriveMetrics(balanceMap,
tvlUsdNormalized) never receives undefined, or alternatively add a defensive
check in deriveMetrics to handle undefined/NaN inputs; ensure the chosen fix
prevents avgPositionUsd/medianPositionUsd from being computed from undefined and
avoids persisting NaN to the DB.

34-34: ⚠️ Potential issue | 🟡 Minor

err.message is undefined for non-Error throwables

If processToken rejects with a plain string or object, err.message is undefined, producing "Failed to process: undefined" in the log.

🛠️ Proposed fix
-      console.log(`Failed to process: ${err.message}`, err);
+      console.log(`Failed to process: ${err?.message ?? String(err)}`, err);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersWorker.js` at line 34, The log uses err.message
which is undefined for non-Error throwables; update the catch/logging in the
processToken flow (where console.log(`Failed to process: ${err.message}`, err)
is called) to safely print the error by checking for err.message and falling
back to a serialized representation (e.g., err.message || String(err) ||
JSON.stringify(err)) or by logging the err object directly so both Error
instances and plain strings/objects are readable; change the console.log call in
the processToken error handler accordingly.
🧹 Nitpick comments (1)
src/queries/holder.js (1)

98-99: Table name constants unused in several query functions

getLatestHolders (line 98), getHolderHistory (line 117), getAllHolderStates (line 84), and getPoolsWithoutHolderState (line 141) all hardcode 'holder_daily' or 'holder_state' directly in SQL instead of the module-level holderTable/stateTable constants. A future table rename would require searching raw strings rather than updating one constant.

♻️ Suggested refactor (example for getLatestHolders)
  const query = `
    SELECT DISTINCT ON ("configID")
      "configID", "holderCount", "avgPositionUsd", "top10Pct", "medianPositionUsd"
-   FROM holder_daily
+   FROM ${holderTable}
    ORDER BY "configID", timestamp DESC
  `;

Note: ES6 template literal interpolation of module-level constants (not user input) carries no SQL-injection risk here.

Also applies to: 117-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/queries/holder.js` around lines 98 - 99, Replace hardcoded table names
with the module-level constants: update the SQL in getLatestHolders,
getHolderHistory, getAllHolderStates, and getPoolsWithoutHolderState to use the
holderTable and stateTable constants (e.g., via ES6 template literal
interpolation) instead of literal 'holder_daily' or 'holder_state'; ensure you
reference the existing holderTable and stateTable identifiers in the query
strings so future renames only require changing the constants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 94-97: The function currently returns 'skipped' when
Object.keys(updatedMap).length === 0, which prevents advancing pool.lastBlock
and lets the fromBlock/toBlock range grow unbounded; instead, call
upsertHolderState (the same update used in the non-empty case) to persist the
new lastBlock for tokenAddress/pool so the incremental scan advances, then
return 'skipped' — update the branch that checks updatedMap to invoke
upsertHolderState with the computed lastBlock (and any relevant holder counts)
before returning.
- Around line 73-74: The code computes fromBlock = Number(pool.lastBlock) + 1
without validating pool.lastBlock, which can produce NaN and pass the >= check;
update the logic in triggerHoldersDaily.js around the fromBlock/toBlock
calculation (variables: pool.lastBlock, chainBlocks[chain], fromBlock, toBlock)
to defensively handle non-numeric lastBlock — coerce and validate lastBlock
(e.g., use Number.isFinite or isNaN checks), set a sensible default (like 0) or
skip processing for that pool if invalid, and only call scanTransfers when
fromBlock is a valid finite number and fromBlock < toBlock; also add a brief
log/error when lastBlock is invalid to aid debugging.
- Line 35: A malformed pool field causes parsePoolField to throw inside the
pools.map call (which builds parsedPools), aborting the whole daily run; wrap
the call to parsePoolField for each pool in a local try/catch inside the mapping
so a single bad pool is logged and skipped (or returned with a safe fallback)
instead of propagating the exception: change const parsedPools = pools.map((p)
=> ({ ...p, ...parsePoolField(p.pool) })); to map each p through a small
function that tries parsePoolField(p.pool), on error logs the pool id/context
(using the same logger used elsewhere) and returns either null or a fallback
object, then filter out nulls before further processing so the rest of the pools
continue to be processed.
- Around line 183-197: The loop building messages for sendMessageBatch lacks
per-item error handling and uses a fragile Id derived from pool.configID; wrap
the body that calls parsePoolField and validates tokenAddress/chain in a
per-pool try/catch so a single malformed pool won't abort the whole batch, log
and continue on error, and make the SQS message Id unique and
collision-resistant (e.g., append the loop index or a short deterministic suffix
to pool.configID.replace(/-/g, '')) to avoid collisions that would cause
sendMessageBatch to fail; update references in this block (parsePoolField,
messages.push, and pool.configID usage) accordingly.

In `@src/handlers/triggerHoldersWorker.js`:
- Around line 108-118: The snapshot insert and state update are done in the
wrong order: call upsertHolderState(configID, toBlock, balanceMap) before
insertHolder({...}) so the worker advances lastBlock atomically first; this
makes retries idempotent if the snapshot insert later fails—locate the two calls
(insertHolder and upsertHolderState) and swap their order, keeping the same
arguments (configID, toBlock, balanceMap and the metrics/timestamp payload), and
preserve existing error handling so any failure still bubbles up.
- Line 46: processToken currently accepts tvlUsd that can be undefined and is
passed to deriveMetrics causing NaN values to be stored; update processToken
(function processToken) to guard/normalize tvlUsd before calling deriveMetrics
(e.g., default tvlUsd to 0 or explicitly coerce undefined to 0/Number) so
deriveMetrics(balanceMap, tvlUsdNormalized) never receives undefined, or
alternatively add a defensive check in deriveMetrics to handle undefined/NaN
inputs; ensure the chosen fix prevents avgPositionUsd/medianPositionUsd from
being computed from undefined and avoids persisting NaN to the DB.
- Line 34: The log uses err.message which is undefined for non-Error throwables;
update the catch/logging in the processToken flow (where console.log(`Failed to
process: ${err.message}`, err) is called) to safely print the error by checking
for err.message and falling back to a serialized representation (e.g.,
err.message || String(err) || JSON.stringify(err)) or by logging the err object
directly so both Error instances and plain strings/objects are readable; change
the console.log call in the processToken error handler accordingly.

---

Nitpick comments:
In `@src/queries/holder.js`:
- Around line 98-99: Replace hardcoded table names with the module-level
constants: update the SQL in getLatestHolders, getHolderHistory,
getAllHolderStates, and getPoolsWithoutHolderState to use the holderTable and
stateTable constants (e.g., via ES6 template literal interpolation) instead of
literal 'holder_daily' or 'holder_state'; ensure you reference the existing
holderTable and stateTable identifiers in the query strings so future renames
only require changing the constants.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71de47b and 9dda8bf.

📒 Files selected for processing (8)
  • migrations/1771242161000_add-holder-tables.js
  • src/api/controllers/yield.js
  • src/api/routes/yield.js
  • src/handlers/triggerEnrichment.js
  • src/handlers/triggerHoldersDaily.js
  • src/handlers/triggerHoldersWorker.js
  • src/queries/holder.js
  • src/utils/holderScanner.js
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/utils/holderScanner.js
  • migrations/1771242161000_add-holder-tables.js
  • src/api/routes/yield.js

@0xkr3p 0xkr3p force-pushed the feat/holder-stats-integration branch from e35690c to fd0d007 Compare March 6, 2026 12:36
@0xkr3p 0xkr3p force-pushed the feat/holder-stats-integration branch from fd0d007 to 821236e Compare March 9, 2026 12:13
@0xkr3p
Copy link
Contributor Author

0xkr3p commented Mar 12, 2026

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/handlers/triggerHoldersDaily.js (1)

40-56: ⚠️ Potential issue | 🟡 Minor

parsePoolField can throw, crashing the entire daily run on a single malformed pool.

If pool.pool contains malformed data, parsePoolField will throw and abort processing for all pools. Consider wrapping in try/catch to skip malformed pools gracefully.

🛡️ Proposed defensive fix
   for (const pool of pools) {
     if (!pool.token || !isValidEvmAddress(pool.token)) continue;

-    const { chain } = parsePoolField(pool.pool);
-    if (!chain) continue;
+    let chain;
+    try {
+      ({ chain } = parsePoolField(pool.pool));
+    } catch {
+      console.log(`Skipping pool ${pool.configID}: failed to parse pool field`);
+      continue;
+    }
+    if (!chain) continue;

     const chainId = resolveChainId(chain);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 40 - 56, The loop currently
calls parsePoolField(pool.pool) which can throw and abort the whole run; wrap
the parsePoolField call in a try/catch inside the for (const pool of pools) loop
(the block that builds tasks) so malformed pool.pool values are skipped instead
of crashing; on catch, log a warning (include pool.configID and the error) and
continue to the next pool, and only proceed to resolveChainId and push into
tasks when parsePoolField succeeds.
🧹 Nitpick comments (8)
src/queries/holder.js (1)

80-90: Consider adding a timestamp bound to getHolderHistory.

Unlike getLatestHolders (30-day bound) and getEligiblePools (7-day bound), this query returns all historical data for a pool without limit. For long-running pools, this could return large result sets.

Consider adding pagination or a default time window, though this may be acceptable if the data volume is expected to remain manageable.

💡 Optional: Add a default time bound
 const getHolderHistory = async (configID, conn) => {
   conn = conn || (await connect());

   const query = `
     SELECT timestamp, "holderCount", "avgPositionUsd", "top10Pct"
     FROM holder_daily
     WHERE "configID" = $<configID>
+      AND timestamp >= NOW() - INTERVAL '1 year'
     ORDER BY timestamp ASC
   `;
   return conn.query(query, { configID });
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/queries/holder.js` around lines 80 - 90, getHolderHistory currently
returns all rows for a configID which can produce large result sets; update the
function signature (getHolderHistory) to accept optional time-bounds (e.g.,
startTimestamp and/or daysBack) or pagination params, add corresponding SQL
WHERE clauses (e.g., "AND timestamp >= $<startTimestamp>" or limit/offset) with
sensible defaults (for example default to last 30 days) and update the
conn.query parameter object accordingly; ensure callers of getHolderHistory are
updated to pass bounds or rely on defaults and add tests or notes where
applicable.
src/handlers/triggerHoldersDaily.js (1)

319-370: Consider using buildResult helper in processPool for consistency.

processPool duplicates the result-building logic that's centralized in buildResult. Using the shared helper would reduce duplication and ensure consistent formatting.

♻️ Refactored processPool using buildResult
 async function processPool(task, totalSupplyMap, today) {
   const { configID, chain, chainId, tokenAddress, tvlUsd } = task;

   const data = await fetchHolders(chainId, tokenAddress, 10, false);
   const holderCount = data.total_holders;

   if (holderCount == null) return null;

-  if (holderCount === 0) {
-    return {
-      configID,
-      timestamp: today.toISOString(),
-      holderCount: 0,
-      avgPositionUsd: null,
-      top10Pct: null,
-      top10Holders: null,
-    };
-  }
-
-  const avgPositionUsd = holderCount > 0 ? tvlUsd / holderCount : null;
-
-  let top10Pct = null;
-  let top10Holders = null;
-  const supplyKey = `${tokenAddress.toLowerCase()}-${chain}`;
-  const totalSupply = totalSupplyMap[supplyKey];
-
-  if (data.deltas && data.deltas.length > 0 && totalSupply && totalSupply > 0n) {
-    const top10Balance = data.deltas.reduce(
-      (sum, d) => sum + BigInt(d.balance || d.amount || 0),
-      0n
-    );
-    top10Pct = Number((top10Balance * 10000n) / totalSupply) / 100;
-
-    top10Holders = data.deltas.map((d) => ({
-      address: d.address || d.owner,
-      balance: String(d.balance || d.amount || 0),
-      balancePct:
-        totalSupply > 0n
-          ? Number((BigInt(d.balance || d.amount || 0) * 10000n) / totalSupply) / 100
-          : 0,
-    }));
-  }
-
-  return {
-    configID,
-    timestamp: today.toISOString(),
-    holderCount,
-    avgPositionUsd,
-    top10Pct,
-    top10Holders,
-  };
+  // Convert API response to holder format for buildResult
+  const holders = (data.deltas || [])
+    .map((d) => ({
+      address: d.address || d.owner,
+      balance: BigInt(d.balance || d.amount || 0),
+    }))
+    .filter((h) => h.balance > 0n)
+    .sort((a, b) => (b.balance > a.balance ? 1 : -1));
+
+  return buildResult(holders, holderCount, configID, tvlUsd, totalSupplyMap, chain, tokenAddress, today);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerHoldersDaily.js` around lines 319 - 370, Replace the
manual result construction in processPool with a call to the shared buildResult
helper: keep the existing logic that fetches holders (fetchHolders), computes
holderCount, avgPositionUsd, top10Pct and top10Holders (using totalSupplyMap and
the same BigInt math), then pass those computed values into buildResult (along
with configID, today/toISOString(), and tvlUsd as needed) instead of returning
the assembled object directly; ensure you import/require or reference
buildResult and preserve the exact property names (configID, timestamp,
holderCount, avgPositionUsd, top10Pct, top10Holders) so formatting remains
consistent with other callers.
src/utils/holderApi.js (6)

286-288: Address validation doesn't check hex characters.

isValidEvmAddress only validates prefix and length. An address like 0xGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG would pass but fail on-chain calls. Consider a regex check if stricter validation is desired.

Stricter validation
 function isValidEvmAddress(address) {
-  return address.startsWith('0x') && address.length === 42;
+  return /^0x[a-fA-F0-9]{40}$/.test(address);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderApi.js` around lines 286 - 288, isValidEvmAddress currently
only checks prefix and length; update the function (isValidEvmAddress) to also
validate that the 40 characters after "0x" are valid hexadecimal characters
(0-9, a-f, A-F) using a regex test (e.g. match /^0x[0-9a-fA-F]{40}$/) so inputs
like "0xGGG..." are rejected; keep the same function signature and return a
boolean.

175-182: getCurrentBlock lacks error handling and has inconsistent chain aliasing.

  1. No try/catch - if the API call fails, the error propagates and could crash the handler.
  2. The avalanche → avax alias on line 177 is separate from CHAIN_ALIASES (line 236), creating two places to maintain chain mappings.
Proposed fix
 async function getCurrentBlock(chain) {
-  const timestamp = Math.floor(Date.now() / 1000);
-  const chainKey = chain === 'avalanche' ? 'avax' : chain;
-  const { data } = await axios.get(
-    `https://coins.llama.fi/block/${chainKey}/${timestamp}`
-  );
-  return data.height;
+  try {
+    const timestamp = Math.floor(Date.now() / 1000);
+    const chainKey = chain === 'avalanche' ? 'avax' : chain;
+    const { data } = await axios.get(
+      `https://coins.llama.fi/block/${chainKey}/${timestamp}`,
+      { timeout: 10000 }
+    );
+    return data.height;
+  } catch (err) {
+    console.error(`Failed to get current block for ${chain}:`, err.message);
+    return null;
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderApi.js` around lines 175 - 182, getCurrentBlock currently
lacks error handling and duplicates the avalanche->avax alias; update it to use
the centralized CHAIN_ALIASES map (fall back to the original chain if not
present) instead of hardcoding 'avalanche', wrap the axios request in a
try/catch, log the error (or processLogger.error) when the call fails, and
return a safe value (e.g., null) on failure so callers don't crash; reference
the getCurrentBlock function and CHAIN_ALIASES constant when making these
changes.

163-170: Fallback assumptions when data is unavailable.

When ankrCount is null/0 or pelucheCount is 0, the function returns 'needs_rebase' (line 163-164). When any error occurs, it returns 'standard' (line 169). These opposing defaults could cause inconsistent behavior:

  • Missing ANKR data → 'needs_rebase'
  • Network error during comparison → 'standard'

If the intent is to be conservative, consider making both fallbacks consistent, or return a distinct value like 'unknown' to let the caller decide.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderApi.js` around lines 163 - 170, The current logic returns
'needs_rebase' when ankrCount is null/0 or pelucheCount is 0 but returns
'standard' from the catch block, causing inconsistent fallbacks; update the
function handling ankrCount and pelucheCount so the catch returns the same
conservative default (e.g., also 'needs_rebase') or return a distinct sentinel
like 'unknown' in both the ankrCount/pelucheCount checks and the catch; modify
the return in the error handler and the early-null checks (variables: ankrCount,
pelucheCount, and the catch block surrounding the diff calculation) to use the
chosen consistent fallback so callers get predictable behavior.

56-68: Empty catch blocks reduce observability.

Silent failures in classification loops make it difficult to diagnose issues when tokens are unexpectedly classified as 'unknown'. Consider logging errors at debug level.

Proposed improvement
   for (const abi of SHARE_BASED_ABIS) {
     try {
       const { output } = await sdk.api.abi.multiCall({ abi, calls: call, chain, permitFailure: true });
       if (output[0]?.output && output[0].output !== ZERO) return 'share_based';
-    } catch {}
+    } catch (err) {
+      // ABI not supported by this token - expected for most tokens
+    }
   }

   for (const abi of REBASE_ABIS) {
     try {
       const { output } = await sdk.api.abi.multiCall({ abi, calls: call, chain, permitFailure: true });
       if (output[0]?.output) return 'true_rebase';
-    } catch {}
+    } catch (err) {
+      // ABI not supported by this token - expected for most tokens
+    }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderApi.js` around lines 56 - 68, The empty catch blocks in the
classification loops around SHARE_BASED_ABIS and REBASE_ABIS swallow runtime
errors from sdk.api.abi.multiCall, hindering observability; update the catch
handlers to log the caught error at debug level (including the abi identifier,
chain, and the call/target details) so you can trace failures when a token falls
through to 'unknown'—specifically modify the try/catch around
sdk.api.abi.multiCall in both loops to call the project's debug logger with a
clear message and the error and context (abi, chain, call) rather than leaving
the catch empty.

258-267: Consider exponential backoff for retries.

The current retry logic uses a fixed 2-second delay and only attempts one retry. For a daily batch job this may be sufficient, but for production resilience, exponential backoff would be more robust, especially for 429 (rate limit) responses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderApi.js` around lines 258 - 267, The current fetch in the
try/catch around axios.get (using url, headers and RETRYABLE_CODES) only does
one fixed 2s retry; replace that with a retry loop (e.g., maxRetries = 3-5) that
on retryable status codes (including 429) waits with exponential backoff and
jitter (baseDelay * 2^attempt ± small random ms) before re-calling axios.get,
and propagate the final error if all attempts fail; ensure you reference
RETRYABLE_CODES, the axios.get call, and preserve the existing timeout/headers
when retrying.

24-27: saveHolderCache lacks error handling.

If writeToS3 fails (network issue, permissions, bucket misconfiguration), this will throw an unhandled exception. Since caching is non-critical, consider wrapping this in try/catch to prevent cache failures from crashing the pipeline.

Proposed fix
 async function saveHolderCache(tokenAddress, chain, data) {
   const key = `${CACHE_PREFIX}/${tokenAddress.toLowerCase()}-${chain}.json`;
-  await writeToS3(CACHE_BUCKET, key, data);
+  try {
+    await writeToS3(CACHE_BUCKET, key, data);
+  } catch (err) {
+    console.error(`Failed to save holder cache for ${tokenAddress} on ${chain}:`, err.message);
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/holderApi.js` around lines 24 - 27, The saveHolderCache function
currently calls writeToS3 directly and can throw unhandled exceptions; wrap the
writeToS3 call in a try/catch inside saveHolderCache (referencing
saveHolderCache, writeToS3, CACHE_BUCKET, CACHE_PREFIX) to swallow or handle
errors for non-critical caching—log the error (using the module logger or
console.error) including context (tokenAddress, chain, key) and do not rethrow
so cache failures won't crash the pipeline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/handlers/triggerEnrichment.js`:
- Line 14: The welfordUpdate function in src/utils/welford.js currently assigns
variables (d, count, meanAPY, mean2APY, meanDR, mean2DR, productDR, deltaAPY,
delta2APY, deltaDR, delta2DR) without declarations which creates implicit
globals; update welfordUpdate to declare each of these with let/const as
appropriate within the function scope (e.g., let count = ..., const d = ..., let
deltaAPY = ..., etc.), initialize them before use, and ensure any variables that
are mutated use let while constants use const so the function is safe under
strict mode/ESM and when called from triggerEnrichment.js via welfordUpdate.

In `@src/handlers/triggerHoldersDaily.js`:
- Around line 108-114: The loop over cacheResults is accessing r.value.task even
when r.status === 'rejected', causing a TypeError; update the loop so you only
access r.value (and set tokenType) when r.status === 'fulfilled' (and r.value is
truthy), and for rejected entries handle them separately (e.g., skip or log
r.reason) instead of touching r.value.task; reference the cacheResults iteration
in triggerHoldersDaily.js and ensure the checks around r.status and r.value
guard all accesses to r.value.task.

In `@src/utils/holderApi.js`:
- Around line 93-96: The batch classification omitted the Lido-style ABI so
tokens with getTotalPooledEther get misclassified; update the interfaces array
used by classifyTokensBatch to include both true-rebase ABIs (add an entry for
'getTotalPooledEther' with type 'true_rebase') so it matches classifyToken's
behavior; look for the interfaces constant used by classifyTokensBatch in
src/utils/holderApi.js and add the getTotalPooledEther entry alongside
'rebasingCreditsPerToken'.

---

Duplicate comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 40-56: The loop currently calls parsePoolField(pool.pool) which
can throw and abort the whole run; wrap the parsePoolField call in a try/catch
inside the for (const pool of pools) loop (the block that builds tasks) so
malformed pool.pool values are skipped instead of crashing; on catch, log a
warning (include pool.configID and the error) and continue to the next pool, and
only proceed to resolveChainId and push into tasks when parsePoolField succeeds.

---

Nitpick comments:
In `@src/handlers/triggerHoldersDaily.js`:
- Around line 319-370: Replace the manual result construction in processPool
with a call to the shared buildResult helper: keep the existing logic that
fetches holders (fetchHolders), computes holderCount, avgPositionUsd, top10Pct
and top10Holders (using totalSupplyMap and the same BigInt math), then pass
those computed values into buildResult (along with configID,
today/toISOString(), and tvlUsd as needed) instead of returning the assembled
object directly; ensure you import/require or reference buildResult and preserve
the exact property names (configID, timestamp, holderCount, avgPositionUsd,
top10Pct, top10Holders) so formatting remains consistent with other callers.

In `@src/queries/holder.js`:
- Around line 80-90: getHolderHistory currently returns all rows for a configID
which can produce large result sets; update the function signature
(getHolderHistory) to accept optional time-bounds (e.g., startTimestamp and/or
daysBack) or pagination params, add corresponding SQL WHERE clauses (e.g., "AND
timestamp >= $<startTimestamp>" or limit/offset) with sensible defaults (for
example default to last 30 days) and update the conn.query parameter object
accordingly; ensure callers of getHolderHistory are updated to pass bounds or
rely on defaults and add tests or notes where applicable.

In `@src/utils/holderApi.js`:
- Around line 286-288: isValidEvmAddress currently only checks prefix and
length; update the function (isValidEvmAddress) to also validate that the 40
characters after "0x" are valid hexadecimal characters (0-9, a-f, A-F) using a
regex test (e.g. match /^0x[0-9a-fA-F]{40}$/) so inputs like "0xGGG..." are
rejected; keep the same function signature and return a boolean.
- Around line 175-182: getCurrentBlock currently lacks error handling and
duplicates the avalanche->avax alias; update it to use the centralized
CHAIN_ALIASES map (fall back to the original chain if not present) instead of
hardcoding 'avalanche', wrap the axios request in a try/catch, log the error (or
processLogger.error) when the call fails, and return a safe value (e.g., null)
on failure so callers don't crash; reference the getCurrentBlock function and
CHAIN_ALIASES constant when making these changes.
- Around line 163-170: The current logic returns 'needs_rebase' when ankrCount
is null/0 or pelucheCount is 0 but returns 'standard' from the catch block,
causing inconsistent fallbacks; update the function handling ankrCount and
pelucheCount so the catch returns the same conservative default (e.g., also
'needs_rebase') or return a distinct sentinel like 'unknown' in both the
ankrCount/pelucheCount checks and the catch; modify the return in the error
handler and the early-null checks (variables: ankrCount, pelucheCount, and the
catch block surrounding the diff calculation) to use the chosen consistent
fallback so callers get predictable behavior.
- Around line 56-68: The empty catch blocks in the classification loops around
SHARE_BASED_ABIS and REBASE_ABIS swallow runtime errors from
sdk.api.abi.multiCall, hindering observability; update the catch handlers to log
the caught error at debug level (including the abi identifier, chain, and the
call/target details) so you can trace failures when a token falls through to
'unknown'—specifically modify the try/catch around sdk.api.abi.multiCall in both
loops to call the project's debug logger with a clear message and the error and
context (abi, chain, call) rather than leaving the catch empty.
- Around line 258-267: The current fetch in the try/catch around axios.get
(using url, headers and RETRYABLE_CODES) only does one fixed 2s retry; replace
that with a retry loop (e.g., maxRetries = 3-5) that on retryable status codes
(including 429) waits with exponential backoff and jitter (baseDelay * 2^attempt
± small random ms) before re-calling axios.get, and propagate the final error if
all attempts fail; ensure you reference RETRYABLE_CODES, the axios.get call, and
preserve the existing timeout/headers when retrying.
- Around line 24-27: The saveHolderCache function currently calls writeToS3
directly and can throw unhandled exceptions; wrap the writeToS3 call in a
try/catch inside saveHolderCache (referencing saveHolderCache, writeToS3,
CACHE_BUCKET, CACHE_PREFIX) to swallow or handle errors for non-critical
caching—log the error (using the module logger or console.error) including
context (tokenAddress, chain, key) and do not rethrow so cache failures won't
crash the pipeline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b230ce2f-8c23-42a3-998b-07391cc2c2e0

📥 Commits

Reviewing files that changed from the base of the PR and between 7eb5949 and 8496c1f.

📒 Files selected for processing (11)
  • .gitignore
  • env.js
  • migrations/1771242161000_add-holder-tables.js
  • serverless.yml
  • src/api/controllers/yield.js
  • src/api/routes/yield.js
  • src/handlers/triggerEnrichment.js
  • src/handlers/triggerHoldersDaily.js
  • src/queries/holder.js
  • src/utils/holderApi.js
  • src/utils/testEnrichmentMerge.js
💤 Files with no reviewable changes (1)
  • .gitignore

} = require('../queries/yield');
const { getStat } = require('../queries/stat');

const { welfordUpdate } = require('../utils/welford');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
file=$(fd '^welford\.js$' src | head -n1)
nl -ba "$file" | sed -n '1,80p'
echo
rg -n '^\s*(d|count|meanAPY|mean2APY|meanDR|mean2DR|productDR|deltaAPY|delta2APY|deltaDR|delta2DR)\s*=' "$file"

Repository: DefiLlama/yield-server

Length of output: 699


🏁 Script executed:

#!/bin/bash
# Read the full welford.js file
echo "=== Full src/utils/welford.js ==="
cat -n src/utils/welford.js

echo
echo "=== Check if file has strict mode ==="
head -5 src/utils/welford.js | grep -i "strict\|use"

Repository: DefiLlama/yield-server

Length of output: 2504


🏁 Script executed:

#!/bin/bash
echo "=== Check line 14 in src/handlers/triggerEnrichment.js ==="
sed -n '10,20p' src/handlers/triggerEnrichment.js | cat -n

echo
echo "=== Check around line 135 in src/handlers/triggerEnrichment.js ==="
sed -n '130,140p' src/handlers/triggerEnrichment.js | cat -n

echo
echo "=== Search for welfordUpdate calls in the file ==="
grep -n "welfordUpdate" src/handlers/triggerEnrichment.js

Repository: DefiLlama/yield-server

Length of output: 1217


Declare all variables in welfordUpdate before using them.

src/utils/welford.js assigns d, count, meanAPY, mean2APY, meanDR, mean2DR, productDR, deltaAPY, delta2APY, deltaDR, and delta2DR without const/let declarations. In non-strict CommonJS, these become implicit globals; under strict mode or ESM, they cause a ReferenceError. Since the import on Line 14 now adds this to the enrichment hot path, add declarations for these variables.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers/triggerEnrichment.js` at line 14, The welfordUpdate function in
src/utils/welford.js currently assigns variables (d, count, meanAPY, mean2APY,
meanDR, mean2DR, productDR, deltaAPY, delta2APY, deltaDR, delta2DR) without
declarations which creates implicit globals; update welfordUpdate to declare
each of these with let/const as appropriate within the function scope (e.g., let
count = ..., const d = ..., let deltaAPY = ..., etc.), initialize them before
use, and ensure any variables that are mutated use let while constants use const
so the function is safe under strict mode/ESM and when called from
triggerEnrichment.js via welfordUpdate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants