Skip to content

AI reviews summary feature#1219

Open
phongo1 wants to merge 17 commits intodevfrom
AI-reviews-summary
Open

AI reviews summary feature#1219
phongo1 wants to merge 17 commits intodevfrom
AI-reviews-summary

Conversation

@phongo1
Copy link
Collaborator

@phongo1 phongo1 commented Feb 1, 2026

This PR is for the addition of an AI powered reviews summary feature on the course/instructor page.

manual LLM summary generation occurs by running generate_ai_summaries with appropriate flags. This helps us control token usage and keep budget for this feature extremely low.

ENV vars required:

# AI
OPENROUTER_API_KEY={key}
OPENROUTER_MODEL=meta-llama/llama-3.3-70b-instruct:free

What I did

  • Added a table for storing generated summaries
  • Wrote helper functions to make the LLM call to generate summary (with system prompt)
  • Added a script in commands folder: tcf_website/management/commands/generate_ai_summaries.py
    • this command has ability to query for top n popular (most reviewed) courses, do LLM call and generate summaries for top n course/instructor pages, and also generate for a single specific course/instructor page given courseId and instructorId
    • documentation on this is written in comment block at the top of the file
  • Added a script in the /scripts directory to list all generated AI summaries that we have

Screenshots

note: UI still being iterated on

image

Running list_ai_summaries.sh script
^ this lists all generated summaries that we have
image


Running docker compose exec web python manage.py generate_ai_summaries --limit 5 --min-reviews 10 --missing-only Example:
^ this generates summaries for 5 course/instructor pairs with a min review of 10 and for pairs that don't have a generated summary
image

Running

Summary by CodeRabbit

  • New Features

    • AI-generated review summaries shown on course–instructor pages with a Beta badge and source-review counts.
    • New CLI command to generate summaries for single pairs or batches with filtering, limits, and dry-run.
  • UI

    • Added summary box styling and template blocks to display AI summaries.
  • Data

    • Persistent summary records to store generated summaries per course/instructor or per club.
  • Chores

    • Admin registration, new settings for the AI service, and a utility script to list generated summaries.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.
📝 Walkthrough

Walkthrough

Adds OpenRouter-backed AI review summarization: new settings, ReviewLLMSummary model + migration, an OpenRouter-based review_summary service and management command to generate summaries, template/CSS and view wiring to display stored summaries, and admin/model export updates.

Changes

Cohort / File(s) Summary
Settings
tcf_core/settings/base.py
Added environment-backed OpenRouter settings: OPENROUTER_API_KEY and OPENROUTER_MODEL under an "AI review summary configuration" header.
Model & migration
tcf_website/models/models.py, tcf_website/migrations/0024_reviewllmsummary.py
Added ReviewLLMSummary model and migration: fields for summary text, model_id, source counts/last id, JSON metadata, timestamps, FKs to course, instructor, optional club, plus two partial unique constraints (per course+instructor when club is null; unique per club when present).
Model export & admin
tcf_website/models/__init__.py, tcf_website/admin.py
Exported and registered ReviewLLMSummary.
Summarization service
tcf_website/services/review_summary.py
New service: constants (SUMMARY_THRESHOLD, MAX_REVIEWS_FOR_PROMPT, MAX_REVIEW_CHARS), prompt builder, OpenRouter HTTP client (_call_openrouter), and generate_review_summary returning (summary, error, model_used) with API key and response error handling.
Management command
tcf_website/management/commands/generate_ai_summaries.py
New Django command to generate/update summaries (args: --limit, --course-id, --instructor-id, --min-reviews, --max-reviews, --missing-only, --dry-run); validates API key and arg dependencies; creates/updates ReviewLLMSummary.
Views / Templates / CSS
tcf_website/views/browse.py, tcf_website/templates/course/course_professor.html, tcf_website/static/course/course_professor.css
Views fetch ReviewLLMSummary and add review_summary_text/review_summary_source_count to context; template conditionally renders an "AI Review Summary" block (inserted in two places); CSS adds .ai-summary-box and badge styling.
Utilities / Scripts
scripts/list_ai_summaries.sh
Added CLI script to list stored summaries (no club) with related course/instructor info.

Sequence Diagram

sequenceDiagram
    participant CLI as Management Command
    participant Service as Review Summary Service
    participant OpenRouter as OpenRouter API
    participant DB as Database
    participant View as Browse View
    participant Template as Template Renderer

    CLI->>DB: Query candidate reviews / pairs
    CLI->>Service: generate_review_summary(course,instructor,reviews_qs,max_reviews)
    Service->>Service: validate API key & build prompt
    Service->>OpenRouter: POST /chat/completions (model, messages)
    alt OpenRouter returns summary
        OpenRouter-->>Service: summary text
    else OpenRouter returns error
        OpenRouter-->>Service: error
    end
    Service-->>CLI: (summary, error, model_used)
    CLI->>DB: Create or update ReviewLLMSummary
    DB-->>CLI: saved

    View->>DB: Query ReviewLLMSummary for course/instructor
    DB-->>View: summary record or none
    View->>Template: Render context with review_summary_text and review_summary_source_count
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐇 I nibbled through reviews, one by one,
Turned noisy bits into summaries—tidy and fun.
A boxed-up insight where feedback now gleams,
Carrots for courses, stitched up from dreams.
Signed, the rabbit who loves clever schemes.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'AI reviews summary feature' directly and clearly describes the main objective of the PR—adding an AI-powered summaries feature for course reviews.
Description check ✅ Passed The description covers most required template sections: GitHub Issues (implied closure), What I did (comprehensive), Screenshots (included), and Testing (addressed via management command documentation). Minor improvements could be made but content is substantial and well-structured.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch AI-reviews-summary

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@tcf_website/management/commands/generate_ai_summaries.py`:
- Around line 36-37: Add a one-line docstring to the Command class to satisfy
pylint C0115 and match other management commands; update the class definition
for Command (subclassing BaseCommand) by inserting a short descriptive docstring
(e.g., "Management command to generate AI summaries for course/instructor
pairs.") immediately after the class header so the class has a proper docstring.
- Around line 1-23: The module docstring in generate_ai_summaries.py has lines
exceeding the line-length limit; wrap long lines in the module docstring (the
top-of-file string describing usage and examples for the generate_ai_summaries
management command) so no line exceeds the project's limit (e.g., 100 chars);
preserve all content and examples, breaking long sentences and example command
lines across multiple shorter lines while keeping the same wording and
formatting.
- Around line 63-66: The CLI flag --max-reviews in generate_ai_summaries.py
currently defaults to 25 but the service clamps to MAX_REVIEWS_FOR_PROMPT (20),
causing mismatch; update the argparse.add_argument call that defines
"--max-reviews" to set default=20 to match MAX_REVIEWS_FOR_PROMPT and update the
help text if necessary to reflect the 20-review cap; reference the argument
registration for "--max-reviews" and the MAX_REVIEWS_FOR_PROMPT constant to
ensure both are aligned.
- Around line 102-125: The f-string on the self.stdout.write call is too long;
break it into multiple shorter strings or use implicit string concatenation
inside parentheses to keep each line under the 100-char limit. Locate the call
that uses self.stdout.write(self.style.WARNING(f"Skipping course {course_id} /
instructor {instructor_id}: only {review_count} reviews.")) and split the
message across lines (e.g., start with f"Skipping course {course_id} /
instructor {instructor_id}:" on one line and f" only {review_count} reviews." on
the next), preserving the f-string interpolation and the surrounding
self.stdout.write(self.style.WARNING(...)) structure.

In `@tcf_website/views/browse.py`:
- Around line 400-426: The view computes visible_reviews and
current_review_count but never uses current_review_count and only sets
summary_state to "missing" or "available", causing template-state mismatches;
update the logic around ReviewLLMSummary/summary_record and summary_state so it
uses current_review_count to set "no_reviews" when there are zero
visible_reviews, keep "available" when summary_record and summary_text exist,
set "missing" when no summary_record exists, and set "fallback" when a
summary_record exists but summary_text is empty (or "error" if you can derive an
error flag from summary_record); reference ReviewLLMSummary, summary_record,
summary_text, current_review_count, visible_reviews, and summary_state when
making these changes.
🧹 Nitpick comments (5)
tcf_website/models/models.py (1)

1421-1450: Clarify nullable fields for club summaries.

With club optional but course/instructor required, you can’t persist club-only summaries, and __str__ would fail if those ever become null. If club summaries are intended, make course/instructor nullable and add a constraint to enforce either club or course+instructor (not both). Otherwise, consider removing club to avoid dead fields.

♻️ Example adjustment (will require a migration)
-    course = models.ForeignKey(Course, on_delete=models.CASCADE)
-    instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE)
+    course = models.ForeignKey(Course, on_delete=models.CASCADE, null=True, blank=True)
+    instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE, null=True, blank=True)
@@
     class Meta:
         constraints = [
+            models.CheckConstraint(
+                check=(
+                    models.Q(club__isnull=False, course__isnull=True, instructor__isnull=True)
+                    | models.Q(club__isnull=True, course__isnull=False, instructor__isnull=False)
+                ),
+                name="summary_requires_club_or_course_instructor",
+            ),
             models.UniqueConstraint(
                 fields=["course", "instructor"],
                 name="unique_summary_per_course_instructor",
                 condition=models.Q(club__isnull=True),
             ),
tcf_website/services/review_summary.py (1)

117-125: Narrow exception handling to request/parsing errors.

Catching Exception can mask programmer errors; prefer requests.RequestException and JSON parsing errors so unexpected bugs surface.

♻️ Suggested change
-        except Exception as exc:  # pylint: disable=broad-except
-            last_error = str(exc)
-            continue
+        except (requests.RequestException, ValueError) as exc:
+            last_error = str(exc)
+            continue
tcf_website/management/commands/generate_ai_summaries.py (1)

74-83: Reduce locals in handle and silence unused args.

CI flags ARG002 and R0914. Consider renaming args to _args and grouping options into a small dict/dataclass to cut locals.

🔧 Minimal lint fix for unused args
-    def handle(self, *args: Any, **options: Any):
+    def handle(self, *_args: Any, **options: Any):
tcf_website/templates/course/course_professor.html (1)

372-387: Conditional logic may not handle "fallback" state correctly.

The current condition ordering has an issue: if review_summary_text is truthy, line 372 will always match first, regardless of review_summary_state. This means:

  1. The "fallback" branch (lines 379-381) will only be reached when review_summary_text is falsy, making {{ review_summary_text|default:"" }} display nothing useful.
  2. Users won't see the warning message about showing a cached summary during errors.

If the "fallback" state should show both the text and the warning, consider checking state first:

♻️ Suggested fix to prioritize state checks
-            {% if review_summary_text %}
+            {% if review_summary_state == "fallback" and review_summary_text %}
+                <p class="mb-2">{{ review_summary_text|linebreaksbr }}</p>
+                <p class="text-warning small mb-0">Showing last saved summary. Error: {{ review_summary_error }}</p>
+            {% elif review_summary_text %}
                 <p class="mb-2">{{ review_summary_text|linebreaksbr }}</p>
                 <p class="text-primary small mb-0">
                     AI generated based on {{ review_summary_source_count }} review{% if review_summary_source_count != 1 %}s{% endif %}.
                 </p>
             {% elif review_summary_state == "no_reviews" %}
                 <p class="text-muted mb-0">No written reviews yet. Add yours to unlock a summary.</p>
-            {% elif review_summary_state == "fallback" %}
-                <p class="mb-2">{{ review_summary_text|default:"" }}</p>
-                <p class="text-warning small mb-0">Showing last saved summary. Error: {{ review_summary_error }}</p>
             {% elif review_summary_state == "error" %}

Note: The view (browse.py) currently only sets states "missing" and "available"—see related comment there about implementing the missing state logic.

tcf_website/views/browse.py (1)

401-411: Query is more expensive than necessary for just a count.

The query includes .order_by("-created") which is unnecessary when only calling .count(). If you implement the state logic suggested above, simplify the query:

♻️ Suggested optimization
     # AI summary block
-    visible_reviews = (
-        Review.objects.filter(
-            instructor=instructor_id,
-            course=course_id,
-            toxicity_rating__lt=settings.TOXICITY_THRESHOLD,
-            hidden=False,
-        )
-        .exclude(text="")
-        .order_by("-created")
-    )
-    current_review_count = visible_reviews.count()
+    current_review_count = (
+        Review.objects.filter(
+            instructor=instructor_id,
+            course=course_id,
+            toxicity_rating__lt=settings.TOXICITY_THRESHOLD,
+            hidden=False,
+        )
+        .exclude(text="")
+        .count()
+    )

@artiehumphreys
Copy link
Member

Be sure to make the changes we discussed then I'll give a more in-depth review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tcf_website/services/review_summary.py`:
- Around line 102-107: Currently you slice reviews_qs before removing empty-text
entries which can leave too few reviews; change the logic so you first
materialize a larger set (e.g., fetch up to MAX_REVIEWS_FOR_PROMPT or a higher
safe limit from reviews_qs), filter out entries with empty r.text using the
existing reviews = [r for r in reviews if (r.text or "").strip()] and only then
apply the cap using max_reviews = max(1, min(max_reviews or
MAX_REVIEWS_FOR_PROMPT, MAX_REVIEWS_FOR_PROMPT)) and slicing
(reviews[:max_reviews]) so the final list contains up to max_reviews non-empty
reviews; update code around reviews_qs, reviews, and MAX_REVIEWS_FOR_PROMPT
accordingly.

@artiehumphreys
Copy link
Member

Code looks great! Just take a look at the few comments I gave you earlier. Why is the token limit set to 220? I think that visually it looks good and is almost ready to merge.

@phongo1
Copy link
Collaborator Author

phongo1 commented Feb 6, 2026

Code looks great! Just take a look at the few comments I gave you earlier. Why is the token limit set to 220? I think that visually it looks good and is almost ready to merge.

Raised to 280 max tokens. Should be enough for 4 - 6 sentence summaries without truncation

@phongo1 phongo1 force-pushed the AI-reviews-summary branch from 96c012d to c70166c Compare February 6, 2026 23:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@tcf_website/migrations/0024_reviewllmsummary.py`:
- Around line 18-26: The migration
tcf_website/migrations/0024_reviewllmsummary.py creates its primary key using
BigAutoField which conflicts with the project's DEFAULT_AUTO_FIELD =
"django.db.models.AutoField" in tcf_core/settings/base.py; to fix, regenerate
this migration with the project's settings active (so Django will use AutoField)
or edit the migration to replace BigAutoField with models.AutoField for the "id"
field, ensuring the migration class/name remains unchanged and that migrations
are recompiled/tested after the change.

In `@tcf_website/services/review_summary.py`:
- Around line 52-90: The code in _call_openrouter currently calls
response.raise_for_status(), which raises an HTTPError and prevents parsing the
JSON error body; change this to inspect response.status_code and only parse JSON
on non-2xx responses to extract structured error details (use response.json() in
a try/except to handle invalid JSON), then if status is not in the 200 range
return (None, extracted_error_message_or_fallback) instead of raising; keep the
existing successful-path parsing of data.get("choices") and return (content,
None) when present, and ensure you still handle missing JSON by returning a
sensible fallback error message from the response text or a generic message.
🧹 Nitpick comments (6)
tcf_website/models/models.py (1)

1424-1426: course and instructor are non-nullable, but club-based summaries may not have them.

The unique_summary_per_club constraint implies club summaries exist independently, but course and instructor are required (non-null) FKs. If you ever create a club-only summary, you'd be forced to provide arbitrary course/instructor values. Consider making these nullable if club summaries are planned:

-    course = models.ForeignKey(Course, on_delete=models.CASCADE)
-    instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE)
+    course = models.ForeignKey(Course, on_delete=models.CASCADE, null=True, blank=True)
+    instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE, null=True, blank=True)

Alternatively, if club summaries aren't needed yet, remove the club field and the unique_summary_per_club constraint to keep the schema focused.

tcf_website/management/commands/generate_ai_summaries.py (3)

76-95: options.get("max_reviews") or 20 silently converts --max-reviews 0 to 20.

Since argparse always provides the default, the or 20 only triggers for an explicit --max-reviews 0. If 0 is invalid, it's better to validate explicitly rather than silently substituting. Additionally, this hardcoded 20 duplicates the service constant MAX_REVIEWS_FOR_PROMPT.

Proposed fix
-        max_reviews = options.get("max_reviews") or 20
+        max_reviews = options["max_reviews"]

The service already clamps the value via max(1, min(max_reviews, MAX_REVIEWS_FOR_PROMPT)), so there's no need to double-guard here.


136-148: Course.objects.get() / Instructor.objects.get() can raise DoesNotExist and crash the loop.

In the aggregated-pairs path, course_id and instructor_id come from Review foreign keys, so they should exist. However, a race condition (e.g., a course deleted between aggregation and lookup) would raise an unhandled exception and abort the remaining pairs. Wrapping these lookups in a try/except (or using filter().first()) would make the loop resilient.

Proposed fix
         for course_id, instructor_id, review_count, latest_id in pairs:
-            course = Course.objects.get(id=course_id)
-            instructor = Instructor.objects.get(id=instructor_id)
+            try:
+                course = Course.objects.get(id=course_id)
+                instructor = Instructor.objects.get(id=instructor_id)
+            except (Course.DoesNotExist, Instructor.DoesNotExist):
+                self.stdout.write(
+                    self.style.WARNING(
+                        f"Skipping course {course_id} / instructor {instructor_id}: "
+                        "object not found."
+                    )
+                )
+                continue
             qs = base_reviews.filter(course=course, instructor=instructor).order_by(
                 "-created"
             )

150-178: generate_review_summary errors are caught, but consider logging the exception for operational visibility.

The error from generate_review_summary is printed to stdout via self.style.ERROR, which is good for interactive use. For production/cron scenarios, also consider using logging.warning or logging.error so failures appear in structured logs.

Otherwise, the update_or_create and success reporting logic looks correct.

tcf_website/services/review_summary.py (2)

17-49: Prompt construction looks solid, but line 27 is very long.

The ratings format string on line 27 packs a lot into one line. Consider breaking it up for readability and to stay within line-length limits.

Proposed fix
-            f"- Ratings (instructor {review.instructor_rating}/5, enjoyability {review.enjoyability}/5, recommend {review.recommendability}/5, difficulty {review.difficulty}/5): {truncated_text}"
+            f"- Ratings (instructor {review.instructor_rating}/5, "
+            f"enjoyability {review.enjoyability}/5, "
+            f"recommend {review.recommendability}/5, "
+            f"difficulty {review.difficulty}/5): "
+            f"{truncated_text}"

61-66: Consider making max_tokens and temperature configurable or at least top-level constants.

The max_tokens=220 and temperature=0.3 values are buried in the request payload. Since the PR discussion already flagged 220 as potentially too low for 4-6 sentence summaries, extracting these as module-level constants (or Django settings) would make them easier to tune without digging into the function body.

Comment on lines +18 to +26
(
"id",
models.BigAutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

BigAutoField inconsistent with project's DEFAULT_AUTO_FIELD.

The project explicitly sets DEFAULT_AUTO_FIELD = "django.db.models.AutoField" in tcf_core/settings/base.py (line 186), but this migration uses BigAutoField. This won't break anything, but it creates an inconsistency — this table will use bigint PKs while all other tables use int. Regenerate the migration with the correct settings active, or it will work fine as-is if you're okay with the mismatch.

🤖 Prompt for AI Agents
In `@tcf_website/migrations/0024_reviewllmsummary.py` around lines 18 - 26, The
migration tcf_website/migrations/0024_reviewllmsummary.py creates its primary
key using BigAutoField which conflicts with the project's DEFAULT_AUTO_FIELD =
"django.db.models.AutoField" in tcf_core/settings/base.py; to fix, regenerate
this migration with the project's settings active (so Django will use AutoField)
or edit the migration to replace BigAutoField with models.AutoField for the "id"
field, ensuring the migration class/name remains unchanged and that migrations
are recompiled/tested after the change.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@tcf_website/management/commands/generate_ai_summaries.py`:
- Around line 196-197: The current payload sets "source_review_count" to the
total matching review_count which overstates how many reviews were actually sent
to the LLM; change the assignment that builds the summary payload (the dict
containing "summary_text" and "source_review_count") to use the actual number
used in the prompt by storing min(review_count, max_reviews) (or compute
used_count = min(review_count, max_reviews)) and set "source_review_count" to
that value so the UI reflects the true number of reviews fed to the model;
locate this logic in the generate_ai_summaries flow where review_count and
max_reviews are available and replace the direct review_count reference with the
computed used_count.
- Around line 192-202: The update_or_create call on ReviewLLMSummary uses course
and instructor as the lookup keys and can inadvertently match club-scoped rows
(or raise MultipleObjectsReturned); modify the lookup to include club=None so it
only targets the club-less unique constraint. Update the
ReviewLLMSummary.objects.update_or_create invocation (the lookup args:
course=course, instructor=instructor) to also pass club=None, leaving the
defaults dict (summary_text, source_review_count, last_review_id,
source_metadata, model_id) unchanged.
- Line 94: The line setting max_reviews uses options.get("max_reviews") or 20
which is dead code because the CLI arg for --max-reviews already sets
default=25; update either the argparse add_argument for --max-reviews to
default=None if you intend 20 as the true fallback, or remove the "or 20" and
just use max_reviews = options.get("max_reviews") so the configured default (25
or user value) is respected; locate this in generate_ai_summaries.py where
max_reviews is assigned and adjust the add_argument('--max-reviews', ...) or the
assignment accordingly.
🧹 Nitpick comments (1)
tcf_website/management/commands/generate_ai_summaries.py (1)

167-169: Unhandled DoesNotExist in loop could crash the entire batch.

If a course or instructor is deleted between the aggregation query and the .get() call, the command will abort with an unhandled exception, losing progress on remaining pairs. Consider wrapping in try/except and continuing to the next pair.

Suggested fix
         for course_id, instructor_id, review_count, latest_id in pairs:
-            course = Course.objects.get(id=course_id)
-            instructor = Instructor.objects.get(id=instructor_id)
+            try:
+                course = Course.objects.get(id=course_id)
+                instructor = Instructor.objects.get(id=instructor_id)
+            except (Course.DoesNotExist, Instructor.DoesNotExist):
+                self.stdout.write(
+                    self.style.WARNING(
+                        f"Skipping course {course_id} / instructor {instructor_id}: "
+                        "object not found."
+                    )
+                )
+                continue

Comment on lines +196 to +197
"summary_text": summary_text,
"source_review_count": review_count,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

source_review_count records total reviews, not the number actually fed to the LLM.

review_count is the total number of matching reviews, but the service only uses up to max_reviews of them for the prompt. The UI reportedly renders this as "AI generated based on N reviews," which would overstate the actual input. Consider storing min(review_count, max_reviews) if the intent is to reflect how many reviews informed the summary.

🤖 Prompt for AI Agents
In `@tcf_website/management/commands/generate_ai_summaries.py` around lines 196 -
197, The current payload sets "source_review_count" to the total matching
review_count which overstates how many reviews were actually sent to the LLM;
change the assignment that builds the summary payload (the dict containing
"summary_text" and "source_review_count") to use the actual number used in the
prompt by storing min(review_count, max_reviews) (or compute used_count =
min(review_count, max_reviews)) and set "source_review_count" to that value so
the UI reflects the true number of reviews fed to the model; locate this logic
in the generate_ai_summaries flow where review_count and max_reviews are
available and replace the direct review_count reference with the computed
used_count.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tcf_website/management/commands/generate_ai_summaries.py`:
- Around line 168-170: The loop over pairs uses Course.objects.get and
Instructor.objects.get which can raise Course.DoesNotExist or
Instructor.DoesNotExist and abort the whole job; wrap the lookups in a
try/except that catches Course.DoesNotExist and Instructor.DoesNotExist (or use
.filter(...).first() and check for None), log a warning with identifying ids
(course_id, instructor_id) and continue to the next pair instead of letting the
exception propagate from the generate_ai_summaries command; ensure any
subsequent logic inside the loop only runs when both course and instructor are
present.
🧹 Nitpick comments (2)
scripts/list_ai_summaries.sh (1)

15-18: select_related is missing course__subdepartment, causing N+1 queries.

course.code() accesses self.subdepartment.mnemonic, which triggers a separate query per row since only "course" and "instructor" are prefetched.

Proposed fix
 summaries = (
     ReviewLLMSummary.objects.filter(club__isnull=True)
-    .select_related("course", "instructor")
+    .select_related("course", "course__subdepartment", "instructor")
     .order_by("-updated_at", "-id")
 )
tcf_website/management/commands/generate_ai_summaries.py (1)

213-214: The if not dry_run guard is redundant.

Dry-run pairs already continue on line 181, so execution only reaches line 213 when dry_run is False.

Proposed fix
-            if not dry_run and idx < len(pairs) - 1:
+            if idx < len(pairs) - 1:
                 time.sleep(4.0) # avoids rate limits

Comment on lines 168 to 170
for idx, (course_id, instructor_id, review_count, latest_id) in enumerate(pairs):
course = Course.objects.get(id=course_id)
instructor = Instructor.objects.get(id=instructor_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Course.objects.get() / Instructor.objects.get() can raise DoesNotExist, crashing the entire loop.

In the bulk path, if a course or instructor was deleted between the aggregation query and this lookup, the unhandled exception aborts processing of all remaining pairs.

Proposed fix
         for idx, (course_id, instructor_id, review_count, latest_id) in enumerate(pairs):
-            course = Course.objects.get(id=course_id)
-            instructor = Instructor.objects.get(id=instructor_id)
+            try:
+                course = Course.objects.get(id=course_id)
+                instructor = Instructor.objects.get(id=instructor_id)
+            except (Course.DoesNotExist, Instructor.DoesNotExist) as exc:
+                self.stdout.write(
+                    self.style.ERROR(
+                        f"Skipping course {course_id} / instructor {instructor_id}: {exc}"
+                    )
+                )
+                continue
             qs = base_reviews.filter(course=course, instructor=instructor).order_by(
🤖 Prompt for AI Agents
In `@tcf_website/management/commands/generate_ai_summaries.py` around lines 168 -
170, The loop over pairs uses Course.objects.get and Instructor.objects.get
which can raise Course.DoesNotExist or Instructor.DoesNotExist and abort the
whole job; wrap the lookups in a try/except that catches Course.DoesNotExist and
Instructor.DoesNotExist (or use .filter(...).first() and check for None), log a
warning with identifying ids (course_id, instructor_id) and continue to the next
pair instead of letting the exception propagate from the generate_ai_summaries
command; ensure any subsequent logic inside the loop only runs when both course
and instructor are present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants