Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
📝 WalkthroughWalkthroughAdds OpenRouter-backed AI review summarization: new settings, ReviewLLMSummary model + migration, an OpenRouter-based review_summary service and management command to generate summaries, template/CSS and view wiring to display stored summaries, and admin/model export updates. Changes
Sequence DiagramsequenceDiagram
participant CLI as Management Command
participant Service as Review Summary Service
participant OpenRouter as OpenRouter API
participant DB as Database
participant View as Browse View
participant Template as Template Renderer
CLI->>DB: Query candidate reviews / pairs
CLI->>Service: generate_review_summary(course,instructor,reviews_qs,max_reviews)
Service->>Service: validate API key & build prompt
Service->>OpenRouter: POST /chat/completions (model, messages)
alt OpenRouter returns summary
OpenRouter-->>Service: summary text
else OpenRouter returns error
OpenRouter-->>Service: error
end
Service-->>CLI: (summary, error, model_used)
CLI->>DB: Create or update ReviewLLMSummary
DB-->>CLI: saved
View->>DB: Query ReviewLLMSummary for course/instructor
DB-->>View: summary record or none
View->>Template: Render context with review_summary_text and review_summary_source_count
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@tcf_website/management/commands/generate_ai_summaries.py`:
- Around line 36-37: Add a one-line docstring to the Command class to satisfy
pylint C0115 and match other management commands; update the class definition
for Command (subclassing BaseCommand) by inserting a short descriptive docstring
(e.g., "Management command to generate AI summaries for course/instructor
pairs.") immediately after the class header so the class has a proper docstring.
- Around line 1-23: The module docstring in generate_ai_summaries.py has lines
exceeding the line-length limit; wrap long lines in the module docstring (the
top-of-file string describing usage and examples for the generate_ai_summaries
management command) so no line exceeds the project's limit (e.g., 100 chars);
preserve all content and examples, breaking long sentences and example command
lines across multiple shorter lines while keeping the same wording and
formatting.
- Around line 63-66: The CLI flag --max-reviews in generate_ai_summaries.py
currently defaults to 25 but the service clamps to MAX_REVIEWS_FOR_PROMPT (20),
causing mismatch; update the argparse.add_argument call that defines
"--max-reviews" to set default=20 to match MAX_REVIEWS_FOR_PROMPT and update the
help text if necessary to reflect the 20-review cap; reference the argument
registration for "--max-reviews" and the MAX_REVIEWS_FOR_PROMPT constant to
ensure both are aligned.
- Around line 102-125: The f-string on the self.stdout.write call is too long;
break it into multiple shorter strings or use implicit string concatenation
inside parentheses to keep each line under the 100-char limit. Locate the call
that uses self.stdout.write(self.style.WARNING(f"Skipping course {course_id} /
instructor {instructor_id}: only {review_count} reviews.")) and split the
message across lines (e.g., start with f"Skipping course {course_id} /
instructor {instructor_id}:" on one line and f" only {review_count} reviews." on
the next), preserving the f-string interpolation and the surrounding
self.stdout.write(self.style.WARNING(...)) structure.
In `@tcf_website/views/browse.py`:
- Around line 400-426: The view computes visible_reviews and
current_review_count but never uses current_review_count and only sets
summary_state to "missing" or "available", causing template-state mismatches;
update the logic around ReviewLLMSummary/summary_record and summary_state so it
uses current_review_count to set "no_reviews" when there are zero
visible_reviews, keep "available" when summary_record and summary_text exist,
set "missing" when no summary_record exists, and set "fallback" when a
summary_record exists but summary_text is empty (or "error" if you can derive an
error flag from summary_record); reference ReviewLLMSummary, summary_record,
summary_text, current_review_count, visible_reviews, and summary_state when
making these changes.
🧹 Nitpick comments (5)
tcf_website/models/models.py (1)
1421-1450: Clarify nullable fields for club summaries.With
cluboptional butcourse/instructorrequired, you can’t persist club-only summaries, and__str__would fail if those ever become null. If club summaries are intended, makecourse/instructornullable and add a constraint to enforce either club or course+instructor (not both). Otherwise, consider removingclubto avoid dead fields.♻️ Example adjustment (will require a migration)
- course = models.ForeignKey(Course, on_delete=models.CASCADE) - instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE) + course = models.ForeignKey(Course, on_delete=models.CASCADE, null=True, blank=True) + instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE, null=True, blank=True) @@ class Meta: constraints = [ + models.CheckConstraint( + check=( + models.Q(club__isnull=False, course__isnull=True, instructor__isnull=True) + | models.Q(club__isnull=True, course__isnull=False, instructor__isnull=False) + ), + name="summary_requires_club_or_course_instructor", + ), models.UniqueConstraint( fields=["course", "instructor"], name="unique_summary_per_course_instructor", condition=models.Q(club__isnull=True), ),tcf_website/services/review_summary.py (1)
117-125: Narrow exception handling to request/parsing errors.Catching
Exceptioncan mask programmer errors; preferrequests.RequestExceptionand JSON parsing errors so unexpected bugs surface.♻️ Suggested change
- except Exception as exc: # pylint: disable=broad-except - last_error = str(exc) - continue + except (requests.RequestException, ValueError) as exc: + last_error = str(exc) + continuetcf_website/management/commands/generate_ai_summaries.py (1)
74-83: Reduce locals inhandleand silence unusedargs.CI flags ARG002 and R0914. Consider renaming
argsto_argsand grouping options into a small dict/dataclass to cut locals.🔧 Minimal lint fix for unused args
- def handle(self, *args: Any, **options: Any): + def handle(self, *_args: Any, **options: Any):tcf_website/templates/course/course_professor.html (1)
372-387: Conditional logic may not handle "fallback" state correctly.The current condition ordering has an issue: if
review_summary_textis truthy, line 372 will always match first, regardless ofreview_summary_state. This means:
- The "fallback" branch (lines 379-381) will only be reached when
review_summary_textis falsy, making{{ review_summary_text|default:"" }}display nothing useful.- Users won't see the warning message about showing a cached summary during errors.
If the "fallback" state should show both the text and the warning, consider checking state first:
♻️ Suggested fix to prioritize state checks
- {% if review_summary_text %} + {% if review_summary_state == "fallback" and review_summary_text %} + <p class="mb-2">{{ review_summary_text|linebreaksbr }}</p> + <p class="text-warning small mb-0">Showing last saved summary. Error: {{ review_summary_error }}</p> + {% elif review_summary_text %} <p class="mb-2">{{ review_summary_text|linebreaksbr }}</p> <p class="text-primary small mb-0"> AI generated based on {{ review_summary_source_count }} review{% if review_summary_source_count != 1 %}s{% endif %}. </p> {% elif review_summary_state == "no_reviews" %} <p class="text-muted mb-0">No written reviews yet. Add yours to unlock a summary.</p> - {% elif review_summary_state == "fallback" %} - <p class="mb-2">{{ review_summary_text|default:"" }}</p> - <p class="text-warning small mb-0">Showing last saved summary. Error: {{ review_summary_error }}</p> {% elif review_summary_state == "error" %}Note: The view (
browse.py) currently only sets states "missing" and "available"—see related comment there about implementing the missing state logic.tcf_website/views/browse.py (1)
401-411: Query is more expensive than necessary for just a count.The query includes
.order_by("-created")which is unnecessary when only calling.count(). If you implement the state logic suggested above, simplify the query:♻️ Suggested optimization
# AI summary block - visible_reviews = ( - Review.objects.filter( - instructor=instructor_id, - course=course_id, - toxicity_rating__lt=settings.TOXICITY_THRESHOLD, - hidden=False, - ) - .exclude(text="") - .order_by("-created") - ) - current_review_count = visible_reviews.count() + current_review_count = ( + Review.objects.filter( + instructor=instructor_id, + course=course_id, + toxicity_rating__lt=settings.TOXICITY_THRESHOLD, + hidden=False, + ) + .exclude(text="") + .count() + )
|
Be sure to make the changes we discussed then I'll give a more in-depth review. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tcf_website/services/review_summary.py`:
- Around line 102-107: Currently you slice reviews_qs before removing empty-text
entries which can leave too few reviews; change the logic so you first
materialize a larger set (e.g., fetch up to MAX_REVIEWS_FOR_PROMPT or a higher
safe limit from reviews_qs), filter out entries with empty r.text using the
existing reviews = [r for r in reviews if (r.text or "").strip()] and only then
apply the cap using max_reviews = max(1, min(max_reviews or
MAX_REVIEWS_FOR_PROMPT, MAX_REVIEWS_FOR_PROMPT)) and slicing
(reviews[:max_reviews]) so the final list contains up to max_reviews non-empty
reviews; update code around reviews_qs, reviews, and MAX_REVIEWS_FOR_PROMPT
accordingly.
|
Code looks great! Just take a look at the few comments I gave you earlier. Why is the token limit set to 220? I think that visually it looks good and is almost ready to merge. |
Raised to 280 max tokens. Should be enough for 4 - 6 sentence summaries without truncation |
96c012d to
c70166c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@tcf_website/migrations/0024_reviewllmsummary.py`:
- Around line 18-26: The migration
tcf_website/migrations/0024_reviewllmsummary.py creates its primary key using
BigAutoField which conflicts with the project's DEFAULT_AUTO_FIELD =
"django.db.models.AutoField" in tcf_core/settings/base.py; to fix, regenerate
this migration with the project's settings active (so Django will use AutoField)
or edit the migration to replace BigAutoField with models.AutoField for the "id"
field, ensuring the migration class/name remains unchanged and that migrations
are recompiled/tested after the change.
In `@tcf_website/services/review_summary.py`:
- Around line 52-90: The code in _call_openrouter currently calls
response.raise_for_status(), which raises an HTTPError and prevents parsing the
JSON error body; change this to inspect response.status_code and only parse JSON
on non-2xx responses to extract structured error details (use response.json() in
a try/except to handle invalid JSON), then if status is not in the 200 range
return (None, extracted_error_message_or_fallback) instead of raising; keep the
existing successful-path parsing of data.get("choices") and return (content,
None) when present, and ensure you still handle missing JSON by returning a
sensible fallback error message from the response text or a generic message.
🧹 Nitpick comments (6)
tcf_website/models/models.py (1)
1424-1426:courseandinstructorare non-nullable, but club-based summaries may not have them.The
unique_summary_per_clubconstraint implies club summaries exist independently, butcourseandinstructorare required (non-null) FKs. If you ever create a club-only summary, you'd be forced to provide arbitrary course/instructor values. Consider making these nullable if club summaries are planned:- course = models.ForeignKey(Course, on_delete=models.CASCADE) - instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE) + course = models.ForeignKey(Course, on_delete=models.CASCADE, null=True, blank=True) + instructor = models.ForeignKey(Instructor, on_delete=models.CASCADE, null=True, blank=True)Alternatively, if club summaries aren't needed yet, remove the
clubfield and theunique_summary_per_clubconstraint to keep the schema focused.tcf_website/management/commands/generate_ai_summaries.py (3)
76-95:options.get("max_reviews") or 20silently converts--max-reviews 0to 20.Since argparse always provides the default, the
or 20only triggers for an explicit--max-reviews 0. If 0 is invalid, it's better to validate explicitly rather than silently substituting. Additionally, this hardcoded20duplicates the service constantMAX_REVIEWS_FOR_PROMPT.Proposed fix
- max_reviews = options.get("max_reviews") or 20 + max_reviews = options["max_reviews"]The service already clamps the value via
max(1, min(max_reviews, MAX_REVIEWS_FOR_PROMPT)), so there's no need to double-guard here.
136-148:Course.objects.get()/Instructor.objects.get()can raiseDoesNotExistand crash the loop.In the aggregated-pairs path,
course_idandinstructor_idcome fromReviewforeign keys, so they should exist. However, a race condition (e.g., a course deleted between aggregation and lookup) would raise an unhandled exception and abort the remaining pairs. Wrapping these lookups in a try/except (or usingfilter().first()) would make the loop resilient.Proposed fix
for course_id, instructor_id, review_count, latest_id in pairs: - course = Course.objects.get(id=course_id) - instructor = Instructor.objects.get(id=instructor_id) + try: + course = Course.objects.get(id=course_id) + instructor = Instructor.objects.get(id=instructor_id) + except (Course.DoesNotExist, Instructor.DoesNotExist): + self.stdout.write( + self.style.WARNING( + f"Skipping course {course_id} / instructor {instructor_id}: " + "object not found." + ) + ) + continue qs = base_reviews.filter(course=course, instructor=instructor).order_by( "-created" )
150-178:generate_review_summaryerrors are caught, but consider logging the exception for operational visibility.The error from
generate_review_summaryis printed to stdout viaself.style.ERROR, which is good for interactive use. For production/cron scenarios, also consider usinglogging.warningorlogging.errorso failures appear in structured logs.Otherwise, the
update_or_createand success reporting logic looks correct.tcf_website/services/review_summary.py (2)
17-49: Prompt construction looks solid, but line 27 is very long.The ratings format string on line 27 packs a lot into one line. Consider breaking it up for readability and to stay within line-length limits.
Proposed fix
- f"- Ratings (instructor {review.instructor_rating}/5, enjoyability {review.enjoyability}/5, recommend {review.recommendability}/5, difficulty {review.difficulty}/5): {truncated_text}" + f"- Ratings (instructor {review.instructor_rating}/5, " + f"enjoyability {review.enjoyability}/5, " + f"recommend {review.recommendability}/5, " + f"difficulty {review.difficulty}/5): " + f"{truncated_text}"
61-66: Consider makingmax_tokensandtemperatureconfigurable or at least top-level constants.The
max_tokens=220andtemperature=0.3values are buried in the request payload. Since the PR discussion already flagged 220 as potentially too low for 4-6 sentence summaries, extracting these as module-level constants (or Django settings) would make them easier to tune without digging into the function body.
| ( | ||
| "id", | ||
| models.BigAutoField( | ||
| auto_created=True, | ||
| primary_key=True, | ||
| serialize=False, | ||
| verbose_name="ID", | ||
| ), | ||
| ), |
There was a problem hiding this comment.
BigAutoField inconsistent with project's DEFAULT_AUTO_FIELD.
The project explicitly sets DEFAULT_AUTO_FIELD = "django.db.models.AutoField" in tcf_core/settings/base.py (line 186), but this migration uses BigAutoField. This won't break anything, but it creates an inconsistency — this table will use bigint PKs while all other tables use int. Regenerate the migration with the correct settings active, or it will work fine as-is if you're okay with the mismatch.
🤖 Prompt for AI Agents
In `@tcf_website/migrations/0024_reviewllmsummary.py` around lines 18 - 26, The
migration tcf_website/migrations/0024_reviewllmsummary.py creates its primary
key using BigAutoField which conflicts with the project's DEFAULT_AUTO_FIELD =
"django.db.models.AutoField" in tcf_core/settings/base.py; to fix, regenerate
this migration with the project's settings active (so Django will use AutoField)
or edit the migration to replace BigAutoField with models.AutoField for the "id"
field, ensuring the migration class/name remains unchanged and that migrations
are recompiled/tested after the change.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@tcf_website/management/commands/generate_ai_summaries.py`:
- Around line 196-197: The current payload sets "source_review_count" to the
total matching review_count which overstates how many reviews were actually sent
to the LLM; change the assignment that builds the summary payload (the dict
containing "summary_text" and "source_review_count") to use the actual number
used in the prompt by storing min(review_count, max_reviews) (or compute
used_count = min(review_count, max_reviews)) and set "source_review_count" to
that value so the UI reflects the true number of reviews fed to the model;
locate this logic in the generate_ai_summaries flow where review_count and
max_reviews are available and replace the direct review_count reference with the
computed used_count.
- Around line 192-202: The update_or_create call on ReviewLLMSummary uses course
and instructor as the lookup keys and can inadvertently match club-scoped rows
(or raise MultipleObjectsReturned); modify the lookup to include club=None so it
only targets the club-less unique constraint. Update the
ReviewLLMSummary.objects.update_or_create invocation (the lookup args:
course=course, instructor=instructor) to also pass club=None, leaving the
defaults dict (summary_text, source_review_count, last_review_id,
source_metadata, model_id) unchanged.
- Line 94: The line setting max_reviews uses options.get("max_reviews") or 20
which is dead code because the CLI arg for --max-reviews already sets
default=25; update either the argparse add_argument for --max-reviews to
default=None if you intend 20 as the true fallback, or remove the "or 20" and
just use max_reviews = options.get("max_reviews") so the configured default (25
or user value) is respected; locate this in generate_ai_summaries.py where
max_reviews is assigned and adjust the add_argument('--max-reviews', ...) or the
assignment accordingly.
🧹 Nitpick comments (1)
tcf_website/management/commands/generate_ai_summaries.py (1)
167-169: UnhandledDoesNotExistin loop could crash the entire batch.If a course or instructor is deleted between the aggregation query and the
.get()call, the command will abort with an unhandled exception, losing progress on remaining pairs. Consider wrapping in try/except and continuing to the next pair.Suggested fix
for course_id, instructor_id, review_count, latest_id in pairs: - course = Course.objects.get(id=course_id) - instructor = Instructor.objects.get(id=instructor_id) + try: + course = Course.objects.get(id=course_id) + instructor = Instructor.objects.get(id=instructor_id) + except (Course.DoesNotExist, Instructor.DoesNotExist): + self.stdout.write( + self.style.WARNING( + f"Skipping course {course_id} / instructor {instructor_id}: " + "object not found." + ) + ) + continue
| "summary_text": summary_text, | ||
| "source_review_count": review_count, |
There was a problem hiding this comment.
source_review_count records total reviews, not the number actually fed to the LLM.
review_count is the total number of matching reviews, but the service only uses up to max_reviews of them for the prompt. The UI reportedly renders this as "AI generated based on N reviews," which would overstate the actual input. Consider storing min(review_count, max_reviews) if the intent is to reflect how many reviews informed the summary.
🤖 Prompt for AI Agents
In `@tcf_website/management/commands/generate_ai_summaries.py` around lines 196 -
197, The current payload sets "source_review_count" to the total matching
review_count which overstates how many reviews were actually sent to the LLM;
change the assignment that builds the summary payload (the dict containing
"summary_text" and "source_review_count") to use the actual number used in the
prompt by storing min(review_count, max_reviews) (or compute used_count =
min(review_count, max_reviews)) and set "source_review_count" to that value so
the UI reflects the true number of reviews fed to the model; locate this logic
in the generate_ai_summaries flow where review_count and max_reviews are
available and replace the direct review_count reference with the computed
used_count.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tcf_website/management/commands/generate_ai_summaries.py`:
- Around line 168-170: The loop over pairs uses Course.objects.get and
Instructor.objects.get which can raise Course.DoesNotExist or
Instructor.DoesNotExist and abort the whole job; wrap the lookups in a
try/except that catches Course.DoesNotExist and Instructor.DoesNotExist (or use
.filter(...).first() and check for None), log a warning with identifying ids
(course_id, instructor_id) and continue to the next pair instead of letting the
exception propagate from the generate_ai_summaries command; ensure any
subsequent logic inside the loop only runs when both course and instructor are
present.
🧹 Nitpick comments (2)
scripts/list_ai_summaries.sh (1)
15-18:select_relatedis missingcourse__subdepartment, causing N+1 queries.
course.code()accessesself.subdepartment.mnemonic, which triggers a separate query per row since only"course"and"instructor"are prefetched.Proposed fix
summaries = ( ReviewLLMSummary.objects.filter(club__isnull=True) - .select_related("course", "instructor") + .select_related("course", "course__subdepartment", "instructor") .order_by("-updated_at", "-id") )tcf_website/management/commands/generate_ai_summaries.py (1)
213-214: Theif not dry_runguard is redundant.Dry-run pairs already
continueon line 181, so execution only reaches line 213 whendry_runisFalse.Proposed fix
- if not dry_run and idx < len(pairs) - 1: + if idx < len(pairs) - 1: time.sleep(4.0) # avoids rate limits
| for idx, (course_id, instructor_id, review_count, latest_id) in enumerate(pairs): | ||
| course = Course.objects.get(id=course_id) | ||
| instructor = Instructor.objects.get(id=instructor_id) |
There was a problem hiding this comment.
Course.objects.get() / Instructor.objects.get() can raise DoesNotExist, crashing the entire loop.
In the bulk path, if a course or instructor was deleted between the aggregation query and this lookup, the unhandled exception aborts processing of all remaining pairs.
Proposed fix
for idx, (course_id, instructor_id, review_count, latest_id) in enumerate(pairs):
- course = Course.objects.get(id=course_id)
- instructor = Instructor.objects.get(id=instructor_id)
+ try:
+ course = Course.objects.get(id=course_id)
+ instructor = Instructor.objects.get(id=instructor_id)
+ except (Course.DoesNotExist, Instructor.DoesNotExist) as exc:
+ self.stdout.write(
+ self.style.ERROR(
+ f"Skipping course {course_id} / instructor {instructor_id}: {exc}"
+ )
+ )
+ continue
qs = base_reviews.filter(course=course, instructor=instructor).order_by(🤖 Prompt for AI Agents
In `@tcf_website/management/commands/generate_ai_summaries.py` around lines 168 -
170, The loop over pairs uses Course.objects.get and Instructor.objects.get
which can raise Course.DoesNotExist or Instructor.DoesNotExist and abort the
whole job; wrap the lookups in a try/except that catches Course.DoesNotExist and
Instructor.DoesNotExist (or use .filter(...).first() and check for None), log a
warning with identifying ids (course_id, instructor_id) and continue to the next
pair instead of letting the exception propagate from the generate_ai_summaries
command; ensure any subsequent logic inside the loop only runs when both course
and instructor are present.
This PR is for the addition of an AI powered reviews summary feature on the course/instructor page.
manual LLM summary generation occurs by running
generate_ai_summarieswith appropriate flags. This helps us control token usage and keep budget for this feature extremely low.ENV vars required:
What I did
tcf_website/management/commands/generate_ai_summaries.pynpopular (most reviewed) courses, do LLM call and generate summaries for topncourse/instructor pages, and also generate for a single specific course/instructor page givencourseIdandinstructorIdScreenshots
note: UI still being iterated on
Running

list_ai_summaries.shscript^ this lists all generated summaries that we have
Running

docker compose exec web python manage.py generate_ai_summaries --limit 5 --min-reviews 10 --missing-onlyExample:^ this generates summaries for 5 course/instructor pairs with a min review of 10 and for pairs that don't have a generated summary
Running
Summary by CodeRabbit
New Features
UI
Data
Chores