Bug 2007112 - Perform a binary search backfill with the sheriffing bot with change detection technique integration by junngo · Pull Request #9237 · mozilla/treeherder

junngo · 2026-02-18T08:23:42Z

Hi there :)
I've integrated the smart backfill logic in this patch.
I haven't added unit tests yet. I'd like to confirm that the overall direction and policy make sense first. Once that’s agreed on, I’ll follow up with tests.
Please let me know if you have any questions or if there’s anything you’d like me to change.
Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=2007112
Related Patch: https://phabricator.services.mozilla.com/D282605

Features

This improves on the previous approach, which backfilled a fixed set of pushes around the alert and stopped without further analysis. In this patch, the suspected change point is refined by incrementally filling in missing performance data, while minimizing unnecessary test executions.
I kept the previous workflow and state machine [0]. With this logic, after a backfill completes successfully, the patch re-verifies the data to identify the actual culprit.
I added new columns to the BackfillRecord table.

iteration_count
last_detected_push_id
anchor_push_id
backfill_logs [1]

During verification, multiple potential culprit pushes can be detected. In such cases, we select the culprit that is closest to the previous alert point.
In the example below [1], the initial alert point is push 10 (previous_push_id: 10). After backfilling, verification detects changes at pushes 9 and 14. Since push 9 is closer to the previous alert point (10), it is selected as the culprit.
In this example, the search window is intentionally kept small for testing purposes. In production, we typically use a wider search window (around 12–24 pushes).
(Note: In this example, push 9 represents a regression, while push 14 represents an improvement.)
When searching in the right direction, the logic currently looks ahead by 25 pushes. This value is currently fixed.
Ideally, it should match the number of pushes triggered by the real backfill action on the mozilla-central side. Also, if we need to manage this value, we can consider adding a new column for it.
I intentionally did not change the existing workflow. If a failure occurs during verification (in the verify_and_iterate method), the backfill record status remains SUCCESSFUL. We could introduce a new status to represent this case more explicitly, or alternatively change the status to FAILED. I left this as a policy decision to be discussed.
The re_run_detect_changes method is based on the alert generation logic [2], in order to recreate the same environment used when alerts are originally generated.

[0]

treeherder/treeherder/perf/models.py

Lines 887 to 891 in ce3a39b

    
           PRELIMINARY = 0 
        
           READY_FOR_PROCESSING = 1 
        
           BACKFILLED = 2 
        
           SUCCESSFUL = 3 
        
           FAILED = 4

[1] Example

[{
  "iteration": 1,
  "detected_push_id": 9,
  "detected_t_value": 7.831531802772834, 
  "candidates": [{"push_id": 9, "t_value": 7.831531802772834, "push_timestamp": 1770761099}, {"push_id": 14, "t_value": 11.361080927457797, "push_timestamp": 1770782865}], 
  "timestamp": "2026-02-17T16:31:34.889370",
  "previous_push_id": 10,
  "direction": "left",
  "notes": "Detected push moved left (from 10 to 9)"
},{
  "iteration": 2,
  "detected_push_id": 9,
  "detected_t_value": 7.831531802772834, 
  "candidates": [{"push_id": 9, "t_value": 7.831531802772834, "push_timestamp": 1770761099}, {"push_id": 14, "t_value": 11.361080927457797, "push_timestamp": 1770782865}],
  "timestamp": "2026-02-17T16:36:49.724971",
  "previous_push_id": 9,
  "direction": "stabilized",
  "notes": "Detected push same as previous, culprit stabilized"
}]

[2]

treeherder/treeherder/perf/alerts.py

Lines 59 to 126 in ce3a39b

    
           def generate_new_alerts_in_series(signature): 
        
               # get series data starting from either: 
        
               # (1) the last alert, if there is one 
        
               # (2) the alerts max age 
        
               # (use whichever is newer) 
        
               max_alert_age = alert_after_ts = datetime.now() - settings.PERFHERDER_ALERTS_MAX_AGE 
        
               series = PerformanceDatum.objects.filter(signature=signature, push_timestamp__gte=max_alert_age) 
        
               latest_alert_timestamp = ( 
        
                   PerformanceAlert.objects.filter(series_signature=signature) 
        
                   .select_related("summary__push__time") 
        
                   .order_by("-summary__push__time") 
        
                   .values_list("summary__push__time", flat=True)[:1] 
        
               ) 
        
               if latest_alert_timestamp: 
        
                   latest_ts = latest_alert_timestamp[0] 
        
                   series = series.filter(push_timestamp__gt=latest_ts) 
        
                   if latest_ts > alert_after_ts: 
        
                       alert_after_ts = latest_ts 
        
               datum_with_replicates = ( 
        
                   PerformanceDatum.objects.filter( 
        
                       signature=signature, 
        
                       repository=signature.repository, 
        
                       push_timestamp__gte=alert_after_ts, 
        
                   ) 
        
                   .annotate( 
        
                       has_replicate=Exists( 
        
                           PerformanceDatumReplicate.objects.filter(performance_datum_id=OuterRef("pk")) 
        
                       ) 
        
                   ) 
        
                   .filter(has_replicate=True) 
        
               ) 
        
               replicates = PerformanceDatumReplicate.objects.filter( 
        
                   performance_datum_id__in=Subquery(datum_with_replicates.values("id")) 
        
               ).values_list("performance_datum_id", "value") 
        
               replicates_map: dict[int, list[float]] = {} 
        
               for datum_id, value in replicates: 
        
                   replicates_map.setdefault(datum_id, []).append(value) 
        
               revision_data = {} 
        
               for d in series: 
        
                   if not revision_data.get(d.push_id): 
        
                       revision_data[d.push_id] = RevisionDatum( 
        
                           int(time.mktime(d.push_timestamp.timetuple())), d.push_id, [], [] 
        
                       ) 
        
                   revision_data[d.push_id].values.append(d.value) 
        
                   revision_data[d.push_id].replicates.extend(replicates_map.get(d.id, [])) 
        
               min_back_window = signature.min_back_window 
        
               if min_back_window is None: 
        
                   min_back_window = settings.PERFHERDER_ALERTS_MIN_BACK_WINDOW 
        
               max_back_window = signature.max_back_window 
        
               if max_back_window is None: 
        
                   max_back_window = settings.PERFHERDER_ALERTS_MAX_BACK_WINDOW 
        
               fore_window = signature.fore_window 
        
               if fore_window is None: 
        
                   fore_window = settings.PERFHERDER_ALERTS_FORE_WINDOW 
        
               alert_threshold = signature.alert_threshold 
        
               if alert_threshold is None: 
        
                   alert_threshold = settings.PERFHERDER_REGRESSION_THRESHOLD 
        
               data = revision_data.values() 
        
               analyzed_series = detect_changes( 
        
                   data, 
        
                   min_back_window=min_back_window, 
        
                   max_back_window=max_back_window, 
        
                   fore_window=fore_window, 
        
               )

…t with change detection technique integration

beatrice-acasandrei

Awesome work! Could you add some inline comments for the main methods that you updated? Since a significant portion of the core logic resides on the m-c side, could we add a high-level summary of the cross-project workflow? It would be very helpful to document which actions are performed in m-c first and exactly how those results influence the logic in Perfherder.

beatrice-acasandrei · 2026-03-05T15:36:42Z

treeherder/perf/models.py

+    iteration_count = models.IntegerField(default=0)
+    last_detected_push_id = models.IntegerField(null=True, blank=True)
+    anchor_push_id = models.IntegerField(null=True, blank=True)
+    backfill_logs = models.TextField(default="[]")


I think adding some more context via comments would be helpful.
For example :

iteration_count: Tracks search iterations.

last_detected_push_id: The most recent culprit identification.

anchor_push_id: The reference point for searches.

backfill_logs: Detailed iteration history.

beatrice-acasandrei · 2026-03-05T16:12:50Z

treeherder/perf/models.py

        (READY_FOR_PROCESSING, "Ready for processing"),
        (BACKFILLED, "Backfilled"),
        (SUCCESSFUL, "Successful"),
        (FAILED, "Failed"),


It may be useful for debugging to have a separate status for the verification process (VERIFICATION_FAILED, "Verification Failed")

beatrice-acasandrei · 2026-03-05T16:14:53Z

treeherder/perf/auto_perf_sheriffing/secretary.py

                logger.error(ex)

+    def verify_and_iterate(self, record: BackfillRecord, max_iterations: int = 5):
+        if record.iteration_count >= max_iterations:


Should we consider adding a timeout check alongside with the max iterations check?

Bug 2007112 - Perform a binary search backfill with the sheriffing bo…

ab2ae26

…t with change detection technique integration

junngo requested review from beatrice-acasandrei and esanuandra as code owners February 18, 2026 08:23

beatrice-acasandrei reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 2007112 - Perform a binary search backfill with the sheriffing bot with change detection technique integration#9237

Bug 2007112 - Perform a binary search backfill with the sheriffing bot with change detection technique integration#9237
junngo wants to merge 1 commit intomozilla:masterfrom
junngo:sheriff-automated-backfill

junngo commented Feb 18, 2026 •

edited

Loading

Uh oh!

beatrice-acasandrei left a comment

Uh oh!

beatrice-acasandrei Mar 5, 2026

Uh oh!

beatrice-acasandrei Mar 5, 2026

Uh oh!

beatrice-acasandrei Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	PRELIMINARY = 0
	READY_FOR_PROCESSING = 1
	BACKFILLED = 2
	SUCCESSFUL = 3
	FAILED = 4

	def generate_new_alerts_in_series(signature):
	# get series data starting from either:
	# (1) the last alert, if there is one
	# (2) the alerts max age
	# (use whichever is newer)
	max_alert_age = alert_after_ts = datetime.now() - settings.PERFHERDER_ALERTS_MAX_AGE
	series = PerformanceDatum.objects.filter(signature=signature, push_timestamp__gte=max_alert_age)
	latest_alert_timestamp = (
	PerformanceAlert.objects.filter(series_signature=signature)
	.select_related("summary__push__time")
	.order_by("-summary__push__time")
	.values_list("summary__push__time", flat=True)[:1]
	)
	if latest_alert_timestamp:
	latest_ts = latest_alert_timestamp[0]
	series = series.filter(push_timestamp__gt=latest_ts)
	if latest_ts > alert_after_ts:
	alert_after_ts = latest_ts

	datum_with_replicates = (
	PerformanceDatum.objects.filter(
	signature=signature,
	repository=signature.repository,
	push_timestamp__gte=alert_after_ts,
	)
	.annotate(
	has_replicate=Exists(
	PerformanceDatumReplicate.objects.filter(performance_datum_id=OuterRef("pk"))
	)
	)
	.filter(has_replicate=True)
	)
	replicates = PerformanceDatumReplicate.objects.filter(
	performance_datum_id__in=Subquery(datum_with_replicates.values("id"))
	).values_list("performance_datum_id", "value")
	replicates_map: dict[int, list[float]] = {}
	for datum_id, value in replicates:
	replicates_map.setdefault(datum_id, []).append(value)

	revision_data = {}
	for d in series:
	if not revision_data.get(d.push_id):
	revision_data[d.push_id] = RevisionDatum(
	int(time.mktime(d.push_timestamp.timetuple())), d.push_id, [], []
	)
	revision_data[d.push_id].values.append(d.value)
	revision_data[d.push_id].replicates.extend(replicates_map.get(d.id, []))

	min_back_window = signature.min_back_window
	if min_back_window is None:
	min_back_window = settings.PERFHERDER_ALERTS_MIN_BACK_WINDOW
	max_back_window = signature.max_back_window
	if max_back_window is None:
	max_back_window = settings.PERFHERDER_ALERTS_MAX_BACK_WINDOW
	fore_window = signature.fore_window
	if fore_window is None:
	fore_window = settings.PERFHERDER_ALERTS_FORE_WINDOW
	alert_threshold = signature.alert_threshold
	if alert_threshold is None:
	alert_threshold = settings.PERFHERDER_REGRESSION_THRESHOLD

	data = revision_data.values()
	analyzed_series = detect_changes(
	data,
	min_back_window=min_back_window,
	max_back_window=max_back_window,
	fore_window=fore_window,
	)

Conversation

junngo commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Uh oh!

beatrice-acasandrei left a comment

Choose a reason for hiding this comment

Uh oh!

beatrice-acasandrei Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

beatrice-acasandrei Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

beatrice-acasandrei Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

junngo commented Feb 18, 2026 •

edited

Loading