Skip to content

issues/1597: Temperature Monitoring#1876

Open
MitchellThompkins wants to merge 102 commits intounraid:mainfrom
MitchellThompkins:mthompkins/issues/1597-temperature-monitoring
Open

issues/1597: Temperature Monitoring#1876
MitchellThompkins wants to merge 102 commits intounraid:mainfrom
MitchellThompkins:mthompkins/issues/1597-temperature-monitoring

Conversation

@MitchellThompkins
Copy link
Contributor

@MitchellThompkins MitchellThompkins commented Feb 1, 2026

Temperature Monitoring System Feature Request

Description

This PR implements a comprehensive temperature monitoring system integrated into the existing MetricsResolver. It provides real-time access to temperature data from various system sensors via GraphQL queries and subscriptions.

I tested these changes locally using the provided Apollo Server on Unraid 7.2.0.

Addresses #1597

Key Features

  • Multi-Source Support:
    • CPU & Motherboard: Via lm-sensors integration.
    • Disks: Modifies smartctl integration
      • Modifies smartctl to leverage json parsing instead of raw strings.
    • IPMI: Support for ipmitool sensors (see notes below!).
  • GraphQL API:
    • New temperature field on the Metrics node.
    • systemMetricsTemperature subscription for real-time updates.
    • Exposes current, min, max, and history for each sensor.
  • Configuration:
    • Fully configurable via api.json (enabled status, polling intervals, thresholds).
    • Support for default_unit preference (Celsius/Fahrenheit) with automatic conversion.
  • History & Aggregation:
    • In-memory history tracking with configurable retention.
    • Summary statistics (average temp, hottest/coolest sensor counts).

Implementation Details

  • Binary Management: Relies on system-installed tools (sensors, smartctl, ipmitool) rather than bundling binaries, aligning with the base OS integration strategy. There is no attempt here to package sensor tooling as part of this feature. This is per a conversation on the feature scope with the Unraid team.
  • Architecture: Implemented a modular TemperatureSensorProvider interface allowing for easy addition of new sensor types.
  • Robustness: DisksService was updated to parse smartctl JSON output directly, resolving issues with raw string parsing on certain drive models. There was an issue with some Seagate drives reporting raw values with extra data in parentheses: 24 (0 14 0 0 0). Parsing the last value in the string returned incorrect data. Parsing with json prevents this.
  • Testing: Added unit tests for all additional modules.

Documentation

  • Added developer documentation at docs/developer/temperature.md detailing configuration options and API usage.

Scope Notes

  • IPMI Integration: The IpmiSensorsService has been implemented to parse standard ipmitool output, but it has not been tested on live hardware as I do not have access to a system with IPMI support. It relies on standard ipmitool output formats (at least the documentation I saw and what AI told me).
  • GPU: GPU temperature monitoring is currently out of scope and has not been implemented. I do not have access to a machine with a GPU and could not reliably test it.
  • Alerts: The API calculates and exposes WARNING and CRITICAL statuses based on thresholds but does not currently trigger active system notifications. This type of alert is passive only.

Summary by CodeRabbit

New Features

  • Added comprehensive temperature monitoring with real-time metrics for CPU, disk, and system sensors
  • Configurable temperature thresholds and alerts (Warning/Critical states)
  • Support for multiple temperature units (Celsius, Fahrenheit, Kelvin, Rankine)
  • Temperature history tracking and trend analysis capability
  • Real-time temperature metric subscriptions

@elibosley elibosley linked an issue Feb 13, 2026 that may be closed by this pull request
3 tasks
@MitchellThompkins
Copy link
Contributor Author

MitchellThompkins commented Feb 23, 2026

@Ajit-Mehrotra let me know if there's anything here that warrants an explanation, thanks!

@Ajit-Mehrotra
Copy link
Contributor

Will review this tomorrow - sorry for the delays!

@Ajit-Mehrotra
Copy link
Contributor

Half-way through this PR review. Will finish up later today!

Copy link
Contributor

@Ajit-Mehrotra Ajit-Mehrotra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey man, pretty good job with this. Just a couple small things for cleanup that shouldn't take long.

Another, quick thing. Not a big deal, but would be great if you could regenerate the graphql schema and graphql types.

Here's what you'd need to do:

  • cd to api/api --> pnpm dev (keep server running)
  • open new terminal
    • cd to api/web --> pnpm codegen
    • pnpm dev

That should regenerate all the files you need. I've already done this on my computer for your branch, so I can also just create a PR on your fork or something, if needed. But it's probably easier just to run it locally on your machine.

Also, you might know this but I found it helpful to run pnpm docker:build-and-run in the api/plugin directory to create a local plugin which you can install onto your unraid machine.

Feel free to reach out with any questions or if you need help, and thank you for your contribution to the Unraid community. Really cool stuff!

line.includes('Temperature_Celsius') || line.includes('Airflow_Temperature_Cel')
);
const { stdout } = await execa('smartctl', ['-n', 'standby', '-A', '-j', device]);
const data = JSON.parse(stdout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using zod schema for type validation would be useful here. Would be great if you could also implement that :)

Not just useful here but it would help make everything here type-safe.

Comment on lines 16 to 17
export class LmSensorsService implements TemperatureSensorProvider {
readonly id = 'lm-sensors';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename this to LinuxMonitorSensorService just to make it clearer and less ambiguous. It might not be immediately clear what lm means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with b1739a3.

}

const { stdout } = await execa('sensors', args, { timeout: this.timeoutMs });
const data: unknown = JSON.parse(stdout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, would be nice to use zod schema


private async initializeProviders(): Promise<void> {
// 1. Get sensor specific configs
const lmSensorsConfig = this.configService.get('api.temperature.sensors.lm_sensors');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't include the temperature configs in the api.json. Instead, try adding a temperature.json config using ConfigFilePersister.

The Api config json is for configuring how the api itself is run. So adding it here wouldn't be appropriate. Instead, we have created a config file persister that you can use for something like this :)

As-is, this might cause instability/config-loss with upgrades/downgrades (not a major problem, but still something I think we should address).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, we have created a config file persister that you can use for something like this :)

Ah good catch. I totally didn't appreciate how configs were being passed in. I asked deepwiki where to find the config and it lead me to just pulling the config file directly.

If I did this correctly, it should be addressed with ecffc73 but please double check me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this graphql types to the api instead. Doesn't need to be in here.

If you move temperature into its own config, that will help facilitate this change as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay; that makes more sense. I didn't realize how NestJS actually read in these types and I assumed it was all coming from 1 file. (I'm new to typescript/graphql/this code base and this was the first place I found to put these that worked). Between ecffc73 and a307a38 I think those cover this.

( cd usr/local/bin ; ln -sf ../lib/node_modules/npm/bin/npm-cli.js npm )
( cd usr/local/bin ; rm -rf npx )
( cd usr/local/bin ; ln -sf ../lib/node_modules/npm/bin/npx-cli.js npx )
( cd usr/local/unraid-api/node_modules/.bin ; rm -rf apollo-pbjs )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'd do here is just use what's on the main branch for doinst.sh. This is a generated file, so it shouldn't impact anything. I tested it locally on docker and when it regenerates with what's on the main branch, it doesn't generate what you have here. Lmk if that's not the case for you though.

Comment on lines 61 to 64
const name = parts[0];
const readingParts = parts[1].split(' ');
const valueStr = readingParts[0];
const unitStr = readingParts.slice(1).join(' '); // "degrees C"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you can use array destructuring here to make it easier to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with 8ae9b74d.

@MitchellThompkins
Copy link
Contributor Author

Hey man, pretty good job with this. Just a couple small things for cleanup that shouldn't take long.

Another, quick thing. Not a big deal, but would be great if you could regenerate the graphql schema and graphql types.

Here's what you'd need to do:

  • cd to api/api --> pnpm dev (keep server running)
  • open new terminal
    • cd to api/web --> pnpm codegen
    • pnpm dev

That should regenerate all the files you need. I've already done this on my computer for your branch, so I can also just create a PR on your fork or something, if needed. But it's probably easier just to run it locally on your machine.

Also, you might know this but I found it helpful to run pnpm docker:build-and-run in the api/plugin directory to create a local plugin which you can install onto your unraid machine.

Feel free to reach out with any questions or if you need help, and thank you for your contribution to the Unraid community. Really cool stuff!

Thanks! Those are all good suggestions. I'll try to knock them out tonight or tomorrow sometime.

@MitchellThompkins
Copy link
Contributor Author

  • pnpm codegen

Hmmmm, so I tried to do the gen but get:

~/workspace/api/web > pnpm codegen

> @unraid/web@4.29.2 codegen /home/mthompkins/workspace/api/web
> graphql-codegen --config codegen.ts -r dotenv/config

✔ Parse Configuration
⚠ Generate outputs
  ❯ Generate to src/composables/gql/
    ✔ Load GraphQL schemas
    ✔ Load GraphQL documents
    ✖ GraphQL Document Validation failed with 5 errors;
        Error 0: Unknown type "ConnectSignInInput".
          at /home/mthompkins/workspace/api/web/src/store/account.fragment.ts:2:34

      Error 1: Cannot query field "connectSignIn" on type "Mutation".
          at /home/mthompkins/workspace/api/web/src/store/account.fragment.ts:3:5

      Error 2: Cannot query field "connectSignOut" on type "Mutation".
          at /home/mthompkins/workspace/api/web/src/store/account.fragment.ts:3:5

      Error 3: Cannot query field "cloud" on type "Query". Did you mean "rclone"?
          at /home/mthompkins/workspace/api/web/src/store/server.fragment.ts:3:5

      Error 4: Unknown type "Cloud". Did you mean "CpuLoad" or "Float"?
          at /home/mthompkins/workspace/api/web/src/store/server.fragment.ts:2:28
  ✖ One or more errors occurred, no files were generated. To allow output on errors, set config.allowPartialOutputs=true
 ELIFECYCLE  Command failed with exit code 1.

And yeah the pnpm docker:build-and-run workflow is actually how I've been testing this :)

Here's the machine:

~/workspace/api > cat /etc/os-release
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://gitlab.archlinux.org/groups/archlinux/-/issues"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=archlinux-logo


private async initializeProviders(): Promise<void> {
// 1. Get sensor specific configs
const lmSensorsConfig = this.configService.get('api.temperature.sensors.lm_sensors');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, we have created a config file persister that you can use for something like this :)

Ah good catch. I totally didn't appreciate how configs were being passed in. I asked deepwiki where to find the config and it lead me to just pulling the config file directly.

If I did this correctly, it should be addressed with ecffc73 but please double check me.

Comment on lines 16 to 17
export class LmSensorsService implements TemperatureSensorProvider {
readonly id = 'lm-sensors';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with b1739a3.

Comment on lines 61 to 64
const name = parts[0];
const readingParts = parts[1].split(' ');
const valueStr = readingParts[0];
const unitStr = readingParts.slice(1).join(' '); // "degrees C"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with 8ae9b74d.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay; that makes more sense. I didn't realize how NestJS actually read in these types and I assumed it was all coming from 1 file. (I'm new to typescript/graphql/this code base and this was the first place I found to put these that worked). Between ecffc73 and a307a38 I think those cover this.

@MitchellThompkins
Copy link
Contributor Author

MitchellThompkins commented Feb 26, 2026

I'll get to the last few comments tomorrow or the next day probably.

I'm also hitting an issue where when I try to write my new temperature.json with some configuration values, no matter what when I restart the API it seems to overwrite whatever my configuration was =/

Before starting API:

{
  "enabled": true,
  "polling_interval": 10000,
  "default_unit": "RANKINE",
  "history": {
    "max_readings": 4,
    "retention_ms": 300000
  },
  "thresholds": {
    "cpu_warning": 75,
    "cpu_critical": 90,
    "disk_warning": 50,
    "disk_critical": 60
  },
  "sensors": {
    "lm_sensors": {
      "enabled": true,
      "config_path": "/etc/sensors3.conf"
    },
    "smartctl": {
      "enabled": true
    },
    "ipmi": {
      "enabled": false
    }
  }
}

After starting API:

root@server:/boot/config/plugins/dynamix.my.servers/configs# cat temperature.json 
{
  "default_unit": "CELSIUS",
  "sensors": {
    "lm_sensors": {
      "enabled": true,
      "config_path": ""
    },
    "smartctl": {
      "enabled": true
    },
    "ipmi": {
      "enabled": true,
      "args": []
    }
  },
  "thresholds": {
    "cpu_warning": 70,
    "cpu_critical": 85,
    "disk_warning": 50,
    "disk_critical": 60,
    "warning": 80,
    "critical": 90
  }
}

Clearly the default is missing fields...but I feel like I have them in the right places so I'll need to figure that out.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts (1)

264-270: ⚠️ Potential issue | 🟠 Major

Guard against empty sensors before calling buildSummary.

After filtering on line 264, sensors could be empty (if all sensor data was invalid). Calling buildSummary with an empty array will throw an error (line 381-383). Add a guard before building the result.

🐛 Proposed fix
         .filter((s): s is TemperatureSensor => s !== null);

+    if (sensors.length === 0) {
+        return null;
+    }
+
     return {
         id: 'temperature-metrics',
         sensors,
         summary: this.buildSummary(sensors),
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`
around lines 264 - 270, After filtering to get `sensors: TemperatureSensor[]`,
guard against the case where `sensors` is empty before calling
`this.buildSummary(sensors)`; if empty, return the result object with a safe
summary (e.g., null or an empty/default summary shape) instead of invoking
`buildSummary` which throws on empty input — update the code that returns the
object with id 'temperature-metrics' to check `sensors.length === 0` and supply
a safe summary value, referencing the `sensors` variable and the
`this.buildSummary` method.
🧹 Nitpick comments (6)
api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/ipmi_sensors.service.ts (1)

21-21: Remove unused ConfigService dependency.

The ConfigService is injected but never used in this service. If it's intended for future configuration (as the comment on line 33 suggests), consider removing it until actually needed to keep the code clean.

♻️ Proposed fix
-import { ConfigService } from '@nestjs/config';
-
 import { execa } from 'execa';

 import {
@@ -13,10 +11,6 @@ import {
 export class IpmiSensorsService implements TemperatureSensorProvider {
     readonly id = 'ipmi-sensors';
     private readonly logger = new Logger(IpmiSensorsService.name);
     private readonly timeoutMs = 3000;

-    constructor(private readonly configService: ConfigService) {}
+    constructor() {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/ipmi_sensors.service.ts`
at line 21, The ConfigService injected into the IpmiSensorsService constructor
is unused; remove the unused dependency and any related import to keep the class
clean. Edit the IpmiSensorsService (constructor) to drop the private readonly
configService: ConfigService parameter and remove the ConfigService import at
the top of ipmi_sensors.service.ts; if there are any comments referencing future
use, leave the comment but do not inject ConfigService until it's actually
required.
api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.module.ts (1)

1-1: Remove the filename comment.

As per coding guidelines, comments should only be added when needed for clarity of function. The filename is already evident from the file path.

♻️ Proposed fix
-// temperature/temperature.module.ts
 import { Module } from '@nestjs/common';

As per coding guidelines: Never add comments unless they are needed for clarity of function.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.module.ts`
at line 1, Remove the unnecessary top-of-file comment "//
temperature/temperature.module.ts" from the module file; simply delete that
single-line filename comment so the file contains only real code and meaningful
comments (no other changes to exports, imports, or any symbols in the
temperature.module.ts file).
api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.spec.ts (1)

59-68: Consider using a simpler mock pattern for TemperatureConfigService.

Using Object.create(TemperatureConfigService.prototype) and then assigning methods is unusual. A typed partial mock object would be cleaner and more consistent with the other mocks in this file.

♻️ Alternative approach
-        temperatureConfigService = Object.create(TemperatureConfigService.prototype);
-        temperatureConfigService.getConfig = vi.fn().mockReturnValue({
+        temperatureConfigService = {
+            getConfig: vi.fn().mockReturnValue({
             default_unit: 'celsius',
             sensors: {
                 lm_sensors: { enabled: true },
                 smartctl: { enabled: true },
                 ipmi: { enabled: false },
             },
             thresholds: {},
-        });
+            }),
+        } as unknown as TemperatureConfigService;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.spec.ts`
around lines 59 - 68, Replace the unusual
Object.create(TemperatureConfigService.prototype) pattern with a simple typed
partial mock for temperatureConfigService: create a plain object literal that
implements getConfig as vi.fn().mockReturnValue(...), cast it to
Partial<TemperatureConfigService> or TemperatureConfigService as appropriate,
and use that mock in tests instead of TemperatureConfigService.prototype to
match the other mocks in this file.
api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts (3)

28-30: Remove unused cache fields (dead code).

The cache and cacheTimestamp fields are declared but never used. The actual caching logic uses this.history.getMostRecentReading() instead. These fields should be removed to avoid confusion.

🧹 Proposed fix
 export class TemperatureService implements OnModuleInit {
     private readonly logger = new Logger(TemperatureService.name);
     private availableProviders: TemperatureSensorProvider[] = [];
 
-    private cache: TemperatureMetrics | null = null;
-    private cacheTimestamp = 0;
     private readonly CACHE_TTL_MS = 1000;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`
around lines 28 - 30, Remove the dead private fields `cache` and
`cacheTimestamp` from the TemperatureService class (they're declared but never
referenced); locate their declarations (private cache: TemperatureMetrics | null
= null; private cacheTimestamp = 0;) and delete them so only the actual caching
via `this.history.getMostRecentReading()` remains; ensure no other code
references `cache` or `cacheTimestamp` and run type checks to confirm no further
fixes are needed.

22-22: Remove unnecessary comment.

This comment just states the filename, which provides no additional value. As per coding guidelines: "Never add comments unless they are needed for clarity of function."

🧹 Proposed fix
-// temperature.service.ts
 `@Injectable`()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`
at line 22, Remove the redundant top-of-file comment that only repeats the
filename in temperature.service.ts; open the file and delete the lone comment
line "// temperature.service.ts" (or any similar filename-only comment at the
top) so the file contains only meaningful code/comments.

333-338: Confusing parameter name sourceUnit.

The parameter sourceUnit on line 338 actually receives targetUnit from the caller (line 149). Consider renaming to targetUnit or displayUnit for clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`
around lines 333 - 338, The parameter name sourceUnit on the computeStatus
function is misleading because callers pass a target/display unit; rename the
parameter to targetUnit (or displayUnit) in computeStatus and update its
signature and all references inside computeStatus to the new name, then update
every call site that passes the unit (e.g., where computeStatus(...) is invoked)
to match the renamed parameter; also update any related type annotations or
JSDoc for TemperatureUnit to reflect the new parameter name to keep intent
clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.ts`:
- Around line 86-102: The resolver is reading wrong config keys so it always
uses defaults; update the config access in the metrics resolver (where
isTemperatureEnabled, pollingInterval and this.configService are used) to use
the correct keys or inject TemperatureConfigService: either replace
'api.temperature.enabled' and 'api.temperature.polling_interval' with
'temperature.enabled' and 'temperature.polling_interval', or inject
TemperatureConfigService and call its getConfig() to obtain enabled and
polling_interval and use those values when calling
subscriptionTracker.registerTopic (the same place you call
temperatureService.getMetrics and publish to
PUBSUB_CHANNEL.TEMPERATURE_METRICS).

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`:
- Around line 170-180: getThresholdsForType is incorrectly treating stored
threshold numbers as being in the user's target unit; update
getThresholdsForType so it returns thresholds as-is in Celsius (do not convert
based on targetUnit) and remove the targetUnit parameter from its signature,
then update all callers (e.g., where getThresholdsForType(r.type,
thresholdConfig, targetUnit) is used) to call getThresholdsForType(type,
thresholdConfig) only; keep the convertValue(...) calls that convert from
TemperatureUnit.CELSIUS to the user's targetUnit so displayed thresholds are
converted at the call site, not inside getThresholdsForType.

---

Duplicate comments:
In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`:
- Around line 264-270: After filtering to get `sensors: TemperatureSensor[]`,
guard against the case where `sensors` is empty before calling
`this.buildSummary(sensors)`; if empty, return the result object with a safe
summary (e.g., null or an empty/default summary shape) instead of invoking
`buildSummary` which throws on empty input — update the code that returns the
object with id 'temperature-metrics' to check `sensors.length === 0` and supply
a safe summary value, referencing the `sensors` variable and the
`this.buildSummary` method.

---

Nitpick comments:
In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/ipmi_sensors.service.ts`:
- Line 21: The ConfigService injected into the IpmiSensorsService constructor is
unused; remove the unused dependency and any related import to keep the class
clean. Edit the IpmiSensorsService (constructor) to drop the private readonly
configService: ConfigService parameter and remove the ConfigService import at
the top of ipmi_sensors.service.ts; if there are any comments referencing future
use, leave the comment but do not inject ConfigService until it's actually
required.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.module.ts`:
- Line 1: Remove the unnecessary top-of-file comment "//
temperature/temperature.module.ts" from the module file; simply delete that
single-line filename comment so the file contains only real code and meaningful
comments (no other changes to exports, imports, or any symbols in the
temperature.module.ts file).

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.spec.ts`:
- Around line 59-68: Replace the unusual
Object.create(TemperatureConfigService.prototype) pattern with a simple typed
partial mock for temperatureConfigService: create a plain object literal that
implements getConfig as vi.fn().mockReturnValue(...), cast it to
Partial<TemperatureConfigService> or TemperatureConfigService as appropriate,
and use that mock in tests instead of TemperatureConfigService.prototype to
match the other mocks in this file.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`:
- Around line 28-30: Remove the dead private fields `cache` and `cacheTimestamp`
from the TemperatureService class (they're declared but never referenced);
locate their declarations (private cache: TemperatureMetrics | null = null;
private cacheTimestamp = 0;) and delete them so only the actual caching via
`this.history.getMostRecentReading()` remains; ensure no other code references
`cache` or `cacheTimestamp` and run type checks to confirm no further fixes are
needed.
- Line 22: Remove the redundant top-of-file comment that only repeats the
filename in temperature.service.ts; open the file and delete the lone comment
line "// temperature.service.ts" (or any similar filename-only comment at the
top) so the file contains only meaningful code/comments.
- Around line 333-338: The parameter name sourceUnit on the computeStatus
function is misleading because callers pass a target/display unit; rename the
parameter to targetUnit (or displayUnit) in computeStatus and update its
signature and all references inside computeStatus to the new name, then update
every call site that passes the unit (e.g., where computeStatus(...) is invoked)
to match the renamed parameter; also update any related type annotations or
JSDoc for TemperatureUnit to reflect the new parameter name to keep intent
clear.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 27a04af and bbed1e5.

📒 Files selected for processing (11)
  • api/dev/configs/api.json
  • api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.spec.ts
  • api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/ipmi_sensors.service.spec.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/ipmi_sensors.service.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/lm_sensors.service.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/temperature-config.model.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/temperature-config.service.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.module.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.spec.ts
  • api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/src/unraid-api/graph/resolvers/metrics/temperature/sensors/lm_sensors.service.ts

Comment on lines +170 to +180
const rawThresholds = this.getThresholdsForType(r.type, thresholdConfig, targetUnit);
const warning = this.convertValue(
rawThresholds.warning,
TemperatureUnit.CELSIUS,
targetUnit
);
const critical = this.convertValue(
rawThresholds.critical,
TemperatureUnit.CELSIUS,
targetUnit
);
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Threshold unit conversion logic is incorrect.

The threshold values in config (e.g., warning: 80, cpu_warning: 70) are stored in Celsius, but getThresholdsForType treats them as if they're in the user's targetUnit. If a user configures default_unit: FAHRENHEIT, the code will interpret 80 as 80°F (~26.7°C) instead of 80°C, causing warnings to trigger at very low temperatures.

The thresholds should always be treated as Celsius values (their storage format), not converted from targetUnit.

🐛 Proposed fix

Update getThresholdsForType to not convert threshold values since they're already in Celsius:

 private getThresholdsForType(
     type: SensorType,
-    thresholds: TemperatureThresholdsConfig,
-    sourceUnit: TemperatureUnit
+    thresholds: TemperatureThresholdsConfig
 ): { warning: number; critical: number } {
-    const getVal = (val: number | undefined, defaultCelsius: number): number => {
-        if (val === undefined || val === null) return defaultCelsius;
-        return this.convertValue(val, sourceUnit, TemperatureUnit.CELSIUS);
-    };
+    const getVal = (val: number | undefined, defaultCelsius: number): number =>
+        val ?? defaultCelsius;

     switch (type) {

Then update all call sites to remove the targetUnit argument:

  • Line 170: this.getThresholdsForType(r.type, thresholdConfig)
  • Line 236: this.getThresholdsForType(metadata.type, thresholdConfig)
  • Line 342: this.getThresholdsForType(type, thresholdConfig)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.service.ts`
around lines 170 - 180, getThresholdsForType is incorrectly treating stored
threshold numbers as being in the user's target unit; update
getThresholdsForType so it returns thresholds as-is in Celsius (do not convert
based on targetUnit) and remove the targetUnit parameter from its signature,
then update all callers (e.g., where getThresholdsForType(r.type,
thresholdConfig, targetUnit) is used) to call getThresholdsForType(type,
thresholdConfig) only; keep the convertValue(...) calls that convert from
TemperatureUnit.CELSIUS to the user's targetUnit so displayed thresholds are
converted at the call site, not inside getThresholdsForType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can specify thresholds in whatever unit they desire, that's why there is a default_unit field in the configuration. Comparisons against the for thresholds are all converted to Celsius so that a standard unit is used.

getThresholdsForType always returns {warning,critical} data in degC through it's getVal function.

Then warning and critical are converted from whatever the target unit the user specified to degC.

                const warning = this.convertValue(
                    rawThresholds.warning,
                    TemperatureUnit.CELSIUS,
                    targetUnit
                );
                const critical = this.convertValue(
                    rawThresholds.critical,
                    TemperatureUnit.CELSIUS,
                    targetUnit
                );

This code then converts them from degC to the unit the user specified.

It does beg the question of why is this circular then.

  • Input: User provides a threshold (e.g., 158°F).
  • Inside getThresholdsForType: You convert 158°F → 70°C.
  • The Return: The function returns 70 (Celsius).
  • Outside the function: You take 70 and convert 70°C → 158°F.

And the reason is just to give an easy way to provide defaults (in the cases that call getVal) if they weren't provided. There's probably a better way to do this but it's not "wrong".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MitchellThompkins, thank you for the detailed explanation—I misunderstood the flow. You're right that the logic is functionally correct: thresholds are stored in the user's configured unit, normalized to Celsius by getThresholdsForType, and then converted back to the target unit for display.

I see now that the circular conversion (user unit → Celsius → user unit) exists to handle the default values, which are expressed in Celsius. The getVal helper converts user-provided thresholds to Celsius, and if none are provided, it uses the Celsius defaults directly.

Since you mentioned there's "probably a better way," one alternative could be to store defaults as a separate constant structure (e.g., DEFAULT_THRESHOLDS_CELSIUS = { cpu_warning: 70, cpu_critical: 85, ... }) and only convert user-provided thresholds. That would eliminate the convert-back step. But as you said, what's there isn't wrong—just a bit circular.

Thanks for taking the time to clarify!

__
(o.o)

^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather just leave it alone for now and leave it as a clean-up exercise in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MitchellThompkins, sounds good—we can revisit it later if needed. Thanks for walking through the logic with me!

@Ajit-Mehrotra
Copy link
Contributor

I'll put some time aside on monday or tuesday to pull your code locally and review your changes again. Will also be able to guide you better when I have time to debug. Tomorrow is going to be a bit busy for me.

For the pnpm codegen thing, make sure your api dev server is running while you run pnpm codegen. Love that you're using arch :)

@MitchellThompkins
Copy link
Contributor Author

MitchellThompkins commented Feb 27, 2026

Sorry for this wall of text in advance.

Regarding the test issues I was having

So I was having 3 issues:

  1. I had a cached build in api/dist that I was testing with that caused me to load the wrong build. Removing that solved that issue (which is why my types didn’t align). This was just me being stupid.
  2. When I added a TemperatureConfigService to handle the configuration, I should have provided a async factory provider. Also just me being stupid and missing something. I fixed that with f31513a.
  3. I think I found a bug in ConfigFilePersister that does not allow configurations to be modified while the API is live. I fixed that with 06df963. I'm not an expert on this though and I left you a comment to take a look over in the chagnes.

Regarding (3), this took me forever to find b/c I wasn’t really familiar with the intricacies of ConfigFilePersister. I found that if I modified /boot/config/plugins/dynamix.my.servers/configs/temperature.json and restarted the server, any changes I made to temperature.json disappeared. I think it is because this is happening:

  1. API starts -> loads temperature.json -> in-memory config has values contained in file
  2. Edit the file
  3. Restart the API -> shutdown of old instance calls onModuleDestroy() -> calls persist() -> compares in-memory config with config in file -> not equal -> overwrites file on disk
  4. Go to 1

fee7d46 introduced this change and I think it’s in order to handle the case where a config change was made via the API while shutting down and wanting to capture that change, though I'm not 100% sure. As my temperature.json is now managed by ConfigFilePersister I think I was hitting this.

Here’s the explicit behavior I observed:

Start API service, then edit temperature.json to modify the polling_interval field from 5000 (default) to 4000:

root@server:/boot/config/plugins/dynamix.my.servers/configs# cat temperature.json 
{
  "enabled": true,
  "polling_interval": 4000,
  "default_unit": "CELSIUS",
  ...
}

Restart API service and polling_interval field has reverted:

root@server:/boot/config/plugins/dynamix.my.servers/configs# cat temperature.json 
{
  "enabled": true,
  "polling_interval": 5000,
  "default_unit": "CELSIUS",
 ...
}

Regarding the codegen

I did follow the instructions, however the issue was that api/dev/configs/api.jsonwas missing "plugins": ["unraid-api-plugin-connect"] (though tbh I'm not sure why it was different?). I checked out what was in main with 6a008e5 and that fixed it.

Regenerated with 53bf97e...though now the pipline fails due to some type checking issues in the generated content.

Rest of the comments

I'll get to the rest of your comments this weekend probably, it's mostly just small stuff.

Oh also I only added the machine info in case it was something weird with my VM (which is hosted on unraid actually :)). I'm no Arch evangelist lol, I'll use whatever works.

*/
async onModuleDestroy() {
this.configObserver?.unsubscribe();
await this.persist();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ajit-Mehrotra this was what I changed to let my config managed by ConfigFilePersister actually persist. Let me know if that's wrong but w/o this I could not get my changes to actually stick around between API restarts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Bounty] System Temperature Monitoring in the Unraid API

2 participants