Skip to content

Comments

392 convert layers to numpower#393

Open
apphp wants to merge 28 commits intoRubixML:3.0from
apphp:392-convert-layers-to-numpower
Open

392 convert layers to numpower#393
apphp wants to merge 28 commits intoRubixML:3.0from
apphp:392-convert-layers-to-numpower

Conversation

@apphp
Copy link

@apphp apphp commented Dec 6, 2025

No description provided.

apphp added 2 commits December 6, 2025 18:55
…hensive unit tests and fixed broken source file link in the documentation
@apphp apphp self-assigned this Dec 6, 2025
@apphp apphp changed the base branch from master to 3.0 December 6, 2025 17:15
apphp added 20 commits December 7, 2025 00:41
…d updated documentation with fixed source file link. Added `Parametric` interface to define parameterized layers.
…ical stability during inference, and gradient computation logic
…ical stability during inference, and gradient computation logic
…ence/backward passes, unit tests, and documentation updates
…rward/inference/backward passes, unit tests
@andrewdalpino andrewdalpino requested review from a team and Copilot and removed request for SkibidiProduction and andrewdalpino December 30, 2025 00:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request converts neural network layers and optimizers to use the NumPower library for numerical operations. The changes include implementing new optimizer classes (StepDecay, RMSProp, Momentum, Cyclical, Adam, AdaMax, AdaGrad), new layer implementations (Swish, Placeholder1D, PReLU, Noise, Multiclass, Dropout, Dense, Continuous, Binary, BatchNorm, Activation), updating tests to use PHPUnit attributes, and reorganizing code into namespaced subdirectories with corresponding documentation updates.

Key Changes

  • Migration from tensor library to NumPower for all neural network operations
  • Reorganization of optimizers and layers into subdirectory namespaces (e.g., Adam/Adam.php instead of Adam.php)
  • Conversion of PHPUnit test annotations to attributes (#[Test], #[TestDox], etc.)

Reviewed changes

Copilot reviewed 71 out of 71 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/Specifications/SamplesAreCompatibleWithDistanceTest.php Updated to use PHPUnit attributes for test annotations
tests/NeuralNet/Optimizers/Stochastic/StochasticTest.php Refactored test with data providers and PHPUnit attributes
tests/NeuralNet/Optimizers/StepDecay/StepDecayTest.php New comprehensive test suite for StepDecay optimizer
tests/NeuralNet/Optimizers/RMSProp/RMSPropTest.php New test suite covering RMSProp optimizer with NumPower
tests/NeuralNet/Optimizers/Momentum/MomentumTest.php New test suite for Momentum optimizer
tests/NeuralNet/Optimizers/Cyclical/CyclicalTest.php New test suite for Cyclical learning rate optimizer
tests/NeuralNet/Optimizers/Adam/AdamTest.php New comprehensive Adam optimizer tests
tests/NeuralNet/Optimizers/AdaMax/AdaMaxTest.php New AdaMax optimizer test suite
tests/NeuralNet/Optimizers/AdaGrad/AdaGradTest.php New AdaGrad optimizer tests
tests/NeuralNet/Layers/*/Test.php (multiple) New test suites for all layer types using NumPower
tests/NeuralNet/FeedForwardTest.php Updated to use PHPUnit attributes
tests/Helpers/GraphvizTest.php Updated to use PHPUnit attributes
src/Traits/AssertsShapes.php Updated exception import and error message
src/NeuralNet/Parameters/Parameter.php Formatting improvements and method signature updates
src/NeuralNet/Optimizers/*/Optimizers.php (multiple) New optimizer implementations using NumPower
src/NeuralNet/Layers/*/Layers.php (multiple) New layer implementations using NumPower
src/NeuralNet/Initializers/*/Initializers.php (multiple) Added explicit loc parameter to NumPower calls
phpunit.xml Increased memory limit to 256M
docs/neural-network/**/*.md (multiple) Updated file paths and added mathematical formulations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 72 out of 72 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@andrewdalpino andrewdalpino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @apphp, mostly good to go I just had a few questions before I approved.

*
* @var int
*/
protected const int AXIS_SAMPLES = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just double check that this is correct and not the other way around. From my understanding the input to each layer is shaped [num_features, num_samples] which would mean that axis 0 is the features not the samples.

Copy link
Author

@apphp apphp Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. It was rewritten to use the convention [num_features, num_samples], meaning features-first.

$m = $dOut->shape()[0];

// Compute gradient per formula: dX = (dXHat * m - dXHatSigma - xHat * xHatSigma) * (stdInv / m)
return NumPower::multiply(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we break this apart to help make it more human-readable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
$n = $output->shape()[1];

if ($this->costFn instanceof CrossEntropy) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if I left a comment here or not in the original implementation, but it's worth noting to future developers that this is an optimization specific to the Binary Cross Entropy loss function. Through some clever mathematical cancellation tricks this gradient calculation is just a simple subtraction operation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but this is was a good point - I added comment for both implementations.


if ($this->biases) {
// Reshape bias vector [fanOut] to column [fanOut, 1] to match output [fanOut, n]
$bias = NumPower::reshape($this->biases->param(), [$this->neurons, 1]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this reshaping required in the original Tensor code?

Copy link
Author

@apphp apphp Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, original Tensor code didnt need it, but NumPower version does to get correct bias broadcasting across the batch dimension.

This is because in the NumPower NumPower::add($output, $biasVector) does not reliably broadcast a 1-d vector across the second dimension

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great to know, thanks!

// Gradient of the loss with respect to beta
// dL/dbeta = sum_over_batch(dL/dy * dy/dbeta)
// Here we use a simplified formulation: dL/dbeta ~ sum(dOut * input)
$dBetaFull = NumPower::multiply($dOut, $this->input);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting comment, you know I'm not entirely sure I had the original gradient calculation 100% correct in the first place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me refactor this calculation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it for now as these change are more about converting rather than fixing. But if indeed the calculation is wrong then we'll fix it in a followup.

@andrewdalpino
Copy link
Member

At some point we should resolve the static analysis warnings as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants