Add missing examples by alinpahontu2912 · Pull Request #59 · dotnet/TorchSharpExamples

alinpahontu2912 · 2026-02-16T11:17:01Z

No description provided.

Port 5 examples from pytorch/examples that were missing: - SuperResolution: ESPCN model using PixelShuffle for image upscaling - GCN: Graph Convolutional Network for node classification - GAT: Graph Attention Network with multi-head attention - ForwardForward: Forward-Forward algorithm (Hinton 2022) for MNIST - SiameseNetwork: Siamese network for image similarity using BCELoss Each example includes both the Model (in Models/) and the training harness (in CSharpExamples/). All wired up in Program.cs.

Port two missing PyTorch examples to TorchSharp: - TimeSequencePrediction: LSTM-based sine wave prediction using stacked LSTMCells, matching pytorch/examples/time_sequence_prediction. Uses synthetic sine wave data with LBFGS optimizer. - WordLanguageModel: RNN/LSTM/GRU/RNN word-level language model on WikiText-2, matching pytorch/examples/word_language_model. Supports four model types (wlm-lstm, wlm-gru, wlm-rnn-tanh, wlm-rnn-relu) with manual SGD, gradient clipping, and learning rate annealing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR adds seven new machine learning model examples to the TorchSharp C# examples collection, porting PyTorch examples to demonstrate various neural network architectures and training techniques.

Changes:

Adds RNN-based word language models (LSTM/GRU/RNN variants) for text prediction
Implements time-series prediction with stacked LSTM cells for sine wave forecasting
Adds super-resolution model using sub-pixel convolution (ESPCN)
Implements Siamese networks for image similarity comparison
Adds graph neural networks (GCN and GAT) for node classification
Implements Forward-Forward algorithm as an alternative to backpropagation

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
src/CSharp/Models/WordLanguageModel.cs	RNN model class supporting LSTM, GRU, and vanilla RNN with weight tying
src/CSharp/Models/TimeSequencePrediction.cs	Stacked LSTMCell model for sequential prediction
src/CSharp/Models/SuperResolution.cs	ESPCN model with PixelShuffle for image upscaling
src/CSharp/Models/SiameseNetwork.cs	Dual CNN backbone for image pair comparison
src/CSharp/Models/GCN.cs	Graph convolutional layer and 2-layer GCN model
src/CSharp/Models/GAT.cs	Graph attention layer with multi-head attention and 2-layer GAT model
src/CSharp/Models/ForwardForward.cs	Layer-wise training implementation of Forward-Forward algorithm
src/CSharp/CSharpExamples/WordLanguageModel.cs	Training runner for word language model on WikiText-2
src/CSharp/CSharpExamples/TimeSequencePrediction.cs	Training runner with synthetic sine wave data
src/CSharp/CSharpExamples/SuperResolution.cs	Training runner using MNIST for super-resolution demonstration
src/CSharp/CSharpExamples/SiameseNetwork.cs	Training runner with MNIST pair generation
src/CSharp/CSharpExamples/GCN.cs	Training runner with synthetic graph data
src/CSharp/CSharpExamples/GAT.cs	Training runner with synthetic graph data and multi-head attention
src/CSharp/CSharpExamples/ForwardForward.cs	Training runner implementing layer-wise Forward-Forward training
src/CSharp/CSharpExamples/Program.cs	Adds command-line switches for all new examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-16T19:25:46Z

src/CSharp/Models/WordLanguageModel.cs

+            if (tie_weights)
+            {
+                if (nhid != ninp)
+                    throw new ArgumentException("When using the tied flag, nhid must be equal to emsize");


The error message refers to 'nhid must be equal to emsize', but the parameter name is 'ninp' not 'emsize'. This inconsistency between the error message and the actual parameter name could confuse users. Consider changing the error message to: "When using the tied flag, nhid must be equal to ninp"

Suggested change

throw new ArgumentException("When using the tied flag, nhid must be equal to emsize");

throw new ArgumentException("When using the tied flag, nhid must be equal to ninp");

Copilot · 2026-02-16T19:25:47Z

src/CSharp/Models/TimeSequencePrediction.cs

+            lstm1 = LSTMCell(1, 51);
+            lstm2 = LSTMCell(51, 51);
+            linear = Linear(51, 1);


The model hard-codes the hidden size as 51 in multiple places (lines 26, 27, 28, 49, 50, 51, 52). This magic number should be extracted to a named constant or passed as a constructor parameter to improve maintainability and make the model more flexible.

Copilot · 2026-02-16T19:25:47Z

src/CSharp/CSharpExamples/WordLanguageModel.cs

+                        }
+                    }
+
+                    total_loss += loss.to(torch.CPU).item<float>();


The loss calculation here is inconsistent with the Evaluate function. In Train, it accumulates raw loss values (line 170), while in Evaluate it multiplies by data.shape[0] (line 204). This means the two loss values are not directly comparable and could lead to confusion. Consider making the loss calculation consistent between training and evaluation.

Copilot · 2026-02-16T19:25:47Z

src/CSharp/Models/ForwardForward.cs

+                var h = OverlayLabelOnInput(x, torch.full(x.shape[0], label, dtype: ScalarType.Int64, device: device));
+                var goodness = torch.tensor(0.0f, device: device);
+                foreach (var layer in layers) {
+                    h = layer.forward(h);
+                    goodness = goodness + h.pow(2).mean(new long[] { 1 });
+                }
+                goodnessList.Add(goodness.unsqueeze(1));


The goodnessList accumulates tensors that are not disposed between loop iterations. This could lead to memory leaks during prediction, especially when iterating over all 10 labels. Consider using a NewDisposeScope or explicitly disposing tensors that are no longer needed after each iteration.

Suggested change

var h = OverlayLabelOnInput(x, torch.full(x.shape[0], label, dtype: ScalarType.Int64, device: device));

var goodness = torch.tensor(0.0f, device: device);

foreach (var layer in layers) {

h = layer.forward(h);

goodness = goodness + h.pow(2).mean(new long[] { 1 });

}

goodnessList.Add(goodness.unsqueeze(1));

using (var d = torch.NewDisposeScope()) {

var labelTensor = torch.full(x.shape[0], label, dtype: ScalarType.Int64, device: device);

var h = OverlayLabelOnInput(x, labelTensor);

var goodness = torch.tensor(0.0f, device: device);

foreach (var layer in layers) {

h = layer.forward(h);

goodness = goodness + h.pow(2).mean(new long[] { 1 });

}

var goodnessUnsqueezed = goodness.unsqueeze(1).MoveToOuterDisposeScope();

goodnessList.Add(goodnessUnsqueezed);

}

Copilot · 2026-02-16T19:25:48Z

src/CSharp/Models/GAT.cs

+            var sourceScores = torch.matmul(hTransformed, a.index(new TensorIndex[] {
+                TensorIndex.Colon, TensorIndex.Slice(null, nHidden), TensorIndex.Colon }));
+            var targetScores = torch.matmul(hTransformed, a.index(new TensorIndex[] {
+                TensorIndex.Colon, TensorIndex.Slice(nHidden), TensorIndex.Colon }));


The slicing operation on line 64 uses 'TensorIndex.Slice(nHidden)' which is equivalent to 'TensorIndex.Slice(nHidden, null)', meaning it slices from nHidden to the end. However, for the attention mechanism, this should be slicing from nHidden to 2nHidden. While the current code may work if the tensor has exactly 2nHidden elements in that dimension, it's less explicit and could be confusing. Consider using 'TensorIndex.Slice(nHidden, 2 * nHidden)' for clarity.

Suggested change

TensorIndex.Colon, TensorIndex.Slice(nHidden), TensorIndex.Colon }));

TensorIndex.Colon, TensorIndex.Slice(nHidden, 2 * nHidden), TensorIndex.Colon }));

Copilot · 2026-02-16T19:25:48Z

src/CSharp/CSharpExamples/GAT.cs

+            var idxTrain = idx.slice(0, 1600, numNodes, 1);
+            var idxVal = idx.slice(0, 1200, 1600, 1);
+            var idxTest = idx.slice(0, 0, 1200, 1);


The slice operations for train/val/test splits appear to be incorrect. Line 65 uses idx.slice(0, 1600, numNodes, 1) which attempts to slice from index 1600 to numNodes with step 1, but based on the context (training should use indices 1600 onwards), this appears backwards. Similarly, lines 66-67 have the same issue. The correct usage should ensure training gets the largest subset, validation gets a medium subset, and test gets the smallest. Verify the slice parameters match the intended data split.

Suggested change

var idxTrain = idx.slice(0, 1600, numNodes, 1);

var idxVal = idx.slice(0, 1200, 1600, 1);

var idxTest = idx.slice(0, 0, 1200, 1);

int numTrain = 1600;

int numVal = 400;

int trainEnd = numTrain;

int valEnd = numTrain + numVal;

var idxTrain = idx.slice(0, 0, trainEnd, 1);

var idxVal = idx.slice(0, trainEnd, valEnd, 1);

var idxTest = idx.slice(0, valEnd, numNodes, 1);

Copilot · 2026-02-16T19:25:48Z

src/CSharp/Models/GAT.cs

+            // Mask non-existent edges
+            var connectivityMask = -9e16 * torch.ones_like(e);


The connectivity mask uses -9e16 which may be too large and could cause numerical instability or overflow issues. PyTorch's standard practice is to use -9e15 or a safer value like -1e9. Consider using a more conservative masking value to avoid potential NaN values after softmax.

Suggested change

// Mask non-existent edges

var connectivityMask = -9e16 * torch.ones_like(e);

// Mask non-existent edges with a large negative value to approximate -infinity

var connectivityMask = -1e9 * torch.ones_like(e);

Copilot · 2026-02-16T19:25:49Z

src/CSharp/CSharpExamples/GCN.cs

+            // Add edges and self-loops
+            for (int i = 0; i < numNodes; i++) {
+                adjMat[i, i] = 1.0f; // self-loops
+            }
+            // Note: In a real implementation, you'd construct the adjacency matrix properly
+            // and apply the renormalization trick D^(-1/2) A D^(-1/2)
+            // For now, use identity + random edges normalized by degree
+            adjMat = adjMat + torch.eye(numNodes, device: device) * 0.1f;
+


The code adds an identity matrix scaled by 0.1 to the adjacency matrix that already has self-loops (line 66). This results in diagonal values of 1.1 instead of 1.0, which may not be the intended behavior. The comment on line 63 mentions "In a real implementation, you'd construct the adjacency matrix properly", but this double-addition of self-loops should be corrected even in this synthetic example.

Suggested change

// Add edges and self-loops

for (int i = 0; i < numNodes; i++) {

adjMat[i, i] = 1.0f; // self-loops

}

// Note: In a real implementation, you'd construct the adjacency matrix properly

// and apply the renormalization trick D^(-1/2) A D^(-1/2)

// For now, use identity + random edges normalized by degree

adjMat = adjMat + torch.eye(numNodes, device: device) * 0.1f;

// Add self-loops

for (int i = 0; i < numNodes; i++) {

adjMat[i, i] = 1.0f; // self-loops

}

// Note: In a real implementation, you'd construct the adjacency matrix properly,

// add edges based on the dataset, and apply the renormalization trick D^(-1/2) A D^(-1/2).

// Here we only add self-loops for demonstration; the adjacency is then normalized below.

Copilot · 2026-02-16T19:25:49Z

src/CSharp/CSharpExamples/SiameseNetwork.cs

+                    // Find another image with the same label
+                    int j = rng.Next(batchSize);
+                    int attempts = 0;
+                    while (labels[j].item<long>() != sameLabel && attempts < batchSize) {
+                        j = rng.Next(batchSize);
+                        attempts++;
+                    }
+                    images2.Add(data[j].unsqueeze(0));
+                    targets.Add(1.0f);
+                } else {
+                    // Different class pair
+                    var thisLabel = labels[i].item<long>();
+                    int j = rng.Next(batchSize);
+                    int attempts = 0;
+                    while (labels[j].item<long>() == thisLabel && attempts < batchSize) {
+                        j = rng.Next(batchSize);
+                        attempts++;
+                    }


The CreatePairs function has a potential infinite loop issue. When trying to find a matching or different label, the code gives up after 'batchSize' attempts (lines 112-115, 123-126), but if no suitable pair is found, it uses the last randomly selected index 'j'. This could result in incorrect pairs (same label when different was requested, or vice versa). Consider either guaranteeing valid pairs or throwing an exception if no valid pair can be found.

Suggested change

// Find another image with the same label

int j = rng.Next(batchSize);

int attempts = 0;

while (labels[j].item<long>() != sameLabel && attempts < batchSize) {

j = rng.Next(batchSize);

attempts++;

}

images2.Add(data[j].unsqueeze(0));

targets.Add(1.0f);

} else {

// Different class pair

var thisLabel = labels[i].item<long>();

int j = rng.Next(batchSize);

int attempts = 0;

while (labels[j].item<long>() == thisLabel && attempts < batchSize) {

j = rng.Next(batchSize);

attempts++;

}

// Find another image with the same label (and not the same index)

int j = -1;

int start = rng.Next(batchSize);

for (int offset = 0; offset < batchSize; offset++) {

int idx = (start + offset) % batchSize;

if (idx == i) continue;

if (labels[idx].item<long>() == sameLabel) {

j = idx;

break;

}

}

if (j == -1) {

throw new InvalidOperationException("Unable to find a pair with the same label in the current batch.");

}

images2.Add(data[j].unsqueeze(0));

targets.Add(1.0f);

} else {

// Different class pair

var thisLabel = labels[i].item<long>();

int j = -1;

int start = rng.Next(batchSize);

for (int offset = 0; offset < batchSize; offset++) {

int idx = (start + offset) % batchSize;

if (idx == i) continue;

if (labels[idx].item<long>() != thisLabel) {

j = idx;

break;

}

}

if (j == -1) {

throw new InvalidOperationException("Unable to find a pair with a different label in the current batch.");

}

Copilot · 2026-02-16T19:25:49Z

src/CSharp/CSharpExamples/GCN.cs

+            var idx = torch.randperm(numNodes, device: device);
+            var idxTrain = idx.slice(0, 1500, numNodes, 1);
+            var idxVal = idx.slice(0, 1000, 1500, 1);
+            var idxTest = idx.slice(0, 0, 1000, 1);


The slice operations for train/val/test splits are incorrect. The indices are specified as (dim, start, end, step), but the order should be (dim, start, end, step) where start and end define the range. Line 77 uses idx.slice(0, 1500, numNodes, 1) which would try to slice from index 1500 to numNodes, but based on the context this should be slicing indices 1500 onwards. The correct call should be idx.slice(0, 1500, numNodes, 1) or more simply idx[TensorIndex.Slice(1500, numNodes)]. Similar issues exist on lines 78-79.

alinpahontu2912 and others added 3 commits February 13, 2026 13:46

Merge branch 'dotnet:main' into add-missing-examples

7f020d0

alinpahontu2912 requested a review from Copilot February 16, 2026 19:20

Copilot started reviewing on behalf of alinpahontu2912 February 16, 2026 19:21 View session

Copilot AI reviewed Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing examples#59

Add missing examples#59
alinpahontu2912 wants to merge 3 commits intodotnet:mainfrom
alinpahontu2912:add-missing-examples

alinpahontu2912 commented Feb 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

	throw new ArgumentException("When using the tied flag, nhid must be equal to emsize");
	throw new ArgumentException("When using the tied flag, nhid must be equal to ninp");

	TensorIndex.Colon, TensorIndex.Slice(nHidden), TensorIndex.Colon }));
	TensorIndex.Colon, TensorIndex.Slice(nHidden, 2 * nHidden), TensorIndex.Colon }));

-            var idxTrain = idx.slice(0, 1600, numNodes, 1);
-            var idxVal = idx.slice(0, 1200, 1600, 1);
-            var idxTest = idx.slice(0, 0, 1200, 1);
+            int numTrain = 1600;
+            int numVal = 400;
+            int trainEnd = numTrain;
+            int valEnd = numTrain + numVal;
+            var idxTrain = idx.slice(0, 0, trainEnd, 1);
+            var idxVal = idx.slice(0, trainEnd, valEnd, 1);
+            var idxTest = idx.slice(0, valEnd, numNodes, 1);

		// Mask non-existent edges
		var connectivityMask = -9e16 * torch.ones_like(e);

Conversation

alinpahontu2912 commented Feb 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments