Skip to content

Add missing examples#59

Open
alinpahontu2912 wants to merge 3 commits intodotnet:mainfrom
alinpahontu2912:add-missing-examples
Open

Add missing examples#59
alinpahontu2912 wants to merge 3 commits intodotnet:mainfrom
alinpahontu2912:add-missing-examples

Conversation

@alinpahontu2912
Copy link
Member

No description provided.

alinpahontu2912 and others added 3 commits February 13, 2026 13:46
Port 5 examples from pytorch/examples that were missing:

- SuperResolution: ESPCN model using PixelShuffle for image upscaling
- GCN: Graph Convolutional Network for node classification
- GAT: Graph Attention Network with multi-head attention
- ForwardForward: Forward-Forward algorithm (Hinton 2022) for MNIST
- SiameseNetwork: Siamese network for image similarity using BCELoss

Each example includes both the Model (in Models/) and the training
harness (in CSharpExamples/). All wired up in Program.cs.
Port two missing PyTorch examples to TorchSharp:

- TimeSequencePrediction: LSTM-based sine wave prediction using stacked
  LSTMCells, matching pytorch/examples/time_sequence_prediction. Uses
  synthetic sine wave data with LBFGS optimizer.

- WordLanguageModel: RNN/LSTM/GRU/RNN word-level language model on
  WikiText-2, matching pytorch/examples/word_language_model. Supports
  four model types (wlm-lstm, wlm-gru, wlm-rnn-tanh, wlm-rnn-relu)
  with manual SGD, gradient clipping, and learning rate annealing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds seven new machine learning model examples to the TorchSharp C# examples collection, porting PyTorch examples to demonstrate various neural network architectures and training techniques.

Changes:

  • Adds RNN-based word language models (LSTM/GRU/RNN variants) for text prediction
  • Implements time-series prediction with stacked LSTM cells for sine wave forecasting
  • Adds super-resolution model using sub-pixel convolution (ESPCN)
  • Implements Siamese networks for image similarity comparison
  • Adds graph neural networks (GCN and GAT) for node classification
  • Implements Forward-Forward algorithm as an alternative to backpropagation

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/CSharp/Models/WordLanguageModel.cs RNN model class supporting LSTM, GRU, and vanilla RNN with weight tying
src/CSharp/Models/TimeSequencePrediction.cs Stacked LSTMCell model for sequential prediction
src/CSharp/Models/SuperResolution.cs ESPCN model with PixelShuffle for image upscaling
src/CSharp/Models/SiameseNetwork.cs Dual CNN backbone for image pair comparison
src/CSharp/Models/GCN.cs Graph convolutional layer and 2-layer GCN model
src/CSharp/Models/GAT.cs Graph attention layer with multi-head attention and 2-layer GAT model
src/CSharp/Models/ForwardForward.cs Layer-wise training implementation of Forward-Forward algorithm
src/CSharp/CSharpExamples/WordLanguageModel.cs Training runner for word language model on WikiText-2
src/CSharp/CSharpExamples/TimeSequencePrediction.cs Training runner with synthetic sine wave data
src/CSharp/CSharpExamples/SuperResolution.cs Training runner using MNIST for super-resolution demonstration
src/CSharp/CSharpExamples/SiameseNetwork.cs Training runner with MNIST pair generation
src/CSharp/CSharpExamples/GCN.cs Training runner with synthetic graph data
src/CSharp/CSharpExamples/GAT.cs Training runner with synthetic graph data and multi-head attention
src/CSharp/CSharpExamples/ForwardForward.cs Training runner implementing layer-wise Forward-Forward training
src/CSharp/CSharpExamples/Program.cs Adds command-line switches for all new examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (tie_weights)
{
if (nhid != ninp)
throw new ArgumentException("When using the tied flag, nhid must be equal to emsize");
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message refers to 'nhid must be equal to emsize', but the parameter name is 'ninp' not 'emsize'. This inconsistency between the error message and the actual parameter name could confuse users. Consider changing the error message to: "When using the tied flag, nhid must be equal to ninp"

Suggested change
throw new ArgumentException("When using the tied flag, nhid must be equal to emsize");
throw new ArgumentException("When using the tied flag, nhid must be equal to ninp");

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +28
lstm1 = LSTMCell(1, 51);
lstm2 = LSTMCell(51, 51);
linear = Linear(51, 1);
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model hard-codes the hidden size as 51 in multiple places (lines 26, 27, 28, 49, 50, 51, 52). This magic number should be extracted to a named constant or passed as a constructor parameter to improve maintainability and make the model more flexible.

Copilot uses AI. Check for mistakes.
}
}

total_loss += loss.to(torch.CPU).item<float>();
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loss calculation here is inconsistent with the Evaluate function. In Train, it accumulates raw loss values (line 170), while in Evaluate it multiplies by data.shape[0] (line 204). This means the two loss values are not directly comparable and could lead to confusion. Consider making the loss calculation consistent between training and evaluation.

Copilot uses AI. Check for mistakes.
Comment on lines +148 to +154
var h = OverlayLabelOnInput(x, torch.full(x.shape[0], label, dtype: ScalarType.Int64, device: device));
var goodness = torch.tensor(0.0f, device: device);
foreach (var layer in layers) {
h = layer.forward(h);
goodness = goodness + h.pow(2).mean(new long[] { 1 });
}
goodnessList.Add(goodness.unsqueeze(1));
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goodnessList accumulates tensors that are not disposed between loop iterations. This could lead to memory leaks during prediction, especially when iterating over all 10 labels. Consider using a NewDisposeScope or explicitly disposing tensors that are no longer needed after each iteration.

Suggested change
var h = OverlayLabelOnInput(x, torch.full(x.shape[0], label, dtype: ScalarType.Int64, device: device));
var goodness = torch.tensor(0.0f, device: device);
foreach (var layer in layers) {
h = layer.forward(h);
goodness = goodness + h.pow(2).mean(new long[] { 1 });
}
goodnessList.Add(goodness.unsqueeze(1));
using (var d = torch.NewDisposeScope()) {
var labelTensor = torch.full(x.shape[0], label, dtype: ScalarType.Int64, device: device);
var h = OverlayLabelOnInput(x, labelTensor);
var goodness = torch.tensor(0.0f, device: device);
foreach (var layer in layers) {
h = layer.forward(h);
goodness = goodness + h.pow(2).mean(new long[] { 1 });
}
var goodnessUnsqueezed = goodness.unsqueeze(1).MoveToOuterDisposeScope();
goodnessList.Add(goodnessUnsqueezed);
}

Copilot uses AI. Check for mistakes.
var sourceScores = torch.matmul(hTransformed, a.index(new TensorIndex[] {
TensorIndex.Colon, TensorIndex.Slice(null, nHidden), TensorIndex.Colon }));
var targetScores = torch.matmul(hTransformed, a.index(new TensorIndex[] {
TensorIndex.Colon, TensorIndex.Slice(nHidden), TensorIndex.Colon }));
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slicing operation on line 64 uses 'TensorIndex.Slice(nHidden)' which is equivalent to 'TensorIndex.Slice(nHidden, null)', meaning it slices from nHidden to the end. However, for the attention mechanism, this should be slicing from nHidden to 2nHidden. While the current code may work if the tensor has exactly 2nHidden elements in that dimension, it's less explicit and could be confusing. Consider using 'TensorIndex.Slice(nHidden, 2 * nHidden)' for clarity.

Suggested change
TensorIndex.Colon, TensorIndex.Slice(nHidden), TensorIndex.Colon }));
TensorIndex.Colon, TensorIndex.Slice(nHidden, 2 * nHidden), TensorIndex.Colon }));

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +67
var idxTrain = idx.slice(0, 1600, numNodes, 1);
var idxVal = idx.slice(0, 1200, 1600, 1);
var idxTest = idx.slice(0, 0, 1200, 1);
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slice operations for train/val/test splits appear to be incorrect. Line 65 uses idx.slice(0, 1600, numNodes, 1) which attempts to slice from index 1600 to numNodes with step 1, but based on the context (training should use indices 1600 onwards), this appears backwards. Similarly, lines 66-67 have the same issue. The correct usage should ensure training gets the largest subset, validation gets a medium subset, and test gets the smallest. Verify the slice parameters match the intended data split.

Suggested change
var idxTrain = idx.slice(0, 1600, numNodes, 1);
var idxVal = idx.slice(0, 1200, 1600, 1);
var idxTest = idx.slice(0, 0, 1200, 1);
int numTrain = 1600;
int numVal = 400;
int trainEnd = numTrain;
int valEnd = numTrain + numVal;
var idxTrain = idx.slice(0, 0, trainEnd, 1);
var idxVal = idx.slice(0, trainEnd, valEnd, 1);
var idxTest = idx.slice(0, valEnd, numNodes, 1);

Copilot uses AI. Check for mistakes.
Comment on lines +85 to +86
// Mask non-existent edges
var connectivityMask = -9e16 * torch.ones_like(e);
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connectivity mask uses -9e16 which may be too large and could cause numerical instability or overflow issues. PyTorch's standard practice is to use -9e15 or a safer value like -1e9. Consider using a more conservative masking value to avoid potential NaN values after softmax.

Suggested change
// Mask non-existent edges
var connectivityMask = -9e16 * torch.ones_like(e);
// Mask non-existent edges with a large negative value to approximate -infinity
var connectivityMask = -1e9 * torch.ones_like(e);

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +67
// Add edges and self-loops
for (int i = 0; i < numNodes; i++) {
adjMat[i, i] = 1.0f; // self-loops
}
// Note: In a real implementation, you'd construct the adjacency matrix properly
// and apply the renormalization trick D^(-1/2) A D^(-1/2)
// For now, use identity + random edges normalized by degree
adjMat = adjMat + torch.eye(numNodes, device: device) * 0.1f;

Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code adds an identity matrix scaled by 0.1 to the adjacency matrix that already has self-loops (line 66). This results in diagonal values of 1.1 instead of 1.0, which may not be the intended behavior. The comment on line 63 mentions "In a real implementation, you'd construct the adjacency matrix properly", but this double-addition of self-loops should be corrected even in this synthetic example.

Suggested change
// Add edges and self-loops
for (int i = 0; i < numNodes; i++) {
adjMat[i, i] = 1.0f; // self-loops
}
// Note: In a real implementation, you'd construct the adjacency matrix properly
// and apply the renormalization trick D^(-1/2) A D^(-1/2)
// For now, use identity + random edges normalized by degree
adjMat = adjMat + torch.eye(numNodes, device: device) * 0.1f;
// Add self-loops
for (int i = 0; i < numNodes; i++) {
adjMat[i, i] = 1.0f; // self-loops
}
// Note: In a real implementation, you'd construct the adjacency matrix properly,
// add edges based on the dataset, and apply the renormalization trick D^(-1/2) A D^(-1/2).
// Here we only add self-loops for demonstration; the adjacency is then normalized below.

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +126
// Find another image with the same label
int j = rng.Next(batchSize);
int attempts = 0;
while (labels[j].item<long>() != sameLabel && attempts < batchSize) {
j = rng.Next(batchSize);
attempts++;
}
images2.Add(data[j].unsqueeze(0));
targets.Add(1.0f);
} else {
// Different class pair
var thisLabel = labels[i].item<long>();
int j = rng.Next(batchSize);
int attempts = 0;
while (labels[j].item<long>() == thisLabel && attempts < batchSize) {
j = rng.Next(batchSize);
attempts++;
}
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CreatePairs function has a potential infinite loop issue. When trying to find a matching or different label, the code gives up after 'batchSize' attempts (lines 112-115, 123-126), but if no suitable pair is found, it uses the last randomly selected index 'j'. This could result in incorrect pairs (same label when different was requested, or vice versa). Consider either guaranteeing valid pairs or throwing an exception if no valid pair can be found.

Suggested change
// Find another image with the same label
int j = rng.Next(batchSize);
int attempts = 0;
while (labels[j].item<long>() != sameLabel && attempts < batchSize) {
j = rng.Next(batchSize);
attempts++;
}
images2.Add(data[j].unsqueeze(0));
targets.Add(1.0f);
} else {
// Different class pair
var thisLabel = labels[i].item<long>();
int j = rng.Next(batchSize);
int attempts = 0;
while (labels[j].item<long>() == thisLabel && attempts < batchSize) {
j = rng.Next(batchSize);
attempts++;
}
// Find another image with the same label (and not the same index)
int j = -1;
int start = rng.Next(batchSize);
for (int offset = 0; offset < batchSize; offset++) {
int idx = (start + offset) % batchSize;
if (idx == i) continue;
if (labels[idx].item<long>() == sameLabel) {
j = idx;
break;
}
}
if (j == -1) {
throw new InvalidOperationException("Unable to find a pair with the same label in the current batch.");
}
images2.Add(data[j].unsqueeze(0));
targets.Add(1.0f);
} else {
// Different class pair
var thisLabel = labels[i].item<long>();
int j = -1;
int start = rng.Next(batchSize);
for (int offset = 0; offset < batchSize; offset++) {
int idx = (start + offset) % batchSize;
if (idx == i) continue;
if (labels[idx].item<long>() != thisLabel) {
j = idx;
break;
}
}
if (j == -1) {
throw new InvalidOperationException("Unable to find a pair with a different label in the current batch.");
}

Copilot uses AI. Check for mistakes.
Comment on lines +76 to +79
var idx = torch.randperm(numNodes, device: device);
var idxTrain = idx.slice(0, 1500, numNodes, 1);
var idxVal = idx.slice(0, 1000, 1500, 1);
var idxTest = idx.slice(0, 0, 1000, 1);
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slice operations for train/val/test splits are incorrect. The indices are specified as (dim, start, end, step), but the order should be (dim, start, end, step) where start and end define the range. Line 77 uses idx.slice(0, 1500, numNodes, 1) which would try to slice from index 1500 to numNodes, but based on the context this should be slicing indices 1500 onwards. The correct call should be idx.slice(0, 1500, numNodes, 1) or more simply idx[TensorIndex.Slice(1500, numNodes)]. Similar issues exist on lines 78-79.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments