Why neural network can approach one extreme point rather than others?

For non-strictly convex quadratic programming, there are usually more than one extreme points, and some of these extreme points are far apart. Why can neural networks accurately approach one of these extreme points? The labels obtained by solvers usually do not necessarily conform to a specific distribution and do not have strong feature information. However, neural networks can almost perfectly approximate these solutions with little feature information—does this mean overfitting of neural network?
Or to rephrase the question: since the labels are randomly obtained by solvers during training without obvious feature information, which extreme point should the solutions provided by neural networks approach during testing, and why these extreme points rather than others?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why neural network can approach one extreme point rather than others? #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Why neural network can approach one extreme point rather than others? #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions