Skip to content

Error of computational accuracy in the unit test of Modified Huber Loss Op #4406

@JiayiFeng

Description

@JiayiFeng

The Modified Huber Loss Op can be represented by the following function:

L(g) = max(0, 1 - g)^2    for g >= -1,
       -4g                otherwise.

g = y * f(x)

This is a piecewise function and it is made up of three distinct ranges: (-∞, -1), [-1, 1) and [1, +∞]. In the range of [1, +∞), the function result is constantly zero.

Although three ranges joint to their neighbors smoothly, the trends of gradients from two sides of junctions -1 and 1 diff from each other. That makes out gradient estimation results from Python and C++ a considerable difference.

The difference will further be divided by the gradient itself, to make the error estimation a relative
result. However, near junction 1 the gradient itself is quite near to zero. A considerable difference, then divided by a near zero value. It finally leads to a big error.

A simple solution is to keep all the g of the above function in out unit tests far enough from 1.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions