Spaces:
Running
Running
Add docs on dimensional constraints
Browse files- docs/examples.md +89 -1
docs/examples.md
CHANGED
|
@@ -433,9 +433,97 @@ equal to:
|
|
| 433 |
$\frac{x_0^2 x_1 - 2.0000073}{x_2^2 - 1.0000019}$, which
|
| 434 |
is nearly the same as the true equation!
|
| 435 |
|
|
|
|
| 436 |
|
|
|
|
|
|
|
|
|
|
| 437 |
|
| 438 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 439 |
|
| 440 |
For the many other features available in PySR, please
|
| 441 |
read the [Options section](options.md).
|
|
|
|
| 433 |
$\frac{x_0^2 x_1 - 2.0000073}{x_2^2 - 1.0000019}$, which
|
| 434 |
is nearly the same as the true equation!
|
| 435 |
|
| 436 |
+
## 10. Dimensional constraints
|
| 437 |
|
| 438 |
+
One other feature we can exploit is dimensional analysis.
|
| 439 |
+
Say that we know the physical units of each feature and output,
|
| 440 |
+
and we want to find an expression that is dimensionally consistent.
|
| 441 |
|
| 442 |
+
We can do this as follows, using `DynamicQuantities.jl` to assign units,
|
| 443 |
+
passing a string specifying the units for each variable.
|
| 444 |
+
First, let's make some data on Newton's law of gravitation, using
|
| 445 |
+
astropy for units:
|
| 446 |
+
|
| 447 |
+
```python
|
| 448 |
+
import numpy as np
|
| 449 |
+
from astropy import units as u, constants as const
|
| 450 |
+
|
| 451 |
+
M = (np.random.rand(100) + 0.1) * const.M_sun
|
| 452 |
+
m = 100 * (np.random.rand(100) + 0.1) * u.kg
|
| 453 |
+
r = (np.random.rand(100) + 0.1) * const.R_earth
|
| 454 |
+
G = const.G
|
| 455 |
+
|
| 456 |
+
F = G * M * m / r**2
|
| 457 |
+
```
|
| 458 |
+
|
| 459 |
+
We can see the units of `F` with `F.unit`.
|
| 460 |
+
|
| 461 |
+
Now, let's create our model.
|
| 462 |
+
Since this data has such a large dynamic range,
|
| 463 |
+
let's also create a custom loss function
|
| 464 |
+
that looks at the error in log-space:
|
| 465 |
+
|
| 466 |
+
```python
|
| 467 |
+
loss = """function loss_fnc(prediction, target)
|
| 468 |
+
scatter_loss = abs(log((abs(prediction)+1e-20) / (abs(target)+1e-20)))
|
| 469 |
+
sign_loss = 10 * (sign(prediction) - sign(target))^2
|
| 470 |
+
return scatter_loss + sign_loss
|
| 471 |
+
end
|
| 472 |
+
"""
|
| 473 |
+
```
|
| 474 |
+
|
| 475 |
+
Now let's define our model:
|
| 476 |
+
|
| 477 |
+
```python
|
| 478 |
+
model = PySRRegressor(
|
| 479 |
+
binary_operators=["+", "-", "*", "/"],
|
| 480 |
+
unary_operators=["square"],
|
| 481 |
+
loss=loss,
|
| 482 |
+
complexity_of_constants=2,
|
| 483 |
+
maxsize=25,
|
| 484 |
+
niterations=100,
|
| 485 |
+
populations=50,
|
| 486 |
+
# Amount to penalize dimensional violations:
|
| 487 |
+
dimensional_constraint_penalty=10**5,
|
| 488 |
+
)
|
| 489 |
+
```
|
| 490 |
+
|
| 491 |
+
and fit it, passing the unit information.
|
| 492 |
+
To do this, we need to use the format of [DynamicQuantities.jl](https://symbolicml.org/DynamicQuantities.jl/dev/#Usage).
|
| 493 |
+
|
| 494 |
+
```python
|
| 495 |
+
# Get numerical arrays to fit:
|
| 496 |
+
X = pd.DataFrame(dict(
|
| 497 |
+
M=M.value,
|
| 498 |
+
m=m.value,
|
| 499 |
+
r=r.value,
|
| 500 |
+
))
|
| 501 |
+
y = F.value
|
| 502 |
+
|
| 503 |
+
model.fit(
|
| 504 |
+
X,
|
| 505 |
+
y,
|
| 506 |
+
X_units=["Constants.M_sun", "kg", "Constants.R_earth"],
|
| 507 |
+
y_units="kg * m / s^2"
|
| 508 |
+
)
|
| 509 |
+
```
|
| 510 |
+
|
| 511 |
+
You can observe that all expressions with a loss under
|
| 512 |
+
our penalty are dimensionally consistent!
|
| 513 |
+
(The `"[⋅]"` indicates free units in a constant, which can cancel out other units in the expression.)
|
| 514 |
+
For example,
|
| 515 |
+
|
| 516 |
+
```julia
|
| 517 |
+
"y[m s⁻² kg] = (M[kg] * 2.6353e-22[⋅])"
|
| 518 |
+
```
|
| 519 |
+
|
| 520 |
+
would indicate that the expression is dimensionally consistent, with
|
| 521 |
+
a constant `"2.6353e-22[m s⁻²]"`.
|
| 522 |
+
|
| 523 |
+
Note that this expression has a large dynamic range so may be difficult to find. Consider searching with a larger `niterations` if needed.
|
| 524 |
+
|
| 525 |
+
|
| 526 |
+
## 11. Additional features
|
| 527 |
|
| 528 |
For the many other features available in PySR, please
|
| 529 |
read the [Options section](options.md).
|