summary refs log tree commit diff stats
path: root/lib/pure/algorithm.nim
diff options
context:
space:
mode:
authorAngel Ezquerra <AngelEzquerra@users.noreply.github.com>2024-01-20 06:39:49 +0100
committerGitHub <noreply@github.com>2024-01-20 06:39:49 +0100
commit83f2708909e0c59b84f61c2721907ff896285dce (patch)
treeb361b53fbc60223d29c984002d48ce9cede45e28 /lib/pure/algorithm.nim
parent720021908db1f6622c1ebcdad60dff5c2740a80b (diff)
downloadNim-83f2708909e0c59b84f61c2721907ff896285dce.tar.gz
Speed up complex.pow when the exponent is 2.0 or 0.5 (#23237)
This PR speeds up the calculation of the power of a complex number when
the exponent is 2.0 or 0.5 (i.e the square and the square root of a
complex number). These are probably two of (if not) the most common
exponents. The speed up that is achieved according to my measurements
(using the timeit library) when the exponent is set to 2.0 or 0.5 is >
x7, while there is no measurable difference when using other exponents.

For the record, this is the function I used to mesure the performance:

```nim
import std/complex
import timeit

proc calculcatePows(v: seq[Complex], factor: Complex): seq[Complex] {.noinit, discardable.} =
  result = newSeq[Complex](v.len)
  for n in 0 ..< v.len:
    result[n] = pow(v[n], factor)

let v: seq[Complex64] = collect:
  for n in 0 ..< 1000:
    complex(float(n))

echo timeGo(calculcatePows(v, complex(1.5)))
echo timeGo(calculcatePows(v, complex(0.5)))
echo timeGo(calculcatePows(v, complex(2.0)))
```

Which with the original code got:

> [177μs 857.03ns] ± [1μs 234.85ns] per loop (mean ± std. dev. of 7
runs, 1000 loops each)
> [128μs 217.92ns] ± [1μs 630.93ns] per loop (mean ± std. dev. of 7
runs, 1000 loops each)
> [136μs 220.16ns] ± [3μs 475.56ns] per loop (mean ± std. dev. of 7
runs, 1000 loops each)

While with the improved code got:

> [176μs 884.30ns] ± [1μs 307.30ns] per loop (mean ± std. dev. of 7
runs, 1000 loops each)
> [23μs 160.79ns] ± [340.18ns] per loop (mean ± std. dev. of 7 runs,
10000 loops each)
> [19μs 93.29ns] ± [1μs 128.92ns] per loop (mean ± std. dev. of 7 runs,
10000 loops each)

That is, the new optimized path is 5.6 (23 vs 128 us per loop) to 7.16
times faster (19 vs 136 us per loop), while the non-optimized path takes
the same time as the original code.
Diffstat (limited to 'lib/pure/algorithm.nim')
0 files changed, 0 insertions, 0 deletions