For my diploma thesis, I have implemented Satoh’s algorithm for counting points on elliptic curves over finite fields of small characteristic in C++ (using NTL and GMP).
GF(2337) = GF(23)[X]/(F(X)), F(X) = 22 + X2 + X37
E/GF(2337): y2 = x3 + ax + b
a = 1 + X + X2 + 19X4 + (F(X))
b = 2 + 3X + 4X2 + 7X5 + (F(X))
#E(GF(2337)) = 11 × 47 × 67 × 37313 × q
q = 187285613829952805985761367930316131263729A combination of Satoh’s algorithm and an “early-abort strategy” was used to find cryptographically strong curves (with prime order) for all characteristics up to 1999 (representation). Besides, these results may be current point counting records for elliptic curves over finite fields GF(pn) with 5 ≤ p < 2000 and n≫1.
The needed modular polynomials have to be provided separately. For 367 ≤ p < 1000, I thankfully used the coefficients published by Andrew Sutherland. For p ≥ 1009, the coefficients were generated using PARI/GP (modulo a power of p).
In addition to (classical) modular polynomials, the implementation also relies on division polynomials, which can be partially precomputed (based on the work of James McKee). These computations were done for the first 301 primes greater than 3.
If you are interested in running times, you may follow these links (ordered by speed), where n is always the smallest prime such that pn > 2512 (comparable to AES-256)
GMP NTL CPUspeed-up (prev. config)
5.0.1 5.5.2 2.53 GHz Intel Core 2 Duo P8700
5.1.1 6.0.0 Intel Core i5-3427U (up to 2.8 GHz)2.14×
6.0.0 8.1.0 Intel Xeon E5-2666 v3 (up to 3.5 GHz)1.53×
6.2.1 11.4.3 Intel Celeron N4100 (up to 2.4 GHz)1.05×
5.1.3 6.0.0 Intel Core i7-4770 (up to 3.9 GHz)1.06×
6.2.0 11.4.3 Intel Xeon E5-2667 v3 (up to 3.6 GHz)2.09×
6.2.0 11.4.3 Intel Core i7-5775R (up to 3.8 GHz)1.07×
6.2.1 11.5.1 2.6 GHz AWS Graviton31.15×
6.2.0 11.4.3 Intel Xeon Platinum 8151 (up to 4.0 GHz)1.06×
6.2.1 11.5.1 Intel Xeon Platinum 8252C (up to 4.5 GHz)1.09×
6.3.0 11.5.1 AMD Ryzen 7 3700X (up to 4.4 GHz)1.05×
6.3.0 11.6.0 2.8 GHz AWS Graviton41.04×
6.3.0 11.5.1 AMD Ryzen 7 PRO 5750G (up to 4.6 GHz)1.08×
6.3.0 11.6.0 AMD Ryzen 9 5900XT (up to 4.8 GHz)1.12×
6.3.0 11.5.1 Apple M21.23×
6.3.0 11.6.0 AMD EPYC 9R05 (up to 5 GHz)1.07×
6.3.0 11.5.1 Apple M41.18×
Yes, the single core performance of Apple's M4 is really impressive. See also hashcat benchmarks. Moreover, software matters. For example, the Intel Celeron N4100 with modern libraries (2020) matches the much faster Intel Xeon E5-2666 v3 with older libraries (2014-2015).