A1222 Guide: Performance and Benchmarking

How do we gauge performance on the A1222 against other systems?

Why on Earth I'm bothering with this I don't know, but people seem interested nonetheless. The truth is that most benchmarks are fairly worthless: this one favors this CPU architecture, that one favors that graphics card, the other is misleading when using certain memory types, and so forth. The only benchmark that really matters is how your workload runs on your system with your configuration. Everything else is noise. That said there are a few benchmarking tools that we tend to run on our NG Amiga systems to compare between them, and some can be useful in determining overall performance trends. The systems I'll be comparing against are the AmigaOne X5000/20, the SAM460EX, and the Pegasos II.

Each benchmark listed here was tested on an original Tabor beta board-based A1222 build using the latest AmigaOS beta components as of mid-2024. It's beta. Beta. BETA!!!! That means that the final numbers may be similar or be wildly different. The whole reason for things like NDAs is to prevent stuff like this from coloring people's expectations before release, either for good or for bad. If anyone starts quoting this page on the forums, I'll send the Golem of Prague after you. Don't think I can't.

CPU Benchmarks

The tests listed here below are designed to characterize CPU computation performance and storage bandwidth rather than any sort of real-world workload. Just like anything else on the A1222 if it uses PowerPC floating-point instructions and data the results will be affected by the FPU emulation layer. For example the RageMem results below use float64 datatypes for the 64-bit operations, so the numbers are going to be misleading. So, tests like the Whetstone benchmark or LAME MP3 conversion, or the 64-bit read/write tests from RageMem have been removed from the tables below for now, but I can make the raw data available to those genuinely interested.

Test Units A1222 X5000 SAM460 Pegasos 2
RageMem
L1 cache read (32-bits) MB/sec 4419 7537 4386 3809
L1 cache write (32-bits) MB/sec 4420 7538 4362 3265
L2 cache read (32-bits) MB/sec 1557 4288 1033 1719
L2 cache write (32-bits) MB/sec 983 5022 448 1545
RAM read (32-bits) MB/sec 587 677 277 139
RAM write (32-bits) MB/sec 905 1469 448 211
SortBench
1K elements MB/sec 2225.08 3729.67 1700.54 1859.10
2K elements MB/sec 2210.70 3755.91 1706.85 1860.45
4K elements MB/sec 2214.75 3763.17 1697.80 1867.98
8K elements MB/sec 2211.39 3748.58 1650.10 1863.68
16K elements MB/sec 1399.56 3380.38 472.58 1723.76
32K elements MB/sec 1266.79 3277.59 390.23 1644.92
Dhrystone 2.1
Dhrystones per second 1205265.9 1959940.6 1002608.4 947880.0
Dhrystones MIPS DMIPS 685 1115 570 539
Sieve of Eratosthenes (ten iterations, array size 8191)
Average iteration runtime seconds 0.005267 0.002744 0.011243 0.016283
Maximum MIPS results MIPS 1172.9 1973.0 873.2 1145.4
Minimum MIPS result MIPS 180.2 213.2 80.7 44.8


Disk Benchmarks

Back in the Disk Options section of this guide we discussed disk hardware and filesystem choices. Now I'd like to compare the relative performance of disk I/O between my A1222 and my X5000. For this test the same Samsung SSD was used, with the same partitioning scheme, and same filesystem -- NGFS\01. While I'd like to do some real-world testing here, like streaming video and file copies of various sizes, I don't have instrumentation to accurately capture results, so like before, we'll use DiskSpeed 4.5 for our tests. I have no idea if FPU emulation impacts the results, but it hardly matters, since in the end, that's how it would work on our A1222s anyway.

Some of these numbers seem, well, a little off. But generally we would expect the A1222 to show less performance than the flagship X5000, and that's what it does show. The p1022sata.device driver probably has many areas where improvements could be made; my understanding is that it is based on the FreeScale Linux driver, and there may be areas that can be improved to take advantage of ExecSG functionality. No idea. In real-world use it can show, especially when loading many, tiny files, such as AISS images or games with tons of image and sound assets. With other applications you don't notice it at all. The point is that the A1222 disk I/O has quite a bit of room for software improvement; there is no reason some of these numbers are as low as they are.

After a request from a reader I have added results from my Pegasos 2 system but I don't consider them much use. With the A1222 and X5000 we are comparing apples to apples, i.e., same SSD, same filesystem, same interface. The Pegasos 2 is so old that the drive is mechanical, not solid-state; it is IDE-based, not SATA2; and the filesystem is SFS\0, not the newer NGFS\0. Nonetheless it is a typical setup for older, pre-PCIe systems, so maybe it is useful. You be the judge.

Test Units A1222 X5000 Pegasos 2
Directory manipulation speed
File create files/sec 2894 4109 3872
File open files/sec 36.57K 72.24K 23.61K
Directory scan files/sec 174.89K 297.27K 138.71K
File delete files/sec 6.04K 14.45K 0.63K
Seek/read seeks/sec 294.30K 532.19K 80.02K
File operations
512B buffer: create file MB/sec 18.00 18.18 9.08
512B buffer: write file MB/sec 26.63 45.39 30.71
512B buffer: read file MB/sec 121.94 223.65 52.77
4KB buffer: create file MB/sec 40.34 24.90 5.80
4KB buffer: write file MB/sec 150.19 323.09 248.40
4KB buffer: read file MB/sec 364.36 785.62 111.47
32KB buffer: create file MB/sec 49.81 27.61 7.43
32KB buffer: write file MB/sec 367.67 1170 584.35
32KB buffer: read file MB/sec 346.45 1100 156.83
256KB buffer: create file MB/sec 47.66 27.69 30.34
256KB buffer: write file MB/sec 315.81 1620 23.81
256KB buffer: read file MB/sec 338.97 1100 15.09
1MB buffer: create file MB/sec 48.88 26.88 30.38
1MB buffer: write file MB/sec 327.50 959.12 23.38
1MB buffer: read file MB/sec 340.38 1000 64.38

I also did some experiments back in 2021 on NGFS\00 and SFS filesystems on the A1222 and X5000 that may be of interest. Compared to the table above they show either the maturation of NGFS over time, the flakiness of DiskSpeed, both, or neither. I'm not trying to be coy; I'm just presenting the numbers as I measured them. That older data is available for review. Oh, and those experiments were with DiskSpeed 4.3, an older version.

Graphics Benchmarks

We have a few graphics tools we can use for benchmarking, and for starters will use actual benchmarking tools that are well-known. Later I hope to add real-world examples like Blender rendering times, ShaderJoy average FPS numbers, timed demo levels in Quake, etc. Those are far more interesting than the tests below, especially since there is a high-end RadeonHD card in the X5000 and a low-end RadeonRX card in the A1222. The Pegasos II results are with a Radeon 9000 Pro AGP card.

Test Units A1222 X5000 SAM460 Pegasos 2
SDLbench
320x240 SW: Slow points frames/sec 14.4 29.2 5.3 11.8
320x240 SW: Fast points frames/sec 275.9 483.0 164.0 266.7
320x240 SW: Rect fill rects/sec 28643.4 59362.3 27489.9 20277.2
320x240 SW: 32x32 blits blits/sec 66064.5 157538.0 53194.8 56888.9
320x240 HW: Slow points frames/sec 444.4 800.0 363.6 288.6
320x240 HW: Fast points frames/sec 267.8 536.7 222.4 114.6
320x240 HW: Rect fill rects/sec 45011.0 68266.7 69423.7 3150.0
320x240 HW: 32x32 blits blits/sec 11804.0 35008.5 16516.1 3150.0
640x480 SW: Slow points frames/sec 1.9 3.8 1.3 1.5
640x480 SW: Fast points frames/sec 68.9 121.3 47.4 66.7
640x480 SW: Rect fill rects/sec 8359.2 18367.7 4740.7 6370.1
640x480 SW: 32x32 blits blits/sec 48761.9 157538.0 50567.9 56109.6
640x480 HW: Slow points frames/sec 105.3 216.2 93.0 54.4
640x480 HW: Fast points frames/sec 66.8 134.7 55.9 29.2
640x480 HW: Rect fill rects/sec 30117.6 50567.9 55531.4 11702.0
640x480 HW: 32x32 blits blits/sec 11160.8 35617.4 16384.0 3150.0
GfxBench2D
Overall score 6743.22 6678.69 n/a 1711.32
WritePixelArray MB/sec 396.04 923.11 n/a 174.1
ReadPixelArray MB/sec 332.31 895.61 n/a 42.6
copyToVRAM MB/sec 147.02 534.54 n/a 129.9
copyFromVRAM MB/sec 6.69 34.58 n/a 46.8


Memory Benchmarks

After reading a thread on amigans.net (link) I realized I didn't have anything that highlighted memory throughput other than Ragemem numbers above -- and those are limited because the use of 64-bit floats wasn't really useful since all it did was hit the FPU emulation. So I've gone ahead and used the STREAM benchmark for this purpose, and you can grab a copy on OS4Depot (download) to compare against your systems. At some point I'll dig up my old Pegasos II from my basement storage and test there as well. Here we compare the A1222 using native floating-point using the SPE unit, the A1222 using FPU emulation, and the X5000/20:

Test Units A1222 (SPE) A1222 (emulated) X5000
Copy MB/sec 778.2 777.6 1173.8
Scale MB/sec 479.0 149.4 1081.1
Add MB/sec 576.1 155.8 1436.7
Triad MB/sec 549.7 149.2 1433.7

There are a few things to keep in mind: firstly I'm not using the fastest known SO-DIMM in my A1222, so your mileage may vary. Secondly, although this benchmark is designed to demonstrate relative memory bandwidth, this shows that even for well-behaved code, if it hits floating-point instructions on the A1222 and is not native SPE code, you're going to notice it. Thirdly, with well-behaved code, the impact is substantial but not nearly as bad as one might think. If your tool uses floating-point math sparingly, the A1222 is perfectly usable. If your tool is virtually nothing but floating-point code, such as Heretic II, for example, and isn't SPE-native, don't bother.
Previous section: Tips and Tricks Back to contents page... Next section: Frequently Asked Questions

Introduction | Initial Build | Disk Options | Recommended Software
Tools Compatibility | Games Compatibility | Tips | Benchmarks | FAQ | Links

Last updated: 25.06.24