BlueSmoke - Review : VIA C3 Processor
|Date||: Mar 6th, 2002|
|Author||: Jin-Wei Tioh|
Flash forward to the present day. The concept of a "value PC" has been blurred, at least in terms of raw CPU power, thanks largely to the AMD Duron. The Celeron is still around, with the CopperMine128 core in this iteration. However, 2 CPU manufacturers are noticeably missing from the landscape : Cyrix and IDT. Cyrix started out in 1988 producing math co-processors for Intel 286 and 386 systems. They went on to produce the well received Cyrix 6x86 and 6x86MX processors in collaboration with IBM. The 6x86 line caused quite a stir as a result of its low price and competent integer performance, which translated into good performance in business applications. National Semiconductor acquired Cyrix in 1997, churning out the Cyrix MediaGX and MII chips, both either short-lived or non-popular. You know what happens to a company when its main product line fails. VIA bought Cyrix in 1999 for US$167 million, including the rights to all future Cyrix products. In September of the same year, VIA also acquired Centaur from IDT, makers of the WinChip processor for a cool US$51 million. This includes the rights to the Centaur x86 microprocessor design.
So what did this buying spree produce? The first product to go out of VIA's doors was the Cyrix III, based on the Cyrix developed Joshua processor core around April in 2000. What with the reasonably priced Celerons on the market, the Cyrix III did not impress. Next came a new revision of the Cyrix III, based on the Centaur x86 core aka "Samuel". The infamous PR (Power Rating) system was scrapped in favor of the actual CPU MHz, but unfortunately, the Samuel core performed worse than the original Joshua core running at a lower clock speed. You can just guess how the market received it...
Now, VIA has retooled the Samuel core, producing the new Samuel2-based C3 processor. The Cyrix name has been dropped, just as well to avoid its stigma. What is new and how does it fare? Find out.
As mentioned earlier, the original Cyrix Joshua core running at a lower clock speed performed better than the Samuel core running at a higher clock speed. It is therefore clear that changes definitely needed to be made, resulting in the new Samuel2 core. While Intel and AMD have yet to move from the current 0.18 micron to a 0.13 micron process, VIA forged ahead and started production of the C3 on a 0.15 micron process, making the C3 the world's first 0.15 micron x86 CPU.
However the price for this is a decrease in complexity, with the C3 having a generally simpler internal architechture. The 2 64KB L1 caches from the original Samuel core remain, with a new 64KB L2 cache added to the Samuel2. The L2 cache is an exclusive cache, instead of the more conventional inclusive cache architechture. The Pentium III (Coppermine) and the Celeron (Coppermine128) both implement an inclusive L2 cache, meaning that all of the data stored in their L1 cache is duplicated in their L2 cache.
An exclusive cache is the anti-matter of an inclusive cache, not duplicating L1 cache data in the L2 cache. The L2 cache now contains only the copy-back cache blocks, ie. blocks that are designated to be written back to the memory sub-system. What does this mumbo-jumbo mean? Data that doesn't fit in a CPU's L1 cache would normally go back into the main system memory if no L2 cache was present, a bad thing considering that system memory has a higher latency (waiting period) than cache memory. So although the C3 only has a 64KB L2 cache, it does not contain a copy of the L1 cache, giving an effective total of 192KB of frequently used data (128KB L1 + 64KB L2). The Celeron, having an inclusive L2 cache, duplicates the L1 cache's contents. Because of this, the Celeron actually only has 128KB of cache to store frequently used data.
Another aspect of L2 cache memory is its associativity. The C3 only has a 4-way set associative L2 cache, on par with the Celeron. By comparison, the Pentium III's is 8-way, and both the Duron's and Thunderbird's are 16-way. This is one of the drawbacks of the C3's and Celeron's smaller die sizes. A higher associativity allows for a higher hit rate for the L2 cache, bringing about obvious performance benefits.
VIA did not implement OOO (out-of-order) instruction execution on the C3. OOO execution increases performance by dynamically rescheduling instructions that cannot be statically reordered by the compiler. OOO execution helps to maximize CPU efficiency and throughput by keeping the CPU's instruction pipelines full by executing instructions "out of order". While the lack of this architechtural feature will certainly hurt the C3's performance, it is a necessary sacrifice to achieve its small die size.
Lastly, the VIA C3 runs on the same GTL+ (gunning transistor logic) bus as the Pentium III / Celeron CPUs, meaning that the C3 is electrically compatible with these Intel CPUs. Thus, it is no surprise that the C3 uses the same socket format as the Pentium III / Celeron. The following table is a technical summary of several processors, including the original Samuel-based C3.
|Socket-370 CPU Technical Comparison|
|VIA Cyrix III||VIA C3||Intel Celeron||Intel Pentium III|
|Clock Speed||500 - 667MHz||667 - ?MHz||533 - 1000MHz||500 - 1130MHz|
|L2 Cache Speed||N/A||Core Clock||Core Clock||Core Clock|
|L2 Cache Type||N/A||Exclusive||Inclusive||Inclusive|
|SIMD Support||MMX, 3DNow!||MMX, 3DNow!||MMX, SSE||MMX, SSE|
|Electrical Bus||100 - 133MHz GTL+||100 - 133MHz GTL+||66 - 100MHz GTL+||100 - 133MHz GTL+|
Let's see how the C3 holds up under our barrage of tests...
|CPUs||VIA C3 733MHz
Intel Pentium III 733MHz
|Motherboard||AOpen AX3S Plus|
|Interface Material||Arctic Silver II|
|Memory||2 x 128MB PC-150 CAS 3 (Kingmax)|
|Hard Drive||Seagate U10 10GB 5400rpm U-ATA 66|
|CD-ROM Drive||AOpen 36x|
|Video Card/s||ABIT Siluro MX400 64MB (default clock - 200/166)|
|Operating System||Windows 2000 Professional (Service Pack 2)|
|Video Drivers||4.13.01.1241 (ver 12.41)|
|Benchmarks||ZDLabs WinBench 99
ZD Business Winstone 2001
ZD Content Creation Winstone 2001
SiSoft Sandra 2001te Professional
3DMark 2001 Pro
Quake III Arena (Retail) - demo001
Max Payne (Retail)
|Stability Tests||FreeBSD 4.3 - makeworld
StabilityTest + HotCPU Lite
Ultra-X RAM Stress Test
For the following results, both the VIA C3 and Intel Pentium III were run on the AX3S Plus using standard parameters (ie. no overclocking) at 733MHz (5.5 x 133 FSB), CAS 2.
|RAM Stress Test|
(AOpen AX3S Plus / i815E / 133MHz / SDRAM)
Pentium III 733
(AOpen AX3S Plus / i815E / 133MHz / SDRAM)
Strictly speaking, CPUMark doesn't just test the integer performance of a CPU. Rather, it measures the efficiency of the CPU coupled with the memory subsystem (especially cache) in performing integer operations. This is the area where the VIA C3 is supposed to shine. However, something seems to be holding it back, and we can only surmise the main culprit is the C3's L2 cache. As we will see in the next section, it seems that the C3's L2 cache is not too efficient, which hampers its overall performance.
On the other hand, FPUMarks can be taken as an indication of a CPU's FPU (floating point unit) performance. This is due to the fact that most floating point operations are not L2 cache intensive. As can be observed in the graph, FPU performance is the C3's Achilles heel, as has always been with the older Cyrix and WinChip CPUs. This hardly comes as a surprise, since the C3 is derived from the Centaur x86 core, a.k.a Samuel. Moreover, VIA is targetting the C3 at business and value users, who mainly use internet and productivity applications. These applications do not (or rarely) use floating point or MMX instructions. Thus, a weaker FPU would not be of much consequence in this scenario.
SiSoft Sandra 2001te's memory benchmark tests a system's memory subsystem, ie. the interplay between the CPU, chipset and memory. Sandra's memory benchmark is based on the C STREAM memory benchmark by John McCalpin. STREAM measures sustained memory bandwidth (not burst or peak), and has been utilized in benchmarking all types of systems, from personal to super-computers. Sandra's version basically uses dynamic data (roughly 50% of physical system RAM) in lieu of STREAM's static data, coupled with the aggressive scheduling of instructions to maximize memory throughput. Since two variables have been held constant (the chipset and memory), the difference in results between the C3 and the Pentium III are only dependant on the CPU, especially their caches. The C3 is no match for the Pentium III's 256-bit ATC (advanced transfer cache) design. The Pentium III integer score of 356MB/s is twice the score of the C3. Interestingly however, the performance delta drops significantly in the floating point STREAM test.
Another integral subsystem is the storage subsystem. This is the brightest spot in the C3's theoretical performance, as it stays neck and neck with the Pentium III in both the Business and High-End Disk WinMarks, despite the latter's more advanced architecture. This effectively opens up some avenues for the C3, such as being used in an entry-level webserver or fileserver. Here, storage access speed and memory are the main bottlenecks. The CPU in such systems basically just shuttle data back and forth from the client to the server and vice versa. There is also the C3's ridiculously low power consumption and heat output. This allows it to be run with just a passive cooling solution, even in a tight enclosure such as a 1U rackmount case. The direct (less power needed to run the server) and indirect cost savings (eg. lessened need for air-conditioning) make this a perfectly feasible and appealing platform that many would want to consider. Just ask any Californian web hosting company.
Ziff Davis older Winstone benchmarks used to be the industry's de-facto standard for systems benchmarking. However, the results obtained were based on obsolete applications such as Microsoft Office 97 and thus offered little value. Now, Ziff Davis has updated its aging Busines Winstone 2000 and Content Creation Winstone 2000, replacing them with the 2001 versions. Both measure overall, real world system performance; Business Winstone 2001 by seeing how fast a system can finish a set of common business tasks, including Microsoft Word 2000, Excel 2000, Access 2000, FrontPage 2000, PowerPoint 2000, Project 98, Norton AntiVirus, Netscape Communicator 4.7, Lotus Notes R5 and NicoMak WinZip; Content Creation 2001 by doing much the same using more intensive applications such as Macromedia Dreamweaver 3.0, Macromedia Director 8.0, Adobe Photoshop 5.5, Adobe Premiere 5.1, Sonic Foundry Sound Forge 4.5 and Netscape Navigator 4.73.
All applications are run on a multi-tasking basis. The whole benchmark suite being run five times (highest score taken), with automatic hard drive defragmentation and rebooting in between each run. This effectively minimizes the margin of variation to within 3% or less, something very welcome in a benchmark. The final score indicates how much faster the system is compared to ZD's base system, ie. a score of 50 indicates the system is 5 times faster.
As with the theoretical CPU benchmarks shown earlier, the 733MHz C3 also trails the 733MHz Pentium III on both test suites. The Pentium III leads by 30% on the Business Winstone, with the performance delta widening to 50% on the more demanding Content Creation Winstone. VIA has made it very clear that "the C3 processor is targeted at the value segment of the desktop and notebook PC market. It is therefore designed to run the type of applications most commonly used by people who purchase Value PC systems". We can imagine that this would include mainstream productivity applications such as Microsoft Office and internet applications such as browsers, e-mail clients, etc. (basically applications utilized in Business Winstone 2001). We were therefor a little disappointed to see the such a large performance delta between the C3 and the Pentium III. However, this was already guaranteed to happen right from the start, due to the C3 being designed to compete with Intel's value solution, the Celeron 2. The Celeron is just a Pentium III with half of its L2 cache disabled. In doing so however, Intel halves the associativity of the Celeron's L2 cache, which severely impacts its performance as numerous other sites have shown. While the 733MHz Celeron will definitely perform worse that its Pentium III counterpart, we can only speculate how it will measure up against the C3.
The VIA C3 lags significantly behind the Pentium III in 3DMark 2001. Glancing at the C3's Quake III Arena and Max Payne results, one can clearly see that the games are CPU constrained at 640x480 and 1024x768, with there being only a 2fps change in framerate. The C3 only catches up to the Pentium III in Quake III Arena at 1600x1200. However in this case, the ABIT Siluro MX400 (a 64MB GeForce2 MX400) becomes the limiting factor rather than raw CPU power. One interesting note is that the final scene of Max Payne is very intensive on the system, stressing the graphics card and CPU. Even the Pentium III, with its vastly superior FPU (floating point unit) power, only scores double of the C3 at 640x480.
The cache size and clock speed of the VIA C3 are fine by any standards. So what is constraining the C3's potential in 3D gaming? Based on what we've seen earlier in the CPU and subsystem benchmarks, it is most likely a combination of factors; 1)The C3's weak FPU. 2)Its relatively inefficient cache. A CPU's cache and FPU performance are integral factors in 3D gaming performance. In fact, turning off the cache on any CPU will instantly result in a drastic drop in overall performance. At any rate, although its performance could have been better for a 733MHz chip, the C3 does provide somewhat decent framerates in less intensive games, probably enough for the very casual gamer.
The VIA C3 processor is somewhat of a love-hate affair. It provides very decent business application performance, but only passable floating point and 3D gaming performance. 2D games such as Sid Meir's Civilization III, and lighter-duty 3D games such as The Sims will run just fine. Other distinct advantages are the C3's low cost, low power consumption and compatibility with many existing Socket-370 motherboards. The emerging trend in the market (even enthusiast market to some extent) are so called "silent PCs", either "traditional" full desktop units or small, highly-integrated solutions aimed at the living room. Barring drastic solutions such as water cooling or running one's CPU in a fridge, the only real option is the C3, which in our tests was able to run just fine even without a heatsink, even when overclocked. The potential for its use is huge, serving either as a DVD decoder, in-car computer, a portable LAN box, a laptop CPU, or even simply just as a cheap, entry-level family computer. What about the Transmeta Crusoe? The reaction of Japanese consumers (who sent C3s flying off the shelves) should be a testament to the promises that Transmeta's much-hyped Crusoe did not meet, but on which the C3 delivers.
Particularly interesting is the C3's storage performance, where there is practically no decrease as compared to an equivalent Intel CPU. Coupled with its aforementioned low power consumption and correspondingly low heat output, the C3 becomes an extremely attractive option for entry-level fileservers and webservers. The potential cost savings (less cooling, lower system cost and power consumption) are very irresistible in a server farm.
In conclusion, it basically boils down to what the end-user is looking for. If a quiet or low power consumption system is a priority, the C3 would definitely be a step in the right direction, meeting those goals whilst providing very decent applications performance. However, those simply looking for raw performance might want to look elsewhere, though to be fair, this is not the target market of the C3. Highly Recommended!
Copyright © 2000-2005 BlueSmoke. All rights reserved. Terms, Conditions and Privacy Information.
Site Design by Jin-Wei Tioh