The Cores

From Intel : Pentium series

P5 (0.8)

The original Pentium core. Two integer pipelines with arcane pairing rules, a decent floating-point core, but a maximum speed of 66MHz (synchronous with the bus), ran very hot, and the earliest ones suffered from the famed FDIV bug. Used Intel's BiCMOS process.

P54 (0.6)

The second revision of the Pentium. A bit smaller, ran a bit cooler and accordingly could be run at up to 100MHz. The FDIV bug was fixed; this chip also used BiCMOS.

P54 (0.35/0.6)

According to Intel's site, there was a special revision of the Pentium used only at the 120MHz clock speed, which used BiCMOS technology but built the CMOS in a 0.35u process. If anyone knows more about this, please email me.

P54C

Intels last classic-Pentium core, this was built in pure 0.35u CMOS, and ran at up to 200MHz.

P55C

The Pentium MMX core. This fixed several design problems in the previous Pentium cores - most important for performance, it had branch prediction technology taken from the P6 which actually predicted branches effectively. Coupled with enlarged L1 caches and MMX instructions, this produced a significant performance increase. Intel also took the opportunity to fix the F00FC7C8 bug in hardware, but since modern OSes have an F00FC7C8 fix in software this is not important.

P6 series (PPro, P2, P3)

The P6 core, introduced in November 1995, took everyone by surprise. It's essentially a fast RISC chip fed by decoders which convert the x86 instructions into simpler RISC instructions; it has three integer pipelines and a very good floating-point core. Moreover, it used a long pipeline, effective branch prediction, and (most importantly) out-of-order execution and register renaming to produce an extremely fast system; at the time it beat all existing RISC chips in the SpecINT benchmarks.

Pentium Pro

The P6 core was used in the Pentium Pro series of CPUs, which coupled it with one or two custom-made very fast SRAM chips, mounted in the same ceramic package.

Klamath

The Pentium Pro package was extremely expensive for Intel to manufacture; accordingly, in May 1997, they released the first P2 chips. These used the Klamath core, which was a P6 core with MMX added and various tweaks made to increase performance in 16-bit code, packaged in a large cartridge containing commodity SRAM chips on a standard PCB.

Deschutes

Deschutes is an enhanced Pentium 2 core - one or two extra instructions were added, along with support for a 100MHz front-side bus, and the higher clock speeds made possible by a shrink to a 0.25u process. The cache was usually 512k of commodity SRAM running at half the core speed; this core was later repackaged, using larger amounts of Intel-manufactured custom SRAM running at the full core speed and with four-way multiprocessing, as the Pentium 2 Xeon.

Covington

The Covington core, based on Deschutes, was brought out in a great hurry, in order to provide the original Celeron to compete with AMD's K6 at the low end of the market. It was shipped without external cache, but it's not clear whether the cache interface was deleted or simply ignored.

Mendocino

By mid-1998, the 0.25u process had matured enough that it became practical to put medium-sized caches on the same die as the core. Intel released the CeleronA series of chips, which used a die containing a Deschutes-derived core and 128k of fast L2 cache SRAM. These were sold on the 66MHz front-side bus, but the 300MHz model runs fine at 450MHz with a 100MHz front-side bus for really quite startling performance; they were the last cheap Intel chips to support dual processor configurations, and dual overclocked Celerons were at one point a popular enthusiast configuration.

Dixon

In early 1999, Intel released some mobile P2 processors based on the Dixon core, which is very similar to the Mendocino but contains 256k of L2 cache on-die.

Katmai

The Katmai core is Deschutes with the SSE instructions added; it's built in 0.25u and, like Deschutes, is designed to interface to off-chip cache running at half the core clock speed. As with Deschutes, versions interfacing to faster custom-made off-chip cache and supporting four-way multiprocessing were sold as Pentium ||| Xeon.

Coppermine

The Coppermine core was long suspected to be the final iteration of the P6. It's built in a 0.18u process, and has 256k of fast on-die L2 cache, with a latency of six cycles and capable of transferring 32 bytes every two cycles. Intel's benchmarks for the Coppermine P3 chips suggest that FP performance has been improved substantially, though the suspicion is that this is caused by clever prefetch-capable compilers: tests at Anandtech suggest the improvement on current software is much smaller. Coppermine was sold as Pentium ||| Xeon without changing the cache or multiprocessing abilities; only the packaging was changed.

Cu128

To increase the capabilities of low-end systems, and to provide some resistance to AMD at the bottom end of the market, Intel introduced the CeleronA series of processors. These are based on the Coppermine core with half its internal L2 cache disabled and a 66MHz bus speed (or 100MHz on the 2001 models). These chips were really not comparable with the AMD Duron they competed with; though they overclocked well, and were the cheapest way of getting a processor with SSE capability.

Cascades

The original Xeon model of using off-chip fast caches isn't easily extensible beyond 550MHz. So, for the 700MHz and 900MHz Pentium ||| Xeons, Intel took advantage of their 0.18u process and the expertise in producing physically-large chips that they gained manufacturing HP's PA-8500 processors, and integrated up to two megabytes of L2 cache onto a die otherwise similar to the Coppermine but supporting four-way multi-processing.

Tualatin

As Intel got their 0.13u process into production, the first chip they produced on it was a P6 core equipped with 512k on-die cache. Marketting this chip so as not to compete too much with the P4 is not straightforward; it's targetted at the server and mobile markets at the moment, and sold at a price uncompetitive with Athlons and P4s of comparable raw speed. Tualatin processors shipped at up to 1400MHz, and have been over-clocked to over 2GHz using home-built cryonics.

Versions of the chip with half the L2 cache, with the pre-fetch features disabled, and running with a 100MHz front-side bus rather than the 133MHz of the normal version, are sold as Celerons at the 1GHz to 1.4GHz level; these are reasonably competitive with Durons of the same speed.

P4 series

By the end of 2000, the Coppermine core was showing its age; its speed limitation in 0.18u technology was dramatically shown when Intel were forced to recall the 1133MHz P3. However, they'd had its successor in development for a long time; the P4 was launched at the end of November, based on the Willamette core.

Willamette is an even more aggressively out-of-order chip than P6, with larger queues and deeper buffers at all opportunities. Instead of caching raw instruction data, it uses a large trace cache to store decoded instructions, thus decoupling the execution core from decoder performance.

The high clock rates come from a very long (20-stage) pipeline, which is backed up by very highly resourced branch prediction. The ALU is clocked at twice the speed of the rest of the processor, completing some operations every half-tick; this is technically impressive though the advantage it gives in real applications is not clear.

Willamette also introduces the SSE2 instructions, which by providing double-precision vector FP in the SSE registers allow x87 code to be almost entirely retired; using a sensible FP model is probably the main reason beyond high clock rates for its excellent SpecFP scores. However, these instructions still mostly have a latency of two cycles, suggesting that the functional units are still only 64 bits wide.

Willamette

The first P4 incarnation, codenamed Willamette, is made in a 0.18u process, and features an 8K L1 data cache, a 12000-micro-op trace cache (described as 12kT in the table below; micro-ops are about 64 bits long, so this takes about 100k of SRAM, but since instructions are expanded by about a factor four in the decoding process it's equivalent to about 20k un-decoded), and 256K of L2 capable of delivering 32 bytes a cycle. The front-side bus is 100MHz QDR, for an equivalent of 400MHz; this allows [for an x86] unprecedented memory bandwidth of 3.2GB per second. Versions with half the L2 cache were used in 2002 for the desktop Celeron chip at 1.8GHz and below; they were not competitive with the Tualatin-based Celerons despite their greater clock speed.

Foster

Released more than a year after the Willamette, this server chip is a Willamette equipped with a 512k or 1024k on-chip L3 cache, and support for hyper-threading. It supports up to four processors.

Northwood

This is the 0.13u port of the Pentium 4; apart from higher clock rates, it offers a 512K L2 cache. In June 2002, versions with a 133MHz QDR front-side bus (4.2GB/sec) were introduced. Running significantly cooler than Willamette, this was the chip that brought P4 into the mainstream market and made it a viable competitor to Athlon. Versions with 256k and 128k of L2 cache exist, for the mobile-Celeron and desktop-Celeron (2GHz+) markets respectively.

Northwood HT

In November 2002, perhaps concerned by the then-imminent launch of the AMD Athlon64, Intel released a version of Northwood with hyper-threading enabled. They also fixed certain performance problems with the P4 core at this stage, in particular L1 cache aliasing which had caused trouble in Prestonia. This chip suffers from very high leakage power (40% of total CPU power at 3GHz). The initial launch was with a 133MHz QDR bus, but in March 2003 the bus speed was increased further to 200MHz QDR (6.4GB/sec). In May 2003, Intel launched slower versions on the same bus, down to a 2400MHz model, to revitalise the Pentium 4 range.

Prestonia

Sold as "Xeon DP", this is a Northwood core with hyper-threading enabled, and with support for up to two processors in a system.

Gallatin

Sold as "Xeon MP", this is a Northwood core, with hyper-threading, with an on-die L3 cache of one, two or four megabytes, and with support for up to four processors in a system. It is also sold as the "Pentium 4 Extreme Edition", without multi-processor support; in this form it holds most of the SPEC performance records

Prescott

This is likely to go down as one of Intel's worse mistakes. It was late (expected in mid-2003, it appeared in February 2004), it is hot, and, whilst slower than an equivalently-clocked Northwood, thermal issues (plus, presumably, heroic engineering efforts from the Northwood team) meant that it launched at maximum clock speeds no higher than Northwood.

It is the 90nm version of the Pentium 4, with 1MB of L2 cache and 16kb of L1 data; it supports SSE3 (a rather unexciting set of 13 new instructions), and has "improved hyperthreading" (though still only two processor contexts). The front-side bus is at 6.4GB/sec (200MHz QDR), though this may rise later.

Nocona

This is Prescott's answer to the Xeon. It supports AMD's 64-bit extensions to x86.

Potomac

This is Prescott's answer to the Xeon MP.

After Prescott

A whole series of chips were announced after Prescott, but in mid-May 2004, Intel made it clear that they did not intend to develop the Netburst architecture much further. The expectation is that they will build multi-core processors based on the Banias core, but all is still very unclear.

Banias series

Banias is the core, designed at Intel's development labs in Israel, used for the Pentium M processor.  It's a very sophisticated chip designed for high performance at low power consumption: an advanced P6 core (with enlarged L1 cache, equipped with SSE2 instructions and what seem quite sophisticated micro-architecture enhancements -- better branch prediction, micro-op fusion to send decoded instructions around in bundles, better stack handling), with elaborate clock gating so that most of the chip can be turned off at any given moment, fitted to a low-voltage 400 MHz Pentium 4 bus and a 1MB level-2 cache designed for low power consumption. Peak power consumption for any model is 24.5 watts.

Intel encourages manufacturers to pair it with the low-power 855 'Oden' chipset and an Intel wireless network device, by providing very substantial advertising support to the producers of this configuration, and not permitting any other configuration to use the much-hyped name Centrino. At launch, it was available in a variety of thin, light, powerful and expensive laptops, running at up to 1.6GHz and offering performance comparable to a 2.4GHz Mobile P4.

Dothan, released after some delay in May 2004, is the 90nm shrink of Banias. Essentially the only difference is a faster clock and a doubled cache size; unlike Prescott, this chip runs rather cooler than its 130nm predecessor. On SPEC benchmarks, the Dothan at 2GHz is comparable for integer work with a 3.4GHz Northwood or a 2.4GHz Opteron, and for FP with a P4/2666 - not bad for 21 watts peak power.

From AMD : K5 series

SSA/5

I'm told that this was an earlier version of the K5; when I find out some more about the differences between it and the later K5, I'll put that here. Apparently, it had more internal wait states and rather less effective branch prediction.

K5

Designed as a competitor for the Pentium, the AMD K5 failed in the market essentially because it was not possible to produce it at very competitive clock rates. It used the P2-like technique of converting X86 operations to RISC-like ones, and was a four-issue (two integer ops, an FP op and a load/store) superscalar chip with rather less strict pairing rules than the Pentium, and rather different instruction timing - apparently it is by a long way the fastest chip for its clock speed at running the distributed.net RC5 problem.

K6 series

The K5 did not compete realistically with the Pentium 2, and so AMD moved to its first post-RISC design, the K6. Again this translates x86 instructions to a RISC core, but it has seven execution units (load, store, two integer, FP, branch, MMX), and a more sophisticated out-of-order core. It has a large L1 cache, but is rather let down by a non-pipelined FPU which is half as fast at running optimised code as the one on the P2 (and, infuriatingly, is slowed down by the standard FXCH technique used to speed up Pentium code). There are one or two AMD internal extensions, most noticably SYSCALL/SYSRET for lower-latency system calls.

K6

The K6 core first appeared in the eponymous chips.

K6/2

Because of the FPU problems of the K6, AMD introduced 3DNow!; the K6/2 was the first of their chips to use this. It also provided support for a 100MHz front-side bus on Super 7 boards, though it was still hindered by having the L2 cache on the front-side bus.

CXT

In late 1998, AMD introduced the 400Mhz K6/2, which used the CXT core. This is a slight enhancement of the K6/2, with write combining used to improve the memory performance and write allocation extended to physical memory sizes above 504M (though this second option is redundant since no Super 7 board supports more than 384M of memory). The same core is now used in 300MHz-and-up K6 chips.

K6/3

Super 7 boards, even with the faster front-side bus, are hindered by having only 100MHz L2 cache. The K6/3 core includes 256k of full-speed L2 cache (as well as 64k of L1), and is otherwise the same as the CXT. The extra cache provides an impressive speed increase.

Athlon series

Between late summer 1999 and winter 2000, AMD took the lead in the performance x86 market with their Athlon design. This is a second- generation post-RISC design, building on the features of the K6 but adding significantly greater execution resources - most notably a fully-pipelined, superscalar FPU capable of issuing one add and one multiply per cycle, and very large (64k for both I and D) level-1 caches. The bus interface is borrowed from Digital's Alpha EV6.

K7

The first Athlon core was the K7. This used a 200MHz front-side bus, and a separate bus to commodity SRAMs for the level-2 cache; these were packaged in a cartridge physically but not electronically similar to the P2 one.

K75

In December 1999, AMD started producing Athlon processors made in a 0.18u process; by February 2000, most new Athlons used the K75 core. This isn't just a process shrink of the K7, though about the only programmer-visible modification is that it supports the Deschutes FXSTORE command (possibly in preparation for the K8 64-bit x86 extensions).

Thunderbird

As the 1000MHz mark was approached and eventually reached, the deficiencies of the Athlon's off-chip L2 cache became clear, as the various versions were forced to slower and slower L2-to-internal clock ratios because of the slow speed of off-the-shelf SRAM. So, in June 2000, AMD released a new range of Athlons with 256k of full-speed L2 cache on-die, though attached by a 64-bit bus.

Spitfire

This core is to Thunderbird roughly as Cu128 is to Coppermine; it has 64k rather than 256k of full-speed L2 cache, is sold at lower speed grades, and is significantly cheaper. However, it runs with the full 200MHz bus speed of its larger brother, and outperforms the Cu128 core very substantially.

Palomino

This is a significantly modified Thunderbird core intended at first for portable and workstation applications; it's designed for lower power consumption, supports SSE1 instructions, is equipped with hardware pre-fetch to improve performance on patterns of dynamically-predictable memory accesses, and in some models has multi-processor support. It also offers slightly higher clock speeds, at least in its later desktop incarnations, than the 0.18u Thunderbird.

Morgan

A Palomino with 64k rather than 256k of full-speed L2 cache, and with the hardware pre-fetch disabled; used for Durons of 1GHz and above.

Thoroughbred A/B

These are 0.13u versions of the Palomino core. Thoroughbred A had significant performance problems, with 1850MHz or so its absolute upper limit; Thoroughbred B, pre-released later though with severe availability restrictions, has a slightly different layout using two more metal layers, and upon its release was pushed to 2400MHz and more without exotic cooling. Some versions of Thoroughbred B released in late 2002 support a 166MHz DDR bus.

Barton

This chip, announced at the start of February 2003 after a succession of delays, is a Thoroughbred B with a 166MHz DDR bus and 512K of level-2 cache. At the topmost levels it was something of a disappointment; even at the 3000+ performance rating it was not very competitive with the P4/3066, and initially there were some availability problems. At the 2500+ performance rating, this is a spectacular reasonable-priced core, competitive on many applications with the P4/2533.

Enter the Opteron

The Opteron represents AMD's move into 64-bit computing, with a chip that supports the x86-64 architecture, a fairly straight extension of the x86 to 64-bit registers. The architecture is a noticably more advanced version of the K7, implemented in a 0.13u SOI (silicon on insulator) process; the initial version has a megabyte of L2 cache, occupying about two thirds of the die area. Like the Alpha EV7, Hammer has a memory controller on the chip itself; unlike the ten (eight for data and two for parity) Rambus channels on the Alpha EV7, the memory controller uses two channels of standard PC2700 DDR. 

Sledgehammer

This is the core used for Opteron and Athlon FX; a full megabyte of cache, two memory controllers, and for many tasks the fastest Windows-running chip on the planet. With only one memory controller enabled, it is known as Clawhammer.

Newcastle

This is a K8 with only 512k of L2 cache; it can support single or dual memory controllers, depending on whether it's destined for Socket 754 or Socket 939.

Sempron

This is a K8 with 256k of L2 cache, targetted at the bottom of the 64-bit market, or to compete with the Celeron D chips.

Oakville

This is the first 90nm port of K8, used to provide low-power-consumption processors for thin and light notebooks

From Cyrix

6x86

The 6x86 is a two-issue superscalar chip, without explicit pairing rules. Its FPU is not pipelined, and is substantially slower than the Pentium's even without this disadvantage; this is not a chip for high- performance floating-point work. The cache is unified, and the chip therefore has explicit support for locking down instructions in the cache, to ensure they don't get thrown out by long streams of data.

6x86MX

The 6x86MX core is a 6x86 with a substantially larger (64k) L1 cache, support for MMX, EMMX and the Pentium's timestamp counter.

M II

Cyrix's current processor is the M2. As far as I can tell, this is a process shrink of the 6x86MX.

MediaGX (now owned by National Semiconductor and called "Geode")

This is an interesting chip, and likely to be the direction in which portable and low-end PCs evolve. The chip contains a 6x86 core, graphics hardware, sound hardware and memory management; you can build an entire system using this chip and very little else. Unfortunately, Cyrix don't publish very much information about it.

From VIA

In autumn 1999, VIA purchased Cyrix and Centaur Technology, and set their processor development teams to working on new designs.

C3(from VIA)

This recently-released chip uses the Cayenne core (an improved M2 with, amongst other things, a pipelined FPU and 3DNow! support) and features 256k of internal L2 cache as well as a 64k unified L1, supporting 66-133MHz front-side buses, It is designed to be used wherever a Socket-370 Celeron is. I haven't got hold of the data sheet for it yet.

Samuel II (from VIA)

This core is designed for low power consumption and for the very bottom end of the x86 market: it's a drop-in replacement for a Socket-370 Celeron, though can run at 100 and 133MHz bus speeds. The core is relatively straightforward, and backed up by a large 128k L1 cache.

Ezra

Ezra is VIA's 0.13u competitor against the FCBGA Celerons. Designed by the Centaur team, it appears primitive compared to Intel and AMD designs; the design is somewhere between the 486 and the Pentium, though with a longer pipeline than either and equipped with significantly better caches (comparable to the AMD Duron). It supports MMX and 3DNow. It does no out-of-order execution, and runs its FPU at half the core clock speed; this does, however, mean that it runs at a very low supply voltage and needs only a passive heatsink. This is a chip designed for reallycheap computers; Wal-Mart sells a machine containing it for about £130. Reviews suggest that it's comparable to a Celeron of between half and two-thirds the MHz rating.

Nehemiah

Nehemiah is a newer VIA low-end chip, supporting SSE, and with its FPU running at the full core clock speed. It is supposedly 70% faster than Ezra at playing Quake 3. There are some availability problems: there have been reports in February 2003 of people ordering a Nehemiah and getting an Ezra instead.

From other companies

Winchip

This chip, designed by Centaur, was introduced by IDT in 1998 as a very cheap Pentium replacement for low-end systems. It features low power consumption and decent-sized caches, but, dismissing out-of-order and superscalar execution (and even branch prediction) as 'complicated hocus-pocus', it has a primitive internal design (one-wide in-order issue, incompletely-pipelined FPU) which makes it uncompetitive in performance per clock, and a short pipeline making high clock-speeds difficult to attain.

Winchip 2

Recanting somewhat of their previous heresies, Centaur's Winchip 2 has branch prediction, an improved pipelined floating-point unit, and a two-way superscalar MMX unit supporting 3DNow!

Rise mP6

This rather unusual chip features three-way superscalar execution of MMX and a dual-issue FP unit; the documentation, sadly, sucks. I believe the intellectual property ended up with SiS, who produce a Geode-like everything-integrated chip called the 552, comparable to a Pentium2 at 200MHz, and intended for set-top boxes and PDAs.

Summary of features

I'm using abbreviations in the column headings so the table fits nicely on narrower screens.

SSE comes at three levels (SSE, SSE2 and the Prescott extensions SSE3), or is absent. MMX comes at three levels: MMX introduced on the P55C, the MMX+ additional MMX instructions from the P3, and Cyrix's EMMX. 3DNow! comes at two levels: the 3DNow! instructions introduced on the K6, and the 3DN+ extension introduced on the original Athlon.

SYS is the SYSCALL extension, CMV is the FCMOV extension, FST is the Deschutes FSTORE extension. HT is hyper-threading; either "None", or the number of logical processors supported.

MP is multi-processing, and counts the number of processors supported on a single system bus (much larger multi-processor machines tend to be clusters of four-processor systems). AMD systems don't share the bus; the (2) in the MP column indicates that the core was used for Athlon MP chips, which differ rather little from the XP chips but were intended for use in two-processor systems with the AMD760 chipset. Nobody ever produced an SMP system with more than two Athlons in it.

The "2 (4)" by Deschutes, Katmai and Klamath indicate that the core was used in Pentium 3 Xeon chips, which offered four-way multi-processing, as well as in Pentium 3 chips which only went up to two-way. Likewise, the Sledgehammer core is used in Opteron 1-, 2- and 8-series CPUs (which support 1, 2 and 8-way multi-processing respectively); it has no concept of a single system bus, linking directly to its memory, and to peripherals and other processors by several point-to-point Hypertransport connections.

Core name

Process size

Speed range (MHz) MP

L1 cache

L2 cache

L3 cache HT

SYS

CMV

FST

SSE level MMX level 3DNow! level
P24 (0.8) 0.8u 60-66 2 8k I / 8k D 0k 0k None N N N None None None
P54 (0.6) 0.6u 75-100 2 8k I / 8k D 0k 0k None N N N None None None
P54 (0.35/0.6) 0.35u CMOS, 0.6u bipolar 120 2 8k I / 8k D 0k 0k None N N N None None None
P54C 0.35u 133-200 2 8k I / 8k D 0k 0k None N N N None None None
P55C 0.35u 166-233 2 16k I / 16k D 0k 0k None N N N None MMX None
P6 0.35u 150-200 4 16k I / 16k D 0k 0k None N Y N None None None
Klamath 0.35u 233-333 2 (4) 16k I / 16k D 0k 0k None N Y N None MMX None
Deschutes 0.25u 350-450 2 (4) 16k I / 16k D 0k 0k None N Y Y None MMX None
Mendocino 0.25u 300-533 2 16k I / 16k D 128k 0k None N Y Y None MMX None
Dixon 0.25u 266-366 1 16k I / 16k D 256k 0k None N Y Y None MMX None
Katmai 0.25u 450-600 2 (4) 16k I / 16k D 0k 0k None N Y Y 1 MMX+ None
Coppermine 0.18u 500-1000 2 16k I / 16k D 256k 0k None N Y Y 1 MMX+ None
Cu128 0.18u 533-1100 1 16k I / 16k D 128k 0k None N Y Y 1 MMX+ None
Cascades 0.18u 700 or 900 4 16k I / 16k D 512k, 1024k or 2048k 0k None N Y Y 1 MMX+ None
Tualatin 0.13u 866-1400 2 16k I / 16k D 512k 0k None N Y Y 1 MMX+ None
Tualatin256 0.13u 1200-1400 1 16k I / 16k D 256k 0k None N Y Y 1 MMX+ None
Banias 0.13u 900-1700 1 32k I / 32k D 1024k 0k None N Y Y 2 MMX+ None
Dothan 0.09u 1600-2000+ 1 32k I / 32k D 2048k 0k None N Y Y 2 MMX+ None
Willamette 0.18u 1300-2000 1 12kT / 8k D 256k 0k None N Y Y 2 MMX+ None
Northwood 0.13u 1600-2800 1 12kT / 8k D 512k 0k None N Y Y 2 MMX+ None
Northwood HT 0.13u 3400+ 1 12kT / 8k D 512k 0k 2 N Y Y 2 MMX+ None
Prestonia 0.13u 1600-3066 2 12kT / 8k D 512k 0k 2 N Y Y 2 MMX+ None
Gallatin 0.13u 1500-3000+ 4 12kT / 8k D 512k 1024k, 2048k, 4096k 2 N Y Y 2 MMX+ None
Prescott 0.09u 2800-3600+ 1 12kT / 16k D 1024k 0k 2 N Y Y 3 MMX+ None
Nocona 0.09u 2800-3600+ 1 12kT / 16k D 1024k 0k 2 N Y Y 3 MMX+ None
K6 (model 6) 0.35u 166-233 1 32k I / 32k D 0k 0k None Y N N None MMX None
K6 (model 7) 0.25u 266-300 1 32k I / 32k D 0k 0k None Y N N None MMX None
K6/2 0.25u 266-380 1 32k I / 32k D 0k 0k None Y N N None MMX 3DNow!
CXT 0.25u 400-533 1 32k I / 32k D 0k 0k None Y N N None MMX 3DNow!
K6/3 0.25u 400-500 1 32k I / 32k D 256k 0k None Y N N None MMX 3DNow!
K7 0.25u 500-700 1 64k I / 64k D 0k 0k None Y Y N None MMX 3DNow+
K75 0.18u 500-1000 1 64k I / 64k D 0k 0k None Y Y N None MMX+ 3DNow+
Spitfire 0.18u 600-950 1 64k I / 64k D 64k 0k None Y Y N None MMX+ 3DNow+
Thunderbird 0.18u 750-1400 (2) 64k I / 64k D 256k 0k None Y Y N None MMX+ 3DNow+
Palomino 0.18u 850-1733 (2) 64k I / 64k D 256k 0k None Y Y Y 1 MMX+ 3DNow+
Morgan 0.18u 1000-1400 1 64k I / 64k D 64k 0k None Y Y Y 1 MMX+ 3DNow+
Thoroughbred A 0.13u 1800 only 1 64k I / 64k D 256k 0k None Y Y Y 1 MMX+ 3DNow+
Thoroughbred B 0.13u 2000-2250+ (2) 64k I / 64k D 256k 0k None Y Y Y 1 MMX+ 3DNow+
Barton 0.13u 1833-2166+ 1 64k I / 64k D 512k 0k None Y Y Y 1 MMX+ 3DNow+
Sledgehammer 0.13u 1400-2400+ 8 64k I / 64k D 1024k 0k None Y Y Y 2 MMX+ 3DNow+
Newcastle 0.13u 1800-2400+ 1 64k I / 64k D 512k 0k None Y Y Y 2 MMX+ 3DNow+
Sempron 0.13u 1800+ 1 64k I / 64k D 256k 0k None Y Y Y 2 MMX+ 3DNow+
Oakville 0.09u 2000+ 1 64k I / 64k D 1024k 0k None Y Y Y 2 MMX+ 3DNow+
6x86 0.65u   1 16k 0k 0k None N N N None None None
6x86MX 0.35u   1 64k 0k 0k None N Y N None EMMX None
M2 0.25u 225-300 1 64k 0k 0k None N Y N None EMMX None
Cyrix3 0.18u ? 1 64k 256k 0k None N N N None MMX+ 3DNow+
Samuel II 0.15u 733+ 1 128k 64k 0k None N ? N None MMX 3DNow
Ezra 0.13u 800-1000+ 1 64k I / 64k D 64k 0k None N ? N None MMX 3DNow
Nehemiah 0.13u 1000+ 1 64k I / 64k D 64k 0k None N N N 1 MMX+ 3DNow!
Winchip 0.35u   1 32k I / 32k D 0k 0k None N N N None None None
Winchip 2 0.25u   1 32k I / 32k D 0k 0k None N N N None MMX 3DNow!