Intel made some dramatic changes in the Nehalem microarchitecture in order to offer new features and capacity in the core i7 family processors. In the coming paragraph we will explore the details of some features and their influence on control and measurement application. Intel moved the memory controller and PCI Express controller from the northbridge to the CPU die, in order to reduce the number of external databus . These changes increase data-throughput and reduce the latency for memory and PCI Express data transactions. (Figure 1)
Intel inserts a distributed shared memory architecture using Intel QuickPath Interconnect (QPI). QPI is the new point-to-point interconnects for connecting a CPU to either a chipset or another CPU.
Intel’s decisions have more significant impact for multiprocessor systems. These improvements make the Core i7 family of processors ideal for test and measurement applications such as high-speed design validation and high-speed data record and playback. CPU Performance Boost via Intel Turbo Boost Technology
To provide a good performance and to optimize the processor power consumption, Intel introduced a new feature called Intel Turbo Boost. Intel Turbo Boost is a new feature that automatically allows active processor cores to run faster than the operating frequency when certain conditions are met. Intel Turbo Boost is activated when the operating system requests the highest processor performance state. The maximum frequency of the specific processing core on the Core i7 processor is dependent on the number of active cores, and the amount of time the processor spends in the Turbo Boost state depends on the workload and operating environment.
Figure 3. Intel Turbo Boost features offer processing performance gains for all applications regardless of the number of execution threads created. Figure 3 illustrates how the operating frequencies of the processing cores in the quad-core Core i7 processor change to offer the best performance for a specific workload type. In an idle state, all four cores operate at their base clock frequency. If an application that creates four discrete execution threads is initiated, then all four processing cores start operating at the quad-core turbo frequency. If the application creates only two execution threads, then two idle cores are put in a low-power state and their power is diverted to the two active cores to allow them to run at an even higher clock frequency. Similar behavior would apply in the case where the applications generate only a single execution thread.