Wednesday, July 3, 2013

How instruction set affect CPU performance




Technology: SIMD / MMX / SSE / SSE2 / SSE3 / 3DNow!
An increasing number of newer processors come with extensions to their instruction set commonly referred to as SIMD instructions. SIMD stands for Single Instruction Multiple Data meaning that a single instruction, such as "add", operates on a number of data items in parallel. A typical SIMD instruction for example will add 8 16 bit values in parallel. Obviously this can increase execution speed dramatically which is why these instruction set extensions were created.

MMX Instruction Set

In 1997 Intel launched the Pentium Processor with MMX Technology. Subsequently the Pentium II, III and 4 were launched and other processors, such as the AMD Athlon and Duron, added this instruction set extension.

SSE Instruction Set

The Pentium III processor, introduced in 1999, was the first processor with the SSE instruction set extension. However, AMD at that time already had added the similar 3DNow! instruction set extension to their AMD K6 CPU.
The SSE instruction set in fact consists of 3 distinct enhancements:
  • Additional MMX instructions such as min/max
  • Prefetch and write-through instructions for optimizing data movement from and to the L2/L3 caches and main memory
  • New 128 bit XMM registers and corresponding 32 bit floating point (single precision) instructions

SSE2 Instruction Set

The Pentium 4 processor introduced the SSE2 instruction set extensions in 2000.
Two important improvements were made:
  • Support 64 bit (double precision) floating point data types
  • Integer instruction for the XMM registers 

Speed-Ups

SIMD instructions have the potential to speed-up software by factors of 2 .. 16 or even more. This especially applies to compute-intensive software dealing with arrays of data. However, as most compilers do not utilize the (full) potential of these instructions these speed-ups can generally only be achieved by an optimization expert.
You have to a processor with an instruction set designed to improve performance with 3D graphics and other multimedia data. To take full advantage of MMX, SSE, SSE2, and 3DNow! technology, software must be written to use its specific capabilities.

Performance test



Video Converter software: FFmpeg
FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video. It includes libavcodec - the leading audio/video codec library.
Support for -cpuflags flags (global) in Command line
Possible flags for this option are:
·         x86- mmx, mmxext, sse, sse2, sse2slow, sse3, sse3slow, ssse3, atom, sse4.1, sse4.2, avx, xop, fma4, 3dnow, 3dnowext, cmov,
·         ARM- armv5te, armv6, armv6t2, vfp, vfpv3, neon,
·         PowerPC- altivec,
·         Specific Processors - pentium2, pentium3, pentium4, k6, k62, athlon, athlonxp, k8
ffmpeg -cpuflags -sse+mmx ...
ffmpeg -cpuflags mmx ...
ffmpeg -cpuflags 0 ...
     ffmpeg -cpuflags mmx -i originalvideo.mkv convertedvideo.avi
Download ffmpeg.exe, ffplay.exe and ffprobe.exe from http://ffmpeg.zeranoe.com/builds/win64/static/ and run start.bat file
 


@echo OFF



rem set inputfilename=org.wmv

rem set outfilename=out.wmv



set/p inputfilename=please enter input file :

set/p outfilename=please enter output file :



rem echo converting, %inputfilename% to %outfilename% >>result.txt



echo =======converting, %inputfilename% to %outfilename%=========== >>result.txt

set/p cpuplags=please enter your CPUPLAGS option :



echo your CPUPLAGS option is %cpuplags% >>result.txt





ffmpeg -cpuflags %cpuplags% -i %inputfilename% %outfilename%  -benchmark >>result.txt



echo ------------------------------------------------------------ >>result.txt

echo >> result.txt 



pause


keep ffmpeg.exe, ffplay.exe and ffprobe.exe and start.bat files on same directory. and run bat file.

  To configure Nginx for a Laravel application located within a subfolder, a  location  block is required to handle requests to that specifi...