Technology: SIMD / MMX / SSE / SSE2 / SSE3 / 3DNow!
An increasing number of newer processors come with
extensions to their instruction set commonly referred to as SIMD instructions.
SIMD stands for Single Instruction Multiple Data meaning that a single
instruction, such as "add", operates on a number of data items in
parallel. A typical SIMD instruction for example will add 8 16 bit values in
parallel. Obviously this can increase execution speed dramatically which is why
these instruction set extensions were created.
MMX Instruction Set
In 1997 Intel launched the Pentium Processor with MMX Technology. Subsequently the Pentium II, III and 4 were launched and other processors, such as the AMD Athlon and Duron, added this instruction set extension.SSE Instruction Set
The Pentium III processor, introduced in 1999, was the first processor with the SSE instruction set extension. However, AMD at that time already had added the similar 3DNow! instruction set extension to their AMD K6 CPU.The SSE instruction set in fact consists of 3 distinct enhancements:
- Additional MMX instructions such as min/max
- Prefetch and write-through instructions for optimizing data movement from and to the L2/L3 caches and main memory
- New 128 bit XMM registers and corresponding 32 bit floating point (single precision) instructions
SSE2 Instruction Set
The Pentium 4 processor introduced the SSE2 instruction set extensions in 2000.Two important improvements were made:
- Support 64 bit (double precision) floating point data types
- Integer instruction for the XMM registers
Speed-Ups
SIMD instructions have the potential to speed-up software by factors of 2 .. 16 or even more. This especially applies to compute-intensive software dealing with arrays of data. However, as most compilers do not utilize the (full) potential of these instructions these speed-ups can generally only be achieved by an optimization expert.You have to a processor with an instruction set designed to improve performance with 3D graphics and other multimedia data. To take full advantage of MMX, SSE, SSE2, and 3DNow! technology, software must be written to use its specific capabilities.
Performance test
Video Converter software: FFmpeg
FFmpeg is a complete, cross-platform
solution to record, convert and stream audio and video. It includes libavcodec
- the leading audio/video codec library.
Support for -cpuflags
flags (global)
in Command line
Possible flags for
this option are:
·
x86- mmx, mmxext, sse, sse2, sse2slow, sse3, sse3slow,
ssse3, atom, sse4.1, sse4.2, avx, xop, fma4, 3dnow, 3dnowext, cmov,
·
ARM- armv5te, armv6, armv6t2, vfp, vfpv3, neon,
·
PowerPC- altivec,
·
Specific Processors - pentium2, pentium3, pentium4, k6,
k62, athlon, athlonxp, k8
Sample codes ( http://ffmpeg.org/ffmpeg.html#Preset-files
)
ffmpeg
-cpuflags -sse+mmx ...
ffmpeg
-cpuflags mmx ...
ffmpeg
-cpuflags 0 ...
ffmpeg -cpuflags mmx -i originalvideo.mkv convertedvideo.avi
Download
ffmpeg.exe, ffplay.exe
and ffprobe.exe from
http://ffmpeg.zeranoe.com/builds/win64/static/ and run start.bat file
@echo OFF
rem set inputfilename=org.wmv
rem set outfilename=out.wmv
set/p inputfilename=please enter input file :
set/p outfilename=please enter output file :
rem echo converting, %inputfilename% to %outfilename%
>>result.txt
echo =======converting, %inputfilename% to %outfilename%===========
>>result.txt
set/p cpuplags=please enter your CPUPLAGS option :
echo your CPUPLAGS option is %cpuplags% >>result.txt
ffmpeg -cpuflags %cpuplags% -i %inputfilename% %outfilename% -benchmark >>result.txt
echo ------------------------------------------------------------
>>result.txt
echo >> result.txt
pause