Professional Documents
Culture Documents
Intel provides Higher Value with a Balanced Platform, enables balanced platform solutions that help
to improve both performance and cost models.
This is thanks to degree of parallelism in your software in the cpu processor.
Intel processors also have wide vector units, so they can execute a single instruction simultaneously
across multiple data points.
These parallel execution resources
can dramatically improve performance, but some applications require
more parallelism than others for
best performance.
Intel architecture gives you unmatched flexibility adding Intel
Xeon PhiTM coprocessors, Intel processors and coprocessors
include a variety of unique technologies
that help to improve parallel throughput, overall performance, and security,
while reducing energy consumption.
These technologies can increase per-
Intel offers two solutions that address the full range of networking needs for technical
computing: 1) 10/40 Gigabit Intel Ethernet technology provides a flexible, highperformance solution for connecting nodes in a loosely coupled cluster or for connecting a
workstation or cluster to a site networ included in Intel Ethernet Controller XL710 And 2)
Intel True Scale Fabric is a purpose built interconnect solution for HPC clusters designed
to support the most demanding performance requirements.
Optimize Your Software For Fast, Parallel Execution with Intel Parallel Studio XE
Professional Edition and Cluster Edition and,
High Performance Compilers, Libraries, Parallel Models, and Analysis Tools:
Intel Fortran and C++ Compilers help to boost application performance through
explicit vectorization
Intel Math Kernel Library provides high performance for linear algebra, Fast Fourier
Transforms (FFT), vector math, and statistics functions on the latest Intel architectures.
Standards-based parallel models of OpenMP 4.0
Intel MPI Library provides sustained scalability with low latencies
Powerful analysis tools help to accelerate software development(Intel Advisor XE,
Intel Inspector XE, Intel VtuneTM Amplifier XE, Intel Trace Analyzer and
Collector)
Improving Auto-vectorization and Cache Behavior. Even though the vast majority of the
loads encountered during our stencil will be contiguous, the compiler may choose to use a
sequence of relatively expensive gather/scatter operations instead of simple packed loads.
Alternative optimizations: 1) peeling the first and last iterations from the inner-most loop,
such that all of the remaining iterations are known not to handle any edge cases; or 2)
introducing halo or ghost cells (i.e. a layer of additional cells around the grid which
store values representative of the boundary condition).
Such halo cells are a commonly used design pattern in many high performance
applications. For example, adding halo cells to an array of 1024x1024x1024
doubles increases its 8 GB footprint by only 48 MB
Esas son unas de las optimizaciones que el compilador puede aprovechar de forma ms ptima
dichas cpus.
Otros ejemplos de optimizaciones que se le pueden proporcionar para ejecutar aplicaciones de son
las siguientes: