58 views

Uploaded by Layl Zan

- App 1
- Untitled
- Self Adhesive Tape Slitting Machine
- Hydrostatic Test Calculation Formula
- TM 101 Wet END
- Tkinter: example of use
- Hydrostatic Testing for Pipe Lines(Ch5)
- Principles of Wet-End Chemistry
- E11_22_5007-SPECIFICATIONS FOR HYDROSTATIC TESTING
- Paper Machine Efficiency
- The Mechanics of Tension Control
- Paper_Mills.pdf
- 2006 Development of PLC-Based Tension Control System
- VirtualBox-v5.2.0-b118431__DOC-SDK-Reference
- Paper Machine
- Tappi 0502-17 Papermaker Formula
- An Introduction to Pipeline Pigging
- Hydrotest Pressure Calculation
- Leak Testing
- Wara Pipeline Hydrotest Procedure Rev-A

You are on page 1of 89

Dr Edward Schoeld A*STAR / Singapore Computational Sciences Club Seminar June 14, 2011

Most scientists and engineers are: programming for 50+% of their work time (and rising) self-taught programmers using inefcient programming practices using the wrong programming languages: C++, FORTRAN, C#, PHP, Java, ...

Rapid prototyping Efciency for computational kernels Pre-written packages! Vectors, matrices, modelling, simulations, visualisation Extensibility; web front-ends; database backends; ...

PhD in statistical pattern recognition: 2001-2006 Needed good tools for my research! Discovered Python in 2002 after frustration with C++, Matlab, Java, Perl Contributed to NumPy and SciPy: maxent, sparse matrices, optimization, Monte Carlo, etc. Managed six releases of SciPy in 2005-6

1. Why Python?

Introducing Python

What is Python?

interpreted strongly but dynamically typed object-oriented intuitive, readable open source, free batteries included

batteries included

Pythons standard library is: very large well-supported well-documented

data types operating system CGI testing calendar strings compression complex numbers multimedia email networking GUI FTP databases XML threads arguments cryptography CSV les serialization

Native Python code executes 10x more slowly than C and FORTRAN

... to get to Kuala Lumpur ASAP?

Date 1961 1984 1997 2000, Apr 2003, Aug 2007, Mar 2009, Sep

Cost per GFLOPS (US $) US $1.1 trillion US $15,000,000 US $30,000 $1000 $82 $0.42 $0.13

Technology 17 million IBM 1620s Cray X-MP Two 16-CPU clusters of Pentiums Bunyip Beowulf cluster KASY0 Ambric AM2045 ATI Radeon R800 Source: Wikipedia: FLOPS

Proxy for cost of programmer time

Efciency

When FORTRAN was invented, computer time was more expensive than programmer time. In the 1980s and 1990s that reversed.

Efcient programming

What if ...

... you now need to reach Sydney?

Advantages of Python

Easy to write Easy to maintain Great standard libraries Thriving ecosystem of third-party packages Open source

Batteries included

Pythons standard library is: very large well supported well documented

data types operating system CGI testing calendar strings compression complex numbers multimedia email networking GUI FTP databases XML threads arguments cryptography CSV les serialization

Question

What is the date 177 days from now?

Rapid prototyping Plotting, visualisation, 3D Numerical computing Web and database programming All-purpose glue

Python Fortran Java

C C++ C#

A different language for each task? A language you know? A language others in your team are using: support and help?

Python Interpreted Powerful data input/output Great plotting General-purpose language Cost Open source Yes Yes Yes Powerful Free Yes

Python Powerful Portable Standard libraries Easy to write and maintain Easy to learn Yes Yes Vast Yes Yes

Python

Fast to write Good for embedded systems, device drivers and operating systems Good for most other high-level tasks

Yes

No

No

Yes

Yes

No

Standard library

Vast

Limited

Python Powerful, well-designed language Standard libraries Easy to learn Code brevity Easy to write and maintain Yes Vast Yes Short Yes

Open source

Python is open source software Benets: No vendor lock-in Cross-platform Insurance against bugs in the platform Free

Computer graphics: Industrial Light & Magic Web: Google: News, Groups, Maps, Gmail Legacy system integration: AstraZeneca - collaborative drug discovery

Aerospace: NASA Research: universities worldwide ... Others: YouTube, Reddit, BitTorrent, Civilization IV,

Python spread from scripting to the entire production pipeline Numerous reviews since 1996: Python is still the best tool for them

A common sentiment: We achieve immediate functioning code so much faster in Python than in any other language that its staggering. - Robin Friedrich, Senior Project Engineer

Eric Newton, Python for Critical Applications: http://

metaslash.com/brochure/ recall.html

Metaslash, Inc: 1999 to 2001 Mission-critical system for air-trafc control Replicated, fault-tolerant data storage

Python prototype -> C++ implementation -> Python again Why? C++ dependencies were buggy C++ threads, STL were not portable enough Pythons advantages over C++ More portable 75% less code: more productivity, fewer bugs

See http://www.python.org/about/success/ for lots more case studies and success stories

Small beginnings Piecemeal growth, quirky interfaces ... Large, cumbersome systems

NumPy

An n-dimensional array/matrix package

NumPy

Centre of Pythons numerical computing ecosystem

NumPy

The most fundamental tool for numerical computing in Python Fast multi-dimensional array capability

Two fundamental objects: 1. n-dimensional array

2. universal function

a rich set of numerical data types nearly 400 functions and methods on arrays: type conversions mathematical logical

NumPy's features

Fast. Written in C with BLAS/LAPACK hooks. Rich set of data types Linear algebra: matrix inversion, decompositions, Discrete Fourier transforms Random number generation Trig, hypergeometric functions, etc.

Loops are mostly unnecessary Operate on entire arrays!

>>> a = numpy.array([20, 30, 40, 50]) >>> a < 35 array([True, True, False, False], dtype=bool) >>> b = numpy.arange(4) >>> a - b array([20, 29, 38, 47]) >>> b**2 array([0, 1, 4, 9])

Universal functions

NumPy denes 'ufuncs' that operate on entire arrays and other sequences (hence 'universal') Example: sin()

>>> a = numpy.array([20, 30, 40, 50]) >>> c = 10 * numpy.sin(a) >>> c array([ 9.12945251, -9.88031624, 7.4511316 , -2.62374854])

Array slicing

>>> a = numpy.arange(10)**3 >>> a array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729]) >>> a[2:5] array([ 8, 27, 64])

Fancy indexing

>>> a = numpy.arange(12)**2 >>> ind = numpy.array([ 1, 1, 3, 8, 5 ]) >>> a[ind] array([ 1, 1, 9, 64, 25])

Matrix inversion: mat(A).I Or: linalg.inv(A) Linear solvers: linalg.solve(A, x)

Pseudoinverse: linalg.pinv(A)

What is SciPy?

Back-end: computational work Front-end: input / output, visualization, GUIs Dozens of great scientic packages exist

NumPy: numerical / array module Matplotlib: great 2D and 3D plotting library IPython: nice interactive Python shell SciPy: set of scientic libraries: sparse matrices, signal processing, RPy: integration with the R statistical environment

Cython: C language extensions Mayavi: 3D graphics, volumetric rendering Nitimes, Nipype: Python tools for neuroimaging SymPy: symbolic mathematics library

VPython: easy, real-time 3D programming UCSF Chimera, PyMOL, VMD: molecular graphics PyRAF: Hubble Space Telescope interface to RAF astronomical data BioPython: computational molecular biology Natural language toolkit: symbolic + statistical NLP Physics: PyROOT

BSD-licensed software for maths, science, engineering

integration optimization interpolation FFTs clustering signal processing linear algebra ODEs n-dim image processing interpolation sparse matrices maximum entropy statistics scientic constants C/C++ and Fortran integration

Fit a model to noisy data: y = a/xb sin(cx)+

scipy.optimize

Task: Fit a model of the form y = a/bx sin(cx)+ to noisy data. Spec: 1. Generate noisy data 2. Choose parameters (a, b, c) to minimize sum squared errors 3. Plot the data and tted model (next session)

import numpy import pylab from scipy.optimize import leastsq def myfunc(params, x): (a, b, c) = params return a / (x**b) * numpy.sin(c * x) true_params = [1.5, 0.1, 2.] def f(x): return myfunc(true_params, x) def err(params, x, y): # error function return myfunc(params, x) - y

# n x y y Generate noisy data to fit = 30; xmin = 0.1; xmax = 5 = numpy.linspace(xmin, xmax, n) = f(x) += numpy.rand(len(x)) * 0.2 * \ (y.max() - y.min())

v0 = [3., 1., 4.] # initial param estimate # Fitting v, success = leastsq(err, v0, args=(x, y), maxfev=10000) print 'Estimated parameters: ', v print 'True parameters: ', true_params X = numpy.linspace(xmin, xmax, 5 * n) pylab.plot(x, y, 'ro', X, myfunc(v, X)) pylab.show()

Fit a model to noisy data: y = a/xb sin(cx)+

Construct and solve a sparse linear system

Sparse matrices

Sparse matrices are mostly zeros. They can be symmetric or asymmetric. Sparsity patterns vary: block sparse, band matrices, ... They can be huge! Only non-zeros are stored.

SciPy supports seven sparse storage schemes ... and sparse solvers in Fortran.

To construct a 1000x1000 lil_matrix and add values:

>>> from scipy.sparse import lil_matrix >>> from numpy.random import rand >>> from scipy.sparse.linalg import spsolve >>> >>> >>> >>> A = lil_matrix((1000, 1000)) A[0, :100] = rand(100) A[1, 100:200] = A[0, :100] A.setdiag(rand(1000))

Now convert the matrix to CSR format and solve Ax=b:

>>> A = A.tocsr() >>> b = rand(1000) >>> x = spsolve(A, b) # Convert it to a dense matrix and solve, and check that the result is the same: >>> from numpy.linalg import solve, norm >>> x_ = solve(A.todense(), b) # Compute norm of the error: >>> err = norm(x - x_) >>> err < 1e-10 True

Matplotlib

Great plotting package in Python Matlab-like syntax Great rendering: anti-aliasing etc. Many backends: Cairo, GTK, Cocoa, PDF Flexible output: to EPS, PS, PDF, TIFF, PNG, ...

Search the web for 'Matplotlib gallery'

1. Use a Monte Carlo algorithm to estimate : 1. Generate uniform random variates (x,%y) over [0, 1]. 2. Estimate from the proportion p that land in the unit circle. 2. Time two ways of doing this: 1. Using for loops 2. Using array operations (vectorized)

3. Scaling

HPC

High-performance computing

Aspects to HPC

Supercomputers Parallel programming Caches, shared memory Code porting Distributed clusters / grids Scripting Job control Specialized hardware

Advantages Portability Easy scripting, glue Maintainability Proling to identify hotspots Vectorization with NumPy Disadvantages Global interpreter lock Less control than C Native loops are slow

Useful Python language features: Generators, iterators Useful packages: Great HDF5 support from PyTables!

Hierarchical data

Databases without the relational baggage

Efcient support for massive data sets

Applications of PyTables

aeronautics drug discovery nancial analysis climate prediction telecommunications data mining statistical analysis etc.

PyTables Pro is now being open sourced. Indexed searches for speed Merging with PyTables Working project name: NewPyTables

PyTables performance

OPSI indexing engine speed: Querying 10 billion rows can take hundredths of a second! Target use-case: mostly read-only or append-only data

Important principles

1. "Premature optimization is the root of all evil" Don't write cryptic code just to make it more efcient!

2. 1-5% of the code takes up the vast majority of the computing time! ... and it might not be the 1-5% that you think!

From most to least important: 1. Check: Do you really need to make it more efcient? 2. Check: Are you using the right algorithms and data structures? 3. Check: Are you reusing pre-written libraries wherever possible? 4. Check: Which parts of the code are expensive? Measure, don't guess!

Exponential-order and polynomial-order speedups are possible by choosing the right algorithm for a task. These require the right data structures! These dwarf 10-25x linear-order speedups from: using lower-level languages using different language constructs.

The largest Python training provider in South-East Asia Delighted customers include:

Python for Programmers Python for Scientists and Engineers Python for Geoscientists Python for Bioinformaticians New courses: Python for Financial Engineers Python for IT Security Professionals 4 days 3 days 3 days 4 days 4 days 4 days

Python: beginners, advanced Scientic data processing with Python Software engineering with Python Large-scale problems: HPC, huge data sets, grids Statistics and Monte Carlo problems

Spatial data analysis / GIS General scripting, job control, glue GUIs with PyQt Integrating with other languages: R, C, C++, Fortran, ... Web development in Django

- App 1Uploaded byStephanie Pavlidou
- UntitledUploaded byapi-256504985
- Self Adhesive Tape Slitting MachineUploaded byNausheen Ahmed Noba
- Hydrostatic Test Calculation FormulaUploaded byMajeed Rumani
- TM 101 Wet ENDUploaded byFahmi Januar Anugrah
- Tkinter: example of useUploaded byNed_creed
- Hydrostatic Testing for Pipe Lines(Ch5)Uploaded byseso20081
- Principles of Wet-End ChemistryUploaded bydinotim
- E11_22_5007-SPECIFICATIONS FOR HYDROSTATIC TESTINGUploaded byAlienshow
- Paper Machine EfficiencyUploaded byGumilang Satrio P
- The Mechanics of Tension ControlUploaded byNedeljko Filipovic
- Paper_Mills.pdfUploaded byPristull
- 2006 Development of PLC-Based Tension Control SystemUploaded bynileshsaw
- VirtualBox-v5.2.0-b118431__DOC-SDK-ReferenceUploaded byindians jones
- Paper MachineUploaded byRadhika Shetty
- Tappi 0502-17 Papermaker FormulaUploaded byMulyadi Moel
- An Introduction to Pipeline PiggingUploaded bycbi000
- Hydrotest Pressure CalculationUploaded byChetan B Kapadia
- Leak TestingUploaded byayubkara
- Wara Pipeline Hydrotest Procedure Rev-AUploaded byLee Bumyul
- Hydrotest Method Statement 12th Mar 2012-1Uploaded bysethu1091
- Specification of Hydro Testing for OnshoreUploaded byamacathot06
- NBF_DKD_R_6_1_enUploaded byCharles Guzman
- WetEndUploaded byArtesira Yuna
- Hydrostatic Test Pressure CalculationUploaded bymarclkm

- CPET190_Lect8Uploaded byLayl Zan
- 4-FormulairesUploaded bykajibe
- CPET190_Lect6Uploaded byLayl Zan
- Les_Moyens_de_CommunicationUploaded byaminaespoir
- CPET190_Lect2.pptUploaded byLayl Zan
- CPET190_Lect9Uploaded byLayl Zan
- Guide Outlook 2007Uploaded byAyoub Aalili
- cours_webUploaded byLayl Zan
- Raccourcis WindowsUploaded byLayl Zan
- Souris ClavierUploaded byLayl Zan
- InternetUploaded byLayl Zan

- En Dsmbisp Pt Slm v41Uploaded byKenneth_Patten_3257
- Field Communicator Resource CD ReadMe.pdfUploaded byJesus Antonio Castillejos Alarcón
- C++ FAQUploaded byamarbvp
- DiodesUploaded bypriyosantosa
- A Survey of Network Simulators Supporting Wireless Networks.pdfUploaded byWidi Prasetyo
- Idm Install GuideUploaded byOziel Ulloa Palma
- PCE385Uploaded byGigi Ion
- Exascale compuingUploaded bysundeepadapa
- Plc Panasonic Fpx IngUploaded bystgpereira
- roadmaps_systemiUploaded bylyllith
- how toUploaded byjaphetjake
- CHS Module 5 - Diagnose and Troubleshoot Computer SystemsUploaded byLawrence Cada Nofies
- Mini2440 Manual (English)Uploaded bydevakumari_rose
- CA Workload Automation de-JavaScriptingUploaded byjoseph1002
- Assembler Language ProgrammingUploaded byhnsmksm
- Td 02 CorrigeUploaded byAlilo Intime
- 61113070 Laser Based Intruder Alarm PptUploaded bydivyaa76
- ScaledbUploaded byabishekvs
- Implementation of Carry Skip Adder using PTLUploaded byIRJET Journal
- Adding new Drives to Unixware 7 systemUploaded byCalwyn Baldwin
- Getting Started with PSoC Creator IDE and FreeSoC2 (PSoC 5LP)Uploaded byGurudatta Palankar
- GuideUploaded byRobert Gomez Pato
- Requirements for PRELIMSUploaded byAiya Gabumpa
- Linux vmstat commandUploaded byaespinos20005573
- UntitledUploaded byByron Echo Publications
- PHP IntroductionUploaded byweareunite
- BACnet Introduction - V3-1Uploaded byLuis
- OKI ML3320-21 Maintenance ManualUploaded byAnthony Daskalieros
- O9400S-1US4-V15-M.pdfUploaded byTạo Trần
- Lunibuk-16032014.003223.sec4Uploaded byRivie Ariana