You are on page 1of 91

Introduction to Python, COM and

PythonCOM
Mark Hammond
Skippi-Net, Melbourne, Australia
mhammond@skippinet.com.au
http://starship.python.net/crew/mhammond
Introduction to Python, COM and
PythonCOM
The plan
• Section I - Intro to Python
– 30-45 mins
• Section II - Intro to COM
– 30-45 mins
• Python and COM
– The rest!
Section I

Introduction to Python
What Is Python?
• Created in 1990 by Guido van Rossum
– While at CWI, Amsterdam
– Now hosted by centre for national research
initiatives, Reston, VA, USA
• Free, open source
– And with an amazing community
• Object oriented language
– “Everything is an object”
Why Python?
• Designed to be easy to learn and master
– Clean, clear syntax
– Very few keywords
• Highly portable
– Runs almost anywhere - high end servers and workstations,
down to windows CE
– Compiled to machine independent byte-codes
• Extensible
– Designed to be extensible using C/C++, thereby allowing
access to many external libraries
Most obvious and notorious
features
• Clean syntax plus high-level data types
– Leads to fast coding
• Uses white-space to delimit blocks
– Humans generally do, so why not the language?
– Try it, you will end up liking it
• Variables do not need declaration
– Although not a type-less language
Pythonwin
• We are using Pythonwin
– Only available on Windows
– GUI toolkit using Tkinter available for most
platforms
– Standard console Python available on all platforms
• Has interactive mode for quick testing of code
• Includes debugger and Python editor
Interactive Python
• Starting Python.exe, or any of the GUI
environments present an interactive mode
–>>> prompt indicates start of a statement or
expression
– If incomplete, ... prompt indicates second and
subsequent lines
– All expression results printed back to interactive
console
Variables and Types (1 of 3)
• Variables need no declaration
• >>> a=1
>>>
• As a variable assignment is a statement, there is no
printed result
• >>> a
1
• Variable name alone is an expression, so the result
is printed
Variables and Types (2 of 3)
• Variables must be created before they can be
used
• >>> b
Traceback (innermost last):
File "<interactive input>", line
1, in ?
NameError: b
>>>
• Python uses exceptions - more detail later
Variables and Types (3 of 3)
• Objects always have a type
• >>> a = 1
>>> type(a)
<type 'int'>
>>> a = "Hello"
>>> type(a)
<type 'string'>
>>> type(1.0)
<type 'float'>
Assignment versus Equality
Testing
• Assignment performed with single =
• Equality testing done with double = (==)
– Sensible type promotions are defined
– Identity tested with is operator.
• >>> 1==1
1
>>> 1.0==1
1
>>> "1"==1
0
Simple Data Types
• Strings
– May hold any data, including embedded NULLs
– Declared using either single, double, or triple quotes
– >>> s = "Hi there"
>>> s
'Hi there'
>>> s = "Embedded 'quote'"
>>> s
"Embedded 'quote'"
Simple Data Types
– Triple quotes useful for multi-line strings
– >>> s = """ a long
... string with "quotes" or
anything else"""
>>> s
' a long\012string with "quotes"
or anything else'
>>> len(s)
45
Simple Data Types
• Integer objects implemented using C longs
– Like C, integer division returns the floor
– >>> 5/2
2
• Float types implemented using C doubles
• Long Integers have unlimited size
– Limited only by available memory
– >>> long = 1L << 64
>>> long ** 5
213598703592091008239502170616955211460270452235665276994
7041607822219725780640550022962086936576L
High Level Data Types
• Lists hold a sequence of items
– May hold any object
– Declared using square brackets
• >>> l = []# An empty list
>>> l.append(1)
>>> l.append("Hi there")
>>> len(l)
2
High Level Data Types
• >>> l
[1, 'Hi there']
>>>
>>> l = ["Hi there", 1, 2]
>>> l
['Hi there', 1, 2]
>>> l.sort()
>>> l
[1, 2, 'Hi there']
High Level Data Types
• Tuples are similar to lists
– Sequence of items
– Key difference is they are immutable
– Often used in place of simple structures
• Automatic unpacking
• >>> point = 2,3
>>> x, y = point
>>> x
2
High Level Data Types
• Tuples are particularly useful to return
multiple values from a function
• >>> x, y = GetPoint()
• As Python has no concept of byref
parameters, this technique is used widely
High Level Data Types
• Dictionaries hold key-value pairs
– Often called maps or hashes. Implemented using
hash-tables
– Keys may be any immutable object, values may be
any object
– Declared using braces
• >>> d={}
>>> d[0] = "Hi there"
>>> d["foo"] = 1
High Level Data Types
• Dictionaries (cont.)
• >>> len(d)
2
>>> d[0]
'Hi there'
>>> d = {0 : "Hi there", 1 :
"Hello"}
>>> len(d)
2
Blocks
• Blocks are delimited by indentation
– Colon used to start a block
– Tabs or spaces may be used
– Maxing tabs and spaces works, but is discouraged
• >>> if 1:
... print "True"
...
True
>>>
Blocks
• Many people hate this when they first see it
– Almost all Python programmers come to love it
• Humans use indentation when reading code to
determine block structure
– Ever been bitten by the C code?:
• if (1)
printf("True");
CallSomething();
Looping
• The for statement loops over sequences
• >>> for ch in "Hello":
... print ch
...
H
e
l
l
o
>>>
Looping
• Built-in function range() used to build
sequences of integers
• >>> for i in range(3):
... print i
...
0
1
2
>>>
Looping
• while statement for more traditional loops
• >>> i = 0
>>> while i < 3:
... print i
... i = i + 1
...
0
1
2
>>>
Functions
• Functions are defined with the def
statement:
• >>> def foo(bar):
... return bar
>>>
• This defines a trivial function named foo
that takes a single parameter bar
Functions
• A function definition simply places a function
object in the namespace
• >>> foo
<function foo at fac680>
>>>
• And the function object can obviously be called:
• >>> foo(3)
3
>>>
Classes
• Classes are defined using the class statement
• >>> class Foo:
... def __init__(self):
... self.member = 1
... def GetMember(self):
... return self.member
...
>>>
Classes
• A few things are worth pointing out in the
previous example:
– The constructor has a special name __init__,
while a destructor (not shown) uses __del__
– The self parameter is the instance (ie, the this
in C++). In Python, the self parameter is explicit
(c.f. C++, where it is implicit)
– The name self is not required - simply a
convention
Classes
• Like functions, a class statement simply adds a
class object to the namespace
• >>> Foo
<class __main__.Foo at 1000960>
>>>
• Classes are instantiated using call syntax
• >>> f=Foo()
>>> f.GetMember()
1
Modules
• Most of Python’s power comes from modules
• Modules can be implemented either in
Python, or in C/C++
• import statement makes a module available
• >>> import string
>>> string.join( ["Hi", "there"] )
'Hi there'
>>>
Exceptions
• Python uses exceptions for errors
– try / except block can handle exceptions
• >>> try:
... 1/0
... except ZeroDivisionError:
... print "Eeek"
...
Eeek
>>>
Exceptions
• try / finally block can guarantee execute of
code even in the face of exceptions
• >>> try:
... 1/0
... finally:
... print "Doing this anyway"
...
Doing this anyway
Traceback (innermost last): File "<interactive input>",
line 2, in ?
ZeroDivisionError: integer division or modulo
>>>
Threads
• Number of ways to implement threads
• Highest level interface modelled after Java
• >>> class DemoThread(threading.Thread):
... def run(self):
... for i in range(3):
... time.sleep(3)
... print i
...
>>> t = DemoThread()
>>> t.start()
>>> t.join()
0
1 <etc>
Standard Library
• Python comes standard with a set of modules,
known as the “standard library”
• Incredibly rich and diverse functionality
available from the standard library
– All common internet protocols, sockets, CGI, OS
services, GUI services (via Tcl/Tk), database,
Berkeley style databases, calendar, Python parser,
file globbing/searching, debugger, profiler,
threading and synchronisation, persistency, etc
External library
• Many modules are available externally
covering almost every piece of functionality
you could ever desire
– Imaging, numerical analysis, OS specific
functionality, SQL databases, Fortran interfaces,
XML, Corba, COM, Win32 API, etc
• Way too many to give the list any justice
Python Programs
• Python programs and modules are written as text
files with traditionally a .py extension
• Each Python module has its own discrete namespace
• Python modules and programs are differentiated
only by the way they are called
– .py files executed directly are programs (often referred to
as scripts)
– .py files referenced via the import statement are
modules
Python Programs
• Thus, the same .py file can be a
program/script, or a module
• This feature is often used to provide
regression tests for modules
– When module is executed as a program, the
regression test is executed
– When module is imported, test functionality is not
executed
Python “Protocols”
• Objects can support a number of different
“protocols”
• This allows your objects to be treated as:
– a sequence - ie, indexed or iterated over
– A mapping - obtain/assign keys or values
– A number - perform arithmetic
– A container - perform dynamic attribute fetching and
setting
– Callable - allow your object to be “called”
Python “Protocols”
• Sequence and container example
• >>> class Protocols:
... def __getitem__(self, index):
... return index ** 2
... def __getattr__(self, attr):
... return "A big " + attr
...
>>> p=Protocols()
>>> p[3]
9
>>> p.Foo
'A big Foo’
>>>
More Information on Python
• Can’t do Python justice in this short time frame
– But hopefully have given you a taste of the
language
• Comes with extensive documentation,
including an excellent tutorial and library
reference
– Also a number of Python books available
• Visit www.python.org for more details
Section II

Introduction to COM
What is COM?
• Acronym for Component Object Model, a
technology defined and implemented by
Microsoft
• Allows “objects” to be shared among many
applications, without applications knowing
the implementation details of the objects
• A broad and complex technology
• We can only provide a brief overview here
What was COM
• COM can trace its lineage back to DDE
• DDE was expanded to Object Linking and
Embedding (OLE)
• VBX (Visual Basic Extensions) enhanced
OLE technology for visual components
• COM was finally derived as a general
purpose mechanism
– Initially known as OLE2
COM Interfaces
• COM relies heavily on interfaces
• An interface defines functionality, but not
implementation
– Each object (or more correctly, each class) defines
implementation of the interface
– Each implementation must conform to the interface
• COM defines many interfaces
– But often does not provide implementation of these
interfaces
COM Interfaces
• Interfaces do not support properties
– We will see how COM properties are typically
defined later
• Interfaces are defined using a vtable scheme
similar to how C++ defines virtual methods
• All interfaces have a unique ID (an IID)
– Uses a Universally Unique Identifer (UUID)
– UUIDs used for many COM IDs, including IIDs
Classes
• COM defines the concept of a class, used to create
objects
– Conceptually identical to a C++ or Python class
– To create an object, COM locates the class factory, and
asks it to create an instance
• Classes have two identifiers
– Class ID (CLSID) is a UUID, so looks similar to an IID
– ProgID is a friendly string, and therefore not guaranteed
unique
IUnknown interface
• Base of all COM interfaces
– By definition, all interfaces also support the
IUnknown interface
• Contains only three methods
– AddRef() and Release() for managing COM
lifetimes
• COM lifetimes are based on reference counts
– QueryInterface() for obtaining a new interface
from the object
Creating objects, and obtaining
interfaces
• To create an object, the programmer specifies
either the ProgID, or the CLSID
• This process always returns the requested
interface
– or fails!
• New interfaces are obtained by using the
IUnknown::QueryInterface() method
– As each interface derives from IUnknown, each
interface must support QI
Other standard interfaces
• COM defines many interfaces, for example:
• IStream
– Defines file like operations
• IStorage
– Defines file system like semantics
• IPropertyPage
– Defines how a control exposes a property page
• etc - many many interfaces are defined
– But not many have implementations
Custom interfaces
• COM allows you to define your own
interfaces
• Interfaces are defined using an Interface
Definition Language (IDL)
• Tools available to assign unique IIDs for the
interface
• Any object can then implement or consume
these interfaces
IDispatch - Automation objects
• IDispatch interface is used to expose dynamic object
models
• Designed explicitly for scripting languages, or for
those languages that can not use normal COM
interfaces
– eg, where the interface is not known at compile time, or
there is no compile time at all
• IDispatch is used extensively
– Microsoft Office, Netscape, Outlook, VB, etc - almost
anything designed to be scripted
IDispatch - Automation objects
• Methods and properties of the object model
can be determined at runtime
– Concept of Type Libraries, where the object
model can be exposed at compile time
• Methods and properties are used indirectly
– GetIDsOfNames() method is used to get an ID for
a method or property
– Invoke() is used to make the call
IDispatch - Automation objects
• Languages usually hide these implementation
details from the programmer
• Example: object.SomeCall()
– Behind the scenes, your language will:
id = GetIDsOfNames("SomeCall")
Invoke(id, DISPATCH_METHOD)
• Example: object.SomeProp
– id = GetIDsOfNames("SomeProp")
Invoke(id, DISPATCH_PROP_GET)
VARIANTs
• The IDispatch interface uses VARIANTs as its
primary data type
• Simply a C union that supports the common data
types
– and many helper functions for conversion etc
• Allows the single Invoke() call to accept almost
any data type
• Many languages hide these details, performing
automatic conversion as necessary
Implementation models
• Objects can be implemented in a number of ways
– InProc objects are implemented as DLLs, and loaded
into the calling process
• Best performance, as no marshalling is required
– LocalServer/RemoteServer objects are implemented as
stand-alone executables
• Safer due to process isolation, but slower due to remoting
• Can be both, and caller can decide, or let COM
choose the best
Distributed COM
• DCOM allows objects to be remote from their
caller
• DCOM handles all marshalling across machines
and necessary security
• Configuration tools allow an administrator to
configure objects so that neither the object nor
the caller need any changes
– Although code changes can be used to explicitly
control the source of objects
The Windows Registry
• Information on objects stored in the Windows
registry
– ProgID to CLSID mapping
– Name of DLL for InProc objects
– Name of EXE for LocalServer objects
– Other misc details such as threading models
• Lots of other information also maintained in
registry
– Remoting proxies, object security, etc.
Conclusion for Section II
• COM is a complex and broad beast
• Underlying it all is a fairly simple Interface
model
– Although all the bits around the edges combine to
make it very complex
• Hopefully we have given enough framework
to put the final section into some context
Section III

Python and COM


PythonCOM architecture
• Underpinning everything is an extension
module written in C++ that provides the core
COM integration
– Support for native COM interfaces exist in this
module
– Python reference counting is married with COM
reference counting
• Number of Python implemented modules that
provide helpers for this core module
Interfaces supported by
PythonCOM
• Over 40 standard interfaces supported by the
framework
• Extension architecture where additional
modules can add support for their own
interfaces
– Tools supplied to automate this process
– Number of extension modules supplied, bringing
total interfaces to over 100
Using Automation from Python
• Automation uses IDispatch to determine
object model at runtime
• Python function
win32com.client.Dispatch() provides
this run-time facility
• Allows a native “look and feel” to these
objects
Using Automation - Example
• We will use Excel for this demonstration
• Excel ProgId is Excel.Application
• >>> from win32com.client import Dispatch
>>> xl=Dispatch("Excel.Application")
>>> xl
<COMObject Excel.Application>
>>>
Using Automation - Example
• Now that we have an Excel object, we can call
methods and set properties
• But we can’t see Excel running?
– It has a visible property that may explain things
• >>> xl.Visible
0
• Excel is not visible, let’s make it visible
>>> xl.Visible = 1
>>>
Automation - Late vs. Early
Bound
• In the example we just saw, we have been
using late bound COM
– Python has no idea what properties or methods
are available
– As we attempt a method or property access,
Python dynamically asks the object
– Slight performance penalty, as we must resolve
the name to an ID (GetIDsOfNames()) before
calling Invoke()
Automation - Late vs. Early
Bound
• If an object provides type information via a
Type Library, Python can use early bound
COM
• Implemented by generating a Python source
file with all method and property definitions
– Slight performance increase as all names have
been resolved to IDs at generation time, rather
than run-time
Automation - Late vs. Early
Bound
• Key differences between the 2 techniques:
– Late bound COM often does not know the specific
types of the parameters
• Type of the Python object determines the VARIANT type
created
• Does not know about ByRef parameters, so no parameters
are presented as ByRef
– Early bound COM knows the types of the parameters
• All Python types are coerced to the correct type
• ByRef parameters work (returned as tuples)
Playing with Excel
• >>> xl.Workbooks.Add()
<COMObject <unknown>>
>>> xl.Range("A1:C1").Value = "Hi",
"From", "Python"
>>> xl.Range("A1:C1").Value
((L'Hi', L'From', L'Python'),)
>>> xl.Range("A1:C1").PrintOut()
>>>
How did we know the methods?
• Indeed, how did we know to use
“Excel.Application”?
• No easy answer - each application/object defines
their own object model and ProgID
• Documentation is the best answer
– Note that MSOffice does not install the COM
documentation by default - must explicitly select it
during setup
• COM browsers can also help
Native Interfaces from Python
• Examples so far have been using using
IDispatch
– win32com.client.Dispatch() function
hides the gory details from us
• Now an example of using interfaces natively
– Need a simple example - Windows Shortcuts fits the
bill
• Develop some code that shows information
about a Windows shortcut
Native Interfaces from Python
• 4 main steps in this process
– Import the necessary Python modules
– Obtain an object that implements the IShellLink
interface
– Obtain an IPersistFile interface from the object,
and load the shortcut
– Use the IShellLink interface to get information
about the shortcut
Native Interfaces from Python
Step 1: Import the necessary Python modules
– We need the pythoncom module for core COM
support
– We need the win32com.shell.shell
module
• This is a PythonCOM extension that exposes
additional interfaces
• >>> import pythoncom
>>> from win32com.shell import shell
Native Interfaces from Python
Step 2: Obtain an object that implements the
IShellLink interface
• Use the COM function
CoCreateInstance()
• Use the published CLSID for the shell
• The shell requires that the object request be
for an InProc object.
• Request the IShellLink interface
Native Interfaces from Python
Step 2: Obtain an object that implements the
IShellLink interface
• >>> sh =
pythoncom.CoCreateInstance(shell.CLSID_ShellLink,

... None,
... pythoncom.CLSCTX_INPROC_SERVER,
... shell.IID_IShellLink)
>>> sh
<PyIShellLink at 0x1630b04 with obj at 0x14c9d8>
>>>
Native Interfaces from Python
Step 3: Obtain an IPersist interface from the
object, and load the shortcut
• Use QueryInterface to obtain the new interface
from the object
• >>> pe=
sh.QueryInterface(pythoncom.IID_IPersistFile)
>>>
pe.Load("c:/winnt/profiles/skip/desktop/IDLE.lnk")
Native Interfaces from Python
Step 4: Use the IShellLink interface to get
information about the shortcut

• >>> sh.GetWorkingDirectory()
'L:\\src\\python-cvs\\tools\\idle'
>>> sh.GetArguments()
'l:\\src\\python-cvs\\tools\\idle\\idle.pyw'
>>>
Implementing COM using Python.
• Final part of our tutorial is implementing
COM objects using Python
• 3 main steps we must perform
– Implement a Python class that exposes the
functionality
– Annotate the class with special attributes required
by the COM framework.
– Register our COM object
Implementing COM using Python.
Step 1: Implement a Python class that exposes
the functionality
• We will build on our last example, by
providing a COM object that gets information
about a Windows shortcut
• Useful to use from VB, as it does not have
direct access to these interfaces.
• We will design a class that can be initialised
to a shortcut, and provides methods for
obtaining the info.
Implementing COM using Python
Step 1: Implement a Python class that exposes
the functionality
• Code is getting to big for the slides
• Code is basically identical to that presented
before, except:
– The code has moved into a class.
– We store the IShellInfo interface as an instance
attribute
Implementing COM using Python
Step 1: Implement a Python class that exposes
the functionality

• Provide an Init method that takes the name of


the shortcut.
• Also provide two trivial methods that simply
delegate to the IShellInfo interface
• Note Python would allow us to automate the
delegation, but that is beyond the scope of this!
Implementing COM using Python
Step 2: Annotate the class with special
attributes
• PythonCOM requires a number of special
attributes
– The CLSID of the object (a UUID)
• Generate using
print pythoncom.CreateGuid()
– The ProgID of the object (a friendly string)
• Make one up!
– The list of public methods
• All methods not listed as public are not exposed via COM
Implementing COM using Python
Step 2: Annotate the class with special
attributes
• class PyShellLink:
_public_methods_=['Init',
'GetWorkingDirectory',
'GetArguments']
_reg_clsid_="{58DE4632-9323-11D3-85F9-00C04FEFD0A7}"
_reg_progid_="Python.ShellDemo"
...
Implementing COM using Python
Step 3: Register our COM object
• General technique is that the object is
registered when the module implementing the
COM object is run as a script.
– Recall previous discussion on modules versus
scripts
• Code is trivial
– Call the UseCommandLine() method, passing
the class objects we wish to register
Implementing COM using Python
Step 3: Register our COM object
• Code is simple
– note we are passing the class object, not a class
instance.
• if __name__=='__main__':
UseCommandLine(PyShellLink)
• Running this script yields:
• Registered: Python.ShellDemo
Testing our COM object
• Our COM object can be used by any
automation capable language
– VB, Delphi, Perl - even VC at a pinch!
• We will test with a simple VBScript program
– VBScript is free. Could use full blown VB, or
VBA. Syntax is identical in all cases
Testing our COM object
• set shdemo =
CreateObject("Python.ShellDemo")

shdemo.Init("c:\winnt\profiles\skip\deskt
op\IDLE.lnk")

WScript.Echo "Working dir is " +


shdemo.GetWorkingDirectory()
• Yields
• Working dir is L:\src\python-cvs\tools\idle
Summary
• Python can make COM simple to work with
– No reference count management
– Many interfaces supported.
– Natural use for IDispatch (automation) objects.
• Simple to use COM objects, or implement
COM objects
Questions?
More Information
• Python itself
– http://www.python.org
– news:comp.lang.python
• Python on Windows
– http://www.python.org/windows
• Mark Hammond’s Python extensions
– http://starship.python.net/crew/mhammond
– http://www.python.org/windows

You might also like