Using type annotations¶
You will get the most out of mypyc if you compile code with precise type annotations. Not all type annotations will help performance equally, however. Using types such as primitive types, native classes, union types, trait types, and tuple types as much as possible is a key to major performance gains over CPython.
In contrast, some other types, including Any
, are treated as
erased types. Operations on erased types use
generic operations that work with arbitrary objects, similar to how
the CPython interpreter works. If you only use erased types, the only
notable benefits over CPython will be the removal of interpreter
overhead (from compilation) and a bit of early binding, which will usually only give minor performance
gains.
Primitive types¶
The following built-in types are treated as primitive types by mypyc, and many operations on these types have efficient implementations:
int
(native operations)i64
(documentation, native operations)i32
(documentation, native operations)i16
(documentation, native operations)float
(native operations)bool
(native operations)str
(native operations)List[T]
(native operations)Dict[K, V]
(native operations)Set[T]
(native operations)Tuple[T, ...]
(variable-length tuple; native operations)None
The link after each type lists all supported native, optimized operations for the type. You can use all operations supported by Python, but native operations will have custom, optimized implementations.
Primitive containers¶
Primitive container objects such as list
and dict
don’t
maintain knowledge of the item types at runtime – the item type is
erased.
This means that item types are checked when items are accessed, not
when a container is passed as an argument or assigned to another
variable. For example, here we have a runtime type error on the final
line of example
(the Any
type means an arbitrary, unchecked
value):
from typing import List, Any
def example(a: List[Any]) -> None:
b: List[int] = a # No error -- items are not checked
print(b[0]) # Error here -- got str, but expected int
example(["x"])
Native classes¶
Classes that get compiled to C extensions are called native classes. Most common operations on instances of these classes are optimized, including construction, attribute access and method calls.
Native class definitions look exactly like normal Python class definitions. A class is usually native if it’s in a compiled module (though there are some exceptions).
Consider this example:
class Point:
def __init__(self, x: int, y: int) -> None:
self.x = x
self.y = y
def shift(p: Point) -> Point:
return Point(p.x + 1, p.y + 1)
All operations in the above example use native operations, if the file is compiled.
Native classes have some notable different from Python classes:
Only attributes and methods defined in the class body or methods are supported. If you try to assign to an undefined attribute outside the class definition,
AttributeError
will be raised. This enables an efficient memory layout and fast method calls for native classes.Native classes usually don’t define the
__dict__
attribute (they don’t have an attribute dictionary). This follows from only having a specific set of attributes.Native classes can’t have an arbitrary metaclass or use most class decorators.
Native classes only support single inheritance. A limited form of
multiple inheritance is supported through trait types. You generally
must inherit from another native class (or object
). By default,
you can’t inherit a Python class from a native class (but there’s
an override to allow that).
See Native classes for more details.
Tuple types¶
Fixed-length
tuple types
such as Tuple[int, str]
are represented
as value types when stored in variables,
passed as arguments, or returned from functions. Value types are
allocated in the low-level machine stack or in CPU registers, as
opposed to heap types, which are allocated dynamically from the
heap.
Like all value types, tuples will be boxed, i.e. converted to corresponding heap types, when stored in Python containers, or passed to non-native code. A boxed tuple value will be a regular Python tuple object.
Union types¶
Union types and optional types that contain primitive types, native class types and trait types are also efficient. If a union type has erased items, accessing items with non-erased types is often still quite efficient.
A value with a union types is always boxed, even if it contains a value that also has an unboxed representation, such as an integer or a boolean.
For example, using Optional[int]
is quite efficient, but the value
will always be boxed. A plain int
value will usually be faster, since
it has an unboxed representation.
Trait types¶
Trait types enable a form of multiple inheritance for native classes.
A native class can inherit any number of traits. Trait types are
defined as classes using the mypy_extensions.trait
decorator:
from mypy_extensions import trait
@trait
class MyTrait:
def method(self) -> None:
...
Traits can define methods, properties and attributes. They often define abstract methods. Traits can be generic.
If a class subclasses both a non-trait class and traits, the traits must be placed at the end of the base class list:
class Base: ...
class Derived(Base, MyTrait, FooTrait): # OK
...
class Derived2(MyTrait, FooTrait, Base):
# Error: traits should come last
...
Traits have some special properties:
You shouldn’t create instances of traits (though mypyc does not prevent it yet).
Traits can subclass other traits or native classes, but the MRO must be linear (just like with native classes).
Accessing methods or attributes through a trait type is somewhat less efficient than through a native class type, but this is much faster than through Python class types or other erased types.
You need to install mypy-extensions
to use @trait
:
pip install --upgrade mypy-extensions
Erased types¶
Mypyc supports many other kinds of types as well, beyond those
described above. However, these types don’t have customized
operations, and they are implemented using type erasure. Type
erasure means that all other types are equivalent to untyped values at
runtime, i.e. they are the equivalent of the type Any
. Erased
types include these:
Python classes (including ABCs)
Non-mypyc extension types and primitive types (including built-in types that are not primitives)
Type Any
Protocol types
Using erased types can still improve performance, since they can
enable better types to be inferred for expressions that use these
types. For example, a value with type Callable[[], int]
will not
allow native calls. However, the return type is a primitive type, and
we can use fast operations on the return value:
from typing import Callable
def call_and_inc(f: Callable[[], int]) -> int:
# Slow call, since f has an erased type
n = f()
# Fast increment; inferred type of n is int (primitive type)
n += 1
return n
If the type of the argument f
was Any
, the type of n
would
also be Any
, resulting in a generic, slower increment operation
being used.
Strict runtime type checking¶
Compiled code ensures that any variable or expression with a non-erased type only has compatible values at runtime. This is in contrast with using optional static typing, such as by using mypy, when type annotations are not enforced at runtime. Mypyc ensures type safety both statically and at runtime.
Any
types and erased types in general can compromise type safety,
and this is by design. Inserting strict runtime type checks for all
possible values would be too expensive and against the goal of
high performance.
Value and heap types¶
In CPython, memory for all objects is dynamically allocated on the
heap. All Python types are thus heap types. In compiled code, some
types are value types – no object is (necessarily) allocated on the
heap. bool
, float
, None
, native integer types
and fixed-length tuples are value types.
int
is a hybrid. For typical integer values, it is a value
type. Large enough integer values, those that require more than 63
bits (or 31 bits on 32-bit platforms) to represent, use a heap-based
representation (same as CPython).
Value types have a few differences from heap types:
When an instance of a value type is used in a context that expects a heap value, for example as a list item, it will transparently switch to a heap-based representation (boxing) as needed.
Similarly, mypyc transparently changes from a heap-based representation to a value representation (unboxing).
Object identity of integers, floating point values and tuples is not preserved. You should use
==
instead ofis
if you are comparing two integers, floats or fixed-length tuples.When an instance of a subclass of a value type is converted to the base type, it is implicitly converted to an instance of the target type. For example, a
bool
value assigned to a variable with anint
type will be converted to the corresponding integer.
The latter conversion is the only implicit type conversion that happens in mypyc programs.
Example:
def example() -> None:
# A small integer uses the value (unboxed) representation
x = 5
# A large integer uses the heap (boxed) representation
x = 2**500
# Lists always contain boxed integers
a = [55]
# When reading from a list, the object is automatically unboxed
x = a[0]
# True is converted to 1 on assignment
x = True
Since integers and floating point values have a different runtime representations and neither can represent all the values of the other type, type narrowing of floating point values through assignment is disallowed in compiled code. For consistency, mypyc rejects assigning an integer value to a float variable even in variable initialization. An explicit conversion is required.
Examples:
def narrowing(n: int) -> None:
# Error: Incompatible value representations in assignment
# (expression has type "int", variable has type "float")
x: float = 0
y: float = 0.0 # Ok
if f():
y = n # Error
if f():
y = float(n) # Ok
Native integer types¶
You can use the native integer types i64
(64-bit signed integer),
i32
(32-bit signed integer), i16
(16-bit signed integer), and
u8
(8-bit unsigned integer) if you know that integer values will
always fit within fixed bounds. These types are faster than the
arbitrary-precision int
type, since they don’t require overflow
checks on operations. They may also use less memory than int
values. The types are imported from the mypy_extensions
module
(installed via pip install mypy_extensions
).
Example:
from mypy_extensions import i64
def sum_list(l: list[i64]) -> i64:
s: i64 = 0
for n in l:
s += n
return s
# Implicit conversions from int to i64
print(sum_list([1, 3, 5]))
Note
Since there are no overflow checks when performing native integer arithmetic, the above function could result in an overflow or other undefined behavior if the sum might not fit within 64 bits.
The behavior when running as interpreted Python program will be
different if there are overflows. Declaring native integer types
have no effect unless code is compiled. Native integer types are
effectively equivalent to int
when interpreted.
Native integer types have these additional properties:
Values can be implicitly converted between
int
and a native integer type (both ways).Conversions between different native integer types must be explicit. A conversion to a narrower native integer type truncates the value without a runtime overflow check.
If a binary operation (such as
+
) or an augmented assignment (such as+=
) mixes native integer andint
values, theint
operand is implicitly coerced to the native integer type (native integer types are “sticky”).You can’t mix different native integer types in binary operations. Instead, convert between types explicitly.
For more information about native integer types, refer to native integer operations.