librt.vecs ========== The ``librt.vecs`` module defines the ``vec`` type, a low-level, uniform growable array type. It's part of the ``librt`` package on PyPI. When constructing a ``vec``, the item type ``T`` is always explicitly given via ``vec[T]``:: from librt.vecs import append, vec v = vec[float]([1.0, 2.5]) # Construct vec[float] with two items ``vec`` supports many sequence operations, though it's not a full sequence type:: len(v) # 2 v[0] # 1.0 v[-1] # 2.5 for x in v: print(x) The length of each ``vec`` value is immutable. Appending an item is still a fast operation, but it returns a new ``vec`` value:: v = append(v, -0.5) print(v) # vec[float]([1.0, 2.5, -0.5]) ``vec`` only supports simple, uniform item types. It uses an efficient packed binary encoding for these *value item types*: * ``mypy_extensions.i64`` (signed 64-bit integer) * ``mypy_extensions.i32`` (signed 32-bit integer) * ``mypy_extensions.i16`` (signed 16-bit integer) * ``mypy_extensions.u8`` (unsigned byte) * ``float`` (64-bit float) * ``bool`` ``int`` is not a valid item type, since it has an arbitrary precision, and vec is an efficiency-focused type. Use one of the fixed-length integer types instead. Class item types (e.g. ``str`` or ``MyNativeClass``) are represented as regular object references. Optional class item types (e.g. ``str | None``) are supported for convenience, but arbitrary union types are not supported as item types. Nested vecs are supported, e.g. ``vec[vec[i64]]``. A vec value is often used as an efficient alternative to ``list`` or ``array.array`` in code compiled using mypyc. Its primary advantages are an efficient packed memory representation for value item types and very fast inlined get and set item operations. Vec instances perform runtime checking of item types. Since values of type variables are not available at runtime (they are *erased*), type variables can't be used as item types. A vec value is effectively an immutable (length, buffer) pair. This means that any operation that changes the length of a vec, including ``append`` as we saw above, returns a modified value. .. note:: An immutable length allows more efficient code to be generated by mypyc, and vec values can be allocated to machine registers effectively. However, vec values must be boxed if used in a non-native context, such as if added to a list or dict. Here are some examples of valid vec types: .. list-table:: :header-rows: 1 * - Type - Item representation * - ``vec[i32]`` - Packed 32-bit integers * - ``vec[float]`` - Packed 64-bit floats * - ``vec[str]`` - Object references * - ``vec[vec[u8]]`` - Packed vec values The ``vec`` class ----------------- .. class:: vec[T](items: Iterable[T] = ..., *, capacity: i64 = ...) A generic growable array type. The runtime type parameter ``T`` used when calling ``vec[T](...)`` determines the element type. The ``capacity`` parameter allows defining the minimum initial capacity of the buffer, some of which may be unused after construction. Unused capacity allows fast ``append`` and ``extend`` operations that don't need to reallocate the buffer. Actual capacity will be larger than ``capacity`` if ``items`` has more than ``capacity`` items. Construction from ``list`` and ``tuple`` objects is optimized. Also, for value item types, construction from an object that implements the buffer protocol is optimized (such as ``bytes``), if the format is compatible with the vec item type. Mypyc treats ``vec[T]([x] * n)`` as a special form. For example, ``vec[u8]([0] * n)`` constructs a zero-initialized vec object efficiently, without building an intermediate list. There are also other constructor-related special forms -- see `Special forms`_ below. It's an error to construct a ``vec`` object without providing an item type: ``vec()`` raises an exception. .. describe:: len(v) → i64 Return the length of ``v``. .. describe:: v[i] → T Return item at index ``i`` (index may be negative). .. describe:: v[i:j] → vec[T] Return a slice. This constructs a new ``vec`` object. ``i`` and ``j`` may be negative. .. describe:: v[i] = o Assign to an item (index may be negative). .. describe:: o in v → bool Return True if ``v`` contains ``o``. .. describe:: for o in v Iterate over items. .. describe:: memoryview(v) ``vec`` implements the buffer protocol, but only for value item types that use a packed representation. Functions --------- Since the following operations return a modified value, they are module-level functions instead of methods. .. function:: append(v: vec[T], o: T) -> vec[T] Return ``v`` with item ``o`` appended to it. If ``v`` has unused capacity, reuse the existing buffer. The time complexity is O(1) on average. Example:: v = vec[i32]() v = append(v, 1) .. function:: extend(v: vec[T], it: Iterable[T]) -> vec[T] Return ``v`` with all items from iterable ``it`` appended to it. If ``v`` has sufficient unused capacity, reuse the existing buffer. The time complexity is O(n) on average, where n is the length of ``it``. Example:: v = vec[u8]() v = extend(v, b"foo") .. function:: remove(v: vec[T], o: T) -> vec[T] Return ``v`` with the first instance of item ``o`` removed. Reuse the buffer from ``v``. Raise ``ValueError`` if value doesn't exist. Example:: v = vec[i32]([1, 2, 3]) v = remove(v, 2) # v has items [1, 3] .. function:: pop(v: vec[T], i: i64 = -1) -> tuple[vec[T], T] Return ``(new_v, item)``, where ``item`` is the value at index ``i`` and ``new_v`` is ``v`` with that item removed. Reuse the buffer from ``v``. Example:: v = vec[i32]([1, 2, 3]) v, x = pop(v) # x is 3; v has items [1, 2] Special forms -------------- Certain combinations of operations that would be multiple separate operations in regular Python are guaranteed to be compiled by mypyc to direct operations with no unnecessary temporary objects. .. list-table:: :header-rows: 1 * - Special form - Description * - ``vec[T]()`` - Construct empty vec with no buffer. This doesn't perform any dynamic allocation (at least for non-nested vecs). * - ``vec[T]([element1, ...])`` - Directly construct a vec object with given items, without a temporary list. * - ``vec[T]([element1] * n)`` - Directly construct a vec with length n, without any temporary list. * - ``vec[T]([ for ... in ])`` - Vec comprehension creates no temporary list. Thread safety ------------- In free-threaded Python builds, it's unsafe to write or modify an item if other threads might be concurrently accessing *the same item*. For example, writing ``v[4]`` is not safe to do if another thread might be reading ``v[4]``. Similarly, two threads concurrently calling ``append`` or ``remove`` on the same vec object is not safe. This is different from list objects, since vec is a lower-level type where implicit synchronization would have a significant performance cost. However, since vec lengths are immutable, some race conditions that lists can be susceptible to are not possible with vecs. Implementation details ---------------------- In a native context, such as in a local variable or a parameter in a native function, or in an attribute of a native class, vec values are implemented as value objects with two fields: length and buffer. The buffer is a normal Python object, but it's not directly accessible to users. If a vec object is empty, no buffer object is required. This means that empty vecs are particularly efficient in a native context (usually 16 bytes). A packed representation is used for buffers with supported value item types, including for nested vecs. The packed representation is much more efficient than a Python list object, and it's also significantly more efficient than ``array.array`` for small sequences. Multiple vec values can share the same underlying buffer. For example, assigning a vec to another variable creates an alias that refers to the same buffer:: v = vec[i32]([1, 2, 3], capacity=3) w = v # v and w share the same buffer w[0] = 99 print(v[0]) # 99 -- both see the change However, this sharing is not guaranteed to persist if there are operations that change the length (such as ``append``). These may reallocate the buffer, breaking the sharing silently:: v = append(v, 4) # reallocates the buffer since there is no free capacity v[0] = 0 print(w[0]) # still 99 -- v and w no longer share a buffer If you need independent copies, use slicing (``v[:]``) to explicitly create a vec with its own buffer. It's not recommended to rely on the details of buffer reallocation, as these might change between ``librt`` releases. Using vecs outside compiled code -------------------------------- ``vec`` is fully supported in non-compiled code, but ``vec`` values will be boxed in such non-native contexts. There will be always two objects, a boxed vec object and a buffer object, whereas in native contexts usually only the buffer is a dynamically allocated object. ``vec`` is primarily useful in code compiled using mypyc, and it's been heavily optimized for this use case. There may be no performance benefit in interpreted code over using ``list`` or ``array.array``.