vllm.compilation.multi_output_match
MultiOutputMatch
¶
Bases: ABC
This class provides utilities to process multi-output matches and manually insert replacements.
This is necessary because the automatic replacement for multi-output matches is broken: https://github.com/pytorch/pytorch/issues/137280
Source code in vllm/compilation/multi_output_match.py
__init__
¶
insert_auto_fn
¶
insert_auto_fn(op: OpOverload, kwargs) -> Node
Insert an auto_functionalized node with the given op and kwargs.
insert_getitems
¶
Insert operator.getitem nodes to extract elements from a tuple node.
:param tuple_node: The tuple node to extract elements from. :param indices: The indices of the elements to extract. :return: Tuple of the new getitem nodes, corresponding to the indices.
Source code in vllm/compilation/multi_output_match.py
inserting_after_match
¶
Insert nodes after the last node in the match. This is done to avoid use-before-definition errors after inserting replacement nodes.
Source code in vllm/compilation/multi_output_match.py
process
abstractmethod
¶
Process a multi-output match and manually insert the replacement.
This method should: 1. Insert the replacement nodes after the last node in the match. 2. Rebind the users of nodes in the match to use the new nodes. 3. Set meta["val"] for de-functionalization.
The result of an auto-functionalized node is a tuple of tensors. The first element is the return value of the function, usually None. The remaining elements are the mutated args of the function.
All auto-functionalized nodes must contain a proper meta["val"], as it is used by de-functionalization. meta["val"] has to contain the value of the node (tuple of tensors) that would be returned by the functionalized node during tracing.
Existing nodes in the graph all have this property set, but we have to set it manually for new nodes we insert.
Example:
op schema: foo(a: Tensor!, b: Tensor, c: Tensor!) -> None¶
at = auto_functionalized(torch.ops._C.foo.default, a, b, c)