Skip to content

Conversation

codrut3
Copy link
Contributor

@codrut3 codrut3 commented Jul 13, 2025

Currently merge_operations represents merged components as a CircuitOperation. This means in a merge of n operations, n-1 CircuitOperations are created, with a complexity of O(n^2).
I use a disjont-set data structure to reduce the complexity to O(n) for merge_k_qubit_unitaries_to_circuit_op. merge_operations itself can't be improved because it uses a merge_func that requires creating a CircuitOperation at every step.

…s in the merge methods. Currently merge_operations represents merged components as a CircuitOperation. This means in a merge of n operations, n-1 CircuitOperations are created, with a complexity of O(n^2). We use a disjont-set data structure to reduce the complexity to O(n) for merge_k_qubit_unitaries_to_circuit_op. merge_operations itself can't be improved because it uses a merge_func that requires creation of CircuitOperation at every step.
@codrut3 codrut3 requested review from a team and vtomole as code owners July 13, 2025 17:29
@codrut3 codrut3 requested a review from dabacon July 13, 2025 17:29
@github-actions github-actions bot added the size: XL lines changed >1000 label Jul 13, 2025
@codrut3
Copy link
Contributor Author

codrut3 commented Jul 13, 2025

I didn't change the algorithm in merge operations, but switched from CircuitOperation to Component for intermediate merges.
I wrote an explanation for the algorithm here.

I found this inefficiency while investigating why merge_single_qubit_gates_to_phased_x_and_z in the Shannon decomposition is slow. Right now this method is not called anymore by the Shannon decomposition, I replaced it with a custom version that is faster.

Still this change improves transpilation. On my laptop, I measured an average improvement of 5,09% from 86.89 seconds to 82,46 seconds for 7 qubits.

I also measured the improvement, if merge_single_qubit_gates_to_phased_x_and_z would still be used in the Shannon decomposition. In this case it is 20.7% from 40.88 seconds to 32.40 seconds for 8 qubits.

This is the script I used to measure the transpiler improvement:

"""Tests performance of quantum Shannon decomposition. """ import sys import time from scipy.stats import unitary_group import cirq if not sys.warnoptions: import warnings warnings.simplefilter("ignore") def main(): measurements = [] n_qubits = 7 for i in range(10): print(f'Iteration {i}') U = unitary_group.rvs(2**n_qubits) qubits = [cirq.NamedQubit(f'q{i}') for i in range(n_qubits)] start_time = time.time() circuit = cirq.Circuit(cirq.quantum_shannon_decomposition(qubits, U)) elapsed_time = time.time() - start_time print(f'Time taken: {elapsed_time:.2f} seconds') # Run cirq transpiler start_time = time.time() circuit = cirq.optimize_for_target_gateset(circuit, gateset=cirq.CZTargetGateset(preserve_moment_structure=False), max_num_passes=None) elapsed_time = time.time() - start_time measurements.append(elapsed_time) print(f'Cirq depth for {n_qubits} qubits (with transpiler): {len(circuit)} ({elapsed_time:.2f} sec)') print(f'Average time taken for transpiling: {sum(measurements) / len(measurements):.2f} seconds') if __name__ == '__main__': main()
@NoureldinYosri NoureldinYosri self-assigned this Jul 23, 2025
@babacry babacry self-assigned this Jul 23, 2025
Copy link
Collaborator

@babacry babacry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think the PR greatly improved the performance and the readability!

The unit tests look solid and confirm the merge behavior is working as intended. I've focused my review on the overall design in this iteration.

For performance, IIUC, given m connected ops,

  • before: construct CircuitOperation m-1 times.
  • after: construct CircuitOperation 1 time + extra on-average-constant disjoint-set union and find time.
    The bottleneck is the CircuitOperation construction time in the previous implementation. Correct me if I am wrong.

Since this is a big design change, we should be careful here for the compatibility and extensibility. I've put a couple of comments for discussion.

Copy link
Collaborator

@NoureldinYosri NoureldinYosri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we do online queries about the connected components?

@codrut3
Copy link
Contributor Author

codrut3 commented Jul 30, 2025

do we do online queries about the connected components?

The components are used throughout _merge_operations_impl, and there are queries about the component representatives during the merge algorithm. Please, is this what you had in mind?

@codrut3
Copy link
Contributor Author

codrut3 commented Jul 30, 2025

Thank you a lot for the review @babacry and @NoureldinYosri ! I made changes following your comments, please have a look!

Copy link
Collaborator

@babacry babacry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with a couple of nits.

Over to Nour @NoureldinYosri

containing measurement operations with the same key.
ckey_indexes: Mapping from measurement keys to (sorted) list of component moments
containing classically controlled operations controlled on the same key.
components_by_index: List of circuit moments containing components. We use a dictionary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not from your pr, but the naming of the var is strange: should it be the opposite as something like index_by_component_list? see go/python-tips/054

and could you update the explanation of the attribute with the map details? The key of the map is component, what does the value of the map represent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is correct, but confusing. The data structure is:
list of dictionary {key: component, value: 0}, where the position in the list is given by the moment id

So the dictionaries are in a list, with the index the moment id. For a given moment, a dictionary is used to preserve the insertion order of the components in the moment. The value has no meaning, it's only there to take advantage of the dict order preserving feature.

I added more details in the docstring, please have a look!

Copy link
Collaborator

@babacry babacry Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, thanks for the clarification! It's a bit tricky, I rephrased a bit:

components_by_index: List of components indexed by moment id. For a moment id, we use a dictionary to keep track of the components in the moment. The dictionary instead of a set is used to preserve insertion order and the dictionary's values are intentionally unused. 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I updated the description.

merged_circuit_op_tag: tag to use for CircuitOperations
Returns:
the circuit with merged components as a CircuitOperation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: keep the docstring format uniform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, can you give me more details about what's wrong here? I don't find the issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: uppercase at the beginning and period at the end.

The circuit with merged components as a CircuitOperation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

circuit: CIRCUIT_TYPE,
factory: ComponentFactory,
*,
merged_circuit_op_tag: str = "Merged connected component",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a good practice to put space in a tag? @NoureldinYosri I am not sure, though existing code before this pr have some mixed style of tags.

maybe "merged_connected_component"? or "_merged_connected_component" (start with _ to separate a default tag named by cirq pkg with user specified tag)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the tag value from line 621 in merge_operations_to_circuit_op. I think the two should be in sync.

Copy link
Collaborator

@NoureldinYosri NoureldinYosri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codrut3 sorry about the late reply but I was wondering if instead of manually implementing union-find we can do the required the functionality using https://networkx.org/documentation/stable/_modules/networkx/utils/union_find.html

this will split the logic of merging moments into two parts where one of them where we need to maintain only one of them

 def merge(self, a, b): self._merge_moments(a, b) self.union_find.union(a, b)

is it possible to do that or will it break the logic?

@codrut3
Copy link
Contributor Author

codrut3 commented Aug 10, 2025

@codrut3 sorry about the late reply but I was wondering if instead of manually implementing union-find we can do the required the functionality using https://networkx.org/documentation/stable/_modules/networkx/utils/union_find.html

this will split the logic of merging moments into two parts where one of them where we need to maintain only one of them

 def merge(self, a, b): self._merge_moments(a, b) self.union_find.union(a, b)

is it possible to do that or will it break the logic?

I think it works to use a separate data structure to do the union-find. This structure would have to be initialized by the user and passed as an argument to the component factory.

However, I would prefer to use scipy.DisjointSet because it allows me to add new singletons on the fly, while https://networkx.org/documentation/stable/_modules/networkx/utils/union_find.html requires all elements to be known from the beginning.

If you are ok with adding a dependency on the scipy package, I will rewrite the code. Do I need to add the package version as a requirement somewhere?

Let me know how to proceed! @NoureldinYosri

@NoureldinYosri
Copy link
Collaborator

@codrut3 that sounds good, Cirq already depends on scipy

scipy~=1.12

sorry for the extra work

@babacry babacry self-requested a review August 12, 2025 23:34
Copy link
Collaborator

@babacry babacry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! @codrut3

Single comments are added in the reviews.

@codrut3
Copy link
Contributor Author

codrut3 commented Aug 15, 2025

@codrut3 that sounds good, Cirq already depends on scipy

scipy~=1.12

sorry for the extra work

No worries! I changed the implementation to use scipy.DisjointSet.

I took this opportunity to rewrite the connected component classes. I moved the merge() and find() methods to the factory class, which I renamed as ComponentSet. This solved the issue that can_merge and merge_func were defined at the component level, and could potentially differ for the two components at merge time. Now they are defined at the level of ComponentSet which ensures all created components have the same callable (as well as are part of the same scipy.DisjointSet).

If memory consumption ever becomes an issue, we can replace with a no-op trivial operation. Also clarify comment for similar action for ComponentWithOps.
The Component class uses default Python object hash and equality functions. If retrieved from sets or dictionaries, they must be identical.
Make them throw NotImplementedError which is ignored in coverage.
These operate on a general list of sorted integers and do not need any class data. Replaced with `_insort_last` and `_remove_last`.
Iteration over Circuit yields Moments.
Indicate these are internal functions not intended for use outside of cirq.
default_factory=lambda: defaultdict(lambda: [-1])
)
ops_by_index: list[dict[cirq.Operation, int]] = dataclasses.field(default_factory=list)
components_by_index: list[dict[Component, int]] = dataclasses.field(default_factory=list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - if dictionary values are unused, it is perhaps better to declare it as dict[Component, NoneType] and use None instead of 0 to add to dictionary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Copy link
Collaborator

@pavoljuhas pavoljuhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after addressing some minor issues - mainly renaming the new module connected_component to _connected_component to mark it as for internal use only.

Rather than writing a lot of comments I pushed the suggestions directly to the PR, these should be easy to review commit-by-commit.

Please let me know if this looks OK to you, LGTM on my side (with one tiny inline suggestion).

Copy link

codecov bot commented Oct 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.38%. Comparing base (34f1f99) to head (7637dde).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@ Coverage Diff @@ ## main #7484 +/- ## ======================================== Coverage 99.38% 99.38% ======================================== Files 1089 1091 +2 Lines 97551 97813 +262 ======================================== + Hits 96950 97212 +262  Misses 601 601 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Optimized for the majority case when the value is the last in the list.
"""
indices.reverse()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't reverse O(n)?
I think it would be better to check with an if the last element before trying reverse and remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some changes in my commit, please have a look!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that! I was under impression that list.reverse is O(1), but it is indeed O(n). I have debug-printed the removed positions when not last and they are close to the list end. I think we can use bisection to optimize the removal a bit more, here is my take on it in a735c01.

LMK if this looks OK to you.

@codrut3
Copy link
Contributor Author

codrut3 commented Oct 16, 2025

Thank you a lot for your review and changes!

I addressed the nit and made another change, PTAL.

Copy link
Collaborator

@pavoljuhas pavoljuhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go on my side.

As of quantumlib#7590 typing_extensions is not in cirq-core requirements. mypy can detect mismatched `merge` arguments w/r to the base class without the `@typing_extensions.override` decorator. It does not seem worth reintroducing this dependency, here we drop the decorator instead. NB: Python 3.12 added a similar `typing.override` decorator, but we need to support 3.11.
@pavoljuhas
Copy link
Collaborator

One last tweak in 7637dde - typing_extensions is not in the cirq-core requirements anymore so I removed its import.

@codrut3
Copy link
Contributor Author

codrut3 commented Oct 17, 2025

Nice, looks good to me, thank you a lot for the changes!

@pavoljuhas pavoljuhas added this pull request to the merge queue Oct 17, 2025
@pavoljuhas
Copy link
Collaborator

Nice, looks good to me, thank you a lot for the changes!

Thanks a lot for contributing this!

Merged via the queue into quantumlib:main with commit 253b7b1 Oct 17, 2025
35 checks passed
@github-project-automation github-project-automation bot moved this from Waiting to Resolved in Follow-up tracker Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants