Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Nov 8, 2025

📄 140% (1.40x) speedup for Graph.topologicalSort in code_to_optimize/topological_sort.py

⏱️ Runtime : 2.67 milliseconds 1.11 milliseconds (best of 13 runs)

📝 Explanation and details

The optimized code achieves a 140% speedup by replacing an inefficient list operation with a more performant approach. The key optimization is changing stack.insert(0, v) to stack.append(v) followed by a single stack.reverse() call.

What changed:

  • In topologicalSortUtil: stack.insert(0, v)stack.append(v)
  • In topologicalSort: Added stack.reverse() before returning
  • Minor improvement: visited[i] == Falsenot visited[i] (slightly more Pythonic)

Why this is faster:
The original code performs stack.insert(0, v) for every node visited, which is an O(N) operation since Python lists must shift all existing elements when inserting at the head. For a graph with N nodes, this results in O(N²) total time complexity just for list operations.

The optimized version uses stack.append(v) (O(1) operation) for each node, then performs a single stack.reverse() (O(N)) at the end. This reduces the list operation complexity from O(N²) to O(N).

Performance impact:
The line profiler shows the stack operation time dropped from 3.06ms (21% of total time) to 1.78ms (12.6% of total time) in topologicalSortUtil. The optimization is particularly effective for larger graphs - test cases show 157-197% speedup for graphs with 1000 nodes, while smaller graphs (≤5 nodes) show minimal or mixed results since the O(N²) vs O(N) difference isn't significant at small scales.

This optimization maintains identical functionality and correctness while dramatically improving performance for larger topological sorting workloads.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 89 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import uuid from collections import defaultdict # imports import pytest from code_to_optimize.topological_sort import Graph # unit tests # ----------- BASIC TEST CASES ----------- def test_empty_graph(): # Edge: Empty graph (0 vertices) g = Graph(0) result, sort_id = g.topologicalSort() # 5.25μs -> 5.33μs (1.56% slower) def test_single_node_graph(): # Basic: Single node, no edges g = Graph(1) result, sort_id = g.topologicalSort() # 5.50μs -> 5.62μs (2.22% slower) def test_two_nodes_no_edges(): # Basic: Two nodes, no edges g = Graph(2) result, sort_id = g.topologicalSort() # 5.92μs -> 5.67μs (4.41% faster) def test_two_nodes_one_edge(): # Basic: Two nodes, one edge 0->1 g = Graph(2) g.graph[0].append(1) result, sort_id = g.topologicalSort() # 5.50μs -> 5.33μs (3.13% faster) def test_three_nodes_linear(): # Basic: Linear chain 0->1->2 g = Graph(3) g.graph[0].append(1) g.graph[1].append(2) result, sort_id = g.topologicalSort() # 5.54μs -> 5.46μs (1.52% faster) def test_three_nodes_branching(): # Basic: Branching 0->1, 0->2 g = Graph(3) g.graph[0].append(1) g.graph[0].append(2) result, sort_id = g.topologicalSort() # 5.38μs -> 5.58μs (3.73% slower) def test_three_nodes_diamond(): # Basic: Diamond 0->1, 0->2, 1->2 g = Graph(3) g.graph[0].append(1) g.graph[0].append(2) g.graph[1].append(2) result, sort_id = g.topologicalSort() # 5.42μs -> 5.17μs (4.84% faster) def test_disconnected_components(): # Basic: Two disconnected chains: 0->1, 2->3 g = Graph(4) g.graph[0].append(1) g.graph[2].append(3) result, sort_id = g.topologicalSort() # 5.42μs -> 5.42μs (0.018% faster) # ----------- EDGE TEST CASES ----------- def test_cycle_detection(): # Edge: Graph with a cycle (should not be a valid topological sort) # The provided implementation does not detect cycles, so it will return a result # We check that the output does not satisfy topological sort constraints for a cycle g = Graph(3) g.graph[0].append(1) g.graph[1].append(2) g.graph[2].append(0) result, sort_id = g.topologicalSort() # 5.12μs -> 5.17μs (0.813% slower) # For a cycle, no valid topological sort exists. The output will be some permutation. # We check that at least one dependency is violated # 0->1, so 0 before 1; 1->2, so 1 before 2; 2->0, so 2 before 0 # This is impossible; so at least one constraint must be violated. violations = 0 if result.index(0) > result.index(1): violations += 1 if result.index(1) > result.index(2): violations += 1 if result.index(2) > result.index(0): violations += 1 def test_self_loop(): # Edge: Node with a self-loop g = Graph(1) g.graph[0].append(0) result, sort_id = g.topologicalSort() # 4.62μs -> 4.79μs (3.46% slower) def test_multiple_edges(): # Edge: Multiple edges between same nodes g = Graph(3) g.graph[0].append(1) g.graph[0].append(1) g.graph[1].append(2) g.graph[1].append(2) result, sort_id = g.topologicalSort() # 5.08μs -> 5.12μs (0.820% slower) def test_isolated_nodes(): # Edge: Some nodes have no edges g = Graph(4) g.graph[0].append(1) # nodes 2 and 3 are isolated result, sort_id = g.topologicalSort() # 5.29μs -> 5.29μs (0.000% faster) def test_reverse_edges(): # Edge: All edges reversed (2->1, 1->0) g = Graph(3) g.graph[2].append(1) g.graph[1].append(0) result, sort_id = g.topologicalSort() # 4.96μs -> 5.00μs (0.820% slower) def test_graph_with_no_edges(): # Edge: Graph with nodes but no edges g = Graph(5) result, sort_id = g.topologicalSort() # 5.50μs -> 5.62μs (2.22% slower) def test_graph_with_duplicate_edges(): # Edge: Duplicate edges between nodes g = Graph(3) g.graph[0].extend([1,1,1]) result, sort_id = g.topologicalSort() # 5.21μs -> 5.17μs (0.813% faster) def test_large_graph_sparse_edges(): # Large: 100 nodes, only 1 edge g = Graph(100) g.graph[0].append(99) result, sort_id = g.topologicalSort() # 24.7μs -> 20.9μs (18.1% faster) # ----------- LARGE SCALE TEST CASES ----------- def test_large_linear_chain(): # Large: Linear chain of 1000 nodes N = 1000 g = Graph(N) for i in range(N-1): g.graph[i].append(i+1) result, sort_id = g.topologicalSort() def test_large_branching_tree(): # Large: Tree structure, 1000 nodes, each node i points to i+1 and i+2 (if possible) N = 1000 g = Graph(N) for i in range(N): if i+1 < N: g.graph[i].append(i+1) if i+2 < N: g.graph[i].append(i+2) result, sort_id = g.topologicalSort() # For each i, i before i+1 and i+2 for i in range(N): if i+1 < N: pass if i+2 < N: pass def test_large_disconnected_graph(): # Large: 10 chains of 100 nodes each, disconnected chains = 10 chain_len = 100 N = chains * chain_len g = Graph(N) for c in range(chains): start = c * chain_len for i in range(chain_len - 1): g.graph[start + i].append(start + i + 1) result, sort_id = g.topologicalSort() # 415μs -> 155μs (167% faster) # For each chain, order must be preserved for c in range(chains): start = c * chain_len for i in range(chain_len - 1): pass def test_large_graph_all_isolated(): # Large: 1000 nodes, all isolated N = 1000 g = Graph(N) result, sort_id = g.topologicalSort() # 405μs -> 150μs (169% faster) def test_large_graph_with_cycles(): # Large: 100 nodes, create a cycle between last three N = 100 g = Graph(N) for i in range(N-3): g.graph[i].append(i+1) # Create cycle: N-3 -> N-2 -> N-1 -> N-3 g.graph[N-3].append(N-2) g.graph[N-2].append(N-1) g.graph[N-1].append(N-3) result, sort_id = g.topologicalSort() # 25.2μs -> 19.8μs (26.9% faster) # At least one constraint must be violated due to cycle violations = 0 if result.index(N-3) > result.index(N-2): violations += 1 if result.index(N-2) > result.index(N-1): violations += 1 if result.index(N-1) > result.index(N-3): violations += 1 # ----------- DETERMINISM AND UUID TESTS ----------- def test_uuid_is_unique(): # Each call should produce a unique uuid string g = Graph(1) uuid_set = set() for _ in range(10): _, sort_id = g.topologicalSort() # 29.7μs -> 29.9μs (0.699% slower) uuid_set.add(sort_id) def test_uuid_format(): # UUID should be a valid string representation g = Graph(1) _, sort_id = g.topologicalSort() # 4.88μs -> 5.00μs (2.50% slower) import re # Check for UUID format uuid_regex = r'^[a-f0-9\-]{36} # ----------- RANDOMIZED TEST CASES (DETERMINISTIC) ----------- def test_random_graph_small(): # Random graph, small size, deterministic edges g = Graph(5) g.graph[0].append(2) g.graph[1].append(2) g.graph[2].append(3) g.graph[3].append(4) result, sort_id = g.topologicalSort() # 5.71μs -> 5.54μs (3.01% faster) # codeflash_output is used to check that the output of the original code is the same as that of the optimized code. #------------------------------------------------ import uuid from collections import defaultdict # imports import pytest # used for our unit tests from code_to_optimize.topological_sort import Graph # unit tests def is_valid_topo_sort(order, graph_edges, num_vertices): """Helper function to check if the returned order is a valid topological sort.""" position = {node: idx for idx, node in enumerate(order)} for u in range(num_vertices): for v in graph_edges[u]: if position[u] >= position[v]: return False return set(order) == set(range(num_vertices)) and len(order) == num_vertices # ------------------------ # 1. Basic Test Cases # ------------------------ def test_single_node_graph(): # Graph with a single node and no edges g = Graph(1) order, sort_id = g.topologicalSort() # 4.96μs -> 4.88μs (1.70% faster) def test_two_nodes_one_edge(): # Graph: 0 -> 1 g = Graph(2) g.graph[0].append(1) order, sort_id = g.topologicalSort() # 4.96μs -> 4.92μs (0.854% faster) def test_three_nodes_chain(): # Graph: 0 -> 1 -> 2 g = Graph(3) g.graph[0].append(1) g.graph[1].append(2) order, sort_id = g.topologicalSort() # 5.08μs -> 5.04μs (0.813% faster) def test_three_nodes_branch(): # Graph: 0 -> 1, 0 -> 2 g = Graph(3) g.graph[0].append(1) g.graph[0].append(2) order, sort_id = g.topologicalSort() # 5.17μs -> 5.08μs (1.63% faster) def test_multiple_components(): # Graph: 0->1, 2->3 g = Graph(4) g.graph[0].append(1) g.graph[2].append(3) order, sort_id = g.topologicalSort() # 5.21μs -> 5.17μs (0.813% faster) # ------------------------ # 2. Edge Test Cases # ------------------------ def test_empty_graph(): # Empty graph (no nodes) g = Graph(0) order, sort_id = g.topologicalSort() # 4.25μs -> 4.38μs (2.86% slower) def test_graph_with_no_edges(): # Graph with 5 nodes, no edges g = Graph(5) order, sort_id = g.topologicalSort() # 5.58μs -> 5.46μs (2.31% faster) def test_graph_with_cycle(): # Graph: 0->1->2->0 (cycle) g = Graph(3) g.graph[0].append(1) g.graph[1].append(2) g.graph[2].append(0) # The function does NOT detect cycles, so it will recurse infinitely or stack overflow. # We expect a RecursionError or stack overflow. with pytest.raises(RecursionError): g.topologicalSort() def test_graph_with_self_loop(): # Graph: 0->0 (self loop) g = Graph(1) g.graph[0].append(0) with pytest.raises(RecursionError): g.topologicalSort() def test_disconnected_graph(): # Graph: 0->1, 2, 3 (disconnected nodes) g = Graph(4) g.graph[0].append(1) order, sort_id = g.topologicalSort() # 7.21μs -> 7.79μs (7.48% slower) # 2 and 3 can be anywhere def test_duplicate_edges(): # Graph: 0->1, 0->1 (duplicate edge) g = Graph(2) g.graph[0].append(1) g.graph[0].append(1) order, sort_id = g.topologicalSort() # 6.08μs -> 6.46μs (5.81% slower) # ------------------------ # 3. Large Scale Test Cases # ------------------------ def test_large_linear_chain(): # Graph: 0->1->2->...->999 n = 1000 g = Graph(n) for i in range(n-1): g.graph[i].append(i+1) order, sort_id = g.topologicalSort() for i in range(n-1): pass def test_large_wide_graph(): # Graph: 0->i for i=1 to 999 n = 1000 g = Graph(n) for i in range(1, n): g.graph[0].append(i) order, sort_id = g.topologicalSort() # 432μs -> 168μs (157% faster) # 0 must come before all others for i in range(1, n): pass def test_large_sparse_graph(): # Graph: edges only from even to next odd (0->1, 2->3, ...) n = 1000 g = Graph(n) for i in range(0, n-1, 2): g.graph[i].append(i+1) order, sort_id = g.topologicalSort() # 387μs -> 130μs (197% faster) for i in range(0, n-1, 2): pass def test_large_graph_with_multiple_components(): # Two chains: 0->1->...->499 and 500->501->...->999 n = 1000 g = Graph(n) for i in range(0, 499): g.graph[i].append(i+1) for i in range(500, 999): g.graph[i].append(i+1) order, sort_id = g.topologicalSort() # 404μs -> 141μs (187% faster) # Check chains are respected for i in range(0, 499): pass for i in range(500, 999): pass def test_large_graph_with_no_edges(): # 1000 nodes, no edges n = 1000 g = Graph(n) order, sort_id = g.topologicalSort() # 401μs -> 148μs (169% faster) # codeflash_output is used to check that the output of the original code is the same as that of the optimized code. #------------------------------------------------ from code_to_optimize.topological_sort import Graph def test_Graph_topologicalSort(): Graph.topologicalSort(Graph(1))
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_7dnvqc5e/tmptwugmlm2/test_concolic_coverage.py::test_Graph_topologicalSort 6.46μs 6.46μs 0.000%✅

To edit these changes git checkout codeflash/optimize-Graph.topologicalSort-mhq0bhxy and push.

Codeflash Static Badge

The optimized code achieves a **140% speedup** by replacing an inefficient list operation with a more performant approach. The key optimization is changing `stack.insert(0, v)` to `stack.append(v)` followed by a single `stack.reverse()` call. **What changed:** - In `topologicalSortUtil`: `stack.insert(0, v)` → `stack.append(v)` - In `topologicalSort`: Added `stack.reverse()` before returning - Minor improvement: `visited[i] == False` → `not visited[i]` (slightly more Pythonic) **Why this is faster:** The original code performs `stack.insert(0, v)` for every node visited, which is an O(N) operation since Python lists must shift all existing elements when inserting at the head. For a graph with N nodes, this results in O(N²) total time complexity just for list operations. The optimized version uses `stack.append(v)` (O(1) operation) for each node, then performs a single `stack.reverse()` (O(N)) at the end. This reduces the list operation complexity from O(N²) to O(N). **Performance impact:** The line profiler shows the stack operation time dropped from 3.06ms (21% of total time) to 1.78ms (12.6% of total time) in `topologicalSortUtil`. The optimization is particularly effective for larger graphs - test cases show **157-197% speedup** for graphs with 1000 nodes, while smaller graphs (≤5 nodes) show minimal or mixed results since the O(N²) vs O(N) difference isn't significant at small scales. This optimization maintains identical functionality and correctness while dramatically improving performance for larger topological sorting workloads.
@codeflash-ai codeflash-ai bot requested a review from aseembits93 November 8, 2025 08:12
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

1 participant