Graph Theory for Software Engineers: Practical Concepts
Why Graph Theory?
Graphs aren't abstract math — they're everywhere in software:
- Social networks: Friend graphs, follow graphs
- CI/CD pipelines: Dependency graphs of build steps
- Package management: npm/Maven dependency trees
- Distributed systems: Service mesh topologies
- Knowledge graphs: Entity-relationship networks
Fundamental Algorithms
Breadth-First Search (BFS)
Find the shortest path in unweighted graphs:
from collections import deque
def bfs(graph, start, target):
visited = {start}
queue = deque([(start, [start])])
while queue:
node, path = queue.popleft()
if node == target:
return path
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, path + [neighbor]))
return None
Time complexity: O(V + E). Use for: shortest path, connectivity checks, social degrees of separation.
Depth-First Search (DFS)
Explore deeply before broadly. Use for: topological sorting, cycle detection, maze solving.
def has_cycle(graph):
visited = set()
rec_stack = set()
def dfs(node):
visited.add(node)
rec_stack.add(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
if dfs(neighbor):
return True
elif neighbor in rec_stack:
return True
rec_stack.discard(node)
return False
for node in graph:
if node not in visited:
if dfs(node):
return True
return False
Centrality Metrics
Which nodes matter most?
- Degree centrality: Node with most connections
- Betweenness centrality: Nodes that bridge communities (gatekeepers)
- Closeness centrality: Nodes that can reach others quickly
- PageRank: Nodes endorsed by other important nodes
import networkx as nx
G = nx.Graph()
G.add_edges_from([("A", "B"), ("B", "C"), ("A", "C"), ("C", "D")])
degree = nx.degree_centrality(G)
betweenness = nx.betweenness_centrality(G)
pagerank = nx.pagerank(G)
Graph Representations for Code
Choose based on your use case:
| Representation | Storage | Add Edge | Check Edge | Neighbors |
|---|---|---|---|---|
| Adjacency Matrix | O(V²) | O(1) | O(1) | O(V) |
| Adjacency List | O(V+E) | O(1) | O(deg(V)) | O(deg(V)) |
| Edge List | O(E) | O(1) | O(E) | O(E) |
For most applications, adjacency list wins — it's compact and iterating neighbors is fast.
Why This Matters for Knowledge Graphs
Every knowledge graph algorithm builds on these fundamentals:
- Graph traversal (BFS/DFS) → Querying multi-hop relationships in Neo4j
- Cycle detection → Ensuring RDF graphs are valid
- Shortest path → Finding connection paths in entity graphs
- Centrality → Identifying key entities in a knowledge domain
- Community detection → Grouping related concepts (used by Microsoft's GraphRAG)
Understanding these primitives makes you a better engineer whether you're building a recommendations engine, a dependency resolver, or a knowledge graph-powered AI application.