Graph Theory 101: From First Principles

You have already seen a graph today. The web of links between Wikipedia pages. Your social feed where some people follow you and you do not follow back. The map app that found you a shorter route to work. All of them share a single underlying abstraction: a collection of things with connections between them. Mathematicians formalised this idea 300 years ago when Leonhard Euler asked whether you could cross every bridge in Königsberg exactly once and return to where you started. That puzzle gave us graph theory.

What Is a Graph?

A graph is a pair of sets: G = (V, E). V is the set of vertices (also called nodes). E is the set of edges (the connections between vertices). An edge is an unordered pair of vertices: if you have vertices called Alice and Bob, the edge {Alice, Bob} means they are connected.

That is the whole definition. Everything else in graph theory is about asking interesting questions of this simple structure.

Vertices vs Nodes

The terms are interchangeable. Computer scientists tend to say "nodes". Mathematicians and physicists tend to say "vertices". You will see both in the wild. This article uses them interchangeably.

The Size of a Graph

Quantity	Notation	Meaning
Number of vertices	\|V\| or n	The order of the graph
Number of edges	\|E\| or m	The size of the graph

A graph with n vertices could have anywhere from zero edges to roughly n squared edges, depending on what kind of graph it is.

Directed vs Undirected Graphs

This is the first big fork in the road.

Undirected graphs have edges with no direction. An edge is a two-way street. If Alice and Bob are connected, either can reach the other. Think of Facebook friendships: if you are friends with someone, they are friends with you.

Directed graphs (digraphs) have edges with an arrow. Each edge goes from one vertex to another. You can have an edge from Alice to Bob without Bob having one back to Alice. Think of Twitter follows: you can follow someone who does not follow you.

Property	Undirected	Directed
Edge notation	{u, v}	(u, v)
Symmetry	Always symmetric	May be asymmetric
Real example	Facebook friends	Twitter follows
Common use case	Road networks (two-way streets)	Web page links, payment flows

Some graphs are neither fully directed nor fully undirected. A mixed graph has both types of edges, though they are less common in practice.

Weighted vs Unweighted Graphs

An edge can carry extra information called a weight. In a road network, the vertices are intersections and the edge weights are distances. Your satnav finds the path with the smallest total weight.

Unweighted graphs treat every edge as having equal cost. When you use BFS to find the shortest number of hops between two people in a social network, you are working with an unweighted graph.

Signed Graphs

Occasionally you see graphs with positive and negative edge weights. These model friendly and hostile relationships, or trust and distrust in a recommendation system.

Loops and Multigraphs

A loop is an edge that connects a vertex to itself. A simple graph bans loops.

A multigraph allows multiple edges between the same two vertices. Why would you need this? Consider a map of airline routes between cities. If two airlines fly the same route, that might be represented as two parallel edges. Or consider a collaboration network where two researchers co-author multiple papers.

A simple graph bans both loops and multiple edges. Most of graph theory starts with simple graphs and adds complexity from there.

Degree, In-Degree, Out-Degree

The degree of a vertex is the number of edges incident to it. In a directed graph, you split this into in-degree (edges coming in) and out-degree (edges going out). A Twitter user followed by 500 people and following 200 has in-degree 500 and out-degree 200.

The Handshake Lemma

This is the first theorem worth remembering. The sum of the degrees of all vertices in an undirected graph equals twice the number of edges.

sum of degrees = 2 * |E|

Why? Every edge contributes to the degree of two vertices. Count carefully and you get a factor of two. This lemma appears everywhere. It tells you immediately that the number of vertices with odd degree must be even.

In a directed graph, the sum of in-degrees equals the sum of out-degrees, and both equal |E|.

Walks, Trails, Paths, Cycles

The terminology here is precise. Each term adds a stricter constraint.

A walk is any sequence of vertices where each consecutive pair is connected by an edge. You can revisit vertices and reuse edges as much as you like.

A trail is a walk that does not reuse any edge. You can revisit vertices but not edges.

A path is a trail that does not revisit any vertex. This is what most people mean when they say "find a path from A to B." It is a simple, non-repeating route.

A cycle is a path that starts and ends at the same vertex, with at least one edge. Think of a loop you can walk around and return to where you began.

Term	Reuse vertices?	Reuse edges?	Start = End?
Walk	Yes	Yes	Optional
Trail	Yes	No	Optional
Path	No	No	No
Cycle	Only the start/end	No	Yes

Connectivity

A graph is connected if there is a path between every pair of vertices. You can get from any vertex to any other vertex.

A disconnected graph falls apart into pieces. Each piece is a connected component. A graph with three islands has three connected components.

For directed graphs, connectivity is more subtle. A digraph is strongly connected if there is a directed path from any vertex to any other vertex in both directions. It is weakly connected if ignoring direction makes the underlying undirected graph connected.

Connected Components in Practice

When a deployment tool analyses a microservice architecture, it might find one giant connected component of services that talk to each other, plus a few isolated services. Each isolated group is its own connected component.

Special Graph Families

Some graphs appear so often that they have their own names.

Complete Graphs Kₙ

Every vertex is connected to every other vertex. K₅ has five vertices, each joined to the other four. The number of edges in Kₙ is n(n-1)/2. Complete graphs are the densest possible simple graphs.

Bipartite Graphs

You can split the vertices into two groups such that every edge crosses from one group to the other. No edge connects two vertices inside the same group. Many real problems are bipartite: users and the movies they rate, job applicants and positions, patients and symptoms.

Trees

A tree is a connected graph with no cycles. It is the minimal way to connect all vertices. Trees show up everywhere: file systems, organisational hierarchies, parse trees in compilers, spanning trees in network routing.

A forest is a disjoint union of trees (a disconnected acyclic graph).

Property	Tree	Forest
Connected?	Yes	No
Has cycles?	No	No
Edges relative to vertices	\|E\| = \|V\| - 1	\|E\| = \|V\| - (number of trees)

Cycle Graphs Cₙ

A cycle graph is a single loop of n vertices, each connected to exactly two neighbours. C₃ is a triangle, C₄ is a square, C₅ is a pentagon.

Graph Isomorphism

Two graphs are isomorphic if they are the same graph drawn differently. The vertex labels do not matter. What matters is the pattern of connections.

Take a triangle graph. Whether you draw it as an equilateral triangle, a squashed triangle, or three points connected by curved lines, the underlying graph is the same: each vertex connects to the other two.

Determining whether two arbitrary graphs are isomorphic is a famously difficult problem. No one knows whether it can be solved in polynomial time in general, though it is almost certainly not NP-complete. For small graphs you can often tell by comparing degree sequences, but for large graphs the problem remains tantalisingly open.

Adjacency Matrix vs Adjacency List

When you need to store a graph in a computer, you have two main choices.

An adjacency matrix is a |V| by |V| grid where cell [i][j] is 1 if there is an edge from vertex i to vertex j. Lookups are instant. But the matrix takes O(|V|²) space regardless of how many edges you actually have. For a sparse graph with millions of vertices and only a few edges each, that is wasteful.

An adjacency list stores for each vertex a list of its neighbours. Space scales with |V| + |E|. Iterating over a vertex's neighbours is fast. Checking whether a specific edge exists is slower, but that is rarely the bottleneck.

Consideration	Adjacency Matrix	Adjacency List
Space	O(V²)	O(V + E)
Edge lookup	O(1)	O(degree(v))
Iterate neighbours	O(V)	O(degree(v))
Best for	Dense graphs, quick edge checks	Sparse graphs, traversals

For almost all real-world graphs, the adjacency list wins because real graphs are sparse. The Twitter follow graph has hundreds of millions of vertices, but each user follows only a tiny fraction of everyone else.

Subgraphs and Graph Minors

A subgraph is a graph formed by taking a subset of vertices and a subset of edges that only use those vertices. If you delete some vertices (and the edges attached to them), you get an induced subgraph.

A graph minor is what you get when you contract edges (merge two vertices into one) or delete vertices and edges. Minors matter in deep structural graph theory. The famous Robertson-Seymour theorem says that every family of graphs closed under taking minors can be characterised by a finite set of forbidden minors. It took twenty papers and 500 pages to prove.

Why This Matters

Everything else in graph theory and graph algorithms builds on these foundations. In the companion article, "Graph Theory for Software Engineers," you will see how BFS and DFS traverse these structures, how shortest-path algorithms find routes through weighted graphs, and how centrality metrics identify important nodes. You will also see how every knowledge graph platform from Neo4j to Amazon Neptune relies on these same primitives under the hood.

Graph theory starts with a deceptively simple idea: things and their connections. From that seed grows an entire field that models the internet, the brain, social networks, logistics, protein interactions, and the structure of knowledge itself.