In the previous topic, we discussed the basic terminology used with graphs. If you remember correctly, we associated every definition with social networks, namely, with the graph below:
In this topic, we will continue with some important concepts related to graphs, which are crucial to understanding and writing cool graph algorithms. So, buckle up, and let's get started!
Paths and cycles
A path is a path. Literally... Jokes aside, a path in a graph is an alternating sequence of nodes and edges. Consider the following examples:
The blue nodes and edges correspond to the path . If each node in a path appears only once, this path is called simple. The path above is a simple path since there are no nodes appearing multiple times. If we add the edge to this path, it will stop being simple, since now the node appears twice.
If the path starts and ends at the same node, this path is called a cycle. For example, the green nodes and edges form a cycle .
For directed graphs, the terms are the same. Notice that the edges are directed; therefore, the alternating sequence should look like this: (node 1) — (outgoing edge from node 1) — (node 2) — (outgoing edge from node 2) and so on. Let's get back to our Instagraph:
Assume that you follow Alex, Alex follows Dora, she follows Kate, and Kate listens to songs of L&D, a popular band that posts its songs on Instagram. Recently, they posted a song called "Just the beginning". What are the chances that this song shows up on your feed? Well, Kate will obviously repost it. If Dora and Alex decide to do so as well, the post with the original song will appear on your feed thanks to the repost from Alex. Therefore, the song will have a chance to get a like from you. Formally, a path from you to L&D will be created, namely, You Alex Dora Kate L&D. This is the basic idea of how the algorithms that intend to raise users' popularity on social media work. They tend to find users that are potential reposters and have friends with similar qualities as well.
Connectivity in graphs
Consider the initial graph. You may think that there are two graphs, but you would be wrong. Actually, this is a disconnected graph with two components. Let's try to define formally what connectivity means.
In an undirected graph, two nodes and are said to be connected if there is a path between them. A set of nodes where each pair is connected by a path forms a connected component. A graph is called a connected graph if it contains exactly one connected component. Otherwise, the graph is disconnected.
Disconnected graphs arise pretty often. Let the users in the graph above be computer science students. Assume they've had a disagreement leading to the creation of two groups of friends. Everyone from the first group unfriended the students from the other group and vice versa. In one word, teen stuff. From the graphs' point of view, this implies the formation of two connected components: and as shown in the picture.
Note that now we are talking only about undirected graphs. As for directed graphs, a definition of a connected component is a bit trickier: we will discuss it in the following topics.
Subgraphs
Filters play an essential role in searching, don't they? What happens with a graph of users when filtering? Let's suppose we want to search among the female users. From the initial structure, we are left with the graph on the left. Or assume you want to find users whose name contains four letters. We end up with the mini-graph on the right.
Formally, a graph obtained from another graph by removing nodes or edges is called a subgraph. From the example above, the graphs and are subgraphs of our initial graph, because we get them by removing some nodes and edges: it shouldn't be difficult for you to see which ones.
Note that when we remove a node from a graph, we also remove all edges incident to this node.
Summary
In this topic, we've considered some important concepts related to graphs. Here is a quick recap:
-
A path is an alternating sequence of nodes and edges. A cycle is a path that starts and ends at the same node.
-
Two nodes are said to be connected if there is a path between them. A set of nodes where each pair is connected by a path forms a connected component. A graph is called a connected graph if it contains exactly one connected component. Otherwise, the graph is disconnected.
-
A subgraph is a graph obtained from another graph by removing nodes or edges.
Although this is just the beginning (not the song from L&D), this vocab is already enough to discuss more sophisticated graph algorithms and concepts – so let's go and do that!