# 5.8. Data Structures for Disjoint Sets

In 
  • ref
  • Partition
    • A partition of a set X is a set of non-empty subsets of X such that every element x in X is in exactly one of these subsets.
    • Example: X = \{1, 2, 3, 4, 5, 6\}→\{ \{1, 3, 5\}, \{2\}, \{4, 6\} \}
  • Disjoint-set data structure (union-find data structure)
    • Used to effectively manage a collection of subsets that partition a given set of elements.
  • Basic operations on disjoint sets
    • Makeset(x)
      • Make a set containing only the given element x.
    • S = Find(x)
      • Determine which set the particular element x is in.
        • Typically return an element that serves as the subset’s representative.
        • May be used to determine if two elements are in the same subset.
    • Union(x, y) (or Merge(x, y))
      • Merge two subsets into a single subset.
  • Applications
    • Tracking the connected components of an undirected graph
      • Decide if two vertices belong to the same component, or if adding an edge between them would result in a cycle.
      • Useful for implementing the Kruskal’s algorithm for finding minimum spanning tree
  • Scheduling with deadlines
  • Computing shorelines of a terrain
  • Classifying a set of atoms into molecules or fragments.
  • Connected component labeling in image analysis
  • Example
    • U=\{a,b,c,d,e\}
    • For (each x \in U)
      • Makeset(x); → {a}, {b}, {c}, {d}, {e}
      • Union(a, c); → {a, c}, {b}, {d}, {e}
      • {a, c} = Find(a);
      • Union(c, e); → {a, c, e}, {b}, {d}
  • Implementation of disjoint sets using reversed trees
    • Makeset(x)
      Makeset(x){
      	parent(x) := x
      	rank(x) := 0
      }
      • Time complexity: O(1) image
  • Two ways of implementing the Union Operation
    • Time complexity: O(n) image
    • Time complexity: O(\log n) image
  • Union by rank
    • Always attach the smaller tree to the root of the larger tree.
    • The rank increases by one only if two trees of the same rank are merged.
      • The rank of a one-element tree is zero.
    • The Union and Find operations can be done in O(\log n) in the worst case.
      • The number of elements in a tree of rank r is at least 2^r (Proof by induction)
      • The maximum possible rank of a tree with n elements is O(\log n).

Disjoint set의 path compression 연산에 대해서도 알아볼 것.

  • S = Find(x)
    • Time complexity: O(depth of x in the tree)
      Find(x) {
        if (x == parent(x))
          return x 
        else
          return Find(parent(x)) 
      }
      Find(x) {
        while (x != parent(x))
          x := parent(x)
        return x
      }
  • Union(x, y)
    • Time complexity: 2 Find op’s + O (1)
      Union(x, y) { 
        x0 := Find(x) 
        y0 := Find(y) 
        if (x0 == y0)
          return
        if (rank(x0) > rank(y0))
          parent(y0) := x0
        else
          parent(x0) := y0
        if (rank(x0) == rank(y0))
          rank(y0) := rank(y0)+1
      }

# Another Algorithm Based on Disjoint Sets

  • Method
    • d_{max} : the maximum of the deadlines for n jobs.
    • Add a job as late as possible to the schedule being built, but no later than its deadline.
    • Sort the jobs in non-increasing order by profit.
    • Initialize d_{max}+1 disjoint sets, containing the integers 0, 1, 2, ..., d_{max}
    • For each job in the sorted order,
      • Find the set S containing the minimum of its deadline and n.
        • If small(S) = 0, reject the job.
        • Otherwise, schedule it at time small(S), and merge S with the set containing small(S)-1.
  • Time complexity
    • O(n \log m) for the disjoint set manipulation, where m is the minimum of n and d_{max}
    • O(n \log n) for sorting the profits.
  • Example image