#5.8. Data Structures for Disjoint Sets

In 
  • ref
  • Partition
    • A partition of a set is a set of non-empty subsets of such that every element x in is in exactly one of these subsets.
    • Example:
  • Disjoint-set data structure (union-find data structure)
    • Used to effectively manage a collection of subsets that partition a given set of elements.
  • Basic operations on disjoint sets
    • Makeset(x)
      • Make a set containing only the given element x.
    • S = Find(x)
      • Determine which set the particular element x is in.
        • Typically return an element that serves as the subset’s representative.
        • May be used to determine if two elements are in the same subset.
    • Union(x, y) (or Merge(x, y))
      • Merge two subsets into a single subset.
  • Applications
    • Tracking the connected components of an undirected graph
      • Decide if two vertices belong to the same component, or if adding an edge between them would result in a cycle.
      • Useful for implementing the Kruskal’s algorithm for finding minimum spanning tree
  • Scheduling with deadlines
  • Computing shorelines of a terrain
  • Classifying a set of atoms into molecules or fragments.
  • Connected component labeling in image analysis
  • Example
    • For (each )
      • Makeset(); → {}, {}, {}, {}, {}
      • Union(, ); → {, }, {}, {}, {}
      • {a, c} = Find(a);
      • Union(, ); → {, , }, {}, {}
  • Implementation of disjoint sets using reversed trees
    • Makeset(x)
      Makeset(x){ parent(x) := x rank(x) := 0 }
      • Time complexity: image
  • Two ways of implementing the Union Operation
    • Time complexity: image
    • Time complexity: image
  • Union by rank
    • Always attach the smaller tree to the root of the larger tree.
    • The rank increases by one only if two trees of the same rank are merged.
      • The rank of a one-element tree is zero.
    • The Union and Find operations can be done in in the worst case.
      • The number of elements in a tree of rank is at least (Proof by induction)
      • The maximum possible rank of a tree with n elements is .

Disjoint set의 path compression 연산에 대해서도 알아볼 것.

  • S = Find(x)
    • Time complexity:
      Find(x) { if (x == parent(x)) return x else return Find(parent(x)) } Find(x) { while (x != parent(x)) x := parent(x) return x }
  • Union(x, y)
    • Time complexity: 2 Find op’s +
      Union(x, y) { x0 := Find(x) y0 := Find(y) if (x0 == y0) return if (rank(x0) > rank(y0)) parent(y0) := x0 else parent(x0) := y0 if (rank(x0) == rank(y0)) rank(y0) := rank(y0)+1 }

#Another Algorithm Based on Disjoint Sets

  • Method
    • : the maximum of the deadlines for jobs.
    • Add a job as late as possible to the schedule being built, but no later than its deadline.
    • Sort the jobs in non-increasing order by profit.
    • Initialize disjoint sets, containing the integers
    • For each job in the sorted order,
      • Find the set containing the minimum of its deadline and .
        • If , reject the job.
        • Otherwise, schedule it at time , and merge with the set containing .
  • Time complexity
    • for the disjoint set manipulation, where is the minimum of and
    • for sorting the profits.
  • Example image