Topic 6: Performance and Scalabilitys

Goal:
Explain how to increase the performance of Java code using threading.
Illustrate some of the pitfalls there are in doing this.

Java Executors - Tasks

  • Tasks are a central concept for executors

    • When designing a program using executors, first think about the tasks to be executed
    • Like for threads, tasks can be conveniently defined in their own class
    • Ideally, tasks should be independent (if not, thread satefty must be ensured)
    @Override
        public void run() {
            if ((high-low) < threshold) { 
                for (int i=low; i<=high; i++) if (isPrime(i)) lc.increment();
            } else {
                int mid= low+(high-low)/2;
                Future<?> f1 = pool.submit(new countPrimesTask(lc, low, mid, pool, threshold) );
                Future<?> f2 = pool.submit(new countPrimesTask(lc, mid+1, high, pool, threshold) );
    
                try { f1.get();f2.get(); }
                catch (InterruptedException | ExecutionException e) { }
        }
    }
    
    
    class PrimeCounter {
      private int count= 0;
      public synchronized void increment() {
          count= count + 1;
      }
      public synchronized int get() { 
          return count; 
      }
      public synchronized void setZero() {
          count= 0;
      }
    }
    

    initializes the Executor service

    class PrimeCountExecutor {
          private ExecutorService pool;
          ...
          public PrimeCountExecutor () {
                pool= new ForkJoinPool();
                Future<?> done= pool.submit(new countPrimesTask( ... ));
                try { done.get(); }
          }
      }
    
      // pool= new ForkJoinPool( n );
      // n specifies the number of CPUs that will be used
    
  • The ExecutorService can be shut down.

    pool.shutdown();
    
    • After shutdown the pool cannot be reused, but can assign it a new value
  • Runnable vs. Callable

    • Both are used to specify the code of a thread
    • Runnable
      • cannot return a result
      • overrides run()
    • Callable
      • returns a result (via a Future)
      • overrides call()
    • Runnables may use shared data (e.g., to deliver a result , in countPrimes example), callable can also use shared data.
    public class DotProductTask implements Callable<Integer> {
          final int pos;
          final int[] x, y;
    
          public DotProductTask(int[] x, int[] y, int pos) {
                this.x = x;
                this.y = y;
                this.pos = pos;
          }
    
          @Override
          public Integer call() {
              return x[pos] * y[pos];
          }
    }
    
    List<DotProductTask> tasks = new ArrayList<DotProductTask>();
    
    // Randomly initialize arrays x and y...
    // Create the list of tasks (Futures) to execute
    for (int i = 0; i < N; i++)
          tasks.add(new DotProductTask(x,y,i));
    ...
    // Add all futures to the execution pool at once
    List<Future<Integer>> futures = pool.invokeAll(tasks);
    for(Future<Integer> f : futures) {
          result += f.get(); // Wait for each future to be executed
    
    pool.shutdown();
    
  • Submit vs Execute

    • Both are used to spawn a tasks
    • pool.execute
      • cannot return a result
    • pool.submit
      • returns a result (via a Future)

Scalability, speed-up and loss (of scalablity) classifications

Quicksort executor

class QuicksortTask implements Runnable {
      Task p; // low and high boundaries
      ExecutorService pool;

      @Override public void run() { qsort(p, pool, ... ); }
      
      public static void qsort(Task p, ExecutorService pool, ... 
      ) { 
            Future<?> lowTask= null; Future<?> highTask= null;

            //split task in two: Low and High
            if (Low.size>= threshold) 
                  lowTask = pool.submit( new QuicksortTask( pLow, pool, ... ))
            else 
                  Quicksort(pLow); //sequential sort
            if (High.size>= threshold) 
                  highTask = pool.submit( new QuicksortTask( pHigh, pool, ... ))
            else 
                  Quicksort(pHigh);
 }

 //Waiting for longest running subtask to finish
try {
        lowTask.get();
        highTask.get();
} catch (InterruptedException | ExecutionException e) { e.printStackTrace(); }
  • Sorting results (Does not scale perfectly)
    Executor
    1 8.5 s
    2 4.8 s
    4 2.6 s
    8 2.2 s
    16 2.2 s
  • Loss of scalability
    • Starvation loss
      • QuickSort (第一阶段划分数组时只用了一个cpu横扫)
      • Minimize the time that the task pool is empty
    • Separation loss (best threshold)
      • Prime count
      • Find a good threshold to distribute workload evenly
    • Saturation loss (locking common data structure)
      • Minimize high thread contention in the problem
    • Braking loss
      • String search
      • Stop all tasks as soon as the problem is solved
  • What limits performance
    • CPU-bound
      • Eg. counting prime numbers, sorting, ….
      • To speed up, add more CPUs (cores) (exploitation)
    • Input/output-bound
      • Eg. I/O or reading from network
      • To speed up, use more threads/tasks (inherent)
    • Synchronization-bound (Saturation loss)
      • Eg. Algorithm using shared data structure
      • To speed up, improve shared data structure (eg. Lock striping)

Lock striping