Documents

11 views

High Productivity Language Systems: Next-Generation Petascale Programming

High Productivity Language Systems: Next-Generation Petascale Programming. Aniruddha G. Shet, Wael R. Elwasif, David E. Bernholdt, and Robert J. Harrison Computer Science and Mathematics Division Oak Ridge National Laboratory. Revolutionary approach to large-scale parallel programming.
of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Transcript
High Productivity Language Systems: Next-Generation Petascale ProgrammingAniruddha G. Shet, Wael R. Elwasif, David E. Bernholdt, and Robert J. HarrisonComputer Science and Mathematics DivisionOak Ridge National LaboratoryRevolutionary approach to large-scale parallel programmingCandidate languages:
  • Chapel (Cray)
  • Fortress (Sun)
  • X10 (IBM)
  • Million-way concurrency (and more) will be required on coming HPC systems.
  • The current “Fortran+MPI+OpenMP” model will not scale.
  • New languages from the DARPA HPCS program point the way toward the next-generation programming environment.
  • Emphasis on performance and productivity.
  • Not SPMD:
  • Lightweight “threads,” LOTS of them
  • Different approaches to locality awareness/management
  • High-level (sequential) language constructs:
  • Rich array data types (part of the base languages)
  • Strongly typed object oriented base design
  • Extensible language model
  • Generic programming
  • Based on joint work with
  • Argonne National Laboratory
  • Lawrence Berkeley National Laboratory
  • Rice University
  • And the DARPA HPCS programConcurrency: The next generation
  • Single initial thread of control
  • Parallelism through language constructs
  • True global view of memory, one-sided access model
  • Support task and data parallelism
  • “Threads” grouped by “memory locality”
  • Extensible, rich distributed array capability
  • Advanced concurrency constructs:
  • Parallel loops
  • Generator-based looping and distributions
  • Local and remote futures
  • What about productivity?
  • Index sets/regions for arrays
  • “Array language” (Chapel, X10)
  • Safe(r) and more powerful language constructs
  • Atomic sections vs locks
  • Sync variables and futures
  • Clocks (X10)
  • Type inference
  • Leverage advanced IDE capabilities
  • Units and dimensions (Fortress)
  • Component management, testing, contracts (Fortress)
  • Math/science-based presentation (Fortress)
  • DFExploring new languages: Quantum chemistry
  • Fock matrix construction is a key kernel.
  • Used in pharmaceutical and materials design, understanding combustion and catalysis, and many other areas.
  • Scalable algorithm is irregular in both data and work distribution.
  • Cannot be expressed efficiently using MPI.
  • task-local working blocksD, F global-view distributed arrayswork pool of integral blocksIntegrals(mn|ls)CPU 0CPU 1CPU P-2CPU P-1Fmn¬ Dls [ 2 (mn|ls) - (ml|ns) ]Integrals(mn|ls)CPU 0CPU 1CPU P-2CPU P-1DFLoad balancing approaches for Fock matrix buildIntegrals(mn|ls)CPU 0CPU 1CPU P-2CPU P-1DFParallelism and global-view data in Fock matrix buildTradeoffs in HPLS language design
  • Emphasis on parallel safety (X10) vs expressivity (Chapel, Fortress)
  • Locality control and awareness:
  • X10: explicit placement and access
  • Chapel: user-controlled placement, transparent access
  • Fortress: placement “guidance” only, local/remote access blurry (data may move!!!)
  • What about mental performance models?
  • Programming language representation:
  • Fortress: Allow math-like representation
  • Chapel, X10: Traditional programming language front end
  • How much do developers gain from mathematical representation?
  • Productivity/performance tradeoff
  • Different users have different “sweet spots”
  • Remaining challenges
  • (Parallel) I/O model
  • Interoperability with (existing) languages and programming models
  • Better (preferably portable) performance models and scalable memory models
  • Especially for machines with 1M+ processors
  • Other considerations:
  • Viable gradual adoption strategy
  • Building a complete development ecosystem
  • ContactsAniruddha G. ShetComputer Science Research Group Computer Science and Mathematics Division(865) 576-5606shetag@ornl.gov Wael R. ElwasifComputer Science Research Group Computer Science and Mathematics Division(865) 241-0002elwasifwr@ornl.govDavid E. BernholdtComputer Science Research GroupComputer Science and Mathematics Division(865) 574-3147bernholdtde@ornl.gov Robert J. HarrisonComputational Chemical SciencesComputer Science and Mathematics Division(865) 241-3937harrisonrj@ornl.gov10 Elwasif_HPCS_SC07
    Advertisement
    Related Search
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks