Enjoy light refreshments, coffee, tea and water prior to starting your day.
TUTORIAL: Enabling Your Campus to Simplify Research Data Management with Globus Online
ABSTRACT: In this tutorial, XSEDE Campus Champions and owners of campus resources will learn how to deliver easy-to-use yet sophisticated data management solutions to their researchers using Globus services. Globus provides research data management capabilities using software-as-a-service (SaaS) approaches, without requiring construction of custom end-to-end systems. Globus services provide secure, scalable, robust solutions to the issues faced by users when moving, storing, and sharing "big data" among distributed research groups. The Globus Transfer service addresses the challenges of moving large data sets between campus/XSEDE resources and researchers' personal computers. The Globus Storage service enables users to place data on campus storage and other cloud storage systems, and allows them to access, update, snapshot, and share versions of their data with anyone on campus as well collaborators at other institutions. This tutorial will demonstrate how a campus can easily enable data management services for their end users using Globus. Participants will learn how to set up a Globus Transfer endpoint using Globus Connect Multi-User (GCMU) and how to create and manage Globus Storage endpoints. Participants also will learn how to set up a MyProxy OAuth server and configure their endpoint to use it for user authentication.
REQUIRES: Laptop
TUTORIAL: Hands-on Tutorial for Building Science Gateway Applications on Cyberinfrastructure
ABSTRACT: The science gateway approach has been widely adopted to bridge cyberinfrastructure (CI) and domain science communities by establishing an online problem-solving environment that seamlessly manages domain-specific computations on CI and provides usable Web- and/or desktop-based gateway applications to community users. As CI resources become increasingly available and accessible for researchers and scientists, agile and effective gateway application development becomes crucial to efficiently leverage CI computing power in domain-specific computation and allow researchers to concentrate on their domain problem solving. This tutorial uses SimpleGrid, a toolkit for efficient learning and development of science gateway building blocks, to provide hands-on experience in leveraging CI (XSEDE in particular) for domain-specific scientific computing, developing XSEDE-enabled science gateway applications as software and Web services, and integrating gateway applications as modular and highly usable Web 2.0 applications. Apache Rave will be used for hands-on exercises of building gateway app gadgets and a prototype Web portal. The intended audience for this tutorial includes researchers and developers who have their scientific codes ready for planned use of CI and are interested in providing community access to the codes by creating a science gateway.
REQUIRES: Web browser, SSH
PREREQUISITES: Web development, grid computing experience
ABSTRACT: Visualization is largely understood and utilized as an excellent communication tool by researchers. This narrow view often keeps scientists from using and developing visualization skillets. This tutorial will provide a grounds-up understanding of visualization and its utility in error diagnostic and exploration of data for scientific insight. When used effectively, visualization can provide a complementary and effective toolset for data analysis, which is one of the most challenging problems in computational domains. In this tutorial, we plan to bridge these gaps by providing end users with fundamental visualization concepts, execution tools, and usage examples. The tutorial will be presented in three sessions covering visualization fundamentals, visualization with Paraview, and visualization with Visit.
REQUIRES: Laptop with Visit and Paraview installed
TUTORIAL: Introduction to BigJob - A SAGA-Based Interoperable, Extensible and Scalable Pilot-Job for XSEDE
ABSTRACT: The SAGA-based Pilot-Job, known as BigJob, provides the unique capability to use Pilot-Jobs on the highest-performing machines as well as collectively on distributed cyberinfrastructure. It supports very large-scale parallel jobs, as well as high throughput of many smaller jobs. In addition to the number and range of job sizes that it supports, what makes BigJob unique among all Pilot-Jobs is its ability to be programmatically extended to support a range of "simple workflows," provide application-level control of both the Pilots and the tasks assigned to the Pilot-Job, and its interoperability over all XSEDE and OSG platforms. This half-day tutorial will bring together SAGA team members, XSEDE staff, and XSEDE end users (scientists using BigJob) to deliver: The basic concepts behind Pilot-Jobs, several science exemplars that routinely use BigJob on XSEDE for extreme-scale science, introduction on how to use BigJob, how to use BigJob on XSEDE and OSG, how to program and customize BigJob for your needs, building frameworks using BigJob, and advanced concepts and application-level scheduling using BigJob.
REQUIRES: Laptop
TUTORIAL: Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers
ABSTRACT: The next round of supercomputing technology will feature heterogeneous architectures with many-core CPU and accelerator technologies. In the coming year, the Texas Advanced Computing Center (TACC) will deploy a 2PF Intel Sandy Bridge, 8PF Intel MIC Architecture hybrid cluster name Stampede, featuring hundreds of thousands of heterogeneous cores. This tutorial will introduce experienced C/C++ and Fortran programmers to techniques essential for preparing their scientific applications for future systems. These future architectures will support wider and more powerful SIMD vector units, so the first half of the tutorial will focus on understanding modern vector programming, compiler vectorization reports, solutions to common vectorization problems, and new ways to employ new SIMD instructions. The second half of the tutorial will concentrate on using OpenMP directives and tasks to exploit the parallelism inherent in high core-count processors and heterogenous systems. Hands-on exercises will be conducted. Motivating examples from Intel's future MIC Architecture will be presented.
PREREQUISITES: Parallel programming experience
TUTORIAL: Selecting and Using XSEDE Resources for Maximum Productivity
ABSTRACT: The XSEDE program provides a wide variety of resources to the research and academic community. Due to the varied nature of these resources, it is not always easy or even clear which resources to select for a project. This tutorial will provide an overview of the requirements that need to be determined and how to find resources that "match" these requirements. It also will cover some basic information about the computational and data resources available to the user community, as well as the tools and information provided to assist in the selection of these resources. Some examples of "matching users to resources" will be provided, as well as information on the various methods of accessing the resources. Tips on making the most of the resources selected also will be covered. Example serial and parallel code and access to several resources for a "hands-on" workshop will be provided in the second half of the tutorial. Some of the code examples will highlight the utilization of the "special" Condor Pool and OSG resources, as well as a few others.
REQUIRES: Laptop, GSISSH, Web browser
PREREQUISITES: XSEDE ID with access to resource
ABSTRACT: This tutorial provides an overview of the grid middleware UNICORE 6 covering both server and client components. First, the system's overall architecture will be introduced, followed by a discussion of the features and some technical details on the main components. This includes a discussion of services for job execution, data storage, and user management, as well as service discovery, data transfers, and security issues.
REQUIRES: Laptop, Java
ABSTRACT: This tutorial will provide an introduction to the basic ideas and key technologies that comprise high-performance computing today. Although the sessions are targeted toward students, any researcher new to the field of high-performance scientific computing could benefit from this introduction. No previous HPC experience is required. Although there are no strict requirements, a laptop with a command line terminal program will be helpful in following along with the examples. The topics covered in this morning session will be an introduction to Linux, which will introduce students to the command line interface and explore some useful command line tools; an introduction to high-performance computing, which will focus on clusters and how they are used in scientific computing; and an introduction to high-performance file systems, which will detail the different types of storage available in HPC and explore best practices for data movement and curation.
ABSTRACT:
STRONGLY ENCOURAGED: Laptop (Windows, MacOS or Linux); free software might need to be downloaded during the session
PREREQUISITES: One recent semester of programming in C or C++; recent basic experience with any Unix-like operating system (could be Linux but doesn't have to be). (Attendees with Fortran experience will be able to follow along.) No previous HPC experience will be required.,
TUTORIAL: A New and Improved Eclipse Parallel ToolsPlatform: Advancing the Development of Scientific Applications
ABSTRACT: Many HPC developers still use command-line tools and tools with disparate and sometimes confusing user interfaces for the different aspects of the HPC project life cycle. The Eclipse Parallel Tools Platform (PTP) combines tools for coding, debugging, job scheduling, tuning, revision control, and more into an integrated environment for increased productivity. Leveraging the successful open source Eclipse platform, PTP helps manage the complexity of HPC scientific code development and optimization on diverse platforms and provides tools to gain insight into complex code that is otherwise difficult to attain. This tutorial will provide attendees with a hands-on introduction to Eclipse and PTP. Access to a parallel system from XSEDE for the hands-on portions will be provided.
REQUIRES: Laptop pre-installed with Eclipse and PTP.
See http://wiki.eclipse.org/PTP/tutorials/XSEDE12 for installation instructions.
Tutorial: Accelerator Programming with OpenACC and CUDA
Abstract: Tutorial scope:
Full day
Beginning through intermediate material
o Jon Urbanic PSC, Introduction to OpenACC
o Lars Koesterke TACC, Intermediate CUDA
This full day tutorial will address the 2 main programming models in use today for adapting HPC codes to effectively use GPU accelerators. The two half day sessions will share some common techniques for achieving best performance with an accelerator. The included accelerator-tutorial.pdf file contains draft material that will be reworked before the final tutorial date.
Introduction to OpenACC:
The Intro to OpenACC session will cover the newest programming model for accelerators based on OpenMP-like directives. While there are no prerequisites, reading or skimming one of the many introductory CUDA tutorials would be helpful, and a working knowledge of C or Fortran is required.Students may participate in hands-on programming sessions hosted on a cluster at PSC. Topics and
hands-on sessions will include:
-Welcome/Intro to the Environment (10 mins)
-Parallel Computing Overview (10 mins)
-Introduction to OpenACC (2 hrs)
-OpenACC with CUDA Libraries (30 mins)
Intermediate CUDA:
This portion of the tutorial will cover intermediate programming techniques and performance tools/tips for CUDA programmers. There are many introductory materials for the beginning CUDA programmer and one or more of these is considered a prerequisite for this portion of the tutorial.
-http://developer.nvidia.com/cuda-education-training
The intermediate CUDA material will be presented in lecture-discussion style using well-developed example codes and walkthroughs. Students will receive working sample code and routines they can incorporate into their own HPC projects. Topics and examples will include:
- CUDA Fortran
- CUDA C
- optimizing Shared Memory use in the accelerator
- using streams to overlap communication and computation on the accelerator
- MPI and accelerators with CUDA
- driving multiple GPUs per process
- Optimizing cpu core – gpu device affinity for maximum memory bandwidth
-- [http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/DellNVIDIACluster/Doc/Architecture.html#Affinity]
ABSTRACT: This tutorial, suitable for attendees with intermediate-level experience in Parallel programing, will provide a comprehensive overview on the optimization techniques to speed up parallel programs. It will focus on analyzing and tuning the performance of the code to efficiently use XSEDE computing resources. An overview on parallel programming models (MPI, OpenMP, Threads and Hybrid) and performance tools (FPMPI, PAPI, IPM, TAU and PerfExpert) will be presented, emphasizing performance measurement, profiling, tracing, and analysis to improve application performance. Moreover, a discussion of various MPI/OpenMP techniques, I/O paradigms and state of the art numerical linear packages will be presented with emphasis on best practices and scalability. A hands-on session will be conducted for each part to give participants the opportunity to investigate techniques and performance optimizations on a HPC system. Example codes with instructions will be provided to facilitate individual discovery. Additionally, participants may bring and work on their own codes.
REQUIRES: Laptop, SSH
PREREQUISITES: Parallel programming experience
TUTORIAL: Developing Science Gateways using Apache Airavata API
ABSTRACT: The tutorial will be based on Apache Airavata, a software toolkit to build science gateways. Airavata provides features to compose, manage, execute, and monitor small to large-scale applications and workflows on computational resources ranging from local clusters to national grids such as XSEDE and computing clouds such as Amazon Elastic Compute Cloud. Airavata builds on general concepts of service-oriented computing, distributed messaging, workflow composition, and orchestration. The Airavata suite includes tools for workflow composition and monitoring. The standout feature of the workflow engine allows it to interpret at each step, providing dynamic interactive capabilities. The core capabilities provide the ability to wrap command line-driven science applications and make them into robust, network-accessible services. The Airavata Registry provides persistent data store. The gateway building toolkit also includes a publish/subscribe-based messaging system with features to incorporate clients behind firewalls and overcome network glitches.
REQUIRES: SSH, SVN, Web browser
PREREQUISITES: Grid computing, portal development experience
ABSTRACT: High-throughput computing is not as well understood as some of the other types of computing available on XSEDE resources. However, the Purdue Condor Pool and OSG are very good resources for certain types of computational scientific research and can provide a large number of compute hours if the type of research computing required is a good match. This tutorial will cover the concept of high-throughput computing, the types of jobs that might be a good "match" for the Purdue Condor Pool and OSG resources, as well as a background in the Condor software that provides the main job submission and workflow mechanism for both resources. Additional details will be provided on how both the Purdue Condor Pool and OSG are set up. Example serial code, as well as access to both the Purdue Condor Pool and OSG for a "hands-on" workshop, will be provided in the second half of the tutorial.
REQUIRES: Laptop, GSISSH, Web browser
PREREQUISITES: XSEDE ID on Condor Pool
ABSTRACT: Infrastructure-as-a-service (IaaS) cloud computing (sometimes also called "infrastructure cloud computing") has recently emerged as a promising outsourcing paradigm; it has been widely embraced commercially and is also beginning to make inroads in scientific communities. Although popular, the understanding of its benefits, challenges, modes of use, and general applicability as an outsourcing paradigm for science is still in its infancy, which gives rise to many myths and misconceptions. Without specific and accurate information, it is hard for the scientific communities to understand whether this new paradigm is worthwhile – and if so, how to best develop, leverage, and invest in it. Our objective in this tutorial is to facilitate the introduction to infrastructure cloud computing to scientific communities and provide accurate and up-to-date information about features that could affect its use in science: to conquer myths, highlight opportunities, and equip the attendees with a better understanding of the relevance of cloud computing to their scientific domains. To this end, we have developed a tutorial that mixes the discussion of various aspects of cloud computing for science, such as performance, privacy, and standards, with practical exercises using infrastructure clouds and state-of-the-art tools. We will be using FutureGrid clouds for the examples.
REQUIRES: Laptop, SSH
PREREQUISITES: Linux experience
ABSTRACT: This tutorial aims at presenting OpenACC in the context of a code migration methodology. Besides having a clear view of how to migrate applications onto new many-core processors (GPU right now), the main objective of a methodology is to reduce risks and improve efficiency. CAPS HMPP compiler for OpenACC was released April 24, 2012. The purpose of OpenACC is to define a common set of directives between CAPS, CRAY and PGI compilers. This initiative comes in anticipation of the eventual extension of OpenMP to accelerators.
ABSTRACT: The Appro Gordon Compute Cluster was put into production at the San Diego Supercomputer Center (SDSC) in early 2012. In addition to providing academic users with their earliest access to the Intel EM64T Xeon E5 (Sandy Bridge) processor, it contains a number of unique features that make it ideal for data-intensive applications. All current and potential Gordon users are invited to attend, but we especially encourage practitioners from domains that have not traditionally been major users of NSF compute resources (e.g., political science, linguistics, economics, finance, data analytics, and sociology) to participate. The tutorial covers the Gordon architecture, the types of applications that are ideally suited for the system, and how to run jobs and get the best performance on the system.
PREREQUISITES: Linux experience
ABSTRACT: This tutorial will provide training and hands-on activities to help new users learn and become comfortable with the basic steps necessary to first obtain, and then successfully employ, an XSEDE allocation to accomplish their research goals. The tutorial will consist of three sections. The first section, "Information Security Training for XSEDE Researchers," will review basic information security principles for XSEDE users including, how to protect yourself from online threats and risks, how to secure your desktop/laptop, safe practices for social networking, email and instant messaging, how to choose a secure password and what to do if your account or machine has been compromised. The second part will explain the XSEDE allocations process and how to write and submit successful allocation proposals. The instructor will describe the contents of an outstanding proposal and the process for generating each part. Topics covered will include the scientific justification, the justification of the request for resources, techniques for producing meaningful performance and scaling benchmarks, and navigating the POPS system through the XSEDE Portal for electronic submission of proposals. The last part of the tutorial will cover the New User Training material that is been delivered remotely quarterly, but it will delve deeper into these topics. New topics will be covered, including how to troubleshoot why a job has not run and how to improve job turnaround by understanding differences in batch job schedulers on different platforms. We will demonstrate how to perform the various tasks on the XSEDE Portal with live, hands-on activities and personalized help. In the event of network issues, we will have demos available as a backup. We anticipate significant interest from Campus Champions, and therefore, we will explain how attendees can assist others, as well as briefly describe projects that are being currently carried out in non-traditional HPC disciplines.
REQUIRES: Laptop, Web browser
TUTORIAL: XSEDE/Genesis II – From the Global Federated File System to Running Jobs
ABSTRACT: XSEDE introduces a new approach to satisfying user needs. The approach combines an emphasis on interoperability and standards as a means to reduce risk and provide a diversity of software sources, inclusion of campus and research group resources as first-class members of XSEDE, as well as particular attention to non-functional quality attributes such as ease-of-use and availability. This tutorial introduces students to using XSEDE access layer tools and sharing capabilities such as the Global Federated File System (GFFS), the XSEDE queue, and the XSEDE client user interfaces.
REQUIRES: Laptop
PREREQUISITES: XSEDE ID preferred; Linux experience
Enjoy light refreshments, coffee, tea and water prior to starting your day.
Welcome - Craig Stewart
Speaker: John Towns, NCSA
Title: State of XSEDE
Abstract - XSEDE has just wrapped up its first year of the program with a myriad of activities during that first year. In the presentation we will review the purpose and mission of XSEDE. In the spirit of the conference, "Bridging from the eXtreme to the campus and beyond," we will also review the expanded focus of the program with respect to the broad community it intends to support. Throughout the talk we will highlight some of the significant accomplishments of XSEDE in its inaugural year.
Speaker: Richard Tapia, Rice
Title: Crisis In Higher Education: The Need For New Leadership
Abstract: Extreme growth in the nation’s Hispanic population, primarily Mexican American, is forcing educational challenges at a crisis level for the country. The problem is exacerbated by the fact that this fastest growing segment of the nation’s population continues to be the least educated. His warning is that the rate at which the minority population is growing outpaces the rate at which we are improving our effectiveness in educating this segment of the population. Because the economic health of the country is based in large measure upon technical advances, the country must find a way to incorporate this growing population into the mainstream of scientific and technical endeavors.
The speaker's remarks will focus on the successes and failures of the nation’s tier 1 universities regarding their representation at the undergraduate, graduate, and faculty levels in science, engineering, and mathematics. The speaker will also relate how he became a leader in underrepresentation issues at the campus, state, and national levels, and will discuss challenges he's faced throughout this journey.
Invited Speaker: Edith Gummer, NSF Program Director, Directorate for Education and Human Resources (EHR)/ Research on Learning in Formal and Informal Settings (DRL)
Title: Evaluation in the Extreme Setting: Micro and Macro Challenges and Opportunities
Abstract: This presentation will focus on the challenges and opportunities in evaluation of learning environments that address complex outcomes of computational literacy and effective analysis of large scale data. The challenges include the design and implementation of assessments of these learning outcomes including the intersection of disciplinary knowledge and computational literacy. Gathering information about the effective elements of multifaceted instructional environments requires the efforts of a team of disciplinary and educational researchers. Evaluations need to include both quantitative and qualitative aspects to ascertain not only how effective design and implementation of courses and programs improves student learning, but by whom and under what circumstances. Course instructors and the educational researchers and evaluators with whom they work should consider the opportunities they have link with others who are working in this area to facilitate the use of common instruments and processes so that resources and findings can be shared across the community.
Science: Membrane protein simulations under asymmetric ionic concentrations
Abstract: Important cellular processes, such as cell-cell recognition, signal transduction, and transport of electrical signals are controlled by membrane proteins. Membrane proteins act as gatekeepers of the cellular environment by allowing passage of ions, small molecules, or nascent proteins under specific environmental signals such as transmembrane voltage, changes in ionic concentration, or binding of a ligand. Molecular dynamics simulations of membrane proteins, performed in a lipid bilayer environment, mimic the cellular environment by representing the solvent, lipids, and the protein in full atomistic detail. These simulations employ periodic boundary conditions in three dimensions to avoid artifacts associated with the finite size of the system. Under these conditions, the membrane protein system is surrounded by ionic solutions on either side of the membrane whose properties cannot be changed independently. We have developed a computational method that allows simulations of membrane proteins under periodic boundary condition while controlling the two ionic solutions properties independently. In this method, an energy barrier is introduced between the two adjacent unit cells and separates the two ionic solutions. The height of the barrier affects the chemical potential of the ions on each side of the barrier, and thus allows for individual control over ionic properties. During the course of the simulation, the height of the barrier is adjusted dynamically to reach the proper ionic concentration on each side. This method has been implemented in the Tcl interface of the molecular dynamics program NAMD.
We have applied this method to simulate the voltage-gated potassium channel Kv1.2 under physiological conditions, in which the extracellular solution is made of 10mM KCl and 100mM of NaCl solution, while the intracellular solution has an ionic concentration of 100mM KCl and 10mM NaCl. The simulations maintain a 1:10 and 10:1 ratio between ionic concentrations on each side. The simulations are performed under a voltage bias of 100mV and provide the first simulation of potassium channels under the exact physiological condition.
The method has also been applied to simulate ionic currents passing through OmpF, an outer membrane porin, under membrane potentials. Here we were able to accurately calculate the reversal potential of the OmpF channel in a tenfold salt gradient of 0.1 intracellular to 1M extracellular KCl. Our results agree with experimental ion conductance measurements and reproduce key features of ion permeation and selectivity of the OmpF channel. Specifically, the I-V plots obtained under asymmetric ionic solutions revealed the natural asymmetry in the channel caused by increased conductance rates observed at positive potentials, as well as the inherent cation-selectivity of the OmpF pore. Therefore, we have developed a method that directly relates molecular dynamics simulations of ionic currents to electrophysiological measurements in ion channels.
Software: UltraScan Solution Modeler: Integrated Hydrodynamic Parameter and Small Angle Scattering Computation and Fitting Tools
Abstract: UltraScan Solution Modeler (US-SOMO) processes atomic and lower-resolution bead model representations of biological and other macromolecules to compute various hydrodynamic parameters, such as the sedimentation and diffusion coefficient, relaxation time and intrinsic viscosity, and small angle scattering curves that contribute to our understanding of molecular structure in solution. Knowledge of biological macromolecules’ structure aids researchers in understanding their function as a path to disease prevention and therapeutics for conditions such as cancer, thrombosis, Alzheimer’s disease and others. US-SOMO provides a convergence of experimental, computational, and modeling techniques, in which detailed molecular structure and properties are determined from data obtained in a range of experimental techniques that, by themselves, give incomplete information. Our goal in this work is to develop the infrastructure and user interfaces that will enable a wide range of scientists to carry out complicated experimental data analysis techniques on XSEDE. Our user community consists of biophysics and structural biology researchers. A recent search on PubMed reports 9,205 papers in the decade referencing the techniques we support. We believe our software will provide these researchers a convenient and unique framework to determine structure, and thus advancing their research.
The computed hydrodynamic parameters and scattering curves are screened against experimental data, effectively pruning potential structures into equivalence classes. Experimental methods may include analytical ultracentrifugation, dynamic light scattering, small angle X-ray scattering, NMR, fluorescence spectroscopy, and others. One source of macromolecular models are X-ray crystallographic studies. A molecule’s behavior in solution may not match those observed in the crystal form. Using computational techniques, an initial fixed model can be expanded into a search space utilizing high temperature molecular dynamic approaches or stochastic methods such as Brownian dynamics. The number of structures produced can vary greatly, ranging from hundreds to tens of thousands or more. This introduces a number of cyberinfrastructure challenges. Computing hydrodynamic parameters and small angle scattering curves can be computationally intensive for each structure, and therefore cluster compute resources are essential for timely results. Input and output data sizes can vary greatly from less than 1 MB to 2 GB or more. Although the parallelization is trivial, along with data size variability there is a large range of compute sizes, ranging from one to potentially thousands of cores with compute time of minutes to hours.
In addition to the distributed computing infrastructure challenges, an important concern was how to allow a user to conveniently submit, monitor and retrieve results from within the C++/Qt GUI application while maintaining a method for authentication, approval and throttling of usage. Middleware supporting these design goals has been integrated into the application with assistance from the Open Gateway Computing Environments (OGCE) collaboration team. The approach was tested on various XSEDE clusters and local compute resources. This paper reviews current US-SOMO functionality and implementation with a focus on the newly deployed cluster integration.
Tech: Gordon: Design, Performance, and Experiences Deploying and Supporting a Data-Intensive Supercomputer
Abstract: The Gordon data intensive supercomputer entered service in early 2012 as an allocable computing system in the NSF Extreme Science and Engineering Discovery Environment (XSEDE) program. Gordon has several innovative features that make it ideal for data intensive computing including: 1,024, dual socket, 16-core, 64GB compute nodes based on Intel’s Sandy Bridge processor; 64 I/O nodes with an aggregate of 300 TB of high performance flash (SSD); large, virtual SMP “supernodes” of up to 2 TB DRAM; a dual-rail, QDR InfiniBand, 3D torus network based on commodity hardware and open source software; and a 100 GB/s Lustre based parallel file system, with over 4 PB of disk space. In this paper we present the motivation, design, and performance of Gordon. We provide: low level micro-benchmark results to demonstrate processor, memory, I/O, and network performance; standard HPC benchmarks; and performance on data intensive applications to demonstrate Gordon’s performance on typical workloads. We highlight the inherent risks in, and offer mitigation strategies for, deploying a data intensive supercomputer like Gordon which embodies significant innovative technologies. Finally we present our experiences thus far in supporting users and managing a system like Gordon.
The fifteen students in the first XSEDE Summer Immersion program will present brief summaries of the projects they are working on with XSEDE staff and researchers. We are looking for input from the students, supervisors, and XSEDE community on the projects and the program, and recommendations for improvements. This session is open to the entire conference. Come by and see what the students are working on!
EOT: Enhancing Chemistry Teaching and Learning through Cyberinfrastructure
Abstract: In this paper, we discuss strategies used in the Institute for Chemistry Literacy through Computational Science (ICLCS) that have impacted high school teachers and their students from small rural communities in Illinois. The overarching goal of the ICLCS is to infuse high school Chemistry curricula with computational models and simulations in order to increase the content knowledge of both teachers and their students. Schools in rural communities present issues that are closely related to their size and location, but these issues may resonate in larger communities as well. By bringing leading-edge technologies to classrooms and intensive professional development to teachers not only have we impacted student achievement and teacher content knowledge in Chemistry, but we have used technology to improve access and opportunity for this underserved audience. This paper gives an overview of two aspects of the program—the cloud enabled environment designed to deliver a computational chemistry web service for education and the virtual professional learning environment that serves as a community of practice.
Science: Exploiting HPC Resources for the 3D-Time Series Analysis of Caries Lesion Activity.
Abstract: We present a research framework to analyze 3D-time series caries lesion activity based on collections of SkyScanμ-CT images taken at different times during the dynamiccaries process. Analyzing caries progression (or reversal)is data-driven and computationally demanding. It involvessegmenting high-resolution μ-CT images, constructing 3Dmodels suitable for interactive visualization, and analyzing3D and 4D (3D + time) dental images. Our development exploitsXSEDE’s supercomputing, storage, and visualizationresources to facilitate the knowledge discovery process. Inthis paper, we describe the required image processing algorithmsand then discuss the parallelization of these methodsto utilize XSEDE’s high performance computing resources.We then present a workflow for visualization and analysis usingParaView. This workflow enables quantitative analysisas well as three-dimensional comparison of multiple temporaldatasets from the longitudinal dental research studies.Such quantitative assessment and visualization can help usto understand and evaluate the underlying processes thatarise from dental treatment, and therefore can have significantimpact in the clinical decision-making process andcaries diagnosis.
Software: Trinity RNA-Seq Assembler Performance Optimization
Abstract: RNA-sequencing is a technique to study RNA expression in biological material. It is quickly gaining popularity in the field of transcriptomics. Trinity is a software tool that was developed for efficient de novo reconstruction of transcriptomes from RNA-Seq data. In this paper we first conduct a performance study of Trinity and compare it to previously published data from 2011. We examine the runtime behavior of Trinity as a whole as well as its individual components and then optimize the most performance critical parts. We find that standard best practices for HPC applications can also be applied to Trinity, especially on systems with large amounts of memory. When combining best practices for HPC applications along with our specific performance optimization, we can decrease the runtime of Trinity by a factor of 3.9. This brings the runtime of Trinity in line with other de novo assemblers while maintaining superior quality.
Tech: A Tale of Two Systems: Flexibility of Usage of Kraken and Nautilus at the National Institute for Computational Sciences
Abstract: The National Institute for Computational Sciences (NICS) currently operates two computational resources for the eXtreme Science and Engineering Discovery Environment (XSEDE), Kraken, a 112896-core Cray XT5 for general purpose computation, and Nautilus, a 1024-core SGI Altix UV 1000 for data analysis and visualization. We analyze a year worth of accounting logs for Kraken and Nautilus to understand how users take advantage of these two systems and how analysis jobs differ from general HPC computation We find that researchers take advantage of the flexibility offered by these sytems, running a wide variety of jobs at many scales and using the full range of core counts and available memory for their jobs. The jobs on Nautilus tend to use less walltime and more memory per core than the jobs run on Kraken. Additionally, researchers are more likely to run interactive jobs on Nautilus than on Kraken. Small jobs experience a good quality of service on both systems. This information can be used for the management and allocation of time on existing HPC and analysis systems as well as for planning for deploying future HPC and analysis systems.
EOT:The Role of Evaluation in a Large-scale Multi-site Project
Abstract: The initial role of evaluation in a large-scale multi-site project is presented. The evaluation utilized an educative, values-engaged approach (EVEN) [4]. This paper also presents the evaluation questions and metrics used to structure an evaluation of this scale. The evaluation team has initiated the process of addressing the evaluation questions by including stakeholders in constructing a detailed evaluation matrix, conducting data collection, and regularly presenting formative information to project leads and managers to guide program improvement. This process as well as how this evaluation addresses key issues in evaluating STEM training, education, and outreach while contributing to advances in the field is also discussed.
Science: Transforming molecular biology research through extreme accleration of AMBER molecular dynamics simulations: Sampling for the 99%.
Abstract: This talk will cover recent developments in the acceleration of Molecular Dynamics Simulations using NVIDIA Graphics Processing units with the AMBER software package. In particular it will focus on recent algorithmic improvements aimed at accelerating the rate at which phase space is sampled. A recent success has been the reproduction and extension of key results from the DE Shaw 1 millisecond Anton MD simulation of BPTI (Science, Vol. 330 no. 6002 pp. 341-346) with just 2.5 days of dihedral boosted AMD sampling on a single GPU workstation, (Pierce L, Walker R.C. et al. JCTC, 2012 in review). These results show that with careful algorithm design it is possible to obtain sampling of rare biologically relevant events that occur on the millisecond timescale using just a single $500 GTX580 Graphics Card and a desktop workstation. Additional developments highlighted will include the acceleration of AMBER MD simulations using graphics processing units including Amazon EC2 and Microsoft Azure Cloud based automated ensemble calculations, a new precision model optimized for the upcoming Kepler architecture (Walker R.C. et al, JCP, 2012, in prep) as well as approaches for running large scale multi-dimensional GPU accelerated replica exchange calculations on Keeneland and BlueWaters.
Software: Exploring Similarities Among Many Species Distributions
Abstract: Collecting species presence data and then building models to predict species distribution has been long practiced in the field of ecology for the purpose of improving our understanding of species relationships with each other and with the environment. Due to limitations of computing power as well as limited means of using modeling software on HPC facilities, past species distribution studies have been unable to fully explore diverse data sets. We build a system that can, for the first time to our knowledge, leverage HPC to support effective exploration of species similarities in distribution as well as their dependencies on common environmental conditions. Our system can also compute and reveal uncertainties in the modeling results enabling domain experts to make informed judgments about the data. Our work was motivated by and centered around data collection efforts within the Great Smoky Mountains National Park that date back to the 1940s. Our findings present new research opportunities in ecology and produce actionable field-work items for biodiversity management personnel to include in their planning of daily management activities.
Tech: Analyzing Throughput and Utilization on Trestles
Abstract: The Trestles system is targeted to modest-scale and gateway users, and is operated to enhance users’ productivity by maintaining good turnaround time as well as other user-friendly features such as long run times and user reservations. However the goal of maintaining good throughput competes with the goal of high system utilization. This paper analyzes one year of Trestles operations to characterize the empirical relationship between utilization and throughput, with the objectives of understanding their trade-off, and informing allocations and scheduling policies to optimize this trade-off. There is considerable scatter in the correlation between utilization and throughput, as measured by expansion factor. There are periods of good throughput at both low and high utilizations, while there are other periods when throughput degrades significantly not only at high utilization but even at low utilization. However, throughput consistently degrades above ~90% utilization. User behavior clearly impacts the expansion factor metrics; the great majority of jobs with extreme expansion factors are associated with a very small fraction of users who either flood the queue with many jobs or request run times far in excess of actual run times. While the former is a user workflow choice, the latter clearly demonstrates the benefit for users to request run times that are well-matched to actual run times. Utilization and throughput metrics derived from XDMoD are compared for Trestles with two other XSEDE systems, Ranger and Kraken, with different sizes and allocation/scheduling policies. Both Ranger and Kraken have generally higher utilization and, not surprisingly, higher expansion factors than Trestles over the analysis period. As a result of this analysis, we intend to increase the target allocation fraction from the current 70% to ~75-80%, and strongly advise users to reasonably match requested run times to actual run times.
EOT: Longitudinal User and Usage Patterns in the XSEDE User Community
Abstract: The XSEDE user community is often assumed to be dominated by a (mostly) fixed set of users in a largely static pecking order. This assumption is based primarily on anecdotal experience but often the only quantitative data are the consistent patterns of overall resource consumption observed at many time scales. The XSEDE accounting system offers a unique opportunity to study consumption patterns over time by project teams and individuals. This analysis shows some tendency for larger-scale consumers to remain among the larger-scale consumers; however, the XSEDE user community is much more dynamic than often assumed. In addition, small-scale user behavior over time differs distinctly from large-scale user behavior, with the “long tail” more often comprised of short-lived projects. We define a number of metrics for describing these patterns and consider their implications for the outreach activities and user support within XSEDE and other HPC environments.
Science: Invited Talk: Multiscale simulations of blood-flow: from a platelet to an artery
Abstract: We review our recent advances on multiscale modeling of blood flow including blood rheology. We focus on the objectives, methods, computational complexity and overall methodology for simulations at the level of glycocalyx (<1 micron), blood cells (2-8 microns) and up to larger arteries ($O(cm)$). The main findings of our research and future directions are summarized. We discuss the role of High Performance Computers for multiscale modeling and present new parallel visualization tools. We also present results of simulations performed with our coupled continuum-atomistic solver on up to 300K cores, modeling initial stages of blood clot formation in a brain aneurysm.
Software: The Prickly Pear Archive: A Portable Hypermedia for Scholarly Publication
Abstract: An executable paper is an hypermedia for publishing, review- ing, and reading scholarly papers which include a complete HPC software development or scientific code. A hyperme- dia is an integrated interface to multimedia including text, figures, video, and executables, on a subject of interest. Re- sults within the executable paper, including numeric output, graphs, charts, tables, equations and the underlying codes which generated such results. These results are dynamically regenerated and included in the paper upon recompilation and re-execution of the code. This enables a scientifically enriched environment which functions not only as a journal but a laboratory in itself, in which readers and reviewers may interact with and validate the results.
The Prickly Pear Archive (PPA) is such a system[1]. One distinguishing feature of the PPA is the inclusion of an un- derlying component-based simulation framework, Cactus[3], which simplifies the process of composing, compiling, and executing simulation codes. Code creation is simplified us- ing common bits of infrastructure; each paper augments to the functionality of the framework. Other distinguishing features include the portability and reproducibility of the archive, which allow researchers to re-create the software environment in which the simulation code was created.
A PPA production system hosted on HPC resources (e.g. an XSEDE machine) unifies the computational scientific pro- cess with the publication process. A researcher may use the production archive to test simulations; and upon arriving at a scientifically meaningful result, the user may then incor- porate the result in an executable paper on the very same resource the simulation was conducted. Housed within a vir- tual machine, the PPA allows multiple accounts within the same production archive, enabling users across campuses to bridge their efforts in developing scientific codes.
The executable paper incorporates into its markup code references to manipulable parameters, including symbolic equations, which the simulation executable accepts as input. An interface to this markup code enables authors, readers, and reviewers to control these parameters and re-generate the paper, potentially arriving at a novel result. Thus, the executable paper functions not only as a publication in it- self, but also as an interactive laboratory from which novel science may be extracted. One can imagine the executable paper environment, encapsulated and safeguarded by a vir- tual machine, as a portable laboratory in which the com- putational scientist arrived at the result. The notion of an executable paper is particularly useful in the context of com- puter and computational science, where the code underlying a (scientific) software development is of interest to the wider development community.
Why are executable papers to be preferred over traditional papers? As Gavish et al. have observed[2], the current work- flow in the life cycle of a traditional paper may be summa- rized in the following five steps:
1. Store a private copy of the original data to be processed on the local machine. 2. Write a script or computer program, with hard-coded tuning parameters, to load the data from the local file, analyze it, and out- put selected results (a few graphical figures, tables, etc) to the screen or to local files. 3. Withhold the source code that was executed, and the copy of the original data that was used, and keep them in the local file system in a directory called e.g. “code-final.” 4. Copy and paste the results into a publica- tion manuscript containing a textual descrip- tion of the computational process that pre- sumably took place. 5. Submit the manuscript for publication as a package containing the word processor source file and the graphical figure files.
Three issues with this workflow lie with the communication, validation, and reproducibility of the method used to ar- rive at the result. The novel execution-based publication paradigm for computer and computational science has a few advantages over the traditional paper-based paradigm which allow it to overcome these issues.
The first advantage is the enhanced explanatory power of- fered by multimedia such as graphs, charts, and numeric values which are generated from manipulable parameters. The reader and reviewer may view the parameters to see how such figures were arrived at. They may also vary the parameters, re-execute the code, and witness changes to me- dia themselves. This level of interactivity offered by the executable archive allows the audience to understand the simulation by allowing them to conduct it first-hand.
The second advantage is that increased level of validation and peer review of the experimental method–owing to to the inclusion of code and logs used to arrive at a result–which is essential for computational science to self-correct. Though a computational scientist may reproduce a simulation code according to a natural-language or pseudocode description, implementation differences may have a dramatic impact on the result. (minutia such as the order of nested loops, the order of computations, and the language constructs used). Were they subject to the review, inefficiencies and errors could be more easily corrected. Fortunately, since the ex- perimental apparatus in computational science is digitized, it is potentially easier for it to achieve this level of validation relative to paper-based media.
The third is reproduciblity, which supports efforts to mod- ify or extend the computational method. In other sciences, the experimental method is fleshed out in the traditional paper in such detail that anyone with the proper equipment may repeat it and expect the same results. Perhaps the computational scientist may be able to supply pseudocode in the traditional paper for others to implement. However, reproducing codes (especially simulation codes) from pseu- docode is a laborious and time-consuming process. Even when such reproduction is completed, gaining access to the proper equipment is often as problematic for the computa- tional scientist as for any other, as it requires allocations to high-security supercomputers and time invested in building the software dependencies necessary to run the code.
Given the digital nature of computational science, it seems that communicating, validating, and reproducing experimen- tal work should be easier. To facilitate the process of extend- ing or further investigating a computational model, ideally modifiable code would come with the paper, along with the process of building and executing it (in the form of a script) and the total environment in which the code was written, built, and executed. In addition, the paper results would be dynamically generated upon recompilation and execution following any modifications to the code.
The latter scenario may seem like a distant possibility, but the production of many executable archives is already under- way. One such archive is the Prickly Pear Archive, an exe- cutable paper journal which uses the Cactus Computational Toolkit as its underlying framework. The award-winning Cactus framework is used for general problem-solving on regular meshes, and is used across disciplines including nu- merical relativity, astrophysics, and coastal science. In ad- Figure 1: The PPA workflow integrates several com- ponents through its PHP interface, including Cac- tus, Kranc, SimFactory and LaTeX.
In addition to Cactus, the PPA interfaces with several other com- ponents, including: SimFactory, set of utilities for building and running Cactus applications and managing Cactus jobs; Kranc, a script for creating Cactus thorns from equations rendered in a Mathematica-style language; the Piraha pars- ing expression grammar for parsing parameter and configu- ration files; the LaTeX markup language; and an encapsu- lating virtual machine which enables portability and rep
Tech: Invited Talk: UNICORE 6 in XSEDE
Abstract: UNICORE (Uniform Interface to Computing Resources) offers a ready-to-run Grid system including client and server software. UNICORE makes distributed computing and data resources available in a seamless and secure way in intranets and the internet. UNICORE 6 is deployed at PRACE sites all over Europe, as well as in the D-Grid (Deutsches Grid). UNICORE 6 implementations have been demonstrated to work during the XSEDE proposal process and currently XSEDE has been working towards deployment of UNICORE through the XSEDE Software Development and Integration process. An overview of UNICORE for XSEDE and the current status of the development activities and deployment will be discussed.
Speaker - James Gutowski, Dell
Title - Stampede: Enabling more science with XSEDE
Abstract - When deployed in 2013, Stampede, built by TACC in partnership with Dell and Intel, will be the most powerful system in the NSF's eXtreme Digital (XD) program, and will support the nation's scientists in addressing the most challenging scientific and engineering problems over four years. Stampede will have a peak performance of 10 petaflops, with 272 terabytes of total memory, and 14 petabytes of disk storage. This presentation will provide an overview of Stampede and the innovative technologies that will enable more science for XSEDE users.
Speaker - Thomas Eickermann, Juelich
Title - PRACE - The European HPC Research Infrastructure
Abstract - PRACE - the Partnership for Advanced Computing in Europe is the European HPC Research Infrastructure.
Since its creation in 2010, PRACE has grown to currently 24 member states. It provides access to a set of high-end Tier-0 HPC systems in Europe for the European research communities a offers supporting services such as application enabling and training. The presentation will give an overview of mission, status and achievements as well as future plans of PRACE.
EOT: Computational Science Certificates for the Current Workforce: Lessons Learned
Abstract: One of the keys to the future competitiveness of U.S. industry is the integration of modeling and simulation into the development, design, and manufacturing process. A related challenge is to retrain the current workforce in the use of computational modeling to enable its effective use in the workplace. We review our implementation of a new, computational science certificate program aimed at the current workforce in Ohio. The structure of the current program is discussed along with the problems associated with meeting the educational needs of this population.
Science: Optimization of Density Functional Tight-Binding and Classical Reactive Molecular Dynamics for High-Throughput Simulations of Carbon Materials
Abstract: Carbon materials and nanostructures (fullerenes, nanotubes) are promising building blocks of nanotechnology. Potential applications include optical and electronic devices, sensors, and nano-scale machines. The multiscale character of processes related to fabrication and physics of such materials requires using
a combination of different approaches such as (a) classical dynamics, (b) direct Born-Oppenheimer dynamics, (c) quantum dynamics for electrons and (d) quantum dynamics for selected nuclei. We describe our effort on optimization of classical reactive molecular dynamics and density-functional tight binding method, which is a core method in our direct and quantum dynamics studies. We find that optimization is critical for efficient use of high-end machines. Choosing the optimal configuration for the numerical library and compilers can result in four-fold speedup of direct dynamics as compared with default programming environment. The integration algorithm and parallelization approach must also be tailored for the computing environment.
The efficacy of possible choices is discussed.
Science: Three-dimensional Simulations of Geometrically Complex Subduction with Large Viscosity Variations
Abstract: The incorporation of geologic realism into numerical models of subduction is becoming increasingly necessary as observational and experimental constraints indicate plate boundaries are inherently three-dimensional (3D) in nature and contain large viscosity variations. However, large viscosity variations occurring over short distances pose a challenge for computational codes, and models with complex 3D geometries require substantially greater numbers of elements, increasing the computational demands. We modified a community mantle convection code, CitcomCU, to model realistic subduction zones that use an arbitrarily shaped 3D plate boundary interface and incorporate the effects of a strain-rate dependent viscosity based on an experimentally derived flow law for olivine aggregates. Tests of this implementation on 3D models with a simple subduction zone geometry indicate that limiting the overall viscosity range in the model, as well as limiting the viscosity jump across an element, improves model runtime and convergence behavior, consistent with what has been shown previously. In addition, the choice of interpolation method and averaging scheme used to transfer the viscosity structure to the different levels in the multigrid solver can significantly improve model performance. These optimizations can improve model runtime by over 20%. 3D models of a subduction zone with a complex plate boundary geometry were then constructed, containing over 100 million finite element nodes with a local resolution of up to 2.35 km, and run on the TeraGrid. These complex 3D models representative of the Alaska subduction zone-transform plate boundary contain viscosity variations of up to seven orders of magnitude. The optimizations in solver parameters determined from the simple 3D models of subduction applied to the much larger and more complex models of an actual subduction zone improved model convergence behavior and reduced runtimes by on the order of 25%. One scientific result from 3D models of Alaska is that a laterally variable mantle viscosity emerges in the mantle as a consequence of variations in the flow field, with localized velocities of greater than 80 cm/yr occurring close to the subduction zone where the negative buoyancy of the slab drives the flow. These results are a significant departure from the paradigm of two-dimensional (2D) models of subduction where the slab velocity is often fixed to surface plate motion. While the solver parameter optimization can improve model performance, the results also demonstrate the need for new solvers to keep pace with the demands for increasingly complex numerical simulations in mantle convection.
Software: Invited Talk: Building your personal HTC Science Gateway. Miron Livny
EOT: A Blended, Multimodal Access eTextBook in Computational Physics
Abstract: A complete eTextBook with multiple executable elements
has been created and Web-based versions have been placed
in the National Science Digital Library. The book’s file formats
and executable elements are chosen to be platform
independent, highly useable and free. While future technologies
and operating systems promise improved executable
books, the created eTextbook highlights some of the features
possible with existing technologies. The prototype eText-
Book includes text, computational laboratories, demonstrations
and video–based lecture modules.
Science: A toolkit for the analysis and visualization of free volume in materials
Abstract: A suite of tools is presented which enable analysis of free volume in terms of accepted standard metrics. The tools are extensible through the use of standard UNIX tools to be useful for output of many standard simulation packages. The tools also include utilities for rapid development of visual output not available in other packages. The tools are also extensible and modifiable for other types of spatial data.
Science: Multiscale Modeling of High Explosives for Transportation Accidents
Abstract: The development of a reaction model to simulate the accidental detonation of a large array of seismic boosters in a semi-truck subject to fire is considered. To test this model large scale simulations of explosions and detonations were performed by leveraging the massively parallel capabilities of the Uintah Computational Framework and the XSEDE computational resources. Computed stress profiles in bulk-scale explosive materials were validated using compaction simulations of hundred micron scale particles and found to compare favorably with experimental data. A validation study of reaction models for deflagration and detonation showed that computational grid cell sizes up to 10 mm could be used without loss of fidelity. The Uintah Computational Framework shows linear strong scaling up to 180K cores which combined with coarse resolution and validated models will now enable simulations of semi-truck scale transportation accidents for the first time.
Software: Roadmaps, Not Blueprints: Paving the Way to Science Gateway Success
Abstract: As science today grows ever more digital, it poses exciting challenges and opportunities for researchers. The existence of science gateways—and the advanced cyberinfrastructure (CI) tools and resources behind the accessible web interfaces—can significantly improve the productivity of researchers facing the most difficult challenges, but designing the most effective tools requires an investment of time, effort, and money. Because all gateways cannot be funded in the long term, it is important to identify the characteristics of successful gateways and make early efforts to incorporate whatever strategies will set up new gateways for success. Our research seeks to identify why some gateway projects change the way science is conducted in a given community while other gateways do not. Through a series of five full-day, iterative, multidisciplinary focus groups, we have gathered input and insights from sixty-six participants representing a diverse array of gateways and portals, funding organizations, research institutions, and industrial backgrounds. In this paper, we describe the key factors for success as well as the situational enablers of these factors. These findings are grouped into five main topical areas—the builders, the users, the roadmaps, the gateways, and the support systems—but we find that many of these factors and enablers are intertwined and inseparable, and there is no easy prescription for success.
EOT: FutureGrid Education: Using Case Studies to Develop Curriculum for Communicating Parallel and Distributed Computing Concepts
Abstract: The shift to parallel computing---including multi-core computer architectures, cloud distributed computing, and general-purpose GPU programming---leads to fundamental changes in the design of software and systems. As a result, learning parallel computing techniques in order to allow software to take advantage of the shift toward parallelism is of important significance. To this end, FutureGrid, an experimental testbed for cloud, grids, and high performance computing, provides a resource for anyone to find, share, and discuss modular teaching materials and computational platform supports.
FutureGrid advances the education and training in distributed computing for organizations with less diverse computational resources; it accomplishes this through the development of instructional resources to include preconfigured environments for providing students with sandboxed virtual clusters. These can be used for either self-learning or teaching courses in parallel, cloud, and grid computing. FutureGrid’s education and outreach initiatives allow computational frameworks, such as, Google’s proprietary, MapReduce and the open-source Apache Hadoop to be applied to datasets of web scale. The availability of cloud computing platforms offer a variety of programming models, such as Hadoop and Twister, an iterative MapReduce framework, to make it feasible for anyone who wants to explore the data deluge without extensive local or personal investment in cluster computing.
FutureGrid provides users with community-driven teaching modules, which provide conceptual principles of parallelism and hands-on practice with parallel computing, in self-contained units, which can be inserted in various environments in multiple curricular contexts. These modules offer an incremental approach to getting interested individuals the exposure to parallelism they will need to become participants in the concurrency evolution.
This paper would present a series of case studies for experiences in parallel and distributed education using the FutureGrid testbed. Building on previous experiences from courses, workshops, and summer schools associated with FutureGrid, we present a viable solution to developing a curriculum by leveraging collaboration with organizations. Our approach to developing a successful guide stems from the idea of anyone interested in learning parallel and distributing computing can do so with minimum assistance from a domain expert, and it addresses the educational goals and objectives to help meet many challenges, which lie ahead in the discipline. We validate our approach to developing a community driven curriculum by providing use cases and their experiences with the teaching modules. Examples of some use cases include the following: a NCSA summer school for big data science, hosting a workshop for faculty members of historically black colleges and universities, courses in distributed and cloud computing at universities, such as Indiana University, University of Florida, Louisiana State University, and the University of Piemonte Orientale - A. Avogadro.
Science: Extending Parallel Scalability of LAMMPS and Multiscale Reactive Molecular Simulations
Abstract: Conducting molecular dynamics (MD) simulations involving chemical reactions in large-scale condensed phase systems (liquids, proteins, fuel cells, etc…) is a computationally prohibitive task even though many new ab-initio based methodologies (i.e., AIMD, QM/MM) have been developed. Chemical processes occur over a range of length scales and are coupled to slow (long time scale) system motions, which make adequate sampling a challenge. Multistate methodologies, such as the multistate empirical valence bond (MS-EVB) method, which are based on effective force fields, are more computationally efficient and enable the simulation of chemical reactions over the necessary time and length scales to properly converge statistical properties.
The typical parallel scaling bottleneck in both reactive and nonreactive all-atom MD simulations is the accurate treatment of long-range electrostatic interactions. Currently, Ewald-type algorithms rely on three-dimensional Fast Fourier Transform (3D-FFT) calculations. The parallel scaling of these 3D-FFT calculations can be severely degraded at higher processor counts due to necessary MPI all-to-all communication. This poses an even bigger problem in MS-EVB calculations, since the electrostatics, and hence the 3D-FFT, must be evaluated many times during a single time step.
Due to the limited scaling of the 3D-FFT in MD simulations, the traditional single-program-multiple-data (SPMD) parallelism model is only able to utilize several hundred CPU cores, even for very large systems. However, with a proper implementation of a multi-program (MP) model, large systems can scale to thousands of CPU cores. This paper will discuss recent efforts in collaboration with XSEDE advanced support to implement the MS-EVB model in the scalable LAMMPS MD code, and to further improve parallel scaling by implementing MP parallelization algorithms in LAMMPS. These algorithms improve parallel scaling in both the standard LAMMPS code and LAMMPS with MS-EVB, thus facilitating the efficient simulation of large-scale condensed phase systems, which include the ability to model chemical reactions.
Science: Multiscale simulations of Langmuir cells and submesoscale eddies using XSEDE resources
Abstract: A proper treatment of upper ocean mixing is an essential part of accurate climate modeling. This problem is difficult because the upper ocean is home to many competing processes. Vertical turbulent mixing acts to unstratify the water column, while lateral submesoscale eddies attempt to stratify the column. Langmuir turbulence, which often dominates the vertical mixing, is driven by an interaction of the wind stress and surface wave (Stokes) drift, while the submesoscale eddies are driven by lateral density and velocity changes. Taken together, these processes span a large range of spatial and temporal scales. They have been studied separately via theory and modeling. It has been demonstrated that the way these scales are represented in climate models has a nontrivial impact on the global climate system. The largest impact is on upper ocean processes, which filter air-sea interactions. This interaction is especially interesting, because it is the interface between nonhydrostatic and hydrostatic, quasigeostrophic and ageostrophic, and small-scale and large-scale ocean dynamics. Previous studies have resulted in parameterizations for Langmuir turbulence and submesoscale fluxes, but these parameterizations assume that there is no interaction between these important processes. In this work we have utilized a large XSEDE allocation (9 million SUs) to perform multiscale simulations that encompass the Langmuir Scale (O(10-100m)) and the submesoscale eddies (O(1-10km)). One simulation includes a Stokes drift, and hence Langmuir turbulence, while the other does not.
To adequately represent such disparate spatial scales is a challenge in numerous regards. Numerical prediction algorithms must balance efficiency, scalability, and accuracy. These simulations also present a large challenge for data storage and transfer. However, the results of these simulations will influence climate modeling for decades.
Software: Offline Parallel Debugging: A Case Study Report
Debugging is difficult; debugging parallel programs at large scale is particularly so. Interactive debugging tools continue to improve in ways that mitigate the difficulties, and the best such systems will continue to be mission critical. Such tools have their limitations, however. They are often unable to operate across many thousands of cores. Even when they do function correctly, mining and analyzing the right data from the results of thousands of processes can be daunting, and it is not easy to design interfaces that are useful and effective at large scale. One additional challenge goes beyond the functionality of the tools themselves. Leadership class systems typically operate in a batch mode intended to maximize utilization and throughput. It is generally unrealistic to expect to schedule a large block of time to operate interactively across a substantial fraction of such a system. Even when large scale interactive sessions are possible, they can be expensive, and can impact system access for others.
Given these challenges, there is potential value in research into other non-traditional debugging models. Here we describe our progress with one such approach: offline debugging. In Section \ref{Concept} we describe the concept of offline debugging in general terms. We then provide in Section \ref{Implementation} an overview of GDBase, a prototype offline debugger. Section \ref{Case Studies} describes proof-of-concept demonstrations of GDBase, and focuses on the first attempts to deploy GDBase in large-scale debugging efforts of major research codes. Section \ref{Conclusions} highlights lessons learned and recommendations.
BOF: "How Did We Get Here?" Gordon Design and Planning
Appro will hold a drawing for a $200 amazon.com gift card at this BOF. Details available at the event.
Steve Lyness and Shawn Strande
Abstract -- Learn how Appro, Intel and San Diego Supercomputing Center (SDSC) at the University of California collaborated for a major design win with a skip-generation architecture called “Gordon” Supercomputer three years in advance of the system being deployed. Learn how this early preparation resulted in a grant from the National Science Foundation (NSF) to allow this system to be built and be available to offer a powerful supercomputer resource dedicated to solving critical science and societal problems using advanced HPC technology. Discover what is behind the Gordon design innovations, the processors, flash memory, interconnect network and the entire system configuration. Explore the ideas and planning of how industry trends, partnerships, early access to future technology roadmaps and system configuration adjustments were used to extrapolate to the time the system would actually be built. Learn how reliability, availability, manageability and system configuration compatibility were essential for this successful data intensive supercomputer be able to deliver over 200 TFlops of peak performance and up to 35M IOPS from 300TB of Solid State Storage. Also, learn how Gordon’s scientific applications benefit from fast interaction and manipulation of large volumes of structured data and how Gordon is helping the HPC research community by being available through an open-access national grid.
Abstract: “Big Data” is a major force in the current scientific environment. In March 2012, President Barack Obama released the details of the administration’s big data strategy. Dr. John P. Holdren, Assistant to the President and Director of the White House Office of Science and Technology Policy announced, “the initiative we are launching today promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security.”
The datasets used by XSEDE researchers are getting larger as well. In the 2012 survey of researchers using RDAV’s Nautilus system at NICS, over two-thirds of respondents indicated that their data will grow in size over the next year. As more powerful resources, such as the upcoming Stampede system at TACC, become available to XSEDE researchers, simulation sizes will continue to grow. Furthermore, the XSEDE Campus Champions represent college and university researchers who are confronted by large datasets.
There are many unsolved problems associated with analyzing, moving, storing, and understanding large scale data sets. At this BoF, researchers and XSEDE staff can discuss their challenges and successes with working with large datasets as well as the hardware, software, and support resources that XSEDE service providers can offer to researchers working with big data.
BOF: Hosting Cloud, HPC and Grid Educational Activities on FutureGrid
Abstract: FutureGrid is an XSEDE resource which has, at the core of its educational mission, the ability to create consistent, controlled and repeatable educational environments in all areas of computer science related to parallel, large‐scale or distributed computing and networking as well as the availability, repeatability, and open sharing of electronic educational materials. FutureGrid has deployed a distributed platform where educators and students can create and access such customized environments for hands-on activities on cloud, HPC and grid computing. This Birds-of-a-Feather session will provide a forum for users to get informed about the opportunities available to use FutureGrid in education, present user stories describing different ways in which it has been used in classes, and encourage discussion from the participants on features that they would like to see in this infrastructure to support their educational needs.
The general format of the BOF includes brief overview presentations to provide context, followed by discussions with attendees, focused on their needs in education and how FutureGrid can help, facilitated by the BOF organizers. The primary target audience for this BOF session draws from attendees of the conference’s typically well-attended EOT track, but the BOF will also be of interest to a broader set of XSEDE users interested in use of a flexible platform for education and training.
Overview presentations will describe FutureGrid capabilities supporting educational activities – including the user portal, creating classes and user accounts, available tutorials and community materials, and cloud/HPC/Grid platforms available in FutureGrid, such as Nimbus, OpenStack, and Eucalyptus. Given the increased interest in the use of cloud computing in educational activities, the presentations will describe, in particular, FutureGrid support for user-customized virtual machine appliances which integrate pre-configured software such that educational environments can be easily created, customized, shared among users, and deployed on FutureGrid’s cloud resources, as well as support for users to collaborate on the development of curriculum for classes using FutureGrid.
Abstract: Eclipse is a widely used, open source integrated development environment that includes support for C, C++, Fortran, and Python. The Parallel Tools Platform (PTP) extends Eclipse to support development on high performance computers. PTP allows the user to run Eclipse on her laptop, while the code is compiled, run, debugged, and profiled on a remote HPC system. PTP provides development assistance for MPI, OpenMP, and UPC; it allows users to submit jobs to the remote batch system and monitor the job queue; and it provides a visual parallel debugger.
In this paper, we will describe the capabilities we have added to PTP to support XSEDE resources. These capabilities include submission and monitoring of jobs on systems running Sun/Oracle Grid Engine, support for GSI authentication and MyProxy logon, support for environment modules, and integration with compilers from Cray and PGI. We will describe ongoing work and directions for future collaboration, including OpenACC support and parallel debugger integration
BOF: Stampede Alert! Come Contribute or Get Outta the Way
Abstract: Stampede, the next big HPC system in the XSEDE program, will go into production in January 2013. Stampede will be tremendously powerful computational platform, leveraging Dell nodes containing Intel Sandy Bridge processors and forthcoming MIC coprocessors to provide 10PF peak performance. Stampede will also have tremendous memory, disk, and visualization capabilities, a set of large shared memory nodes, software that enables high throughput computing, excellent interconnect latency and bandwidth, a rich set of software and services, and outreach efforts including campus bridging efforts to help other sites deploy MIC-based clusters. Most importantly, the Stampede project is designed to support hundreds of diverse science applications and requirements spanning domains and usage models. We invite you to come learn about the system and project, and to provide your suggestions for how we can deliver the most productive system and services for the open science community.
BOF: XSEDE:Review and Directions After Year One
Abstract: A BOF of XSEDE users will be held to discuss progress that has been made and proposed future directions.
Enjoy light refreshments, coffee, tea and water prior to starting your day.
Speaker - Gayatri Buragohain, Feminist Approach to Technology
Title - Women, technology and feminism - reflections from India.
Abstract - The gender disparity in technical fields has been a concern throughout the world in recent times. Collating some inputs from eminent feminist activists and academicians from India who participated in a consultation on "Women and Technology" organized by Feminist Approach to Technology (FAT) last year, I will try to provide a feminist perspective on the need to mend this gender gap which looks beyond women's right to education and fair employment. I will do a comparison between status of women's participation in technical fields in India and the US to explain the similarity and differences between the challenges faced in the two countries. Lastly I would share some insights from the work that is being done in India to address this gender gap, including the work being done by my organization.
Speaker - Jim Kinter, Center for Ocean-Land-Atmosphere Studies (COLA) and George Mason University
Title - Benefits and Challenges of High Spatial Resolution in Climate Models
Abstract - In three separate projects, the sensitivity of climate simulations to increasing spatial resolution was explored. This talk will summarize some of the benefits afforded by increasing resolution, as well as the challenges associated with large computations. In 2009-2010, the convergence of the outcomes from the World Modeling Summit and the windfall availability of a dedicated supercomputing resource at the National Institute for Computational Studies (NICS, an XSEDE partner) enabled a large international collaborative project called Athena to be undertaken. The objective of the project was to evaluate the value of dramatically increased spatial resolution in climate models, specifically with regard to changes in simulation fidelity and differences in projected climate change. The Athena team, composed of investigators from COLA, the European Centre for Medium-range Weather Forecasts, the University of Reading (U.K.), the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) and the University Tokyo, in partnership with the computational science support team at NICS, confirmed that several important features of atmospheric circulation and precipitation are significantly better simulated when mesoscales are more accurately represented. The project also exposed a number of tensions that may be viewed as either problems or opportunities. This included insight into the challenges of handling a petabyte of data in a single project, with some prospects for the coming "exaflood".
In a separate project, a team of researchers from COLA, the University of Miami, the National Center for Atmospheric Research (NCAR) and the University of California at Berkeley, funded under the National Science Foundation PetaApps program and provisioned with supercomputing resources on Kraken, explored the roles of ocean eddies in simulations of climate. The possibility that noise in the climate system has an impact on predictability was specifically explored using a novel technique called "interactive ensemble" modeling. The volume of data generated in the project was very large and continues to be a valuable resource for understanding how ocean eddies can impact various features of climate variability.
In a third project, researchers from COLA and CSU, funded through the Center for Multiscale Modeling of Atmospheric Processes, an NSF Science and Technology Center, applied the novel technique of embedding cloud-resolving models in global climate model gridboxes, sometimes called super-parameterization, to a fully coupled global climate model. While the intent was to better represent very short time-scale processes associated with convective clouds, the representation of tropical variability on time scales of months and seasons to years was significantly improved. The computational challenges posed by accurately representing clouds in global models for climate time-scale simulations are described.
All three projects, while separately funded and composed of different multi-institutional teams of investigators, highlight the benefits and challenges of bringing high-end computing facilities - with their necessary complement of software, networks, and visualization tools - to bear on transformational computational science problems, as envisioned in the NSF CIF21 initiative.
Student Programming Contest (Room #1)
Student Programming Contest (Room #2)
EOT:Cloud-enabling Biological Simulations for Scalable and Sustainable Access: An Experience Report
Abstract: We present our experiences with cloud-enabling an evolutionary genetics learning environment for achieving sustainability and scalability. The project called Pop!World features three major levels: (i) the Gateway module for catering to K-12 students, (ii) the Discovery module for undergraduates and (ii) the Research module for advanced learners and researchers. The Discovery module of Pop!World is currently in use in the introductory Biological Sciences course at UB (BIO 200). The project that began as a design and development of prototype tool for learning and teaching soon faced two major issues: scalability and sustainability. Scalability in our case is about the ability to service thousands of users at a fairly reasonable quality of service. Sustainability is about accessibility and availability beyond the classroom. Learners are often introduced to useful tools and environments during their enrollment in a course. Yet, continued access to the tools beyond the duration of the course is critical for sustaining the learning that happened during the course and to enable experimentation, discovery and application of the knowledge they acquired. Therefore, we used cloud cyber-infrastructure to address successfully the dual issues of scalability and sustainability. In this paper we discuss the details of the cloud deployment of Pop!World and our experiences in using it in an educational setting. The tool is currently deployed on Google App Engine (GAE).
Significance to XSEDE: We expect the learning model we have developed for Evolutionary Biology simulations can be transformed to disseminate capabilities of XSEDE and to serve as an educational and training model for XSEDE.
Science: Comparing the performance of group detection algorithm in serial and parallel processing environments
Abstract: Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players’ virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 270 minutes for the major step of the analysis when running on a single processor. The same computation required 22 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 12. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.
Software: The CIPRES Science Gateway: Enabling High-Impact Science for Phylogenetics Researchers with Limited Resources
Abstract: The CIPRES Science Gateway (CSG) provides browser-based access to computationally demanding phylogenetic codes run on large HPC resources. Since its release in December 2009, there has been a sustained, near-linear growth in the rate of CSG use, both in terms of number of users submitting jobs each month and number of jobs submitted. The average amount of computational time used per month by CSG increased more than 5-fold over that time period. As of April 2012, more than 4,000 unique users have run parallel tree inference jobs on TeraGrid/XSEDE resources using the CSG. The steady growth in resource use suggests that the CSG is meeting an important need for computational resources in the Systematics/Evolutionary Biology community. To insure that XSEDE resources accessed through the CSG are used effectively, policies for resource consumption were developed, and an advanced set of management tools was implemented. Studies of usage trends show that these new management tools helped in distributing XSEDE resources across a large user population that has low-to-moderate computational needs. In the last quarter of 2012, 29% of all active XSEDE users accessed computational resources through the CSG, while the analyses conducted by these users accounted for 0.7% of all allocatable XSEDE computational resources. [Great!] User survey results showed that the easy access to XSEDE/TeraGrid resources through the CSG had a critical and measurable scientific impact: at least 300 scholarly publications spanning all major groups within the Tree of Life have been enabled by the CSG since 2009. The same users reported that 82% of these publications would not have been possible without access to computational resources available through the CSG. The results indicate that the CSG is a critical and cost-effective enabler of science for phylogenetic researchers with limited resources.
Tech: Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System
Abstract: The Uintah Computational Framework was developed to provide an environment for solving fluid-structure interaction problems on structured adaptive grids on large-scale, long-running, data-intensive problems. Uintah uses a combination of fluid-flow solvers and particle-based methods for solids, together with a novel asynchronous task-based approach with fully automated load balancing. Uintah demonstrates excellent weak and strong scalability at full machine capacity on XSEDE resources such as Ranger and Kraken, and through the use of a hybrid memory approach based on a combination of MPI and Pthreads, Uintah now runs on up to 262k cores on the DOE Jaguar system. In order to extend Uintah to heterogeneous systems, with ever-increasing CPU core counts and additional on-node GPUs, a new dynamic CPU-GPU task scheduler is designed and evaluated in this study. This new scheduler enables Uintah to fully exploit these architectures with support for asynchronous, out-of-order scheduling of both CPU and GPU computational tasks. A new runtime system has also been implemented with an added multi-stage queuing architecture for efficient scheduling of CPU and GPU tasks. This new runtime system automatically handles the details of asynchronous memory copies to and from the GPU and introduces a novel method of pre-fetching and preparing GPU memory prior to GPU task execution. In this study this new design is examined in the context of a developing, hierarchical GPU-based ray tracing radiation transport model that provides Uintah with additional capabilities for heat transfer and electromagnetic wave propagation. The capabilities of this new scheduler design are tested by running at large scale on the modern heterogeneous systems, Keeneland and TitanDev, with up to 360 and 960 GPUs respectively. On these systems, we demonstrate significant speedups per GPU against a standard CPU core for our radiation problem.
EOT:Using Stereoscopic 3D Videos to Inform the Public about the Benfits of Computation Science
Abstract: This paper describes an effort to create and disseminate a series of stereoscopic 3D videos that raise awareness about the value of computational science. While the videos target the general population, including the K-12 community, the audience for this paper includes scientific or technical peers who may be interested in sharing or demonstrating their own work more broadly. After outlining the motivation and goals of the project, the authors describe the visual content and computational science behind each of the videos. We then discuss our highly collaborative production workflow that has evolved over the past decade, as well as our distribution mechanisms. We include a summary of the most relevant and appropriate stereoscopic display technologies for the intended audience. Lastly, we analyze and compare this work to other forms of engagement, summarize best practices, and describe potential improvements to future stereoscopic 3D video production.
Science: High performance data mining of social media
Abstract: Difficulties in designing a system to mine social media lie in web service restrictions, legal permissions and security, as well as in network and execution engine latency. Our data mining algorithm tests on Twitter data on small scale at F .90 for accuracy at identification of streets, buildings, place names and place abbreviations. But for large scale, to maintain accuracy and efficiency, we have had to develop techniques to manage the real-time data load. Our contribution algorithm and architecture strategies for multi-core and parallel processing that exclude major program refactoring.
Software: Mojave: A Development Environment for the Cactus Computational Framework
Abstract: ABSTRACT
This paper presents “Mojave,” a set of plug-ins for the Eclipse Integrated Development Environment (IDE), which provides a unified interface for HPC code development and job man- agement. Mojave facilitates code creation, refactoring, build- ing, and running of a set of HPC scientific codes based on the Cactus Computational Toolkit, a computational framework for general problem-solving on regular meshes. The award- winning Cactus framework has been used in numerous fields including numerical relativity, cosmology, and coastal sci- ence. Cactus, like many high-level frameworks, leverages DSLs and generated distributed data structures. Mojave fa- cilitates the development of Cactus applications and the sub- mission of Cactus runs to high end resources (e.g. XSEDE systems) using built-in Eclipse features, C/C++ Develop- ment Tooling (CDT), Parallel Tools Platform (PTP) plug- ins[6], and Simfactory (a Cactus-specific set of command line utilities)[5].
Numerous quality and productivity gains can be achieved using integerated development environments (IDEs)[2, 3, 4]. IDEs offer advanced search mechanisms that provide devel- opers a broader range of utilities to manage their code base; for example, refactoring capabilities which enable developers to easily perform tedious and error-prone code transforma- tions and keep their code more maintainable [1]; and static analysis tools, which help locate and correct bugs; and many other advantages.
In order for Eclipse to provide these features, however, it needs to be able to index a Cactus codebase, and to do this it needs to understand how Cactus organizes its generated files. Originally, Cactus used per-directory macro definitions in its generated code to enable individual modules to access module-specific code; as a result, displaying or analyzing the content of many header files depended on which file included them, making it impossible for any IDE to properly render or index the code. Part of the Mojave development effort in- cludes refactoring Cactus to use per-directory include mech- anisms instead of defines, generating multiple versions of the same header files in different directories. These changes are invisible to codes that use Cactus, but required teaching Mojave to correctly and automatically configure (or dynam- ically reconfigure) the numerous module directories within a Cactus build directories.
Mojave also leverages the Parallel Tools Platform’s (PTP) feature-rich JAXB-based resource management interface. This new extensible resource managment component enables Mo- jave to add its own specialized resource manager which de- scribes the commands to interact with the remote resource manager as well as a means to display workload distribution on remote machines graphically. Thus one may view his job graphically in the context of e.g. an XSEDE resource’s workload.
Mojave provides specialized integration points for SimFac- tory, a command-line tool for Cactus job submission and monitoring. These integration points enable scientists to (1) develop new thorns, (2) create, (3) manage, and (4) monitor simulations or sets of simulations on remote machines. Mo- jave offers a menu-driven interface to Simfactory, allowing users to add ther own commands under the menu or attach an action script for Cactus simulation management opera- tions triggered by the matching of regular expressions on the console output stream. This allows for flexible responses to job monitoring information.
Because the Cactus Simfactory tools are designed to enable remote submissions to multiple resources and manage jobs on many machines at once, Mojave introduces a new job in- formation sharing feature which enables a research group to view and monitor a set of jobs running on diverse resources submitted by multiple scientists. The research group may then be better informed about its existing runs. In addi- tion, information shared about the results of the runs en- ables a quick response from the community, conveniently from within the same environment the code was developed. Job sharing capabilities combined with the productivity gains offered by Eclipse and the CDT and PTP plug-ins as well as the job submission and monitoring capabilities offered by Simfactory lend flexibility to Mojave as a unified computa- tional science interface for Cactus, thereby bridging scientific efforts across campuses around the globe.
REFERENCES [1] D. Dig F. Kjolstad and M. Snir. Bringing the HPC Programmer’s IDE into the 21st Century through Refactoring. In SPLASH 2010 Workshop on Concurrency for the Application Programmer (CAP’10). Association for Computing Machinery (ACM), Oct. 2010. [2] A. Frazer. Case and its contribution to quality. In Layman’s Guide to Software Quality, IEE Colloquium on, pages 6/1 –6/4, dec 1993. [3] M. J. Granger and R. A. Pick. Computer-aided software engineering’s impact on the software development process : An experiment. In Proceedings of the 24th Hawaii International Conference on System Sciences, pages 28–35, January 1991. [4] P.H. Luckey and R.M. Pittman. Improving software quality utilizing an integrated case environment. In Aerospace and Electronics Conference, 1991. NAECON 1991., Proceedings of the IEEE 1991 National, pages 665 –671 vol.2, may 1991. [5] M. W. Thomas and E. Schnetter. Simulation Factory: Taming Application Configuration and Workflow on High-End Resources. ArXiv e-prints, August 2010. [6] G.R. Watson, C.E. Rasmussen, and B.R. Tibbitts. An integrated approach to improving the parallel application development process. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1 –8, may 2009.
Tech: An Analysis of GPU Utilization Trends on the Keeneland Initial Delivery System
Abstract: In late 2010, The Georgia Institute of Technology along with its partners – the Oak Ridge National Lab, the University of Tennessee-Knoxville, and the National Institute for Computational Sciences, deployed the Keeneland Initial Delivery System (KIDS) - a 201 Teraflop, 120-node HP SL390 system with 240 Intel Xeon CPUs and 360 NVIDIA Fermi graphics processors as a part of the Keeneland Project. The Keeneland Project is a five-year Track 2D cooperative agreement awarded by the National Science Foundation (NSF) in 2009 for the deployment of an innovative high performance computing system in order to bring emerging architectures to the open science community and KIDS is being used to develop programming tools and libraries in order to ensure that the project can productively accelerate important scientific and engineering applications. Until late 2011, there was no formal mechanism in place for quantifying the efficiency of GPU usage on the Keeneland system because most applications did not have the appropriate administrative tools and vendor support. GPU administration has largely been an afterthought as vendors in this space are focused on gaming and video applications. There is a compelling need to monitor GPU utilization on Keeneland for the purposes of proper system administration and future planning for Keeneland Final System, which is expected to be in production in July 2012. With the release of CUDA 4.1, NVIDIA added enhanced functionality to the nvidia-system management interface (nvidiasmi) tool, which is a management and monitoring command line utility that leverages the NVIDIA Management Library (NVML). NVML is a C-based API for monitoring and managing various states of the NVIDIA GPU devices. It provides a direct access to the queries and commands exposed via nvidia-smi. Using nvidiasmi, a monitoring tool was built for KIDS, to monitor utilization and memory usage on the GPUs. In this paper, we discuss the development of the GPU Utilization tool in depth, and its implementation details on KIDS. We also provide an analysis of the utilization statistics generated by this tool. For example, we identify utilization trends across jobs submitted on KIDS – such as overall GPU utilization as compared to CPU utilization (figure 1), and we investigate how GPU utilization changes for different job sizes with changes in GPU hours requested. We also examine GPU utilization from the perspective of software – which packages are most frequently used, and how do they compare with respect to GPU utilization (figure 2) and memory usage. Collection and analysis of this data is essential for facilitating heterogeneous computing on the Keeneland Initial Delivery System. Future direction for the usage of these statistics is to provide insights on overall usage of the system, determine appropriate ratios for jobs (CPU to GPU, GPU to host memory), assist in scheduling policy management, and determine software utilization. These statistics become even more relevant as the center prepares for the deployment of the Keeneland Final System. As heterogeneous computing appears to be more and more common, and is quickly becoming the standard, this information will help greatly in delivering consistent high uptime and assist software developers in writing more efficient code for the majority of the codebases aimed at heterogeneous systems.
EOT: WaterHUB – A resource for students and educators for learning hydrology
Abstract: The study of surface water hydrology involves understanding the occurrence, distribution and movement of water on the surface of the earth. Because of human impacts in the form of landuse change, the hydrologic processes at one geographic location may be different than other locations under same or different climatic settings. As a result, a tool that educators and students can use to explore hydrology through observed data and computational simulations is needed. The objective of this paper is to present a prototype model sharing and data exploration tool that users can use for education as well as for research. A GIS-enabled model sharing platform for Soil Water Assessment Tool (SWAT), called SWAT Share, is developed that enables students to not only run model simulations online, but also publish, share and visualize model results to study the impact of land use change on hydrology at watershed scale. SWAT Share is developed as a part of the WaterHUB system that is built by combining Purdue’s HUBZero technology and TeraGrid/XSEDE computation resources. Experience of developing and using SWAT Share in a classroom will be presented and discussed. In addition, the development and implementation plan of another tool, called Hydrology Exploration Tool, will be presented.
Science: Multiple Concurrent Queries on Demand: Large Scale Video Analysis in a Flash Memory Environment as a Case for Humanities Supercomputing
Abstract: [Please see the attached pdf for this abstract with figures integrated, as well as additional author names.]
The Large Scale Video Analytics (LSVA) research project is a newly supported effort that explores the viability of a human-machine hybrid approach to managing massive video archives for the purposes of research and scholarship. Video databases are characterized by incomplete metadata and wildly divergent content tags; and while machine reading has a low efficacy rate, human tagging is generally too labor intensive to be viable. Thus, in the LSVA project, a prototype of the approach that integrates multiple algorithms for image recognition, scene-completion, and crowd-sourced image tagging will be developed such that the system grows smarter and more valuable with increased usage. Building on interdisciplinary research in the humanities and social sciences on one hand (film theory, collective intelligence, visual communication), and computer science on the other (signal processing, large feature extraction for machine-reading, algorithmic pattern recognition), the LSVA project will enable the researchers in testing different algorithms by placing them into a workflow, applying them to the same video dataset in real-time, and finally analyzing the results using cinematic theories of representation.
Currently, the process of understanding and utilizing the content of a large database of video archives is time consuming and laborious. Besides the large size of the archives, other key challenges to effectively analyzing the video archives are limited metadata and lack of precise understanding of the actual content of the archive.
For many years, scholars have required high-performance computing resources for analyzing and examining digital videos. However, due to usage-policies and technical limitations, supercomputers required scholars to work in a batch-oriented workflow. The batch-oriented workflow is contradictory to the typical workflow of scholars which is exploratory and iterative in nature such that the results of one query are used to inform the next query. Batch-oriented workflow interrupts this process and can hinder rather than help discovery.
The arrival of the XSEDE resource “Gordon”, the supercomputer that has extensive flash memory, transformed the relationship between this research method and HPC. Its architecture opened the possibility for researchers to interactively, and on-demand, query large databases in real-time, including databases of digital videos. Additionally, the computational capability of Gordon is sufficient for extensive analysis of video-assets in real-time for determining which videos to return in response to a query. This is a computationally intensive process involving queries that cannot be anticipated ahead of time.
This project will be using the Gordon supercomputer to not only pre-process videos to automatically extract meaningful metadata, but also as an interactive engine that allows the researchers to generate queries on the fly for which metadata that was extracted a priori is not sufficient. In order to be useful to researchers, we are combining an interactive database, a robust web-based front-end (Medici), and powerful visualization representations to aid the researcher in understanding the contents of the video-footage without requiring them to watch every frame of every movie. Given that there is more video than one could ever view in a lifetime on YouTube alone, with more added to it and other video hosting sites on a daily-basis, the need for and implications of this type of meta-level analysis is great indeed.
Due to the need for high-quality end-user experience (low-latency and high-throughput), the LSVA project has dedicated and interactive access to Gordon’s I/O nodes. In the first phase of this project, the database and video archive will be resident on Gordon’s I/O node and Luster File System. In the future, we will experiment with federated databases located at different sites across the country.
This work builds on the NCSA Medici system as the front-end that the user interacts with (see Fig. 2). Medici comes well-equipped to allow automated processes to be dropped into a technology-supported workflow. Medici also provides easy tagging and grouping of data elements using an RDF model at the back-end.
Conclusions
Though we are in the preliminary stages of this project, we are enthusiastic and confident about building an on-demand interactive query engine for video archives and designing a user-interface with appropriate visualizations to support real-time video analysis and querying. Ultimately, we hope to turn this system into a science gateway that can be used by the community of film scholars, social scientists, computer scientists, and artists.
Software: Enabling Large-scale Scientific Workflows on Petascale Resources Using MPI Master/Worker
Abstract: Computational scientists often need to execute large, loosely-coupled parallel applications such as workflows and bags of tasks in order to do their research. These applications are typically composed of many, short-running, serial tasks, but frequently demand large amounts of computation and storage. In order to produce results in a reasonable time, scientists would like to execute these applications using petascale resources. In the past this has been a challenge because petascale systems are not designed to execute such workloads efficiently. In this paper we describe a new approach to executing large, fine-grained workflows on distributed petascale systems. Our solution involves partitioning the workflow into independent subgraphs, and then submitting each subgraph as a self-contained MPI job to remote resources. We describe how the partitioning and job management has been implemented in the Pegasus Workflow Management System. We also explain how this approach provides an end to end solution for challenges related to system architecture, queue policies and priorities, and application reuse and development. Finally, we describe how the system is being used to enable the execution of a very large seismic hazard analysis application on XSEDE resources.
Tech: A Distributed Memory Out-of-Core Method and Its Application to Quantum Chemistry Applications
Abstract: Out-of-core methods, which repeatedly offload data to disk in order toovercome local on-node memory constraints are encountered in a rangeof scientific computing disciplines, including quantumchemistry. Unfortunately, these methods do not often map nicely ontoglobal parallel file systems employed on modern HPC clusters and canoverwhelm even the most capable of file systems causing unacceptablylow application performance (while also degrading I/O performance forall system users). To address this bottleneck and explore moreefficient use of HPC clusters for a quantum chemistry application,CFOUR, a new MPI-based utility has been developed to supportout-of-core methods on distributed memory systems. This MPI Ocore utility leverages the high-speed interconnect available onHPC clusters to offload and retrieve out-of-core records to one ormore remote memory storage pools, avoiding excessive I/O transactionson local or global file systems. In this paper, we present an overviewof the Ocore implementation, it's direct application within a largequantum chemistry application, and micro-benchmark and applicationperformance results from an HPC cluster interconnected withquad-data-rate InfiniBand.
EOT:A Learning Outcome Driven Cyber Infrastructure for Thermodynamics Education
Abstract: The web portal TEST, the Expert System for Thermodynamics (www.thermofluids.net) is a courseware that is being used in Engineering Thermodynamics classes by more than 2000 registered educators around the world. The courseware combines a number of resoures: Tables and charts, a library of animations covering every major topic, rich internet applications for simulating important thermodynamic systems, thermodynamic calculators called daemons (named after Maxwell's daemon) to verify manual solution and pursue what-if studies, and sixteeen chapters multi-media problems and examples. In this work, we present an outcome driven approach to link various resources offered by this courseware to motivate a student to learn. For this purpose, we have associtated 24 learning outcomes to engineering thermodynamics, a two semester course in most universities. For each outcome a fixed number of problems (10) are turned into key problems, designated by a key icon next to the problem. As a student tries to solve a key problem, the outcomes fulfilled by the problem is displayed and the progress made by the student is monitored. On successful solution or after an unsuccessful attmept, a customized recommendation of resources - what to read, what animations to browse, what tables to famiarize, what daemon and RIA to use etc. - is made. A student can keep track of his or her progress and estimate the work required to reach various proficiency level on a given oucome. When a large number of students start using this system, it may become possible to track which resources are more helpful in thermodynamic problem solving. A potential benefit of this approach is direct assessment of student achievements with learninng outcomes, an area of high emphasis from the acredition board.
Invited Talk: Mythbusting with nanoHUB.org - the first science-gateway software as a service cloud focused on end-to-end application users AND application developers
Abstract: Gordon Moore’s 1965 prediction of continued semiconductor device down-scaling and circuit up-scaling has become a self-fulfilling prophesy in the past 40 years. Open source code development and sharing of the process modeling software SUPREM and the circuit modeling software SPICE were two critical technologies that enabled the down-scaling of semiconductor devices and up-scaling of circuit complexity. SPICE was originally a teaching tool that transitioned into a research tool, was disseminated by an inspired engineering professor via tapes, and improved by users who provided constructive feedback to a multidisciplinary group of electrical engineers, physicist, and numerical analysts. Ultimately SPICE and SUPREM transitioned into all electronic design software packages that power today’s 280 billion dollar semiconductor industry.
Can we duplicate such multi-disciplinary software development starting from teaching and research in a small research group leading to true economic impact? What are technologies that might advance such a process? How can we deliver such software to a broad audience? How can we teach the next generation engineers and scientists on the latest research software? What are critical user requirements? What are critical developer requirements? What are the incentives for faculty members to share their competitive advantages? How do we know early on if such an infrastructure is successful? This presentation will show how nanoHUB.org addresses these questions.
By serving a community of 230,000 users in the past 12 months with an ever-growing collection of 3,000 resources, including over 220 simulation tools, nanoHUB.org has established itself as “the world’s largest nanotechnology user facility” [1]. nanoHUB.org is driving significant knowledge transfer among researchers and speeding transfer from research to education, quantified with usage statistics, usage patterns, collaboration patterns, and citation data from the scientific literature. Over 850 nanoHUB citations in the literature resulting in a secondary citation h-index of 41 prove that high quality research by users outside of the pool of original tool developers can be enabled by nanoHUB processes. In addition to high-quality content, critical attributes of nanoHUB success are its open access, ease of use, utterly dependable operation, low-cost and rapid content adaptation and deployment, and open usage and assessment data. The open-source HUBzero software platform, built for nanoHUB and now powering many other hubs, is architected to deliver a user experience corresponding to these criteria.
In June 2011 the National Science and Technology Council published Materials Genome Initiative for Global Competitiveness [2], writing “Accelerating the pace of discovery and deployment of advanced material systems will therefore be crucial to achieving global competitiveness in the 21st century.” The Council goes on to say, "Open innovation will play a key role in accelerating the development of advanced computational tools. … An existing system that is a good example of a first step toward open innovation is the nanoHUB, a National Science Foundation program run through the Network for Computational Nanotechnology."
[2] Quote by Mikhail Roco, Senior Advisor for Nanotechnology, National Science Foundation.
[1] http://www.whitehouse.gov/sites/default/files/microsites/ostp/materials_genome_initiative-final.pdf
Software: The Eclipse Parallel Tools Platform: Toward an Integrated Development Environment for XSEDE Resources
Abstract: Eclipse is a widely used, open source integrated development environment that includes support for C, C++, Fortran, and Python. The Parallel Tools Platform (PTP) extends Eclipse to support development on high performance computers. PTP allows the user to run Eclipse on her laptop, while the code is compiled, run, debugged, and profiled on a remote HPC system. PTP provides development assistance for MPI, OpenMP, and UPC; it allows users to submit jobs to the remote batch system and monitor the job queue; and it provides a visual parallel debugger.
In this paper, we will describe the capabilities we have added to PTP to support XSEDE resources. These capabilities include submission and monitoring of jobs on systems running Sun/Oracle Grid Engine, support for GSI authentication and MyProxy logon, support for environment modules, and integration with compilers from Cray and PGI. We will describe ongoing work and directions for future collaboration, including OpenACC support and parallel debugger integration
Tech: Achieve Better Performance with PEAK on XSEDE Resources
Abstract: As the leading distributed cyberinfrastructure for open scientific research in the United States, XSEDE supports several supercomputers across the country, as well as computational tools that are critical to the success of those researchers. In most cases, users are looking for a systematic way of selecting and configuring the available systems software and libraries for their applications so as to obtain optimal application performance. However, few scientific application developers have the time for an exhaustive search of all the possible configurations to determine the best one, and performing such a search empirically can consume a significant proportion of their allocation hours. We present here a framework, called the Performance Environment Autoconfiguration frameworK (PEAK), to help developers and users of scientific applications to select the optimal configuration for their application on a given platform and to update that configuration when changes in the underlying hardware and systems software occur. The choices to be made include the compiler with its settings of compiling options, the numerical libraries and settings of library parameters, and settings of other environment variables to take advantage of the NUMA systems. The framework has helped us choose the optimal configuration to get a significant speedup for some scientific applications executed on XSEDE platforms such as Kraken, Ranger, Nautilus and Blacklight.
EOT: Conducting K-12 Outreach to Evoke Early Interest in IT, Science, and Advanced Technology
Abstract: The Indiana University Pervasive Technology Institute has engaged for several years in K-12 Education, Outreach and Training (EOT) events related to technology in general and computing in particular. In each event we strive to positively influence children’s perception of science and technology. We view K-12 EOT as a channel for technical professionals to engage young people in the pursuit of scientific and technical understanding. Our goal is for students to see these subjects as interesting, exciting, and worth further pursuit. By providing opportunities for pre-college students to engage in science, technology, engineering and mathematics (STEM) activities first hand, we hope to influence their choices of careers and field-of-study later in life.
In this paper we give an account of our experiences with providing EOT: we describe several of our workshops and events; we provide details regarding techniques that we found to be successful in working with both students and instructors; we discuss program costs and logistics; and we describe our plans for the future.
Science: Excited States in Lattice QCD using the Stochastic LapH Method
Abstract: A new method for computing the mass spectrum of excited baryons and mesons from the temporal correlations of quantum-field operators in quantum chromodynamics is described. The correlations are determined using Markov-chain Monte Carlo estimates of QCD path integrals formulated on an anisotropic space-time lattice. Access to the excited states of interest requires determinations of lower-lying multi-hadron state energies, necessitating the use of multi-hadron operators. Evaluating the correlations of such multi-hadron operators is difficult with standard methods. A new stochastic method of treating the low-lying modes of quark propagation which exploits a new procedure for spatially-smearing quark fields, known as Laplacian Heaviside smearing, makes such calculations possible for the first time. A new operator for studying glueballs, a hypothetical form of matter comprised predominantly of gluons, is also tested, and computing the mixing of this glueball operator with a quark-antiquark operator and multiple two-pion operators is shown to be feasible.
Science: Monte Carlo strategies for first-principles simulations of elemental systems
Abstract: We discuss the application of atomistic Monte Carlo simulation based on electronic structure calculations to elemental systems such as metals and alloys. As in prior work in this area [1,2], an approximate "pre-sampling" potential is used to generate large moves with a high probability of acceptance. Even with such a scheme, however, such simulations are extremely expensive and may benefit from algorithmic developments that improve acceptance rates and/or enable additional parallelization.
Here we consider various such developments. The first of these is a three-level hybrid algorithm in which two pre-sampling potentials are used. The lowest level is an empirical potential, and the "middle" level uses a low-quality density functional theory. The efficiency of the multistage algorithm is analyzed and an example application is given.
Two other schemes for reducing overall run-time are also considered. In the first, the Multiple-try Monte Carlo algorithm [4], a series of moves are attempted in parallel, with the choice of the next state in the chain made by using all the information gathered. This is found to be a poor choice for simulations of this type. In the second scheme, "tree sampling," multiple trial moves are made in parallel such that if the first is rejected, the second is ready and can be considered immediately. Performance of this scheme is shown to be quite effective under certain reasonable run parameters.
[1] S. Wang et al., Comp. Mater. Sci. 29 (2004) 145-151.
[2] M. J. McGrath et al., Comp. Phys. Comm. 169 (2005) 289-294.
[3] L. D. Gelb and T. Carnahan, Chem. Phys. Letts. 417 (2006) 283-287.
[4] J. S. Liu, Monte Carlo Strategies in Scientific Computing (2001), Springer, New York.
Tech: Invited Talk: Improving XSEDE Software Quality using Software Engineering Best Practice
Abstract: XSEDE is introducing a range of system and software engineering practices to achieve systematic and continuous improvement in the quality of its integrated and supported software. This paper will describe XSEDE’s software engineering practices and what we hope to obtain from them. We discuss the technical and cultural challenges of establishing community-defined practices, and the techniques we have been using to address these challenges. We will introduce the initial engineering practices implemented in project year 1, outline additional engineering practice improvements planned during project year 2, and suggest how these engineering practices could be leveraged by the broader XSEDE community.
EOT: Computing MATTERS: Building Pathways to Cyberinfrastructure
Abstract: As we prepare students for the 21st century workforce, three of the most important skills for advancing modern mathematics and science are quantitative reasoning, computational thinking, and multi-scale modeling. Computing MATTERS: Pathways to Cyberinfrastructure (1) program, funded in part by the National Science Foundation Cyberinfrastructure Training, Education, Advancement, and Mentoring (CI-TEAM) program, provides opportunities for students to explore and engage in skills needed to advance computational science education through the use of cyberinfrastructure. Computing MATTERS is a program of Shodor (2), a national resource for computational science education. It combines the best of Shodor’s efforts from workshops, apprenticeships and internships. The program provides a continuum of activities from middle school grades through college for students to encounter the excitement of discovery, the power of inquiry and the joy of learning enabled by cyberinfrastructure and advanced technologies.
Based on Shodor’s initial demonstration program, Computing MATTERS: Pathways to Cyberinfrastructure is an implementation program that expands on partnerships with universities, community colleges, school districts, community centers, and Sigma Xi to extend Computing MATTERS first throughout the Research Triangle, NC area and then to span the state of North Carolina as the program develops. In Computing MATTERS, “MATTERS” is itself an acronym and has meaning in its own right: Mentoring Academic Transitions Through Experiences in Research and Service. Significant progress has been accomplished to implement Computing MATTERS and to excite and attract students all over North Carolina and beyond to consider study and careers in science, technology, engineering and mathematics (STEM). The program has proven to attract many individuals from groups underrepresented in science, technology, engineering and mathematics.
Through Computing MATTERS: Pathways to Cyberinfrastructure, Shodor seeks to build self-sustaining local infrastructures, adapting the best features of Shodor’s local initiatives, while expanding the mentor base to include area colleges and universities as well as members of Sigma Xi living and working in North Carolina.
Science: Benchmark Calculations for Multi-Photon Ionization of the Hydrogen Molecule and the Hydrogen Molecular Ion by Short-Pulse Intense Laser Radiation
Abstract: We provide an overview of our recent work on the implementation of the finite-element discrete-variable representation to study the interaction of a few-cycle intense laser pulse with the H$_2$ and H$_2^{\,+}$ molecules. The problem is formulated in prolate spheroidal coordinates, the ideal system for a diatomic molecule, and the time-dependent Schr\"odinger equation is solved on a space-time grid. The physical information is extracted by projecting the time-evolved solution to the appropriate field-free states of the problem.
Science: Computational challenges in nanoparticle partition function calculation
Abstract: Bottom-up building block assembly is a useful technique for determining thermodynamically stable configurations of certain physical particles. This paper provides a description of the computational bottlenecks encountered when generating large configurations of particles. We identify two components; cluster pairing and shape matching, that dominate the run time. We present scaling data for a simple example particle and discuss opportunities for enhancing implementations of bottom-up building block assembly for studying larger or more complex systems.
Tech: A Framework for Federated Two-Factor Authentication Enabling Cost-Effective Secure Access to Distributed Cyberinfrastructure
Abstract: As cyber attacks become increasingly sophisticated, the security measures used to mitigate the risks must also increase in sophistication.One time password (OTP) systems provide strong authentication because security credentials are not reusable, thus thwarting credential replay attacks. The credential changes regularly, making brute-forceattacks significantly more difficult. In high performance computing,end users may require access to resources housed at several differentservice provider locations. The ability to share a strong token betweenmultiple computing resources reduces cost and complexity. The National Science Foundation (NSF) Extreme Science and EngineeringDiscovery Environment (XSEDE) provides access to digital resources,including supercomputers, data resources, and software tools. XSEDE willoffer centralized strong authentication for services amongst serviceproviders that leverage their own user databases and security profiles.This work implements a scalable framework built on standards to providefederated secure access to distributed cyberinfrastructure.
EOT: Motivating Minority Student Involvement in High Performance Computing activities at TSU
Abstract: none
Science: Electrostatic Screening Effects on a Model System for Molecular Electronics
Abstract:
Science: Ensemble modeling of storm interaction with XSEDE
Abstract: We applied TG/XSEDE HPCs to an ensemble modeling study of how thunderstorm severity depended on the proximity of nearby storms. The Weather Research and Forecasting model was used to investigate 52 idealized thunderstorm scenarios, changing the position of a nearby convective cell when another was developing. We found a large impact from having any other storm cell nearby, as well as very high sensitivity to where that cell was placed. This represents a significant change from forecast thinking that currently relies on expected storm behavior guidance for a quiescent environment. We are also studying a new, tornado-scale simulation based on the above findings.
In carrying out this study over the last 3 years we utilized five (perhaps 6-7 by July) XSEDE HPCs, and made use of both traditional batch capabilities as well as grid-based computing. We found our greatest challenges were less in high-performance computing than in data and storage. Moving, storing, and analyzing multi-TB data sets proved to be challenging and we found our data processing could significantly degrade the performance of high-performance disk systems. In addition to the details of our study, we will discuss these experiences, how we hope to make the most of XSEDE resources and note some of the near-term and long-term challenges we expect to encounter in our numerical research.
Tech: Running Many Molecular Dynamics Simulations on Many Supercomputers
Abstract: The challenges facing biomolecular simulations are manyfold. In addition to long time simulations of a single large system, an important challenge is the ability to run a large number of identical copies (ensembles) of the same system. Ensemble-based simulations are important for effective sampling and due to the low-level of coupling between them, ensemble-based simulations are good candidates to utilize distributed cyberinfrastructure. The problem for the practitioner is thus effectively marshaling thousands if not millions of high-performance simulations on distributed cyberinfrastructure. Here we assess the ability of an interoperable and extensible pilot- job tool (BigJob), to support high-throughput simulations of high- performance molecular dynamics simulations across distributed supercomputing infrastructure. Using a nucleosome positioning problem as an exemplar, we demonstrate how we have addressed this challenge on the TeraGrid/XSEDE. Specifically, we compute 336 independent trajectories of 20 ns each. Each trajectory is further divided into twenty 1 ns long simulation tasks. A single task requires ≈ 42 MB of input, 9 hours of compute time on 32 cores, and generates 3.8 GB of data. In total we have 6,720 tasks (6.7 μs ) and approximately 25 TB to manage. There is natural task-level concurrency, as these 6,720 can be executed with 336-way task concurrency. Using NAMD 2.7, this project requires approximately 2 million hours of CPU time and could be completed in just over 1 month on a dedicated supercomputer containing 3,000 cores. In practice even such a modest supercomputer is a shared resource and our experience suggests that a simple scheme to automatically batch queue the tasks, might require several years to complete the project. In order to reduce the total time-to-completion, we need to scale-up, out and across various resources. Our approach is to aggregate many ensemble members into pilot-jobs, distribute pilot-jobs over multiple compute resources concurrently, and dynamically assign tasks across the available resources.
EOT: Building a Regional Partnership for Computer Science Education
Abstract: University of California San Diego (UCSD) piloted a new computer science course for the undergraduate curriculum, “Introduction to Computer Science (CS) Principles” in fall of 2010, with significant gains in student interest in CS and performance in the course (when compared with the traditional introductory CS course), particularly among women and minority students. Approved for credit and taught by Dr. Beth Simon, the course was contextual, conceptual, and constructivist in its approach to programming; building interest and enthusiasm for the “magic” of computing before introducing programming mechanics. This lower division course was designed as part of a revised AP Computer Science sequence for high schools, to be coupled with revised and updated AP tests that include the underlying computing principles. Since AP tests drive high school curriculum, their revision has the potential to significantly change the way that computer science is introduced to students in high schools nationwide.
But introducing new courses into a region with many districts and a highly diverse student population poses many challenges. The San Diego Supercomputer Center (SDSC) at UCSD conducted an exploratory project to identify and build the collaborative networks necessary to support sustainable change in the San Diego region’s pre-college computer science education programs. This paper reports on SDSC’s activities to engage the diverse stakeholders whose support is critical to successfully introducing and sustaining this new course San Diego County schools.
Though the project was spearheaded by SDSC, it involved many other leadership partners from UCSD, other colleges in the region, the San Diego County Office of Education, the San Diego chapter of the Computer Science Teachers Association, district administrators and technology specialists, and high school teachers.
The primary activities of this project were identification of key stakeholders within the K-12 and community college school districts serving students in the San Diego County region, meeting with those stakeholders, presenting the case for the course to elected officials identified by district personnel, and development of action agendas to support the establishment of functional partnerships for further collaboration toward project objectives. Those objectives included:
• Outline protocols, processes, and decision criteria for strengthening the computer science curriculum in each district; and strategies for using them to support the larger project goals;
• Identify key decision makers whose endorsement was needed for district-wide implementation;
• Identify leaders within districts (teachers, professional development specialists, technology specialists, and administrators) willing to become the first cohort Teacher Leadership Team to introduce the not-yet-but- soon-to-be AP CS Principles course into the San Diego region; and
• Provide professional development for the leadership team to become familiar with the pilot “Introduction to Computer Science Principles” course launched at UCSD in 2010-2011.
This project benefited tremendously from the generous sharing of lessons learned by experienced colleagues from the Los Angeles-based “Into the Loop” project. Their advice helped us better understand the complex challenges to implementation of comprehensive revision of the high school computer science curriculum, and to adapt some of their strategies to create a San Diego regional coalition.
Science: Improving Tornado Prediction Using Data Mining
Abstract: Responsible for over four-hundred deaths in the United States in 2011, tornadoes are a major source of preventable deaths and property damage. One major cause for the high number of fatalities caused by tornadoes each year is the high False Alarm Ratio (FAR). FAR is a forecasting metric which describes the probability a tornado warning is issued, given no tornado exists.
While tornado forecasting has improved dramatically over the past decades, the FAR has consistently remained between seventy and eighty percent. This indicates as much as eighty percent of all tornado warnings are false-alarms. Consequently, the public has gradually become desensitized to tornado warnings, and thus do not seek shelter when it is appropriate [1]. If the number of fatalities caused by tornadoes is to be reduced, the FAR must first be improved.
The reasons for the high FAR are complex and manifold. Arguably the most pragmatic reason is simply caution. Faced with the decision of whether or not to issue a warning, a meteorologist can err in one of two ways. The first error, a type I error, occurs when the meteorologist predicts a tornado, but one does not occur. The second error, a type II error, occurs when a meteorologist does not predict a tornado, but one does occur. Because the former is merely inconvenient and the latter is potentially fatal, meteorologists tend to err on the side of caution, and typically issue a tornado warning given a storm which shows any signs of being tornadic.
In addition to caution, a critical cause for the high FAR is a limited understanding of tornadogenisis, or the process by which tornadoes form. Fortunately, tornadoes are a relatively infrequent phenomenon. However, tornado scarcity coupled with the limitations of current radar technology has result in a limited amount of real-world data.
Because of the extraordinary complexity of tornadogenisis and the limited amount of real-world data, tornadogenisis has stubbornly resisted complete understanding for centuries. However, for the first time in history, technology has reached sufficient fruition to begin solving this ancient problem.
The absence of real-world data can be addressed by running highly sophisticated and computationally intensive mathematical models on Kraken, a High-Performance Computer provided by XSEDE, to numerically simulate supercells. By using simulated in data in lieu of real-world data, datasets of arbitrary resolution and scale can be generated as needed [2].
Unfortunately, while the use of a simulator mitigates the issue of data scarcity, it also creates a new one. If the simulations are of sufficiently high-resolution to be useful, the size of the dataset becomes astronomical. At present our dataset consists of approximately fifty individual simulations, each approximately one terabyte in size, totaling over fifty terabytes of memory. Because the dataset is too large for any individual to analyze and understand, new techniques needed to be developed which were capable of autonomously analyzing the dataset.
Specifically, Spatiotemporal Relational Probability Trees (SRPTs) were developed. SRPTs are an augmentation of a classic data mining algorithm, the Probability Tree (PT). Probability Trees were chosen for their strong predictive ability, efficiency, and ability to scale to large datasets [3]. Additionally, unlike many data mining algorithms, a Probability Tree is human-readable. Consequently, after the PT has been grown using the simulated dataset, further insights can be drawn from its structure by domain scientists.
SRPTs differ from PTs in several ways. The most important difference being SRPTs are capable of creating spatial, temporal, and relational distinctions within the tree. This gives SRPTs the ability to reason about spatiotemporal and relational data. Because the natural world is inherently spatiotemporal and relational, and many scientific datasets share these properties, this greatly enhances the strength and applicability of SRPTs.
However, while SRPTs are powerful predictors, they do suffer from one major weakness: overfitting. Overfitting occurs when the SRPTs cease to discover generalizable, meaningful patterns within the data, and instead begin fitting to minutia and noise within the dataset. Overfitting in SRPTs can largely be mitigated by limiting the depth to which the tree grows. However, this also limits the predictive ability of an individual SRPT.
To address these issues, Spatiotemporal Relational Random Forests (SRRFs) were developed. Much like a traditional random forest, an SRRF is an ensemble of SRPTs. By growing hundreds of individual SRPTs from bootstrap samples of the dataset, then combining each individual tree’s prediction in an intelligent way, SRRFs are capable of discovering far more complex patterns within the data [4]. Additionally, because each tree within the SRRF is limited in size, overfitting is far less of an issue.
By training SRRFs on the simulated tornado dataset, we hope to discover salient patterns and conditions necessary for the formation of tornadoes. By discovering and understanding these patterns, traditional tornado forecasting may be improved dramatically. Specifically, these patterns may aid in differentiating storms which will produce a tornado from those which will not. This ability would have the immediate effect of reducing the False-Alarm Ratio, and ultimately aid in restoring the public’s confidence in tornado warnings. By doing so, we may ultimately reduce the number of preventable fatalities caused by tornadoes.
References
1. Rosendahl, D. H., 2008: Identifying precursors to strong low-level rotation with numerically simulated supercell thunderstorms: A data mining approach. Master's thesis, University of Oklahoma, School of Meteorology.
2. Xue, Ming, Kevin Droegemeier, and V. Wong. "The Advanced Regional Prediction System (ARPS) - A Multiscale Nonhydrostatic Atmospheric Simulation and Prediction Model. Part 1: Model Dynamics and Verfication."Meteorology and Atmospheric Physics75 (2000): 161-193. Print.
3. McGovern, Amy and Hiers, Nathan and Collier, Matthew and Gagne II, David J. and Brown, Rodger A. (2008). Spatiotemporal Relational Probability Trees. Proceedings of the 2008 IEEE International Conference on Data Mining, Pages 935-940. Pisa, Italy. 15-19 December 2008.
4. Supinie, Timothy and McGovern, Amy and Williams, John and Abernethy, Jennifer. Spatiotemporal Relational Random Forests. Proceedings of the 2009 IEEE International Conference on Data Mining (ICDM) workshop on Spatiotemporal Data Mining, electronically published.
Science: Quantum Algorithms for Predicting the Properties of Complex Materials
Abstract: A central goal in computational materials science is to find efficient methods for solving the Kohn-Sham equation. The realization of this goal would allow one to predict materials properties such as phase stability, structure and optical and dielectric properties for a wide variety of materials. Typically, a solution of the Kohn-Sham equation requires computing a set of low-lying eigenpairs. Standard methods for computing such eigenpairs require two procedures: (a) maintaining the orthogonality of an approximation space, and (b) forming approximate eigenpairs with the Rayleigh-Ritz method. These two procedures scale cubically with the number of desired eigenpairs. Recently, we presented a method, applicable to any large Hermitian eigenproblem, by which the spectrum is partitioned among distinct groups of processors. This "divide and conquer" approach serves as a parallelization scheme at the level of the solver, making it compatible with existing schemes that parallelize at a physical level and at the level of primitive operations, e.g., matrix-vector multiplication. In addition, among all processor sets, the size of any approximation subspace is reduced, thereby reducing the cost of orthogonalization and the Rayleigh-Ritz method. We will address the key aspects of the algorithm, its implementation in real space, and demonstrate the accuracy of the algorithm by computing the electronic structure of some representative materials problems.
Tech: A Systematic Process for Efficient Execution on Intel's Heterogeneous Computation Nodes
Abstract: Heterogeneous architectures (mainstream CPUs with accelerators / co-processors) are expected to become more prevalent in high performance computing clusters. This paper deals specifically with attaining efficient execution on nodes which combine Intel's multicore Sandy Bridge chips with MIC manycore chips. The architecture and software stack for Intel's heterogeneous computation nodes attempt to make migration from the now common multicore chips to the manycore chips straightforward. However, specific execution characteristics are favored by these manycore chips such as making use of the wider vector instructions, minimal inter-thread conflicts, etc. Additionally manycore chips have lower clock speed and no unified last-level cache. As a result, and as we demonstrate in this paper, it will commonly be the case that not all parts of an application will execute more efficiently on the manycore chip than on the multicore chip. This paper presents a process, based on measurements of execution on Westmere-based multicore chips, which can accurately predict which code segments will execute efficiently on the manycore chips and illustrates and evaluates its application to three substantial full programs -- HOMME, MOIL and MILC. The effectiveness of the process is validated by verifying scalability of the specific functions and loops that were recommended for MIC execution on a Knights Ferry computation node.
Abstract: The XSEDE Campus Bridging team has been facilitating a pilot program for the Global Federated File System software at 2 XSEDE test sites and 4 pilot sites, with the goal of making use of the GFFS software on campus and at XSEDE resources in order to share data and facilitate computational workflows for research. Representatives from the pilot sites will discuss their use cases and requirements for GFFS and their fit with the use cases developed by the XSEDE Campus Bridging and Architecture and Design teams.
Panel Moderator:
- Jim Ferguson, National Institute for Computational Sciences, University of Tennessee
Panel Participants:
- Guy Almes, Texas A&M University
- Toby Axelsson, University of Kansas
Panel: Managing the Long Tail of Science: Data and Communities
Abstract: Long tail statistical distributions have been used for years to describe service time distribution in queueing theory and network performance decay. The first reference of the long tail to data appears to be a 1992 article that refers to long tail kinetics, the variety of data from complex, disordered materials that cannot be described by conventional kinetics. The application of the long tail that gave the term its wide popularity is to online business. Chris Anderson, in his 2004 Wired magazine article, proposed first that merchandise assortments can grow because goods are not limited by shelf space, and second, that online venues change the demand curve because consumers value niche products. These complementary forces result in a tail that steadily grows both longer as more obscure products are made available but also fatter as consumers discover products better suited to their tastes. How is the long tail interpreted for scientific and scholarly data? And perhaps more importantly, what is the growth path of the long tail? Will it too grow fatter as “consumers” discover products better suited to their tastes? The long tail in science has been variously used to refer to scientific data and the scientific communities that produce the data. The long tail is many creators with small amounts of data. The scientific research making up long tail data can be regional or localized – a population survey of a minority group, or the study of a coastline basin. Where some scientific communities coalesce around community models, big instruments, or community repositories, long tail communities generally have not. Today much of the long tail data is lost as an individual retires or leaves science. Additionally, the methods and tools used within a long tail community are varied, the data are distinct, and the challenges for preservation are great. The implications and opportunities for managing the long tail discussed in this panel of distinguished experts:
Panel moderator: Beth Plale, Professor of Computer Science, Director, Data To Insight Center, Managing Director, Pervasive Technology Institute, Indiana University
Panel participants:
Geoffrey Fox, FutureGrid PI, and Professor of Computer Science and Informatics, Indiana University
Nassib Nassar, Senior Research Scientist, RENCI, University of North Carolina at Chapel Hill
John Kunze, Associate Director, University of California Curation Center in the California Digital Library
Anne Thessen, Data Conservancy, Marine Biological Lab, Woods Hole
NOTE: The list of panelists was revised on Monday, July 16.
Abstract: With the world moving to web-based tools for everything from photo sharing to research publication, it’s no wonder scientists are now seeking online technologies to support their research. But the requirements of large-scale computational research are both unique and daunting: massive data, complex software, limited budgets, and demand for increased collaboration. While “the cloud” promises to alleviate some of these pressures, concerns about feasibility still exist for scientists and the resource providers that support them.
This panel will explore the capacity of Software as a Service (SaaS) to transform computational research so the challenges above can be leveraged to advance, not hinder, innovation and discovery. Leaders from each constituency of a scientific research environment (investigator, campus champion, supercomputing facility, SaaS provider) will debate the feasibility of SaaS-based research, examining the delta between current and desired state from a technology and adoptability perspective. We will explore the delta between where we are – and where we need to be – for scientists to reliably and securely perform research in the cloud.
Panel Moderator:
- Ian Foster, Computation Institute, University of Chicago and Argonne
Panel participants:
- Nancy Cox, University of Chicago
- Brock Palen, University of Michigan
- J. Ray Scott, Pittsburgh Supercomputing Center
- Steve Tuecke, University of Chicago
Abstract: The XSEDE science gateway and campus bridging programs share a mission to expand access to cyberinfrastructure, for scientific communities and campus researchers. Since the TeraGrid science gateway program began in 2003, science gateways have served researchers in a wide range of scientific disciplines, from astronomy to seismology. In its 2011 report, the NSF ACCI Task Force on Campus Bridging identified the critical need for seamless integration of cyberinfrastructure from the scientist’s desktop to the local campus, to other campuses, and to regional, national, and international cyberinfrastructure.
To effectively expand access to cyberinfrastructure across communities and campuses, XSEDE must address security challenges in areas such as identity/access management, accounting, risk assessment, and incident response. Interoperable authentication, as provided by the InCommon federation, enables researchers to conveniently "sign on" to access cyberinfrastructure across campus and across the region/nation/world. Coordinated operational protection and response, as provided by REN-ISAC, maintains the availability and integrity of highly connected cyberinfrastructure. Serving large communities of researchers across many campuses requires security mechanisms, processes, and policies to scale to new levels.
This panel will discuss the security challenges introduced by science gateways and campus bridging, potential approaches for addressing these challenges (for example, leveraging InCommon and REN-ISAC), and plans for the future. Panelists will solicit requirements and recommendations from attendees as input to future work.
Panel Moderator:
- Jim Basney, University of Illinois
Panel participants:
- Randy Butler, University of Illinois
- Dan Fraser, Argonne National Laboratory
- Suresh Marru, Indiana University
- Craig Stewart, Indiana University
BOF: XSEDE User Portal, XUP Mobile and Social Media Integration
Abstract: The XSEDE User Portal (XUP) provides an integrated interface for XSEDE users to access the information and services available to them through the XSEDE project. The XUP allows users to accomplish many things, including:
• View system account information
• Log in to XSEDE resources
• Transfer files both between XSEDE resources and between their desktop and XSEDE resources
• Request allocations, and view and manage project allocation usage
• Monitor the status of HPC, storage, and visualization resources
• Access documentation and news
• Register for training
• Receive consulting support
A companion to the XUP, the XSEDE User Portal Mobile, enables users access to many of the above capabilities through their mobile interface.
The XUP team will lead a discussion designed to enhance the capabilities of XSEDE User Portal, to improve the XSEDE User Portal mobile interface and potentially native mobile app versions. This will include exploring ideas to leverage and potentially other popular web-based services into the XUP, including social media. Social media has revolutionized how users communicate with each other and make effective use of the services available to them. The challenge is to how to leverage and integrate social media to advance scientific research.
The purpose of this BoF is to collect user feedback about the current XSEDE User Portal and its mobile interface, and to discuss how to best integrate social media and other popular online capabilities into the XUP project help make XSEDE users more productive and promote the science that is accomplished in XSEDE.
BOF: A Focus Group Discussion on Software Sustainability Characteristics
Abstract: The National Science Foundation Software Infrastructure for Sustained Innovation (SI2) program solicitation states that software is "central to NSF's vision of a Cyberinfrastructure Framework for 21st Century Science and Engineering (CIF21)," and goes on to emphasize that, in general, software is essential to computational and data-enabled science. The SI2 program is one vehicle by which the NSF hopes to enable sustained and well-supported software to provide services and functionality needed by the U.S. science and engineering community. Two NSF-funded Early Concept Grants for Exploratory Research (EAGER) projects are studying and analyzing the state of comprehensive software infrastructure to provide a more accurate understanding, new insight, and increased awareness of characteristics of software sustainability. These projects compliment one another in developing a methodology and strategy for conducting the studies and identifying common elements and best practices for software sustainability.
The purpose of NSF award #11129017 (PI James Bottum, Clemson University), “EAGER: A Study of the National Software Cyberinfrastructure Environment” is developing a framework to provide NSF data-driven insight of software cyberinfrastructure that NSF has invested in since 2000. This effort will create a software CI data collection methodology and a taxonomy that inventories and categorizes NSF’s investments by software type for awards made since the year 2000. In addition, this project will conduct in-depth case study analysis that will provide an increased understanding of characteristics common to sustaining software environments. Findings from the case studies will be used to generate a set of recommendations of areas of future study for NSF consideration in developing a software sustainability strategy as well as a target audience for these analyses.
Through a combination of detailed case studies and surveys of software producers and users, NSF award # 1147606 (PI Craig Stewart, Indiana University), “EAGER: Best Practices and Models for Sustainability for Robust Cyberinfrastructure Software” is identifying best practices for the process of moving software from a "discovery" process to well-maintained and sustainable infrastructure for 21st Century science and engineering. The work is focusing in particular on the following scenario: Given a piece of software that provides interesting capabilities and a community that wants to use (and possibly contribute to the further development of) that software, what steps are necessary to transform that software from "interesting tool" to "robust and widely used element of national infrastructure, contributing to the NSF vision for CIF21?" This research will lead to greater availability of widely usable software tools and curriculum materials, increasing the quality of education in computer science, computational science, and STEM disciplines.
Speakers: Dustin Atkins, Clemson; Nathan Bohlmann, Clemson; and Julie Wernert, Indiana.
NOTE ADDED JULY 16: In conjunction with this BOF, three informal focus group discussions also will be held: Thursday, July 19, from 7:30-8:30 a.m. in the Root room, 8th floor, and during the Tuesday and Wednesday conference lunches -- Look for reserved tables. For more info, contact Julie Wernert, jwernert@iu.edu, or Nathan Bohlmann, nlb@clemson.edu.
BOF: Cloud Computing for Science: Challenges and Opportunities
Abstract: Outsourcing compute infrastructure and services has many potential benefits to scientific projects: it offers access to sophisticated resources that may be beyond the means of a single institution to acquire, allows for more flexible usage patterns, creates potential for access to economies of scale via consolidation, and eliminates the overhead of system acquisition and operation for an institution allowing it to focus on its scientific mission. Cloud computing recently emerged as a promising paradigm to realize such outsourcing as it offers on-demand, short-term access, which allows users to flexibly manage peaks in demand, pay-as-you-go model, which helps save costs for bursty usage patterns (i.e., helps manage “valleys” in demand), and convenience, as users and institutions no longer have to maintain specialized IT departments. However, cloud computing brings with it also challenges as we seek to understand how to best leverage the paradigm.
Many scientific communities are experimenting with this new model, among others using FutureGrid resources and a testbed for initial exploration. The objective of this BOF is to focus discussion on experiences to date as well as define challenges and priorities in understanding how cloud computing can be best leveraged in the scientific context. We plan to discuss application patterns as well as highlight and discuss the priority of the current challenges and open issues in cloud computing for science. Specifically, we will discuss the following challenges. What types of applications are currently considered suitable for the cloud and what are the obstacles to enlarging that set? What is the state-of-the-art of cloud computing performance relative to scientific applications and how is it likely to change in the future? How would programming models have to change (or what new programming models need to be developed) to support scientific applications in the clouds? Given the current cloud computing offering, what middleware needs to be developed to enable scientific communities to leverage clouds? How does cloud computing change the potential for new attacks and what new security tools and mechanisms will be needed to support it? How can we facilitate transition to this new paradigm for the scientific community; what needs to be done/established first? Depending on the profile of attendance, we expect the last question in particular to form a substantial part of the discussion.
The BOF will be structured as follows. We will begin with a short structured talk session, led by the organizers, that will summarize and update several previous discussions on this topic, notably the MAGIC meetings in September, April and May as well as several parallel developments that took place in the scientific context such as the Magellan report, cloud-related experimentation status on the FutureGrid project, and application activity. The second session of the BOF will be devoted to the discussion, elaboration, and prioritization of the challenges listed above. Finally, we will address the prioritization and shape of concrete transition measures. The time allocated to the last two issues will depend on the structure of the attendance; if we can get feedback from XSEDE users we will emphasize the transition measures, if we attract CS practitioners we will focus on technical challenges.
BOF by Gold Sponsor - Penguin Computing. Discussion lead: Matt Jacobs
BOF: Gordon Data Intensive HPC System
Abstract: This BOF will give current and prospective users the background on Gordon’s architecture, examples of application results obtained on Gordon, and insights in the types of problems that may be well-suited for Gordon. A brief presentation on these topics will help orient the audience, but the majority of the session will be dedicated to a discussion with attendees about their applications and how they might take advantage of Gordon’s unique architectural features. The goal of the BOF is to begin building a community of data intensive application users who can share best practices, and begin to learn from one another. Users from non-traditional HPC domains such as the humanities, social science, and economics are encouraged to attend. This will also be a good opportunity for Campus Champions to learn about Gordon so they are well-prepared to engage their local communities.
BOF: MapReduce and Data Intensive Applications
Abstract: We are in the era of data deluge and future success in science depends on the ability to leverage and utilize large-scale data. This proposal follows up our successful first meetings in this series of “MapReduce application and environments” at TeraGrid 2011. Further we will use it to kick start an XSEDE forum. It aligns directly with several NSF goals including Cyberinfrastructure Framework for 21st Century Science and Engineering (CF21) and Core Techniques and Technologies for Advancing big Data Science & Engineering (BIGDATA). In particular, MapReduce based programming models and run-time systems such as the open-source Hadoop system have increasingly been adopted by researchers of HPC, Grid and Cloud community with data-intensive problems, in areas including bio-informatics, data mining and analytics, and text processing. While MapReduce run-time systems such as Hadoop are currently not supported across XSEDE systems (it is available on some systems including FutureGrid), there is increased demand for these environments by the science community. This BOF session will provide a forum for discussions with users on challenges and opportunities for the use of MapReduce as an interoperable framework on HPC, Grid and Cloud.
BOF: Supporting Genomics and other Biological Research
Abstract: Are you spending more time and resources working with biologists? Are you inundated with ‘omics; genomics, proteomics, metabolomics – what about ‘seq’s; RNA-seq, ChIP-seq, and who-knows-what-seq? How is that working out for you? Want to talk about it?
Increasingly biologists and biomedical researchers are interacting with high performance computing resources. This session is for HPC professionals in campus bridging, people tasked with recruiting and interacting with biologists, and/or systems administrators who are supporting a growing number of bioinformatic applications. We will consider how to meet the computing needs of life scientists by discussing topics such as using Galaxy to support HPC clusters, archiving raw biological data, and dealing with the heterogeneity of bioinformatic applications.
Poster session on balcony
Display of posters and Student Poster Competition on balcony
Enjoy light refreshments, coffee, tea and water prior to starting your day.
In conjunction with the Wed. afternoon BOF, three informal focus group discussions will be held: Thursday, July 19, from 7:30-8:30 a.m. in the Root room, 8th floor, and during the Tuesday and Wednesday conference lunches -- Look for reserved tables. For more info, contact Julie Wernert, jwernert@iu.edu, or Nathan Bohlmann, nlb@clemson.edu.
Science: Massively parallel direct numerical simulations of forced compressible turbulence: a hybrid MPI/OpenMP approach
Abstract: A highly scalable simulation code for turbulent flowswhich solves the fully compressible Navier-Stokesequations is presented. The code, which supports one, twoand three dimensional domain decompositionsis shown to scale well on up to 262,144cores. Introducing multiple levels of parallelism based ondistributed message passing and shared-memory paradigms results in areductionof up to 33\% of communication time at large core counts.The code has been used to generate a large database ofhomogeneous isotropic turbulence in a stationary state created by forcingthe largest scales in the flow.The scaling of spectra of velocity and density fluctuationsare presented. While the former follow classical theories strictly validfor incompressible flows, the latter presents a more complicated behavior.Fluctuations in velocity gradients and derived quantitiesexhibit extreme though rare fluctuations, a phenomenon known as intermittency.The simulations presented provide data to disentangle Reynolds andMach number effects.
Software: What Is Campus Bridging and What is XSEDE Doing About It?
Abstract: The term “campus bridging” was first used in the creation of an NSF Advisory Committee for Cyberinfrastructure task force. That task force eventually arrived at the following description of campus bridging:
“Campus bridging is the seamlessly integrated use of cyberinfrastructure operated by a scientist or engineer with other cyberinfrastructure on the scientist’s campus, at other campuses, and at the regional, national, and international levels as if they were proximate to the scientist, and when working within the context of a Virtual Organization (VO) make the ‘virtual’ aspect of the organization irrelevant (or helpful) to the work of the VO.”
That definition and the task force report detail many things that could conceivably be done under the rubric of campus bridging.
But unlike other topics such as software or data, there is little ability to point to something and say, “Aha, there is a campus bridge.” Campus bridging is more a viewpoint and a set of usability, software, and information concerns that should inform everything done within XSEDE and the more general NSF strategy Cyberinfrastructure for 21st Century Innovation.
In this paper we outline several specific use cases of campus bridging technologies that have been identified as priorities for XSEDE in the next four years, ranging from documentation to software used entirely outside of XSEDE to software that helps bridge from individual researcher to campus to XSEDE cyberinfrastructure.
Tech: Evaluation of Parallel and Distributed File System Technologies for XSEDE
Science: High Accuracy Gravitational Waveforms from Black Hole Binary Inspirals Using OpenCL
Abstract: There is a strong need for high-accuracy and efficient modeling of extreme-mass-ratio binary black hole systems (EMRIs) because these are strong sources of gravitational waves that would be detected by future observatories. In this article, we present sample results from our Teukolsky EMRI code: a time-domain Teukolsky equation solver (a linear, hyperbolic, partial differential equation solver using finite-differencing), that takes advantage of several mathematical and computational enhancements to efficiently generate long-duration and high-accuracy EMRI waveforms.
We emphasize here the computational advances made in the context of this code. Currently there is considerable interest in making use of many-core processor architectures, such as Nvidia and AMD graphics processing units (GPUs) for scientific computing. Our code uses the Open Computing Language (OpenCL) for taking advantage of the massive parallelism offered by modern GPU architectures. We present the performance of our Teukolsky EMRI code on multiple modern processors architectures and demonstrate the high level of accuracy and performance it is able to achieve. We also present the code's scaling performance on a large supercomputer i.e. NSF's XSEDE resource, Keeneland.
Software: Campus Bridging Made Easy via Globus Services
Tech: Using Kerberized Lustre Over the Wide Area Network for High Energy Physics Data
Abstract: This paper reports the design and implementation of a secure, wide area network, distributed filesystem by the ExTENCI project, based on the Lustre filesystem. The system is used for remote access for analysis of data from the CMS experiment at the Large Hadron Collider (LHC) , and from the Lattice Quantum ChromoDynamics (LQCD) project. Security is provided by kerberos with additional fine grained control based on lustre ACLs and quotas. We show the impact of kerberos on the I/O rates of CMS and LQCD applications on client nodes, both real and virtual.
Science: A High throughput workflow environment for cosmological simulations
Abstract: The cause of cosmic acceleration remains an important unanswered question in cosmology. The Dark Energy Survey (DES) is a joint DoE-NSF project that will perform a sensitive survey of cosmic structure traced by galaxies and quasars across 5000 sq deg of sky. DES will be the first project to combine four different methods (supernova brightness, the acoustic scale of galaxy clustering, the population of groups and clusters of galaxies, and weak gravitational lensing) to study dark matter, dark energy, and departures from general relativistic gravity via evolution of the cosmic expansion rate and growth rate of linear density perturbations. Realizing the full statistical power of this and complementary surveys requires support from cosmological simulations to address the many potential sources of systematic error, particularly errors that are shared jointly across the tests of cosmic acceleration using cosmic structure.
We are coordinating a Blind Cosmology Challenge (BCC) process for DES, in which a variety of synthetic sky realizations in different cosmologies will be analyzed, in a blind manner, by DES science teams. The BCC process requires us to generate a suite of roughly 50 2048^3-particle N-body simulations that sample the space-time structure in a range of cosmic volumes. These simulations are dressed with galaxies, and the resulting catalog-level truth tables are then processed with physical (e.g., gravitational lensing) and telescope/instrument effects (e.g., survey mask) before their release to science teams. We describe here our efforts to embed control of the catalog production process within a workflow engine that employs a service-oriented architecture to manage XSEDE job requests. We describe the approach, including workflow tests and extensions, and present first production results for the N-body portion of the workflow. We propose future extensions aimed toward a science gateway service for astronomical sky
Software: The Anatomy of Successful ECSS Projects: Lessons of Supporting High-Throughput High-Performance Ensembles on XSEDE
Abstract: The Extended Collaborative Support Service (ECSS) of XSEDE is aprogram to provide support for advanced user requirements thatcannot and should not be supported via a regular ticketingsystem. Recently, two ECSS projects have been awarded by XSEDEmanagement to support the high-throughput of high-performance (HTHP)molecular dynamics (MD) simulations; both of these ECSS projects usea SAGA-based Pilot-Jobs approach as the technology required tosupport the HTHP scenarios. Representative of the underlying ECSSphilosophy, these projects were envisioned as three-waycollaborations between the application stake-holders,advanced/research software development team, and the resourceproviders. In this paper, we describe the aims and objectives ofthese ECSS projects, how the deliverables have been met, and somepreliminary results obtained. We also describe how SAGA has beendeployed on XSEDE in a Community Software Area as a necessaryprecursor for these projects.
Tech: The Data Supercell
Abstract: In April of 2012, the Pittsburgh Supercomputing Center unveiled a unique mass storage platform named 'The Data Supercell'. The Data Supercell (DSC) is based on the SLASH2 filesystem, also developed at PSC, and incorporates multiple classes of systems into its environment for the purposes of aiding scientific users and storage administrators.
The Data Supercell aims to play a major role in the XSEDE storage ecosystem. Besides serving the vanilla role of PSC's mass storage system, DSC features the new novelties of the SLASH2 filesystem. Outfitted with these new features DSC seeks to provide a new class of integrated storage services to serve users of large scientific data and XSEDE resource providers.
This submission will cover all or most aspects of the DSC. These will include:
* Software and Hardware Architecture
Here will be explained the types of storage systems which compose the DSC with emphasis on the heterogeneous nature of the assembly. Particularly, the SLASH2 I/O service is very portable across system class and operating system. We will detail how this feature was instrumental in constructing the DSC by enabling the inclusion of a legacy tape system with dense storage bricks running ZFS.
* Performance Analysis of DSC
The community will be interested in the performance of any new distributed / parallel filesystem. Since DSC is the flagship SLASH2 deployment at this time, we shell disseminate I/O performance measurements for data and metadata operations through this submission.
* Novel features such as File Replication and Poly-residencies
At this time, DSC is primarily used as PSC's mass storage system however the system has interesting capabilities which extend beyond a the features of a traditional archiver. Of these, is the ability to move data in parallel between a scratch filesystem (ie Lustre or GPFS) and the highly dense storage nodes. Further, such features can be enacted by normal users of the system allowing them to transfer data between mass storage and (any) parallel filesystem with exceptional performance.
SLASH2's file replication capabilities allow for users and administrators to determine the layout, residency, and number of replicas on a per-block basis or for a whole file. Our paper will illuminate such capabilities as used on the DSC.
* Upcoming Integrated scientific cloud data services
Here we shall describe the how existing and upcoming SLASH2 features will be used to aid XSEDE's large data users. This will focus on users and / or research groups with considerable on-campus storage and compute resources which frequently operate on XSEDE resources. The section will describe in detail the vision of incorporating data replication, user-specific eventually consistent metadata volumes, data multi-residency, and system managed parallel replication for creating tightly integrated storage environments between large scale campus and XSEDE RP resources.
General Session: Campus Champion Panel
Awards Luncheon and Closing Speaker Steven Reiner