CSM #1 - Secure Communications
CSM #2 - Quantum Physics
CSM #3 - Regional climate change projection
CSM #4 - Reconstructed climate fields
CSM #5 - Java Tutorials for Software Engineering
CSM #6 - BPC Tech Club
Data Verity #1
Data Verity #2
Los Alamos #1 - IO Traces
Los Alamos #2 - OpenSpeedShop Usability
Los Alamos #3 - Resilient License Serving
Los Alamos #4 - HPC Consistent User Environment
Los Alamos #5 - Parallel I/O Performance Measurement & Testing project
Spatial Corporation
Sun Microsystems #1 and #2 (OS Projects)
Toilers #1 - Visualizer for Event Tracking
Toilers #2 - Wireless Sensor Data Acquisition System for Underground Contaminant Monitoring
Toilers #3 - iNSpect Visualization Tool
USGS - ShakeMap GUI
Background Information:
In mathematical physics it is common to model the time-evolution of certain types of microscopic and macroscopic phenomenon, where the quantum effects are non-negligible, by the linear and nonlinear Schroedinger equation, respectively. Currently, there are various numerical methods, which can approximate the solutions to these equations under various conditions. In this research we take a more focused approach and seek to approximate strict upper-bounds on the ground state solutions of each equation for arbitrary conditions. To do this we will use the principle of least action from the calculus of variations to determine the weights/coefficients of the eigenbasis expansion of our bounding functions.
Project Goals:The initial goal of this project is to define, using the model equations and variational principles, a set of approximating equations and an algorithm, which will be used to evaluate them. After this an evaluation routine should be constructed, using standard mathematical packages, which reports the data both graphically and numerically. The output of this routine will then be compared to standard results and assessed for numerical accuracy. Time permitting, multiple physically motivated cases will be analyzed. Lastly, a final report summarizing the theoretical developments, implementation, error analysis, and usage should be created for the end user and future development.
Skills Required:CSM #3 - Regional climate change projection
Background information
Recent work on probabilistic climate change projections has focused
mainly on the evolution of global mean temperature. However, impacts
and adaptations are determined mostly by local climate change and thus
require a quantitative picture of the expected change on regional and
seasonal scales. Global fields of probabilistic projections of future
climate change are available from a multivariate Bayesian analysis.
Project Goals
Starting from the global fields of future climate change, the goal is
to calculate and visualize regional climate change projections by
means of a web interface. More specifically, the user should be able
to choose elements like region, emission scenario, time frame,
quantiles or thresh-holds etc. The web interface should communicate
with the statistical software package R that extracts and summarizes
the required quantities and creates summary statistics and plots.
Skills Required
CSM #4 - Reconstructed climate fields
Background information
A major contribution of the National Centers for Environmental
Prediction (NCEP) or the European Centre for Medium-Range Weather
Forecasts (ECMWF) are the ``re-analysis'' climate fields. The
re-analysis consists of blending sparse past observations with a
numerical weather prediction (NWP) model and deriving ``best guess'',
aka optimal interpolation, aka filtered fields. The construction is a
very time consuming computing process. There exists several
methodologies that furnish a similar product with a fraction of the
computing time.
Project Goals
Our goal is to refine one of those existing approaches which consist
of blending a primary variable (sparse gridded observations) with
several secondary variables (atmosphere-ocean general circulation
model (AOGCM) data), called aggregation-kriging. It is important to
note that the AOGCM temperature fields represent
one possible realization of a specific monthly temperature field and
are generally biased. The project team needs to discuss the difference
between the observed and simulated bias as well as to identify and
quantify the model bias.
Each secondary variable has 2520 observations, therefore
straightforward implementations of aggregation-kriging are
computationally challenging. However, the use of sparse matrix algebra
allows to carry out the calculations on ordinary desktop computers.
Starting from existing R code, the project team needs to write R code
that will handle the massive datasets in a reasonable way.
The current implementation is based on an stationary and isotropic
spherical covariance structure, an assumption which needs to be
relaxed.
Skills Required
The Colorado School of Mines (CSM) has received funding to develop a sequence of educational experiences designed to attract and retain female students in computer science. One of these experiences will be creating an animatronic creature for use in video production project at the high school level. Another segment of the project will be retrofitting a robot with a wireless sensor and creating an applet for interacting with it. There are additional projects available that may be added or substituted if needed.
Project Requirements:This project requires knowledge of Java and a desire to have fun. Experience in robotics may be helpful, but is not required.
There may be an opportunity for one or more students to continue lesson development throughout the summer with funding.
Data Verity is a retail consulting firm that provides its clients with different types of reports and training. Our ESP system (programmed in MySQL, PHP, Javascript and some Java) provides clients with online data collection and reporting. We are looking for innovative ways to analyze and forecast client behavior and success.
Project Objectives:Create and implement a versatile, automated statistics package which:
Data Verity is a retail consulting firm that provides its clients with different types of reports and training. Our ESP system (programmed in MySQL, PHP, Javascript and some Java) provides clients with online data collection and reporting. The user interface is in need of improvement via Javascript.
Project Objectives:Los Alamos National Laboratory is home to many of the world's fastest supercomputers. Our Storage and I/O group at LANL is responsible for the maintenance of current storage systems as well as the design of future ones. This work requires an in-depth knowledge of how storage systems are used by the applications. Our group has developed trace mechanisms to capture the I/O usage of large parallel jobs.
There is a large community of storage vendors and storage researchers who are very interested in the I/O patterns of LANL applications due to their very large scale. Unfortunately, many of the applications running on LANL supercomputers contain restricted information which cannot be publicized. Therefore, there is a great need for utilities to turn our classified traces into something useful for the public audience. We would like to solicit CSM students for a Field Session whose aim is to solve this tricky problem. Specifically, we would like these students to use some previously collected unclassified traces that we have available and create a replay application. Further, the students would use our trace mechanism to capture the I/O patterns of their new replay application. Finally, the students would use our visualization tool (or some other mechanisms of their own device) to verify how accurately the replay application mirrors the I/O patterns of the original application.
This work requires knowledge of C/C++, parallel computing, and possibly perl or some other scripting language. Publicly available LANL traces can be studied to learn more about this project. They are available here: http://institute.lanl.gov/data/tdata/
Programming models are being revisited in light of computer architectures moving to heterogeneous or multi core implementations. This has increased the need for multi layer MPI implementations, use of threads below the node level, and other possibilities. Computer scientists at LANL have become intimately involved in the issues and possible approaches during the past year working with the Roadrunner cell hybrid architecture. One of the things that they have found is that the core structure of most applications have to be better understood before any transitions to new architectures can be done. Analysis tools are needed to assess time, IO, memory usage, etc., in order to identify possible approaches to move specific data, functions and memory patterns to operate on appropriate segments of the processor architecture.
We are collecting the various approaches and tools that are being used, with the goal of identifying how to support the transition process, as well as the types of plugins that could further add value. In this context, we are also working with the Open|SpeedShop tool framework, home page: http://www.openspeedshop.org/
The focus is to validate the tool framework and its results, and compare it against the actual needs of the individuals doing the application assessment. Possible outcomes include modifications to the individual tools, the usability of the framework, and/or the analysis process. Scope ranges from individual use, up to collective processes for larger audiences to use, as this type of programming transition becomes more mainstream.
Students will install the Open|SpeedShop Performance Analysis Tool on a compatible cluster, and assess usability, from a user perspective, for serial jobs (no MPI). Outcomes are (in increasing order of complexity): identify gaps in documentation (from initial installation through tool usage); assess ease of use including identifying non-intuitive operations; evaluate quality of the experiment results from the various built-in plugin analysis tools (more difficult).The students must have basic programming skills and a willingness to work on challenging and open-ended problems. They will need to run programs through the OpenSpeedShop framework and identify usability issues. An optional deliverable is to validate the experiments to check reported results. This will also require some ability to think critically in terms of designing tests and scenarios that will test usage models. The initial focus is on serial programs. Later usage tests will go into MPI runs on clusters. A machine running a recent release of linux, RH, SUSE or fedora will be sufficient for non-MPI tests.
More information about this project can be found here.The Los Alamos HPC environment includes 7 (soon to be 8) major computing clusters. The user environment for large-scale scientific computing includes software tools (compilers and debuggers) that require special licensing. Currently, a single machine “serves” product licenses to a set of clusters and their users (“clients”), using FLEXnet, a standard management tool. If any problem occurs on that machine, the compilers and debuggers cannot be used by any users until the problem is fixed.
Licensed software is tightly coupled to its server. For example, compiler version 6 must be served, using version 6 keys, by FLEXnet version 10. Upgrading the compiler to version 7 requires version 7 keys, which can only be handled by FLEXnet version 11. Our environment often require continued availability of version 6 while version 7 is being evaluated. This is difficult, if not impossible, to accommodate in a single server configuration.
Goal:The Ptools team in the HPC-3 group at LANL would like to investigate additional product management functions available in the FLEXnet tool. Specifically:
The students should have: basic scripting skills and a willingness to work on detail-oriented tasks. Quorum server configurations require 3 machines; LANL is running FLEXnet on Linux machines, but the package is available for a wide variety of platforms, including Windows, Macintosh, and most UNIX/Linux variants, any of which would be acceptable for this project. The reporting functions require a web server.
Additional information about this project can be found
here.
The Los Alamos HPC environment includes 7 (soon to be 8) major computing clusters. The user environment for large-scale scientific computing requires multiple software tools – compilers, debuggers, special purposes libraries, performance analysis, etc. The Ptools Team in the HPC-3 group at LANL provides basic support of these tools. Our goal is to provide exactly the same software environment on each of the major clusters (except where architecture or other technical factors prevent).
Vendors are constantly revising and updating their products, sometimes as often as monthly! Thus, maintenance is never-ending. Each HPC cluster has a different maintenance schedule, thus the software environments can drift. The problem is further compounded by the fact that each cluster has multiple sub-sections, and internal mechanisms can break down and cause discrepancies. Thus, users may see different behavior even within a single cluster.
Goal:We need a tool to:
Ptools can provide sample data for this project. Optionally, students could be provided accounts one of the clusters, to improve the collection and/or transmittal mechanisms.
Student/Mentor/Machine Requirements:The students should have: basic scripting skills, including sed, awk, Perl, and/or Python; basic web programming skills; willingness to work on detail-oriented tasks; database skills desirable. Data manipulation and display could also involve spreadsheet, database, or other formats. A machine running a stable web server will be sufficient for most of the tasks.
More information about this project can be found
here.
The MPI-2 specification provides for the first time a standard high-performance I/O interface. While this has allowed I/O-intensive and MPI codes to run on multi-process platforms, achieving consistent performance levels has yet to be realized. Moreover, because of rapid changes to the software and hardware providing for this capability we have a need for a test and validation suite based on the type of I/O currently in use at LANL.
Current I/O benchmark and testing software packages suffer from the fact that they do not stress the system to the point of our real-world applications, they may characterize only one or few levels of the entire I/O subsystem, and they may quickly become obsolete.
GoalWe need to evaluate our existing performance needs and develop a measurement and testing framework that helps us to achieve an established performance level. We need to evaluate existing I/O benchmark and testing software packages, if any, or develop an in-house solution.
Students will evaluate known I/O test suites and relevant research on the topic. In conjunction with a LANL researcher students will aid in the design and implementation of a test suite for use on LANL productions systems.
Student/Mentor/Machine RequirementsThe students must have basic programming skills and a willingness to work on challenging and open-ended problems. As most codes of interest to DOE/LANL are written in Fortran and C, these would be the desired language skills a student should possess. This project will also require some ability to think critically in terms of designing software test and validation code. Access to a multi-core or multi-processor machine running a recent release of Linux (RH, SUSE or Fedora) and some version of an MPI-2 implementation is required. In addition, the machine should be able to build/run the necessary MPI-2 parallel I/O codes.
More information about this project can be found here.Spatial provides high-performance 3D software components and services for design-centric markets. By integrating our 3D software technologies with new or existing software applications, our partners get the 3D functionality they desire while better managing development costs, optimizing resources, and decreasing time-to-market, allowing them to more closely focus on their core competencies. Learn more at www.spatial.com.
Productivity and Automation:The goal of this department is to ensure that the tools and processes used within Spatial not only work as desired but that they are optimal. The Productivity and Automation team has also pioneered the use of the agile software methodology which means we complete work in two week iterations, work closely with an internal customer, utilize pair programming and test driven development, and hold mini daily meetings (15 min or less) called “stand-ups”. Therefore, the interns that take on this project will be part of two iterations and work closely with the other members of the team during all phases of work (story/task creation, testing, design, development, documentation, etc).
Project Background:The Productivity and Automation team has recently been tasked with rewriting our automated testing infrastructure from the ground up. The main testing mechanism is a C++ based application that is utilized by a JAVA wrapper which in turn communicates with a JAVA “brain” module that divides the work. The “brain” module communicates with all machines doing the testing and keeps all information in a database. A web application written in JSP (that utilizes beans and servlet code) communicates with this database to allow users to configure, run, and view test results. Other technologies used worth mentioning include: JUnit, FitNesse, Selenium, MySql, Python, and Apache Tomcat.
Project Objectives:Since the time allotted for field session covers two iterations, we have come up with two objectives (one to be completed within each iteration cycle):
Sun and the OpenSolaris community have been working on Indiana, which is an open-source version of the Solaris operating system. Among other features, it's very much designed for a student/developer audience--it fits on a single CD, and used a Debian-like network repository to provide additional software resources.
Project #1 - Web-based ISO Creation ToolThis project will be to create a Web-based application to allow
users
to create a custom install CD or DVD, based on their specific software
preferences and needs. A broadly similar example of what might be
interesting would be:
http://www.instalinux.com/cgi-bin/coe_bootimage.cgi
The specific features to be included are as yet undetermined, and ideally the students can help make those decisions.
Students working on this project should have reasonable operating system experience, plus good scripting and Web-development skills. Experience with various Linux distributions and Solaris would be beneficial.
Project #2 - Port Indiana to SPARC ArchitectureUntil now, Project Indiana has been x86-based. There is a need to make Indiana run on Sparc architecture too. The source code is the same on both, so that part is easy. The boot process, though, on Sparc differs greatly from the x86 architecture. (E.g., BIOS doesn't exist, OBP does, and they function differently.) This project would be well scoped and bounded, and senior resources would be available to advise, review, and consult, but it will be challenging.
Students working on this project MUST have VERY strong operating system skills, plus good scripting and interpersonal skills. This project would be quite interesting, but a stretch even for good students. Senior engineers would be available to support and help, but it will likely be one of the most technically challenging school projects you might do.
Working Environment:This project would allow an opportunity to experience software development at the Sun Microsystems campus in Broomfield, CO. The team would be working with local engineers as well as other engineers in locations across the US and in Prague, Bangalore, and Beijing.
Qi Han (CS faculty)Background:
Nick Hubbell (CS graduate student)
Colorado School of Mines
Wireless sensor networks (WSNs) are a new type of embedded computer system. Sensors, microprocessors, memory and low power radios can be integrated into small, inexpensive, intelligent sensing devices. When organized into a wireless network, these devices can give scientists environmental monitoring capability on a scale and level of precision that was impossible in years past. One of the chief applications for WSNs is the task of event detection and tracking. Events are phenomena of interest that must be identified and followed by nodes in the network, such as a person walking through the area or a chemical plume seeping through the ground. We have developed an algorithm which allows nodes within the network to detect an event, organize themselves into a group around each event, and for these groups to move to follow the events.
Project Goals:This project is to create a GUI visualizer that enables users to actually watch the algorithm form and change node groups to follow the underlying events. This visualizer takes the output generated by a previous simulation run of the algorithm and converts it to a graphical representation. As such, the GUI is non-interactive and useful as a debugging aid and for demo presentations of the previously developed event tracking algorithm.
Project Requirements:The visualizer should:
Qi Han (CS faculty)Background:
Philip Loden (CS undergraduate student)
Colorado School of Mines
Release of chemicals or biological agents in the subsurface often results in plumes migrating in the medium, posing risk to human and ecological environments. Temporal and spatial monitoring of the plume concentrations are needed to assess risk, make decisions, and take remedial action. Current underground contaminant plume monitoring technologies are inefficient, expensive and ineffective. Wireless sensor networks (WSNs) have the potential to dramatically improve this process. WSNs are composed of a number of a new type of embedded computer systems that communicated in wireless fashion. Sensors, microprocessors, memory and low power radios can be integrated into small, inexpensive, and intelligent sensing devices.
We have previously developed a sensor data acquisition system using
TinyOS 1.0.
The system has been used in a porous media aquifer test bed consisting
of
Crossbow TelosB motes and EchoTE conductivity sensors.
Project Goals:
iNSpect is an ns-2 visualization tool that is used by over 420 researchers in 51 countries (as of December 2007) for visualizing mobile ad-hoc and/or wireless networks. The current version of iNSpect is version 3.5. The original version of iNSpect was developed by Tracy Camp, Stuart Kurkowski, Mike Colagrosso, Neil (Tuli) Mushell, Matthew Gimlin, Neal Erickson, and Jeff Boleng from the Toilers Research Group at the Colorado School of Mines. In April 2005, Fatih Gey and Peter Eginger from the Department Security Technology at the Fraunhofer Institute for Computer Graphics Research over-hauled the iNSpect (version 3.3) visualization tool for wireless networks. This overhaul provided a cleaner and more logical class structure, allowing it to be much easier to extend this visualization tool.
In November 2007, the Toilers decided to update iNSpect to handle both wired and wireless visualization and to work in the next version of the network simulator (ns-3). Our hope is that the new iNSpect visualizer (for wired and wireless simulations) will be included with future ns-3 distributions. We are extending the Fraunhofer IGD implementation of iNSpect by adding simulation support for wired networks (dubbed version 4). So far we have redesigned the internal model to handle wired and wireless simulations, and we have added the ability to turn nodes on and off during the simulation (e.g., to handle limited energy models). Our update to iNSpect is still able to handle the ns-2 new-trace format for wireless, but can now also handle ns-2 wired traces.
Project:The summer team will need to help update iNSpect to be more maintainable and also more functional. The list of things we need are as follows:
Team members will need to know how to program in C++ using polymorphic classes and pointers. The team will also need to include at least one member familiar with thread safety and the monitor design pattern. Skills that will be useful but that can be learned during the project include GTK threads and gui elements. Also, learning a handful of design patterns will be necessary.
U.S. Geological Survey (USGS)
National Earthquake Information Center (NEIC)
David Wald & Kuo-Wan LinBackground:
1711 Illinois Street
Golden, CO 80401
(303) 273-8441
wald@usgs.gov, klin@usgs.gov
ShakeMap is a well-established tool used to portray the extent of potential damaging shaking following an earthquake. ShakeMap is automatically generated for small and large earthquakes in areas where it is available and can be found on the Internet at http://earthquake.usgs.gov/shakemap/. It is designed to rapidly produce shaking and intensity maps for use by emergency response organizations, local, county, State and Federal Government agencies, public and private companies and organizations, the media, and the general public. NEIC currently operates the Global ShakeMap system. The system automatically generates ShakeMaps for earthquakes of magnitude of 5.5 or greater worldwide and of 3.5 or greater inside the contiguous U.S.
ShakeMap is a collection of programs, largely written in the Perl programming language in a UNIX environment. These programs are run sequentially to produce ground motion maps (as PostScript and JPEG images, GIS files, etc.) as well as web pages and email notifications. In addition to Perl, a number of other software packages are used. These software packages include Generic Mapping Tools (GMT), Ghostscript, make, SCCS, C compiler, MySQL, Metadata Parser (MP), Zip, and SSH. Operating the ShakeMap application requires in-depth knowledge in both Earth and Computer Science.
Objectives:The objective is to create a Graphical User Interface (GUI) for ShakeMap to lower the technical barrier of operation. NEIC could greatly benefit from the implementation of ShakeMap GUI, as most staffs, who are experts in Earth Science, do not possess enough computer skills to operate ShakeMap confidently. With ShakeMap GUI, an operator will be able to review and update ShakeMaps during earthquake response and to prepare scenarios for the purpose of research and earthquake exercises. GUI approach is to be determined by the needs assessment, but could be MVC-based, or comparable technologies.
Scope of Work: