The Duke Human Vaccine Institute (DHVI) is a global leader in the fight against HIV/AIDS.  At DHVI, researchers focus on the scientific “bottlenecks” in the development of vaccines for HIV and other viruses.  For DHVI’s Antibodyome project, SciMed developed the Laboratory Assay System (LAS) to manage millions of data records generated and used by HIV scientists around the world.  With the LAS, DHVI has significantly increased the speed and accuracy with which scientists can work toward HIV/AIDS prevention and cure while simultaneously reducing the cost of research.


A large segment of DHVI’s research is focused on antibodies and their relationship to HIV infection.  The number of human antibodies is extraordinary—they appear in more than 100 trillion combinations.  

In any scientific effort, the value of data can be realized only if the data is accurately and effectively managed.  In the case of DHVI, its antibody and process databases had grown so large that the work of entry, storage, access, and manipulation had become a nontrivial challenge.  DHVI staff recognized that they needed an entirely new system for managing their data.  This new system would need to address a broad range of requirements, including:

  • Data integrity, security, and auditability
  • Robust software to match the scale of research processes and data generation
  • Automated data acquisition
  • Intelligent data-interrogation
  • Good usability that would not discourage adoption
  • Support for intellectual property (IP) designations/claims.

The Solution

To address DHVI’s data challenges, SciMed built the LAS in a collaborative project that initially spanned nine months, and the first version of the LAS was completed in March 2010.   Since going live, DHVI has achieved a substantial list of results:

  • The LAS addresses all of DHVI’s original data management requirements related to accuracy, interrogation, completeness, and auditability.

    • The LAS now acquires antibody and process data directly from laboratory equipment—a change that eliminates errors related to manual entry and manipulation.
    • New data-interrogation tools allow scientist to access information quickly with a higher level of confidence that the data they retrieve are both complete and correct.
    • Researchers now have fully auditable data trails to support both scientific claims and intellectual property claims.
    • Lastly, the carefully designed user-interface allowed the LAS to be quickly adopted by researchers.

In addition, the LAS is generating other economic and scientific benefits:

  • HIV researchers are saving money and time on multiple fronts:

    • Fewer wasted or repeated analyses due to lost, incorrect, or untrustworthy data.
    • Less time entering, tracking, and correcting data
    • Less time retrieving, confirming, and auditing data
  • An important secondary effect is that scientists now have more time for thinking and collaboration, which further speeds their research progress.

  • Lastly, the LAS infrastructure is designed for expansion and extensibility—not only for HIV research but also for other challenging vaccine research areas such as tuberculosis and influenza.

Project Details

In 2005, DHVI became home to the Center for HIV/AIDS Vaccine Immunology (CHAVI), newly funded by the National Institutes for Health, National Institute of Allergy & Infectious Diseases (NIH/NIAID).  Under CHAVI, antibody research at the global level accelerated quickly with scientists from many institutions generating and using an increasing body of data on human antibody responses to HIV.  By 2009, it was evident the large-scale research effort had outgrown its former methods for managing information.

Tony Moody, MD, director of DHVI’s B cell Immunotechnology Laboratory, describes the history:  

As part of the CHAVI team, we at DHVI, focus on the “scientific bottlenecks” on the development path to vaccines for HIV, TB and other diseases.  As our research processes began to generate more and more data under the CHAVI programs, we discovered that data integrity and data auditability had become not a scientific bottleneck, but a “technical bottleneck.” Our science work was getting slowed down by the mundane but critically important task of keeping all the data in order.

For many years, we’d been handling our data in the traditional ways: with lab notebooks, spreadsheets, manual data transfers, and the like. While these methods had worked in the past, they were failing under the weight of our expanding research efforts and increasingly massive data sets. We were accumulating spreadsheets with 50 columns and 4 or 5 thousand lines of data. Manual transcriptions and on-the-fly programming created errors for data going into the spreadsheets and data storage files. And then, manual file manipulation and do-it-yourself macros and cell-formulas created new errors for data coming out of the research process.

The LAS plan was born of these needs. DHVI knew that the software development project would not be quick or easy, but, if done well, the benefits related to cost, auditability, and speed would more than justify the effort.

Successful projects start with good design and planning. By digging deep into DHVI’s research methodologies, SciMed was able to offer several design approaches allowing the DHVI project team to effectively shape the software architecture into an application reflective of DHVI’s science processes. From the beginning, the DHVI research, biostatisticians, IT, and SciMed team concentrated their design efforts on a short list of priorities that included automatic data handling, auditability tools, and a usable interface.

As Moody describes it:

Part of our design really focused on getting away from data manipulation prior to its entry into a database.  For example: it’s common practice for scientists to use lab equipment that generates data on a paper strip, which a technician transcribes by hand into a spreadsheet, that someone else later reformats and feeds into a database. That might work for a small lab, but for science on the scale we’re doing -- no way.  As much as possible, we wanted a raw feed to go from our lab machines straight into the databases.  That was an important capacity for SciMed to develop for the LAS. Our labs conduct a wide variety of processes on many types of equipment, but the development team did a thorough digging to make sure the application could connect all the pieces and move the data correctly.

Auditability was a second critical need.  DHVI’s primary focus is on scientific discovery: finding the things that can impact HIV.  That said, the journey from scientific discovery to vaccine development involves many other players and requires an auditable record of which party did which part of the work. Moody explains:

When we discover something that’s going to be important, there’s a whole new set of questions around intellectual property (IP).  We need an adequate data record to support our IP claims if we’re going to develop a product, or look at licensure or sale, or work with the Food and Drug Administration.  We must be able to say "yes, we did all these specific steps: this is how we did them, when we did them, who did them,” and everything else that supports the assertion that our IP claims are true.  Ample time was spent with our scientists to make sure that the LAS could track and report the history behind every piece of data.

A third design emphasis was that SciMed needed to make the LAS easy to use.  DHVI’s Director of Information Technology, Paul Debien, explains the importance of this design issue:

The best tools in the world are worthless if they’re a pain to use, because if you add pain or extra effort, nobody will actually use them.  Dr. Moody coined the term “pain neutral” and that became the mantra for our usability design:  the LAS would need to require no more effort to any of its users than the methods or tools they were using before.  To make this happen, the DHVI-SciMed team did the design and early testing with users across the spectrum of our research process.  We included everyone from principal investigators like Tony Moody, to statisticians, to people in the lab handling the pipettes and filling the plates.  Digging into the details with CHAVI and DHVI staff, SciMed obtained a clear sense of what features were essential to our user’s needs, and also prevented many potential design errors.

As a last critical requirement, DHVI needed SciMed to finish a complete, working version of the LAS without delay, even if the first version did not include all the features that the CHAVI team might like to have.

Paul Debien describes the team’s direction:

Early on, we made a very conscious appraisal of which parts of the data generation and data management system needed our priority attention.  "Velocity over complete perfection" was our guiding philosophy.  We needed enough pieces of the LAS completed so that researchers could start using it, and buying in to the changes in their lab’s workflow. SciMed understood our request and in conjunction with the DHVI project team, deployed the good development practices needed to meet DHVI milestones: from detailed Q&A meetings to understand how scientific processes worked, to pre-development mockups that let users see whether LAS was on track or not, to finding the programming efficiencies that would move the project forward as fast as possible.

After nine months of design and development, SciMed delivered the first production-ready version of LAS to DHVI in March of 2010.  Evaluated against its original technical and usability requirements, the LAS is a great success.  The scientists now have more accurate, auditable data—managed with less work and more speed, and the user-interface and process flow have allowed the LAS to be quickly adopted by all its users.

Tony Moody summarizes the results:

What the LAS has given us is the ability to move the process forward faster.  We know how to do the science, and we might have limped along with the myriad spreadsheets we had been using to track our data, processes and projects.  But it really wasn’t sustainable.   The LAS allows us to perform all of those tasks with the accuracy, auditability and speed that we really need.

Building the LAS was not cheap.  But the multiple cost-savings we now obtain have more than justified the project.  First, there’s the immediate dollar savings of not having to repeat experiments. Experiments are expensive propositions -- easily in the range of $5,000 to $10,000 for each one we do.  Eliminating repeat experiments gets us to real savings very quickly.  Second, we’ve reduced costs by reducing the time it takes to get things done -- time to enter data, time to retrieve data, and time to make sure that the data we’re looking at are correct.

Paul Debien concludes:

Lastly, when it comes to the actual science and medicine that we’re here to do, we’re reducing the time it takes to translate “finding 1” into “finding 2”.  In the macro sense, we’re preventing the costs of errors in vaccine developments and trials, and speeding our progress toward preventions and cures.  These are meaningful contributions to the public health effort.