Generic infrastructure

SARA Huygens

This part of the e-BioGrid program only involves dedicated projects. A generic infrastructure that interconnects the life science tools to the hardware resource facilities is required to be able to efficiently work in an e-science infrastructure environment.

e-BioGrid program is open for dedicated, generic infrastructure projects. Contact us if you have a generic need for support in software of hardware infrastructure.


Projects in this technology area

HPC cloud to scale the NBIC Galaxy
description:The Galaxy server, serviced by NBIC, is used to run generic bioinformatics tools sequence and proteomics analysis through a standard web-user interface. As many of the tools are CPU and storage demanding, running the Galaxy server on a high-performing computing cloud will expand its computer capacity. In collaboration with NBIC, the e-BioGrid team developed NBIC Galaxy on Cloud. The application scales dynamically with increasing workload. Having access to the HPC Cloud enables processing of big data volumes, and high speed network connection provides rapid data transfers.
applicant:Marc van Driel, Netherlands Bioinformatics Centre
results:Galaxy images are being build to be installed on the clouds
status:completed
team:Marc van Driel, Leon Mei, Floris Sluiter, Tom Visser, Niek Bosch
type:This is a dedicated project.
HPC Cloud beta testing
description:The Microarray Department/Integrative Bioinformatics Unit at the University of Amsterdam will setup the HPC Cloud environment as a flexible and scalable environment for microarray design and analysis. From a local R session we want to be able to initialize a HPC Cloud computer cluster on the fly, use it from the local R session and shut the cluster down when no longer needed for.
applicant:Timo Breit, University of Amsterdam
results:Faster submission of computationally intensive jobs to the Cloud. Dynamically up and down-scaling a Cloud cluster from a local R session.
status:completed
team:Han Rauwerda, Wim de Leeuw
type:This is a dedicated project.
AMC e-infrastructure for Biomedical Research
description:The e-BioInfra platform provides facilities to run large data analysis experiments on the Dutch Grid. The project includes software and system design, development and deployment as services for the AMC researchers community. The platform is based on workflow technology, including also data transfer, monitoring and provenance services. The team also provides support to researchers that wish to perform experiments on the grid infrastructure. The web interface of the e-bioinfra gateway provides easy access to novice users.
BiGGrid funds one member of the e-bioscience team (Mark Santcroos) to improve the link between the e-Bioinfra and the Dutch grid resources and services. Activities involve development and integration of new middleware tools, user support, definition of guidelines and best practices, and platform dissemination to a larger community of biomedical and life science researchers.
applicant:Silvia Olabarriaga, Antoine van Kampen and Jan Just Keiser, Amsterdam Medical Centre / University of Amsterdam
results:S.D. Olabarriaga, T. Glatard, P.T. de Boer, "A Virtual Laboratory for Medical Image Analysis", IEEE Transactions on Information Technology In Biomedicine (TITB), 2010 Apr 5.
M.W.A. Caan, F.M. Vos, L.J. van Vliet, A.H.C. van Kampen, S.D. Olabarriaga. Gridifying a Diffusion Tensor Imaging Analysis Pipeline. Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid 2010) Melbourne, VIC, Australia, May 17-May 20. IEEE Computer Society. pp.733-738, 2010.
S. Shahand, M. Santcroos, Y. Mohammed, V. Korkhov, A. Luyf, A. van Kampen and S. Olabarriaga. Front-ends to Biomedical Data Analysis on Grids. Proceedings of HealthGrid 2011 (in press), 2011
status:ongoing
team:Mark Santcroos, Silvia Olabarriaga, Jan Just Keijser, Antoine van Kampen, Shayan Shahand, Vladimir Korkhov, Souley Madougou
type:This is a dedicated project.
Data Analysis Framework (DAF)
description:Development of generic infrastructure program Data Analysis Framework (DAF) to make simpler processing of data intensive tasks on Grid. DAF provides for each users quota limited disk space to upload the starting input files and store the processing results. On the data available at the user disk space data processing tasks can be executed based on integrated command line tools in DAF. All data processing services including file I/O to and from the user´s disk space can be accessed via web service requests. DAF is using glite for job submission and enhanced ToPoS pilot job system to lower job submission errors. We plan to integrate DAF with data management software such as Molgenis or OpenBIS to develop a fully integrated data analysis platform. We also plan to provide an easy-to-use web interface, where generic web pages are created for integrated tools and workflow and scientific visualization platform providing the visualization support.
See also the application to a proteomics data analysis infrastructure here.

applicant:Ishtiaq Ahmad, University of Groningen, Department of Pharmacy, Analytical Biochemistry
results:DAF in current state is already used to provide high-throughput time alignment service¹ based on Warp2D tool² for LC-MS peak list accessible at http://www.nbpp.nl/warp2d.html. Integrated msComapre6 workflow in NBIC Galaxy server. Other results relate to the tools and workflows, that we intend to integrate in DAF and already mentioned above.

1. Ahmad I, Suits F, Hoekman B, Swertz MA, Byelas H, Dijkstra M, Hooft R, Katsubo D, van Breukelen B, Bischoff R, Horvatovich P., A high-throughput processing service for retention time alignment of complex proteomics and metabolomics LC-MS data, Bioinformatics, 2011, 27(8):1176-1178, PMID: 21349866
2. Suits F, Lepre J, Du P, Bischoff R, Horvatovich P., Two-dimensional method for time aligning liquid chromatography-mass spectrometry data, Anal Chem., 2008, 80(9):3095-3104, PMID: 18396914
3. Christin C, Hoefsloot HC, Smilde AK, Suits F, Bischoff R, Horvatovich PL., Time alignment algorithms based on selected mass traces for complex LC-MS data, J Proteome Res., 2010, 9(3):1483-1495, PMID: 20070124
4. Christin C, Smilde AK, Hoefsloot HC, Suits F, Bischoff R, Horvatovich PL., Optimized time alignment algorithm for LC-MS data: correlation optimized warping using component detection algorithm-selected mass chromatograms, Anal Chem., 2008, 80(18):7012-7021, PMID: 18715018
5. Christin, C., Hoefsloot, H. C. J., Smilde, A. K., Hoekman, B., Bischoff, R., Horvatovich, P., A critical assessment of statistical methods for biomarker discovery in clinical proteomics, manuscript submitted to Molecular & Cellular Proteomics.
6. Hoekman, B., Breitling, R., Suits, F., Bischoff, R., Horvatovich, P., msCompare: a framework for quantitative analysis of label-free LC-MS data for comparative
status:ongoing
team:Isthiaq Ahmad, Berend Hoekman, Peter Horvatovich, Rainer Bischoff and collaborators from Gaining Momentum Initiative
type:This is a dedicated project.
SHIWA - SHaring Interoperable Workflows for large-scale scientific simulations on Available DCIs
description:The SHIWA VO was created for the SHIWA project (shiwa-workflow.eu). It will be used for testing the SHIWA Simulation Platform (SSP), which will enable scientists to share and run workflows on DCIs. The project develops solutions interoperable workflows, including management of credentials across DCIs. The tests to be performed on the SHIWA VO resources initially will be intended to show viability of the adopted solutions on production infrastructures. This VO is already supported by the French NGI, but we need more sites to support it to enable testing under more realistic conditions.
applicant:Vladimir Korkov, Amsterdam Medical Centre
results:will follow shortly
status:completed
team:Vladimir Korkov, Silvia Olabarriaga
type:This is a dedicated project.
web service fail-over
description:We want to test a Web service fail-over system for a Danish web service by use of a virtual machine that can perform the same task. This Virtual machine will be parked at both a computer centre in Munchen and one in the UK. We will at fixed times let the web service in Denmark go down at which time our calling program will one way or another launch one of those two virtual machines to do the one second call at your cloud.
applicant:Gert Vriend, Radboud University Nijmegen
results:will follow shortly
status:ongoing
team:Gert Vriend
type:This is a dedicated project.
Managing cloud computing for life sciences research via smart interfaces
description:In a large class of bioinformatics applications, the processing power required fluctuates strongly and it is not feasible nor needed to keep the maximum processing capability available locally all the time. The SARA HPC-cloud offers compute power on demand in the form of freely configurable virtual machines. In the cloud one can configure a system: number of cores, amount of memory, secondary storage and network of the machine and freely install desired software running on this machine. These machines are stored as images, which can be deployed at a later time. We have implemented a system which can be used to control deployment of machine images in the cloud bypassing the cloud user interface. Using this system it is easy to setup applications in which cloud resources are transparently used from outside. It consists of a lightweight server, which is capable of starting and stopping machine images, just as is possible through the web-interface. It also keeps track of running machines under its control. Clients can request the starting of machines or request information about running machines. These clients can be used in applications to access cloud resources with minimal user intervention. We describe two use cases: the first one is about creating an R-cluster in the cloud. In this use case a user can start an R-cluster on the cloud from within an local R session and distribute the calculation work using the normal R-cluster commands over the cloud. The second use case is the back end of the array designer web-application. In the web application, the user can generate a microarray design based on input sequence data and a number of additional parameters. The required resources for generation of the array design are not available on the web-server and in this use case, the work is done in the cloud. For each array design a machine is instantiated on the cloud and stopped when the design is ready.
applicant:Wim de Leeuw, UvA
results:will follow soon
status:completed
team:Wim de Leeuw, Linda Bakker, Han Rauwerda, Timo Breit
type:This is a dedicated project.

Loading feed..

Subscribe to our newsletter here