Abstract :
Recent trends in science have made computational capabilities an essential part of scientific discovery. This is often referred to as enhanced scientific discovery, or eScience. eScience has been an integral part of high energy physics for several decades due to the complexity and volume of data produced by experiments. In the 1990s, eScience become central to biology with the sequencing of the human genome. More recently, eScience has become integral to neuroscience to understand neural circuits and human behavior. It is my view that the demands of 21st century science will mean that eScience is largely done in the Cloud. There are several reasons for this. Foremost, many of the computing requirements of scientists are bursty, requiring massive capabilities for short periods of time. This requirement is well suited to the Cloud. Second, 21st century science will frequently require the publication of large datasets such as the Allen Institute´s Brain Atlas and the world wide network of genomics data. Hosting these datasets in public clouds will be much easier than requiring individual scientists (or even universities) to build their own data hosting systems. Third, progress in science increasingly requires collaborations among many distributed groups. Thecloud can greatly facilitate these collaborations. This talk discusses the requirements for science in the Cloud, and efforts underway to address these requirements. I will provide considerable detail about Google´s Exacycle project that is donating one billion core hours to scientific discovery in molecular modeling, drug analysis, and astronomy.