Mastering Cloud Computing is designed for undergraduate students learning to develop cloud computing applications. Tomorrow's applications won't live on a single computer but will be deployed from and reside on a virtual server, accessible anywhere, any time. Tomorrow's application developers need to understand the requirements of building apps for these virtual systems, including concurrent programming, high-performance computing, and data-intensive systems. The book introduces the principles of distributed and parallel computing underlying cloud architectures and specifically focuses on virtualization, thread programming, task programming, and map-reduce programming. There are examples demonstrating all of these and more, with exercises and labs throughout. It explains how to make design choices and tradeoffs to consider when building applications to run in a virtual cloud environment. Real-world case studies include scientific, business, and energy-efficiency considerations.
Acknowledgments xi
Preface xiii
PART 1 FOUNDATIONS
Chapter 1 Introduction 3 (26)
1.1 Cloud computing at a glance 3 (12)
1.1.1 The vision of cloud computing 5 (2)
1.1.2 Defining a cloud 7 (2)
1.1.3 A closer look 9 (2)
1.1.4 The cloud computing reference 11 (2)
model
1.1.5 Characteristics and benefits 13 (1)
1.1.6 Challenges ahead 14 (1)
1.2 Historical developments 15 (7)
1.2.1 Distributed systems 15 (3)
1.2.2 Virtualization 18 (1)
1.2.3 Web 2.0 19 (1)
1.2.4 Service-oriented computing 20 (1)
1.2.5 Utility-oriented computing 21 (1)
1.3 Building cloud computing environments 22 (7)
1.3.1 Application development 22 (1)
1.3.2 Infrastructure and system 23 (1)
development
1.3.3 Computing platforms and 24 (2)
technologies
Summary 26 (1)
Review questions 27 (2)
Chapter 2 Principles of Parallel and 29 (42)
Distributed Computing
2.1 Eras of computing 29 (1)
2.2 Parallel vs. distributed computing 29 (2)
2.3 Elements of parallel computing 31 (8)
2.3.1 What is parallel processing? 31 (1)
2.3.2 Hardware architectures for 32 (4)
parallel processing
2.3.3 Approaches to parallel programming 36 (1)
2.3.4 Levels of parallelism 36 (1)
2.3.5 Laws of caution 37 (2)
2.4 Elements of distributed computing 39 (15)
2.4.1 General concepts and definitions 39 (1)
2.4.2 Components of a distributed system 39 (2)
2.4.3 Architectural styles for 41 (10)
distributed computing
2.4.4 Models for interprocess 51 (3)
communication
2.5 Technologies for distributed 54 (17)
computing?
2.5.1 Remote procedure call 54 (2)
2.5.2 Distributed object frameworks 56 (5)
2.5.3 Service-oriented computing 61 (8)
Summary 69 (1)
Review questions 70 (1)
Chapter 3 Virtualization 71 (40)
3.1 Introduction 71 (2)
3.2 Characteristics of virtualized 73 (4)
environments
3.2.1 Increased security 74 (1)
3.2.2 Managed execution 75 (2)
3.2.3 Portability 77 (1)
3.3 Taxonomy of virtualization techniques 77 (14)
3.3.1 Execution virtualization 77 (12)
3.3.2 Other types of virtualization 89 (2)
3.4 Virtualization and cloud computing 91 (2)
3.5 Pros and cons of virtualization 93 (2)
3.5.1 Advantages of virtualization 93 (1)
3.5.2 The other side of the coin: 94 (1)
disadvantages
3.6 Technology examples 95 (16)
3.6.1 Xen: paravirtualization 96 (1)
3.6.2 VMware: full virtualization 97 (7)
3.6.3 Microsoft Hyper-V 104 (5)
Summary 109 (1)
Review questions 109 (2)
Chapter 4 Cloud Computing Architecture 111 (32)
4.1 Introduction 111 (1)
4.2 The cloud reference model 112 (12)
4.2.1 Architecture 112 (2)
4.2.2 Infrastructure- and 114 (3)
hardware-as-a-service
4.2.3 Platform as a service 117 (4)
4.2.4 Software as a service 121 (3)
4.3 Types of clouds 124 (9)
4.3.1 Public clouds 125 (1)
4.3.2 Private clouds 126 (2)
4.3.3 Hybrid clouds 128 (3)
4.3.4 Community clouds 131 (2)
4.4 Economics of the cloud 133 (2)
4.5 Open challenges 135 (8)
4.5.1 Cloud definition 135 (1)
4.5.2 Cloud interoperability and 136 (1)
standards
4.5.3 Scalability and fault tolerance 137 (1)
4.5.4 Security, trust, and privacy 138 (1)
4.5.5 Organizational aspects 138 (1)
Summary 139 (1)
Review questions 139 (4)
PART 2 CLOUD APPLICATION PROGRAMMING AND THE
ANEKA PLATFORM
Chapter 5 Aneka 143 (28)
5.1 Framework overview 143 (3)
5.2 Anatomy of the Aneka container 146 (9)
5.2.1 From the ground up: the platform 147 (1)
abstraction layer
5.2.2 Fabric services 147 (3)
5.2.3 Foundation services 150 (3)
5.2.4 Application services 153 (2)
5.3 Building Aneka clouds 155 (7)
5.3.1 Infrastructure organization 155 (1)
5.3.2 Logical organization 155 (3)
5.3.3 Private cloud deployment mode 158 (1)
5.3.4 Public cloud deployment mode 158 (2)
5.3.5 Hybrid cloud deployment mode 160 (2)
5.4 Cloud programming and management 162 (9)
5.4.1 Aneka SDK 162 (5)
5.4.2 Management tools 167 (1)
Summary 168 (1)
Review questions 168 (3)
Chapter 6 Concurrent Computing 171 (40)
6.1 Introducing parallelism for 171 (2)
single-machine computation
6.2 Programming applications with threads 173 (16)
6.2.1 What is a thread? 174 (1)
6.2.2 Thread APIs 174 (3)
6.2.3 Techniques for parallel 177 (12)
computation with threads
6.3 Multithreading with Aneka 189 (6)
6.3.1 Introducing the thread 190 (1)
programming model
6.3.2 Aneka thread vs. common threads 191 (4)
6.4 Programming applications with Aneka 195 (16)
threads
6.4.1 Aneka threads application model 195 (1)
6.4.2 Domain decomposition: matrix 196 (7)
multiplication
6.4.3 Functional decomposition: Sine, 203 (1)
Cosine, and Tangent
Summary 203 (7)
Review questions 210 (1)
Chapter 7 High-Throughput Computing 211 (42)
7.1 Task computing 211 (5)
7.1.1 Characterizing a task 212 (1)
7.1.2 Computing categories 213 (1)
7.1.3 Frameworks for task computing 214 (2)
7.2 Task-based application models 216 (9)
7.2.1 Embarrassingly parallel 216 (1)
applications
7.2.2 Parameter sweep applications 217 (1)
7.2.3 MPI applications 218 (4)
7.2.4 Workflow applications with task 222 (3)
dependencies
7.3 Aneka task-based programming 225 (28)
7.3.1 Task programming model 226 (1)
7.3.2 Developing applications with the 227 (16)
task model
7.3.3 Developing a parameter sweep 243 (5)
application
7.3.4 Managing workflows 248 (2)
Summary 250 (1)
Review questions 251 (2)
Chapter 8 Data-Intensive Computing 253 (62)
8.1 What is data-intensive computing? 253 (7)
8.1.1 Characterizing data-intensive 254 (1)
computations
8.1.2 Challenges ahead 254 (1)
8.1.3 Historical perspective 255 (5)
8.2 Technologies for data-intensive 260 (16)
computing
8.2.1 Storage systems 260 (8)
8.2.2 Programming platforms 268 (8)
8.3 Aneka MapReduce programming 276 (39)
8.3.1 Introducing the MapReduce 276 (17)
programming model
8.3.2 Example application 293 (16)
Summary 309 (1)
Review questions 310 (5)
PART 3 INDUSTRIAL PLATFORMS AND NEW
DEVELOPMENTS
Chapter 9 Cloud Platforms in Industry 315 (38)
9.1 Amazon web services 315 (17)
9.1.1 Compute services 316 (5)
9.1.2 Storage services 321 (8)
9.1.3 Communication services 329 (3)
9.1.4 Additional services 332 (1)
9.2 Google AppEngine 332 (9)
9.2.1 Architecture and core concepts 333 (5)
9.2.2 Application life cycle 338 (2)
9.2.3 Cost model 340 (1)
9.2.4 Observations 341 (1)
9.3 Microsoft Azure 341 (12)
9.3.1 Azure core concepts 342 (5)
9.3.2 SQL Azure 347 (2)
9.3.3 Windows Azure platform appliance 349 (1)
9.3.4 Observations 349 (1)
Summary 350 (1)
Review questions 351 (2)
Chapter 10 Cloud Applications 353 (20)
10.1 Scientific applications 353 (5)
10.1.1 Healthcare: ECG analysis in the 353 (2)
cloud
10.1.2 Biology: protein structure 355 (2)
prediction
10.1.3 Biology: gene expression data 357 (1)
analysis for cancer diagnosis
10.1.4 Geoscience: satellite image 358 (1)
processing
10.2 Business and consumer applications 358 (15)
10.2.1 CRM and ERP 359 (3)
10.2.2 Productivity 362 (3)
10.2.3 Social networking 365 (1)
10.2.4 Media applications 366 (3)
10.2.5 Multiplayer online gaming 369 (1)
Summary 370 (1)
Review questions 371 (2)
Chapter 11 Advanced Topics in Cloud 373 (56)
Computing
11.1 Energy efficiency in clouds 373 (4)
11.1.1 Energy-efficient and green cloud 375 (2)
computing architecture
11.2 Market-based management of clouds 377 (13)
11.2.1 Market-oriented cloud computing 378 (1)
11.2.2 A reference model for MOCC 379 (5)
11.2.3 Technologies and initiatives 384 (5)
supporting MOCC
11.2.4 Observations 389 (1)
11.3 Federated clouds/InterCloud 390 (32)
11.3.1 Characterization and definition 391 (1)
11.3.2 Cloud federation stack 392 (7)
11.3.3 Aspects of interest 399 (18)
11.3.4 Technologies for cloud 417 (5)
federations
11.3.5 Observations 422 (1)
11.4 Third-party cloud services 422 (7)
11.4.1 MetaCDN 423 (2)
11.4.2 SpotCloud 425 (1)
Summary 425 (2)
Review questions 427 (2)
References 429 (10)
Index 439