Wednesday, September 21, 2011

Why I think cloud computing should be considered a “platform” by adopters!

In this post attempt to explore various components that collectively define cloud computing as a platform. It is to be noted that I have deliberately used the term platform, in this context to highlight the importance of services extended by each layer to enable the notion of unlimited available resources. While virtualization technologies enable efficient use of resources, by marginally over-committing computing resources, with an assumption that not all partitions of a system will consume all resources at all times. This assumption is further accentuated by a policy driven approach that factors in resource prioritization at times of contention, by allocation resources to most important partition or application. While virtualization can be seen as a foundation for cloud computing platform, virtualization alone cannot be and should not be mistaken for a cloud computing platform. As a platform cloud computing attempts to address several IT operational and business including (but not limited to):

a. Escalating costs of hosting ‘over-provisioned’ and application specific

environment.

b. Reduced ‘ramp-up’ time for provisioning hardware and software resources

for various environments (Development, Staging and production).

c. Cost Effective solution by re-use of software and hardware assets.

d. Reduction in support and maintenance costs with use of standardized (at

times virtual) images.

So it is evident that the over arching goal of cloud computing platform is to provide a cost effective solution to the end user with tremendous flexibility and agility. The flexibility and agility thus become important facets of any cloud platform, as these concepts are basic to the intrinsic value enabled by cloud of resources. To achieve the elasticity and agile on demand provisioning thus requires a system that is not only self sustainable but also sensitive to growth. Let’s explore what this means. A true cloud platform provides an illusion of infinite computing resources available on demand. The notion of on demand infinite computing resources requires a systemic approach that includes sense and response subsystems, with a tie into a system level monitoring subsystems which is front ended by a rich user interface and all tied together by a robust governance sub system. These sub systems can be further classified as a complete and inseparable components of a cloud computing platform. We will discuss them in detail in later posts, but it is important to understand the relevance of these sub systems as vital design imperatives of a cloud computing platform. Cloud computing is a new consumption and delivery model nudged by consumer demand and continual growth in internet services. Cloud computing exhibits the following at least 6 key characteristics:

a. Elastic Environment

b. Provisioning Automation

c. Extreme Scalability

d. Advanced Virtualization

e. Standards based delivery

f. Usage based equitable chargeback

I thus, deliberately use the term platform in context of cloud computing

environment that facilitates flexibility, robustness and agility, as a systemic

approach in providing a stage to hosting applications without the concern for

availability or provisioning of underlying resources.

Sunday, September 11, 2011

Changing Landscape of Middleware

Lately I have been engaged with many clients that are maniacally focused on reduced costs by means of reduced footprint. This post, I attempt to discuss some of my thoughts and experiences towards this trend:

  1. Growth poses a problem: As business grows so does the resulting infrastructure and primarily the middleware which houses the business logic and in some cases this tier shares the presentation or the store front with business logic. So with growth in business either due to larger client base or new business model implies similar growth in middleware. This consistent growth poses a few challenges some obvious ones include costs – not only of software but also hosting infrastructure and hardware. Some non obvious challenges include manageability of platform i.e. general administration, handling performance and Service level agreement (SLA) and addressing scalability.

  1. Consolidation – an Answer? - Many clients have taken an approach of consolidation to address this issue. Now this consolidation can come at many levels including (but not limited to) data center consolidation, IT & middleware virtualization, automation – for installation and configuration. Many clients are even as bold in claiming this effort towards the goal of being ‘cloud’ ready. However, I think that virtualization can only offer so much, at some point to achieve better resource economies clients and leaders have to think about a better design, understanding of business and user behavior and promoting design that not only appeals to the end user but also nudges the user towards a certain desired behavior.

  1. Design: I have always advocated dedicating significant amount of time in design phase. While a design phase may not produce a sizable amount of tangible application artifacts, but it does enable a better design pattern for future improvements and upgrades. (More on this in later).

  1. Changing Landscape of Middleware: What I am seeing and we have enabling technologies seeping into enterprise infrastructure to make this possible is the Movement of content in the outer tiers. So one way to address the scalability of middleware processing is to not let the request traverse to middleware tiers until absolutely necessary. This can be done in many ways, and one can be creative on how they accomplish this – this is where the application design comes in. Here is an example
    1. Many clients push as much content to ultra edge or the public accessible content domain. For example Akamai content network. An example of this type of content is generally static content. This enables faster access to site and catalog content, and has a high user satisfaction rate. This also enables the ‘window shoppers’ from consuming your precious middleware cycles.
    2. Caching at edge tier – now not all content can be cached or served from ultra edge, and some content such as domain specific content, some page fragments and jsps etc can be cached at the edge tier (which is usually behind the firewall), this does a great deal in saving processing costs of middleware presentation tier.
    3. Caching at middleware tier: Patterns like side cache and in-line database cache are further instrumental in reducing resource usage such as a db lookup, db connection and in-memory access enables faster access to various type of content.

  1. The Idea: By caching strategically at many tiers, we are trying to offload processing to various tiers and ONLY dedicating processing in middleware when it is most important or the ‘window shoppers’ now mean business, we will dedicate our cycles to those business meaning clients and service them better with an enhanced experience.

Challenge: I discussed the Design phase, the challenge is to ensure the application design that is modular enough to enable these various tiers of caching and still present a unified front, where the end user is oblivious of the inner working of the application that has its content derived from various layers. An intentional design will enable the content and business logic to be isolated, thus enabling caching at various tiers.

Enabling technologies:

  1. Ultra Edge caching – Akamai content network
  2. Edge caching – edge caching appliances ( such as IBM XI50, XC10), in memory data grid at the edge (IMDG – such as WebSphere eXtreme Scale (IBM) , Coherence (Oracle)
  3. Side Cache and Inline database buffers - such as WebSphere eXtreme Scale (IBM) , Coherence (Oracle)
  4. Smart Routing – IBM XI50, IBM AO, F5, Cisco etc.

Thoughts?

:)

Nitin

Friday, September 9, 2011

is Cloud just virtualization and Automation? NO!!

Day and Day out, we see every technology vendor, attempting to position themselves in cloud realm, and struggle to find a niche in this 'cloudy' topic.

I think:
1. Virtualization and Automation are building blocks of Cloud computing platform. They alone will not solve any problems

2. Cloud computing is based on the premise that it is a new model that accommodates new services delivery and consumption model. Now the term 'service' is very elusive, and I think this is what is exploited by every vendor trying to find that 'fit'.

3. Without a vision and a set expectation from 'cloud' all the investment into cloud is pointless and will create more problems rather than solutions. So the goals and vision are very important that will and should drive the investment into cloud strategy and supporting technologies.

4. Chargeback: This is so important that I think without a equitable chargeback model, the cloud initiative will be a failure. And this is because when we address the 'service delivery and consumption' the metering at the consumption end will balance the resources at the delivery end..this balance is probably most important concept! Otherwise... you will have a buffet of services and unhealthy consumers and low quality services!!

So I question myself, when every technology decision maker/investor and consumer makes a choice to embark the journey to achieve 'cloud driven' economies of scale... do they have a strategy? Or we all are consuming the hype, until the next one surfaces and then we will drop cloud and adore the 'rainmaker'!!

Thoughts?
:)
Nitin

Wednesday, September 7, 2011

How does IMDG solve the Scalability problem?

WebSphere eXreme Scale - WXS is a IMDG - In Memory Data Grid implementation


Fundamentals: How does IMDG solve the Scalability problem?

Understanding Scalability:

In understanding the scalability challenge addressed by WebSphere eXtreme

Scale, let us first define and understand scalability.

Wikipedia defines scalability as a "desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added."

· Scalability in a system is about the ability to do more, whether it is processing more data or handling more traffic, resulting in higher transactions

· scalability poses great challenges to database and transaction systems

· An increase in data can expose demand constraints on back-end database servers

· This can be a very expensive and short term approach to solving the problem of processing ever growing data and transactions

At some point, either due to practical, fiscal or physical limits, enterprises are unable to continue to "scale out" by simply adding hardware. The progressive approach then adopted is to "scale out" by adding additional database servers and using a high speed connection between the database servers to provide a fabric of database servers. This approach while viable, poses some challenges around keeping the databases servers synchronized. It is important to ensure that the databases are kept in sync for data integrity and crash recovery.

Solution: In Memory Data Grid or IBMs WebSphere eXtreme Scale

WebSphere eXtreme Scale compliments the database layer to provide a fault tolerant, highly available and scalable data layer that addresses the growing concern around the data and eventually the business.

· Scalability is never an IT problem alone. It directly impacts the business applications and the business unit that owns the applications.

· Scalability is treated as a competitive advantage.

· The applications that are scalable can easily accommodate growth and aid

The business functions in analysis and business development.

WebSphere eXtreme Scale provides a set of interconnected java processes that holds the data in memory, thereby acting as shock absorbers to the back end databases. This not only enabled faster data access, as the data is accessed from memory, but also reduces the stress on database.

more on this.....later.

So what is cloud computing?

As the IT operations continue to evolve and transform the business
towards agility and adaptability to ever changing rules of marketplace, the
efficiency of any IT operation is of paramount significance. The phrase ‘time to
market’ has a completely new meaning in today’s dynamic business environment
where the only constant is change. This rapidly changing environment has lead
the IT and business leaders alike to re-think the ‘procurement to provisioning’
process with one goal in mind – Efficient use of resources. These resources
include IT assets such as hardware and software, human capital such as
administrators, developers, testers, other IT management staff and facilities
employed in hosting the overall IT infrastructure. The efficiency goals are not
only towards costs savings but also are defined by the business requirement
usually driven by external market forces and availability of various enabling
technologies. Cloud computing as a platform is amalgamation of such enabling
technologies. While the concept of cloud computing is not new, efforts such as
net(work) computing and various hardware and software virtualization
technologies in the past have attempted to address the need for ‘unlimited’
resource pool capable of handling varying workloads. These efforts, while did
contribute towards a more mature cloud platform, as a singular technology it did
fall short of a vision of a true cloud computing platform.
So what is a cloud computing platform? Is it simply automated
provisioning systems coupled with a resource virtualization, where the workload
is policy driven, and resources over committed and any resource contention
handled by policy driven resolution? As it turns out technologies that provide
provisioning, virtualization and policy enforcement form the building blocks of a
true cloud computing platform, but not any one technology is a cloud offering in
and of it self.

Drawing differences between Apache Hadoop and WebSphere eXtreme Scale (WXS)

Drawing differences between Apache Hadoop and WebSphere eXtreme Scale (WXS)

Hadoop:

Apache Hadoop is a software framework (platform) that enables a distributed manipulation of vast amount of data. Introduced in 2006, it is supported by Google, Yahoo!, and IBM, to name a few. At the heart of its design is the MapReduce implementation and HDFS (Hadoop Distributed File System), which was inspired by the MapReduce (introduced by a Google paper) and the Google File System.

MapReduce: MapReduce is a software framework introduced by Google that supports distributed computing on large data sets on clusters of computers (or nodes). It is the combination of two processes named Map and Reduce.

Note: MapReduce applications must have the characteristic of "Map" and "Reduce," meaning that the task or job can be divided into smaller pieces to be processed in parallel. Then the result of each sub-task can be reduced to make the answer for the original task. One example of this is Website keyword searching. The searching and grabbing tasks can be divided and delegated to slave nodes, then each result can be aggregated and the outcome (the final result) is on the master node.

In the Map process, the master node takes the input, divides it up into smaller sub-tasks, and distributes those to worker nodes. The worker node processes that smaller task, and passes the answer back to the master node.

In the Reduce process, the master node then takes the answers of all the sub-tasks and combines them to get the output, which is the result of the original task.

The advantage of MapReduce is that it allows for the distributed processing of the map and reduction operations. Because each mapping operation is independent, all maps can be performed in parallel, thus reducing the total computing time.


HDFS

From the perspective of an end user, HDFS appears as a traditional file system. You can perform CRUD actions on files with certain directory path. But, due to the characteristics of distributed storage, there are "NameNode" and "DataNode," which take each of their responsibility.

The NameNode is the master of the DataNodes. It provides metadata services within HDFS. The metadata indicates the file mapping of the DataNode. It also accepts operation commands and determines which DataNode should perform the action and replication.

The DataNode serves as storage blocks for HDFS. They also respond to commands that create, delete, and replicate blocks received from the NameNode.

Use case:

  1. MapReduce Application
  2. querying the data stored on the Hadoop cluster
  3. Data integration and processing ( grid batch type (ETL) applications)

WXS:

WebSphere eXtreme Scale compliments the database layer to provide a fault tolerant, highly available and scalable data layer that addresses the growing concern around the data and eventually the business.

· Scalability is never an IT problem alone. It directly impacts the business applications and the business unit that owns the applications.

· Scalability is treated as a competitive advantage.

· The applications that are scalable can easily accommodate growth and aid

The business functions in analysis and business development.

WebSphere eXtreme Scale provides a set of interconnected java processes that holds the data in memory, thereby acting as shock absorbers to the back end databases. This not only enabled faster data access, as the data is accessed from memory, but also reduces the stress on database.

WebSphere® eXtreme Scale is an elastic, scalable, in-memory data grid. It dynamically caches, partitions, replicates, and manages application data and business logic across multiple servers. WebSphere eXtreme Scale performs massive volumes of transaction processing with high efficiency and linear scalability, and provides qualities of service such as transactional integrity, high availability, and predictable response times.

The elastic scalability is possible through the use of distributed object caching. Elastic means the grid monitors and manages itself, allows scale-out and scale-in, and is self-healing by automatically recovering from failures. Scale-out allows memory capacity to be added while the grid is running, without requiring a restart. Conversely, scale-in allows for immediate removal of memory capacity.

WebSphere eXtreme Scale can be used in different ways. It can be used as a very powerful cache or as a form of an in-memory database processing space to manage application state or as a platform for building powerful Extreme Transaction Processing (XTP) applications.

Use Case:

1. Extensible network attached cache

2. In memory data grid

3. Application cache ( session and data)

References:

  1. http://www.ibm.com/developerworks/aix/library/au-cloud_apache/
  2. http://wiki.apache.org/hadoop/
  3. http://publib.boulder.ibm.com/infocenter/wxsinfo/v7r0/index.jsp?topic=/com.ibm.websphere.extremescale.over.doc/cxsoverview.html
  4. http://hadoop.apache.org/hbase/

Why Middleware in clouds?

Folks!
I have been working in Middleware realm for many years, and I figure I blog about my experiences and thoughts, and see what you all think.
All the posts on this blog are my thoughts and I kind of take full responsibility for it!

Enjoy and do provide your feedback!

:)
Nitin