Saturday, June 23, 2012

Testing Tools in SOA and Cloud Era


Traditionally when an organization or project team talks about quality, the focus is about functional testing, regression testing & load/stress testing and the market place had been dominated by few players like IBM, Micro Focus and HP/Mercury supplying the necessary tools to meet customers demand. But recent technological advancement in the area of SOA/ESB, cloud and Business Process Management (BPM) has redefined the quality management market and many new players are trying to unsettle the so called leaders in this market space.
LISA

Now a days organizations are looking for tools to not only to improve the software quality but also to improve their productivity in quality related activities which eventually leads to reduction in the time to market as a way to increase their business bottom line.  Also the competition in the quality market space are heating up with the advent of new players like iTKO's LISA, Parasoft's SOATest, Crosscheck Networks SOAPSonar etc., at times when the most of the applications that being developed are shifting from client/server or vanilla Web applications to SOA based, which consists of Web Services/ESB and cloud based, which combine services running internally and services deployed on cloud-provided services.

At a macro level, few open source tools are also playing a vital role such as:
  • soapUI, a complete and automated testing solution which provides support from SOAP- and REST-based  Web services, to JMS enterprise messaging layers, databases, Rich Internet Applications, and much more.
  • WebInject(Perl based) tool for automated testing of web applications and web services.
  • TestMaker offers an easier way to expose the performance bottlenecks, functional issues in Web, Rich Internet Applications (RIA using Ajax, Flex) SOA and BPM applications.

SOAtest

With quality management area still very volatile as  growing number of vendors continues to enter the market on one side and vendor consolidation such as acquisition of iTKO by CA Technologies and Green Hat by IBM happening at the other side,  the coming years will be very interesting for organizations, developer and QA team involved in SOA, BMP and Cloud application developments.

Monday, June 4, 2012

Hadoop in Big Data Era


A generic processing framework designed to execute queries and batch read operations against massive datasets, across clusters of computers, which facilitate the organizations to scans through tons of data (which are first loaded into the Hadoop Distributed File System - HDFS), and produce results that are meaning to the them. Simply put, Hadoop is the key open source technology that provides a Big Data Engine.
Hadoop operates on massive datasets by horizontally scaling the processing across very large numbers of servers through an approach called MapReduce and not by vertical scaling which requires powerful single server to process the huge data in a timely manner.

Hundreds or thousands of small, inexpensive, commodity servers do have the power if the processing can be horizontally scaled and executed in parallel. Using the MapReduce approach, Hadoop splits up a problem, sends the sub-problems to different servers, and lets each server solve its sub-problem in parallel. It then merges all the sub-problem solutions together and writes out the solution into files which may in turn be used as inputs into additional MapReduce steps.
Although Hadoop provides a platform for data storage and parallel processing, the real value comes from add-ons subprojects (ZooKeeper, Pig, Hive, Lucene, HBase, etc), which adds functionality and new capabilities to the platform. Most implementations of a Hadoop platform will include at least some of these subprojects, for example an organization will choose HDFS as the primary distributed file system and HBase as the database to store billions of rows of data and MapReduce as the framework for distributed processing.

A number of companies are emerging with the different plans to help the organization in using Hadoop by extending support or by providing professional services or by producing tools that work along with Hadoop and make it easier to use or by providing a complete platform (based on Hadoop) that addresses many of the enterprise needs. It is worthwhile to look at few of the players in this segment

InfoSphere BigInsights
IBM took the open source Big Data technology - Hadoop and extended it into an enterprise ready Big Data platform.
IBM delivers a Hadoop platform that is hardened for enterprise use with deep consideration for high availability, scalability, performance, ease-of-use and other things one normally expect out of solution to be deployed in production environment.
Also InfoSphere BigInsights flatten the time-to-value curve associated with Big Data analytics by providing the development and runtime environments for developers to build advanced analytical applications and providing tools for business users to analyze the data.
Cloudera CDH
(Cloudera's Distribution for Hadoop)
Cloudera delivers an integrated Apache Hadoop-based stack containing all the components needed for production use, tested and packaged to work together. It incorporates only software from open source projects – no forks or proprietary underpinnings and comes with Cloudera Manager which is a end-to-end management application for Apache Hadoop that includes revolutionary features such as proactive health checks and intelligent log management
M5
MapR’s M5 make Hadoop more reliable (provides full data protection, no single points of failure), more affordable, more manageable (improved performance) and significantly easier to use.


To put in perceptive, Hadoop should never be considered as replacement of relational databases or data ware housing, but something that will coexist and complement the traditional data store to provide richer capabilities to the organization. While traditional ware houses are ideal for analyzing structured data from various systems, the sheer magnitude of unstructured and semi structured data involved makes it very sensible to use the cheap cycles of server farms to transform masses of unstructured data with low information density into smaller amounts of dense structured data that is then loaded into traditional database for further analysis. 

To conclude, open source Hadoop offers a great deal of potential for enterprises to harness the data (structured, semi structured or has no structure at all) that was until now difficult to manage and analyze. Hadoop is also gaining wider acceptance with vendors who are coming out with various Hadoop-based stack to significantly provide a better user experience.