Tag: Hadoop

Google partners have another Cloud product

LOD_Cloud_Diagram_as_of_September_2011Google is adding another product in its range of big data services on the Google Cloud Platform today.

Dubbed the Cloud Dataproc service, the product is in beta, but Google Beta products normally stay that way for years.

The service sits between managing the Spark data processing engine or Hadoop framework directly on virtual machines and a fully managed service like Cloud Dataflow.

This allows the partner to orchestrate data pipelines on Google’s platform.

Dataproc users can create a Hadoop cluster in under 90 seconds and Google will only charge 1 cent per virtual CPU/hour in the cluster. It is top of the usual cost of running virtual machines and data storage, but you can add Google’s cheaper preemptible instances to your cluster to save a bit on compute costs. Billing is per-minute, with a 10-minute minimum.

Users can set up ad-hoc clusters when needed and because it is managed, Google will handle the administration for them.

It is compatible with all existing Hadoop-based products, and it should be a doddle to port existing workloads over to Google’s new service.

Some punters want total control over their data pipeline and processing architecture and are more likely to want to run and manage their own virtual machines. Dataproc users won’t have to make any real tradeoffs when compared to setting up their own infrastructure.

IBM teams up with Twitter

ibm-officeBig Blue is very busy with its cloud data services and data analytics and today has penned an agreement with Twitter aimed at enterprises and developers.

The deal means IBM will deliver cloud data services with Twitter built in – meaning that companies can use analytics to mine meaningful data from the flood of tweets that hit cyber space every day.

IBM described Twitter as unlike any other data source in the world because it happens in real time, and is public and conversational.

IBM claims it can separate the signal from the noise by analysing tweets with millions of data points from other data that is public.

The deal means that developers can search explore and examine data using its Insights for Twitter service on Bluemix.

The company said it can also analyse Twitter data by configuration Biginsights on Cloud and combine the tweets with IBM’s Enterprise Hadoop-as-a-service.

It has already given 4,000 of its own staff access to Twitter data.


ITC intros high speed analytics

Pic Mike MageeITC Infotech said that it has introduced an enterprise analytics system that lets users more easily access high speed data analytics.

The product, called ZEAS (Z Enterprise Analytics Solution) uses a graphical user interface to analyse big data with the minimum of coding.

The product supports Hadoop open source technology and ITC claims that it will let enterprises analyse big data five times faster than its’ competitors’ offerings.

It also claimed that data analysis projects that would have taken months for experienced Hadoop developers to implement can now be done in weeks.

ZEAS also includes a data operation centre that gives enterprise grade access controls, monitoring and alerting mechanisms for data management.

The company introduced the offering at the Strata+Hadoop World conference held in San Jose this week.

ITC Infotech is a subsidiary of $7 billion company ITC that provides services to global customers. It targets the banking, financial services and insurance sectors.

IBM makes big data push

ibm-officeBig Blue said it has introduced data analytics with the introduction of IBM BigInsights for Apache Hadoop.

The offering provides machine learning, R, and other features that can tackle big data.

IBM claimed that while many think Apache Hadoop is powerful for collecting and storing large sets of variable data, companies are failing to realise its potential.

It’s offering has a broad data science toolset for querying data, visualising, and provide scaleable distributed machine learning.

The offering includes Analyst, which includes IBM’s SQL engine, Data Scientist that provides a machine learning engine that ranges over big data to find patterns.

Enterprise Management includes tools to optimise workflows, and management software to give faster results.

IBM also said it has joined the Open Data Platform (ODP) association which is aiming to provide standardisation over Hadoop and big data technologies.

IBM intros big mainframe

IBM Z13Big Blue said it has built a mainframe which is the most powerful and secure system ever.
The Z13 can churn 2.5 billion transactions a day,  and includes embedded analytics.
IBM said the system took five years to develop costing $1 billion, includes 500 new patents and is a collaborative venture with over 60 of its customers.
The machine allows real time encryption of mobile transactions that uses some of these patents.
The Z13 embedded analytics allows it to give real time insights on transactions including fraud protection.
IBM said that the Z13 also includes the fastest microprocessor in the world, is twice as fas as Intel microprocessors and 300 percent more memory.
The Z13 also includes native support for Hadoop and includes improvements to the IBM DB2 analytics accelerator.

IT ready for Big Data

clouds3A survey of 100 IT decision makers from top dollar firms has revealed that enterprises are more than dabbling their toes in the ocean of Big Data.

Syncsort, which is in the Big Data business itself, said that 62 percent of its respondents will optimise their enterprise data warehouses by sending data and batch workloads to Hadoop.

And 69 percent of the people it polled said they expect to make their enterprise wide data available in Hadoop.

Meanwhile just over half of the respondents are likely to spend between five to 10 percent of their budgets on Big Data projects.

Over seventy percent of the respondents work for companies with turnovers of over $50 million plus.

It seems that the IT guys don’t have problems proving the benefits of Big Data to the senior suits that authorise the buys.  It appears from the survey that less than 25 percent of those polled have problems allocating budgets to their Big Data plans.

Teradata snaps up Think Big Analytics

doshAnalytic data company Teradata has bought Think Big Analytics.

The reason it’s bought the company is for its Hadoop and big data consulting capabilities, it said in a s statement.

Teradata didn’t say how much it paid for the firm, but said Think Big’s team will stay in place.  It will continue to use the Think Big brand.

CEO Mike Koehler said it is Teradata’s third buy in six weeks. All, he said, will help to achieve its goal of being the market leader.

“Think Big’s consulting expertise enhances Teradata’s capability to advise customers on the best way to leverage diverse, open source big data technologies to grow their businesses,” he said.

Think Big, said Teradata have heaps of experience with a number of Hadoop distributions including Hortonworks, Cloudera, and MapR.

Hadoop makes the enterprise grade

cloud 2An IDC survey commissioned by Red Hat indicates Hadoop is reaching critical mass in the business world.

According to IDC, 32 percent of those surveyed already deploy Hadoop; 31 percent will deploy it in the next 12 months and 36 percent indicated they would deploy it in the future.

And the use of Hadoop is not just for analysing big data.

IDC said that 39 percent of the respondents use NoSQL databases such as Hbase, Cassandra and MongoDB, and 36 percent said they use MPP databases such as Greenplum and Vertica.

While businesses use Hadoop for analysis of raw data, 39 percent of the respondents use it to put “if-then” modelling for products and services.

The IDC survey also showed that many businesses use alternatives to HDFS such as Big Blue’s Global File System, EMC’s Isilon OneFS and Red Hat Storage – that is GlusterFS.

Rackspace intros on-demand e-learning

rackspaceOpen cloud provider Rackspace has introduced an on demand, e-learning training course with a view to bringing about wider adoption of OpenStack technologies.

Customers will be able to register for courses that promise to teach ways to use and deploy OpenStack powered cloud. The on demand e-learning version of Rackspace’s OpenStack Fundamentals will be available to the public in October, though pre-registration is available now.

Additionally, Rackspace is introducing four further in-person courses.

These are Networking-Neutron, where students can learn how to use Neutron to provide Networking-as-a-Service, as well as encouraging students to use an API to build and configure networking infrastructure. Building Cloudy Apps sees students using Python to learn about horizontal scaling and APIs, security in the cloud is self explanatory, and so’s Hadoop on OpenStack.

Certified training partners for the Fundamentals courses include, worldwide, New Horizons, Skyline Advanced Technology Services, and Intelligent Cloud Technologies.

Course overviews and schedules are available at Rackspace’s training website.

Rackspace boasts it’s expanding the program because of rapid growth in OpenStack, including over 10,000 contributors at its three year anniversary in July this year. Citing the BSA global cloud scorecard for 2013, 14 million cloud jobs should emerge by 2015, so there’s plenty of room for Rackspace to work.

“Rackspace recognises the need for comprehensive educational courses and delivery models and is fundamentally revolutionising OpenStack training to include a Certified Training Partner Programme and on demand e-learning course,” said Tony Campbell, director of training and certification for OpenStack.

HP expands its big data products

old schoolHP has expanded its big data products portfolio so that partners can tailor products so that clients can squeeze more out of their business information.

There is a lot of money in these sorts of products. According to HP research, nearly 60 percent of companies surveyed will spend at least 10 percent of their innovation budget on big data this year.

However the study also found that one in three organisations have failed with a big data initiative and are wary of getting their fingers burnt again.

HP thinks its new enhanced portfolio delivers big data out of the box so that it can enable enterprises to handle the growing volume, variety, velocity and vulnerability of data that can cause these initiatives to fail.
The new product range is based around HAVEn which is a big data analytics platform, which uses HP’s analytics software, hardware and services.

George Kadifa, executive vice president, Software said that big data enables organisations to take advantage of the totality of their information—both internal and external—in real time.

It produces extremely fast decision making, resulting in unique and innovative ways to serve customers and society.

HAVEn combines proven technologies from HP Autonomy, HP Vertica, HP ArcSight and HP Operations Management, as well as key industry initiatives such as Hadoop.

It avoids vendor lock-in with an open architecture that supports a broad range of analytics tools and protect investments with support for multiple virtualisation technologies.

HAVEn uses all information collected including structured, semistructured and unstructured data, via HP’s portfolio of more than 700 connectors into HAVEn.

It means that organisations can consume, manage and analyse massive streams of IT operational data from a variety of HP products, including HP ArcSight Logger and the HP Business Service Management portfolio, as well as third-party sources.

In addition to this HP announced its Vertica Community Edition. This is a free, downloadable software that delivers the same functionality of the HP Vertica Analytics Platform Enterprise Edition with no commitments or time limits. Clients can analyse up to a terabyte of data before spending more cash on an enterprisewide solution.

There is also the HP Autonomy Legacy Data Cleanup—information governance package. According to HP this helps clients analyse legacy data, lower costs and reduce risks while squeezing value from big data.