Google’s Cloud offerings are covering the skies with a plethora of services and products. In its ongoing I/O, Google announced the launch of Cloud Dataflow, a managed data processing service.
Developers can create data pipelines to analyze data and use Cloud Dataflow to work with live streaming data as well as by uploading data to the system. Though yet in the Beta stage, Cloud Dataflow is promising though pricing and other terms of usage are yet to be announced. Cloud Dataflow takes over from MapReduce and should be a refined offering based on Flume and Millwheel technologies with Java for Cloud Dataflow SDK that allows developers the facility of a dashboard to monitor data pipelines. The end purpose is to assist users to draw insights from data at reduced operational costs and less bother of deploying and maintaining infrastructure while being able to take in any streaming data as also newline delimited text files, BigQuery tables and others in batch mode.
Cloud Dataflow plugs in a gap in Google’s cloud offerings, bringing it at par with Amazon and Kinesis. Now the beta version looks promising with incorporation of BigQuery. Voluminous data can be filtered and prepared for writing into BigQuery and Dataflow can also read from BigQuery and concatenate data sources. Google is currently using Twitter for testing reactions of developer community to its Dataflow