Google Cloud: What to ExpectNext data release:BigLake supports Apache Iceberg, Hudi, and Delta Lake; BigQuery now supports unstructured data, Apache Spark, and DataStream; Looker Studio unifies business intelligence products; and Vertex AI Vision is now available
Google announced on Tuesday that it will add support for most commonly used open-source table formats to data lakes as part of its ongoing effort to support all types of data and provide a one-stop data platform in the form of BigLake.
"Our BigLake storage engine will gain support for Apache Iceberg, Databricks' Delta Lake, and Apache Hudi." " Gerrit Kazmaier, Google Cloud's vice president of data analytics, wrote about it in a blog post. "We can help eliminate barriers that prevent organizations from getting the most out of their data by supporting these widely used data formats."
The company stated that support for Apache Iceberg will be available in preview and that support for Hudi and Delta Lake will be available soon. A timetable for the preview and general availability was not provided.
Matt Aslett, who is the research director at Ventana Research, says that Google has decided to support open-source table formats because adding them will give data lakes the ability to manage transactions.
According to Aslett, "more than half (57%) of data lake adopters are currently using at least one of these emerging table formats." "This could increase the likelihood that data lakes, rather than data warehouses, will be used to support structured data processing-based analytics workloads."
However, a recent study by Ventana Research called Data Lakes Dynamics Insights found that less than one-quarter of organizations have used a data lake to replace an existing data warehouse environment. In nearly three-quarters of organizations, both data lakes and data warehouse environments co-exist.
"This benefits Google's BigLake because it can address both data warehousing and data lake approaches in a single environment," Aslett said.
Doug Henschen, a principal analyst at Constellation Research, says that Google's support for these Open-source table formats appear to be a reaction to product updates from Snowflake and Databricks.
"Apache Iceberg is the hot new option gaining traction because it promises openness as well as performance gains," said Henschen. "However, Google is making it clear that it is not picking sides by promising support for Delta Lake and Hudi as well."
Tony Baer, the chief analyst at dbInsight, says that Oracle, a competitor of Google, may also announce similar features at its upcoming annual CloudWorld conference.
BigQuery can handle unstructured data.
As part of its Cloud Next announcements, Google talked about new features for its managed enterprise data warehouse, BigQuery. One of these features is support for unstructured data.
"Starting today, data teams can analyze both structured and unstructured data in BigQuery, with easy access to Google Cloud's capabilities in machine learning (ML), speech recognition, computer vision, translation, and text processing," Kazmaier wrote.
According to Google, most enterprise data teams primarily use structured data, which accounts for only 10% of all data produced. Structured data includes information from operational databases, SaaS apps like Abode, SAP, ServiceNow, and Workday, and semistructured data in the form of JSON log files.
On the other hand, unstructured data includes video from TV archives, audio from call centers or radio, and documents in many different formats.
According to Google, enterprises are increasingly required to work with unstructured data.
Analysts think that Google's choice to support unstructured data is a way for cloud service providers to stand out.
No other competing cloud service provider is currently addressing the need for unstructured data support as aggressively as Google. Henschen explained.
Henschen added that responding to all data types on a single platform promises to simplify things for CIOs, data scientists, and developers alike," Henschen added.
Google also announced support for Apache Spark, an open-source unified analytics engine. Analysts say that the move fits with the company's plan to make its cloud service a modern lakehouse for analytics, data warehousing, and data science.
According to the company, the new integration, which will be in private preview, will allow enterprise Data teams can use Apache Spark to create BigQuery procedures that integrate with their SQL pipelines.
By embracing Spark, Google is embracing the most popular data scientist choice, Henschen explained.
"Unlike Google, Snowflake is still in the early stages of data science. Its Snowpark offering on top of its database uses Python and other languages, and it relies heavily on partners for help," Henschen said.
Another competitor, Databricks, has also made its platform better at supporting data warehouse and business intelligence (BI) workloads.
Meanwhile, Google has integrated BigQuery with its change stream service, Datastream.
According to the company's blog, the new integration will help organizations more effectively replicate data from various sources, including real-time data in AlloyDB, PostgreSQL, MySQL, and third-party databases such as Oracle.
Google has also updated its DataPlex data unifier service to automate data quality processes.
In a blog post, Kazmaier wrote, "For example, users will be able to more easily understand data lineage—where data comes from and how it has been transformed and moved over time, reducing the need for time-consuming manual processes."
Looker Studio brings together business intelligence products.
At Cloud Next, the company said that it will combine its business intelligence products by putting Looker and Data Studio together to make Looker Studio, which will come in three different flavors.
In a blog post, Wright, senior director of BI product management at Google Cloud, wrote, "Looker Studio currently supports over 800 data sources with a catalog of over 600 connectors, making it simple to explore data from various sources."
Looker Studio will give private preview access to data models, and the company says it will also get a new interface. The base version of Looker Studio will be free, and the company has also said this.
Looker was a paid service before the merger, and Data Studio was a free service. According to Aslett, the free version will not include support. Enterprises will have to upgrade to Looker Studio Pro to get support and new features.
"Upgrading to Looker Studio Pro will provide customers with new enterprise management features, team collaboration capabilities, and SLAs [service level agreements]." " This is only the first release, and we've developed a roadmap of capabilities that our enterprise customers have requested, beginning with Dataplex integration for data lineage and metadata visibility," Wright said.
According to the company, Looker also now supports data access via visualization tools such as Tableau and Microsoft Power BI.
Vertex AI Vision has been released.
Google has added a new feature called Vertex AI Vision to its machine learning platform Vertex AI to help developers and data scientists build and deploy computer vision-based applications.
With the launch of the Vertex AI platform last May, followed by the release of the collaborative development environment Vertex AI Workbench in October, the company has been working to simplify machine learning (ML) operations.
The new end-to-end application development environment will assist you in ingesting, analyzing, and storing visual data, the company said, claiming that the new service can reduce the time it takes to create computer vision applications from weeks to hours and at one-tenth the cost of current offerings.
Google says it can do this by providing a more user-friendly interface and a library of machine learning models that have already been trained to do common tasks like counting the number of people in a room, recognizing products, and finding objects.
"You can also import your existing AutoML or custom ML models from Vertex AI into your Vertex AI Vision applications." "All of our new AI products, as always, adhere to our AI principles," the company said.