Arcion now reads logs from Oracle, promises 10x faster data replication

California-based Arcion (formerly Blitzz), which offers a fully managed platform to replicate transactional data to cloud-based data platforms in real-time, is making data extraction from Oracle databases faster with a new native log reader.

The capability, part of Arcion’s latest release, enables enterprises to read logs from their Oracle instance directly during replication, eliminating the need to use Logminer or other less effective or efficient sources. According to the company, this, combined with its distributed and parallel architectural design, ensures unlimited scalability and 10 times faster data extraction to target platforms such as Databricks, Snowflake, MySQL, PostgreSQL, SingleStore and Yugabyte.

“Arcion is the only end-to-end multithreaded CDC [change data capture] solution that auto-scales vertically and horizontally. Any process Arcion runs on source and target is parallelized using patent-pending techniques to achieve maximum throughput. There isn’t a single step within the pipeline that is single-threaded. It gives Arcion users ultra-low latency CDC replication and can always keep up with the forever increasing data volume on the source. If an enterprise wants to migrate or replicate terabyte-scale data that requires high throughput, Arcion is the answer,” Gary Hagmueller, the CEO of the company, told VentureBeat.

While newer data integration tools such as Airbyte, Debezium, StreamSet and Kafka Connectors miss out on this feature, there are many older CDC tools (Qlik Attunity, Fivetran-acquired HVR) that do offer the capability. However, as Hagmueller pointed out, all these older solutions require material effort to both set up and manage – which is not the case with Arcion.

Making data replication easier

In addition to the native reader for Oracle users, the latest Arcion release also simplifies the handling of DDL (data definition language) schema changes and data transformation for enterprises.

As part of the former, the schema evolution capability of the platform has been extended to automatically capture DDL changes from a source database and replicate them in the target data platform. The feature saves data engineers from the manual trouble to keep schema aligned between source and target databases. Previously, if there was a change to the DDL or schema on the source database, they had to stop the replication process and rebuild it from scratch by snapshotting the source system. This led to downtime, wastage of expensive compute resources and chances of user error and data loss.

“Oracle Golden Gate is one CDC solution that supports automatic schema evolution (DDL). But Arcion is the only CDC platform that supports out-of-the-box DDL with modern analytic warehouses like Snowflake or Databricks. Oracle Golden Gate does not provide very robust support for Snowflake and Databricks, so anyone adopting such systems will find that solution inadequate. Alternatively, the data team has to be ready to invest in manual resources to handle the schema evolution with other alternative CDC solutions,” the CEO noted.

Meanwhile, to help enterprises better handle data transformations, Arcion’s introducing a zero-code feature that delivers flexible, high-performance streaming column transformations on the fly. This eliminates the need to expend engineering resources on creating a staging table (e.g., Kafka) and writing custom code to transform data on the target. The practice also led to delayed SLAs.

Oracle log reader availability

The Oracle log reader is currently available in beta and will see a wider rollout later this month, while the other two capabilities are now generally available as part of the fully-hosted version of Arcion.

With this release, Arcion is also adding Google BigQuery and Azure-Managed SQL Server as new sources and Imply (founded by the original creators of Apache Druid) as a new target. In all, the platform supports over 20 enterprise databases and data warehouses for data replication. A few months ago, the company also raised $13 million in series A funding at a valuation of $65 million.

"The data replication and protection software market showed much greater-than-expected resilience in 2020 despite the pandemic," Phil Goodwin, research director at IDC's infrastructure systems, platforms and technologies group, said. "We expect this market to return to its normal growth pattern, with a 2.7% CAGR through 2025. The public cloud services portion of the market is the bright spot, with an expected 11.6% CAGR during that time."

Making data replication easier

Oracle log reader availability

More