News

Spark SQL, part of Apache Spark, is used for structured data processing by running SQL queries on Spark data. Srini Penchikala discusses Spark SQL module & how it simplifies data analytics using SQL.
Doris, according to the Apache Software Foundation, is based on the integration of Google Mesa and Apache Impala, an open source MPP SQL query engine, developed in 2012 and based on the ...
Apache Phoenix is a relatively new open source Java project that provides a JDBC driver and SQL access to Hadoop’s NoSQL database: HBase. It was created as an internal project at Salesforce ...
The Apache Drill framework aims to provide just such a SQL engine. Drill can operate across multiple distributed data stores such as HDFS or Amazon S3, relational databases that support JDBC or ODBC, ...
The Apache Software Foundation (ASF) this week updated an open source Apache Drill tool that enables end users to query multiple data sources using SQL — without waiting for enterprise IT teams ...
Has anyone had any experience using VMWare Converter to convert a CentOS 4.4 box running Apache and Sybase SQLanywhere to an ESXi server? Any particular suggestions or tips for things to keep an ...
In addition to new SQL features and myriad other improvements, Hive 0.13 was released in coordination with Apache Tez 0.4, an interactive alternative to the oft-maligned batch-oriented MapReduce ...
Apache Kafka is a key component in data pipeline architectures when it comes to ingesting data. Confluent, the commercial entity behind Kafka, wants to leverage this position to become a platform ...
Confluent released KSQL: interactive, distributed streaming SQL engine for Apache Kafka. KSQL supports stream processing operations like aggregations, joins, windowing, and sessionization on topics in ...
In the latest release, Hive 0.12, the same query runs in 10 to 12 seconds. “So you’re seeing 60, 70, 100 times performance increase just because of this Stinger initiative works at making apache Hive, ...