Data Storage and Management

Data Storage and Management news, information, and how-to advice

flickr kdooley jw osjp apache phoenix
spark

chalkboard with 1, 2, 3 written on it

16 for '16: What you must know about Hadoop and Spark right now

Amazingly, Hadoop has been redefined in the space of a year. Let's take a look at all the salient parts of this roiling ecosystem and what they mean

frustration

5 things we hate about Spark

Spark has dethroned MapReduce and changed big data forever, but that rapid ascent has been accompanied by persistent frustrations

code big data binary programming

MariaDB pops up on Azure with new cluster service

Companies interested in deploying the increasingly popular open-source database on Microsoft's Azure cloud platform now have an easy way to do so.

first-aid patch medicine cure

Oracle fixes critical flaws in Database Server, MySQL, Java

The bad news: Java and Oracle's database products had lots of vulnerabilities. The good news: None are currently under attack.

Hadoop eats data analytics

Hadoop is slowly eating conventional analytics

The components of the Hadoop ecosystem won't overthrow Teredata or IBM Netezza any time soon, but ultimately, the commodity solution almost always wins.

elephant thinkstock

Hadoop, in trouble? Only in Gartner-land

A new poll of customers provides a brighter, more detailed picture of Hadoop adoption than Gartner's famously downbeat survey.

Apache Spark

Open source Java projects: Apache Spark

Set up and use Spark to analyze data contained in Hadoop, Splunk, files on a file system, local databases, and more.

chuck norris

How Apache Ranger and Chuck Norris help secure Hadoop

The Hadoop ecosystem has always been a bag of parts, each of which needs to be secured separately -- at least they did need that, until Apache Ranger came to town.

3d printed robotic hands

Review: MongoDB 3.0 reaches for the enterprise

MongoDB zeroes in on operations with pluggable storage engines and revamped management tools

big data painpoints

9 big data pain points

Do enough Hadoop and NoSQL deployments, and the same problems crop up again and again. It's time for the industry to nail them sooner rather than later.

streaming data

Why streaming analytics is such a big deal

Analytics drive decisions, but some decisions shouldn't wait until batch processes complete -- which is why, eventually, we'll all analyze data as it streams in.

developer choice

Which freaking Hadoop engine should I use?

These four truths will help you determine which Hadoop technology to use for the types of workloads you anticipate.

dev challenges
Tip

Big data, big challenges: Hadoop in the enterprise

Fresh from the front lines: Common problems encountered when putting Hadoop to work -- and the best tools to make Hadoop less burdensome.

big data

Spark 1.4 adds support for R, Python 3, cluster management

Spark data processing framework adds languages used by many data crunchers, as well as container-based cluster management features.

A better mousetrap: A JSON data warehouse takes on Hadoop

Sure, a NoSQL or JSON data warehouse sounds faddish, but SonarW is a better solution for many.

hadoop sql

LinkedIn fills another SQL-on-Hadoop niche

LinkedIn's open source, home-brew OLAP project is a new way for Hadoop users (and others) to query both real-time and historical data.

apex datatorrent

Spark and Storm face new competition for real-time Hadoop processing

DataTorrent is releasing its real-time data processing engine for Hadoop and beyond as the open source Project Apex.

business storm 157689723

Review: Storm’s real-time processing comes at a price

Storm may be the only real-time processing framework that has been proven to process millions of messages per second, but there's a steep learning curve ahead.

Load More