Big Data

Big Data | News, how-tos, features, reviews, and videos

sparkler / firework / sparks / celebration / hands
big data messaging system / information architecture / mosaic infrastructure

big data messaging system / information architecture / mosaic infrastructure

Built for realtime: Big data messaging with Apache Kafka, Part 1

Apache Kafka scales horizontally and offers much higher throughput than some traditional messaging systems. Get started with installation, then build your first Kafka messaging system

one yellow arrow moving opposite a stream of white arrows

Real-time data processing with data streaming: new tools for a new era

Real-time data streaming is still early in its adoption, but over the next few years organizations with successful rollouts will gain a competitive advantage

shortcut through a maze

Why there are no shortcuts to machine learning

As long as companies understand that good data science takes time in an enterprise, and give these people room to learn and grow, they won’t need shortcuts

big data elephant analytics risk predictions vulnerable

3 big data platforms look beyond Hadoop

Learn how the Cloudera, Hortonworks, and MapR data platforms are evolving to meet the demands for real-time analytics and machine learning

database futuristic technology

The era of the cloud database has finally begun

Enterprises are waking up to discover that their database needs have changed dramatically—and that the old-school RDBMS is no longer the best tool

2 hadoop and spark

What’s new in Apache Spark? Low-latency streaming and Kubernetes

Continuous processing and native Kubernetes support in Apache Spark 2.3 spell the end for micro-batching and Hadoop

toy rocket ship

Cython tutorial: How to speed up Python

How to use Cython and its Python-to-C compiler to give your Python applications a rocket boost

blockchain network machine learning neural network

TensorFlow review: The best deep learning library gets better

At version r1.5, Google's open source machine learning and neural network library is more capable, more mature, and easier to learn and use

overflowing trash can with balled up paper

No, you shouldn’t keep all that data forever

Most of your old data is useless trash. So throw it away, rather than spend all the time and money hoping AI will figure something out about it

holiday lights neurons network stream
external url

What is Apache Spark? The big data analytics platform explained

Fast, flexible, and developer-friendly, Apache Spark is the leading platform for large-scale SQL, batch processing, stream processing, and machine learning

Big data analytics hand touchscreen user man
external url

What is big data? Everything you need to know

Analyzing lots of data is only part pf what makes big data different from previous data analytics. Learn what the other three aspects are

13 frameworks for mastering machine learning

13 frameworks for mastering machine learning

Venturing into machine learning? These open source tools do the heavy lifting for you

machine learning

What is machine learning? Software derived from data

Building systems that learn from data is a better way to solve complex problems, given enough meaningful data to learn from

Real-world devops failures -- and how to avoid them

How to avoid big data analytics failures

Follow these six best practices to blow past the competition, generate new revenue sources, and better serve customers

storm clouds dark

Data is eating the software that is eating the world

The data-driven machine learning algorithms that power AI will not only upend programming, but lower the barriers to AI itself

Sparks

Apache Spark 2.2 gets streaming, R language boosts

The latest additions to Apache's all-in-one in-memory processing framework simplify stream processing and flesh out support for the R language

I love MySQL license plate heart

NoSQL, no problem: Why MySQL is still king

You'd think the advent of 'webscale' NoSQL databases would have consigned MySQL to history. But you'd be very wrong

roses flowers bouquets market

Aggregating with Apache Spark

Get an overview of threadless, multithreaded, and distributed aggregation using the Streams API, Java threads, and MapReduce, then see for yourself what Spark's cluster computing engine brings to the equation

Load More