HBase lets you store big data on commodity machines. It can scale up to millions of rows and columns each, putting Big Data processing, in real time, within reach of even individual developers, which is extremely difficult, if not impossible, with Relational Database Management Systems (RDBMS). For an introduction to HBase, be sure to read HBase – An OpenSource BigTable Database. In this post, I will talk about data migrations, showing you how to migrate your data from MySQL into an HBase data store.
Continue reading →
Migrating Data from MySQL into HBase
.NET Entity Framework: Using Database Migration to Seed a Database
If you have been following along, you know that I’ve been writing a series of posts on the .NET Entity Framework, and this post is the last in the series.
The Entity Framework (EF) is an object-relational database mapper that enables easy database access for .NET developers using domain-specific .NET objects that EF can generate for you. As shown in the Using Database Migration post, you can easily use EF to create your database from the classes that are defined in your project code. You can then use Code First Migrations to handle any database changes for the life cycle of your application.
Continue reading →
AngularJS Tips and Tricks Using ngResource
AngularJS is a JavaScript framework that can help you write web applications quickly using less code. AngularJS is not a library, rather it is an entirely different way of thinking about writing web applications. As such, AngularJS has certain opinions about how you should be using it. Whereas in the vanilla JavaScript world, you are given a blank canvas on which to paint, AngularJS provides structure, organization and patterns to work with.
In this post we are focusing on using the ngResource module in AngularJS, which is “a factory that creates a resource object and lets you interact with RESTful server-side data sources.” Read Chapter 9: Practical Applications, Ajax to see an example using ngResource with Ajax.
Continue reading →
Monoids for Programmers: A Scala Example
The term monoid frustrates a lot of programmers who otherwise are pretty versatile with higher-order generics, mutexes and even XSLT. This blog post will show how using monoids can be very simple and practical. Monoids are the basis of more complicated algebraic structures in mathematics, and are the underlying entities in many operations that we do while coding. There is a code sample in this post for Scala, which is becoming a lingua franca these days, and subsequent posts will focus on Scala.
Continue reading →
.NET Entity Framework: Using Database Migration
Entity Framework (EF) is an object-relational database mapper that enables easy database access for .NET developers using domain-specific .NET objects that EF can generate for you. As we saw in the first blog post on Entity Framework, you can use EF to create your database from the classes you have defined in your project code. You can add or remove properties from your classes and have EF migrate the changes into your database as well.
As developers, you often face projects where the specifications change over time for many reasons. Sometimes the client isn’t sure exactly what they need or their needs change during the course of the project. The design and analysis may determine that you need to add fields to the database or even remove fields from the database for one reason or another. For example, you may have new requirements to keep track of a user’s email address. Or you may need to add and track a user’s cell phone. A field/column entitled fName may even need to be renamed to firstName, or maybe you need to split a name into separate fields. There are many reasons that database fields/columns may need to change on a given project.
For this blog post, we are going to revisit the Visual Studio project used in The .NET Entity Framework Code First post. If you haven’t created that project, follow along with this post, but we recommend viewing that sample code.
Continue reading →
Making Node.js Talk
In our previous Using Streams in Node.js post, we introduced the concept of streams and what abstractions and tools Node.js has for consuming or producing them.
Streams have many usages, ranging from abstracting file access, to streaming Hypertext Transfer Protocol (HTTP) requests and response bodies. In this post, we’re going to see how we can make two or more Node.js processes talk to each other, building a communication protocol on top of streams. Be sure to read Pedro’s Chapter 9: Reading and Writing Streams of Data in Professional Node.js: Building Javascript Based Scalable Software to learn more about Node.js streams.
Continue reading →
Debugging Backbone’s Events
Backbone.js is a client-side MV* framework that provides a simple, but powerful, events API. All Backbone objects (Models, Collections, Views & Routers) trigger events and can listen to events. This architecture is an implementation of the Observer design pattern and using events makes it easier to keep the various parts of your application loosely coupled.
In this post we’ll look at how to set a handler to listen to all of the events on a Backbone object, and why this is useful. Backbone has two main methods for setting up event handlers: .on and .listenTo. For the examples in this post, we’ll be using the .on method. This method works in similar ways to other event libraries, such as jQuery or NodeJS’s EventEmitter.
Continue reading →
Node.js and Windows Azure: Getting Started
If you’re looking for a cloud platform to build and deploy highly scalable Node.js applications and services, Windows Azure is a good fit. It supports a wide variety of programming tools, frameworks, and development tools, including Node.js.
In this post, we’ll take a look at various options for building and deploying Node applications on Windows Azure. If you’re new to Windows Azure, you can sign up for a free trial to get started. Windows Azure development is a vast topic, and here we’ll just focus on the basics of getting up and running with Node.js. To dive deeper, take a look at Chapter 1: Windows Azure Platform Overview in Windows Azure Platform.
Continue reading →
Running Hadoop MapReduce Jobs on Amazon EMR
With Apache Hadoop MapReduce, users who previously had to use a relational database to store data and process it using SQL queries can now play with mammoth sizes of unstructured data. MapReduce has simplified data analytics, and processing has become much simpler and, more importantly, scalable with MapReduce. Hadoop is by far the most widely used implementation of the MapReduce paradigm.
You can get started using the MapReduce framework on your own machine with a local mode installation of Hadoop. Simply unzip, configure, and you’re ready to write your first MapReduce example. Local mode is strictly for beginners who are getting started with Hadoop. You can find more on getting started with Hadoop on your machine on the official page.
Since MapReduce is ideal to process huge amounts of unstructured data, it naturally implies that you will require a sizeable compute and storage infrastructure for any serious Hadoop deployment. This is where Amazon EMR comes in. EMR which is short for Elastic MapReduce is a ready-made web service that offers MapReduce running on Amazon EC2. If you’re good at Amazon Web Services (AWS), it will only take you a couple of minutes to get an EMR service up and running. This article will walk you through this process. The best thing about EMR is that you do not have to take care of EC2 instance provisioning, since EMR takes care of firing up and shutting down new instances on demand. With Amazon EMR, you don’t have to worry about configuring a Hadoop cluster, so you can focus on crunching your big data.
Continue reading →
Parallel Programming Paradigms in Clojure, Part II
This is the second part in the Parallel Programming in Clojure series that introduces some advanced topics and functionalities for concurrency in Clojure. In Part 1, we gave an overview of concurrency in Clojure and covered some basic topics including Agents, Concurrency Macros, Futures and Java Interoperatability. In this Part 2 blog post, we will discuss some mutable types such as Refs and Atom along with their use cases and examples. We will also shed some light on fork/join parallelism through Reducers that should further enhance your understanding about the possibilities for using Clojure.
Continue reading →
