Healthcare providers, pharmaceutical manufacturers and biotechnology companies are spawning their own health tech start-up ecosystems to solve some of the most complex health problems. Often, this is accomplished through the use of high performance computing (HPC) and Big Data analytics. Patient-derived data, such as genomics, can now be compared against very large data sets to identify patterns, matches and other indicators that can provide new treatment plans and essentially better health outcomes.
Recently, Amazon Web Services [AWS] began offering different types of EBS drives. Apart from the magnetic EBS drive, there are now two types of Amazon AWS SSD drives: an EBS General Purpose SSD drive and a Provisioned IOPS SSD drive.
Five years ago, Amazon found that every 100ms of latency cost them 1% of sales. Google discovered that a half-second increase in search latency dropped traffic by 20%.
Amazon recently introduced new types of storage-optimized instances. This new generation of instances is available within the I2 and HI1 families. All provide high storage and better IO performance compared to other instance families in AWS. Flux7 Labs decided to benchmark these new instances to better understand the tradeoffs between them that our customers face.
The Amazon I2 Instance Type
Amazon has announced immediate availability of the I2 instance type, the next generation of Amazon EC2 High I/O instance and the best solution for transactional systems and high performance NoSQL databases such as Cassandra and MongoDB. I2 instances feature the latest generation of Intel Ivy Bridge processors, the Intel Xeon E5-2670 v2. Each virtual CPU (vCPU) is a hardware hyperthread from an Intel Xeon E5-2670 v2 (Ivy Bridge) processor. Its features, price and availability can be combined to derive a performance-oriented usage and to explore new use cases.
In our previous post here, we detailed why Ganglia is a good tool for monitoring clusters. However, when monitoring a Hadoop cluster you often need more information about CPU, disk, memory, and nodal network statistics than the generic Ganglia config can provide. For those who need more finely tuned monitoring, Hadoop supports a framework for recording internal statistics and then for posting them to an external source, either to a file or to Ganglia. In fact, Hadoop now supports an implementation of the Metrics2 Framework for Ganglia. In this post we’ll discuss Hadoop Metrics2 Framework’s design and how it enables Ganglia metrics.
Recently at Flux7 Labs we developed an end-to-end Internet of Things project that received sensor data to provide reports to service-provider end users. Our client asked us to support multiple service providers for his new business venture. We knew that rearchitecting the application to incorporate major changes would prove to be both time-consuming and expensive for our client. It also would have required a far more complicated, rigid and difficult-to-maintain codebase.
On January 11, Aater and I attended Data Day Texas 2014 here in Austin. Sponsored by Geek Austin, it was such a great event that I thought I’d share some highlights. Data Day Texas holds special significance for Flux7 Labs because it was at Data Day 2013 that we made our first presentation, when Aater gave a talk on the role of microservers in big data, which you can find here.
At Flux7 Labs we solve a variety of problems for our customers and often that includes guiding clients to the right tools for their needs. In our previous post on NoSQL, we discussed how NoSQL solutions offer a better alternative to RDBMSs. In this post we’ll walk you through different types of NoSQL database models and solutions and show you how different architectures and design philosophies support various features. We’ll explain how NoSQL can be a tool that better serves your needs than a one-size-fits-all tool like an RDBMS.
As mentioned in part 1 of this series (Creating a LAMP Stack AMI), a common concern among most customers is to choose the right instance type.