The world’s first gigabyte hard drive was the size of a refrigerator -- and that wasn’t all that long ago. Clearly, technology has evolved, and so have our data storage and analysis needs. With data serving a key role in helping companies unearth intelligence that can provide a competitive advantage, solutions that allow organizations to end data silos and help create actionable business outcomes from intelligent data analysis are gaining traction.
Keeping in line with the principals of a Well Architected Review (WAR), we are constantly challenged by our customers to help evolve their requirements into repeatable and automated patterns deployed in their AWS environment, using the latest AWS has to offer in its growing list of managed services. In this case, a research wing of a global industrial firm wanted a solution to replace their current VPN and bastion host solution with access control topping the list. The answer: AWS Client site VPN.
According to research by Aberdeen, business data is growing exponentially, with the average company seeing the volume of their data grow at a rate that exceeds 50% per year. A customer of ours who provides specialized healthcare services approached us to help with their data growth challenges. With rapid expansion and complexity in the volume of its data, the firm sought to implement a data lake solution in the AWS cloud space.
We recently had the opportunity to work with a popular quick serve restaurant (QSR) who reached out asking if Flux7 could help speed its developer outcomes for faster time to market. For this global enterprise, the goal manifested itself in a project where Flux7 helped the QSR create one-click automated installations of various products, including Amazon Redshift, through AWS Service Catalog which helped make development more efficient and productive-- through automation that minimized process overhead. Today we’d like to share the story of this AWS case study project.
We recently had the opportunity to work with a privately-held clinical research organization that was interested in updating the systems that its internal team of research scientists uses for data analysis. It was interested in moving to the AWS cloud as the team’s large data-related demands had outgrown its on-premise system and needed the benefit of a highly secure, elastic, high performance computing environment.
In 2018, the number of connected IoT devices is projected to grow to over 23 billion, according to Statista. And these devices will create a volume of 400 zettabytes of data by the end of the year, as reported by Datastax. With such an explosion in device-driven data, it’s important to have a strategy for maximizing the business impact of IoT big data, transforming it into actionable intelligence.
Today we are delighted to be recognized as having achieved AWS Service Delivery Partner status for Amazon Aurora. As you can see from thenews release we issued, the AWS Service Delivery Program is designed to highlight AWS Consulting Partners who have a track record of delivering verified customer success for specific Amazon Web Services (AWS) products.
We have been working closely with a customer who is undergoing a business transformation. As a multimedia equipment manufacturer, the organization has a loyal following of its high quality devices. However, like many companies facing the convergence of markets and new customer demands, the company has embarked on a metamorphosis. Traditionally very focused on hardware, their software was largely ignored even though it offered customers real value. Part of the company’s transformation was a move to treat their software like a full-fledged offering, rather than a free supplement. An upcoming product release marked the first (and biggest steps), in cementing this change in company direction.
There are many reasons an organization might choose Amazon Aurora over the Amazon Relational Database Service (Amazon RDS). Superior performance, greater scalability, and the ability to restart without losing cache are just a few. However, for those organizations who are already running an important application or Website on top of the RDS managed service, it can be a challenge to migrate from it to Aurora, despite the latter’s obvious benefits. After all, you can’t just take down a service that customers expect access to 24x7.
This month’s re:Invent in Las Vegas drew over 32,000 attendees and the show did not disappoint as AWS delivered on its precedent to unveil a number of new features and products at the show. With numerous announcements, AWS news was peppered throughout two days of lengthy keynote sessions, we’ve asked Ali Hussain, Flux7 co-founder and CTO, to weigh in on what caught his attention and where he thinks the most impact will be seen to enterprise organizations like those that Flux7 serves.
AWS Case Studies: DevOps
A Fortune 500 manufacturer was using Hadoop, internal data centers, Rackspace and CenturyLink to facilitate services that connected its customers with data insights using an Internet of Things model. The overarching goal: to facilitate continuous data-driven improvement within its customers’ operations. To help achieve this goal and overcome its Hadoop scaling issues, the company engaged with Flux7, DevOps consulting group and AWS partners. Additionally, the manufacturer sought a global solution that would comply with EU data privacy laws.
Flux7 engineer Ahsan Ali and CTO Ali Hussain collaborated on this post
The rise of IoT has given rise to a new generation of needs in the world of big data processing. Now we need to handle data ingress from many sensors around the world, and make real-time decisions to be executed by these devices. As such it is no surprise we see new services to handle processing of streaming data, such as Amazon Kinesis.
Last week, Amazon Web Services announced the availability of larger and faster Elastic Block Storage Volumes, something we’ve been looking forward to since the original announcement at re:Invent 2014. AWS continues to add rich features to their platform and it can be difficult to stay on top of them, and understand which new capabilities are going to impact an individual business, and how.
Healthcare providers, pharmaceutical manufacturers and biotechnology companies are spawning their own health tech start-up ecosystems to solve some of the most complex health problems. Often, this is accomplished through the use of high performance computing (HPC) and Big Data analytics. Patient-derived data, such as genomics, can now be compared against very large data sets to identify patterns, matches and other indicators that can provide new treatment plans and essentially better health outcomes.
Recently, Amazon Web Services [AWS] began offering different types of EBS drives. Apart from the magnetic EBS drive, there are now two types of Amazon AWS SSD drives: an EBS General Purpose SSD drive and a Provisioned IOPS SSD drive.
Five years ago, Amazon found that every 100ms of latency cost them 1% of sales. Google discovered that a half-second increase in search latency dropped traffic by 20%.
Amazon recently introduced new types of storage-optimized instances. This new generation of instances is available within the I2 and HI1 families. All provide high storage and better IO performance compared to other instance families in AWS. Flux7 Labs decided to benchmark these new instances to better understand the tradeoffs between them that our customers face.
The Amazon I2 Instance Type
Amazon has announced immediate availability of the I2 instance type, the next generation of Amazon EC2 High I/O instance and the best solution for transactional systems and high performance NoSQL databases such as Cassandra and MongoDB. I2 instances feature the latest generation of Intel Ivy Bridge processors, the Intel Xeon E5-2670 v2. Each virtual CPU (vCPU) is a hardware hyperthread from an Intel Xeon E5-2670 v2 (Ivy Bridge) processor. Its features, price and availability can be combined to derive a performance-oriented usage and to explore new use cases.
In our previous post here, we detailed why Ganglia is a good tool for monitoring clusters. However, when monitoring a Hadoop cluster you often need more information about CPU, disk, memory, and nodal network statistics than the generic Ganglia config can provide. For those who need more finely tuned monitoring, Hadoop supports a framework for recording internal statistics and then for posting them to an external source, either to a file or to Ganglia. In fact, Hadoop now supports an implementation of the Metrics2 Framework for Ganglia. In this post we’ll discuss Hadoop Metrics2 Framework’s design and how it enables Ganglia metrics.
Recently at Flux7 Labs we developed an end-to-end Internet of Things project that received sensor data to provide reports to service-provider end users. Our client asked us to support multiple service providers for his new business venture. We knew that rearchitecting the application to incorporate major changes would prove to be both time-consuming and expensive for our client. It also would have required a far more complicated, rigid and difficult-to-maintain codebase.
On January 11, Aater and I attended Data Day Texas 2014 here in Austin. Sponsored by Geek Austin, it was such a great event that I thought I’d share some highlights. Data Day Texas holds special significance for Flux7 Labs because it was at Data Day 2013 that we made our first presentation, when Aater gave a talk on the role of microservers in big data, which you can find here.
At Flux7 Labs we solve a variety of problems for our customers and often that includes guiding clients to the right tools for their needs. In our previous post on NoSQL, we discussed how NoSQL solutions offer a better alternative to RDBMSs. In this post we’ll walk you through different types of NoSQL database models and solutions and show you how different architectures and design philosophies support various features. We’ll explain how NoSQL can be a tool that better serves your needs than a one-size-fits-all tool like an RDBMS.
As mentioned in part 1 of this series (Creating a LAMP Stack AMI), a common concern among most customers is to choose the right instance type.
Gone are the days when an RDBMS (Relational Database Management System) was the appropriate solution for every database need. Why? Let’s discuss two scenarios in this post.
Big companies, including Amazon, Google, Facebook and Yahoo, first adopted NoSQL for in-house solutions due to the lack of RDBMS feature support for their ever-changing needs. By providing weak consistency and optimizing for certain use cases, they’re able to utilize large distributed systems to handle their required workloads. There are five ways that NoSQL handles large workloads differently than do traditional RDBMSs, and in which NoSQL outperforms RDBMSs.
Cassandra is a one stop choice for data driven organizations dealing with real-time Big Data operations for their core functionalities. Now what makes it so dear to the developers and organizations dealing huge databases is a bunch of features that it houses to tackle the stored data.
”BigData” is a term that has been buzzing around a lot for the last few years. And when you hear this buzz, you’ll hear ”Hadoop” as well. In last 2-3 years, many big players in the industry have come up with their own distribution of Apache Hadoop, be it Intel, Microsoft, IBM, or EMC, etc. Also, some startups, focusing only on Hadoop, have become big players now – Cloudera, Hortonworks – in this area.