What is Apache Mahout? Website Brand Review

Website Brand Review of Apache Mahout

While we’ve all heard about Apache Hadoop, did you know there are over a dozen big data projects at Apache? We host projects that provide everything for your big data stack: databases, storage, streaming, logging, analysis, machine learning, and more. Apache Mahout is one of the pieces that puts a big data stack to do higher-level work for you.

Here’s my quick review of the Apache Mahout project, told purely from the point of view of a new user finding the project website.

Happy Birthday! This month is the Apache Mahout project’s 6th #ApacheBirthday!

What Is Apache Mahout?

“The Apache Mahoutâ„¢ project’s goal is to build an environment for quickly creating scalable performant machine learning applications.”

While this is a laudable statement – and nicely emphasises the community behind the project – it doesn’t directly say what the software they provide does.

“The three major components of Mahout are an environment for building scalable algorithms, many new Scala + Spark and H2O (Apache Flink in progress) algorithms, and Mahout’s mature Hadoop MapReduce algorithms.”

Continue reading What is Apache Mahout? Website Brand Review

Who’s Who at Apache: Roles and Responsibilities

I’ve turned this post into several pages on the ASF’s official website, including a graphical org chart, and a detailed description of all corporate officers at Apache. See the improved versions there!


There’s a huge amount of volunteer energy that flows around Apache’s Annual Member Meeting every year.  Old members and new alike come together and brainstorm all sorts of new ideas, both organizational and technical – and we have plenty of online… discussions, let us say.  There is an amazing amount of energy from a lot of very smart people, and when we focus  this energy, we make real improvements to the Foundation and sometimes in some of our projects.

As we’ve grown, keeping a full shared understanding of all the details of membership and corporate operations has become much harder.  We have some documentation, but we also still have a lot of tribal knowledge and decisions hidden in our mailing list archives.  To understand the same things, we need to be able to see what rules or policies we’ve actually decided on – or at least written down.

So here is an overview of all the different roles that people can have with the ASF as either a Foundation or with specific Apache projects.  In particular, I’m focusing on the specific agreements we make with individuals, or the explicitly posted policies that we expect people to abide by.  For more information on how Apache works, see /dev, /governance, and Community.

Continue reading Who’s Who at Apache: Roles and Responsibilities

Shane’s Apache Director Position Statement, 2016

The ASF is holding it’s annual Member’s Meeting this week to elect a new board and a number of new Members to the ASF.  I’m honored to have been nominated to stand for the board election, and I’m continuing my tradition of publicly posting my vision for Apache each year.

We are lucky to have both a large involved membership, as well as another excellent slate of candidates including a couple of great new faces. No matter how Apache STeVe ends up computing the results, Apache will have a great board for the year to come.

Please read on for my take on what’s important for the ASF’s future…

Continue reading Shane’s Apache Director Position Statement, 2016

What is Apache Hive? Website Branding Review

Website Brand Review of Apache Hive

While we’ve all heard about Apache Hadoop®, did you know there are over a dozen big data projects at Apache? We host projects that provide all the different functions your big data stack: databases, storage, streaming, logging, analysis, and more. Apache Hiveâ„¢ is one of these pieces of the whole big data ecosystem.

Here’s my quick review of the Apache Hive project, told purely from the point of view of a new user finding the project website.

What Is Apache Hive?

“The Apache Hive â„¢ data warehouse software facilitates querying and managing large datasets residing in distributed storage”.

Continue reading What is Apache Hive? Website Branding Review

What is Apache Flex? Website Branding Review

Website Brand Review of Apache Flex

Many projects come to Apache from software vendors donating them to the Apache community, where the Apache Incubator works to form an open and independent community around the project. Here, Adobe donated both the code and the brand for their Flex project to Apache. Now, the ASF is the steward both to the vibrant Apache Flex community, as well as the new owner of the Flex brand and registered trademark.

Here’s my quick review of the Apache Flex project, told purely from the point of view of a new user finding the project website. While we’re all familiar with Adobe Flash browser plugin, not everyone may be familiar with the Flex environment for building Flash (and other!) applications.

What Is Apache Flex?

Apache Flex® is the open-source framework for building expressive web and mobile applications.

In other words, Flex is a toolkit for building general applications that can be run on a variety of web browsers and mobile platforms that include the Adobe Flash or Adobe AIR runtimes or application containers. Flex is the coding language and environment you use to write applications for the Flash/AIR containers.

No, Really, What Is Apache Flex For?

Continue reading What is Apache Flex? Website Branding Review

What is Apache HBase? Website Branding Review

Website Brand Review of Apache HBase

How do open source projects get popular? By providing some useful functionality that users want to have. How do open source projects thrive over the long term? By turning those users into contributors who then help improve and maintain the project. How well a project showcases themselves on the web is an important part of the adoption and growth cycle.

Here’s my quick review of the Apache HBase project, told purely from the point of view of a new user finding the project website. HBase is a key part of the big data storage stack, so although you may not work directly with it, it’s probably underlying some systems you use.

What Is Apache HBase?

“Apache HBase™ is the Hadoop® database, a distributed, scalable, big data store”.

Continue reading What is Apache HBase? Website Branding Review

What Is Apache Mesos? Website Branding Review

Website Brand Review of Apache Mesos

How do open source projects get popular? By providing some useful functionality that users want to have. How do open source projects thrive over the long term? By turning those users into contributors who then help improve and maintain the project. How well a project showcases themselves on the web is an important part of the adoption and growth cycle.

Here’s my quick review of the Apache Mesos project, told purely from the point of view of a new user finding the project website. Mesos is turning into a major project in the big data and cloud space; not perhaps the obvious popularity of Apache Spark yet, but certainly big.

What Is Apache Mesos?

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

Continue reading What Is Apache Mesos? Website Branding Review

Apache CMS: Adding static data tables easily?

Did you know that the ASF has their own CMS / static generator / magic update system that runs the apache.org homepage and many Apache project homepages? While it’s more of an Apache infra tool rather than a full Apache top level project, it’s still a full service solution for allowing multiple static website builds that are integrated into our servers.

While there are plenty of great technical CMS systems, when choosing a system for your company, many of the questions are organizational and deployment related. How easy is it for your IT team to manage the core system? How easy is it for various teams (or projects) to store and update their own content, perhaps using different templates in the system? How can you support anonymous editing/patch submission from non-committers? Does it support a safe and processor-respectful static workflow, minimizing the load on production servers while maximizing backups? And how can you do all this with a permissive license, and only hosting your own work?

Continue reading Apache CMS: Adding static data tables easily?

What Is Apache Spark? Website Branding Review

Volunteering at the ASF and elsewhere in open source, I think a lot about open source brands. In particular: how do various open source projects – run by a wide variety of typically very geeky volunteers – present themselves publicly to new users? We sometimes spend so much time working on the great new code – and explaining it to other developers we already know – that sometimes I wonder if we’re really showcasing what our great new code can do for new users and contributors.

Here’s my quick review of the Apache Spark project, told purely from the point of view of a new user who just came to the project website. I’m trying to show what I think someone new to the project might think about the project once they get to the homepage. Since Spark is a major project in the big data space, there are a lot of search hits for Spark, including a wide variety of other software vendors.

Continue reading What Is Apache Spark? Website Branding Review

ApacheCon Big Data/Core News Wrapup

Our annual Apache:Big Data and ApacheCon:Core events were held recently at the lovely Corinthia Hotel Budapest, and the content and attendees were amazing.  The weather was great too, and sightseeing and shopping in Budapest were lovely.  Attendance was still good even in the face of time-competing software conferences and the local refugee crisis happening in the region.

While they were booked as separate events, many people stayed for the whole week.  Going forward, we will likely have a single event, but be even clearer with the strength of content in specific track days.  The broad array of very deep and well-received technical content in the big data space was truly impressive; Apache has over a dozen big data related projects and probably 20 more incoming Incubator podlings, so we certainly have the space covered!

Continue reading ApacheCon Big Data/Core News Wrapup