Congratulations to six new Apache projects!

In last week’s monthly meeting of the Board of Directors of the ASF, we approved the creation of six new Top Level Projects (TLPs) at the ASF. This is the most new TLPs ever created at once, followed only by the meeting of November, 2008 where 5 new TLPs were created (CouchDB, Buildr, the Attic, Qpid, and Abdera).

In this particular case, much of the growth comes from within existing projects, wherein subprojects communities within Hadoop and Lucene have matured sufficiently to deserve to manage their own fates, and to create their own Project Mangement Committees (PMCs) to take charge. To put this in another perspective, this is also reflective of the ASF’s growth; before this meeting we had over 70 TLPs and over 30 Incubator podlings, so an addition of 6 new TLPs is less than 10% growth for the month.

We should congratulate the Apache Traffic Server community first, since they went through the Incubation process and successfully graduated from an Incubator Podling into their own TLP. Soon to be served (once the website migration is complete) from http://trafficserver.apache.org/, Apache Traffic Server is fast, scalable and extensible HTTP/1.1 compliant caching proxy server. Congratulations to the whole team in showing a strong and diverse community around this new product.

Next up come three subprojects within the well-known Apache Lucene project which have grown organically from modules within Lucene to be diverse and active projects within their own right. You may recognize some of these product names from the Lucene world.

  • Apache Mahout, which is building a system for creating scalable and effective machine learning libraries which can perform recommendation mining, clustering, classification, and grouping into itemsets.
  • Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
  • Apache Nutch, integratable with both Lucene and Hadoop, adds web-specific crawling, fetching, and organization features.

The Apache Hadoop project – another wildly distributed computing technology – has also grown two of it’s subprojects to the point where they deserve their own fame.

  • Apache Avro is a fast data serialization system that includes rich and dynamic schemas in all it’s processing.
  • Apache HBase is the Hadoop database – designed to provide random, realtime read/write access to Big Data – billions of records – using commodity hardware.

Why did these subprojects spin out to become their own TLPs? The driving factor is not the technology, but rather the community and oversight aspects of how the ASF organizes it’s mostly self-running projects.

From the oversight perspective, the ASF Board relies on every project’s PMC to manage their project’s operations within the broad guidelines of the Apache Way, and to report their project’s progress and issues to the board. This means that there must be enough PMC members who can actively monitor and participate in their project’s activities, and can especially show due diligence and responsibility in voting on any official product releases the project makes. With the rapid growth in both community and technology areas in the Hadoop and Lucene projects, it’s a difficult job for the PMCs to truly understand and help manage all the subprojects they’ve created or added over the past two years.

While the scope of oversight may have hinted that some subprojects should be promoted to TLP status, the gating factor is community. Does a subproject have a strong and diverse enough community to provide their own, independent PMC that can manage their own affairs? Becoming a TLP is both a benefit and a responsibility: the community through it’s new, more focused PMC can better run itself; however the new PMC is also expected to provide accurate reports and responsible oversight of their community and product releases.

Congratulations to all six new projects! Please note that as the websites are updated, each project will be moving it’s home page to http://projectname.apache.org in the near future.

0x12 days until ApacheCon: Meet the MeetUps

Along with the usual array of Special Events at ApacheCon, we’ve had tremendous interest so far in the MeetUps scheduled on Monday and Tuesday evenings between 18:00 and 23:00

MeetUps are free for attendees, and are focused on individual project communities. In fact, the community is the primary driver to having a MeetUp – they’re doing the organizing, and the conference is happy to be able to provide space in our hotel this year. The only registration for the MeetUps is to sign up on the wiki page. If you have something to present, great; put it on the wiki, if not, just come with your questions about the project. Laptops are expected.

The ApacheMeetupsEu09 wiki page really has all the key details. If you go, tell them that Shane sent you.

  • Wicket Meetup (day TBD depending on signups), has their own flickr tags.
  • Jackrabbit Meetup (day TBD depending on signups), or anything about JCR; had a great meetup in 2008 too.
  • Portals Meetup (day TBD depending on signups), including Portals Pluto, Jetspeed, Bridges and WSRP4J and other related projects.
  • Lucene Meetup, will be Tuesday evening; ask about Lucene, Mahout, Solr, Droids, more.
  • Maven Meetup (will be held if there are enough signups).

Many, many thanks to Arj√© Cahn for his patience with everyone! His tireless drive to ensure the MeetUps got organized and could share our space this year, a win for everyone, both MeetUp’ers and ApacheCon attendees as well – plus thanks to all the MeetUp organizers and project committers who are planning on presenting or being there to answer questions. We hope to get a number of Amsterdam area locals coming just for the MeetUps who will get to see a little bit of what ApacheCon is about.