OSCON 2009

It’s time again for the annual OSCON report.

OSCON 2009

The conference

Every other OSCON that I’ve been to (since 2003) has been in Portland, and in some ways the two have become synonymous for me. I’m not taking the move to San Jose very well. There are a variety of little things, like the fact that you could end up walking 1/4 of a mile to get from one talk to another only to end up reversing the trip for the next session. At the end of Thursday, I bagged going to a talk because I was tired of walking back and forth. I had a bad experience (much worse than usual) with the WiFi connection in the hotel where I was staying, something that I don’t tolerate very well. The fact that the hotel acknowledged the problem and then offered drink vouchers as an apology didn’t help any. I had to ask the checkout agent to remove the charges for the days that I got under 20kb/s. If you take the view (which I do) that OSCON really starts at 6pm and ends at 3am, then downtown San Jose doesn’t really hold a candle to downtown Portland. My understanding is that OSCON only has a one year contract for San Jose, so maybe we’ll get something else next year. I hope so.

Another thing about OSCON relates to the attendees themselves. I was unsurprised to hear that attendance was down. The combination of the economy and the move away from Portland could explain some of that. The lunch hall seemed pretty full (and the food was very good for a conference lunch – maybe the best I’ve ever had), and it seemed a decent size to me. What I noticed was something else. Normally when I show up at OSCON, even on the first day of tutorials, it is pretty hard to go very far before I run into someone that I know. This year that was not the case, and I don’t feel that it improved that much once the conference proper began. In combination with the move to San Jose, this had a pretty major impact on the value that I got out of the conference.

The talks

OSCON 2009

This year I wound up all over the map session wise. I took in some sessions on tools: The SD distributed bug tracker, and Theo Schlossnagle’s talk on his new monitoring system, Reconnoiter. I also attended Tom Preston-Warner’s talk on github. His talk ended up being much more about git in general. I was hoping that he would have more to say on the social/community behaviors that they’ve observed on projects on GitHub. There’s not a lot of data on how the use of DVCS’s is impacting the social/community dynamics of open source projects, and the folks at github are in a unique position to observe some of this. Maybe next year.

I also continued to gather more information on things related to cloud computing. In this case there was some storage stuff in the form of Neo4J and Cassandra. Adam Jacob’s s talk on Chef was well attended despite being in the last session block of the conference, and people stayed well past the ending time for the Q&A. Reconnoiter also falls into the cloud tools space. I attended Kirrily Robert, Yoz Grahame, and Jason Douglas’ talk titled “Forking Encouraged: Folk Programming, Open Source, and Social Software Development“, hoping to glean some insight or data into “fork oriented” open source. That wasn’t really what I got. The talk was fairly philosophical for a while. The most interesting (and surprising) part of the presentation was a brief demonstration of Metaweb’s new Freebaseapps.com, which is a development environment for Freebase which embodies some of the principles discussed in the philosophical portion of the talk. From my cloud computing oriented point of view, it looks to me like an “IDE for the cloud”. I need to dig into this a bit more.

One topic which was brand new to me this year was R, which is a functional language for statistical computing and graphics. I’d been hearing a little bit of buzz on R via Twitter, and I was just invited to join the advisory board for REvolution Computing, a startup that is working to foster the R community and to support those users that want a more commercialized offering of R. Since I didn’t know much about R, I found Micheal Driscoll’s talk “Open Source Analytics: Visualization and Predictive Modeling of Big Data with the R Programming Language“. Analytics of all kinds are going to be much more important as the amount of data in web applications grows. If you are interested in big data, and don’t know about R, that seems like a problem. I know that I am going rectify my own personal lack of knowledge.

My talk

As in previous years, I gave a talk at the conference. One of the presentations that I’ve done in several places has a large section about the problem of programming concurrent systems, motivated by the arrival of multicore processors. For OSCON, I took that section of the talk and expanded it into a session of its own. Despite two one hour out loud run throughs, I still got the pacing a little bit wrong and had to rush at the end to get all the content in. If I’m not careful this is going to wind up turning into a three hour tutorial. I’ve embedded the slideshare version for those of you that are interested.

Photography

OSCON is a significant event for me photographically, since OSCON 2005 happened days after I got my first digital SLR. It’s also one of the times that I usually see my friend James Duncan Davidson, who has been one of the people that has helped me along my photographic journey.   

This year things were a little different. Regular readers will know that I am getting a little burned out on conference photographs. I’ve been to a lot of conferences and shot a lot of pictures. After a while, they start to look and feel the same. It’s hard for me to both concentrate fully on the conference, the talks, the hallway track, etc, as well as concentrating on doing stuff that would be interesting photographically. All of which is a long way of saying, “I shot less. A lot less”.

One other reason that I don’t feel bad about shooting less at OSCON is that Duncan is there. Or normally he is. This year, he was absent because he got the nod to be the main stage photographer for TED Global 2009. Those of you who follow Duncan will know that when he needs a second camera he turns to Pinar Ozger. They’ve been working together for a while, but I’ve never met Pinar in person, because I don’t usually end up at the two camera events, and the one time that she, Duncan, and I were all in the same place, we just never ended up meeting. So this was the year that I got to meet Pinar – we bumped into each other at the OSCON speaker’s party and had a great chat. This was also my first time to really get a sense for her eye. When you second shoot for someone, you try to follow the lead of the main photographer. So I am the most familiar with Pinar’s work when she’s working with Duncan. Since she was the lead this year, I (and everybody else) got to see her eye at work. There are some wonderfully artistic shots in her coverage of the show.

One person that you’ll see in Pinar’s set is photographer Julian Cash, the head of the Human Creativity Project. I first met Julian at ApacheCon in San Diego back in 2005. At the time, I didn’t really know much about photography, and I didn’t really get to see much of what he had done with his light painting portraits. Today, I have have much better appreciation for his light paintings. He did one of me at the MySQL conference earlier this year, and he did a bunch at OSCON too.

The photographic tradition at OSCON is going strong.

Bar Camp Seattle 2009

This past Saturday I hopped over to Seattle for Bar Camp Seattle 2009. I wasn’t able to make it to last year’s event, and since we haven’t had many Bar Camp’s in Seattle, I wanted to see what was happening. As usual, there was a wall where the schedule was developed as the day went on.

BarCamp Seattle 2009

I was happy to see that there were a number of people involved in organizing the event. Those folks were easily distinguishable by their red and yellow propellor beanies.

BarCamp Seattle 2009

The Bar Camp / Foo Camp structure for a conference is pretty liberating when compared to the usual pre-planned, eyes forward conference. However, it’s not enough to guarantee a good event. I’ve attended a number of these kinds of events, and in my view, who actually turns up is just as important as how the event is structured. Bar Camp Seattle had a lot of people who were interested in talking about various social media related topics. I have no idea if any of those sessions were any good, because I didn’t end up going to any of them. I ran into and met some developer type people, but not as many as I hoped to. Despite the use of Pathable’s cool registration / attendee matchmaking system, I didn’t find it that easy to make use of the information printed on my badge. I would have loved some clever mixer based on the badge labels or something along those lines.

Brian Rice and I (but really mostly Brian) ran a session a session for people interested in programming languages. I was pretty happy because the session was very interactive, but the session length of 30 minutes made it hard to get very far during the allotted time.

BarCamp Seattle 2009

The best session that I attended was a session that was literally and figuratively off the grid. Brian Dorsey put up a session for sitting outside under the nearby bridge. The weather in Seattle was really beautiful last Saturday, so it really begged for being outside. There were about 9 or 10 of us who wandered out and sat talking about a wide range of topics. I enjoyed the sense of flowing from topic to topic, shifting naturally with the flow of conversation, the changing of roles of various participants as the topics changed, and so forth.

I think that I am at a crossroads as to the value of these generic, unstructured events. I helped organize the first Seattle Mind Camp, and I am glad to see that there is now a Bar Camp in Seattle as well. At the same time, I’ve frequently left these events feeling unsatisfied. Beforehand I am filled with excitement at the possibility of meeting new people from other tribes / fields and somehow stirring the pot of creative juices. Most of the time, I end up leaving without that stirring having occurred. If I look back over the last four or five years worth of conferences that I’ve attended, only a handful of “unstructured” events really stand out. Those events were Foo Camp and the Scala Liftoff, and in both cases, I would say that the particular sets of people involved made a huge difference.   

Thoughts on WWDC

Some thoughts on yesterday’s announcements:

MacBook Pros

The laptop refresh was a surprise to me. I wasn’t expecting anything until Intel’s Nehalem based laptop CPU’s and chipsets hit the market late summer or early fall. The basics of the machines haven’t improved that much, and won’t until that happens. I’m wary of the unibody built in battery – I had to have my MacBook Pro batteries replaced recently, and the built-in battery would make that a lot harder. As a photographer, I like the wider color gamut of the LCD, but I don’t like the glossy finish. I also find there replacement of the ExpressCard slot with an SD card slot odd. It would have been more “Pro” to at least use a Compact Flash slot.   

In any case, I’m not in the market for a new laptop, so the minor changes and the nice price reduction don’t mean much to me at the moment.

Snow Leopard

Snow Leopard, on the other hand, is of great interest to me now that my primary box is a Mac Pro. I’m eager to have OS X taking better advantage of the all the hardware threads in the box. I was disappointed that there wasn’t more discussion of this in the keynote, but I also understand that having more than 2 cores is still a bit out there. I’m also disappointed that there was no mention of ZFS in either the workstation or server editions of Snow Leopard.

I guess that Snow Leopard is not as ready as many people (including me) thought. It won’t be shipping until September. Apple has taken a very reasonable approach to pricing the upgrade. The biggest issue for me is that I’ve been having problems with 10.5.7. I uninstalled it from the MacPro, and my work laptop wigged out on me last week during JavaOne, and I am very suspicious that the problems are 10.5.7 related. Jeffrey Zeldman is chronicling his own set of problems with the update. It’s going to be a long time between now and September if Apple doesn’t sort this out.

iPhone 3.0

The iPhone 3.0 stuff was pretty much a rehash of what was previewed back in March. The only surprise was the “Find My iPhone” feature, which really ought to be a standard feature. I’m not sure if I’m going to buy MobileMe just to get this ability. Everybody is going to get an upgrade to this version of the software so there’s nothing but happiness all around.

What’s not so happy is that some of the features will be unavailable because AT&T isn’t ready to support them: MMS and Tethering. I’m not really sure that I would actually use the MMS. I do most of my picture sharing via Twitter or Facebook. I am pretty sure that I would use tethering, either when riding the ferry or when traveling for work. However, if AT&T adds another $30 a month for the privilege, I probably won’t do it. I can get a Boingo account form $10 a month. True that it won’t work everywhere, but it will work on the ferry and in major airports. Does AT&T really think that we don’t know how to comparison shop?

iPhone 3GS

The iPhone 3GS is a nice upgrade. I’d be happy with the speed, but I’m going to get a speed increase (supposedly) from the iPhone 3.0 software. Faster 3G data would also be nice. The battery life improvements don’t cover the 3G radio usage, which is how I pound my iPhone.   

There are two features which really stand out to me: the compass and the camera.

I travel a lot, and I get mixed up a lot. Having the compass to help decipher directions would really be a help to me. I can think of several occasions in the last 6 months, where I could have saved some aggravation if I knew what direction I was pointed in.

The improvements to the camera look really good. Chase Jarvis is calling it the photographer’s iPhone, which is pretty much a no brainer. There was no mention of speeding up the amount of time it takes to get the camera to come on, which is one of my biggest gripes with it. Is it really a decisive moment camera? No way. But it looks like it is a much better camera than what we have now. I could probably justify $199 to upgrade my 16G iPhone 3G – it’d be a lot cheaper than a camera.

Unfortunately, I’m not going to get to do that. At least not until December 2009, due to the subsidized pricing of the iPhone. Lots of people are complaining about this, but that’s the way that the carriers have always worked. It’s not something new, in fact, its a sign that AT&T has a little more pull on Apple that we thought. So I’ll be waiting at least until December. The problem is that if I wait till December, I’m only 6 months away from the next iPhone product launch (if they keep to the current schedule), and as TechCrunch points out, if Apple lets its exclusive contract with AT&T expire in 2010, then you’d actually have carrier choice. That would be a good thing, and since getting onto Verizon’s huge network can only help iPhone sales, I’d bet that the iPhone is on Verizon in 2010. That’s not an impossible thing. Verizon made its first appearance ever at JavaOne this year, a sign that things are starting to change over there. I guess I’m going to wait and see how AT&T treats me between now and then. But they should be painfully aware that people are buying the iPhone, not the carrier.

CommunityOne / JavaOne 2009

This was my second year attending these events as a Sun employee. Everything in software at Sun seems to revolve around these two events. For quite some time before the show, people are working away furiously getting things ready to be unveiled, myself included.

CommunityOne

For me, the big theme at CommunityOne was cloud computing. Sun itself was emphasizing cloud stuff and the latest release of OpenSolaris, 2009.06, which were the main topics of the CommunityOne general session. The Sun Cloud is due out sometime this summer, so much of the cloud part of the session was having partners come up and tell about their experiences working with our cloud. The OpenSolaris team has done a huge amount of work in 2009.06. The feature that stuck out to me the most is “Crossbow“, which is a completely rewritten networking stack. Solaris already had CPU virtualization technology built into it via the zones feature. Crossbow makes it possible to virtualize networking configurations. This means that you could run an instance of OpenSolaris on your laptop (either natively or via VirtualBox, VMWare or whatever) and actually have a virtualized data center configuration running right there. That’s pretty interesting stuff.

I went to several cloud sessions, and I’d have to say that the current state of cloud computing is pretty rough. At least that’s true at the Infrastructure as a Service level where Amazon and the Sun Cloud are. As an example, I went to a good presentation by fellow Sun employees on cloud computing patterns. I happened to be sitting with James Governor and Stephen O’Grady of Redmonk, and I turned to Stephen and said “these patterns are all at a level that I never want to have to worry about”. The patterns themselves were fine, but I personally don’t want to have to deal with things at that level in a cloud platform. There is lots of room for improvement and innovation in this space.

My CommunityOne talk was called “Programming Languages for the Cloud”.

The talk is based on my experience as a language guy who has been asked to work on cloud computing stuff. As such, I’m really trying to raise questions (for which I don’t yet have answers) about places where work on programming languages might usefully intersect with cloud computing. I figured that this would be a niche kind of talk, so I was very surprised to find myself in one of the larger rooms at Moscone, complete with a live video feed. I was even more surprised to see that the room was pretty full. After the presentation, one of the Salesforce.com engineers working on Apex (their domain specific language for the cloud) came up to the front. We ended up having lunch and I learned a bunch interesting stuff about their experience withe Apex. This sort of thing is what makes conferences worthwhile.

JavaOne

I spent the first day of JavaOne prepping for my presentation, “Seeding the Cloud”, which was about some ways that tools could help developers who choose to build applications in the cloud.

Ashwin Rao and I had some pretty interesting demos lined up, but we had problems with the internet connection in the room so a number of the demos failed. I learned later that the internet connection for all of Moscone Center had gone out, which made me feel slightly better. As someone commented to me, it was a good illustration of some of the weak points of the cloud (web, really) model.
The demonstration that I really wanted to show was an extension of some work that the Kenai team has done. Kenai is going to have support for doing continuous integration via Hudson, and the machines for doing that can be allocated as cloud instances. This is great if you have a project in Java or some other language that has major build steps. Another use for a dynamically allocated farm of machines is to do web UI testing on browser combinations. Back at PyCon, I put a bug into Adam Christian and Mikeal Roger‘s ears about this. Adam and Mikeal are the primary guys behind the Windmill web UI testing framework. Adam has been working with Hudson author Kohsuke Kawaguchi, and between the two of them they came up with a way for Hudson to start up a bunch of different browsers on different operating systems. If my demo had worked, people would have seen me kick off a Hudson build from inside of Netbeans 6.7, and then we would have watched (via RDC) the various browsers running though some UI tests on a web application. Oh well.
I spent the rest of JavaOne ducking into various language and concurrency talks. Jonas Boner gave a very nice talk comparing some of the concurrency mechanisms that are available on the JVM. Alex Miller gave a talk on Java concurrency gotchas. The net effect of Alex’s talk was to reinforce the fact that we need one or more of the mechanisms that Jonas covered in his talk. Also in the concurrency vein, I stopped in on Philipp Haller and Frank Sommer’s ‘s talk on Scala Actors. Probably the most fun concurrency thing was a random conversation with Clojure author Rich Hickey and Jonas Boner in the speaker room.
JavaOne is big on the keynote / general sessions. I only went to two, the opening session, and Bob Brewin’s technical keynote. The big news (to me) in Bob’s keynote was Mark Reinhold’s demonstration of a modularized JDK. This is cool for a variety of reasons, like reducing the footprint of the downloads, ability to build distribution packages trivially, and so forth. But the thing that made me happies was news that the CLASSPATH is finally going way, to be replaced by a module-info.java file.
The opening general session was very subdued. There were a variety of partner / sponsor segments, but things were really running at a low energy level until the end when Scott McNealy took the stage and then introduced Oracle CEO Larry Ellison. Despite Ellison’s reassurances to the Java community, it was a sad moment. I’ve only been at Sun for a little over a year, but my history with the company is pretty long. When I was in grad school, Brown was on of the first large installations of Solaris (replacing SunOS). Like many developers, I’ve used Java over the years. Sun has made a number of very important contributions to the computer industry, and it’s sad to me that a company so full of innovation was unable to remain independent.
Photography
This year things were so busy and frenetic that I really didn’t have much time to pull out the camera. Between presentations and meeting up with Sun people from all over the world, there just wasn’t time. Here are a few from the few times that my camera escaped its bag:
JavaOne 2009

Bob Brewin’s Technical Keynote
CommunityOne 2009
The Extra Action Marching Band on the CommunityOne Expo Floor
CommunityOne 2009

The CommunityOne Party

CommunityOne 2009

The CommunityOne Party

CommunityOne 2009

The CommunityOne Party
CommunityOne 2009

The CommunityOne Party

CommunityOne 2009

The CommunityOne Party

CommunityOne 2009

The CommunityOne Party

CommunityOne and JavaOne

I’m here in San Francisco for CommunityOne and the first 3 days (Tuesday through Thursday) of JavaOne.

Today at CommunityOne I’ll be giving a talk called “Programming Languages for the Cloud“. This is an exploration of areas where programming languages might intersect cloud computing.

Tuesday afternoon at JavaOne Ashwin Rao and i I will be giving a talk called “Seeding the Cloud“, where we’ll be giving a developer centric view of cloud computing and cloud computing issues. We’ll also have demos of some of the cloud developer stuff we’ve been working on.

If you are around and want to meetup, you can always get a hold of me via Twitter.

On Twitter Data

I’ve been getting various kinds of private communication about this, so it’s probably worth some commentary…

For some time now, I’ve been wondering when someone would start to use systems like Twitter as a way to deliver information between programs. A few weeks ago, Todd Fast, a colleague at Sun gave me a preview of what is now the Twitter Data proposal. Todd and Jiri Kopsa have done all the heavy lifting on this, so if you have substantive comments or requests, they are really the people you should be dealing with. They were kind enough to recognize me as a reviewer of their work, but the initial idea is theirs.

Twitter Data is a bit different than what I was envisioning. I was thinking more along the lines of jamming JSON or XML data into a Twitter message as a starting point for program level data exchange. That would allow us to leverage existing tools and libraries and make the entire thing straight forward. The interesting part, then, would be in the distribution network that arose from programs following other programs. This could also be embedded into a person’s Twitter feed by allowing clients to ignore tweet payloads that were structured data.

Twitter Data proposes a way to annotate the data oriented parts of a regular Tweet in order to make it easier for machines to extract the data. Some people think this is a good idea, and some people think it’s a terrible idea. It’s easy to see the arguments on both sides. Pro, is that you could turn your Tweet stream into a way to deliver information about you to programs, and that Twitter Data would make it that much easier to do. The Cons (that I’ve seen so far) are that people don’t want to have this kind of data exchange mixed into their Twitter stream, or that parsing the natural language that appears in the 140 characters of a tweet shouldn’t be that hard.

So we have two dimensions (at least) to the problem that Twitter Data is trying to address:

  1. Is it a useful thing to have structured or semi structured information about a person included in their Twitter feed?
  2. If so, should that data be out of band, mixed in, or extracted (natural language processing)?

Independent of the merits of the specific Twittter Data proposal (and I definitely think that there are merits), I think that these two questions are worth some discussion and pondering.

Mac Pro time

For the past three or four years, I’ve been promising myself that I was going to buy myself a Mac Pro. This mostly a result of digital photography, which makes rapacious demands on computer systems. In the last 9 months or so, it’s also been because I am doing more work using virtualized machine images. In any case, every time Apple had an event, I was telling myself that I was going to buy the machine, but there was always some reason why it never happened. The announcement of the Nehalem based Mac Pro earlier this year finally pushed me over the edge. And pushing was required. There’s been a lot of benchmarking which casts the performance of these machines in questionable light when compared with the machines that they replaced. Until a bunch of applications are rewritten to take advantage of the large number of cores in Nehalem based systems, these boxes are only slightly better than the ones they replaced, and a bit more expensive.

I ended up getting an 8 core machine, because these are the machines that can be expanded to an outrageous amount of memory, something which is a necessity for systems doing a lot of Photoshop. Due to the benchmarking controversy, I got the 2.66GHz processors, so that single threaded programs wouldn’t suffer as much. Here’s a quick rundown on my experience after having the machine for a few weeks.

Hardware

All of my hardware moved over without a hiccup, except for my Logitech Z-5500 speakers. I needed a TOSLINK to TOSLINK cable, which was rectified by a trip to Radio Shack (yes, we have one on Bainbridge Island. It’s not Fry’s but once a year or so they save my bacon.). The machine is much quieter than I expected. The last desktop machine that I owned was a homebuilt Windows box, and that thing was really loud. The Mac Pro is quieter than some of the external FireWire drives that are plugged into it. Heat would be a different story. My office is already several degrees warmer than the rest of the house, and now it’s probably another several degrees warmer. I’m having to be very careful about leaving my office doors open in order for things to cool down. Figuring out how this works in the summer is going to be interesting.

Performance wise I am pretty happy. Things are definitely snappier than my Sun supplied 2.6GHz MacBook Pro. I moved some external disks off of Firewire and into the Mac Pro’s internal SATA drive bays, and I am sure that the change in interface made a big contribution to the improved speed. The machine has 12GB of Other World Computing RAM in it, so it basically doesn’t page unless I am doing something big in Photoshop or have several VirtualBox VMs open at the same time.

There are some things that I miss:

We don’t have TV, so we do a lot of NetFlix and other DVD’s. This happened mostly on the MacBook Pro via Front Row and the Apple Remote. The Mac Pro doesn’t talk to the Apple remote, and I miss that. If people have suggestions for controlling Front Row on a Mac Pro, please leave them in the comments.

I got used to having the laptop hooked up to the LCD display, and using the laptop LCD as my “communications display” for IM, IRC, Twitter and so forth. Now I’m back down to a single display and missing it. I’m also missing it in Lightroom.

The Mac Pro came with an Apple keyboard, and the keyboard I was using was a Microsoft Natural Keyboard from 2000, and some of the keys were starting to get hard to push. So I figured that I would try the Apple keyboard. So far I don’t mind it, but keys are in different places, and the new keyboard has 9 years of muscle memory working against it. But that would be true of just about any keyboard.

Software

Any time I get a new machine I update my Macintosh Tips and Tricks page. I definitely have some updates that I could make, and I might make some of them after JavaOne. The rumor mill is suggesting that MacOS 10.6 Snow Leopard is going to ship this summer, so I might just wait until that happens, since I expect a lot of things to need updating, rearranging, etc.

I did have a problem when I tried to update the machine to 10.5.7. Things were behaving very oddly, so I restored the machine back to 10.5.6 with Time Machine. Time Machine backups on an internal SATA drive take less time (and make less noise) than on an external FireWire drive. I’m going to give this another try after JavaOne. And for prospective commenters, yes, I repaired permissions and used the Combo Updater.

Photoshop occasionally makes use of the additional cores, but it’s the large amount of RAM that is really making the difference at the moment. The same is true for Lightroom. Perhaps the next editions of these programs, coupled with 10.6, will do a better job of keeping multiple cores busy. In the meantime, my Lightroom to Photoshop batch jobs are definitely running quite a bit faster than before.

On the whole

On the whole, I am happy with the machine, and I expect to be a lot happier when 10.6 ships this summer.

Erlang Factory 2009

I spent Thursday and Friday of last week at the Erlang Factory in San Francisco (although the event was actually in Palo Alto).

Why did I go?

I’ve written about Erlang in this space before. Erlang is having a major influence on other languages, such as Scala on the JVM side and Axum on the CLR side. In addition every language seems to have several implementation of Erlang style “actors” (despite the fact that this is historically incorrect). Erlang has been around for a long time, and has seen industrial usage in demanding telecom applications. As a dynamically typed functional language with good support for concurrency and distribution, it is (if nothing else) a source of interesting ideas. Earlier this year, my boss asked me to start doing some thinking about cloud computing in addition to the stuff that I was already doing around dynamic languages — another good match for Erlang. This was the first large scale gathering of Erlang people in the US (at least that I am aware of), so I wanted to drop in and see what was going on, what the community is like, and so on.

Talks

The program at the Erlang Factory was very strong. In many of the slot sessions, there were 3 excellent talks to choose from. Every single talk that I went to was of very high quality. It was so bad that I wasn’t able to explore all the areas that I wanted to. Fortunately, the sessions were videotaped and are supposed to be made available on the web. Also, there was a decent amount of twittering going on, so a Twitter search for #erlangfactory will turn up some useful information.

I attended a number of “experience” talks by companies / individuals. There were experience talks from Facebook, SAP, Orbitz, and Kreditor (the fastest growing company in Sweden). I made it to the Facebook talk and the Kreditor talk. Facebook’s usage/deployment is on the order of 100 machines, which provide the chat facility for Facebook. Erlang is doing all the heavy lifting, and PHP is doing the web UI part. There was a lot of this kind of architecture floating around the conference. It seemed like the most popular combination was Ruby/Erlang, but there was definitely Python and PHP as well. The Kreditor talk was interesting because their site has been running for 3 years with very small amounts of downtime. Unfortunately, their entire deployment is probably less than 10 machines, so that blunts the impressiveness of what they have done. Still it was interesting to hear how they accomplished this using features of Erlang. In addition to the talks, I spoke with many attendees who are using Erlang in their companies. One such person was eBay founder Pierre Omidyar, who is running Ginx, a web based Twitter client. Pierre is doing the coding and deployment of the site, and was well versed in the Erlang way of doing things. An interesting data point.

The Erlang community (like all communities) has it’s old guard. These are folks who have worked with Erlang for years, before its recent burst of interest. There were a pair of keynotes by Erlang long-timers Robert Virding (The Erlang Rationale) and Ulf Wiger (Mulitcore Programming in Erlang). Both of these talks shared a common trait — the speakers were pretty honest about what was good about Erlang, and where there were problems. Given how prone the computing business is to fashion, I found this to be refreshing. Virding talked about the reasons why Erlang is designed the way it is. He accepted the blame for inconstencies in the libraries, talked about the need to avoid the process dictionary, and agreed that “a char type is probably not wrong”. Wiger’s talk was about why parallelizing code is hard (even with Erlang). He used the example of parallelizing map to demonstrate this, and showed the use of the QuickCheck testing tool to aid in finding parallelism bugs. The Erlang version of QuickCheck was inspired by the Haskell version of QuickCheck, and it’s a very very useful tool. The adaptations for parallelism look very nice. It’s a shame that the Erlang version is commercial software. I don’t grudge the authors the right to charge money for their software, but I do think that this will hold back adoption of this important tool.

There were many talks on what I would describe as “cloud problems”. For example, Ezra Zygmutowicz’s “You got your Erlang in my Ruby” was really about how he built a self assembling cluster of Ruby daemon’s (Nanite), Dave Fayram and Abhay Kumar’s “Building Reliable Distributed Heterogenous Services with Katamari/Fuzed“, and Lennart Ohman’s “A service fail over and take-over system for Erlang/OTP”. Like PyCon, there was a lot of interested in eventually consistent databases/key-value stores/non-relational databases. Cliff Moon’s talk on dynomite (a clone of Amazon’s Dynamo system), was particularly encouraging because he was reaching out to other people in the audience (and there were a decent number of them) to try an consolidate all their efforts into a single project. From what I could tell, people seemed receptive to that idea.

CouchDB also fits into that last category of non-relational databases, but it gets it’s own paragraph. One reason is that I helped mentor the project through the Apache Incubator (and chauffeured those CouchDB committers who were present). Another is that CouchDB creator Damien Katz got a keynote. Third is that there was basically a CouchDB track on the second day of the conference. There was a lot of interest in CouchDB, and a lot of activity as well. I was told that some of the people who took the CouchDB training during the training days had actually submitted patches on the project already. Damien’s talk was not about the technical details of CouchDB, but about his personal journey to CouchDB, which included selling his house and living off his savings in order to see CouchDB come to life.

Activity has really picked up in the Erlang web framework space. In addition to Erlang Web, and Yariv Sadan’s Erlyweb, there is also Rusty Klophaus’ Nitrogen. Nitrogen focuses more on the UI side of the web framework, omitting any kind of data storage. It’s very easy to create an AJAX based user interface using Nitrogen, and there is nice support for Comet. As part of his presentation, Rusty showed his slides on a Nitrogen based webcast reflector. You specify the UI using Erlang terms, which then causes HTML/Javascript/etc to be generated, which caused a stir in part of the Twitter peanut gallery. I was mostly happy to see people focusing on solving the current generation of problems. My favorite web space talk was probably Justin Sheehy’s talk on Webmachine. I think that I prefer the description of WebMachine as a REST or HTTP toolkit. Webmachine gives you what you need to implement any HTTP method correctly, and then provided a set of callback functions that can be implemented to customize that processing to do actual work. One of the coolest things about Webmachine is it’s ability to visually show you that path taken in processing a particular HTTP request, and being able to inspect/dump data at various points in the diagram. It makes for a very nice demo.

There were not that many “language geek” talks. This contrasts with the early years of PyCon (at least for as long as I have attended) where there were quite a number. I missed Robert Virding’s talk on Lisp Flavored Erlang (but I saw some example usage in a CouchDB talk), because it overlapped the dynomite talk. I was able to attend Tony Arcieri’s talk on “Building Languages on Erlang (and an introduction to Reia)”. During the first part of his talk, Tony showed how to construct an Erlang module on the fly in the Erlang shell. He then discussed some tools which are useful to people trying to build languages on top of BEAM, the Erlang virtual machine:

  • Robert Virding has written leex, a lexical analyzer generator
  • yecc, a Yacc style parser generator is included in the Erlang distribution
  • the erl_syntax_lib library aids in constructing Erlang abstract syntax trees, which can then be compiled to Erlang bytecode.
  • Erlyweb contains the smerl (simple metaprogramming) library for creating and manipulating Erlang modules at runtime.

After that, he launched into a description of REiA. I’m not sure that I agree with some of the choices that he has made, but I am happy to see people experimenting with languages on top of BEAM, and in keeping with Erlang’s process model and the OTP infrastructure. One of the things that Tony mentioned was abandoning indentation based syntax. He wrote an entire postmortem on that experience in his blog. Python’s indentation based syntax has won me over and made me a fan, and I am sad to see that indentation syntax, blocks/closures, and expression orientation continue to be at odds.

Coda

It looks like Erlang is starting to find a home. Companies are using it in production. There are books starting to be written about it. Many (not all) of the things which make Erlang seem odd to “mainstream” programmers also appear in languages like Scala, Haskell, and F#. At the same time, Erlang has a long history of industrial deployment, albeit in a single (large) market segment. Many of the problems which we now face in large web systems (and the cloud): concurrency, distribution, high availability, and scalability are strengths for Erlang. Indeed, many of the people that I heard from or talked to basically said that they couldn’t solve their problem with any other technology, or that their solutions were dramatically simpler than the technologies that they already knew. Will that be enough to propel Erlang into the mainstream? I don’t know. I also don’t know if our current state of mainstreamness is going to remain. More and more I’m seeing an attitude of “let’s use the best tool for the job”, not only in languages, but in all parts of (web) applications.

There’s also the issue of the Erlang community itself.   Around 120 people showed up for the conference. As I mentioned previously, there are the folks who have been doing Erlang for years. Then there are the relative newcomers, who are web oriented/web savvy, and solving problems in very different domains than the original problem domain of Erlang and it’s inventors. Thus far, the two segments seem to be getting along fine. I hope that will continue — success or the potential for success has a tendency to bend relationships.

Why I finally believe in hashtags

I’ve been using Twitter for a while now, but I’ve never really used hashtags much. I’ve never been much for doing the stuff it takes to get a highly promoted blog or twitter stream. I figure that if my content is worthwhile, that should be enough. At PyCon I found the compelling hashtag use case for me.

There were a lot of people using hashtags in their PyCon tweets, and Jacob Kaplan-Moss showed me Twitterfall, which made it easy to keep track of uses of the tag. That made it *much* easier to find the virtual twitter stream for PyCon. This was also true at Lang.NET, the DSL DevCon, and the MySQL conference. This week(end) I’ll be using hashtags to track the progress of JSConf.   From now on I’ll always use hashtags when I’m at a conference or event.

One reason that it’s taken me so long to get the hash tag thing is that I use Twitter primarily via rich desktop (or iPhone) clients. Until recently I wasn’t using clients that could do searching. I had tried TweetDeck, and it never stayed with me. When Nambu came along, I was pretty enthusiastic because it was a native TweetDeck. Unfortunately, I had crashing problems with it at Lang.Net (since fixed, I think), and I put it aside when I realized that Syrinx 2.0 had searches. While Syrinx doesn’t save searches across restarts, its memory use is tolerable enough to leave it running all the time, so it’s not a big problem, and I am hopeful that MRR will include saved searches in a future version. Commenters: yes, I tried Tweetie for Mac, and I didn’t like it. I love Tweetie for iPhone, though. Go figure.

MySQL Conference 2009

I spent most of this week at the MySQL Conference. I was giving a talk on Python and MySQL, which came about as a favor to some folks in the marketing department at Sun. This was a fair exchange, because I’ve been curious about the MySQL community. MySQL is at the other end of the open source spectrum from the ASF, so I wanted to see for myself what it was like. The MySQL conference is the MySQL community’s equivalent of ApacheCon. There is a mix of talks, some aimed at users of MySQL, and others aimed at developers of MySQL or related products.

There is a sizeable ecosystem around MySQL. There are extension patches from Google and Percona, which were mentioned in many talks that I was in. There’s MariaDB, Monty’s community oriented fork of MySQL. There’s the Drizzle project, which looks really interesting. There’s lots going on, and I got the feeling that there’s lots of innovation happening in various parts of the ecosystem. It feels energetic and fun, and what I would expect of a big open source community, despite it being a long way from Apache or Python.

I attended all kinds of talks. I went to a number of talks about analyzing performance and monitoring, including 3 talks on DTrace. Sadly, these talks were sparsely attended, which is a symptom of some of the problems that Solaris/OpenSolaris has been having. What was interesting was that all of these talks were given by former MySQL employees, and all of them were genuinely enthusiastic about DTrace. The best of these talks was Domas Mituzas’ Deep-inspecting MySQL with DTrace, where he showed some very cool MySQL specific DTrace scripts. If DTrace got ported to Linux as a result of the Oracle/Sun acquisition, that would be a good outcome for the world.

I also went to several cloud computing talks, where the topics was how to run MySQL in the cloud. These were pretty interesting because it turns out that there is a bunch of stuff that you need to do and be aware of when running the current versions of MySQL in a cloud environment. I hope that the Drizzle folks are aware of some of these issues and are able to solve some of these problems so that running in the cloud us pretty simple.

Here are my 3 favorite talks:

  • Don MacAskill’s The SmugMug Tale – I’m a photo guy, but not a SmugMug customer. Don’s been tweeting his experiences using Amazon Web Services to build SmugMug, and he’s been blogging his experiences with ZFS, the Sun Storage 7000, and so forth. I’ve been following his stuff for a while, so this was mostly a chance to see an in person rendering of an on line personality.
  • One talk that I didn’t expect to enjoy was Mark Madden’s Using Open Source BI in the Real World. I’m not really a Business Intelligence guy per se, but the world of blogging and twittering and so forth starts to make you attuned to the usefulness of various kinds of analytics. Anyone building any kind of non-trivial web software need analytics capabilities, so having open source solutions for this is good. It probably also didn’t hurt that I talked to several BI vendors on the expo floor the night before. What I really enjoyed about the talk was the beginning sections on how to be an analyst, think about and project the future. I’m given to a bit of that now and then, so I found this part of the talk pretty interesting.
  • The best talk that I went to was Yoshinori Matsunobu’s Mastering the Art of Indexing. The speaker pretty much covered all the kinds of indexing in MySQL, which indexes work best in which conditions (both for selecting and inserting — there were some interesting surprises for insert), and even tested the differences between hard disks and solid state drives. Maybe I loved this talk because it brought back all the research that I did in query optimization back in graduate school. But that wouldn’t explain all the other people in the room, which was standing room only.

Based on what I saw this week, I’m not in any way worried about the future of MySQL.