Tag Archives: functional programming

Strange Loop 2012

I think that the most ringing endorsement that I can give Strange Loop is that it has been a very long time since I experienced so much agony when trying to pick which talks to go to during any given block.

Emerging Languages Camp

This year Strange Loop hosted the Emerging Languages Camp (ELC), which previously had been hosted at OSCON. I liked the fact that it was its own event, not yet another track in the OSCON panoply. That, coupled with a very PLT oriented audience this year, made Strange Loop a much better match for ELC than OSCON.

I definitely went into ELC interested in a particular set of talks. There is a lot of buzz around big data, and some of the problems around big data and data management more generally. Also I did my graduate work around implementing “database programming languages”, so there was some academic interest to go along with the practical necessity. There were three talks that fell into that bucket: Bandicoot: code reuse for the relational model, The Reemergence of Datalog, and Julia: A Fast Dynamic Language for Technical Computing.

I found Bandicoot a little disappointing. I think that the mid 90′ work of Buneman’s group at UPenn on Structural Recursion as a Query Language and Comprehension Syntax would be a better basis for a modulary and reusable system for programming relations.   

Logic Programming may be making a resurgence via the work on core.logic in Clojure and the influence of Datalog on Cascalog, Datomic and Bloom. The Reemergence of Datalog was tutorial on Datalog for those who had never seen it before, as well as a survey of Datalog usage in those modern day systems.

Julia is a language that sits in the same conceptual space as R, SAS, SPSS, and so forth. The problem with most of those systems is that they were designed by statisticians and not programmers. So while they are great for statistical analysis, they are less good for statistical programming. Julia aims to improve on this, while adding support for distributed compuation and a very high performance implementation. There’s no decisive winner in the technical computing space, and it seems like Julia might have a chance to shine.

There were, of course, some other interesting language talks at ELC.   

Dave Herman from Mozilla talked about Rust for the first time (at least to a large group). Rust is being developed as a systems programming language. There are some interesting ideas in it, particularly a very Erlang like concurrency model. At the same time, there were some scary things. Part of what Rust is trying to do is achieve performance, and part of how this happens is via explicit specification of memory/variable lifetimes. Syntactically this is accomplished via punctuation prefixes, and I was wondering if the code was going to look very Perl-ish. servo is browser engine that is being written in Rust, and looking at the source code of a real application will help me to see whether my Perlishness concern is valid.

Elixir: Modern Programming for the Erlang VM looks like a very nice way to program atop BEAM (the Erlang VM). Eliminating the prolog inspired syntax goes a long way, and it appears that Elixir also addresses some of the issues around using strings in Erlang. It wasn’t clear to me that all of the string issues have been addressed, but I was definitely impressed with what I saw.

Strange Loop Talks and Unsessions

I’m going to cover these by themes. I’m not sure these are the actual themes of the conference, but they are the themes that emerged from the talks that I went to.

First, and unsurprisingly, a data theme. The opening keynote, In Memory Databases: the Future is Now! was by Mike Stonebraker. It’s been a long time since I saw Stonebraker speak – I think that the last time was when I was in graduate school. He was basically making the case that transaction processing (TP) is not going away, and that there might be applications for a new generation of TP systems in some of the places where the various NoSQL systems are now being used. Based on that hypothesis/assumption, he then went on to describe the trends in modern systems and how they would lead to different design, much of which is embodied in VoltDB. This was a very controversial talk, at for some people. I considered the trend/system analysis part to be reasonable in a TP setting. I’m not sure that I agree with his views on the applicability of TP, but I’m fairly sure that time will sort all of that out. I think that this is an important point for the NoSQL folks to keep in mind. When the original work on RDBMS was done, it was mocked, called impractical, not useful and so forth. It took many years of research and technology development. I think that we should expect to see something similar with NoSQL, although I have no idea how long that timeline will be.

Nathan Marz’s talk Runaway Complexity in Big Data… and a plan to stop it. was basically making the case for, and explaining the hybrid/combined batch/realtime architecture that he pioneered at BackType, and which is now in production at Twitter. That same architecture led to Cascading and Storm, which are pretty interesting systems. Marz is working on a book with Manning that will go into the details of his approach.

The other interesting data talks revolved around Datomic. Unfortunately, I was unable to attend Rich Hickey’s The Database as a Value, so I didn’t get to hear him speak directly about Datomic. There are several Datomic related videos floating around, so I’ll be catching up on those. I was able to attend the evening unsession Datomic Q&A / Hackfest. This session was at 9pm, and was standing room only. I didn’t have quite enough background on Datomic to follow all of what was said, but I was very interested by what I saw: the time model, the immutability of data which leads to interesting scalability, the use of Datalog. I’m definitely going to be looking into it some more. The one thing that troubles me is that it is not open source. I have no problem with a paid supported version, but it’s hard to make the argument for proprietary system or infrastructure software nowadays.

Another theme, which carried over from ELC was logic programming. I had already heard Friedman and Byrd speak at last fall’s Clojure/conj, and I was curious to see where they have taken miniKanren since then. In their talk, Relational Programming in miniKanren, they demonstrated some of what they showed previously, and then they ran out of material. So on the fly, they decided to implement a type inferencer for simple lambda terms live on stage. Not only were they able to finish it, but since it was a logic program, they were also able to run it in reverse, which was pretty impressive. I was hoping that they might have some additional work on constraints to talk about, but other than disequality constraints, they didn’t discuss anything. Afterwards in Twitter, Alex Payne pointed out that there are some usability issues with miniKanren’s API’s. I think that this is true, but it’s also true that this is a research system. You might look at something like Clojure’s core.logic for a system that’s being implemented for practitioners.

David Nolen did an unsession Core Logic: A Tutorial Reconstruction where he walked the audience through the operation of core.logic, and by extension, miniKanren, since the two systems are closely related. He pointed out that he read parts of “The Reasoned Schemer” 8 times until he understood it enough to implement it, and then he found that he didn’t really understand it until after the implementation was done. There was also a large crowd in this session, and Christopher Petrelli made a video recording on his phone, since InfoQ wasn’t recording the unsessions.

The final talk in the logic programming them was Oleg Kiselyov’s talk Guess lazily! Making a program guess and guess well. Kiselyov has been around for a long time and written or coauthored many important papers related to Scheme and continuations. I’ve be following (off and on) his work for a long time, but this is the first time I was at a conference where he was speaking. I was shocked to find that the room was packed. His talk was about how to defer making the inevitable choices required by non-determinism, especially in the context of logic type systems. His examples were in OCaml, which I had some trouble following, but after Friedman and Byrd the day before, he apparently felt compelled to write a type inferencer that could be run backwards as well. His code was a bit longer than the miniKanren version.

The next theme is what I’d call effective use of functional programming. The first talk was Stuart Sierra’s Functional Design Patterns. This was a very worthwhile talk, which I won’t attempt to summarize since the slides are available. Needless to say, he found a number of examples that could be called design patterns. This was one of the talks where I need to sit down and look at the patterns and think on them for a while. That’s hard to do during the talk (and the conference, really). Some things require pondering, and this is one of them.

The other talk in this category was Graph: composable production systems in Clojure, which described the Prismatic team’s approach to composing systems in Clojure. What they have is an abstraction that allows them to declaratively specify how the parts of the system are connected. For a while it just looked to me like a way to encode a data flow graph in a Clojure abstraction. The aha moment was when he showed how they use Clojure metadata to annotate the arguments or pipe connectors if you will. The graphs can be compiled in a variety of ways including Clojure lazy maps, which present some interesting possibilities. Unfortunately, I had to leave half way through the talk, so I missed the examples of how the apply this abstraction in their system.

Theme number four was programming environments. I hesitate to use the term IDE, because it connotes a class of tools that is loved by some, reviled by others, and when you throw that term around, it seems to limit people’s imagination. I contributed to the Kickstarter for Light Table, so I definitely wanted to attend Chris Granger’s talk Behind the Mirror: The birth of Light Table. Chris gave a philosophical preamble before showing off the current version of Light Table. He demonstrated adding support for Git in a short amount of code, and went on to demonstrate a mode for developing games. He said that they are planning to release version 1 sometime in May, and that Light Table will be open source. I also learned that Kickstarter money is counted as revenue, so they have lost a significant amount of the donations to taxes, which is part of the reason that Kodawa participated in Y Combinator, and is trying to raise some money to get a bigger team.

Not long after the Light Table kickstarter, this video by Bret Victor made the rounds. It went really well with all the buzz about Light Table, and Alex Miller, the organizer of Strange Loop, went out and persuaded Bret to come and talk. Bret’s title was Taking off the Blindfold, and I found this to be a very well motivated talk. In the talk, Bret talked about the kinds of proerties that our programming tools should have. The talk was vey philosophical despite the appearance of a number of toy demos of environment features.

During both of these talks there was a lot of chatter. Some was harking back to the Smalltalk (but sadly, not the Lisp Machine) environments,while some questioned the value of a more visual style of tools (those emacs and vi graybeards). When I first got into computers I read book called “Interactive Programming Environments” and ever since i’ve always been wishing for better tools.   I am glad to see some experimentation come back into this space.

Some old friends are busy making hay in the Node.js and Javascript communities, and it probably horrifies theme that I have ClojureScript as a theme, but so be it. I went to two ClojureScript talks. One was David Nolen’s ClojureScript: Better Semantics at Low Prices!, which was really state of the union of ClojureScript. The second was Kevin Lynagh’s Building visual data driven UI’s with ClojureScript. Visualization is becoming more and more important and ClojureScript’s C2 library look really appealing.

It’s fitting that the last them should be Javascript. Well, maybe. I went to two Javascript talks, and both of them were keynotes, so I didn’t actually choose them. But Javascript is so important these days that it really is a theme. In fact, it’s so much of a theme, that I’ve been going to Javascript conferences for the last 2 years. It’s been several years since I saw Lars Bak speak. His talk on Pushing the Limits of Web Browsers was in two parts. Or so I think. I arrived just as he was finishing the first part which seemed like an account of the major things that the V8 team has learned during their amazing journey of speeding up Javascript. The second part of his talk was about Dart. I didn’t know that Bak was the lead of the Dart project, but that doesn’t change how I feel about Dart. I see the language, I understand the rationale, and I just can’t get excited about it.   

I’ve been to enough of those Javascript only talks to hear Brendan Eich talk about The State of Javascript. Brendan opened by giving a brief history of how Javascript got to be the way it is, and then launched into a list of the improvement coming in EcmaScript 6 (ES6). That was all well and good, and towards the end, after the ES6 stuff, he threw in some items that were new, like the sweet.js hygienic macro project, and the lljs typed JavaScript project. It seemed like this was a good update for this audience, who seemed unaware of all the goings on over in JavaScript land. From a PLT point of view, I guess that’s understandable, but at the same time, JavaScript is too important to ignore.

Final Thoughts

Strange Loop has grown to over 1000 people, much larger than when I attended in 2010 (I had to miss 2011). I think that Alex Miller is doing a great job of running the conference, and of finding interesting and timely speakers. This was definitely the best conference that I attended this year, and probably the last 2-3 years as well.

If you’re looking for more information on what happened at Strange Loop 2012:

Slides: https://github.com/strangeloop/strangeloop2012/tree/master/slides

Other Strange Loop Coverage: https://github.com/strangeloop/strangeloop2012/wiki/Coverage

Clojure Conj 2011

Last week I was in Raleigh, attending the second Clojure/Conj. The last time that I attended a Lisp conference was the 1986 ACM Conference on Lisp and Functional Programming. I am a Lisp guy. I took the famed “Structure and Interpretation of Computer Programs” course from Sussman and Abelson. I spent some time doing undergraduate research on Symbolics Lisp Machines. When Apple invested some energy into Dylan, I hoped that I’d be able to use a Lisp on a personal computer. Java pretty much ruined that. Over the years, I pretty much gave up on the idea of being able to use Lisp for my day to day work. So much so, that when I first heard Rich Hickey talk about Clojure, my reaction going in was unenthusiastic. By the end of Rich’s talk, he had my attention. Clojure has been doing some growing up since then, and I really wanted to attend last year’s Clojure/Conj, but wasn’t able to.

Almost all of my conversations at the conference involved the questions, “Why are you at Clojure/Conj” and “How did you get interested in Clojure”. I’ve answered the second question in the previous paragraph. The question of “why” boils down to three themes: Clojure itself, Data, and Clojurescript. I’m going to use these threes theme to report on the conference talks.

Clojure itself
Clojure is a Lisp dialect that runs on the JVM and has great interoperability with existing Java code. It has great support for functional programming, as well as several innovative features for dealing with concurrency.

Stuart Sierra started off with a talk that pointed out areas where people could learn beyond the books and online exercises that are available. In each of those areas, he also proposed projects that people could work on. One of the things that stood out for me was his use of the Clojure reader to deal with Java Resources. I always found Resources to be annoying, and the use of the Reader is a clever way to make them more palatable and useful.

Clojail is a system for executing Clojure code in a sandbox. The system is quite flexible and the applications aren’t just limited to security. I can imagine using clojail to implement something like the Sponsors described in the original Actor model. Anthony Grimes, one of the committers for clojail gave the presentation. He is 17 years old.

One thing that made me happy was to see the bridge building between the Scala and Clojure communities. Phil Bagwell, who pioneered many of the persistent data structures in Clojure is now at Typesafe, the Scala company. He came and gave a nice talk about Scala’s parallel collection classes. Perhaps these classes will one day find their way into Clojure Daniel Spiewak gave a very solid presentation on the computer science behind the persistent data structures in Clojure.

At many conferences a talk like Clojure on Android would be at the higher end. The technical level of the talks at the Conj was high enough to make the task of getting Clojure on Android seem mundane. This is to take nothing away from the very impressive work that has been done. There are some issues remaining like footprint and startup time, but it looks like some effort is going to happen at the Clojure core team level to make some of this possible. The thought of talking to a REPL running on a phone, or tablet is a tasty one.

Rich Hickey’s keynote reminded me very much of a Guido keynote at PyCon: a discussion of language issues that he was looking at, and a solicitation for discussion. Rich was very careful to say that the stuff he was discussing was not a roadmap, so I’ll repeat that disclaimer. Here are some of the items that stood out to me. Plans to allow multiple builds of Clojure – a regular version, a leaner deployment version, a really lean Android version, a super deluxe development/debugging version and so on. There is discussion about allowing the reader to be extensible, in order to allow new data types to be round tripped. I didn’t follow the history of ClojureScript, so it was useful to see that Rich is pretty committed to this idea, and that bits of technology might even be flowing “backward” from the ClojureScript compiler into Clojure on the JVM. I was also very interested on Rich’s view that the use of a logic system like that in core.logic would be a far better tool than a traditional type system. More on the logic system below.

The last talk of the conference was Sam Aaron’s talk on Overtone, which is a computer music system written in Clojure. The major point was that he used Clojure to define a language for describing computer music, much in the sam way that sheet music describes regular music. There was lots of cool music along the way, including a pretty good simulation of the sound portion of the THX commercial that often plays before movies. The description of that commercial fit in a single projected screen of code.

Data

One thing that I’ve been looking at recently is exploratory environments for working with “federated” data. I’ve grown to dislike the term Big Data, because it’s come to mean almost nothing, however, the ship has already sailed on that one. Most people would be familiar with the idea of sitting down in front of their relational database SQL command prompts, and issuing ad-hoc queries. As the use of varied kinds of storage systems grows, we are losing that kind of interactive relationship with data. Some of the people in the Clojure community have built some interesting data systems, and Clojure is itself amenable to exploratory work with data, between it’s orientation around functional programming, and a development style oriented around a REPL.

David McNeil talked about Revelytix’s federated (among RDBMS and RDF triple store) SPARQL query engine. Their system uses s-expressions to represents the nodes in a graph of stream processing nodes. These expressions are then compiled down to a form that can be executed in parallel using the Java Fork/Join framework. The operators in the s-expessions are mirrors of built in Clojure sequence functions, and can use and be used in Clojure expressions. It’s not hard to imagine extending the set of federatable storage systems.

Heroku’s Mark McGranaghan talked about viewing logs data. What he really meant was viewing log data as akin to a native data type on Clojure and being able to use Clojure’s built in functions on log data in a natural way. Heroku has built a system call Pulse which takes this view. I particularly liked the small functions that he defined for expressing the intervals for recomputing statistics. It’s the cleanest formulation of that kind of thing that I’ve seen, and it’s enabled by his thesis view and Clojure.

Nathan Marz has been doing some great work at BackType and now Twitter. At StrangeLoop he open sourced Storm, a set of general primitives for doing realtime computation. At the Conj, he was talking about Cascalog, which is a Clojure DSL for Hadoop. Both Cascalog and Storm are in use at Twitter. Cascalog is inspired by Datalog and targets the same space as Pig. Cascalog has the full power of Clojure available to it, as well as the power of Datalog. It’s a little unclear to me exactly how much of Datalog is supported, but this is a powerful idea. Imagine combining the best of Cascalog and the Revelytix system. The source code to Marz’s examples is on Github.

Clojure has a logic programming library, core.logic which is based on the miniKanren system developed at Indiana University by Daniel Friedman, William Byrd, and Oleg Kiselyov. Ambrose Bonnaire-Sergeant presented an excellent tutorial on logic programming in general, and miniKanren in particular. David Nolen talked about predicate dispatching, a much more general way of doing method dispatch, and talked about his plans to tie that together with core.logic. The surprise highlight in this area was that Dan Friedman and William Byrd came to the conference and did a BOF on miniKanren and their constraint extensions to miniKanren. The BOF was surprisingly well attended (over 60 people), due in part to Ambrose’s excellent talk earlier that day. A key philosophical point about miniKanren is that there is a straight forward mechanical conversion from a functional program to logical/relational (miniKanren) program. This looks very promising, and it has me thinking about mashups of miniKanren (core.logic) and Datalog (cascalog). Professor Friedman and his students have done some very important work in the Scheme area over the years, and it was a great experience to meet him and spend some time over dinner. After dinner, we were sitting in the hotel lobby, and David Nolen was walking Friedman and Byrd through the implementation of core.logic, which was ported from the Scheme version of miniKanren, and then optimized for Clojure. There was a free flow of ideas back and forth, and it was a great example of a collaboration between academia and practice (it’s hard to say industry because Nolen and company are doing this in their free time). This is one of the things that I’ve always hoped for around open source, and it was nice to see such a concrete example. MiniKanren is described in Byrd’s PhD dissertation, and in the book “The Reasoned Schemer“.

ClojureScript

ClojureScript is a Clojure compiler which emits Javascript, which is then run though Google’s Closure compiler. I’ve been doing some prototyping work using Node.js and HTML/Javascript, so ClojureScript looks kind of interesting, particularly because it is good at some the data intensive stuff that Javascript is so laborious at. There were three ClojureScript sessions. Chris Houser took us on a deep dive of the compiler, Kevin Lynagh show us some basic applications of ClojureScript in the browser, and David Nolen did a BOF where he showed off the browser connected REPL for Javascript. ClojureScript is still in its infancy, but it’s interesting nonetheless. Once David gets the constraint version of core.logic working in ClojureScript, it should get a lot more interesting.

Community

The thing that stood out to me about the Clojure community was the presence of the “young Jedi”, Anthony Grimes, and Ambrose Bonnaire-Sergeant. Both of them were able to attend their first Clojure/Conj (Anthony’s was last year) due to fundraising campaign initiated by Chas Emerick. Anthony is 17, and Ambrose has not yet graduated from college. Both of them are lead developers on highly technical projects within the Clojure community, and both did a great job of speaking in front of 300+ people who were mostly older than them. When I worked at OSAF, I worked with Stuart Parmenter, who started working in open source when he was 14. It’s great to work with these young, very gifted people, and I love seeing the community welcome and make a space for them.

The flip side of this is that like many open source, programming language oriented conferences, there were very few women in attendance. Perhaps the Clojure community could take a page from the very successful work that my friend Sarah Allen has done on RailsBridge.

Learning More

O’Reilly has finally recanted and is doing a Lisp book. Clojure Programming should be done soon, and Manning has Clojure in Action and The Joy of Clojure. If you are looking for an interactive way of learning Clojure, there is Try Clojure. Those looking to sharpen their Clojure skills can look at the Clojure Koans and 4Clojure .

The speaker slides from the Clojure/Conj 2011 are available on GitHub.

Update: corrected the name of Indiana University – thanks to Lindsey Kuper

Update: linked to a more up to date Overtone repository – thanks to Sam Aaron

Haskell Workshop and CUFP 2010

It has been many years since I attended an ACM conference, and even more years since I attended the Lisp and Functional Programming Conference, which has evolved into the International Conference on Functional Programming (ICFP). ICFP was in the United States this year, and I’ve wanted to drop in for quite some time. There are many ideas pioneered by the functional programming community, and as much as possible I like to go to the original sources of ideas. ICFP is a long conference with many attached events, and it turns out that the best use of my time was to drop in for the Haskell workshop at the tail end of the conference, and the Commercial Users of Functional Programming (CUFP) conference.

Haskell Workshop

I’ve been around long enough to remember when Haskell first came out, and despite my stint as a database programming languages grad student, I’ve never had the chance to really give Haskell the attention that I feel it deserves. 20 year since its appearance Haskell is still barely on the radar. At the same time, I heard some very interesting talks at the workshop. Things like the Hoopl library for implementing dataflow optimzations in compilers, and the Orc DSL for concurrent scripting. The Haskell systems hackers have made great progress and doing some great work. Bryan O’ Sullivan described his work on improving GHC’s ability to handle lots of long lived open network connections. Given the recent burst of interest in event based programming models, such as Node.js, this is an interesting result. Simon Marlow presented a redesign of the Evaluation Strategies mechanism that GHC uses to control parallelism. Many of the talks that I heard have ideas that are applicable to problems that exist in modern systems. I just wish that I could see a path the involved using Haskell itself to solve those problems instead of the ideas migrating into another language/system.

Surgecon

Unbeknownst to me, my friend Theo Schlossnagle ran Surge, a conference on scalability, in Baltimore, and it overlapped the parts of ICFP that I attended. Surge seems to have flown pretty low under the radar. Google doesn’t return many relevant results for it, and the best information (other than talking to Surge attendees) I’ve been able to find on Surge is on Lanyrd. Theo told me that he was counting on this year’s attendees to be his PR for next year. I didn’t attend, but based on the tweets and dinner conversations, it sounds like it was great. I had dinner/beers with some Apache folks who were in town for Surge, as well as some Surge attendees like Bryan Cantrill. The “systems guys” gave me a good ribbing about being at a conference for “irrelevant languages”, and I had a really good conversation with Bryan about Node.js, cloud computing, and the Oracle acquisition (ok, that part wasn’t so good). Node.js is on a lot of people’s minds at the moment, and it was good to hear Bryan’s perspective on it. It was an interesting sidebar to the immersion in functional programming. I do think that in the medium term there are some interesting connections between Node and FP, but that ‘s probably an entire post of its own.

CUFP

There was a lot of F# related content at CUFP, and I think that Microsoft deserves kudos for the work that they are doing. I think it’s pretty clear that shipping F# in the box with Visual Studio 2010 is not a huge money maker for Microsoft at this point, and I’m impressed with their willingness to take a long term view of the future of programming. Unfortunately I’m not a Windows ecosystem person, so as attractive as F# and Visual Studio are, I doubt that I’ll be playing with this anytime soon.

Marius Eriksen‘s talk on Scala at Twitter was interesting because of the way that he described the conceptualization of Rockdove operations as folds, taking clear advantage of the benefits offered by a functional style. He also had some thought provoking comments about giving applications access to the behavior of the garbage collector. There are some interesting possibilities if you start to give developers control of the behavior of various parts of the runtime system.

Michael Fogus talked about his company’s experience using Scala. His talk was pretty entertaining, and there were some interesting comparisons between Scala features that they thought would be useful and Scala features that actually turned out to be useful. My only issue with his talk was the size of the sample, which isn’t something that he could do anything about. This was also true of the talk by the Intel compiler folks.

I’ve seen a number of talks on the Microsoft Reactive Extensions, mostly with respect to JavaScript. I continue to believe that RxJS could be a great help to Javascript programmers, particularly as things like Node.js take hold. Matt Podwysocki’s Node.js file server example shows how.

Warren Harris from Metaweb talked about his use of monads, arrows, and OCaml to build a more efficient query processor for Freebase’s MQL query language. This was a really interesting talk, because query optimization was the topic of my graduate school research, and at the time the connections between query languages and functional programming were a relatively new topic.

Final thoughts

It doesn’t take much to fan the flames of functional love in me. There are lots of smart people working on beautiful and interesting solutions. I wish that I could see a better path for those ideas to make it into mainstream practice.

Erlang Factory 2009

I spent Thursday and Friday of last week at the Erlang Factory in San Francisco (although the event was actually in Palo Alto).

Why did I go?

I’ve written about Erlang in this space before. Erlang is having a major influence on other languages, such as Scala on the JVM side and Axum on the CLR side. In addition every language seems to have several implementation of Erlang style “actors” (despite the fact that this is historically incorrect). Erlang has been around for a long time, and has seen industrial usage in demanding telecom applications. As a dynamically typed functional language with good support for concurrency and distribution, it is (if nothing else) a source of interesting ideas. Earlier this year, my boss asked me to start doing some thinking about cloud computing in addition to the stuff that I was already doing around dynamic languages — another good match for Erlang. This was the first large scale gathering of Erlang people in the US (at least that I am aware of), so I wanted to drop in and see what was going on, what the community is like, and so on.

Talks

The program at the Erlang Factory was very strong. In many of the slot sessions, there were 3 excellent talks to choose from. Every single talk that I went to was of very high quality. It was so bad that I wasn’t able to explore all the areas that I wanted to. Fortunately, the sessions were videotaped and are supposed to be made available on the web. Also, there was a decent amount of twittering going on, so a Twitter search for #erlangfactory will turn up some useful information.

I attended a number of “experience” talks by companies / individuals. There were experience talks from Facebook, SAP, Orbitz, and Kreditor (the fastest growing company in Sweden). I made it to the Facebook talk and the Kreditor talk. Facebook’s usage/deployment is on the order of 100 machines, which provide the chat facility for Facebook. Erlang is doing all the heavy lifting, and PHP is doing the web UI part. There was a lot of this kind of architecture floating around the conference. It seemed like the most popular combination was Ruby/Erlang, but there was definitely Python and PHP as well. The Kreditor talk was interesting because their site has been running for 3 years with very small amounts of downtime. Unfortunately, their entire deployment is probably less than 10 machines, so that blunts the impressiveness of what they have done. Still it was interesting to hear how they accomplished this using features of Erlang. In addition to the talks, I spoke with many attendees who are using Erlang in their companies. One such person was eBay founder Pierre Omidyar, who is running Ginx, a web based Twitter client. Pierre is doing the coding and deployment of the site, and was well versed in the Erlang way of doing things. An interesting data point.

The Erlang community (like all communities) has it’s old guard. These are folks who have worked with Erlang for years, before its recent burst of interest. There were a pair of keynotes by Erlang long-timers Robert Virding (The Erlang Rationale) and Ulf Wiger (Mulitcore Programming in Erlang). Both of these talks shared a common trait — the speakers were pretty honest about what was good about Erlang, and where there were problems. Given how prone the computing business is to fashion, I found this to be refreshing. Virding talked about the reasons why Erlang is designed the way it is. He accepted the blame for inconstencies in the libraries, talked about the need to avoid the process dictionary, and agreed that “a char type is probably not wrong”. Wiger’s talk was about why parallelizing code is hard (even with Erlang). He used the example of parallelizing map to demonstrate this, and showed the use of the QuickCheck testing tool to aid in finding parallelism bugs. The Erlang version of QuickCheck was inspired by the Haskell version of QuickCheck, and it’s a very very useful tool. The adaptations for parallelism look very nice. It’s a shame that the Erlang version is commercial software. I don’t grudge the authors the right to charge money for their software, but I do think that this will hold back adoption of this important tool.

There were many talks on what I would describe as “cloud problems”. For example, Ezra Zygmutowicz’s “You got your Erlang in my Ruby” was really about how he built a self assembling cluster of Ruby daemon’s (Nanite), Dave Fayram and Abhay Kumar’s “Building Reliable Distributed Heterogenous Services with Katamari/Fuzed“, and Lennart Ohman’s “A service fail over and take-over system for Erlang/OTP”. Like PyCon, there was a lot of interested in eventually consistent databases/key-value stores/non-relational databases. Cliff Moon’s talk on dynomite (a clone of Amazon’s Dynamo system), was particularly encouraging because he was reaching out to other people in the audience (and there were a decent number of them) to try an consolidate all their efforts into a single project. From what I could tell, people seemed receptive to that idea.

CouchDB also fits into that last category of non-relational databases, but it gets it’s own paragraph. One reason is that I helped mentor the project through the Apache Incubator (and chauffeured those CouchDB committers who were present). Another is that CouchDB creator Damien Katz got a keynote. Third is that there was basically a CouchDB track on the second day of the conference. There was a lot of interest in CouchDB, and a lot of activity as well. I was told that some of the people who took the CouchDB training during the training days had actually submitted patches on the project already. Damien’s talk was not about the technical details of CouchDB, but about his personal journey to CouchDB, which included selling his house and living off his savings in order to see CouchDB come to life.

Activity has really picked up in the Erlang web framework space. In addition to Erlang Web, and Yariv Sadan’s Erlyweb, there is also Rusty Klophaus’ Nitrogen. Nitrogen focuses more on the UI side of the web framework, omitting any kind of data storage. It’s very easy to create an AJAX based user interface using Nitrogen, and there is nice support for Comet. As part of his presentation, Rusty showed his slides on a Nitrogen based webcast reflector. You specify the UI using Erlang terms, which then causes HTML/Javascript/etc to be generated, which caused a stir in part of the Twitter peanut gallery. I was mostly happy to see people focusing on solving the current generation of problems. My favorite web space talk was probably Justin Sheehy’s talk on Webmachine. I think that I prefer the description of WebMachine as a REST or HTTP toolkit. Webmachine gives you what you need to implement any HTTP method correctly, and then provided a set of callback functions that can be implemented to customize that processing to do actual work. One of the coolest things about Webmachine is it’s ability to visually show you that path taken in processing a particular HTTP request, and being able to inspect/dump data at various points in the diagram. It makes for a very nice demo.

There were not that many “language geek” talks. This contrasts with the early years of PyCon (at least for as long as I have attended) where there were quite a number. I missed Robert Virding’s talk on Lisp Flavored Erlang (but I saw some example usage in a CouchDB talk), because it overlapped the dynomite talk. I was able to attend Tony Arcieri’s talk on “Building Languages on Erlang (and an introduction to Reia)”. During the first part of his talk, Tony showed how to construct an Erlang module on the fly in the Erlang shell. He then discussed some tools which are useful to people trying to build languages on top of BEAM, the Erlang virtual machine:

  • Robert Virding has written leex, a lexical analyzer generator
  • yecc, a Yacc style parser generator is included in the Erlang distribution
  • the erl_syntax_lib library aids in constructing Erlang abstract syntax trees, which can then be compiled to Erlang bytecode.
  • Erlyweb contains the smerl (simple metaprogramming) library for creating and manipulating Erlang modules at runtime.

After that, he launched into a description of REiA. I’m not sure that I agree with some of the choices that he has made, but I am happy to see people experimenting with languages on top of BEAM, and in keeping with Erlang’s process model and the OTP infrastructure. One of the things that Tony mentioned was abandoning indentation based syntax. He wrote an entire postmortem on that experience in his blog. Python’s indentation based syntax has won me over and made me a fan, and I am sad to see that indentation syntax, blocks/closures, and expression orientation continue to be at odds.

Coda

It looks like Erlang is starting to find a home. Companies are using it in production. There are books starting to be written about it. Many (not all) of the things which make Erlang seem odd to “mainstream” programmers also appear in languages like Scala, Haskell, and F#. At the same time, Erlang has a long history of industrial deployment, albeit in a single (large) market segment. Many of the problems which we now face in large web systems (and the cloud): concurrency, distribution, high availability, and scalability are strengths for Erlang. Indeed, many of the people that I heard from or talked to basically said that they couldn’t solve their problem with any other technology, or that their solutions were dramatically simpler than the technologies that they already knew. Will that be enough to propel Erlang into the mainstream? I don’t know. I also don’t know if our current state of mainstreamness is going to remain. More and more I’m seeing an attitude of “let’s use the best tool for the job”, not only in languages, but in all parts of (web) applications.

There’s also the issue of the Erlang community itself.   Around 120 people showed up for the conference. As I mentioned previously, there are the folks who have been doing Erlang for years. Then there are the relative newcomers, who are web oriented/web savvy, and solving problems in very different domains than the original problem domain of Erlang and it’s inventors. Thus far, the two segments seem to be getting along fine. I hope that will continue — success or the potential for success has a tendency to bend relationships.

Refactoring in the Functional Programming world

I’m an Emacs guy. I was first exposed to Emacs back in 1984 on a VAX running BSD. This was prior to GNU Emacs, so the Emacs that I saw was James Gosling’s Emacs. At the time, I was working on a compiler for a functional programming language called SUPER, which was evaluated using combinator graph reduction.

For many years, and across many languages, including Scheme, C, C++, Perl, TCL, and Java, Emacs was my tool of choice. My hands had the muscle memory for the keystrokes, and over those years I accumulated a file full of Emacs-Lisp customizations for Emacs (by this time, mostly GNU Emacs). When Eclipse started to support refactoring I started using Eclipse as my primary tool for editing Java programs. Refactoring is an example of the kind of high leverage features that I want in my programming tool set.

A few days ago I found some gems buried in a thread on the Scala mailing list. Dave Griffith has been accumulating a list of refactorings for Scala. Here’s his complete list:

Curry Method (split a parameter list, and the arg lists of all callers).

Uncurry Method (merge split parameter list, including merging the arg lists of callers. If method is called with partial args, either complain or automatically create a helper method which represents the partial application, and replace partial calls with it.)

Extract Trait (including searching for other classes which can have the same trait extracted. Tricky with super calls, but not impossible)

Split Trait (splits trait into two traits (putting in self-types if needed), change all extending classes to extend both traits)

Extract Extractor (select a pattern, automatically create an extractor)

Extract Closure (similar to extract method, but creating a function object)

Introduce by-name parameter

Extract type definition (obvious)

Merge nested for-comprehensions into single for-comprehension (and converse)

Split guard from for-comprehension into nested if (and converse)

Convert for-comprehension into map/filter/flatmap chain (and converse)

Wrap parameter as Option (converting null checks, etc.)

Convert instanceOf/asInstance pair to match

Replace case clause with if body to guarded case clause(s)

I was particularly interested in those refactorings related to functional/higher-order programming and pattern matching. Between the surge of interest in Scala, F# and Haskell, it looks like there’s room for some more work in refactoring.