Node.js research
V Introduction
* Hi, I'm Ryan Wilcox. I've been programming for about 15 years on various things, and been around the block a few times. I've done classic Mac OS applications, cross-platform applications in C++, Python web apps of all sorts (including some Twisted), declarative programming, and I've spent the last 3 years doing Ruby and Ruby on Rails (including some Event Machine).
* I know the best practices of these frameworks, the pitfalls, and the "why" of those best practices.
* So, when evaluating node.js for use in a potential project, I asked around for the node.js best practices. I didn't get as much discussion as I was hoping for.
* I decided to dig in and look at node.js from the eyes of someone who's Been There, Done That.
* I am a node.js outsider. This potential project will be my first node.js project of any size. I don't speak from direct experience with node.js, but from my research and my own knowledge dealing with other event systems. I'd love to know if I'm wrong on a topic.
V Audience
* This document assumes you have tried out node.js a little. Installed it on your machine, ran the Hello World HTTP server. You saw the "don't block, use callbacks" philosophy of node, and thought, "OK, I can do this"
* Then maybe you put node.js down because you didn't need a asynchronous Hello World server in your mostly Ruby (or Python Clojure, or Bog knows what else) shop. This is exactly what I did. Until the other day.
* This document assumes you've been around the block a few times, and are looking at node.js with an evaluating eye. "Can I use this for a new potential project, and what are the best practices in the community?"
* Yes, the Node.js Modules page (https://github.com/joyent/node/wiki/modules) is there. It's also 35 pages long - a great show of what node.js can do… you see all the practices, but not which ones are the best.
* "But seriously, I just need to write some code, not check out 100+ node.js projects that may or may not still work or be any good. And really, that 'node.js is cancer' rant was a big deal a while back, WTF's up with that? And then there were those non-blocking Fibonacci servers…."
Node.js is cancer: http://teddziuba.com/2011/10/node-js-is-cancer.html
Node.js has jumped the shark: http://www.unlimitednovelty.com/2011/10/nodejs-has-jumped-shark.html
node.js non-blocking Fibonacci code: https://github.com/glenjamin/node-fib/blob/master/app.js
* Still here? Good - keep reading, because you're my audience.
* Or, did that kind of go over your head? Say "Wait, huh, what?!" - This article has a fair bit of reference material, so take a look at the reference material and come back. This research was pretty frustrating, even to me, to gather.
V The node.js Event Model
* cooperative multi-threading: process.nextTick() lets you defer stuff to the next time the event loop is idle. So you can let other things have a chunk of time if you're in the middle of a long, blocking operation
http://nodejs.org/docs/v0.3.1/api/process.html#process.nextTick
http://en.wikipedia.org/wiki/Thread_(computer_science)#Multithreading
* Node.js (highly) encourages a non-blocking style of programming. Thus all the callbacks: "do this at some time after this other thing happens".
V Wait, what does "non-blocking" mean?
* Yielding to process.nextTick in lengthy operations
* using callbacks when performing low level operations
V using small blocking operations to build up larger sequences of events that happen asynchronously
V for example:
* for (current_record in records) {
record.updateTimeRemainingAsynchronously(function() {;})
}
* the FOR loop in this case is blocking, but it's spawning N record.updateTimeRemainingAsynchronously functions to run sometime in the future. (One just has to hope/know that record.updateTimeRemainingAsynchronously() is actually non-blocking
* Using the event loop to split things up, or shoving it to workers
V So, blocking is bad, right?
* Right
* "in node, everything runs in parallel except your code"
Technically the common quote is wrong, and could be refined: "Everything runs, one thing at a time, in the event loop, where everyone tries to be polite and give others time to do their thing. Your code should also be polite and give others time to do their thing" It's more accurate, and more unwieldy. Maybe: "We are nice because node is nice", is a better (if obfuscated) quote.
* But why? You have one event loop per node.js process. If your code hogs the processor (event loop) for 5 seconds (by sleeping, or doing a large calculations, or Fibonacci numbers) node.js will not respond to anything else for those 5 seconds. "What other things?", you ask. Things like responding to other HTTP requests. The longer your code blocks, the more you are Denial Of Service Attacking yourself.
http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
http://www.slideshare.net/shivercube/functional-nodejs
V So how do you make your code not blocking where you can?
* Send really long stuff to the background: put it in a queue and progress it in the background. This is Rails 301 stuff
* Use Events to split things up into events that can be listened to (and sending an event means implicitly that the event loop will run)
* use the profiler (?)
* Be polite and use process.nextTick places, but don't overdue it. (the more time you give away to Other Things the more wall-clock time the request will take.
V Possible Frameworks to use:
V Application Frameworks
V http://expressjs.com/ <- it's Sinatra-like
* Personally, I like Sinatra when I *know* I'm going into a very small project that will not grow a ton of features (thus resulting in a "big ball of mud" app). Elsewise, I like frameworks that assume a big structure the project can grow into.
V http://geddyjs.org/ <-- Rails like infrastructure,
* and offers geddy.util.async.execNonBlocking function to perform things in a non-blocking manner
* geddy.util.async.AsyncChain is a chain of functors that will be executed asyncronously
* ORM is DataMapper like
* supports generation, but not coffeescript generation
V http://railwayjs.com/ <-- another Rails like infrastructure framework
* Rails like
* Its generators can output coffeescript (pass --coffee to the rw commands)
* ORMs: mysql, mongoid, redis, mysql
* can also use Sequalize (a Datamapper based ORM) <-- BUT mysql only
* ORM that Railway.js uses can relatively easily write Adaptors for other things (postgres clients, for example)
> Other ORMs
* http://persistencejs.org/ <-- Datamapper based, mysql + sqlite adaptors
> User authentication frameworks
* https://github.com/ciaranj/connect-auth <-- node.js middleware version of Ruby's Warden framework
* https://github.com/bnoguchi/everyauth <-- node.js middleware version of Ruby's OmniAuth framework
V Testing
* http://vowsjs.org/ <-- Vows
* https://github.com/caolan/nodeunit <-- nodeunit
V Standard Library Stuff
* https://github.com/caolan/async <-- Async Iteration tools for node.js
* http://howtonode.org/do-it-fast <-- avoid event loop hell
* https://github.com/wdavidw/node-each <-- async each loops
* https://github.com/substack/node-seq <-- chainable async Iteration etc
* Something to keep the nested callbacks at bay. For example, an Observer pattern. Or Fibers/Promises. (Or Coffeescript...)
V Useful Javascript stuff I can NOT use (because it implicitly blocks):
V Underscore.js for Iterations: each implicitly blocks
* source (the underscore.js source): http://documentcloud.github.com/underscore/docs/underscore.html
V What I would love, but can't find
* MochiKit.Base's partial/bind, extend, repr, and Adaptor functionality without the (blocking?) functional tools
* But, if a library makes a blocking call available to me, I would rather not mistakenly reach for it when I want a non-blocking tool.
V Development Tools
> Reloading code when files change (for example, during development)
* https://github.com/isaacs/node-supervisor <-- watches a directory structure and reloads the Javascript files when changes are made. (node.js does not do this by default, and neither does express apparently)
* cluster (mentioned elsewhere in this document) can also do this.
* http://github.com/mde/jake <-- Make/Rake like tool
V Deployment Tools
* http://railsbros.de/2011/02/18/deploying_a_node_js_server_with_capistrano_and_cluster.html <-- Deploying node.js with Capistrano and Cluster
V Production Tools
* http://learnboost.github.com/cluster/ <-- create a cluster of node.js servers. Thus your node.js app is load-balanced on the one machine, and you are running N event loops on the same machine. (Need multiple machines running cluster? Stick a load-balancer in front of your load-balanced machines!) Plugin community.
* https://github.com/pgte/fugue <-- billed as "Unicorn for node", but Fugue's own author says that you should probably use Cluster
V Q: "How do I get a local copy of all the libraries I use, like `bundle package` in Bundler?"
* A: npm will by default install packages into a local space (the node_modules folder of your project)
V Good node.js reads
* http://stella.laurenzo.org/2011/03/bulletproof-node-js-coding/
V Node.js & Coffeescript
* http://zappajs.org/ <-- A Coffeescript Sinatra-like built on top of (node.js) Express
* http://ariejan.net/2011/06/10/vows-and-coffeescript <-- Vows + Coffeescript
* "CS automatically integrates with require.extensions so if your scripts have a "coffee" extension they will run as coffeescript.."
V References / Presentations / Slideshows to watch
* http://www.slideshare.net/fleegix/mde-txjs-2011fullstackfallacies <-- no recorded audio, :(
* http://blip.tv/jsconf/jsconf2011-tom-hughes-croucher-5478056 <-- Tom Hughes-Croucher's node.js talk at JSConf 2011. I picked up a lot of information from this talk!

Comments

blog comments powered by Disqus