NPM: Install Packages Not Yet Published
This morning I was working on a project and one of the modules I depended on had a small bug in it. As I was about to log an issue on the project’s github page I discovered that it was already fixed, just not yet released. I really wanted to push my changes out to our staging server and my build process relies on npm gathering all the dependencies my project needs, so I looked for ways to install through npm without much modification to my build process.
What I discovered could be considered an abuse of npm’s preinstall hook, but it works.
This brings in not just the module, but also all of the module’s transitive dependencies as well. This trick worked for me, but I’m still a little doubtful that this is considered the “right” way around the problem.
Blog Rolling with MongoDB, Node.js and Coffeescript
This morning I woke up with a lingering thought on my mind that was left over from recent conversations. In the technical community we often get so invested in our work that rather than talk about the simple building blocks that build our success we talk about the huge breakthroughs we make. The problem however is that our breakthroughs most often aren’t accessible to someone who wants to just get started. So today I will give an intro tutorial to using node.js, coffeescript and mongodb to build a simple blog. It builds off the concept in a tutorial I first used to learn node.js more than a year ago, but with a completely from scratch approach. In this tutorial I will also cover practicing Behavior Driven Development using Mocha.
Getting StartedQuite obviously, you’re going to need node.js and mongodb installed. I recommend downloading and installing from the node.js website and following the instructions. I will note that this tutorial covers 0.6.7 so if you come across this post a year from now the API might have changed significantly since then. You will also need mongodb installed, you can download it here. If you use Ubuntu (or some other Debian derivative) you should consider installing from their apt repository. OSX? No problem, you can also install it via homebrew or macports.
Finally, since we’ll be using coffeescript for this tutorial, run npm -g install coffee-script (you might need to sudo) to install coffeescript. Run coffee from the commandline to access the coffeescript REPL. If all works well, install these additional packages listed below via npm that we’ll be using throughout the tutorial.
- express
- mocha
Now let’s bootstrap our project structure. Type express coffeepress to generate a skeleton express project structure. You should see output similar to the following:
Notice how at the end it says to cd to the directory and type npm install? Let’s follow those instructions. Let’s run what we have so far by typing node app.js and navigating to http://localhost:3000. This is the default structure that gives a good starting point. Feel free to investigate the files under the directory before moving on. I even suggest poking around by changing the view a bit and changing the title from express to “My Coffeepress Blog”.
Porting to CoffeescriptAt this point, let’s port our backend to coffeescript. I used to copy and paste files into the js2coffee website but you can also install js2coffee via npm. So run the following:
Now you can run coffee app.coffee to run the same app, but now in coffeescript. Take a look at the resulting files to get a feel for what has changed. New to coffeescript? Then I recommend taking a gander at coffeescript.org before moving on. Here is the project structure so far.
Basic NavigationI like to try and work my way from the outside in while developing a site or feature, materializing components into existence as I need them. So let’s start by working on the initial navigation of the site with some simple in-memory storage of blog posts. This is a good time to get our test framework setup and write a few simple tests against our routes. Normally I prefer to not write tests against my routes, shoving logic into heavy models or services. However I have come to learn that untested components of a system serve as a gravity well for untested code that eventually leads to clients calling you about broken applications. What follows serves as both an introduction to Mocha as well as express’ routing mechanism.
Let’s edit our package.json to include our test framework dependencies.
We include should so that we can use BDD style assertions (more on this in a bit). Write a simple test case located at test/routes-test.coffee with the following code to get us started with mocha
Now run this by typing mocha from the root of the project directory. It should pass. Let’s go ahead and make it fail by changing 4 to 5 and rerunning it. Hopefully this gives you a good feel for our test framework before we move on and change this test to reflect our existing index route. Swap the code in this test out with the following.
Here we fake our requests and response in order to capture what is passed into the response. We fake the render method and verify that our rendered view is “index” and that the variable title is equal to what we expect to be passed in. Run the tests and make changes to your route to make it pass.
Now let’s add a post variable that will be an array of posts we’ll display on the front page. Add the following assertion right after the title assertion:
Run the tests to see it fail and change the route to have a posts array variable available in the template.
Unfortunately you’ll notice that the test fails. This is due to a subtle difference between equal and eql. The former enforces strict equality while the latter is a bit looser, so we change our assertion to use eql. Take a look at the should documentation for more information.
Next let’s write tests for the “new post” route.
Run it, see the failure, and rework our routes.coffee file to include the route (with no implementation yet)
You’ll notice our test passes. That’s not good. Why? Because we put our assertion in our req.render callback, which never gets executed. Doh! How can we make absolutely sure it gets called during our test run? Old school thinking would have you assign a local variable outside the scope of the callback that gets assigned during execution and then can be verified against later on. However we have no guarantee that the routing logic will be synchronous!
Thankfully mocha has a feature that allows for easy testing in these situations. In the method declaration of our test specify a parameter named done. This is a callback that we can call anywhere to indicate the test is done. Basically the test will wait up to a default of 2000ms for it to be called. With this in mind, let’s modify our tests with the following:
If we run this via mocha now we’ll notice that we have one failure. Let’s go ahead and implement the route and connect it into our router.
And connecting it up in the app.coffee
Modifying the Views
This code is useless without views, so let’s modify our views a bit. Let’s modify our layout.jade to link to the new posts page. This layout also makes use of twitter bootstrap because I’m too lazy to design something for this tutorial.
And create our add_post view at views/add_post.jade. An interesting thing to note here that I’ll touch on in a bit is that I prefix the input names with post.
Now let’s add another route to handle the post. This time I’m going to kind of skip delving into the details of writing the test, but you can look at what I have so far here if you’d like to see it.
For now, we’re just going to store each post in an array. Nothing fancy yet. We also add a new route to app.coffee. We could refactor or use some express-mvc plugin to reduce adding each route by hand, but I think it’s good to do it like this to get a feel for express’ low level routing mechanisms.
Finally, we’ll add one last view for viewing a single post:
Go ahead and start the application up and navigate to http://localhost:3000. Post a few posts and play around a bit. You can see the finished application we have so far here.
MongooseWhew. I hope I haven’t lost you yet. Especially with the tests against the routes… I know those are always a bit painful! Now that we have functional blog let’s make it work by storing posts in mongodb using Mongoose.
Let’s add a dependency on mongoose to our project and freeze it at version 2.4.10. As always, run npm install to bring it in. Now we’ll create an initial test to just test mongoose out.
Here we import both mongoose and the model object that we’re going to create. Since we want our test to start with a clean slate, we use the before hook (which runs once before anything else in the test runs) to both connect to the database and then remove all of the Post objects from mongodb. We pass the done callback to the remove call so that tests don’t run until all Posts have been removed.
Now we create a new Post instance. You can pass an object literal in to set properties on the model, so we do that here. Finally, in our post.save callback we look the post back up and verify certain attributes have been set. It’s a dumb test (and in fact I rarely test mongoose’s behavior like this), but it does verify that we’ve configured our model correctly.
Now let’s implement our model to make the test pass.
Pretty simple. Now let’s refit our routes to use the Post model instead of an in memory array.
That’s all good and dandy, but one last hiccup is our tests for our routes now fail. Chalk this one up to not having any abstraction or dependency injection in place, but that is fine for now, we’ll live with it and change the tests.
Finally we need our app to actually connect to mongoose when we run it. I like to do this based on the express configuration. This is immensely important if you have mongodb running on servers separated from your application. For this example we’ll just use the databases coffeepress-dev and coffeepress-prod.
Run it and write a few posts. Restart the app and you’ll see the posts still there. Woot!
ConclusionWell, that about wraps it up… you can see this tutorial in it’s finished glory on the finished branch of the repository. There’s a bit missing out here that we’d implement in the real world. Obviously some kind of authentication would be in order if we took this further, possibly using mongoose-auth. We’d also want to add some validation when posting. These are all excellent topics for future posts but for now I hope this was enough to help you get going!
Streaming Files from MongoDB GridFS
Not too long ago I tweeted what I felt was a small triumph on my latest project, streaming files from MongoDB GridFS for downloads (rather than pulling the whole file into memory and then serving it up). I promised to blog about this but unfortunately my specific usage was a little coupled to the domain on my project so I couldn’t just show it off as is. So I’ve put together an example node.js+GridFS application and shared it on github and will use this post to explain how I accomplished it.
First off, special props go to tjholowaychuk who responded in the #node.js irc channel when I asked if anyone has had luck with using GridFS from mongoose. A lot of my resulting code is derived from an gist he shared with me. Anyway, to the code. I’ll describe how I’m using gridfs and after setting the ground work illustrate how simple it is to stream files from GridFS.
I created a gridfs module that basically accesses GridStore through mongoose (which I use throughout my application) that can also share the db connection created when connecting mongoose to the mongodb server.
We can’t get files from mongodb if we cannot put anything into it, so let’s create a putFile operation.
This really just delegates to the putFile operation that exists in GridStore as part of the mongodb module. I also have a little logic in place to parse options, providing defaults if none were provided. One interesting feature to note is that I store the filename in the metadata because at the time I ran into a funny issue where files retrieved from gridFS had the id as the filename (even though a look in mongo reveals that the filename is in fact in the database).
Now the get operation. The original implementation of this simply passed the contents as a buffer to the provided callback by calling store.readBuffer(), but this is now changed to pass the resulting store object to the callback. The value in this is that the caller can use the store object to access metadata, contentType, and other details. The user can also determine how they want to read the file (either into memory or using a ReadableStream).
This code just has a small blight in that it checks to see if the filename and fileId are equal. If they are, it then checks to see if metadata.filename is set and sets store.filename to the value found there. I’ve tabled the issue to investigate further later.
In my specific instance, I wanted to attach files to a model. In this example, let’s pretend that we have an Application for something (job, a loan application, etc) that we can attach any number of files to. Think of tax receipts, a completed application, other scanned documents.
Here I define files as an array of Mixed object types (meaning they can be anything) and a method addFile which basically takes an object that at least contains a path and filename attribute. It uses this to save the file to gridfs and stores the resulting gridstore file object in the files array (this contains stuff like an id, uploadDate, contentType, name, size, etc).
Handling RequestsThis all plugs in to the request handler to handle form submissions to /new. All this entails is creating an Application model instance, adding the uploaded file from the request (in this case we named the file field “file”, hence req.files.file) and saving it.
Now the sum of all this work allows us to reap the rewards by making it super simple to download a requested file from gridFS.
Here we simply look up a file by id and use the resulting file object to set Content-Type and Content-Disposition fields and finally make use of ReadableStream::pipe to write the file out to the response object (which is an instance of WritableStream). This is the piece of magic that streams data from MongoDB to the client side.
IdeasThis is just a humble beginning. Other ideas include completely encapsulating gridfs within the model. Taking things further we could even turn the gridfs model into a mongoose plugin to allow completely blackboxed usage of gridfs.
Feel free to check the project out and let me know if you have ideas to take it even further. Fork away!
Enabling JMX in Gradle’s jetty Plugin
It’s another day, which means another gradle tip. I have been experimenting with JMX lately and using MBeanExporter to export spring beans so that I can interact with them over JMX (specifically, stopping and starting rabbitMQ consumers). I can get this working on any container easily enough but I really wanted to get it working with my locally running jetty instance launched by gradle.
First you’ll set a jettyConfig for the jettyRun task. I usually do this for both jettyRun and jettyRunWar:
The additionalRuntimeJars is needed because of a transitive dependency on mx4j. I don’t know why this is, but it is required. I add mx4j as a providedRuntime dependency along with jetty-management:
Finally you need to setup your jetty configuration to startup a JMX server. There’s a bit of freedom here with what you can do but here is one that I stole shamelessly from the jetty website:
Now run gradle jettyRun and have jconsole open a remote connection to service:jmx:rmi://localhost:2100/jndi/rmi://localhost:2099/jmxrmi and go do whatever you want to do with JMX.
Gradle Tip: Start/Stop Embedded Jetty for System Tests
I thought I’d share another feature of gradle that i have found extremely useful, starting and stopping an embedded jetty server when my tests run. This is really useful for projects that host web services as it allows me to hit them and very the correct results plus it verifies the full stack is configured correctly. One could quite possibly also use this setup on web projects and have Geb based tests run against their project.
Given a project setup using the jetty plugin as I described in my previous post, all you need to do is hook jetty into run before and after the test task:
And that’s it. Now whenever you run gradle test the embedded jetty server will run along with your tests.
Gradle: Using JNDI with the Jetty Plugin
I use gradle a lot at work and I believe one discovery that was a true win was discovering how to fake JNDI when using the jettyRun task for local development. Originally googling and searching the documentation didn’t yield anything so I thought I’d write a quick post detailing how to do it in case you’re like me and googling for the same thing.
First off you need to create a jetty-env configuration file and put it somewhere in your project (I prefer src/test/resources). Here’s a sample of one I use that uses H2 for the dataSource:
This uses an already running H2 instance and runs an init script located in src/test/resources for populating the database with some tables. From our gradle script we need to reference the file from the jettyRun task (I also add it to the jettyRunWar task as well).
Finally, to complete this example, we want H2 running before jetty kicks off. So we add the h2 dependency to our build script and run the main method of org.h2.tools.Server.
I’ve created a sample spring MVC 3 project that makes use of all the above for local development. Just clone it and run gradle jettyRun to see it in action!
Big Company vs. Small Company
The other day I was having lunch with a friend of mine who works for a medium sized company (by medium sized I mean large, but not Fortune 500 large). Our discussions touched a variety of topics by one that caught my attention was when he voiced his frustration on his current project. “We’re not doing much programming right now,” he quiped, “for the most part we’re doing static content management and updating pages that is basically a ‘Recent News’ section for the organization.” With all respect to my friend (who is a really good programmer) this discussion really reminded me of what I dislike about companies with large company mindsets.
It’s hard for me to put my finger on it, but the gist of it is that developers paid upwards of $60,000+ a year were doing static content management while a $15 an hour developer fresh out of college would most likely setup Drupal or a WordPress blog and let them update it while he focused on more important things. I’d know… I was once that $15 an hour developer.
It’s a common problem I’ve seen in large companies in my experiences… problem A is easily solved by tool B but tool B is written in language C while the company has embraced language D (and its derivatives). No good equivalent to tool B exists in language D so the developers will spend lots of time doing manual work, in which case you have one to five developers with degrees updating text (I’ve seen it). The worst case (and common) scenario is the company will possibly spend upwards of a quarter million or more developing a poor imitation of tool B in language D that will only work in house and will never be able to be used outside the company.
Why is this? Why is it so hard to just use tool B and keep focusing on the more important tasks at hand? From what I’ve observed in my career is that companies with large company mindsets rarely can consider using languages outside of their core language choice. It requires server provisioning, training, hiring developers or server administrators experienced with language C… and quite possibly the hiring of consultants who specialize in tool B. The prep time for such a task could easily take up to six months or more and in the end it probably will be decided that tool B isn’t up to the task.
Compare this to a more fluid development environment. The developer probably uses language D too but knows tool B would be the best choice to get the job done. He’d probably have the company drop 40 stones on a rackspace instance and install tool B and language C on it and integrate it with the current site written in language D. I’ve seen this done in days, if not hours.
Notice I really try to emphasize “big company mindset” over “big companies” as the guilty party of this tomfoolery. Just because you’re big doesn’t mean you have to act like this and I’ve seen small companies engage in this behavior as well. “Large Company Minded” companies tend to prefer a great deal of process in the way they do business and identify success with solid, foolproof process that can be adhered to by anyone. Don’t get me wrong, I’m not saying absolute chaos of no processes are a good thing but I believe that a business should have a certain degree of fluidity to be successful. Why not just do a quick cost analysis over how much it would cost to have a single server (probably even one already running some other application server) to run language C and tool B rather than redevelop tool B in language D? Why not just determine that that would be the best option and just do it instead of lollygagging around and wasting your company’s money?
Accessing a Connect Session From Socket.IO
UPDATE 11/28/2011: After talking with TJ Hallowaychuck I discovered I was doing it wrong… there are better ways to do this then the hack I had come up with. Using connect to parse the cooie you can use this instead:
Which is much cleaner than what I originally posted. The original post is intact for historical reasons.
I thought I’d post on a technique I’ve been using to associate the users session with a socket.io server. Although this technique was done in a pure node.js app, it’s probably possible to do the same to grab the session id from your PHP app or Grails app that is utilizing socket.io.
Anyhow, here’s what I’m rolling with:
Yep, it’s sneaky. I sniff the sid out and simply use redisClient (since redis is my backing session store) to look the session up. Now on all socket requests I can access the session directly.
