April 19, 2009

I hate SVN, or: here’s a tool I use to make diffs better

Filed under: General Information — Ryan Wilcox @ 9:33 am

The Intro, and the frustration

So SVN’s command line interface could be a lot better. Specifically, svn diff.

Use case: I update and see a change someone else has made. I want to see what modifications they made relative to what it used to be… how do I do this?

The answer using the straight svn command line is so:

  1. Figure out what revision the file changed under (mindful that the revision subversion reported to you when you did an update may actually not be the revision of the last change, if you have multiple projects in the same repository and some other part of the revision changed between this change and the HEAD of the repository).
  2. Figure out what the revision-before-that-was. If you’ve figured out #1 by doing an svn log, well you have this number too.
  3. svn diff -r revision-from-step-2:revision-from-step-1 somefile

Which sucks. For one it’s too many steps, and then there’s general svn unhelpfulness.

To make your life more ‘wonderful’, svn diff will return nothing in the following situations:

  1. The file has not changed between the revisions you specified. (Obviously you didn’t do steps #1 and #2, but svn is not going to tell you this)
  2. The file doesn’t exist in the repository at all. (Obviously you screwed up the file name, but svn is again going to remain silent).

So if svn returns nothing you have one of these two things going on. Good luck debugging, buddy.

But yes, the suck. Non obvious, and WAY to much work if I just want to see what changes some coworker made to my codebase. This needs to be a fast step: I just want to do a code review quickly and get back to work!!!

A tool to make it better

So I wrote a tool to make it better. Granted, an inefficient shell script to make it better, but hey.

This shell script accomplishes two thing:

  • Gets rid of having to do steps #1 and #2. Instead you give the shell script what i call a revision offset. (”Compare the current version to the version one change ago”). This is NOT revision number — remember that there may have been other changes since that file changed… but “revision offset” is a relative number — “N changes ago”.

    Sadly, not in English like that, but it’s a bash script — maybe I’ll rewrite it someday and use Python and PythonMethodMatcher.

  • HELPS you debug. “Hey, buddy, I think that file doesn’t exist. You sure about that?”
  • This tool will also helpfully tell you what it’s doing, so it’s not opaque magic — if you want to do something different you can take what the script generates, modify it and off you go. This tool is transparent and wants to help you.

The shell script is far from perfect (it’s pretty inefficient, because I’m not a master bash coder), but it does make this less frustrating.

Example Use Case

$ ls
README.txt somefile.sh someotherfile.sh
$ svn log README.txt
------------------------------------------------------------------------
# r170 | someone | 2009-04-09 22:52:59 -0400 (Thu, 09 Apr 2009) | 1 line
#
# a change
#------------------------------------------------------------------------
# r167 | someone | 2009-04-06 09:43:25 -0400 (Mon, 06 Apr 2009) | 1 line
#
# another change
#
....

# ok, enough background for the reader, on with the show!

$ svndiffrelative 0 1 README # <-- "compare the current revision to 1 change ago"
file README does not exist. CHECK THE NAME! Use --force to if you want to do this on a deleted file
$ svndiffrelative 0 1 README.txt
comparing revision: 170 to 167
running: svn diff -r 167:170 README.txt
Index: README.txt
===================================================================
--- README.txt (revision 167)
+++ README.txt (revision 170)
@@ -125,6 +125,8 @@
+ hello world!

The Script

Download the script from my site!

March 18, 2009

New Open Source Project: PyMethodMatch

Filed under: Uncategorized — Ryan Wilcox @ 9:35 am

I’ve been doing a lot of Ruby on Rails Cucumber testing, and was inspired by the simplicity of controlling the flow of executing by matching on regular expressions.

The idea behind cucumber is you write English steps and assertions to test a web app. Plain English means just that: no “tell box 1 of window 1″ almost-english-but-not-quite, and not nerd English like Rails makes popular “1.days”.

Plain English.

Here’s a (slightly modified) example from something I’ve been working on:


Given a logged in, registered user
And I have a blog post that I can edit
Then I should see "Edit"

This all happens by the parser matching those phrases with a bunch of regular expression code blocks elsewhere in the app, then executing one if one matches. Which is brilliant in its simplicity.

So I wrote this same kind of engine in Python. Just a generic regex matches calling functions, suitable for any purpose (not just test driven development like Cucumber).

I put it on on GitHub. Find it at PyMethodMatch’s home on GitHub

February 18, 2009

Product Launch: Experienced Man’s Guide to Cross-Platform Programming with wxWidgets

Filed under: Uncategorized — Ryan Wilcox @ 10:08 pm

Today I’m proud to announce a product launch: Experienced Man’s Guide To Cross-Platform Programming with wxWidgets.

When I buy a computer book I write notes in it. Sometimes these notes are back and forward references, sometimes I found a bug in something and I write the solution down, sometimes something behaves one way one place and another way another. Maybe time passes, things change, but the book has (now) outdated information.

So I write notes in the margin. When there isn’t any room in the margin I take out half a piece of paper and write my notes there.

Over the course of a few years my books become pretty note-ridden, especially if I was using the book every day.

I bought Cross-Platform GUI Programming with wxWidgets about 3 years ago, and during those 3 years I’ve been using wxWidgets almost non-stop stop clients. I have a lot of notes in this book.

So I copied all my notes from my book and put them in a handy dandy PDF, which I’m now selling as a DRM free PDF.

Since this is a PDF, and not a traditional printed book, I’m selling it like software: Version 1.0 is out today, but there will be upgrades and fixes and revision level releases so on and so forth. If you have knowledge to add, or ideas or comments, please contact me and we’ll see what we can do to make you happy.

So please support this product, and my effort to keep the wxWidgets book up to date, relevant and useful!

February 8, 2009

Installation Hints for ReviewBoard on Debian PPC (Sarge)

Filed under: ResearchAndDevelopment, General Information — Ryan Wilcox @ 12:12 am

So, after trying to install ReviewBoard on a virtual machine (and failing), I decided to install it on my real server, a machine running Debian PPC (sarge).

Eventually I got it up and running, but here are my notes:

I decided to run it off fastcgi (asking apt-get to install lighttpd pulled in tons of other software I didn’t want… and I didn’t want to screw around with setting up mod_python - although in retrospect maybe that was silly). You need to apt-get libapache2-mod-fastcgi. EXCEPT this doesn’t work by default: you need to set your /etc/apt/sources.list file to download “non-free” software too. So you lines will look something like deb http://debian.lcs.mit.edu/debian/ testing main non-free. What makes this confusing is that there is libapache2-mod-fastcgid, which is a different implementation of the fastcgi protocol… but not what you want at all. Anyway there’s more yack about sources.list here.

Also, if you’re having Apache2 listen on an oddball port for requests to your ReviewBoard, make sure your firewall lets those ports through :).

Make sure to uncomment the line in apache2.conf that Includes the files in sites-enabled!

I thought Everburning: Fighting Review Board was going to be helpful, but turns out it mostly works out of the box (without the stuff he does).

You might have to fiddle with settings_local.py’s FORCE_SCRIPT_NAME, and what the rewrite rule in the Review Board generated Apache2 config file says, to make them agree. This will be pretty obvious when the front review board page is ugly (because the CSS didn’t load), OR you’ll get an error about URLconf defined in djblets.util.rooturl, Django tried these. URL patterns...

I think that’s it. I’m pretty sure I easy_install ReviewBoard to get it installed, but I think there might be an apt-get package for it too.

So it wasn’t a bad experience, just one I was hoping to avoid with a virtual machine for it.. oh well I guess. Hope this helps someone else.

February 5, 2009

Running ReviewBoard Virtual Machine by rPath/rBuilder

Filed under: Uncategorized — Ryan Wilcox @ 6:31 pm

Today I got distracted by Review Board, a tool to help companies do code reviews. Because I was too lazy to deal with the dependancies I decided to look for a virtual machine for Review Board, all configured for me.

This was probably a stupid idea in retrospect.

I found a virtual image provided by rPath/rBuilder with (come to find out) no documentation. None at all. Which is annoying because to set up Review Board you need a “superuser” for the web app.

So, in order to set up Review Board on this box you need to do these things:

  1. Log into the machine. root with no password will log you into the shell
  2. rm /srv/reviewboard/data/reviewboard.sqlite
  3. /srv/reviewboard/manage.py syncdb. It’ll ask you if you want to create a superuser. Say yes.
  4. chmod 777 /srv/reviewboard/data/reviewboard.sqlite

Yes, the chmod 777 is a security risk. However, if you were to actually deploy this you’d want to use MySQL or something like that.

The settings files are in the /srv/reviewboard/ directory.

Also: the virtual machine hosts Review Board on port 80.

What is neat about the rPath/rBuilder virtual appliance is that they offer an EC2 virtual machine with Review Board too…

Hope this saves someone some time…

Update: And, having done all that, the machine doesn’t have PySVN on it, nor developer tools (gcc, easy_install) to bootstrap it myself. Arg.

November 2, 2008

Rails migrations and model validations

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 8:42 pm

So Rails has a feature called “migrations”. The idea here is that your database will change over time and you need a way to change this incrementally as the project changes and new requirements come up. So essentially your database moves through time.

Rails also has a feature where you can define validations for a field at the model level (aka: application level, vs database level, validations): make sure it is defined, is in a certain range, etc. These are declarative statements, meaning that they are set up when the class is defined and ran when an instance is created.

In certain situations, however, there comes an issue with this. For example, if you’re using Cruise Control to do integration tests, you might find that you’re migrating down to the first migration and then migrating all the way back up to current. Which is fine and a good practice, except for one thing: your model validations don’t move through time with your migrations. This is no problem… unless you’re trying to create or modify records using these model objects in your migrations. Then you have a paradox: the model tries to validate a field, but was given no value because that field doesn’t exist yet.

Lets say in migration 2 we are creating vendors and populating some default ones:

class CreateVendors < ActiveRecord::Migration
def self.up
create_table :vendors do |t|
t.string :name
t.string :email
end

a = Vendor.new(:name => “Fred Flinstone”, :email => “fred@example.com”)
a.save!

b = Vendor.new(:name => “Barney Ruble”, :email => “barney@example.com”)
b.save

end

def self.down
drop_table :vendors
end
end

But we'’ve been developing the app for some time, and in migration 20 we add a rating field to the Vendor table, as well as the following validation to the model:

validates_presence_of :rating, :on => :create, :message => "can't be blank"

Now when we migrate down to 0 and migrate back up, we get an error in migration 2:

$ rake db:migrate
(in .... )
== 2 CreateVendors: migrating =======================================
rake aborted!
Validation failed: Rating can't be blank

(See full trace by running task with --trace)

Oops! We’re validating something that doesn’t exist yet!!

Rails saves the day again, because each validation_ declaration has an :if parameter, which should be a Symbol or Procedure to run. If this symbol or procedure method returns true, then the validation is run.

So our problem can be solved by changing the model validation to read:

validates_presence_of :rating, :on => :create, :message => "can't be blank",
:if => Proc.new { |record| record.respond_to? :rating }

responds_to asks the Rails model: “Do you have a method named ‘rating’?”. Because Rails magically creates methods for every column an entity has in the database, Rails will either have that method (because it has been defined in the database), or not (because no such column exists). So we only run the validation if we have the column, avoiding the mess with trying to validate data that doesn’t make sense (yet).

July 27, 2008

Book Recommendations: Geek Leisure

Filed under: Uncategorized — Ryan Wilcox @ 8:52 pm

The release of a new book from one of my favorite authors gave me incentive to write this list.

Techno-geekery Fiction

Because I read so quickly, for me to spend any length of time with a book requires the book to be pretty massive… as in around a thousand pages. These books fit that bill, in a very technological setting (or diving into very technical topics)

Geek “History”

  • The Cuckoo’s Egg: A hippy-ish astronomer turned sys admin/hacker tracks hackers through all kinds of early 1990s systems to… you guessed it… East Germany. All real.
  • Microserfs: Were you paying attention to the programming technologies coming out of Apple and Microsoft in say 1992-1993? (Ie: Do you remember yack about Pink or OpenDoc?) If so this book will bring back a lot of memories… and is just a good story about work, family, and companionship.

Geek Philosophy

  • Things A Computer Scientist Rarely Talks About: A lecture and question and answer format where one of computer science’s great writers talks about his Christianity, his experiences writing a book where he analyzes verse 3:16 of every book in the Bible, etc. Fascinating.

Enjoy!

July 21, 2008

Book Recommendations: Python

Filed under: Uncategorized — Ryan Wilcox @ 10:35 pm

I’m starting something new on this blog - along with my normal content, I’ll give out book recommendations on topics or languages. I’ve got a pretty big bookshelf, and i pick the best of the best and list them here.

So, now on with my first topic: Python!

  • Python In A NutshellA BIG nutshell, a great reference book. The sections on performance profiling and testing are particularly excellent. When one of my alumni went to Hong Kong to do Python work, I gave him my first edition copy of this book.
  • Dive Into PythonThis is online for free at Dive Into Python.org. The book is meant to help experienced programmers ramp up into Python by taking examples and ripping them apart to see how they work
  • Python Cookbook15 minutes after cracking this book open I learned so much about what Python can do, Python idioms, etc. Excellent if you’re coming from another language too and just need to learn the lay of the land

Hope this helps speed people on their Python adventures!

July 9, 2008

Everything you ever wanted to know about debugging backgrounDRb in Rails… before it eats your soul

Filed under: Uncategorized — Ryan Wilcox @ 8:20 pm

Introduction

BackgrounDRb is both a blessing and a curse. First, it allows you to farm off work to other Ruby processes, so that your main process can get back to the work of serving your client (before a proxy timeout). All kinds of things can be done: re-encoding data, doing long queries, importing data. You can query BackgrounDRb to see what the status of an item is: so giving a simple progress indicator.

That’s the promise. The curse is that BackgrounDRb is very hard to debug. Partially due to the fact that it’s like multithreaded programming (technically farmed out to other Ruby processes, presumably via fork)… but multithreaded programming with bad error messages is worse. BackgrounDRb is in this “worse” category.

So here are my tips for avoid having your soul eaten by debugging this beast. Or, at least, let you put up a good fight.

Zeroth up - Information

Go, read these articles. I’ll be waiting.

Don’t expect the RubyDocs to give you much information… there’s no actual documentation there. It just barely beats browsing the source files in your editor… but you may prefer this.

First Up - Know your versions

Install from trunk if you can. The December 2007 build of BackgrounDRb has an issue where you can’t pass more than let’s say 4K of data to or from a worker. I’m serious. Yes, that means register_status too. Now you might want to consider files as an interprocess communication method anyway… but I think BackgrounDRb is one of those projects where you should Trust Head.

Second Up - Architecture

Ok, so this stuff is hard. You want to be able to use regular old Ruby to test things out before you add BackgrounDRb to the mix. My suggestion: do all your work in a totally separate, isolated, and unit-tested class, and use your worker only to set that up and run it. Seriously: the more you can debug in unit tests, normally, the more hair you’ll have.

Third Up - The Pain

There are two ways to approach things at this point: the mock way and the interpreter’s way. I don’t care which you pick.

The Mock Way

Instead of inheriting from BackgrounDRb::MetaWorker, inherit from this sucker….

class FakeyWorker
attr_reader :logger
def initialize
@logger = logger
end
def register_status(data)
@logger.debug("FROM REGISTER_STATUS")
@logger.debug(data)
end

def self.set_worker_name(something)
#ignore
end

def get_status
return @currdata
end
end

Import your worker file into your controller, and call it directly. Call get_status after the call to your worker method, and see what comes back. Since this is (still) happening synchronously, you’ll see exceptions in your browser. Fix errors. Rinse and repeat. Then lose the training wheels and use MiddleMan like you’re supposed to.

The interpreter Way

If that doesn’t appeal to you, (or you’ve done the above but still don’t trust) do the following:

  1. Fire up one console window with script/backgroundrb running
  2. Fire up another console window, and run script/console
  3. Trigger your worker from here. Here’s some example code:

    >>> MiddleMan.new_worker( :worker => :import_data_worker, :job_key => 41258 )
    >>> data = {....}
    >>> MiddleMan.worker(:import_data_worker, 41258).my_worker_method(data)
    >>> MiddleMan.worker(:import_data_worker, 41258).ask_status

If ask_status returns nil, you have a problem. If a traceback shows up in the script/backgroundrb window, you have a problem…. but don’t bother reading the traceback - it’s probably worthless.

If you’re having troubles, break it down: are you 100% sure your worker is not nil? (Answer: No). save the worker to a variable and check it! Also a worker’s worker_info method is nice.

If you did get nil from ask_status, I’d ask: is my_worker_method even getting called? Do something as obvious as possible: I like writing to a file in my_worker_method. No way to miss that, or have it go to the wrong log.

Fourth Up - BackgrounDRb STILL wants your soul

BackgrounDRb really wants to eat your soul. Really. Here are some suggestions:

  • Wrap your entire worker method in begin… rescue and register_status $!.backtrace (or, write it to a file)
  • Logging MiddleMan.all_worker_info.to_yaml in your controller action is brilliant. Verify you have the worker you’re looking for!
  • I’m not joking about checking to make sure MiddleMan.worker returns you a worker.
  • Is your worker method not getting called for “no reason at all”? Are you passing a lot of data to it and using the December 2007 version?
  • I’m assuming you’re copying off a worker and action that works already?
  • Some of these things could be full of crap. If so, let me know.

The End

With any luck this article helped your debugging, and you spent way less time than I did debugging your workers. Yes, BackgrounDRb sucks: the documentation isn’t good, the error messages are horrible, it’s a ton of infrastructure, insanely complicated, and sometimes it Just Doesn’t Want To Work. It’s also a fair tool. For all that, I might not be able to submit patches to help out the project, but I can write this article.

June 21, 2008

Ruby/Rails Performance Links

Filed under: Uncategorized — Ryan Wilcox @ 7:02 am

Update: Yikes, I hit the Post button before I really wanted to! So you’ll get to see more of my revision process than normal. Sorry about that. It also might be incomplete when you read it, as it might take me a few days to finish it.

So I haven’t been posting here - or anywhere - a lot. That’s because I got a full-time contract doing some Ruby On Rails… while also trying to keep 2 other long-term clients happy.

Anyway, enough of that. One of my assignments has been performance optimization on part of the RoR website. A quick search will find you links… or you can read on to find mine.

First off, I was testing just one batch-type action in a specific controller, not anything massive. I bet my links will help you with that too, but some of that advice will be specific too my task, not yours.

First, I separated out the testable parts with the “un-testable” parts. Or rather, the stuff I wanted to test vs the view (which I didn’t want to test). Basically following the performance advice in the Agile Web Development With Rails book: create unit tests to make sure your performance doesn’t go backwards. This is OK advice, and with this model in place I was able to do profiling pretty easily… but there’s a catch.

Advice #1: Don’t trust Benchmark.realtime

I used Benchmark.realtime to test performance. Ok… except I was getting some pretty big differences between runs of a chunk of code - without having touched it! Then I remembered about Python’s timeit module, which says it “avoids common timing traps”, and makes mention of something Tim Peters wrote in the Python Cookbook. The money quote is :

… time.time measures wall-clock time [which I think Ruby’s Time.now does as well]. So, for example, it includes time consumed by the operating system when a burst of network activity demands attention….

So checking Benchmark.realtime once, like the code example in the Rails book does, is naive and probably won’t deliver consistent results. It’ll help if you run the test multiple times and take the mean, for example, or use maybe an even better statistical approach. Want to read more? Read The Ruby Programming Language or a blog post from the authors on profiling (where they suggest this).

But, there’s a catch! (Database Debris)

At the end of a test, Rails (at least to my understanding) rollbacks back the records etc you changed during the test. If your doing N test runs inside a test function cruft will build up if you do database manipulation. If you’re consistently creating record id 1 (or any other column that must be unique), your tests will fail after the first time. Which sucks. I don’t know of a way to rollback the transaction in the middle of the test.

See also: Unit Testing: Leave No Trace vs. Schmutz.

And another one (Timing Ranges, Again)

So when you have your mean values, it’s very (very) possible that you can pick a high threshold and say “it should never take longer than this” (Which is what the Ruby book does). There are two problems with this approach: that a simple assert MAX > ACTUAL only says “false is not equal to true” when it fails. Thanks guys. Secondly, it doesn’t account for variability. Yes, we (should have) gotten rid of some of this via our work above… but what if the computer has a full load while you’re running the entire test? Or if you’re testing on a slower computer than you develop on?

Both these problems can be avoided by using UnitTest’s assert_in_delta(). Let’s say we want our operation to take 3 seconds, give or take 1 second


expected_float = 3.0
delta_float = 1.0
assert_in_delta(expected_float, meantime_float, delta_float)

Apparently, assert_in_delta will pass if (meantime_float - expected_float <= delta_float) (Thanks WikiBooks!)

If this does fail (say with a mean_time_float of 5.0) we get something like:

1) Failure:
test_it(MyTestCase)
[test/my_test_case.rb:156:in `test_it'
/home/ryan/my_test_case.rb:150:in `test_it']:
"3" and
"5" expected to be within
"1" of each other.

Which is way better than “Hey, your test failed! That’s all!”)

Use A Profiler

Know What ActiveRecord Is Doing

Read Other Articles

Next Page »