Alexey's blog

twitter_images

4 minute read Published: 2016-02-10

Recently I've been refactoring and improving one of my older projects. It's a Ruby gem, that can download images off of twitter based on the search terms you provide.

Here's a relevant XKCD:

XKCD 512

Initially it was a somewhat simpler script, inspired by the material provided in the Bastards Book of Ruby.

As of now, it is a much nicer structured app that implements a PIN-based Twitter authorization (no more registering your own app for credentials), CLI arguments (no more gets.chomp'ing every single parameter), relative directory paths (no more typing out /home/username) and a nice-looking progressbar, courtesy of ruby-progressbar.

Granted, there are still more things to add and improve. The test coverage can use a little tune up. There are some methods that should be refactored and perhaps some additional features could be added as well.

In this blog post I'll mention some of the difficulties I've dealt with and some of the challenges I've yet to conquer.

Initial authorization

Currently, the additional flags that you can use are "-v" for finding out the gem's version, "-h" for showing you how to use the gem and "-a" for authorizing the app with Twitter.

The last one has to be run at least once before you can use the app. It stores your credentials for later use in your $HOME/.twitter_imagesrc file. To re-authorize the app you can rerun the command with this flag again.

I am not too comfortable with this. Taking notes from rainbowstream, I'd like the app to authorize you automatically when you're running it for the first time, unless there is a previous configuration in place. It would also be nice if, after the initial authorization, the app would run normally and download those images you've requested.

CLI flags

Another thing is that I am not entirely certain about the flags I've mentioned. Do they serve any significant purpose? Can they be improved somehow? Perhaps some new flags might be added, like a "-d" one, right before you specify the target directory to save the images in?

I am not entirely sure yet. Reconfiguring this would involve changing how the ARGV parameters are parsed, which might be a good thing.

Getting the right amount of images

Beyond that I have some minor annoyance with the way I get the right amount of images downloaded.

You see, when you make a "GET search/tweets" request, you get back 100 tweets per request maximum. But not all of those tweets will have an image attached to them. Googling around yields mentions of adding "filter:images" to your queries along with "include_entities=true". This didn't work for me, it seems that the filtering no longer works and the entities are included by default, according to this page.

So what I am doing now is breaking out of the downloading loop once I have more links than the required amount and then running this:

def trim_links(amount)
  @links = @links.slice(0...amount)
end

where @links is an Array of assembled links. I do this before passing this object to the Downloader class that saves the images.

It doesn't seem to be too elegant of a solution, but I am not too certain what else to do here.

Overall code quality

I believe the current project structure is decent. I also believe things can always be improved. Refactoring and improving the tests is something I am planning on doing soon.

The structure of the classes seems reasonable and in line with the OOP principles.

The tests should be tweaked to test the results and not the implementation of the methods, since those change often. I should also mention that there are plenty of tests covering private methods.

Normally, the private methods should be left alone and the public methods should be the ones tested. But, in my case, there is a lot going on in there, so I wanted to have some additional safety net that would tell me what's broken in my private methods when I introduce a new feature.

The POODR book says:

"The rules-of-thumb for testing private methods are thus: Never write them, and if you do, never ever test them, unless of course it makes sense to do so. Therefore, be biased against writing these tests but do not fear to do so if this would improve your lot."

This is what I am going to strife towards in the future. Simplify the interfaces, reduce the number of private methods, remove the unneeded tests. But for now, I do feel that I need those tests. Until "the fog clears and a design reveals itself", I will keep them.

Improving the overall test coverage is on the horizon as well.

In conclusion

I've had lots of fun refactoring this project and playing with it. Adding new features and figuring out how the Twitter API works was quite enjoyable. And even though I've had to pause my exploration of Haskell, I still feel pretty happy about it.

I encourage you to check it out over at GitHub and give it a go.