Sikuli: scripting with screenshots

Now this “play iTunes” script is hardly groundbreaking, as the same effect could be achieved manually in just a few clicks, but already people are using Sikuli to do some far more advanced things than this. For instance, there’s a demo on YouTube of someone using it to export music from Logic Pro into a new movie file, then using QuickTime player to email the file using the Share function. Another script automates an entire Coda workflow (Coda is a web development tool), uploading new versions of the pages to a website, and there’s even a demo of Sikuli playing a game of Bejewelled.

Sikuli: scripting with screenshots

It’s true that creating the scripts in the first place can be a little tricky, especially while you’re still getting the hang of the language (and since I’m not a Python programmer, getting
used to that language’s various – let’s be kind and call them “idiosyncrasies” – took me a while).

If you find that you’re performing the same repetitive tasks over and over again, this program may well save you huge amounts of time

But once you do get into the groove it’s remarkably easy to do some really quite sophisticated things. If you find that you’re performing the same repetitive tasks over and over again, this program may well save you huge amounts of time.

For example, the first demo on the Sikuli site shows how to change your machine’s IP address, which isn’t particularly exciting unless you have to do that every day when you move your laptop from home to the office, in which case you’ll find it a massive relief.

While the scripting language is based on Python, it includes many extra features such as moving the mouse to a particular place on the screen and clicking, typing values into dialog boxes, and searching for a visual element based on some other element. For instance, if there are multiple sliders on a page, you could find the one you want by identifying a label to the left of it, then tell Sikuli to look for a slider to the right of that label. Clearly, a lot of thought has gone into this product, and it’s improving with every new release.

One criticism I do have is that the documentation could do with being beefed up quite a bit: it currently consists of a few examples, a few How-Tos, and a rather limited command reference that details the Sikuli extensions to the Python language. The command reference isn’t awfully clear in places, and several times I had to resort to web searches to find out how to do various things. However, these web searches pretty well always paid off and found me the information I was looking for, so don’t let this put you off the project as a whole.

I’ve been through plenty of scripting tools in my time, all the way back to QuickKeys in the early days of the Mac, and Project Sikuli is the first one in a very long time that’s really caught my interest.

It’s a totally new way to think about automating some of your workflow, and even non-programmers should be able to pick up its basics quite quickly, given how visual everything is. I’ll admit that I haven’t tried it on a Windows machine yet, but the reports I’ve read say that it works well on that platform too, although the screengrabs throughout the project’s web pages demonstrate clearly that much of the development team is Mac-centric (which I, of course, wholeheartedly applaud!).

And with that, I’m off to work on a script that involves automatically ripping portions of DVDs, something my company does for a PR client and which will save the person who’s responsible for that job about 30 minutes every time he has to do it.

Disclaimer: Some pages on this site may include an affiliate link. This does not effect our editorial in any way.

Todays Highlights
How to See Google Search History
how to download photos from google photos