Messing around with appleScript, mapquest API & imageMagick
I got a request from a teacher who wanted to download a years worth of images from a Glow Blog (for end of year slideshow).
Although there are plugins that can do this these are not available on Glow Blogs. I was stumped apart from going through the site and downloading them 1 by 1. But after a wee bit of thinking I though I’d try using the REST API via AppleScript.
The REST API will list in JSON format the media:
Look at that in FireFox for a pretty view.
JSON Helper is
an agent (or scriptable background application) which allows you to do useful things with JSON directly from AppleScript.
So I can grab the list of media from a site in JSON format use appleScript to download all the files.
The script I wrote is not great, you can’t download from a particular year, but a quick look at the JSON will help in working out how many files to download.
I am sure there are more efficient ways to do this and I’ve only tested on a couple of site, but it seems to do the trick and might be useful again sometime.
Figured out how to automate posting of old AudioBoos with AppleScript, test here more to come then #Edutalk.
AudioBoom is closing its free tier:
If you take no action, then after 2nd October 2017, you will no longer be able to upload new content and your account will become private. We will continue to enable distribution of your existing content for a period of a month so all your RSS feeds and web embeds will continue to work for that period. If you choose to move to another podcast provider, let us know by emailing us at firstname.lastname@example.org and we will redirect your RSS feeds for you. We’ll need at least 5 working days to comply with your request. After 36 months from 30th August 2017, your account will be deleted (including your old podcasts and your RSS feeds, so we recommend that you arrange for redirection of your RSS feeds, download your old podcasts and back them up elsewhere, before that period expires.
from: Subscription Changes
Which is depressing news for me and for Edutalk. I have 50 odd boos which range over field recording, audio recorded for Edutalk and some microcast type posts. Edutalk has had several hundred contributions from many different people over the years.
The situation at Edutalk is more worrying. I could pay $9.99 a month to keep my own account alive. But Edutalk has had contributions from many different people, we could not expect them to pay up for the privilege of having their content syndicated onto Edutalk.
AudioBoom did not provide any export that would help with importing into WordPress (or anything else). This differs from the posterous closedown which did give a WordPress export option.
We do have a while to sort this out. There is a month until the accounts become private.
AudioBoom does have an API, and we used it before.
I am not intending to rush, so this is the plan.
- Download the information about the posts using the API
- Download all the mp3s by parsing the JSON the api provides.
- Delete all the posts on edutalk that have been syndicated from AudioBoom.
- Upload all the mp3s
- Create posts that embed all these mp3s with the matching titles and descriptions etc.
Today I managed to download the json files and the mp3 I used AppleScript as I find it easier to get stuff done with that than pure shell scripting.
Thank goodness for the JSON helper for AppleScript which worked a treat.
I’ve put the script here:
in case anyone is interested.
I had to run it 10 times, I guess I could have just made a loop but as I ended up downloading 890 mp3 for a total of 2.6 GB batches of 100 files at a time seemed like a good idea.
I am a wee bit worried that there are 2186 posts syndicated from audioboo on the Edutalk site, but there does seem to be a lot of duplication presumably caused by FeedWordPress.
I’ve now got all of the data and the mp3 files I can get.
I know how to post to WordPress from AppleScript, but I’ve discovered a couple of hurdles. I don’t seem to be able to add an enclosure with AppleScript and I can’t see how to ad multiple tags to a post.
The first is probably not a problem. These posts are all so old that they will not feature in our RSS feed. I would like to include all of the tags. I may end up creating a WordPress export file or try one of the csv import plugins. There is now not such a rush. I can test these approaches on this blog with my own boos.
I guess the main lesson to be learnt here is about the temporary nature of the free layer of the web. The AudioBoo app and service were wonderful in their day but reliance on free services costs.
The featured images is a gif captured with Licecap, of a mp3 download.
I’ve been beta testing micro.blog. There is a new page here for status type posts, these get sent to micro.blog/johnjohnston and to twitter.
This has renewed my interest in finding different ways to post to the blog especially for short posts that would have previously gone straight to twitter.
Some thoughts about making choices about the software and systems you use, they may have hidden positives or negatives.
- Ian Guest (@IaninSheffield)
- Aaron Davis (@mrkrndvs)
- My Secret Art of Blogging – Read Write Respond
- Banning Ads Is Nice, but the Problem Is Facebook’s Underlying Model | Hapgood
- Sal Soghoian
Featured image, iPhone screenshot, edited in snapseed
The other day a colleague and I were trying to remember how to get the icon art for iOS apps to help write notes. We though we remembered a way to get them out from examining the package. Later I was reading the ADE list, where there was a bit of bemoaning that you can no longer copy the art from iTunes. Someone mentioned that the art was now is a file iTunesArtwork inside the .ipa files in the iTunes folder, the .ipa file being zip files.
This means you can get the art work by, changing the extension on an ios app file to .zip, expanding the archive, adding a .png extension to the iTunesArtwork file. You end up with the artwork png file.
This seems like a fairly long road for a short cut. A wee bit of though lead me to try a few shell scripts. Basically you can use the unzip command to extract the iTunesArtwork file with a png extension and you get a png file of the artwork.
To make this a little easier I wrapped up the shell script in an AppleScript. Drag a bunch of .ipa files onto the droplet and it will create a folder on your desktop and extract the art work as png files. Double click the droplet and it will prompt you for a file and do the same. The files are named the same as the .ipa files except I replace all non alphanumerical characters with an underscore. I’ve put the script in my dropbox in case anyone would find it useful, and uploaded the text so you can View the Script.
BTW: Rounded Corners
So the artwork extracted does not have the rounded corners:
You can change the way that looks on the web with a bit of css:
style="-moz-border-radius: 20%;-webkit-border-radius: 20%;border-radius: 20%;"
This might help other folk documenting iOS stuff. I’ve now got a folder of >600 icons ready to go.
I like listening to podcasts. I usually listen to them while driving. I use instacast to play podcasts on my iPhone. Instacast allows you to subscribe to podcast feeds, it downloads episodes while you are on WiFi for playback later. I subscribe to a few educational podcast, some mac ones, the Scottish Poetry Library and Machine of Death. I change these about occasionally.
Sometimes though I want to listen to individual podcasts episodes without subscribing to the whole feed. Recently I’ve been doing this by downloading the podcast media to dropbox, making the files favourites on my phone while on WiFi(which downloads them onto the phone) and listening later. To speed this up a bit and to allow me to do this from my phone or an ipad I have a folder in my dropbox with an AppleScript Folder Action attached to the folder. I add a text file with the url to a media file to this folder (typically with droptext) and it is automatically downloaded to my desktop in a dropbox folder. I then can favourite etc as normal.
This still leaves a bit to be desired, I need to remember to favourite the files while on Wifi so that they are ready to play in the car.
Huffduffer looks like it is made to solve this problem. It is a service that allows you to create a podcast feed from episodes of different podcasts or just mp3 files found on the web. You use a bookmarklet which finds any mp3 files on the current webpage and adds them to your podcast.
Earlier this week I saw a link to huffduffer and created an account: Johnjohnston on Huffduffer. The only problem is I created the account on my phone and left it a few days to install the bookmarklet on my desktop. By then I had forgotten the password!
So today I decided to try a wee bit of DIY with AppleScript. I’ve already got a few dropbox folders set up with Folder actions to do some automation 1 so had a rough idea of how to go about this.
What I want to do is, on iOS copy the url to a webpage, switch to droptext, make a new text file containing the url and save it into the folder. The Folder Action script then parses the webpage for mp3 and m4a files and adds them to a RSS file. I’ve describe to this file in instacast so don’t need to think about it much other than opening instacast when on wifi and letting it download episodes.
Google helped with a couple of tricky parts, getting the address of mp3 files out of the web page:
how to extract an mp3’s url from m3u…: Apple Support Communities and getting the correct style of date so that the RSS feed validates:
RFC 822 Dates with AppleScript | Joe Maller.
The script basically adds the mp3 urls to a text file along with the date they are added. This text file is parsed to produce an RSS feed. The script certainly lacks any polish, but it works. Here is the RSS feed in my dropbox. And here is what it looks like in
As you can see, the feed is quite minimal, the names come from the mp3 file name. The script (I’ve uploaded it here), needs lots of work. I briefly tried to get the titles from the tile of the webpage, but ran into some odd characters which threw things off. I’ve also hard coded file paths into the script and it would be better not too. Most of the script, dealing with detecting the files added is a lift form the examples that Apple ship. My bit just process the url. I’ve also adapt this to run from a mac grabbing the front url from Safari, this script is in my FastScripts folder s oI can run it with a keyboard shortcut.
Not sure if anyone is interested in this stuff here, but it fascinates me and posting it is one way of keeping track.
This is going to be another slightly geeky post. The previous one, Testing a new system, was about a way to blog using dropbox and AppleScript folder actions had me thinking about other things that could be done using this sort of system. The way I am doing this relays on having dropbox and a mac that is on when you want it. If you don’t have a mac you might like Wappwolf which is a web service that can do a lot of things with files in your dropbox automatically.
So I already have a system for blogging by dropping files into a folder on my dropbox and was looking around for another idea to play with. There seems to be a few OCR apps for iPhones but I had noticed that Tesseract was available on Google Code and googled around to see how it could be installed and run on a mac. One I found was TesseractOCR Mac a Cocoa Front end to the Tesseract OCR program. I downloaded this and gave it a try. It worked well on my desktop. I then struck gold: Installing and using Tesseract 2.04 on Mac OS X 10.6.6 with Homebrew | Ramble On. This post explains clearly how to install Tesseract on a mac so that it can be used on the command line. It is also a good intro to homebrew.
Homebrew is the easiest and most flexible way to install the UNIX tools Apple didn’t include with OS X.
For someone who has struggled with this sort of thing before, homebrew is pretty straightforward. Installing homebrew is just a case of copying a line of code from the installation page, pasting it into the terminal and pressing return.
Following the instructions from Ramble On I just typed brew install imagemagick in the terminal and hit return. Lots of scary text scrolls by:
Once imagemagick was installed I repeated the process for Tesseract.
As I was wanting to figure out how to use my phoe for OCR I took a photo of a bit of newspaper, I used Camera+, the clarity filter, cropped and made the image Black and White:
I used Wifi Photo Transfer to grab the photo from my camera and put it on the desktop.
The OCR process is in two steps using the terminal and the newly installed applications:
- Convert to 200dpi tiff:
convert -density 200 -units PixelsPerInch -type Grayscale +compress fr_160.jpg fr_160.tif
- Preform OCR on the tif
tesseract fr_160.tif fr_160 -1 eng
I now have two extra files on my desktop, fr_160.tif and fr_160.txt, the txt file contains the OCR text:
(_;oogle is facing fresh criticism after admitting that it has not deleted all of the private data, including emails and pass- words, it secretly collected from internet users around the UK as it gathered data for its Street View maps. The search ﬁrm was ordered in Decem- ber 2010 to delete the private information hoovered up by its Street View cars from open Wi-Fi networks. r But yesterday Google told the Infor- mation Commissioner’s Ofﬁce “human error” had prevented it from erasing the data, which could include the millions of emails and passwords . Google admitted in May 2010 its Street View cars had “mistakenly” collected pri-
Which is pretty good.
OCR for dropbox
I now can see that tesseract works well and needed to make it work on images added to a particular dropbox folder.
There are a few folder action scripts that come with a mac, there are in /Library/Scripts/Folder Action Scripts/ several of these deal with images files ad contain routines for handling the dropping of files. These ‘standard’ routines move added files of the correct file type to a subfolder and then pass them on to a sub-routine that deals with the files. I could just duplicated one of these and edit the process_item sub routine. Basically I just scripted the process tested above. I’ve uploaded the script ocr folder action as html, incase anyone will find it useful or fun.
To use the script you put it in the Folder Action Scripts (copy the text of the html file paste it in the appleScript script editor.). Add a folder to dropbox and attached the script to that (right click on the folder and choose Folder Actions Setup…).
Most of my bit of the script just uses do shell script to run the scripts above, the only gotcha was that although I can use convert in the terminal, in a script I have to use the full path to the script:
set ocrscript to
"/usr/local/Cellar/tesseract/3.01/bin/tesseract '" & tif_file & "' '" & tif_file & "' -1 eng"
do shell script ocrscript
This is to do with the way homebrew installs applications and the fact AppleScript doesn’t access commands from /usr/local/….
My script is fairly crude, especially about file endings, if I add :Photo 28-07-2012 12 35 55.jpg to the dropbox folder, it is moved into the processed files folder and Photo 28-07-2012 12 35 55.jpg.tif and Photo 28-07-2012 12 35 55.jpg.tif.txt are created. Not elegant.
The whole process from taking a photo to opening the txt file in dropbox only takes a couple of minutes when using 3G. The system will not deal with columns or more than a single block of text but it does that fairly well. Mostly it was fun to figure out how to do.
This is a quick test of a alternative way to post to my blog.
I tend to blog from my MacBook. I’ve been testing various iOS systems for blogging on the go. I’ve also read a lot about blogging systems that use Dropbox files to produce a post. In the past I’ve experimented with posting to a blog with AppleScript and thought I could knit something simple together. This consists of several parts:
- a folder on my Dropbox called BlogThis
- a Folder Action AppleScript on this folder on my always on work mac.
- The MetaweblogAPI enabled on this blog and supported by AppleScript.
When a file arrives on Dropbox and syncs to my work mac the Folder Action AppleScript posts it to my blog. It uses the first line of the file as a title. If the file is HTML it posts that, if it is markdown it converts it to HTML first.
This post was created with [NOCs](http://www.wisd.com/) on my iPhone. Once I’ve finished a local, iPhone, draft NOCs allows me to move it to any Dropbox folder.
About 6 years show I was testing posting images via the MetaWeblogApi and it should be easy enough to use a Dropbox folder for that, or to script an FTP upload. This would mean I could add an image from my phone to my Dropbox. This would upload to this sit and could be encorporated into a post. Until the I could use Flickr.
I am not sure if anyone is very interested in this sort of thing. If they are I’ll be able to post more details from a desktop. This is about as long a post as I’d like to write on a phone.
Not quite perfect yet, I had to edit the img tag here. More fun to be had.
Update: it was like magic watching my home mac when posting this from my phone, growl told me that a file had been addd to my dropbox and almost immediately that a file had ben moved (by the work mac).