A gif of the terminal running videogrep

I’ve followed the #ds106 daily create for quite a few years now. The other day the invite was to use PlayPhrase

PlayPhrase will assemble a clip of movie scenes all having the same phrase, a small supercut if you will.

The results are slick and amusing.

I remember creating a few Supercuts using the amazing Videogrep python script. I thought I’d give it another go. I’ve made quite a few notes on using Videogrep before, but I think I’ve smoothed out a few things on this round. I thought I might write up the process DS106 style just for memory & fun1. The following brief summary assumes you have command line basics.

I decided to just go for people saying ds106 in videos about ds106. I searched for ds106 on YouTube and found quite a few. I needed to download the video and an srt, subtitle, file. Like most videos on YouTube there are not uploaded subtitles on any of the ds106 videos I choose. But you can download the autogenerated subtitles in vtt format and convert to srt with yt-dlp. The downloading and subtitle conversion is handled by yt-dlp2.

I had installed Videogrep a long time ago, but decided to start with a clean install. I understand very little about python and have run into various problems getting things to work. Recently I discover that using a virtual environment seems to help. This creates a separate space to avoid problems with different versions of things. I’d be lying if I could explain much about what these things are. Fortunately it is easy to set up and use if you are at all comfortable with the command line.

The following assumes you are in the terminal and have moved to the folder you want to use.

Create a virtual environment:

python3 -m venv venv

Turn it on:

source venv/bin/activate

Your prompt now looks something like this:

(venv) Mac-Mini-10:videos john$

You will also have a folder venv full of stuff

I am happy to ignore this and go on with the ‘knowledge’ that I can’t mess too much up.

Install Videogrep:

pip install videogrep

I am using yt-dlt to get the videos. As usual I am right in the middle when I realise I should have updated it before I started. I’d advise you to do that first.

You can get a video and generate a srt file form the YouTube auto generated:

yt-dlp --sub-lang "en" --write-auto-sub -f 18 --convert-subs srt "https://www.youtube.com/watch?v=tuoOKNJW7EY"

Should download the video, the auto generated subtitles and convert them to a srt file!

I edit the video & srt file names to make then easier to see/type

Then you can run Videogrep:

videogrep --input ds106.mp4 --search "ds106"

This makes a file Supercut.mp4 of all the bits of video with the text ‘ds106’ in the srt file.

I did a little editing of the srt file to find and replace ds-106 with ds106, and ds16 with ds106. I think I could work round that by using a regular expression in videogrep.

After trying that I realised I wanted a fragment not a whole sentence, for that you need the vtt file: I can dowmnload that with:
yt-dlp –write-auto-sub –sub-lang en –skip-download “https://www.youtube.com/watch?v= tuoOKNJW7EY”

Then I rename the file to ds106.vtt delete the srt file and run

videogrep --input ds106.mp4 --search "106" –search-type fragment

I shortened ds106 to 106 as vtt files seem to split the text into ds and 106.

I ended up with 4 nice wee Supercut files. I could have run through the whole lot at once but I did it one at a time.

I thought I could join all the videos together with ffmpeg, but ran into bother with dimensions and formats so I just opened up iMovie and dragged the clips in.

at the end close the virtualenv with deactivate

reactivate with

source venv/bin/activate

This is about the simplest use of videogrep, it can do much more interesting and complex things.

  1. I am retired, it is raining & Alan mentioned it might be a good idea. ↩︎
  2. I assume you have installed yt-dlp, GitHub – yt-dlp/yt-dlp: A feature-rich command-line audio/video downloader. As I use a Mac I use homebrew to install this some other command line tools. This might feel as if things are getting complicated. I think that is because it is. ↩︎

screenshot of pi.johnj.info/gb

One of the things I am interested in as part of my work on Glow Blogs is what people are using Glow Blogs for.

Glow Blogs is made of of 33 different WordPress multi-sites. One for each Local Authority in Scotland and one central one.

The home page of each LA lists the last few posts. Visiting these pages will give you an idea of what is going on. In the past I’ve opened up each L.A. in a tab in my browser and gone through them. I had a script that would open them all up. I’ve now worked out an easy way to give a quick overview.

Recently I noticed shot-scraper ,Tools for taking automated screenshots of websites . I’ve used various automatic webpage screenshot pages in the past. These have usually been services that either charge money or have shut down. I used webkit2png a wee bit, but ran into now forgotten problems, perhaps around https?

shot-scraper can be automated and extended. It is a command line tool and using these is always an interesting struggle. I usually just follow any instructions blindly, searching any problems as I go. In this case it didn’t take tool long.

Once installed shot-scraper is pretty easy to use. shot-scraper https://johnjohnston.info Dumps an image johnjohnston-info.png

There are a lot of options, you can output jpegs rather than pngs. Run some javascript before taking a screenshot or wait for a while. you can even choose a section of the page to grab.

So I can use shot-scraper to create screenshots of each LA homepage. Then display them on a web page for a quick overview of Glow Blogs.


    #!/bin/bash

    cd /Users/john/Documents/scripts/glowscrape/img

    URLLIST="ab as ac an ce cl dd dg ea ed el er es fa fi gc glowblogs hi in mc my na nl or pk re sa sb sh sl st wd wl"
    for i in $URLLIST ;
    do
        /usr/local/bin/shot-scraper -s "#glow-latest-posts" -j "jQuery('.pea_cook_wrapper').hide()" --quality 80 https://blogs.glowscotland.org.uk/"$i" -o  "$i".jpg && continue
    done;
    

This first hides the cookie banner displayed by blogs and then screenshots the #glow-latest-posts section of the page only.

The script continues by copying the image over to my raspberry pi where they are shown on a web page

I hit a couple of problems along the way. The first was that the script stopped running when it could not find the #glow-latest-posts section. This happens on a couple of LAs who have no public blogs. adding && continue to the screenshot fixed that.

The second problem came when I wanted to run the script regularly. OSX schedules tasks with launchd. I’ve used Lingon X to schedule a few of these. Since I recently updated my system I first needed to get a new version of Lingon X. I then found that increased security gave me a few hoops to jump through to get the script to run.

I think it would have been simpler to do the whole job on a raspberry pi. But I was not sure if it would run shot-scraper. I’ll leave that for another day and a newer pi.

This is a pretty trivial use of a very powerful tool. I’ve now got a webpage that gives me a quick overview of what is going on in Glow Blogs and took another baby step in bash.

The first thing that surprised me was the lack of featured Images on the blog posts. These not only make the LA home pages took nicer they also make display blog posts on twitter more attractive.

Since 2014 I’ve been making “movies” with my flickr photos for the year. I make them with a script which downloads the years photos puts them together into a movie and, use to, add music. The Music bit is broken (https) so I downloaded some manually.

This year pretty much stopped in October, then I got covid in November and have not been out much since.

I also average the photos ( below) and montage them for the featured image. This year I made a version of the script to download wee square images for the montage (average & montage scripts here).

I enjoy both the process and watching my photos flickr by. I like the fact that I can easily tweek bit of the script or run the video creation again quickly to try out different speeds, music etc.

Replied to Command Line — The MagPi magazine by Aaron DavisAaron Davis (collect.readwriterespond.com)
MagPi / RaspberryPi put together a guide to getting going with command line.

Hi Aaron,
This is a useful guide. I remember  Oliver Quinlan, a guest on Radio EDUtalk talking about the eloquence of the command line compared to pointing and grunting.
I enjoy using the command line, often with Raspberry PIs, but it is easy to miss some of the basics which this guide covers well.

After seeing @adders on micro.blog posting some timelapse I though I might have another go. My first thought was to just use the feature built into phone. I then though to repurpose a raspberry pi. This lead to the discovery that two of my PIs were at school leaving only one at home with a camera. This we zero had dome sterling service taking over 1 million pictures of the sky and stitching them into 122918 gifs and posting them to tumblr. I decommissioned that when tumblr started mistaking these for unsuitable photos.

My first idea were just write a simple bash script that would take a pic and copy it to my mac. I’ve done that before, just need to timestamp the image names. Then I found RPi-Cam-Web-Interface. This is really cool. It turns your pi into a camera and a webserver where you can control the camera and download the photos.

I am fairly used to setting up a headless pi and getting on my WiFi now. So the next step was just to follow all the instructions from the RPi-Cam-Web-Interface page. The usual fairly incomprehensible stuff in the terminal ensued. All worked fine though.

I then downloaded the folder full of images onto my mac and stitched them together with ffmpeg.

ffmpeg is a really complex beast, I think this worked ok:

make a list of the files with

for f in *.jpg; do echo "file '$f'" >> mylist.txt; done

then stitch them together:

ffmpeg -r 10 -f concat -i mylist.txt -c:v libx264 -pix_fmt yuv420p out.mp4

I messed about quite a bit, resizing the images before starting made for a smaller move and finally I

ffmpeg -i out.mp -vf scale=720:-2 outscaled.mp4

To make an even smaller version.

I am now on the look out for more interesting weather or a good sunset.

graph of number twitter clients used by schools

I’ve talked to a fair number of teachers who find it easier to use twitter than to blog to share their classroom learning. I’ve been thinking a little of how to make that easier but got side tracked wondering how schools, teachers and classes use twitter.

If you use twitter on the web it tells you the application used to post the tweet. At the bottom of a tweet there is the date and the app that posted the tweet.

I’ve got a list that is made up of North Lanarkshire schools I started when I was supporting ICT in the authority.

I could go down the list and count the methods but I though there might be a better way. I recalled having a played with the twitter api a wee bit so searched for and found: GET lists/statuses — Twitter Developers. I was hoping ther was some sort of console to use, but could not find one, a wee bit more searching found how to authenticate to the api using a token and how to generate that token. Using bearer tokens

It then didn’t take too long to work out how to pull in a pile of status updates from the list using the terminal:

curl --location --request GET 'https://api.twitter.com/1.1/lists/statuses.json?list_id=229235515&count=200&max_id=1225829860699930600' --header 'Authorization: Bearer BearerTokenGoesHere'

This gave me a pile of tweets in json format. I had a vague recollection that google sheets could parse json so gave that a go. I had to upload the json somewhere I could import it into a sheet. This felt somewhat clunky. I did see some indications that I could use a script to grab the json in sheets, but though it might be simpler to do it all on my mac. More searching, but I fairly quickly came up with this:

curl --location --request GET 'https://api.twitter.com/1.1/lists/statuses.json?list_id=229235515&count=200&' --header 'Authorization: Bearer BearerTokenGoesHere' | jq '.[].source' | sed -e 's/<[^>]*>//g' | sort -bnr | uniq -c | sort -bnr

This does the following:

  1. download the status in json format
  2. passes it to the jq application (which I had installed in the past) which pulls out a list of the sources.
  3. It is then passed to sed which strips the html tags leaving the text. (I just search for this, I have no idea how works)
  4. next the list is sorted
  5. then uniq pulls out the uniq entries and counts then
  6. Finally sorts the counts and gave:
119 "Twitter for iPhone"
  28 "Twitter for Android"
  22 "Twitter Web App"
   8 "Twitter for iPad"
   1 "Twitter Web Client"

This surprised me. I use my school iPad to post to twitter and sort of expected iPads to be highest or at least higher.

It maybe that the results are skewed by the Monday, Tuesday holiday and 2 inservice days, so I’ll run this a few times next week and see. You can also use a max_id parameter so I could gather more than 200 (less retweeted content) tweets.

This does give me the idea that it might be worth explaining how to make posting to Glow Blogs simpler using a phone.

Update, Friday, bacn to school and NLC looks like:

 74 "Twitter for iPhone"
  51 "Twitter for iPad"
  18 "Twitter for Android"
  10 "Twitter Web App"
   1 "dlvr.it"

I liked the Pummelvision service so when it went I sort of
made my own. Which lead to this:
Flickr 2014 and DIY pummelvision and 2016 Flickring by.

I went a little early this year:

I’ve updated the script (gist) to handle a couple of new problems.

  1. Some of my iphone photos were upside down in the video as ffmpeg doesn’t see the Rotation EXIF. I installed jhead via homebrew deal with this.
  2. I installed sox to duplicate the background track as I took more photos and slowed them down a bit this year.

I have great fun with this every time I try it, I quite like the results but the tinkering with the script is the fun bit. I sure it could be made a lot more elegant but it works for me.