Sam's news

Here are some of the news sources I follow.

My main website is at

W3C to work with MDN on Web Platform documentation

Published 18 Oct 2017 by Dominique Hazaël-Massieux in W3C Blog.

screenshot of MDN search barI am pleased to announce that W3C has joined in collaboration with Mozilla, Google, and Samsung to support MDN Web Docs. MDN documents cross-browser Web standards to allow Web developers to learn and to share information about building the open Web.

MDN is a web development documentation portal created by Mozilla with the mission to provide unbiased,  browser-agnostic documentation of HTML, CSS, JavaScript, and Web APIs.

Building the Web platform through standards can only work if these standards can be effectively adopted by developers. As a developer myself, MDN has been an invaluable resource in learning about the Web, and I am enthused by the prospects of contributing to its future and how standardization and dev documentation can work better together.

How can I get the table headers to be vertical in a MediaWiki table?

Published 18 Oct 2017 by Regis May in Newest questions tagged mediawiki - Stack Overflow.

I'm trying to turn the headers in a MediaWiki table from horizontal to vertical. This way there should be more room for the table data.

I found this MediaWiki template that can turn text in arbitrary direction

This basically works but it does not solve the problem: Within tables it seams that in web-browsers FIRST all cells are layouted and THEN the text is rotated. This way all table header cells will cover the same amount of space after rotation is performed, which does not solve my problem. I require the opposite approach: FIRST rotate the text and THEN layout the table.

How can this achieved by CSS? How can the text be made vertical in table headers this way saving horizontal space?

Mediawiki error empty response

Published 18 Oct 2017 by Samuel Panicucci in Newest questions tagged mediawiki - Stack Overflow.

I install Mediawiki in my server following Tutorial, but sometimes ERR_EMPTY_RESPONSE apper as error, without any apache error log message.

My Mediawiki version is 1.29.1

Embed PHP in a Mediawiki page

Published 18 Oct 2017 by Doug in Newest questions tagged mediawiki - Stack Overflow.

I have been googling this for the best part of yesterday without any joy, so I'm here for the last hope for an answer before I give up!

Can anyone tell me if it's possible to embed a .php page in a MediaWiki page? I was hoping for an extension of some sort or some other type of solution - I did see a post her somewhere during my search that suggested using a hook to make an extension - but I have absolutely no idea where to even start!

Essentially the problem I'm trying to solve is that I have a .php script that spits out the results of a couple of array queries, and I would like to present this directly on a MediaWiki page. I'm not bothered how, in an inline frame I assume..

Has anyone tried anything this and made it work? Any pointers would be very much appreciated. Thanks

What's New With the DigitalOcean Network

Published 17 Oct 2017 by Luca Salvatore in DigitalOcean: Cloud computing designed for developers.

What's New With the DigitalOcean Network

Early this year the network engineering team at DigitalOcean embarked on a fairly ambitious project. We were thinking about areas of our network that needed improvement both for our customers and for our internal systems. One of the key things that we strive for at DO is to provide our customers with a stable and high performing cloud platform. As we continue to grow and release new products, it becomes clear that network infrastructure is a critical component and it must keep up with our customers needs. In order to allow our customers to grow, the network must be able to scale, it must be performant, and above all, must be reliable.

With those factors in mind, we went to work building out the DigitalOcean global backbone. It’s not finished yet, but we wanted to share what has been done so far, what is in progress, and what the end state will be.

Creating a Backbone Network

DigitalOcean currently operates 12 datacenter regions (DCs) all around the world. Up until recently, these datacenters have functioned as independent “island” networks. This means that if you have Droplets in multiple locations and they need to communicate with each other, that communication goes across the public internet. For the most part, that “just works”, but the internet is susceptible to a multitude of potential problems: ISPs can have technical problems, congestion is common, and there are malicious attacks that can cause widespread issues. If you have an application that requires communication between multiple regions, the factors mentioned above could throw a wrench in even the most well designed system. To mitigate this risk, we are building our own backbone network.

A backbone network allows us to interconnect our DCs using a variety of technologies such as dark fiber and wavelengths. This means that communication between DO locations no longer needs to traverse the public internet. Instead, traffic between locations runs over dedicated links that DigitalOcean manages. This gives our customers predictable and reliable transport between regions. Predictable and reliable are the key words here, and this is immensely important for anyone who is building mission critical applications. It allows developers and engineers to know exactly how their application will perform, and feel safe in the fact that their traffic is running over dedicated and redundant infrastructure.

Our customers have probably noticed a number of “Network Maintenance Notifications” that we’ve sent out. In order to build out our backbone and ensure that it is scalable, reliable, and performant, we’ve had to make a number of changes to our existing network infrastructure. This includes software upgrades, new hardware, and a number of complex configuration changes. The end result will ensure that our current and future customers will benefit from all of this work.

Now, onto the details. This is what we have built so far, and what we'll build in the future.

Networking Through DO-Owned Links

We’ve interconnected our three NYC locations; All Droplet-to-Droplet traffic between NYC1, NYC2, and NYC3 now traverses DO-owned links. Latency is predictable and stable, and packet loss is nonexistent.

We’ve done the same thing around all of our European locations: LON1, AMS2, AMS3, and FRA1 are all now interconnected together. Again, all traffic between Droplets within the EU now stays within the DO network. Here is how it looks:

What's New With the DigitalOcean Network

We’ve also provisioned transatlantic links connecting our NYC regions to our European regions. This means that your communication between NYC and any datacenter in Europe also stays within our own network:

What's New With the DigitalOcean Network

Adding more to the mix, we’ve connected our NYC locations to our two facilities in California, SFO1 and SFO2. All communication around North America as well as communication within and to Europe now stays within the DO backbone:

What's New With the DigitalOcean Network

Next up will be connectivity from the SFO region to SGP1. We also have plans to link Singapore to Europe which is slated for Q1 2018 as well as TOR1 to NYC. Once fully completed, the DO global backbone will look like this:

What's New With the DigitalOcean Network

We are very excited about what these upgrades mean for DO and for you, our users. We’re continually striving to create better performing and more reliable infrastructure, and I feel that these upgrades to the network will set the stage for some really awesome things to be built on top of the DO platform.

Luca Salvatore is currently the manager of the Networking Engineering Team at DigitalOcean. Over the past decade Luca has held various network engineering roles both in Australia and the USA. He has designed and built large enterprise and datacenter networks and has first hand experience dealing with massively scalable networks such as DigitalOcean. He has been working in the cloud networking space for the past 5 years and is committed to peering and an open internet for all.

Reminder: Research Before You Sell Out

Published 17 Oct 2017 by Ipstenu (Mika Epstein) in Make WordPress Plugins.

Are you thinking of selling your plugin? Did someone offer you money to put a link to their sites in your readme or wp-admin settings page?


I’m sure most of you are aware of the recent bad behaviour that’s gone on with regards to unscrupulous people purchasing plugins and using them to leverage malware, spam, and backdoors. While we would never tell you that it’s wrong to sell the plugins (they’re yours after all), we do want to help you recognize the warning signs of a bad-faith purchase.

Above all, if anything in the process makes you nervous and feel like something is wrong, call the deal off. You can email us at and we can help vet the buyer for you.

But remember this: The primary reason people want to buy ‘popular’ plugins is to use it to spam.

Signs To Watch Out For

Here are some basic red-flags:

Do Your Homework

When people come to us asking to adopt plugins, we vet them. We look at the code first. If there’s no new version of the code, with fixes, we don’t even consider it. If the prospective buyer of your plugin can’t show you how they’ll update it, don’t do it. Period.

No matter what you must do the work to vet these people. Ask them serious questions. How do they plan to handle support and reviews? How familiar are they with the directory guidelines? Do they already know how to use SVN? How will they take care of your existing users?

Review their code. Sit down and look over every single line of code and make sure it’s safe and well written. If you see base64 and it’s not for images, tell them no. If you see them phoning home, tell them no. If you see them doing things in an insecure way, tell them no.

At the end of the day, what they do is going to reflect on YOU, and your reputation could suffer.

Many times, good developers find their names dragged through the mud when a plugin they own is purchased by people who do horrible things with their code. Make absolutely certain, beyond shadow of a doubt, that they understand what owning the plugin means, and that they must abide by all the plugin and forum guidelines.

Worst Case Scenario

If we find out you sold your plugin to someone who does evil with it, the odds are you won’t get that plugin back. Among other reasons, you sold it. To have you take someone’s money for the access, and then give it back to you, would be tantamount to theft. At the very least, it would be a bad-faith action on our part. Once you sell a plugin, accept the money, and your access is removed, that’s it. You’ve indicated you’re done with it, and we will enforce that.

This means if evil is done and we need to fix the plugin, we’ll roll it back to a safe version, remove everyone’s access, and disable the plugin permanently. That will it will push a final update, but no one new can install it. We feel that once a plugin has been sold and used like that, it’s near impossible to recover any reputation, and it’s better for the community to walk away.

Should You Sell Your Plugins?

The directory was never intended to be a sales marketplace, and it’s unlikely it will ever be one. If your deepest wish is to make a super popular plugin and sell it for gobs of money, this is probably not the place for you. Selling your plugin is a chancy business, and it’s hard to make money legitimately on a free plugin. After all, they can legally just fork it and make a new one.

You certainly can sell your plugins, but sell it smartly. At the end of the day, it may be better to retire a plugin than sell it or give it away to someone you’re not sure will do good.

#notice, #warning

MediaWiki: Convert seconds to HH:MM:SS

Published 17 Oct 2017 by gooflab in Newest questions tagged mediawiki - Stack Overflow.

How do I convert seconds to HH.MM:SS via MediaWiki (plus Semantic Media Wiki)? I tried and tried and just can't find a way.

The duration in seconds comes via filling out a template. As I can't install more extensions to the Wiki the only ways are via SMW and parser functions.

Thank you very much in advance!

Semantic Search with Subproperties

Published 16 Oct 2017 by Crysis in Newest questions tagged mediawiki - Stack Overflow.

I'm using Semantic Mediawiki on an offline wiki. I setup these properties in a chain.

Book -> Series -> Setting with book being a subproperty of series, which is a subproperty of setting.

So on the book page "Earth X", the series property is listed as "Earth Trilogy". On the series page "Earth Trilogy", the setting is listed as "Starborn Earth".

Various character pages contain the property "Book::Earth X", to tie them to the book.

What I am trying to set this up to do, is to be able to pull up the Starborn Earth setting, and via subproperties, also pull up the Earth Trilogy series page, Earth X book page, and all of those character pages. Because all of that is a part of the Starborn Earth setting.

But I'm not sure how this search should be written. I've tried using "Setting::+" on the Special:Ask page. It gives me all the pages, but also gives ANY other pages with Setting. (ie, if I have a page with "Setting::Dark Earth", all of those pages will be returned as well. Making the search useless.)

If I do a search with "Setting:Starborn Earth", all that is returned is the page that directly has that property set, the Earth Trilogy series page. The book and character pages are not there at all.

Basically I'm trying to set my Wiki so I can use subproperties to organize various pages.

Using some video games as an example:

Setting:Zelda would return ALL pages in the Zelda setting, while Setting:Mario would return ALL pages in the Mario setting.

So Setting:Zelda might spit out pages like this: Link, Zelda, Ganondorf, Ocarina of Time, Majora's Mask, Hyrule, Hylian.

And then Setting:Mario might spit out pages like this: Mario, Princess Peach, Mario Kart 8, Super Mario Galaxy, Chain Chomp.

And then if I wanted to break it down further, I would then be able to step down in the chain.

Series::Ocarina of Time, while being in Setting::Zelda, would only return pages like this: Lon Lon Ranch (OoT), Zelda (OoT), Link (OoT)

Yet it would ignore a page like: Link (LttP)

Because that page would be under Series::A Link to the Past.

So I would be able to search for content at every level of the chain, and narrow down the focus on what to display.

Updated Kubuntu 17.10 RC ISOs now available

Published 16 Oct 2017 by rikmills in Kubuntu.

Following on from yesterday’s 1st spin of the 17.10 RC images by the ubuntu release team, today the RC images (marked Artful Final on the QA tracker) have been re-spun and updated.

Please update your ISOs if you downloaded previous images, and test as before.

Please help us by testing as much as you have time for. Remember, in particular we need i386 testers, on “bare metal” rather than VMs if possible.

Builds are available from:

the CD image to left of the ISO names being a link to take you to download urls/options.

Take note of the Ubuntu Community ISO testing party on Monday 16th at 15:00 UTC:

Please attend and participate if you are able. The #ubuntu-on-air IRC channel on can be joined via a web client found beneath the live stream on, or of course you can join in a normal IRC client.

Happy testing,

Rik Mills

Kubuntu Developer
Kubuntu Release team

Disestablish and be damned

Published 16 Oct 2017 by in New Humanist Articles and Posts.

The Church of England continues to hold incredible constitutional power, to the detriment of the UK. Can this be challenged?

Kubuntu Artful Aardvark (17.10) initial RC images now available

Published 14 Oct 2017 by valorie-zimmerman in Kubuntu.

Artful Aardvark (17.10) initial Release Candidate (RC) images are now available for testing. Help us make 17.10 the best release yet!

Note: This is an initial spin of the RC images. It is likely that at least one more rebuild will be done on Monday.

Adam Conrad from the Ubuntu release team list:

Today, I spun up a set of images for everyone with serial 20171015.

Those images are *not* final images (ISO volid and base-files are still
not set to their final values), intentionally, as we had some hiccups
with langpack uploads that are landing just now.

That said, we need as much testing as possible, bugs reported (and, if
you can, fixed), so we can turn around and have slightly more final
images produced on Monday morning. If we get no testing, we get no
fixing, so no time like the present to go bug-hunting.

… Adam

The Kubuntu team will be releasing 17.10 on October 19, 2017.

This is an initial pre-release. Kubuntu RC pre-releases are NOT recommended for:

Kubuntu pre-releases are recommended for:

Getting Kubuntu 17.10 Intial Release Candidate:

To upgrade to Kubuntu 17.10 pre-releases from 17.04, run

sudo do-release-upgrade -d

from a command line.

Download a Bootable image and put it onto a DVD or USB Drive here: (the little CD icon)

See our release notes:

Please report any bugs on Launchpad using the commandline:

ubuntu-bug packagename

Check on IRC channels, Kubuntuforum or the Kubuntu mail lists if you don’t know the package name. Once the bug is reported on Launchpad, please link to it on the qatracker where you got your RC image. Join the community ISO testing party:

KDE bugs (bugs in Plasma or KDE applications) are still filed at

How to use curl with the MediaWiki API for register

Published 14 Oct 2017 by Midlaj in Newest questions tagged mediawiki - Stack Overflow.

I want to use mediawiki api to register user, i want to use curl to for api calls, I tried with

function RegisterCurl()
    // is cURL installed yet?
    if (!function_exists('curl_init')){
        die('Sorry cURL is not installed!');

    //$url = 'http://localhost/WC/gccfwiki/api.php';
    $myvar1 = "createaccount"; 
    $myvar2 = "http://localhost/WC/gccfwiki/";
    $myvar3 = $this->GetToken;
    // $remove = '+\\';
    // $myvar3 = str_replace($remove,"",$myvar3);
    $myvar4 = "Bob"; 
    $myvar5 = "ExamplePassword";
    $myvar6 = "ExamplePassword";
    $myvar7 = "";
    $myvar8 = "Robert20Example";

    $myvars = 'action=' . $myvar1 . '&createreturnurl=' . $myvar2 . '&createtoken=' . $myvar3 . '&username=' . $myvar4. '&password=' . $myvar5 . '&retype=' . $myvar6 . '&email=' . $myvar7 . '&realname=' . $myvar8;

    $url = 'http://localhost/WC/gccfwiki/api.php';

    $ch = curl_init( $url );
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 25);
    curl_setopt( $ch, CURLOPT_POST, 1);
    curl_setopt( $ch, CURLOPT_POSTFIELDS, $myvars);
    curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, 1);
    $output = curl_exec($ch);
    // Close the cURL resource, and free system resources
    return $output;

public function GetToken()
    $url = 'http://localhost/WC/gccfwiki/api.php?action=query&format=json&meta=tokens&type=csrf|createaccount';
    $ch = curl_init( $url );
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt( $ch, CURLOPT_POST, 1);
    curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, 1);
    $output = curl_exec($ch);      
    $token = $out->query->tokens->createaccounttoken;
    return $token;

i got the response { "error": { "code": "badtoken", "info": "Invalid CSRF token.", "*": "See http://localhost/WC/gccfwiki/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <; for notice of API deprecations and breaking changes." } }

is there any solution.. how i solve it???

Perth Heritage Days 2017

Published 13 Oct 2017 by in Category:Blog_posts.

, Perth CBD.


Moana Chambers[edit]

Moana Chambers. The café only recently closed, but prehaps will re-open again soon under new management. An amazing stairwell, retrofitted certainly but it's hard to get an idea of the original layout.

Trinity Buildings[edit]

Trinity Arcade. A great place, I'd love to work in these offices. Lots of empty rooms by the looks of it.

The sign boards 2017-10-14 1513.jpg 2017-10-14 1513 1.jpg 2017-10-14 1514.jpg 2017-10-14 1514 2.jpg 2017-10-14 1514 3.jpg read as follows:

The first congregational church in Perth was founded by Henry Trigg who opened a chapel in 1846 on a site in William Street.

In 1863 the present site in Saint George's Terrace was purchased and a new building, Trinity Congregational Chapel, was opened by Governor Hampton in 1865. This colonial building was designed by Richard Roach Jewell in the Gothic Revival style known as Commissioners' Gothic, was located on a rise set well back from the Terrace.

Constructed in Flemish Bond brickwork and with a shingled root that building exists today and appears externally very much as it did in 1865. The original open space around the chapel has been replaced with arcades and buildings.

The new schoolroom was added to the northern side of the chapel in 1872. Constructed in Flemish Bond brickwork with a shingled roof, that building also survives but with some minor alterations and later additions to the western side.

The present Trinity Church on Saint George's Terrace was erected in front of the former chapel. Designed by Henry Trigg, the first architect to be born and trained in W.A. and grandson of the founder, the church was dedicated in December 1893. This building constructed in Flemish Bond brickwork with twin towers and elaborate stucco decoration was designed in the Victorian style known as "Dissenter's Mediaevalism."

Trinity Church maintains a presence in the city and an historic link with the past. The arcades provide a connection between the business centre in St. George's Terrace and the commercial district of the Hay Street Mall.

(Architects: Duncan Stephen and Mercer 1982.)

In 1927 Trinity Arcade and Buildings were constructed on the Hat Street frontage of the property to provide three floors of shops and commercial premises and a basement. The building was designed for the trustees of Trinity Congregational Church by E. Allwood, architect.

In 1981, Trinity Arcade was extended on three levels and the basement of Trinity buildings upgraded to provided a shopping arcade link from St. George's Terrace and under Hay Street. Trinity Church and Halls were restored at the same time.

Mary Raine exhibition[edit]

A corporate exhibition with reasonable research and few artifacts. Lots to learn though, about Mary Raine. Shame aboue the venue (the foyer of the Bank West corporate offices).


Cast Iron Pillar Boxes of WA[edit]

A great talk by Sue Hobson, who has published a book about the history of cast-iron pillar boxes of Western Australia. She's been giving this talk regularly since first presenting to the RHSWA, and has a family connection to the topic as her great-great grandfather started the company (J & E Ledger) that made the pillar boxes.

+ Add a commentComments on this blog post
No comments yet

Is there a WikiBooks API or a handle from the mediawiki API

Published 13 Oct 2017 by Ab Sin in Newest questions tagged mediawiki - Stack Overflow.

I am trying to find an API for I have been going through mediawiki and wikipedia API documentation but not able to figure out a way. For example here is a wikibooks link for Java Programming Language. Can I get a list of the hyperlinks on this page like Statements, Conditional_blocks etc. I am not expecting exact answers, a documentation link or a working example will be great too.

How to check if Wikipedia Article is Featured or not using API?

Published 13 Oct 2017 by fattah.safa in Newest questions tagged mediawiki - Stack Overflow.

I need to check if a Wikipedia Article is Featured or nor. How to do that using Wikipedia API? If this is not supported, is there a Wikipedia API function to get the list of Wikipedia Featured Articles?

Working with modules on MediaWiki

Published 12 Oct 2017 by GFL in Newest questions tagged mediawiki - Stack Overflow.

I need to repeat a string n number of times on my wiki.

It looks like I can do that using Module:String


But instead of getting hellohellohello I get {{#invoke:String|rep|hello|3}}

Do I need to install or turn on modules? I'm familiar with MediaWiki extensions but I've never come across modules before and can't find any documentation.

WordPress 4.9 Beta 2

Published 12 Oct 2017 by Mel Choyce in WordPress News.

WordPress 4.9 Beta 2 is now available!

This software is still in development, so we don’t recommend you run it on a production site. Consider setting up a test site just to play with the new version. To test WordPress 4.9, try the WordPress Beta Tester plugin (you’ll want “bleeding edge nightlies”). Or you can download the beta here (zip).

For more information on what’s new in 4.9, check out the Beta 1 blog post. Since then, we’ve made 70 changes in Beta 2.

Do you speak a language other than English? Help us translate WordPress into more than 100 languages!

If you think you’ve found a bug, you can post to the Alpha/Beta area in the support forums. We’d love to hear from you! If you’re comfortable writing a reproducible bug report, file one on WordPress Trac, where you can also find a list of known bugs.

Let’s test all of these:
code editing, theme switches,
widgets, scheduling.

Is it possible to edit Mediawiki Gallery Builder to display more then 50 images when adding images?

Published 10 Oct 2017 by user2152004 in Newest questions tagged mediawiki - Stack Overflow.

So using a "Add a photo to this Gallery" is pretty self explanatory However when I click on "Add a Photo" it only displays the last 50 images uploaded. How would I go about tweeking it to display more then 50 images using Javascript or HTML? At minimum it should display 100.

Cthulhu: Organizing Go Code in a Scalable Repo

Published 10 Oct 2017 by Matt Layher in DigitalOcean: Cloud computing designed for developers.

Cthulhu: Organizing Go Code in a Scalable Repo

At DigitalOcean, we’ve used a “mono repo” called cthulhu to organize our Go code for nearly three years. A mono repo is a monolithic code repository which contains many different projects and libraries. Bryan Liles first wrote about cthulhu in early 2015, and I authored a follow-up post in late 2015.

A lot has changed over the past two years. As our organization has scaled, we have faced a variety of challenges while scaling cthulhu, including troubles with vendoring, CI build times, and code ownership. This post will cover the state of cthulhu as it is today, and dive into some of the benefits and challenges of using a mono repo for all of our Go code at DigitalOcean.

Cthulhu Overview

Our journey using Go with a mono repo began in late 2014. Since then, the repository, called "cthulhu", has grown exponentially in many ways. As of October 6th, 2017, cthulhu has:

As the scale of the repository has grown over the past three years, it has introduced some significant tooling and organizational challenges.

Before we dive into some of these challenges, let’s discuss how cthulhu is structured today (some files and directories have been omitted for brevity):

├── docode
│   └── src
│       └── do
│           ├── doge
│           ├── exp
│           ├── services
│           ├── teams
│           ├── tools
│           └── vendor
└── script

docode/ is the root of our GOPATH. Readers of our previous posts may notice that third_party no longer exists, and do/ is now the prefix for all internal code.

Code Structure

All Go code lives within our GOPATH, which starts at cthulhu/docode. Each directory within the do/ folder has a unique purpose, although we have deprecated the use of services/ and tools/ for the majority of new work.

doge/ stands for “DigitalOcean Go Environment”, our internal “standard library”. A fair amount of code has been added and removed from doge/ over time, but it still remains home to a great deal of code shared across most DigitalOcean services. Some examples include our internal logging, metrics, and gRPC interaction packages.

exp/ is used to store experimental code: projects which are in a work-in-progress state and may never reach production. Use of exp/ has declined over time, but it remains a useful place to check in prototype code which may be useful in the future.

services/ was once used as a root for all long-running services at DO. Over time, it became difficult to keep track of ownership of code within this directory, and it was replaced by the teams/ directory.

teams/ stores code owned by specific teams. As an example, a project called “hypervisor” owned by team “compute” would reside in do/teams/compute/hypervisor. This is currently the preferred method for organizing new projects, but it has its drawbacks as well. More on this later on.

tools/ was once used to store short-lived programs used for various purposes. These days, it is mostly unused except for CI build tooling, internal static analysis tools, etc. The majority of team-specific code that once resided in tools/ has been moved to teams/.

Finally, vendor/ is used to store third-party code which is vendored into cthulhu and shared across all projects. We recently added the prefix do/ to all of our Go code because existing Go vendoring solutions did not work well when vendor/ lived at the root of the GOPATH (as was the case with our old third_party/ approach).

script/ contains shell scripts which assist with our CI build process. These scripts perform tasks such as static analysis, code compilation and testing, and publishing newly built binaries.

CI Build Tooling

One of the biggest advantages of using a mono repo is being able to effectively make large, cross-cutting changes to the entire repository without fear of breaking any “downstream” repositories. However, as the amount of code within cthulhu has grown, our CI build times have grown exponentially.

Even though Go code builds rather quickly, in early 2016, CI builds took an average of 20 minutes to complete. This resulted in extremely slow development cycles. If a poorly written test caused a spurious failure elsewhere in the repo, the entire build could fail, causing a great deal of frustration for our developers.

After experiencing a great deal of pain because of slow and unreliable builds, one of our engineers, Justin Hines, set out to solve the problem once and for all. After a few hours of work, he authored a build tool called gta, which stands for “Go Test Auto”. gta inspects the git history to determine which files changed between master and a feature branch, and uses this information to determine which packages must be tested for a given build (including packages that import the changed package).

As an example, suppose a change is committed which modifies a package, do/teams/example/droplet. Suppose this package is imported by another package, do/teams/example/hypervisor. gta is used to inspect the git history and determine that both of these packages must be tested, although only the first package was changed.

For very large changes, it can occasionally be useful to test the entire repository, regardless of which files have actually changed. Adding “force-test” anywhere in a branch name disables the use of gta in CI builds, restoring the old default behavior of “build everything for every change”.

The introduction of gta into our CI build process dramatically reduced the amount of time taken by builds. An average build now takes approximately 2-3 minutes—a dramatic improvement over the 20 minute builds of early 2016. This tool is used almost everywhere in our build pipeline, including static analysis checks, code compilation and testing, and artifact builds and deployment.

Static Analysis Tooling

Every change committed to cthulhu is run through a bevy of static analysis checks, including tools such as gofmt, go vet, golint, and others. This ensures a high level of quality and consistency between all of our Go code. Some teams have even introduced additional tools such as staticcheck for code that resides within their teams/ folder.

We have also experimented with the creation of custom linting tools that resolve common problems found in our Go code. One example is a tool called buildlint that checks for a blessed set of build tags, ensuring that tags such as !race (exclude this file from race detector builds) cannot be used.

Static analysis tools are incredibly valuable, but it can be tricky to introduce a new tool into the repository. Before we decided to run golint in CI, there were nearly 1,500 errors generated by the tool for the entirety of cthulhu. It took a concerted effort and several Friday afternoons to fix all of these errors, but it was well worth the effort. Our internal godoc instance now provides a vast amount of high quality documentation for every package that resides within cthulhu.


While there are many advantages to the mono repo approach, it can be challenging to maintain as well.

Though many different teams contribute to the repository, it can be difficult to establish overall ownership of the repository, its tooling, and its build pipelines. In the past, we tried several different approaches, but most were unsuccessful due to the fact that customer-facing project work typically takes priority over internal tooling improvements. However, this has recently changed, and we now have engineers working specifically to improve cthulhu and our build pipelines, alongside regular project work. Time will tell if this approach suits our needs.

The issue of code vendoring remains unsolved, though we have made efforts to improve the situation. As of now, we use the tool “govendor” to manage our third-party dependencies. The tool works well on Linux, but many of our engineers who run macOS have reported daunting issues while running the tool locally. In some cases, the tool will run for a very long time before completion. In others, the tool will eventually fail and require deleting and re-importing a dependency to succeed. In the future, we’d also like to try out “dep”, the “official experiment” vendoring tool for the Go project. At this time, GitHub Enterprise does not support Go vanity imports, which we would need to make use of dep.

As with most companies, our organizational structure has also evolved over time. Because we typically work in the teams/ directory in cthulhu, this presents a problem. As of now, our code structure is somewhat reliant on our organizational structure. Because of this, code in teams/ can become out of sync with the organizational structure, causing issues with orphaned code, or stale references to teams that no longer exist. We don’t have a concrete solution to this problem yet, but we are considering creating a discoverable “project directory service” of sorts so that our code structure need not be tied to our organizational structure.

Finally, as mentioned previously, scaling our CI build process has been a challenge over time. One problem in particular is that non-deterministic or “flappy” tests can cause spurious failures in unrelated areas of the repository. A test typically flaps when it relies on some assumption which cannot be guaranteed, such as timing or ordering of concurrent operations. This problem is compounded when interacting with a service such as MySQL in an integration test. For this reason, we encourage our engineers to do everything in their power to make their tests as deterministic as possible.


We’ve been using cthulhu for three years at DigitalOcean, and while we’ve faced some significant hurdles along the way, the mono repo approach has been a huge benefit to our organization as a whole. Over time, we’d like to continue sharing our knowledge and tools, so that others can reap the benefits of a mono repo just as we have.

Matt Layher is a software engineer on the Network Services team, and a regular contributor to a wide variety of open source networking applications and libraries written in Go. You can find Matt on Twitter and GitHub.

Mediawiki running rebuildtextindex.php made phrase searches take minutes

Published 10 Oct 2017 by Austin in Newest questions tagged mediawiki - Stack Overflow.

I've inherited an in production mediawiki server that has had pages automatically generated by python scripts using large datasets. There are about 2 million pages on this particular wiki. We noticed that some phrase searches were not turning up any results that clearly should have. So we ran maintenance/rebuildtextindex.php like mediawiki prescribed. Now the search results are returning correctly. The issue is that words are searching fine, but phrases are not. Eg. "word1 word2" without quotes returns results in a few seconds or less, but with quotes literally takes minutes with the browser stating "waiting for [domain]..." before finally returning the results.

I've tried looking into issues with this script, the mediawiki version is 1.24, so I tried running the script with and without dropping the search index table. Both bring the same result.

I'm new to mediawiki and have enough php knowledge to understand basic php, I'm okay with SQL, but can't see what I'm missing. How can I get the phrase searches to be faster?

Installing MediaWiki , WikiHow Version

Published 10 Oct 2017 by Adrian Vidican in Newest questions tagged mediawiki - Stack Overflow.

Anyone know how should install WikiHow Media Wiki version since it does not have the "mw-config" folder at all. Is there any way it should be installed ?

The Open Source Code is avaiable at


Send login to mediawiki

Published 10 Oct 2017 by Pram in Newest questions tagged mediawiki - Stack Overflow.

I've created my react web apps and my own mediawiki site(, i just want to connect the user from my react web apps to my own mediawiki, my react web apps using json, anybody know the flow how to send login from my react web apps to mediawiki?

The business of death in a growing world

Published 9 Oct 2017 by in New Humanist Articles and Posts.

As the population increases, so does the number of corpses – and the way we dispose of the dead may be about to change.

Why do I write environmental history?

Published 8 Oct 2017 by Tom Wilson in thomas m wilson.

Why bother to tell the history of the plants and animals that make up my home in Western Australia?  Partly its about reminding us of what was here on the land before, and in some ways, could be here again. In answering this question I’d like to quote the full text of Henry David Thoreau’s […]

mediawiki - make link evaluation case insensitive

Published 6 Oct 2017 by david furst in Newest questions tagged mediawiki - Stack Overflow.

i'm running a small wiki and our users would like an interface they find less confusing. the complaint is that a page titled something like 'Big_news' displays as a redlink if the link is 'Big News' or 'big news' or some other upper/lower case permutation, and they'd like these to appear as normal-coloured links if the page exists. when a user clicks on the link, the appropriate page is displayed correctly, but it would be better to see that the page already exists beforehand.

i've tried to implement solutions such as those presented here, here, and here, but they don't work -- links still display as redlinks on the page. [indeed, i think some of the articles are out of date ; mediawiki 1.27 doesn't seem to have the tables mentioned in them.]

any ideas how i might go about doing this ?

When our sun was young

Published 5 Oct 2017 by in New Humanist Articles and Posts.

The "faint young Sun paradox" has puzzled scientists for decades.

WordPress 4.9 Beta 1

Published 5 Oct 2017 by Jeffrey Paul in WordPress News.

WordPress 4.9 Beta 1 is now available!

This software is still in development, so we don’t recommend you run it on a production site. Consider setting up a test site just to play with the new version. To test WordPress 4.9, try the WordPress Beta Tester plugin (you’ll want “bleeding edge nightlies”). Or you can download the beta here (zip).

WordPress 4.9 is slated for release on November 14, but we need your help to get there. We’ve been working on making it even easier to customize your site. Here are some of the bigger items to test and help us find as many bugs as possible in the coming weeks:

As always, there have been exciting changes for developers to explore as well, such as:

If you want a more in-depth view of what major changes have made it into 4.9, check out posts tagged with 4.9 on the main development blog, or look at a list of everything that’s changed. There will be more developer notes to come, so keep an eye out for those as well.

Do you speak a language other than English? Help us translate WordPress into more than 100 languages!

If you think you’ve found a bug, you can post to the Alpha/Beta area in the support forums. We’d love to hear from you! If you’re comfortable writing a reproducible bug report, file one on WordPress Trac, where you can also find a list of known bugs.

Happy testing!

Without your testing,
we might hurt the internet.
Please help us find bugs.🐛

Is it possible to upload video on mediawiki using api?

Published 4 Oct 2017 by MasuD RaNa in Newest questions tagged mediawiki - Stack Overflow.

I wants to upload video on mediawiki using api. is there any away to upload video? if its possible, give me example code and tech me, how to I make it. thanks.

Hatch at One Year: Helping More Startups Grow

Published 3 Oct 2017 by Hollie Haggans in DigitalOcean: Cloud computing designed for developers.

Hatch at One Year: Helping More Startups Grow

Our global incubator program Hatch turned one year this past September. Since 2016, we’ve partnered with over 170 incubators, venture capital firms, and accelerators globally to provide infrastructure credit and technical support to startups in over 61 countries, and we’ve moved Hatch out of the beta phase to make the program more widely available across the world.

Hatch startup participants include companies like:

In addition to providing infrastructure support, we’ve hosted forums, called Hatch Founders’ Circles, in New York, Berlin, and Bangalore that facilitate thought partnership between our Hatch startups and other successful tech entrepreneurs, and are launching an invite-only Slack community for our Hatch startup founders.

To celebrate this milestone, DigitalOcean co-founder Mitch Wainer recently interviewed DO CEO Ben Uretsky for an episode of The Deep End podcast. They discussed DO’s humble beginnings and what’s changed for the company over the past six years.

The following excerpts were edited and adapted from the podcast, which you can listen to in full here:

How DigitalOcean Got Its Start

Mitch Wainer: So Ben, why don't you just quickly introduce yourself. You’re the CEO of DigitalOcean, but give a little background on your history.

Ben Uretsky: I was born in Russia, immigrated here when I was five years old with my family, my brother, my mom and my dad, and one of our grandmas as well. I went to school in New York City, graduated from Pace University. I actually managed to start my first company while attending college, so that was great. I built that business over a number of years, and had the pleasure of starting DigitalOcean in the summer of 2011 with four other co-founders, you being one of them. That was definitely a fun journey. We rented a three-bedroom ranch out in Boulder, Colorado.

That was for the Techstars program. What was the most exciting or the most interesting memory that you can share from Techstars? Which memory stands out in your mind?

I'd say demo day. A lot of practice and months of preparation went into getting ready…and there were about a thousand people in the audience. I think it was a high pressure situation because it's investors and people from the community; it's not just a general crowd.

The other event that came to mind the year prior to that, or actually just a few months earlier—the New York Tech Meetup. That was 800 people, but it felt much more supportive because it's the tech community coming out to see a few companies demo and showcase their work, whereas the Techstars demo day, you feel like you're being judged by a thousand people. So that was definitely an intense experience. I remember doing practice sessions with you in the backyard; getting ready for demo day, and you did the Karate Kid on me: “Wax on, wax off.”

Overcoming Challenges

DigitalOcean has grown, not only on the people side, but also on the tech side. We've gone through a lot of different technology challenges and struggles, so I would love for you to talk about some of those struggles and how we overcame those challenges.

Initially, most of the software was actually written by a single person, Jeff Carr, our co-founder. And in those days, the way that we would reference cloud health could be measured in hours. Essentially, how many hours can Jeff stay away from the console before something would break and he would need to get back in there and fix it? The good news is that we applied simplicity to our architecture as well. So we ensured that, no matter what happens, customer Droplets and customer environments wouldn't be affected by most of the outages and most of the service interruptions.

It allowed us to maintain a really high level of uptime and build the performance and reliability that our customers expect, but at the same time, if you're a single person building the system, a lot of difficulties, [and] challenges come up that you may not have foreseen. [And] the product really scaled. Jeff more or less single-handedly got it to nearly 100,000 customers. What you start building day one looks radically different when you have 100,000 users on the service. I'd say that was one challenge.

The second is really as we started to grow: building and engineering team morale into the service and getting people familiar [with] the ins and outs of the systems. And what was really exciting is that first team, one of their main driving objectives was to refactor a lot of the code that Jeff wrote. Turning it from a proof-of-concept into a much more stable and reliable environment, with documentation, with a more modular understanding, and so that kind of speaks to the shift that we're still going through today. Moving away from the monolithic app that was originally built into a more distributed, micro-service enabled architecture. We're making good progress, but with a more scalable service environment comes more complexity. We have to invest a lot more engineers into building new features and capabilities. And so there are trade-offs in each of those scenarios.

It All Comes Down to People

How has the engineering team structure changed throughout the years to support that evolution of our back-end code base and stack?

There are a few interesting mutations along the way: Going from one engineer to 30; bringing in our first set of engineering managers. We really promoted from within our first six. And I think what was really inspiring is a few years ago, we sat down and came up with a mission document, and said, "Okay, if we're gonna scale this to a million customers, and even more revenue, how do we see ourselves getting there?" And everyone contributed towards what their team's mission and objective was.

[For a while] it was more or less a few frontend teams and quite a few backend teams, but nonetheless, that structure held for a couple of years. And prior to that, I feel like we were reworking maybe every other quarter. So that stability allowed us to grow the team, from 30 or 40 people to a little bit over a hundred. Just a few months ago, engineering management along with [the] Product [team] had the opportunity to re-envision a different way to organize the teams, and today, we've moved to a much more vertical structure, building a team around each of the products. We [now] have a team for Droplet, a team for our Storage services, and a team for the Network services. And that's full stack from the frontend, the API, all the way to the backend services. We're in a much more verticalized structure today.

As CEO of the company, what are some of your challenges and what really keeps you up at night?

The interesting thing is that the role has changed year by year, and different challenges come up and are top of mind. I would say the two that I feel are most recurrent [are] related to the people. Whether it's employees or even the senior leadership team, and making sure that you have that right, that everyone's engaged, they're motivated, that you're making the right hiring decisions. That's all pretty complex stuff when we only hired 20 people [at first]. Today, DigitalOcean is roughly 350 people. And as a result, the amount of decisions that you have to make multiplies, and also the effects within the company become that much more complicated. That's always an interesting aspect of the work.
The second challenge that ties very close to that is making sure you paint the right vision for the business, so that people feel like when they come to work, they know what needs to be done. They're in alignment with where the company is headed. And that they're motivated and inspired by what we're trying to build.

So it all comes down to people?

Companies are collections of people first and foremost. They're not the service, they're not the product, it's really people, and once you comprehend that, I think it allows you to take your leadership to the next level.

Hollie Haggans heads up Global Partnerships for DigitalOcean’s Hatch program. She is passionate about startups and cold brew coffee. Get in touch with questions at

Is your filter going to break the layout?

Published 3 Oct 2017 by JS Morisset in Make WordPress Plugins.

If you’re not clear about the difference between WordPress actions and filters, you may end up breaking the page layout of WordPress – maybe days, months, or even years after you’ve written and implemented a new filter hook.

The difference can be difficult for new developers to grasp – after all, hooking an action or filter runs your code, either way, right? Well, yes, it does, but filters can be executed several times, in different locations as the webpage is being built (header, body, footer, etc.), and even in the admin back-end. More importantly, filters do not send anything to the webpage! Filter hooks receive their data / text as an argument, and then “return” the modified (or original) data / text at the end. They do not use “echo”, “print”, “printf”, etc. – they should not send anything to the webpage. If you need to output something directly to the webpage, use an action – that’s what they’re for.

A good filter hook:

function my_filter_hook( $text ) {
    $text .= 'Adding some text.';
    return $text;

A bad filter hook:

function my_filter_hook( $text ) {
    echo 'Adding some text.';
    return $text;

How common is this problem?

Unfortunately, it’s much more common that you might think. The the_content filter, for example, is often a source of problems – developers may think their content filter hook is executed only once, when WordPress includes the post content, but post content may be required in header meta tags, widgets, in the footer, or even filtered in the admin back-end. The the_content filter may even be used on completely unrelated text, to expand shortcodes or format text. If you’re sending anything to the webpage from a filter hook, you’re doing it wrong – use an action instead.

How do you know if you have a badly coded filter?

If you activate a plugin that uses the the_content filter (as one example), and the webpage layout is affected, you may have an incorrectly coded filter hook. This problem is so common that plugin authors often avoid using the the_content filter for that very reason.

This problem can be frustrating for end-users to diagnose as well – often the only way is to start disabling plugins / change themes until the problem goes away, and reporting the problem to the plugin / theme author can also be challenging since it may, or may not, be caused by a filter hook – and if it is, determining which filter is affected can require some coding effort / knowledge.

This problem is so common, and so challenging for users to diagnose, that I even wrote a plugin specifically to fix and report which content filter hooks incorrectly send output to the webpage.

#action, #doingitwrong, #filter, #hook, #output

Can the BBC remain impartial?

Published 3 Oct 2017 by in New Humanist Articles and Posts.

The BBC’s commitment to a tepid impartiality is inappropriate for these chaotic times.

The Month in WordPress: September 2017

Published 2 Oct 2017 by Hugh Lashbrooke in WordPress News.

This has been an interesting month for WordPress, as a bold move on the JavaScript front brought the WordPress project to the forefront of many discussions across the development world. There have also been some intriguing changes in the WordCamp program, so read on to learn more about the WordPress community during the month of September.

JavaScript Frameworks in WordPress

Early in the month, Matt Mullenweg announced that WordPress will be switching away from React as the JavaScript library WordPress Core might use — this was in response to Facebook’s decision to keep a controversial patent clause in the library’s license, making many WordPress users uncomfortable.

A few days later, Facebook reverted the decision, making React a viable option for WordPress once more. Still, the WordPress Core team is exploring a move to make WordPress framework-agnostic, so that the framework being used could be replaced by any other framework without affecting the rest of the project.

This is a bold move that will ultimately make WordPress core a lot more flexible, and will also protect it from potential license changes in the future.

You can get involved in the JavaScript discussion by joining the #core-js channel in the Making WordPress Slack group and following the WordPress Core development blog.

Community Initiative to Make WordCamps More Accessible

A WordPress community member, Ines van Essen, started a new nonprofit initiative to offer financial assistance to community members to attend WordCamps. DonateWC launched with a crowdsourced funding campaign to cover the costs of getting things up and running.

Now that she’s raised the initial funds, Ines plans to set up a nonprofit organization and use donations from sponsors to help people all over the world attend and speak at WordCamps.

If you would like to support the initiative, you can do so by donating through their website.

The WordCamp Incubator Program Returns

Following the success of the first WordCamp Incubator Program, the Community Team is bringing the program back to assist more underserved cities in kick-starting their WordPress communities.

The program’s first phase aims to find community members who will volunteer to mentor, assist, and work alongside local leaders in the incubator communities — this is a time-intensive volunteer role that would need to be filled by experienced WordCamp organizers.

If you would like to be a part of this valuable initiative, join the #community-team channel in the Making WordPress Slack group and follow the Community Team blog for updates.

WordPress 4.8.2 Security Release

On September 19, WordPress 4.8.2 was released to the world — this was a security release that fixed nine issues in WordPress Core, making the platform more stable and secure for everyone.

To get involved in building WordPress Core, jump into the #core channel in the Making WordPress Slack group, and follow the Core team blog.

Further Reading:

If you have a story we should consider including in the next “Month in WordPress” post, please submit it here.

How can I pass some MediaWiki markup (here: a table) to a template?

Published 1 Oct 2017 by Regis May in Newest questions tagged mediawiki - Stack Overflow.

I'm looking for a way that a user may specify a table - i.e. by specifying it in MediaWiki syntax - to a template. The template should then put this table in the context of some larger output provided by the template.

A simple example. Is there a way a user could specify something like this:

|{| class="wikitable"
| Something
| Useful

and then the template outputs somewhere the specified data FooBar and the table?

If this doesn't work, is there some alternative way of doing this? i.e. by specifying some arbitrary (!) CSV data and outputting it in a formatted way?

Collect all values passed to a template

Published 1 Oct 2017 by GFL in Newest questions tagged mediawiki - Stack Overflow.

I am trying to create a spreadsheet with all the values of articles that use a variable template.

For example, the "Johann Sebastian Bach" article uses "Infobox person" template. And it has some values it passes to the template:

name, birth_place, death_place, etc...

What is the best way to collect all the pages that use the "Infobox person" template as well as all the values being passed to the template?

I'm not trying to get the template variables, but the actual information. In the case of the person template I'd like to be able to create a spreadsheet with:

article,name,birth_place,death_place,... "Johann Sebastian Bach","Johann Sebastian Bach","Eisenach","Leipzig",... "Phil Spector","Phil Spector","The Bronx, New York, U.S.",,... ...

How to get all allowed languages for Wikidata

Published 30 Sep 2017 by Lokal_Profil in Newest questions tagged mediawiki - Stack Overflow.

I'm writing a tool for interacting with Wikidata where labels and descriptions are added to items. But I would like to validate that the language is supported before trying to add it.

So my question is how do I get a list of the allowed language codes. The documentation describes this as UserLanguageCode but gives no info on retrieving the allowed values.

I know I can get a list of all of the used languages by doing the following SQL operation on the database, but that is both slow and inefficient: SELECT DISTINCT term_language FROM wb_terms.

As an aside is the list of allowed languages the same for MonolingualText statements?

XAMPP + MediaWiki + Composer installing extensions

Published 30 Sep 2017 by Crysis in Newest questions tagged mediawiki - Stack Overflow.

I used the Windows Setup.exe to install Composer. It now works from the command line.

I have an installation of XAMPP in C:\ that contains MediaWiki installed in the HTDOCS folder.

My goal: Install the extensions Semantic Breadcrumb Links and Semantic Extra Special Properties into MediaWiki.

The problem: Composer floods me with a bunch of nonsense downloads of vendor folders I already possess, ignoring the SBL/SESP stuff entirely.

I deleted my composer.json file in the MW root folder entirely. Then went to the command line and entered this to recreate and add to it.

composer require mediawiki/semantic-breadcrumb-links ~1.4
composer require mediawiki/semantic-extra-special-properties ~1.5

The composer.json is created with those two in it. I already have Semantic MediaWiki and it's dependencies installed. Composer Merge Plugin is in my Vendors folder and is installed according to MediaWiki.

I run this command:

composer update

And then I get slammed with a bunch of talk about composer.json being read, then not being found in a Composer folder located in my Roaming folder. Then it starts spitting out Github information. And then it proceeds to download file after file, and write them to a cache in my AppData/Local/Composer/repo folder.

All of the files written already exist inside my Vendor folder in the MediaWiki installation.

I can't find anything online about the two extensions I want to install. An example of the output:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Users\Ghaleon>cd c:\xampp\htdocs\zephyr

c:\xampp\htdocs\zephyr>composer require mediawiki/semantic-extra-special-properties "~1.5" -vvv
Reading ./composer.json
Loading config file ./composer.json
Checked CA file C:\Users\Ghaleon\AppData\Local\Temp\composer-cacert-e78c8ab7b4432bd466e64bb942d988f6c0ac91cd785017e465bdc96d42fe9dd0.pem: valid
Executing command (C:\xampp\htdocs\zephyr): git branch --no-color --no-abbrev -v

Executing command (C:\xampp\htdocs\zephyr): git describe --exact-match --tags
Executing command (C:\xampp\htdocs\zephyr): git log --pretty="%H" -n1 HEAD
Executing command (C:\xampp\htdocs\zephyr): hg branch
Executing command (C:\xampp\htdocs\zephyr): fossil branch list
Executing command (C:\xampp\htdocs\zephyr): fossil tag list
Executing command (C:\xampp\htdocs\zephyr): svn info --xml
Failed to initialize global composer: Composer could not find the config file: C:/Users/Ghaleon/AppData/Roaming/Composer/composer.json
To initialize a project, please create a composer.json file as described in the getcomposer "Getting Started" section
Reading C:\xampp\htdocs\zephyr/vendor/composer/installed.json
Loading plugin Wikimedia\Composer\MergePlugin
Running 1.5.2 (2017-09-11 16:59:25) with PHP 7.1.7 on Windows NT / 6.1
./composer.json has been updated
Reading ./composer.json
Loading config file ./composer.json
Executing command (C:\xampp\htdocs\zephyr): git branch --no-color --no-abbrev -v

Executing command (C:\xampp\htdocs\zephyr): git describe --exact-match --tags
Executing command (C:\xampp\htdocs\zephyr): git log --pretty="%H" -n1 HEAD
Executing command (C:\xampp\htdocs\zephyr): hg branch
Executing command (C:\xampp\htdocs\zephyr): fossil branch list
Executing command (C:\xampp\htdocs\zephyr): fossil tag list
Executing command (C:\xampp\htdocs\zephyr): svn info --xml
Failed to initialize global composer: Composer could not find the config file: C:/Users/Ghaleon/AppData/Roaming/Composer/composer.json
To initialize a project, please create a composer.json file as described in the getcomposer "Getting Started" section
Reading C:\xampp\htdocs\zephyr/vendor/composer/installed.json
Loading plugin Wikimedia\Composer\MergePlugin_composer_tmp0
Loading composer repositories with package information
Downloading packages.json
Writing C:/Users/Ghaleon/AppData/Local/Composer/repo/https-- into cache
Updating dependencies (including require-dev)
Reading C:/Users/Ghaleon/AppData/Local/Composer/repo/
vider-2013.json from cache
Reading C:/Users/Ghaleon/AppData/Local/Composer/repo/
vider-2014.json from cache
Reading C:/Users/Ghaleon/AppData/Local/Composer/repo/
vider-2015.json from cache
Reading C:/Users/Ghaleon/AppData/Local/Composer/repo/
vider-2016.json from cache
Reading C:/Users/Ghaleon/AppData/Local/Composer/repo/
vider-2016-10.json from cache
Downloading provider-2017-01%24d88eddadd51717e910cd1bad29 89204217e7add9448808859860c1520b3d294d.json

Oil, Love and Oxygen – Album Launch

Published 29 Sep 2017 by Dave Robertson in Dave Robertson.

“Oil, Love an Oxygen” is a collection of songs about kissing, climate change, cult 70s novels and more kissing. Recorded across ten houses and almost as many years, the album is diverse mix of bittersweet indie folk, pop, rock and blues. The Kiss List bring a playful element to Dave Robertson’s songwriting, unique voice and percussive acoustic guitar work. This special launch night also features local music legends Los Porcheros, Dave Johnson, Sian Brown, Rachel Armstrong and Merle Fyshwick.

Tickets $15 through , or on the door if still available


Kubuntu Artful Aardvark (17.10) final Beta images now available

Published 28 Sep 2017 by valorie-zimmerman in Kubuntu.

Artful Aardvark (17.10) final Beta images are now available for testing. Help us make 17.10 the best release yet!

The Kubuntu team will be releasing 17.10 on October 19, 2017.

This is the final Beta pre-release. Kubuntu Beta pre-releases are NOT recommended for:

Kubuntu Beta pre-releases are recommended for:

Getting Kubuntu 17.10 Beta 2:

To upgrade to Kubuntu 17.10 pre-releases from 17.04, run

sudo do-release-upgrade -d

from a command line.

Download a Bootable image and put it onto a DVD or USB Drive here:

Torrents are also available.

See our release notes:

Please report any bugs on Launchpad using the commandline:

ubuntu-bug packagename

Check on IRC channels, Kubuntuforum or the Kubuntu mail lists if you don’t know the package name.

KDE bugs (bugs in Plasma or KDE applications) are still filed at

What are my options for documenting code in SVN [closed]

Published 28 Sep 2017 by SoulOfSet in Newest questions tagged mediawiki - Stack Overflow.

In my company we work on a very large Java process that has some JavaDoc documentation, which is fine. I would like to start giving a more higher level approach to it though. As well as the JavaDoc I would like to start adding another form that is more accessible and contains more details about the flow of the programs in general. I also want to be able to include the JavaDoc data though, maybe as a reference. I would also like to be able to include the headers, maybe with some custom scripted parsing.

It would be great if this could also be tied to subversion in some way. So in the documentation I can link revisions, and the actual file itself and it would be opened via svn browser interface.

I'm leaning towards something like MediaWiki. I'm just not seeing all the features I want without some involved customization which I would rather not put the time into if there are other options.

So, do I have other options? Can you point me in the right direction?

Block Storage Comes to NYC3 and LON1; One More Datacenter on the Way!

Published 27 Sep 2017 by DigitalOcean in DigitalOcean: Cloud computing designed for developers.

Block Storage Comes to NYC3 and LON1; One More Datacenter on the Way!

Today, we're excited to share that Block Storage is available to Droplets in NYC3 and LON1. With Block Storage, you can scale your storage independently of your compute and have more control over how you grow your infrastructure, enabling you to build and scale larger applications more easily. Block Storage has been a key part of our overall focus on strengthening the foundation of our platform to increase performance and enable our customers to scale.

We've seen incredible engagement since our launch last July. Users have created Block Storage volumes in SFO2, NYC1, FRA1, SGP1, TOR1, and BLR1 to scale databases, take backups, store media, and much more; NYC3 and LON1 are our seventh and eighth datacenters with Block Storage respectively.

As we continue to upgrade and augment our other datacenters, we'll be ensuring that Block Storage is added too. In order to help you plan your deployments, we've finalized the timeline for AMS3. Here is the schedule we're targeting for Block Storage rollout in 2017:

Block Storage Comes to NYC3 and LON1; One More Datacenter on the Way!

Inside LON1, our London datacenter region.

Additionally, Kubernetes now offers support for DigitalOcean Block Storage thanks to StackPointCloud. Learn more about it here.

Thanks to everyone who has given us feedback and used Block Storage so far. Please keep it coming. You can create your first Block Storage volume in NYC3 or LON1 today!

Please note: For our NYC3 region, we recommend that you add a volume at the time you create your Droplet to ensure access to Block Storage.

—DigitalOcean Storage Team


Published 27 Sep 2017 by fabpot in Tags from Twig.


Published 27 Sep 2017 by fabpot in Tags from Twig.

Mediawiki: creating a table with automatic numeration

Published 27 Sep 2017 by Sascha R. in Newest questions tagged mediawiki - Stack Overflow.

I'm using Mediawiki and I'm presently struggeling with creating a table with automatic numeration.
Right now I have about 50 pages wihich belong to a naming concept (let's call it) .TEST.
I'm getting a list of all my pages with this query:

{{#ask: [[Category:Server]] [[N-Segment::TEST]]  

Mediawiki outputs me a table with all pages and the attributes I want.


Additionally I desire a column, which shows me the present number in each row just like the following example:

1    Bla.TEST
2    test2.TEST
3    another.TEST
50   last.TEST

I have already searched for this but I couldn't found any suitable solution.
How can I achieve my goal?
Additional information: I'm using Mediawiki version 1.23.1.
Thank you very much.

Global WordPress Translation Day 3

Published 27 Sep 2017 by Hugh Lashbrooke in WordPress News.

On September 30 2017, the WordPress Polyglots Team – whose mission is to translate WordPress into as many languages as possible – will hold its third Global WordPress Translation Day, a 24-hour, round-the-clock, digital and physical global marathon dedicated to the localisation and internationalisation of the WordPress platform and ecosystem, a structure that powers, today, over 28% of all existing websites.

The localisation process allows for WordPress and for all WordPress-related products (themes and plugins) to be available in local languages, so to improve their accessibility and usage and to allow as many people as possible to take advantage of the free platform and services available.

In a (not completely) serendipitous coincidence, September 30 has also been declared by the United Nations “International Translation Day”, to pay homage to the great services of translators everywhere, one that allows communication and exchange.

The event will feature a series of multi-language live speeches (training sessions, tutorials, case histories, etc.) that will be screen-casted in streaming, starting from Australia and the Far East and ending in the Western parts of the United States.

In that same 24-hour time frame, Polyglots worldwide will gather physically in local events, for dedicated training and translations sprints (and for some fun and socializing as well), while those unable to physically join their teams will do so remotely.

A big, fun, useful and enlightening party and a lovely mix of growing, giving, learning and teaching, to empower, and cultivate, and shine.

Here are some stats about the first two events:

Global WordPress Translation Day 1

Global WordPress Translation Day 2

We would like your help in spreading this news and in reaching out to all four corners of the world to make the third #WPTranslationDay a truly amazing one and to help celebrate the unique and fundamental role that translators have in the Community but also in all aspects of life.

A full press release is available, along with more information and visual assets at

For any additional information please don’t hesitate to contact the event team on

The first UK AtoM user group meeting

Published 27 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday the newly formed UK AtoM user group met for the first time at St John's College Cambridge and I was really pleased that myself and a colleague were able to attend.
Bridge of Sighs in Autumn (photo by Sally-Anne Shearn)

This group has been established to provide the growing UK AtoM community with a much needed forum for exchanging ideas and sharing experiences of using AtoM.

The meeting was attended by about 15 people though we were informed that there are nearly 50 people on the email distribution list. Interest in AtoM is certainly increasing in the UK.

As this was our first meeting, those who had made progress with AtoM were encouraged to give a brief presentation covering the following points:
  1. Where are you with AtoM (investigating, testing, using)?
  2. What do you use it for? (cataloguing, accessions, physical storage locations)
  3. What do you like about it/ what works?
  4. What don’t you like about it/ what doesn’t work?
  5. How do you see AtoM fitting into your wider technical infrastructure? (do you have separate location or accession databases etc?)
  6. What unanswered questions do you have?
It was really interesting to find out how others are using AtoM in the UK. A couple of attendees had already upgraded to the new 2.4 release so that was encouraging to see.

I'm not going to summarise the whole meeting but I made a note of people's likes and dislikes (questions 3 and 4 above). There were some common themes that came up.

Note that most users are still using AtoM 2.2 or 2.3, those who have moved to 2.4 haven't had much chance to explore it yet. It may be that some of these comments are already out of date and fixed in the new release.

What works?

AtoM seems to have lots going for it!

The words 'intuitive', 'user friendly', 'simple', 'clear' and 'flexible' were mentioned several times. One attendee described some user testing she carried out during which she found her users just getting on and using it without any introduction or explanation! Clearly a good sign!

The fact that it was standards compliant was mentioned as well as the fact that consistency was enforced. When moving from unstructured finding aids to AtoM it really does help ensure that the right bits of information are included. The fact that AtoM highlights which mandatory fields are missing at the top of a page is really helpful when checking through your own or others records.

The ability to display digital images was highlighted by others as a key selling point, particularly the browse by digital objects feature.

The way that different bits of the AtoM database interlink was a plus point that was mentioned more than once - this allows you to build up complex interconnecting records using archival descriptions and authority records and these can also be linked to accession records and a physical location.

The locations section of AtoM was thought to be 'a good thing' - for recording information about where in the building each archive is stored. This works well once you get your head around how best to use it.

Integration with Archivematica was mentioned by one user as being a key selling point for them - several people in the room were either using, or thinking of using Archivematica for digital preservation.

The user community itself and the quick and helpful responses to queries posted on the user forum were mentioned by more than one attendee. Also praised was the fact that AtoM is in continuous active development and very much moving in the right direction.

What doesn't work?

Several attendees mentioned the digital object functionality in AtoM. As well as being a clear selling point, it was also highlighted as an area that could be improved. The one-to-one relationship between an archival description and a digital object wasn't thought to be ideal and there was some discussion about linking through to external repositories - it would be nice if items linked in this way could be displayed in the AtoM image carousel even where the url doesn't end in a filename.

The typeahead search suggestions when you enter search terms were not thought to be helpful all of the time. Sometimes the closest matches do not appear in the list of suggested results.

One user mentioned that they would like a publication status that is somewhere in between draft and published. This would be useful for those records that are complete and can be viewed internally by a selected group of users who are logged in but are not available to the wider public.

More than one person mentioned that they would like to see a conservation module in AtoM.

There was some discussion about the lack of an audit trail for descriptions within AtoM. It isn't possible to see who created a record, when it was created and information about updates. This would be really useful for data quality checking, particularly when training new members of staff and volunteers.

Some concerns about scalability were mentioned - particularly for one user with a very large number of records within AtoM - the process of re-indexing AtoM can take three days.

When creating creator or access points, the drop down menu doesn’t display all the options so this causes difficulties when trying to link to the right point or establishing whether the desired record is in the system or not. This can be particularly problematic for common surnames as several different records may exist.

There are some issues with the way authority records are created currently, with no automated way of creating a unique identifier and no ability to keep authority records in draft.

A comment about the lack of auto-save and the issue of the web form timing out and losing all of your work seemed to be a shared concern for many attendees.

Other things that were mentioned included an integration with Active Directory and local workarounds that had to be put in place to make finding aids bi-lingual.

Moving forward

The group agreed that it would be useful to keep a running list of these potential areas of development for AtoM and that perhaps in the future members may be able to collaborate to jointly sponsor work to improve AtoM. This would be a really positive outcome for this new network.

I was also able to present on a recent collaboration to enable OAI-PMH harvesting of EAD from AtoM and use it as an opportunity to try to drum up support for further development of this new feature. I had to try and remember what OAI-PMH stood for and think I got 83% of it right!

Thanks to St John's College Cambridge for hosting. I look forward to our next meeting which we hope to hold here in York in the Spring.

How to send web pages for which explicit files do not exist

Published 27 Sep 2017 by SlipperyPete in Newest questions tagged mediawiki - Stack Overflow.

How do CMS's such as MediaWiki, Drupal, Wordpress etc. display the correct pages when a URL is for a directory/file which doesn't exist.

To clarify, if I go to the url, there is no directory on wikipedia's server /wiki/Example, instead MediaWiki creates the page from templates and information in databases etc. I'm asking how the CMS "Hijacks" the request for that directory/file in order to send it's own page back rather than a 404.

I'm asking with regards to php as that's what I'm using and what most CMS's seem to be primarily based on.

Hacktoberfest 2017: The Countdown Begins!

Published 26 Sep 2017 by Stephanie Morillo in DigitalOcean: Cloud computing designed for developers.

Hacktoberfest 2017: The Countdown Begins!

Contributors of the world, we’re excited to announce that DigitalOcean’s fourth annual Hacktoberfest officially kicks off on Sunday, October 1. If you’ve been meaning to give back to your favorite open source projects—or if you want to make your first-ever contributions—set aside time this October to start hacking. You can earn a limited-edition Hacktoberfest T-shirt and stickers!

This year, we have resources available on local Hacktoberfest Meetups (and how to start one), finding issues to work on, learning how to contribute to open source, and resources for project maintainers who want to attract participants to their projects. You can find all of these resources and register to participate on the official Hacktoberfest website.

The Details

If you’re wondering what Hacktoberfest is, it’s a month-long celebration of all things open source. Here’s what you need to know:

Over the course of the month, you can find new projects to work on from the Hacktoberfest site. Every time you visit the site, you'll see issues labeled "Hacktoberfest". Additionally, we’ll send registered participants digests with resources and projects that you can look at if you need ideas.

The Fine Print

To get a free T-shirt, you must register and make four pull requests between October 1-31. You can open a PR in any public, GitHub-hosted repo—not just on issues that have been labeled “Hacktoberfest”.

(Please note: Review a project’s Code of Conduct before submitting a PR. If a maintainer reports your PR as spam, or if you violate the project’s Code of Conduct, you will be ineligible to participate in Hacktoberfest.)

Mark Your Calendars

With just four days away until Hacktoberfest 2017 gets underway, take a look at what Hacktoberfest 2016 and Hacktoberfest 2015 looked like.

Have you participated in Hacktoberfest before? If so, share some of your stories or tips for newcomers in the comments below. If you have favorite projects, or if you’re a project maintainer, tell us what projects participants should visit in the comments. And be sure to see what others are saying in the #Hacktoberfest hashtag on your favorite social media platforms!

See you all on October 1!

Book review: Religion and Atheism

Published 26 Sep 2017 by in New Humanist Articles and Posts.

This collection of thought-provoking essays shows that dividing lines often cut in different ways.

W3C Executive Forum – a discussion on the Future of the Web for YOU!

Published 26 Sep 2017 by J. Alan Bird in W3C Blog.

Thirty years ago the Web changed how you do business. We’re at another inflection point today with new emerging technologies on the Web, and new Web application areas that are already revolutionizing many business models.  Among these are the names you would recognize in Digital Publishing, FinTech, Automotive, Telco, Smart Manufacturing and Entertainment.

As the work in these areas continues to solve real-world concerns it has become apparent to the W3C Leadership that many of the senior executives in these industries don’t have a sufficient background to fully appreciate how the continued evolution of the Web will positively impact their technology and organizational strategies.

W3C has created an event designed to provide value and insight to executives in these industries with a goal of providing rich content and some food for thought.   W3C is pleased to announce the inaugural W3C Executive Forum, which will be co-located with the annual W3C TPAC event, in Burlingame, CA on 08 November 2017.  The Executive Forum event will have three great Industry panels with different themes and the capstone will be an interview of the Inventor of the Web and W3C Director, Sir Tim Berners-Lee.

Our kickoff panel will discuss the Future of Payments on the Web.  Moderated by Worldpay’s Nick Telford-Reed, the panel includes Sang Ahn, Vice President and General Manager, Samsung Pay;  Souheil Badran, President, Alipay North America; and Karen Czack, Vice President of Industry Engagement and Regulation at American Express.  Quite a collection of diverse opinions and markets – this should be an insightful dialogue!

As automobiles become smarter, so do our cities.  We’ve got Steve Crumb, Executive Director, GENIVI moderating our second panel – Connected Cars, Cities and the Web!  In order to represent all aspects of this hot topic, we’ve invited Greg Bohl, Vice President HARMAN Connected Services; Patricia (Patti) Robb, Head of the Silicon Valley Innovation Center, Intel Corporation; and Tina Quigley, Regional Transportation Commission of Southern Nevada to participate on the panel.

Many of you have heard or seen a presentation by Mark Pesce, Futurist and Web 3D pioneer.   We’re excited that he’s agreed to moderate the Emerging Technologies panel with experts Stina Ehrensvard, CEO & Founder, Yubico; Sean White, Senior Vice President of Emerging Technologies, Mozilla; and Megan Lindsay, Product Manager WebVR, Google. They will consider new forms on the Web, such as virtual and mixed reality and voice interfaces as well as user experiences and security.

Each of these sessions is designed to provide thought provoking insight and we expect there to be time for interactive engagement of the audience so bring your questions to participate in the conversations.

In addition to these three exciting panels, we’re going to wrap up the day by having Brad Stone of Bloomberg engage Sir Tim Berners-Lee in a discussion on the overall future of the Web.  This is a rare chance to hear Sir Tim’s thoughts on the overall future of the Web and the role of Industry in that evolution.

We’ve included a couple of networking sessions with the event as well – one with just the attendees at a break, and one with the broader W3C Community after the event over drinks and dinner.

If you’re interested in more information, you can find that at or by contacting J. Alan Bird, Global Business Development Leader at

We look forward to seeing you in Burlingame this November and use #W3CForum to spread the word about this great event!

Kubuntu Artful Aardvark (17.10) Beta 2 testing

Published 25 Sep 2017 by valorie-zimmerman in Kubuntu.

Artful Aardvark (17.10) Beta 2 images are now available for testing.

The Kubuntu team will be releasing 17.10 in October. The final Beta 2 milestone will be available on September 28.

This is the first spin in preparation for the Beta 2 pre-release. Kubuntu Beta pre-releases are NOT recommended for:

Kubuntu Beta pre-releases are recommended for:

Getting Kubuntu 17.10 Beta 2:

To upgrade to Kubuntu 17.10 pre-releases from 17.04, run sudo do-release-upgrade -d from a command line.

Download a Bootable image and put it onto a DVD or USB Drive via the download link at This is also the direct link to report your findings and any bug reports you file.

See our release notes:

Please report your results on the Release tracker.

MediaWiki: It's not possible to create user account using LDAP Authentication extension

Published 25 Sep 2017 by Ana Carvalho in Newest questions tagged mediawiki - Stack Overflow.

Not all users in LDAP are authorized to own an user account in my MediaWiki. I already have users logging in because I created their accounts before installing LDAP Plugin. Now, I need to create accounts for new employees and I always receive the message "Username entered already in use. Please choose a different name.", through Special:CreateAccount.

Obviously, If I disable all LDAP configuration in LocalSettings, I'm able to create a local user account with the same LDAP username. Then , if I enable LDAP configuration again, the user is recognized with LDAP password and he can log in. The fact is that I don't want to edit LocalSettings every time I have a new employee.

MediaWiki: 1.29.1

PHP: 5.5.21 (apache2handler)

PostgreSQL: 9LDAP

My configuration is below. Thanks in advance.

require_once ('.../extensions/LdapAuthentication/LdapAuthentication.php');

$wgAuth = new LdapAuthenticationPlugin();

$wgLDAPDomainNames = array( 'AD' );

$wgLDAPServerNames = array( 'AD' => 'url' );

$wgLDAPUseLocal = false;

$wgLDAPEncryptionType = array( 'AD' => 'clear' );

$wgLDAPPort = array( 'AD' => 389 );

$wgLDAPProxyAgent = array( 'AD' => 'CN=a,OU=b,DC=c,DC=d' );

$wgLDAPProxyAgentPassword = array( 'UFPE-AD' => 'password' );

$wgLDAPSearchAttributes = array( 'AD' => 'description' );

$wgLDAPBaseDNs = array( 'AD' => 'DC=c,DC=d' );

$wgLDAPDisableAutoCreate = array( 'AD' => true );

$wgLDAPPreferences = array( 'AD' => array( 'email' => 'mail', 'realname' => 'cn','nickname' => 'givenname') );

$wgLDAPLowerCaseUsername = array( 'AD' => true);

$wgGroupPermissions['*']['createaccount'] = false;

Announcing DigitalOcean Currents: A Quarterly Report on Developer Cloud Trends

Published 24 Sep 2017 by Ryan Quinn in DigitalOcean: Cloud computing designed for developers.

Announcing DigitalOcean Currents: A Quarterly Report on Developer Cloud Trends

The landscape developers work in is ever-changing. Keeping up means following numerous press sources, blogs, and social media sites and joining the communities they are involved in. We decided that the best way to truly understand how developers work and how the tools we build help them was to ask—so we did!

DigitalOcean Currents is our inaugural quarterly report where we will share insights into how developers work and how they are affected by the latest tech industry trends. To get the data for this report, we reached out to developers through social media and our newsletter, our community, social news aggregators like Reddit, and more. We collected opinions from more than 1,000 developers across the world and company sectors.

Among the many insights we gained from the survey, we found that developers rely on online documentation and tutorials more than any other method of learning about new technologies. Will this continuing trend mean developers have a wider or narrower base of knowledge (as bite sized pieces of technical content displace lengthy books)?

Despite the tech industry’s important focus on maintaining a good work-life balance, only 12% of the developers we surveyed reported that they put the keyboard away at home; many opt to use their free time to write code for work or for personal projects. While developers are often passionate about their work, this result indicates that developers may be more likely to face burnout even when working for employers who make work-life balance a focus.

Here are other key findings from the first report:

Helpful Companies and Projects Make Developers Successful

52% of respondents said their preferred way of learning about new technologies is through online tutorials, and 28% said official documentation is their preferred way of learning. This appears to indicate that companies who invest in great documentation see a payoff in developer loyalty.

Announcing DigitalOcean Currents: A Quarterly Report on Developer Cloud Trends

Linux Marketshare is More Than Meets the Eye

While recent market share numbers show Linux rising to just over 3% of the desktop market, this number may be misleading. Instead of simply asking our respondents which desktop they used, we asked which operating system environment they spent most of their time using. 39% of respondents reported spending more time in a Linux environment than elsewhere, outpacing both macOS (36%) and Windows (23%).

PHP and MySQL Still Reign Supreme

Despite all the buzz about the latest and greatest languages and frameworks, PHP remains the most popular language among our respondents with MySQL as the most popular database. Meanwhile, Nginx far outpaced Apache as the preferred web server.

Announcing DigitalOcean Currents: A Quarterly Report on Developer Cloud Trends

Vendor Lock-in Does Not Scare Developers

With the rise of SaaS, PaaS, and IaaS over legacy hosting platforms, vendor lock-in could be a concern. But with modern APIs and interoperability either directly or indirectly available through multi-vendor libraries, 77% of our respondents said that they’ve never decided not to use a cloud service for fear of being locked into that vendor’s ecosystem.

Moving to Hybrid and Multi-cloud Isn’t a Given

According to Gartner, 90% of organizations will adopt hybrid infrastructure management by 2020, but the majority of survey respondents said they aren’t planning to use simultaneous cloud services in the next year; only 15% said they would consider their current strategy a hybrid cloud strategy. Just 10% of respondents said they would consider their current strategy multi-cloud, and 70% said they have no plans to implement a multi-cloud strategy in the next year.

The full DigitalOcean Currents (September 2017) report can be found here.

The tech industry moves fast and the cutting edge moves even faster. In order to bring you the most recent information, DigitalOcean Currents will be shared every quarter, highlighting the latest trends among developers.

If you would like to be among the first to receive Currents each quarter, sign up here. You’ll receive the latest report each time it is released and will be among those asked to share your views and experiences.

Making a killing: examining the arms trade

Published 24 Sep 2017 by in New Humanist Articles and Posts.

An examination of Britain’s arms trade tells us who we are as a country and what our role in the world really is.

A Member Perspective on Publishing@W3C

Published 22 Sep 2017 by Bill McCoy in W3C Blog.

publishing at W3C gidgetBill Kasdorf of Apex Covantage has long been a major contributor to the establishment of publishing industry standards and best practices. His work also uniquely bridges the disparate fields of publishing from EDU to ebooks to scholarly to magazines and news. Bill has just written an important blog post providing his personal perspective about the activities of Publishing@W3C umbrella, concluding that the convergence is well underway.

You’ll hear much more about the future of  publishing and the Open Web Platform at the upcoming W3C Publishing Summit Nov 9-10 in San Francisco (among his copious volunteer contributions, Bill Kasdorf is also the program chair for this event). Register now to ensure your seat at what is expected to be a sell-out conference.

Moving a proof of concept into production? it's harder than you might think...

Published 20 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Myself and colleagues blogged a lot during the Filling the Digital Preservation Gap Project but I’m aware that I’ve gone a bit quiet on this topic since…

I was going to wait until we had a big success to announce, but follow on work has taken longer than expected. So in the meantime here is an update on where we are and what we are up to.


Just to re-cap, by the end of phase 3 of Filling the Digital Preservation Gap we had created a working proof of concept at the University of York that demonstrated that it is possible create an automated preservation workflow for research data using PURE, Archivematica, Fedora and Samvera (then called Hydra!).

This is described in our phase 3 project report (and a detailed description of the workflow we were trying to implement was included as an appendix in the phase 2 report).

After the project was over, it was agreed that we should go ahead and move this into production.

Progress has been slower than expected. I hadn’t quite appreciated just how different a proof of concept is to a production-ready environment!

Here are some of the obstacles we have encountered (and in some cases overcome):

Error reporting

One of the key things that we have had to build in to the existing code in order to get it ready for production is error handling.

This was not a priority for the proof of concept. A proof of concept is really designed to demonstrate that something is possible, not to be used in earnest.

If errors happen and things stop working (which they sometimes do) you can just kill it and rebuild.

In a production environment we want to be alerted when something goes wrong so we can work out how to fix it. Alerts and errors are crucial to a system like this.

We are sorting this out by enabling Archivematica's own error handling and error catching within Automation Tools.

What happens when something goes wrong?

...and of course once things have gone wrong in Archivematica and you've fixed the underlying technical issue, you then need to deal with any remaining problems with your information packages in Archivematica.

For example, if the problems have resulted in failed transfers in Archivematica then you need to work out what you are going to do with those failed transfers. Although it is (very) tempting to just clear out Archivematica and start again, colleagues have advised me that it is far more useful to actually try and solve the problems and establish how we might handle a multitude of problematic scenarios if we were in a production environment!

So we now have scenarios in which an automated transfer has failed so in order to get things moving again we need to carry out a manual transfer of the dataset into Archivematica. Will the other parts of our workflow still work if we intervene in this way?

One issue we have encountered along the way is that though our automated transfer uses a specific 'datasets' processing configuration that we have set up within Archivematica, when we push things through manually it uses the 'default' processing configuration which is not what we want.

We are now looking at how we can encourage Archivematica to use the specified processing configuration. As described in the Archivematica documentation, you can do this by including an XML file describing your processing configuration within your transfer.

It is useful to learn lessons like this outside of a production environment!

File size/upload

Although our project recognised that there would be limit to the size of dataset that we could accept and process with our application, we didn't really bottom out what size dataset we intended to support.

It has now been agreed that we should reasonably expect the data deposit form to accept datasets of up to 20 GB in size. Anything larger than this would need to be handed in a different way.

Testing the proof of concept in earnest showed that it was not able to handle datasets of over 1 GB in size. Its primary purpose was to demonstrate the necessary integrations and workflow not to handle larger files.

Additional (and ongoing) work was required to enable the web deposit form to work with larger datasets.


In testing the application we of course ended up trying to push some quite substantial datasets through it.

This was fine until everything abrubtly seemed to stop working!

The problem was actually a fairly simple one but because of our own inexperience with Archivematica it took a while to troubleshoot and get things moving in the right direction again.

It turned out that we hadn’t allocated enough space in one of the bits of filestore that Archivematica uses for failed transfers (/var/archivematica/sharedDirectory/failed). This had filled up and was stopping Archivematica from doing anything else.

Once we knew the cause of the problem the available space was increased but then everything ground to a halt again because we had quickly used that up again ….increasing the space had got things moving but of course while we were trying to demonstrate the fact that it wasn't working, we had deposited several further datasets which were waiting in the transfer directory and quickly blocked things up again.

On a related issue, one of the test datasets I had been using to see how well Research Data York could handle larger datasets consisted of c.5 GB consisting of about 2000 JPEG images. Of course one of the default normalisation tasks in Archivematica is to convert all of these JPEGs to TIFF.

Once this collection of JPEGs were converted to TIFF the size of the dataset increased to around 80 GB. Until I witnessed this it hadn't really occurred to me that this could cause problems.

The solution - allocate Archivematica much more space than you think it will need!

We also now have the filestore set up so that it will inform us when the space in these directories gets to 75% full. Hopefully this will allow us to stop the filestore filling up in the future.


The proof of concept did not undergo rigorous testing - it was designed for demonstration purposes only.

During the project we thought long and hard about the deposit, request and preservation workflows that we wanted to support, but we were always aware that once we had it in an environment that we could all play with and test, additional requirements would emerge.

As it happens, we have discovered that the workflow implemented is very true to that described in the appendix of our phase 2 report and does meet our needs. However, there are lots of bits of fine tuning required to enhance the functionality and make the interface more user friendly.

The challenge here is to try to carry out the minimum of work required to turn it into an adequate solution to take into production. There are so many enhancements we could make – I have a wish list as long as my arm – but until we better understand whether a local solution or a shared solution (provided by the Jisc Research Data Shared Service) will be adopted in the future it is not worth trying to make this application perfect.

Making it fit for production is the priority. Bells and whistles can be added later as necessary!

My thanks to all those who have worked on creating, developing, troubleshooting and testing this application and workflow. It couldn't have happened without you!

W3C hearing the concerns the EME recommendation raises

Published 20 Sep 2017 by Coralie Mercier in W3C Blog.

We are hearing a great deal of anger, concerns and disagreement –in the Press, in e-mail we’ve received, and in response on Social Media. We have been reading and hearing assertions that we made a mistake when we published Encrypted Media Extensions (EME) as a W3C Recommendation.

A lot has been written about EME in the many years the work has been conducted at the W3C, in more or less sourced and more or less unbiased articles. Among those writings are the materials we prepared, the stories we shared with the media, the Blog posts our Director Tim Berners-Lee and CEO Jeff Jaffe have written. So much has been written that I wonder how much, and what has been read and heard.

We hear furious feedback from some of our community and we are sorry about those who feel this way about this contentious topic and those we failed to convince, those whose trust we lost –trust that we hope and believe we have worked many years to earn and will work to gain again in the future, although we do know many viewpoints are unlikely to change. We also hear those eager to get back to their non-EME interactions with the W3C, whether they support or oppose EME.

Relevant materials

The W3C leadership has been accused of overriding objections and of obstruction

At W3C all decisions are informed by discussions amongst the W3C membership and the general public. The recent vote was not a decision of the W3C management, but a voting of the W3C members themselves.

Our leaders have led, and facilitated a very divisive debate both among the W3C Membership and Web community, but that is one that touches on society at large. To quote from a blog post our CEO published Monday, “DRM has been used for decades prior to the EME debate. But in recent years it is a credit to the world wide web that the web has become the delivery vehicle for everything, including movies. Accordingly it was inevitable that we would face issues of conflicting values and the appropriate accommodations for commercial use of the web.

W3C is a technical standards body and thus, the debate helped improve the specification in areas of security, privacy, and accessibility.

The W3C is accused of disowning its mission statement and principles, even of greed

We take at heart our principles, and greed is far from being among those. W3C is a Membership organization, which means that Members pay a fee to contribute to setting the agenda of the Open Web Platform, to participate in W3C technical work and to send engineers to conduct that work, but our Process is built around community work and throughout the standardization track our Process ensures our work is reviewed.

Refer to section “Wide Review” of the Process document in particular:
The objective is to ensure that the entire set of stakeholders of the Web community, including the general public, have had adequate notice of the progress of the Working Group, and were able to actually perform reviews of and provide comments on the specification.
A second objective is to encourage groups to request reviews early enough that comments and suggested changes can still be reasonably incorporated in response to the review.
Before approving transitions, the Director will consider who has been explicitly offered a reasonable opportunity to review the document, who has provided comments, the record of requests to and responses from reviewers, and seek evidence of clear communication to the general public about appropriate times and which content to review and whether such reviews actually occurred.

Wide Review is a requirement and such review needs to be demonstrated as part of W3C work to proceed through the various maturity levels.

The W3C is accused of keeping votes secret, and not being transparent

The W3C follows the W3C Process Document which describes the organizational structure of the W3C and the processes related to the responsibilities and functions they exercise to enable W3C to accomplish its mission to lead the Web to its full potential.

Futhermore, W3C has co-signed with IEEE, Internet Architecture Board (IAB), Internet Engineering Task Force (IETF), and Internet Society, a statement affirming the importance of a jointly developed set of principles establishing a modern paradigm for global, open standards.

Also, W3C is a level playing field. No matter how much big companies at W3C are worth, they have exactly the same vote; exactly the same voice as a small start up, university or advocacy group – each member gets one vote.

Consider these two facts: our statutes provision member-confidentiality, and the process by which W3C Members can appeal certain decisions was used for the very first time and concluded last week.

In the case at hand, there were several milestones where the W3C Members expressed their preference on EME, through the following decision-making and voting:

Review and voting last about 4 weeks by Process (although the Appeal process isn’t yet very specific, so we followed the existing process for votes), reviews are governed by consensus while votes are governed by majority, and in both cases all W3C Members had the same options for visibility of their responses:

Screenshot of the options available to W3C Members regarding visibility of their responses

Screenshot of the visibility options available to W3C Members responses

Those options regarding visibility of W3C Member responses have been available since December 2014, further to W3C Member request, for all calls for review of proposed work and proposed recommendations.

Lastly, although our practice had been to not share any numerical results, we informed our Members that we would share this information for the appeal vote that was about to end, given the controversy. Our long-standing practice has been to respect member-confidentiality, yet we still shared the vote totals, and while we did not expect any particular recognition, we never anticipated to be accused of abandoning consensus, or to be blamed because no member chose to make their responses publicly visible.

EME is a framework for DRM implementation, not DRM itself

Many have protested EME or W3C making a “DRM standard”. Implying that EME is DRM is false. EME is an API to DRM. EME is agnostic of the DRM. Using “DRM” in place of “EME” furthers the wrong assumption that W3C controls DRM. W3C does not create/standardize/mandate DRM directly. That is out of scope and is stated clearly in the specification itself, the charter of the working group that conducted the work, our public communication and material.

DRM exists whether the W3C as a Consortium or a Team wants it or not.

It is equally wrong to assume that browsers would not implement DRM without EME, the alternatives are closed and dedicated applications, which media outlets would insist upon for access to their content.

What EME achieves is that by allowing the decryption to take place in the browser, users benefit from the security, privacy and accessibility that are built in the Open Web Platform.

Furthermore, all functionalities involved being provided by the HTML specification or some of its extensions, current and future security, privacy and accessibility enhancements of the Open Web Platform can be leveraged.

W3C is accused of betraying the free and open Web

EME is an extension of the Open Web Platform. 49 W3C groups contribute to grow and improve the Open Web Platform, enabling W3C to pursue its mission to lead the Web to its full potential through the creation of Web standards, guidelines, and supporting materials.

233 other specifications are under active development. You can see all the other areas W3C is actively involved in. We tackle every aspect of the Web. We ensure the long-term growth of the Web. We care to make the benefits of the Web available to all people, whatever their hardware, software, network infrastructure, native language, culture, geographical location, or physical or mental ability.

We have heard from some that we have not listened to our community

The steps leading to EME becoming a recommendation involved several rounds through with W3C Members were consulted, culminating in a vote last week, but wide review and comments from the public are built-in the W3C Process. While, as our CEO Jeff Jaffe noted, we know that many people are not satisfied with the result, we feel we took the appropriate time to have a respectful debate about a complex set of issues and provide a result that will improve the Web for its users.

We have heard your concerns and anger and so we have tried to clarify some misconceptions and explain the process and rationales in these decisions.

We understand and admire the passion we’ve seen about the open Web and the role of the W3C in the world, and while there may be points where not everyone always agrees on the methods, please know that we at the W3C also care deeply and passionately about the Web, about its users and about our community.

While this has been a time of some disappointment, passion and confusion, we continue to reach out to you, our community, to clarify and engage – to continue to have discussions and to hear from you on important issues that face the Web.

Call for design: Artful Banner for website

Published 19 Sep 2017 by valorie-zimmerman in Kubuntu.

Kubuntu 17.10 — code-named Artful Aardvark — will be released on October 19th, 2017. We need a new banner for the website, and invite artists and designers to submit designs to us based on the Plasma wallpaper and perhaps the mascot design.

The banner is 1500×385 SVG.

Please submit designs to the Kubuntu-devel mail list.

Introducing Spaces: Scalable Object Storage on DigitalOcean

Published 19 Sep 2017 by John Gannon in DigitalOcean: Cloud computing designed for developers.

Introducing Spaces: Scalable Object Storage on DigitalOcean

Today we’re excited to announce the launch of DigitalOcean Spaces: a simple, standalone object storage service that enables developers to store and serve any amount of data with automatic scalability, performance, and reliability.

Object storage has been one of the most requested products that we’ve been asked to build. When we embarked on developing a scalable storage product that is abstracted from compute resources, we realized we had an opportunity to refactor and improve how developers solve this problem today.


We believe in simplifying our products to enable developers to build great software. To do that, we look at every opportunity to remove friction from the development process including spending less time estimating costs associated with storage, transfer, number of requests, pricing tiers, and regional pricing.

Spaces is available for a simple $5 per month price and includes 250GB of storage and 1TB of outbound bandwidth. There are no costs per request and additional storage is priced at the lowest rate available: $0.01 per GB transferred and $0.02 per GB stored. Uploads are free.

Spaces provides cost savings of up to 10x along with predictable pricing and no surprises on your monthly bill.

To make it easy for anyone to try, we are offering a 2 month free trial.

Scales with Your Data

Spaces is designed to scale automatically; as your application data grows, you won't need to worry about scaling any storage infrastructure. Although your Space can be configured to be accessed from anywhere, we realize that some customers prefer to keep their data close to their customers or to their own compute nodes.

To that end, Spaces is available in our NYC3 region as of today, and will be rolled out in AMS3 before the end of 2017. More regions will follow in early 2018—stay tuned for future updates.

Designed for Developers

Our goal was to simplify the essential components of object storage into a clean design. We tested several designs with developers to ensure Spaces was easy to use and manage with deployed applications. With Spaces, you can:

You can use your favorite storage management tools and libraries with Spaces. A large ecosystem of S3-compatible tools and libraries can be used to manage your Space. (We’ve published articles about some of these tools on our Community site; find the links in the “Getting Started” section below.)

Secure, Reliable, and Performant

Files you store in Spaces are encrypted on physical disks with 256-bit AES-XTS full-disk encryption. In addition, you can encrypt files with your own keys before uploading them to Spaces. You can limit access to Spaces and the files within using your Spaces API key(s) and permissioning.

Files stored in Spaces are distributed using a fault-tolerant placement technique called erasure coding. Spaces can tolerate multiple host failures without blocking any client I/O or experiencing any data loss.

Spaces is designed to provide high availability for storing and serving web assets, media, backups, log files, and application data. At DigitalOcean, we use Spaces for a variety of applications including serving of web assets (html, images, js) for, and for backups of data critical to our business. During the early access period, thousands of users stored millions of objects and Spaces performed as expected with low latency and high throughput.

Getting Started

Almost 90,000 developers and businesses signed up to try Spaces during early access. Find out more about how your application could use Spaces for cost effective and scalable object storage by reading these articles and tutorials:


API Documentation

Command-Line Clients

GUI Clients

We’ll be adding new features and regions over the coming months and look forward to hearing your feedback!

WordPress 4.8.2 Security and Maintenance Release

Published 19 Sep 2017 by Aaron D. Campbell in WordPress News.

WordPress 4.8.2 is now available. This is a security release for all previous versions and we strongly encourage you to update your sites immediately.

WordPress versions 4.8.1 and earlier are affected by these security issues:

  1. $wpdb->prepare() can create unexpected and unsafe queries leading to potential SQL injection (SQLi). WordPress core is not directly vulnerable to this issue, but we’ve added hardening to prevent plugins and themes from accidentally causing a vulnerability. Reported by Slavco
  2. A cross-site scripting (XSS) vulnerability was discovered in the oEmbed discovery. Reported by xknown of the WordPress Security Team.
  3. A cross-site scripting (XSS) vulnerability was discovered in the visual editor. Reported by Rodolfo Assis (@brutelogic) of Sucuri Security.
  4. A path traversal vulnerability was discovered in the file unzipping code. Reported by Alex Chapman (noxrnet).
  5. A cross-site scripting (XSS) vulnerability was discovered in the plugin editor. Reported by 陈瑞琦 (Chen Ruiqi).
  6. An open redirect was discovered on the user and term edit screens. Reported by Yasin Soliman (ysx).
  7. A path traversal vulnerability was discovered in the customizer. Reported by Weston Ruter of the WordPress Security Team.
  8. A cross-site scripting (XSS) vulnerability was discovered in template names. Reported by Luka (sikic).
  9. A cross-site scripting (XSS) vulnerability was discovered in the link modal. Reported by Anas Roubi (qasuar).

Thank you to the reporters of these issues for practicing responsible disclosure.

In addition to the security issues above, WordPress 4.8.2 contains 6 maintenance fixes to the 4.8 release series. For more information, see the release notes or consult the list of changes.

Download WordPress 4.8.2 or venture over to Dashboard → Updates and simply click “Update Now.” Sites that support automatic background updates are already beginning to update to WordPress 4.8.2.

Thanks to everyone who contributed to 4.8.2.

How do you deal with mass spam on MediaWiki?

Published 19 Sep 2017 by sau226 in Newest questions tagged mediawiki - Webmasters Stack Exchange.

What would be the best way to find a users IP address on MediaWiki if all the connections were proxied through squid proxy server and you have access to all user rights?

I am a steward on a centralauth based wiki and we have lots of spam accounts registering and making 1 spam page each.

Can someone please tell me what the best way to mass block them is as I keep on having to block each user individually and lock their accounts?

Plasma 5.11 beta available in unofficial PPA for testing on Artful

Published 18 Sep 2017 by valorie-zimmerman in Kubuntu.

Adventurous users and developers running the Artful development release can now also test the beta version of Plasma 5.11. This is experimental and can possibly kill kittens!

Bug reports on this beta go to, not to Launchpad.

The PPA comes with a WARNING: Artful will ship with Plasma 5.10.5, so please be prepared to use ppa-purge to revert changes. Plasma 5.11 will ship too late for inclusion in Kubuntu 17.10, but should be available via the main backports PPA as soon as is practical after release day, October 19th, 2017.

Read more about the beta release:

If you want to test on Artful: sudo add-apt-repository ppa:kubuntu-ppa/beta && sudo apt-get update && sudo apt full-upgrade -y

The purpose of this PPA is testing, and bug reports go to

Mediawiki parser function addon that prints out the amount of time in server-side processing so far?

Published 18 Sep 2017 by user1258361 in Newest questions tagged mediawiki - Stack Overflow.

Some (Semantic Mediawiki) queries and other processing (table output and query formatting) could take significant time on a Mediawiki instance. Is there a parser function addon available where the new function reports the amount of time spent server-side processing since the page started being processed?

For example, if I guess that a query is taking a long time, I can place calls to the parser function above and below it, subtract and get the amount of time the query required on the server.


Published 18 Sep 2017 by timbaker in Tim Baker.

The author (centre) with Ruth and Ian Gawler Recently a great Australian, a man who has helped thousands of others in their most vulnerable and challenging moments, a Member of the Order of Australia, quietly retired from a long and remarkable career of public service....

How to change my Heroku + MediaWiki database from clearDB one to a local DB

Published 18 Sep 2017 by IWI in Newest questions tagged mediawiki - Stack Overflow.

I build a wiki using MediaWiki. Initially, I used a remote database (clearDB), as setup was faster. I now want to migrate the data, and use a db local to the server instead.

My current db settings in Localsettings.php

## Database settings
$wgDBtype = "mysql";
$wgDBserver = "";
$wgDBname = "heroku_XXXXXXXXX";
$wgDBuser = "XXXXXXXXX";
$wgDBpassword = "XXXXXXXXXX";

Obviously, if I just "changed" the $wgDBserver to localhost, it won't work.

What needs to be done to migrate the old data and default MediaWiki architecture to a new db local to the server?

Harvesting EAD from AtoM: we need your help!

Published 18 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Back in February I published a blog post about a project to develop AtoM to allow EAD (Encoded Archival Description) to be harvested via OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting): “Harvesting EAD from AtoM: a collaborative approach

Now that AtoM version 2.4 is released (hooray!), containing the functionality we have sponsored, I thought it was high time I updated you on what has been achieved by this project, where more work is needed and how the wider AtoM community can help.

What was our aim?

Our development work had a few key aims:

  • To enable finding aids from AtoM to be exposed as EAD 2002 XML for others to harvest. The partners who sponsored this project were particularly keen to enable the Archives Hub to harvest their EAD.
  • To change the way that EAD was generated by AtoM in order to make it more scalable. Moving EAD generation from the web browser to the job scheduler was considered to be the best approach here.
  • To make changes to the existing DC (Dublin Core) metadata generation feature so that it also works through the job scheduler - making this existing feature more scalable and able to handle larger quantities of data

A screen shot of the job scheduler in AtoM - showing the EAD and
DC creation jobs that have been completed

What have we achieved?

The good

We believe that the EAD harvesting feature as released in AtoM version 2.4 will enable a harvester such as the Archives Hub to harvest our catalogue metadata from AtoM as EAD. As we add new top level archival descriptions to our catalogue, subsequent harvests should pick up and display these additional records. 

This is a considerable achievement and something that has been on our wishlist for some time. This will allow our finding aids to be more widely signposted. Having our data aggregated and exposed by others is key to ensuring that potential users of our archives can find the information that they need.

Changes have also been made to the way metadata (both EAD and Dublin Core) are generated in AtoM. This means that the solution going forward is more scalable for those AtoM instances that have very large numbers of records or large descriptive hierarchies.

The new functionality in AtoM around OAI-PMH harvesting of EAD and settings for moving XML creation to the job scheduler is described in the AtoM documentation.

The not-so-good

Unfortunately the EAD harvesting functionality within AtoM 2.4 will not do everything we would like it to do. 

It does not at this point include the ability for the harvester to know when metadata records have been updated or deleted. It also does not pick up new child records that are added into an existing descriptive hierarchy. 

We want to be able to edit our records once within AtoM and have any changes reflected in the harvested versions of the data. 

We don’t want our data to become out of sync. 

So clearly this isn't ideal.

The task of enabling full harvesting functionality for EAD was found to be considerably more complex than first anticipated. This has no doubt been confounded by the hierarchical nature of the EAD which differs from the simplicity of the traditional Dublin Core approach.

The problems encountered are certainly not insurmountable, but lack of additional resources and timelines for the release of AtoM 2.4 stopped us from being able to finish off this work in full.

A note on scalability

Although the development work deliberately set out to consider issues of scalability, it turns out that scalability is actually on a sliding scale!

The National Library of Wales had the forethought to include one of their largest archival descriptions as sample data for inclusion in the version of AtoM 2.4 that Artefactual deployed for testing. Their finding aid for St David’s Diocesan Records is a very large descriptive hierarchy consisting of 33,961 individual entries. This pushed the capabilities of EAD creation (even when done via the job scheduler) and also led to discussions with The Archives Hub about exactly how they would process and display such a large description at their end even if EAD generation within AtoM were successful.

Some more thought and more manual workarounds will need to be put in place to manage the harvesting and subsequent display of large descriptions such as these.

So what next?

We are keen to get AtoM 2.4 installed at the Borthwick Institute for Archives over the next couple of months. We are currently on version 2.2 and would like to start benefiting from all the new features that have been introduced available... and of course to test in earnest the EAD harvesting feature that we have jointly sponsored.

We already know that this feature will not fully meet our needs in its current form, but would like to set up an initial harvest with the Archives Hub and further test some of our assumptions about how this will work.

We may need to put some workarounds in place to ensure that we have a way of reflecting updates and deletions in the harvested data – either with manual deletes or updates or a full delete and re-harvest periodically.

Harvesting in AtoM 2.4 - some things that need to change

So we have a list of priority things that need to be improved in order to get EAD harvesting working more smoothly in the future:

In line with the OAI-PMH specification

  • AtoM needs to expose updates to the metadata to the harvester
  • AtoM needs to expose new records (at any level of description) to the harvester
  • AtoM needs to expose information about deletions to the harvester
  • AtoM also needs to expose information about deletions to DC metadata to the harvester (it has come to my attention during the course of this project that this isn’t happening at the moment) 

Some other areas of potential work

I also wanted to bring together and highlight some other areas of potential work for the future. These are all things that were discussed during the course of the project but were not within the scope of our original development goals.

  • Harvesting of EAC (Encoded Archival Context) - this is the metadata standard for authority records. Is this something people would like to see enabled in the future? Of course this is only useful if you have someone who actually wants to harvest this information!
  • On the subject of authority records, it would be useful to change the current AtoM EAD template to use @authfilenumber and @source - so that an EAD record can link back to the relevant authority record in the local AtoM site. The ability to create rich authority records is such a key strength of AtoM, allowing an institution to weave rich interconnecting stories about their holdings. If harvesting doesn’t preserve this inter-connectivity then I think we are missing a trick!
  • EAD3 - this development work has deliberately not touched on the new EAD standard. Firstly, this would have been a much bigger job and secondly, we are looking to have our EAD harvested by The Archives Hub and they are not currently working with EAD3. This may be a priority area of work for the future.
  • Subject source - the subject source (for example "Library of Congress Subject Headings") doesn't appear in AtoM generated EAD at the moment even though it can be entered into AtoM - this would be a really useful addition to the EAD.
  • Visible elements - AtoM allows you to decide which elements you wish to display/hide in your local AtoM interface. With the exception of information relating to physical storage, the XML generation tasks currently do not take account of visible elements and will carry out an export of all fields. Further investigation of this should be carried out in the future. If an institution is using the visible elements feature to hide certain bits of information that should not be more widely distributed, they would be concerned if this information was being harvested and displayed elsewhere. As certain elements will be required in order to create valid EAD, this may get complicated!
  • ‘Manual’ EAD generation - the project team discussed the possibility of adding a button to the AtoM user interface so that staff users can manually kick-off EAD regeneration for a single descriptive hierarchy. Artefactual suggested this as a method of managing the process of EAD generation for large descriptive hierarchies. You would not want the EAD to regenerate with each minor tweak if a large archival description was undergoing several updates, however, you need to be able to trigger this task when you are ready to do so. It should be possible to switch off the automatic EAD re-generation (which normally triggers when a record is edited and saved) but have a button on the interface that staff can click when they want to initiate the process - for example when all edits are complete. 
  • As part of their work on this project, Artefactual created a simple script to help with the process of generating EAD for large descriptive hierarchies - it basically provides a way of finding out which XML files relate to a specific archival description so that EAD can be manually enhanced and updated if it is too large for AtoM to generate via the job scheduler. It would be useful to turn this script into a command-line task that is maintained as part of the AtoM codebase.

We need your help!

Although we believe we have something we can work with here and now, we are not under any illusions that this feature does all that it needs to in order to meet our requirements in the longer term. 

I would love to find out what other AtoM users (and harvesters) think of the feature. Is it useful to you? Are there other things we should put on the wishlist? 

There is a lot of additional work described in this post which the original group of project partners are unlikely to be able to fund on their own. If EAD harvesting is a priority to you and your organisation and you think you can contribute to further work in this area either on your own or as part of a collaborative project please do get in touch.


I’d like to finish with a huge thanks to those organisations who have helped make this project happen, either through sponsorship, development or testing and feedback.

Reflections on the EME debate

Published 18 Sep 2017 by Jeff Jaffe in W3C Blog.

For the past several years we have engaged in one of the most divisive debates in the history of the W3C Community. This is the debate about whether W3C should release the Encrypted Media Extensions (EME) Recommendation without requiring that vendors provide a covenant that protects security and interoperability researchers.

This debate is an offshoot of a larger debate in society – whether it is appropriate to protect content using Digital Rights Management (DRM) and whether it is appropriate for nations to pass law such as the Digital Millennium Copyright Act (DMCA) which impose penalties on those that attempt to break Digital Rights Management (DRM). Since EME is an interface to Content Decryption Modules (CDMs) that decrypt content protected by DRM, the larger debate in society came into our own halls.

The debate within W3C was passionate and well informed on both sides. Supporters of EME argued that watching movies (protected by DRM) was happening on the web – and providing a common, secure, private, accessible interface from W3C was well within our mission and hence an appropriate activity for W3C. Opponents insisted that such a spec must be accompanied by a covenant that protects researchers from the overreach of DMCA. The arguments on both sides were far deeper than what I summarized above – indeed there were an intricate set of arguments with good logic on all sides of the debate.

There is potential fallout from the debate. Some on both sides of the issue complained about the intensity of the arguments on the other side. Others told me that they felt that the overall intensity of the debate was harmful to W3C. Personally, I approach this with a high degree of equanimity. I don’t think there was anything untoward in the debate itself. Only respectful passion from those who are passionate. And I feel that the intensity of the debate shows W3C in its best light. Here’s why.

First of all, I don’t think that this was a debate about W3C standards alone. This was part of a larger debate in society. W3C did not create DRM and we did not create DMCA. DRM has been used for decades prior to the EME debate. But in recent years it is a credit to the world wide web that the web has become the delivery vehicle for everything, including movies. Accordingly it was inevitable that we would face issues of conflicting values and the appropriate accommodations for commercial use of the web. I cannot envision a situation where this debate would not have erupted in our community given the larger trends that are happening in the world.

Secondly, we have had an incredibly respectful debate. The debate started years ago in the restricted-media W3C Community Group soon after EME was chartered. There were hundreds of posts with many points of view professionally stated on all sides of the issue. Each side contributed understanding to the other side. That doesn’t mean that people with passionate viewpoints were swayed. But W3C played its role as the venue for an open debate in the public square.

Third, the existence of the debate improved the specification. Critics of early versions of the spec raised valid issues about security, privacy, and accessibility. The resultant work of the Working Group then improved the spec in those dimensions. Critics might not have achieved their ultimate goal of a covenant that protected security researchers – but they did help improve security on the web nonetheless.

Finally, the deliberative process of W3C, which several times took a step back to look for suggestions and/or objections played itself out properly. At multiple places the debate caused the entire community to better scrutinize the work. All voices were heard. Not all contradictory voices could be simultaneously satisfied – but the debate was influential. And in the end, the inventor of the world wide web, the Director of W3C, Tim Berners-Lee, took in all of the diverse input and provided a thoughtful decision which addressed all objections in detail.

I know from my conversations that many people are not satisfied with the result. EME proponents wanted a faster decision with less drama. EME critics want a protective covenant. And there is reason to respect those who want a better result. But my personal reflection is that we took the appropriate time to have a respectful debate about a complex set of issues and provide a result that will improve the web for its users.

My main hope, though, is that whatever point-of-view people have on the EME covenant issue, that they recognize the value of the W3C community and process in arriving at a decision for an inherently contentious issue. We are in our best light when we are facilitating the debate on important issues that face the web.

What is behind the violence in Myanmar?

Published 17 Sep 2017 by in New Humanist Articles and Posts.

Francis Wade, author of "Myanmar's Enemy Within" explains the deep roots of the violence, and the long-term persecution of the Rohingya people.

MediaWiki - How to align different sections

Published 17 Sep 2017 by Smithy55 in Newest questions tagged mediawiki - Stack Overflow.

I'm not sure how to achieve this look. I want three sections all in a horizontal line. This is for MediaWiki

Here's what I want it to look like:

I'm trying to make them all line up in a single horizontal line, not like this:

I'm essentially trying to achieve three sections horizontally, not vertically.

Here's the code

{| style="border: 2px solid #aaa; border-radius: 5px; width: 25%; padding: 5px 10px;" | style="width: 100%" | '''1''' |}

How can I achieve this look?

Mediawiki - could not install extensions

Published 15 Sep 2017 by Arunkumar in Newest questions tagged mediawiki - Stack Overflow.

I could not install certain extensions in mediawiki. For example, I could not install extensions that require to insert line like require_once "$IP/extensions/..." in LocalSettings.php. Once I do that the site is completely down. However, I could install extensions that require the line wfLoadExtension('xxx');

Any help much appreciated.

From here to eternity

Published 15 Sep 2017 by in New Humanist Articles and Posts.

As Vladimir Putin seeks to extend his rule, Russia's Orthodox religious right is on the rise.

Jason Scott Talks His Way Out of It: A Podcast

Published 14 Sep 2017 by Jason Scott in ASCII by Jason Scott.

Next week I start a podcast.

There’s a Patreon for the podcast with more information here.

Let me unpack a little of the thinking.

Through the last seven years, since I moved back to NY, I’ve had pretty variant experiences of debt or huge costs weighing me down. Previously, I was making some serious income from a unix admin job, and my spending was direct but pretty limited. Since then, even with full-time employment (and I mean, seriously, a dream job), I’ve made some grandiose mistakes with taxes, bills and tracking down old obligations that means I have some notable costs floating in the background.

Compound that with a new home I’ve moved to with real landlords that aren’t family and a general desire to clean up my life, and I realized I needed some way to make extra money that will just drop directly into the bill pit, never to really pass into my hands.

How, then, to do this?

I work very long hours for the Internet Archive, and I am making a huge difference in the world working for them. It wouldn’t be right or useful for me to take on any other job. I also don’t want to be doing something like making “stuff” that I sell or otherwise speculate into some market. Leave aside I have these documentaries to finish, and time has to be short.

Then take into account that I can no longer afford to drop money going to anything other than a small handful of conferences that aren’t local to me (the NY-CT-NJ Tri-State area), and that people really like the presentations I give.

So, I thought, how about me giving basically a presentation once a week? What if I recorded me giving a sort of fireside chat or conversational presentation about subjects I would normally give on the road, but make them into a downloadable podcast? Then, I hope, everyone would be happy: fans get a presentation. I get away from begging for money to pay off debts. I get to refine my speaking skills. And maybe the world gets something fun out of the whole deal.

Enter a podcast, funded by a Patreon.

The title: Jason Talks His Way Out of It, my attempt to write down my debts and share the stories and thoughts I have.

I announced the Patreon on my 47th birthday. Within 24 hours, about 100 people had signed up, paying some small amount (or not small, in some cases) for each published episode. I had a goal of $250/episode to make it worthwhile, and we passed that handily. So it’s happening.

I recorded a prototype episode, and that’s up there, and the first episode of the series drops Monday. These are story-based presentations roughly 30 minutes long apiece, and I will continue to do them as long as it makes sense to.

Public speaking is something I’ve done for many, many years, and I enjoy it, and I get comments that people enjoy them very much. My presentation on That Awesome Time I Was Sued for Two Billion Dollars has passed 800,000 views on the various copies online.

I spent $40 improving my sound setup, which should work for the time being. (I already had a nice microphone and a SSD-based laptop which won’t add sound to the room.) I’m going to have a growing list of topics I’ll work from, and I’ll stay in communication with the patrons.

Let’s see what this brings.

One other thing: Moving to the new home means that a lot of quality of life issues have been fixed, and my goal is to really shoot forward finishing those two documentaries I owe people. I want them done as much as everyone else! And with less looming bills and debts in my life, it’ll be all I want to do.

So, back the new podcast if you’d like. It’ll help a lot.

Parsec start-of-row pattern?

Published 14 Sep 2017 by LudvigH in Newest questions tagged mediawiki - Stack Overflow.

I am trying to parse mediawiki text using Parsec. Some of the constructs in mediawiki markup can only occur at the start of rows (such as the header markup ==header level 2==). In regexp I would use an anchor (such as ^) to find the start of a line.

One attempt in GHCi is

Prelude Text.Parsec> parse (char '\n' *> string "==" *> many1 letter <* string "==") "" "\n==hej=="
Right "hej"

but this is not too good since it will fail on the first line of a file. I feel like this should be a solved problem...

What is the most idiomatic "Start of line" parsing in Parsec?

First Public Working Drafts for Web of Things

Published 14 Sep 2017 by Dave Raggett in W3C Blog.

There is widespread agreement on the huge potential for exploiting connected sensors and actuators, but such devices vary considerably in their capabilities and communication technologies. This makes it challenging to create services involving devices from different vendors and using different standards.

The Web of Things seeks to reduce the costs and risks for developing such services through an object oriented approach in which devices are exposed to applications as things with properties, actions and events, decoupling developers from the details of the underlying communication standards.

The Web of Things Working Group, launched in early 2017, has released three First Public Working Drafts introducing the Web of Things Architecture, JSON-LD based programming language neutral descriptions of thing interaction models, and an associated scripting API.

We welcome your feedback via email to

How to modify Mediawiki Login form

Published 14 Sep 2017 by Ashok Gj in Newest questions tagged mediawiki - Stack Overflow.

I'm trying to add the attribute autocomplete="off" in mediawiki login form. Being completely new, I'm unable to find where the form is being built.

I'm using MediaWiki 1.29.1

Any help would be greatly appreciated.

Things I did to find it: 1. Added the following code in LocalSettings.php

$wgHooks['UserLoginForm'][] = 'modifyLoginForm';
function modifyLoginForm( &$template ) {
   //Printed template and got the following

 UserloginTemplate Object
    [data] => Array
            [link] => 
            [header] => 
            [name] => Admin
            [password] => 
            [retype] => 
            [email] => 
            [realname] => 
            [domain] => 
            [reason] => 
            [action] => /mediawiki/index.php?title=Special:UserLogin&action=submitlogin&type=login&returnto=Main+Page
            [message] => 
            [messagetype] => error
            [createemail] => 
            [userealname] => 1
            [useemail] => 1
            [emailrequired] => 
            [emailothers] => 1
            [canreset] => 1
            [resetlink] => 1
            [canremember] => 1
            [usereason] => 
            [remember] => 
            [cansecurelogin] => 
            [stickHTTPS] => 
            [token] => 18955182baa69e0a66edefghi4e0ef
            [loginend] => 
            [signupend] => 
            [usedomain] => 
  1. Tried to modify includes/templates/Userlogin.php . There is a login form, but it didn't get affected.

  2. Checked LoginSignupSpecialPage.php, but not sure how to add the attribute.

Book review: Protest - Stories of Resistance

Published 14 Sep 2017 by in New Humanist Articles and Posts.

This anthology examines protest over seven centuries, showing how it is part of our social and political fabric.

Mediawiki - Bulk upload new version of images

Published 13 Sep 2017 by Arunkumar in Newest questions tagged mediawiki - Stack Overflow.

I would like upload larger number of images to the mediawiki. It should support uploading newer version of files, if they are already uploaded. I tried using the extension UploadLocal. However, it seems it doesn't support uploading as newer versions.

Any help much appreciated.

Regards Arun

Announcing Topicbox &ndash; our new product for teams

Published 12 Sep 2017 by Bron Gondwana in FastMail Blog.

Topicbox logo

Email is important to everybody. Email is your electronic memory, your archive of immutable truth. It is built on open standards, and it's the one truly global and interoperable written communication network. While instant messaging is great for immediacy, email is best for considered thought and permanence.

FastMail provides a great product for individuals and individual team members, and we're justifiably proud of that. FastMail staff use our own email archives every day to check whether our memories are correct or to answer questions like "what were the specifications for those servers we discussed 3 months ago and what did the various suppliers quote for them".

That email history – the team memory – gets locked away in individual mailboxes. New team members start from scratch with no easy way to look back on past decisions or discussions. This became very clear to us as our own company grew, and we had to forward important emails to new hires to help them understand our history. Or not! Nobody has time to go back and curate their email to forward every possibly relevant message to the new hire. They either asked around, interrupting others, or did without information that could have helped with their jobs.

With our company values firmly in mind we set out to solve this problem (which is common to all teams with changing membership) and redefine what a mailing list can be.

Topicbox is our new group email product which makes team history discoverable, email archives searchable, and collaboration easier – we know because we used the product ourselves as we were building it!

Many websites and services are designed to keep your eyeballs on their properties for the longest possible time. We are not like that. Our goal is to solve your email needs as quickly as possible so you can get on with everything else in your life.

I'm not going to list all the features here – you can read all about them on the Topicbox site, or sign up for a free trial straight away if you prefer doing rather than reading.

We have deliberately designed Topicbox so you don't need to be an existing FastMail customer – just start sending group emails to your team's Topicbox address instead of CCing individuals. It is effortless to add Topicbox to your existing workflow, and you can start with a single team from your organisation.

Later when you add new members to a team, they come on board with access to all the best information in your archives from day one – as discoverable and searchable as the email in an individual's account.

Try it out, we'd love to hear what you think.

Email us at or tweet @Topicboxer.

Of course, all good launches come with cake, and we celebrated Topicbox at our last quarterly meeting, with colleagues from the USA and India.

FastMail staff with Topicbox logo cake

Kubuntu Council Election Results Announced

Published 11 Sep 2017 by valorie-zimmerman in Kubuntu.

The Kubuntu Council is happy to announce the results of the election, and welcome the following members: Rik Mills, Aaron Honeycutt (returning) and Rick Timmis.

We thank Simon for running and making this a race, Clay and Ovidiu-Florin for their service on the Council, and Kubuntu Member Clive Johnston for stepping up and running this election.

See the official announcement from Clive on the mailing list:

Does Mediawiki encrypt logins by default as the browser sends them to the server?

Published 11 Sep 2017 by user1258361 in Newest questions tagged mediawiki - Server Fault.

Several searches only turned up questions about encrypting login info on the server side. Does Mediawiki encrypt logins after you type them in the browser and send them? (to prevent a man-in-the-middle from reading them in transit and taking over an account)

Exit strategies

Published 11 Sep 2017 by in New Humanist Articles and Posts.

After a spate of terrorist attacks on European soil, getting deradicalisation programmes to work has become more important than ever.

Update to Terms of Service

Published 11 Sep 2017 by Bron Gondwana in FastMail Blog.

On October 1st 2017, FastMail updated our Terms of Service to make them friendlier for our customers, and easier to understand.

These changes also allow us to progressively unify our Terms of Service across all our products: FastMail, Topicbox, Pobox and Listbox.

Summary of changes

UPDATE: changed tense and links on October 1st to reflect the activation of new terms.

UPDATE: we made one change after first publishing this post - in response to feedback we changed the wording in the termination section. We now say we may terminate accounts for non-renewal rather than for non-usage. So long as you keep paying to keep an account open, we don't mind if you're actually using it or not!

We now make it clear you aren’t responsible if a third party accesses your account as a result of FastMail’s negligence.

We have removed sections which allowed us to claim liquidated damages from you if you breached some of our Terms.

We have amended our termination clause. We used to be able to terminate your account at any time and for any reason. Now, we can only do so if you: fail to comply with the Terms and Conditions; if we are required to by law; or if your account is inactive for an extended period of time.

We have made the terms easier to read: in particular we have more consistent use of defined terms, so that we now refer to ourselves as “FastMail” rather than the “Service Provider”.

We have replaced clauses UNNECESSARILY WRITTEN IN SCARY-LOOKING ALL-CAPS with friendly-looking normal text.

We have removed an unnecessary clause specifying that the document is in English (given it’s already pretty clear it is!).

We have added specific provisions to outline the rights and responsibilities between FastMail, our authorised resellers, and their customers.

We have added a clause that we don't endorse the content or views of our customers.

The full document is available at

The Bounty of the Ted Nelson Junk Mail

Published 9 Sep 2017 by Jason Scott in ASCII by Jason Scott.

At the end of May, I mentioned the Ted Nelson Junk Mail project, where a group of people were scanning in boxes of mailings and pamphlets collected by Ted Nelson and putting them on the Internet Archive. Besides the uniqueness of the content, it was also unique in that we were trying to set it up to be self-sustaining from volunteer monetary contributions, and the compensate the scanners doing the work.

This entire endeavor has been wildly successful.

We are well past 18,000 pages scanned. We have taken in thousands in donations. And we now have three people scanning and one person entering metadata.

Here is the spreadsheet with transparency and donation information.

I highly encourage donating.

But let’s talk about how this collection continues to be amazing.

Always, there are the pure visuals. As we’re scanning away, we’re starting to see trends in what we have, and everything seems to go from the early 1960s to the early 1990s, a 30-year scope that encompasses a lot of companies and a lot of industries. These companies are trying to thrive in a whirlpool of competing attention, especially in certain technical fields, and they try everything from humor to class to rudimentary fear-and-uncertainty plays in the art.

These are exquisitely designed brochures, in many cases – obviously done by a firm or with an in-house group specifically tasked with making the best possible paper invitations and with little expense spared. After all, this might be the only customer-facing communication a company could have about its products, and might be the best convincing literature after the salesman has left or the envelope is opened.

Scanning at 600dpi has been a smart move – you can really zoom in and see detail, find lots to play with or study or copy. Everything is at this level, like this detail about a magnetic eraser that lets you see the lettering on the side.

Going after these companies for gender roles or other out-of-fashion jokes almost feels like punching down, but yeah, there’s a lot of it. Women draped over machines, assumptions that women will be doing the typing, and clunky humor about fulfilling your responsibilities as a (male) boss abounds. Cultural norms regarding what fears reigned in business or how companies were expected to keep on top of the latest trends are baked in there too.

The biggest obstacle going forward, besides bringing attention to this work, is going to be one of findability. The collection is not based on some specific subject matter other than what attracted Ted’s attention over the decades. He tripped lightly among aerospace, lab science, computers, electronics, publishing… nothing escaped his grasp, especially in technical fields.

If people are looking for pure aesthetic beauty, that is, “here’s a drawing of something done in a very old way” or “here are old fonts”, then this bounty is already, at 1,700 items, a treasure trove that could absorb weeks of your time. Just clicking around to items that on first blush seem to have boring title pages will often expand into breathtaking works of art and design.

I’m not worried about that part, frankly – these kind of sell themselves.

But there’s so much more to find among these pages, and as we’re now up to so many examples, it’s going to be a challenge to get researching folks to find them.

We have the keywording active, so you can search for terms like monitor, circuit, or hypercard and get more specific matches without concentrating on what the title says or what graphics appear on the front. The Archive has a full-text search, and so people looking for phrases will no doubt stumble into this collection.

But how easily will people even think to know about a wristwatch for the Macintosh from 1990, a closed circuit camera called the Handy Looky..  or this little graphic, nestled away inside a bland software catalog:

…I don’t know. I’ll mention that this is actually twitter-fodder among archivists, who are unhappy when someone is described as “discovering” something in the archives, when it was obvious a person cataloged it and put it there.

But that’s not the case here. Even Kyle, who’s doing the metadata, is doing so in a descriptive fashion, and on a rough day of typing in descriptions, he might not particularly highlight unique gems in the pile (he often does, though). So, if you discover them in there, you really did discover them.

So, the project is deep, delightful, and successful. The main consideration of this is funding; we are paying the scanners $10/hr to scan and the metadata is $15/hr. They work fast and efficiently. We track them on the spreadsheet. But that means a single day of this work can cause a notable bill. We’re asking people on twitter to raise funds, but it never hurts to ask here as well. Consider donating to this project, because we may not know for years how much wonderful history is saved here.

Please share the jewels you find.

4 Months!

Published 9 Sep 2017 by Jason Scott in ASCII by Jason Scott.

It’s been 4 months since my last post! That’s one busy little Jason summer, to be sure.

Obviously, I’m still around, so no heart attack lingering or problems. My doctor told me that my heart is basically healed, and he wants more exercise out of me. My diet’s continued to be lots of whole foods, leafy greens and occasional shameful treats that don’t turn into a staple.

I spent a good month working with good friends to clear out the famous Information Cube, sorting out and mailing/driving away all the contents to other institutions, including the Internet Archive, the Strong Museum of Play, the Vintage Computer Federation, and parts worldwide.

I’ve moved homes, no longer living with my brother after seven up-and-down years of siblings sharing a house. It was time! We’re probably not permanently scarred! I love him very much. I now live in an apartment with very specific landlords with rules and an important need to pay them on time each and every month.

To that end, I’ve cut back on my expenses and will continue to, so it’s the end of me “just showing up” to pretty much any conferences that I’m not being compensated for, which will of course cut things down in terms of Jason appearances you can find me at.

I’ll still be making appearances as people ask me to go, of course – I love travel. I’m speaking in Amsterdam in October, as well as being an Emcee at the Internet Archive in October as well. So we’ll see how that goes.

What that means is more media ingestion work, and more work on the remaining two documentaries. I’m going to continue my goal of clearing my commitments before long, so I can choose what I do next.

What follows will be (I hope) lots of entries going deep into some subjects and about what I’m working on, and I thank you for your patience as I was not writing weblog entries while upending my entire life.

To the future!

Check out Web Payments Demos @ Money20/20

Published 7 Sep 2017 by Ian Jacobs in W3C Blog.

Please join me on Monday, 23 October at Money20/20, where my colleagues from Google, Mastercard, and Airbnb will demonstrate how to streamline online checkout using new Web standards from W3C.

In our session, Zach Koch (Google) will highlight new browser features to accelerate checkout. James Anderson (Mastercard) will show how native mobile applications could integrate with this checkout experience, and also enhance payment security through tokenization and multi-factor authentication. Michel Weksler (Airbnb) will provide the merchant site driving the demos.

Industry leaders are collaborating on these open standards at W3C to enable a streamlined and consistent user experience across the Web, designed to increase conversions and lower merchant integration costs. All major browser makers are now implementing Payment Request API, the heart of the new Web checkout experience. So Money20/20 will be a great opportunity to check out the future of Web payments. Please join us!

SWFUpload To Be Removed From Core

Published 7 Sep 2017 by Ipstenu (Mika Epstein) in Make WordPress Plugins.

Removing SWFUpload

If your plugin is using SWFUpload, please remove it and switch to Plupload. If you’re a security plugin scanning for it, you’re fine. If your plugin is using it, or including your own, it’s time to upgrade.


Godless for God’s Sake: Now available for Kindle for just $5.99

Published 6 Sep 2017 by James Riemermann in


Godless for God’s Sake: Nontheism in Contemporary Quakerism

In this book edited by British Friend and author David Boulton, 27 Quakers from 4 countries and 13 yearly meetings tell how they combine active and committed membership in the Religious Society of Friends with rejection of traditional belief in the existence of a transcendent, personal and supernatural God.

For some, God is no more (but no less) than a symbol of the wholly human values of “mercy, pity, peace and love”. For others, the very idea of God has become an archaism.

Readers who seek a faith free of supernaturalism, whether they are Friends, members of other religious traditions or drop-outs from old-time religion, will find good company among those whose search for an authentic 21st century understanding of religion and spirituality has led them to declare themselves “Godless – for God’s Sake”.


Preface: In the Beginning…

1. For God’s Sake? An Introduction


David Boulton

2. What’s a Nice Nontheist Like You Doing Here?


Robin Alpern

3. Something to Declare


Philip Gross

4. It’s All in the Numbers

Joan D Lucas

5. Chanticleer’s Call: Religion as a Naturalist Views It

Os Cresson

6. Mystery: It’s What we Don’t Know

James T Dooley Riemermann

7. Living the Questions

Sandy Parker

8. Listening to the Kingdom

Bowen Alpern

9. The Making of a Quaker Nontheist Tradition

David Boulton and Os Cresson

10. Facts and Figures

David Rush

11. This is my Story, This is my Song…


Ordering Info

Links to forms for ordering online will be provided here as soon as they are available. In the meantime, contact the organizations listed below, using the book details at the bottom of this page.

QuakerBooks of Friends General Conference

(formerly FGC Bookstore)

1216 Arch St., Ste 2B

Philadelphia, PA 19107

215-561-1700 fax 215-561-0759

(this is the “Universalism” section of Quakerbooks, where the book is currently located)

(this is the “Universalism” section of Quakerbooks, where the book is currently located)



Quaker Bookshop

173 Euston Rd London NW1 2BJ

020 7663 1030, fax 020 7663 1008


Those outside the United Kingdom and United States should be able to order through a local bookshop, quoting the publishing details below – particularly the ISBN number. In case of difficulty, the book can be ordered direct from the publisher’s address below.

Title: “Godless for God’s Sake: Nontheism in Contemporary Quakerism” (ed. David Boulton)

Publisher: Dales Historical Monographs, Hobsons Farm, Dent, Cumbria LA10 5RF, UK. Tel 015396 25321. Email

Retail price: ?9.50 ($18.50). Prices elsewhere to be calculated on UK price plus postage.

Format: Paperback, full colour cover, 152 pages, A5

ISBN number: 0-9511578-6-8 (to be quoted when ordering from any bookshop in the world)

How We Created a People-First Hiring Experience

Published 5 Sep 2017 by Olivia Melman in DigitalOcean: Cloud computing designed for developers.

How We Created a People-First Hiring Experience

This post is the first installment of a two-part series we’re publishing this fall around recruiting and the new hire experience at DigitalOcean.

When I joined DigitalOcean in March of this year, I was the 281st employee at the company. We’re now over 350 employees, over 120 of whom were hired this year alone, and our goal is to surpass 400 by the end of 2017.

These numbers should tell you a few things:

Fusing together these two takeaways, the need for a Recruiting Operations function becomes critical, especially in today's HR-tech landscape. The war for talent is real, leaving recruiters with limited bandwidth to focus on process improvements and a constant sense of urgency to make great hires. I joined DigitalOcean from LinkedIn, where I worked as a Customer Success Manager helping Enterprise Recruiting Teams (similar in size and scale to DO) optimize their LinkedIn solutions to hire most effectively. When I was given the chance to expand upon this experience by coming in-house to DO, I was inspired by the opportunity to add value to a growing company in brand new ways.

Having a dedicated resource to ensure we are using best practices, understanding the analytics behind our pipeline performance, evaluating recruitment tools and implementing data-driven recruiting strategies enables a company like ours to grow and scale. So, while my role is largely focused on increasing hiring efficiency and implementing technologies and tools to automate, my true passion is fueled by the experiences I’m impacting for everyone throughout the recruiting process.

Experience matters at DO—from user, to candidate, to employee. A study from CareerBuilder showed that “nearly 4 in 5 candidates (78%) say the overall candidate experience they receive is an indicator of how a company values its people.” In recruiting, a positive experience is intertwined with anticipating what’s going to happen next, and the candidate should feel as respected, valued, and loved as any employee. That includes proactive communication about the process and why it’s important. We believe it’s important to be transparent about what you can expect here at DigitalOcean, in any capacity.

The nature of Recruiting Ops impacts several groups. In this post, I’ll touch on on the candidate and employee experiences.


We’ve all been a candidate before. Anyone who has ever looked for a job has experienced the black hole of “I applied, now what?”. At DO, you don’t have to wonder, wait, and then wait some more. Humanizing the candidate experience is key to fostering a positive one, so here’s what we share with every applicant when they apply:
How We Created a People-First Hiring Experience

In addition, we believe both candidates and Recruiters win when the recruiting process is more personalized. We’ve replaced the traditional “Recruiter Phone Call” first step with a video chat over Google Hangout. By encouraging candidates to participate in a Google Hangout, our Recruiters can get a truer sense of the candidate, and the candidate a truer sense of who their partner will be throughout the remainder of their interview experience. And because we’re a highly remote-friendly organization, we include video conferencing as a normal part of how we connect on a daily basis.

We want to be transparent about what we’re working on internally and how we’re measuring the success of our well-oiled recruiting machine. A few specific metrics we’re monitoring are time to hire, length of candidate journey, and new hire engagement scores. Here’s why and how:

From our DigitalOcean Neighborhood Guide for non-local candidates joining us onsite to interview, to our employee handbook (available to the public soon), we offer resources to ease the candidate journey every step of the way. Stay tuned for that employee handbook.

In the meantime, watch our short culture video (produced by The Muse), which gives a sneak peak into what it’s like to work here. You’ll get to see our office, and get a sense of DO’s culture via interviews with some of our most talented employees. By partnering with awesome companies like The Muse, LinkedIn, Glassdoor, and more, I’m able to help showcase the amazing work our teams are doing and the experiences our employees are having. Our LinkedIn page introduces candidates to our leadership team, and there you can also read employee testimonials and employee-written blog posts. We also share product and content updates on LinkedIn to tell our story to users and candidates alike in real-time.

We’re actively building tools to give folks outside of DigitalOcean a better understanding of what the employee experience is like here in hopes that they—or you!—will want to join us (we're hiring!).

In our next post, we’ll touch more upon what new employees can expect upon joining DO as well as how our referrals program works.

Olivia Melman joined DigitalOcean in March 2017 as the People team’s first Program Manager. She is heavily focused on automation and collaboration within the full-cycle recruitment process, strengthening external partnerships to promote DO’s employment brand, and leveraging data leveraging data to drive Recruiting strategy.

Plugin Support Reps

Published 4 Sep 2017 by Sergey Biryukov in Make WordPress Plugins.

Some of the larger plugin shops have a support team to help out on the forums. It would be useful to be able to give those people a special “support rep” role on the forums so they could be recognized as such.

Support representatives can mark forum topics as resolved or sticky (same as plugin authors and contributors), but don’t have commit access to the plugin.

The UI for managing plugin support reps can be found in Advanced View on the plugin page, next to managing committers:

Screenshot of the UI for managing support reps

Once someone is added as a support rep, they will get a Plugin Support badge when replying to the plugin support topics or reviews:

Screenshot of Plugin Support badge on the forums

The Month in WordPress: August 2017

Published 1 Sep 2017 by Hugh Lashbrooke in WordPress News.

While there haven’t been any major events or big new developments in the WordPress world this past month, a lot of work has gone into developing a sustainable future for the project. Read on to find out more about this and other interesting news from around the WordPress world in August.

The Global WordPress Translation Day Returns

On September 30, the WordPress Polyglots team will be holding the third Global WordPress Translation Day. This is a 24-hour global event dedicated to the translation of the WordPress ecosystem (core, themes, plugins), and is a mix of physical, in-person translation work with online streaming of talks from WordPress translators all over the world.

Meetup groups will be holding events where community members will come together to translate WordPress. To get involved in this worldwide event, join your local meetup group or, if one is not already taking place in your area, organize one for your community.

You can find out more information on the Translation Day blog and in the #polyglots-events channel in the Making WordPress Slack group.

WordPress Foundation to Run Open Source Training Worldwide

The WordPress Foundation is a non-profit organization that exists to provide educational events and resources for hackathons, support of the open web, and promotion of diversity in the global open source community.

In an effort to push these goals forward, the Foundation is going to be offering assistance to communities who would like to run local open source training workshops. A number of organizers have applied to be a part of this initiative, and the Foundation will be selecting two communities in the coming weeks.

Follow the WordPress Foundation blog for updates.

Next Steps in WordPress Core’s PHP Focus

After last month’s push to focus on WordPress core’s PHP development, a number of new initiatives have been proposed and implemented. The first of these initiatives is a page on that will educate users on the benefits of upgrading PHP. The page and its implementation are still in development, so you can follow and contribute on GitHub.

Along with this, plugin developers are now able to specify the minimum required PHP version for their plugins. This version will then be displayed on the Plugin Directory page, but it will not (yet) prevent users from installing it.

The next evolution of this is for the minimum PHP requirement to be enforced so that plugins will only work if that requirement is met. You can assist with this implementation by contributing your input or a patch on the open ticket.

As always, discussions around the implementation of PHP in WordPress core are done in the #core-php channel in the Making WordPress Slack group.

New Editor Development Continues

For a few months now, the core team has been steadily working on Gutenberg, the new editor for WordPress core. While Gutenberg is still in development and is some time away from being ready, a huge amount of progress has already been made. In fact, v1.0.0 of Gutenberg was released this week.

The new editor is available as a plugin for testing and the proposed roadmap is for it to be merged into core in early 2018. You can get involved in the development of Gutenberg by joining the #core-editor channel in the Making WordPress Slack group and following the WordPress Core development blog.

Further reading:

If you have a story we should consider including in the next “Month in WordPress” post, please submit it here.

How Data and Models Feed Computing

Published 30 Aug 2017 by Alejandro (Alex) Jaimes in DigitalOcean: Cloud computing designed for developers.

How Data and Models Feed Computing

This post is the second in a three-part series on artificial intelligence by DigitalOcean’s Head of R&D, Alejandro (Alex) Jaimes. (Click here to read the first installment.)

Not every company, nor every developer will have the resources or the time to collect vast amounts of data to create models from scratch. Fortunately, the same repetition that I described in my last post occurs within and across industries. Because of this, particularly with deep learning, we’ve seen two very important trends:

While the companies that have the most data may never release it, such data is not a requirement for every problem. It’s clear, however, that teams that leverage existing public models and combine public and proprietary datasets will have a competitive advantage. They must be “smart” about how they use and leverage the data they are able to collect, again with an AI mindset and strategy in mind.

Supervised and Unsupervised Learning

The majority of successes in AI so far have been based on supervised learning, in which machine learning algorithms are fed with labeled data—labeled data refers to a sample group that can be identified with a meaningful label or tag—versus unlabeled data. Labeling data is expensive, time consuming, and difficult (e.g., maintaining the desired quality, dealing with subjectivity, etc). For this reason, the ideal algorithms will be “unsupervised”—in other words, learning from unlabeled data. While promising, those algorithms have not shown the success levels needed to have the desired impact. Teams should then rely on creative strategies to leverage existing datasets, and combine supervised and unsupervised methods for now.

A number of companies offer labeling and data collection services. But there are ways to use algorithms to simplify the manual labeling process (e.g., with a “small” dataset one can create an algorithm that labels a much larger unlabeled dataset, so that humans have to correct errors made by the algorithm instead of labeling all of the data from scratch), or to create synthetic datasets (e.g., by using algorithms to generate “fake” data that looks like the original data). The bottom line is that no matter what size the project is, there are almost always alternatives to either obtain new data or augment existing datasets.

AI as a Service

Generally, significant efforts are required in developing models to perform tasks in accurate, efficient ways. For that reason, many companies and teams focus on specific verticals—building functionalities that are limited, but that work well in practice (versus the ideal of building a “human-like” AI capable of doing many things at once).

In some cases, those functionalities can be applied across domains. Developing a speech recognition system from scratch, for example, is a major effort, and most companies and teams that need it would be better off using a service than building it from scratch.

As the AI industry advances, we can expect to see more and more of those functionalities coming from specific vendors and open source initiatives, similar to the way software is built today: combinations of libraries, APIs, and open source and commercial components, coupled with custom software for specific applications.

In addition, given the nature of AI, building an infrastructure that quickly scales as needs shift is a major challenge. This implies that running AI will mostly happen on the cloud. Note that in the new AI computing paradigm, growing datasets, experimentation, and constant “tweaking” of models is a critical component.

Therefore, AI will be used as a cloud-based service for many applications. That’s a natural progression and in many ways leads to the commoditization of AI, which will lead to greater efficiency, opportunities, innovation, and positive economic impact. In our next installment, we’ll explore what all of this means for today’s developers.

In line with the trends we’re seeing in research and industry, we’re releasing a powerful set of tools that allow developers to easily re-use existing models, work with large quantities of data, and easily scale, on the cloud. We encourage you to take a look at our machine learning one-click. What other tools or functionalities would you be interested in having us provide? Feel free to leave feedback in the comments section below.

Alejandro (Alex) Jaimes is Head of R&D at DigitalOcean. Alex enjoys scuba diving and started coding in Assembly when he was 12. In spite of his fear of heights, he's climbed a peak or two, gone paragliding, and ridden a bull in a rodeo. He's been a startup CTO and advisor, and has held leadership positions at Yahoo, Telefonica, IDIAP, FujiXerox, and IBM TJ Watson, among others. He holds a Ph.D. from Columbia University.
Learn more by visiting his personal website or LinkedIn profile. Find him on Twitter: @tinybigdata.

Hello everyone, some of you…

Published 29 Aug 2017 by Konstantin Obenland in Make WordPress Plugins.

Hello everyone, some of you will have the following email in your inbox:

Your password on has been deactivated, and you need to reset it to log in again.

We recently discovered your login credentials in a list of compromised emails and passwords published by a group of security researchers. This list was not generated as the result of any exploit on, but rather someone gaining access to the email & password combination you also used on another service.

To reset your password and get access to your account, please follow these steps:
1. Go to
2. Click on the link “Lost your password?”
3. Enter your username:
4. Click the “Get New Password” button

It is very important that your password be unique. Using the same password on different web sites increases the risk of your account being hacked.

If you have any further questions or trouble resetting your password, please reply to this message to get help from our support team. We will never ask you to supply your account password via email.

At this point we don’t have a reason to believe any accounts have been compromised, but out of an abundance of caution passwords are proactively disabled just to make sure.

If you have any questions don’t hesitate to post them in the comments.

[EDIT]: Updated the list typo to now go in order.

[EDIT]: Comments are closed. Reply to the email folks.

MassMessage hits 1,000 commits

Published 28 Aug 2017 by legoktm in The Lego Mirror.

The MassMessage MediaWiki extension hit 1,000 commits today, following an update of the localization messages for the Russian language. MassMessage replaced a Toolserver bot that allowed sending a message to all Wikimedia wikis, by integrating it into MediaWiki and using the job queue. We also added some nice features like input validation and previewing. Through it, I became familiar with different internals of MediaWiki, including submitting a few core patches.

I made my first commit on July 20, 2013. It would get a full rollout to all Wikimedia wikis on November 19, 2013, after a lot of help from MZMcBride, Reedy, Siebrand, Ori, and other MediaWiki developers.

I also mentored User:wctaiwan, who worked on a Google Summer of Code project that added a ContentHandler backend to the extension, to make it easier for people to create and maintain page lists. You can see it used by The Wikipedia Signpost's subscription list.

It's still a bit crazy to think that I've been hacking on MediaWiki for over four years now, and how much it has changed my life in that much time. So here's to the next four years and next 1,000 commits to MassMessage!

Minimum PHP Version Requirement

Published 28 Aug 2017 by Sergey Biryukov in Make WordPress Plugins.

Not all plugins can work on PHP 5.2, like WordPress core currently does. Not all plugin developers want to support PHP 5.2, like core does. As a project, WordPress would like to move forward and encourage people to use more recent PHP versions.

As one of the first steps to reach that goal, plugin authors can now specify a minimum required PHP version for their plugin in readme.txt file with a new Requires PHP header:

=== Plugin Name ===
Contributors: (this should be a list of userid's)
Donate link:
Tags: comments, spam
Requires at least: 4.6
Tested up to: 4.8
Requires PHP: 5.6
Stable tag: 4.3
License: GPLv2 or later
License URI:

Users will see this displayed in the plugin directory, like this:

Screenshot of the Requires PHP Version line on plugin page

As a next step, the WordPress core team is going look into showing users a notice that they cannot install a certain plugin or theme because their install does not meet the required criteria, with some user-oriented and host-specific instructions on how to switch their site to a newer PHP version.

If you have any feedback on the subject, please leave a comment or join the next PHP meeting in #core-php channel on Slack.

Kubuntu Artful Aardvark (17.10) Beta 1 testing

Published 28 Aug 2017 by valorie-zimmerman in Kubuntu.

Artful Aardvark (17.10) Beta 1 images are now available for testing.

The Kubuntu team will be releasing 17.10 in October. The final Beta 1 milestone will be available on August 31st.

This is the first spin in preparation for the Beta 1 pre-release. Kubuntu Beta pre-releases are NOT recommended for:

Kubuntu Beta pre-releases are recommended for:

Getting Kubuntu 17.10 Beta 1:

To upgrade to Kubuntu 17.10 pre-releases from 17.04, run sudo do-release-upgrade -d from a command line.

Download a Bootable image and put it onto a DVD or USB Drive via the download link at This is also the direct link to report your findings and any bug reports you file.

See our release notes:

Please report your results on the Release tracker:

Requiring HTTPS for my Toolforge tools

Published 27 Aug 2017 by legoktm in The Lego Mirror.

My Toolforge (formerly "Tool Labs") tools will now start requiring HTTPS, and redirecting any HTTP traffic. It's a little bit of common code for each tool, so I put it in a shared "toolforge" library.

from flask import Flask
import toolforge

app = Flask(__name__)

And that's it! Your tool will automatically be HTTPS-only now.

$ curl -I ""
HTTP/1.1 302 FOUND
Server: nginx/1.11.13
Date: Sat, 26 Aug 2017 07:58:39 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 281
Connection: keep-alive
X-Clacks-Overhead: GNU Terry Pratchett

My DebConf 17 presentation - Bringing MediaWiki back into Debian

Published 26 Aug 2017 by legoktm in The Lego Mirror.

Full quality video available on Wikimedia Commons, as well as the slides.

I had a blast attending DebConf '17 in Montreal, and presented about my efforts to bring back MediaWiki into Debian. The talks I went to were all fantastic, and got to meet some amazing people. But the best parts about the conference was the laid-back atmosphere and the food. I've never been to another conference that had food that comes even close to DebConf.

Feeling very motivated, I have three new packages in the pipeline: LuaSandbox, uprightdiff, and libkiwix.

I hope to be at DebConf again next year!

Plasma 5.10.5 and Frameworks 5.37 updates now in backports PPA for Zesty 17.04

Published 24 Aug 2017 by rikmills in Kubuntu.

The final 5.10.5 bugfix update of the Plasma 5.10 series is now available for users of Kubuntu Zesty Zapus 17.04 to install via our backports PPA.

KDE Frameworks is also updated to the latest version 5.37


To update, use the Software Repository Guide to add the following repository to your software sources list:


or if it is already added, the updates should become available via your preferred update method.

The PPA can be added manually in the Konsole terminal with the command:

sudo add-apt-repository ppa:kubuntu-ppa/backports

and packages then updated with

sudo apt update
sudo apt full-upgrade


Upgrade notes:

~ The Kubuntu backports PPA already contains from previous updates backported KDE PIM 16.12.3 packages (Kmail, Kontact, Korganiser Akregator etc), plus various other backported applications, so please be aware that enabling the backports PPA for the 1st time and doing a full upgrade will result in a substantial amount of upgraded packages in addition to Plasma 5.10.

~ While we believe that these packages represent a beneficial and stable update, please bear in mind that they have not been tested as comprehensively as those in the main ubuntu archive, and are supported only on a limited and informal basis. Should any issues occur, please provide feedback on our mailing list [1], IRC [2], file a bug against our PPA packages [3], or optionally contact us via social media.

1. Kubuntu-devel mailing list:
2. Kubuntu IRC channels: #kubuntu & #kubuntu-devel on
3. Kubuntu ppa bugs:


Help needed testing newest bugfix release of Plasma on Kubuntu 17.04

Published 22 Aug 2017 by valorie-zimmerman in Kubuntu.

Are you using Kubuntu 17.04, our current release? Help us test a new bugfix release for KDE Plasma! Go here for more details:

Unfortunately that page illustrates Xenial and Ubuntu Unity rather than Zesty in Kubuntu. Using Discover or Muon, use Settings > More, enter your password, and ensure that Pre-release updates (zesty-proposed) is ticked in the Updates tab.

Or from the commandline, you can modify the software sources manually by adding the following line to /etc/apt/sources.list:

deb zesty-proposed restricted main multiverse universe

If you are going to be testing from proposed frequently, you might try the pinning process as described on the wiki page about how to enable proposed. Otherwise, you can just

sudo apt update and then sudo apt install packagename/zesty-proposed each of the packages listed in the bug report:

If you do not pin, remove the proposed repository immediately after you finish installing the test packages, or you risk wrecking your system in interesting ways.

Please report your findings on the bug report. If you need some guidance on how to structure your report, please see Testing is very important to the quality of the software Ubuntu and Kubuntu developers package and release.

We need your help to get this important bug-fix release out the door to all of our users.

Thanks! Please stop by the Kubuntu-devel IRC channel or Telegram group if you need clarification of any of the steps to follow.

How to Manage, Build, and Nurture Distributed Teams

Published 21 Aug 2017 by Dave "Dizzy" Smith in DigitalOcean: Cloud computing designed for developers.

How to Manage, Build, and Nurture Distributed Teams

This blog post was adapted from Dizzy’s OSCON 2017 talk, “Managing, Nurturing, and Building Distributed Teams”.

There’s a lot of talk out there about what worked for one company or another when it comes to distributed teams. The challenge we all face is how to make successful, distributed teams reproducible. Lots of companies have made it work—by luck or by force of personality. I, however, want to engineer it using effective communication.

And at the end of my life, I want to look back and know I made the world a tiny bit better for people who create amazing things. This is the work of a manager—to create environments in which people do the best work of their lives. It is especially important in a distributed team since you can’t rely on personality or stage presence; you must be disciplined and focused, and work deliberately to construct these environments. Most importantly, you have to understand the forces you face and how to counteract them.

How Distributed Teams Differ From Other Teams

A distributed team can be succinctly summed up as: “People + Work - Location; United by purpose”.

There are lots of implications related to the removal of location. Without location, managers have to find something more fundamental to bind people together. As managers, we must consider how to unite distributed teams by rallying them around a collective purpose. There are forces at work making it difficult for distributed teams to unite—and stay united—around any purpose. Thankfully, there are ways to counteract these forces, too.

Dimensions of Communication

Communication always takes the easiest path. Being cognizant of this will save you a lot of pain as a manager (this is also a reason why many distributed teams fail).

Consider how having a team mostly in office with a few dispersed remote team members affects communication. It might be easier for people in-office to walk over to someone—or yell across the table—to discuss something versus typing away in Slack. Or if it’s difficult to set up a conference call, the group of people may choose to run a meeting without waiting for the remotes to log in. At DigitalOcean, every conference room has a Hangout-equipped computer, so it’s never hard to get a meeting going with everyone.

When running a distributed team, it’s helpful to think about communication in three dimensions:

With these dimensions, we can begin characterizing the common forms or modalities of communication:

How to Manage, Build, and Nurture Distributed Teams

It’s important to remember that all modalities matter, and choosing one over the other means making explicit tradeoffs. Where possible, seek balance (e.g., choosing only IM as a form of communication but no email might mean deeper thinking is discouraged). All of this framework is important to consider as you manage and nurture a team, since conversations may need to span some or all of them, so choose your modalities wisely.

Make sure your team is using the right modalities for the communication they need to have. Guide conversations to the correct forums. Set guidelines, like knowing when to start a Hangout. Moderate the amount of face to face to give people time to think and ensure everyone who needs to be a part of the conversation is present. If you don’t, communication will be slower and far less efficient, taking you off the rails.

How to Keep Distributed Teams Moving

There are (at least) three things you need to be doing in a distributed to keep things moving:

Stay aligned. Alignment means that you and your team(s) have a shared contextual view of the (business/operating) world that allows them to function despite being out of time sync or in different locations. Staying aligned means paying attention to the following: progress (what, why, blockers), priorities (what’s most important), and people (offering praise and feedback, and addressing challenges).

Stay in touch. Where alignment ensures the team can operate without physical presence, staying in touch ensures the team can remain human in the face of space/time differences. Staying in touch includes scheduling regular 1:1s to connect with a person individually, schedule leadership syncs to connect with the leaders in your business, and holding office hours.

Keep an eye on the horizon. If you’re paying attention to the people and the work they do, the final major step is making sure you’re not overwhelmed by the details. You can’t lose track of the purpose when the details threaten to overwhelm you. A few simple things you can do to that end are asking when a task will be done (to drive urgency and encourage an environment open to discussing problems) and carving out time for thinking and planning.

In conclusion, remember the forces at work against all distributed teams—space and time—and focus on creative effective communication practices to counteract those forces. Staying aligned and in touch will help you foster communication, and keeping an eye on the horizon will get you in the habit of talking through problems and setting aside time to think ahead.

Dave “Dizzy” Smith is a senior director of engineering at DigitalOcean. A software industry veteran with over 21 years of experience, he has a broad range of experience across real-time messaging systems, identity federation and authentication, and low-latency peer-to-peer data stores and has been an active contributor to many open source projects. Follow him on Twitter at @dizzyd.

Benchmarking with the NDSA Levels of Preservation

Published 18 Aug 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Anyone who has heard me talk about digital preservation will know that I am a big fan of the NDSA Levels of Preservation.

This is also pretty obvious if you visit me in my office – a print out of the NDSA Levels is pinned to the notice board above my PC monitor!

When talking to students and peers about how to get started in digital preservation in a logical, pragmatic and iterative way, I always recommend using the NDSA Levels to get started. Start at level 1 and move forward to the more advanced levels as and when you are able. This is a much more accessible and simple way to start addressing digital preservation than digesting some of the bigger and more complex certification standards and benchmarking tools.

Over the last few months I have been doing a lot of documentation work. Both ensuring that our digital archiving procedures are written down somewhere and documenting where we are going in the future.

As part of this documentation it seemed like a good idea to use the NDSA Levels:

Previously I have used the NDSA Levels in quite a superficial way – as a guide and a talking point, it has been quite a different exercise actually mapping where we stand.

It was not always straightforward to establish where we are and to unpick and interpret exactly what each level meant in practice. I guess this is one of the problems of using a relatively simple set of metrics to describe what is really quite a complex set of processes.

Without publishing the whole document that I've written on this, here is a summary of where I think we are currently. I'm also including some questions I've been grappling with as part of the process.

Storage and geographic location

Currently at LEVEL 2: 'know your data' with some elements of LEVEL 3 and 4 in place

See the full NDSA levels here

Four years ago we carried out a ‘rescue mission’ to get all digital data in the archives off portable media and on to the digital archive filestore. This now happens as a matter of course when born digital media is received by the archives.

The data isn’t in what I would call a proper digital archive but it is on a fairly well locked down area of University of York filestore.

There are three copies of the data available at any one time (not including the copy that is on original media within the strongrooms). The University stores two copies of the data on spinning disk. One at a data centre on one campus and the other at a data centre on another campus with another copy backed up to tape which is kept for 90 days.

I think I can argue that storage of the data on two different campuses is two different geographic locations but these locations are both in York and only about 1 mile apart. I'm not sure whether they could be described as having different disaster threats so I'm going to hold back from putting us at Level 3 though IT do seem to have systems in place to ensure that filestore is migrated on a regular schedule.


File fixity and data integrity

Currently at LEVEL 4: 'repair your data'

See the full NDSA levels here

Having been in this job for five years now I can say with confidence that I have never once received file fixity information alongside data that has been submitted to us. Obviously if I did receive it I would check it on ingest, but I can not envisage this scenario occurring in the near future! I do however create fixity information for all content as part of the ingest process.

I use a tool called Foldermatch to ensure that the digital data I have copied into the archive is identical to the original. Foldermatch allows you to compare the contents of two folders and one of the comparison methods (the one I use at ingest) uses checksums to do this.

Last year I purchased a write blocker for use when working with digital content delivered to us on portable hard drives and memory sticks. A check for viruses is carried out on all content that is ingested into the digital archive so this fulfills the requirements of level 2 and some of level 3.

Despite putting us at Level 4, I am still very keen to improve our processes and procedures around fixity. Fixity checks are carried out at intervals (several times a month) and these checks are logged but at the moment this is all initiated manually. As the digital archive gets bigger, we will need to re-think our approaches to this important area and find solutions that are scalable.


Information Security

Currently at LEVEL 2: 'know your data' with some elements of LEVEL 3 in place

See the full NDSA levels here

Access to the digital archive filestore is limited to the digital archivist and IT staff who administer the filestore. If staff or others need to see copies of data within the digital archive filestore, copies are made elsewhere after appropriate checks are made regarding access permissions. The master copy is always kept on the digital archive filestore to ensure that the authentic original version of the data is maintained. Access restrictions are documented.

We are also moving towards the higher levels here. A recent issue reported on a mysterious change of last modified dates for .eml files has led to discussions with colleagues in IT, and I have been informed that an operating system upgrade for the server should include the ability to provide logs of who has done what to files in the archive.

It is worth pointing out that as I don't currently have systems in place for recording PREMIS (preservation) metadata. I am currently taking a hands off approach to preservation planning within the digital archive. Preservation actions such as file migration are few and far between and are recorded in a temporary way until a more robust system is established.


Currently at LEVEL 3: 'monitor your data'

See the full NDSA levels here

We do OK with metadata currently, (considering a full preservation system is not yet in place). Using DROID at ingest is helpful at fulfilling some of the requirements of levels 1 to 3 (essentially, having a record of what was received and where it is).

Our implementation of AtoM as our archival management system has helped fulfil some of the other metadata requirements. It gives us a place to store administrative metadata (who gave us it and when) as well as providing a platform to surface descriptive metadata about the digital archives that we hold.

Whether we actually have descriptive metadata or not for digital archives will remain an issue. Much metadata for the digital archive can be generated automatically but descriptive metadata isn't quite as straightforward. In some cases a basic listing is created for files within the digital archive (using Dublin Core as a framework) but this will not happen in all cases. Descriptive metadata typically will not be created until an archive is catalogued which may come at a later date.

Our plans to implement Archivematica next year will help us get to Level 4 as this will create full preservation metadata for us as PREMIS.


File formats

Currently at LEVEL 2: 'know your data' with some elements of LEVEL 3 in place

See the full NDSA levels here

It took me a while to convince myself that we fulfilled Level 1 here! This is a pretty hard one to crack, especially if you have lots of different archives coming in from different sources, and sometimes with little notice. I think it is useful that the requirement at this level is prefaced with "When you can..."!

Thinking about it, we do do some work in this area - for example:

To get us to Level 2, as part of the ingest process we run DROID to get a list of file formats included within a digital archive. Summary stats are kept within a spreadsheet that covers all content within the digital archive so we can quickly see the range of formats that we hold and find out which archives they are in.

This should allow us to move towards Level 3 but we are not there yet. Some pretty informal and fairly ad hoc thinking goes into  file format obsolescence but I won't go as far as saying that we 'monitor' it. I have an awareness of some specific areas of concern in terms of obsolete files (for example I've still got those WordStar 4.0 files and I really do want to do something with them!) but there are no doubt other formats that need attention that haven't hit my radar yet.

As mentioned earlier, we are not really doing migration right now - not until I have a better system for creating the PREMIS metadata, so Level 4 is still out of reach.



This has been a useful exercise and it is good to see where we need to progress. Going from using the Levels in the abstract and actually trying to apply them as a tool has been a bit challenging in some areas. I think additional information and examples would be useful to help clear up some of the questions that I have raised.

I've also found that even where we meet a level there is often other ways we could do things better. File fixity and data integrity looks like a strong area for us but I am all too aware that I would like to find a more sustainable and scalable way to do this. This is something we'll be working on as we get Archivematica in place. Reaching Level 4 shouldn't lead to complacency!

An interesting blog post last year by Shira Peltzman from the UCLA Library talked about Expanding the NDSA Levels of Preservation to include an additional row focused on Access. This seems sensible given that the ability to provide access is the reason why we preserve archives. I would be keen to see this developed further so long as the bar wasn't set too high. At the Borthwick my initial consideration has been preservation - getting the stuff and keeping it safe - but access is something that will be addressed over the next couple of years as we move forward with our plans for Archivematica and AtoM.

Has anyone else assessed themselves against the NDSA Levels?  I would be keen to see how others have interpreted the requirements.

Botanical Wonderland events

Published 18 Aug 2017 by carinamm in State Library of Western Australia Blog.

From pressed seaweed, to wildflower painting, embroidery, to photography – botanical wonders have inspired and defined Western Australia. Hear from art historian, author, artist and curator Dr Dorothy Erickson in two events at the State Library of Western Australia.

WA wildflowers 17.jpg

Lecture: Professional women Artists in the Wildflower State by Dr Dorothy Erickson
Wednesday 23 August 2017 – 5:00-6:00 pm
Great Southern Room – State Library of Western Australia
Free. No bookings required

The first profession acceptable to be practiced by Middle Class women was as an Artist. They were the ‘Angels in the Studio’ at the time when gold was first being found in Western Australia. While a few Western Australian born were trained artists many others came in the wake of the gold rushes when Western Australia was the world’s El Dorado. A number were entranced by the unique wildflowers and made this the mainstay of their careers. This talk will focus on the professional women artists in Western Australia from 1890 to WWI with particular attention to the those who painted our unique botanical wonderland.

L W Greaves_CROP

Lilian Wooster Greaves was a prolific Western Australian wildflower artist , “no one else seems to be able to equal her skill in pressing and mounting wildflower specimens, in the form of panels, cards and booklets – The West Australian 21 May 1927. Portrait of Lilian Wooster Greaves Out of Doors in WA, 1927, State Library of Western Australia 821A(W)GRE.

Floor Talk on Botanical Wonderland exhibition with Dr Dorothy Erickson
Friday 1 September 2017  – 1:00-1:30 pm
The Nook – State Library of Western Australia
Free. No bookings required.

Be inspired by the botanical wonders of Western Australia as Australian artist Dr Dorothy Erickson discusses some of the marvels on display in the exhibition.

Nature's Showground 1940_001

Nature’s Showground, 1940. The Western Mail, State Library of Western Australia, 630.5WES.

Botanical Wonderland is a partnership between the Royal Western Australian Historical Society, the Western Australian Museum and the State Library of Western Australia. The exhibition is on display at the State Library until 24 September 2017.

Image: Acc 9131A/4: Lilian Wooster Greaves, pressed wildflower artwork, ‘Westralia’s Wonderful Wildflowers’, c1929

Filed under: community events, Exhibitions, Illustration, SLWA collections, SLWA displays, SLWA events, SLWA Exhibitions, SLWA news, State Library of Western Australia, WA history, Western Australia Tagged: botanical wonderland, Botanical Wonderland Events, Dr Dorothy Erickson, Royal WA Historical Society, State Library of Western Australia, WA Museum, Western Australian Museum, Wildflowers WA, Wildflowers Western Australia

Program for W3C Publishing Summit Announced

Published 17 Aug 2017 by Bill McCoy in W3C Blog.

openweb quote illustration The program for the inaugural W3C Publishing Summit (taking place November 9-10, 2017 in the San Francisco Bay Area) has just been announced. The program will feature keynotes from Internet pioneer and futurist Tim O’Reilly and Adobe CTO Abhay Parasnis. along with dozens of other speakers and panelists who will showcase and discuss how web technologies are shaping publishing today, tomorrow, and beyond.

Publishing and the web interact in innumerable ways. From schools to libraries, from design to production to archiving, from metadata to analytics, from New York to Paris to Buenos Aires to Tokyo, the Summit will show how web technologies are making publishing more accessible, more global, and more efficient and effective. Mozilla user experience lead and author Jen Simmons will showcase the ongoing revolution in CSS. Design experts Laura Brady, Iris Febres and Nellie McKesson will cover putting the reader first when producing ebooks and automating publishing workflows. We’ll also hear from reading system creator Micah Bowers (Bluefire) and EPUB pioneers George Kerscher (DAISY) and Garth Conboy (Google).

The newly-unveiled program will also showcase insights from senior leaders from across the spectrum of publishing and digital content stakeholders including Jeff Jaffe (CEO, W3C), Yasushi Fujita (CEO, Media DO), Rick Johnson (SVP Product and Strategy, Ingram/VitalSource), Ken Brooks (COO, Macmillan Learning), Liisa McCloy-Kelley (VP, Penguin Random House), and representatives from Rakuten Kobo, NYPL, University of Michigan Library/Publishing, Wiley, Hachette Book Group, Editis, EDRLab, and more.

I’m very excited about this new event which represents an important next milestone in the expanded Publishing@W3C initiative and I hope you will join us. Register now. For more information on the event, see the W3C Publishing Summit 2017 homepage and Media Advisory.

Sponsors of the W3C Publishing Summit include Ingram/VitalSource, SPi Global, and Apex. Additional sponsorship opportunities are available, email me at for more information. The Publishing Summit is one of several co-located events taking place during W3C’s major annual gathering, TPAC, for which registration is open for W3C members.

x-post: Community Conduct Project Kick-off Meeting

Published 15 Aug 2017 by Ipstenu (Mika Epstein) in Make WordPress Plugins.

Community Conduct Project – Kick off meeting scheduled for 17:00 UTC on the 5th September 2017


Launching the WebAssembly Working Group

Published 3 Aug 2017 by Bradley Nelson in W3C Blog.

We’d like to announce the formation of a WebAssembly Working Group.

For over two years the WebAssembly W3C Community Group has served as a forum for browser vendors and others to come together to develop an elegant and efficient compilation target for the Web. A first version is available in 4 browser engines and is on track to become a standard part of the Web. We’ve had several successful in-person CG meetings, while continuing our robust online collaboration on github. We also look forward to engaging the wider W3C community at the WebAssembly meeting at this year’s TPAC.

With the formation of this Working Group, we will soon be able to recommend an official version of the WebAssembly specification.

For those of you unfamiliar with WebAssembly, its initial goal is to provide a good way for C/C++ programs to compile to run on the Web, safely and at near-native speeds.

WebAssembly improves or enables a ranges of use cases, including:

WebAssembly is also about bringing more programming languages to the Web.

By offering a compact and well specified compilation target, WebAssembly enables not only compiled languages like C/C++ and Rust, but also interpreted languages like Lua, Python, and Ruby. As we enhance WebAssembly to support managed objects and better DOM+JS bindings, the list of supported languages will continue to grow.

Even if you develop primarily in JavaScript, you’ll benefit as a wealth of libraries from other languages are exposed to JavaScript. Imagine using JavaScript to access powerful libraries from outside the Web for things like physical simulation, fast number crunching, and machine learning.

There is still a lot of work to do with WebAssembly, which we will continue to incubate in our Community Group. We plan to make Wasm an even better compilation target and are already exploring adding features like: threads, managed object support, direct DOM/JS bindings, SIMD, and memory mapping.

A warm thanks to everyone involved with the WebAssembly effort.

Keep expecting the Web to do more!

WordPress 4.8.1 Maintenance Release

Published 2 Aug 2017 by Weston Ruter in WordPress News.

After over 13 million downloads of WordPress 4.8, we are pleased to announce the immediate availability of WordPress 4.8.1, a maintenance release.

This release contains 29 maintenance fixes and enhancements, chief among them are fixes to the rich Text widget and the introduction of the Custom HTML widget. For a full list of changes, consult the release notes, the tickets closed, and the list of changes.

Download WordPress 4.8.1 or visit Dashboard → Updates and simply click “Update Now.” Sites that support automatic background updates are already beginning to update to WordPress 4.8.1.

Thanks to everyone who contributed to 4.8.1:
Adam Silverstein, Andrea Fercia, Andrew Ozz, Atanas Angelov, bonger, Boone Gorges, Boro Sitnikovski, David Herrera, James Nylen, Jeffrey Paul, Jennifer M. Dodd, K. Adam White, Konstantin Obenland, Mel Choyce, r-a-y, Reuben Gunday, Rinku Y, Said El Bakkali, Sergey Biryukov, Siddharth Thevaril, Timmy Crawford, and Weston Ruter.

FastMail apps and signup available in Germany again

Published 2 Aug 2017 by Bron Gondwana in FastMail Blog.

For the past two months, FastMail has been rejecting new signups from Germany, and our app has been unavailable in the German app stores. We took these actions out of caution after receiving letters from the German Bundesnetzagentur requiring us to register with them as an email provider.

Until we obtained legal advice, we did not know whether we could comply with German requirements while fulfilling our obligations under Australian law.

Having conferred with lawyers in Germany, we have confidence that we can safely register, which allows us to provide apps in German stores again, and to accept new signups from German users.

From our lawyers:

you are required to notify your company and your business as a “commercial, publicly available telecommunications service. Section 6 of the German Telecommunications Act (TKG) provides the legal basis for this notification requirement: "Any person operating a public telecommunications network on a commercial basis or providing a publicly available telecommunications service on a profit-oriented basis shall notify the Bundesnetzagentur without undue delay of beginning to provide, of providing with differences or of ceasing to provide his activity and of any changes in his undertaking. Such notification requires written form."

you are not required to offer interception facilities. Most of the obligations following from part 7 (section 110 and 113) of the TKG are only relevant if the provider has more than 100,000 German customers. So it’s misleading if the Bundesnetzagentur talks about “100,000 customers” without mentioning that this means German customers only.

We currently have significantly fewer than 100,000 customers in Germany. We will re-assess our legal situation when we get closer to 100,000 German customers as international law in this area is changing quickly, and the situation may have changed again by the time those clauses become material.

FastMail continues to be an Australian company subject to Australian law as described in our privacy policy. Our understanding of Australian law is that it is illegal for us to directly provide any customer data or metadata to law enforcement authorities from outside Australia. If we receive requests from the Bundesnetzagentur we will follow our existing process for all foreign requests and refer them to their country's mutual assistance treaty with Australia.

The Month in WordPress: July 2017

Published 2 Aug 2017 by Hugh Lashbrooke in WordPress News.

After a particularly busy month in June, things settled down a bit in the WordPress world — WordPress 4.8’s release went very smoothly, allowing the Core team to build up some of the community infrastructure around development. Read on for more interesting news from around the WordPress world in July.

Weekly meeting for new core contributors

Onboarding new contributors is a persistent issue for most WordPress contribution teams. While every team welcomes any new contributors, the path to getting deeply involved can be tricky to find at times.

This month, the Core team implemented a fantastic new initiative: weekly meetings for new core contributors as a way to encourage involvement and foster fresh contributions. The meetings not only focus on bugs suited to first-time contributors, they also make space for experienced contributors to help out individuals who may be new to developing WordPress core.

The meetings are held every Wednesday at 19:00 UTC in the #core channel in the Making WordPress Slack group.

Increased focus on PHP practices in WordPress core

In bringing people together to improve WordPress core, a new channel in the Making WordPress Slack group named #core-php is designed to focus on PHP development in the project.

Along with this increased concentration on PHP, a new weekly meeting is now taking place every Monday at 18:00 UTC in #core-php to improve WordPress core’s PHP practices.

Sharp rise in meetup group growth

The dashboard events widget in WordPress 4.8 displays local, upcoming WordPress events for the logged in user. The events listed in this widget are pulled from the meetup chapter program, as well as the WordCamp schedule.

This widget provides greater visibility of official WordPress events, and encourages community involvement in these events. It’s safe to say that the widget has achieved its goals admirably — since WordPress 4.8 was released a little over a month ago, 31 new meetup groups have been formed with 15,647 new members across the whole program. This is compared to 19 new groups and only 7,071 new members in the same time period last year.

You can find a local meetup group to join on, and if you would like to get involved in organizing events for your community, you can find out more about the inner workings of the program on the Community Team site or by joining the #community-events channel in the Making WordPress Slack group.

WordPress 4.8.1 due for imminent release

WordPress 4.8 cycle’s first maintenance release will be published in the coming week, more than a month after 4.8 was released. This release fix some important issues in WordPress core and the majority of users will find that their sites will update to this new version automatically.

If you would like to help out by testing this release before it goes live, you can follow the beta testing guide for WordPress core. To get further involved in building WordPress core, jump into the #core channel in the Making WordPress Slack group, and follow the Core team blog.

Further reading:

If you have a story we should consider including in the next “Month in WordPress” post, please submit it here.

Marley Spoon: A Look into Their Stack and Team Structure

Published 31 Jul 2017 by Hollie Haggans in DigitalOcean: Cloud computing designed for developers.

Marley Spoon: A Look into Their Stack and Team Structure

Over the past eleven months, more than 1,600 startups from around the world have built their infrastructure on DigitalOcean through Hatch, our global incubator program designed to help startups as they scale. Launched in 2016, the goal of the program is to help support the next generation of startups get their products off the ground.

Marley Spoon, a subscription meal kit company based in Berlin and Hatch startup, sees infrastructure as an integral part of every engineer’s workflow. “We are trying to build a team where people don’t feel responsible just for a small bit, but we want to build a team where people feel responsible for the whole architecture,” says Stefano Zanella, Head of Software Engineering at Marley Spoon. “In order to do this, we believe that people need to know how the system works.”

In this interview, Zanella gives us a glimpse into Marley Spoon’s unique engineering team structure, and the technologies they use to power both their customer-facing platform and the internal-facing production and distribution platform. The following has been transcribed from our Deep End podcast, and adapted for this blog post.

DigitalOcean: How do you model your engineering teams?

Stefano Zanella: Our teams are shaped around user flows to some extent. We have currently four teams: three teams are product teams—they are related to the product itself—and one team actually takes care of the platform for the infrastructure.

The [first] three teams, we shape them around the user flow. So, we have a team that takes care of the new customers. We call it the acquisition team because they focus mostly on marketing, but they also provide data insights, manage the customer experience for new customers, shorten the subscription flow, and so on.

Then we have a team that focuses on the recurring customers. It’s the team that takes care of the functionality like adding to an order, posing new subscriptions, keeping a delivery, changing your address, changing the time that you want your box at home, etc.

And then the third team actually takes care of what we call the “back office” in the sense that we do it in our own production centers; we have warehouses all across the world. We have a tool that tracks how many orders need to be done, when, where, and [by] which warehouse. We have them organize the batches because we work a lot with shippers and we try to be just in time, because of course the food is fresh and we want to keep it just like that. So this team takes care of all the production-related issues.

And how do you organize these teams? Do you have teams with maybe product managers, designers, engineers in the same group? Or [do] you isolate teams depending on their skill set or area of expertise?

The interesting thing about Marley Spoon is that the situation is always changing. We are very proud of the fact that we believe in owning the process and changing the process and structure as we see fit.

When we started we had an engineering team and a product team. Then, at some point, we realized that the communication structure wasn’t working well enough for us to be productive and effective enough. So we actually put the product managers inside the [engineering] teams. Then, we [also] figured out that the relationship with the designers wasn’t good enough, so we put the designers inside the team as well.

For a certain period of time, we had teams [that] were functional from my point of view, and now since we are growing a lot, the team is growing, and we have different needs. We are [now] focusing on product managers aligning with the rest of the business, rather than with engineers because the relationship with engineers is really good right now. We moved the product team outside of the teams again, so they are their own team because we want them to also work as a team, not just be disconnected. We assign specific product managers to specific departments and then internally, the team shuffles the work to the engineering team. But it’s a situation that can change every time, because it really depends on where we see the problems.

Going down the technology side of things, what’s your stack and architecture right now? Or maybe you want to talk about how Marley Spoon evolved?

Well, actually let me answer the last part of your question, because I think it’s really interesting speaking about the engineers. So, we do believe that the main role of an engineer is not writing code, but it's actually running the system.

And in order to do this, we believe that people need to know how the system works. They need to have a feeling of how the whole system is working. From that point of view, we don’t see all of the teams related to technology, for example. We use a workflow based on the Kanban workflow. Since it’s based on Kanban, every time somebody runs out of work, they are free to pick new work from the backlog. And the product managers manage the backlog, which means that whoever is free should pick stuff from the top because that’s the most important thing to do.

We don’t have this clear distinction between backend and frontend developers. We do have people that are more skilled at frontend or backend, but we try to broaden their scope of action all the time. So, from that point of view, we try to help each other a lot because we believe that’s the best way to grow.

Getting back to the stack question, what are the technologies you have in your architecture?

So, mainly we are a Ruby-based company. We use Rails mainly for our web apps. We have a couple of projects that are pure Ruby because they are projects for background processing. We started them in Ruby, but we are considering switching to a different technology.

We are currently in the process of upgrading the stack because we were using Backbone as a library and Coffeescript as a language because that was what was coming out with default Rails 4. Now we are slowly moving toward React because we see a lot of traction outside and inside the team as well. So, we would like to give it a try.

We hope that will also help us shape and improve our relationship with the designers, for example. Then we have a small a command line tool, for our Kanban board because we wrote it ourselves. We wrote our own Kanban board because we like to have a tool that can evolve in the process. And we wrote a very little command line tool so that you can create tickets and move tickets around from the command line.

Tune into the full interview on our podcast, or learn more about our Hatch program today.

Hollie Haggans heads up Global Partnerships for DigitalOcean’s Hatch program. She is passionate about startups and cold brew coffee. Get in touch with questions at

The mysterious case of the changed last modified dates

Published 31 Jul 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Today's blog post is effectively a mystery story.

Like any good story it has a beginning (the problem is discovered, the digital archive is temporarily thrown into chaos), a middle (attempts are made to solve the mystery and make things better, several different avenues are explored) and an end (the digital preservation community come to my aid).

This story has a happy ending (hooray) but also includes some food for thought (all the best stories do) and as always I'd be very pleased to hear what you think.

The beginning

I have probably mentioned before that I don't have a full digital archive in place just yet. While I work towards a bigger and better solution, I have a set of temporary procedures in place to ingest digital archives on to what is effectively a piece of locked down university filestore. The procedures and workflows are both 'better than nothing' and 'good enough' as a temporary measure and actually appear to take us pretty much up to Level 2 of the NDSA Levels of Preservation (and beyond in some places).

One of the ways I ensure that all is well in the little bit of filestore that I call 'The Digital Archive' is to run frequent integrity checks over the data, using a free checksum utility. Checksums (effectively unique digital fingerprints) for each file in the digital archive are created when content is ingested and these are checked periodically to ensure that nothing has changed. IT keep back-ups of the filestore for a period of three months, so as long as this integrity checking happens within this three month period (in reality I actually do this 3 or 4 times a month) then problems can be rectified and digital preservation nirvana can be seamlessly restored.

Checksum checking is normally quite dull. Thankfully it is an automated process that runs in the background and I can just get on with my work and cheer when I get a notification that tells me all is well. Generally all is well, it is very rare that any errors are highlighted - when that happens I blog about it!

I have perhaps naively believed for some time that I'm doing everything I need to do to keep those files safe and unchanged because if the checksum is the same then all is well, however this month I encountered a problem...

I've been doing some tidying of the digital archive structure and alongside this have been gathering a bit of data about the archives, specifically looking at things like file formats, number of unidentified files and last modified dates.

Whilst doing this I noticed that one of the archives that I had received in 2013 contained 26 files with a last modified date of 18th January 2017 at 09:53. How could this be so if I have been looking after these files carefully and the checksums are the same as they were when the files were deposited?

The 26 files were all EML files - email messages exported from Microsoft Outlook. These were the only EML files within the whole digital archive. The files weren't all in the same directory and other files sitting in those directories retained their original last modified dates.

The middle

So this was all a bit strange...and worrying too. Am I doing my job properly? Is this something I should be bringing to the supportive environment of the DPC's Fail Club?

The last modified dates of files are important to us as digital archivists. This is part of the metadata that comes with a file. It tells us something about the file. If we lose this date are we losing a little piece of the authentic digital object that we are trying to preserve?

Instead of beating myself up about it I wanted to do three things:

  1. Solve the mystery (find out what happened and why)
  2. See if I could fix it
  3. Stop it happening again
So how could it have happened? Has someone tampered with these 26 files? Perhaps unlikely considering they all have the exact same date/time stamp which to me suggests a more automated process. Also, the digital archive isn't widely accessible. Quite deliberately it is only really me (and the filestore administrators) who have access.

I asked IT whether they could explain it. Had some process been carried out across all filestores that involved EML files specifically? They couldn't think of a reason why this may have occurred. They also confirmed my suspicions that we have no backups of the files with the original last modified dates.

I spoke to a digital forensics expert from the Computer Science department and he said he could analyse the files for me and see if he could work out what had acted on them and also suggest a methodology of restoring the dates.

I have a record of the last modified dates of these 26 files when they arrived - the checksum tool that I use writes the last modified date to the hash file it creates. I wondered whether manually changing the last modified dates back to what they were originally was the right thing to do or whether I should just accept and record the change.

...but I decided to sit on it until I understood the problem better.

The end

I threw the question out to the digital preservation community on Twitter and as usual I was not disappointed!

In fact, along with a whole load of discussion and debate, Andy Jackson was able to track down what appears to be the cause of the problem.

He very helpfully pointed me to a thread on StackExchange which described the issue I was seeing.

It was a great comfort to discover that the cause of this problem was apparently a bug and not something more sinister. It appears I am not alone!

...but what now?

So I now I think I know what caused the problem but questions remain around how to catch issues like this more quickly (not six months after it has happened) and what to do with the files themselves.

IT have mentioned to me that an OS upgrade may provide us with better auditing support on the filestore. Being able to view reports on changes made to digital objects within the digital archive would be potentially very useful (though perhaps even that wouldn't have picked up this Windows bug?). I'm also exploring whether I can make particular directories read only and whether that would stop issues such as this occurring in the future.

If anyone knows of any other tools that can help, please let me know.

The other decision to make is what to do with the files themselves. Should I try and fix them? More interesting debate on Twitter on this topic and even on the value of these dates in the first place. If we can fudge them then so can others - they may have already been fudged before they got to the digital archive - in which case, how much value do they really have?

So should we try and fix last modified dates or should we focus our attention on capturing and storing them within the metadata. The later may be a more sustainable solution in the longer term, given their slightly slippery nature!

I know there are lots of people interested in this topic - just see this recent blog post by Sarah Mason and in particular the comments - When was that?: Maintaining or changing ‘created’ and ‘last modified’ dates. It is great that we are talking about real nuts and bolts of digital preservation and that there are so many people willing to share their thoughts with the community.

...and perhaps if you have EML files in your digital archive you should check them too!

Don't lose your mail: a tale of horror

Published 31 Jul 2017 by Nicola Nye in FastMail Blog.

Caution: this blog post is not for the faint-hearted. It will take you on a journey through the darkest depths of despair and anxiety. But fear not, there is a happy ending. So come with us, gentle reader, on a tale of thrilling adventure sure to set your hair on end.

There you are, sitting at your desk (or on a train, or a bus) catching up on your email, making plans, following up with friends and colleagues. This is good. This is going great. Except...

You've lost your email.

Your heart sinks, your palms sweat.

You double-check. Definitely gone. Not all of it: some folders are still there. But mail is definitely... gone.

You check that your internet service provider isn't offline. You check that your internet is working. You check your mail on your phone and tablet.

Everything else is fine, but mail is missing. MISSING!

Do you use FastMail on the web?

Don't lose your mail: not with FastMail

Published 31 Jul 2017 by Nicola Nye in FastMail Blog.

Looking for the start of this adventure?

We know how vital email is to the every day running of our lives. We know the sick feeling you get when you lose even one important mail.

We want to make it hard for you to accidentally lose mail, and easy for you to recover in the unfortunate event a disaster occurs.

FastMail. We've got your back.

Jazz and MediaWiki package

Published 28 Jul 2017 by in Category:Blog_posts.

, Fremantle.

And rain, I mustn't forget the rain. I'm worrying about the roof, although far less than I used to (it's a different roof). The jazz is the radio; it's on.

But the main point this morning is exploring the mediawiki-lts package maintained by Legoktm. I've been meaning to look at it for a while, and switch my (non-playground) wikis over to it, but there's never enough time. Not that there's enough time now, but I'm just trying to get it running locally for two wikis (yes, the smallest possible farm).

So, in simple steps, I first added the PPA:

sudo add-apt-repository ppa:legoktm/mediawiki-lts

This created /etc/apt/sources.list.d/legoktm-ubuntu-mediawiki-lts-xenial.list. Then I updated the package info:

sudo apt-get update

And installed the package:

sudo apt install mediawiki

At this point, the installation prompt for MediaWiki 1.27.3 was available at http://localhost/mediawiki/ (which luckily doesn't conflict with anything I already had locally) and I stepped through the installer, creating a new database and DB user via phpMyAdmin as I went, and answering all the questions appropriately. (It's actually been a while since I last saw the installer properly.) The only tricky thing I found was that it asks for the "Directory for deleted files" but not for the actual directory for all files — because I want the files to be stored in a particular place and not in /usr/share/mediawiki/images/, especially as I want there to be two different wikis that don't share files.

I made a typo in my database username in the installation form, and got a "Access denied for user x to database y" error. I hit the browser's back button, and then the installer's back buttons, to go back to the relevant page in the installer, fixed the typo and proceeded. It remembered everything correctly, and this time installed the database tables, with only one error. This was "Notice: JobQueueGroup::__destruct: 1 buffered job(s) of type(s) RecentChangesUpdateJob never inserted. in /usr/share/mediawiki/includes/jobqueue/JobQueueGroup.php on line 447". Didn't seem to matter.

At the end of the installer, it prompted me to download LocalSettings.php and put it at /etc/mediawiki/LocalSettings.php which I did:

sudo mv ~/LocalSettings.php /etc/mediawiki/.
sudo chown root:root /etc/mediawiki/LocalSettings.php
sudo chmod 644 /etc/mediawiki/LocalSettings.php

And then I had a working wiki at http://localhost/mediawiki/index.php!


I wanted a different URL, so edited /etc/apache2/sites-available/000-default.conf (in order to not modify the package-provided /etc/mediawiki/mediawiki.conf) to add:

Alias /mywiki /var/lib/mediawiki

And changed the following in LocalSettings.php:

$wgScriptPath = "/mywiki";

The multiple wikis will have to wait until later, as will the backup regime.

+ Add a commentComments on this blog post
No comments yet

Roundup: Welcome, on news, bad tools and great tools

Published 28 Jul 2017 by Carlos Fenollosa in Carlos Fenollosa — Blog.

I'm starting a series of posts with a summary of the most interesting links I found. The concept of "social bookmarks" has always been interesting, but no implementation is perfect. was probably the closest to a good enough service, but in the end, we all just post them to Twitter and Facebook for shares and likes.

Unfortunately, Twitter search sucks, and browser bookmarks rot quickly. That's why I'm trying this new model of social + local, not only for my readers but also for myself. Furthermore, writing a tapas-sized post is much faster than a well-thought one.

Hopefully, forcing myself to post periodically —no promises, though— will encourage me to write regular articles sometimes.

Anyway, these posts will try to organize links I post on my Twitter account and provide a bit more of context.

While other friends publish newsletters, I still believe RSS can work well, so subscribe to the RSS if you want to get these updates. Another option is to use some of the services which deliver feeds by email, like Feenbox which, by the way may never leave alpha, so drop me an email if you want an invitation.


RTVE, the Spanish public TV, has uploaded a few Bit a bit episodes. It was a rad early-90s show that presented video games and the early Internet.

On news

I quit reading news 3 years ago. A recent article from Tobias Rose-Stockwell digs deep into how your fear and outrage are being sold for profit by the Media.

@xurxof recommended a 2012 article from Rolf Dobelli, Avoid News. Towards a Healthy News Diet

LTE > Fiber

I was having router issues and realized how my cellphone internet is sometimes more reliable than my home fiber.

It seems to be more common than you'd think, read the Twitter replies! XKCD also recently posted a comic on this


There was a discussion on on tools to journal your workday, which was one of the reasons that led me to try out these roundup posts.

New keyboard

I bought a Matias Clicky mechanical keyboard which sounds like a minigun. For all those interested in mechanical keyboards, you must watch Thomas's Youtube channel

The new board doesn't have a nav cluster, so I configured Ctrl-HJKL to be the arrow keys. It gets a few days to get used to, but since then, I've been using that combination even when I'm using a keyboard with arrow keys.

Slack eats CPU cycles

Slack was eating a fair amount of my CPU while my laptop was trying to build a Docker image and sync 3000 files on Dropbox. Matthew O'Riordan also wrote Where’s all my CPU and memory gone? The answer: Slack

Focus, focus, focus!

I'm a subscriber and use it regularly, especially when I'm working on the train or in a busy cafe.

musicForProgramming() is a free resource with a variety of music and also provides a podcast feed for updates.

Tags: roundup

Comments? Tweet  

My letter to the Boy Scouts of America

Published 25 Jul 2017 by legoktm in The Lego Mirror.

The following is a letter I just mailed to the Boy Scouts of America, following President Donald Trump's speech at the National Jamboree. I implore my fellow scouts to also contact the BSA to express their feelings.

25 July 2017

Boy Scouts of America
PO Box 152079
Irving, TX

Dear Boy Scouts of America,

Like many others I was extremely disappointed and disgusted to hear about the contents of President Donald Trump’s speech to the National Jamboree. Politics aside, I have no qualms with inviting the president, or having him speak to scouts. I was glad that some of the Eagle Scouts currently serving at high levels of our government were recognized for their accomplishments.

However above all, the Boy Scouts of America must adhere to the values of the Scout Law, and it was plainly obvious that the president’s speech did not. Insulting opponents is not “kindness”. Threatening to fire a colleague is not “loyal”. Encouraging boos of a former President is not “courteous”. Talking about fake news and media is not “trustworthy”. At the end of the day, the values of the Scout Law are the most important lesson we must instill in our youth – and President Trump showed the opposite.

The Boy Scouts of America must send a strong message to the public, and most importantly the young scouts that were present, that the president’s speech was not acceptable and does not embody the principles of the Boy Scouts of America.

I will continue to speak well of scouting and the program to all, but incidents like this will only harm future boys who will be dissuaded from joining the organization in the first place.

Kunal Mehta
Eagle Scout, 2012
Troop 294
San Jose, CA

How do I get my MediaWiki site to use templates? [closed]

Published 21 Jul 2017 by Cyberherbalist in Newest questions tagged mediawiki - Webmasters Stack Exchange.

My MediaWiki site is currently using v1.24.4.

I don't seem to have many templates installed, and some very important ones seem to be missing. For example, I can't use the Reference List template. If I do put references in an article, with {{reflist}} at the bottom, the template comes across as a redlink:


Are templates something that have to be installed separately? And if so, how do I go about it.

My site is hosted by DreamHost.

“Fixing the Web” with Jeff Jaffe, Brewster Kahle and Steven Gordon

Published 20 Jul 2017 by Amy van der Hiel in W3C Blog.

On 14 July 2017, W3C CEO Jeff Jaffe (MIT ’76) was featured as part of an MIT Alumni Association Panel “Fixing the Web” with Brewster Kahle, (’82) Founder and Digital Librarian, Internet Archive and Steven Gordon (’75), Professor of IT Management, Babson College.

When talking about the history of the Web and Tim Berners-Lee, Jeff noted that after its invention:

“He created a consortium called the W3C so that everyone who was interested in enhancing the web technology base can work together collaboratively.”

Jeff added about W3C:

“Most of our work recently has been transforming the web from being a large database of static information to dynamic information; a web of application where people build web applications which work essentially as distributed applications across multiple systems, making sure that we address societal problems such as web accessibility for people that have challenges or security privacy issues.”

The panel was moderated by science Journalist Barbara Moran, and the topics were wide ranging and interesting – from the Internet Archive, to government control of the Web, advertising, social media, innovation and more.

In the discussion, a question was raised from Twitter about the EME standard:

Jeff noted:

We’ve developed a new proposed standard called EME, Encrypted Media Extensions, that instead of displaying these movies to hundreds of millions of people in an insecure and privacy violating fashion, we’ve built it in a way that makes it secure for people to watch movies.


Please watch the video if you’d like to see more.


Building the Lego Saturn V rocket 48 years after the moon landing

Published 20 Jul 2017 by legoktm in The Lego Mirror.

Full quality video available on Wikimedia Commons.

On this day 48 years ago, three astronauts landed on the moon after flying there in a Saturn V rocket.

Today I spent four hours building the Lego Saturn V rocket - the largest Lego model I've ever built. Throughout the process I was constantly impressed with the design of the rocket, and how it all came together. The attention paid to the little details is outstanding, and made it such a rewarding experience. If you can find a place that has them in stock, get one. It's entirely worth it.

The rocket is designed to be separated into the individual stages, and the lander actually fits inside the rocket. Vertically, it's 3ft, and comes with three stands so you can show it off horizontally.

As a side project, I also created a timelapse of the entire build, using some pretty cool tools. After searching online how to have my DSLR take photos on a set interval and being frustrated with all the examples that used a TI-84 calculator, I stumbled upon gphoto2, which lets you control digital cameras. I ended up using a command as simple as gphoto2 --capture-image-and-download -I 30 to have it take and save photos every 30 seconds. The only negative part is that it absolutely killed the camera's battery, and within an hour I needed to switch the battery.

To stitch the photos together (after renaming them a bit), ffmpeg came to the rescue: ffmpeg -r 20 -i "%04d.jpg" -s hd1080 -vcodec libx264 time-lapse.mp4. Pretty simple in the end!

Riding the Jet Stream to 1 Million Users

Published 18 Jul 2017 by Ben Uretsky in DigitalOcean: Cloud computing designed for developers.

Riding the Jet Stream to 1 Million Users

Today, we’re excited to share a recent milestone with you: DO now supports 1 million users around the world. We’ve grown with our users, and have worked hard to give them the products they need to run their services without compromising the user experience they’ve come to love. We’re grateful to our users and community, and to the people that have helped us grow and learn along the way.

In 2012, DigitalOcean had a modest start. Our staging environment was around 4 or 5 servers, and we had a handful of engineers running the platform. We had two datacenter regions, 200 Droplets deployed, and a vision for what cloud computing could become. But most importantly, we had the support of a community of developers that helped us realize that vision.

A Maiden Voyage

Holding user groups in our early stages really helped us answer key questions about what aspects of the user experience could be improved. We launched our first datacenter, NYC1, and opened up our first international datacenter, AMS1, in January 2012.

Our users have played a huge part in helping us determine where to launch new datacenters to serve them better; in addition to NYC and Amsterdam, we now have them in San Francisco, Frankfurt, London, Singapore, Toronto, and Bangalore. Our dedicated team of network engineers, software engineers, datacenter technicians, and platform support specialists have worked tirelessly to give all of our users a great experience and access to simple cloud computing at any scale.

Making Waves

Among our early adopters were projects and companies like, AudioBox, and GitLab, who have scaled along with us as we’ve grown. Projects like Laravel Forge also chose to host their applications on DO. We’ve also partnered with companies like GitHub (Student Developer Pack and Hacktoberfest), Docker (Docker Student Developer Kit and our Docker one-click application), CoreOS, and Mesosphere on major initiatives.

Developers that helped spread the word when we first started include John Resig (jQuery), Jeff Atwood (Stack Overflow), Ryan Bates (Railscast), Xavier Noria (core Rails contributor), and Salvatore Sanfilippo (Redis).

Pere Hospital, co-founder of Cloudways, found DigitalOcean in 2014 while looking for an IaaS partner that could add value to his clients’ business processes. When Cloudways hit 5,000 DO compute instances they had their own internal celebration—and they’ve added thousands more since.

John O’Nolan, founder of Ghost, shared this anecdote: “On one of DigitalOcean's birthdays, the team sent us a couple of vinyl shark toys as a surprise and a thank you for being a customer. These sharks quickly became a mainstay of our weekly team meetings, along with the most horrific slew of puns: “Are you being ‘shark-astic’?” “That sounds a bit fishy.” etc. The jokes went so far that six months later we somehow found ourselves on a retreat in Thailand with our CTO, Hannah, coding at a table in a full-body shark costume.”

Additionally, several community members embraced DO and created tools that extended our API early on. Jack Pearkes created the command line tool, Tugboat, in 2013. Ørjan Blom created Barge, a Ruby library that pre-dated our official Ruby library, droplet_kit. Lorenzo Setale created python-digitalocean, which remains the most widely used Python library on DO. And Antoine Corcy created DigitalOceanV2, a library that helps PHP applications interact with v2 of the DO API. There have also been many others that have shared feedback with us and created tools of their own. We thank all of you for being a part of this.

All Hands on Deck

Members of the DO community have become a part of the DO family. We’ve reached over 1,600 tutorials on our Community site, in large part due to technologists that have contributed articles through participation in our Get Paid to Write program. Marko Mudrinić, for example, has written a number of articles for the Community site, frequently engages with other users in our Q&A section, and contributes to the official DO command line tool, doctl.

We’ve been lucky to have community members go on to join the DO team, like Community Manager Kamal Nasser and Platform Support Specialist Jonathan Tittle. Jonathan was an early adopter, having migrated his company’s customers to DO back in 2012. He then became one of our most engaged Community members. Jonathan told me, “When I look over questions posted to the DigitalOcean Community, I can honestly look back and say 'I’ve been there' and recall the countless times that I ran into an issue and couldn’t find the answer on my own, much less get the help I needed from someone who knew. When the questions were stacking up one day, I dove in and did my best to help. I quickly found myself spending countless hours troubleshooting alongside a user until an issue was resolved. I was simply trying to offer a helping hand when and where I could.”

Over the Horizon

The journey to 1 million is full of stories, people, moments, events, and companies that have crossed paths with us and have inspired us. Our users have been with us every step of the way, and we’ve tasked ourselves with meeting their growing infrastructure needs, and their goals for engaging and collaborating with us. There is so much more to come, and we’re excited to share it all with you. Thank you!

Song Club Showcase

Published 14 Jul 2017 by Dave Robertson in Dave Robertson.

While the finishing touches are being put on the album, I’m going solo with other Freo songwriter’s at the Fib.


Net Neutrality: Why the Internet Must Remain Open and Accessible

Published 11 Jul 2017 by Ben Uretsky in DigitalOcean: Cloud computing designed for developers.

Net Neutrality: Why the Internet Must Remain Open and Accessible

DigitalOcean is proud to be taking part in today’s Day of Action to Save Net Neutrality. Access to an open internet is crucial to allowing companies like DigitalOcean and the thousands of businesses we power to exist. This is not something we can take for granted. Efforts to roll back the protections provided by net neutrality rules will stifle innovation and create an uneven playing field for smaller companies competing with entrenched players.

I want to share the letter that I sent to our representatives in the Senate and encourage you to join us in speaking up while there's still time.

DigitalOcean Inc. supports the Federal Communication Commission’s Open Internet Order and the principles of network neutrality that it upholds. As an infrastructure provider serving over one million registered users, we support our customers’ rights to fair, equal, and open networks access as outlined in the Order. We have not experienced, nor do we anticipate experiencing, any negative impact on broadband investment or service as a result of the Order.

We strongly oppose the Commission’s recent proposal to dismantle the 2015 Open Internet Order. As evidenced by the federal judiciary over the past two decades in Comcast Corp. v. FCC and other cases, the Commission cannot enforce unbiased and neutral Internet access without the Title II classification of broadband providers as telecommunications providers. Therefore, we ask you to codify Title II reclassification into law. It is the only way to uphold network neutrality principles.

As a direct competitor to the largest technology infrastructure providers in the nation, we are concerned that the Commission’s recent Notice for Proposed Rulemaking (WC Docket No. 17-108) will create an anti-competitive market environment because the costs of unfair networking practices will be forced onto infrastructure providers such as ourselves. Furthermore, many of our customers are individuals or small edge providers for whom changes to current network neutrality policies would significantly raise barriers to entry in various markets. Without legal protections against network blocking, throttling, unreasonable interference, and paid prioritization, it will be more difficult for us and for our customers to innovate, compete, and support the free flow of information.

By protecting network neutrality, we hope that the 115th Congress can promote investment in New York, eliminate business uncertainty with regards to FCC rulemaking, support competition in the broadband market, and encourage small businesses to innovate. We look forward to working with you on passing legislation related to this issue.


Ben Uretsky, CEO, DigitalOcean Inc.

If you’re in the US, join us and stand up for your right to an open and accessible internet by submitting your own letter to the FCC today.

Wikidata Map July 2017

Published 11 Jul 2017 by addshore in Addshore.

It’s been 9 months since my last Wikidata map update and once again we have many new noticable areas appearing, including Norway, South Africa, Peru and New Zealand to name but a few.  As with the last map generation post I once again created a diff image so that the areas of change are easily identifiable comparing the data from July 2017 with that from my last post on October 2016.

The various sizes of the generated maps can be found on Wikimedia Commons:

Reasons for increases

If you want to have a shot at figuring out the cause of the increases in specific areas then take a look at my method described in the last post using the Wikidata Query Service.

Peoples discoveries so far:

I haven’t included the names of those that discovered reasons for areas of increase above, but if you find your discovery here and want credit just ask!

Introducing High CPU Droplets

Published 10 Jul 2017 by Ben Schaechter in DigitalOcean: Cloud computing designed for developers.

Introducing High CPU Droplets

Today, we’re excited to announce new High CPU Droplet plans for CPU-intensive workloads. As applications grow and evolve we have found that there are certain workloads that need more powerful underlying computing power.

Use Cases

Here are some use cases that can benefit from CPU-optimized compute servers:


We are offering five new Droplet plans. They start from $40/mo for two dedicated vCPUs, up to $640/mo for 32 dedicated vCPUs.

Introducing High CPU Droplets

We've partnered with Intel to back these Droplets with Intel's most powerful processors, delivering a maximum, reliable level of performance. Going forward, we’ll regularly evaluate and use the best CPUs available to ensure they always deliver the best performance for your applications.

The current CPUs powering High CPU Droplets are the Intel Broadwell 2697Av4 with a clock speed of 2.6Ghz, and the Intel Skylake 8168 with a clock speed of 2.7Ghz. Customers in our early access period have seen up to four times the performance of Standard Droplet CPUs, and on average see about 2.5 times the performance.

These Droplets are available through the Control Panel and the API starting today as capacity allows in SFO2, NYC1, NYC3, TOR1, BLR1, AMS3, FRA1, and LON1.

Ben Schaechter
Product Manager, Droplet

Preserving Google docs - decisions and a way forward

Published 7 Jul 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Back in April I blogged about some work I had been doing around finding a suitable export (and ultimately preservation) format for Google documents.

This post has generated a lot of interest and I've had some great comments both on the post itself and via Twitter.

I was also able to take advantage of a slot I had been given at last week's Jisc Research Data Network event to introduce the issue to the audience (who had really come to hear me talk about something else but I don't think they minded).

There were lots of questions and discussion at the end of this session, mostly focused on the Google Drive issue rather than the rest of the talk. I was really pleased to see that the topic had made people think. In a lightening talk later that day, William Kilbride, Executive Director of The Digital Preservation Coalition mused on the subject of "What is data?". Google Drive was one of the examples he used, asking where does the data end and the software application start?

I just wanted to write a quick update on a couple of things - decisions that have been made as a result of this work and attempts to move the issue forward.

Decisions decisions

I took a summary of the Google docs data export work to my colleagues in a Research Data Management meeting last month in order to discuss a practical way forward for the institutional research data we are planning on capturing and preserving.

One element of the Proof of Concept that we had established at the end of phase 3 of Filling the Digital Preservation Gap was a deposit form to allow researchers to deposit data to the Research Data York service.

As well as the ability to enable researchers to browse and select a file or a folder on their computer or network, this deposit form also included a button to allow deposit to be carried out via Google Drive.

As I mentioned in a previous post, Google Drive is widely used at our institution. It is clear that many researchers are using Google Drive to collect, create and analyse their research data so it made sense to provide an easy way for them to deposit direct from Google Drive. I just needed to check out the export options and decide which one we should support as part of this automated export.

However, given the inconclusive findings of my research into export options it didn't seem that there was one clear option that adequately preserved the data.

As a group we decided the best way out of this imperfect situation was to ask researchers to export their own data from Google Drive in whatever format they consider best captures the significant properties of the item. By exporting themselves in a manual fashion prior to upload, this does give them the opportunity to review and check their files and make their own decision on issues such as whether comments are included in the version of their data that they upload to Research Data York.

So for the time being we are disabling the Google Drive upload button from our data deposit interface....which is a shame because a certain amount of effort went into getting that working in the first place.

This is the right decision for the time being though. Two things need to happen before we can make this available again:

  1. Understanding the use case - We need to gain a greater understanding of how researchers use Google Drive and what they consider to be 'significant' about their native Google Drive files.
  2. Improving the technology - We need to make some requests to Google to make the export options better.

Understanding the use case

We've known for a while that some researchers use Google Drive to store their research data. The graphic below was taken from a survey we carried out with researchers in 2013 to find out about current practice across the institution. 

Of the 188 researchers who answered the question "Where is your digital research data stored (excluding back up copies)?" 22 mentioned Google Drive. This is only around 12% of respondents but I would speculate that over the last four years, use of Google Drive will have increased considerably as Google applications have become more embedded within the working practices of staff and students at the University.

Where is your digital research data stored (excluding back up copies)?

To understand the Google Drive use case today I really need to talk to researchers.

We've run a couple of Research Data Management teaching sessions over the last term. These sessions are typically attended by PhD students but occasionally a member of research staff also comes along. When we talk about data storage I've been asking the researchers to give a show of hands as to who is using Google Drive to store at least some of their research data.

About half of the researchers in the room raise their hand.

So this is a real issue. 

Of course what I'd like to do is find out exactly how they are using it. Whether they are creating native Google Drive files or just using Google Drive as a storage location or filing system for data that they create in another application.

I did manage to get a bit more detail from one researcher who said that they used Google Drive as a way of collaborating on their research with colleagues working at another institution but that once a document has been completed they will export the data out of Google Drive for storage elsewhere. 

This fits well with the solution described above.

I also arranged a meeting with a Researcher in our BioArCh department. Professor Matthew Collins is known to be an enthusiastic user of Google Drive.

Talking to Matthew gave me a really interesting perspective on Google Drive. For him it has become an essential research tool. He and his colleagues use many of the features of the Google Suite of tools for their day to day work and as a means to collaborate and share ideas and resources, both internally and with researchers in other institutions. He showed me PaperPile, an extension to Google Drive that I had not been aware of. He uses this to manage his references and share them with colleagues. This clearly adds huge value to the Google Drive suite for researchers.

He talked me through a few scenarios of how they use Google - some, (such as the comments facility) I was very much aware of. Others, I've not used myself such as the use of the Google APIs to visualise for example activity on preparing a report in Google Drive - showing a time line and when different individuals edited the document. Now that looks like fun!

He also talked about the importance of the 'previous versions' information that is stored within a native Google Drive file. When working collaboratively it can be useful to be able to track back and see who edited what and when. 

He described a real scenario in which he had had to go back to a previous version of a Google Sheet to show exactly when a particular piece of data had been entered. I hadn't considered that the previous versions feature could be used to demonstrate that you made a particular discovery first. Potentially quite important in the competitive world of academic research.

For this reason Matthew considered the native Google Drive file itself to be "the ultimate archive" and "a virtual collaborative lab notebook". A flat, static export of the data would not be an adequate replacement.

He did however acknowledge that the data can only exist for as long as Google provides us with the facility and that there are situations where it is a good idea to take a static back up copy.

He mentioned that the precursor to Google Docs was a product called Writely (which he was also an early adopter of). Google bought Writely in 2006 after seeing the huge potential in this online word processing tool. Matthew commented that backwards compatibility became a problem when Google started making some fundamental changes to the way the application worked. This is perhaps the issue that is being described in this blog post: Google Docs and Backwards Compatibility.

So, I'm still convinced that even if we can't preserve a native Google Drive file perfectly in a static form, this shouldn't stop us having a go!

Improving the technology

Along side trying to understand how researchers use Google Drive and what they consider to be significant and worthy of preservation, I have also been making some requests and suggestions to Google around their export options. There are a few ideas I've noted that would make it easier for us to archive the data.

I contacted the Google Drive forum and was told that as a Google customer I was able to log in and add my suggestions to Google Cloud Connect so this I did...and what I asked for was as follows:

  • Please can we have a PDF/A export option?
  • Please could we choose whether or not to export comments or not ...and if we are exporting comments can we choose whether historic/resolved comments are also exported
  • Please can metadata be retained - specifically the created and last modified dates. (Author is a bit trickier - in Google Drive a document has an owner rather than an author. The owner probably is the author (or one of them) but not necessarily if ownership has been transferred).
  • I also mentioned a little bug relating to comment dates that I found when exporting a Google document containing comments out into docx format and then importing it back again.
Since I submitted these feature requests and comments in early May it has all gone very very quiet...

I have a feeling that ideas only get anywhere if they are popular ...and none of my ideas are popular ...because they do not lead to new and shiny functionality.

Only one of my suggestions (re comments) has received a vote by another member of the community.

So, what to do?

Luckily, since having spoken about my problem at the Jisc Research Data Network, two people have mentioned they have Google contacts who might be interested in hearing my ideas.

I'd like to follow up on this, but in the meantime it would be great if people could feedback to me. 

  • Are my suggestions sensible? 
  • Are there are any other features that would help the digital preservation community preserve Google Drive? I can't imagine I've captured everything...

The UK Archivematica group goes to Scotland

Published 6 Jul 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday the UK Archivematica group met in Scotland for the first time. The meeting was hosted by the University of Edinburgh and as always it was great to be able to chat informally to other Archivematica users in the UK and find out what everyone is up to.

The first thing to note was that since this group of Archivematica ‘explorers’ first met in 2015 real and tangible progress seems to have been made. This was encouraging to see. This is particularly the case at the University of Edinburgh. Kirsty Lee talked us through their Archivematica implementation (now in production) and the steps they are taking to ingest digital content.

One of the most interesting bits of her presentation was a discussion about appraisal of digital material and how to manage this at scale using the available tools. When using Archivematica (or other digital preservation systems) it is necessary to carry out appraisal at an early stage before an Archival Information Package (AIP) is created and stored. It is very difficult (perhaps impossible) to unpick specific files from an AIP at a later date.

Kirsty described how one of her test collections has been reduced from 5.9GB to 753MB using a combination of traditional and technical appraisal techniques. 

Appraisal is something that is mentioned frequently in digital preservation discussions. There was a group talking about just this a couple of weeks ago at the recent DPC unconference ‘Connecting the Bits’. 

As ever it was really valuable to hear how someone is moving forward with this in a practical way. 

It will be interesting to find out how these techniques can be applied at scale of some of the larger collections Kirsty intends to work with.

Kirsty recommended an article by Victoria Sloyan, Born-digital archives at the Wellcome Library: appraisal and sensitivity review of two hard drives which was helpful to her and her colleagues when formulating their approach to this thorny problem.

She also referenced the work that the Bentley Historical Library at University of Michigan have carried out with Archivematica and we watched a video showing how they have integrated Archivematica with DSpace. This approach has influenced Edinburgh’s internal discussions about workflow.

Kirsty concluded with something that rings very true for me (in fact I think I said it myself the two presentations I gave last week!). Striving for perfection isn’t helpful, the main thing is just to get started and learn as you go along.

Rachel McGregor from the University of Lancaster gave an entertaining presentation about the UK Archivematica Camp that was held in York in April, covering topics as wide ranging as the weather, the food and finally feeling the love for PREMIS!

I gave a talk on work at York to move Archivematica and our Research Data York application towards production. I had given similar talks last week at the Jisc Research Data Network event and a DPC briefing day but I took a slightly different focus this time. I wanted to drill in a bit more detail into our workflow, the processing configuration within Archivematica and some problems I was grappling with. 

It was really helpful to get some feedback and solutions from the group on an error message I’d encountered whilst preparing my slides the previous day and to have a broader discussion on the limitations of web forms for data upload. This is what is so good about presenting within a small group setting like this as it allows for informality and genuinely productive discussion. As a result of this I over ran and made people wait for their lunch (very bad form I know!)

After lunch John Kaye updated the group on the Jisc Research Data Shared Service. This is becoming a regular feature of our meetings! There are many members of the UK Archivematica group who are not involved in the Jisc Shared Service so it is really useful to be able to keep them in the loop. 

It is clear that there will be a substantial amount of development work within Archivematica as a result of its inclusion in the Shared Service and features will be made available to all users (not just those who engage directly with Jisc). One example of this is containerisation which will allow Archivematica to be more quickly and easily installed. This is going to make life easier for everyone!

Sean Rippington from the University of St Andrews gave an interesting perspective on some of the comparison work he has been doing of Preservica and Archivematica. 

Both of these digital preservation systems are on offer through the Jisc Shared Service and as a pilot institution St Andrews has decided to test them side by side. Although he hasn’t yet got his hands on both, he was still able to offer some really useful insights on the solutions based on observations he has made so far. 

First he listed a number of similarities - for example alignment with the OAIS Reference Model, the migration-based approach, the use of microservices and many of the tools and standards that they are built on.

He also listed a lot of differences - some are obvious, for example one system is commercial and the other open source. This leads to slightly different models for support and development. He mentioned some of the additional functionality that Preservica has, for example the ability to handle emails and web archives and the inclusion of an access front end. 

He also touched on reporting. Preservica does this out of the box whereas with Archivematica you will need to use a third party reporting system. He talked a bit about the communities that have adopted each solution and concluded that Preservica seems to have a broader user base (in terms of the types of institution that use it). The engaged, active and honest user community for Archivematica was highlighted as a specific selling point and the work of the Filling the Digital Preservation Gap project (thanks!).

Sean intends to do some more detailed comparison work once he has access to both systems and we hope he will report back to a future meeting.

Next up we had a collaborative session called ‘Room 101’ (even though our meeting had been moved to room 109). Considering we were encouraged to grumble about our pet hates this session came out with some useful nuggets:

After coffee break we were joined (remotely) by several representatives from the OSSArcFlow project from Educopia and the University of North Carolina. This project is very new but it was great that they were able to share with us some information about what they intend to achieve over the course of the two year project. 

They are looking specifically at preservation workflows using open source tools (specifically Archivematica, BitCurator and ArchivesSpace) and they are working with 12 partner institutions who will all be using at least two of these tools. The project will not only provide training and technical support, but will fully document the workflows put in place at each institution. This information will be shared with the wider community. 

This is going to be really helpful for those of us who are adopting open source preservation tools, helping to answer some of those niggling questions such as how to fill the gaps and what happens when there are overlaps in the functionality of two tools.

We registered our interest in continuing to be kept in the loop about this project and we hope to hear more at a future meeting.

The day finished with a brief update from Sara Allain from Artifactual Systems. She talked about some of the new things that are coming in version 1.6.1 and 1.7 of Archivematica.

Before leaving Edinburgh it was a pleasure to be able to join the University at an event celebrating their progress in digital preservation. Celebrations such as this are pretty few and far between - perhaps because digital preservation is a task that doesn’t have an obvious end point. It was really refreshing to see an institution publicly celebrating the considerable achievements made so far. Congratulations to the University of Edinburgh!

Hot off the press…

Published 4 Jul 2017 by Tom Wilson in thomas m wilson.


Can't connect to MediaWiki on Nginx server [duplicate]

Published 4 Jul 2017 by Marshall S. Lee in Newest questions tagged mediawiki - Server Fault.

This question is an exact duplicate of:

I downloaded and configured MediaWiki on the Ubuntu server. I'm running it on Nginx, so I opened the nginx.conf file and modified the server part as follows.

 38     server {
 39         listen 80;
 40         server_name;
 42         access_log /var/log/nginx/access-wiki.log;
 43         error_log /var/log/nginx/error-wiki.log;
 45         charset utf-8;
 46         passenger_enabled on;
 47         client_max_body_size 50m;
 49         location / {
 50             root /var/www/html/mediawiki;
 51             index index.php;
 52         }
 54         # pass the PHP scripts to FastCGI server
 55         location ~ \.php$ {
 56             root           html;
 57             fastcgi_pass   unix:/var/run/php/php7.0-fpm.sock;
 58             fastcgi_index  index.php;
 59             fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
 60             include        fastcgi_params;
 61         }
 63         # deny access to .htaccess files, if Apache's document root
 64         # concurs with nginx's one
 66         location ~ /\.ht {
 67             deny  all;
 68         }
 69     }

After editing, I restarted the Nginx and now I started facing another problem. Every time I try to access the webpage by with the domain above, I keep failing to face the main page of MediaWiki but I receive a file instead, which says the following.

 * This is the main web entry point for MediaWiki.
 * If you are reading this in your web browser, your server is probably
 * not configured correctly to run PHP applications!
 * See the README, INSTALL, and UPGRADE files for basic setup instructions
 * and pointers to the online documentation.
 * ----------
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * GNU General Public License for more details.
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 * @file

// Bail on old versions of PHP, or if composer has not been run yet to install
// dependencies. Using dirname( __FILE__ ) here because __DIR__ is PHP5.3+.
// @codingStandardsIgnoreStart MediaWiki.Usage.DirUsage.FunctionFound
require_once dirname( __FILE__ ) . '/includes/PHPVersionCheck.php';
// @codingStandardsIgnoreEnd
wfEntryPointCheck( 'index.php' );

require __DIR__ . '/includes/WebStart.php';

$mediaWiki = new MediaWiki();

Now, in the middle of the setup, I'm almost lost and have no idea how to work it out. I created a file hello.html in the root directory and accessed the page via This is working. I do believe that the PHP configuration part is causing the errors, but I don't know how to fix it.


Published 4 Jul 2017 by fabpot in Tags from Twig.

The Month in WordPress: June 2017

Published 3 Jul 2017 by Hugh Lashbrooke in WordPress News.

We’re starting a new regular feature on this blog today. We’d like to keep everyone up-to-date about the happenings all across the WordPress open source project and highlight how you can get involved, so we’ll be posting a roundup of all the major WordPress news at the end of every month.

Aside from other general news, the three big events in June were the release of WordPress 4.8, WordCamp Europe 2017, and the WordPress Community Summit. Read on to hear more about these as well as other interesting stories from around the WordPress world.

WordPress 4.8

On June 8, a week before the Community Summit and WordCamp Europe, WordPress 4.8 was released.You can read the Field Guide for a comprehensive overview of all the features of this release (the News and Events widget in the dashboard is one of the major highlights).

Most people would either have their version auto-updated, or their hosts would have updated it for them. For the rest, the updates have gone smoothly with no major issues reported so far.

This WordPress release saw contributions from 346 individuals; you can find their names in the announcement post. To get involved in building WordPress core, jump into the #core channel in the Making WordPress Slack group, and follow the Core team blog.

WordCamp Europe 2017

WordCamp Europe 2017 was held in Paris between June 15-17. The event began with a Contributor Day, followed by two days of talks and community goodness. The talks were live-streamed, but you can still catch all the recordings on The organisers also published a handy wrap-up of the event.

WordCamp Europe exists to bring together the WordPress community from all over the continent, as well as to inspire local communities everywhere to get their own events going — to that end, the event was a great success, as a host of new meetup groups have popped up in the weeks following WordCamp Europe.

The work that Contributor Day participants accomplished was both varied and valuable, covering all aspects of the WordPress project — have a look through the Make blogs for updates from each team.

Finally, we also learned during the event that WordCamp Europe 2018 will be held in Belgrade, Serbia, continuing the tradition of exploring locations and communities across the continent.

WordPress Community Summit

The fourth WordPress Community Summit took place during the two days leading up to WordCamp Europe 2017. This event is an invite-only unconference where people from all over the WordPress community come together to discuss some of the more difficult issues in the community, as well as to make plans for the year ahead in each of the contribution teams.

As the Summit is designed to be a safe space for all attendees, the notes from each discussion are in the process of being anonymized before we publish them on the Summit blog (so stay tuned – they’ll show up there over the next few weeks).

You can already see the final list of topics that were proposed for the event here (although a few more were added during the course of the two day Summit).

WordPress marketing push continues apace

As part of the push to be more intentional in marketing WordPress (as per Matt Mullenweg’s 2016 State of the Word), the Marketing team has launched two significant drives to obtain more information about who uses WordPress and how that information can shape their outreach and messaging efforts.

The team is looking for WordPress case studies and is asking users, agencies, and freelancers to take a WordPress usage survey. This will go a long way towards establishing a marketing base for WordPress as a platform and as a community — and many people in the community are looking forward to seeing this area develop further.

To get involved in the WordPress Marketing team, you can visit their team blog.

New Gutenberg editor available for testing

For some time now, the Core team has been hard at work on a brand-new text editor for WordPress — this project has been dubbed “Gutenberg.” The project’s ultimate goal is to replace the existing TinyMCE editor, but for now it is in beta and available for public testing — you can download it here as a plugin and install it on any WordPress site.

This feature is still in beta, so we don’t recommend using it on a production site. If you test it out, though, you’ll find that it is a wholly different experience to what you are used to in WordPress. It’s a more streamlined, altogether cleaner approach to the text-editing experience than we’ve had before, and something that many people are understandably excited about. Matt Mullenweg discussed the purpose of Gutenberg in more detail during his Q&A at WordCamp Europe.

There are already a few reviews out from Brian Jackson at Kinsta, Aaron Jorbin, and Matt Cromwell (among many others). Keep in mind that the project is in constant evolution at this stage; when it eventually lands in WordPress core (probably in v5.0), it could look very different from its current iteration — that’s what makes this beta stage and user testing so important.

To get involved with shaping the future of Gutenberg, please test it out, and join the #core-editor channel in the Making WordPress Slack group. You can also visit the project’s GitHub repository to report issues and contribute to the codebase.

Further reading:

If you have a story we should consider including in the next “Month in WordPress” post, please submit it here.

2017 Community Summit Notes

Published 28 Jun 2017 by Ipstenu (Mika Epstein) in Make WordPress Plugins.

The Plugin team is small but mighty. We had a very productive summit and contributor day this year, pushing forward some of the changes we’ve been working on for a while. The following notes are the product of the sessions as well as some hallway chats over red wine, gin, and cheese.


To Do:

Most of that to-do is on me to at least get the tickets started, but if these are things you’re interested in, then I encourage you to come to the open office hours! I’m hoping to have the first in August, as I have July Vacations 🙂 Sorry, family first!

I’ll post more about what I plan to do with the open office hours soon, including topics and schedules.

#community-summit, #contributor-day

Test With Gutenberg Please!

Published 27 Jun 2017 by Ipstenu (Mika Epstein) in Make WordPress Plugins.

Call for testing: Gutenberg

This is especially important if your plugin adds meta boxes or otherwise makes changes to the editor. PLEASE test early and often.

A Dark Theme for FastMail

Published 26 Jun 2017 by Neil Jenkins in FastMail Blog.

As we pass the winter solstice here in Australia, the darkest night has come and the days are finally getting longer. But for those that miss the murky blackness (and for all our northern hemisphere customers journeying towards winter), you can now choose a Dark theme for your FastMail account and relive the inky twilight.

Want to try it out? Change your theme on the Settings → General & Preferences screen.

Here's what mail in the Dark theme looks like…

Mailbox in dark theme

(Please note that rich text (HTML) messages get a white background, as unfortunately too many messages set font colours presuming a light background).

And our Dark calendar…

Calendar in dark theme

Wikimedia Hackathon at home project

Published 24 Jun 2017 by legoktm in The Lego Mirror.

This is the second year I haven't been able to attend the Wikimedia Hackathon due to conflicts with my school schedule (I finish at the end of June). So instead I decided I would try and accomplish a large-ish project that same weekend, but at home. I'm probably more likely to get stuff done while at home because I'm not chatting up everyone in person!

Last year I converted OOjs-UI to use PHP 5.5's traits instead of a custom mixin system. That was a fun project for me since I got to learn about traits and do some non-MediaWiki coding, while still reducing our technical debt.

This year we had some momentum on MediaWiki-Codesniffer changes, so I picked up one of our largest tasks which had been waiting - to upgrade to the 3.0 upstream PHP_CodeSniffer release. Being a new major release there were breaking changes, including a huge change to the naming and namespacing of classes. My current diffstat on the open patch is +301, -229, so it is roughly the same size as last year. The conversion of our custom sniffs wasn't too hard, the biggest issue was actually updating our test suite.

We run PHPCS against test PHP files and verify the output matches the sniffs that we expect. Then we run PHPCBF, the auto-fixer, and check that the resulting "fixed" file is what we expect. The first wasn't too bad, it just calls the relevant internal functions to run PHPCS, but the latter would have PHPBCF output in a virtual filesystem, shells out to create a diff, and then tries to put it back together. Now, we just get the output from the relevant PHPCS class, and compare it to the expected test output.

This change was included in the 0.9.0 release of MediaWiki-Codesniffer and is in use by many MediaWiki extensions.

Emulation for preservation - is it for me?

Published 23 Jun 2017 by Jenny Mitcham in Digital Archiving at the University of York.

I’ve previously been of the opinion that emulation isn’t really for me.

I’ve seen presentations about emulation at conferences such as iPRES and it is fair to say that much of it normally goes over my head.

This hasn’t been helped by the fact that I’ve not really had a concrete use case for it in my own work - I find it so much easier to relate and engage to a topic or technology if I can see how it might be directly useful to me.

However, for a while now I’ve been aware that emulation is what all the ‘cool kids’ in the digital preservation world seem to be talking about. From the very migration heavy thinking of the 2000’s it appears that things are now moving in a different direction.

This fact first hit my radar at the 2014 Digital Preservation Awards where the University of Freiburg won the The OPF Award for Research and Innovation award for their work on Emulation as a Service with bwFLA Functional Long Term Archiving and Access.

So I was keen to attend the DPC event Halcyon, On and On: Emulating to Preserve to keep up to speed... not only because it was hosted on the doorstep in the centre of my home town of York!

It was an interesting and enlightening day. As usual the Digital Preservation Coalition did a great job of getting all the right experts in the room (sometimes virtually) at the same time, and a range of topics and perspectives were covered.

After an introduction from Paul Wheatley we heard from the British Library about their experiences of doing emulation as part of their Flashback project. No day on emulation would be complete without a contribution from the University of Freiburg. We had a thought provoking talk via WebEx from Euan Cochrane of Yale University Library and an excellent short film created by Jason Scott from the Internet Archive. One of the highlights for me was Jim Boulton talking about Digital Archaeology - and that wasn’t just because it had ‘Archaeology’ in the title (honest!). His talk didn’t really cover emulation, it related more to that other preservation strategy that we don’t talk about much anymore - hardware preservation. However, many of the points he raised were entirely relevant to emulation - for example, how to maintain an authentic experience, how you define what the significant properties of an item actually are and what decisions you have to make as a curator of the digital past. It was great to see how engaged the public were with his exhibitions and how people interacted with it.

Some of the themes of the day and take away thoughts for me:

Thinking about how this all relates to me and my work, I am immediately struck by two use cases.

Firstly research data - we are taking great steps forward in enabling this data to be preserved and maintained for the long term but will it be re-usable? For many types of research data there is no clear migration strategy. Emulation as a strategy for accessing this data ten or twenty years from now needs to be seriously considered. In the meantime we need to ensure we can identify the files themselves and collect adequate documentation - it is these things that will help us to enable reuse through emulators in the future.

Secondly, there are some digital archives that we hold at the Borthwick Institute from the 1980's. For example I have been working on a batch of WordStar files in my spare moments over the last few years. I'd love to get a contemporary emulator fired up and see if I could install WordStar and work with these files in their native setting. I've already gone a little way down the technology preservation route, getting WordStar installed on an old Windows 98 PC and viewing the files, but this isn't exactly contemporary. These approaches will help to establish the significant properties of the files and assess how successful subsequent migration strategies are....but this is a future blog post.

It was a fun event and it was clear that everybody loves a bit of nostalgia. Jim Boulton ended his presentation saying "There is something quite romantic about letting people play with old hardware".

We have come a long way and this is most apparent when seeing artefacts (hardware, software, operating systems, data) from early computing. Only this week whilst taking the kids to school we got into a conversation about floppy disks (yes, I know...). I asked the kids if they knew what they looked like and they answered "Yes, it is the save icon on the computer"(see Why is the save icon still a floppy disk?)...but of course they've never seen a real one. Clearly some obsolete elements of our computer history will remain in our collective consciousness for many years and perhaps it is our job to continue to keep them alive in some form.

Quick Method to wget my local wiki... need advice (without dumping mysql)

Published 23 Jun 2017 by WubiUbuntu1980 in Newest questions tagged mediawiki - Ask Ubuntu.

I need advice.

I have a webserver vm (LAN, not on the internet), it has 2 wikis:



I want to wget only the homework wiki pages, without crawling into the GameWiki?

My goal is to just get the .htmls (ignore all other files images etc), with wget. (I dont want to do a mysqldump or mediawiki export, but rather wget for my (non-IT) boss who just wants to double click the html).

How can I run wget to only crawl the HomeWorkWiki, and not the GameWiki on this VM.


Using MediaWiki and external data, how can I show an image in a page, returned as a blob from a database?

Published 20 Jun 2017 by Masutatsu in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I'm creating a wiki (using MediaWiki) which pulls data from a mySQL instance, and uses this alongside a template to generate the page dynamically.

My mySQL instance contains images, stored in a field of type BLOB.

Is it possible for MediaWiki to interpret this BLOB data into the actual image desired to be shown on the page?

The Value of Having A Bug Bounty Program

Published 19 Jun 2017 by Rusty Bower in DigitalOcean: Cloud computing designed for developers.

The Value of Having A Bug Bounty Program

In Spring of 2017, DigitalOcean transitioned from a private bug bounty program to a public bounty program on Bugcrowd. There were many drivers behind this decision, including getting more researcher engagement with our products, leveraging the pre-existing researchers that exist in the Bugcrowd ecosystem, and creating a scalable solution for the DO security team to manage. Although researchers were actively engaged in our original private bug bounty program, we immediately began to see quality vulnerabilities reported once we made the switch. Our old bug bounty program consisted of manual verification and a reward of Droplet credit and/or DO swag. While this worked when we were a much smaller company, the need to level up our bug bounty program has grown as we’ve scaled.

While we already conform to secure coding practices and undergo regular code audits and reviews, bug bounty researchers are able to find valuable bugs for our engineers to fix. Currently, any Bugcrowd researcher is able to test against the platform (although we’ve limited the scope to our API and Cloud endpoints for the time being).

Once we launched, we immediately saw results as hundreds of new researchers began testing the platform in the first few days alone:

The Value of Having A Bug Bounty Program In the rest of this post, we’ll explore some examples of bugs we received within 24 hours of launching our new bug bounty program.

One of the first submissions we received was a stored XSS in the notifications field via team name. This bug coincided with the launch of our new Teams product, and although our engineers had built client-side sanitization, server-side sanitization was not properly implemented. For this reason, an attacker could modify the team name, and any users invited to that team would have a stored XSS in their notifications area. After triaging and verifying this vulnerability, we worked closely with our engineers to get proper sanitization of all inputs, both on the client side and server side.

A second—and slightly more severe—reported vulnerability was a misconfiguration of Google OAuth in one of our applications. Even though all access to this application was supposed to be restricted to only valid DigitalOcean email addresses, the misconfiguration resulted in any valid Google Apps domain being able to authenticate successfully. Once we received this vulnerability, we worked quickly to check our logs for any potential unauthorized access. We didn’t find any, and as we reviewed the logs, we updated the OAuth provider to restrict appropriately.

The most exciting and severe vulnerability we received was a blind SQL injection into a specific search field in our API. We alerted the engineering team as soon as we received the report, and then audited the logs in search of any malicious activity exploiting this vulnerability. Thankfully, none were found, and our engineers were able to implement a fix within 24 hours of being notified of the issue.

As a company that takes security very seriously, we’ve gotten tremendous value from our new bug bounty program by finding issues and vulnerabilities that have passed code review and third-party penetration tests. It has afforded us the opportunity to work with and yield amazing results from skilled researchers we didn’t have access to before.

If you are interested in learning more—or participating in—our bug bounty program, visit our Bugcrowd program page.

Rusty Bower is an Information Security Engineer who manages the DigitalOcean Bug Bounty Program. When he is not triaging vulnerabilities, Rusty enjoys speaking about security topics and tinkering with random InfoSec projects in his basement.

A typical week as a digital archivist?

Published 16 Jun 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Sometimes (admittedly not very often) I'm asked what I actually do all day. So at the end of a busy week being a digital archivist I've decided to blog about what I've been up to.


Today I had a couple of meetings. One specifically to talk about digital preservation of electronic theses submissions. I've also had a work experience placement in this week so have set up a metadata creation task which he has been busy working on.

When I had a spare moment I did a little more testing work on the EAD harvesting feature the University of York is jointly sponsoring Artefactual Systems to develop in AtoM. Testing this feature from my perspective involves logging into the test site that Artefactual has created for us and tweaking some of the archival descriptions. Once those descriptions are saved, I can take a peek at the job scheduler and make sure that new EAD files are being created behind the scenes for the Archives Hub to attempt to harvest at a later date.

This piece of development work has been going on for a few months now and communications have been technically quite complex so I'm also trying to ensure all the organisations involved are happy with what has been achieved and will be arranging a virtual meeting so we can all get together and talk through any remaining issues.

I was slightly surprised today to have a couple of requests to talk to the media. This has sprung from the news that the Queen's Speech will be delayed. One of the reasons for the delay relates to the fact that the speech has to be written on goat's skin parchment, which takes a few days to dry. I had previously been interviewed for a article entitled Why is the UK still printing its laws on vellum? and am now mistaken for someone who knows about vellum. I explained to potential interviewers that this is not my specialist subject!


In the morning I went to visit a researcher at the University of York. I wanted to talk to him about how he uses Google Drive in relation to his research. This is a really interesting topic to me right now as I consider how best we might be able to preserve current research datasets. Seeing how exactly Google Drive is used and what features the researcher considers to be significant (and necessary for reuse) is really helpful when thinking about a suitable approach to this problem. I sometimes think I work a little bit too much in my own echo chamber, so getting out and hearing different perspectives is incredibly valuable.

Later that afternoon I had an unexpected meeting with one of our depositors (well, there were two of them actually). I've not met them before but have been working with their data for a little while. In our brief meeting it was really interesting to chat and see the data from a fresh perspective. I was able to reunite them with some digital files that they had created in the mid 1980's, had saved on to floppy disk and had not been able to access for a long time.

Digital preservation can be quite a behind the scenes sort of job - we always give a nod to the reason why we do what we do (ie: we preserve for future reuse), but actually seeing the results of that work unfold in front of your eyes is genuinely rewarding. I had rescued something from the jaws of digital obsolescence so it could now be reused and revitalised!

At the end of the day I presented a joint webinar for the Open Preservation Foundation called 'PRONOM in practice'. Alongside David Clipsham (The National Archives) and Justin Simpson (Artefactual Systems), I talked about my own experiences with PRONOM, particularly relating to file signature creation, and ending with a call to arms "Do try this at home!". It would be great if more of the community could get involved!

I was really pleased that the webinar platform worked OK for me this time round (always a bit stressful when it doesn't) and that I got to use the yellow highlighter pen on my slides.

In my spare moments (which were few and far between), I put together a powerpoint presentation for the following day...


I spent the day at the British Library in Boston Spa. I'd been invited to speak at a training event they regularly hold for members of staff who want to find out a bit more about digital preservation and the work of the team.

I was asked specifically to talk through some of the challenges and issues that I face in my work. I found this pretty easy - there are lots of challenges - and I eventually realised I had too many slides so had to cut it short! I suppose that is better than not having enough to say!

Visiting Boston Spa meant that I could also chat to the team over lunch and visit their lab. They had a very impressive range of old computers and were able to give me a demonstration of Kryoflux (which I've never seen in action before) and talk a little about emulation. This was a good warm up for the DPC event about emulation I'm attending next week: Halcyon On and On: Emulating to Preserve.

Still left on my to do list from my trip is to download Teracopy. I currently use Foldermatch for checking that files I have copied have remained unchanged. From the quick demo I saw at the British Library I think that Teracopy would be a more simple one step solution. I need to have a play with this and then think about incorporating it into the digital ingest workflow.

Sharing information and collaborating with others working in the digital preservation field really is directly beneficial to the day to day work that we do!


Back in the office today and a much quieter day.

I extracted some reports from our AtoM catalogue for a colleague and did a bit of work with our test version of Research Data York. I also met with another colleague to talk about storing and providing access to digitised images.

In the afternoon I wrote another powerpoint presentation, this time for a forthcoming DPC event: From Planning to Deployment: Digital Preservation and Organizational Change.

I'm going to be talking about our experiences of moving our Research Data York application from proof of concept to production. We are not yet in production and some of the reasons why will be explored in the presentation! Again I was asked to talk about barriers and challenges and again, this brief is fairly easy to fit! The event itself is over a week away so this is unprecedentedly well organised. Long may it continue!


On Fridays I try to catch up on the week just gone and plan for the week ahead as well as reading the relevant blogs that have appeared over the week. It is also a good chance to catch up with some admin tasks and emails.

Lunch time reading today was provided by William Kilbride's latest blog post. Some of it went over my head but the final messages around value and reuse and the need to "do more with less" rang very true.

Sometimes I even blog myself - as I am today!

Was this a typical week - perhaps not, but in this job there is probably no such thing! Every week brings new ideas, challenges and surprises!

I would say the only real constant is that I've always got lots of things to keep me busy.

Search Issues

Published 14 Jun 2017 by Ipstenu (Mika Epstein) in Make WordPress Plugins.

UPDATE (@dd32): All issues should be resolved as of 2:15AM UTC. The root cause was a change in the behaviour of Jetpack Search which we rely upon causing queries to fail. A network outage had caused issues for some queries earlier in the day, but was completely unrelated.

You may have noticed that search is acting up. Per @dd32: is experiencing a few network issues at present in the datacenter, it’s likely that connectivity between the API and’s elastic search is up-and-down, and when it’s down, search will be offline.

Yes, that means search for plugins too.

There’s nothing to do but wait at this point. It may be up and down while the connectivity is being sorted.

WordPress 4.8 “Evans”

Published 8 Jun 2017 by Matt Mullenweg in WordPress News.

An Update with You in Mind

Gear up for a more intuitive WordPress!

Version 4.8 of WordPress, named “Evans” in honor of jazz pianist and composer William John “Bill” Evans, is available for download or update in your WordPress dashboard. New features in 4.8 add more ways for you to express yourself and represent your brand.

Though some updates seem minor, they’ve been built by hundreds of contributors with you in mind. Get ready for new features you’ll welcome like an old friend: link improvements, three new media widgets covering images, audio, and video, an updated text widget that supports visual editing, and an upgraded news section in your dashboard which brings in nearby and upcoming WordPress events.

Exciting Widget Updates

Image Widget

Adding an image to a widget is now a simple task that is achievable for any WordPress user without needing to know code. Simply insert your image right within the widget settings. Try adding something like a headshot or a photo of your latest weekend adventure — and see it appear automatically.

Video Widget

A welcome video is a great way to humanize the branding of your website. You can now add any video from the Media Library to a sidebar on your site with the new Video widget. Use this to showcase a welcome video to introduce visitors to your site or promote your latest and greatest content.

Audio Widget

Are you a podcaster, musician, or avid blogger? Adding a widget with your audio file has never been easier. Upload your audio file to the Media Library, go to the widget settings, select your file, and you’re ready for listeners. This would be a easy way to add a more personal welcome message, too!

Rich Text Widget

This feature deserves a parade down the center of town! Rich-text editing capabilities are now native for Text widgets. Add a widget anywhere and format away. Create lists, add emphasis, and quickly and easily insert links. Have fun with your newfound formatting powers, and watch what you can accomplish in a short amount of time.

Link Boundaries

Have you ever tried updating a link, or the text around a link, and found you can’t seem to edit it correctly? When you edit the text after the link, your new text also ends up linked. Or you edit the text in the link, but your text ends up outside of it. This can be frustrating! With link boundaries, a great new feature, the process is streamlined and your links will work well. You’ll be happier. We promise.

Nearby WordPress Events

Did you know that WordPress has a thriving offline community with groups meeting regularly in more than 400 cities around the world? WordPress now draws your attention to the events that help you continue improving your WordPress skills, meet friends, and, of course, publish!

This is quickly becoming one of our favorite features. While you are in the dashboard (because you’re running updates and writing posts, right?) all upcoming WordCamps and official WordPress Meetups — local to you — will be displayed.

Being part of the community can help you improve your WordPress skills and network with people you wouldn’t otherwise meet. Now you can easily find your local events just by logging in to your dashboard and looking at the new Events and News dashboard widget.

Even More Developer Happiness 😊

More Accessible Admin Panel Headings

New CSS rules mean extraneous content (like “Add New” links) no longer need to be included in admin-area headings. These panel headings improve the experience for people using assistive technologies.

Removal of Core Support for WMV and WMA Files

As fewer and fewer browsers support Silverlight, file formats which require the presence of the Silverlight plugin are being removed from core support. Files will still display as a download link, but will no longer be embedded automatically.

Multisite Updates

New capabilities have been introduced to 4.8 with an eye towards removing calls to
is_super_admin(). Additionally, new hooks and tweaks to more granularly control site and user counts per network have been added.

Text-Editor JavaScript API

With the addition of TinyMCE to the text widget in 4.8 comes a new JavaScript API for instantiating the editor after page load. This can be used to add an editor instance to any text area, and customize it with buttons and functions. Great for plugin authors!

Media Widgets API

The introduction of a new base media widget REST API schema to 4.8 opens up possibilities for even more media widgets (like galleries or playlists) in the future. The three new media widgets are powered by a shared base class that covers most of the interactions with the media modal. That class also makes it easier to create new media widgets and paves the way for more to come.

Customizer Width Variable

Rejoice! New responsive breakpoints have been added to the customizer sidebar to make it wider on high-resolution screens. Customizer controls should use percentage-based widths instead of pixels.

The Squad

This release was led by Matt and Jeff Paul, with the help of the following fabulous folks. There are 346 contributors with props in this release, with 106 of them contributing for the first time. Pull up some Bill Evans on your music service of choice, and check out some of their profiles:

Aaron D. Campbell, Aaron Jorbin, abrightclearweb, Achal Jain, achbed, Acme Themes, Adam Silverstein, adammacias, Ahmad Awais, ahmadawais, airesvsg, ajoah, Aki Björklund, akshayvinchurkar, Alain Schlesser, Alex Concha, Alex Dimitrov, Alex Hon, alex27, allancole, Amanda Rush, Andrea Fercia, Andreas Panag, Andrew Nacin, Andrew Ozz, Andrey "Rarst" Savchenko, Andy Meerwaldt, Andy Mercer, Andy Skelton, Aniket Pant, Anil Basnet, Ankit K Gupta, Anthony Hortin, antisilent, Anton Timmermans, apokalyptik, artoliukkonen, Arunas Liuiza, attitude, backermann1978, Bappi, Ben Cole, Bernhard Gronau, Bernhard Kau, binarymoon, Birgir Erlendsson (birgire), BjornW, bobbingwide, boblinthorst, boboudreau, bonger, Boone B. Gorges, Brady Vercher, Brainstorm Force, Brandon Kraft, Brian Hogg, Brian Krogsgard, Bronson Quick, Caroline Moore, Casey Driscoll, Caspie, Chandra Patel, Chaos Engine, cheeserolls, chesio, chetansatasiya, choong, Chouby, chredd, Chris Jean, Chris Marslender, Chris Smith, Chris Van Patten, Chris Wiegman, chriscct7, chriseverson, Christian Chung, Christian Nolen, Christian Wach, Christoph Herr, Clarion Technologies, Claudio Sanches, Claudio Sanches, ClaudioLaBarbera,, coderkevin, codfish, coreymcollins, Curdin Krummenacher, Curtiss Grymala, Cătălin Dogaru, danhgilmore, Daniel Bachhuber , Daniel Kanchev, Daniel Pietrasik, Daniele Scasciafratte, Daryl L. L. Houston (dllh), Dave Pullig, Dave Romsey (goto10), David A. Kennedy, David Chandra Purnama, David Herrera, David Lingren, David Mosterd, David Shanske, davidbhayes, Davide 'Folletto' Casali, deeptiboddapati, delphinus, deltafactory, Denis de Bernardy, Derek Herman, Derrick Hammer, Derrick Koo, dimchik, Dinesh Chouhan, Dion Hulse, dipeshkakadiya, dmsnell, Dominik Schilling, Dotan Cohen, Doug Wollison, doughamlin, DreamOn11, Drew Jaynes, duncanjbrown, dungengronovius, DylanAuty, Eddie Hurtig, Eduardo Reveles, Edwin Cromley, ElectricFeet, Elio Rivero, Ella Iseulde Van Dorpe, elyobo, enodekciw, enshrined, Eric Andrew Lewis, Eric Lanehart, Evan Herman, Felix Arntz, Fencer04, Florian Brinkmann, Florian TIAR, FolioVision, fomenkoandrey, Francesco Taurino, Frank Klein, Frankie Jarrett, Fred, Fredrik Forsmo, fuscata, Gabriel Maldonado, Garth Mortensen, Gary Jones, Gary Pendergast, Geeky Software, George Stephanis, Goran Šerić, Graham Armfield, Grant Derepas, Gregory Karpinsky (@tivnet), Hardeep Asrani, Helen Hou-Sandí, Henry Wright, hiddenpearls, Hinaloe, Hristo Pandjarov, Hugo Baeta, Iain Poulson, Ian Dunn, Ian Edington, idealien, Ignacio Cruz Moreno, imath, implenton, Ionut Stanciu, Ipstenu (Mika Epstein), ivdimova, J.D. Grimes, Jacob Peattie, Jake Spurlock, James Nylen, jamesacero, Japh, Jared Cobb, jayarjo, jdolan, jdoubleu, Jeff Bowen, Jeff Paul, Jeffrey de Wit, Jeremy Felt, Jeremy Pry, jimt, Jip Moors, jmusal, Joe Dolson, Joe Hoyle, Joe McGill, Joel James, johanmynhardt, John Blackbourn, John Dittmar, John James Jacoby, John P. Bloch, John Regan, johnpgreen, Jon (Kenshino), Jonathan Bardo, Jonathan Brinley, Jonathan Daggerhart, Jonathan Desrosiers, Jonny Harris, jonnyauk, jordesign, JorritSchippers, Joseph Fusco, Josh Eaton, Josh Pollock, joshcummingsdesign, joshkadis, Joy, jrf, JRGould, Juanfra Aldasoro, Juhi Saxena, Junko Nukaga, Justin Busa, Justin Sainton, Justin Shreve, Justin Sternberg, K.Adam White, kacperszurek, Kailey (trepmal), KalenJohnson, Kat Hagan, Keanan Koppenhaver, keesiemeijer, kellbot, Kelly Dwan, Kevin Hagerty, Kirk Wight, kitchin, Kite, kjbenk, Knut Sparhell, koenschipper, kokarn, Konstantin Kovshenin, Konstantin Obenland, Konstantinos Kouratoras, kuchenundkakao, kuldipem, Laurel Fulford, Lee Willis, Leo Baiano, LittleBigThings (Csaba), Lucas Stark, Luke Cavanagh, Luke Gedeon, Luke Pettway, lyubomir_popov, Mário Valney, mageshp, Mahesh Waghmare, Mangesh Parte, Manish Songirkar, mantismamita, Marcel Bootsman, Marin Atanasov, Marius L. J., Mariyan Belchev, Mark Jaquith, Mark Root-Wiley, Mark Uraine, Marko Heijnen, markshep, matrixik, Matt Banks, Matt King, Matt PeepSo, Matt van Andel, Matt Wiebe, Matthew Haines-Young, mattyrob, Max Cutler, Maxime Culea, Mayo Moriyama, mckernanin, Mel Choyce, mhowell, Michael Arestad, Michael Arestad, michalzuber, Miina Sikk, Mike Auteri, Mike Crantea, Mike Glendinning, Mike Hansen, Mike Little, Mike Schroder, Mike Viele, Milan Dinić, modemlooper, Mohammad Jangda, Mohan Dere, monikarao, morettigeorgiev, Morgan Estes, Morten Rand-Hendriksen, moto hachi ( ), mrbobbybryant, Naim Naimov, Nate Reist, NateWr, nathanrice, Nazgul, Ned Zimmerman, net, Nick Halsey , Nicolas GUILLAUME, Nikhil Chavan, Nikhil Vimal, Nikolay Bachiyski, Nilambar Sharma, noplanman, nullvariable, odie2, odyssey, Okamoto Hidetaka, orvils, oskosk, Otto Kekäläinen, ovann86, Pantip Treerattanapitak (Nok), Pascal Birchler, patilvikasj, Paul Bearne, Paul Wilde, Payton Swick, pdufour, Perdaan, Peter Wilson, phh, php, Piotr Delawski, pippinsplugins, pjgalbraith, pkevan, Pratik, Pressionate, Presskopp, procodewp, Rachel Baker, Rahul Prajapati, Ramanan, Rami Yushuvaev, ramiabraham, ranh, Red Sand Media Group, Riad Benguella, Rian Rietveld, Richard Tape, Robert D Payne, Robert Jolly, Robert Noakes, Rocco Aliberti, Rodrigo Primo, Rommel Castro, Ronald Araújo, Ross Wintle, Roy Sivan, Ryan Kienstra, Ryan McCue, Ryan Plas, Ryan Welcher, Sal Ferrarello, Sami Keijonen, Samir Shah, Samuel Sidler, Sandesh, Sang-Min Yoon, Sanket Parmar, Sarah Gooding, Sayed Taqui, schrapel, Scott Reilly, Scott Taylor,, scribu, seancjones, Sebastian Pisula, Sergey Biryukov, Sergio De Falco, sfpt, shayanys, shazahm1, shprink, simonlampen, skippy, smerriman, snacking, solal, Soren Wrede, Stanimir Stoyanov, Stanko Metodiev, Steph, Steph Wells, Stephanie Leary, Stephen Edgar, Stephen Harris, Steven Word, stevenlinx, Sudar Muthu, Swapnil V. Patil, swapnild, szaqal21, Takahashi Fumiki, Takayuki Miyauchi, Tammie Lister, tapsboy, Taylor Lovett, team, tg29359, tharsheblows, the, themeshaper, thenbrent, thomaswm, Thorsten Frommen, tierra, Tim Nash, Timmy Crawford, Timothy Jacobs, timph, Tkama, tnegri, Tom Auger, Tom J Nowell, tomdxw, Toro_Unit (Hiroshi Urabe), Torsten Landsiedel, transl8or, traversal, Travis Smith, Triet Minh, Trisha Salas, tristangemus, truongwp, tsl143, Ty Carlson, Ulrich, Utkarsh, Valeriu Tihai, Viljami Kuosmanen, Vishal Kakadiya, vortfu, Vrunda Kansara, webbgaraget, WebMan Design | Oliver Juhas, websupporter, Weston Ruter, William Earnhardt, williampatton, Wolly aka Paolo Valenti, WraithKenny, yale01, Yoav Farhi, Yoga Sukma, Zach Wills, Zack Tollman, Ze Fontainhas, zhildzik, and zsusag.


Finally, thanks to all the community translators who worked on WordPress 4.8. Their efforts bring WordPress 4.8 fully translated to 38 languages at release time with more on the way.

Do you want to report on WordPress 4.8? We’ve compiled a press kit featuring information about the release features, and some media assets to help you along.

If you want to follow along or help out, check out Make WordPress and our core development blog. Thanks for choosing WordPress — we hope you enjoy!

Five minutes with Kylie Howarth

Published 7 Jun 2017 by carinamm in State Library of Western Australia Blog.

Kylie Howarth is an award winning Western Australian author, illustrator and graphic designer. Original illustrations and draft materials from her most recent picture book 1, 2, Pirate Stew (Five Mile Press) are currently showing in The Story Place Gallery.

We spent some time hearing from Kylie Howarth about the ideas and inspiration behind her work. Here’s what she had to say…


1, 2, Pirate Stew is all about the power of imagination and the joys of playing in a cardboard box. How do your real life experiences influence your picture book ideas? What role does imagination play?

The kids and I turned the box from our new BBQ into a pirate ship. We painted it together and made anchors, pirate hats and oars. They loved it so much they played in it every day for months… and so the idea for 1, 2, Pirate Stew was born. It eventually fell apart and so did our hot water system, so we used that box to build a rocket. Boxes live long lives around our place. I also cut them up and take them to school visits to do texture rubbings with the students.

Your illustrations for 1, 2, Pirate Stew are unique in that they incorporate painted textures created during backyard art sessions with your children. What encouraged you to do this? How do your children’s artworks inspire you?

I just love children’s paintings. They have an energy I find impossible to replicate. Including them in my book illustrations encourages kids to feel their art is important and that they can make books too. Kids sometimes find highly realistic illustrations intimidating and feel they could never do it themselves. During school and library visits, they love seeing the original finger paintings and potato stamp prints that were used in my books.

Through digital illustration you have blended hand drawings with painted textures. How has your background and training as a graphic designer influenced your illustrative style?

Being a graphic designer has certainly influenced the colour and composition of my illustrations. In 1, 2, Pirate Stew particularly the use of white space. Many illustrators and designers are afraid of white space but it can be such an effective tool, it allows the book to breathe. The main advantage though is that I have been able to design all my own book covers, select fonts and arrange the text layout.

Sometimes ideas for picture books evolve and change a lot when working with the publisher. Sometimes the ideas don’t change much at all. What was your experience when creating 1, 2, Pirate Stew? Was it similar or different to your previous books Fish Jam and Chip?

I worked with a fabulous editor, Karen Tayleur on all three books. We tweaked the text for Fish Jam and Chip a little to make them sing as best we could. With 1, 2, Pirate Stew however, the text was based on the old nursery rhyme 1, 2, Buckle My Shoe. So there was little room to move as I was constrained to a limited number of syllables and each line had to rhyme. I think we only added one word. I did however further develop the illustrations from my original submission. Initially the character’s faces were a little more stylised so I refined them to be more universal. Creating the mini 3D character model helped me get them looking consistent from different angles throughout the book. I also took many photographs of my boys to sketch from.

1, 2, Pirate Stew – an exhibition is on display at the State Library of Western Australia until 22 June 2017. The exhibition is part of a series showcasing the diverse range of illustrative styles in picture books published by Western Australian authors and illustrators. For more information go to

Filed under: Children's Literature, Exhibitions, Illustration, SLWA displays, SLWA Exhibitions, SLWA news Tagged: 1 2 Pirate Stew, Five Mile Press, Kylie Howarth, State Library of Western Australia, State Library WA, Story Place Gallery, WA authors, WA illustrators


Published 7 Jun 2017 by fabpot in Tags from Twig.


Published 7 Jun 2017 by fabpot in Tags from Twig.

MediaWiki fails to show Ambox

Published 7 Jun 2017 by lucamauri in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I am writing you about the use of Template:Ambox in MediaWiki.

I have a version 1.28 hosted MediWiki installation that works well apparently at everything, but I can't get the boxes explain here to work properly.

As a test I implemented in this page the following code:

| type       = notice
| text       = Text for a big box, for the top of articles.
| smalltext  = Text for the top of article sections.

and I expected a nice box to show up. Instead I simply see the text Template:Ambox shown at the top of the page.
It seems like this template is not defined in MediaWiki, but, as far as I understood, this is built-in and in all examples I saw it seems it should work out-of-the-box.

I guess I miss something basic here, but it really escapes me: any help you might provide will be appreciated.



New Customer Support System!

Published 6 Jun 2017 by Nicola Nye in FastMail Blog.

You are our customer, not our product. This is our number one value. Your subscription fee doesn't just go towards keeping your email safe, secure and reliable; it means if you ever have a problem that our detailed help pages don't resolve, one of our friendly support team will help you out.

And now, as part of our commitment to providing first rate customer service, we have made it even easier to access FastMail Support.

We have made changes to our contact form and ticketing system to make them clearer and more user-friendly. We've also been busy behind the scenes improving our back-end systems, which means we can address issues faster and more efficiently than before.

The biggest change is that you can now send support requests via email, to!

Not all users will have access to the new system yet. We are gradually rolling it out over the next few weeks. We hope you like what you see.

With FastMail, you're not left alone. We know you would rather spend your time getting things done, not trying to fix a glitch by experimenting with something a random stranger on the internet tried once.

Ticket creation screenshot

We pride ourselves on the quality of our personal support service. As a result we don't use social media for individual support: instead please create a ticket. Most requests need us to send or receive the kind of information we (and you!) don't want to be exposed on social media. For detailed and rapid assistance, our ticketing system has the tools to help us help you.

We also don’t provide telephone support. Telephone support is expensive, and time consuming. In order to provide high quality, round-the-clock assistance while keeping your costs reasonable, we provide support by email only.

We do still love hearing from you! Drop us a line any time on Twitter, Facebook, or Google+, where we post system-wide announcements.

While our system is changing, our policies remain the same. Here's how to recognise true requests from the FastMail support team. Our team will never ask you for a password or credit card number via email or phone. We will never offer to fix your problems by remotely accessing your computer. Messages from us will be identified with a green tick in the web interface. We encourage you to check out our full customer support policies.


Published 5 Jun 2017 by fabpot in Tags from Twig.


Published 5 Jun 2017 by fabpot in Tags from Twig.


Published 5 Jun 2017 by fabpot in Tags from Twig.


Published 5 Jun 2017 by fabpot in Tags from Twig.


Published 5 Jun 2017 by fabpot in Tags from Twig.

Ted Nelson’s Junk Mail (and the Archive Corps Pilot)

Published 31 May 2017 by Jason Scott in ASCII by Jason Scott.

I’ve been very lucky over the past few months to dedicate a few days here and there to helping legend Ted Nelson sort through his archives. We’ve known each other for a bunch of years now, but it’s always a privilege to get a chance to hang with Ted and especially to help him with auditing and maintaining his collection of papers, notes, binders, and items. It also helps that it’s in pretty fantastic shape to begin with.

Along with sorting comes some discarding – mostly old magazines and books; they’re being donated wherever it makes sense to. Along with these items were junk mail that Ted got over the decades.

About that junk mail….

After glancing through it, I requested to keep it and take it home. There was a lot of it, and even going through it with a cursory view showed me it was priceless.

There’s two kinds of people in the world – those who look at ephemera and consider it trash, and those who consider it gold.

I’m in the gold camp.

I’d already been doing something like this for years, myself – when I was a teenager, I circled so many reader service cards and pulled in piles and piles of flyers and mailings from companies so fleeting or so weird, and I kept them. These became and later the reader service collection, which encapsulates completely. There’s well over a thousand pages in that collection, which I’ve scanned myself.

Ted, basically, did what I was doing, but with more breadth, more variety, and with a few decades more time.

And because he was always keeping an eye out on many possibilities for future fields of study, he kept his mind (and mailbox) open to a lot of industries. Manufacturing, engineering, film-making, printing, and of course “computers” as expressed in a thousand different ways. The mail dates from the 1960s through to the mid 2000s, and it’s friggin’ beautiful.

Here’s where it gets interesting, and where you come in.

There’s now a collection of scanned mail from this collection up at the Internet Archive. It’s called Ted Nelson’s Junk Mail and you can see the hundreds of scanned pages that will soon become thousands and maybe tens of thousands of scanned pages.

They’re separated by mailing, and over time the metadata and the contents will get better, increase in size, and hopefully provide decades of enjoyment for people.

The project is being coordinated by Kevin Savetz, who has hired a temp worker to scan in the pages across each weekday, going through the boxes and doing the “easy” stuff (8.5×11 sheets) which, trust me, is definitely worth going through first. As they’re scanned, they’re uploaded, and (for now) I am running scripts to add them as items to the Junk Mail collection.

The cost of doing this is roughly $80 a day, during which hundreds of pages can be scanned. We’re refining the process as we go, and expect it to get even more productive over time.

So, here’s where Archive Corps comes in; this is a pilot program for the idea behind the new idea of Archive Corps, which is providing a funnel for all the amazing stuff out there to get scanned. If you want to see more stuff come from the operation that Kevin is running, he has a paypal address up at – the more you donate the more days we are able to have the temp come in to scan.

I’m very excited to watch this collection grow, and see the massive variety of history that it will reveal. A huge thank-you to Ted Nelson for letting me take these items, and a thank-you to Kevin Savetz for coordinating.

Let’s enjoy some history!

Local illustration showcase

Published 30 May 2017 by carinamm in State Library of Western Australia Blog.

From digital illustration to watercolor painting and screen-printing, three very different styles of illustration highlight the diversity and originality of picture books published this year. 

In a series of exhibitions, The Story Place Gallery will showcase original artwork by Western Australian illustrators from the picture books 1,2 , Pirate Stew, (Five Mile Press 2017), One Thousand Trees and Colour Me (Fremantle Press 2017).


7, 8, he took the bait © Kylie Howarth 2017

In 1,2 , Pirate Stew,  Kylie Howarth has used a digital Illustration process to merge her drawings created using water soluble pencils, with background textures painted by her two adventurous children Beau and Jack. Kylie Howarth’s playful illustrations of gentle colours, together with her entertaining rhyming verse, take readers on an imaginative adventure all about the joys of playing in a cardboard box. Illustrations from 1,2, Pirate Stew are on display from 26 May – 22 June.


Among © Kyle Hughes-Odgers 2017

Kyle Hughes-Odgers’ distinctive illustrations blend geometric shapes, patterns and forms. In his watercolour illustrations for One Thousand Trees, he uses translucent colours and a restricted colour palette to explore the relationship between humankind and the environment. Shades of green browns and grey blues emphasise contrasts between urban and natural scenes. Kyle Hughes-Odgers places the words of the story within his illustrations to accentuate meaning. One Thousand Trees is on display from 24 June to 23 July.


If I was red © Moira Court

Moira Court’s bold illustration for the book Colour Me (written by Ezekiel Kwaymullina) were created using a woodcut and screen printing technique. Each final illustration is made from layers of silk screen prints created using hand cut paper stencils and transparent ink. Each screen print was then layered with a patchy, textural woodcut or linoleum print. Colours were  printed one at a time to achieve a transparent effect. The story celebrates the power of each individual colour, as well as the power of their combination. Colour Me is on display from 26 July – 16 August.

Each exhibition in this series is curated especially for children and is accompanied by a story sharing area, self-directed activity, and discussion prompters for families

Filed under: Children's Literature, community events, Exhibitions, Illustration, SLWA displays, SLWA Exhibitions, SLWA news

A Lot of Doing

Published 28 May 2017 by Jason Scott in ASCII by Jason Scott.

If you follow this weblog, you saw there was a pause of a couple months. I’ve been busy! Better to do than to talk about doing.

A flood of posts are coming – they reflect accomplishments and thoughts of the last period of time, so don’t be freaked out as they pop up in your life very quickly.


TV Interview on Stepping Off

Published 26 May 2017 by Tom Wilson in thomas m wilson.

All accounts updated to version 2.9

Published 20 May 2017 by Pierrick Le Gall in The Blog.

17 days after Piwigo 2.9.0 was released and 4 days after we started to update, all accounts are now up-to-date.

Piwigo 2.9 and new design on administration pages

Piwigo 2.9 and new design on administration pages

As you will learn from the release notes, your history will now be automatically purged to keep “only” the last 1 million lines. Yes, some of you, 176 to be exact, have more than 1 million lines, with a record set to 27 millions lines!

Wikimedia Commons Android App Pre-Hackathon

Published 19 May 2017 by addshore in Addshore.

Wikimedia Commons Logo

The Wikimedia Commons Android App allows users to upload photos to Commons directly from their phone.

The website for the app details some of the features and the code can be found on GitHub.

A hackathon was organized in Prague to work on the app in the run up to the yearly Wikimedia Hackathon which is in Vienna this year.

A group of 7 developers worked on the app over a few days and as well as meeting each other and learning from each other they also managed to work on various improvements which I have summarised below.

2 factor authentication (nearly)

Work has been done towards allowing 2fa logins to the app.

Lots of the login & authentication code has been refactored and the app now uses the clientlogin API module provided by Mediawiki instead of the older login module.

When building to debug the 2fa input box will appear if you have 2fa login enabled, however the current production build will not show this box and simply display a message saying that 2fa is not currently supported. This is due to a small amount of session handling work that the app still needs.

Better menu & Logout

As development on the app was fairly non existent between mid 2013 and 2016 the UI generally fell behind. This is visible in forms, buttons as well as app layout.

One significant push was made to drop the old style ‘burger’ menu from the top right of the app and replace it with a new slide out menu draw including a feature image and icons for menu items.

Uploaded images display limit

Some users have run into issues with the number of upload contributions that the app loads by default in the contributions activity. The default has always been 500 and this can cause memory exhaustion / OOM and a crash on some memory limited phones.

In an attempt to fix and generally speed up the app a recent upload limit has been added to the settings which will limit the number images and image details that are displayed, however the app will still fetch and store more than this on the device.

Nearby places enhancements

The nearby places enhancements probably account for the largest portion of development time at the pre hackathon. The app has always had a list of nearby places that don’t have images on commons but now the app also has a map!

The map is powered by the mapbox SDK and the current beta uses the mapbox tiles however part of the plan for the Vienna hackathon is to switch this to using the wikimedia hosted map tiles at

The map also contains clickable pins that provide a small pop up pulling information from Wikidata including the label and description of the item as well as providing two buttons to get directions to the place or read the Wikipedia article.

Image info coordinates & image date

Extra information has also been added to the image details view and the image date and coordinates of the image can now be seen in the app.

Summary of hackathon activity

The contributions and authors that worked on the app during the pre hackathon can be found on Github at the following link.

Roughly 66 commits were made between the 11th and 19th of May 2017 by 9 contributors.

Screenshot Gallery

A New Maintainer Appears!

Published 18 May 2017 by Beau Simensen in Sculpin's Blog.

Effective immediately, I have handed over full ownership of the Sculpin project to Chris Tankersley. Until otherwise specified, the rest of the Sculpin Organization will remain intact.

This was not an easy decision for me to make. I've been thinking about it for a few years now. It isn't fair to the community for my continued lack of time and energy to hold Sculpin back from moving forward.

The hardest thing for me is that Sculpin, as it stands right now, works great for me. I maintain dozens of Sculpin sites and they've all worked great for the last two to three years.

There are things I'd love to change but it has become clear to me that I neither have the time nor energy to make it happen.

Thanks for your support over the years. I'm sure Chris and the team will treat you better than I have over the last few.



I am honored that Beau is allowing me to take over ownership of the Sculpin project. I have been the FIG representative for Sculpin for a few years now, making sure that Sculpin's interests are heard for new PSRs that take shape in the PHP community. Since the beginning of the year I've also represented the FIG as part of the Core Committee.

While I may not exactly be a very heavy committer, if at all to the base code so far, I have been serving on the Sculpin organization committee since 2015. I've spent much of that time extolling the virtues of Sculpin, and have helped guide what features and roadmaps we have worked on in that time.

Sculpin has been a big part of my workflow since I started working with it, and it is one of the projects near and dear to my heart. When Beau decided to step down, it was not a hard decision to step up and help keep this project going. Sculpin is a stable, dependable static site builder, and I would hate to see it go away.

I plan on coming up with some exciting new features for Sculpin in addition to updating the codebase. I hope you all come along for the ride.

Thank you Beau. For Sculpin, and letting me help keep it alive.


Privacy Awareness Week 2017

Published 16 May 2017 by David Gurvich in FastMail Blog.

We are excited to announce that FastMail is a partner in this year’s Privacy Awareness Week (PAW) — the largest campaign in the Asia Pacific that raises awareness on privacy issues and how personal information can be better protected.

The campaign runs from 15 May to 19 May and this year’s theme is ‘trust and transparency’, to highlight how clear privacy practices build trust between individuals and organisations.

At FastMail, we have a great responsibility to keep your email secure. We continually review our code and processes for potential vulnerabilities and we take new measures wherever possible to further secure your data.

In our recent blog post on FastMail’s Values we made it clear that not only does your data belong to you, but that we also strive to be good stewards of your data.

By being privacy-aware, you can make more informed decisions about managing your data. Here are a few quick best practice tips you can use with your FastMail account:

1. Protect your email, protect your identity

Passwords are like locks, and some doors are more important than others. Your email is the front door and master key to most of your online identities. If a malicious user controls your email, they can reset your passwords everywhere else (like your bank account).

The best protection? Just like in your home, it’s two sets of locks — two-step verification (also known as two-factor authentication or 2FA). It combines something you know (your password) and something you have (your phone or a security key). We make it easy to set up and use two-step verification on your FastMail account.

Not all your online accounts require two-step verification, but we recommend it for identity services (like Facebook or Twitter), financial services (your bank, your credit card company), and other services with critical data (Dropbox, your DNS provider).

2. Protect your keys

The two most common ways for an attacker to get your password are either knowing enough about a user’s personal information to guess it, or reuse of a password compromised from another site. You can protect against both of these attacks with one simple tool: a password manager. A password manager makes it easy to use a distinct password for every service. Good password managers will even generate random passwords for you, making it impossible for someone to guess.

Many browsers have a basic password manager built in. We prefer stand-alone tools like 1Password or LastPass — their syncing tools let you access your passwords on both your computer and your phone.

3. Trust, but verify

Less common than password reuse or guessable passwords, but a growing problem, is phishing. Phishing is a targeted attack, where a malicious user claims to be a trusted contact (FastMail, your bank, a loved one) to get you to provide your password or other personal information.

When you receive an email from the FastMail team, it will always have our security check mark. (Want to know what the security check mark looks like?) If you want to be sure you're on the proper FastMail website, look for the green padlock badge in the URL bar.

For any service, when in doubt, do not click on the links in a message – go directly to their website instead. If it's urgent enough for a company to email you, you should expect to see an alert on your account, too.

Follow these simple tips, and better protect your privacy online.

Visit the PAW 2017 website to find out more and be sure to join the conversation on social media with #2017PAW, and help raise privacy awareness.

NYI Datacentre Move

Published 13 May 2017 by Bron Gondwana in FastMail Blog.

You're reading this version of this blog post, so we succeeded. Our primary datacentre moved and you didn't even notice!

Over the past week, we have moved all of the FastMail, Pobox and Listbox hardware from New York Internet's Manhattan datacentre to their Bridgewater, New Jersey location. The new location gives us more space to expand while keeping the same great service we have always received from NYI's network and remote hands teams.

The wide open plains of New Jersey

Pre-move view

To prepare for this move, we performed numerous "fire drills" over the last 3 weeks. We shut down half the FastMail infrastructure at a time during the Australian day, to make sure nobody noticed and that all the hardware would come back up cleanly.

Our design goal is to have sufficient redundancy that we can run on half capacity - comfortably during non-peak times - with a little slowdown during peak times. This is due to our commitment to high availability. The fire drills gave us a high level of confidence that our systems were still meeting this goal in practice.

In 2014 I spent a week in New York and moved all the servers to a new set of racks. In the process we reconfigured that redundancy such that we could take down entire racks at a time if we ever had to - either for this type of move within NYI or even to move to a new provider! It's part of our regular contingency planning, and has been very valuable for this week's work.

We had migrated the Pobox/Listbox hardware from Philadelphia up to New York over a few batches in the last 12 months. While not the same 50/50 plan we used for FastMail, we felt confident we could repeat the batched moves over a much shorter timeline for this move.

Those plans in hand, we are pleased to say that we moved every service, and virtually every server, in only two days!

Getting it done

We have to start by thanking NYI for their assistance and diligent preparation leading up to this move. They set up racks in the new datacentre and bridged all our networks through dark fibre between their two locations.

This week, two of our operations team flew to New York to put our plan into effect. The moves were scheduled for Monday and Tuesday nights (8th and 9th of May), starting at 6pm New York time (8am Melbourne). Rob and Jon are the operations leads for the two sets of infrastructure. They led the move on the ground, working with NYI staff and a team of movers. Back in Australia, I was one of the two operations staff monitoring the move and keeping services running smoothly.

On Tuesday during the day, we were running with half the hardware in each datacentre across the bridged network. We're now entirely in New Jersey with nothing left in Manhattan.

The moves took longer than planned, as moves always do! Missing rack rails, slightly smaller racks than expected, networks not quite coming together on the first go, etc meant Rob and Jon were up until 5am their time getting the last bits up and working. The datacentre crew at NYI-NJ were amazing as well. We were very fortunate 15 years ago when we found NYI, they really are a gem amongst datacentres! As I've said before, a lot of our reliability can be attributed to having a really good datacentre partner. With their help, we were back up and running for the US day.

But enough talking about the move, let's see some more photos!

Christmas morning, all the presents are unwrapped

What Now?

Packed for transit

Packed and ready to go

Colourful Pobox

Pobox is colourful

Hang on, we've seen this before

Hang on, we've seen this before

Even though I knew the plan and had confidence that we had tested each of the individual tasks required, you never know what's going to happen on the ground. (Yes, we even had a plan for what would happen if the truck crashed on either day.) So I speak for everyone when I give Rob and Jon huge high fives for pulling this off so smoothly!

Things customers may have noticed

There will always be a few hitches with a massively coordinated operation. Issues we dealt with in the process:

  1. A handful of FastMail App users were using the QA server, which was offline for about 8 hours on Monday night. Likewise the beta server was offline for about 8 hours on Tuesday night.

  2. Pobox and Listbox new logins broke on Monday night because of an undeclared dependency on the billing service, which was offline during the Monday part of the move. Once that was identified as the cause, the quickest fix was to push forwards and bring up the billing service again in New Jersey.

  3. A bug in Pobox service provisioning cropped up, unrelated to the move. But, because other services were intentionally offline, the bad behavior persisted long enough to cause Pobox DNS to break for 30-40 minutes. During that time, some Pobox users could not send mail, and others reported bouncing messages from their correspondents. As continuous delivery of mail is always our highest goal, we deeply apologize to anyone affected by this issue.

Thank you to everyone who did notice issues for your support and patience. We know how important your email is to you, so your kind comments make us very happy.

AtoM Camp take aways

Published 12 May 2017 by Jenny Mitcham in Digital Archiving at the University of York.

The view from the window at AtoM Camp ...not that there was
any time to gaze out of the window of course...
I’ve spent the last three days in Cambridge at AtoM Camp. This was the second ever AtoM Camp, and the first in Europe. A big thanks to St John’s College for hosting it and to Artefactual Systems for putting it on.

It really has been an interesting few days, with a packed programme and an engaged group of attendees from across Europe and beyond bringing different levels of experience with AtoM.

As a ‘camp counsellor’ I was able to take to the floor at regular intervals to share some of our experiences of implementing AtoM at the Borthwick, covering topics such as system selection, querying the MySQL database, building the community and overcoming implementation challenges.

However, I was also there to learn!

Here are some bits and pieces that I’ve taken away.

My first real take away is that I now have a working copy of the soon to be released AtoM 2.4 on my Macbook - this is really quite cool. I'll never again be bored on a train - I can just fire up Ubuntu and have a play!

Walk to Camp takes you over Cambridge's Bridge of Sighs
During the camp it was great to be able to hear about some of the new features that will be available in this latest release.

At the Borthwick Institute our catalogue is still running on AtoM 2.2 so we are pretty excited about moving to 2.4 and being able to take advantage of all of this new functionality.

Just some of the new features I learnt about that I can see an immediate use case are:

On day two of camp I enjoyed the implementation tours, seeing how other institutions have implemented AtoM and the tweaks and modifications they have made. For example it was interesting to see the shopping cart feature developed for the Mennonite Archival Image Database and most popular image carousel feature on front page of the Chinese Canadian Artifacts Project. I was also interested in some of the modifications the National Library of Wales have made to meet their own needs.

It was also nice to hear the Borthwick Catalogue described  by Dan as “elegant”!

There was a great session on community and governance at the end of day two which was one of the highlights of the camp for me. It gave attendees the chance to really understand the business model of Artefactual (as well as alternatives to the bounty model in use by other open source projects). We also got a full history of the evolution of AtoM and saw the very first project logo and vision.

The AtoM vision hasn't changed too much but the name and logo have!

Dan Gillean from Artefactual articulated the problem of trying to get funding for essential and ongoing tasks, such as code modernisation. Two examples he used were updating AtoM to work with the latest version of Symfony and Elasticsearch - both of these tasks need to happen in order to keep AtoM moving in the right direction but both require a substantial amount of work and are not likely to be picked up and funded by the community.

I was interested to see Artefactual’s vision for a new AtoM 3.0 which would see some fundamental changes to the way AtoM works and a more up-to-date, modular and scalable architecture designed to meet the future use cases of the growing AtoM community.

Artefactual's proposed modular architecture for AtoM 3.0

There is no time line for AtoM 3.0, and whether it goes ahead or not is entirely dependent on a substantial source of funding being available. It was great to see Artefactual sharing their vision and encouraging feedback from the community at this early stage though.

Another highlight of Camp:
a tour of the archives of St John's College from Tracy Deakin
A session on data migrations on day three included a demo of OpenRefine from Sara Allain from Artefactual. I’d heard of this tool before but wasn’t entirely sure what it did and whether it would be of use to me. Sara demonstrated how it could be used to bash data into shape before import into AtoM. It seemed to be capable of doing all the things that I’ve previously done in Excel (and more) ...but without so much pain. I’ll definitely be looking to try this out when I next have some data to clean up.

Dan Gillean and Pete Vox from IMAGIZ talked through the process of importing data into AtoM. Pete focused on an example from Croydon Museum Service who's data needed to be migrated from CALM. He talked through some of the challenges of the task and how he would approach this differently in future. It is clear that the complexities of data migration may be one of the biggest barriers to institutions moving to AtoM from an alternative system, but it was encouraging to hear that none of these challenges are insurmountable.

My final take away from AtoM Camp is a long list of actions - new things I have learnt that I want to read up on or try out for myself ...I best crack on!

Permission denied for files in www-data

Published 11 May 2017 by petergus in Newest questions tagged mediawiki - Ask Ubuntu.

I have image files being uploaded with mediawiki, and they are setting the owner as www-data. Viewing the files results in 403 forbidden. (all other site files owned by SITE_USER).

The SITE_USER and www-data are both in each others (secondary) groups.

What am I missing here?

EDIT: My Apache directives

DocumentRoot "/home/SITE_USER/public_html/"
# Alias for Wiki so images work
Alias /images "/home/SITE_USER/public_html/mediawiki/sites/images"    
<Directory "/home/SITE_USER/public_html/">
RewriteRule ^(.*)$ %{DOCUMENT_ROOT}//index.php [L]
# Enable the rewrite engine
RewriteEngine On
# Short url for wiki pages
RewriteRule ^/?wiki(/.*)?$ %{DOCUMENT_ROOT}/index.php [L]
# Redirect / to Main Page
RewriteRule ^/*$ %{DOCUMENT_ROOT}/index.php [L]
Options -Indexes +SymLinksIfOwnerMatch
allow from all
AllowOverride All Options=ExecCGI,Includes,IncludesNOEXEC,Indexes,MultiViews,SymLinksIfOwnerMatch
Require all granted

Maintenance report of April 28th 2017

Published 11 May 2017 by Pierrick Le Gall in The Blog. clients have already received this message. Many users told us they were happy to receive such details about our technical operations so but let’s make it more “public” with a blog post!

A. The short version

On April 27th 2017, we replaced one of main servers. The replacement itself was successful. No downtime. The read-only mode has lasted only 7 minutes, from 6:00 to 6:07 UTC.

While sending the notification email to our clients, we encountered difficulties with Gmail users. Solving this Gmail issue made the website unavailable for a few users and maybe an hour. Everything was back to normal in a few hours. Of course, no data has been lost during this operation.

The new server and Piwigo are now good friends. They both look forward to receive version 2.9 in the next days 😉

B. Additional technical details

The notification message had already been sent to the first 390 users when we realized emails sent to Gmail addresses were returned in error. Indeed Gmail now asks for a “reverse DNS IPv6”. Sorry for this very technical detail. We already had it on the old server so we added it on the new server. And then start the problems… Unfortunately the new server does not manage IPv6 the same way. A few users, on IPv6, told us they only see “Apache2 Debian Default Page” instead of their Piwigo. Here is the timeline:

Unfortunately adding or removing an IPv6 is not an immediate action. It relies on the “DNS propagation” which may take a few hours, depending on each user.

We took the rest of the day to figure out how to make Gmail accept our emails and web visitors see your Piwigo. Instead of “”, we now use a sub-domain of “” (Pigolabs is the company running service) with an IPv6 : no impact on web traffic.

We also have a technical solution to handle IPv6 for web traffic. We have decided not to use it because IPv6 lacks an important feature, the FailOver. This feature, only available on IPv4, let us redirect web traffic from one server to another in a few seconds without worrying about DNS propagation. We use it when a server fails and web traffic goes to a spare server.

In the end, the move did not go so well and we sweat quite a this friday, but everything came back to normal and the “Apache2 Debian Default Page” issue eventually affected only a few people!

At the J Shed

Published 7 May 2017 by Dave Robertson in Dave Robertson.

We can’t wait to play here again soon… in June… stay tuned! Photo by Alex Chapman


composer - Semantic MediaWiki require onoi/callback-container, but it can't be installed

Published 5 May 2017 by Сергей Румянцев in Newest questions tagged mediawiki - Server Fault.

I try to install the latest release of SemanticMediaWiki. When I run composer update, it returns the following:

> ComposerHookHandler::onPreUpdate
Loading composer repositories with package information
Updating dependencies (including require-dev)
Your requirements could not be resolved to an installable set of packages.

  Problem 1
    - mediawiki/semantic-media-wiki 2.4.x-dev requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - mediawiki/semantic-media-wiki 2.4.6 requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - mediawiki/semantic-media-wiki 2.4.5 requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - mediawiki/semantic-media-wiki 2.4.4 requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - mediawiki/semantic-media-wiki 2.4.3 requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - mediawiki/semantic-media-wiki 2.4.2 requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - mediawiki/semantic-media-wiki 2.4.1 requires onoi/callback-container ~1.0 -> satisfiable by onoi/callback-container[1.0.0, 1.1.0] but these conflict with your requirements or minimum-stability.
    - Installation request for mediawiki/semantic-media-wiki ~2.4.1 -> satisfiable by mediawiki/semantic-media-wiki[2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.4.5, 2.4.6, 2.4.x-dev].

I have even set minimum-stability to dev and even prefer-stable to false. Nothing resolves.

It is not the first problem with Composer. It returned an error due to no set version in package mediawiki/core, which was required still by this SMW. But not at this time, surprise.

And Composer don't see the package in composer show onoi/callback-container. There is stable version 2.0 at all.

After upgrade to 14.04 I get "You don't have permission to access /wiki/ on this server."

Published 3 May 2017 by Finn Årup Nielsen in Newest questions tagged mediawiki - Ask Ubuntu.

After dist-upgrade to 14.04 I get "You don't have permission to access /wiki/ on this server." for a MediaWiki installation with alias. /w/index.php is also failing.

So far I have seen a difference in configuration between 12.04 and 14.04 and I did

cd /etc/apache2/sites-available
sudo ln -s ../sites-available/000-default.conf .

This fixed other problems, but not the MediaWiki problem.

How can we preserve Google Documents?

Published 28 Apr 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Last month I asked (and tried to answer) the question How can we preserve our wiki pages?

This month I am investigating the slightly more challenging issue of how to preserve native Google Drive files, specifically documents*.


At the University of York we work a lot with Google Drive. We have the G Suite for Education (formally known as Google Apps for Education) and as part of this we have embraced Google Drive and it is now widely used across the University. For many (me included) it has become the tool of choice for creating documents, spreadsheets and presentations. The ability to share documents and directly collaborate are key.

So of course it is inevitable that at some point we will need to think about how to preserve them.

How hard can it be?

Quite hard actually.

The basic problem is that documents created in Google Drive are not really "files" at all.

The majority of the techniques and models that we use in digital preservation are based around the fact that you have a digital object that you can see in your file system, copy from place to place and package up into an Archival Information Package (AIP).

In the digital preservation community we're all pretty comfortable with that way of working.

The key challenge with stuff created in Google Drive is that it doesn't really exist as a file.

Always living in hope that someone has already solved the problem, I asked the question on Twitter and that really helped with my research.

Isn't the digital preservation community great?

Exporting Documents from Google Drive

I started off testing the different download options available within Google docs. For my tests I used 2 native Google documents. One was the working version of our Phase 1 Filling the Digital Preservation Gap report. This report was originally authored as a Google doc, was 56 pages long and consisted of text, tables, images, footnotes, links, formatted text, page numbers, colours etc (ie: lots of significant properties I could assess). I also used another more simple document for testing - this one was just basic text and tables but also included comments by several contributors.

I exported both of these documents into all of the different export formats that Google supports and assessed the results, looking at each characteristic of the document in turn and establishing whether or not I felt it was adequately retained.

Here is a summary of my findings, looking specifically at the Filling the Digital Preservation Gap phase 1 report document:

...but what about the comments?

My second test document was chosen so I could look specifically at the comments feature and how these were retained (or not) in the exported version.

  • docx - Comments are exported. On first inspection they appear to be anonymised, however this seems to be just how they are rendered in Microsoft Word. Having unzipped and dug into the actual docx file and looked at the XML file that holds the comments, it is clear that a more detailed level of information is retained - see images below. The placement of the comments is not always accurate. In one instance the reply to a comment is assigned to text within a subsequent row of the table rather than to the same row as the original comment.
  • odt -  Comments are included, are attributed to individuals and have a date and time. Again, matching up of comments with right section of text is not always accurate - in one instance a comment and it's reply are linked to the table cell underneath the one that they referenced in the original document.
  • rtf - Comments are included but appear to be anonymised when displayed in MS Word...I haven't dug around enough to establish whether or not this is just a rendering issue.
  • txt - Comments are retained but appear at the end of the document with a [a], [b] etc prefix - these letters appear in the main body text to show where the comments appeared. No information about who made the comment is preserved.
  • pdf - Comments not exported
  • epub - Comments not exported
  • html - Comments are present but appear at the end of the document with a code which also acts as a placeholder in the text where the comment appeared. References to the comments in the text are hyperlinks which take you to the right comment at the bottom of the document. There is no indication of who made the comment (not even hidden within the html tags).

A comment in original Google doc

The same comment in docx as rendered by MS Word

...but in the XML buried deep within the docx file structure - we do have attribution and date/time
(though clearly in a different time zone)

What about bulk export options?

Ed Pinsent pointed me to the Google Takeout Service which allows you to:
"Create an archive with your data from Google products"
[Google's words not mine - and perhaps this is a good time to point you to Ed's blog post on the meaning of the term 'Archive']

This is really useful. It allows you to download Google Drive files in bulk and to select which formats you want to export them as.

I tested this a couple of times and was surprised to discover that if you select pdf or docx (and perhaps other formats that I didn't test) as your export format of choice, the takeout service creates the file in the format requested and an html file which includes all comments within the document (even those that have been resolved). The content of the comments/responses including dates and times is all included within the html file, as are names of individuals.

The downside of the Google Takeout Service is that it only allows you to select folders and not individual files. There is another incentive for us to organise our files better! The other issue is that it will only export documents that you are the owner of - and you may not own everything that you want to archive!

What's missing?

Quite a lot actually.

The owner, creation and last modified dates of a document in Google Drive are visible when you click on Document details... within the File menu. Obviously this is really useful information for the archive but is lost as soon as you download it into one of the available export formats.

Creation and last modified dates as visible in Document details

Update: I was pleased to see that if using the Google Takeout Service to bulk export files from Drive, the last modified dates are retained, however on single file export/download these dates are lost and the last modified date of the file becomes the date that you carried out the export. 

Part of the revision history of my Google doc
But of course in a Google document there is more metadata. Similar to the 'Page History' that I mentioned when talking about preserving wiki pages, a Google document has a 'Revision history'

Again, this *could* be useful to the archive. Perhaps not so much so for my document which I worked on by myself in March, but I could see more of a use case for mapping and recording the creative process of writing a novel for example. 

Having this revision history would also allow you to do some pretty cool stuff such as that described in this blog post: How I reverse engineered Google Docs to play back any documents Keystrokes (thanks to Nick Krabbenhoft for the link).

It would seem that the only obvious way to retain this information would be to keep the documents in their original native Google format within Google Drive but how much confidence do we have that it will be safe there for the long term?


If you want to preserve a Google Drive document there are several options but no one-size-fits-all solution.

As always it boils down to what the significant properties of the document are. What is it we are actually trying to preserve?

  • If we want a fairly accurate but non interactive digital 'print' of the document, pdf might be the most accurate representation though even the pdf export can't be relied on to retain the exact pagination. Note that I didn't try and validate the pdf files that I exported and sadly there is no pdf/a export option.
  • If comments are seen to be a key feature of the document then docx or odt will be a good option but again this is not perfect. With the test document I used, comments were not always linked to the correct point within the document.
  • If it is possible to get the owner of the files to export them, the Google Takeout Service could be used. Perhaps creating a pdf version of the static document along with a separate html file to capture the comments.

A key point to note is that all export options are imperfect so it would be important to check the exported document against the original to ensure it accurately retains the important features.

Another option would be simply keeping them in their native format but trying to get some level of control over them - taking ownership and managing sharing and edit permissions so that they can't be changed. I've been speaking to one of our Google Drive experts in IT about the logistics of this. A Google Team Drive belonging to the Archives could be used to temporarily store and lock down Google documents of archival value whilst we wait and see what happens next. 

...I live in hope that export options will improve in the future.

This is a work in progress and I'd love to find out what others think.

* note, I've also been looking at Google Sheets and that may be the subject of another blog post

Legal considerations regarding hosting a MediaWiki site

Published 27 Apr 2017 by Oliver K in Newest questions tagged mediawiki - Webmasters Stack Exchange.

What legal considerations are there when creating a wiki using MediaWiki for people to use worldwide?

For example, I noticed there are privacy policies & terms and conditions; are these required to safeguard me from any legal battles?

mosh, the disconnection-resistant ssh

Published 22 Apr 2017 by Carlos Fenollosa in Carlos Fenollosa — Blog.

The second post on this blog was devoted to screen and how to use it to make persistent SSH sessions.

Recently I've started using mosh, the mobile shell. It's targeted to mobile users, for example laptop users who might get short disconnections while working on a train, and it also provides a small keystroke buffer to get rid of network lag.

It really has little drawbacks and if you ever ssh to remote hosts and get annoyed because your vim sessions or tail -F windows get disconnected, give mosh a try. I strongly recommend it.

Tags: software, unix

Comments? Tweet  

In conversation with the J.S. Battye Creative Fellows

Published 19 Apr 2017 by carinamm in State Library of Western Australia Blog.

How can contemporary art lead to new discoveries about collections and ways of engaging with history?  Nicola Kaye and Stephen Terry will discuss this idea drawing from the experience of creating Tableau Vivant and the Unobserved.

In conversation with the J.S. Battye Creative Fellows
Thursday 27 April, 6pm
State Library Theatre.

April 4 Tableau Vivant Image_darkened_2

Tableau Vivant and the Unobserved is the culmination of the State Library’s inaugural J.S. Battye Creative Fellowship.  The Creative Fellowship aims to enhance engagement with the Library’s heritage collections and provide new experiences for the public.

Tableau Vivant and the Unobserved
visually questions how history is made, commemorated and forgotten. Through digital art installation, Nicola Kaye and Stephen Terry expose the unobserved and manipulate our perception of the past.  Their work juxtaposes archival and contemporary imagery to create an interactive experience for the visitor where unobserved lives from the archive collide with the contemporary world. The installation is showing at the State Library until 12 May 2017.

For more information visit:

Filed under: community events, Exhibitions, Pictorial, SLWA collections, SLWA displays, SLWA Exhibitions, SLWA news, State Library of Western Australia, talks, Western Australia Tagged: contemporary art, discussion, installation, J.S. Battye Creative Fellowship, Nicola Kaye, Stephen Terry, talk


Published 17 Apr 2017 by mblaney in Tags from simplepie.

Merge pull request #510 from mblaney/master

Version bump to 1.5 due to changes to Category class.

Interview on Stepping Off: Rewilding and Belonging in the South West

Published 14 Apr 2017 by Tom Wilson in thomas m wilson.

You can listen to a recent radio interview I did about my new book with Adrian Glamorgan here.

Wikimania submisison: apt install mediawiki

Published 9 Apr 2017 by legoktm in The Lego Mirror.

I've submitted a talk to Wikimania titled apt install mediawiki. It's about getting the MediaWiki package back into Debian, and efforts to improve the overall process. If you're interested, sign up on the submissions page :)

Archivematica Camp York: Some thoughts from the lake

Published 7 Apr 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Well, that was a busy week!

Yesterday was the last day of Archivematica Camp York - an event organised by Artefactual Systems and hosted here at the University of York. The camp's intention was to provide a space for anyone interested in or currently using Archivematica to come together, learn about the platform from other users, and share their experiences. I think it succeeded in this, bringing together 30+ 'campers' from across the UK, Europe and as far afield as Brazil for three days of sessions covering different aspects of Archivematica.

Our pod on the lake (definitely a lake - not a pond!)
My main goal at camp was to ensure everyone found their way to the rooms (including the lakeside pod) and that we were suitably fuelled with coffee, popcorn and cake. Alongside these vital tasks I also managed to partake in the sessions, have a play with the new version of Archivematica (1.6) and learn a lot in the process.

I can't possibly capture everything in this brief blog post so if you want to know more, have a look back at all the #AMCampYork tweets.

What I've focused on below are some of the recurring themes that came up over the three days.


Archivematica is just one part of a bigger picture for institutions that are carrying out digital preservation, so it is always very helpful to see how others are implementing it and what systems they will be integrating with. A session on workflows in which participants were invited to talk about their own implementations was really interesting. 

Other sessions  also helped highlight the variety of different configurations and workflows that are possible using Archivematica. I hadn't quite realised there were so many different ways you could carry out a transfer! 

In a session on specialised workflows, Sara Allain talked us through the different options. One workflow I hadn't been aware of before was the ability to include checksums as part of your transfer. This sounds like something I need to take advantage of when I get Archivematica into production for the Borthwick. 

Justin talking about Automation Tools
A session on Automation Tools with Justin Simpson highlighted other possibilities - using Archivematica in a more automated fashion. 

We already have some experience of using Automation Tools at York as part of the work we carried out during phase 3 of Filling the Digital Preservation Gap, however I was struck by how many different ways these can be applied. Hearing examples from other institutions and for a variety of different use cases was really helpful.


The camp included a chance to play with Archivematica version 1.6 (which was only released a couple of weeks ago) as well as an introduction to the new Appraisal and Arrangement tab.

A session in progress at Archivematica Camp York
I'd been following this project with interest so it was great to be able to finally test out the new features (including the rather pleasing pie charts showing what file formats you have in your transfer). It was clear that there were a few improvements that could be made to the tab to make it more intuitive to use and to deal with things such as the ability to edit or delete tags, but it is certainly an interesting feature and one that I would like to explore more using some real data from our digital archive.

Throughout camp there was a fair bit of discussion around digital appraisal and at what point in your workflow this would be carried out. This was of particular interest to me being a topic I had recently raised with colleagues back at base.

The Bentley Historical Library who funded the work to create the new tab within Archivematica are clearly keen to get their digital archives into Archivematica as soon as possible and then carry out the work there after transfer. The addition of this new tab now makes this workflow possible.

Kirsty Lee from the University of Edinburgh described her own pre-ingest methodology and the tools she uses to help her appraise material before transfer to Archivematica. She talked about some tools (such as TreeSize Pro) that I'm really keen to follow up on.

At the moment I'm undecided about exactly where and how this appraisal work will be carried out at York, and in particular how this will work for hybrid collections so as always it is interesting to hear from others about what works for them.

Metadata and reporting

Evelyn admitting she loves PREMIS and METS
Evelyn McLellan from Artefactual led a 'Metadata Deep Dive' on day 2 and despite the title, this was actually a pretty interesting session!

We got into the details of METS and PREMIS and how they are implemented within Archivematica. Although I generally try not to look too closely at METS and PREMIS it was good to have them demystified. On the first day through a series of exercises we had been encouraged to look at a METS file created by Archivematica ourselves and try and pick out some information from it so these sessions in combination were really useful.

Across various sessions of the camp there was also a running discussion around reporting. Given that Archivematica stores such a detailed range of metadata in the METS file, how do we actually make use of this? Being able to report on how many AIPs have been created, how many files and what size is useful. These are statistics that I currently collect (manually) on a quarterly basis and share with colleagues. Once Archivematica is in place at York, digging further into those rich METS files to find out which file formats are in the digital archive would be really helpful for preservation planning (among other things). There was discussion about whether reporting should be a feature of Archivematica or a job that should be done outside Archivematica.

In relation to the later option - I described in one session how some of our phase 2 work of Filling the Digital Preservation Gap was designed to help expose metadata from Archivematica to a third party reporting system. The Jisc Research Data Shared Service was also mentioned in this context as reporting outside of Archivematica will need to be addressed as part of this project.


As with most open source software, community is important. This was touched on throughout the camp and was the focus of the last session on the last day.

There was a discussion about the role of Artefactual Systems and the role of Archivematica users. Obviously we are all encouraged to engage and help sustain the project in whatever way we are able. This could be by sharing successes and failures (I was pleased that my blog got a mention here!), submitting code and bug reports, sponsoring new features (perhaps something listed on the development roadmap) or helping others by responding to queries on the mailing list. It doesn't matter - just get involved!

I was also able to highlight the UK Archivematica group and talk about what we do and what we get out of it. As well as encouraging new members to the group, there was also discussion about the potential for forming other regional groups like this in other countries.

Some of the Archivematica community - class of Archivematica Camp York 2017

...and finally

Another real success for us at York was having the opportunity to get technical staff at York working with Artefactual to resolve some problems we had with getting our first Archivematica implementation into production. Real progress was made and I'm hoping we can finally start using Archivematica for real at the end of next month.

So, that was Archivematica Camp!

A big thanks to all who came to York and to Artefactual for organising the programme. As promised, the sun shined and there were ducks on the lake - what more could you ask for?

Thanks to Paul Shields for the photos

Failover in local accounts

Published 7 Apr 2017 by MUY Belgium in Newest questions tagged mediawiki - Server Fault.

I would like to use mediawiki as documentation with access privileges. I use the LdapAuthentication extension (here : ) in order to get user authenticated against a LDAP.

For various reason, the authentication should continue working even if the LDAP fails.

How can I get a fail-over (for example using the passwords in the local SQL database?) which should enable the wiki to remains accessible even if infrastructure fails?

Shiny New History in China: Jianshui and Tuanshan

Published 6 Apr 2017 by Tom Wilson in thomas m wilson.

  The stones in this bridge are not all in a perfect state of repair.  That’s part of its charm.  I’m just back from a couple of days down at Jianshui, a historic town a few hours south of Kunming with a large city wall and a towering city gate.  The trip has made me reflect on […]

Tableau Vivant and the Unobserved

Published 30 Mar 2017 by carinamm in State Library of Western Australia Blog.

April 4 Tableau Vivant Image_darkened_2.jpg

Still scene: Tableau Vivant and the Unobserved, 2016, Nicola Kaye, Stephen Terry.

Tableau Vivant and the Unobserved visually questions how history is made, commemorated and forgotten. Through digital art installation, Nicola Kaye and Stephen Terry expose the unobserved and manipulate our perception of the past.  Their work juxtaposes archival and contemporary imagery to create an experience for the visitor where unobserved lives from the archive collide with the contemporary world.

Tableau Vivant and the Unobserved is the culmination of the State Library’s inaugural J.S. Battye Creative Fellowship.  The Creative Fellowship aims to enhance engagement with the Library’s heritage collections and provide new experiences for the public.

Artists floor talk
Thursday 6 April, 6pm
Ground Floor Gallery, State Library of Western Australia.

Nicola Kaye and Stephen Terry walk you through Tableau Vivant and the Unobserved

In conversation with the J.S. Battye Creative Fellows
Thursday 27 April, 6pm
State Library Theatre.

How can contemporary art lead to new discoveries about collections and ways of engaging with history?  Nicola Kaye and Stephen Terry will discuss this idea drawing from the experience of creating Tableau Vivant and the Unobserved.

Tableau Vivant and the Unobserved is showing at the State Library from 4 April – 12 May 2017.
For more information visit:

Filed under: community events, Exhibitions, SLWA collections, SLWA displays, SLWA events, SLWA Exhibitions, SLWA news, State Library of Western Australia, talks, WA history, Western Australia Tagged: exhibitions, installation art, J.S. Battye Creative Fellowship, Nicola Kaye, Perth, Perth Cultural Centre, State Library of Western Australai, Stephen Terry, Tableau Vivant and the Unobserved

Remembering Another China in Kunming

Published 29 Mar 2017 by Tom Wilson in thomas m wilson.

Last weekend I headed out for a rock climbing session with some locals and expats.  First I had to cross town, and while doing so I came across an old man doing water calligraphy by Green Lake.  I love the transience of this art: the beginning of the poem is starting to fade by the time he reaches […]

Week #11: Raided yet again

Published 27 Mar 2017 by legoktm in The Lego Mirror.

If you missed the news, the Raiders are moving to Las Vegas. The Black Hole is leaving Oakland (again) for a newer, nicer, stadium in the desert. But let's talk about how we got here, and how different this is from the moving of the San Diego Chargers to Los Angeles.

The current Raiders stadium is outdated and old. It needs renovating to keep up with other modern stadiums in the NFL. Owner Mark Davis isn't a multi-billionaire that could finance such a stadium. And the City of Oakland is definitely not paying for it. So the options left are find outside financing for Oakland, for find said financing somewhere else. And unfortunately it was the latter option that won out in the end.

I think it's unsurprising that more and more cities are refusing to put public money into stadiums that they will see no profit from - it makes no sense whatsoever.

Overall I think the Raider Nation will adapt and survive just as it did when they moved to Los Angeles. The Raiders still have an awkward two-to-three years left in Oakland, and with Derek Carr at the helm, it looks like they will be good ones.

Week #10: March Sadness

Published 23 Mar 2017 by legoktm in The Lego Mirror.

In California March Madness is really...March Sadness. The only Californian team that is still in is UCLA. UC Davis made it in but was quickly eliminated. USC and Saint Mary's both fell in the second round. Cal and Stanford didn't even make it in. At best we can root for Gonzaga, but that's barely it.

Some of us root for school's we went to, but for those of us who grew up here and support local teams, we're left hanging. And it's not bias in the selection commitee, those schools just aren't good enough.

On top of that we have a top notch professional team through the Warriors, but our amateur players just aren't up to muster.

So good luck to UCLA, represent California hella well. We somewhat believe in you.

Week #9: The jersey returns

Published 23 Mar 2017 by legoktm in The Lego Mirror.

And so it has been found. Tom Brady's jersey was in Mexico the whole time, stolen by a member of the press. And while it's great news for Brady, sports memorabilia fans, and the FBI, it doesn't look good for journalists. Journalists are given a lot of access to players, allowing them to obtain better content and get better interviews. It would not be surprising if the NFL responds to this incident by locking down the access that journalists are given. And that would be real bummer.

I'm hoping this is seen as an isolated incident and all journalists are not punished for the offenses by one. Enterprise plans, now official!

Published 23 Mar 2017 by Pierrick Le Gall in The Blog.

In the shadow of the standard plan for several years and yet already adopted by more than 50 organizations, it is time to officially introduce the Enterprise plans. They were designed for organizations, private or public, looking for a simple, affordable and yet complete tool to manage their collection of photos.

The main idea behind Enterprise is to democratize photo library management for organizations of all kind and size. We are not targeting fortune 500, although some of them are already clients, but fortune 5,000,000 companies! Enterprise plans can replace, at a reasonable cost, inadequate solutions relying on intranet shared folders, where photos are sometimes duplicated, deleted by mistake, without the appropriate permission system.

Introduction to Enterprise plans

Introduction to Enterprise plans

Why announcing officially these plans today? Because the current trend obviously shows us that our Enterprise plans find its market. Although semi-official, Enterprise plans represented nearly 40% of our revenue in February 2017! It is time to put these plans under the spotlights.

In practice, here is what changes with the Enterprise plans:

  1. they can be used by organizations, as opposed to the standard plan
  2. additional features, such as support for non-photo files (PDF, videos …)
  3. higher level of service (priority support, customization, presentation session)

Discover Entreprise

Please Help Us Track Down Apple II Collections

Published 20 Mar 2017 by Jason Scott in ASCII by Jason Scott.

Please spread this as far as possible – I want to reach folks who are far outside the usual channels.

The Summary: Conditions are very, very good right now for easy, top-quality, final ingestion of original commercial Apple II Software and if you know people sitting on a pile of it or even if you have a small handful of boxes, please get in touch with me to arrange the disks to be imaged. 

The rest of this entry says this in much longer, hopefully compelling fashion.

We are in a golden age for Apple II history capture.

For now, and it won’t last (because nothing lasts), an incredible amount of interest and effort and tools are all focused on acquiring Apple II software, especially educational and engineering software, and ensuring it lasts another generation and beyond.

I’d like to take advantage of that, and I’d like your help.

Here’s the secret about Apple II software: Copy Protection Works.

Copy protection, that method of messing up easy copying from floppy disks, turns out to have been very effective at doing what it is meant to do – slow down the duplication of materials so a few sales can eke by. For anything but the most compelling, most universally interesting software, copy protection did a very good job of ensuring that only the approved disks that went out the door are the remaining extant copies for a vast majority of titles.

As programmers and publishers laid logic bombs and coding traps and took the brilliance of watchmakers and used it to design alternative operating systems, they did so to ensure people wouldn’t take the time to actually make the effort to capture every single bit off the drive and do the intense and exacting work to make it easy to spread in a reproducible fashion.

They were right.

So, obviously it wasn’t 100% effective at stopping people from making copies of programs, or so many people who used the Apple II wouldn’t remember the games they played at school or at user-groups or downloaded from AE Lines and BBSes, with pirate group greetings and modified graphics.

What happened is that pirates and crackers did what was needed to break enough of the protection on high-demand programs (games, productivity) to make them work. They used special hardware modifications to “snapshot” memory and pull out a program. They traced the booting of the program by stepping through its code and then snipped out the clever tripwires that freaked out if something wasn’t right. They tied it up into a bow so that instead of a horrendous 140 kilobyte floppy, you could have a small 15 or 20 kilobyte program instead. They even put multiple cracked programs together on one disk so you could get a bunch of cool programs at once.

I have an entire section of TEXTFILES.COM dedicated to this art and craft.

And one could definitely argue that the programs (at least the popular ones) were “saved”. They persisted, they spread, they still exist in various forms.

And oh, the crack screens!

I love the crack screens, and put up a massive pile of them here. Let’s be clear about that – they’re a wonderful, special thing and the amount of love and effort that went into them (especially on the Commodore 64 platform) drove an art form (demoscene) that I really love and which still thrives to this day.

But these aren’t the original programs and disks, and in some cases, not the originals by a long shot. What people remember booting in the 1980s were often distant cousins to the floppies that were distributed inside the boxes, with the custom labels and the nice manuals.


On the left is the title screen for Sabotage. It’s a little clunky and weird, but it’s also something almost nobody who played Sabotage back in the day ever saw; they only saw the instructions screen on the right. The reason for this is that there were two files on the disk, one for starting the title screen and then the game, and the other was the game. Whoever cracked it long ago only did the game file, leaving the rest as one might leave the shell of a nut.

I don’t think it’s terrible these exist! They’re art and history in their own right.

However… the mistake, which I completely understand making, is to see programs and versions of old Apple II software up on the Archive and say “It’s handled, we’re done here.” You might be someone with a small stack of Apple II software, newly acquired or decades old, and think you don’t have anything to contribute.

That’d be a huge error.

It’s a bad assumption because there’s a chance the original versions of these programs, unseen since they were sold, is sitting in your hands. It’s a version different than the one everyone thinks is “the” version. It’s precious, it’s rare, and it’s facing the darkness.

There is incredibly good news, however.

I’ve mentioned some of these folks before, but there is now a powerful allegiance of very talented developers and enthusiasts who have been pouring an enormous amount of skills into the preservation of Apple II software. You can debate if this is the best use of their (considerable) skills, but here we are.

They have been acquiring original commercial Apple II software from a variety of sources, including auctions, private collectors, and luck. They’ve been duplicating the originals on a bits level, then going in and “silent cracking” the software so that it can be played on an emulator or via the web emulation system I’ve been so hot on, and not have any change in operation, except for not failing due to copy protection.

With a “silent crack”, you don’t take the credit, you don’t make it about yourself – you just make it work, and work entirely like it did, without yanking out pieces of the code and program to make it smaller for transfer or to get rid of a section you don’t understand.

Most prominent of these is 4AM, who I have written about before. But there are others, and they’re all working together at the moment.

These folks, these modern engineering-minded crackers, are really good. Really, really good.

They’ve been developing tools from the ground up that are focused on silent cracks, of optimizing the process, of allowing dozens, sometimes hundreds of floppies to be evaluated automatically and reducing the workload. And they’re fast about it, especially when dealing with a particularly tough problem.

Take, for example, the efforts required to crack Pinball Construction Set, and marvel not just that it was done, but that a generous and open-minded article was written explaining exactly what was being done to achieve this.

This group can be handed a stack of floppies, image them, evaluate them, and find which have not yet been preserved in this fashion.

But there’s only one problem: They are starting to run out of floppies.

I should be clear that there’s plenty left in the current stack – hundreds of floppies are being processed. But I also have seen the effort chug along and we’ve been going through direct piles, then piles of friends, and then piles of friends of friends. We’ve had a few folks from outside the community bring stuff in, but those are way more scarce than they should be.

I’m working with a theory, you see.

My theory is that there are large collections of Apple II software out there. Maybe someone’s dad had a store long ago. Maybe someone took in boxes of programs over the years and they’re in the basement or attic. I think these folks are living outside the realm of the “Apple II Community” that currently exists (and which is a wonderful set of people, be clear). I’m talking about the difference between a fan club for surfboards and someone who has a massive set of surfboards because his dad used to run a shop and they’re all out in the barn.

A lot of what I do is put groups of people together and then step back to let the magic happen. This is a case where this amazingly talented group of people are currently a well-oiled machine – they help each other out, they are innovating along this line, and Apple II software is being captured in a world-class fashion, with no filtering being done because it’s some hot ware that everyone wants to play.

For example, piles and piles of educational software has returned from potential oblivion, because it’s about the preservation, not the title. Wonderfully done works are being brought back to life and are playable on the Internet Archive.

So like I said above, the message is this:

Conditions are very, very good right now for easy, top-quality, final ingestion of original commercial Apple II Software and if you know people sitting on a pile of it or even if you have a small handful of boxes, please get in touch with me to arrange the disks to be imaged.

I’ll go on podcasts or do interviews, or chat with folks on the phone, or trade lots of e-mails discussing details. This is a very special time, and I feel the moment to act is now. Alliances and communities like these do not last forever, and we’re in a peak moment of talent and technical landscape to really make a dent in what are likely acres of unpreserved titles.

It’s 4am and nearly morning for Apple II software.

It’d be nice to get it all before we wake up.


Managing images on an open wiki platform

Published 19 Mar 2017 by Oliver K in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I'm developing a wiki page using MediaWiki and there are a few ways of inplementing images into wiki pages such as uploading them on the website and uploading them on external websites it potentially banning and requesting others to place an image.

Surely images may be difficult to manage as one day someone may upload a vulgar image and many people will then see it. How can I ensure vulgar images do not get through and that administrators aren't scarred for life after monitoring them?

Does the composer software have a command like python -m compileall ./

Published 18 Mar 2017 by jehovahsays in Newest questions tagged mediawiki - Server Fault.

I want to use composer for a mediawiki root folder with multiple directories
that need composer to install their dependencies
with a command like composer -m installall ./
For example , if the root folder was all written in python
i could use the command python -m compileall ./

Hilton Harvest Earth Hour Picnic and Concert

Published 18 Mar 2017 by Dave Robertson in Dave Robertson.


Sandpapering Screenshots

Published 15 Mar 2017 by Jason Scott in ASCII by Jason Scott.

The collection I talked about yesterday was subjected to the Screen Shotgun, which does a really good job of playing the items, capturing screenshots, and uploading them into the item to allow people to easily see, visually, what they’re in for if they boot them up.

In general, the screen shotgun does the job well, but not perfectly. It doesn’t understand what it’s looking at, at all, and the method I use to decide the “canonical” screenshot is inherently shallow – I choose the largest filesize, because that tends to be the most “interesting”.

The bug in this is that if you have, say, these three screenshots:

…it’s going to choose the first one, because those middle-of-loading graphics for an animated title screen have tons of little artifacts, and the filesize is bigger. Additionally, the second is fine, but it’s not the “title”, the recognized “welcome to this program” image. So the best choice turns out to be the third.

I don’t know why I’d not done this sooner, but while waiting for 500 disks to screenshot, I finally wrote a program to show me all the screenshots taken for an item, and declare a replacement canonical title screenshot. The results have been way too much fun.

It turns out, doing this for Apple II programs in particular, where it’s removed the duplicates and is just showing you a gallery, is beautiful:

Again, the all-text “loading screen” in the middle, which is caused by blowing program data into screen memory, wins the “largest file” contest, but literally any other of the screens would be more appropriate.

This is happening all over the place: crack screens win over the actual main screen, the mid-loading noise of Apple II programs win over the final clean image, and so on.

Working with tens of thousands of software programs, primarily alone, means that I’m trying to find automation wherever I can. I can’t personally boot up each program and do the work needed to screenshot/describe it – if a machine can do anything, I’ll make the machine do it. People will come to me with fixes or changes if the results are particularly ugly, but it does leave a small amount that no amount of automation is likely to catch.

If you watch a show or documentary on factory setups and assembly lines, you’ll notice they can’t quite get rid of people along the entire line, especially the sign-off. Someone has to keep an eye to make sure it’s not going all wrong, or, even more interestingly, a table will come off the line and you see one person giving it a quick run-over with sandpaper, just to pare down the imperfections or missed spots of the machine. You still did an enormous amount of work with no human effort, but if you think that’s ready for the world with no final sign-off, you’re kidding yourself.

So while it does mean another hour or two looking at a few hundred screenshots, it’s nice to know I haven’t completely automated away the pleasure of seeing some vintage computer art, for my work, and for the joy of it.

Thoughts on a Collection: Apple II Floppies in the Realm of the Now

Published 15 Mar 2017 by Jason Scott in ASCII by Jason Scott.

I was connected with The 3D0G Knight, a long-retired Apple II pirate/collector who had built up a set of hundreds of floppy disks acquired from many different locations and friends decades ago. He generously sent me his entire collection to ingest into a more modern digital format, as well as the Internet Archive’s software archive.

The floppies came in a box without any sort of sleeves for them, with what turned out to be roughly 350 of them removed from “ammo boxes” by 3D0G from his parents’ house. The disks all had labels of some sort, and a printed index came along with it all, mapped to the unique disk ID/Numbers that had been carefully put on all of them years ago. I expect this was months of work at the time.

Each floppy is 140k of data on each side, and in this case, all the floppies had been single-sided and clipped with an additional notch with a hole punch to allow the second side to be used as well.

Even though they’re packed a little strangely, there was no damage anywhere, nothing bent or broken or ripped, and all the items were intact. It looked to be quite the bonanza of potentially new vintage software.

So, this activity at the crux of the work going on with both the older software on the Internet Archive, as well as what I’m doing with web browser emulation and increasing easy access to the works of old. The most important thing, over everything else, is to close the air gap – get the data off these disappearing floppy disks and into something online where people or scripts can benefit from them and research them. Almost everything else – scanning of cover art, ingestion of metadata, pulling together the history of a company or cross-checking what titles had which collaborators… that has nowhere near the expiration date of the magnetized coated plastic disks going under. This needs us and it needs us now.

The way that things currently work with Apple II floppies is to separate them into two classes: Disks that Just Copy, and Disks That Need A Little Love. The Little Love disks, when found, are packed up and sent off to one of my collaborators, 4AM, who has the tools and the skills to get data of particularly tenacious floppies, as well as doing “silent cracks” of commercial floppies to preserve what’s on them as best as possible.

Doing the “Disks that Just Copy” is a mite easier. I currently have an Apple II system on my desk that connects via USB-to-serial connection to my PC. There, I run a program called Apple Disk Transfer that basically turns the Apple into a Floppy Reading Machine, with pretty interface and everything.

Apple Disk Transfer (ADT) has been around a very long time and knows what it’s doing – a floppy disk with no trickery on the encoding side can be ripped out and transferred to a “.DSK” file on the PC in about 20 seconds. If there’s something wrong with the disk in terms of being an easy read, ADT is very loud about it. I can do other things while reading floppies, and I end up with a whole pile of filenames when it’s done. The workflow, in other words, isn’t so bad as long as the floppies aren’t in really bad shape. In this particular set, the floppies were in excellent shape, except when they weren’t, and the vast majority fell into the “excellent” camp.

The floppy drive that sits at the middle of this looks like some sort of nightmare, but it helps to understand that with Apple II floppy drives, you really have to have the cover removed at all time, because you will be constantly checking the read head for dust, smudges, and so on. Unscrewing the whole mess and putting it back together for looks just doesn’t scale. It’s ugly, but it works.

It took me about three days (while doing lots of other stuff) but in the end I had 714 .dsk images pulled from both sides of the floppies, which works out to 357 floppy disks successfully imaged. Another 20 or so are going to get a once over but probably are going to go into 4am’s hands to get final evaluation. (Some of them may in fact be blank, but were labelled in preparation, and so on.) 714 is a lot to get from one person!

As mentioned, an Apple II 5.25″ floppy disk image is pretty much always 140k. The names of the floppy are mine, taken off the label, or added based on glancing inside the disk image after it’s done. For a quick glance, I use either an Apple II emulator called Applewin, or the fantastically useful Apple II disk image investigator Ciderpress, which is a frankly the gold standard for what should be out there for every vintage disk/cartridge/cassette image. As might be expected, labels don’t always match contents. C’est la vie.

As for the contents of the disks themselves; this comes down to what the “standard collection” was for an Apple II user in the 1980s who wasn’t afraid to let their software library grow utilizing less than legitimate circumstances. Instead of an elegant case of shiny, professionally labelled floppy diskettes, we get a scribbled, messy, organic collection of all range of “warez” with no real theme. There’s games, of course, but there’s also productivity, utilities, artwork, and one-off collections of textfiles and documentation. Games that were “cracked” down into single-file payloads find themselves with 4-5 other unexpected housemates and sitting behind a menu. A person spending the equivalent of $50-$70 per title might be expected to have a relatively small and distinct library, but someone who is meeting up with friends or associates and duplicating floppies over a few hours will just grab bushels of strange.

The result of the first run is already up on the Archive: A 37 Megabyte .ZIP file containing all the images I pulled off the floppies. 

In terms of what will be of relevance to later historians, researchers, or collectors, that zip file is probably the best way to go – it’s not munged up with the needs of the Archive’s structure, and is just the disk images and nothing else.

This single .zip archive might be sufficient for a lot of sites (go git ‘er!) but as mentioned infinite times before, there is a very strong ethic across the Internet Archive’s software collection to make things as accessible as possible, and hence there are over nearly 500 items in the “3D0G Knight Collection” besides the “download it all” item.

The rest of this entry talks about why it’s 500 and not 714, and how it is put together, and the rest of my thoughts on this whole endeavor. If you just want to play some games online or pull a 37mb file and run, cackling happily, into the night, so be it.

The relatively small number of people who have exceedingly hard opinions on how things “should be done” in the vintage computing space will also want to join the folks who are pulling the 37mb file. Everything else done by me after the generation of the .zip file is in service of the present and near future. The items that number in the hundreds on the Archive that contain one floppy disk image and interaction with it are meant for people to find now. I want someone to have a vague memory of a game or program once interacted with, and if possible, to find it on the Archive. I also like people browsing around randomly until something catches their eye and to be able to leap into the program immediately.

To those ends, and as an exercise, I’ve acquired or collaborated on scripts to do the lion’s share of analysis on software images to prep them for this living museum. These scripts get it “mostly” right, and the rough edges they bring in from running are easily smoothed over by a microscopic amount of post-processing manual attention, like running a piece of sandpaper over a machine-made joint.

Again, we started out 714 disk images. The first thing done was to run them against a script that has hash checksums for every exposed Apple II disk image on the Archive, which now number over 10,000. Doing this dropped the “uniquely new” disk images from 714 to 667.

Next, I concatenated disk images that are part of the same product into one item: if a paint program has two floppy disk images for each of the sides of its disk, those become a single item. In one or two cases, the program spans multiple floppies, so 4-8 (and in one case, 14!) floppy images become a single item. Doing this dropped the total from 667 to 495 unique items. That’s why the number is significantly smaller than the original total.

Let’s talk for a moment about this.

Using hashes and comparing them is the roughest of rough approaches to de-duplicating software items. I do it with Apple II images because they tend to be self contained (a single .dsk file) and because Apple II software has a lot of people involved in it. I’m not alone by any means in acquiring these materials and I’m certainly not alone in terms of work being done to track down all the unique variations and most obscure and nearly lost packages written for this platform. If I was the only person in the world (or one of a tiny sliver) working on this I might be super careful with each and every item to catalog it – but I’m absolutely not; I count at least a half-dozen operations involving in Apple II floppy image ingestion.

And as a bonus, it’s a really nice platform. When someone puts their heart into an Apple II program, it rewards them and the end user as well – the graphics can be charming, the program flow intuitive, and the whole package just gleams on the screen. It’s rewarding to work with this corpus, so I’m using it as a test bed for all these methods, including using hashes.

But hash checksums are seriously not the be-all for this work. Anything can make a hash different – an added file, a modified bit, or a compilation of already-on-the-archive-in-a-hundred-places files that just happen to be grouped up slightly different than others. That said, it’s not overwhelming – you can read about what’s on a floppy and decide what you want pretty quickly; gigabytes will not be lost and the work to track down every single unique file has potential but isn’t necessary yet.

(For the people who care, the Internet Archive generates three different hashes (md5, crc32, sha1) and lists the size of the file – looking across all of those for comparison is pretty good for ensuring you probably have something new and unique.)

Once the items are up there, the Screen Shotgun whips into action. It plays the programs in the emulator, takes screenshots, leafs off the unique ones, and then assembles it all into a nice package. Again, not perfect but left alone, it does the work with no human intervention and gets things generally right. If you see a screenshot in this collection, a robot did it and I had nothing to do with it.

This leads, of course, to scaring out which programs are a tad not-bootable, and by that I mean that they boot up in the emulator and the emulator sees them and all, but the result is not that satisfying:

On a pure accuracy level, this is doing exactly what it’s supposed to – the disk wasn’t ever a properly packaged, self-contained item, and it needs a boot disk to go in the machine first before you swap the floppy. I intend to work with volunteers to help with this problem, but here is where it stands.

The solution in the meantime is a java program modified by Kevin Savetz, which analyzes the floppy disk image and prints all the disk information it can find, including the contents of BASIC programs and textfiles. Here’s a non-booting disk where this worked out. The result is that this all gets ingested into the search engine of the Archive, and so if you’re looking for a file within the disk images, there’s a chance you’ll be able to find it.

Once the robots have their way with all the items, I can go in and fix a few things, like screenshots that went south, or descriptions and titles that don’t reflect what actually boots up. The amount of work I, a single person, have to do is therefore reduced to something manageable.

I think this all works well enough for the contemporary vintage software researcher and end user. Perhaps that opinion is not universal.

What I can say, however, is that the core action here – of taking data away from a transient and at-risk storage medium and putting it into a slightly less transient, less at-risk storage medium – is 99% of the battle. To have the will to do it, to connect with the people who have these items around and to show them it’ll be painless for them, and to just take the time to shove floppies into a drive and read them, hundreds of times… that’s the huge mountain to climb right now. I no longer have particularly deep concerns about technology failing to work with these digital images, once they’re absorbed into the Internet. It’s this current time, out in the cold, unknown and unloved, that they’re the most at risk.

The rest, I’m going to say, is gravy.

I’ll talk more about exactly how tasty and real that gravy is in the future, but for now, please take a pleasant walk in the 3D0G Knight’s Domain.

The Followup

Published 14 Mar 2017 by Jason Scott in ASCII by Jason Scott.

Writing about my heart attack garnered some attention. I figured it was only right to fill in later details and describe what my current future plans are.

After the previous entry, I went back into the emergency room of the hospital I was treated at, twice.

The first time was because I “felt funny”; I just had no grip on “is this the new normal” and so just to understand that, I went back in and got some tests. They did an EKG, a blood test, and let me know all my stats were fine and I was healing according to schedule. That took a lot of stress away.

Two days later, I went in because I was having a marked shortness of breath, where I could not get enough oxygen in and it felt a little like I was drowning. Another round of tests, and one of the cardiologists mentioned a side effect of one of the drugs I was taking was this sort of shortness/drowning. He said it usually went away and the company claimed 5-7% of people got this side effect, but that they observed more like 10-15%. They said I could wait it out or swap drugs. I chose swap. After that, I’ve had no other episodes.

The hospital thought I should stay in Australia for 2 weeks before flying. Thanks to generosity from both MuseumNext and the ACMI, my hosts, that extra AirBnB time was basically paid for. MuseumNext also worked to help move my international flight ahead the weeks needed; a very kind gesture.

Kind gestures abounded, to be clear. My friend Rochelle extended her stay from New Zealand to stay an extra week; Rachel extended hers to match my new departure date. Folks rounded up funds and sent them along, which helped cover some additional costs. Visitors stopped by the AirBnB when I wasn’t really taking any walks outside, to provide additional social contact.

Here is what the blockage looked like, before and after. As I said, roughly a quarter of my heart wasn’t getting any significant blood and somehow I pushed through it for nearly a week. The insertion of a balloon and then a metal stent opened the artery enough for the blood flow to return. Multiple times, people made it very clear that this could have finished me off handily, and mostly luck involving how my body reacted was what kept me going and got me in under the wire.

From the responses to the first entry, it appears that a lot of people didn’t know heart attacks could be a lingering, growing issue and not just a bolt of lightning that strikes in the middle of a show or while walking down the street. If nothing else, I’m glad that it’s caused a number of people to be aware of how symptoms portray each other, as well as getting people to check up cholesterol, which I didn’t see as a huge danger compared to other factors, and which turned out to be significant indeed.

As for drugs, I’ve got a once a day waterfall of pills for blood pressure, cholesterol, heart healing, anti-clotting, and my long-handled annoyances of gout (which I’ve not had for years thanks to the pills). I’m on some of them for the next few months, some for a year, and some forever. I’ve also been informed I’m officially at risk for another heart attack, but the first heart attack was my hint in that regard.

As I healed, and understood better what was happening to me, I got better remarkably quick. There is a single tiny dot on my wrist from the operation, another tiny dot where the IV was in my arm at other times. Rachel gifted a more complicated Fitbit to replace the one I had, with the new one tracking sleep schedule and heart rate, just to keep an eye on it.

A day after landing back in the US, I saw a cardiologist at Mt. Sinai, one of the top doctors, who gave me some initial reactions to my charts and information: I’m very likely going to be fine, maybe even better than before. I need to take care of myself, and I was. If I was smoking or drinking, I’d have to stop, but since I’ve never had alcohol and I’ve never smoked, I’m already ahead of that game. I enjoy walking, a lot. I stay active. And as of getting out of the hospital, I am vegan for at least a year. Caffeine’s gone. Raw vegetables are in.

One might hesitate putting this all online, because the Internet is spectacularly talented at generating hatred and health advice. People want to help – it comes from a good place. But I’ve got a handle on it and I’m progressing well; someone hitting me up with a nanny-finger-wagging paragraph and 45 links to isn’t going to help much. But go ahead if you must.

I failed to mention it before, but when this was all going down, my crazy family of the Internet Archive jumped in, everyone from Dad Brewster through to all my brothers and sisters scrambling to find me my insurance info and what they had on their cards, as I couldn’t find mine. It was something really late when I first pinged everyone with “something is not good” and everyone has been rather spectacular over there. Then again, they tend to be spectacular, so I sort of let that slip by. Let me rectify that here.

And now, a little bit on health insurance.

I had travel insurance as part of my health insurance with the Archive. That is still being sorted out, but a large deposit had to be put on the Archive’s corporate card as a down-payment during the sorting out, another fantastic generosity, even if it’s technically a loan. I welcome the coming paperwork and nailing down of financial brass tacks for a specific reason:

I am someone who once walked into an emergency room with no insurance (back in 2010), got a blood medication IV, stayed around a few hours, and went home, generating a $20,000 medical bill in the process. It got knocked down to $9k over time, and I ended up being thrown into a low-income program they had that allowed them to write it off (I think). That bill could have destroyed me, financially. Therefore, I’m super sensitive to the costs of medical care.

In Australia, it is looking like the heart operation and the 3 day hospital stay, along with all the tests and staff and medications, are going to round out around $10,000 before the insurance comes in and knocks that down further (I hope). In the US, I can’t imagine that whole thing being less than $100,000.

The biggest culture shock for me was how little any of the medical staff, be they doctors or nurses or administrators, cared about the money. They didn’t have any real info on what things cost, because pretty much everything is free there. I’ve equating it to asking a restaurant where the best toilets to use a few hours after your meal – they might have some random ideas, but nobody’s really thinking that way. It was a huge factor in my returning to the emergency room so willingly; each visit, all-inclusive, was $250 AUD, which is even less in US dollars. $250 is something I’ll gladly pay for peace of mind, and I did, twice. The difference in the experince is remarkable. I realize this is a hot button issue now, but chalk me up as another person for whom a life-changing experience could come within a remarkably close distance of being an influence on where I might live in the future.

Dr. Sonny Palmer, who did insertion of my stent in the operating room.

I had a pile of plans and things to get done (documentaries, software, cutting down on my possessions, and so on), and I’ll be getting back to them. I don’t really have an urge to maintain some sort of health narrative on here, and I certainly am not in the mood to urge any lifestyle changes or preach a way of life to folks. I’ll answer questions if people have them from here on out, but I’d rather be known for something other than powering through a heart attack, and maybe, with some effort, I can do that.

Thanks again to everyone who has been there for me, online and off, in person and far away, over the past few weeks. I’ll try my best to live up to your hopes about what opportunities my second chance at life will give me.


Want to learn about Archivematica whilst watching the ducks?

Published 13 Mar 2017 by Jenny Mitcham in Digital Archiving at the University of York.

We are really excited to be hosting the first European Archivematica Camp here at the University of York next month - on the 4-6th April.

Don't worry - there will be no tents or campfires...but there may be some wildlife on the lake.

The Ron Cooke Hub on a frosty morning - hoping for some warmer weather for Camp!

The event is taking place at the Ron Cooke Hub over on our Heslington East campus. If you want to visit the beautiful City of York (OK, I'm biased!) and meet other European Archivematica users (or Archivematica explorers) this event is for you. Artefactual Systems will be leading the event and the agenda is looking very full and interesting.

I'm most looking forward to learning more about the workflows that other Archivematica users have in place or are planning to implement.

One of these lakeside 'pods' will be our breakout room

There are still places left and you can register for Camp here or contact the organisers at

...and if you are not able to attend in person, do watch this blog in early April as you can guarantee I'll be blogging after the event!

Through the mirror-glass: Capture of artwork framed in glass.

Published 13 Mar 2017 by slwacns in State Library of Western Australia Blog.


State Library’s collection material that is selected for digitisation comes to the Digitisation team in a variety of forms. This blog describes capture of artwork that is framed and encased within glass.

So let’s see how the item is digitized.


Two large framed original artworks from the picture book Teacup written by Rebecca Young and illustrated by Matt Ottley posed some significant digitisation challenges.

When artwork from the Heritage collection is framed in glass, the glass acts like a mirror and without great care during the capture process, the glass can reflect whatever is in front of it, meaning that the photographer’s reflection (and the reflection of capture equipment) can obscure the artwork.

This post shows how we avoided this issue during the digitisation of two large framed paintings, Cover illustration for Teacup and also page 4-5 [PWC/255/01 ] and The way the whales called out to each other [PWC/255/09].

Though it is sometimes possible to remove the artwork from its housing, there are occasions when this is not suitable. In this example, the decision was made to not remove the artworks from behind glass as the Conservation staff assessed that it would be best if the works were not disturbed from their original housing.

PWC/255/01                                                         PWC/255/09

The most critical issue was to be in control of the light. Rearranging equipment in the workroom allowed for the artwork to face a black wall, a method used by photographers to eliminate reflections.


We used black plastic across the entrance of the workroom to eliminate all unwanted light.


The next challenge was to set up the camera. For this shoot we used our Hasselblad H3D11 (a 39 mega pixel with excellent colour fidelity).


Prior to capture, we gave the glass a good clean with an anti-static cloth. In the images below, you can clearly see the reflection caused by the mirror effect of the glass.


Since we don’t have a dedicated photographic studio we needed to be creative when introducing extra light to allow for the capture. Bouncing the light off a large white card prevented direct light from falling on the artwork and reduced a significant number of reflections. We also used a polarizing filter on the camera lens to reduce reflections even further.


Once every reflection was eliminated and the camera set square to the artwork, we could test colour balance and exposure.

In the image below, you can see that we made the camera look like ‘Ned Kelly’ to ensure any shiny metal from the camera body didn’t reflect in the glass. We used the camera’s computer controlled remote shutter function to further minimise any reflections in front of the glass.



The preservation file includes technically accurate colour and greyscale patches to allow for colour fidelity and a ruler for accurate scaling in future reproductions.


The preservation file and a cropped version for access were then ingested into the State Library’s digital repository. The repository allows for current access and future reproductions to be made.

From this post you can see the care and attention that goes into preservation digitisation, ‘Do it right, do it once’ is our motto.

Filed under: Children's Literature, Exhibitions, Illustration, Picture Books, SLWA collections, SLWA Exhibitions, State Library of Western Australia, Uncategorized, WA, Western Australia Tagged: digitisation, illustration, slwa, SLWA collections, WA, WA Author

Week #8: Warriors are on the right path

Published 12 Mar 2017 by legoktm in The Lego Mirror.

As you might have guessed due to the lack of previous coverage of the Warriors, I'm not really a basketball fan. But the Warriors are in an interesting place right now. After setting an NBA record for being the fastest team to clinch a playoff spot, Coach Kerr has started resting his starters and the Warriors have a three game losing streak. This puts the Warriors in danger of losing their first seed spot with the San Antonio Spurs only half a game behind them.

But I think the Warriors are doing the right thing. Last year the Warriors set the record for having the best regular season record in NBA history, but also became the first team in NBA history to have a 3-1 advantage in the finals and then lose.

No doubt there was immense pressure on the Warriors last year. It was just expected of them to win the championship, there really wasn't anything else.

So this year they can easily avoid a lot of that pressure by not being the best team in the NBA on paper. They shouldn't worry about being the top seed, just seed in the top four, and play your best in the playoffs. Get some rest, they have a huge advantage over every other team simply by already being in the playoffs with so many games left to play.

How can we preserve our wiki pages

Published 10 Mar 2017 by Jenny Mitcham in Digital Archiving at the University of York.

I was recently prompted by a colleague to investigate options for preserving institutional wiki pages. At the University of York we use the Confluence wiki and this is available for all staff to use for a variety of purposes. In the Archives we have our own wiki space on Confluence which we use primarily for our meeting agendas and minutes. The question asked of me was how can we best capture content on the wiki that needs to be preserved for the long term? 

Good question and just the sort of thing I like to investigate. Here are my findings...

Space export

The most sensible way to approach the transfer of a set of wiki pages to the digital archive would be to export them using the export options available within the Space Tools.

The main problem with this approach is that a user will need to have the necessary permissions on the wiki space in order to be able to use these tools ...I found that I only had the necessary permissions on those wiki spaces that I administer myself.

There are three export options as illustrated below:

Space export options - available if you have the right permissions!


Once you select HTML, there are two options - a standard export (which exports the whole space) or a custom export (which allows you to select the pages you would like included within the export).

I went for a custom export and selected just one section of meeting papers. Each wiki page is saved as an HTML file. DROID identifies these as HTML version 5. All relevant attachments are included in the download in their original format.

There are some really good things about this export option:
  • The inclusion of attachments in the export - these are often going to be as valuable to us as the wiki page content itself. Note that they were all renamed with a number that tied them to the page that they were associated with. It seemed that the original file name was however preserved in the linking wiki page text 
  • The metadata at the top of a wiki page is present in the HTML pages: ie Created by Jenny Mitcham, last modified by Jenny Mitcham on 31, Oct, 2016 - this is really important to us from an archival point of view
  • The links work - including links to the downloaded attachments, other wiki pages and external websites or Google Docs
  • The export includes an index page which can act as a table of contents for the exported files - this also includes some basic metadata about the wiki space


Again, there are two options here - either a standard export (of the whole space) or a custom export, which allows you to select whether or not you want comments to be exported and choose exactly which pages you want to export.

I tried the custom export. It seemed to work and also did export all the relevant attachments. The attachments were all renamed as '1' (with no file extension), and the wiki page content is all bundled up into one huge XML file.

On the plus side, this export option may contain more metadata than the other options (for example the page history) but it is difficult to tell as the XML file is so big and unwieldy and hard to interpret. Really it isn't designed to be usable. The main function of this export option is to move wiki pages into another instance of Confluence.


Again you have the option to export whole space or choose your pages. There are also other configurations you can make to the output but these are mostly cosmetic.

I chose the same batch of meeting papers to export as PDF and this produces a 111 page PDF document. The first page is a contents page which lists all the other pages alphabetically with hyperlinks to the right section of the document. It is hard to use the document as the wiki pages seem to run into each other without adequate spacing and because of the linear nature of a pdf document you feel drawn to read it in the order it is presented (which in this case is not a logical order for the content). Attachments are not included in the download though links to the attachments are maintained in the PDF file and they do continue to resolve to the right place on the wiki. Creation and last modified metadata is also not included in the export.

Single page export

As well as the Space Export options in Confluence there are also single page export options. These are available to anyone who can access the wiki page so may be useful if people do not have necessary permissions for a space export.

I exported a range of test pages using the 'Export to PDF' and 'Export to Word' options.

Export to PDF

The PDF files created in this manner are version 1.4. Sadly no option to export as PDF/A, but at least version 1.4 is closer to the PDF/A standard than some, so perhaps a subsequent migration to PDF/A would be successful.

Export to Word

Surprisingly the 'Word' files produced by Confluence appear not to be Word files at all!

Double click on the files in Windows Explorer and they open in Microsoft Word no problem, but DROID identifies the files as HTML (with no version number) and reports a file extension mismatch (because the files have a .doc extension).

If you view the files in a text application you can clearly see the Content-Type marked as text/html and <html> tags within the document. Quick View Plus, however views them as an Internet Mail Message with the following text displayed at the top of each page:

Subject: Exported From Confluence
1024x640 72 Print 90

All very confusing and certainly not giving me a lot of faith in this particular export format!


Both of these single page export formats do a reasonable job of retaining the basic content of the wiki pages - both versions include many of the key features I was looking for - text, images, tables, bullet points, colours. 

Where advanced formatting has been used to lay out a page using coloured boxes, the PDF version does a better job at replicating this than the 'Word' version. Whilst the PDF attempts to retain the original formatting, the 'Word' version displays the information in a much more linear fashion.

Links were also more usefully replicated in the PDF version. The absolute URL of all links, whether internal, external or to attachments was included within the PDF file so that it is possible to follow them to their original location (if you have the necessary permissions to view the pages). On the 'Word' versions, only external links worked in this way. Internal wiki links and links to attachments were exported as a relative link which become 'broken' once that page is taken out of its original context. 

The naming of the files that were produced is also worthy of comment. The 'Word' versions are given a name which mirrors the name of the page within the wiki space, but the naming of the PDF versions are much more useful, including the name of the wiki space itself, the page name and a date and timestamp showing when the page was exported.

Neither of these single page export formats retained the creation and last modified metadata for each page and this is something that it would be very helpful to retain.


So, if we want to preserve pages from our institutional wiki, what is the best approach?

The Space Export in HTML format is a clear winner. It reproduces the wiki pages in a reusable form that replicates the page content well. As HTML is essentially just ASCII text it is also a good format for long term preservation.

What impressed me about the HTML export was the fact that it retained the content, included basic creation and last modified metadata for each page and downloaded all relevant attachments, updating the links to point to these local copies.

What if someone does not have the necessary permissions to do a space export? My first suggestion would be that they ask for their permissions to be upgraded. If not, perhaps someone who does have necessary permissions could carry out the export?

If all else fails, the export of a single page using the 'Export as PDF' option could be used to provide ad hoc content for the digital archive. PDF is not the best preservation format but may be able to be converted to PDF/A. Note that any attachments would have to be exported separately and manually is this option was selected.

Final thoughts

A wiki space is a dynamic thing which can involve several different types of content - blog posts, labels/tags and comments can all be added to wiki spaces and pages. If these elements are thought to be significant then more work is required to see how they can be captured. It was apparent that comments could be captured using the HTML and XML exports and I believe blog posts can be captured individually as PDF files.

What is also available within the wiki platform itself is a very detailed Page History. Within each wiki page it is possible to view the Page History and see how a page has evolved over time - who has edited it and when those edits occurred. As far as I could see, none of the export formats included this level of information. The only exception may be the XML export but this was so difficult to view that I could not be sure either way.

So, there are limitations to all these approaches and as ever this goes back to the age old discussion about Significant Properties. What is significant about the wiki pages? What is it that we are trying to preserve? None of the export options preserve everything. All are compromises, but perhaps some are compromises we could live with.

China – Arrival in the Middle Kingdom

Published 9 Mar 2017 by Tom Wilson in thomas m wilson.

I’ve arrived in Kunming, the little red dot you can see on the map above.  I’m here to teach research skills to undergraduate students at Yunnan Normal University.  As you can see, I’ve come to a point where the foothills of the Himalayas fold up into a bunch of deep creases.  Yunnan province is the area of […]

Introducing Similarity Search at Flickr

Published 7 Mar 2017 by Clayton Mellina in

At Flickr, we understand that the value in our image corpus is only unlocked when our members can find photos and photographers that inspire them, so we strive to enable the discovery and appreciation of new photos.

To further that effort, today we are introducing similarity search on Flickr. If you hover over a photo on a search result page, you will reveal a “…” button that exposes a menu that gives you the option to search for photos similar to the photo you are currently viewing.

In many ways, photo search is very different from traditional web or text search. First, the goal of web search is usually to satisfy a particular information need, while with photo search the goal is often one of discovery; as such, it should be delightful as well as functional. We have taken this to heart throughout Flickr. For instance, our color search feature, which allows filtering by color scheme, and our style filters, which allow filtering by styles such as “minimalist” or “patterns,” encourage exploration. Second, in traditional web search, the goal is usually to match documents to a set of keywords in the query. That is, the query is in the same modality—text—as the documents being searched. Photo search usually matches across modalities: text to image. Text querying is a necessary feature of a photo search engine, but, as the saying goes, a picture is worth a thousand words. And beyond saving people the effort of so much typing, many visual concepts genuinely defy accurate description. Now, we’re giving our community a way to easily explore those visual concepts with the “…” button, a feature we call the similarity pivot.

The similarity pivot is a significant addition to the Flickr experience because it offers our community an entirely new way to explore and discover the billions of incredible photos and millions of incredible photographers on Flickr. It allows people to look for images of a particular style, it gives people a view into universal behaviors, and even when it “messes up,” it can force people to look at the unexpected commonalities and oddities of our visual world with a fresh perspective.

What is “similarity”?

To understand how an experience like this is powered, we first need to understand what we mean by “similarity.” There are many ways photos can be similar to one another. Consider some examples.

It is apparent that all of these groups of photos illustrate some notion of “similarity,” but each is different. Roughly, they are: similarity of color, similarity of texture, and similarity of semantic category. And there are many others that you might imagine as well.

What notion of similarity is best suited for a site like Flickr? Ideally, we’d like to be able to capture multiple types of similarity, but we decided early on that semantic similarity—similarity based on the semantic content of the photos—was vital to facilitate discovery on Flickr. This requires a deep understanding of image content for which we employ deep neural networks.

We have been using deep neural networks at Flickr for a while for various tasks such as object recognition, NSFW prediction, and even prediction of aesthetic quality. For these tasks, we train a neural network to map the raw pixels of a photo into a set of relevant tags, as illustrated below.

Internally, the neural network accomplishes this mapping incrementally by applying a series of transformations to the image, which can be thought of as a vector of numbers corresponding to the pixel intensities. Each transformation in the series produces another vector, which is in turn the input to the next transformation, until finally we have a vector that we specifically constrain to be a list of probabilities for each class we are trying to recognize in the image. To be able to go from raw pixels to a semantic label like “hot air balloon,” the network discards lots of information about the image, including information about  appearance, such as the color of the balloon, its relative position in the sky, etc. Instead, we can extract an internal vector in the network before the final output.

For common neural network architectures, this vector—which we call a “feature vector”—has many hundreds or thousands of dimensions. We can’t necessarily say with certainty that any one of these dimensions means something in particular as we could at the final network output, whose dimensions correspond to tag probabilities. But these vectors have an important property: when you compute the Euclidean distance between these vectors, images containing similar content will tend to have feature vectors closer together than images containing dissimilar content. You can think of this as a way that the network has learned to organize information present in the image so that it can output the required class prediction. This is exactly what we are looking for: Euclidian distance in this high-dimensional feature space is a measure of semantic similarity. The graphic below illustrates this idea: points in the neighborhood around the query image are semantically similar to the query image, whereas points in neighborhoods further away are not.

This measure of similarity is not perfect and cannot capture all possible notions of similarity—it will be constrained by the particular task the network was trained to perform, i.e., scene recognition. However, it is effective for our purposes, and, importantly, it contains information beyond merely the semantic content of the image, such as appearance, composition, and texture. Most importantly, it gives us a simple algorithm for finding visually similar photos: compute the distance in the feature space of a query image to each index image and return the images with lowest distance. Of course, there is much more work to do to make this idea work for billions of images.

Large-scale approximate nearest neighbor search

With an index as large as Flickr’s, computing distances exhaustively for each query is intractable. Additionally, storing a high-dimensional floating point feature vector for each of billions of images takes a large amount of disk space and poses even more difficulty if these features need to be in memory for fast ranking. To solve these two issues, we adopt a state-of-the-art approximate nearest neighbor algorithm called Locally Optimized Product Quantization (LOPQ).

To understand LOPQ, it is useful to first look at a simple strategy. Rather than ranking all vectors in the index, we can first filter a set of good candidates and only do expensive distance computations on them. For example, we can use an algorithm like k-means to cluster our index vectors, find the cluster to which each vector is assigned, and index the corresponding cluster id for each vector. At query time, we find the cluster that the query vector is assigned to and fetch the items that belong to the same cluster from the index. We can even expand this set if we like by fetching items from the next nearest cluster.

This idea will take us far, but not far enough for a billions-scale index. For example, with 1 billion photos, we need 1 million clusters so that each cluster contains an average of 1000 photos. At query time, we will have to compute the distance from the query to each of these 1 million cluster centroids in order to find the nearest clusters. This is quite a lot. We can do better, however, if we instead split our vectors in half by dimension and cluster each half separately. In this scheme, each vector will be assigned to a pair of cluster ids, one for each half of the vector. If we choose k = 1000 to cluster both halves, we have k2= 1000 * 1000 = 1e6 possible pairs. In other words, by clustering each half separately and assigning each item a pair of cluster ids, we can get the same granularity of partitioning (1 million clusters total) with only 2 * 1000 distance computations with half the number of dimensions for a total computational savings of 1000x. Conversely, for the same computational cost, we gain a factor of k more partitions of the data space, providing a much finer-grained index.

This idea of splitting vectors into subvectors and clustering each split separately is called product quantization. When we use this idea to index a dataset it is called the inverted multi-index, and it forms the basis for fast candidate retrieval in our similarity index. Typically the distribution of points over the clusters in a multi-index will be unbalanced as compared to a standard k-means index, but this unbalance is a fair trade for the much higher resolution partitioning that it buys us. In fact, a multi-index will only be balanced across clusters if the two halves of the vectors are perfectly statistically independent. This is not the case in most real world data, but some heuristic preprocessing—like PCA-ing and permuting the dimensions so that the cumulative per-dimension variance is approximately balanced between the halves—helps in many cases. And just like the simple k-means index, there is a fast algorithm for finding a ranked list of clusters to a query if we need to expand the candidate set.

After we have a set of candidates, we must rank them. We could store the full vector in the index and use it to compute the distance for each candidate item, but this would incur a large memory overhead (for example, 256 dimensional vectors of 4 byte floats would require 1Tb for 1 billion photos) as well as a computational overhead. LOPQ solves these issues by performing another product quantization, this time on the residuals of the data. The residual of a point is the difference vector between the point and its closest cluster centroid. Given a residual vector and the cluster indexes along with the corresponding centroids, we have enough information to reproduce the original vector exactly. Instead of storing the residuals, LOPQ product quantizes the residuals, usually with a higher number of splits, and stores only the cluster indexes in the index. For example, if we split the vector into 8 splits and each split is clustered with 256 centroids, we can store the compressed vector with only 8 bytes regardless of the number of dimensions to start (though certainly a higher number of dimensions will result in higher approximation error). With this lossy representation we can produce a reconstruction of a vector from the 8 byte codes: we simply take each quantization code, look up the corresponding centroid, and concatenate these 8 centroids together to produce a reconstruction. Likewise, we can approximate the distance from the query to an index vector by computing the distance between the query and the reconstruction. We can do this computation quickly for many candidate points by computing the squared difference of each split of the query to all of the centroids for that split. After computing this table, we can compute the squared difference for an index point by looking up the precomputed squared difference for each of the 8 indexes and summing them together to get the total squared difference. This caching trick allows us to quickly rank many candidates without resorting to distance computations in the original vector space.

LOPQ adds one final detail: for each cluster in the multi-index, LOPQ fits a local rotation to the residuals of the points that fall in that cluster. This rotation is simply a PCA that aligns the major directions of variation in the data to the axes followed by a permutation to heuristically balance the variance across the splits of the product quantization. Note that this is the exact preprocessing step that is usually performed at the top-level multi-index. It tends to make the approximate distance computations more accurate by mitigating errors introduced by assuming that each split of the vector in the production quantization is statistically independent from other splits. Additionally, since a rotation is fit for each cluster, they serve to fit the local data distribution better.

Below is a diagram from the LOPQ paper that illustrates the core ideas of LOPQ. K-means (a) is very effective at allocating cluster centroids, illustrated as red points, that target the distribution of the data, but it has other drawbacks at scale as discussed earlier. In the 2d example shown, we can imagine product quantizing the space with 2 splits, each with 1 dimension. Product Quantization (b) clusters each dimension independently and cluster centroids are specified by pairs of cluster indexes, one for each split. This is effectively a grid over the space. Since the splits are treated as if they were statistically independent, we will, unfortunately, get many clusters that are “wasted” by not targeting the data distribution. We can improve on this situation by rotating the data such that the main dimensions of variation are axis-aligned. This version, called Optimized Product Quantization (c), does a better job of making sure each centroid is useful. LOPQ (d) extends this idea by first coarsely clustering the data and then doing a separate instance of OPQ for each cluster, allowing highly targeted centroids while still reaping the benefits of product quantization in terms of scalability.

LOPQ is state-of-the-art for quantization methods, and you can find more information about the algorithm, as well as benchmarks, here. Additionally, we provide an open-source implementation in Python and Spark which you can apply to your own datasets. The algorithm produces a set of cluster indexes that can be queried efficiently in an inverted index, as described. We have also explored use cases that use these indexes as a hash for fast deduplication of images and large-scale clustering. These extended use cases are studied here.


We have described our system for large-scale visual similarity search at Flickr. Techniques for producing high-quality vector representations for images with deep learning are constantly improving, enabling new ways to search and explore large multimedia collections. These techniques are being applied in other domains as well to, for example, produce vector representations for text, video, and even molecules. Large-scale approximate nearest neighbor search has importance and potential application in these domains as well as many others. Though these techniques are in their infancy, we hope similarity search provides a useful new way to appreciate the amazing collection of images at Flickr and surface photos of interest that may have previously gone undiscovered. We are excited about the future of this technology at Flickr and beyond.


Yannis Kalantidis, Huy Nguyen, Stacey Svetlichnaya, Arel Cordero. Special thanks to the rest of the Computer Vision and Machine Learning team and the Vespa search team who manages Yahoo’s internal search engine.

Thumbs.db – what are they for and why should I care?

Published 7 Mar 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Recent work I’ve been doing on the digital archive has made me think a bit more about those seemingly innocuous files that Windows (XP, Vista, 7 and 8) puts into any directory that has images in – Thumbs.db.

Getting your folder options right helps!
Windows uses a file called Thumbs.db to create little thumbnail images of any images within a directory. It stores one of these files in each directory that contains images and it is amazing how quickly they proliferate. Until recently I wasn’t aware I had any in my digital archive at all. This is because although my preferences in Windows Explorer were set to display hidden files, the "Hide protected operating system files" option also needs to be disabled in order to see files such as these.

The reason I knew I had all these Thumbs.db files was through a piece of DROID analysis work published last month. Thumbs.db ranked at number 12 in my list of the most frequently occurring file formats in the digital archive. I had 210 of these files in total. I mentioned at the time that I could write a whole blog post about this, so here it is!

Do I really want these in the digital archive? In my mind, what is in the ‘original’ folders within the digital archive should be what OAIS would call the Submission Information Package (SIP). Just those files that were given to us by a donor or depositor. Not files that were created subsequently by my own operating system.

Though they are harmless enough they can be a bit irritating. Firstly, when I’m trying to run reports on the contents of the archive, the number of files for each archive is skewed by the Thumb.db files that are not really a part of the archive. Secondly, and perhaps more importantly, I was trying to create a profile of the dates of files within the digital archive (admittedly not an exact science when using last modified dates) and the span of dates for each individual archive that we hold. The presence of Thumbs.db files in each archive that contained images gave the false impression that all of the archives had had content added relatively recently, when in fact all that had happened was that a Thumbs.db file had automatically been added when I had transferred the data to the digital archive filestore. It took me a while to realise this - gah!

So, what to do? First I needed to work out how to stop them being created.

After a bit of googling I quickly established the fact that I didn’t have the necessary permissions to be able to disable this default behaviour within Windows so I called in the help of IT Services.

IT clearly thought this was a slightly unusual request, but made a change to my account which now stops these thumbnail images being created by me. Being that I am the only person who has direct access to the born digital material within the archive this should solve that problem.

Now I can systematically remove the files. This means that they won’t skew any future reports I run on numbers of files and last modified dates.

Perhaps once we get a proper digital archiving system in place here at the Borthwick we won’t need to worry about these issues as we won’t directly interact with the archive filestore? Archivematica will package up the data into an AIP and put it on the filestore for me.

However, I will say that now IT have stopped the use of Thumbs.db from my account I am starting to miss them. This setting applies to my own working filestore as well as the digital archive. It turns out that it is actually incredibly useful to be able to see thumbnails of your image files before double clicking on them! Perhaps I need to get better at practicing what I preach and make some improvements to how I name my own image files – without a preview thumbnail, an image file *really* does benefit from a descriptive filename!

As always, I'm interested to hear how other people tackle Thumbs.db and any other system files within their digital archives.

This Month’s Writer’s Block

Published 7 Mar 2017 by Dave Robertson in Dave Robertson.



Published 6 Mar 2017 by timbaker in Tim Baker.

The image on the left was taken a year ago when I had to renew my driver’s license, so I am stuck with it for the next 10 years. I don’t mind so much as it reminds me how far I’ve come. The photo on...

Week #7: 999 assists and no more kneeling

Published 4 Mar 2017 by legoktm in The Lego Mirror.

Joe Thornton is one assist away from reaching 1,000 in his career. He's a team player - the recognition of scoring a goal doesn't matter to him, he just wants his teammates to score. And his teammates want him to achieve this milestone too, as shown by Sharks passing to Thornton and him passing back instead of them going directly for the easy empty netter.

Oh, and now that the trade deadline has passed with no movement on the goalie front, it's time for In Jones We Trust:

via /u/MisterrAlex on reddit

In other news, Colin Kaepernick announced that he's going to be a free agent and opted out of the final year of his contract. But in even bigger news, he said he will stop kneeling for the national anthem. I don't know if he is doing that to make himself more marketable, but I wish he would have stood (pun intended) with his beliefs.

Songs for the Beeliar Wetlands

Published 2 Mar 2017 by Dave Robertson in Dave Robertson.

The title track of the forthcoming Kiss List album has just been included on an awesome fundraising compilation of 17 songs by local songwriters for the Beeliar wetlands. All proceeds go to #rethinkthelink. Get it while its hot! You can purchase the whole album or just the songs you like.

Songs for the Beeliar Wetlands: Original Songs by Local Musicians (Volume 1) by Dave Robertson and The Kiss List


Stepping Off Meets the Public

Published 1 Mar 2017 by Tom Wilson in thomas m wilson.

At the start of February I launched my new book, Stepping Off: Rewilding and Belonging in the South-West, at an event at Clancy’s in Fremantle.  On Tuesday evening this week I was talking about the book down at Albany Library.     As I was in the area I decided to camp for a couple of […]

Digital Deli, reading history in the present tense

Published 1 Mar 2017 by Carlos Fenollosa in Carlos Fenollosa — Blog.

Digital Deli: The Comprehensive, User Lovable Menu Of Computer Lore, Culture, Lifestyles, And Fancy is an obscure book published in 1984. I found about it after learning that the popular Steve Wozniak article titled "Homebrew and How the Apple Came to Be" belonged to a compilation of short articles.

The book

I'm amazed that this book isn't more cherished by the retrocomputing community, as it provides an incredible insight into the state of computers in 1984. We've all read books about their history, but Digital Deli provides a unique approach: it's written in present tense.

Articles are written with a candid and inspiring narrative. Micro computers were new back then, and the authors could only speculate about how they might change the world in the future.

The book is adequately structured in sections which cover topics from the origins of computing, Silicon Valley startups, and reviews of specific systems. But the most interesting part for me are not the tech articles, but rather the sociological essays.

There are texts on how families welcome computers to the home, the applications of artificial intelligence, micros on Wall Street and computers on the classroom.

How the Source works

Fortunately, a copy of the book has been preserved online, and I highly encourage you to check it out and find some copies online

Besides Woz explaining how Apple was founded, don't miss out on Paul Lutus describing how he programmed AppleWriter in a cabin in the woods, Les Solomon envisioning the "magic box" of computing, Ted Nelson on information exchange and his Project Xanadu, Nolan Bushnell on video games, Bill Gates on software usability, the origins of the Internet... the list goes on and on.

Les Solomon

If you love vintage computing you will find a fresh perspective, and if you were alive during the late 70s and early 80s you will feel a big nostalgia hit. In any case, do yourself a favor, grab a copy of this book, and keep it as a manifesto of the greatest revolution in computer history.

Tags: retro, books

Comments? Tweet  

Week #6: Barracuda win streak is great news for the Sharks

Published 24 Feb 2017 by legoktm in The Lego Mirror.

The San Jose Barracuda, the Sharks AHL affiliate team, is currently riding a 13 game winning streak, and is on top of the AHL — and that's great news for the Sharks.

Ever since the Barracuda moved here from Worcester, Mass., it's only been great news for the Sharks. Because they play in the same stadium, sending players up or down becomes as simple as a little paperwork and asking them to switch locker rooms, not cross-country flights.

This allows the Sharks to have a significantly deeper roster, since they can call up new players at a moment's notice. So the Barracuda's win streak is great news for Sharks fans, since it demonstrates how even the minor league players are ready to play in the pros.

And if you're watching hockey, be on the watch for Joe Thornton to score his 1,000 assist! (More on that next week).

How can I keep mediawiki not-yet-created pages from cluttering my google webmaster console with 404s?

Published 24 Feb 2017 by Sean in Newest questions tagged mediawiki - Webmasters Stack Exchange.

we have a mediawiki install as part of our site. As on all wikis people will add links for not yet created pages (red links). When followed these links return a 404 status (as there is no content) along with an invite to add content.

I'm not getting buried in 404 notices in google webmaster console for this site. Is there a best way to handle this?

Thanks for any help.

The Other Half

Published 24 Feb 2017 by Jason Scott in ASCII by Jason Scott.

On January 19th of this year, I set off to California to participate in a hastily-arranged appearance in a UCLA building to talk about saving climate data in the face of possible administrative switchover. I wore a fun hat, stayed in a nice hotel, and saw an old friend from my MUD days for dinner. The appearance was a lot of smart people doing good work and wanting to continue with it.

While there, I was told my father’s heart surgery, which had some complications, was going to require an extended stay and we were running out of relatives and companions to accompany him. I booked a flight for seven hours after I’d arrive back in New York to go to North Carolina and stay with him. My father has means, so I stayed in a good nearby hotel room. I stayed with him for two and a half weeks, booking ten to sixteen hour days to accompany him through a maze of annoyances, indignities, smart doctors, variant nurses ranging from saints to morons, and generally ensure his continuance.

In the middle of this, I had a non-movable requirement to move the manuals out of Maryland and send them to California. Looking through several possibilities, I settled with: Drive five hours to Maryland from North Carolina, do the work across three days, and drive back to North Carolina. The work in Maryland had a number of people helping me, and involved pallet jacks, forklifts, trucks, and crazy amounts of energy drinks. We got almost all of it, with a third batch ready to go. I drove back the five hours to North Carolina and caught up on all my podcasts.

I stayed with my father another week and change, during which I dented my rental car, and hit another hard limit: I was going to fly to Australia. I also, to my utter horror, realized I was coming down with some sort of cold/flu. I did what I could – stabilized my father’s arrangements, went into the hotel room, put on my favorite comedians in a playlist, turned out the lights, drank 4,000mg of Vitamin C, banged down some orange juice, drank Mucinex, and covered myself in 5 blankets. I woke up 15 hours later in a pool of sweat and feeling like I’d crossed the boundary with that disease. I went back to the hospital to assure my dad was OK (he was), and then prepped for getting back to NY, where I discovered almost every flight for the day was booked due to so many cancelled flights the previous day.

After lots of hand-wringing, I was able to book a very late flight from North Carolina to New York, and stayed there for 5 hours before taking a 25 hour two-segment flight through Dubai to Melbourne.

I landed in Melbourne on Monday the 13th of February, happy that my father was stable back in the US, and prepping for my speech and my other commitments in the area.

On Tuesday I had a heart attack.

We know it happened then, or began to happen, because of the symptoms I started to show – shortness of breath, a feeling of fatigue and an edge of pain that covered my upper body like a jacket. I was fucking annoyed – I felt like I was just super tired and needed some energy, and energy drinks and caffiene weren’t doing the trick.

I met with my hosts for the event I’d do that Saturday, and continued working on my speech.

I attended the conference for that week, did a couple interviews, saw some friends, took some nice tours of preservation departments and discussed copyright with very smart lawyers from the US and Australia.

My heart attack continued, blocking off what turned out to be a quarter of my bloodflow to my heart.

This was annoying me but I didn’t know it was, so according to my fitbit I walked 25 miles, walked up 100 flights of stairs, and maintained hours of exercise to snap out of it, across the week.

I did a keynote for the conference. The next day I hosted a wonderful event for seven hours. I asked for a stool because I said I was having trouble standing comfortably. They gave me one. I took rests during it, just so the DJ could get some good time with the crowds. I was praised for my keeping the crowd jumping and giving it great energy. I’d now had been having a heart attack for four days.

That Sunday, I walked around Geelong, a lovely city near Melbourne, and ate an exquisite meal at Igni, a restaurant whose menu basically has one line to tell you you’ll be eating what they think you should have. Their choices were excellent. Multiple times during the meal, I dozed a little, as I was fatigued. When we got to the tram station, I walked back to the apartment to get some rest. Along the way, I fell to the sidewalk and got up after resting.

I slept off more of the growing fatigue and pain.

The next day I had a second exquisite meal of the trip at Vue Le Monde, a meal that lasted from about 8pm to midnight. My partner Rachel loves good meals and this is one of the finest you can have in the city, and I enjoyed it immensely. It would have been a fine last meal. I’d now had been experiencing a heart attack for about a week.

That night, I had a lot of trouble sleeping. The pain was now a complete jacket of annoyance on my body, and there was no way to rest that didn’t feel awful. I decided medical attention was needed.

The next morning, Rachel and I walked 5 blocks to a clinic, found it was closed, and walked further to the RealCare Health Clinic. I was finding it very hard to walk at this point. Dr. Edward Petrov saw me, gave me some therapy for reflux, found it wasn’t reflux, and got concerned, especially as having my heart checked might cost me something significant. He said he had a cardiologist friend who might help, and he called him, and it was agreed we could come right over.

We took a taxi over to Dr. Georg Leitl’s office. He saw me almost immediately.

He was one of those doctors that only needed to take my blood pressure and check my heart with a stethoscope for 30 seconds before looking at me sadly. We went to his office, and he told me I could not possibly get on the plane I was leaving on in 48 hours. He also said I needed to go to Hospital very quickly, and that I had some things wrong with me that needed attention.

He had his assistants measure my heart and take an ultrasound, wrote something on a notepad, put all the papers in an envelope with the words “SONNY PALMER” on them, and drove me personally over in his car to St. Vincent’s Hospital.

Taking me up to the cardiology department, he put me in the waiting room of the surgery, talked to the front desk, and left. I waited 5 anxious minutes, and then was bought into a room with two doctors, one of whom turned out to be Dr. Sonny Palmer.

Sonny said Georg thought I needed some help, and I’d be checked within a day. I asked if he’d seen the letter with his name on it. He hadn’t. He went and got it.

He came back and said I was going to be operated on in an hour.

He also explained I had a rather blocked artery in need of surgery. Survival rate was very high. Nerve damage from the operation was very unlikely. I did not enjoy phrases like survival and nerve damage, and I realized what might happen very shortly, and what might have happened for the last week.

I went back to the waiting room, where I tweeted what might have been my possible last tweets, left a message for my boss Alexis on the slack channel, hugged Rachel tearfully, and then went into surgery, or potential oblivion.

Obviously, I did not die. The surgery was done with me awake, and involved making a small hole in my right wrist, where Sonny (while blasting Bon Jovi) went in with a catheter, found the blocked artery, installed a 30mm stent, and gave back the blood to the quarter of my heart that was choked off. I listened to instructions on when to talk or when to hold myself still, and I got to watch my beating heart on a very large monitor as it got back its function.

I felt (and feel) legions better, of course – surgery like this rapidly improves life. Fatigue is gone, pain is gone. It was also explained to me what to call this whole event: a major heart attack. I damaged the heart muscle a little, although that bastard was already strong from years of high blood pressure and I’m very young comparatively, so the chances of recovery to the point of maybe even being healthier than before are pretty good. The hospital, St. Vincents, was wonderful – staff, environment, and even the food (incuding curry and afternoon tea) were a delight. My questions were answered, my needs met, and everyone felt like they wanted to be there.

It’s now been 4 days. I was checked out of the hospital yesterday. My stay in Melbourne was extended two weeks, and my hosts (MuseumNext and ACMI) paid for basically all of the additional AirBNB that I’m staying at. I am not cleared to fly until the two weeks is up, and I am now taking six medications. They make my blood thin, lower my blood pressure, cure my kidney stones/gout, and stabilize my heart. I am primarily resting.

I had lost a lot of weight and I was exercising, but my cholesterol was a lot worse than anyone really figured out. The drugs and lifestyle changes will probably help knock that back, and I’m likely to adhere to them, unlike a lot of people, because I’d already been on a whole “life reboot” kick. The path that follows is, in other words, both pretty clear and going to be taken.

Had I died this week, at the age of 46, I would have left behind a very bright, very distinct and rather varied life story. I’ve been a bunch of things, some positive and negative, and projects I’d started would have lived quite neatly beyond my own timeline. I’d have also left some unfinished business here and there, not to mention a lot of sad folks and some extremely quality-variant eulogies. Thanks to a quirk of the Internet Archive, there’s a little statue of me – maybe it would have gotten some floppy disks piled at its feet.

Regardless, I personally would have been fine on the accomplishment/legacy scale, if not on the first-person/relationships/plans scale. That my Wikipedia entry is going to have a different date on it than February 2017 is both a welcome thing and a moment to reflect.

I now face the Other Half, whatever events and accomplishments and conversations I get to engage in from this moment forward, and that could be anything from a day to 100 years.

Whatever and whenever that will be, the tweet I furiously typed out on cellphone as a desperate last-moment possible-goodbye after nearly a half-century of existence will likely still apply:

“I have had a very fun time. It was enormously enjoyable, I loved it all, and was glad I got to see it.”


Three take aways to understand Cloudflare's apocalyptic-proportions mess

Published 24 Feb 2017 by Carlos Fenollosa in Carlos Fenollosa — Blog.

It turns out that Cloudflare's proxies have been dumping uninitialized memory that contains plain HTTPS content for an indeterminate amount of time. If you're not familiar with the topic, let me summarize it: this is the worst crypto news in the last 10 years.

As usual, I suggest you read the HN comments to understand the scandalous magnitude of the bug.

If you don't see this as a news-opening piece on TV it only confirms that journalists know nothing about tech.

How bad is it, really? Let's see

I'm finding private messages from major dating sites, full messages from a well-known chat service, online password manager data, frames from adult video sites, hotel bookings. We're talking full HTTPS requests, client IP addresses, full responses, cookies, passwords, keys, data, everything

If the bad guys didn't find the bug before Tavis, you may be on the clear. However, as usual in crypto, you must assume that any data you submitted through a Cloudflare HTTPS proxy has been compromised.

Three take aways

A first take away, crypto may be mathematically perfect but humans err and the implementations are not. Just because something is using strong crypto doesn't mean it's immune to bugs.

A second take away, MITMing the entire Internet doesn't sound so compelling when you put it that way. Sorry to be that guy, but this only confirms that the centralization of the Internet by big companies is a bad idea.

A third take away, change all your passwords. Yep. It's really that bad. Your passwords and private requests may be stored somewhere, on a proxy or on a malicious actor's servers.

Well, at least change your banking ones, important services like email, and master passwords on password managers -- you're using one, right? RIGHT?

You can't get back any personal info that got leaked but at least you can try to minimize the aftershock.

Update: here is a provisional list of affected services. Download the full list, export your password manager data into a csv file, and compare both files by using grep -f sorted_unique_cf.txt your_passwords.csv.

Afterwards, check the list of potentially affected iOS apps

Let me conclude by saying that unless you were the victim of a targeted attack it's improbable that this bug is going to affect you at all. However, that small probability is still there. Your private information may be cached somewhere or stored on a hacker's server, waiting to be organized and leaked with a flashy slogan.

I'm really sorry about the overly dramatic post, but this time it's for real.

Tags: security, internet, news

Comments? Tweet  

The localhost page isn’t working on MediaWiki

Published 23 Feb 2017 by hasanghaforian in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I want to use Widget PDF to embed PDF files on my MediaWiki pages. So at first installed Extension:Widgets on MediaWiki and it seems it is installed (I can see it in Installed extensions list in Special:Version of the Wiki). The I copied and pasted the entire source of the PDF widget code page into a page called Widget:PDF on my Wiki:

<big>This widget allows you to '''embed PDF files''' on your wiki page.</big>

Created by [ühler Wilhelm Bühler] and adapted by [ Karsten Hoffmeyer].

== Using this widget ==
For information on how to use this widget, see [ widget description page on].

== Copy to your site ==
To use this widget on your site, just install [ MediaWiki Widgets extension] and copy the [{{fullurl:{{FULLPAGENAME}}|action=edit}} full source code] of this page to your wiki as page '''{{FULLPAGENAME}}'''.
</noinclude><includeonly><object class="pdf-widget" data="<!--{$url|validate:url}-->" type="application/pdf" wmode="transparent" style="z-index: 999; height: 100%; min-height: <!--{$height|escape:'html'|default:680}-->px; width: 100%; max-width: <!--{$width|escape:'html'|default:960}-->px;"><param name="wmode" value="transparent">
<p>Currently your browser does not use a PDF plugin. You may however <a href="<!--{$url|validate:url}-->">download the PDF file</a> instead.</p></object></includeonly>

My PDF file is under this URL:


And it's name is File:GraphicsandAnimations-Devoxx2010.pdf. So as described here, I added this code to my Wiki page:


But this error occured:

The localhost page isn’t working
localhost is currently unable to handle this request. 

What I did:

  1. Also I tried this (original example of the Widget PDF)


    But result was the same.

  2. I read Extension talk:Widgets but did not find any thing.

  3. I opened Chrome DevTools (Ctrl+Shift+I), but there was no error.

How I can solve the problem?


After some times, I tried to uninstall Widget PDF and Extension:Widgets and reinstall them. So I removed Extension:Widgets files/folder from $IP/extensions/ and also deleted Widget:PDF page from Wiki. Then I installed Extension:Widgets again, but now, I can not open the Wiki pages at all (I see above error again), unless I delete require_once "$IP/extensions/Widgets/Widgets.php"; from LocalSettings.php. So I even cannot try to load Extension:Widgets.

Now I see this error in DevTools:

Failed to load resource: the server responded with a status of 500 (Internal Server Error)

Also after uninstalling Extension:Widgets, I tried Extension:PDFEmbed and unfortunately again I saw above error.

Editing MediaWiki pages in an external editor

Published 20 Feb 2017 by in Category:Blog_posts.

, Fremantle.

I've been working on a MediaWiki gadget lately, for editing Wikisource authors' metadata without leaving the author page. It's fun working with and learning more about OOjs-UI, but it's also a pain because gadget code is kept in Javascript pages in the MediaWiki namespace, and so every single time you want to change something it's a matter of saving the whole page, then clicking 'edit' again, and scrolling back down to find the spot you were at. The other end of things—the re-loading of whatever test page is running the gadget—is annoying and slow enough, without having to do much the same thing at the source end too.

So I've added a feature to the ExternalArticles extension that allows a whole directory full of text files to be imported at once (namespaces are handled as subdirectories). More importantly, it also 'watches' the directories and every time a file is updated (i.e. with Ctrl-S in a text editor or IDE) it is re-imported. So this means I can have MediaWiki:Gadget-Author.js and MediaWiki:Gadget-Author.css open in PhpStorm, and just edit from there. I even have these files open inside a MediaWiki project and so autocompletion and documentation look-up works as usual for all the library code. It's even quite a speedy set-up, luckily: I haven't yet noticed having to wait at any time between saving some code, alt-tabbing to the browser, and hitting F5.

I dare say my bodged-together script has many flaws, but it's working for me for now!

+ Add a commentComments on this blog post
No comments yet

Mediawiki doesn't send any email

Published 19 Feb 2017 by fpiette in Newest questions tagged mediawiki - Ask Ubuntu.

My mediawiki installation (1.28.0, PHP 7.0.13) doesn't send any email and yet there is no error emitted. I checked using Special:EmailUser page.

What I have tryed: 1) A simple PHP script to send a mail using PHP's mail() function. It works. 2) I have turned PHP mail log. There is a normal line for each Mediawiki email "sent".

PHP is configured (correctly since it works) to send email using Linux SendMail. MediaWiki is not configured to use direct SMTP.

Any suggestion appreciated. Thanks.

Week #5: Politics and the Super Bowl – chewing a pill too big to swallow

Published 17 Feb 2017 by legoktm in The Lego Mirror.

For a little change, I'd like to talk about the impact of sports upon us this week. The following opinion piece was first written for La Voz, and can also be read on their website.

Super Bowl commercials have become the latest victim of extreme politicization. Two commercials stood out from the rest by featuring pro-immigrant advertisements in the midst of a political climate deeply divided over immigration law. Specifically, Budweiser aired a mostly fictional story of their founder traveling to America to brew, while 84 Lumber’s ad followed a mother and daughter’s odyssey to America in search of a better life.

The widespread disdain toward non-white outsiders, which in turn has created massive backlash toward these advertisements, is no doubt repulsive, but caution should also be exercised when critiquing the placement of such politicization. Understanding the complexities of political institutions and society are no doubt essential, yet it is alarming that every facet of society has become so politicized; ironically, this desire to achieve an elevated political consciousness actually turns many off from the importance of politics.

Football — what was once simply a calming means of unwinding from the harsh winds of an oppressive world — has now become another headline news center for political drama.

President George H. W. Bush and his wife practically wheeled themselves out of a hospital to prepare for hosting the game. New England Patriots owner, Robert Kraft, and quarterback, Tom Brady, received sharp criticism for their support of Donald Trump, even to the point of losing thousands of dedicated fans.

Meanwhile, the NFL Players Association publicly opposed President Trump’s immigration ban three days before the game, with the NFLPA’s president saying “Our Muslim brothers in this league, we got their backs.”

Let’s not forget the veterans and active service members that are frequently honored before NFL games, except that’s an advertisement too – the Department of Defense paid NFL teams over $5 million over four years for those promotions.

Even though it’s an America’s pastime, football, and other similar mindless outlets, provide the role of allowing us to escape whenever we need a break from reality, and for nearly three hours on Sunday, America got its break, except for those commercials. If we keep getting nagged about an issue, even if we’re generally supportive, t will eventually become incessant to the point of promoting nihilism.

When Meryl Streep spoke out at the Golden Globes, she turned a relaxing event of celebrating fawning into a political shitstorm which redirected all attention back toward Trump controversies. Even she was mostly correct, the efficacy becomes questionable after such repetition as many will become desensitized.

Politics are undoubtedly more important than ever now, but for our sanity’s sake, let’s keep it to a minimum in football. That means commercials too.

What have we got in our digital archive?

Published 13 Feb 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Do other digital archivists find that the work of a digital archivist rarely involves doing hands on stuff with digital archives? When you have to think about establishing your infrastructure, writing policies and plans and attending meetings it leaves little time for activities at the coal face. This makes it all the more satisfying when we do actually get the opportunity to work with our digital holdings.

In the past I've called for more open sharing of profiles of digital archive collections but I am aware that I had not yet done this for the contents of our born digital collections here at the Borthwick Institute for Archives. So here I try to redress that gap.

I ran DROID (v 6.1.5, signature file v 88, container signature 20160927) over the deposited files in our digital archive and have spent a couple of days crunching the results. Note that this just covers the original files as they have been given to us. It does not include administrative files that I have added, or dissemination or preservation versions of files that have subsequently been created.

I was keen to see:
...and also use these results to:
  • Inform future preservation planning and priorities
  • Feed further information to the PRONOM team at The National Archives
  • Get us to Level 2 of the NDSA Levels of Digital Preservation which asks for "an inventory of file formats in use" and which until now I haven't been collating!

Digital data has been deposited with us since before I started at the Borthwick in 2012 and continues to be deposited with us today. We do not have huge quantities of digital archives here as yet (about 100GB) and digital deposits are still the exception rather than the norm. We will be looking to chase digital archives more proactively once we have a Archivematica in place and appropriate workflows established.

Last modified dates (as recorded by DROID) appear to range from 1984 to 2017 with a peak at 2008. This distribution is illustrated below. Note however, that this data is not always to be trusted (that could be another whole blog post in itself...). One thing that it is fair to say though is that the archive stretches back right to the early days of personal computers and up to the present day.

Last modified dates on files in the Borthwick digital archive

Here are some of the findings of this profiling exercise:

Summary statistics

  • Droid reported that 10005 individual files were present
  • 9431 (94%) of the files were given a file format identification by Droid. This is a really good result ...or at least it seems it in comparison to my previous data profiling efforts which have focused on research data. This result is also comparable with those found within other digital archives, for example 90% at Bentley Historical Library, 96% at Norfolk Record Office and 98% at Hull University Archives
  • 9326 (99%) of those files that were identified were given just one possible identification. 1 file was given 2 different identifications (an xlsx file) and 104 files (with a .DOC extension) were given 8 identifications. In all these cases of multiple identifications, identification was done by file extension rather than signature - which perhaps explains the uncertainty

Files that were identified

So perhaps these are things I'll look into in a bit more detail if I have time in the future.

  • 90 different file formats were identified within this collection of data

  • Of the identified files 1764 (19%) were identified as Microsoft Word Document 97-2003. This was followed very closely by JPEG File Interchange Format version 1.01 with 1675 (18%) occurrences. The top 10 identified files are illustrated below:

  • This top 10 is in many ways comparable to other similar profiles that have been published recently from Bentley Historical Library, Hull University Archive and Norfolk Records Office with high occurrences of Microsoft Word, PDF and JPEG images. In contrast. what it is not so common in this profile are HTML files and GIF image files - these only just make it into the top 50. 

  • Also notable in our top ten are the Sibelius files which haven't appeared in other recently published profiles. Sibelius is musical notation software and these files appear frequently in one of our archives.

Files that weren't identified

  • Of the 574 files that weren't identified by DROID, 125 different file extensions were represented. For most of these there was just a single example of each.

  • 160 (28%) of the unidentified files had no file extension at all. Perhaps not surprisingly it is the earlier files in our born digital collection (files from the mid 80's), that are most likely to fall into this category. These were created at a time when operating systems seemed to be a little less rigorous about enforcing the use of file extensions! Approximately 80 of these files are believed to be WordStar 4.0 (PUID:  x-fmt/260) which DROID would only be able to recognise by file extension. Of course if no extension is included. DROID has little chance of being able to identify them!

  • The most common file extensions of those files that weren't identified are visible in the graph below. I need to do some more investigation into these but most come from 2 of our archives that relate to electronic music composition:

I'm really pleased to see that the vast majority of the files that we hold can be identified using current tools. This is a much better result than for our research data. Obviously there is still room for improvement so I hope to find some time to do further investigations and provide information to help extend PRONOM.

Other follow on work involves looking at system files that have been highlighted in this exercise. See for example the AppleDouble Resource Fork files that appear in the top ten identified formats. Also appearing quite high up (at number 12) were Thumbs.db files but perhaps that is the topic of another blog post. In the meantime I'd be really interested to hear from anyone who thinks that system files such as these should be retained.

Harvesting EAD from AtoM: a collaborative approach

Published 10 Feb 2017 by Jenny Mitcham in Digital Archiving at the University of York.

In a previous blog post AtoM harvesting (part 1) - it works! I described how archival descriptions within AtoM are being harvested as Dublin Core for inclusion within our University Library Catalogue.* I also hinted that this wouldn’t be the last you would hear from me on AtoM harvesting and that plans were afoot to enable much richer metadata in EAD 2002 XML (Encoded Archival Description) format to be harvested via OAI-PMH.

I’m pleased to be able to report that this work is now underway.

The University of York along with five other organisations in the UK have clubbed together to sponsor Artefactual Systems to carry out the necessary development work to make EAD harvesting possible. This work is scheduled for release in AtoM version 2.4 (due out in the Spring).

The work is being jointly sponsored by:

We are also receiving much needed support in this project from The Archives Hub who are providing advice on the AtoM EAD and will be helping us test the EAD harvesting when it is ready. While the sponsoring institutions are all producers of AtoM EAD, The Archives Hub is a consumer of that EAD. We are keen to ensure that the archival descriptions that we enter into AtoM can move smoothly to The Archives Hub (and potentially to other data aggregators in the future), allowing the richness of our collections to be signposted as widely as possible.

Adding this harvesting functionality to AtoM will enable The Archives Hub to gather data direct from us on a regular schedule or as and when updates occur, ensuring that:

So, what are we doing at the moment?

What we are doing at the moment is good and a huge step in the right direction, but perhaps not perfect. As we work together on this project we are coming across areas where future work would be beneficial in order to improve the quality of the EAD that AtoM produces or to expand the scope of what can be harvested from AtoM. I hope to report on this in more detail at the end of the project, but in the meantime, do get in touch if you are interested in finding out more.

* It is great to see that this is working well and our Library Catalogue is now appearing in the referrer reports for the Borthwick Catalogue on Google Analytics. People are clearly following these new signposts to our archives!

Week #4: 500 for Mr. San Jose Shark

Published 9 Feb 2017 by legoktm in The Lego Mirror.

He did it: Patrick Marleau scored his 500th career goal. He truly is Mr. San Jose Shark.

I had the pleasure of attending the next home game on Saturday right after he reached the milestone in Vancouver, and nearly lost my voice chearing for Marleau. They mentioned his accomplishment once before the game and again during a break, and each time Marleau would only stand up and acknowledge the crowd cheering for him when he realized they would not stop until he did.

He's had his ups and downs, but he's truly a team player.

“I think when you hit a mark like this, you start thinking about everyone that’s helped you along the way,” Marleau said.

And on Saturday at home, Marleau assisted on both Sharks goals, helping out his teammates who had helped Marleau score his over the past two weeks.

Congrats Marleau, and thanks for the 20 years of hockey. Can't wait to see you raise the Cup.

Simpson and his Donkey – an exhibition

Published 9 Feb 2017 by carinamm in State Library of Western Australia Blog.

Illustrations by Frané Lessac and words by Mark Greenwood share the heroic story of John Simpson Kirkpatrick in the picture book Simpson and his Donkey.  The exhibition is on display at the State Library until  27 April. 

Unpublished spread 14 for pages 32 – 33
Collection of draft materials for Simpson and his Donkey, PWC/254/18 

The original illustrations, preliminary sketches and draft materials displayed in this exhibition form part of the State Library’s Peter Williams’ collection: a collection of original Australian picture book art.

Known as ‘the man with the donkey’, Simpson was a medic who rescued wounded soldiers at Gallipoli during World War I.

The bravery and sacrifice attributed to Simpson is now considered part of the ‘Anzac legend’. It is the myth and legend of John Simpson that Frané Lessac and Mark Greenwood tell in their book.

Frané Lessac and Mark Greenwood also travelled to Anzac Cove to explore where Simpson and Duffy had worked.  This experience and their research enabled them to layer creative interpretation over historical information and Anzac legend.


On a moonless April morning, PWC254/6 

Frané Lessac is a Western Australian author-illustrator who has published over forty books for children. Frané speaks at festivals in Australia and overseas, sharing the process of writing and illustrating books. She often illustrates books by , Mark Greenwood, of which Simpson and his Donkey is just one example.

Simpson and his Donkey is published by Walker Books, 2008. The original illustrations are  display in the Story Place Gallery until 27 April 2017.


Filed under: Children's Literature, community events, Exhibitions, Illustration, Picture Books, SLWA collections, SLWA displays, WA books and writers, WA history, Western Australia Tagged: children's literature, exhibitions, Frane Lessac, Mark Greenwood, Peter Williams collection, Simpson and his Donkey, State Library of Western Australia, The Story Place

New feature for ia-upload

Published 7 Feb 2017 by in Category:Blog_posts.

, Fremantle.

I have been working on an addition to the IA Upload tool these last few days, and it's ready for testing. Hopefully we'll merge it tomorrow or the next day.

This is the first time I've done much work with the internal structure of DjVu files, and really it's all been pretty straight-forward. A couple of odd bits about matching element and page names up between things, but once that was sorted it all seems to be working as it should.

It's a shame that the Internet Archive has discontinued their production of DjVu files, but I guess they've got their reasons, and it's not like anyone's ever heard of DjVu anyway. I don't suppose anyone other than Wikisource was using those files. Thankfully they're still producing the DjVu XML that we need to make our own DjVus, and it sounds like they're going to continue doing so (because they use the XML to produce the text versions of items).

Update two days later: this feature is now live.

+ Add a commentComments on this blog post
No comments yet


Published 6 Feb 2017 by timbaker in Tim Baker.

So I’ve got this speaking gig coming up at the Pursue Your Passion conference in Bryon Bay Saturday week, February 18. And I’ve been thinking a lot about what I want to say. One of my main qualifications for this gig is my 2011 round...

Published 2 Feb 2017 by in Category:Blog_posts.

. sounds like a cool thing built on top of the existing blogosphere, allowing anyone to microblog (i.e. tweet) from the comfort of their own personally-controlled blog installation (e.g. WordPress).

Week #3: All-Stars

Published 2 Feb 2017 by legoktm in The Lego Mirror.

via /u/PAGinger on reddit

Last weekend was the NHL All-Star game and skills competition, with Brent Burns, Martin Jones, and Joe Pavelski representing the San Jose Sharks in Los Angeles. And to no one's surprise, they were all booed!

Pavelski scored a goal during the tournament for the Pacific Division, and Burns scored during the skills competition's "Four Line Challenge". But since they represented the Pacific, we have to talk about the impossible shot Mike Smith made.

And across the country, the 2017 NFL Pro Bowl (their all-star game) was happening at the same time. The Oakland Raiders had seven Pro Bowlers (tied for most from any team), and the San Francisco 49ers had...none.

In the meantime the 49ers managed to hire a former safety with no General Manager-related experience as their new GM. It's really not clear what Jed York, the 49ers owner, is trying out here, and why he would sign John Lynch to a six year contract.

But really, how much worse could it get for the 49ers?

Abandoned code projects

Published 30 Jan 2017 by in Category:Blog_posts.

.. One of the sad things about open source software is the process of working on some code, feeling like it's going somewhere good and is useful to people, but then at some point having to abandon it. Normally just because life moves on and the higher-priority code always has to be the stuff that earns an income, or just that there are only so many slots for projects in my brain.

I feel this way about Tabulate, the WordPress plugin I was working on until a year ago, and about a few Dokuwiki plugins that I used to maintain. All were good fun to work on, and served reasonably useful places in some people's websites. But I don't have time, especially as it takes even more time & concentration to switch between completely separate codebases and communities — the latter especially. So these projects just languish, usually until some wonderful person comes along on Github and asks to take over as maintainer.

I am going to try to keep up with Tabulate, however. It doesn't need that much work, and the WordPress ecosystem is a world that I actually find quite rewarding to inhabit (I know lots of people wouldn't agree with that, and certainly there's a commercial side to it that I find a bit tiring).

Not this morning, though, but maybe later this week... :-)

Updates to

Published 29 Jan 2017 by legoktm in The Lego Mirror.

Over the weekend I migrated and associated services over to a new server. It's powered by Debian Jessie instead of the slowly aging Ubuntu Trusty. Most services were migrated with no downtime by rsync'ing content over and the updating DNS. Only had some downtime due to needing to stop the service before copying over the database.

I did not migrate my IRC bouncer history or configuration, so I'm starting fresh. So if I'm no longer in a channel, feel free to PM me and I'll rejoin!

At the same time I moved the main homepage to MediaWiki. Hopefully that will encourage me to update the content on it more often.

Finally, the tor relay node I'm running was moved to a separate server entirely. I plan on increasing the resources allocated to it.


Published 26 Jan 2017 by legoktm in The Lego Mirror.

The only person who would dare upstage Patrick Marleau's four goal night is Randy Hahn, with his hilarious call after Marleau's third goal to finish a natural hat-trick: "NATTY HATTY FOR PATTY". And after scoring another, Marleau became the first player to score four goals in a single period since the great Mario Lemieux did in 1997. He's also the third Shark to score four goals in a game, joining Owen Nolan (no video available, but his hat-trick from the 1997 All-Star game is fabulous) and Tomáš Hertl.

Marleau is also ready to hit his next milestone of 500 career goals - he's at 498 right now. Every impressive stat he puts up just further solidifies him as one of the greatest hockey players of his generation. But he's still missing the one achievement that all the greats need - a Stanley Cup. The Sharks made their first trip to the Stanley Cup Finals last year, but realistically had very little chance of winning; they simply were not the better team.

The main question these days is how long Marleau and Joe Thornton will keep playing for, and if they can stay healthy until they eventually win that Stanley Cup.

Discuss this post on Reddit.

Creating an annual accessions report using AtoM

Published 24 Jan 2017 by Jenny Mitcham in Digital Archiving at the University of York.

So, it is that time of year where we need to complete our annual report on accessions for the National Archives. Along with lots of other archives across the UK we send The National Archives summary information about all the accessions we have received over the course of the previous year. This information is collated and provided online on the Accessions to Repositories website for all to see.

The creation of this report has always been a bit time consuming for our archivists, involving a lot of manual steps and some re-typing but since we have started using AtoM as our Archival Management System the process has become much more straightforward.

As I've reported in a previous blog post, AtoM does not do all that we want to do in the way of reporting via it's front end.

However, AtoM has an underlying MySQL database and there is nothing to stop you bypassing the interface, looking at the data behind the scenes and pulling out all the information you need.

One of the things we got set up fairly early in our AtoM implementation project was a free MySQL client called Squirrel. Using Squirrel or another similar tool, you can view the database that stores all your AtoM data, browse the data and run queries to pull out the information you need. It is also possible to update the data using these SQL clients (very handy if you need to make any global changes to your data). All you need initially is a basic knowledge of SQL and you can start pulling some interesting reports from AtoM.

The downside of playing with the AtoM database is of course that it isn't nearly as user friendly as the front end.

It is always a bit of an adventure navigating the database structure and trying to work out how the tables are linked. Even with the help of an Entity Relationship Diagram from Artefactual creating more complex queries is ...well ....complex!

AtoM's database tables - there are a lot of them!

However, on a positive note, the AtoM user forum is always a good place to ask stupid questions and Artefactual staff are happy to dive in and offer advice on how to formulate queries. I'm also lucky to have help from more technical colleagues here in Information Services (who were able to help me get Squirrel set up and talking to the right database and can troubleshoot my queries) so what follows is very much a joint effort.

So for those AtoM users in the UK who are wrestling with their annual accessions report, here is a query that will pull out the information you need:

SELECT accession.identifier,, accession_i18n.title, accession_i18n.scope_and_content, accession_i18n.received_extent_units, 
accession_i18n.location_information, case when cast(event.start_date as char) like '%-00-00' then left(cast(event.start_date as char),4) 
else cast(event.start_date as char)
end as start_date,
case when cast(event.end_date as char) like '%-00-00' then left(cast(event.end_date as char),4) 
else cast(event.end_date as char)
end as end_date,
from accession
LEFT JOIN event on
LEFT JOIN event_i18n on
JOIN accession_i18n ON
where like '2016%'
order by identifier

A couple of points to make here:

  • In a previous version of the query, we included some other tables so we could also capture information about the creator of the archive. The addition of the relation, actor and actor_i18n tables made the query much more complicated and for some reason it didn't work this year. I have not attempted to troubleshoot this in any great depth for the time being as it turns out we are no longer recording creator information in our accessions records. Adding a creator record to an accessions entry creates an authority record for the creator that is automatically made public within the AtoM interface and this ends up looking a bit messy (as we rarely have time at this point in the process to work this into a full authority record that is worthy of publication). Thus as we leave this field blank in our accession record there is no benefit in trying to extract this bit of the database.
  • In an earlier version of this query there was something strange going on with the dates that were being pulled out of the event table. This seemed to be a quirk that was specific to Squirrel. A clever colleague solved this by casting the date to char format and including a case statement that will list the year when there's only a year and the full date when fuller information has been entered. This is useful because in our accession records we enter dates to different levels. 
So, once I've exported the results of this query, put them in an Excel spreadsheet and sent them to one of our archivists, all that remains for her to do is to check through the data, do a bit of tidying up, ensure the column headings match what is required by The National Archives and the spreadsheet is ready to go!

Bromptons in Museums and Art Galleries

Published 23 Jan 2017 by Andy Mabbett in Andy Mabbett, aka pigsonthewing.

Every time I visit London, with my Brompton bicycle of course, I try to find time to take in a museum or art gallery. Some are very accommodating and will cheerfully look after a folded Brompton in a cloakroom (e.g. Tate Modern, Science Museum) or, more informally, in an office or behind the security desk (Bank of England Museum, Petrie Museum, Geffrye Museum; thanks folks).

Brompton bicycle folded

When folded, Brompton bikes take up very little space

Others, without a cloakroom, have lockers for bags and coats, but these are too small for a Brompton (e.g. Imperial War Museum, Museum of London) or they simply refuse to accept one (V&A, British Museum).

A Brompton bike is not something you want to chain up in the street, and carrying a hefty bike-lock would defeat the purpose of the bike’s portability.

Jack Wills, New Street (geograph 4944811)

This Brompton bike hire unit, in Birmingham, can store ten folded bikes each side. The design could be repurposed for use at venues like museums or galleries.

I have an idea. Brompton could work with museums — in London, where Brompton bikes are ubiquitous, and elsewhere, though my Brompton and I have never been turned away from a museum outside London — to install lockers which can take a folded Brompton. These could be inside with the bag lockers (preferred) or outside, using the same units as their bike hire scheme (pictured above).

Where has your Brompton had a good, or bad, reception?


Less than two hours after I posted this, Will Butler-Adams, MD of Brompton, >replied to me on Twitter:

so now I’m reaching out to museums, in London to start with, to see who’s interested.

The post Bromptons in Museums and Art Galleries appeared first on Andy Mabbett, aka pigsonthewing.

Wikisource Hangout

Published 23 Jan 2017 by in Category:Blog_posts.


I wonder how long it takes after someone first starts editing a Wikimedia project that they figure out that they can read lots of Wikimedia news on — and when, after that, they realise they can also post to the news there? (At which point they probably give up if they haven't already got a blog.)

Anyway, I forgot that I can post news, but then I remembered. So:

There's going to be a Wikisource meeting next weekend (28 January, on Google Hangouts), if you're interested in joining: metawikimedia:Wikisource Community User Group/January 2017 Hangout.

+ Add a commentComments on this blog post

Running with the Masai

Published 23 Jan 2017 by Tom Wilson in thomas m wilson.

What are you going to do if you like tribal living and you’re in the cold winter of the Levant?  Head south to the Southern Hemisphere, and to the wilds of Africa. After leaving Israel and Jordan that is exactly what I did. I arrived in Nairobi and the first thing which struck me was […]

Week #1: Who to root for this weekend

Published 22 Jan 2017 by legoktm in The Lego Mirror.

For the next 10 weeks I'll be posting sports content related to Bay Area teams. I'm currently taking an intro to features writing class, and we're required to keep a blog that focuses on a specific topic. I enjoy sports a lot, so I'll be covering Bay Area sports teams (Sharks, Earthquakes, Raiders, 49ers, Warriors, etc.). I'll also be trialing using Reddit for comments. If it works well, I'll continue using it for the rest of my blog as well. And with that, here goes:

This week the Green Bay Packers will be facing the Atlanta Falcons in the very last NFL game at the Georgia Dome for the NFC Championship. A few hours later, the Pittsburgh Steelers will meet the New England Patriots in Foxboro competing for the AFC Championship - and this will be only the third playoff game in NFL history featuring two quarterbacks with multiple Super Bowl victories.

Neither Bay Area football team has a direct stake in this game, but Raiders and 49ers fans have a lot to root for this weekend.

49ers: If you're a 49ers fan, you want to root for the Falcons to lose. This might sound a little weird, but currently the 49ers are looking to hire Falcons offensive coordinator, Kyle Shanahan, as their new head coach. However, until the Falcons' season ends, they cannnot officially hire him. And since 49ers general manager search depends upon having a head coach, they can get a head start by two weeks if the Falcons lose this weekend.

Raiders: Do you remember the Tuck Rule Game? If so, you'll still probably be rooting for anyone but Tom Brady, quarterback for the Patriots. If not, well, you'll probably want to root for the Steelers, who eliminated Raiders' division rival Kansas City Chiefs last weekend in one of the most bizarre playoff games. Even though the Steelers could not score a single touchdown, they topped the Chiefs two touchdowns with a record six field goals. Raiders fans who had to endure two losses to the Chiefs this season surely appreciated how the Steelers embarrassed the Chiefs on prime time television.

Discuss this post on Reddit.

Four Stars of Open Standards

Published 21 Jan 2017 by Andy Mabbett in Andy Mabbett, aka pigsonthewing.

I’m writing this at UKGovCamp, a wonderful unconference. This post constitutes notes, which I will flesh out and polish later.

I’m in a session on open standards in government, convened by my good friend Terence Eden, who is the Open Standards Lead at Government Digital Service, part of the United Kingdom government’s Cabinet Office.

Inspired by Tim Berners-Lee’s “Five Stars of Open Data“, I’ve drafted “Four Stars of Open Standards”.

These are:

  1. Publish your content consistently
  2. Publish your content using a shared standard
  3. Publish your content using an open standard
  4. Publish your content using the best open standard

Bonus points for:

Point one, if you like is about having your own local standard — if you publish three related data sets for instance, be consistent between them.

Point two could simply mean agreeing a common standard with other items your organisation, neighbouring local authorities, or suchlike.

In points three and four, I’ve taken “open” to be the term used in the “Open Definition“:

Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).

Further reading:

The post Four Stars of Open Standards appeared first on Andy Mabbett, aka pigsonthewing.

Supporting Software Freedom Conservancy

Published 17 Jan 2017 by legoktm in The Lego Mirror.

Software Freedom Conservancy is a pretty awesome non-profit that does some great stuff. They currently have a fundraising match going on, that was recently extended for another week. If you're able to, I think it's worthwhile to support their organization and mission. I just renewed my membership.

Become a Conservancy Supporter!

A Doodle in the Park

Published 16 Jan 2017 by Dave Robertson in Dave Robertson.

The awesome Carolyn White is doing a doodle a day, but in this case it was a doodle of Dave, with Tore and The Professor, out in the summer sun of the Manning Park Farmers and Artisan Market.


MediaWiki - powered by Debian

Published 16 Jan 2017 by legoktm in The Lego Mirror.

Barring any bugs, the last set of changes to the MediaWiki Debian package for the stretch release landed earlier this month. There are some documentation changes, and updates for changes to other, related packages. One of the other changes is the addition of a "powered by Debian" footer icon (drawn by the amazing Isarra), right next to the default "powered by MediaWiki" one.

Powered by Debian

This will only be added by default to new installs of the MediaWiki package. But existing users can just copy the following code snippet into their LocalSettings.php file (adjust paths as necessary):

# Add a "powered by Debian" footer icon
$wgFooterIcons['poweredby']['debian'] = [
    "src" => "/mediawiki/resources/assets/debian/poweredby_debian_1x.png",
    "url" => "",
    "alt" => "Powered by Debian",
    "srcset" =>
        "/mediawiki/resources/assets/debian/poweredby_debian_1_5x.png 1.5x, " .
        "/mediawiki/resources/assets/debian/poweredby_debian_2x.png 2x",

The image files are included in the package itself, or you can grab them from the Git repository. The source SVG is available from Wikimedia Commons.


Published 12 Jan 2017 by timbaker in Tim Baker.

The Pursue Your Passion conference is on in Byron Bay on Saturday February 18 and is designed for anyone wanting to make 2017 the year they really start following their dreams, living their bliss and all that good stuff. I’m one of three speakers on...

Importing pages breaks category feature

Published 10 Jan 2017 by Paul in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I just installed MediaWiki 1.27.1 and setup completes without issue on a server with Ubuntu 16.04, nginx, PHP 5.6, and MariaDB 10.1.

I created an export file with a different wiki using the Special:Export page. I then imported the articles to the new wiki using the Special:Import page. The file size is smaller than any limits and time the operation takes to complete is much less than configured timeouts.

Before import, I have created articles and categories and everything works as expected.

However, after importing, when I create a category tag on an article, clicking the link to the category's page doesn't show the article in the category.

I am using this markup within the article to create the category:

[[Category:Category Name]]

Is this a bug or am I missing something?

Teacup – One Boy’s Story of Leaving His Homeland

Published 8 Jan 2017 by carinamm in State Library of Western Australia Blog.


“Once there was a boy who had to leave home …and find another. In his bag he carried a book, a bottle and a blanket. In his teacup he held some earth from where he used to play”

A musical performance adapted from the picture book Teacup written by Rebecca Young and illustrated Matt Ottley, will premiere at the State Library of Western Australia as part of Fringe Festival. 

Accompanied by musicians from Perth chamber music group Chimera Ensemble, Music Book’s Narrator Danielle Joynt and Lark Chamber Opera’s soprano composer Emma Jayakumar, the presentation of Teacup will be a truly ‘multi-modal’ performance, where the music of Matt Ottley will ‘paint’ the colours, scenery and words into life.

Performance Times:

Fri 27 January 2:30pm
Sat 28 January 10:30am, 1pm and 2:30pm
Sun 29 January 10:30am, 1pm and 2:30pm

Matt Ottley’s original paintings from the picture book Teacup from part of the State Library’s Peter Williams collection of original picture book art. The artworks will be displayed in  Teacup – an exhibition in the ground floor gallery between 20 January – 24 March 2017.

Image credit: Cover illustration for Teacup, Matt Ottley, 2015. State Library of Western Australia, PWC/255/01  Reproduced in the book Teacup written by Rebecca Young with illustrations by Matt Ottley. Published by Scholastic, 2015.

This event is supported by the City of Perth 

Filed under: Children's Literature, community events, Concerts, Exhibitions, Illustration, Music, SLWA collections, SLWA displays, SLWA events, SLWA Exhibitions, Uncategorized Tagged: exhibitions, Matt Ottley, Music Book Stories Inc., Peter Williams collection, State Library of Western Australia, Teacup - One Boy's Story of Leaving His Homeland

Big Tribes

Published 5 Jan 2017 by Tom Wilson in thomas m wilson.

In Jerusalem yesterday I encountered three of the most sacred sites of some of the biggest religions on earth. First the Western Wall, the most sacred site for Jews worldwide. Then after some serious security checks and long wait in a line we were allowed up a long wooden walkway, up to the Temple Mount.   […]

A Year Without a Byte

Published 4 Jan 2017 by Archie Russell in

One of the largest cost drivers in running a service like Flickr is storage. We’ve described multiple techniques to get this cost down over the years: use of COS, creating sizes dynamically on GPUs and perceptual compression. These projects have been very successful, but our storage cost is still significant.
At the beginning of 2016, we challenged ourselves to go further — to go a full year without needing new storage hardware. Using multiple techniques, we got there.

The Cost Story

A little back-of-the-envelope math shows storage costs are a real concern. On a very high-traffic day, Flickr users upload as many as twenty-five million photos. These photos require an average of 3.25 megabytes of storage each, totalling over 80 terabytes of data. Stored naively in a cloud service similar to S3, this day’s worth of data would cost over $30,000 per year, and continue to incur costs every year.

And a very large service will have over two hundred million active users. At a thousand images each, storage in a service similar to S3 would cost over $250 million per year (or $1.25 / user-year) plus network and other expenses. This compounds as new users sign up and existing users continue to take photos at an accelerating rate. Thankfully, our costs, and every large service’s costs, are different than storing naively at S3, but remain significant.

Cost per byte have decreased, but bytes per image from iPhone-type platforms have increased. Cost per image hasn’t changed significantly.

Storage costs do drop over time. For example, S3 costs dropped from $0.15 per gigabyte month in 2009 to $0.03 per gigabyte-month in 2014, and cloud storage vendors have added low-cost options for data that is infrequently accessed. NAS vendors have also delivered large price reductions.

Unfortunately, these lower costs per byte are counteracted by other forces. On iPhones, increasing camera resolution, burst mode and the addition of short animations (Live Photos) have increased bytes-per-image rapidly enough to keep storage cost per image roughly constant. And iPhone images are far from the largest.

In response to these costs, photo storage services have pursued a variety of product options. To name a few: storing lower quality images or re-compressing, charging users for their data usage, incorporating advertising, selling associated products such as prints, and tying storage to purchases of handsets.

There are also a number of engineering approaches to controlling storage costs. We sketched out a few and cover three that we implemented below: adjusting thresholds on our storage systems, rolling out existing savings approaches to more images, and deploying lossless JPG compression.

Adjusting Storage Thresholds

As we dug into the problem, we looked at our storage systems in detail. We discovered that our settings were based on assumptions about high write and delete loads that didn’t hold. Our storage is pretty static. Users only rarely delete or change images once uploaded. We also had two distinct areas of just-in-case space. 5% of our storage was reserved space for snapshots, useful for undoing accidental deletes or writes, and 8.5% was held free in reserve. This resulted in about 13% of our storage going unused. Trade lore states that disks should remain 10% free to avoid performance degradation, but we found 5% to be sufficient for our workload. So we combined our our two just-in-case areas into one and reduced our free space threshold to that level. This was our simplest approach to the problem (by far), but it resulted in a large gain. With a couple simple configuration changes, we freed up more than 8% of our storage.

Adjusting storage thresholds

Extending Existing Approaches

In our earlier posts, we have described dynamic generation of thumbnail sizes and perceptual compression. Combining the two approaches decreased thumbnail storage requirements by 65%, though we hadn’t applied these techniques to many of our images uploaded prior to 2014. One big reason for this: large-scale changes to older files are inherently risky, and require significant time and engineering work to do safely.

Because we were concerned that further rollout of dynamic thumbnail generation would place a heavy load on our resizing infrastructure, we targeted only thumbnails from less-popular images for deletes. Using this approach, we were able to handle our complete resize load with just four GPUs. The process put a heavy load on our storage systems; to minimize the impact we randomized our operations across volumes. The entire process took about four months, resulting in even more significant gains than our storage threshold adjustments.

Decreasing the number of thumbnail sizes

Lossless JPG Compression

Flickr has had a long-standing commitment to keeping uploaded images byte-for-byte intact. This has placed a floor on how much storage reduction we can do, but there are tools that can losslessly compress JPG images. Two well-known options are PackJPG and Lepton, from Dropbox. These tools work by decoding the JPG, then very carefully compressing it using a more efficient approach. This typically shrinks a JPG by about 22%. At Flickr’s scale, this is significant. The downside is that these re-compressors use a lot of CPU. PackJPG compresses at about 2MB/s on a single core, or about fifteen core-years for a single petabyte worth of JPGs. Lepton uses multiple cores and, at 15MB/s, is much faster than packJPG, but uses roughly the same amount of CPU time.

This CPU requirement also complicated on-demand serving. If we recompressed all the images on Flickr, we would need potentially thousands of cores to handle our decompress load. We considered putting some restrictions on access to compressed images, such as requiring users to login to access original images, but ultimately found that if we targeted only rarely accessed private images, decompressions would occur only infrequently. Additionally, restricting the maximum size of images we compressed limited our CPU time per decompress. We rolled this out as a component of our existing serving stack without requiring any additional CPUs, and with only minor impact to user experience.

Running our users’ original photos through lossless compression was probably our highest-risk approach. We can recreate thumbnails easily, but a corrupted source image cannot be recovered. Key to our approach was a re-compress-decompress-verify strategy: every recompressed image was decompressed and compared to its source before removing the uncompressed source image.

This is still a work-in-progress. We have compressed many images but to do our entire corpus is a lengthy process, and we had reached our zero-new-storage-gear goal by mid-year.

On The Drawing Board

We have several other ideas which we’ve investigated but haven’t implemented yet.

In our current storage model, we have originals and thumbnails available for every image, each stored in two datacenters. This model assumes that the images need to be viewable relatively quickly at any point in time. But private images belonging to accounts that have been inactive for more than a few months are unlikely to be accessed. We could “freeze” these images, dropping their thumbnails and recreate them when the dormant user returns. This “thaw” process would take under thirty seconds for a typical account. Additionally, for photos that are private (but not dormant), we could go to a single uncompressed copy of each thumbnail, storing a compressed copy in a second datacenter that would be decompressed as needed.

We might not even need two copies of each dormant original image available on disk. We’ve pencilled out a model where we place one copy on a slower, but underutilized, tape-based system while leaving the other on disk. This would decrease availability during an outage, but as these images belong to dormant users, the effect would be minimal and users would still see their thumbnails. The delicate piece here is the placement of data, as seeks on tape systems are prohibitively slow. Depending on the details of what constitutes a “dormant” photo these techniques could comfortably reduce storage used by over 25%.

We’ve also looked into de-duplication, but we found our duplicate rate is in the 3% range. Users do have many duplicates of their own images on their devices, but these are excluded by our upload tools.  We’ve also looked into using alternate image formats for our thumbnail storage.    WebP can be much more compact than ordinary JPG but our use of perceptual compression gets us close to WebP byte size and permits much faster resize.  The BPG project proposes a dramatically smaller, H.265 based encoding but has IP and other issues.

There are several similar optimizations available for videos. Although Flickr is primarily image-focused, videos are typically much larger than images and consume considerably more storage.


Optimization over several releases

Since 2013 we’ve optimized our usage of storage by nearly 50%.  Our latest efforts helped us get through 2016 without purchasing any additional storage,  and we still have a few more options available.

Peter Norby, Teja Komma, Shijo Joy and Bei Wu formed the core team for our zero-storage-budget project. Many others assisted the effort.

Hello 2017

Published 4 Jan 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Looking back

2016 was a busy year.

I can tell that from just looking at my untidy desk...I was going to include a photo at this point but that would be too embarrassing.

The highlights of 2016 for me were getting our AtoM catalogue released and available to the world in April, completing Filling the Digital Preservation Gap (and seeing the project move from the early 'thinking' phases to actual implementation) and of course having our work on this project shortlisted in the Research and Innovation category of the Digital Preservation Awards.

...but other things happened too. Blogging really is a great way of keeping track of what I've been working on and of course what people are most interested to read about.

The top 5 most viewed posts from 2016 on this blog have been as follows:

Looking forward

So what is on the horizon for 2017?

Here are some of the things I'm going to be working on - expect blog posts on some or all of these things as the year progresses.


I blogged about AtoM a fair bit last year as we prepared our new catalogue for release in the wild! I expect I'll be talking less about AtoM this year as it becomes business as usual at the Borthwick, but don't expect me to be completely silent on this topic.

A group of AtoM users in the UK is sponsoring some development work within AtoM to enable EAD to be harvested via OAI-PMH. This is a very exciting new collaboration and will see us being able to expose our catalogue entries to the wider world, enabling them to be harvested by aggregators such as the Archives Hub. I'm very much looking forward to seeing this take shape.

This year I'm also keen to explore the Locations functionality of AtoM to see whether it is fit for our purposes.


Work with Archivematica is of course continuing. 

Post Filling the Digital Preservation Gap at York we are working on moving our proof of concept into production. We are also continuing our work with Jisc on the Research Data Shared Service. York is a pilot institution for this project so we will be improving and refining our processes and workflows for the management and preservation of research data through this collaboration.

Another priority for the year is to make progress with the preservation of the born digital data that is held by the Borthwick Institute for Archives. Over the year we will be planning a different set of Archivematica workflows specifically for the archives. I'm really excited about seeing this take shape.

We are also thrilled to be hosting the first European ArchivematiCamp here in York in the Spring. This will be a great opportunity to get current and potential Archivematica users across the UK and the rest of Europe together to share experiences and find out more about the system. There will no doubt be announcements about this over the next couple of months once the details are finalised so watch this space.

Ingest processes

Last year a new ingest PC arrived on my desk. I haven't yet had much chance to play with this but the plan is to get this set up for digital ingest work.

I'm keen to get BitCurator installed and to refine our current digital ingest procedures. After some useful chats about BitCurator with colleagues in the UK and the US over 2016 I'm very much looking forward to getting stuck into this.

...but really the first challenge of 2017 is to tidy my desk!

Impressions of Jerusalem and Tel Aviv

Published 3 Jan 2017 by Tom Wilson in thomas m wilson.

Arriving in Israel… Coming over the border from Jordan it was forbidding and stern – as though I was passing through a highly militarised zone, which indeed I was. Machine gun towers, arid, blasted dune landscape, and endless security checks and waiting about. Then I was in the West Bank. The first thing I noticed […]


Published 29 Dec 2016 by Tom Wilson in thomas m wilson.

I have been travelling West from Asia.  When I was in Colombo I photographed a golden statue of the Buddha facing the Greco-Roman heritage embodied in Colombo’s Town Hall.  And now I’ve finally reached a real example of the Roman Empire’s built heritage – the city of Jerash in Jordan.  Jerash is one of the […]

We Are Bedu

Published 26 Dec 2016 by Tom Wilson in thomas m wilson.

While in Wadi Musa I had met our Bedu guide’s 92 year old mother. She was living in an apartment in the town. I asked her if she preferred life when she was a young woman and there was less access to Western conveniences, or if she preferred life