Sam's news

Here are some of the news sources I follow.

My main website is at https://samwilson.id.au/.


Updating to a much higher mediawiki script?

Published 25 May 2018 by Erik L in Newest questions tagged mediawiki - Stack Overflow.

I currently have 1.23.1 installed and would like to update to ´1.30´ and got bunch of extensions installed. What would be the best way to update the wiki? Should I update version by version or jump straight the the final version that I want?


Extract titles of pages in category using the WikipediR package in R

Published 25 May 2018 by Sophie in Newest questions tagged mediawiki - Stack Overflow.

Using the "WikipediR" package in R, I would like to extract all category members of the category "Person_der_Reformation" on Wikipedia:

library(WikipediR)

# Retrieve all pages in the "Person_der_Reformation" category on de.wiki
persons <- pages_in_category("de", "wikipedia", categories = "Person_der_Reformation", limit = 500)
persons

My result seems to be a threefold nested list (persons > query > categorymembers). But how do I further process this? In the end, I want to get a list of all personal names in order to built a list including all the Wikipedia articles I want to scrape for building a corpus for Text Mining in R. Does anybody has a clue on this?

My alternative idea was to read the xml document resulting from the API call https://de.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Person_der_Reformation&cmlimit=500&format=xml

but there I'm struggling with the XPath to adress the title attributes. This is how the result page from the API call looks like:

<api batchcomplete="">
<query>
<categorymembers>
<cm pageid="2720179" ns="0" title="Jacobus Acontius"/>
<cm pageid="1347785" ns="0" title="Sebastian Aitinger"/>
<cm pageid="7892887" ns="0" title="Martial Alba"/>
<cm pageid="2960360" ns="0" title="Albrecht (Nassau-Weilburg)"/>
.....
</categorymembers>
</query>
</api> 

GDPR is Here, and We've Got You Covered

Published 25 May 2018 by DigitalOcean in The DigitalOcean Blog.

GDPR is Here, and We've Got You Covered

Today, the new European General Data Protection Regulation (GDPR) goes into effect. (You might have received a few emails about it.) There are a lot of moving parts, but it’s an important step in protecting the fundamental right of privacy for European citizens, and it also raises the bar for data protection, security, and compliance in the industry. This post is here to guide you to our GDPR-related resources.

We’ve created a new GDPR section on our website to go over what GDPR means for you and the steps we’ve taken to ensure the protection of your privacy. In this section, you’ll find:

In addition, we updated our Privacy Policy and Terms of Service Agreement to comply with the new requirements of GDPR. If you’re interested in seeing what changed in the Privacy Policy and TOS, check out our GitHub repo where you can compare versions.

We take this new regulation seriously, and we want to get you back to doing what you love—building great software.


Mediawiki 1.27.4 jquery not loaded

Published 24 May 2018 by Andy Johnson in Newest questions tagged mediawiki - Stack Overflow.

I am new to the Mediawiki and Resourceloader stuff. I recently downloaded MediaWiki 1.27.4 LTS version and When installed, I found that even if they say, jquery is loaded by default, it is nowhere to be found (I am looking into sources tab in chrome developer tools). In one of my extensions which uses BeforePageDisplay hook, I wanted to use jquery.cookie so I declared the following resourceloader

$wgResourceModules['ext.myFirstExtension'] = array(           
        'dependencies' => array( 'jquery.cookie'),            
        'localBasePath' => dirname( __FILE__ ),            
        'remoteExtPath' => 'myFirstExtension',
        'position' => 'top'
);

And in my extension file, I am autoloading one of the class and in which, I am executing the In the script, I am simply executing the following code, and it throws me typical error of $ undefined since jquery is not loaded.

$(document).ready(function(){
alert("here");
});

And yes, I am using a Vector skin without any modifications. In addition, I am not using any other extensions except VisualEditor and it works beautifully fine.

I also tried mw.loader.load('jquery') in my and it also complains that mw is not recognized.

I also added $wgResourceLoaderDebug = true; in my localsettings so that resource loader doesn't bundle up my scripts and css

I suspect that Mediawiki internally can't function without jquery.. but now how can I get jquery to load in my extension correctly so that I can use jquery.cookie.

Thanks


GDPR is here!

Published 24 May 2018 by Bron Gondwana in FastMail Blog.

The European Union and the United Kingdom have been leaders in writing regulations to protect something we've long known you value -- your personal information and privacy. We talked about the basics of GDPR protection last month; now it's time to talk about what's changing.

For us, it's been an opportunity to make sure that our practices are in line with our values.

For FastMail, not much is changing. We have high standards for ourselves, and you don't have to change much if you aren't monetizing customers' personal data! Where we've spent the bulk of our time (besides converting our policies from code into words) is thinking about areas where being helpful comes into tension with privacy.

Being helpful vs. protecting your privacy

We pride ourselves on solving unusual problems like buggy mail client behavior, and helping customers out of tough situations (even when that tough situation is something like "my aged parent forgot to pay for their account for two years.") It feels great to go above and beyond for customers! But this process made us think about what kind of personal data might be collected incidentally in the logs we use for debugging, or how long a reasonable person might expect that their information is retained if they choose not to pay for an account.

Reducing our data retention periods, especially in the case where the retained data was likely to contain personal customer information, was one of our biggest changes. We've tried to strike the right balance between making sure you still get the support you expect from us, and protecting your personal information.

You've got rights - know how to use them

We know our new privacy policy is longer. We went with one that sacrificed brevity for coverage, but we hope it has retained clarity and comprehension.

Due to our commitment to open standards, it's always been possible to get your personal data from us in a downloadable, machine-readable format. The privacy policy now includes much more specific language detailing the laws under which those rights are granted - but at FastMail, everyone has them, not just European residents.

What's a DPA, and do I need one?

One of GDPR's other major goals is to try to keep companies from passing the buck in the case of a breach of personal information. As such, corporations that process data on behalf of other people need a contract with all the vendors they use who might hold that information. That contract is a Data Protection Addendum. If you're an individual, you get your services directly from us, and you don't need a DPA.

If you're a corporation, and you do need a DPA, it depends which product you're using, for:

What's next?

This is not our last revision to our Terms of Service and Privacy Policy. Protecting your data is not something we need a law to push us to do! It did push us to formally name Privacy Officers (who you can contact at privacy@fastmailteam.com). They are staffers who are receiving additional training on security and privacy considerations, and are explicitly empowered to question decisions we're making in all our products to make sure we're always making good choices around your privacy.

Our revised documents and new related resources:

If you have further questions about GDPR, your data, or your privacy rights, feel free to reach out to our support team for assistance. Thank you for using FastMail!


Perspectives on Payments Canada Summit 2018

Published 23 May 2018 by Karen Myers in W3C Blog.

At the Payments Canada Summit, 9-11 May 2018 in Toronto, the historically conservative Canadian Financial Industry heard some consistent themes from many of the conference speakers: technology change is happening now and will continue to accelerate, so adapt quickly or be left behind. And data is the new currency.

The World Wide Web Consortium’s (W3C) Web Payments lead Ian Jacobs was invited to moderate one of the first panels titled, “Streamlined Checkout on the Web,” together with Shopify’s Andre Lyver, Director of Engineering, and Google’s Anthony Vallee-Dubois, Software Developer, who are active contributors to the technical work.

Andre Lyver, Anthony Valle Dubois and Ian Jacobs on stage at Payments Canada

Andre Lyver, Anthony Valle Dubois and Ian Jacobs speaking at Payments Canada Summit 2018

The trio presented an overview of the W3C’s royalty-free standards work to make online checkout faster and more secure on the Web, and showed demos of implementations by Shopify and in the Google Chrome browser that are working today. At the W3C table in the exhibit hall, Jacobs and Lyver demonstrated how using the simplified “buy button”, and reusing browser-stored data, enables the completion of shopping transactions more quickly and securely.

Lyver presented some very early findings based on Shopify’s experimentation with the W3C’s Payment Request API, including findings of reduced checkout times through the browser interface, and popularity amongst shoppers around surfaced discount codes. Coupons, discount codes, and loyalty programs are under discussion in the W3C’s Web Payments Working Group and Commerce Interest Group.

Following the W3C panel session on Web Payments, the opening day keynote presentations previewed technology changes happening today, or arriving in the very near future. Stacey Madge, Country Manager and President at Visa Canada, envisioned a “wave a connectivity of everything” where Visa will embed credentials into all types of connected devices. Madge referenced a partnership between Visa and Honda to develop APIs for pay-at-the-pump and parking location and payment scenarios. W3C’s Automotive Web Payments Task Force is currently looking at how the Web m to these same use cases.

MasterCard’s Jessica Turner, EVP Digital Payments and Labs, emphasized EMVCo’s secure approach to payments using tokenization technology and echoed the coming ubiquity of IoT payments across devices.

Asheesh Birla, SVP of Product at Ripple, explained how Blockchain technology is solving problems such as close cross-border payments and smart contracts that will help to reduce costs and help to build “the Internet of Value.” Ripple is building community around the related Interledger protocol in the W3C Interledger Payments Community Group.

Ulku Rowe, Technical Director Financial Services at Google, stressed the need for financial institutions to create a culture of innovation to keep pace with today’s environments of cloud computing, machine learning modules, and data analytics. Rowe said the old model of ‘big versus small’ is no longer relevant; it’s whether you are ‘fast or slow’ and can accelerate the transformation of financial services companies to become technology companies.

From a more personal perspective, Frank W. Abagnale, Jr., cybersecurity, fraud and identify theft prevention consultant and author (Catch me if you can), addressed attendees via videoconference on the final day of the summit. Abagnale offered pragmatic tips for companies and individuals to be more responsible for instilling rigorous security practices.

To protect personal identities, Abagnale advised: avoid paying with checks whenever possible because of the personal and bank account information printed on them; use confetti rather than strip shredders for any paper documents, even direct mail flyers, that have any personal information; use credit monitoring services and follow-up immediately on any anomalies; use credit cards rather than debit cards for purchases because the liability protections are better.

In closing comments, main stage host Bruce Croxon, recognized in Canada for his success as an entrepreneur, venture capitalist for startups, and media personality on CBC-TV’s “Dragons’ Den” and now BNN’s “The Disruptors,” encouraged fintech entrepreneurs to create solutions to real problems that have the potential for large market impact.

Payments Canada CEO Jan Pilbauer’s closing keynote painted a future of many innovations for the financial services and payments industry, including embedded payment systems as part of connected devices in home, work and outdoor environments.

Jan Pilbauer

Jan Pilbauer, Payments Canada CEO at closing keynote, Payments Canada Summit 2018

Payments Canada, a W3C member organization headquartered in Ottawa, Ontario, ensures that financial transactions in Canada are carried out safely and securely each day. The organization underpins the Canadian’s financial system and economy by owning and operating Canada’s payment clearing and settlement infrastructure, including associated systems, bylaws, rules and standards. The value of payments cleared by Payments Canada’s systems in 2017 was approximately $50 trillion or $200 billion every business day. These encompass a wide range of payments made by Canadians and businesses involving inter-bank transactions, including those made with debit cards, pre-authorized debits, direct deposits, bill payments, wire payments and cheques.

Payments Canada is currently undergoing a multi-year modernization initiative  based on a comprehensive roadmap for policy, process and technological improvements for all ecosystem participants.


"People cannot say they did not know what was happening"

Published 23 May 2018 by in New Humanist Articles and Posts.

Q&A with journalist and author Rania Abouzeid.

Cannot upload mp4 file to my mediawiki site

Published 23 May 2018 by feiffy in Newest questions tagged mediawiki - Stack Overflow.

can not upload mp4 file to my mediawiki site.

when i upload mp4 file, it show this error:

error msg

have searched google for the error msg "Exception caught: No specifications provided to ArchivedFile constructor", found nothing useful.

i have enabled upload and allow mp4 filetype, this is my LocalSettings :

$wgEnableUploads = true;
...
$wgFileExtensions = array_merge( $wgFileExtensions,
    array( 'mp4')
);

cardiParty 2018.06 - Melbourne Birthday cardiParty!

Published 22 May 2018 by Hugh Rundle in newCardigan.

6:30pm, Friday 8 June at the Upper Terrace room, Duke of Wellington Hotel.

Find out more...


A Message About Intel’s Latest Security Findings

Published 21 May 2018 by Josh Feinblum in The DigitalOcean Blog.

In response to Intel’s statement today regarding new vulnerabilities, we wanted to share all the information we have to date with our customers and community.

Current information does not suggest that this latest vulnerability, Variant 4, would allow Droplets to gain access to the host hypervisor, or access to other Droplets. We also do not believe that we will need to reboot our entire fleet of hypervisors, as was necessary to mitigate impact from the initial Spectre and Meltdown vulnerabilities. However, there is a remote potential for exploit and we are working with Intel to validate microcode to patch for the vulnerabilities. We are accelerating the fix, but applying these updates takes coordination and time.

Our security and engineering teams are monitoring our hypervisors and following this issue closely. We remain in communication with our contacts at Intel regarding any new developments. The security of our users’ data is one of our highest priorities, and we are ready to take action if and when appropriate. At this time, we strongly recommend ensuring that you have the latest packages from your distributions, and you use the latest browser versions with fixes for Variant 4.

We will update this blog as more information becomes available. In addition to posting here, we will notify customers directly if there is a need to take action.


May Community Doers: Open Source Contributors

Published 21 May 2018 by Daniel Zaltsman in The DigitalOcean Blog.

May Community Doers: Open Source Contributors

Since DigitalOcean came to be, the founders believed that the developer community is far greater than the sum of its parts. Six years later we continue to learn and grow thanks to the tireless work of our global community. Instrumental to increasing collaboration and ease-of-use, the Projects section of the Community received its first submission four years ago and today boasts a total of 186 apps, wrappers, and integrations using the DigitalOcean API.

In this month’s “Doers” spotlight, we highlight three builders who continue to maintain technology that makes a difference for users in the DigitalOcean ecosystem. When they are not working on software engineering and DevOps, they give back in a way that enriches the community. Please join us in recognizing May’s featured Doers:

Jeevanandam M. (@myjeevablog)

When he is not building out and supporting aah, the secure, flexible, and rapid Go web framework, Jeeva has been making valuable contributions that enable developers to use DigitalOcean. Since early 2014, he has maintained a widely used DigitalOcean API client library written in Java. The client is used by the Jenkins DigitalOcean plugin, powering a large quantity of CI use cases on top of DigitalOcean. We are immensely thankful for Jeeva’s commitment to quality and community and believe this recognition is long due.

Lorenzo Setale (@koalalorenzo)

Lorenzo is a Copenhagen-based Italian developer of ideas who has been involved in the community since 2012. Anyone who has spun up Droplets using the python-digitalocean Python library will be familiar with tireless Lorenzo’s work. He has long authored and maintained one of the most used and best supported DigitalOcean API libraries. A playground for experimentation for some is a tool to build someone’s first project, thanks to Lorenzo for the technology that keeps on giving.

Peter Souter (@petersouter)

Peter is an open source citizen that leads by example, noting on his blog that “as long as people are interested I will keep maintaining and helping with open source software I maintain.” with regards to his work on Tugboat, a CLI that predates doctl. Previously at Puppet, Peter currently works at HashiCorp out of London and we’re proud to say he's been around our community for a long time. In addition to being the main contributor to tugboat, he's had a few contributions to droplet_kit, the Ruby API client. Thanks for all your work, we appreciate it all.

Jeeva, Lorenzo, and Peter showcase the qualities we are proud to see in our community and we hope that they inspire others as well. We’re grateful to have this opportunity to recognize our amazing community contributors and if you’re interested in getting more involved in the DigitalOcean community, here are a few places to start:

Want to recognize someone in the community? Leave their name in the comments or reach out to Doers [at] DigitalOcean [.] com.


"Computers aren’t capable of using common sense"

Published 21 May 2018 by in New Humanist Articles and Posts.

Q&A with journalist and software developer Meredith Broussard.

How elite networks shape our society

Published 21 May 2018 by in New Humanist Articles and Posts.

What the Cambridge Analytica scandal reveals about Britain.

From Another View in Geraldton

Published 21 May 2018 by carinamm in State Library of Western Australia Blog.

The From Another View project team visited Geraldton, opened a pop-up exhibition at the Museum of Geraldton and conducted a Storylines session at the Geraldton Regional Library.

Looking_MuseumGeraldton

Pop up exhibition at Museum of Geraldton (c) State Library of Western Australia, 2018

At the opening of the exhibition, Pop Robert Ronan welcomed audience members to Southern Yamaji country, the land of the Nhanhagardi, Wilunyu and Amangu. Robert reminisced about life in Geraldton, and as a younger man sitting near the John Forrest statue on the foreshore. Robert recollected wondering about what it might be like for the expedition party to travel his country.

dscf2163.jpg

Museum of Geraldton (c) State Library of Western Australia 2018

Members of the Museums of Geraldton Site Advisory committee, and the Walkaway Station Museum attended. In later life, Lady Forrest (Margaret Elvire Hammersley), John Forrest’s wife lived in Georgina near Walkaway. Some of Lady Forrest’s belongings were donated to the Walkaway Station Museum.

The project team helped a number of families reconnect with photographs of family during the two day visit. Here are some of the stories.

FredMallard

Fred Mallard and Con Kelly and some of the children camped at Galena. Taken at Galena on 2nd October, 1937, at about 6 p.m. by F.I. Bray, D.C.N.A. (Deputy Commissioner [Dept. of] Native Affairs. https://storylines.slwa.wa.gov.au/archive-store/view/6/1403

Charlie Cameron

Mr & Mrs Charlie Cameron at Cue. Photograph taken on 30/9/37 by F.I. Bray, D.C.N.A. https://storylines.slwa.wa.gov.au/archive-store/view/6/1408

During the Storylines session, Trudi Cornish from the Geraldton Regional Library explained that the story of the woman in the photograph is known, however her name is not. The woman was a contemporary of King Billy and ‘gave as good as she got’ when people would mock her with the name ‘Ugly Legs’ due to some scars she had.

Photograph of “Ugly Legs”, Geraldton 1900 https://storylines.slwa.wa.gov.au/archive-store/view/6/9854)

The project team is packed up and ready for the onward journey to Wiluna to conduct a Storylines session and pop-up exhibition on Thursday 24 May 2018 at Tjukurba Art Gallery. The team will then head out to Martu, Birriliburu country along the Canning Stock Route and Gunbarrel Highway to the Mangkili Claypans, with two groups of traditional owners.

Onward

(c) State Library of Western Australia, 2018

Artist Bill Gannon will stop at Pia Wadjarri and visit the school, to discuss his artwork and John Forrest’s trek. Then he will travel to Wiluna via Mt Gould.

Looking at the map. Museum of Geraldton exhibition. (c) State Library of Western Australia, 2018


semantic wiki write to wiki like database

Published 20 May 2018 by Black S. in Newest questions tagged mediawiki - Stack Overflow.

I'm newbie in semantic wiki. I want to do something database and overview computer component for my organization.

I read about semantic wiki language but cant understand Is I can do like this in semantic wiki or not. Help me or give me please directions for find.

For example I have HDD. Each of this have: - status used or unused - if used then computer (parent) or if unused - storage room - serial number - specification - and etc.

I also have storage room end etc hierarchy.

How can do it in semantic wiki ?

Each hdd will be have own page ?

I found that it can be done by subobject but subobject cant show in are page. How I can describe it and do visible it describing or it can be show only with ask ? May be it can be done by something else subobject ?

Thanks for your time


How to get description of an entity/search query from Wikidata API

Published 19 May 2018 by Aakash Singh in Newest questions tagged mediawiki - Stack Overflow.

I want to get description of a search query using Wikidata API. I have found that setting parameter action to wbsearchentities give description of all strings that matches the entity.Can anyone tell me how to get more description about any selected entity like one we have in any search engines on right side of any search engines.


cardiCast episode 31 Reece Harley

Published 18 May 2018 by Justine in newCardigan.

Perth February 2018 cardiParty

Recorded live

The Museum of Perth chronicles the social, cultural, political and architectural history of Perth. Their exhibition space serves as a meeting place of ideas and stories, a retail space, micro-cinema and a cultural hub in a forgotten part of the city.

For our March Perth cardiParty, Reece Harley, Executive Director and founder, gave an introductory talk about Museum of Perth, covering background info about the museum and the current exhibition.

The Museum is an initiative of the Perth History Association Inc, a not-for-profit organisation founded in 2015.

newcardigan.org
glamblogs.newcardigan.org

Music by Professor Kliq ‘Work at night’ Movements EP.
Sourced from Free Music Archive under a Creative Commons licence.

 


UK Archivematica meeting at Westminster School

Published 18 May 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday the UK Archivematica user group meeting was held in the historic location of Westminster School in central London.

A pretty impressive location for a meeting!
(credit: Elizabeth Wells)


In the morning once fuelled with tea, coffee and biscuits we set about talking about our infrastructures and workflows. It was great to hear from a range of institutions and how Archivematica fits into the bigger picture for them. One of the points that lots of attendees made was that progress can be slow. Many of us were slightly frustrated that we aren't making faster progress in establishing our preservation infrastructures but I think it was a comfort to know that we were not alone in this!

I kicked things off by showing a couple of diagrams of our proposed and developing workflows at the University of York. Firstly illustrating our infrastructure for preserving and providing access to research data and secondly looking at our hypothetical workflow for born digital content that comes to the Borthwick Institute.

Now our AtoM upgrade is complete and that Archivematica 1.7 has been released, I am hoping that colleagues can set up a test instance of AtoM talking to Archivematica that I can start to play with. In a parallel strand, I am encouraging colleagues to consider and document access requirements for digital content. This will be invaluable when thinking about what sort of experience we are trying to implement for our users. The decision is yet to be made around whether AtoM and Archivematica will meet our needs on their own or whether additional functionality is needed through an integration with Fedora and Samvera (the software on which our digital library runs)...but that decision will come once we better understand what we are trying to achieve and what the solutions offer.

Elizabeth Wells from Westminster School talked about the different types of digital content that she would like Archivematica to handle and different workflows that may be required depending on whether it is born digital or digitised content, whether a hybrid or fully digital archive and whether it has been catalogued or not. She is using Archivematica alongside AtoM and considers that her primary problems are not technical but revolve around metadata and cataloguing. We had some interesting discussion around how we would provide access to digital content through AtoM if the archive hadn't been catalogued.

Anna McNally from the University of Westminster reminded us that information about how they are using Archivematica is already well described in a webinar that is now available on YouTube: Work in Progress: reflections on our first year of digital preservation. They are using the PERPETUA service from Arkivum and they use an automated upload folder in NextCloud to move digital content into Archivematica. They are in the process of migrating from CALM to AtoM to provide access to their digital content. One of the key selling points of AtoM for them is it's support for different languages and character sets.

Chris Grygiel from the University of Leeds showed us some infrastructure diagrams and explained that this is still very much a work in progress. Alongside Archivematica, he is using BitCurator to help appraise the content and EPrints and EMU for access.

Rachel MacGregor from Lancaster University updated us on work with Archivematica at Lancaster. They have been investigating both Archivematica and Preservica as part of the Jisc Research Data Shared Service pilot. The system that they use has to be integrated in some way with PURE for research data management.

After lunch in the dining hall (yes it did feel a bit like being back at school),
Rachel MacGregor (shouting to be heard over the sound of the bells at Westminster) kicked off the afternoon with a presentation about DMAonline. This tool, originally created as part of the Jisc Research Data Spring project, is under further development as part of the Jisc Research Data Shared Service pilot.

It provides reporting functionality for a range of systems in use for research data management including Archivematica. Archivematica itself does not come with advanced reporting functionality - it is focused on the primary task of creating an archival information package (AIP).

The tool (once in production) could be used by anyone regardless of whether they are part of the Jisc Shared Service or not. Rachel also stressed that it is modular - though it can gather data from a whole range of systems, it could also work just with Archivematica if that is the only system you are interested in reporting on.

An important part of developing a tool like this is to ensure that communication is clear - if you don’t adequately communicate to the developers what you want it to do, you won’t get what you want. With that in mind, Rachel has been working collaboratively to establish clear reporting requirements for preservation. She talked us through these requirements and asked for feedback. They are also available online for people to comment on:


Sean Rippington from the University of St Andrews talked us through some testing he has carried out, looking at how files in SharePoint could be handled by Archivematica. St Andrews are one of the pilot organisations for the Jisc Research Data Shared Service, and they are also interested in the preservation of their corporate records. There doesn’t seem to be much information out there about how SharePoint and Archivematica might work together, so it was really useful to hear about Sean’s work.

He showed us inside a sample SharePoint export file (a .cmp file). It consisted of various office documents (the documents that had been put into SharePoint) and other metadata files. The office documents themselves had lost much of their original metadata - they had been renamed with a consecutive number and given a .DAT file extension. The date last modified had changed to the date of export from SharePoint. However, all was not lost, a manifest file was included in the export and contained lots of valuable metadata, including the last modified date, the filename, the file extension and the name of the person who created file and last modified it.

Sean tried putting the .cmp file through Archivematica to see what happens. He found that Archivematica correctly identified the MS Office files (regardless of change of file extension) but obviously the correct (original) metadata was not associated with the files. This continued to be stored in the associated manifest file. This has potential for confusing future users of the digital archive - the metadata gives useful context to the files and if hidden in a separate manifest file it may not be discovered.

Another approach he took was to use the information in the manifest file to rename the files and assign them with their correct file extensions before pushing them into Archivematica. This might be a better solution in that the files that will be served up in the dissemination information package (DIP) will be named correctly and be easier for users to locate and understand. However, this was a manual process and probably not scalable unless it could be automated in some way.

He ended with lots of questions and would be very glad to hear from anyone who has done further work in this area.

Hrafn Malmquist from the University of Edinburgh talked about his use of Archivematica’s appraisal tab and described a specfic use case for Archivematica which had specific requirements. The records of the University court have been deposited as born digital since 2007 and need to be preserved and made accessible with full text searching to aid retrieval. This has been achieved using a combination of Archivematica and DSpace and by adding a package.csv file containing appropriate metadata that can be understood by DSpace.

Laura Giles from the University of Hull described ongoing work to establish a digital archive infrastructure for the Hull City of Culture archive. They had an appetite for open source and prior experience with Archivematica so they were keen to use this solution, but they did not have the in-house resource to implement it. Hull are now working with CoSector at the University of London to plan and establish a digital preservation solution that works alongside their existing repository (Fedora and Samvera) and archives management system (CALM). Once this is in place they hope to use similar principles for other preservation use cases at Hull.

We then had time for a quick tour of Westminster School archives followed by more biscuits before Sarah Romkey from Artefactual Systems joined us remotely to update us on the recent new Archivematica release and future plans. The group is considering taking her up on her suggestion to provide some more detailed and focused feedback on the appraisal tab within Archivematica - perhaps a task for one of our future meetings.

Talking of future meetings ...we have agreed that the next UK Archivematica meeting will be held at the University of Warwick at some point in the autumn.


UK Archivematica meeting at Westminster School

Published 18 May 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday the UK Archivematica user group meeting was held in the historic location of Westminster School in central London.

A pretty impressive location for a meeting!
(credit: Elizabeth Wells)


In the morning once fuelled with tea, coffee and biscuits we set about talking about our infrastructures and workflows. It was great to hear from a range of institutions and how Archivematica fits into the bigger picture for them. One of the points that lots of attendees made was that progress can be slow. Many of us were slightly frustrated that we aren't making faster progress in establishing our preservation infrastructures but I think it was a comfort to know that we were not alone in this!

I kicked things off by showing a couple of diagrams of our proposed and developing workflows at the University of York. Firstly illustrating our infrastructure for preserving and providing access to research data and secondly looking at our hypothetical workflow for born digital content that comes to the Borthwick Institute.

Now our AtoM upgrade is complete and that Archivematica 1.7 has been released, I am hoping that colleagues can set up a test instance of AtoM talking to Archivematica that I can start to play with. In a parallel strand, I am encouraging colleagues to consider and document access requirements for digital content. This will be invaluable when thinking about what sort of experience we are trying to implement for our users. The decision is yet to be made around whether AtoM and Archivematica will meet our needs on their own or whether additional functionality is needed through an integration with Fedora and Samvera (the software on which our digital library runs)...but that decision will come once we better understand what we are trying to achieve and what the solutions offer.

Elizabeth Wells from Westminster School talked about the different types of digital content that she would like Archivematica to handle and different workflows that may be required depending on whether it is born digital or digitised content, whether a hybrid or fully digital archive and whether it has been catalogued or not. She is using Archivematica alongside AtoM and considers that her primary problems are not technical but revolve around metadata and cataloguing. We had some interesting discussion around how we would provide access to digital content through AtoM if the archive hadn't been catalogued.

Anna McNally from the University of Westminster reminded us that information about how they are using Archivematica is already well described in a webinar that is now available on YouTube: Work in Progress: reflections on our first year of digital preservation. They are using the PERPETUA service from Arkivum and they use an automated upload folder in NextCloud to move digital content into Archivematica. They are in the process of migrating from CALM to AtoM to provide access to their digital content. One of the key selling points of AtoM for them is it's support for different languages and character sets.

Chris Grygiel from the University of Leeds showed us some infrastructure diagrams and explained that this is still very much a work in progress. Alongside Archivematica, he is using BitCurator to help appraise the content and EPrints and EMU for access.

Rachel MacGregor from Lancaster University updated us on work with Archivematica at Lancaster. They have been investigating both Archivematica and Preservica as part of the Jisc Research Data Shared Service pilot. The system that they use has to be integrated in some way with PURE for research data management.

After lunch in the dining hall (yes it did feel a bit like being back at school),
Rachel MacGregor (shouting to be heard over the sound of the bells at Westminster) kicked off the afternoon with a presentation about DMAonline. This tool, originally created as part of the Jisc Research Data Spring project, is under further development as part of the Jisc Research Data Shared Service pilot.

It provides reporting functionality for a range of systems in use for research data management including Archivematica. Archivematica itself does not come with advanced reporting functionality - it is focused on the primary task of creating an archival information package (AIP).

The tool (once in production) could be used by anyone regardless of whether they are part of the Jisc Shared Service or not. Rachel also stressed that it is modular - though it can gather data from a whole range of systems, it could also work just with Archivematica if that is the only system you are interested in reporting on.

An important part of developing a tool like this is to ensure that communication is clear - if you don’t adequately communicate to the developers what you want it to do, you won’t get what you want. With that in mind, Rachel has been working collaboratively to establish clear reporting requirements for preservation. She talked us through these requirements and asked for feedback. They are also available online for people to comment on:


Sean Rippington from the University of St Andrews talked us through some testing he has carried out, looking at how files in SharePoint could be handled by Archivematica. St Andrews are one of the pilot organisations for the Jisc Research Data Shared Service, and they are also interested in the preservation of their corporate records. There doesn’t seem to be much information out there about how SharePoint and Archivematica might work together, so it was really useful to hear about Sean’s work.

He showed us inside a sample SharePoint export file (a .cmp file). It consisted of various office documents (the documents that had been put into SharePoint) and other metadata files. The office documents themselves had lost much of their original metadata - they had been renamed with a consecutive number and given a .DAT file extension. The date last modified had changed to the date of export from SharePoint. However, all was not lost, a manifest file was included in the export and contained lots of valuable metadata, including the last modified date, the filename, the file extension and the name of the person who created file and last modified it.

Sean tried putting the .cmp file through Archivematica to see what happens. He found that Archivematica correctly identified the MS Office files (regardless of change of file extension) but obviously the correct (original) metadata was not associated with the files. This continued to be stored in the associated manifest file. This has potential for confusing future users of the digital archive - the metadata gives useful context to the files and if hidden in a separate manifest file it may not be discovered.

Another approach he took was to use the information in the manifest file to rename the files and assign them with their correct file extensions before pushing them into Archivematica. This might be a better solution in that the files that will be served up in the dissemination information package (DIP) will be named correctly and be easier for users to locate and understand. However, this was a manual process and probably not scalable unless it could be automated in some way.

He ended with lots of questions and would be very glad to hear from anyone who has done further work in this area.

Hrafn Malmquist from the University of Edinburgh talked about his use of Archivematica’s appraisal tab and described a specfic use case for Archivematica which had specific requirements. The records of the University court have been deposited as born digital since 2007 and need to be preserved and made accessible with full text searching to aid retrieval. This has been achieved using a combination of Archivematica and DSpace and by adding a package.csv file containing appropriate metadata that can be understood by DSpace.

Laura Giles from the University of Hull described ongoing work to establish a digital archive infrastructure for the Hull City of Culture archive. They had an appetite for open source and prior experience with Archivematica so they were keen to use this solution, but they did not have the in-house resource to implement it. Hull are now working with CoSector at the University of London to plan and establish a digital preservation solution that works alongside their existing repository (Fedora and Samvera) and archives management system (CALM). Once this is in place they hope to use similar principles for other preservation use cases at Hull.

We then had time for a quick tour of Westminster School archives followed by more biscuits before Sarah Romkey from Artefactual Systems joined us remotely to update us on the recent new Archivematica release and future plans. The group is considering taking her up on her suggestion to provide some more detailed and focused feedback on the appraisal tab within Archivematica - perhaps a task for one of our future meetings.

Talking of future meetings ...we have agreed that the next UK Archivematica meeting will be held at the University of Warwick at some point in the autumn.


Forrest’s Exploration Diaries now online

Published 17 May 2018 by carinamm in State Library of Western Australia Blog.

Artist Bill Gannon and surveyor Rod Schlenker, visited the State Library to see the original diaries of John and Alexander Forrest’s 1874 expedition from Geraldton to Adelaide. The diaries, which are held in the State Library collections, are now accessible online through the catalogue.(ACC 1241A)

073_Forrest Diaries_16-5-18.jpg

From Another View Project Coordinator Tui Raven with Rod Schlenker and Bill Gannon as they look at the diaries. (C) State Library of Western Australia, 2018. 

This week Bill Gannon and a team from the State Library will embark on a on a trip to engage with Aboriginal communities and visit key locations along the 1874 trek route.  This artistic and community engagement is part of the ‘From Another View’ project, a collaboration between the State Library and Minderoo Foundation.  The project considers the trek ‘from another view’, or rather from many views, incorporating various creative and Aboriginal community perspectives.

Explore some of the camp locations referenced in John and Alexander Forrest’s diaries through the Google map.

056_Forrest Diaries_16-5-18.jpg

Forrest’s Expedition to Central Australia, State Library of Western Australia, ACC 1241A

For more information about the From Another View project go to: https://fromanotherview.blog/  Follow the From Another View blog to keep updated with the project.

 


Do human rights have a future? The summer 2018 New Humanist

Published 17 May 2018 by in New Humanist Articles and Posts.

Special report - plus Ha-Joon Chang on sci-fi, in defence of propaganda, ancient DNA and more.

Free speech and Malaysia's ban on fake news

Published 16 May 2018 by in New Humanist Articles and Posts.

The Anti-Fake News Act leaves the question of what qualifies as fake news vague and ill-defined.

Block Storage Volumes Gets a Performance Burst

Published 15 May 2018 by Priya Chakravarthi in The DigitalOcean Blog.

Block Storage Volumes Gets a Performance Burst

At DigitalOcean, we’ve been rapidly adding new products and features on our mission to simplify cloud computing, and today we're happy to announce our latest enhancement.

Over the first half of 2018, we've improved performance for Block Storage Volumes with backend upgrades that reduce cluster latency by 50% and provide new burst support for higher performance for spiky workloads.

Burst Performance Characteristics

Block Storage Volumes have a wide variety of use cases, like database reads and writes as well as storing logs, static assets, backups, and more. The performance expectations from a particular volume will depend on how it's used.

Database workloads, for example, need single-digit millisecond latency. Most workloads in the cloud today are bursty, however, and don't require sustained high performance at all times. Use cases like web servers, backups, and data warehousing can require higher performance due to short increases in traffic or a temporary need for more bandwidth.

To meet the need for very low latency, we upgraded Ceph to its latest version, Luminous v12.2.2, in all regions containing Block Storage. This reduced our cluster latency by 50% and provides the infrastructure you need to manage databases with Block Storage Volumes.

To support spiky workloads, we added burst support, which automatically increases Block Storage Volumes' IOPS and bandwidth rates for short periods of time (60 seconds) before returning to baseline performance to cool off (60 seconds).

Here's a summary of the burst performance characteristics, which compares a Standard Droplet (SD) plan and an Optimized Droplet (OD) plan:

Droplet Plan
SD OD
Baseline
IOPS
(in IOPS/volume)
5000 7500
Baseline BW
(in MB/s)
200 300
Burst IOPS
(in IOPS/volume)
7500 10000
Burst BW
(in MB/s)
300 350
Avg Latency <10 ms <10 ms

We don't scale performance by the size of the volume you create, so every Block Storage Volume is configured to provide the same level of performance for your applications. However, your application needs to be written to realize these limits, and the kind of performance you get will depend on your app's configuration and a number of other parameters.

Performance and Latency Benchmarking

To learn more about the performance you're getting, we wrote How To Benchmark DigitalOcean Volumes, which explains not only how to benchmark your volumes but also how to interpret the results.

We then ran some of these tests internally to share the numbers and performance of our offering. You can find all the details in the tutorial, but here's a sample of results, which shows typical performance based on the queue depth (QD) of the application and the block size (on the x-axis) versus IOPS (on the y-axis).

Block Storage Volumes Gets a Performance Burst

Block Storage Volumes Gets a Performance Burst

These graphs show that the IOPS rate increases as queue depth increases until we hit our practical IOPS cap. Smaller block sizes tend to be IOPS limited, while larger block sizes tend to be bandwidth limited.

What about latency? Most real-world customer applications won't run the same kind of workload often used as a baseline (QD = 1 4K I/O), so these graphs show latency in µsec (or microseconds) as we add load to the cluster.

Block Storage Volumes Gets a Performance Burst

Block Storage Volumes Gets a Performance Burst

We see the same behavior in reads and writes. Because of how the backend storage stores the data, our results show that 16K has better latency at high queue depth, so we recommend you tune for 16K workloads if possible.

What's Next?

The performance improvements aren’t the only thing we have in store. There are several QoS features and infrastructure investments in the pipeline to improve your experience of Block Storage Volumes. (Ready to get started? Create a Volume now.)

We'd love to hear your thoughts, questions, and feedback. Feel free to leave a comment here or reach out to us through our UserVoice.


Episode 8: BTB Digest 1

Published 15 May 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

The best of episodes 1-5! Well, not really the best, but the most relevant (and maybe interesting) parts of the first five episodes, condensed into a short(-ish) 30 minute digest.


Mediawiki Stylesheet Error

Published 15 May 2018 by Sauseee in Newest questions tagged mediawiki - Stack Overflow.

I have a Problem with my mediawiki installation. Some sites are not loading properly. I already read error posts for many hours but couln't find a solution for me. Most sites work fine:

Main Page

Version Page

At the "edit" site formatting breaks: Edit Page

I got following errors: Warning Error

Warning Translation: Loading failed with Script for Source ....

Warning and Error appear on all sites.

URL/load.php gives me this:

/* This file is the Web entry point for MediaWiki's ResourceLoader:
   <https://www.mediawiki.org/wiki/ResourceLoader>. In this request,
   no modules were requested. Max made me put this here. */

LocalSettings.php:

<?php
if ( !defined( 'MEDIAWIKI' ) ) {
    exit;
}

$wgSitename = "Some_Name";
$wgScriptPath = "";
$wgServer = "http://wiki.intern.zz";
$wgResourceBasePath = $wgScriptPath;
$wgLogo = "$wgResourceBasePath/resources/assets/wiki.png";
$wgEnableEmail = false;
$wgEnableUserEmail = true; # UPO
$wgEmergencyContact = "apache@wiki.intern.zz";
$wgPasswordSender = "apache@wiki.intern.zz";
$wgEnotifUserTalk = false; # UPO
$wgEnotifWatchlist = false; # UPO
$wgEmailAuthentication = true;
$wgDBtype = "mysql";
$wgDBserver = "localhost";
$wgDBname = "wikidb";
$wgDBuser = "wikiuser";
$wgDBpassword = "7YdLrbOCrtXnOS2FFGzF";
$wgDBprefix = "";
$wgDBTableOptions = "ENGINE=InnoDB, DEFAULT CHARSET=binary";
$wgDBmysql5 = false;
$wgMainCacheType = CACHE_ACCEL;
$wgMemCachedServers = [];
$wgEnableUploads = true;
$wgUseImageMagick = true;
$wgImageMagickConvertCommand = "/usr/bin/convert";
$wgUseInstantCommons = false;
$wgPingback = false;
$wgShellLocale = "en_US.utf8";
#$wgCacheDirectory = "$IP/cache";
$wgLanguageCode = "de";
$wgSecretKey="46c344c2c3d7af5f4210a0ead3f30002075087328e91aeece74cd086f22bbbb6";
$wgAuthenticationTokenVersion = "1";
$wgUpgradeKey = "d585d23ed115e0bd";
$wgRightsPage = ""; 
$wgRightsUrl = "";
$wgRightsText = "";
$wgRightsIcon = "";
$wgDiff3 = "/usr/bin/diff3";
$wgDefaultSkin = "vector";
wfLoadSkin( 'CologneBlue' );
wfLoadSkin( 'Modern' );
wfLoadSkin( 'MonoBook' );
wfLoadSkin( 'Vector' );

$wgShowExceptionDetails = TRUE;

nginx.conf:

        server{
                listen 80;
                root /data/www/mediawiki;
                index index.php
                server_name wiki.intern.zz;


        location ~ \.php$ {
                fastcgi_pass 127.0.0.1:9000;
                fastcgi_index index.php;
                include fastcgi.conf;
                include fastcgi_params;
        }
        }
}

Cache is always cleared.

OS is Alpine Linux with Nginx and php-fpm.

/tmp folder is writable.

There are NO rewrites set in Nginx.

Used Version is 1.30.0. Same Error with English and 1.29.2

If you need any additional Info please ask. Thank you


The activists working to roll back progress on sexual and reproductive rights

Published 14 May 2018 by in New Humanist Articles and Posts.

A Vatican-inspired network known as Agenda Europe is organising to overturn existing laws on divorce, abortion and LGBT rights.

DDD Perth Survival Tips

Published 13 May 2018 by Derek in DDD Perth - Medium.

So, you are going to DDD Perth, it’s your first time at a conference and you need some help navigating the uncharted territory. Well you have come to the right place, if you have 5 minutes to spare please take the time to read this survival guide and it will hopefully help you get the most out of the conference.

TIP 1 #earlybirdie

At conferences, registration queues can be long, the waits can be lengthy and anxiety levels are often at their maximum.

At DDD Perth we endeavour to make this wait as fun and action packed as possible. However why not simply avoid the queues altogether and turn up early. Early birds not only avoid queues, they get first dibs on any swag being offered out by sponsors (Hint: the best laptop stickers go early), get to the coffee purveyors before anyone else and so find themselves fully caffeinated and ready for the first keynote. (Note: at DDD we champion diversity we also encourage non-coffee addicts to stay ‘hydrated’ … )

TIP 2 #beprepared

Being prepared for a conference at first glance seems a bit odd — after all it’s a day off work and being prepared sounds a bit like work !! But nooooo, being prepared will enhance your experience more than 3D glasses at the latest Avenger movie. Here is a checklist to help:

  1. Get your hands on the running order Before The Event ( website the night before is your best option as it’s the most up to date )
  2. Bring a BackPack — very important for swag
  3. Put your laptop in it — if you want to break out and build something cool
  4. Don’t forget a portable charger for your phone — for mobile PUBG perhaps
  5. Bring a small snack — see #stayfuelled
  6. Bring a jumper — a hoody for that Rami Malek in ‘Mr Robot’ hacker look

TIP 3 #planyourday

“All good plans never survive first contact with the enemy” — does not mean it is not a good idea to have one. Below is a quick list you can checkoff when preparing your DDD Perth plan. They may seem like no brainers but then again most plans are (Note: not a dig at project managers)

  1. Know the Venue’s location — be familiar with how you are going to get to DDD Perth (and home after the after party)
  2. Pick your talks — it’s good to have a chosen talk and a reserve talk for a each time slot, cause life.
  3. Research your speakers — finding out about your chosen speakers adds to the talk as it gives you some possible insight into their talk.
  4. Get a good seat — be early to your chosen talks — standing for 40 min talks isn’t fun.

TIP 4 #volunteerlove
DDD is a not for profit event, it is run by people who give up their time and use personal compute cycles to make sure you have a most awesome time. Please be kind and generous with your time when dealing with them. They will be easily recognised by the multi sponsor emblazoned green t-shirts, grimacing and sweating, red faces… trust me they are having fun. Without these giant Oompa Loompas this event wouldn’t happen. ( Note: some sport luxurious beards and should be treated like any other animal at the zoo … )

TIP 5 #stayfueled

There is a lot of information to take in and digest at DDD Perth, staying fuelled throughout is important. Luckily we thought of that and the event will be fully catered, however always handy to bring a snack when your feel sugar levels heading towards dangerous (sleepy) levels.

TIP 6 #schmoozing

We are all introverted nerds who, beset with imposter syndrome don’t like venturing beyond our keyboards. This is true, but the beauty of DDD is we are ALL introverted nerds beset by imposter syndrome (even the speakers!)

So, get out there and mingle, make connections. Start conversations with complete strangers, safe in the knowledge that they feel exactly like you… It’s a pretty unique situation, be a shame to waste it. Who knows what cool stuff you might find out about, just by talking to the introverted nerd beside you — it might even be me.

tl;dr — respect your peers and have an awesome time at DDD Perth !!


DDD Perth Survival Tips was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


Mediawiki VisualEditor doesn't show TemplateData

Published 11 May 2018 by Espen in Newest questions tagged mediawiki - Stack Overflow.

I have just installed a Mediawiki server with the extensions VisualEditor and TemplateData. VisualEditor is working just fine, and I can use TemplateData when editing a template. My problem is that when I edit a page with VisualEditor and try to use a template, the parameters I have put in TemplateData is not showing. I can manually enter the paramaters, and then it works fine. If i look at "templates used" it is not showing there either. Anyone know what could be the problem?

If more information is needed please tell me.

The versions I am running are: Ubuntu Server 18.04 LTS, MediaWiki: 1.31.0-rc.0, PHP: 7.2.3-1ubuntu1 (apache2handler), VisualEditor: 0.1.0 (e8c4bf2), Parsoid: 0.9.0, TemplateData: 0.1.2 (ab7c322)

Tried looking at the API calls, seems to be three of them when I search for templates:

Request 1:

GET /w/api.php?action=query&format=json&prop=info%7Cpageprops&generator=prefixsearch&gpssearch=t&gpsnamespace=10&gpslimit=10&ppprop=disambiguation&redirects=true

Response 1:

{"batchcomplete":"","query":{"pages":{"6":{"pageid":6,"ns":10,"title":"Template:Test","index":1,"contentmodel":"wikitext","pagelanguage":"en","pagelanguagehtmlcode":"en","pagelanguagedir":"ltr","touched":"2018-05-12T14:04:05Z","lastrevid":50,"length":233}}}}

Request 2:

GET /w/api.php?action=templatedata&format=json&formatversion=2&titles=Template%3ATest&doNotIgnoreMissingTitles=1&lang=en

Response 2:

{"batchcomplete":true,"pages":{"6":{"title":"Template:Test","notemplatedata":true}}}

Request 3:

GET /w/api.php?action=templatedata&format=json&titles=Template%3ATest&lang=en&formatversion=2&doNotIgnoreMissingTitles=1&redirects=1

Response 3:

{"batchcomplete":true,"pages":{"6":{"title":"Template:Test","notemplatedata":true}}}

One "POST /w/api.php" request when the template is inserted. It has this form data:

action=visualeditor&format=json&paction=parsefragment&page=Rolf_Rolfsen&wikitext=%7B%7BTest%7D%7D&pst=true

With this response:

{"visualeditor":{"result":"success","content":"<span about=\"#mwt1\" typeof=\"mw:Transclusion\" data-parsoid='{\"pi\":[[]],\"dsr\":[0,8,null,null]}' data-mw='{\"parts\":[{\"template\":{\"target\":{\"wt\":\"Test\",\"href\":\"./Mal:Test\"},\"params\":{},\"i\":0}}]}'>\n\n</span><p about=\"#mwt1\"><br/>\n{{{testparameter}}}</p>"}}

"Template:Test" looks like this:

<noinclude>
<templatedata>
{
    "params": {
        "testparameter": {
            "description": "a testparameter",
            "example": "bla",
            "default": "lib"
        }
    },
    "description": "Test template"
}
</templatedata>
</noinclude>

{{{testparameter}}}

Change automatic email new account

Published 11 May 2018 by Antonio Andrés in Newest questions tagged mediawiki - Stack Overflow.

Using MediaWiki, I can't find the file where the text of the new account automatic email is written. I would like to add it some more information. Do you know which file is it? This is the text:

Someone created an account for your email address on XXXX (https://XXXX) named "XXXX", with password "XXXX". You should log in and change your password now. You may ignore this message, if this account was created in error.

EDIT

Just found it. It's a translation key: MediaWiki:Createaccount-text


Untitled

Published 10 May 2018 by Sam Wilson in Sam's notebook.

It is time I think (5AM on a Friday) to finally try to get the Flicker2Piwigo CLI script working. Small job before breakfast?


Restore a corrupted mediaWiki, to a newer verison of mediaWiki

Published 10 May 2018 by G_G in Newest questions tagged mediawiki - Webmasters Stack Exchange.

A working installation of mediawiki was corrupted by a user "touching" all files under the dir structure, ending in all files having the same exact permissions, and modify dates. I'm not sure which of the above had caused the wiki to stop working, but in fact - that's what happened.

The mediaWiki is version 1.26 - currently out of support, so I know.

Every single file of the wiki is available, and the dir' structure is indeed intact. The wiki's DB is not longer available. However the images/media (if indeed stored using the DB), are not as critical to the user as the actual page text content.

Is there a way to save this wiki ? I've looked into restoring mediaWiki, but then it's assumed the wiki has been backed up properly, and this is not our case, unfortunately.

Thank you.


Why do we venerate CEOs despite their obvious inadequacies?

Published 9 May 2018 by in New Humanist Articles and Posts.

Q&A with Peter Bloom and Carl Rhodes, authors of "CEO Society: The Corporate Takeover of Everyday Life".

Written any good books recently?

Published 9 May 2018 by in New Humanist Articles and Posts.

Laurie Taylor on the struggles of literary aspiration.

Open PDF in MediaWiki

Published 9 May 2018 by Konrad in Newest questions tagged mediawiki - Stack Overflow.

In my mediawiki, I can link to internal PDFs using:

[[File:test.pdf]]

But if the user clicks the link, a single page for this file is opened like:

http://localhost/mediawiki/index.php/File:test.pdf

How can I achieve, that the PDF file is opened directly with no other pages in between needed?


Introducing Updates for Load Balancers

Published 8 May 2018 by Tyler Crandall in The DigitalOcean Blog.

Introducing Updates for Load Balancers

In February 2017, we launched Load Balancers, our highly available and managed load balancing service. Thousands of users rely on them to distribute traffic across Web and application servers.

Today, we’re announcing significant upgrades to Load Balancers, including Let's Encrypt integration and HTTP/2 support. All users now have access to these features at no additional cost and with no action required. In fact, all existing Load Balancers already have been upgraded.

Let’s Encrypt Integration

Load Balancers now support a simple method to generate, manage, and maintain SSL certificates using Let’s Encrypt.

With a couple of clicks, you can add a free Let’s Encrypt SSL certificate to your Load Balancer to secure your traffic and offload SSL processing. Certificates will automatically renew, so you don't have to worry about a thing.

Introducing Updates for Load Balancers

HTTP/2

Load Balancers now also support the HTTP/2 protocol, which is a major update to HTTP/1.x designed primarily to reduce page load time and resource usage. You can find this under the Forwarding Rules dropdown in your Load Balancer settings.

Load Balancers can additionally terminate HTTP/2 client connections to act as a gateway to HTTP/1.x applications, allowing you to take advantage of HTTP/2's performance and security improvements without upgrading your backend servers.

Keep a look out for more performance-focused announcements in the coming months.

Our improved Load Balancers are available in all regions for the same price of $20/month. For more information about Load Balancers, please check out our website and these community articles:

Happy coding,
Tyler Crandall
Product Manager


Building the best group email for teams – an interview with our COO.

Published 8 May 2018 by David Gurvich in FastMail Blog.

Helen Photo


Topicbox launched in August 2017 and since then we’ve been busy creating the best group email product we can for teams, whatever that team looks like.

We recently sat down with Helen Horstmann-Allen, Topicbox COO and Head of Product to talk about the history of Topicbox, the future of email and how group email can help a wide range of teams be more productive and organized than ever before.

Helen has worked in tech and email for more than 20 years and is still in love with email today.


Firstly, tell us a bit about your background in email and tech?

Helen: I got started with Pobox, which is an email forwarding service (and now part of the FastMail family) in 1995, the world’s first lifetime email address, and having ‘one address for life’ was our initial concept.

And like many companies, we started getting feedback from customers right away asking us for more features they wish we would add. And a very early feature request was group email, colloquially known as ‘listserves’.

The most popular open source product at that point in time was a program called MajorDoMo, and so majordomo.pobox.com launched in early 1996. We had tons of people sign up for it and we very quickly decided that it actually should be its own product, Listbox, which we launched late in 1996.

Listbox is still around today, but it went through many iterations. Initially it was just people who wanted to talk to each other, over email, which was so novel back then!

Over time we expanded the service offering to include email marketing and newsletters, but my first passion was always the group email product.

I think email is a tool that everybody has access to. It is one of the only pieces of technology that is almost truly universal. It’s accessible to almost everyone and when you talk about the value of email as a discussion tool it’s incredibly inclusive.

In 2015 I sold my business (which had created Pobox and Listbox) to FastMail and when we started talking about what we could do together, group email was one of the first things we both thought sounded like a really interesting idea.

Listserves have been around for a very, very, very long time. They are one of the foundational technologies of the internet, but nobody does them really well. And we thought what a great opportunity. If we could make a great group email solution, could we change the way people use all their email?

After quite a bit of work and many iterations we launched Topicbox. Topicbox was originally built to serve people at a pretty large size and as we started working on it, and we started testing it with people, we discovered that in fact even very small groups can get a lot of benefit out of group email.

What were some of the other challenges in creating Topicbox?

Helen: Email is one of those technologies that people love to hate. There’s a lot of challenges in it and most of them have to do not with the sending of email but the receiving and the organizing.

There are always the clients, or the products or the projects that you absolutely positively need to hear everything about the moment it happens. There are other things where you just need to kind of know stuff is going on but you don’t need to be interrupted by it all day long. And then there are plenty of projects that other people are working on, you need to be aware of maybe and you might need access to it in the future but you don’t need to actually see it now.

If Topicbox could take all that information – some of which you get today and you’re frustrated with; and lots of which you don’t get today and then you don’t have that information when you need it – and put it one place that everyone in your organization can share then how much of your team’s best knowledge could we make accessible to you?

How can Topicbox help teams communicate better?

Helen: In many ways using Topicbox is just like using your regular email. The only difference is instead of sending it one-to-one, or CCing a whole group of people, you send it to your group.

The group can be predefined, either by you when you create it, or by someone else who is running the project, and that’s really it. You still send the same information but you get it to the right audience.

I use Topicbox just by myself sometimes, just for one-to-one correspondence because I know that some day, someone besides me is going to need to read it and it is somewhere they can now get at it.

We use it for groups of two or three people. Instead of some people getting CCd on some messages and not others and they have an incomplete history … a Topicbox group lets you have a complete history for everyone to see, even if they end up joining a project later.

And then of course for big groups it makes perfect sense. You don’t want to have to have your ‘All company’ messages in one place, you want to have one central tool that you use for everything.

Where does Topicbox sit amongst a suite of modern communication tools such as more traditional email, instant messaging and CRMs?

Helen: Topicbox is something that almost every company who uses email can use. If in your company today you CC people then you probably should be using Topicbox.

Imagine Topicbox as a tool to put the control and the flow of messages into the hands of the people who receive it.

Chat is terrific, but it’s kind of like a replacement for a telephone call or walking by a group. When you have a really active chat platform it can feel like reading a transcript of everything that happened in your office over the course of a day. That’s too much information for lots of people. And it’s not a great way to get oversight over an organization.

If you feel overwhelmed by the volume of chat, moving your important discussions up to Topicbox is a wonderful solution and a great add-on to those existing tools.

But if you’re using regular email in almost any way, if you ever CC somebody, a Topicbox group is going to help you retain that information in a more useful, more searchable and more categorized way. And that helps when you bring more people on, it helps when you transition people out, it helps when you start up a new project and it helps when you retire that project.

You can say all that information is now gone to one contained place and we’ll start a new group to discuss a new thing and we don’t have to have our history backlog littered with information about old projects that you get when you’re re-using a lot of tools.

What’s your favourite Topicbox feature?

Helen: I love the Daily Summary! I love getting one message that I can quickly skim through and just see what other people are working on who aren’t necessarily my department, or aren’t necessarily in my team. It’s definitely not what I am working on but it gives me a little oversight into what everybody else has got going on, and [gives me insights] if something important is happening in an area of the company that I’m not dealing with.

I also love organization-wide search. Who hasn’t found themselves in the position of knowing that something has been discussed, not necessarily knowing where to look for it? Topicbox helps you find what you are looking for and then immediately places you in the context where you can also see what else has happened around there very, very quickly.

How else has Topicbox improved your own business communication?

Helen: One of my favourite places to use it is with clients. When you are dealing with any type of external organization you may have one, two or three different touchpoints there and you may also have multiple people on your staff who need to deal with them.

Creating a Topicbox group is a really easy way to make sure that everybody is on the same page all the time.

Do you use Topicbox through the web browser, mobile or your email program?

Helen: I started out using Topicbox almost exclusively through email and as time went on I found myself more often going to the website and using the Message Composer to respond to old threads.

And what’s great about that is then I know I can just go back to my inbox and throw away everything because now I know it’s in Topicbox so I don’t have to hold onto it myself.

What is planned for Topicbox in 2018?

Helen: We’ve got some big things planned so stay tuned! I can’t share anything just yet but we’re currently looking at more ways to make Topicbox even better for teams.

We also love hearing from our customers, so if you have any feedback or feature wishes please let us know via Twitter or contact us directly.


How to create global variable in Mediawiki?

Published 7 May 2018 by Erik L in Newest questions tagged mediawiki - Stack Overflow.

I'm trying to create a global variable that should be set on each refresh/pageview. I tried adding the following code in LocalSettings.php:

$randomNumber = rand(0,1);

and then tried to access it in one of my skin files by a simple echo which did not work, it works however if the variable is defined in the skin file though. Does someone have any clue on where I would define a global variable that is set whenever the page is reloaded? It should be user specific so if 2 users browse the website at the same time then it should be possible that one gets $randomNumber = 0 while the other user maybe gets $randomNumber = 1.


Migrate Mediawiki to Confluence using UWC java regex failing to match with |thumb on line

Published 7 May 2018 by droidian in Newest questions tagged mediawiki - Stack Overflow.

I am trying to modify an existing java regex line to compensate for when a embedded picture in mediawiki is formatted with "|thumb|none" at the end of the attachment name. I am, not very familiar with regex and have been struggling to make this work. When the program is run I need this "[[file:send-rec.jpg|thumb|none]]" to be turned into this "!send-rec.jpg!" but only when the regex detects that it is an image file (jpg|gif|bmp|png) there is a single line in the conversion file that I am able to modify:

Mediawiki.0402-re_file_to_images.java-regex=(?i)\[\[file:\s*([^\]\|\s]+)\s*\]\]{replace-with}!$1!

the "{replace-with}" is a placeholder.

Thanks for any help or guidance.


Berlin Underground

Published 6 May 2018 by jenimcmillan in Jeni McMillan.

UBahn

It’s 3.19 am. Berlin time. I am dancing in the underground. Sweet violin plays the strings of my heart. Ride of the Valkyries. My soul in question.


Our Journey towards Diversity

Published 6 May 2018 by Rebecca Waters in DDD Perth - Medium.

DDD Perth 2017 Speakers

DDD Perth has a tagline that reads:

DDD Perth is an inclusive non-profit event for the Perth software community. Our goal is to create an approachable conference that anyone can attend or speak at, especially people that don’t normally get to attend / speak at conferences.

It’s an admirable mission statement, if I do say so myself! DDD Perth aims to do this by adhering to a few golden DDD rules, and a few more that are DDD Perth-ified:

Focussing on creating a safe and inclusive environment where everyone is welcome

The last line I’ve called out here is one I want to spend a bit of time on. In 2016, the conference was into it’s second year. The inaugural conference in 2015 went well. The founders, Matt Davies and Rob Moore might have had minor personal breakdowns and maxxed their personal credit cards to finance the venture, but on the whole they pulled it off and were ready for the challenge of a repeat performance.

So off on the conference train it went! The ticket price stayed as a low as possible. The event was on a Saturday. The Call for Presentations was broad and the voting democratic. There was a code of conduct and it was followed by all attendees.

It was also a fantastic conference. Held at the Mecure Inn, the conference ran 3 tracks, was attended by 200 people and simply was a great day. From talks on Authentication, to presenters drinking growlers on stage, the day was a success. This was the first time I managed to get to the conference and I was in awe of how enjoyable the day was.

The thing is, the safe and inclusive environment was welcoming for sure, but the agenda didn’t feature a single woman. It was something that the organisers saw, the attendees saw, the speakers saw. I’m also proud to say that in his opening address, Rob stood up and owned it. He made a pledge to do something about it for 2017.

In 2017, I joined the organising committee, as did many other talented and motivated folks.

When we came to reflect on the 2016 conference and look ahead to 2017, we set our sights on changing that.

We weren’t sure where to start, but we decided that the next step towards an inclusive conference was addressing gender diversity in speakers. There are other diversity and inclusion points to address, but as a group of volunteers, we recognised that if we were to be successful, we needed to concentrate our efforts, so we focussed on gender diversity.

As I talk about gender diversity in this post, I want to make mention of non-binary genders. In 2017, DDD Perth had no attendees or speakers that identified as anything except Female or Male. Our registration questionnaire allowed for ‘other’ genders to be included, but did not feature a free text field.

We asked our contacts in the Perth community who had experience in this area, and put that advice together with our own ideas.

We looked at two parts of forming an agenda; submissions and selection.

Submissions

When we looked at where our submissions were coming from, it was pretty evident early on that we didn’t have a big reach into the software community.

The number of submissions jumped from 48 in 2016 to 104 in 2017

Reaching Out

We reached out through our contacts to as many community groups and companies as we could. The feeling was that a bigger reach would naturally result in submissions from women.

We also spearheaded a grass roots campaign to increase diversity in submissions. We all knew of impressive women in software in Perth — who doesn’t — so we asked them to submit, and we asked them to convince their peers too as well.

Michelle Sandford, one of Microsoft’s Top Social Media Influencers worldwide, had previously mentioned how much she enjoyed the conference, and agreed to help us promote submissions from women. Fenders, with the incomparable Mandy Michael, helped spread the word. There are many others who helped promote DDD Perth in 2017.

Flexibility in Submissions

Donna Edwards, State Delivery Manager at Readify, makes a great point in her 2017 DDD Perth talk about criteria and the willingness of women to apply. She’s talking about attraction strategies in the workplace; looking back we applied similar thinking to our conference submission process. (I should have put a spoiler alert on this paragraph as to our success in attracting women speakers, huh?)

We took a long look at our conference submission criteria. We looked from the eyes of a first time presenter, from women, from minority groups in our community, and ultimately we removed what we thought were two barriers to submission: unconscious bias and length of talk.

The thought was that for a first time presenter, a 45 minute talk could be intimidating. We introduced 20 minute talk lengths to encourage people who maybe thought they didn’t have enough content for a long talk to still consider submitting.

37% of 2017 talk submissions were 20 minutes in length

Unconscious Bias is an ugly thing to think about, isn’t it? Still, we forced ourselves to think on this point. Quite possibly, the people voting recognised some names and voted for those talks based on notoriety (be that fame, unconscious gender bias, unconscious race bias…the list goes on). We recognised that this could lead to our conference getting old very quick, should we only hear from the same 10 presenters every year. We decided that we would introduce anonymous voting.

I mention this whilst talking about Submissions, because we said upfront on the submission page that all identifying information would be hidden come voting time. We felt that in order to remove the barrier, we needed to be upfront about our intent. Being transparent about our process was important to us.

21% of submissions in 2017 included a presenter who identified as female

Selection

DDD Perth, as with all DDD conferences, has a democratically chosen agenda. What this means is that anyone is able to influence the agenda during our one week voting window.

I’ve already mentioned our anonymous voting changes. We stripped identifying information such as names and pronouns, but left experience and credentials.

However, we didn’t prohibit people mentioning the title of their talk on social media. This was a point of discussion for us; would it detract from the anonymous voting? Would it unfairly advantage those people with a following already?

Look, the answer to that is Yes. There was an advantage for people who vocally encouraged their networks to vote for them. This worked for both male and female speakers. However we felt that as well as being impossible to police, the promotion of the conference was a good byproduct. It also allowed our allies, such as Michelle, to promote the submissions by women she admired.

When it came time to tally the votes, we found that the process employed had yielded positive results.

A quarter of speakers in 2017 identified as female

I also want to mention our Code of Conduct. On securing speakers, we required each person explicitly agreed to our enforceable Code of Conduct. We discussed what to do on the day if this became an issue and every volunteer knew what to do and who to contact.

Uncomfortable as confrontation is, I’m glad we spent time discussing the possibility of expelling people from the conference. It hasn’t been required yet, but that doesn’t take away from the need to be prepared.

Not there yet

As you can see in the above statistics, DDD Perth is improving, but we’re not there yet. That’s why in 2018, we have a dedicated champion, Matt Ward, to help focus our efforts, not only in gender diversity.

We’re concentrating on achieving better representation in gender, seniority, ethnicity, accessibility and role. This isn’t an exhaustive list, nor will 2018 be the perfect conference. We just hope it’s another step forward.


Our Journey towards Diversity was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


How to create sitemaps with only non-www URLs in MediaWiki website?

Published 5 May 2018 by Arnb in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I usually prefer the non-www URL for my website (http://example.com, not http://www.example.com). While submitting the site to Google one needs to set a preferred domain by choosing either the www or the non-www version of URL. Then, during submitting the sitemap, only the sitemap containing the preferred version of URL should be submitted to Google search console.

I have a website run by MediaWiki software. It uses a PHP script (named generateSitemap.php, it is bundled with the MediaWiki installation package) to create the sitemaps. One can set cron jobs to automate the process of updating sitemap at regular intervals.

The problem is my sitemap are being generated containing the www version of the webpages.

How should I instruct the programs to generate sitemaps without www in the URLs?


The anatomy of an AtoM upgrade

Published 4 May 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday we went live with our new upgraded production version of AtoM.

We've been using AtoM version 2.2 since we first unveiled the Borthwick Catalogue to the world two years ago. Now we have finally taken the leap to version 2.4.

We are thrilled to benefit from some of the new features - including the clipboard, being able to search by date range and the full width treeview. Of course we are also keen to test the work we jointly sponsored last year around exposing EAD via OAI-PMH for harvesting.

But what has taken us so long you might ask?

...well, upgrading AtoM has been a new experience for us and one that has involved a lot of planning behind the scenes. The technical process of upgrading has been ably handled by our systems administrator. Much of his initial work behind the scenes has been on 'puppetising' AtoM to make it easier to manage multiple versions of AtoM going forward. In this post though I will focus on the less technical steps we have taken to manage the upgrade and the decisions we have made along the way.

Checking the admin settings

One of the first things I did when I was given a test version of 2.4 to play with was to check out all of the admin settings to see what had changed.

All of our admin settings for AtoM are documented in a spreadsheet alongside a rationale for our decisions. I wanted to take some time to understand the new settings, read the documentation and decide what would work for us.

Some of these decisions were taken to a meeting for a larger group of staff to discuss. I've got a good sense of how we use AtoM but I am not really an AtoM user so it was important that others were involved in the decision making.

Most decisions were relatively straightforward and uncontroversial but the one that we spent most time on was deciding whether or not to change the slugs...

Slugs

In AtoM, the 'slug' is the last element of the url for each individual record within the catalogue - it has to be unique so that all the urls go to the right place. In previous versions of AtoM the slugs were automatically generated from the title of each record. This led to some interesting and varied urls.


Slugs are therefore hard to predict ...and it is not always possible to look at a slug and know which archive it refers to.

This possibly doesn't matter, but could become an issue for us in the future should we wish to carry out more automated data manipulation or system integrations.

AtoM 2.4 now allows you to choose which fields your slugs are generated from. We have decided that it would be better if ours were generated from the identifier of the record rather than the title. The reason being that identifiers are generally quite short and sweet and of course should be unique (though we recently realised that this isn't enforced in AtoM).

But of course this is not a decision that can be taken lightly. Our catalogue has been live for 2 years now and users will have set up links and bookmarks to particular records within it. On balance we decided that it would be better to change the slugs and do our best to limit the impact on users.

So, we have changed the admin setting to ensure future slugs are generated using the identifier. We have run a script provided by Artefactual Systems that changed all the slugs that are already in the database. We have set up a series of redirects from all the old urls of top level descriptions in the catalogue to the new urls (note that having had a good look at the referrer report in Google Analytics it was apparent that external links to the catalogue generally point at top level descriptions).

Playing and testing

It was important to do a certain amount of testing and playing around with AtoM 2.4 and it was important that it wasn't just myself who did this - I encouraged all my colleagues to also have a go.

First I checked the release notes for versions 2.3 and 2.4 so I had a good sense of what had changed and where I should focus my attention. I was then able to test these new features and direct colleagues to them as appropriate for further testing or discussion.

While doing so, I tried to think about whether any of these changes would necessitate changes in our workflows and processes or updates to our staff handbook.

As an example - it was noted that there was a new field to record occupations for authority records. Rather than letting individuals to decide how to use this field, it is important to agree an institutional approach and consider an appropriate methodology or taxonomy. As it happens, we have decided not to use this field for the time being and this will be documented accordingly.

Assessing known bugs

Being a bit late to the upgrade party gives us the opportunity to assess known bugs and issues with a release. I spent some time looking at Artefactual's issues log for AtoM and establish if any of them were going to cause us major problems or required a workaround to be put in place.

There are lots of issues recorded and I looked through many of them (but not all!). Fortunately, very few looked like they would have an impact on us. Most related to functionality we don't utilise - such as the ability to use AtoM with multiple institutions or translate it into multiple languages.

The one bug that I thought would be irritating for us was related to the accessions counter which was not incrementing in version 2.4. Having spent a bit of time testing, it seemed that this wasn't a deal breaker for us and there was a workaround we could put in place to enable staff to continue to create accession records with a unique identifier relatively easily.

Testing local workarounds

Next I tested one of the local workarounds we have for AtoM. We use a CSS print stylesheet to help us to generate an accessions report to send donors and depositors to confirm receipt of an archive. This still worked in the new version of AtoM with no issues. Hoorah!

Look and feel

We gave a bit of thought to how AtoM should be styled. Two years ago we went live with a slightly customised version of the Dominion theme. This had been styled to look similar to our website (which at the time was branded orange).

In the last year, the look and feel of the University website has changed and we are no longer orange! Some thought needed to be given to whether we should change the look of our catalogue now to keep it consistent with our website. After some discussion it was agreed that our existing AtoM theme should be maintained for the time being.

We did however think it was a good idea to adopt the font of the University website, but when we tested this out on our AtoM instance it didn't look as clear...so that decision was quickly reversed.

Usability testing

When we first launched our catalogue we carried out a couple of rounds of user testing (read about it here and here) but this was quite a major piece of work and took up a substantial amount of staff time.

With this upgrade we were keen to give some consideration to the user experience but didn't have resource to invest in more user testing.

Instead we recruited the Senior User Experience Designer at our institution to cast his eye over our version of AtoM 2.4 and give us some independent feedback on usability and accessibility. It was really useful to get a fresh pair of eyes to look at our site, but as this could be a whole blog post in itself so I won't say anymore here...watch this space!

Updating our help pages

Another job was to update both the text and the screenshots on our static help pages within AtoM. There have been several changes since 2.2 and some of these are reflected in the look and feel of the interface. 

The advanced search looks a bit different in version 2.4 - here is the refreshed screenshot for our help pages

We were also keen to add in some help for our users around the clipboard feature and to explain how the full width treeview works.

The icons for different types of information within AtoM have also been brought out more strongly in this version, so we also wanted to flag up what these meant for our users.


...and that reminds me, we really do need a less Canada-centric way to indicate place!

Updating our staff handbook

Since we adopted AtoM a few years ago we have developed a whole suite of staff manuals which record how we use AtoM, including tips for carrying out certain procedures and information about what to put in each field. With the new changes brought in with this upgrade, we of course had to update our internal documentation.

When to upgrade?

As we drew ever closer to our 'go live' date for the upgrade we were aware that Artefactual were busy preparing their 2.4.1 bug fix release. We were very keen to get the bug fixes (particularly for that accessions counter bug that I mentioned) but were not sure how long we were prepared to wait.

Luckily with helpful advice from Artefactual we were able to follow some instructions from the user forum and install from the GitHub code repository instead of the tarball download on the website. This meant we could benefit from those bug fixes that were already stable (and pull others to test as they become available) without having to wait for the formal 2.4.1 release.

No need to delay our upgrade further!

As it happens it was good news we upgraded when we did. The day before the upgrade we hit a bug in version 2.2 during a re-index of elasticsearch. Nice to know we had a nice clean version of 2.4 ready to go the next day!

Finishing touches

On the 'go live' date we'd put word around to staff not to edit the catalogue while we did the switch. Our systems administrator got all the data from our production version of 2.2 freshly loaded into 2.4, ran the scripts to change the slugs and re-indexed the database. I just needed to do a few things before we asked IT to do the Domain Name System switch.

First I needed to check all the admin settings were right - a few final tweaks were required here and there. Second I needed to load up the Borthwick logo and banner to our archival institution record. Thirdly I needed to paste the new help and FAQ text into the static pages (I already had this prepared and saved elsewhere).

Once the DNS switch was done we were live at last! 

Sharing the news

Of course we wanted to publicise the upgrade to our users and tell them about the new features that it brings.

We've put AtoM back on the front page of our website and added a news item.

Let's tell the world all about it, with a catalogue banner and news item

My colleague has written a great blog post aimed at our users and telling them all about the new features, and of course we've all been enthusiastically tweeting!


...and a whole lot of tweeting

Future work

The upgrade is done but work continues. We need to ensure harvesting to our library catalogue still works and of course test out the new EAD harvesting functionality. Later today we will be looking at Search Engine Optimisation (particularly important since we changed our slugs). We also have some remaining tasks around finding aids - uploading pdfs of finding aids for those archives that aren't yet fully catalogued in AtoM using the new functionality in 2.4.

But right now I've got a few broken links to fix...


The anatomy of an AtoM upgrade

Published 4 May 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday we went live with our new upgraded production version of AtoM.

We've been using AtoM version 2.2 since we first unveiled the Borthwick Catalogue to the world two years ago. Now we have finally taken the leap to version 2.4.

We are thrilled to benefit from some of the new features - including the clipboard, being able to search by date range and the full width treeview. Of course we are also keen to test the work we jointly sponsored last year around exposing EAD via OAI-PMH for harvesting.

But what has taken us so long you might ask?

...well, upgrading AtoM has been a new experience for us and one that has involved a lot of planning behind the scenes. The technical process of upgrading has been ably handled by our systems administrator. Much of his initial work behind the scenes has been on 'puppetising' AtoM to make it easier to manage multiple versions of AtoM going forward. In this post though I will focus on the less technical steps we have taken to manage the upgrade and the decisions we have made along the way.

Checking the admin settings

One of the first things I did when I was given a test version of 2.4 to play with was to check out all of the admin settings to see what had changed.

All of our admin settings for AtoM are documented in a spreadsheet alongside a rationale for our decisions. I wanted to take some time to understand the new settings, read the documentation and decide what would work for us.

Some of these decisions were taken to a meeting for a larger group of staff to discuss. I've got a good sense of how we use AtoM but I am not really an AtoM user so it was important that others were involved in the decision making.

Most decisions were relatively straightforward and uncontroversial but the one that we spent most time on was deciding whether or not to change the slugs...

Slugs

In AtoM, the 'slug' is the last element of the url for each individual record within the catalogue - it has to be unique so that all the urls go to the right place. In previous versions of AtoM the slugs were automatically generated from the title of each record. This led to some interesting and varied urls.


Slugs are therefore hard to predict ...and it is not always possible to look at a slug and know which archive it refers to.

This possibly doesn't matter, but could become an issue for us in the future should we wish to carry out more automated data manipulation or system integrations.

AtoM 2.4 now allows you to choose which fields your slugs are generated from. We have decided that it would be better if ours were generated from the identifier of the record rather than the title. The reason being that identifiers are generally quite short and sweet and of course should be unique (though we recently realised that this isn't enforced in AtoM).

But of course this is not a decision that can be taken lightly. Our catalogue has been live for 2 years now and users will have set up links and bookmarks to particular records within it. On balance we decided that it would be better to change the slugs and do our best to limit the impact on users.

So, we have changed the admin setting to ensure future slugs are generated using the identifier. We have run a script provided by Artefactual Systems that changed all the slugs that are already in the database. We have set up a series of redirects from all the old urls of top level descriptions in the catalogue to the new urls (note that having had a good look at the referrer report in Google Analytics it was apparent that external links to the catalogue generally point at top level descriptions).

Playing and testing

It was important to do a certain amount of testing and playing around with AtoM 2.4 and it was important that it wasn't just myself who did this - I encouraged all my colleagues to also have a go.

First I checked the release notes for versions 2.3 and 2.4 so I had a good sense of what had changed and where I should focus my attention. I was then able to test these new features and direct colleagues to them as appropriate for further testing or discussion.

While doing so, I tried to think about whether any of these changes would necessitate changes in our workflows and processes or updates to our staff handbook.

As an example - it was noted that there was a new field to record occupations for authority records. Rather than letting individuals to decide how to use this field, it is important to agree an institutional approach and consider an appropriate methodology or taxonomy. As it happens, we have decided not to use this field for the time being and this will be documented accordingly.

Assessing known bugs

Being a bit late to the upgrade party gives us the opportunity to assess known bugs and issues with a release. I spent some time looking at Artefactual's issues log for AtoM and establish if any of them were going to cause us major problems or required a workaround to be put in place.

There are lots of issues recorded and I looked through many of them (but not all!). Fortunately, very few looked like they would have an impact on us. Most related to functionality we don't utilise - such as the ability to use AtoM with multiple institutions or translate it into multiple languages.

The one bug that I thought would be irritating for us was related to the accessions counter which was not incrementing in version 2.4. Having spent a bit of time testing, it seemed that this wasn't a deal breaker for us and there was a workaround we could put in place to enable staff to continue to create accession records with a unique identifier relatively easily.

Testing local workarounds

Next I tested one of the local workarounds we have for AtoM. We use a CSS print stylesheet to help us to generate an accessions report to send donors and depositors to confirm receipt of an archive. This still worked in the new version of AtoM with no issues. Hoorah!

Look and feel

We gave a bit of thought to how AtoM should be styled. Two years ago we went live with a slightly customised version of the Dominion theme. This had been styled to look similar to our website (which at the time was branded orange).

In the last year, the look and feel of the University website has changed and we are no longer orange! Some thought needed to be given to whether we should change the look of our catalogue now to keep it consistent with our website. After some discussion it was agreed that our existing AtoM theme should be maintained for the time being.

We did however think it was a good idea to adopt the font of the University website, but when we tested this out on our AtoM instance it didn't look as clear...so that decision was quickly reversed.

Usability testing

When we first launched our catalogue we carried out a couple of rounds of user testing (read about it here and here) but this was quite a major piece of work and took up a substantial amount of staff time.

With this upgrade we were keen to give some consideration to the user experience but didn't have resource to invest in more user testing.

Instead we recruited the Senior User Experience Designer at our institution to cast his eye over our version of AtoM 2.4 and give us some independent feedback on usability and accessibility. It was really useful to get a fresh pair of eyes to look at our site, but as this could be a whole blog post in itself so I won't say anymore here...watch this space!

Updating our help pages

Another job was to update both the text and the screenshots on our static help pages within AtoM. There have been several changes since 2.2 and some of these are reflected in the look and feel of the interface. 

The advanced search looks a bit different in version 2.4 - here is the refreshed screenshot for our help pages

We were also keen to add in some help for our users around the clipboard feature and to explain how the full width treeview works.

The icons for different types of information within AtoM have also been brought out more strongly in this version, so we also wanted to flag up what these meant for our users.


...and that reminds me, we really do need a less Canada-centric way to indicate place!

Updating our staff handbook

Since we adopted AtoM a few years ago we have developed a whole suite of staff manuals which record how we use AtoM, including tips for carrying out certain procedures and information about what to put in each field. With the new changes brought in with this upgrade, we of course had to update our internal documentation.

When to upgrade?

As we drew ever closer to our 'go live' date for the upgrade we were aware that Artefactual were busy preparing their 2.4.1 bug fix release. We were very keen to get the bug fixes (particularly for that accessions counter bug that I mentioned) but were not sure how long we were prepared to wait.

Luckily with helpful advice from Artefactual we were able to follow some instructions from the user forum and install from the GitHub code repository instead of the tarball download on the website. This meant we could benefit from those bug fixes that were already stable (and pull others to test as they become available) without having to wait for the formal 2.4.1 release.

No need to delay our upgrade further!

As it happens it was good news we upgraded when we did. The day before the upgrade we hit a bug in version 2.2 during a re-index of elasticsearch. Nice to know we had a nice clean version of 2.4 ready to go the next day!

Finishing touches

On the 'go live' date we'd put word around to staff not to edit the catalogue while we did the switch. Our systems administrator got all the data from our production version of 2.2 freshly loaded into 2.4, ran the scripts to change the slugs and re-indexed the database. I just needed to do a few things before we asked IT to do the Domain Name System switch.

First I needed to check all the admin settings were right - a few final tweaks were required here and there. Second I needed to load up the Borthwick logo and banner to our archival institution record. Thirdly I needed to paste the new help and FAQ text into the static pages (I already had this prepared and saved elsewhere).

Once the DNS switch was done we were live at last! 

Sharing the news

Of course we wanted to publicise the upgrade to our users and tell them about the new features that it brings.

We've put AtoM back on the front page of our website and added a news item.

Let's tell the world all about it, with a catalogue banner and news item

My colleague has written a great blog post aimed at our users and telling them all about the new features, and of course we've all been enthusiastically tweeting!


...and a whole lot of tweeting

Future work

The upgrade is done but work continues. We need to ensure harvesting to our library catalogue still works and of course test out the new EAD harvesting functionality. Later today we will be looking at Search Engine Optimisation (particularly important since we changed our slugs). We also have some remaining tasks around finding aids - uploading pdfs of finding aids for those archives that aren't yet fully catalogued in AtoM using the new functionality in 2.4.

But right now I've got a few broken links to fix...


Integrate wiki geoData into custom site

Published 4 May 2018 by RayTest in Newest questions tagged mediawiki - Stack Overflow.

I am need of retrieving geographic data data like countries,city,latitude,longitude from Wikipedia to custom site which will have search with filter feature using PHP & PostgreSQL. I found wiki's Geodata will be able to fetch required data.I don't know how to implement this in my custom site , please help me with this to get started. :)

Thank you.


Mediawiki Hiding extra URL parameter

Published 3 May 2018 by user3103115 in Newest questions tagged mediawiki - Stack Overflow.

I have a Mediawiki extension that requires a "dbid" parameter in the URL to work. ULRs for normal pages are in /w/Main_Page format, and where the extension is supposed to be launched, /w/Page?dbid=1234. Now, I have been trying to hide dbid= with a / (slash). I tried setting up htaccess to

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/(.*)$ index.php?title=$1&dbid=$2 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [L]

But it only works as long as I don't enable short URLs in the wiki ($wgArticlePath = "{$wgScriptPath}/$1";). If I do so, wiki keeps thinking "Page/1234" is the title. I have found this https://www.mediawiki.org/wiki/Manual:Entry_point_routing but I have no idea how to use it.

I made a simple extension with just

$wgHooks['WebRequestPathInfoRouter'][] = 'ePathRouter';
function ePathRouter( $router ) {
 $router->addStrict( "/Page/100650030", array( 'title' => 'Main_Page' ) );
 return true;
}

to just test if it's going to redirect "/Page/100650030" to the "Main_Page" but it doesn't work at all.

So my question is, how do I use this hook?


Web Inventor, Tim Berners-Lee, and W3C Track at The Web Conference 2018

Published 3 May 2018 by Coralie Mercier in W3C Blog.

WWW2018 logoW3C once again joined The Web Conference, previously known as the WWW Conference, for The Web Conference 2018 which took place last week in Lyon and featured a W3C track, exhibition booth, and tutorials.

W3C Tutorials took place on Monday and Tuesday of the week with a focus on Data Visualization, Audio on the Web, and Media advances on the Web Platform.
Between Wednesday and Friday, we welcomed conference attendees at the W3C booth.
At the W3C Track on Friday, experts from W3C Members and W3C Team broached topics including new trends on the Web Platform, WebAssembly, WebXR, Web of Things, Social Web Protocols and the foundations of trust as well as Intelligent Search on the Web. The Next Big Thing of Web: Future Web Outlook session ended with a panel about the future of the Web and gave way to active interaction with the audience on hot topics they care about for the Web.

The conference was a huge success, with 2300 participants from more than 60 countries, its program was packed with interesting topics and the line-up of speakers and panelits was outstanding. W3C Tutorials and W3C Track were very well attended.

Our Director, Tim Berners-Lee participated in the Thursday plenary panel on “Artificial Intelligence & the Web” with five other distinguished panelists from Amazon, Google, eBay, Facebook and Southampton University after the keynote on Conversational AI for Interacting with Digital and Physical World by Ruhi Sarikaya, Director of Applied Science, from the Amazon Alexa team.

You can watch the panel: AI and the future of the Web and the Internet (Sir Tim Berners-Lee (MIT, W3C), Antoine Bordes (Facebook, FAIR Paris), Vinton Cerf (Google), Kira Radinsky (eBay), Ruhi Sarikaya (Amazon), Prof. Dame Wendy Hall (Southampton University) – Chair), as well as the closing address by Mounir Mahjoubi, French Secretary of State for Digital, who shared excellent and insightful remarks on the state of AI and the role of the State on AI.

“This week brings together so many vital areas – industry, research, and those continuing to build the Web. Later this year we will reach a tipping point where 50% of the people in the world are on the Web. At this time of uncertainty and concerns about how the Web can be used for ill as well as good, we may ask what kind of Web this next 50% will be joining. So it is heartening to be amongst so many people working to make the Web better. At W3C, we’re working on innovative technology and Web standards. We ensure the process remains open, fair, international, fosters accessibility, and with a level-playing field. For such powerful technologies require our standards to be developed with trust, consensus and in the open.”

— Sir Tim Berners-Lee, MIT/World Wide Web Consortium, Inventor of the Web.

Tim Berners-Lee talking at The Web Conference 2018

The full list of videos and interviews are available from the The Web Conference’s channel. Here are the links to the video of the three keynotes of this year’s Conference:


"War is the way Americans learn geography"

Published 3 May 2018 by in New Humanist Articles and Posts.

Q&A with peace activist Medea Benjamin, author of the new book "Inside Iran".

New subscription form

Published 3 May 2018 by Pierrick Le Gall in The Piwigo.com Blog.

Piwigo.com subscription form gets a full redesign, with 3 goals in mind: improve VAT management, give choice between Individual and Enterprise plans, make it possible to subscribe for several years.

1) VAT and european laws

Piwigo.com hosting service is managed by a French company. Clients are coming from all over the world, among more than 70 countries. Depending on your country, the VAT (Value Added Tax) does not apply in the same way. Furthermore, if you or your organisation has a VAT number, other rules apply. No need to go further into technical details. Keep in mind that we need to collect a few data to accurately give VAT back to appropriate countries.

Piwigo.com subscription: we need a few data from you!

Piwigo.com subscription: we need a few data from you!

As long as we don’t have them, the subscription will keep asking for a few data.

Piwigo.com subscription: give your country and your VAT number, if you have one.

Piwigo.com subscription: give your country and your VAT number, if you have one.

Side note: this new rule dates back from 2016 and we were not always asking for your country/VAT number. Until now, we guessed your country based on your IP address and we considered Enterprise clients had a VAT number and Individual clients had none. From now on we won’t have to guess anymore!

2) Choice between Individual and Enterprise plans

Piwigo.com subscription: are you an individual or an organisation?

Piwigo.com subscription: are you an individual or an organisation?

In its early days Piwigo.com was only selling a 39 euros (per year) plan, for individuals. Since 2017, Piwigo.com Enterprise plans, now official! Enterprise clients still had to contact us to create an order. The new subscription form gives organisations the power to create Enterprise orders by themselves, without needing help from Piwigo.com support.

Piwigo.com subscriptions: several options for the Enterprise plan

Piwigo.com subscriptions: several options for the Enterprise plan

3) Directly subscribe for 2 or 3 years

Piwigo.com subscription for Individual plan: select your duration

Piwigo.com subscription for Individual plan: select your duration

Interesting for Piwigo.com: it will obviously give us the opportunity to increase our available cash. More cash in the bank makes some investments possible, like recruitment.

Interesting for clients: by taking 2 years, you get a 10% discount, or 8€ less on your bill. By taking 3 years, discount is 20%, meaning 23€ kept in your pocket. As it is very usual our clients stay 5 years or more, we think it is a good deal!


New tool: Wikimedia APT browser

Published 2 May 2018 by legoktm in The Lego Mirror.

I've created a new tool to make it easier for humans to browse Wikimedia's APT repository: apt.wikimedia.org. Wikimedia's servers run Debian (Ubuntu is nearly phased out), and for the most part use the standard packages that Debian provides. But in some cases we use software that isn't in the official Debian repositories, and distribute it via our own APT repository.

For a while now I've been working on different things where it's helpful for me to be able to see which packages are provided for each Debian version. I was unable to find any existing, reusable HTML browsers for APT repositories (most people seem to use the commandline tools), so I quickly wrote my own.

Introducing the Wikimedia APT browser. It's a short (less than 100 lines) Python and Flask application that reads from the Package/Release files that APT uses, and presents them in a simple HTML page. You can see the different versions of Debian and Ubuntu that are supported, the different sections in each one, and then the packages and their versions.

There's nothing really Wikimedia-specific about this, it would be trivial to remove the Wikimedia branding and turn it into something general if people are interested.

The source code is published on Phabricator and licensed under the AGPL v3, or any later version.


Simplify Container Orchestration: Announcing Early Access to DigitalOcean Kubernetes

Published 2 May 2018 by Jamie Wilson in The DigitalOcean Blog.

Simplify Container Orchestration: Announcing Early Access to DigitalOcean Kubernetes

Over the last 18 months, we’ve delivered many cloud primitives to serve developers and their teams in our unique DO-Simple way. We introduced Load Balancers, Monitoring and Alerts, Cloud Firewalls, Spaces, CPU-Optimized Droplets, a new Dashboard, and new Droplet pricing plans. We extended the availability of Block Storage to all regions. All of these primitives make it easier to go from an idea to production without the overhead and complexity of managing cloud infrastructure.

Today, we’re excited to build on those primitives and announce DigitalOcean Kubernetes, a simple and cost-effective way to deploy, orchestrate, and manage container workloads. Deploying workloads as containers provides many benefits for developers, from rapid deployment to isolation and security. But orchestrating those workloads comes with additional layers of complexity that can be difficult for development teams to manage.

Kubernetes has become the leading open source platform for orchestration, with thousands of contributors in the last year alone. DigitalOcean has been running large workloads on Kubernetes over the past two years, and we’re excited to bring our learnings and expertise to our customers.

We designed DigitalOcean Kubernetes with developers and their teams in mind, so you can save time and deploy your container workloads without needing to configure everything from scratch. Automatic deployment of load balancers, block storage, firewalls, ingress controllers, and more makes configuring your cluster on DigitalOcean as simple as deploying a Droplet.

We understand having your data close to your cluster is essential, so you’ll have the option to deploy a private container registry to your cluster with no configuration, and store the images on DigitalOcean Spaces.

In addition to offering Kubernetes on our platform, we are also upgrading our CNCF membership to Gold. We’re committed to contributing to and supporting the open source technologies around containers, and are looking forward to working with CNCF members to continue the evolution of these and related technologies.

The DigitalOcean Kubernetes Early Access Program sign-up starts today, and access for select users begins next month. If you’re part of the program, your cluster will be free through September 2018.

Simplify Container Orchestration: Announcing Early Access to DigitalOcean Kubernetes

We’re also at KubeCon EU this week, and we’d love to share what we’ve been working on! You can find us in Hall C, at booth number G-C06 at these times:

Jamie Wilson,
Sr. Product Manager


W3C and the W3C Australia Office bring you a GREAT Smart Cities Tour!

Published 1 May 2018 by J. Alan Bird in W3C Blog.

banner for Future of the Web: Data Drives the Smart City

If you’re in Australia, or have colleagues who are, and haven’t registered for this great event, you need to do it soon! We’ve got a great panel of speakers who are looking at a topic that is HOT in Australia! The good news is if you’re a W3C Member, then this event is free – what a deal!

Our world is increasingly being shaped by the vast amount of data being produced in every aspect of our lives. As more devices get connected through the Internet of Things (IoT), harnessing big data in an integrated way can offer valuable insights that can help achieve smart city goals. This comes with important and interesting challenges to solve in order to actualise the smart city vision. Challenges include data collection, integration and privacy.

The World Wide Web Consortium (W3C), in partnership with the Australian National University invites you to Future of the Web: Data Drives the Smart City. Data Drives the Smart City explores the challenges and progress made in the technology and underpinning standards framework needed to enable smart cities. You will hear from leading experts in the field on how challenges are being tackled.

Dates

Topics

Topics to be addressed include perspectives from Government, tech industry leadership, Web standards for spatial data and city sensing, technical solutions to privacy management, and smart grid futures. A panel session will discuss capacity building for smart cities.

Speakers

Speakers include Dr Ian Oppermann (NSW Chief Data Scientist), Dr Ole Nielsen (ACT Chief Digital Officer), J. Alan Bird (W3C Global Business Development Lead), Dr Mukesh Mohania (IBM Distinguished Engineer in IBM Research), Dr David Hyland-Wood (Blockchain Protocol Architect, Consensys), Dr Lachlan Blackhall (Entrepreneurial Fellow and Head, Battery Storage and Grid Integration Program), Dr Kerry Taylor (Chair, W3C Spatial Data on the Web), Dr Peter Christen (Professor, Data Mining and Matching, ANU), Christine Cowper (Principal Consultant with Information Integrity Solutions), and Dr Armin Haller (W3C Office Manager, ANU).

Schedule

Coffee/tea on arrival, morning tea and lunch will be provided.

See the program details.

Registration: $290. Free to W3C members and ANU and alumni.


Episode 7: Dan Barrett

Published 1 May 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Dan Barrett is a longtime developer and project manager who worked for fifteen years at Vistaprint, where he created, and oversaw the development and maintenance of, the MediaWiki installation. He has also written numerous technical books for O'Reilly, including the 2008 book MediaWiki: Wikipedia and Beyond.   Links for some of the topics discussed:  

Mediawiki - How to disable diacritic/special characters in the URL

Published 1 May 2018 by Magorzata Mrwka Zielona Mrwka in Newest questions tagged mediawiki - Webmasters Stack Exchange.

My mediawiki page is in Polish and the mediawiki converts page titles to url in 1:1 style - it means that page named Truteń will have url that contains Truteń with the Polish diacritic character "ń".

This causes code 400 error to appear when I'm trying to preview pages that contain polish diacritic letters (ąęćśźżółń) in Facebook Sharing debbuger. Those also do not display informations from extensions (excerpts from pages while sharing). Miracously main page works, even though it has diacritic character in its url ("Strona główna").

How can I disable those characters? I can also add that I'm almost total newbie and am using OVH services (the cheapest plan they had). Here is an example page - http://wiki.mrowki.ovh/index.php?title=Truteń


GLAM Blog Club – May 2018

Published 1 May 2018 by Nik McGrath in newCardigan.

Andrew kicked off on the theme of control with Joy Division’s She’s Lost Control. Play the track and read on… And she gave away the secrets of her past… Andrew argues for and against copyright in the case of researchers accessing special collections. Control measures by some libraries are put in place preventing digital copies of donor material being made without donor permission. Should libraries take a risk, like some do, and place the onus or control back in the hands of the user to do the right thing, making digital copies for reference but trusting users not to break copyright?

Phillipa, a PhD student, took time off from her PhD to care for her daughter who was diagnosed with Stage 4 lymphatic cancer. “I am outwardly an organised student, but library books were the last thing on my mind as I struggled to appear normal and in control”. The tale of 23 Overdue Books is about feeling out of control, receiving a $1000 library fine, and ultimately the compassion of a librarian who waived the fine.

Michelle’s blog Controlling your online data and privacy gives some fantastic tips about how to protect your privacy online. “You don’t need these companies to control all your data for you…”

Control your files, Niamh states: “Control over your files does take a little time to set up, but the benefits are that your information will be searchable, backed up, restorable and reusable.” “Try to leave your files in a state that the future version of you can use.” Walk the talk.

Hugh is the technical genius behind newCardigan’s systems. In his blog Building our own house Hugh describes the journey to setup systems protecting the privacy of our members and participants. “We’re not quite running our own servers in the spare room, but I’m pretty happy with how far we’ve managed to move towards running our own systems so we don’t force members and participants to hand over data to third parties just so they can socialise with other GLAM people. As much as possible, it’s newCardigan members, or at worst, newCardigan as an organisation, in control.”

Control those tabs, Kathryn gives a guide on how to setup preferences with Chrome for websites that you access daily.

Sam’s blog Getting to “good enough”: thoughts on perfectionism is an honest analysis and reflection on the negative aspects of perfectionism in the workplace.

Libraries becoming the new park, Melly argues for the need for librarians and library technicians to continue to manage public libraries, arguing against the trend in public libraries for using library spaces for other purposes and understaffing with the notion that customers can serve their own needs within the library. “If public parks cannot control human behaviour, what about libraries without staff?”

Amy challenges us to control our present, future, environment, thoughts, voice and relationships in her blog Taking control of the small things.

Want to be a happier librarian? You’re in control! Anne believes that happiness is something we control: “It doesn’t help to get upset or anxious about things you can’t control so focus on the things that you can.”

Sarah’s Control of GLAMR information … in my inbox is all about taking control of the subscriptions that overload us with information in our inbox, in this case GLAMR information! What is still relevant, and what is information Sarah receives in other ways.

GLAM Blog Club – Control Kara acknowledges that control of her career is difficult to attain, but perhaps it’s important to celebrate the small wins. I think most people often feel out of control of their career, but joining the conversation here is definitely a win! Thanks Kara.

My blog Democratisation in action, I argue that: “Although it’s important that archivists maintain control of the systems that ensure items are trackable and findable, it is also important that archivists enable access. Raising the profile of archival collections and awareness of the content available within collections provides more opportunities for individuals from diverse backgrounds to interpret archival material in new and interesting ways. This is democratisation in action.”

Matthew’s Custodial Control of Digital Assets makes a compelling argument for case by case consideration in collecting born digital items: “…you cannot always control what you receive when it comes to digital collections. Standards are there for guidance and sometimes decisions need to be made on whether to allow something into the collection that does not meet them. The intrinsic value of the object, its uniqueness and rarity may very well trump the technical requirements for digital collecting. When dealing with born-digital photographs for example, where some institutions prefer a Camera Raw or uncompressed TIFF file format, a low resolution JPEG would also be accepted under the right circumstances.”

The terror and value of asking for feedback, Stacey gives advice that feedback is valuable, so it’s worth giving up control by: “Putting things out there and asking for feedback…”

Queerying the catalogue: Control, classification, chaos, curiosity, care and communities, Clare is “reflecting on the problematic histories of classification in librarianship and in psychology, particularly in relation to LGBTIQA+ communities, my complicated relationship with labels, and the power of play to help librarians become more comfortable with letting go of at least some of our control and authority, find courage in chaos, embrace fluidity, and change the system.”

Associate, collocate, disambiguate, infuriate, Alissa on her thoughts on “…relinquishing some of my control over the form and display of titles within a catalogue.”

GLAM Blog Club – Control, Rebecca questions: “So what happens when you put a control freak into the world of museums?” Weekly goal lists, problem solving skills and throwing yourself into the deep end, will help you no end.

Authority Control – Can I haz it? Clare on the world of cataloguing and control vocabs, putting theory into practice.

Thank you for your blogs on control, it proved to be a popular theme!

Have you ever walked into a gallery and cried at the sight of a painting? Felt waves of emotion reading a letter in the archives? Have you reacted passionately about something you care deeply about in a meeting at work?

Passion is our theme for GLAM Blog Club this month.

Some might argue that passion is the opposite of control. We anticipate a lovely contrast between last month and this month’s blogs.

Please don’t forget to use the tag GLAM Blog Club in your post, and #GLAMBlogClub for any social media posts linking to it. If you haven’t done so yet, remember to register your blog at Aus GLAM Blogs. Happy blogging!


In a word: hub

Published 30 Apr 2018 by in New Humanist Articles and Posts.

Michael Rosen's column on language and its uses.

Wikibase of Wikibases

Published 30 Apr 2018 by addshore in Addshore.

The Wikibase registry was one of the outcomes of the first in a series of Federated wikibase workshops organised in partnership with the European research council.

The aim of the registry is to act as a central point for details of public Wikibase installs hosted around the web. Data held about the installs currently includes the URL for the home page, Query frontend URL and SPARQL API endpoint URL (if a query service exists).

During the workshop an initial data set was added, and this can be easily seen using the timeline view of the query service and a query that is explained within this post.

Setting up the Wikibase install

The registry is running on the WMF Cloud infrastructure using the wikibase and query service docker images on a single m1.medium VPS with 2 CPUs, 4GB RAM and 40GB disk.

The first step was to request the creation of a project for the install. The current process for this is to create a Phabricator ticket, and that ticket can be seen here.

Once the project was created I could head to horizon (the openstack management interface) and create a VPS to host the install.

I chose the m1.medium flavour for the 4GB memory allowance. As is currently documented in the wikibase-docker docker-compose example readme the setup can fail with less than 3GB memory due to the initial spike in memory usage when setting up the collection of docker services.

Once the machine was up and running I could install docker and docker-compose by following the docker docs for Debian (the OS I chose during the machine creation step).

With docker and docker-compose installed it was time to craft my own docker-compose.yml file based on the example currently present in the wikibase-docker repo.

The key environment variables to change were:

The docs for the environment variables are visible in the README for each image use for the service. For example the ‘wikibase’ image docs can be found in this README.
Once created it was time to start running the services using the following command:

user@wbregistry-01:~/wikibase-registry# docker-compose up -d
Creating network "wikibase-registry_default" with the default driver
Creating volume "wikibase-registry_mediawiki-mysql-data" with default driver
Creating volume "wikibase-registry_mediawiki-images-data" with default driver
Creating volume "wikibase-registry_query-service-data" with default driver
Creating wikibase-registry_mysql_1 ... done
Creating wikibase-registry_wdqs_1     ... done
Creating wikibase-registry_wikibase_1   ... done
Creating wikibase-registry_wdqs-proxy_1   ... done
Creating wikibase-registry_wdqs-updater_1  ... done
Creating wikibase-registry_wdqs-frontend_1 ... done

The output of the command stated that everything correctly started, and I double checked using the following:

user@wbregistry-01:~/wikibase-registry# docker-compose ps
              Name                             Command               State          Ports
-------------------------------------------------------------------------------------------------
wikibase-registry_mysql_1           docker-entrypoint.sh mysqld      Up      3306/tcp
wikibase-registry_wdqs-frontend_1   /entrypoint.sh nginx -g da ...   Up      0.0.0.0:8282->80/tcp
wikibase-registry_wdqs-proxy_1      /bin/sh -c "/entrypoint.sh"      Up      0.0.0.0:8989->80/tcp
wikibase-registry_wdqs-updater_1    /entrypoint.sh /runUpdate.sh     Up      9999/tcp
wikibase-registry_wdqs_1            /entrypoint.sh /runBlazegr ...   Up      9999/tcp
wikibase-registry_wikibase_1        /bin/sh /entrypoint.sh           Up      0.0.0.0:8181->80/tcp

Wikibase and the query service UI were exposed on ports 8181 and 8282 on the machine respectively, but the openstack firewall rules would block any access from outside the project by default, so I created 2 new rules allowing ingress from within the labs network (range 10.0.0.0/8).

I could then setup a web proxy in horizon to map some domains to the exposed ports on the machine.

With the proxies created the 2 services were then accessible to the outside world:

Adding some initial data

The first version of this repository was planned to just hold Items for Wikibase installs. The initial list of properties could be pretty straight forward. A link to the homepage of the wiki is of course useful, and enables navigating to the site. Sites may not expose a query service in a uniform way, so a property would also be needed for this. The SPARQL endpoint used by the query service could also differ thus another property would be needed. And finally to be able to display the initial data on a timeline, and initial creation date would be needed. I added a property for install logo to make the timeline a little prettier.

The properties created initially to describe Wikibase installs with (with example data values for wikidata.org) can be seen below:

Some other properties were also created:

I then added all other wikibase instances run by the WMF which included test and beta Wikidata sites. Wikiba.se also contains a list of Wikibase installs (although out of date). I also managed to find some new installs from wikiapiary looking at the Wikibase Repo extension usage. And of course some of the people in the room had instances to add to the list.

I based the creation date on the rough creation of the first item, or an official inception date. All of the creation date statements should probably have references.

The timeline query

The below SPARQL queries show the creation of a federated timeline query crossing the local wikibase query service (for the registry) and also the wikidata.org query service.

1) Select all Items with our date property (P5):

SELECT ?item ?date
WHERE {
    ?item wdt:P5 ?date .
}

2) Use the label service to select the Item Labels instead of IDs:

SELECT ?itemLabel ?date
WHERE {
    ?item wdt:P5 ?date .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}

3) Also select the logo (P8) if it exists:

SELECT ?itemLabel ?date ?logo
WHERE {
    ?item wdt:P5 ?date .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    OPTIONAL { ?item wdt:P8 ?logo }
}

4) Display the results on a timeline by default:

#defaultView:Timeline
SELECT ?itemLabel ?date ?logo
WHERE{
    ?item wdt:P5 ?date .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    OPTIONAL { ?item wdt:P8 ?logo }
}

5) Also include some results from the wikidata.org query service (using federated queries) to show the WikidataCon events:

In this query new prefixes are needed for wikidata.org as the default “wd” and “wdt” prefixes point to the local wikibase install.
Q37807168 on wikidata.org is “WikidataCon” and P31 is “instance of”.

#defaultView:Timeline

PREFIX wd-wd: <http://www.wikidata.org/entity/>
PREFIX wd-wdt: <http://www.wikidata.org/prop/direct/>

SELECT ?itemLabel ?date (SAMPLE(?logo) AS ?image)
WHERE
{
  {
   ?item wdt:P5 ?date .
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
   OPTIONAL { ?item wdt:P8 ?logo }
  }
 UNION
  {
   SERVICE <https://query.wikidata.org/sparql> {
    ?item wd-wdt:P31 wd-wd:Q37807168 .
    ?item wd-wdt:P580 ?date .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    OPTIONAL { ?item wd-wdt:P154 ?logo }
   } 
  }
}
GROUP BY ?itemLabel ?date

This generates the timeline that you see at the top of the post.

Other issues noticed during setup

Some of the issues were known before this blog post, but others were fresh. Nonetheless if you are following along the following issues and tickets may be of help:

The post Wikibase of Wikibases appeared first on Addshore.


Catch Us in Copenhagen for KubeCon EU

Published 27 Apr 2018 by Jaime Woo in The DigitalOcean Blog.

Catch Us in Copenhagen for KubeCon EU

UPDATE: Catch the talks, now embedded below!

Next week is KubeCon EU in Copenhagen, Denmark. We're already drooling at the idea of diving into smørrebrød, perhaps near the famed Little Mermaid statue.

DigitalOcean will have two speakers and a booth at KubeCon EU:

On Wednesday, May 2, from 2:45 PM-3:20 PM, Matt Layher presents "How To Export Prometheus Metrics From Just About Anything."

Prometheus exporters bridge the gap between Prometheus and systems which cannot export metrics in the Prometheus format. During this talk, you will learn how to gather metrics from a wide variety of data sources, including files, network services, hardware devices, and system calls to the Linux kernel. You will also learn how to build a reliable Prometheus exporter using the Go programming language. This talk is intended for developers who are interested in bridging the gap between Prometheus and other hardware or software.

Then, on Thursday, May 3, Andrew Kim speaks from 2:45PM-3:20PM on "Global Container Networks on Kubernetes at DigitalOcean."

Building a container network that is reliable, fast and easy to operate has become increasingly important in DigitalOcean’s distributed systems running on Kubernetes. Today’s container networking technologies can be restrictive as Pod and Service IPs are not reachable externally which forces cluster administrators to operate load balancers. The addition of load balancers introduces new points of failure in a cluster and hinders observability since source IPs are either NAT’d or masqueraded.

This talk will be a deep dive of how DigitalOcean uses BGP, Anycast and a variety of open source technologies (kube-router, CNI, etc) to achieve a fast and reliable container network where Pod and Service IPs are reachable from anywhere on DigitalOcean’s global network. Design considerations for scalability, lessons learned in production and advanced use cases will also be discussed.

You can also catch us in Hall C, at booth number G-C06. We’ll be tending the booth, where we'll be giving demos and answering questions:

Vi snakkes ved!


Why I always get same content from MediaWiki API call?

Published 27 Apr 2018 by George Marti in Newest questions tagged mediawiki - Stack Overflow.

I am using the MediaWiki API to get images through AJAX call. I used the MediaWiki Sandbox to create an api call to query these images from Wikimedia-Commons. I used the title 'California'.

This is the url:

var url_wiki = "https://commons.wikimedia.org/w/api.php?action=query&titles=California&list=allimages&ailimit=10&format=json&callback=?";

And this is the AJAX call I am using:

$.ajax({
 type: 'GET',
 url: url_wiki,
 data: {
   action:'query',
   format:'json'
 },
 dataType: 'json',
 success: function(result){
   console.log(result);
 }

Now, it seems to work fine, since I get 10 images related to 'California' (even though they are quite weird): Console Result for 'California'

BUT, now my question is: If I change my url attribute 'title' and I type 'Europe', for example, instead of California, I am still getting the same images from 'California'...

New url:

var url_wiki = "https://commons.wikimedia.org/w/api.php?action=query&titles=Europe&list=allimages&ailimit=10&format=json&callback=?";

Console Result for 'Europe' -> Same images!

I can't understand this... something is wrong in my API call?


cardiParty 2018-05 with Anna Burkey

Published 26 Apr 2018 by Justine in newCardigan.

Join Anna Burkey for a preview of the SLV's new StartSpace centre for early-stage entrepreneurs.

Find out more...


Getting Started with an Incident Communications Plan

Published 26 Apr 2018 by Blake Thorne in The DigitalOcean Blog.

Getting Started with an Incident Communications Plan

At Statuspage, we believe it’s never too early for a team to start thinking about an incident communications plan. When your first big incident happens is way too late. Unplanned downtime can cause customer churn and unmanageable inbound support volume. Just one hour of unplanned downtime can cost organizations more than $100,000—and often much more—according to the latest annual downtime survey from Information Technology Intelligence Consulting Research.

Some downtime is inevitable, even massive organizations experience outages from time to time. The good news is the harm from downtime can be mitigated by deploying reassuring context and information in a timely fashion. You may hope to never need an incident communications plan but, as any good Site Reliability Engineer (SRE) will tell you, hope is not a strategy.

Mapping out your team’s first incident communications strategy doesn’t have to be overly complex or resource-draining. In fact, you can accomplish it fairly quickly using these four steps:

Before the Incident

Know what constitutes an incident

Sometimes it’s hard to know what exactly to label as an “incident.” Here’s a set of guidelines Google SREs use, where if any one of the following is true the event is considered an incident:

Feel free to adopt these exact guidelines, adjust them, or write your own. “If any one of the following is true” is a good format. (Another helpful resource for mapping incident severity is this Severity Definitions guide from VMware.)

A note on playing it safe: in our experience it’s better to overcommunicate in situations where you’re uncertain. The inconvenience of closing the loop on an expected incident that never took off far outweighs the downside of playing catch up on incident comms hours into an incident.

“I’ll just fix this quickly before anyone notices,” is a slippery slope. You might gamble and win the first time you try that, but play the game enough and eventually you’ll lose.

Team Roles

Define key roles and expectations for incident responders. Clear labels and expectations can prevent a lot of damage in the heat of an incident. While large teams and complex SRE organizations have a web of roles and responsibilities, we see two roles as a good starting point.

Incident commander

The incident commander is in charge of the incident response, making sure everyone is working toward resolution and following through on their tasks. They also are in charge of setting up any communications and documentation channels for the incident. That could be chat rooms, shared pages for documenting the incident, and even physical spaces in the office. This person also drives the post-incident review.

Communicator

The communicator is in charge of translating the technical information into customer communications and getting those communications out via the right channels. They also monitor incoming customer communications and notify the incident commander if new groups of customers become impacted. After the incident, they ensure the post-mortem gets sent out.

Our recommendation: make it clear from the beginning who has what role in an incident. Even if these people have the bandwidth to help with other areas of the incident, they should respond to these primary objectives first and delegate other tasks where necessary.

Preparation

With a lean team, any time saved during an incident means a lot. Figuring out the right way to wordsmith an announcement can take up precious time in the heat of an incident.

Decide on boilerplate language ahead of time and save it in a template somewhere. Use it to plug in the relevant details during an incident when you need it.

Here is one incident template we use here for our own status page:

"The site is currently experiencing a higher than normal amount of load, and may be causing pages to be slow or unresponsive. We're investigating the cause and will provide an update as soon as possible.”

This language is very simple and generic, and can be deployed as-is in a lot of cases where this is all we know. We can also amend the language to add more relevant details if we have them. For example:

“The site is currently experiencing a higher than normal amount of load due to an incident with one of our larger customers. This is causing about 50% of pages to be unresponsive. We're investigating the cause and will provide an update as soon as possible.”

You should also define your communications channels during an incident. While we obviously recommend Statuspage, there are a lot of tools you can use: Twitter, email, and company blog, as examples. Just make sure you’re clear where you will be posting messages.

During the incident

Once the incident begins, we recommend these three “golden rules” which are worth keeping in mind during the incident.

Early

It’s important to communicate as soon as there is any sign that the incident is impacting customers. Get a message posted as early as possible. It doesn’t have to be perfect. This message serves to reassure users that you’re aware of the issue and actively looking into it. This will also slow down the flood of support tickets and inbound messaging you’re sure to receive during incidents.

Often

When you’re heads-down working on an incident, it can be easy to let regular updates slide. But these long gaps between updates can cause uncertainty and anxiety for your customers. They can start to expect the worst. Even if you’re just updating to say that you’re still investigating the matter, that’s better than no communication. Bonus points if you give an estimate on when next comms will be (and stick to it).

Here’s an example a 2016 HipChat incident.

Precision

In your messaging during the incident, be as precise as you can be without guessing or giving non-committal answers.

Instead of:

“We think we know what’s going on but we need more time.”

Try:

“We’re still working to verify the root cause.”

Instead of:

“The problem seems to be database related.”

Try:

“We’re continuing to investigate the problem.”

At first glance this second example may seem counterintuitive. Why leave out the fact that the issue could be database related? Because you aren’t sure yet. Avoiding hedging words like “we think.” Don’t say you “think” you found the root cause. Either you actually have found the cause or you haven’t.

Once you’ve confirmed the cause, then clearly state as much detail as you’re able to.

For example:

“We’ve identified a corruption with our database related to our last deploy. We are currently rolling back that deploy and monitoring results.”

After the Incident

Some of the biggest opportunities for your team come in the moments after the dust settles from an incident. Your team ideally will run a Post Incident Review session to unpack what happened on the technical side. It’s also a great time to build customer trust by letting them know that you’re taking the incident seriously and taking steps to ensure it doesn’t happen again.

An incident post-mortem is meant to be written after the incident and give a big picture update of what happened, how it happened, and what steps the team is taking to ensure it isn’t repeated. Here are our post-mortem rules.

Empathize

Apologize for the inconvenience, thank customers for their patience, and ensure you’re working on a fix.

Be personal

We see it all the time where teams depersonalize themselves in an effort to seem professional or official. This leads to a cold, distant tone in post-mortems that doesn’t build trust.

Use active voice and “we” pronouns to tell your story. Steer away from words that are overly academic or corporate sounding when simple ones will do.

Instead of:

“Remediation applications on the new load balancer configurations are finalized.”

Try:

“We’ve completed the configuration on the new load balancer.”

Details inspire confidence

People have a good sense for when you’re using a lot of words but not really saying anything. Details are the way to keep your post-mortem from sounding like a lot of hot air.

Here’s an example from a post-mortem Facebook engineers posted after a 2010 incident.

Consider this paragraph:

“Today we made a change to the persistent copy of a configuration value that was interpreted as invalid. This meant that every single client saw the invalid value and attempted to fix it. Because the fix involves making a query to a cluster of databases, that cluster was quickly overwhelmed by hundreds of thousands of queries a second.”

Likely that’s more technical of an explanation than most readers will need. The ones who want this level of detail will appreciate it. The ones who don’t will at least recognize that you’re going above and beyond to explain what happened. A lot of teams worry about being too technical in their messaging and instead wind up sending watered-down communications. Opt for specific details instead.

Close the loop

The post-mortem is your chance to have the last word in an incident. Leave the reader with a sense of trust and confidence by laying out clearly what you’re doing to keep this from happening again.

Here’s an example from a Twilio post-mortem:

“In the process of resolving the incident, we replaced the original redis cluster that triggered the incident. The incorrect configuration for redis-master was identified and corrected. As a further preventative measure, Redis restarts on redis-master are disabled and future redis-master recoveries will be accomplished by pivoting a slave.

The simultaneous loss of in-flight balance data and the ability to update balances also exposed a critical flaw in our auto-recharge system. It failed dangerously, exposing customer accounts to incorrect charges and suspensions. We are now introducing robust fail-safes, so that if billing balances don’t exist or cannot be written, the system will not suspend accounts or charge credit cards. Finally, we will be updating the billing system to validate against our double-bookkeeping databases in real-time.”

Notice how specific this is with outlining what went wrong and exactly what the team is putting in place to keep the problem from repeating.

Even though users today expect 24/7 services that are always up, people are tolerant of outages. We’ve heard a lot of stories about outages over the years at Statuspage, nobody ever went out of business by being too transparent or communicative during an incident. Consider the kind of information and transparency you’d like to receive from the products and vendors you use, and try to treat your users the way you’d like to be treated.

Looking to go even deeper on incident comms exercises? Check out our recent Team Playbook plays on incident response values and incident response communications.

Blake Thorne is a Product Marketing Manager at Statuspage, which helps teams big and small with incident communications. He can be reached at bthorne@atlassian.com or on Twitter.


GDPR, FastMail and you

Published 24 Apr 2018 by Bron Gondwana in FastMail Blog.

Following on from December’s blog post our executive team has been hard at work for the past few months making preparations for the upcoming GDPR. Current customers, no matter where you are located, should expect to receive notices soon about changes.

GDPR has been a great opportunity for us to confirm everything we believe about our products. Your data is yours, and we should be able to clearly articulate how we touch it.

What is GDPR?

General Data Protection Regulation (GDPR) is a new set of rules from the European Union (EU) and sets a standard for how companies use and protect people’s personal data. It comes into effect on May 25, 2018.

While aimed specifically at EU citizens, we feel it aligns closely with our own privacy values (you are our customer, not the product) and we will be providing the same transparency and protection for all our customers, regardless of where they live or their country of citizenship.

It covers:

What does “personal data” mean?

Anything that can help identify an individual is personal data.

Some examples: your email address, your IP address, your physical address, appointments you might have coming up, where you work, who your family members are.

There are some obvious personal information that we collect for users of any of our products (FastMail, Topicbox, Pobox, Listbox): your email address, billing information, and IP address. But any email content is also considered personal information because it can contain anything. We can’t know what you might have put in your email, so we must treat it as personal data.

What this regulation means for you

FastMail is serious about protecting your privacy. It is one of our core values. We believe that security is more than a checkbox.

We will be updating our policies before GDPR comes into effect, and we continue our commitment to plain language and a clear outline of what you can expect from us and any data processing vendors we use.

You control your data, when it comes to email, contacts and calendars. We provide processing of that data in order to supply you with an email service. Our job is to execute your wishes faithfully, efficiently, and with low friction so you can get on with your day.

We process your data to ensure we can deliver your mail, to keep your mailbox free from spam and to make it easy to search.

Our support team do not have access to your email content beyond what’s minimally necessary to supply you with service, unless you explicitly provide consent for the purposes of resolving a support issue.

We periodically profile data in aggregate to test and validate the design of software to ensure we can handle size, scale, and throughput of our customer base.

There are only two ways we use your information for anything other than directly providing the email service you pay us for:

  1. If you opt-in to our newsletters, you occasionally will receive information about changes to our service, company news, or surveys to help us find out how we can help our customers.
  2. Information in aggregate for marketing purposes to better understand the people interested in our service and how we can better meet their needs.

How is FastMail preparing for GDPR?

Because of our longstanding commitment to your privacy, this is a problem we’ve given a lot of thought. We are continuing to review our processes and data to make sure that the only staff who have access to your information are the ones who need it in order to provide you with the service you pay us for.

We are ensuring that in meeting our obligations, we don't get in your way: our service will remain fast, and easy to use. We believe that your privacy is a right, not a chore.

You have the “right to be forgotten” under the GDPR. This means you can request that we delete all your personal data off our platform, without exposing a potential security risk for a malicious attack. You can request your account be removed off our platform and the data will be cleared after a waiting period (just in case a hacker was the one who closed your account).

Our work through open standards means your data has always been portable and you can download it at any time.

We are working with our vendors who help us provide our service to ensure that they, too, are upholding the GDPR and updating our contracts as necessary.

We are preparing Data Protection Agreements (DPAs) for customers to sign, where needed.

We are appointing a Privacy Officer (and you can contact them at privacy@fastmail.com). Their role is to manage FastMail’s compliance with the GDPR regulation, with the help of an externally appointed Data Protection Officer.

Stay tuned

Customers can expect a policy update soon. More information will be published on our blog and help pages as we complete the steps necessary to guarantee compliance.

If you have any questions or concerns not addressed here, please contact privacy@fastmail.com.


Publishing WG Telco, 2018-04-23: Offlining, Infoset

Published 23 Apr 2018 by Tzviya Siegman in W3C Blog.

See minutes online for a more detailed record of the discussions.

Offlining

Our current draft does not go into detail about offlining WP. The call focused on some of the scenarios in which offlining is most valuable and desired. From there, the conversations expanded to the other concerns around offline-enhanced experiences,  such as versioning, identification, updating, and distribution. The group discussed existing browser-based storage systems but is concerned about persistence, risk of browser or (unintentional) user storage “clearing,” as well as quotas. Issues will be pulled from the conversation and added to GitHub.

Infoset

Luc Audrain has reviewed the existing info set and asked the group whether it is sufficient or if it needs to reflect all of the issues raised by Hadrien Gardeur in his comparison to EPUB. We concluded that we need to assess the functionality of the issues raised and ensure that they are addressed by WP. These functionalities might not be a part of the Infoset.


Together

Published 20 Apr 2018 by Matthew Roth in code.flickr.com.

Flickr is excited to be joining SmugMug!

We’re looking forward to some interesting and challenging engineering projects in the next year, and would love to have more great people join the team!

We want to talk to people who are interested in working on an inclusive, diverse team, building large-scale systems that are backing a much-loved product.

You can reach us by email at: iwanttowork@flickr.com

Read our announcement blog post and our extended Q&A for more details.

~The Flickr Team


2017 Year Review

Published 20 Apr 2018 by addshore in Addshore.

2017 has been a great year with continued work at WMDE on both technical wishes projects and also Wikibase / Wikidata related areas. Along the way I shared a fair amount of this through this blog, although not as much as I would have liked. Hopefully I’ll be slightly more active in 2018. Here are some fun stats:

Top 5 posts by page views in 2017 were:

  1. Guzzle 6 retry middleware
  2. Misled by PHPUnit at() method
  3. Wikidata Map July 2017
  4. Add Exif data back to Facebook images
  5. Github release download count &#8211; Chrome Extension

To make myself feel slightly better we can have a look at github and the apparent 1,203 contributions in 2017:

The post 2017 Year Review appeared first on Addshore.


The 2nd UK AtoM user group meeting

Published 20 Apr 2018 by Jenny Mitcham in Digital Archiving at the University of York.

I was pleased to be able to host the second meeting of the UK AtoM user group here in York at the end of last week. AtoM (or Access to Memory) is the Archival Management System that we use here at the Borthwick Institute and it seems to be increasing in popularity across the UK.

We had 18 attendees from across England, Scotland and Wales representing both archives and service providers. It was great to see several new faces and meet people at different stages of their AtoM implementation.

We started off with introductions and everyone had the chance to mention one recent AtoM triumph and one current problem or challenge. A good way to start the conversation and perhaps a way of considering future development opportunities and topics for future meetings.

Here is a selection of the successes that were mentioned:

  • Establishing a search facility that searches across two AtoM instances
  • Getting senior management to agree to establishing AtoM
  • Getting AtoM up and running
  • Finally having an online catalogue
  • Working with authority records in AtoM
  • Working with other contributors and getting their records displaying on AtoM
  • Using the API to drive another website
  • Upgrading to version 2.4
  • Importing legacy EAD into AtoM
  • Uploading finding aids into AtoM 2.4
  • Adding 1000+ urls to digital resources into AtoM using a set of SQL update statements

...and here are some of the current challenges or problems users are trying to solve:
  • How to bar code boxes - can this be linked to AtoM?
  • Moving from CALM to AtoM
  • Not being able to see the record you want to link to when trying to select related records
  • Using the API to move things into an online showcase
  • Advocacy for taking the open source approach
  • Working out where to start and how best to use AtoM
  • Sharing data with the Archives Hub
  • How to record objects alongside archives
  • Issues with harvesting EAD via OAI-PMH
  • Building up the right level of expertise to be able to contribute code back to AtoM
  • Working out what to do when AtoM stops working
  • Discovering that AtoM doesn't enforce uniqueness in identifiers for archival descriptions

After some discussion about some of the issues that had been raised, Louise Hughes from the University of Gloucestershire showed us her catalogue and talked us through some of the decisions they had made as they set this up. 

The University of Gloucestershire's AtoM instance

She praised the digital object functionality and has been using this to add images and audio to the archival descriptions. She was also really happy with the authority records, in particular, being able to view a person and easily see which archives relate to them. She discussed ongoing work to enable records from AtoM to be picked up and displayed within the library catalogue. She hasn't yet started to use AtoM for accessioning but hopes to do so in the future. Adopting all the functionality available within AtoM needs time and thought and tackling it one step at a time (particularly if you are a lone archivist) makes a lot of sense.

Tracy Deakin from St John's College, Cambridge talked us through some recent work to establish a shared search page for their two institutional AtoM instances. One holds the catalogue of the college archives and the other is for the Special Collections Library. They had taken the decision to implement two separate instances of AtoM as they required separate front pages and the ability to manage the editing rights separately. However, as some researchers will find it helpful to search across both instances a search page has been developed that accesses the Elasticsearch index of each site in order to cross search.

The interface for a shared search across St John's College AtoM sites

Vicky Phillips from the National Library of Wales talked us through their processes for upgrading their AtoM instance to version 2.4 and discussed some of the benefits of moving to 2.4. They are really happy to have the full width treeview and the drag and drop functionality within it.

The upgrade has not been without it's challenges though. They have had to sort out some issues with invalid slugs, ongoing issues due to the size of some of their archives (they think the XML caching functionality will help with this) and sometimes find that MySQL gets overwhelmed with the number of queries and needs a restart. They still have some testing to do around bilingual finding aids and have also been working on testing out the new functionality around OAI PMH harvesting of EAD.

Following on from this I gave a presentation on upgrading AtoM to 2.4 at the Borthwick Institute. We are not quite there yet but I talked about the upgrade plan and process and some decisions we have made along the way. I won't say any more for the time being as I think this will be the subject of a future blog post.

Before lunch my colleague Charles Fonge introduced VIAF (Virtual International Authority File) to the group. This initiative will enable Authority Records created by different organisations across the world to be linked together more effectively. Several institutions may create an authority record about the same individual and currently it is difficult to allow these to be linked together when data is aggregated by services such as The Archives Hub. It is worth thinking about how we might use VIAF in an AtoM context. At the moment there is no place to store a VIAF ID in AtoM and it was agreed this would be a useful development for the future.

After lunch Justine Taylor from the Honourable Artillery Company introduced us to the topic of back up and disaster recovery of AtoM. She gave the group some useful food for thought, covering techniques and the types of data that would need to be included (hint: it's not solely about the database). This was particularly useful for those working in small institutions who don't have an IT department that just does all this for them as a matter of course. Some useful and relevant information on this subject can be found in the AtoM documentation.

Max Communications are a company who provide services around AtoM. They talked through some of their work with institutions and what services they can offer.  As well as being able to provide hosting and support for AtoM in the UK, they can also help with data migration from other archival management systems (such as CALM). They demonstrated their crosswalker tool that allows archivists to map structured data to ISAD(G) before import to AtoM.

They showed us an AtoM theme they had developed to allow Vimeo videos to be embedded and accessible to users. Although AtoM does have support for video, the files can be very large in size and there are large overheads involved in running a video server if substantial quantities are involved. Keeping the video outside of AtoM and managing the permissions through Vimeo provided a good solution for one of their clients.

They also demonstrated an AtoM plugin they had developed for Wordpress. Though they are big fans of AtoM, they pointed out that it is not the best platform for creating interesting narratives around archives. They were keen to be able to create stories about archives by pulling in data from AtoM where appropriate.

At the end of the meeting Dan Gillean from Artefactual Systems updated us (via Skype) about the latest AtoM developments. It was really interesting to hear about the new features that will be in version 2.5. Note, that none of this is ever a secret - Artefactual make their road map and release notes publicly available on their wiki - however it is still helpful to hear it enthusiastically described.

The group was really pleased to hear about the forthcoming audit logging feature, the clever new functionality around calculating creation dates, and the ability for users to save their clipboard across sessions (and share them with the searchroom when they want to access the items). Thanks to those organisations that are funding this exciting new functionality. Also worth a mention is the slightly less sexy, but very valuable work that Artefactual is doing behind the scenes to upgrade Elasticsearch.

Another very useful meeting and my thanks go to all who contributed. It is certainly encouraging to see the thriving and collaborative AtoM community we have here in the UK.

Our next meeting will be in London in the autumn.

The 2nd UK AtoM user group meeting

Published 20 Apr 2018 by Jenny Mitcham in Digital Archiving at the University of York.

I was pleased to be able to host the second meeting of the UK AtoM user group here in York at the end of last week. AtoM (or Access to Memory) is the Archival Management System that we use here at the Borthwick Institute and it seems to be increasing in popularity across the UK.

We had 18 attendees from across England, Scotland and Wales representing both archives and service providers. It was great to see several new faces and meet people at different stages of their AtoM implementation.

We started off with introductions and everyone had the chance to mention one recent AtoM triumph and one current problem or challenge. A good way to start the conversation and perhaps a way of considering future development opportunities and topics for future meetings.

Here is a selection of the successes that were mentioned:

  • Establishing a search facility that searches across two AtoM instances
  • Getting senior management to agree to establishing AtoM
  • Getting AtoM up and running
  • Finally having an online catalogue
  • Working with authority records in AtoM
  • Working with other contributors and getting their records displaying on AtoM
  • Using the API to drive another website
  • Upgrading to version 2.4
  • Importing legacy EAD into AtoM
  • Uploading finding aids into AtoM 2.4
  • Adding 1000+ urls to digital resources into AtoM using a set of SQL update statements

...and here are some of the current challenges or problems users are trying to solve:
  • How to bar code boxes - can this be linked to AtoM?
  • Moving from CALM to AtoM
  • Not being able to see the record you want to link to when trying to select related records
  • Using the API to move things into an online showcase
  • Advocacy for taking the open source approach
  • Working out where to start and how best to use AtoM
  • Sharing data with the Archives Hub
  • How to record objects alongside archives
  • Issues with harvesting EAD via OAI-PMH
  • Building up the right level of expertise to be able to contribute code back to AtoM
  • Working out what to do when AtoM stops working
  • Discovering that AtoM doesn't enforce uniqueness in identifiers for archival descriptions

After some discussion about some of the issues that had been raised, Louise Hughes from the University of Gloucestershire showed us her catalogue and talked us through some of the decisions they had made as they set this up. 

The University of Gloucestershire's AtoM instance

She praised the digital object functionality and has been using this to add images and audio to the archival descriptions. She was also really happy with the authority records, in particular, being able to view a person and easily see which archives relate to them. She discussed ongoing work to enable records from AtoM to be picked up and displayed within the library catalogue. She hasn't yet started to use AtoM for accessioning but hopes to do so in the future. Adopting all the functionality available within AtoM needs time and thought and tackling it one step at a time (particularly if you are a lone archivist) makes a lot of sense.

Tracy Deakin from St John's College, Cambridge talked us through some recent work to establish a shared search page for their two institutional AtoM instances. One holds the catalogue of the college archives and the other is for the Special Collections Library. They had taken the decision to implement two separate instances of AtoM as they required separate front pages and the ability to manage the editing rights separately. However, as some researchers will find it helpful to search across both instances a search page has been developed that accesses the Elasticsearch index of each site in order to cross search.

The interface for a shared search across St John's College AtoM sites

Vicky Phillips from the National Library of Wales talked us through their processes for upgrading their AtoM instance to version 2.4 and discussed some of the benefits of moving to 2.4. They are really happy to have the full width treeview and the drag and drop functionality within it.

The upgrade has not been without it's challenges though. They have had to sort out some issues with invalid slugs, ongoing issues due to the size of some of their archives (they think the XML caching functionality will help with this) and sometimes find that MySQL gets overwhelmed with the number of queries and needs a restart. They still have some testing to do around bilingual finding aids and have also been working on testing out the new functionality around OAI PMH harvesting of EAD.

Following on from this I gave a presentation on upgrading AtoM to 2.4 at the Borthwick Institute. We are not quite there yet but I talked about the upgrade plan and process and some decisions we have made along the way. I won't say any more for the time being as I think this will be the subject of a future blog post.

Before lunch my colleague Charles Fonge introduced VIAF (Virtual International Authority File) to the group. This initiative will enable Authority Records created by different organisations across the world to be linked together more effectively. Several institutions may create an authority record about the same individual and currently it is difficult to allow these to be linked together when data is aggregated by services such as The Archives Hub. It is worth thinking about how we might use VIAF in an AtoM context. At the moment there is no place to store a VIAF ID in AtoM and it was agreed this would be a useful development for the future.

After lunch Justine Taylor from the Honourable Artillery Company introduced us to the topic of back up and disaster recovery of AtoM. She gave the group some useful food for thought, covering techniques and the types of data that would need to be included (hint: it's not solely about the database). This was particularly useful for those working in small institutions who don't have an IT department that just does all this for them as a matter of course. Some useful and relevant information on this subject can be found in the AtoM documentation.

Max Communications are a company who provide services around AtoM. They talked through some of their work with institutions and what services they can offer.  As well as being able to provide hosting and support for AtoM in the UK, they can also help with data migration from other archival management systems (such as CALM). They demonstrated their crosswalker tool that allows archivists to map structured data to ISAD(G) before import to AtoM.

They showed us an AtoM theme they had developed to allow Vimeo videos to be embedded and accessible to users. Although AtoM does have support for video, the files can be very large in size and there are large overheads involved in running a video server if substantial quantities are involved. Keeping the video outside of AtoM and managing the permissions through Vimeo provided a good solution for one of their clients.

They also demonstrated an AtoM plugin they had developed for Wordpress. Though they are big fans of AtoM, they pointed out that it is not the best platform for creating interesting narratives around archives. They were keen to be able to create stories about archives by pulling in data from AtoM where appropriate.

At the end of the meeting Dan Gillean from Artefactual Systems updated us (via Skype) about the latest AtoM developments. It was really interesting to hear about the new features that will be in version 2.5. Note, that none of this is ever a secret - Artefactual make their road map and release notes publicly available on their wiki - however it is still helpful to hear it enthusiastically described.

The group was really pleased to hear about the forthcoming audit logging feature, the clever new functionality around calculating creation dates, and the ability for users to save their clipboard across sessions (and share them with the searchroom when they want to access the items). Thanks to those organisations that are funding this exciting new functionality. Also worth a mention is the slightly less sexy, but very valuable work that Artefactual is doing behind the scenes to upgrade Elasticsearch.

Another very useful meeting and my thanks go to all who contributed. It is certainly encouraging to see the thriving and collaborative AtoM community we have here in the UK.

Our next meeting will be in London in the autumn.

Back to the classroom - the Domesday project

Published 20 Apr 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday I was invited to speak to a local primary school about my job. The purpose of the event was to inspire kids to work in STEM subjects (science, technology, engineering and maths) and I was faced with an audience of 10 and 11 year old girls.

One member of the audience (my daughter) informed me that many of the girls were only there because they had been bribed with cake.

This could be a tough gig!

On a serious note, there is a huge gender imbalance in STEM careers with women only making up 23% of the workforce in core STEM occupations. In talking to the STEM ambassador who was at this event, it was apparent that recruitment in engineering is quite hard, with not enough boys OR girls choosing to work in this area. This is also true in my area of work and is one of the reasons we are involved in the "Bridging the Digital Gap" project led by The National Archives. They note in a blog post about the project that:

"Digital skills are vital to the future of the archives sector ...... if archives are going to keep up with the pace of change, they need to attract members of the workforce who are confident in using digital technology, who not only can use digital tools, but who are also excited and curious about the opportunities and challenges it affords."

So why not try and catch them really young and get kids interested in our profession?

There were a few professionals speaking at the event and subjects were varied and interesting. We heard from someone who designed software for cars (who knew how many different computers are in a modern car?), someone who had to calculate exact mixes of seed to plant in Sites of Special Scientific Interest in order to encourage the right wild birds to nest there, a scientist who tested gelatin in sweets to find out what animal it was made from, an engineer who uses poo to heat houses....I had some pretty serious competition!

I only had a few minutes to speak so my challenge was to try and make digital preservation accessible, interesting and relevant in a short space of time. You could say that this was a bit of an elevator pitch to school kids.

Once I got thinking about this I had several ideas of different angles I could take.

I started off looking at the Mount School Archive that is held at the Borthwick. This is not a digital archive but was a good introduction to what archives are all about and why they are interesting and important. Up until 1948 the girls at this school created their own school magazine that is beautifully illustrated and gives a fascinating insight into what life was like at the school. I wanted to compare this with how schools communicate and disseminate information today and discuss some of the issues with preserving this more modern media (websites, twitter feeds, newsletters sent to parents via email).

Several powerpoint slides down the line I realised that this was not going to be short and snappy enough.

I decided to change my plans completely and talk about something that they may already know about, the Domesday Book.

I began by asking them if they had heard of the Domesday Book. Many of them had. I asked what they knew about it. They thought it was from 1066 (not far off!), someone knew that it had something to do with William the Conqueror, they guessed it was made of parchment (and they knew that parchment was made of animal skin). They were less certain of what it was actually for. I filled in the gaps for them.

I asked them whether they thought this book (that was over 900 years old) could still be accessed today and they weren't so sure about this. I was able to tell them that it is being well looked after by The National Archives and can still be accessed in a variety of ways. The main barrier to understanding the information is that it is written in Latin.

I talked about what the Domesday Book tells us about our local area. A search on Open Domesday tells us that Clifton only had 12 households in 1086. Quite different from today!

We then moved forward in time, to a period of history known as 'The 1980's' (a period that the children had recently been studying at school - now that makes me feel old!). I introduced them to the BBC Domesday Project of 1986. Without a doubt one of digital preservation's favourite case studies!

I explained how school children and communities were encouraged to submit information about their local areas. They were asked to include details of everyday life and anything they thought might be of interest to people 1000 years from then. People took photographs and wrote information about their lives and their local area. The data was saved on to floppy disks (what are they?) and posted to the BBC (this was before email became widely available). The BBC collated all the information on to laser disc (something that looks a bit like a CD but with a diameter of about 30cm).

I asked the children to consider the fact that the 900 year old Domesday Book is still accessible and  think about whether the 30 year old BBC Domesday Project discs were equally accessible. In discussion this gave me the opportunity to finally mention what digital archivists do and why it is such a necessary and interesting job. I didn't go into much technical detail but all credit to the folks who actually rescued the Domesday Project data. There is lots more information here.

Searching the Clifton and Rawcliffe area on Domesday Reloaded


Using the Domesday Reloaded website I was then able to show them what information is recorded about their local area from 1986. There was a picture of houses being built, and narratives about how a nearby lake was created. There were pieces written by a local school child and a teacher describing their typical day. I showed them a piece that was written about 'Children's Crazes' which concluded with:

" Another new activity is break-dancing
 There is a place in York where you can
 learn how to break-dance. Break     
 dancing means moving and spinning on
 the floor using hands and body. Body-
 popping is another dance craze where
 the dancer moves like a robot."


Disappointingly the presentation didn't entirely go to plan - my powerpoint only partially worked and the majority of my carefully selected graphics didn't display.

A very broken powerpoint presentation

There was thus a certain amount of 'winging it'!

This did however allow me to make the point that working with technology can be challenging as well as perhaps frustrating and exciting in equal measure!


Back to the classroom - the Domesday project

Published 20 Apr 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday I was invited to speak to a local primary school about my job. The purpose of the event was to inspire kids to work in STEM subjects (science, technology, engineering and maths) and I was faced with an audience of 10 and 11 year old girls.

One member of the audience (my daughter) informed me that many of the girls were only there because they had been bribed with cake.

This could be a tough gig!

On a serious note, there is a huge gender imbalance in STEM careers with women only making up 23% of the workforce in core STEM occupations. In talking to the STEM ambassador who was at this event, it was apparent that recruitment in engineering is quite hard, with not enough boys OR girls choosing to work in this area. This is also true in my area of work and is one of the reasons we are involved in the "Bridging the Digital Gap" project led by The National Archives. They note in a blog post about the project that:

"Digital skills are vital to the future of the archives sector ...... if archives are going to keep up with the pace of change, they need to attract members of the workforce who are confident in using digital technology, who not only can use digital tools, but who are also excited and curious about the opportunities and challenges it affords."

So why not try and catch them really young and get kids interested in our profession?

There were a few professionals speaking at the event and subjects were varied and interesting. We heard from someone who designed software for cars (who knew how many different computers are in a modern car?), someone who had to calculate exact mixes of seed to plant in Sites of Special Scientific Interest in order to encourage the right wild birds to nest there, a scientist who tested gelatin in sweets to find out what animal it was made from, an engineer who uses poo to heat houses....I had some pretty serious competition!

I only had a few minutes to speak so my challenge was to try and make digital preservation accessible, interesting and relevant in a short space of time. You could say that this was a bit of an elevator pitch to school kids.

Once I got thinking about this I had several ideas of different angles I could take.

I started off looking at the Mount School Archive that is held at the Borthwick. This is not a digital archive but was a good introduction to what archives are all about and why they are interesting and important. Up until 1948 the girls at this school created their own school magazine that is beautifully illustrated and gives a fascinating insight into what life was like at the school. I wanted to compare this with how schools communicate and disseminate information today and discuss some of the issues with preserving this more modern media (websites, twitter feeds, newsletters sent to parents via email).

Several powerpoint slides down the line I realised that this was not going to be short and snappy enough.

I decided to change my plans completely and talk about something that they may already know about, the Domesday Book.

I began by asking them if they had heard of the Domesday Book. Many of them had. I asked what they knew about it. They thought it was from 1066 (not far off!), someone knew that it had something to do with William the Conqueror, they guessed it was made of parchment (and they knew that parchment was made of animal skin). They were less certain of what it was actually for. I filled in the gaps for them.

I asked them whether they thought this book (that was over 900 years old) could still be accessed today and they weren't so sure about this. I was able to tell them that it is being well looked after by The National Archives and can still be accessed in a variety of ways. The main barrier to understanding the information is that it is written in Latin.

I talked about what the Domesday Book tells us about our local area. A search on Open Domesday tells us that Clifton only had 12 households in 1086. Quite different from today!

We then moved forward in time, to a period of history known as 'The 1980's' (a period that the children had recently been studying at school - now that makes me feel old!). I introduced them to the BBC Domesday Project of 1986. Without a doubt one of digital preservation's favourite case studies!

I explained how school children and communities were encouraged to submit information about their local areas. They were asked to include details of everyday life and anything they thought might be of interest to people 1000 years from then. People took photographs and wrote information about their lives and their local area. The data was saved on to floppy disks (what are they?) and posted to the BBC (this was before email became widely available). The BBC collated all the information on to laser disc (something that looks a bit like a CD but with a diameter of about 30cm).

I asked the children to consider the fact that the 900 year old Domesday Book is still accessible and  think about whether the 30 year old BBC Domesday Project discs were equally accessible. In discussion this gave me the opportunity to finally mention what digital archivists do and why it is such a necessary and interesting job. I didn't go into much technical detail but all credit to the folks who actually rescued the Domesday Project data. There is lots more information here.

Searching the Clifton and Rawcliffe area on Domesday Reloaded


Using the Domesday Reloaded website I was then able to show them what information is recorded about their local area from 1986. There was a picture of houses being built, and narratives about how a nearby lake was created. There were pieces written by a local school child and a teacher describing their typical day. I showed them a piece that was written about 'Children's Crazes' which concluded with:

" Another new activity is break-dancing
 There is a place in York where you can
 learn how to break-dance. Break     
 dancing means moving and spinning on
 the floor using hands and body. Body-
 popping is another dance craze where
 the dancer moves like a robot."


Disappointingly the presentation didn't entirely go to plan - my powerpoint only partially worked and the majority of my carefully selected graphics didn't display.

A very broken powerpoint presentation

There was thus a certain amount of 'winging it'!

This did however allow me to make the point that working with technology can be challenging as well as perhaps frustrating and exciting in equal measure!


W3C is in Singapore for Seamless Payments!

Published 20 Apr 2018 by J. Alan Bird in W3C Blog.

W3C Seamless Payments booth The W3C Web Payments Working Group produced the Web Payment API which has been adopted by the major Web Browsers and we’re continuing to see adoption by others in the Payments Ecosystem. With the rechartering of the Web Commerce Interest Group we’re seeing a lot of exciting new work starting in areas that are of interest to merchants, retailers, banks and others in the Commerce segment of the Payments Industry.

While our participation has been fairly broad in coverage, we would like to see more participation from the Commerce and Payments Industry in the Asian countries. To bring our message to this community W3C has chosen to participate in Seamless Payments Asia on 3 and 4 May 2018 in Singapore.

Our participation at this event is in three dimensions. I am chairing a Keynote Panel in the afternoon of the first day , will have a booth in the Exhibition area and be doing a presentation in the Exhibition Theater.

I’m really excited about our Panel! The topic is “Frictionless Commerce: How to Make Cross-Boarder Payments, e-Commerce and Retail Easier“. On that panel we have two W3C Members, Airbnb and Rakuten, and two non-Members in National Australia Bank and Rocket International. I think it will be an exciting conversation.

In the booth we are finalizing our agenda but our goal is to have over 10 demonstrations from W3C Members about how the work in various parts of W3C is impacting their products and business.

Last, but not least, I will be doing a presentation in the Exhibition Theater on the 2nd morning at 1000A on “Improving Payments on the Web – an update from W3C“.

If you or others from your company are going to be at the event, let’s get together! Contact either myself or Naomi Yoshizawa to set up a time.


Slim 3.10.0 released

Published 19 Apr 2018 by in Slim Framework Blog.

We are delighted to release Slim 3.10.0. This version has a couple of minor new features and a couple of bug fixes.

The most noticeable improvement is that we now support $app->redirect('/from', '/to') to allow quick and easy redirecting of one path to another without having to write a route handler yourself. We have also added support for the SameSite flag in Slim\Http\Cookies

As usual, there are also some bug fixes, particularly we no longer override the Host header in the request if it’s already defined.

The full list of changes is here


Pittsburgh, We’ll See Yinz at RailsConf!

Published 18 Apr 2018 by Jaime Woo in The DigitalOcean Blog.

Pittsburgh, We’ll See Yinz at RailsConf!

RailsConf has left the desert and makes its way to Steel City April 17-19, 2018. We’ll have Sam Phippen presenting, and several DO-ers checking out talks and tending our booth. Here’s what you need to know about RailsConf 2018.

In Sam’s talk, “Quick and easy browser testing using RSpec and Rails 5.1,” you'll learn about the new system specs in RSpec, how to set them up, and what benefits they provide. It’s for anyone wanting to improve their RSpec suite with full-stack testing.

From the talk description:

Traditionally doing a full-stack test of a Rails app with RSpec has been problematic. The browser wouldn't automate, capybara configuration would be a nightmare, and cleaning up your DB was difficult. In Rails 5.1 the new 'system test' type was added to address this. With modern RSpec and Rails, testing every part of your stack including Javascript from a browser is now a breeze.

Make sure you don’t miss it, Thursday, April 19, from 10:50 AM-11:30 AM in the Spirit of Pittsburgh Ballroom. If you’re interested in RSpec, you might dig his talk from 2017, “Teaching RSpec to Play Nice with Rails.”

You can also catch us in the Exhibit Hall, at booth number 520. The Hall is on Level 2, in Hall A. We’ll be hanging at our booth Wednesday, April 18 from 9:30 AM-6:00 PM, and Thursday, April 19 from 9:30 AM-5:15 PM.

See you there, or, as they say in Pittsburgh, meechinsdahnair!


How to receive a notification when a wiki page has not been updated in a certain amount of time?

Published 18 Apr 2018 by NETGEAR R6220 in Newest questions tagged mediawiki - Stack Overflow.

I would like to receive a notification when a wiki page hasn't been updated hasn't been updated for a certain amount of time. I am currently checking 'Special:AncientPages' and this is good to check it manually, but is there a way of doing this automatically.

The notification could be via email or anything else, it doesn't really matter. Just need to find out if this is possible or not.


MediaWiki with two database servers

Published 18 Apr 2018 by Sam Wilson in Sam's notebook.

I’ve been trying to replicate locally a bug with MediaWiki’s GlobalPreferences extension. The bug is about the increased number of database reads that happen when the extension is loaded, and the increase happens not on the database table that stores the global preferences (as might be expected) but rather on the ‘local’ tables. However, locally I’ve had all of these running on the same database server, which makes it hard to watch the standard monitoring tools to see differences; so, I set things up on two database servers locally.

Firstly, this was a matter of starting a new MySQL server in a Docker container (accessible at 127.0.0.1:3305 and with its data in a local directory so I could destroy and recreate the container as required):

docker run -it -e MYSQL_ROOT_PASSWORD=pwd123 -p3305:3306 -v$PWD/mysqldata:/var/lib/mysql mysql

(Note that because we’re keeping local data, root’s password is only set on the first set-up, and so the MYSQL_ROOT_PASSWORD can be left off future invocations of this command.)

Then it’s a matter of setting up MediaWiki to use the two servers:

$wgLBFactoryConf = [
	'class' => 'LBFactory_Multi',
	'sectionsByDB' => [
		// Map of database names to section names.
		'mediawiki_wiki1' => 's1',
		'wikimeta' => 's2',
	],
	'sectionLoads' => [
		// Map of sections to server-name/load pairs.
		'DEFAULT' => [ 'localdb'  => 0 ],
		's1' => [ 'localdb'  => 0 ],
		's2' => [ 'metadb' => 0 ],
	],
	'hostsByName' => [
		// Map of server-names to IP addresses (and, in this case, ports).
		'localdb' => '127.0.0.1:3306',
		'metadb' => '127.0.0.1:3305',
	],
	'serverTemplate' => [
		'dbname'        => $wgDBname,
		'user'          => $wgDBuser,
		'password'      => $wgDBpassword,
		'type'          => 'mysql',
		'flags'         => DBO_DEFAULT,
		'max lag'       => 30,
	],
];
$wgGlobalPreferencesDB = 'wikimeta';

How can I check that my mediawiki in my virtual machine is not accessible from the internet?

Published 17 Apr 2018 by user8886193 in Newest questions tagged mediawiki - Stack Overflow.

I downloaded from bitnami a virtual machine for mediawiki. The installation is done and I can access the website from host (my real operating system). During the installation i received the ip adress for the acess.

How can I check that my mediawiki in my virtual machine is not accessible from the internet?

It is for me important that the guest and webserver is only accessible by me and that there is no communication through host and the other way round.

(I am a beginner and searched the internet and decided to ask my question here. If it is the wrong place please write me a suitable place for this question.)


A short update on the web-platform-test project invitation

Published 17 Apr 2018 by Philippe le Hegaret in W3C Blog.

For those of you involved in the effort to build a cross-browser testsuite for the majority of the web platform, you are currently noticing that the project is moving to a new organization on GitHub. The move is intended to improve the management and handling of the WPT project on GitHub and Travis.

There is a myriad of services and repositories related to this project and the complex transition is happening precociously, lead by Philip Jägenstedt. This week, past collaborators and teams are invited to join the new GitHub organization to prevent breaking read/write access in the future. If you’re looking into continuing your contributions, you should accept the invitation to make it easier for you after the transition is over.

This transition does not change or impact the W3C relationship with the project. It has been and will continue to be a consensus-driven open-source project with a mission to improve the Web through testing.


Episode 6: Daren Welsh and James Montalvo

Published 17 Apr 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Daren Welsh and James Montalvo are flight controllers and instructors at the Extravehicular Activity (EVA) group at the Johnson Space Center at NASA. They first set up MediaWiki for their group in 2011; since then, they have overseen the spread of MediaWiki throughout the flight operations directorate at Johnson Space Center. They have also done a significant amount of MediaWiki development, including, most recently, the creation of Meza, a Linux-based tool that allows for easy installation and maintenance of MediaWiki.

Links for some of the topics discussed:


How to force Mediawiki to write an ampersand in HTML without it being encoded as &amp;

Published 16 Apr 2018 by user1258361 in Newest questions tagged mediawiki - Stack Overflow.

Automatic encoding of & as &amp; breaks expected functionality for extensions that write HTML or JS if the script/HTML text being written depends on them.

For example, if you write a script with a boolean AND conditional, the && gets encoded as &amp;&amp; which makes no sense.

https://www.mediawiki.org/wiki/Help:Formatting

Nowiki tags don't apply to this encoding. Is there an extension available that allows HTML output of plain &?


Why Specs Change: EPUB 3.2 and the Evolution of the Ebook Ecosystem

Published 16 Apr 2018 by Dave Cramer in W3C Blog.

Drawing of four species of Galapagos Finches, showing the different beak shapes It takes much more than a village to make an ebook. Authors, publishers, developers, distributors, retailers, and readers must all work together. EPUB* requires authoring and validation tools as well as reading systems. The EPUB standard depends on the HTML and CSS standards, among others. There are millions of existing EPUB 2 and EPUB 3 files out there. Change anywhere is felt everywhere.

As this ecosystem evolves, the EPUB standard itself sometimes has to change to keep up. When the Web moved to HTML5, enabling better semantic markup and better accessibility, it was clear that EPUB could benefit. EPUB 3.0, which was released in October 2011, supported HTML5 as well as scripting and multimedia. EPUB could now be used for more kinds of books, better books, more accessible books. EPUB 3 was a big deal, significantly different from, and better than, EPUB 2. Today there’s no reason to use EPUB 2, and yesterday is the best day to start producing EPUB 3.

Sometimes the need for change comes from innovation inside the ebook world. As Apple and Amazon developed fixed-layout ebooks in the early 2010s, the IDPF knew they had to create a standard, to avoid fragmenting the marketplace. Sometimes specs just have bugs, or implementations discover an ambiguity. Some changes are large, like moving to HTML5, and some changes are small, like allowing multiple dc:source elements in EPUB 3.0.1. EPUB 3.0.1 was ultimately a maintenance release, incorporating in the fixed-layout spec, slightly expanding what sorts of attributes were valid in EPUB, and fixing various bugs. Existing EPUB 3s didn’t need to change to support 3.0.1.

In 2016, the IDPF’s EPUB Working Group started working on a more substantive revision, which would become EPUB 3.1. The goal was to bring EPUB closer to the rest of the Web Platform, and make the spec simpler and easier to read. The former was done partly by trying to remove seldom-used features in EPUB that were not part of the larger Web, such as the epub:switch and epub:trigger elements. The Group also clarified the relationship with CSS, moving from an explicit profile of supported properties (which had little bearing on what was actually supported) to using the W3C’s own official definition of CSS, which evolves. It did the same with HTML, referring to the latest version of HTML5, whatever version that might be. But most of our ambitious ideas were scaled back or dropped, such as allowing the regular HTML serialization of HTML5 in EPUB. EPUB 3.1 was officially finished in January 2017, before the IDPF became part of the W3C.

But remember that the spec is only a part of the ecosystem. Two factors proved fatal to EPUB 3.1. First, there are hundreds of thousands of EPUB 3.0.X files already out there. EPUB 3.1 changed the value of the version attribute in the package file, and so those existing files would need to be edited to comply with the new spec, even if they didn’t use any of the removed features.

Second, the validation tool EpubCheck was never updated to support EPUB 3.1.  Unlike the web, the ebook ecosystem is highly dependent on formal validation. EpubCheck is the gatekeeper of the digital publishing world, the tool that verifies compliance with EPUB standards. But EpubCheck is in trouble. It’s maintained by a handful of volunteers, and has almost no resources. There’s a backlog of maintenance work and bug fixes to do. Fifteen months after the release of EPUB 3.1, it still is not supported by EpubCheck, and thus no one can distribute or sell EPUB 3.1 through the major retailers. The Publishing Business Group is currently working to ensure EpubCheck’s future. Stay tuned!

EPUB 3.1 was a good spec—better-organized, easier to understand, clearer about the relationship between EPUB and the underlying web technologies. The EPUB 3.0.1 features it removed were seldom used, and often unsupported. But after 3.1 was completed, many people decided that, even if almost no existing EPUB 3 content was rendered incompatible with the new spec (aside from the version attribute), the price was too high. Better to live with some obsolete features, and guarantee compatibility, than require too much change. EPUB was having its “don’t break the Web” moment.

Early this year, Makoto Murata and Garth Conboy proposed that we roll back some of the changes in EPUB 3.1. This updated spec would be known as EPUB 3.2. The goals were:

  1. Guarantee that any EPUB 3.0.1 publication conforms to EPUB 3.2.
  2. Ensure that EPUB 3.0.1 Reading systems would accept and render any EPUB 3.2 publication, although graceful fallback may sometimes be required.

If you already have EPUB 3 files, you don’t need to make any changes to existing content or workflow to adopt the forthcoming 3.2 spec. You just have a few more options, much like the change from 3.0 to 3.0.1. If you don’t already have EPUB 3 files, start now (making 3.0.1)! There’s no reason to wait.

EPUB 3.2 will still be based on EPUB 3.1, and keep many of the changes in 3.1 that don’t affect compatibility, such as referring to the latest versions of HTML5 and SVG, and using the official CSS Snapshot rather than the old profile. 3.2 will also continue to include WOFF2 and SNFT fonts as core media types. Perhaps most importantly, making EPUB 3.2 closer to EPUB 3.0.1 will require much less work to upgrade EpubCheck.


The W3C EPUB 3 Community Group has started to work on EPUB 3.2, with the explicit goal of remaining compatible with all existing EPUB 3.0.1 files, while retaining the best features of EPUB 3.1. I expect this work to take six months or so; others are more optimistic. When final, EPUB 3.2 will become a W3C Community Group Report, as Community Groups do not create W3C Recommendations.

We need your help! Join the EPUB 3 Community Group at https://www.w3.org/community/epub3/. It’s free, you don’t have to be a W3C member, and everyone is welcome. Much of the discussion of technical issues will happen on GitHub; our repository is at https://github.com/w3c/publ-epub-revision/.

You can look at the early drafts of our spec, too:

    1. EPUB 3.2 Overview
    2. EPUB 3.2 Specification
    3. EPUB Packages 3.2
    4. EPUB Content Documents 3.2
    5. EPUB Media Overlays 3.2
    6. EPUB Open Container Format

*EPUB® is an interchange and delivery format for digital publications, based on XML and Web Standards. An EPUB Publication can be thought of as a reliable packaging of Web content that represents a digital book, magazine, or other type of publication, and that can be distributed for online and offline consumption.


Collapsing section heading in Mediawiki MobileFrontend

Published 15 Apr 2018 by Manu in Newest questions tagged mediawiki - Stack Overflow.

In MobileFrontend for Mediawiki, sections can be collapsed using the parameter $collapseSectionsByDefault = true. But this only applies to H2 sections (= first-level titles) but not to inferior-level sections (H3, H4, H5 and H6 titles).

However, this function seems to exist, I've found up these lines in the line 240 of extensions/MobileFrontend/ressources/mobile.toggle/toggle.js (see here):

// Also allow .section-heading if some extensions like Wikibase
// want to toggle other headlines than direct descendants of $container.
$firstHeading = $container.find( '> h1,> h2,> h3,> h4,> h5,> h6,.section-heading' ).eq( 0 );
tagName = $firstHeading.prop( 'tagName' ) || 'H1';

How to toggle other headlines than H2 in MobileFrontend 1.30 or 1.31?

Thanks in advance!


Concurrent Python Wikipedia Package Requests

Published 14 Apr 2018 by delhics in Newest questions tagged mediawiki - Stack Overflow.

I am making a python application which uses the python Wikipedia package to retrieve the body text of 3 different Wikipedia pages. However, I am noticing very slow performance when retrieving the articles one at a time. Is there a method that I can use to retrieve the body text of 3 Wikipedia pages in parallel?


Firefox Add-on to skip mobile Wikipedia redirect

Published 14 Apr 2018 by legoktm in The Lego Mirror.

Skip Mobile Wikipedia on Firefox Add-ons

Lately, I've been reading Wikipedia on my phone significantly more than I used to. I get 15 minutes on the train each morning, which makes for some great reading time. But when I'm on my phone, Wikipedia redirects to the mobile website. I'm sure there are some people out there who love it, but it's not for me.

There's a "Desktop" button at the bottom of the page, but it's annoying and inconvenient. So I created my first Firefox Add-on, "Skip Mobile Wikipedia". It rewrites all requests to the mobile Wikipedia website to the standard canonical domain, and sets a cookie to prevent any further redirects. It works on the standard desktop Firefox and on Android.

Install the Add-on and view the source code.


April Community Doers: Meetup Edition

Published 13 Apr 2018 by Daniel Zaltsman in The DigitalOcean Blog.

April Community Doers: Meetup Edition

On the six-year voyage toward becoming the cloud platform for developers and their teams, we have received tremendous support from the larger developer community. We’ve seen hundreds of Meetups organized, pull requests submitted, tutorials written, and Q&As contributed, with even more ongoing activity. To show our appreciation, last month we introduced a new way to highlight some of our most active community contributors - our Community Doers!

Community Doers help make the community better through the content they create and the value they add. In addition to the Community homepage, we’ll regularly highlight Community Doers on the blog, Infrastructure as a Newsletter, social media, and to our growing internal community. In March, we were excited to bring you the trio of Marko, Mateusz, and Peter. This month, with a focus on our global Meetup community, we have three new individuals for you to get to know and celebrate with us. Without further ado, meet April’s featured Community Doers:

Aditya Patawari (@adityapatawari)

Aditya is an early adopter and advocate of DigitalOcean, so it’s no surprise that he became the first organizer of our second largest Meetup group, based in Bangalore. He has been producing Meetups since 2016 and has served as a speaker and panelist at consecutive DigitalOcean TIDE conferences. His talk on foolproofing business through infrastructure gap analysis was well received at TIDE New Delhi, and we later invited him to conduct an online webinar on setting up a multi-tier web application with Ansible. We’re extremely proud and excited to be working with him because of his passion for education and for helping the wider community.

Samina Fu (@sufuf3149)

For the second month running, we are proud to highlight the work of our active Taiwan community. Specifically, we are excited to recognize Samina Fu, a Network and Systems Engineering graduate of National Chiao Tung University in Taiwan. Samina is a co-organizer of our Hsinchu community, which she has been bringing together since early 2017. She helped to organize our first of 120 Hacktoberfest Meetups last year, and works closely with Peter Hsu (who we highlighted last month) as a core contributor to the CDNJS project.

David Endersby (@davidendersby1)

When David filled out our Meetup Organizer Application Form in September 2016, we didn’t know he would go on to lead one of our largest and most active Meetup communities. Since early 2017, David has worked hard to develop a blueprint for successfully running a new Meetup community, covering everything from starting out, to finding speakers, to time management, choosing a location, feeding attendees, and more. His efforts have produced a wealth of content and he has an ambitious plan for 2018. If you’re interested in joining, he welcomes you with open arms!


Aditya’s, Samina’s, and David’s efforts exemplify the qualities we are proud to see in our community. They all have a knack for educating the community (off- and online), promoting both learning and community collaboration. But there are so many others we have yet to recognize! We look forward to highlighting more of our amazing community members in the months to come.

Are you interested in getting more involved in the DigitalOcean community? Here are a few places to start:

Know someone who fits the profile? Nominate a member to be recognized in the comments!


MediaWiki CSS not loading, MIME type error

Published 13 Apr 2018 by user3462317 in Newest questions tagged mediawiki - Stack Overflow.

My MediaWiki install on Gandi.net is having CSS problems: The main page works fine. However, all of the other pages are unstyled, as though the browser can't access the CSS.

I've tried using the console to debug in Chrome and get the following error message:

Refused to apply style from 'http://jollof.mariadavydenko.com/wiki/load.php?debug=false&lang=en&modules=mediawiki.feedlink%2Chelplink%2CsectionAnchor%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.skinning.content.externallinks%7Cmediawiki.skinning.interface%7Cmediawiki.special.changeslist%7Cmediawiki.special.changeslist.enhanced%2Clegend%7Cskins.monobook.styles&only=styles&skin=monobook' because its MIME type ('text/html') is not a supported stylesheet MIME type, and strict MIME checking is enabled.

I am running PHP version 5.6 and MySQL version 5.7. I've tried the load.php .htaccess fix recommended for these symptoms but it doesn't work -- load.php loads just fine. Any help would be greatly appreciated.


Correct robots.txt structure? (Mediawiki)

Published 13 Apr 2018 by Filip Torphage in Newest questions tagged mediawiki - Stack Overflow.

I've been checking around in different sites robots.txt files and stumbled upon something I didn't expect at MediaWiki's robots.txt. From what I've read so far you would can write in a robots.txt file like below:

Disallow: foo
Noindex: bar

I then wonder if:

Disallow: /wiki/Category:Noindexed_pages

is a correct structure in a robots.txt file, or at least for mediawiki's part? Also wants to know if Noindexed_pages can be anything or if it is static.

The last code was taken from a wikipedia article of mediawiki's robots.txt.


W3C’s WAI-ACT Project Identified as ‘Key Innovator’

Published 10 Apr 2018 by Shadi Abou-Zahra in W3C Blog.

Today the new EU Innovation Radar was launched to recognize high potential innovations and innovators in EU-funded research. Among them is the W3C WAI-ACT Project (grant 287725 of the 7th Framework Programme), which was ranked among the top ten ‘high capacity innovation projects’ (third place in SME ranking) in the JRC Science and Policy Report of 2014. The project was recognized for innovation in ‘Practical guidance on evaluation of Web Accessibility Initiative guidelines’.

The workflow diagram above depicts five sequential steps: 1. Define the evaluation scope; 2. Explore the target website; 3. Select a representative sample; 4. Audit the selected sample and 5. Report the findings. Each step has an arrow to the next step, and arrows back to all prior steps. This illustrates how evaluators proceed from one step to the next, and may return to any preceding step in the process as new information is revealed to them during the evaluation process.

The WAI-ACT Project carried out its work in 2011-2014 through existing W3C working groups under the W3C consensus process, with broad involvement from different key stakeholders. Results of the project include:

The WAI-ACT Project also included dedicated efforts to engage with the community and liaise we related efforts in Europe and internationally. For example, the project organized the W3C Workshop on Referencing and Applying WCAG 2.0 in Different Contexts, to better support the uptake of WCAG 2.0 internationally.

WAI-ACT was a multi-partner project led by W3C through ERCIM as its European host. The project partners included:

WAI-ACT is part of a series of EU-funded projects led by the W3C Web Accessibility Initiative (WAI). The latest in this series is the currently on-going WAI-Tools Project, which in many ways builds on the efforts of the WAI-ACT Project and later W3C efforts to provide more guidance on accessibility conformance evaluation.

We would like to take the opportunity of this recognition to thank the European Commission for their support over many years, the project partners, the W3C working groups, and the broader community, which made this work happen. I look forward to continued open collaboration through the W3C process.


Untitled

Published 10 Apr 2018 by Sam Wilson in Sam's notebook.

I find autogenerated API docs for Javascript projects (e.g.) so much more useful than those for PHP projects.


Morning joy

Published 9 Apr 2018 by Sam Wilson in Sam's notebook.

I love the morning time, while the brain is still sharp enough to focus on one thing and get it done, but dull enough not to remember the other things and derail everything with panic about there being too much to do. The morning is when the world properly exists, and is broad and friendly.


How can I display spaces in Media Wiki Page Title but not in the URL?

Published 9 Apr 2018 by David Ruess in Newest questions tagged mediawiki - Stack Overflow.

How can I display spaces in Media Wiki Page Title but not in the URL?

Desired Result: if someone types in example.com/w/John1:1-5 then I'd like the page title on that page to show John 1:1-5.

I realize I could create a page at example.com/w/John_1:1-5, but I don't want users to have to type the underscore.

Is there a way to do this without creating a redirect?

Thanks!


Mediawiki MathJax use with Template:math

Published 9 Apr 2018 by CatMan in Newest questions tagged mediawiki - Stack Overflow.

With the SimpleMathJax Extemsion installed on a MediaWiki 1.27 I would like to provide offline access to some wikipedia page code. The code uses MathML tags and a math template. The installation both of Mediawiki and SimpleMathJax were using the defaults settings. There is only the default Mediawiki:Common.css or Mediawiki:Common.js content installed.

All math tags are working fine. However, I am seeing strange artifacts when trying to use e.g. the following expressions

<math>A = 42</math>      // base for (1)
<math>A {=} 42</math>    // base for (2)
<math>'''V'''</math>     // base for (3)

with a template using the code

{{math|A = 42}}          // (1)
{{math|A {=} 42}}        // (2)
{{math|'''V'''}}         // (3)

The versions using the <math> tags are working as expected. This shows that SimpleMathJax is installed correctly, I would guess.

The original template code from Wikipedia's "Template:Math" did not do anything in my installation, so I used this code for the Template:Math

{{#tag:math|{{{1}}}}}

(For reproducing the problem in Mediawiki, simply create a new page named "Template:math" and copy the code above. Then add the template code above to any page and check it with "Preview")

Quite a few things work with this template, e.g. {{math|{{!}}\alpha_{minor}-q^4{{!}}}}. So it can not be totally wrong. However, for the examples above I get the following output:

1                      // for (1)
A[[:Template :=]]42    // for (2)
'''V'''                // for (3)

In the web I found that (1) would fail, because in a Template the '=' character is interpreted by the parser. It needs to read '{{=}}'. But the (2) shows that this does not work. The two curly brackets seem to be interpreted as template. The other parts are OK. In (3) I would have expected 'V' als bold. There are other cases where the Template does fail as well, e.g. italics and <sup>..</sup>.

My Question: What is wrong or what is the proper template code to get the MathML tags working with the {{math|}} syntax?


Maintain mediawiki pages templates

Published 9 Apr 2018 by Kosho-b in Newest questions tagged mediawiki - Stack Overflow.

I'm working with pywikibot to create and update several pages from serialized python objects.

Those python objects updating once a week, after this update I want to run bot that takes the current state for each object and update it to the specific wiki page, I'm talking just about updating templates arguments atm.

I can translate the python object the required templates arguments and I'm searching for convenient library to work with. I came with those problems:

  1. Logging the diff between the old and new template args
  2. Saving the arguments like pretty print output, (Each one in new line and so on - for future manual editing).
  3. When creating new page according to Known template I didn't found a way to get python object with the current template args and create it alone.

I checked those libraries:

  1. pywikibot - bah, work with templates is very hard and not intuitive (extract_templates_and_params_regex_simple & glue_template_and_params).
  2. mwparserfromhell - parsed_page.filter_templates() that good start, but I cant see the diff in an easy way and need to create the template for new pages manually.
  3. Wikipedia\mwclient seems to not give any advantage to work with templates.

Thank you.


Email doesn’t disappear

Published 9 Apr 2018 by Bron Gondwana in FastMail Blog.

More and more often we are seeing stories like this one from Facebook about who has control over your messages on closed platforms.

I keep saying in response: email is your electronic memory. Your email is your copy of a conversation. Nobody, from the lowliest spammer to the grand exulted CEO of a massive company, can remove or change the content of an email message they have sent to you.

At first glance, Facebook Messenger seems to work the same way. You can delete your copy of any message in a conversation, but the other parties keep their unchanged copy. However, it turns out that insiders with privileged access can change history for somebody else, creating an effect similar to gaslighting where you can no longer confirm your recollection of what was once said.

In short, centralised social networks are not a safe repository for your electronic memory. They can change their policies and retroactively change messages underneath you.

With email, it’s all based on open standards, and you can choose a provider you trust to retain messages for you.

FastMail is a provider you can trust

We have built our business on a very simple proposition: we proudly charge money in exchange for providing a service. This means our loyalties are not split. We exist to serve your needs.

Our top three values are all about exactly this. You are our customer, your data belongs to you, we are good stewards of your data.

The right to remember, and the right to forget

We provide tools to allow you to implement rules around retention (for example, you can have your Trash folder automatically purge messages after 30 days), but we don’t ever remove messages without your consent and intent.

If you do delete messages, we don’t destroy them immediately, because our experience has shown that people make mistakes. We allow a window of between one and two weeks in which deleted messages can be recovered (see technical notes at the end of this post for exact details).

Since 2010, our self-service tool has allowed you to restore those recently deleted messages. We don't charge using this service, it’s part of making sure that decisions about your data are made by you, and helping you recover gracefully from mistakes.

Because we only scan message content to build the indexes that power our great search tools and (on delivery) for spam protection – once messages are deleted, they’re really gone. You have the right to forget emails you don’t want to keep.

You’re in control

Thanks as always to our customers who choose what to remember, and what to forget. It’s your email, and you are in control of its lifecycle. Our role is to provide the tools to implement your will.

Nobody else decides how long you keep your email for, and nobody can take back a message they’ve sent you. Your email, your memory, your choice.

An Update

Since I started drafting this article, Facebook have doubled down on the unsend feature, saying that they will make it possible for anybody to remove past messages.

While it's definitely more equitable, I still don't think this is a good idea. People will work around it by screenshotting conversations, and it just makes the platform more wasteful of everybody's time and effort. Plus it's much easier to fake a screenshot than to fake up a live Facebook Messenger interface while scrolling back to show messages.

There are really a lot of bad things about unreliable messaging systems, which is exactly what Wired has to say about this rushed and poorly thought-out feature. Stick with email for important communications.


Technical notes:

We currently purge messages every Sunday when the server load is lowest – and only messages which were deleted over a week ago. Therefore the exact calculation for message retention is one week plus the time until the next Sunday plus however long it takes the server to get to your mailbox as it scans through all the mailboxes containing purged messages. Deleting files is surprisingly expensive on most filesystems, which is why we save it until the servers are least busy.

We also have backups, which may retain deleted messages for longer based on repack schedules, but which can’t be automatically restored with messages that were deleted longer than two weeks ago.


cardiParty 2018-04 Melbourne Open Mic Night

Published 8 Apr 2018 by Justine in newCardigan.

a GLAMRous storytelling event 20 April 2018 6.30pm

Find out more...


Mediawiki search form autocomplete not working on my custom skin

Published 7 Apr 2018 by Jose in Newest questions tagged mediawiki - Stack Overflow.

Hello gurus of Mediawiki,

I am having trouble with modifying one of my custom mediawiki skin (1.26). I followed the mediawiki skinning guide to create a search form within my BaseTemplate. I am using the provided API method makeSearchInput to create the search input box. But for some reason, its not doing the auto-complete as it is supposed to do. I have looked into other mediawiki skin examples, tried to duplicate the settings to see if I can get it to work, but nothing really helped.

 <form class="mw-search" role="form" id="searchform" action="<?php $this->text('wgScript'); ?>">
            <?php
              echo $this->makeSearchInput(array('id' => 'searchInput'));

              echo Html::hidden( 'title', $this->get( 'searchtitle' ) );
             ?>
 </form>

When I look into the network activity, all of the other skins where the autocomplete works, I can see the network connectivity sending commands to the api.php each time I input any character into the input box. But for some reason, it doesn't send anything on my own custom skin. It almost looks like it doesn't even attempt to send the query. I have been searching online but without any luck in discovering what the problem is. Since it works on the other skins on the same server, it's probably not the global settings that I am missing but it could be something that I missed on skin configuration. I am not trying to do any fancy modification, so I must be doing something silly. I have been struggling and wasting many hours on this, so now I am here asking for the help...

Does anyone have any idea on what could be causing this? Any help would be very very much appreciated.

Sincerely,


Untitled

Published 6 Apr 2018 by Sam Wilson in Sam's notebook.

I want a login-by-emailed-link feature for MediaWiki, so it’s easier to log in from mobile.


Wikidata Map March 2018

Published 6 Apr 2018 by addshore in Addshore.

It’s time for the first 2018 installation of the Wikidata Map. It has been roughly 4 months since the last post, which compared July 2017 to November 2017. Here we will compare November 2017 to March 2018. For anyone new to this series of posts you can check back at the progression of these maps by looking at the posts on the series page.

Each Wikidata Item with a Coordinate Location(P625)will have a single pixel dot. The more Items present, the more pixel dots and the more the map will glow in that area. The pixel dots are plotted on a totally black canvas, so any land mass outline simply comes from the mass of dots. You can find the raw data for these maps and all historical maps on Wikimedia Tool Labs.

Looking at the two maps below (the more recent map being on the right) it is hard to see the differences by eye, which is why I’ll use ImageMagik to generate a comparison image. Previous comparisons have used Resemble.js.

ImageMagik has a compare script that can highlight areas of change in another colour, and soften the unchanged areas of the image. The image below highlights the changed areas in violet while fading everything that remains unchanged between the two images. As a result all areas highlighted in violet have either had Items added or removed. These areas can then be compared with the originals to confirm that these areas are in fact additions.

If you want to try comparing two maps, or two other images, using ImageMagik then you can try out https://online-image-comparison.com/ which allows you to do this online!

What has changed?

The main areas of change that are visible on the diff are:

There is a covering of violet across the entire map, but these are the key areas.

If you know the causes for these areas of greatest increase, or think I have missed something important, then leave a comment below and I’ll be sure to update this post with links to the projects and or users.

Files on Commons

All sizes of the Wikidata map for March have been uploaded to Wikimedia Commons.

The post Wikidata Map March 2018 appeared first on Addshore.


New MediaWiki extension: AutoCategoriseUploads

Published 5 Apr 2018 by Sam Wilson in Sam's notebook.

New MediaWiki extension: AutoCategoriseUploads. It “automatically adds categories to new file uploads based on keyword metadata found in the file. The following metadata types are supported: XMP (many file types, including JPG, PNG, PDF, etc.); ITCP (JPG); ID3 (MP3)”.

Unfortunately there’s no code yet in the repository, so there’s nothing to test. Sounds interesting though.


integrate existing PHP session with MediaWiki

Published 5 Apr 2018 by Saleh Altahini in Newest questions tagged mediawiki - Stack Overflow.

I have a site and want to integrate a Wiki into it.

I know I can change my code to register and/or set wiki cookies when the user login but this will slow the system down especially since not every user will visit the wiki.

is there a way to make the wiki check if a PHP Session exists and automatically show the logged in users from the main site also logged in on the wiki?

I tried looking into SessionManager and AuthManager but the documentary is too complicated for me since it's my first time working with MediaWiki. if anyone can point me to the right part of the docs for me it will be very much appreciated.


Disparity between Wikipedia's "What links here" count and backlink count using recommended tool

Published 5 Apr 2018 by user1200 in Newest questions tagged mediawiki - Stack Overflow.

I am trying to retrieve a list of backlinks to a list of pages on the english wikipedia database. I first tried using the mediawiki api to collect all of the links, using the blcontinue parameter; however, when I queried certain pages (e.g., Canada) there were an inordinate amount of backlinks i.e., many, many thousand.

When I look in the "what links here" for the Canada page, and exclude redirects, there seem to be again an inordinate amount (https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Canada&namespace=0&limit=5000&hideredirs=1). I decided that at the current time, I could just do with the total rather than the full list of links, so I used the recommended tool (https://en.wikipedia.org/wiki/Help:What_links_here#Number_of_links) and queried the api for Canada, non-redirects (the default namespace is 0), effectively replicating the above query. Here's the documentation, https://dispenser.info.tm/~dispenser/cgi-bin/backlinkscount.py, and here's some sample R code:

bl_url <- "https://dispenser.info.tm/~dispenser/cgi-bin/backlinkscount.py"
  query_param <- list(
  title  = "Canada",
  filterredir = "nonredirects")

bbl <- GET(bl_url, query = query_param)

num_bl <- as.numeric(content(bbl))

> num_bl
[1] 353

here's the url produced by the call to the api:

https://dispenser.info.tm/~dispenser/cgi-bin/backlinkscount.py?title=Canada&filterredir=nonredirects

So the total returned is 353, much fewer than on the "what links here"

Am I missing something obvious?


nginx and mediawiki on a subdirectory of a different server

Published 4 Apr 2018 by It support in Newest questions tagged mediawiki - Stack Overflow.

So, Marketing has requested that our wiki, currently in wiki.example.com, needs to be on www.example.com/wiki

Now, www.example.com and wiki.example.com are two different servers; not only that, www.example.com is a nginx and wiki.example.com is an Apache2

I need to be sure wiki.example.com keeps working so I cannot touch the LocalSettings to adapt it (unless I create a copy?) and no combination of proxy_pass, rewrites, etc. has helped me through this so I'm asking for help :)

If anyone asks I can type all the different options I have tried but from /wiki to: ~ ^/wiki(.*)? ... Anything I've found I've tried (even the proxy_redirect, which imho doesn't make much sense here)

I see the problem though. The nginx sends the request to apache+mediawiki, which converts the URLs and sends it back... then Nginx doesn't know how to treat that and I get nothing but a funny url (http://www.example.com/wiki/index.php/Main_Page) but a 404 error.

I just deleted all configurations and starting from scratch again, any idea/comment will be greatly appreciated.

Edit> Currently using this:

location /wiki {
    access_log /home/example/logs/wiki-access.log combined;
    error_log /home/example/logs/wiki-error.log;

    try_files $uri $uri/ @rewrite;
}
location @rewrite {
    rewrite ^/(.*)$ /index.php?title=$1&$args;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host www.example.com;
    proxy_pass http://wiki.example.com;
    proxy_redirect http://wiki.example.com /wiki;
}

location / {
    # try to serve file directly, fallback to front controller
    try_files $uri /index.php$is_args$args;
}

Forwards the whole page to http://wiki.example.com That's not what I want :(


See My Hat! new exhibition for children and families coming soon

Published 3 Apr 2018 by carinamm in State Library of Western Australia Blog.

SeeMyHat_JGP

Studio portrait of Ella Mackay wearing a paper hat, 1915, State Library of Western Australia, 230179PD

Featuring photographs and picture books from the State Library collections this exhibition is designed especially for children and families.  Dress hats, uniform hats, fancy dress hats are just some of the millinery styles to explore. Children and their families have the opportunity to make a hat and share a picture book together.

See My Hat! will be on display in the Story Place Gallery, Mezzanine floor from Tuesday 10 April – Wednesday 11 July.


Episode 5: Brian Wolff

Published 3 Apr 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Brian Wolff (username Bawolff) works in the Security team at the Wikimedia Foundation, and has been doing MediaWiki and MediaWiki extension development since 2009.   Links for some of the topics discussed:

Mentoring Engineers Through An Engineering Fellowship Program

Published 3 Apr 2018 by Tom Spiegelman in The DigitalOcean Blog.

Mentoring Engineers Through An Engineering Fellowship Program

For two years, I’ve managed the Infrastructure Engineering (“Infra”) team at DigitalOcean. We’re responsible for managing all servers and machines up to the application layer. This includes hardware spec, firmware, component validation, base OS, configuration management, and hardware management (hardware management database).

In addition to my core responsibilities managing the Infra team, I wanted to foster an environment where mentorship was possible and worked with colleagues to create the Infrastructure Engineering Fellowship Program. It’s an immersive program where DigitalOcean employees from other teams “join” the Infra team for two weeks. Employees with fundamental Linux knowledge and some configuration management experience are eligible to participate.

“Fellows”—as they are known—are invited to a private Slack channel with fellowship alum. They work through JIRA tickets assigned to the team (all while pairing with Infra team engineers), attend team stand-ups, and finally, pick a project to work on for the two week duration. Additionally, fellows meet with me at the start and end of each week to discuss what they worked on and to answer questions they have. To date, we’ve had nine people complete the fellowship and we continue to open the fellowship up to other engineers at DO.

How the Fellowship Started

This program started as a cross-team training experience between my team and the Tier-2 Cloud Operations team (the 24/7 team responsible for uptime on our servers and services), since both of our teams interacted with each other on a daily basis. After a few successful trials with the Cloud Operations team, we realized that there were several other teams that were interested in learning what we do and wanted to take advantage of the fellowship program. We have now had people from five different teams sign up and participate in the program.

My team gets so much more out of the fellowship than we put in. First, we build comradery between the wider organization and my team. Individuals we only worked with through JIRA and Slack now have a personal relationship with the team and are more eager to engage and work with us. My team gains a better perspective of what other teams go through and work on a daily basis which helps us build better tools and workflows to support them. Finally, it is a great way to recruit. Engineers that have been hired for my team came through the fellowship program.

Growing people internally is one of the greatest things I have done with my career. I have had three people join my team from inside the company and have been very successful in their new roles. In a perfect world, we would pair every senior engineer on the team with one engineer still early in their career. In my experience, when looking at the “Tuckman's stages of group development” you will have the best performing team when you have mentors and mentees going through the four stages together as a team:

Mentoring Engineers Through An Engineering Fellowship Program

Tuckman's stages of group development. Photo credit: Tutorials Point

Managing the Fellowship Program

One of the things that we keep top of mind is sustainability. Although two weeks isn’t very long, properly mentoring someone takes a lot of time, and we want to make sure no one feels overwhelmed by the experience. We currently take on just one fellow at a time, and we cater the program to each participant. For example, if a fellow is more interested in hardware than big data, they might pair with our integration team who is charged with managing hardware and firmware, rather than our DevOps-focused team.

There are a few benefits of managing the fellowship this way. One, we can iterate quickly since the program lasts just two weeks. And two, we can focus our energies on mentoring just one person at a time to limit straining the team’s bandwidth. Based on feedback from past fellows, we’ve changed how we handle our 1:1s with engineers and code pairing sessions. We now conduct 1:1s with specific goals in mind. Each fellow is asked to give feedback at the very end of the program to help us guide future fellows.

That said, the same benefits are in some ways ongoing challenges. Working with each fellow individually takes up my time, but it also affects the engineers on my team. They need to take time out of their busy schedules to pair with the fellow by breaking their usual workflow and compelling them to walk through projects step by step. This means something that may take them an hour ends up taking most of a day.

That said, we’re able to make this work because we work on a number of tasks and projects at any given time. If a team is working on one long-term project, the time it takes to explain the project to someone won’t actually yield any benefit in a two-week long program. The fellowship program (and programs like it) really need to be catered to the participant and the team that they are embedding with.

What Makes It Worthwhile

As I pointed out earlier, pairing engineers with more senior engineers leads to better performing teams. Furthermore, there is an even stronger connection when you pair engineers that have proprietary or historical knowledge from inside the company. I am a firm believer that if strong minded, eager-to-learn engineers exist within the company, you shouldn’t hire from outside the company. Creating infrastructure that supports mentorship leads to strong engineers, strong teams, and a strong company.

I love seeing people continue to have conversations and work on projects with my team after the fellowship is over. It is simply amazing to see, and I give all the credit to the engineers on my team. Every one of them is eager to pass on knowledge that they have, and they’ve embraced the fellowship and its goals. The fellowship wouldn’t have been successful if my team didn’t share the same beliefs around mentorship and its cross-team benefits that I have.

Future of the Fellowship

When I started my career in IT, I had an amazing mentor (shout out to Rob Lahnemann) who really took me under his wing and taught me everything he could about programming, Linux, and networking. My manager at the time (shout out to Eric Austin) set this up and put me in a place to succeed as a mentee. This experience really influenced what I believe it means to be a good manager. Pairing engineers eager to learn with senior engineers is huge key factor in any successful team. In the current engineering community, it is not uncommon to find engineers who are not influenced to share their knowledge or are not given the time to be a mentor. But in my opinion, growing as an engineer means being a mentor.

In the future, I would love to see the program more of a revolving door of people doing more work with the Infrastructure Engineering team and doing the fellowship program multiple times (hopefully sometimes for longer than two weeks). I also would love to influence programs like this more often inside DigitalOcean and outside DigitalOcean. One of my biggest goals and drivers in writing this is to influence similar programs in the industry as a whole. My career and pace of growth was directly influenced by a strong mentor, so my passion here for influencing more mentor/mentee relations in the industry is high.

Tom Spiegelman is an Infrastructure Engineering Manager at DigitalOcean. He has an awesome dog, a great team, and is married to the amazing Chantal Spiegelman. He is passionate about all things tech, specifically infrastructure. You can find him on LinkedIn or on Twitter.


From 0 to Kubernetes cluster with Ingress on custom VMs

Published 2 Apr 2018 by addshore in Addshore.

While working on a new Mediawiki project, and trying to setup a Kubernetes cluster on Wikimedia Cloud VPS to run it on, I hit a couple of snags. These were mainly to do with ingress into the cluster through a single static IP address and some sort of load balancer, which is usually provided by your cloud provider. I faffed around with various NodePort things, custom load balancer setups and ingress configurations before finally getting to a solution that worked for me using ingress and a traefik load balancer.

Below you’ll find my walk through, which works on Wikimedia Cloud VPS. Cloud VPS is an openstack powered public cloud solution. The walkthrough should also work for any other VPS host or a bare metal setup with few or no alterations.

Step 0 – Have machines to run Kubernetes on

This walkthrough will use 1 master and 4 nodes, but the principle should work with any other setup (single master single node OR combined master and node).

In the below setup m1.small and m1.medium are VPS flavours on Wikimedia Cloud VPS. m1.small has 1 CPU, 2 GB mem and 20 GB disk; m1.medium has 2 CPU, 4 GB mem and 40 GB disk. Each machine was running debian-9.3-stretch.

One of the nodes needs to have a publicly accessible IP address (Floating IP in on Wikimedia Cloud VPS). In this walkthrough we will assign this to the first node, node-01. Eventually all traffic will flow through this node.

If you have firewalls around your machines (as is the case with Wikimedia Cloud VPS) then you will also need to setup some firewall rules. The ingress rules should probably be slightly stricter as the below settings will allow ingress on any port.

Make sure you turn swap off, or you will get issues with kubernetes further down the line (I’m not sure if this is actually the correct way to do this, but it worked for my testing):

sudo swapoff -a
sudo sed -i \'/ swap /d\' /etc/fstab

Step 1 – Install packages (Docker & Kubernetes)

You need to run the following on ALL machines.

These instructions basically come from the docs for installing kubeadm, specifically, the docker and kube cli tools section.

If these machines are new, make sure you have updated apt:

sudo apt-get update

And install some basic packages that we need as part of this install step:

sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common

Next add the Docker and Kubernetes apt repos to the sources and update apt again:

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
sudo add-apt-repository "deb https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") $(lsb_release -cs) stable"
sudo curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
sudo echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" &gt; /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update

Install Docker:

sudo apt-get install -y docker-ce=$(apt-cache madison docker-ce | grep 17.03 | head -1 | awk '{print $3}')

Install the Kube packages:

sudo apt-get install -y kubelet kubeadm kubectl

You can make sure that everything installed correctly by checking the docker and kubeadm version on all machines:

docker --version
kubeadm version

Step 2.0 – Setup the Master

Setup the cluster with a CIDR range by running the following:

sudo kubeadm init --pod-network-cidr=10.244.0.0/16

The init command will spit out a token, you can choose to copy this now, but don’t worry, we can retrieve it later.

At this point you can choose to update your own user .kube config so that you can use kubectl from your own user in the future:

mkdir -p $HOME/.kube
rm -f $HOME/.kube/config
sudo cp -if /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Setup a Flannel virtual network:

sudo sysctl net.bridge.bridge-nf-call-iptables=1
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml

These yml files are coming directly from the coreos/flannel git repository on GitHub and you can easily pin these files at a specific commit (or run them from your own copies). I used kube-flannel.yml and kube-flannel-rbac.yml

Step 2.1 – Setup the Nodes

Run the following for networking to be correctly setup on each node:

sudo sysctl net.bridge.bridge-nf-call-iptables=1

In order to connect the nodes to the master you need to get the join command by running the following on the master:

sudo kubeadm token create --print-join-command

Then run this join on command (the one output by the command above) on each of the nodes. For example:

sudo kubeadm join 10.68.17.50:6443 --token whverq.hwixqd5mb5dhjz1f --discovery-token-ca-cert-hash sha256:d15bb42ebb761691e3c8b49f31888292c9978522df786c4jui817878a48d79b4

Step 2.2 – Setup the Ingress (traefik)

On the master, mark node-01 with a label stating that it has a public IP address:

kubectl label nodes node-01 haspublicip=true --overwrite

And apply a manifest traefik:

kubectl apply -f https://gist.github.com/addshore/a29affcf75868f018f2f586c0010f43d

This manifest is coming from a gist on GitHub. Of course you should run this from a local static copy really.

Step 3.0 – Setup the Kubernetes Dashboard

This isn’t really required, at this stage your kubernetes cluster should already be working, but for testing things and visualizing the cluster the kubernetes dashboard can be a nice bit of eye candy.

You can use this gist deployment manifest to run the dashboard.

Note: You should alter the Ingress configuration at the bottom of the manifest. Ingress is currently set to kubernetes-dashboard.k8s-example.addshore.com and kubernetes-dashboard-secure.k8s-example.addshore.com. Some basic authentication is also added with the username “dashuser” and password “dashpass”

Step 3.1 – Setup a test service (guids)

Again, your cluster should all be setup at this point, but if you want a simple service to play around with you can use the alexellis2/guid-service docker image which was used in the blog post “Kubernetes on bare-metal in minutes

You can use this gist deployment manifest to run the service.

Note: You should alter the Ingress configuration at the bottom of the manifest. Ingress is currently set to guids.k8s-example.addshore.com.

This service returns simple GUIDs, including the container name that guid was generated from. For example:

$ curl http://guids.k8s-example.addshore.com/guid
{"guid":"fb426500-4668-439d-b324-6b34d224a7df","container":"guids-5b7f49454-2ct2b"}

Automating this setup

While setting up my own kubernetes cluster using the steps above I actually used the python library and command line tool called fabric.

This allowed me to minimize my entire installation and setup to a few simple commands:

fab provision
fab initCluster
fab setupIngressService
fab deployDashboard
fab deployGuids

I might write a blog post about this in the future, until then fabric is definitely worth a read. I much prefer it to other tools (such as ansible) for fast prototyping and repeatability.

Other notes

This setup was tested roughly 1 hour before writing this blog post with some brand new VMs and everything went swimmingly, however that doesn’t mean things will go perfectly for you.

I don’t think I ever correctly set swap to remain off for any of the machines.

If a machine goes down, it will not rejoin the cluster, you will have to manually rejoin it (the last part of step 2.1).

The post From 0 to Kubernetes cluster with Ingress on custom VMs appeared first on Addshore.


cardiCast Episode 30 – Annika Kristensen

Published 2 Apr 2018 by Justine in newCardigan.

Melbourne February 2018 cardiParty

Recorded live

Our February Melbourne cardiParty was held at ACCA, the Australian Centre for Contemporary Art, with Senior Curator Annika Kristensen taking us on a special tour of the exhibition Unfinished Business: Perspectives on art and feminism.
Annika concentrated her discussion on a number of key artworks in the exhibition.

https://acca.melbourne

newcardigan.org
glamblogs.newcardigan.org

 

Music by Professor Kliq ‘Work at night’ Movements EP.
Sourced from Free Music Archive under a Creative Commons licence.


v2.4.8

Published 2 Apr 2018 by fabpot in Tags from Twig.


GLAM Blog Club April 2018

Published 1 Apr 2018 by Hugh Rundle in newCardigan.

April is here, Daylight Savings is over, and we’re all happy to be on Easter holidays. Happiness was our theme for March and as usual the newCardigan community shared some great blog posts.

Rebecca was first in with a post about her amazing trip to the Anna Amalia Bibliothek. Kara, meanwhile, told us about the moment she realised that librarianship would make her happier than search engine optimisation. Our own happiness specialist Anne, shares seven tips to make you a happier librarian. The Andrews took us on a long digression about archiving Twitch streams, took happiness in other people’s happiness, and waxed lyrical about …books! Stacey’s happiness comes from climbing (literal) cliffs, whereas Clare likes to play word games and think about utopias. Alissa loves being a librarian and she’s not even sorry, whereas I am sorry that my GLAM Blog Club post in March was actually on our February theme: Watch. Nik loves looking at photos of people (hello Instagram!), whilst Lydia’s personal Happiness Project turns out to be her profession: lucky Lydia! Lucinda found happiness at Geelong Gallery’s Kylie on Stage exhibition, whilst Michaela, despite being overseas at an amazing conference, still found time to blog about the surprising happiness that comes from eating real poutine. Finally, new GLAM Blog Clubber Donna finds happiness in libraries.

For April, our theme is Control. Are you, perhaps, a little bit of a control freak? Or are you more interested in finding ways that GLAM institutions can hand control back to the communities we serve? Do you hope to one day take control as a CEO or Manager of an institution? Or are you just trying to find a way to control your email inbox? Let us all know!

Most importantly, make sure you use a controlled vocabulary before you publish your blog post. Use the tag GLAM Blog Club in your post, and #GLAMBlogClub for any social media posts linking to it. And of course if you haven’t done so yet, remember to register your blog at Aus GLAM Blogs. Happy blogging!


Include and parse file (hosted on external site) as wikitext with MediaWiki

Published 30 Mar 2018 by Punknoodles in Newest questions tagged mediawiki - Stack Overflow.

With MediaWiki, is there any way to include a text file hosted on another site/server and parse that file as wikitext? Is there any way to include a text file at all?


cardiParty 2018-04 with Eddie Marcus

Published 29 Mar 2018 by Andrew Kelly in newCardigan.

Join Eddie Marcus (the sharp mind behind the Dodgy Perth blog) for the shortest heritage pub trail ever. Eddie will explore the Greek and Roman architecture of three iconic Northbridge pubs. 6:30pm, Friday 13 April.

Find out more...


Digital preservation begins at home

Published 29 Mar 2018 by Jenny Mitcham in Digital Archiving at the University of York.

A couple of things happened recently to remind me of the fact that I sometimes need to step out of my little bubble of digital preservation expertise.

It is a bubble in which I assume that everyone knows what language I'm speaking, in which everyone knows how important it is to back up your data, knows where their digital assets are stored, how big they might be and even what file formats they hold.

But in order to communicate with donors and depositors I need to move outside that bubble otherwise opportunities may be missed.

A disaster story

Firstly a relative of mine lost their laptop...along with all their digital photographs, documents etc.

I won't tell you who they are or how they lost it for fear of embarrassing them...

It wasn’t backed up...or at least not in a consistent way.

How can this have happened?

I am such a vocal advocate of digital preservation and do try and communicate outside my echo chamber (see for example my blog for International Digital Preservation Day "Save your digital stuff!") but perhaps I should take this message closer to home.

Lesson #1:

Digital preservation advocacy should definitely begin at home

When a back up is not a back up...

In a slightly delayed response to this sad event I resolved to help another family member ensure that their data was 'safe'. I was directed to their computer and a portable hard drive that is used as their back up. They confessed that they didn’t back up their digital photographs very often...and couldn’t remember the last time they had actually done so.

I asked where their files were stored on the computer and they didn’t know (well at least, they couldn’t explain it to me verbally).

They could however show me how they get to them, so from that point I could work it out. Essentially everything was in ‘My Documents’ or ‘My Pictures’.

Lesson #2:

Don’t assume anything. Just because someone uses a computer regularly it doesn’t mean they know where they put things.

Having looked firstly at what was on the computer and then what was on the hard drive it became apparent that the hard drive was not actually a ‘back up’ of the PC at all, but contained copies of data from a previous PC.

Nothing on the current PC was backed up and nothing on the hard drive was backed up.

There were however multiple copies of the same thing on the portable hard drive. I guess some people might consider that a back up of sorts but certainly not a very robust one.

So I spent a bit of time ensuring that there were 2 copies of everything (one on the PC and one on the portable hard drive) and promised to come back and do it again in a few months time.

Lesson #3:

Just because someone says they have 'a back up' it does not mean it actually is a back up.

Talking to donors and depositors

All of this made me re-evaluate my communication with potential donors and depositors.

Not everyone is confident in communicating about digital archives. Not everyone speaks the same language or uses the same words to mean the same thing.

In a recent example of this, someone who was discussing the transfer of a digital archive to the Borthwick talked about a 'database'. I prepared myself to receive a set of related tables of structured data alongside accompanying documentation to describe field names and table relationships, however, as the conversation evolved it became apparent that there was actually no database at all. The term database had simply been used to describe a collection of unstructured documents and images.

I'm taking this as a timely reminder that I should try and leave my assumptions behind me when communicating about digital archives or digital housekeeping practices from this point forth.










Digital preservation begins at home

Published 29 Mar 2018 by Jenny Mitcham in Digital Archiving at the University of York.

A couple of things happened recently to remind me of the fact that I sometimes need to step out of my little bubble of digital preservation expertise.

It is a bubble in which I assume that everyone knows what language I'm speaking, in which everyone knows how important it is to back up your data, knows where their digital assets are stored, how big they might be and even what file formats they hold.

But in order to communicate with donors and depositors I need to move outside that bubble otherwise opportunities may be missed.

A disaster story

Firstly a relative of mine lost their laptop...along with all their digital photographs, documents etc.

I won't tell you who they are or how they lost it for fear of embarrassing them...

It wasn’t backed up...or at least not in a consistent way.

How can this have happened?

I am such a vocal advocate of digital preservation and do try and communicate outside my echo chamber (see for example my blog for International Digital Preservation Day "Save your digital stuff!") but perhaps I should take this message closer to home.

Lesson #1:

Digital preservation advocacy should definitely begin at home

When a back up is not a back up...

In a slightly delayed response to this sad event I resolved to help another family member ensure that their data was 'safe'. I was directed to their computer and a portable hard drive that is used as their back up. They confessed that they didn’t back up their digital photographs very often...and couldn’t remember the last time they had actually done so.

I asked where their files were stored on the computer and they didn’t know (well at least, they couldn’t explain it to me verbally).

They could however show me how they get to them, so from that point I could work it out. Essentially everything was in ‘My Documents’ or ‘My Pictures’.

Lesson #2:

Don’t assume anything. Just because someone uses a computer regularly it doesn’t mean they know where they put things.

Having looked firstly at what was on the computer and then what was on the hard drive it became apparent that the hard drive was not actually a ‘back up’ of the PC at all, but contained copies of data from a previous PC.

Nothing on the current PC was backed up and nothing on the hard drive was backed up.

There were however multiple copies of the same thing on the portable hard drive. I guess some people might consider that a back up of sorts but certainly not a very robust one.

So I spent a bit of time ensuring that there were 2 copies of everything (one on the PC and one on the portable hard drive) and promised to come back and do it again in a few months time.

Lesson #3:

Just because someone says they have 'a back up' it does not mean it actually is a back up.

Talking to donors and depositors

All of this made me re-evaluate my communication with potential donors and depositors.

Not everyone is confident in communicating about digital archives. Not everyone speaks the same language or uses the same words to mean the same thing.

In a recent example of this, someone who was discussing the transfer of a digital archive to the Borthwick talked about a 'database'. I prepared myself to receive a set of related tables of structured data alongside accompanying documentation to describe field names and table relationships, however, as the conversation evolved it became apparent that there was actually no database at all. The term database had simply been used to describe a collection of unstructured documents and images.

I'm taking this as a timely reminder that I should try and leave my assumptions behind me when communicating about digital archives or digital housekeeping practices from this point forth.










The challenge of calendaring

Published 29 Mar 2018 by David Gurvich in FastMail Blog.

We often focus on email functionality as it is the main focus of our product. However, FastMail has two other components - calendaring and contacts.

In this post we’re focusing on our calendar.

While calendaring has become an integral part of our flagship service, our calendar feature was only introduced in 2014, making it still relatively young in the history of FastMail. Remember we’ve been around since 1999, which might equate to around 100 in modern tech years…

Just like with email, providing a calendar function presents its own challenges. In short, doing calendaring well is, well, hard. One of the main reasons is that standards related to calendaring are still over the place. We’re working hard on making these standards more consistent so that we can improve online calendaring for everyone.

One of our core values is a commitment to open standards. We’re not looking to create a walled garden by developing proprietary technology where your data is locked down to one source or provider.

With FastMail continuing to use CalDAV and iCalendar it helps to drive open standards in online calendaring and helps us to help you to use your information as you choose, syncing between different service providers and devices (as with email).

The data in your FastMail calendars are stored in open formats and can be downloaded or backed up using any number of standard tools that speak standard protocols.

Community-minded calendaring

We are responsible members of many open source communities. We use, create, sponsor and contribute back to a number of projects, including the Cyrus email server.

A significant part of FastMail’s infrastructure runs on Cyrus, the open source email communication technology that was initially developed at CMU.

Right now one of our biggest projects is implementing JMAP as a new standard, which will help to extend the functionality of calendaring and replace CalDAV.

In order for us to live our values we also invest in our people. And when it comes to calendaring we’ve got a great team that helps us to improve and advance calendaring for all of our users, and hopefully the internet in general.

Ken Murchsion, one of our calendar experts, was crucial to getting calendaring off the ground. Without Ken, calendaring and Cyrus may have never happened.

When Cyrus lacked any calendaring functionality it was Ken, then a CMU employee, who took up a casual challenge as a pet project and managed to build a calendaring function with very basic features.

Ken is quick to point out part of Cyrus’ calendaring ongoing development was made possible by attending CalConnect and meeting and speaking to other developers.

photo
of Ken presenting at CalConnect

Ken met Bron around the 2.5 release of Cyrus, and this fortuitous meeting has laid the foundation for several improvements to the calendar and ongoing CalConnect attendances (and of course, Ken becoming a permanent member of the FastMail team).

For the last few years FastMail has been a member of CalConnect and attending this conference really is important to our ongoing development. Robert, another important part of our calendar team, recently wrote about the importance of CalConnect to FastMail.

Looking ahead

We’re hoping to see JMAP recognized as a new standard during 2018 and once this is fully implemented it will help to see many more improvements across email, calendars and contacts.

At a top level this will help to continually improve backend, performance, scheduling and subscriptions.

At a feature level we’re already testing out some exciting new technology. One of these being ‘consensus scheduling’ – recently discussed at CalConnect - which takes the original scheduling functionality and enables a client to send multiple time options for a meeting or appointment to a group of people. So instead of going back and forth to confirm a meeting time it can all be done within the calendar.

Another feature we’ve started to explore is a polling function that could eventually be applied to things such as meeting confirmations for service providers, further reducing the reliance on telephone-based appointment making. Currently, a formal RFC is underway to help implement a standard.

We’re looking forward to introducing ongoing calendar improvements and features into FastMail and we’ll formally announce these as they enter our production environment.

A special event on the calendar

Earlier this year Ken was the ninth recipient of the CalConnect Distingushed Service Award.

Photo
of the service award trophy

This award is a testament to Ken’s dedication to improving calendaring specification and standards. He is also the author of several RFCs and specifications, which have helped to define calendaring for users the world over.

Reflecting on his achievement, Ken remains as modest as ever, “it’s this interaction with other developers (in attending CalConnect) that is so important, testing and banging out code together.”

Ken’s achievements in the calendaring space are immense and he continues to help improve calendaring for all of us.

As our CEO Bron noted, “CyrusIMAP now has best-in-the-world implementation of calendars and contacts due to Ken’s involvement in CalConnect.”

Well done Ken!


Speaker Profile: Donna Edwards

Published 28 Mar 2018 by Rebecca Waters in DDD Perth - Medium.

Donna Edwards presenting at DDD Perth 2017 (DDD Perth Flickr Album)

Donna Edwards, a well known figure in the Perth software industry, presented at DDD Perth 2017 on Attraction and retention strategies for Women in Tech. She is the Events Manager for Women in Technology WA (WiTWA), on the committee for SQL Saturday, and VP of the Central Communicator Toastmasters club. I asked Donna about her experiences at DDD Perth.

From a Director of ACR, a General Manager at Ignia and more recently, the State Delivery Manager at Readify, you have 20-odd years experience in the IT industry. Can you tell me a little about your career to date?

I’ve worked in different roles within the IT industry from sales, to crawling under desks setting up PCs, to phone support and even installing hardware and software for people. In the past ten years I’ve focused on culture and business growth. My passion has always been creating awesome places to work, winning high quality work and growing a phenomenal team. More than anything I believe life is too short to not love what you do — so follow what you love and everything will work out 😊

Words to live by right there. You’re a seasoned presenter on speaker panels; I’ve seen you speak at a number of events. Was DDD Perth one of your first solo presentations at a conference?

Yes I really enjoy panels and have done quite a few previously however DDD was my first solo presentation (over ten minutes long). Getting selected for a 45 minute slot was a huge achievement and pretty scary I have to admit 😊

What helped you decide to submit a talk to DDD Perth?

I knew that DDD was trying to attract more women presenters after 2016 and I’d never actually submitted for a conference before so I saw it as a challenge! My partner was also submitting so we actually spent a day whilst we were on a cruise sitting out on the deck writing out submissions 😊 we both submitted two talks. I certainly didn’t expect to get selected and was probably hoping not to haha!

That sounds like a bit of a #BeBoldForChange pledge from International Women’s Day 2017. Have you a #PressForProgress goal for 2018?

For me, its always about doing more and continuing to strive to be better both personally as well as achieve more for the community each year. This year I am currently about to take on another three committee roles as well as continuing to focus on taking the WiTWA events to another level. We've sold out our last three events hitting record numbers of attendees (200). It is super exciting to see the level of engaged women in our tech community. Just this week I shared four tech events with all female panels / speakers which is brilliant to see! And it will only get bigger and better 😊

Back to DDD Perth…Did you enjoy the day? How about presenting?

The day was fantastic. I got to hear some brilliant talks from the likes of Patima and Nathan and also got roped into being on a panel with them later in the day! There was a great vibe and everyone seemed to be really enjoying themselves along with lots of familiar faces as is the Perth IT industry 😊 Presenting was actually super fun! We had a few technical issues so it started a bit late which made me a little nervous but once I got started I thoroughly enjoyed the experience. I had done LOADS of practice so I felt pretty comfortable with the slides and content which definitely saved me! It didn’t help that I was in a job interview process and the two potential bosses were both watching my presentation — no pressure. I must have done ok cause I got the job 😉

Oh Wow! That’s an interesting point. As someone who makes hiring decisions for the company you work for, do you like seeing presentations and the like on a curriculum vitae?

Absolutely - whether they get involved in community events by either presenting or volunteering is a huge positive when I am choosing between applicants.

What are you looking forward to seeing in DDD Perth 2018?

The level of diversity for 2017 was great so I’m keen to see that remain or improve for 2018. I’m pretty sure it will be even bigger and better after last year sold out so that’s super exciting! More great sponsors no doubt and hopefully an even bigger after party (which means it will be huge). Finally looking forward to learning a lot — the best thing about DDD is the variety of awesome speakers and topics so you can really tailor the day for what you are interested in.

Thanks for chatting to us, Donna!


Speaker Profile: Donna Edwards was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


Publishing @ W3C goes to ebookcraft

Published 28 Mar 2018 by Tzviya Siegman in W3C Blog.

For many of us who work with ebooks, the highlight of our year is ebookcraft in Toronto. ebookcraft is a two-day conference devoted to ebook production, sponsored by Booknet Canada. The fifth edition was held last week, and it was a veritable who’s who of Publishing @ W3C.

Why do we love ebookcraft? It’s full of “practical tips and forward-thinking inspiration.” It’s impeccably organized, by the wizardly Lauren Stewart and her team. It’s warm and welcoming. There are cookies. More than half the speakers are women. It really is about making beautiful, accessible ebooks. Of course, that requires standards. The ebook world has suffered more than most, with interoperability being a dream rather than a reality. Many of the presenters are involved with standards work at W3C.

The first day of ebookcraft was devoted to workshops, where longer talks and smaller audiences allow for in-depth coverage of various topics. Naomi Kennedy (Penguin Random House) kicked off the day speaking about “Images in Ebooks,” addressing approaches to format, size, and color with the ever-popular Bill the Cat.

Romain Deltour (DAISY) asked his audience “Is Your EPUB Accessible?” I found out that mine was almost there but not quite (and I wrote some of the specs he was featuring, uh-oh!). Romain walked us through concepts such as how information gets from HTML to the user, what assistive technologies are, how to figure out if your content has accessibility support, and how to test your files. Romain is one of the developers behind Ace by DAISY, a command-line program to check EPUBs for accessibility, and he did a demo for us. Ace by DAISY is based on the EPUB Accessibility 1.0 spec.

There was a panel over lunch called “Everybody’s Working on the Weekend,” about volunteerism in digital publishing. The panelists were from Booknet Canada, some of the wonderful planners of the conference. Many of them also devote their time to standards development at Booknet Canada and other organizations. When it was time for audience participation, it was pretty clear that publishing is a world of volunteers. Everyone wants to help, but there’s a serious shortage of time and resources, given busy day jobs. And standards work can be daunting at first—we need to find ways to gently welcome newcomers.

Deborah Kaplan picked up after lunch with ”WAI-ARIA in Practice.” She walked us through ARIA best practices, perhaps most importantly when NOT to use ARIA. She also opened our eyes to the wide world of keyboard navigation and gave us a hefty reading list for learning more.

Peter Krautzberger spoke about MathML: Equation Rendering in ebooks offered an overview of the options available for equational content in EPUB. We looked at equations in SVG and MathML and many options for making them accessible.

Conference organizer Laura Brady participated in a panel with the NNELS (National Network of Equitable Library Services) called “We Tear Apart Your Ebooks.” The panel discussed the NNELS open system for sharing accessible publications. Once a book is in the NNELS system, it can be shared throughout Canada. Authorized users request accessible publications, and the NNELS team works to make them accessible. Laura recently audited several publishers in Canada to assess their level of accessibility (really not that great) and trained them to get much better.

On Day 2, we shifted from workshops to the big room. Who better to kick off the day than Liisa McCloy-Kelley, co-chair of the Publishing Business Group? Liisa’s topic was “Laser Focus: Don’t Get Distracter by that Shiny Object.” Liisa gave us a short tour of the history of ebooks and EPUB (and made sure we knew how to spell it). Publishing, reading, and writing have changed a lot over the years. We all get caught up on “shiny objects” that might catch our attention briefly, but it’s important to explore why you want to do it. Is it because a feature is cool? Is someone asking you to add it? Are you fixing something that’s annoying? Do you have a unique solution? There are many questions to ask that can help you decide whether you should implement a change, and when (and if) you will make the change. There are some issues that the entire industry must address. We need to stop making proprietary formats and embrace standards. Focus on improving image quality as screen quality improves. We should consider the external contexts provided by reading systems, how voice, AR, and VR might affect our content, and be patient.

The highlight of the day was Rachel Comerford’s “epub and chill” talk. Somehow Rachel managed to compare online dating with ebooks. The whole room was chanting “expose your metadata, not yourself.” The rules for dating and ebooks are pretty similar: 1. Remember Your Audience 2. Use Standards 3. Be Transparent 4. Don’t Play Hard to Get. I strongly recommend checking out the video when it becomes available.

Karen Myers (W3C) and I spoke about standards in Publishing@W3C in a talk entitled “Great Expectations—The Sequel.” We offered a brief history of Publishing@W3C and a deep dive into the work happening in the EPUB3 Community Group, the Publishing Business Group, and the Publishing Working Group. We offered a quick tour of the cast of characters that makes up the rest of the W3C. We shared some highlights from groups such as WOFF, WAI, and Verifiable Claims that could be of real interest and value to the the publishing community. We spoke about how to get involved and how to stay current.

Dave Cramer (co-chair of the EPUB 3 CG) and Jiminy Panoz went on an “Excellent CSS Adventure.” You’ll have to watch the video for Dave’s biblical opening. Dave and Jiminy explained the magic of CSS with some great tips, from the power of selectors and the cascade to the mysteries of pseudo-elements and inline layout.

Benjamin Young and I discussed an HTML-First Workflow at Wiley. We spoke briefly of Wiley’s 200+ year history of publishing books and journals. We have recently begun exploring an HTML-first workflow for our journal articles that looks at content apart from metadata. We have focused on layers of material. The content is in HTML. Metadata is in RDFa. Style is acheived with CSS, and internal processing is accomplished using HTML’s data-*. attribute. The Wiley team that is working on this project began with a set of technical requirements with the goal of improving output. It is still a work in progress, but we heard that lots of people are ready to dive into HTML now.

Ben Dugas offered his perspective as an ebook retailer at the End of the Conveyer Belt. Ben works in Content Operations at Kobo. His team looks at all the files that pass through Kobo’s pipeline. To summarize, content creation is hard, spec creation is hard, content QA is hard, and building support is hard. My favorite part of Ben’s presentation was when he pointed out that it takes a little time to get used to standards work, but once they got used to our quirks, they realized they had actual opinions and it was okay to offer them. Ben’s advice is to move on to EPUB 3 (and beyond), use epubcheck and Ace, test across platforms, think about the reader, and not accept the status quo. Sound advice.

If you’re involved in the creation of ebooks, be sure to come to ebookcraft in 2019! In the meantime, you can see what people said about ebookcraft on social media, follow @ebookcraft on Twitter, and eagerly await the videos of this year’s conference.

Many thanks to Dave Cramer for his thoughtful editing of this post.


How to Conduct Effective Code Reviews

Published 28 Mar 2018 by Billie Cleek in The DigitalOcean Blog.

How to Conduct Effective Code Reviews

A code review, at its core, is a conversation about a set of proposed changes. Early in my career, I viewed code reviews as a mostly technical exercise that should be devoid of non-technical concerns. I now see them as one of the few opportunities to concurrently learn and teach while also strengthening my relationship with my peers and colleagues.

My team, Delivery, has been working together for at least six months (some much longer), but only two members work in the New York City office while the rest are spread across North America. Because of our familiarity with each other, most of our daily interactions take place via text or video chat. Code reviews are often short, but we also go out of our way to communicate when we are stating an opinion or being nit-picky.

Most software developers are expected to participate in code reviews, and yet few are offered any training or guidance on conducting and participating in an effective code review. Participants attempt to find the most appropriate solution to a problem given the constraints of time, effort, and skills of all involved. But how do we have that conversation? What does an effective conversation look like? And what are the challenges of participating in a code review, and how can you overcome them?

Whether your tool of choice is GitHub, GitLab, Gerrit, or another tool, the goal of this article is to help you get as much out of the code review process as possible.

What Are Code Reviews For?

Code reviews happen in a wide range of contexts, and often the skills and depth of experience of participants vary widely. On open source projects, for example, participants may not have any sort of personal relationship with each other. Indeed, they may never communicate outside of the code review process. At the other end of the spectrum are code reviews where the participants have daily face-to-face interactions, such as when everyone works at the same company. A good participant will adjust how they participate in a code review according to their knowledge of the other participants.

While it is important to adjust one's communication style in accordance with the intended recipient, how to adjust is influenced by three primary factors: the purpose of the code review, the intended audience, and one's relationship to the audience.

Identifying the Purpose of a Code Review

Code reviews serve both technical and cultural purposes: finding bugs before they're integrated, identifying security concerns, ensuring style consistency with the existing codebase, maintaining code quality, training, fostering a greater sense of ownership, and giving other maintainers an opportunity to get familiar with the code before it's integrated are just some of the reasons you may be asked to participate in code reviews. Make sure you know why you're participating in a code review beforehand.

Regardless of why you’re conducting a code review, it is important to respect the purposes that code reviews serve for the codebase. If the only purpose of a code review is to check for security concerns, then drop whatever personal concerns you may have about coding style or naming patterns. Unfortunately, it is not uncommon for the purpose of code reviews to be poorly defined or non-existent. In that case, once you've determined that the proposed changes are necessary and add value, I'd suggest reviewing for correctness, bug identification, and security concerns. Secondary to those concerns may be overall quality and long term maintainability of the proposed changes.

Submissions: What to Include

Code reviews typically start with a contributor submitting a proposed set of changes to the project. The submission should include:

Depending on the complexity of the changes, reviewers may find an overview of the trade-offs the submitter made in the patch helpful in order to be better understand why the patch is the most appropriate of the possible alternatives.

Written communication about technical subjects can be difficult: people have limited time, and each of us is on a journey of confronting challenges and personal growth. In code reviews every participant has a role to play, each with its own set of objectives:

Regardless of your role in the review process, respect that others may be at a different place in their journey, and assume that all participants are engaging in the process in good faith and because of shared values and goals. The process is easiest when one assumes that all other participants are doing their utmost to help you succeed and get better.

Here's an example of a pull request from our team where I asked for clarification, discussed my concerns, and ultimately landed on a compromise that made the submission better and easier to maintain, all while gaining personal knowledge of the subject at hand:

How to Conduct Effective Code ReviewsHow to Conduct Effective Code Reviews

Example of how my team communicates in our code reviews.

Knowing Your Audience

Start by reading all the code. As a reviewer, recognize that the submitter gave their time and energy and tried to improve the product in some way. As you read and strive to understand the patch, record your questions and concerns privately so that you understand the full context before providing any feedback. As mentioned previously, make an honest effort to restrict your feedback to the purposes for which the code review is being conducted.

Prepare and submit your feedback after reading and understanding the changes. Be gracious. Try to keep your comments focused on the code and the solution it offers; avoid digressing into unrelated matters. If you see something surprising, ask questions. If you don't have a strong history with a submitter, go the extra mile to communicate your good intentions. It's OK to use emojis to communicate tone. Strive to begin fostering a healthy, productive relationship with this new contributor.

Your feedback in code reviews is one of the primary ways to build a community of developers eager to contribute to your project. By nurturing a strong community, you will promote a quality product. Especially for open source maintainers, an authentic, explicit “thank you for the contribution” or other nice words can go a long way towards making people feel appreciated and fostering a supportive community.

Take the feedback, evaluate it, and decide what to do next. For submitters, it can be difficult to read criticism of the code you have written. When a reviewer asks for changes, they are doing so for the same reason a patch author submits a patch: a genuine desire to improve the product. Remind yourself that feedback about code is not personal. You may decide to accept the feedback and change something. Or you may decide that there was a misunderstanding, and that some requested changes are unwarranted or would simply be wrong or add no value. It’s OK to push back.

Developing a Partnership Through Code Reviews

When there is an asymmetric level of experience between the submitter and reviewer, use the opportunity to mentor. As a reviewer with more experience than the submitter, you may choose to accept that submitter's patch as-is and then improve upon it, contacting the submitter to let them know about your changes later. In a professional setting, such an approach isn't always feasible. Have the conversation in the open so that observers (i.e. other readers) can learn too, but reach out for a more personal touch if the extent of feedback is becoming overwhelming in written form. In my experience, patches submitted by someone significantly more experienced than the reviewer are usually accepted as-is or with only very minor changes requested.

When you're thinking out loud, make it clear to the reader so that they do not think you are asking for a change inasmuch as evaluating a possibility. If you're nitpicking, explain your reasons for doing so. On our team, we often preface nit-picky comments with (nit), in order to help contributors recognize these types of comments. This usually serves as a signal that the contributor can ignore that feedback if they want. Without that distinction, the nitpicks are not distinguishable from the feedback that the reviewer feels more strongly about. For all participants: when you're unsure about something, ask, and err on the side of clarity and friendliness.

A successful code review will result in a higher quality change, strengthen the relationship between reviewer and submitter, and increase the understanding that everyone involved has of the project. Code reviews are not just a formality that require a rubber stamp before being merged; they are an essential aspect of modern software development that provide real value to projects and teams by promoting good software engineering practices.

Conclusion

Through code reviews, I've learned to be more gracious and more understanding about the personal challenges and technical struggles that everyone experiences. I have learned to more thoughtfully examine the trade-offs that we all make when writing software. I hope the ideas presented here can help you grow your community and increase your effectiveness.

Billie Cleek is a Senior Software Engineer on the Delivery team where he supports internal tools to provide a consistent deployment surface for DigitalOcean's microservices. In his spare time, Billie is a maintainer of vim-go, infrequent contributor to other open source projects, and can be found working on his 100-year-old house or in the forests of the Pacific Northwest regardless of the weather. You may also find Billie on GitHub and Twitter.


Expanding Participation in W3C – a new Membership Level!

Published 27 Mar 2018 by J. Alan Bird in W3C Blog.

As W3C continues to evolve the breadth and depth of our work, we need to continue to address how we’re packaging and pricing our Membership options. In 2012 we added a tier designed for Startup organizations, which has enabled a significant number of these organizations to join our work. In 2014 we introduced the Introductory Industry Membership to allow organizations with a singular focus on a specific industry segment’s use cases to have a seat at the table.  Again, by all measures this has been a successful program. With the IDPF combination, we put a Transitional Publishing Industry Membership together to allow former IDPF Members a way to engage with us at a rate that is based on their prior IDPF fees with the understanding that it would lead to becoming regular Members at the end of that program. As with the other two, we’ve seen a significant number of organizations join the work at W3C via this program.

In December 2017 there was a discussion between W3C and a significant publishing company which showed that the step up for that publisher from TPI to regular Membership was too big a step. We found this informative, as they are one of the bigger organizations in the Publishing Industry. Based on that conversation, and exploratory discussions with potential Members, the W3C Business Development Team, and within our current Membership, we defined a new Membership level which would be aimed at public organizations that have revenues between $50M and $500M USD.

Today, I’m pleased to announce that W3C is moving ahead with that as a trial program to determine if this offering will be successful in attracting new Members to W3C.  To determine if you qualify for this program, please go to this site. If you do qualify, the Membership Application System is ready for you! If you have any questions about the program, please don’t hesitate to send me a note to abird@w3.org.

Cheers,

J. Alan Bird W3C Global Business Development Leader


Midwest Heritage of Western Australia

Published 27 Mar 2018 by Sam Wilson in Sam's notebook.

Midwest Heritage of Western Australia is a terrific database of records of graves and deceased people in the mid-west region of WA.


Untitled

Published 27 Mar 2018 by Sam Wilson in Sam's notebook.

I joined newCardigan today.


AggregateIQ Brexit and SCL

Published 25 Mar 2018 by addshore in Addshore.

UPDATE 02/04/2018: Looks like AggregateIQ may have had a contract with Cambridge Analytica, but didn’t disclose it because of an NDA… But all spoilt by a unsecure gitlab instance.  https://nakedsecurity.sophos.com/2018/03/28/cambridge-analyticas-secret-coding-sauce-allegedly-leaked/


I wonder why AggregateIQ state that they have never entered a contract with Cambridge Analytica, but don’t mention SCL. Except they do mention they have never been part of SCL or Cambridge Analytica…

Channel 4 report on Brexit and AggregateIQ

From the AggregateIQ website & press release:

AggregateIQ is a digital advertising, web and software development company based in Canada. It is and has always been 100% Canadian owned and operated. AggregateIQ has never been and is not a part of Cambridge Analytica or SCL. Aggregate IQ has never entered into a contract with Cambridge Analytica. Chris Wylie has never been employed by AggregateIQ.
AggregateIQ works in full compliance within all legal and regulatory requirements in all jurisdictions where it operates. It has never knowingly been involved in any illegal activity. All work AggregateIQ does for each client is kept separate from every other client.

Links

The post AggregateIQ Brexit and SCL appeared first on Addshore.


Love is

Published 24 Mar 2018 by jenimcmillan in Jeni McMillan.

Lovers

I am passing through countries, discarding them like forgotten lovers. Now when I think about love, I have many more things to say. I think love is a vulnerability, a willingness to trust someone with a precious heart. To be so child-like and joyous that dancing and singing is a natural state. A heightened awareness of the beloved. A look, a tiny movement, a sigh, a tremor, a breath, a heartbeat, these are the signs that reveal the inner state. But love passes, in the same way that that cities fade into the distance as I travel across Europe. That is what you tell me. And so, I continue my journey.

‘Take your joy and spread it across the world, he wrote.

At least begin with a smile and hug yourself, she thought.’


Resurrecting a MediaWiki instance

Published 24 Mar 2018 by in Posts on The bugalore.

This was my first time backing up and setting up a new MediaWiki (-vagrant) instance from scratch so I decided to document it in the hope that future me might find it useful. We (teams at Wikimedia) often use MediaWiki-Vagrant instances on Labs, err, Cloud VPS to test and demonstrate our projects. It’s also pretty handy to be able to use it when one’s local dev environment is out of order (way more common than you’d think).

Stories Behind the Songs

Published 24 Mar 2018 by Dave Robertson in Dave Robertson.

Every song has a story. Here’s a little background on the writing and recording of each of the songs on Oil, Love & Oxygen. It is sometimes geeky, sometimes political and usually personal, though I reserve the right to be coy when I choose!

  1. Close Your Mouth is a funny one to start with, because it’s the most vague in terms of meaning – I think there were ideas floating around in my head about over-thinking in relationships, but it is not about anything specific. The “bed” of this track was a live take with drums and semi-electric guitar using just a pair of ribbon microphones – very minimalist! There is some beautiful crazy saxophone from Professor Merle in the background of the mix at the 1:02 minute mark.
  2. Good Together is one of my oldest songs, and the recording of it started eight years ago! It features catchy accordion from Cat Kohn (now Melbourne based) and a dreamy electric guitar solo from Ken Williford (now works for NASA). The lyrics are fairly direct storytelling, so I don’t feel the need to elaborate.
  3. Oil, Love & Oxygen. I’ve been banging on about the climate crisis for more than twenty years, and this is the song where I most directly address the emotional side of it. For the lyric writing nerds: I used a triplet syllable stress pattern in the verses. The choir part was an impromptu gathering of friends at the end of a house concert. I first played this song as a duo with Marie O’Dwyer who plays the piano part on this version. The almost subliminal organ part is Rachel playing a 1960s electric organ she found on the side of the road.
  4. The Relation Ship I wrote this on the ferry to Rotto. The “pain body” concept in the chorus comes from Eckhart Tolle’s book A New Earth, and is similar to the sankhara concept in Vipassana. For the A capella intro I experimented with double tracking the band singing together around a mid (omni) / side (ribbon) microphone setup, without using headphones.
  5. Perfect as Cats. As I kid I was fascinated by the big cats, especially snow leopards. This song is not about snow leopards. The drums and bass here were the only parts of the album recorded in a purpose built studio (the old Shanghai Twang). Ben Franz plays the double bass and Rob Binelli the drums (one of the six drummers on the album!).
  6. Dull Ache. Sometimes I wished I lived in Greece, Italy, The Philippines, Costa Rica, Mexico, Ecuador, Nigeria or Spain. The common theme here is the siesta! I’m not at my best in the mid arvo, partly because my sensitive eyes get weary in our harsh sun. Around 4 or 5pm the world becomes a softer place to me, and my mojo returns. This song is also more generally about existential angst and depression. Always reach out for support when you need it – it is not easy dealing with these crazy grey soft things behind our eyes. I love Rob’s crazy guitars on the second half of this song – they are two full takes panned either side without edits.
  7. Kissing and Comedy was inspired by a quote from Tom Robbins’s novel Even Cowgirls Get The Blues: “Maybe the human animal has contributed really nothing to the universe but kissing and comedy–but by God that’s plenty.” I wrote it on the Overland Train. The drums are a single playful take by Angus Diggs, recorded in Dave Johnson’s bedroom with my trusty pair of ribbon mics, and the song was built up from there.
  8. Now That We’ve Kissed was co-written with Ivy Penny and is about being kissed by famous people (which I haven’t) and the implications of kisses in general. The things that “come from a kiss” were literally phoned in by friends.
  9. Rogue State was written in 2007, just prior to the Australian federal election and the Bali Climate Change Conference. It reflects on Australia’s sabotage of progress on climate change at the Kyoto conference in 1997, as documented in books such as Guy Pearse’s “High & Dry: John Howard, Climate Change and the Selling of Australia’s Future” and Clive Hamilton’s “Scorcher”. I had no intention of putting this old song on the album, until the last minute when I decided it was still sadly relevant given so many politicians still show a lack of respect and understanding of science and the planet that supports us. The recording of the song was also an excuse to feature a bit of Peter Grayling cello magic.
  10. Montreal was the first song I wrote on ukulele, though I ended up recording it with guitar. Sian Brown, who helped greatly with recording my vocals for the album, makes a harmony cameo at the end of the song. As for the lyrics, it’s a fairly obvious bittersweet love song.
  11. I Stood You Up is my account of attending a fantastic music camp called Rhythm Song, and kicking myself for not following through with a potential jam with songwriting legend Kristina Olsen. One of her pieces of advice to performers is to make their audience laugh to balance out the sadder songs in a set. The song was written in a mad rush two hours before a Song Club when I thought “What music can I write quickly?… Well I don’t have a blues song yet!”. This version was largely recorded prior to The Kiss List taking shape, so it features multiple guest musicians who are listed in the liner notes.
  12. Measuring the Clouds I wrote for my Dad’s birthday a few years ago. He used to be a Weather Observer in the 60s, sending up the big balloons etc. from many locations around WA such as Cocos Island. He had a beautiful eccentric sense of humour and would answer the phone with “Charlie’s Chook House and Chicken Factory, Chief Chook speaking”. The musical challenge I set myself with this song was to use a five bar pattern in the verse. A cello part was recorded, but was dropped in the mixing when I decided it made the song feel too heavy and I wanted it to feel light and airy.

Share


Stories Behind the Songs

Published 24 Mar 2018 by Dave Robertson in Dave Robertson.

Every song has a story. Here’s a little background on the writing and recording of each of the songs on Oil, Love & Oxygen. It is sometimes geeky, sometimes political and usually personal, though I reserve the right to be coy when I choose!

  1. Close Your Mouth is a funny one to start with, because it’s the most vague in terms of meaning – I think there were ideas floating around in my head about over-thinking in relationships, but it is not about anything specific. The “bed” of this track was a live take with drums and semi-electric guitar using just a pair of ribbon microphones – very minimalist! There is some beautiful crazy saxophone from Professor Merle in the background of the mix at the 1:02 minute mark.
  2. Good Together is one of my oldest songs, and the recording of it started eight years ago! It features catchy accordion from Cat Kohn (now Melbourne based) and a dreamy electric guitar solo from Ken Williford (now works for NASA). The lyrics are fairly direct storytelling, so I don’t feel the need to elaborate.
  3. Oil, Love & Oxygen. I’ve been banging on about the climate crisis for more than twenty years, and this is the song where I most directly address the emotional side of it. For the lyric writing nerds: I used a triplet syllable stress pattern in the verses. The choir part was an impromptu gathering of friends at the end of a house concert. I first played this song as a duo with Marie O’Dwyer who plays the piano part on this version. The almost subliminal organ part is Rachel playing a 1960s electric organ she found on the side of the road.
  4. The Relation Ship I wrote this on the ferry to Rotto. The “pain body” concept in the chorus comes from Eckhart Tolle’s book A New Earth, and is similar to the sankhara concept in Vipassana. For the A capella intro I experimented with double tracking the band singing together around a mid (omni) / side (ribbon) microphone setup, without using headphones.
  5. Perfect as Cats. As I kid I was fascinated by the big cats, especially snow leopards. This song is not about snow leopards. The drums and bass here were the only parts of the album recorded in a purpose built studio (the old Shanghai Twang). Ben Franz plays the double bass and Rob Binelli the drums (one of the six drummers on the album!).
  6. Dull Ache. Sometimes I wished I lived in Greece, Italy, The Philippines, Costa Rica, Mexico, Ecuador, Nigeria or Spain. The common theme here is the siesta! I’m not at my best in the mid arvo, partly because my sensitive eyes get weary in our harsh sun. Around 4 or 5pm the world becomes a softer place to me, and my mojo returns. This song is also more generally about existential angst and depression. Always reach out for support when you need it – it is not easy dealing with these crazy grey soft things behind our eyes. I love Rob’s crazy guitars on the second half of this song – they are two full takes panned either side without edits.
  7. Kissing and Comedy was inspired by a quote from Tom Robbins’s novel Even Cowgirls Get The Blues: “Maybe the human animal has contributed really nothing to the universe but kissing and comedy–but by God that’s plenty.” I wrote it on the Overland Train. The drums are a single playful take by Angus Diggs, recorded in Dave Johnson’s bedroom with my trusty pair of ribbon mics, and the song was built up from there.
  8. Now That We’ve Kissed was co-written with Ivy Penny and is about being kissed by famous people (which I haven’t) and the implications of kisses in general. The things that “come from a kiss” were literally phoned in by friends.
  9. Rogue State was written in 2007, just prior to the Australian federal election and the Bali Climate Change Conference. It reflects on Australia’s sabotage of progress on climate change at the Kyoto conference in 1997, as documented in books such as Guy Pearse’s “High & Dry: John Howard, Climate Change and the Selling of Australia’s Future” and Clive Hamilton’s “Scorcher”. I had no intention of putting this old song on the album, until the last minute when I decided it was still sadly relevant given so many politicians still show a lack of respect and understanding of science and the planet that supports us. The recording of the song was also an excuse to feature a bit of Peter Grayling cello magic.
  10. Montreal was the first song I wrote on ukulele, though I ended up recording it with guitar. Sian Brown, who helped greatly with recording my vocals for the album, makes a harmony cameo at the end of the song. As for the lyrics, it’s a fairly obvious bittersweet love song.
  11. I Stood You Up is my account of attending a fantastic music camp called Rhythm Song, and kicking myself for not following through with a potential jam with songwriting legend Kristina Olsen. One of her pieces of advice to performers is to make their audience laugh to balance out the sadder songs in a set. The song was written in a mad rush two hours before a Song Club when I thought “What music can I write quickly?… Well I don’t have a blues song yet!”. This version was largely recorded prior to The Kiss List taking shape, so it features multiple guest musicians who are listed in the liner notes.
  12. Measuring the Clouds I wrote for my Dad’s birthday a few years ago. He used to be a Weather Observer in the 60s, sending up the big balloons etc. from many locations around WA such as Cocos Island. He had a beautiful eccentric sense of humour and would answer the phone with “Charlie’s Chook House and Chicken Factory, Chief Chook speaking”. The musical challenge I set myself with this song was to use a five bar pattern in the verse. A cello part was recorded, but was dropped in the mixing when I decided it made the song feel too heavy and I wanted it to feel light and airy.

Share


gitgraph.js and codepen.io for git visualization

Published 22 Mar 2018 by addshore in Addshore.

I was looking for a new tool for easily visualizing git branches and workflows to try and visually show how Gerrit works (in terms of git basics) to clear up some confusions. I spent a short while reading stackoverflow, although most of the suggestions weren’t really any good as I didn’t want to visualize a real repository, but a fake set of hypothetical branches and commits.

I was suggested Graphviz by a friend, and quickly found webgraphviz.com which was going in the right direction, but this would require me to learn how to write DOT graph files.

Eventually I found gitgraph.js, which is a small JavaScript library for visualizing branching ‘things’, such as git, well, mainly git, hence the name and produce graphics such as the one below.

In order to rapidly prototype with gitgraph I setup a blueprint codepen.io pen with the following HTML …

<html>
  <head>
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/gitgraph.js/1.11.4/gitgraph.css" />
    <script src="https://cdnjs.cloudflare.com/ajax/libs/gitgraph.js/1.11.4/gitgraph.min.js"></script>
  </head>
  <body><canvas id="graph"></canvas></body>
</html>

… and following JS …

var graph = new GitGraph({
  template: "metro", // or blackarrow
  orientation: "vertical",
  elementId: 'graph',
  mode: "extended", // or compact if you don't want the messages  
});

var master = graph.branch("master");
master.commit( { message: "Initial Commit" });

… to render the rather simple single commit branch below …

Styling can be adjusted passing a template into the GitGraph object …

var myTemplateConfig = {
    colors: ["#008fb5", "#979797", "#f1c109", "#33cc33"],
          branch: {
            lineWidth: 3,
            spacingX: 30,
            labelRotation: 0
          },
  commit: {
        spacingY: 40,
        dot: {
          size: 10
        },
        message: {
          displayAuthor: false,
          displayBranch: true,
          displayHash: true,
          font: "normal 14pt Arial"
        }
    }
    
};
var myTemplate = new GitGraph.Template( myTemplateConfig );

var graph = new GitGraph({
  template: "metro", // or blackarrow
  orientation: "vertical",
  elementId: 'graph',
  mode: "extended", // or compact if you don't want the messages  
  template: myTemplate
});

… which would render …

The blueprint codepen for this style can be found at https://codepen.io/addshore/pen/xWdZXQ.

With this blueprint setup I now have a starting point for further visualizations using gitgraph and codepen comparing Gerrit and Github, for example below comparing a merged pull request consisting of two commits, the second of which contains fixes for the first, vs a single change in gerrit, that has 2 seperate versions.

Keep an eye out on this blog for any more progress I make with this.

The post gitgraph.js and codepen.io for git visualization appeared first on Addshore.


DigitalOcean Currents: March 2018

Published 22 Mar 2018 by Ryan Quinn in The DigitalOcean Blog.

DigitalOcean Currents: March 2018

Currents is back with our third report on the developer experience. This February we asked 5,993 participants about their thoughts on hot topics like artificial intelligence and machine learning, new ways of working with codebases and services like continuous integration and delivery, and important issues like the European Union’s General Data Protection Regulation (GDPR) and the FCC’s decision on net neutrality.

Among our findings this quarter:

If you work in a larger organization, chances are you are using CI/CD

DigitalOcean Currents: March 2018

While only 45% of developers in organizations with five employees or less are using continuous integration, and only 35% are using continuous delivery (CD), developers report the likelihood of using these technologies increases with the size of the organization. This is somewhat intuitive as many of the benefits of these methods provide ways for groups of developers to work together. In large organizations with over 1,000 employees, 68% of developers report using continuous integration and 52% are using continuous delivery.

Developers strongly disagree with the US FCC’s recent decision on net neutrality

DigitalOcean Currents: March 2018

Worldwide, the developers we surveyed voiced a strong opinion against the repeal of net neutrality in the US by the FCC. Among those in the United States this opinion was even more pronounced with 83% of developers against the decision and only 3.6% in favor of the change.

Adoption of the GDPR in Europe has many developers working to ensure compliance

DigitalOcean Currents: March 2018

Thirty-seven percent of the developers we surveyed reported that their teams were currently working to prepare for the GDPR. Unsurprisingly, developers in European countries are leading in this regard, with 58% of respondents in the Netherlands, 62% in Belgium, and 68% in Sweden stating their teams were actively working to ensure GDPR compliance. The United Kingdom saw the most engagement at 70%.


DigitalOcean Currents is published quarterly, highlighting the latest trends among developers.

If you would like to be among the first to receive Currents each quarter, sign up here. You’ll receive the latest report once it is released, share your ideas on what topics we should cover, and participate in our next survey.

Read more about these and other findings in the full report. Download the full Currents report here.


How do I check how much memory a Mediawiki instance has available to it?

Published 21 Mar 2018 by user1258361 in Newest questions tagged mediawiki - Server Fault.

Before anyone posts something about checking php.ini, bear in mind there are all sorts of ways it could be overridden. Where's the admin page or panel that lists the amount of RAM available to mediawiki?

(Due diligence: Searches turned up nothing. Proof in links below)

https://www.google.com/search?q=mediawiki+admin+panel&ie=utf-8&oe=utf-8&client=firefox-b-1 only relevant link is https://www.mediawiki.org/wiki/Manual:System_administration which contains nothing about memory or RAM

https://www.google.com/search?q=mediawiki+admin+UI+how+much+memory+is+allocated&ie=utf-8&oe=utf-8&client=firefox-b-1 again nothing

https://www.google.com/search?q=mediawiki+how+to+check+how+much+memory+is+allocated&ie=utf-8&oe=utf-8&client=firefox-b-1 again, nothing. First link suggests increasing amount of RAM but that isn't useful if my php.ini is being ignored for unknown reasons


Who's a senior developer?

Published 21 Mar 2018 by in Posts on The bugalore.

Something at work today prompted me to get thinking about what people generally mean when they say they are/someone is a senior developer. There are some things which are a given - long-term technical experience, fairly good knowledge of complex languages and codebases, past experience working on products and so on. But in my opinion, there are a fair number of things which we don’t really talk about but are important skills a “senior” developer must possess to actually deserve that title.

Spike in Adam Conover Wikipedia page views | WikiWhat Epsiode 4

Published 21 Mar 2018 by addshore in Addshore.

This post relates to the WikiWhat Youtube video entitled “Adam Conover Does Not Like Fact Checking | WikiWhat Epsiode 4” by channel Cntrl+Alt+Delete. It would appear that the video went slightly viral over the past few days, so let’s take a quick look at the impact that had on the Wikipedia page view for Adam’s article.

The video was published back in January, and although the viewing metrics are behind closed doors this video has had a lot of activity in the past 5 days (judging by the comments).

It is currently the top viewed video in the WikiWhat series at 198,000 views where the other 3 videos (John Bradley, Kate Upton & Lawrence Gillard Jr.) only have 6000 views between them.

The sharp increase in video views translates rather well into Wikipedia page view for the Adam Conover article.

Generate at https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2018-02-28&end=2018-03-20&pages=Adam_Conover|Talk:Adam_Conover|User:Adam_Conover|User_talk:Adam_Conover

Interestingly this doesn’t just show a page view increase for the article, but also the talk page and Adam Conover’s user pages, all of which are shown in the video.

It’s a shame that 200,000 youtube views only translates to roughly 15,000 views on Wikipedia, but, still interesting to see the effect videos such as this can have for the visibility of the site.

You can watch the page views for an Wikipedia page using the Page views tool.

The post Spike in Adam Conover Wikipedia page views | WikiWhat Epsiode 4 appeared first on Addshore.


Introducing Dashboard: View Your Infrastructure At A Glance

Published 21 Mar 2018 by Josh Viney in The DigitalOcean Blog.

Introducing Dashboard: View Your Infrastructure At A Glance

Simplifying the developer experience in the cloud has been a priority for DigitalOcean since we launched Droplets in 2013. As our product capabilities grow, we're taking great care to ensure that using DigitalOcean to run your applications remains as easy and intuitive as possible.

Today, we’re announcing the Control Panel Dashboard, the first of many Control Panel updates planned for 2018 as part of our mission to make it simple for development teams to operate and scale production applications in the cloud.

Introducing The Dashboard

Every day as we talk to developers, read feedback from the community, and witness the amazing applications being launched on our platform, the message that rings the clearest is that everyone values simplicity and ease of use. Visualizing, understanding, and controlling your cloud infrastructure in a single place is not inherently simple or easy, and it can get significantly more difficult as complexity increases.

The release of the new Dashboard is specifically meant to help you quickly access your existing resources and key account-related information, while highlighting additional products and features we think you’ll find useful when deploying scalable, production-ready infrastructure.

For existing users, the Dashboard replaces the Droplets page as the new default home page of the Control Panel. It provides “at-a-glance” visibility into active resources, like Droplets, Spaces, Load Balancers, Domains, Floating IPs, month-to-date current billing usage, shortcuts to team management, and other common tasks without having to navigate to different, often hard-to-find, sections of the Control Panel.

Introducing Dashboard: View Your Infrastructure At A GlanceA look at the new Control Panel Dashboard.

Additionally, we’ve made changes to the top and bottom navigation to expose more helpful links to our status page, Community tutorials, API docs, and the support portal. All with the goal of surfacing more ways to help keep your applications running smoothly without overloading the UI.

The Dashboard is just the beginning. We have many more updates planned this year, and we can’t do it without your continued feedback. When you log in to take a look, please leave us some feedback using the little megaphone icon in the bottom right corner of the Control Panel. Or get early access to upcoming features by completing this survey.

The new Control Panel Dashboard is available starting today and will roll out to all DigitalOcean users over the course of the week. Stay tuned for more UI updates in the future!


Cambridge Analytica, #DeleteFacebook, and adding EXIF data back to your photos

Published 20 Mar 2018 by addshore in Addshore.

Back in 2016 I wrote a short hacky script for taking HTML from facebook data downloads and adding any data possible back to the image files that also came with the download. I created this as I wanted to grab all of my photos from Facebook and be able to upload them to Google Photos and have Google automatically slot them into the correct place in the timeline. Recent news articles about Cambridge Analytica and harvesting of Facebook data have lead to many people deciding the leave the platform, so I decided to check back with my previous script and see if it still worked, and make it a little easier to use.

Step #1 – Move it to Github

Originally I hadn’t really planned on anyone else using the script, in fact I still don’t really plan on it. But let’s keep code in Github not on aging blog posts.

https://github.com/addshore/facebook-data-image-exif

Step #2 – Docker

The previous version of the script had hard coded paths, and required a user to modify the script, and also download things such as the ExifTool before it would work.

Now the Github repo contains a Dockerfile that can be used that includes the script and all necessary dependencies

If you have Docker installed running the script is now as simple as docker run --rm -it -v //path/to/facebook/export/photos/directory://input facebook-data-image-exif.

Step #3 – Update the script for the new format

As far as I know the format of the facebook data dump downloads is not documented anywhere. The format totally sucks, it would be quite nice to have some JSON included, or anything slightly more structured than HTML.

The new format moved the location of the HTML files for each photos album, but luckily the format of the HTML remained mostly the same (or at least the crappy parsing I created still worked).

The new data download did however do something odd with the image sources. Instead of loading them from the local directory (all of the data you have just downloaded) the srcs would still point to the facebook CDN. Not sure if this was intentional, but it’s rather crappy. I imagine if you delete your whole facebook account these static HTML files will actually stop working. Sounds like someone needs to write a little script for this…

Step #4 – Profit!

Well, no profit, but hopefully some people can make use of this again, especially those currently fleeing facebook.

You can find the “download a copy” of my data link at the bottom of your facebook settings.

I wonder if there are any public figures for the rate of facebook account deactivations and deletions…

The post Cambridge Analytica, #DeleteFacebook, and adding EXIF data back to your photos appeared first on Addshore.


Episode 4: Bernhard Krabina

Published 20 Mar 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Bernhard Krabina is a researcher and consultant for KDZ, the Centre for Public Administration Research, a Vienna, Austria-based nonprofit that focuses on improving and modernizing technology-based solutions in government at all levels within Europe. He has been involved with MediaWiki in government for the last 10 years.

Links for some of the topics discussed:


v2.4.7

Published 20 Mar 2018 by fabpot in Tags from Twig.


v1.35.3

Published 20 Mar 2018 by fabpot in Tags from Twig.


How We Support Remote Employees at DigitalOcean

Published 14 Mar 2018 by Amanda Brazzell in The DigitalOcean Blog.

How We Support Remote Employees at DigitalOcean

Remote culture at DigitalOcean is one of my favorite things to talk about when discussing my job. When I first joined the company in June of 2015, there was already a substantial percentage of existing remote employees (better known as our “remotees”). Working with the remotees wasn’t initially a part of my function, but as a member of the Employee Experience Team, I gradually found myself getting to know many of them more personally. I learned about their experiences as distributed employees, some of their pain points, and how it influences their engagement.

Since I've never been remote, I educated myself on best practices for companies with remote employees and how we could expand our top-notch employee experience to those outside of our HQ.

Two and a half years later, our remotee population totals over 200 employees, making up over 50% of our employees, and our program has grown to support both the needs of our business and those who work remotely. To date, remotees score higher in engagement than any other subgroup at the company. This has been attributed to the attention and effort we have actively given to support the remotee experience.

Here’s what we learned and how we adjusted our efforts to better support the remotee experience:

Remote Communication

“Watercooler talk” is an important aspect of working in-office, and it’s a practice that companies seeking to become more remote-friendly have trouble replicating. Being able to easily communicate with other colleagues helps improve team bonds and makes people feel part of the company fabric. At DO, we use several different mediums to avoid having remotees excluded from conversation and risking having information fall through the cracks:

Remote-inclusive Programs

While most of our teams at DigitalOcean are comprised of both in-office and remote employees, there is definite value in giving teams the opportunity to get together in person at different times during the year. Here are the processes we have in place to ensure teams get face time:

Perks for Remotees

While some companies see working from home as a perk in and of itself, we recreate many of the in-office perks and make them available to remotees. This is key to building a cohesive company culture and experience, and one where remotees feel engaged with the company at large.

Our remotes are able to participate in our workstation program, where they get access to different monitors, mouse/keyboards, and trackpads for their home offices, as well as credit up to $100 for headphones of their choice. The equivalent of our commuter benefit for in-house employees is providing remotes a credit toward the cost of either their monthly internet bill or their monthly coworking space membership. Additionally, remotes can opt into a monthly subscription snack box (because snacks are awesome!). Finally, DO covers travel and per diem costs, and provides accommodation at our corporate apartments for remotee visits to HQ.

"Love is What Makes Us Great"

DigitalOcean’s employee experience programs strives to be inclusive of all of our employees. We do this by keeping both the needs of in-office and remote employees in mind, and by adjusting our programs as needed to ensure they can change and scale with our growing organization. Removing obstacles to communication between people in our offices and remotes is essential for building cohesion across teams and to help everyone be the most productive employee they can be, no matter where they’re located.

Apply For a Job @ DO


Amanda Brazzell is DigitalOcean’s Office Experience Team Lead. She has helped build an effective Remote Experience program that drives dispersed employee engagement and job satisfaction. Amanda is a California native who moved to NYC without having ever visited the city before, and has been at DO since 2015.


Facebook Inc. starts cannibalizing Facebook

Published 13 Mar 2018 by Carlos Fenollosa in Carlos Fenollosa — Blog.

Xataka is probably the biggest Spanish blogging company. I have always admired them, from my amateur perspective, for their ability to make a business out of writing blogs.

That is why, when they invited me to contribute with an article about the decline of Facebook, I couldn't refuse. Here it is.

Facebook se estanca, pero Zuckerberg tiene un plan: el porqué de las adquisiciones millonarias de WhatsApp e Instagram, or Facebook is stagnating, but Zuckerberg has a plan: the reason behind the billion dollar acquisitions of WhatsApp and Instagram.

Tags: facebook, internet, mobile

Comments? Tweet  


Vaporous perfection

Published 12 Mar 2018 by jenimcmillan in Jeni McMillan.

DSC_0405

Clouds, so impermanent, advise her that reality is a mere dream. The illusion of solidity in their shape and comforting forms is exactly that, illusion, disappearing as temperature changes, wind blows or night extinguishes day. Why would a cloud be other than this? I marvel at such simplicity. I will endeavour to leave clouds to their journey, not fall in love with them in any other way than to share their pleasure of being vaporous perfection.


Budapest Blues

Published 11 Mar 2018 by jenimcmillan in Jeni McMillan.

budapest

It’s Sunday and I’m in the most beautiful city in the world.

Cigarette butts crushed into broken tiles.

At my feet is another death, in the street,

Broken buildings and hollow dreams.

I’m in her arms like a stillborn child.

Feeling nothing, it seems,

But old.


cardiParty 2018-03-Perth - Museum of Perth

Published 7 Mar 2018 by Andrew Kelly in newCardigan.

Reece Harley, Executive Director and founder, will give an introductory talk about Museum of Perth, giving background info about the museum, and the current exhibition. 6pm, Friday 16 March.

Find out more...


Episode 3: Mike Cariaso

Published 6 Mar 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Mike Cariaso is the co-founder of SNPedia, a MediaWiki-based repository of genomic information (founded in 2006), and the creator of Promethease, personal genetic analysis software that uses SNPedia's data.

Links for some of the topics discussed:


cardiCast episode 29 – Adam Trainer

Published 5 Mar 2018 by Justine in newCardigan.

Perth February 2018 cardiParty

Recorded live

Adam Trainer, curator of the exhibition Alternative Frequencies: 40 Years of RTRFM which celebrates four decades of the state’s longest running FM community radio station, discusses the history of the station, the curation process and the exhibition’s relationship to SLWA’s broader local music collections.

newcardigan.org
glamblogs.newcardigan.org

Music by Professor Kliq ‘Work at night’ Movements EP.
Sourced from Free Music Archive under a Creative Commons licence.


Self-hosted websites are doomed to die

Published 3 Mar 2018 by Sam Wilson in Sam's notebook.

I keep wanting to be able to recommend the ‘best’ way for people (who don’t like command lines) to get research stuff online. Is it Flickr, Zenodo, Internet Archive, Wikimedia, and Github? Or is it a shared hosting account on Dreamhost, running MediaWiki, WordPress, and Piwigo? I’d rather the latter! Is it really that hard to set up your own website? (I don’t think so, but I probably can’t see what I can’t see.)

Anyway, even if running your own website, one should still be putting stuff on Wikimedia projects. And even if not using it for everything, Flickr is a good place for photos (in Australia) because you can add them to the Australia in Pictures group and they’ll turn up in searches on Trove. The Internet Archive, even if not a primary and cited place for research materials, is a great place to upload wikis’ public page dumps. So it really seems that the remaining trouble with self-hosting websites is that they’re fragile and subject to complete loss if you abandon them (i.e. stop paying the bills).

My current mitigation to my own sites’ reliance on me is to create annual dumps in multiple formats, including uploading public stuff to IA, and printing some things, and burning all to Blu-ray discs that get stored in polypropylene sleeves in the dark in places I can forget to throw them out. (Of course, I deal in tiny amounts of data, and no video.)

What was it Robert Graves said in I, Claudius about the best way to ensure the survival of a document being to just leave it sitting on ones desk and not try at all to do anything special — because it’s all perfectly random anyway as to what persists, and we can not influence the universe in any meaningful way?


v2.4.6

Published 3 Mar 2018 by fabpot in Tags from Twig.


v1.35.2

Published 3 Mar 2018 by fabpot in Tags from Twig.


Untitled

Published 2 Mar 2018 by Sam Wilson in Sam's notebook.

I think I am learning to love paperbacks. (Am hiding in New Editions this morning.)


v2.4.5

Published 2 Mar 2018 by fabpot in Tags from Twig.


v1.35.1

Published 2 Mar 2018 by fabpot in Tags from Twig.


Weather Report

Published 1 Mar 2018 by jenimcmillan in Jeni McMillan.

DSC_0437

It is Minus 11 in Berlin.

Heart rate slow.

Breath freezing.

It’s Minus 12 in Berlin.

Heart is warming.

Breath responding.

I think of the Life, Death, Rebirth cycle.

Again and again and again.

DSC_0429

Thank you Clarissa Pinkola Estés.

 

 

 


Conference at UWA – Home 2018

Published 26 Feb 2018 by Tom Wilson in thomas m wilson.

I’ll be presenting a paper at the following conference in July 2018.  It will be looking at the theme of aspirations for home ownership from the perspective of Big History.  Hope to see you there.

Missing

Published 20 Feb 2018 by jenimcmillan in Jeni McMillan.

Trees

Sometimes I just miss people. I want to hold them in my arms and feel their heart beat. I want to look into their souls. Share stories. Linger in all the delicious ways. This isn’t lust. There are many ways to be in the world. Lust has its place. But the kind of desire I speak of is a love so deep that it may only last a second yet find perfection. The willingness to be absolutely present. This is not a contradiction. The longing is a sweetness, something that poetry holds hands with and prose takes a long walk through aimless streets.


Episode 2: Niklas Laxström

Published 20 Feb 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Niklas Laxström is the creator and co-maintainer of translatewiki.net, the site where MediaWiki and most of its extensions (along with other software, like OpenStreetMap) gets translated into hundreds of languages. Niklas also works for the Wikimedia Foundation as part of the Language team, where he helps to develop code related to translation and internationalization, most notably the Translate extension.

Links for some of the topics discussed:


Volunteer Spotlight: David Schokker

Published 20 Feb 2018 by Rebecca Waters in DDD Perth - Medium.

View from the DDD Perth 2017 Speaker and Sponsor party (David Schokker)

Volunteers are the lifeblood of DDD Perth. In order to pull off such a conference, we need volunteers on the ground before, during and after the big day. We simply couldn’t do it without them.

This week, I spent some time chatting with one of the many volunteers of DDD Perth, David Schokker.

battlepanda (@battlepanda_au) | Twitter

David, how did you come across DDD?

I was introduced to DDD by a fellow volunteer. As I have done other large scale events I felt that I could help with DDD and share my experience.

How did you help out on the day?

I was one of the event photographers, responsible for documenting the various interesting things that happen on the day. (Ed: you can check out photos from the day, taken by David and others, over on Flickr)

DDD Perth 2017

What was the most memorable part of your volunteering experience?

The overwhelming amount of appreciation not only myself but for the entire volunteer team. The personal gratitude is why I do events like this.

Would you recommend volunteering at DDD? Why?

Of course, the team is wonderful and diverse.
Not only do you get to help make an event such as DDD happen, but you also get a chance to mingle with some of the best people in their fields of expertise.

Did you meet and mingle with anyone that was particularly awesome?

Yeah meeting Kris (@web_goddess) was amazing, I got to hang out with her before her preso so it was unique to meet a new friend before seeing what they excel in. It really opened my eyes to how strong of a person she is and what great things she does with the community.

Will you be volunteering in 2018?

If you guys want me, of course!

David, I promise, we want you.


Volunteer Spotlight: David Schokker was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


How to use toggleToc() in a MediaWiki installation

Published 18 Feb 2018 by lucamauri in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I admin a wiki site running MediaWiki 1.29 and I have a problem collapsing the TOC on pages.

I would be interesting in keeping the Contents box, but loading the page with it collapsed by default.

It appears there is a simple solution here https://www.mediawiki.org/wiki/Manual_talk:Table_of_contents#Improved_Solution, but I fail to implement it and I have no idea where the error is, hopefully someone can help.

I integrated the code as explained and checked that MediaWiki:Common.js is used by the site.

During page rendering, I checked the Java code is loaded and executed, but it appears to fail because

ReferenceError: toggleToc is not defined

I also checked this page https://www.mediawiki.org/wiki/ResourceLoader/Migration_guide_(users)#MediaWiki_1.29 , but in the table there is a empty cell where it should be explained how to migrate toggleToc();. I am not even entirely sure it should be migrated.

Any help on this topic will be appreciated.

Thanks

Luca


How to use mw.site.siteName in Module:Asbox

Published 17 Feb 2018 by Rob Kam in Newest questions tagged mediawiki - Webmasters Stack Exchange.

Exporting Template:Stub from Wikipedia for use on non-WMF wiki, it transcludes Scribunto Module:Asbox which has on line 233:

' is a [[Wikipedia:stub|stub]]. You can help Wikipedia by [',

Substituting Wikipedia with magic word {{SITENAME}} doesn't work here. How to replace Wikipedia for the comparable Lua function mw.site.siteName, so that pages transcluding the stub template shows the local wiki name instead?


Feel the love for digital archives!

Published 15 Feb 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday was Valentine's Day.

I spent most of the day at work thinking about advocacy for digital preservation. I've been pretty quiet this month, beavering away at a document that I hope might help persuade senior management that digital preservation matters. That digital archives are important. That despite their many flaws and problems, we should look after them as best we can.

Yesterday I also read an inspiring blog post by William Kilbride: A foot in the door is worth two on the desk. So many helpful messages around digital preservation advocacy in here but what really stuck with me was this:

"Digital preservation is not about data loss, it’s about coming good on the digital promise. It’s not about the digital dark age, it’s about a better digital future."

Perhaps we should stop focusing on how flawed and fragile and vulnerable digital archives are, but instead celebrate all that is good about them! Let's feel the love for digital archives!

So whilst cycling home (in the rain) I started thinking about Valentine's cards that celebrate digital archives. Then with a glass of bubbly in one hand and a pen in the other I sketched out some ideas.


Let's celebrate that obsolete media that is still in good working
order (against all odds)

Even file migration can be romantic..

A card to celebrate all that is great about Broadcast
WAV format

Everybody loves a well-formed XML file

I couldn't resist creating one for all you PREMIS fans out there



I was also inspired by a Library of Congress blog post by Abbie Grotke that I keep going back to: Dear Husband: I’m So Sorry for Your Data Loss. I've used these fabulous 'data loss' cards several times over the years to help illustrate the point that we need to look after our digital stuff.



I'm happy for you to use these images if you think they might help with your own digital preservation advocacy. An acknowledgement is always appreciated!

I don't think I'll give up my day job just yet though...

Best get back to the more serious advocacy work I have to do today.





Feel the love for digital archives!

Published 15 Feb 2018 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday was Valentine's Day.

I spent most of the day at work thinking about advocacy for digital preservation. I've been pretty quiet this month, beavering away at a document that I hope might help persuade senior management that digital preservation matters. That digital archives are important. That despite their many flaws and problems, we should look after them as best we can.

Yesterday I also read an inspiring blog post by William Kilbride: A foot in the door is worth two on the desk. So many helpful messages around digital preservation advocacy in here but what really stuck with me was this:

"Digital preservation is not about data loss, it’s about coming good on the digital promise. It’s not about the digital dark age, it’s about a better digital future."

Perhaps we should stop focusing on how flawed and fragile and vulnerable digital archives are, but instead celebrate all that is good about them! Let's feel the love for digital archives!

So whilst cycling home (in the rain) I started thinking about Valentine's cards that celebrate digital archives. Then with a glass of bubbly in one hand and a pen in the other I sketched out some ideas.


Let's celebrate that obsolete media that is still in good working
order (against all odds)

Even file migration can be romantic..

A card to celebrate all that is great about Broadcast
WAV format

Everybody loves a well-formed XML file

I couldn't resist creating one for all you PREMIS fans out there



I was also inspired by a Library of Congress blog post by Abbie Grotke that I keep going back to: Dear Husband: I’m So Sorry for Your Data Loss. I've used these fabulous 'data loss' cards several times over the years to help illustrate the point that we need to look after our digital stuff.



I'm happy for you to use these images if you think they might help with your own digital preservation advocacy. An acknowledgement is always appreciated!

I don't think I'll give up my day job just yet though...

Best get back to the more serious advocacy work I have to do today.





Email is your electronic memory

Published 14 Feb 2018 by Bron Gondwana in FastMail Blog.

From the CEO’s desk.

Sometimes you write planned blog posts, sometimes events in the news are a prompt to re-examine your values. This is one of those second times.

Gmail and AMP

Yesterday, Google announced that Gmail will use AMP to make emails dynamic, up-to-date and actionable. At first that sounds like a great idea. Last week’s news is stale. Last week’s special offer from your favourite shop might not be on sale any more. The email is worthless to you now. Imagine if it could stay up-to-date.

TechCrunch wrote about AMP in Gmail and then one of their columnists wrote a followup response about why it might not be a good idea – which led to a lot of discussion on Hacker News.

Devin used the word static. In the past I have used the word immutable. I think “immutable” is more precise, though maybe less plain and simple language than “static” – because I don’t really care about how dynamic and interactive email becomes – usability is great, I’m all in favour.

But unchanging-ness... that’s really important. In fact, it’s the key thing about email. It is the biggest thing that email has over social networking or any of the hosted chat systems.

An email which is just a wrapper for content pulled from a website is no longer an unchangeable copy of anything.

To be totally honest, email already has a problem with mutability – an email which is just a wrapper around remotely hosted images can already be created, though FastMail offers you the option of turning them off or restricting them to senders in your address book. Most sites and email clients offer an option to block remote images by default, both for privacy and because they can change after being delivered (even more specifically, an email with remote images can totally change after being content scanned).

Your own memory

The email in your mailbox is your copy of what was said, and nobody else can change it or make it go away. The fact that the content of an email can’t be edited is one of the best things about POP3 and IMAP email standards. I admit it annoyed me when I first ran into it – why can’t you just fix up a message in place – but the immutability is the real strength of email. You can safely forget the detail of something that you read in an email, knowing that when you go back to look at it, the information will be exactly the same.

Over time your mailbox becomes an extension of your memory – a trusted repository of history, in the way that an online news site will never be. Regardless of the underlying reasons, it is a fact that websites can be “corrected” after you read them, tweets can be deleted and posts taken down.

To be clear, often things are taken down or edited for good reasons. The problem is, you can read something online, forward somebody a link to it or just go back later to re-read it, and discover that the content has changed since you were last there. If you don’t have perfect memory (I sure don’t!) then you may not even be sure exactly what changed – just be left with a feeling that it’s not quite how you remember it.

Right now, email is not like that. Email is static, immutable, unchanging. That’s really important to me, and really important to FastMail. Our values are very clear – your data belongs to you, and we promise to be good stewards of your data.

I'm not going to promise that FastMail will “never implement AMP” because compatibility is also important to our users, but we will proceed cautiously and skeptically on any changes that allow emails to mutate after you’ve seen them.

An online datastore

Of course, we’re a hosted “cloud” service. If we turned bad, we could start silently changing your email. The best defence against any cloud service doing that is keeping your own copies, or at least digests of them.

Apart from trusting us, and our multiple replicas and backups of every email, we make it very easy to keep your own copies of messages:

  1. Full standards-compliant access to email. You can use IMAP or POP3 to download messages. IMAP provides the triple of “foldername / uidvalidity / uid” as a unique key for every message. Likewise we provide CalDAV and CardDAV access to the raw copies of all your calendars and contacts.

  2. Export in useful formats. Multiple formats for contacts. Standard ICS files for calendars and it’s rather hidden, but at the bottom of the Folders screen, there’s a link called “Mass delete or remove duplicates” and there’s a facility on that screen to download entire folders as a zip file as well.

  3. Working towards new standards for email. Our team is working hard on JMAP and will be participating in a hackathon at IETF in London in March to test interoperability with other implementations.

  4. We also provide a DIGEST.SHA1 non-standard fetch item via IMAP that allows you to fetch the SHA1 of any individual email. It’s not a standard though. We plan to offer something similar via JMAP, but for any attachment or sub-part of emails as well.

Your data, your choice

We strongly believe that our customers stay with us because we’re the best, not because it’s hard to leave. If for any reason you want to leave FastMail, we make it as easy as possible to migrate your email away. Because it’s all about trust – trust that we will keep your email confidential, trust that we will make your email easy to access, and trust that every email will be exactly the same, every time you come back to read it.

Thank you to our customers for choosing us, and staying with us. If you’re not our customer yet, please do grab yourself a free trial account and check out our product. Let us know via support or twitter, whether you decide to stay, and particuarly if you decide not to! The only thing we don’t want to hear is “it should be free” – we’re not interested in that discussion, we provide a good service and we proudly charge for it so that you are our customer, not our product.

And if you’re not ready to move all your email, you can get a lot of the same features for a whole group of people using Topicbox – a shared memory without having to change anything except the “To:” line in the emails you send!

Cheers,

Bron.


MySQL socket disappears

Published 9 Feb 2018 by A Child of God in Newest questions tagged mediawiki - Server Fault.

I am running Ubuntu 16.04 LTS, with MySQL server for MediaWiki 1.30.0 along with Apache2 and PHP7.0. The installation was successful for everything, I managed to get it all running. Then I start installing extensions for MediaWiki. Everything is fine until I install the Virtual Editor extension. It requires that I have both Parsoid and RESTBase installed. So I install those along side Virtual Editor. Then I go to check my wiki and see this message (database name for the wiki is "bible"):

Sorry! This site is experiencing technical difficulties.

Try waiting a few minutes and reloading.

(Cannot access the database: Unknown database 'bible' (localhost))

Backtrace:

#0 /var/www/html/w/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1028): Wikimedia\Rdbms\Database->reportConnectionError('Unknown databas...')

#1 /var/www/html/w/includes/libs/rdbms/loadbalancer/LoadBalancer.php(670): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()

#2 /var/www/html/w/includes/GlobalFunctions.php(2858): Wikimedia\Rdbms\LoadBalancer->getConnection(0, Array, false)

#3 /var/www/html/w/includes/user/User.php(493): wfGetDB(-1)

#4 /var/www/html/w/includes/libs/objectcache/WANObjectCache.php(892): User->{closure}(false, 3600, Array, NULL)

#5 /var/www/html/w/includes/libs/objectcache/WANObjectCache.php(1012): WANObjectCache->{closure}(false, 3600, Array, NULL)

#6 /var/www/html/w/includes/libs/objectcache/WANObjectCache.php(897): WANObjectCache->doGetWithSetCallback('global:user:id:...', 3600, Object(Closure), Array, NULL)

#7 /var/www/html/w/includes/user/User.php(520): WANObjectCache->getWithSetCallback('global:user:id:...', 3600, Object(Closure), Array)

#8 /var/www/html/w/includes/user/User.php(441): User->loadFromCache()

#9 /var/www/html/w/includes/user/User.php(405): User->loadFromId(0)

#10 /var/www/html/w/includes/session/UserInfo.php(88): User->load()

#11 /var/www/html/w/includes/session/CookieSessionProvider.php(119): MediaWiki\Session\UserInfo::newFromId('1')

#12 /var/www/html/w/includes/session/SessionManager.php(487): MediaWiki\Session\CookieSessionProvider->provideSessionInfo(Object(WebRequest))

#13 /var/www/html/w/includes/session/SessionManager.php(190): MediaWiki\Session\SessionManager->getSessionInfoForRequest(Object(WebRequest))

#14 /var/www/html/w/includes/WebRequest.php(735): MediaWiki\Session\SessionManager->getSessionForRequest(Object(WebRequest))

#15 /var/www/html/w/includes/session/SessionManager.php(129): WebRequest->getSession()

#16 /var/www/html/w/includes/Setup.php(762): MediaWiki\Session\SessionManager::getGlobalSession()

#17 /var/www/html/w/includes/WebStart.php(114): require_once('/var/www/html/w...')

#18 /var/www/html/w/index.php(40): require('/var/www/html/w...')

#19 {main}

I checked the error logs in MySQL, and the error message said that the database was trying to be accessed without a password. I restarted my computer and restarted Apache, Parsoid, RESTBase, and MySQL. But I could not successfully restart MySQL. The error log by typing the command journalctl -xe and saw that it failed to start because it couldn't write to /var/lib/mysql/. I went to StackExchange to see if I could a solution, and one answer said to use the command mysql -u root -p. I did and typed in the password and it gave this error:

ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)

I also check the status of it by typing sudo mysqladmin status which said:

mysqladmin: connect to server at 'localhost' failed error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)' Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!

I wanted to verify that it existed, but upon browsing to the location of the socket, I found it was not there. I saw an answer about a missing MySQL socket which said to use the touch command to create the socket and another file. I did it and still had the same issues. I went back to the directory and found the two files to be missing. So I created them again with the touch command and watched the folder to see what happens. After about half a minute, the folder seems to be deleted and recreated. I get kicked out of the folder into it's parent directory and when I go back in the files are gone.

Does anybody know why this is happening, or at least how I can fix this and get MySQL back up and running?


Global Diversity Call for Proposals Day

Published 7 Feb 2018 by Rebecca Waters in DDD Perth - Medium.

Photo by Hack Capital on Unsplash

February 3rd, 2018: Global Diversity Call for Proposals (CFP) Day. Around the globe, over 50 cities across 23 countries participated by running CFP workshops.

The workshops were aimed at first time would-be speakers, from any field (technology focus not required). Mentors were available to help with proposals, provide speaking advice and share their enthusiasm to get newcomers up on stage.

Workshops were held in Brisbane, Melbourne, Perth and Sydney in Australia, by some of the most vocal supporters of diversity in technology in the country.

In Perth, Fenders and DDD Perth run proposal writing workshops to help reduce the barrier to submitting, and so it made sense for us to join in this February fun and encourage a whole new group of conference potentials to get up and share their knowledge!

The workshop in Perth was well attended with participants from different backgrounds, both personally and professionally, coming together to work on their proposals. Mentors from Fenders and DDD Perth brought their children down and the entire building at Meerkats was filled with excitement (and snacks!).

A special mention goes to the company who provided the space, Meerkats; complete with breakout rooms and comfy couches, our supportive DDD Dad was able to entertain the children easily and the workshop participants could move around the space and find a quiet corner to work in when the workshop required.

Personally, I can’t wait to see the 13 proposals that were written get accepted at local user groups, conferences and maybe further afield.

Mark Lockett’s open source software talk captured my interest (and not just me, I’m sure). Rosemary Lynch’s experience with online publishing for organisations is eye opening, and Amy Kapernick’s take on failure is sure to be a hit in the future.

If these topics sound interesting, be sure to follow these people and lend your support for their first foray into speaking in the WA community!

If you missed this workshop, be sure to follow DDD Perth on Twitter or join our mailing list as we will be holding more workshops throughout the year to help aspiring speakers on their way!


Global Diversity Call for Proposals Day was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


Page-specific skins in MediaWiki?

Published 7 Feb 2018 by Alexander Gorelyshev in Newest questions tagged mediawiki - Webmasters Stack Exchange.

Is there a way to force a particular skin to be applied while displaying specific MediaWiki articles?

In my wiki many articles will have a "flip" version with alternative content (think "good" and "evil" perspectives of the same topic). I was thinking about using namespaces to separate these versions, but I need a definitive way to visually contrast them.


Episode 1: Cindy Cicalese

Published 6 Feb 2018 by Yaron Koren in Between the Brackets: a MediaWiki Podcast.

Cindy Cicalese was a Principal Software Systems Engineer at MITRE for many years, where, among other things, she oversaw the creation and maintenance of over 70 MediaWiki installations, as well as development of many MediaWiki extensions. Last year she joined the Wikimedia Foundation as Product Manager for the MediaWiki Platform team.

Links for some of the topics discussed:


How do i edit Login Required Page [closed]

Published 5 Feb 2018 by jehovahsays in Newest questions tagged mediawiki - Webmasters Stack Exchange.

On my private MediaWiki view & read is set to false.
My website visitors would see
Please Login to view other pages.
What needed to do was edit the login link located in this error message.


According to You

Published 2 Feb 2018 by Rebecca Waters in DDD Perth - Medium.

Ian Hughes presenting at DDD Perth 2017

Following DDD Perth 2017, a bunch of attendees gave the internet their take on our conference. Apart from becoming one of the top five trending topics in Australia for conference day…

…there were also some blushingly complimentary posts about the conference.

If perchance you missed one, we’ve rounded them up for you. Take a look!

Kris Howard

Thank you again to the DDD Perth organisers for inviting me to participate in this wonderful event!

Kris, we’re the ones who are thanking you for delivering such an inspiring locknote. Catch the video of Kris’ talk here.

DDD Perth 2017

Dash Digital

They put on a great event that was well-priced at only $50 a ticket, and included top-notch speakers, lunch and networking opportunities at the afterparty.

What we love about this round up from Dash is the coverage of three of our speakers, Kris Howard, Nathan Jones and Will Webster.

Dash developers do #DDDPerth

Nathan Jones

It was a credit to the organisers and a shout-out to the sponsors for helping make it happen.

We love how Nathan invites future conversations and connections. As he says in the blog post, his DM’s are open! You can catch Nathan’s talk here.

DDD Perth Wrap-Up

Gaia Resources

Having the opportunity to hear some of Perth’s best developers talk about their experiences and recommendations is an excellent opportunity to get away from our day-to-day work and find some fresh perspective.

Fun fact: Six developers from Gaia turned out to represent at DDD Perth. Talk about commitment to learning!

Serverless Architecture at the Small Scale - Gaia Resources

LiveHire

Thank you, DDD Perth for creating such a worthwhile and exciting event.

LiveHire is a continued proud sponsor of DDD Perth and we couldn’t do what we do without them. Thanks for the round up!

LiveHire @ Developer Developer Developer! Perth Event 👏 #DDDPerth

Amy Kapernick

Despite the small price tag and the fact that I had to get up early on a Saturday, not only did it not disappoint but I was blown away by the number of supporters, the calibre of the speakers and the overall experience.

Amy’s glowing review blew us out of the water. Thanks for the kind words, Amy - you keep us motivated for 2018!

Amy Goes to Perth

Donna Edwards

I was also really impressed with the diversity at the conference and how the committee had really made a huge effort to attract more women speakers

Donna’s talk was extremely informative; you can check it out here. Donna herself organises large scale events and we really value her feedback on DDD!

DDD Perth 2017

Did we miss a post? Let us know so we can include it.


According to You was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


Cleaning up your inbox

Published 31 Jan 2018 by David Gurvich in FastMail Blog.

With email forming such a big part of our life it’s possible you had a New Year’s resolution to clean up your inbox.

Perhaps you spent last year, or even previous years, at the mercy of your unruly inbox? Or maybe you’ve come back to your email account after some time off and been overwhelmed with cleaning out all those emails.

Putting aside any regular email blasts from friends or family (read on for how to manage that), it’s likely that a lot of your inbox spam or clutter is from marketing lists you have signed up to.

What once seemed like an invitation too good to ignore might now be taking over your email life, so that every time you visit your inbox you’re confronted with more and more emails.

Types of unwanted email

Unwanted email may come in several forms and can include:

  1. Marketing lists - from retailers and organisations.
  2. Social media notifications – linked to an account you’ve already set up.
  3. Spam – communication from people you have no prior relationship with.

So let’s take a look at each of those kinds of unwanted mail in more detail and the best way to keep their effect on your inbox to a minimum.

1. Marketing lists

Imagine you signed up to a marketing list some years ago for a particular retailer. Maybe at a certain period in time you were really interested in throw pillows. But in the intervening years you’ve forgotten about ever signing up to this list and wondering why your inbox keeps filling up with offers on something you don’t want, featured within emails you don’t want to receive. Now you simply find these emails annoying – and consider them to be spam.

Unsubscribe from a list

So how do you stop receiving all of those throw pillow emails? Well, rather than using the 'Report Spam' button the best thing to do is to manually unsubscribe from the list you once signed up to.

Most lists by law should have an unsubscribe link included somewhere within the body of the email; often this is located on the footer. If you can't see an unsubscribe link you may need to contact the sender directly to request removal.

Find lists

There are a few ways you can audit your inbox for lists. The first is to use the 'Mailing lists' tab button. (Note that this is not visible if your screen layout is configured to use the reading pane.)

Image of
filter buttons in the UI with the mailing list button selected

You can click on this to quickly filter your inbox by senders. Then you can go through and decide what you want to keep and what you want to unsubscribe from.

The other way to find a known list is to use our search toolbar and look for it by name.

2. Social media notifications

These days there seems to be a never-ending list of social media platforms to use. Most of us would be aware of, or likely use, some or all of the biggest platforms such as Twitter, Facebook and LinkedIn.

And while social media can be great for staying in touch and promoting your business, notifications are often linked to the email address you set up your account with.

At times this can be convenient, however as these platforms continue to evolve you might find you have endless social media notifications taking over your inbox too.

Switching off notifications at the source

The good thing is that these notifications can be turned off, or managed, directly from the user settings for each individual social media platform you are using.

Visiting the ‘Settings’ or ‘Help’ menu of any social media platform you use should give you step-by-step instructions on how to control what gets sent to your inbox.

3. Spam

At FastMail we define spam as unsolicited mail sent to a large number of users in an attempt to get you to click on a link to buy a product, market an otherwise dubious service, scam you out of some money or install a virus or malware on your computer.

We’re often asked why would you keep receiving certain emails if they had previously been marked as spam?

For example, you may have previously received email you consider to be spam and decide to report the sender as spam using the 'Report Spam' button. However, some days later you find another email from the same sender in your inbox, rather than automatically being moved to your Spam folder upon delivery.

There are a few reasons for this. The first is that at some stage you likely consented to receiving these emails (in some form) so that tells our systems you do want to receive these emails (and we’re all about making sure you receive your email).

The second reason is to do with how our spam filters work. You can choose a range of settings to ensure spam filtering works the best for your needs. We’ve talked about this previously, but essentially you train the spam filter.

Everybody's spam is different. When you report spam that's slipped through our filters, or non-spam that we've mistakenly classified, we feed this information into a database that's tuned just for you. We also automatically train this with spam you've deleted permanently from your spam folder, and non-spam you've moved to your Archive folder or replied to.

And while we never sell email addresses, nor disclose email addresses at our site to anyone else, there are other instances where unscrupulous marketers may have placed you on mailing lists you didn’t consent to – let’s just call them spammers – using a range of methods to spam you.


Taking action

FastMail gives you the power to control your inbox, using a range of features to manage which mail comes to you.

Block the sender

If you can't unsubscribe or switch off notifications, you can block a particular sender by setting up a rule to permanently discard a message upon delivery. We do recommend sending mail into a folder when first setting up the rule, because mail discarded in this way is gone forever: we can't even get it back from backups.

If you have lots of senders you want to block, add them to a group in your addressbook, then create a rule to discard or file mail from that group. You can also create a contact for a wildcard on the whole domain in this group: this will also block mail being sent from any address at that domain.

Mail you want to keep

If you want to never block certain senders, add them to your address book. This also means mail from these trusted senders bypass any spam checking. This might be a good option for online retailers you regularly use, making sure you receive any correspondence straight to your inbox.

Using rules to keep mail organised

Sometimes you still might want to receive email from particular senders but not have these messages taking over our inbox.

We recently wrote about organising your mail with rules and this is ideal for any correspondence that you still want but maybe not at the expense of your day-to-day inbox experience.

When you’re viewing a message you can use the 'More' button, then 'Add rule from Message…' option to directly create a new Rule for that particular mail. For example, you might send all mail from online retailers to a folder called ‘Purchases’.

image showing the Add Rule function when viewing a message in the FastMail web interface

Welcome to your streamlined inbox

So now, rather than waiting for your inbox to fill up and then manually batch-deleting every few weeks or months you can take back control today!

And whether you want to completely unsubscribe from lists or set up rules, the choice is up to you.

Either way, this year you may finally get to utter those words, “I finally unsubscribed from those throw pillow emails” making 2018 the year you bring more peace and control to your inbox.


My back to school free software toolkit

Published 30 Jan 2018 by legoktm in The Lego Mirror.

The 2018 spring semester started last Wednesday. I think I've set up a pretty good free software toolkit for a successful year:

Total software cost: $0


The Journey

Published 26 Jan 2018 by jenimcmillan in Jeni McMillan.

IMG_0349
I’m on a bus. Denmark has faded into the distance and now I’m passing through wind generator infested fields on the way to Berlin.  You know I care about climate change.  I’ve even vowed not to get on a plane again so that could very bad news for anyone expecting me back soon. I guess there’s always sea travel but I can’t decide what worries me more… pirates or seasickness.  I’ll start by doing laps of the sauna. (I know that doesn’t make sense but they’re great).
News trickles through to the remote corners of the world where I’ve been thigh deep in snow, that Australia has been experiencing a heatwave. When I was in Russia someone told me that Sydney had 48 degrees that day. He wasn’t Russian. In general, they’re not friendly with foreigners, unless one is in a sparse, white-tiled community bathhouse with a crowd of large, naked women. Trust me, it was fabulous. If only I had my sketchbook and charcoal.
Along with breathtaking architecture and cheap hostels that were once palaces,  and some photo opportunities that were golden, the lack of smiles was a constant during my three weeks in post Soviet Russia.
When I arrived in Stockholm,  laughter surprised me and the variety of different backgrounds were striking. What a relief to be amongst other humans who could laugh even when life isn’t perfect. It was still minus 5, the metro crowded and I was a foreigner. Of course I loved Russia but a huge thank you to the Swedes, Norwegians and Danish people for being you. I had a fabulous time and I’m sure I’ll go back for my friend’s wedding in August, assuming I manage the next round of paperwork in France.
I’m making my way back to France slowly.  There’s a whole mini series in my dental tourism escapades that happen before I get there. Hello Budapest.. I don’t require being picked up at the airport or help with a discounted hotel but bus and hostel will be fine to get me to your lovely dental suites. 12 February. Stay tuned.
In the meantime, Berlin with its politics, art, contact improvisation and some lovely friends are less than an hour away. I’m excited! The bus is approaching Frankfurt and it’s time I started looking out the window.
Take care, smile and give hugs. It’s a wonderful gift.
PS I didn’t pose naked in the snow but I did take the photograph.

How can I allow sysops and specific users to hide spam articles in MediaWiki?

Published 24 Jan 2018 by jehovahsays in Newest questions tagged mediawiki - Webmasters Stack Exchange.

My website is powered by MediaWiki Version 1.29.1.

Sometimes the Recent Changes results page becomes cluttered with spam articles that I wish to hide from the results page. How can I allow specific users to hide them?

Keep in mind, I don't need spam protection and I only need to know how to hide spam articles from the results page.


Meet the Team

Published 23 Jan 2018 by Rebecca Waters in DDD Perth - Medium.

The 2017 DDD Perth team (credit: David Schokker)

The last 6 months at DDD Perth has seen us hit some amazing milestones:

DDD Perth started in 2015, and two dedicated people, Robert Daniel Moore and Matt Davies have been driving the conference forward each and every year. In 2017, we spread the load amongst a wider group, and in 2018 we’ve formalised the conference with the incorporation of DDD WA Inc. It’s the DDD Perth committee you know and love, with a new name (for the association only) and a big ‘ole cup of motivation for the future.

So, let’s meet the team involved in DDD Perth for 2018.

Lee Campbell

Lee Campbell (@LeeRyanCampbell) | Twitter

Lee describes himself as an angry, messy, intolerant dev who has a compelling need to contribute to the community to make up for his sins. He’s a freelance developer, author and presenter with extensive experience in concurrent programming, composite applications, web and integration technologies.
Lee’s vision for DDD WA is a platform where we can grow the broad base of Junior members of the community so that they can challenge the seniors. At the other end of the spectrum provide the content and challenging ideas that can help our seniors become world leaders.

Rebecca Waters

Rebecca Waters | Professional Profile | LinkedIn

Rebecca is a software engineer and project manager who feeds off the enthusiasm of others and contributes widely to the Perth software industry. She is a mentor in and outside of her company to junior developers and other professionals, an ex-officio board member of the Australian Industry and Defence Network of WA, and committee member of the Australian Computer Society — Women chapter.
Rebecca’s vision for DDD WA is to be the place you want to be at, and DDD Perth the conference you can’t afford to miss.

Rob Moore

Robert Daniel Moore (@robdmoore) | Twitter

Rob is a Principal Consultant for Readify. He’s Passionate about leading software teams to deliver business value.
Rob wants an inclusive and accessible conference for the Perth software industry, and will work hard to make DDD Perth that conference!

Ian Hughes

Ian Hughes (@ian_hughes) | Twitter

Ian Hughes likes code, science, travel, beer, and footy. He’s a Principal Consultant with Readify during the week, primarily in Agile, DevOps and Mobile; he does other stuff on the weekend, like trying to bring the amazing developer community in WA together to talk about and share their experiences at DDD Perth and the Agile Perth Meetup.

Rob Crowley

Rob Crowley (@robdcrowley) | Twitter

Rob is a software consultant, developer and team lead with a passion for delivering systems that perform at scale. Rob has over 15 years of experience building distributed systems on the web stack and has read more RFCs than he cares to admit. He has spoken at various conferences around Australia such as Elastic{ON}, DDD Melbourne and NDC Sydney and brings a wealth of experience to DDD WA.

Aidan Morgan

Aidan Morgan (@aidanjmorgan) | Twitter

Aidan likes whiskey and making things. He is an experienced CTO and is currently the head of engineering at Boundlss. He is most passionate about machine learning, Agile and the Perth Startup community, but is mainly interested in getting things done. He also hates talking about himself in the third person.
Aidan is passionate about DDD Perth because he likes connecting people and learning more about the cool things that are going on in the Perth software community.

Matt Ward

matt ward (@mattyjward) | Twitter

Matt is a technical lead at Bankwest and full stack developer. Matt is passionate about Junior Developers and is involved in the juniordev.io movement in Perth as well as DDD Perth.

Ashley Aitken

Ashley Aitken (@AshleyAitken) | Twitter

Ashley is an entrepreneur, software engineer, IT trainer and academic, and family man. He’s trained software developers and IT professionals for companies like IBM, Apple, and HP, and organised and presented at IT conferences around the world. He’s also a big fan of Lean Startup with Customer Development and runs the Lean Startup Perth meetup.
Ashley’s vision for DDD Perth is for it to encourage and support WA software developers to lead the world in software development practices. He believes we don’t have to just follow, we can set the pace and direction and DDD Perth will play a big part in that.

Marjan Zia

Tweets with replies by Marjan Zia (@zia_marjan) | Twitter

Marjan is a passionate software developer who feels lucky to work in the finance industry. Her main hobby is to build applications that bring benefits to the end users. She is a very customer focused person who puts the customers at the heart of her development cycle. Her main career goal is to become a tech-lead to aid development teams with designing software packages.
Her main goal being part of DDD is to help organise events that WA Tech lovers want to attend and get great benefit from.

Derek Bingham

Derek Bingham (@deekob) | Twitter

Derek is a journeyman developer, building software applications over the past 20 years in many stacks you can and can’t imagine.
Witnessing the inclusivity and diversity that DDD has brought to the Perth community has been inspirational and he hopes to make a contribution to that continuing. Currently plying his trade at Amazon Web Services.

Jake Ginnivan

Jake Ginnivan (@JakeGinnivan) | Twitter

Jake is the Digital Development Manager for Seven West Media and an Open Source Software enthusiast. Jake’s a seasoned presenter and workshop facilitator. Jake has spoken most recently at NDC Sydney, NDC London, and DDD Perth. He brings a wealth of experience to DDD WA.

Andrea Chagas

Andrea Chagas (@andrealchagas) | Twitter

Andrea Chagas is a mobile developer at Bankwest. She is a tech enthusiast who is passionate about collaboration with colleagues. She is constantly cooking up new ideas and ways to do things.


Meet the Team was originally published in DDD Perth on Medium, where people are continuing the conversation by highlighting and responding to this story.


The Full BBS Documentary Interviews are Going Online

Published 23 Jan 2018 by Jason Scott in ASCII by Jason Scott.

This year, the full 250 hours of interviews I conducted for the BBS Documentary are going online at the Internet Archive.

There’s already a collection of them up, from when I first set out to do this. Called “The BBS Documentary Archive“, it’s currently 32 items from various interviews, including a few clip farms and full interviews of a bunch of people who sat with me back in the years of 2002-2004 to talk about all matter of technology and bulletin board history.

That collection, as it currently stands, is a bit of an incomplete mess. Over the course of this project, it will become a lot less so. I’ll be adding every minute of tape I can recover from my storage, as well as fixing up metadata where possible. Naturally you will be asked to help as well.

A bit of background for people coming into this cold: I shot a movie called “BBS: The Documentary” which ended up being an eight episode mini-series. It tried to be the first and ultimately the last large documentary about bulletin board systems, those machines hooked up to phone lines that lived far and wide from roughly 1978-2000s. They were brilliant and weird and they’re one of the major examples of life going online. They laid the foundation for a population that used the Internet and the Web, and I think they’re terribly interesting.

I was worried that we were going to never get The Documentary On BBSes and so I ended up making it. It’s already 10 years and change since the movie came out, and there’s not been another BBS Documentary, so I guess this is it. My movie was very North American-centric and didn’t go into blistering detail about Your Local BBS Scene, and some people resented that, but I stand by both decisions; just getting the whole thing done required a level of effort and energy I’m sure I’m not capable of any more.

Anyway, I’m very proud of that movie.

I’m also proud of the breadth of interviews – people who pioneered BBSes in the 1970s, folks who played around in scenes both infamous and obscure, and experts in areas of this story that would never, ever have been interviewed by any other production. This movie has everything: Vinton Cerf (co-creator of the Internet) along with legends of Fidonet like Tom Jennings and Ken Kaplan and even John Madill, who drew the FidoNet dog logo. We’ve got ANSI kids and Apple II crackers and writers of a mass of the most popular BBS software packages. The creator of .QWK packets and multiple members of the Cult of the Dead Cow. There’s so much covered here that I just think would never, ever be immortalized otherwise.

And the movie came out, and it sold really well, and I open licensed it, and people discover it every day and play it on YouTube or pull out the package and play the original DVDs. It’s a part of culture, and I’m just so darn proud of it.

Part of the reason the movie is watchable is because I took the 250 hours of footage and made it 7.5 hours in total. Otherwise… well….

…unless, of course, you’re a maniac, and you want to watch me talking with people about subjects decades in the past and either having it go really well or fall completely apart. The shortest interview is 8 minutes. The longest is five hours. There’s legions of knowledge touched on in these conversations, stuff that can be a starting port for a bunch of research that would otherwise be out of options to even find what the words are.

Now, a little word about self-doubt.

When I first starting uploading hours of footage of BBS Documentary interviews to the Internet Archive, I was doing it from my old job, and I had a lot going on. I’d not done much direct work with Internet Archive and didn’t know anything going on behind the scenes or how things worked or frankly much about the organization in any meaningful amount. I just did it, and sent along something like 20 hours of footage. Things were looking good.

Then, reviews.

Some people started writing a few scathing responses to the uploads, pointing out how rough they were, my speech patterns, the interview style, and so on. Somehow, I let that get into my head, and so, with so much else to do, I basically walked away from it.

12 years later (12 years!) I’m back, and circumstances have changed.

I work for the Archive, I’ve uploaded hundreds of terabytes of stuff, and the BBS documentary rests easily on its laurels of being a worthwhile production. Comments by randos about how they wish I’d done some prettify-ing of the documentary “raw” footage don’t even register. I’ve had to swim upstream through a cascade of poor responses to things I’ve done in public since then – they don’t get at me. It took some time to get to this place of comfort, which is why I bring it up. For people who think of me as some bulletproof soul, let it be known that “even I” had to work up to that level, even when sitting on something like BBS Documentary and years of accomplishment. And those randos? Never heard from them again.

The interview style I used in the documentary raw footage should be noted because it’s deliberate: they’re conversations. I sometimes talk as much as the subjects. It quickly became obvious that people in this situation of describing BBS history would have aspects that were crystal clear, but would also have a thousand little aspects lost in fuzzy clouds of memory. As I’d been studying BBSes intensely for years at this point, it would often take me telling them some story (and often the same stories) to trigger a long-dormant tale that they would fly with. In many cases, you can see me shut up the second people talk, because that was why I was talking in the first place. I should have known people might not get that, and I shouldn’t have listened to them so long ago.

And from these conversations come stories and insights that are priceless. Folks who lived this life in their distant youth have all sorts of perspectives on this odd computer world and it’s just amazing that I have this place and collection to give them back to you.

But it will still need your help.

Here’s the request.

I lived this stupid thing; I really, really want to focus on putting a whole bunch of commitments to bed. Running the MiniDV recorder is not too hard for me, and neither is the basic uploading process, which I’ve refined over the years. But having to listen to myself for hundreds of hours using whatever time I have on earth left… it doesn’t appeal to me at all.

And what I really don’t want to do, beyond listening to myself, is enter the endless amount of potential metadata, especially about content. I might be inspired to here and there, especially with old friends or interviews I find joyful every time I see them again. But I can’t see myself doing this for everything and I think metadata on a “subjects covered” and “when was this all held” is vital for the collection having use. So I need volunteers to help me. I run a Discord server that communicates with people collaborating with me and I have a bunch of other ways to be reached. I’m asking for help here – turning this all into something useful beyond just existing is a vital step that I think everyone can contribute to.

If you think you can help with that, please step forward.

Otherwise… step back – a lot of BBS history is about to go online.

 


The Undiscovered

Published 19 Jan 2018 by Jason Scott in ASCII by Jason Scott.

There’s a bit of a nuance here; this entry is less about the specific situation I’m talking about, than about the kind of situation it is.

I got pulled into this whole thing randomly, when someone wrote me to let me know it was going along. Naturally, I fired into it all with all cylinders, but after a while, I figured out very good people were already on it, by days, and so I don’t actually have to do much of anything. That works for me.

It went down like this.

MOS Technology designed the 6502 chip which was in a mass of home computers in the 1970s and 1980s. (And is still being sold today.) The company, founded in 1969, was purchased in 1976 by Commodore (they of the 64 and Amiga) and became their chip production arm. A lot of the nitty gritty details are in the Wikipedia page for MOS. This company, now a subsidiary, lived a little life in Pennsylvania throughout the 1980s as part of the Commodore family. I assume people went to work, designed things, parked in the parking lot, checked out prototypes, responded to crazy Commodore administration requests… the usual.

In 1994, Commodore went out of business and its pieces bought by various groups. In the case of the MOS Technology building, it was purchased by various management and probably a little outside investment, and became a new company, called GMT Microelectronics. GMT did whatever companies like that do, until 2001, when they were shut down by the Environmental Protection Agency because it turns out they kind of contaminated the groundwater and didn’t clean it up very well.

Then the building sat, a memory to people who cared about the 6502 (like me), to former employees, and probably nobody else.

Now, welcome to 2017!

The building has gotten a new owner who wants to turn the property into something useful. To do this, they basically have to empty it, raze the building the ground, clean the ground, and then build a new building. Bravo, developer. Remember, this building has sat for 16 years, unwanted and unused.

The sign from the GMT days still sits outside, unchanged and just aged from when the building was once that business. Life has certainly gone on. By the way, these photos are all from Doug Crawford of the Vintage Computing Federation, who took this tour in late 2017.

Inside, as expected, it is a graffiti and firepit shitshow, the result of years of kids and others camping out in the building’s skeletal remains and probably whiling away the weekends hanging out.

And along with these pleasant scenes of decay and loss are some others involving what Doug thought were “Calcium Deposits” and which I personally interpret as maybe I never need to set foot in this building at any point in my future life and probably will have to burn any clothing I wear should I do so.

But damn if Doug didn’t make the journey into this environmentally problematic deathtrap to document it, and he even brought a guest of some reknown related to Commodore history: Bil Herd, one of the designers of the Commodore 128.

So, here’s what I want to get to: In this long-abandoned building, decades past prime and the province of trespassers and neglect, there turns out to have been quite a bit of Commodore history lying about.

There’s unquestionably some unusually neat items here – old printed documentation, chip wafers, and those magnetic tapes of who knows what; maybe design or something else that needed storage.

So here’s the thing; the person who was cleaning up this building for demolishing was put into some really weird situations – he wanted people to know this was here, and maybe offer it up to collectors, but as the blowback happened from folks when he revealed he’d been throwing stuff out, he was thrown into a defensive position and ultimately ended up sticking with looking into selling it, like salvage.

I think there’s two lessons here:

  1. There’s no question there’s caches of materials out there, be they in old corporate offices, warehouses, storerooms, or what have you, that are likely precious windows into bygone technology. There’s an important lesson in not assuming “everything” is gone and maybe digging a bit deeper. That means contacting places, inquiring with simple non-badgering questions, and being known as someone interested in some aspect of history so people might contact you about opportunities going forward.
  2. Being a shouty toolbox about these opportunities will not improve the situation.

I am lucky enough to be offered a lot of neat materials in a given month; people contact me about boxes, rooms and piles that they’re not sure what the right steps are. They don’t want to be lectured or shouted at; they want ideas and support as they work out their relationship to the material. These are often commercial products now long-gone and there’s a narrative that old automatically means “payday at auction” and that may or may not be true; but it’s a very compelling narrative, especially when times are hard.

So much has been saved and yes, a lot has been lost. But if the creators of the 6502 can have wafers and materials sitting around for 20 years after the company closed, I think there’s some brightness on the horizon for a lot of other “lost” materials as well.


User Dictionaries – a Fundamental Design Flaw

Published 18 Jan 2018 by Andy Mabbett in Andy Mabbett, aka pigsonthewing.

I have just had to add several words to the user dictionary for the spell-checker in Notepad++, that I have already added to my user dictionary in LibreOffice, and to my user dictionary in (all under Windows 10 – does this happen with user dictionaries under Unix & Mac operating systems?).

Notepad++ spell-checker, not recognising the word 'Mabbett'

Under , a user should not have to accept a word’s spelling more than once.

User dictionaries should not be in a “walled garden” within an application. They should exist at operating-system level, or more specifically, at user-account level.

Or, until Microsoft (and other operating system vendors) implement this, applications — at least, open source applications like those listed above — should make their user dictionaries accessible to each other.

Some issues to consider: users with dictionaries in more than one language; security.

Prior art: I raised a Notepad++ ticket about this. It was (not unreasonably) closed, with a pointer to this DSpellCheck ticket on the same subject.

The post User Dictionaries – a Fundamental Design Flaw appeared first on Andy Mabbett, aka pigsonthewing.


User Dictionaries – a Fundamental Design Flaw

Published 18 Jan 2018 by Andy Mabbett in Andy Mabbett, aka pigsonthewing.

I have just had to add several words to the user dictionary for the spell-checker in Notepad++, that I have already added to my user dictionary in LibreOffice, and to my user dictionary in (all under Windows 10 – does this happen with user dictionaries under Unix & Mac operating systems?).

Notepad++ spell-checker, not recognising the word 'Mabbett'

Under , a user should not have to accept a word’s spelling more than once.

User dictionaries should not be in a “walled garden” within an application. They should exist at operating-system level, or more specifically, at user-account level.

Or, until Microsoft (and other operating system vendors) implement this, applications — at least, open source applications like those listed above — should make their user dictionaries accessible to each other.

Some issues to consider: users with dictionaries in more than one language; security.

Prior art: I raised a Notepad++ ticket about this. It was (not unreasonably) closed, with a pointer to this DSpellCheck ticket on the same subject.

The post User Dictionaries – a Fundamental Design Flaw appeared first on Andy Mabbett, aka pigsonthewing.


Edit existing wiki page with libreoffice writer

Published 18 Jan 2018 by Rafiek in Newest questions tagged mediawiki - Ask Ubuntu.

I've read about sending a page to a mediawiki using libre writer.
But is it possible to call up an existing wiki page, edit it and send it back to the wiki?
If so, how is this done?


Use remote Tomcat/Solr for BlueSpice ExtendedSearch

Published 15 Jan 2018 by Dominic P in Newest questions tagged mediawiki - Webmasters Stack Exchange.

Is it possible to configure the BlueSpice ExtendedSearch extension to connect to a remote Apache Tomcat/Solr instance instead of installing all of that on the same machine that runs BlueSpice?

I looked through the install guide for ExtendedSearch, but I couldn't find any mention of this as an option.

Any ideas?


THIS IS NOT A WELLNESS BLOG

Published 15 Jan 2018 by timbaker in Tim Baker.

I have found myself recently in the unfamiliar and uncomfortable position of defending natural therapies. This is not a role I ever foresaw for myself. I understand the rigorous scientific process of developing and testing theories, assessing evidence and requiring proof. I studied science into...

New year, new tool - TeraCopy

Published 12 Jan 2018 by Jenny Mitcham in Digital Archiving at the University of York.

For various reasons I'm not going to start 2018 with an ambitious to do list as I did in 2017 ...I've still got to do much of what I said I was going to do in 2017 and my desk needs another tidy!

In 2017 I struggled to make as much progress as I would have liked - that old problem of having too much to do and simply not enough hours in the day.

So it seems like a good idea to blog about a new tool I have just adopted this week to help me use the limited amount of time I've got more effectively!

The latest batch of material I've been given to ingest into the digital archive consists of 34 CD-ROMs and I've realised that my current ingest procedures were not as efficient as they could be. Virus checking, copying files over from 1 CD and then verifying the checksums is not very time consuming, but when you have to do this 34 times, you do start to wonder whether your processes could be improved!

In my previous ingest processes, copying files and then verifying checksums had been a two stage process. I would copy files over using Windows Explorer and then use FolderMatch to confirm (using checksums) that my copy was identical to the original.

But why use a two stage process when you can do it in one go?

The dialog that pops up when you copy
I'd seen TeraCopy last year whilst visiting The British Library (thanks Simon!) so decided to give it a go. It is a free file transfer utility with a focus on data integrity.

So, I've installed it on my PC. Now, whenever I try and copy anything in Windows it pops up and asks me whether I want to use TeraCopy to make my copy.

One of the nice things about this is that this will also pop up when you accidentally click and drop a directory into another directory in Windows Explorer (who hasn't done that at least once?) and gives you the opportunity to cancel the operation.

When you copy with TeraCopy it doesn't just copy the files for you, but also creates checksums as it goes along and then at the end of the process verifies that the checksums are the same as they were originally. Nice! You need to tweak the settings a little to get this to work.


TeraCopy busy copying some files for me and creating checksums as it goes


When copying and verifying is complete it tells you how many files it has
verified and shows matching checksums for both copies - job done!

So, this has made the task of copying data from 34 CDs into the digital archive a little bit less painful and has made my digital ingest process a little bit more efficient.

...and that from my perspective is a pretty good start to 2018!

New year, new tool - TeraCopy

Published 12 Jan 2018 by Jenny Mitcham in Digital Archiving at the University of York.

For various reasons I'm not going to start 2018 with an ambitious to do list as I did in 2017 ...I've still got to do much of what I said I was going to do in 2017 and my desk needs another tidy!

In 2017 I struggled to make as much progress as I would have liked - that old problem of having too much to do and simply not enough hours in the day.

So it seems like a good idea to blog about a new tool I have just adopted this week to help me use the limited amount of time I've got more effectively!

The latest batch of material I've been given to ingest into the digital archive consists of 34 CD-ROMs and I've realised that my current ingest procedures were not as efficient as they could be. Virus checking, copying files over from 1 CD and then verifying the checksums is not very time consuming, but when you have to do this 34 times, you do start to wonder whether your processes could be improved!

In my previous ingest processes, copying files and then verifying checksums had been a two stage process. I would copy files over using Windows Explorer and then use FolderMatch to confirm (using checksums) that my copy was identical to the original.

But why use a two stage process when you can do it in one go?

The dialog that pops up when you copy
I'd seen TeraCopy last year whilst visiting The British Library (thanks Simon!) so decided to give it a go. It is a free file transfer utility with a focus on data integrity.

So, I've installed it on my PC. Now, whenever I try and copy anything in Windows it pops up and asks me whether I want to use TeraCopy to make my copy.

One of the nice things about this is that this will also pop up when you accidentally click and drop a directory into another directory in Windows Explorer (who hasn't done that at least once?) and gives you the opportunity to cancel the operation.

When you copy with TeraCopy it doesn't just copy the files for you, but also creates checksums as it goes along and then at the end of the process verifies that the checksums are the same as they were originally. Nice! You need to tweak the settings a little to get this to work.


TeraCopy busy copying some files for me and creating checksums as it goes


When copying and verifying is complete it tells you how many files it has
verified and shows matching checksums for both copies - job done!

So, this has made the task of copying data from 34 CDs into the digital archive a little bit less painful and has made my digital ingest process a little bit more efficient.

...and that from my perspective is a pretty good start to 2018!

Keeping it real

Published 4 Jan 2018 by jenimcmillan in Jeni McMillan.

DSC_0048

It’s true, cities are not places for wild goats. It’s difficult to reflect amongst the chaos of a human built landscape, unless it’s on our generation of narcissists. This is not personal. Who hasn’t taken a gratuitous selfie once in a while? I’m right in there, fudging the edges with art in my heart. You know me, I adore a good self-portrait, usually without clothes.

Now that I have a backpack instead of a room and a bank account that dives gracefully toward the abyss, I’ve crossed borders and fallen in love with a number of foreign places. All on the cheap. Hitch-hiking. Sleeping in my tent. Washing under greek waterfalls or in post-soviet sauna houses. Wherever I find myself, there are people with smartphones. We Insta and Facebook, Gab, Google+, MySpace, LinkedIn, Pinterest, Reddit, Tumblr, Twitter, Viber, VK, WeChat, Weibo, WhatsApp, Wikia, Snapchat and YouTube. Sometimes we even email friends who are detoxing from social media overload. Then we write blogs.

Yesterday I went to The Winter Place of Peter the Great. Yes, I’m in St Petersburg where people rarely smile, unless they are really happy. That can be infuriating but somehow, in a world of manufactured happiness and political turmoil, perhaps it is a good thing.


Looking forwards, looking backwards (2017/2018)

Published 2 Jan 2018 by inthemailbox in In the mailbox.

Yesterday, 1 January, I was reminded by the British Museum that the month is named after Janus, the Roman god. The International Council on Archives (ICA) uses a stylised form of Janus for their logo, because archivists do the same thing. Archives are identified and appraised based on their ongoing value to the community and to the organisations and people that created them. Over time, they develop historicity, which leads to the common, but mistaken, belief that archives are about “old stuff”.

January 1 is also the time for looking back over the previous year, and making resolutions about the forthcoming year. Personally, I think the latter is foolish, because I ascribe to the aphorism that no plan survives contact with reality, and 2017 demonstrates that perfectly.

I started with grand plans for a new blog on the Pie Lunch Lot, my mother’s and her cronies answer to what we now call the FIFO lifestyle, without benefit of modern social media. This would mean that I would take charge of my personal archives, and work within an historian’s framework. Yeah, right.

Blogs on this site were also few and far between. I left Curtin, and the luxury of reading and reviewing articles as part of my work there. Back at SRO, I’ve been involved with archival description and with developing our archival management system. This has had congruences with my private work, including  a poster at the Association of Canadian Archivists conference in Ottawa (Disrupting description – Canada3) and developing a workshop on archival description for the ASA conference in Melbourne (of which more another time).

I also became the program chair for the ASA conference in Perth in 2018 – “Archives in a Blade Runner age”, which has led to the creation of another blogsite, this one on science fiction and archives. (Don’t forget to submit an abstract before February 28, and, yes, there will be no extensions.) And, I became a National Councillor for the ASA, which has its own steep learning curve.

Add in the usual chaos that is life, and there you have it. 2017 not as planned, 2018 already out of control 🙂


Looking forward to 2018

Published 24 Dec 2017 by Bron Gondwana in FastMail Blog.

This is the final post in the 2017 FastMail Advent Calendar. In the previous post we met Rik.

We’ve done it! 24 blog posts, one per day.

To begin this aspirational blog post, a goal. We plan to post more frequently overall next year. At least one post every month.

This should be fairly easy since we have a Pobox blog, a Topicbox blog and of course this FastMail blog.

One company

In 2018 we will continue the process of becoming a single company where everybody “owns” all our products, rather than two separate companies flying in close formation, each with their own products. Rik is driving the process of integrating development while Rob N ★ leads merging of operations.

We no longer wake somebody with automated notifications if there’s a person awake in a different timezone who can perform triage, leading to better sleep for everybody. We’re also distributing first-line support between the offices, and training support staff in all our products, for a closer working relationship between the people seeing emerging issues, and the people responding to them.

Our 4 products have their own branding, but internally we’re becoming a single team who love all our children equally (ssssh … I think we each still have our favourite)

Settling in to our new digs

FastMail Melbourne moved to a new office in the middle of the year, and the process was not entirely painless.

Special mention to Jamie who somehow didn’t go mad through all of this. What a welcome to the company – he’s just coming up to the completion of his first year with us, and when I asked him to take point on the office fit-out, I had no idea what I was asking him to do. I’m sure he had no idea either, or he wouldn’t have said yes!

Our office is a lovely space, just 50 metres from our old office, so we can still go to our favourite coffee places in the morning! We have a favourite place we normally go, but we can be fickle – if their coffee isn’t up to our snobby standards, Melbourne has plenty of nearby hipster options just waiting for our custom. This year we’ve mostly dragged people who used disposable paper cups into bringing reusable keep-cups instead. Reusable keep-cups are totally Melbourne.

The morning coffee run is our main regular social gathering, and a highlight of the day. Even non-coffee-drinkers join us for the walk and chat.

Improving our products

The world keeps changing, and no product can keep still and stay successful. But we’re not just reacting, we have plans for new features too!

Next year we will keep polishing Topicbox based on feedback from our early adopters. We also have some neat ideas for new features which will make it even more compelling for teams working together.

FastMail hasn’t seen many user-visible changes in the past year, largely because we’ve been focusing on getting JMAP finished and the interface ported over to use it. 3 Years since our first blog post about JMAP, we’re really close to a finished mail spec. 2018 will be the year of JMAP on the desktop, and then we can start adding new features that build upon the new protocol.

More and more people are accessing our products primarily on mobile devices. We have neglected our apps in 2017, and we will remedy that in 2018. Mobile experience is an explicit focus for us in the coming year, and we’ve engaged outside help to assist with our app development.

Continuing to support Open Source and the community

We fund large amounts of the development work going into the Cyrus IMAP server, as well as the many other open source projects we work on.

We have sponsored various conferences in the past year, and provide free email service to some groups that we feel are well aligned with our mission, like the Thunderbird project, one of the most well known open source email clients.

And of course we send people, and give our time to standards work and collaboration at IETF, M3AAGW and CalConnect.

Pragmatism

This is always the most interesting thing to me when I follow discussions about issues that affect us and our customers. Privacy and security are key features for everybody, as are usability and speed. Ideally, as a customer, these things are invisible. You only notice speed when things get slow. You only notice usability when you’re struggling to achieve your goals. You only notice privacy and security when it turns out you didn’t have them.

Neil wrote a great post earlier in this advent series about our mindset around security. Security and usability are frequently in opposition – the most secure computer is one which is turned off and unplugged from the network. The problem is, it’s easy to believe that something is more secure just because it’s harder to use – that is rarely true.

For example if you pop up dialogs all the time to ask users to make security decisions, they will just click “Yes” without reading and actually be less secure than if asked rarely. Our preferred interaction is to perform the requested action immediately, but make undo simple, so the common case is painless. We also provide a restore from backup feature which allows recovery from most mistakes.

As we review our systems for GDPR compliance next year, we will have pragmatic and effective security in mind.

To 2018

The advent calendar is over, Christmas is almost here, and the New Year just a week away. 2018 will be an exciting year for us.

Thank you again for reading, and for your loyal support throughout 2017. We depend upon the existence of people who are willing to pay to be the customer rather than the product. We’re not the cheapest option for your email, but we firmly believe we are the best. We love what we do, and we love the direct relationship with our customers, payment for service provided.

FastMail the company is growing strongly. We have great people, great products, great customers, and funky new t-shirts.

Cheers! See you next year :)


Team Profile: Rik

Published 23 Dec 2017 by Helen Horstmann-Allen in FastMail Blog.

This is the twenty-third post in the 2017 FastMail Advent series. The previous post was about our repsonse to the GDPR. We finish the series with a post looking forward to next year.


2017 has been a year of big changes for FastMail, team-wise. As Bron Gondwana stepped up to CEO, the role of CTO has been taken on by one of our new American colleagues, Ricardo Signes. We picked him up in 2015 when we acquired Pobox and Listbox, and he’s spent the bulk of his time since then building our newest team email product, Topicbox. Get to know Rik!

Photo of Ricardo Signes

What I work on

Historically, I have been the primary programmer on Pobox and Listbox, and I did a lot of work in the last few years building the framework of Topicbox. But nowadays, I spend most of my time coordinating the work of the development and operations teams, figuring out who’s doing what and whose work might be blocking whom, so that people aren’t sitting frustrated at their desks.

As CTO, I balance the technology requirements across different groups. Generally we don’t have people who want contradictory things, but sorting out work often requires determining invisible pre-requisites and doing that work first. It requires figuring out the way to get from here to there… And preferably after I’ve already figured out what people are likely to want next.

Figuring out what people will want next is often a natural by-product of talking to people about what they want. As we take things from the abstract to the concrete, I try to stay focused on the goals (and really understanding them!) rather than the suggested technical implementation they’ve requested. Time is often a consideration; a lot of times, just keeping in mind the next logical iteration of the solution you can get today is all the plan for the future you need.

How did you get involved with FastMail?

They bought me? I got involved with Pobox in 2005 when Dieter Pearcey heard me saying I was looking for somewhere else to hang my hat. He and I had debugged some problems earlier that year on IRC, so when he told me to apply, I did. About 8 years later, I met Bron at OSCON. We were having a beer when super-connector Paul Fenwick realized we worked at Pobox and FastMail, respectively, and asked if we were going to brawl. We did not; we ended up discussing the common problems and solutions of our load balancers and user migrator implementations. About a year after that, we started the long process of acquisition. A year after that, it happened. 16 months after that, I was the CTO.

I took a photo at the time, recording our meeting for posterity.

Bron and RJBS

What’s your favourite thing you’ve worked on this year?

In technical terms, it’s Topicbox. When building Topicbox, we treated it like a greenfield project. We didn’t reuse many of our standard components, but the technical decisions I made were based on years of using our tools and thinking about how I would do it if I did it from scratch. As many of those plans were borne out in successful technical solutions, it was really rewarding — a pleasure to build and see used.

But, more than that, I have loved organizing the technical team. It’s a really talented group of people, with many unique areas of expertise. Orchestrating all of them requires having a handle on what everyone is doing. Doing it successfully also requires I have at least a basic understanding of what everyone is working on. It is either an excuse or demand for me to be learning a little more all the time, which is great! It forces me to get off the same old path.

What’s your preferred mobile platform?

I use iOS. I don’t have really strong feelings about it. It has a bunch of things I prefer, but … I’ll use anything that has Spotify.

What other companies, people or projects inspire you?

The Rust language project is really inspirational. Like most technical projects, they’ve always striven to have a technically excellent product, but they also decided early on that they were unwilling to compromise on their community to get it. So, the community is not the toxic, or even merely odious, community that you can get in other projects with a lesser commitment to community.

Julia Evans, who is an endless prolific source of interesting and instructive material, who is always positive in attitude, is the kind of technical role model I aspire to be. She says my favorite thing, which is that computers are not magical; you can start from first principles and figure out what is going on, always.

Companies are impressive for lots of reasons, but I’m pleased when companies doing interesting work make it clear what their values are, especially when you can see it’s true. They make it clear they have a direction beyond just making money. They promote that the direction they’ve chosen has value to them. They make it easy to guess what it would be like to work there, and what kind of work and behavior would be rewarded. Netflix and Stripe are two examples that come to mind; I hope I do my part to expose a similar ethos here at FastMail.

What’s your favourite FastMail feature?

I like FastMail’s push support, because it makes clear that FastMail really is fast. It looks like a super simple feature, but the technical details are way more complicated than they should be. It’s technically interesting and you can always get Rob N to tell a good story about it!

My favorite Pobox feature is the RSS feed of spam reports, which lets you review the small amount of mail Pobox isn’t quite sure about. I like it both because RSS is something that I wish had gotten wider adoption, and because I like having it in a separate place than my email or the web (which are the two other places you can review it.)

My favorite Topicbox feature is organization-wide search! Topicbox makes it easy for team members to create new groups, which is awesome for making sure all the right people are included in a discussion. But as soon as you start having enough information that you can’t see in one screen, you want to search for it. The Topicbox search technology is based on FastMail’s, so it’s fast, thorough, and easy to refine. You find the full thread… and the conclusion. Organization-wide search is, to me, the best reason to move your organization’s email discussions to Topicbox. (And, yes, we can help you import from an archive or a even a personal mailbox!)

What’s your favourite or most used piece of technology?

My bicycle! It embodies everything I think of as technology. It lets you solve a problem that you could probably solve without it, but much more efficiently. It also rewards curiosity. You don’t need to know how it works to use it. But it’s really easy to learn how to take it apart, fix it, and make it better. Also, like most of the technology I like, I don't use it as often as I'd like.

This isn't my bike. It's a photo I took while on a trip to the Australian office. It's a sculpture of three bicycles occupying the same space!

Three bicycles

What are you listening to / watching these days?

I’m finally catching up on Song-by-Song podcasts, which discusses every Tom Waits song, one per episode. But that means I’m listening to a lot of Tom Waits again too. It’s good to listen to full albums!

We talk a lot about music at FastMail, and we’ve gotten almost everyone on Spotify. We have a bot who tracks people’s Discover Weekly playlists, looking for duplicates, and determining who has compatible (and diverse!) musical tastes. I’ve found a bunch of good music that I wouldn’t have heard before because staffers have been listening. I also know who has consistent enough taste that I know I can always hit up their weekly playlist for, say, synth pop and 80s hits (Rob Mueller!).

What do you like to do outside of work?

I do coding projects outside of work, too, though less this year than in years past. I used to manage the perl5 project for many years, but now I'm just an active member of the peanut gallery.

I watch a lot of movies. I talk a lot about the horror movies I watch because they are the funniest to discuss, but I actually watch plenty of movies across all genres.

I run a D&D game, and I’ve been playing a lot of Mario Kart on my Nintendo Switch.

What's your favourite animal?

I used to have pet guinea pigs, so they’re up there! They’re my favorite animal that I would actually consider palling around with. But I’m also a fan of any animal that is really bizarre in some way. It reminds you that evolution goes in a lot of crazy ways.

Any FM staffers you want to brag on?

Everybody’s great! If I was going to call somebody out in particular, though, it would be Bron. We had reached an inflection point in terms of scale, where we needed to rethink the way we organized our work. Bron stepped up to make that happen, and we’re all better off for it.

What are you proudest of in your work?

In my technical work, over many years, I’m proudest we’ve been able to use a slowly evolving set of patterns without finding out they were fundamentally bankrupt. With Topicbox, we were able to test that theory in the biggest way — we started from scratch using those patterns as first principles, and it worked. So that was really rewarding.

On a larger scale than that, it’s always a blast to meet people in a professional setting who have heard of FastMail or Pobox. They will be excited to talk about what they know of the company, and often tell me they think it would be an exciting and great place to work. In large part, that's because of people and culture we have, and I’m proud to have been part of making that the case!


Review of Stepping Off from Limina.

Published 17 Dec 2017 by Tom Wilson in thomas m wilson.

This book review is a good summary of Stepping Off  – thanks to Amy Budrikis.  Read the review at Limina.  

How would you change Archivematica's Format Policy Registry?

Published 15 Dec 2017 by Jenny Mitcham in Digital Archiving at the University of York.

A train trip through snowy Shropshire to get to Aberystwyth
This week the UK Archivematica user group fought through the snow and braved the winds and driving rain to meet at the National Library of Wales in Aberystwyth.

This was the first time the group had visited Wales and we celebrated with a night out at a lovely restaurant on the evening before our meeting. Our visit also coincided with the National Library cafe’s Christmas menu so we were treated to a generous Christmas lunch (and crackers) at lunch time. Thanks NLW!

As usual the meeting covered an interesting range of projects and perspectives from Archivematica users in the UK and beyond. As usual there was too much to talk about and not nearly enough time. Fortunately this took my mind off the fact I had damp feet for most of the day.

This post focuses on a discussion we had about Archivematica's Format Policy Registry or FPR. The FPR in Archivematica is a fairly complex beast, but a crucial tool for the 'Preservation Planning' step in digital archiving. It is essentially a database which allows users to define policies for handling different file formats (including the actions, tools and settings to apply to specific file type for the purposes of preservation or access). The FPR comes ready populated with a set of rules based on agreed best practice in the sector, but institutions are free to change these and add new tools and rules to meet their own requirements.

Jake Henry from the National Library of Wales kicked off the discussion by telling us about some work they had done to make the thumbnail generation for pdf files more useful. Instead of supplying a generic thumbnail image for all pdfs they wanted the thumbnail to actually represent the file in question. They made changes to the FPR to change the pdf thumbnail generation to use GhostScript.

NLW liked the fact that Archivematica converted pdf files to pdf/a but also wanted that same normalisation pathway to apply to existing pdf/a files. Just because a pdf/a file is already in a preservation file format it doesn’t mean it is a valid file. By also putting pdf/a files through a normalisation step they had more confidence that they were creating and preserving pdf/a files with some consistency.

Sea view from our meeting room!
Some institutions had not had any time to look in any detail at the default FPR rules. It was mentioned that there was trust in how the rules had been set up by Artefactual and that people didn’t feel expert enough to override these rules. The interface to the FPR within Archivematica itself is also not totally intuative and requires quite a bit of time to understand. It was mentioned that adding a tool and a new rule for a specific file format in Archivematica is quite an complex task and not for the faint hearted...!

Discussion also touched on the subject of those files that are not identified. A file needs to be identified before a FPR rule can be set up for it. Ensuring files are identified in the first instance was seen to be a crucial step. Even once a format makes its way into PRONOM (TNA’s database of file formats) Artefactual Systems have to carry out extra work to get Archivematica to pick up that new PUID.

Unfortunately normalisation tools do not exist for all files and in many cases you may just have to accept that a file will stay in the format in which it was received. For example a Microsoft Word document (.doc) may not be an ideal preservation format but in the absence of open source command line migration tools we may just have to accept the level of risk associated with this format.

Moving on from this, we also discussed manual normalisations. This approach may be too resource intensive for many (particularly those of us who are implementing automated workflows) but others would see this as an essential part of the digital preservation process. I gave the example of the WordStar files I have been working with this year. These files are already obsolete and though there are other ways of viewing them, I plan to migrate them to a format more suitable for preservation and access. This would need to be carried out outside of Archivematica using the manual normalisation workflow. I haven’t tried this yet but would very much like to test it out in the future.

I shared some other examples that I'd gathered outside the meeting. Kirsty Chatwin-Lee from the University of Edinburgh had a proactive approach to handling the FPR on a collection by collection and PUID by PUID basis. She checks all of the FPR rules for the PUIDs she is working with as she transfers a collection of digital objects into Archivematica and ensures she is happy before proceding with the normalisation step.

Back in October I'd tweeted to the wider Archivematica community to find out what people do with the FPR and had a few additional examples to share. For example, using Unoconv to convert office documents and creating PDF access versions of Microsoft Word documents. We also looked at some more detailed preservation planning documentation that Robert Gillesse from the International Institute of Social History had shared with the group.

We had a discussion about the benefits (or not) of normalising a compressed file (such as a JPEG) to an uncompressed format (such as TIFF). I had already mentioned in my presentation earlier that this default migration rule was turning 5GB of JPEG images into 80GB of TIFFs - and this is without improving the quality or the amount of information contained within the image. The same situation would apply to compressed audio and video which would increase even more in size when converted to an uncompressed format.

If storage space is at a premium (or if you are running this as a service and charging for storage space used) this could be seen as a big problem. We discussed the reasons for and against leaving this rule in the FPR. It is true that we may have more confidence in the longevity of TIFFs and see them as more robust in the face of corruption, but if we are doing digital preservation properly (checking checksums, keeping multiple copies etc) shouldn't corruption be easily spotted and fixed?

Another reason we may migrate or normalise files is to restrict the file formats we are preserving to a limited set of known formats in the hope that this will lead to less headaches in the future. This would be a reason to keep on converting all those JPEGs to TIFFs.

The FPR is there to be changed and being that not all organisations have exactly the same requirements it is not surprising that we are starting to tweak it here and there – if we don’t understand it, don’t look at it and don’t consider changing it perhaps we aren’t really doing our jobs properly.

However there was also a strong feeling in the room that we shouldn’t all be re-inventing the wheel. It is incredibly useful to hear what others have done with the FPR and the rationale behind their decisions.

Hopefully it is helpful to capture this discussion in a blog post, but this isn’t a sustainable way to communicate FPR changes for the longer term. There was a strong feeling in the room that we need a better way of communicating with each other around our preservation planning - the decisions we have made and the reasons for those decisions. This feeling was echoed by Kari Smith (MIT Libraries) and Nick Krabbenhoeft (New York Public Library) who joined us remotely to talk about the OSSArcFlow project - so this is clearly an international problem! This is something that Jisc are considering as part of their Research Data Shared Service project so it will be interesting to see how this might develop in the future.

Thanks to the UK Archivematica group meeting attendees for contributing to the discussion and informing this blog post.

How would you change Archivematica's Format Policy Registry?

Published 15 Dec 2017 by Jenny Mitcham in Digital Archiving at the University of York.

A train trip through snowy Shropshire to get to Aberystwyth
This week the UK Archivematica user group fought through the snow and braved the winds and driving rain to meet at the National Library of Wales in Aberystwyth.

This was the first time the group had visited Wales and we celebrated with a night out at a lovely restaurant on the evening before our meeting. Our visit also coincided with the National Library cafe’s Christmas menu so we were treated to a generous Christmas lunch (and crackers) at lunch time. Thanks NLW!

As usual the meeting covered an interesting range of projects and perspectives from Archivematica users in the UK and beyond. As usual there was too much to talk about and not nearly enough time. Fortunately this took my mind off the fact I had damp feet for most of the day.

This post focuses on a discussion we had about Archivematica's Format Policy Registry or FPR. The FPR in Archivematica is a fairly complex beast, but a crucial tool for the 'Preservation Planning' step in digital archiving. It is essentially a database which allows users to define policies for handling different file formats (including the actions, tools and settings to apply to specific file type for the purposes of preservation or access). The FPR comes ready populated with a set of rules based on agreed best practice in the sector, but institutions are free to change these and add new tools and rules to meet their own requirements.

Jake Henry from the National Library of Wales kicked off the discussion by telling us about some work they had done to make the thumbnail generation for pdf files more useful. Instead of supplying a generic thumbnail image for all pdfs they wanted the thumbnail to actually represent the file in question. They made changes to the FPR to change the pdf thumbnail generation to use GhostScript.

NLW liked the fact that Archivematica converted pdf files to pdf/a but also wanted that same normalisation pathway to apply to existing pdf/a files. Just because a pdf/a file is already in a preservation file format it doesn’t mean it is a valid file. By also putting pdf/a files through a normalisation step they had more confidence that they were creating and preserving pdf/a files with some consistency.

Sea view from our meeting room!
Some institutions had not had any time to look in any detail at the default FPR rules. It was mentioned that there was trust in how the rules had been set up by Artefactual and that people didn’t feel expert enough to override these rules. The interface to the FPR within Archivematica itself is also not totally intuative and requires quite a bit of time to understand. It was mentioned that adding a tool and a new rule for a specific file format in Archivematica is quite an complex task and not for the faint hearted...!

Discussion also touched on the subject of those files that are not identified. A file needs to be identified before a FPR rule can be set up for it. Ensuring files are identified in the first instance was seen to be a crucial step. Even once a format makes its way into PRONOM (TNA’s database of file formats) Artefactual Systems have to carry out extra work to get Archivematica to pick up that new PUID.

Unfortunately normalisation tools do not exist for all files and in many cases you may just have to accept that a file will stay in the format in which it was received. For example a Microsoft Word document (.doc) may not be an ideal preservation format but in the absence of open source command line migration tools we may just have to accept the level of risk associated with this format.

Moving on from this, we also discussed manual normalisations. This approach may be too resource intensive for many (particularly those of us who are implementing automated workflows) but others would see this as an essential part of the digital preservation process. I gave the example of the WordStar files I have been working with this year. These files are already obsolete and though there are other ways of viewing them, I plan to migrate them to a format more suitable for preservation and access. This would need to be carried out outside of Archivematica using the manual normalisation workflow. I haven’t tried this yet but would very much like to test it out in the future.

I shared some other examples that I'd gathered outside the meeting. Kirsty Chatwin-Lee from the University of Edinburgh had a proactive approach to handling the FPR on a collection by collection and PUID by PUID basis. She checks all of the FPR rules for the PUIDs she is working with as she transfers a collection of digital objects into Archivematica and ensures she is happy before proceding with the normalisation step.

Back in October I'd tweeted to the wider Archivematica community to find out what people do with the FPR and had a few additional examples to share. For example, using Unoconv to convert office documents and creating PDF access versions of Microsoft Word documents. We also looked at some more detailed preservation planning documentation that Robert Gillesse from the International Institute of Social History had shared with the group.

We had a discussion about the benefits (or not) of normalising a compressed file (such as a JPEG) to an uncompressed format (such as TIFF). I had already mentioned in my presentation earlier that this default migration rule was turning 5GB of JPEG images into 80GB of TIFFs - and this is without improving the quality or the amount of information contained within the image. The same situation would apply to compressed audio and video which would increase even more in size when converted to an uncompressed format.

If storage space is at a premium (or if you are running this as a service and charging for storage space used) this could be seen as a big problem. We discussed the reasons for and against leaving this rule in the FPR. It is true that we may have more confidence in the longevity of TIFFs and see them as more robust in the face of corruption, but if we are doing digital preservation properly (checking checksums, keeping multiple copies etc) shouldn't corruption be easily spotted and fixed?

Another reason we may migrate or normalise files is to restrict the file formats we are preserving to a limited set of known formats in the hope that this will lead to less headaches in the future. This would be a reason to keep on converting all those JPEGs to TIFFs.

The FPR is there to be changed and being that not all organisations have exactly the same requirements it is not surprising that we are starting to tweak it here and there – if we don’t understand it, don’t look at it and don’t consider changing it perhaps we aren’t really doing our jobs properly.

However there was also a strong feeling in the room that we shouldn’t all be re-inventing the wheel. It is incredibly useful to hear what others have done with the FPR and the rationale behind their decisions.

Hopefully it is helpful to capture this discussion in a blog post, but this isn’t a sustainable way to communicate FPR changes for the longer term. There was a strong feeling in the room that we need a better way of communicating with each other around our preservation planning - the decisions we have made and the reasons for those decisions. This feeling was echoed by Kari Smith (MIT Libraries) and Nick Krabbenhoeft (New York Public Library) who joined us remotely to talk about the OSSArcFlow project - so this is clearly an international problem! This is something that Jisc are considering as part of their Research Data Shared Service project so it will be interesting to see how this might develop in the future.

Thanks to the UK Archivematica group meeting attendees for contributing to the discussion and informing this blog post.

The Squeeze

Published 8 Dec 2017 by jenimcmillan in Jeni McMillan.

 

I’m about to squeeze this hulk between solid stone buildings that have withstood two world wars and four hundred years of seasonal change, love and laughter in the Aveyron. I’m not the first. This is the through road between wheat fields on high and the ancient moulins along the river that ground the grain to fine flour for the communal bread ovens. Tractors, horses, wagons, and more recently cars and the occasional truck have traversed this route. Today I’m driving the old Mercedes.

I’ve been in Europe for six months now. Hitching my way around France and Greece, meeting the strange, the interesting and the humorous along my way. Striding with backpack or pedalling the tiny trails that connect villages. I don’t drive cars. I’m on the wrong side of the road, the wrong side of the car, and I’m trying to brake with my right foot on the pedal. Sure I have been granted a temporary French permit to drive, but do I really want to exchange a life of adventurous travel for the easy option? I will decide after I have safely parked the car on the wild and wintery hill-top back at my friend’s house.

 

 


Cakes, quizzes, blogs and advocacy

Published 4 Dec 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Last Thursday was International Digital Preservation Day and I think I needed the weekend to recover.

It was pretty intense...

...but also pretty amazing!

Amazing to see what a fabulous international community there is out there working on the same sorts of problems as me!

Amazing to see quite what a lot of noise we can make if we all talk at once!

Amazing to see such a huge amount of advocacy and awareness raising going on in such a small space of time!

International Digital Preservation Day was crazy but now I have had a bit more time to reflect, catch up...and of course read a selection of the many blog posts and tweets that were posted.

So here are some of my selected highlights:

Cakes

Of course the highlights have to include the cakes and biscuits including those produced by Rachel MacGregor and Sharon McMeekin. Turning the problems that we face into something edible helps does seem to make our challenges easier to digest!

Quizzes and puzzles

A few quizzes and puzzles were posed on the day via social media - a great way to engage the wider world and have a bit of fun in the process.


There was a great quiz from the Parliamentary Archives (the answers are now available here) and a digital preservation pop quiz from Ed Pinsent of CoSector which started here. Also for those hexadecimal geeks out there, a puzzle from the DP0C Fellows at Oxford and Cambridge which came just at the point that I was firing up a hexadecimal viewer as it happens!

In a blog post called Name that item in...? Kirsty Chatwin-Lee at Edinburgh University encourages the digital preservation community to help her to identify a mysterious large metal disk found in their early computing collections. Follow the link to the blog to see a picture - I'm sure someone out there can help!

Announcements and releases

There were lots of big announcements on the day too. IDPD just kept on giving!

Of course the 'Bit List' (a list of digitally endangered species) was announced and I was able to watch this live. Kevin Ashley from the Digital Curation Coalition discusses this in a blog post. It was interesting to finally see what was on the list (and then think further about how we can use this for further advocacy and awareness raising).

I celebrated this fact with some Fake News but to be fair, William Kilbride had already been on the BBC World Service the previous evening talking about just this so it wasn't too far from the truth!

New versions of JHOVE and VeraPDF were released as well as a new PRONOM release.  A digital preservation policy for Wales was announced and a new course on file migration was launched by CoSector at the University of London. Two new members also joined the Digital Preservation Coalition - and what a great day to join!

Roadshows

Some institutions did a roadshow or a pop up museum in order to spread the message about digital preservation more widely. This included the revival of the 'fish screensaver' at Trinity College Dublin and a pop up computer museum at the British Geological Survey.

Digital Preservation at Oxford and Cambridge blogged about their portable digital preservation roadshow kit. I for one found this a particularly helpful resource - perhaps I will manage to do something similar myself next IDPD!

A day in the life

Several institutions chose to mark the occasion by blogging or tweeting about the details of their day. This gives an insight into what we DP folks actually do all day and can be really useful being that the processes behind digital preservation work are often less tangible and understandable than those used for physical archives!

I particularly enjoyed the nostalgia of following ex colleagues at the Archaeology Data Service for the day (including references to those much loved checklists!) and hearing from  Artefactual Systems about the testing, breaking and fixing of Archivematica that was going on behind the scenes.

The Danish National Archives blogged about 'a day in the life' and I was particularly interested to hear about the life-cycle perspective they have as new software is introduced, assessed and approved.

Exploring specific problems and challenges

Plans are my reality from Yvonne Tunnat of the ZBW Leibniz Information Centre for Economics was of particular interest to me as it demonstrates just how hard the preservation tasks can be. I like it when people are upfront and honest about the limitations of the tools or the imperfections of the processes they are using. We all need to share more of this!

In Sustaining the software that preserves access to web archives, Andy Jackson from the British Library tells the story of an attempt to maintain a community of practice around open source software over time and shares some of the lessons learned - essential reading for any of us that care about collaborating to sustain open source.

Kirsty Chatwin-Lee from Edinburgh University invites us to head back to 1985 with her as she describes their Kryoflux-athon challenge for the day. What a fabulous way to spend the day!

Disaster stories

Digital Preservation Day wouldn't be Digital Preservation Day without a few disaster stories too! Despite our desire to move away beyond the 'digital dark age' narrative, it is often helpful to refer to worse case scenarios when advocating for digital preservation.

Cees Hof from DANS in the Netherlands talks about the loss of digital data related to rare or threatened species in The threat of double extinction, Sarah Mason from Oxford University uses the recent example of the shutdown of DCist to discuss institutional risk, José Borbinha from Lisbon University, Portugal talks about his own experiences of digital preservation disaster and Neil Beagrie from Charles Beagrie Ltd highlights the costs of inaction.

The bigger picture

Other blogs looked at the bigger picture

Preservation as a present by Barbara Sierman from the National Library of the Netherlands is a forward thinking piece about how we could communicate and plan better in order to move forward.

Shira Peltzman from the University of California, Los Angeles tries to understand some of the results of the 2017 NDSA Staffing Survey in It's difficult to solve a problem if you don't know what's wrong.

David Minor from the University of San Diego Library, provides his thoughts on What we’ve done well, and some things we still need to figure out.

I enjoyed reading a post from Euan Cochrane from Yale University Library on The Emergence of “Digital Patinas”. A really interesting piece... and who doesn't like to be reminded of the friendly and helpful Word 97 paperclip?

In Towards a philosophy of digital preservation, Stacey Erdman from Beloit College, Wisconsin USA asks whether archivists are born or made and discusses her own 'archivist "gene"'.




So much going on and there were so many other excellent contributions that I missed.

I'll end with a tweet from Euan Cochrane which I thought nicely summed up what International Digital Preservation Day is all about and of course the day was also concluded by William Kilbride of the DPC with a suitably inspirational blog post.



Congratulations to the Digital Preservation Coalition for organising the day and to the whole digital preservation community for making such a lot of noise!



Cakes, quizzes, blogs and advocacy

Published 4 Dec 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Last Thursday was International Digital Preservation Day and I think I needed the weekend to recover.

It was pretty intense...

...but also pretty amazing!

Amazing to see what a fabulous international community there is out there working on the same sorts of problems as me!

Amazing to see quite what a lot of noise we can make if we all talk at once!

Amazing to see such a huge amount of advocacy and awareness raising going on in such a small space of time!

International Digital Preservation Day was crazy but now I have had a bit more time to reflect, catch up...and of course read a selection of the many blog posts and tweets that were posted.

So here are some of my selected highlights:

Cakes

Of course the highlights have to include the cakes and biscuits including those produced by Rachel MacGregor and Sharon McMeekin. Turning the problems that we face into something edible helps does seem to make our challenges easier to digest!

Quizzes and puzzles

A few quizzes and puzzles were posed on the day via social media - a great way to engage the wider world and have a bit of fun in the process.


There was a great quiz from the Parliamentary Archives (the answers are now available here) and a digital preservation pop quiz from Ed Pinsent of CoSector which started here. Also for those hexadecimal geeks out there, a puzzle from the DP0C Fellows at Oxford and Cambridge which came just at the point that I was firing up a hexadecimal viewer as it happens!

In a blog post called Name that item in...? Kirsty Chatwin-Lee at Edinburgh University encourages the digital preservation community to help her to identify a mysterious large metal disk found in their early computing collections. Follow the link to the blog to see a picture - I'm sure someone out there can help!

Announcements and releases

There were lots of big announcements on the day too. IDPD just kept on giving!

Of course the 'Bit List' (a list of digitally endangered species) was announced and I was able to watch this live. Kevin Ashley from the Digital Curation Coalition discusses this in a blog post. It was interesting to finally see what was on the list (and then think further about how we can use this for further advocacy and awareness raising).

I celebrated this fact with some Fake News but to be fair, William Kilbride had already been on the BBC World Service the previous evening talking about just this so it wasn't too far from the truth!

New versions of JHOVE and VeraPDF were released as well as a new PRONOM release.  A digital preservation policy for Wales was announced and a new course on file migration was launched by CoSector at the University of London. Two new members also joined the Digital Preservation Coalition - and what a great day to join!

Roadshows

Some institutions did a roadshow or a pop up museum in order to spread the message about digital preservation more widely. This included the revival of the 'fish screensaver' at Trinity College Dublin and a pop up computer museum at the British Geological Survey.

Digital Preservation at Oxford and Cambridge blogged about their portable digital preservation roadshow kit. I for one found this a particularly helpful resource - perhaps I will manage to do something similar myself next IDPD!

A day in the life

Several institutions chose to mark the occasion by blogging or tweeting about the details of their day. This gives an insight into what we DP folks actually do all day and can be really useful being that the processes behind digital preservation work are often less tangible and understandable than those used for physical archives!

I particularly enjoyed the nostalgia of following ex colleagues at the Archaeology Data Service for the day (including references to those much loved checklists!) and hearing from  Artefactual Systems about the testing, breaking and fixing of Archivematica that was going on behind the scenes.

The Danish National Archives blogged about 'a day in the life' and I was particularly interested to hear about the life-cycle perspective they have as new software is introduced, assessed and approved.

Exploring specific problems and challenges

Plans are my reality from Yvonne Tunnat of the ZBW Leibniz Information Centre for Economics was of particular interest to me as it demonstrates just how hard the preservation tasks can be. I like it when people are upfront and honest about the limitations of the tools or the imperfections of the processes they are using. We all need to share more of this!

In Sustaining the software that preserves access to web archives, Andy Jackson from the British Library tells the story of an attempt to maintain a community of practice around open source software over time and shares some of the lessons learned - essential reading for any of us that care about collaborating to sustain open source.

Kirsty Chatwin-Lee from Edinburgh University invites us to head back to 1985 with her as she describes their Kryoflux-athon challenge for the day. What a fabulous way to spend the day!

Disaster stories

Digital Preservation Day wouldn't be Digital Preservation Day without a few disaster stories too! Despite our desire to move away beyond the 'digital dark age' narrative, it is often helpful to refer to worse case scenarios when advocating for digital preservation.

Cees Hof from DANS in the Netherlands talks about the loss of digital data related to rare or threatened species in The threat of double extinction, Sarah Mason from Oxford University uses the recent example of the shutdown of DCist to discuss institutional risk, José Borbinha from Lisbon University, Portugal talks about his own experiences of digital preservation disaster and Neil Beagrie from Charles Beagrie Ltd highlights the costs of inaction.

The bigger picture

Other blogs looked at the bigger picture

Preservation as a present by Barbara Sierman from the National Library of the Netherlands is a forward thinking piece about how we could communicate and plan better in order to move forward.

Shira Peltzman from the University of California, Los Angeles tries to understand some of the results of the 2017 NDSA Staffing Survey in It's difficult to solve a problem if you don't know what's wrong.

David Minor from the University of San Diego Library, provides his thoughts on What we’ve done well, and some things we still need to figure out.

I enjoyed reading a post from Euan Cochrane from Yale University Library on The Emergence of “Digital Patinas”. A really interesting piece... and who doesn't like to be reminded of the friendly and helpful Word 97 paperclip?

In Towards a philosophy of digital preservation, Stacey Erdman from Beloit College, Wisconsin USA asks whether archivists are born or made and discusses her own 'archivist "gene"'.




So much going on and there were so many other excellent contributions that I missed.

I'll end with a tweet from Euan Cochrane which I thought nicely summed up what International Digital Preservation Day is all about and of course the day was also concluded by William Kilbride of the DPC with a suitably inspirational blog post.



Congratulations to the Digital Preservation Coalition for organising the day and to the whole digital preservation community for making such a lot of noise!



Wikidata Map November 2017

Published 3 Dec 2017 by addshore in Addshore.

It has only been 4 months since my last Wikidata map update post, but the difference on the map in these 4 months is much greater than the diff shown in my last post covering 9 months. The whole map is covered with pink (additions to the map). The main areas include Norway, Germany, Malaysia, South Korea, Vietnam and New Zealand to name just a few.

As with previous posts varying sizes of the images generated can be found on Wikimedia Commons along with the diff image:

July to November in numbers

In the last 4 months (roughly speaking):

All of these numbers were roughly pulled out of graphs by eye. The graphs can be seen below:

The post Wikidata Map November 2017 appeared first on Addshore.


Wikibase docker images

Published 3 Dec 2017 by addshore in Addshore.

This is a belated post about the Wikibase docker images that I recently created for the Wikidata 5th birthday. You can find the various images on docker hub and matching Dockerfiles on github. These images combined allow you to quickly create docker containers for Wikibase backed by MySQL and with a SPARQL query service running alongside updating live from the Wikibase install.

A setup was demoed at the first Wikidatacon event in Berlin on the 29th of October 2017 and can be seen at roughly 41:10 in the demo of presents video which can be seen below.

The images

The ‘wikibase‘ image is based on the new official mediawiki image hosted on the docker store. The only current version, which is also the version demoed is for MediaWiki 1.29. This image contains MediaWiki running on PHP 7.1 served by apache. Right now the image does some sneaky auto installation of the MediaWiki database tables which might be disappearing in the future to make the image more generic.

The ‘wdqs‘ image is based on the official openjdk image hosted on the docker store. This image also only has one version, the current latest version of the Wikidata Query Service which is downloaded from maven. This image can be used to run the blazegraph service as well as run an updater that reads from the recent changes feed of a wikibase install and adds the new data to blazegraph.

The ‘wdqs-frontend‘ image hosts the pretty UI for the query service served by nginx. This includes auto completion and pretty visualizations. There is currently an issue which means the image will always serve examples for Wikidata which will likely not work on your custom install.

The ‘wdqs-proxy‘ image hosts an nginx proxy that restricts external access to the wdqs service meaning it is READONLY and also has a time limit for queries (not currently configurable). This is very important as if the wdqs image is exposed directly to the world then people can also write to your blazegraph store.

You’ll also need to have some mysql server setup for wikibase to use, you can use the default mysql or mariadb images for this, this is also covered in the example below.

All of the wdqs images should probably be renamed as they are not specific to Wikidata (which is where the wd comes from), but right now the underlying repos and packages have the wd prefix and not a wb prefix (for Wikibase) so we will stick to them.

Compose example

The below example configures volumes for all locations with data that should / could persist. Wikibase is exposed on port 8181 with the query service UI on 8282 and the queryservice itself (behind the proxy) on 8989.

Each service has a network alias defined (that probably isn’t needed in most setups), but while running on WMCS it was required to get around some bad name resolving.

version: '3'

services:
  wikibase:
    image: wikibase/wikibase
    restart: always
    links:
      - mysql
    ports:
     - "8181:80"
    volumes:
      - mediawiki-images-data:/var/www/html/images
    depends_on:
    - mysql
    networks:
      default:
        aliases:
         - wikibase.svc
  mysql:
    image: mariadb
    restart: always
    volumes:
      - mediawiki-mysql-data:/var/lib/mysql
    environment:
      MYSQL_DATABASE: 'my_wiki'
      MYSQL_USER: 'wikiuser'
      MYSQL_PASSWORD: 'sqlpass'
      MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
    networks:
      default:
        aliases:
         - mysql.svc
  wdqs-frontend:
    image: wikibase/wdqs-frontend
    restart: always
    ports:
     - "8282:80"
    depends_on:
    - wdqs-proxy
    networks:
      default:
        aliases:
         - wdqs-frontend.svc
  wdqs:
    image: wikibase/wdqs
    restart: always
    build:
      context: ./wdqs/0.2.5
      dockerfile: Dockerfile
    volumes:
      - query-service-data:/wdqs/data
    command: /runBlazegraph.sh
    networks:
      default:
        aliases:
         - wdqs.svc
  wdqs-proxy:
    image: wikibase/wdqs-proxy
    restart: always
    environment:
      - PROXY_PASS_HOST=wdqs.svc:9999
    ports:
     - "8989:80"
    depends_on:
    - wdqs
    networks:
      default:
        aliases:
         - wdqs-proxy.svc
  wdqs-updater:
    image: wikibase/wdqs
    restart: always
    command: /runUpdate.sh
    depends_on:
    - wdqs
    - wikibase
    networks:
      default:
        aliases:
         - wdqs-updater.svc

volumes:
  mediawiki-mysql-data:
  mediawiki-images-data:
  query-service-data:

Questions

I’ll vaugly keep this section up to date with Qs & As, but if you don’t find you answer here, leave a comment, send an email or file a phabricator ticket.

Can I use these images in production?

I wouldn’t really recommend running any of these in ‘production’ yet as they are new and not well tested. Various things such as upgrade for the query service and upgrades for mediawiki / wikibase are also not yet documented very well.

Can I import data into these images from an existing wikibase / wikidata? (T180216)

In theory, although this is not documented. You’ll have to import everything using an XML dump of the existing mediawiki install, the configuration will also have to match on both installs. When importing using an XML dump the query service will not be updated automatically, and you will likely have to read the manual.

Where was the script that you ran in the demo video?

There is a copy in the github repo called setup.sh, but I can’t guarantee it works in all situations! It was specifically made for a WMCS debian jessie VM.

Links

The post Wikibase docker images appeared first on Addshore.


What shall I do for International Digital Preservation Day?

Published 30 Nov 2017 by Jenny Mitcham in Digital Archiving at the University of York.

I have been thinking about this question for a few months now and have only recently come up with a solution.

I wanted to do something big on International Digital Preservation Day. Unfortunately other priorities have limited the amount of time available and I am doing something a bit more low key. To take a positive from a negative I would like to suggest that as with digital preservation more generally, it is better to just do something rather than wait for the perfect solution to come along!

I am sometimes aware that I spend a lot of time in my own echo chamber - for example talking on Twitter and through this blog to other folks who also work in digital preservation. Though this is undoubtedly a useful two-way conversation, for International Digital Preservation Day I wanted to target some new audiences.

So instead of blogging here (yes I know I am blogging here too) I have blogged on the Borthwick Institute for Archives blog.

The audience for the Borthwick blog is a bit different to my usual readership. It is more likely to be read by users of our services at the Borthwick Institute and those who donate or deposit with us, perhaps also by staff working in other archives in the UK and beyond. Perfect for what I had planned.

In response to the tagline of International Digital Preservation Day ‘Bits Decay: Do Something Today’ I wanted to encourage as many people as possible to ‘Do Something’. This shouldn’t be just limited to us digital preservation folks, but to anyone anywhere who uses a computer to create or manage data.

This is why I decided to focus on Personal Digital Archiving. The blog post is called “Save your digital stuff!” (credit to the DPC Technology Watch Report on Personal Digital Archiving for this inspiring title - it was noted that at a briefing day hosted by the Digital Preservation Coalition (DPC) in April 2015, one of the speakers suggested that the term ‘personal digital archiving’ be replaced by the more urgent exhortation, ‘Save your digital stuff!’).

The blog post aimed to highlight the fragility of digital resources and then give a few tips on how to protect them. Nothing too complicated or technical, but hopefully just enough to raise awareness and perhaps encourage engagement. Not wishing to replicate all the great work that has already been done on Personal Digital Archiving, by the Library of Congress, the Paradigm project and others I decided to focus on just a few simple pieces of advice and then link out to other resources.

At the end of the post I encourage people to share information about any actions they have taken to protect their own digital legacies (of course using the #IDPD17 hashtag). If I inspire just one person to take action I'll consider it a win!

I'm also doing a 'Digital Preservation Takeover' of the Borthwick twitter account @UoYBorthwick. I lined up a series of 'fascinating facts' about the digital archives we hold here at the Borthwick and tweeted them over the course of the day.



OK - admittedly they won't be fascinating to everyone, but if nothing else it helps us to move further away from the notion that an archive is where you go to look at very old documents!

...and of course I now have a whole year to plan for International Digital Preservation Day 2018 so perhaps I'll be able to do something bigger and better?! I'm certainly feeling inspired by the range of activities going on across the globe today.


What shall I do for International Digital Preservation Day?

Published 30 Nov 2017 by Jenny Mitcham in Digital Archiving at the University of York.

I have been thinking about this question for a few months now and have only recently come up with a solution.

I wanted to do something big on International Digital Preservation Day. Unfortunately other priorities have limited the amount of time available and I am doing something a bit more low key. To take a positive from a negative I would like to suggest that as with digital preservation more generally, it is better to just do something rather than wait for the perfect solution to come along!

I am sometimes aware that I spend a lot of time in my own echo chamber - for example talking on Twitter and through this blog to other folks who also work in digital preservation. Though this is undoubtedly a useful two-way conversation, for International Digital Preservation Day I wanted to target some new audiences.

So instead of blogging here (yes I know I am blogging here too) I have blogged on the Borthwick Institute for Archives blog.

The audience for the Borthwick blog is a bit different to my usual readership. It is more likely to be read by users of our services at the Borthwick Institute and those who donate or deposit with us, perhaps also by staff working in other archives in the UK and beyond. Perfect for what I had planned.

In response to the tagline of International Digital Preservation Day ‘Bits Decay: Do Something Today’ I wanted to encourage as many people as possible to ‘Do Something’. This shouldn’t be just limited to us digital preservation folks, but to anyone anywhere who uses a computer to create or manage data.

This is why I decided to focus on Personal Digital Archiving. The blog post is called “Save your digital stuff!” (credit to the DPC Technology Watch Report on Personal Digital Archiving for this inspiring title - it was noted that at a briefing day hosted by the Digital Preservation Coalition (DPC) in April 2015, one of the speakers suggested that the term ‘personal digital archiving’ be replaced by the more urgent exhortation, ‘Save your digital stuff!’).

The blog post aimed to highlight the fragility of digital resources and then give a few tips on how to protect them. Nothing too complicated or technical, but hopefully just enough to raise awareness and perhaps encourage engagement. Not wishing to replicate all the great work that has already been done on Personal Digital Archiving, by the Library of Congress, the Paradigm project and others I decided to focus on just a few simple pieces of advice and then link out to other resources.

At the end of the post I encourage people to share information about any actions they have taken to protect their own digital legacies (of course using the #IDPD17 hashtag). If I inspire just one person to take action I'll consider it a win!

I'm also doing a 'Digital Preservation Takeover' of the Borthwick twitter account @UoYBorthwick. I lined up a series of 'fascinating facts' about the digital archives we hold here at the Borthwick and tweeted them over the course of the day.



OK - admittedly they won't be fascinating to everyone, but if nothing else it helps us to move further away from the notion that an archive is where you go to look at very old documents!

...and of course I now have a whole year to plan for International Digital Preservation Day 2018 so perhaps I'll be able to do something bigger and better?! I'm certainly feeling inspired by the range of activities going on across the globe today.


Preserving Google Drive: What about Google Sheets?

Published 29 Nov 2017 by Jenny Mitcham in Digital Archiving at the University of York.

There was lots of interest in a blog post earlier this year about preserving Google Docs.

Often the issues we grapple with in the field of digital preservation are not what you'd call 'solved problems' and that is what makes them so interesting. I always like to hear how others are approaching these same challenges so it is great to see so many comments on the blog itself and via Twitter.

This time I'm turning my focus to the related issue of Google Sheets. This is the native spreadsheet application for Google Drive.

Why?

Again, this is an application that is widely used at the University of York in a variety of different contexts, including for academic research data. We need to think about how we might preserve data created in Google Sheets for the longer term.


How hard can it be?

Quite hard actually - see my earlier post!


Exporting from Google Drive

For Google Sheets I followed a similar methodology to Google Docs. Taking a couple of sample spreadsheets and downloading them in the formats that Google provides, then examining these exported versions to assess how well specific features of the spreadsheet were retained.

I used the File...Download as... menu in Google Sheets to test out the available export formats

The two spreadsheets I worked with were as follows:


Here is a summary of my findings:

Microsoft Excel - xlsx

I had high hopes for the xlsx export option - however, on opening the exported xlsx version of my flexisheet I was immediately faced with an error message telling me that the file contained unreadable content and asking whether I wanted to recover the contents.

This doesn't look encouraging...

Clicking 'Yes' on this dialogue box then allows the sheet to open and another message appears telling you what has been repaired. In this case it tells me that a formula has been removed.


Excel can open the file if it removes the formula

This is not ideal if the formula is considered to be worthy of preservation.

So clearly we already know that this isn't going to be a perfect copy of the Google sheet.

This version of my flexisheet looks pretty messed up. The dates and values look OK, but none of the calculated values are there - they are all replaced with "#VALUE".

The colours on the original flexisheet are important as they flag up problems and issues with the data entered. These however are not fully retained - for example, weekends are largely (but not consistently) marked as red and in the original file they are green (because it is assumed that I am not actually meant to be working weekends).

The XLSX export does however give a better representation of the more simple menu choices Google sheet. The data is accurate, and comments are present in a partial way. Unfortunately though, replies to comments are not displayed and the comments are not associated with a date or time.


Open Document Format - ods

I tried opening the ODS version of the flexisheet in LibreOffice on a Macbook. There were no error messages (which was nice) but the sheet was a bit of a mess. There were similar issues to those that I encountered in the Excel export though it wasn't identical. The colours were certainly applied differently, neither entirely accurate to the original.

If I actually tried to use the sheet to enter more data in, the formula do not work - they do not calculate anything, though it does appear that the formula itself appears to be retained. Any values that are calculated on the original sheet are not present.

Comments are retained (and replies to comments) but no date or time appears to be associated with them (note that the data may be there but just not displaying in LibreOffice).

I also tried opening the ODS file in Microsoft Office. On opening it the same error message was displayed to the one originally encountered in the XLSX version described above and this was followed by notification that “Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.” Unlike the XLSX file there didn't appear to be any additional information available about exactly what had been repaired or discarded - this didn't exactly fill me with confidence!

PDF document - pdf

When downloading a spreadsheet as a PDF you are presented with a few choices - for example:
  • Should the export include all sheets, just the current sheet or current selection (note that current sheet is the default response)
  • Should the export include the document title?
  • Should the export include sheet names?
To make the export as thorough as possible I chose to export all sheets and include document title and sheet names.

As you might expect this was a good representation of the values on the spreadsheet - a digital print if you like - but all functionality and interactivity was lost. In order to re-use the data, it would need to be copied and pasted or re-typed back into a spreadsheet application.

Note that comments within the sheet were not retained and also there was no option to export sheets that were hidden.

Web page - html

This gave an accurate representation of the values on the spreadsheet, but, similar to the PDF version, not in a way that really encourages reuse. Formula were not retained and the resulting copy is just a static snapshot.

Interestingly, the comments in the menu choices example weren't retained. This surprised me because when using the html export option for Google documents one of the noted benefits was that comments were retained. Seems to be a lack of consistency here.

Another thing that surprised me about this version of the flexisheet was that it included hidden sheets (I hadn't until this point realised that there were hidden sheets!). I later discovered that the XLSX and ODS also retained the hidden sheets ...but they were (of course) hidden so I didn't immediately notice them! 

Tab delimited and comma separated values - tsv and csv

It is made clear on export that only the current sheet is exported so if using this as an export strategy you would need to ensure you exported each individual sheet one by one.

The tab delimited export of the flexisheet surprised me. In order to look at the data properly I tried importing it into MS Excel. It came up with a circular reference warning which surprised me - were some of the dynamic properties of the sheets being somehow retained (all be it in a way that was broken)?

tab_delim_error_when_import_to_Excel.png
A circular reference warning when opening the tab delimited file in Microsoft Excel

Both of these formats did a reasonable job of capturing the simple menu choices data (though note that the comments were not retained) but neither did an acceptable job of representing the complex data within the flexisheet (given that the more complex elements such as formulas and colours were not retained).

What about the metadata?

I won't go into detail again about the other features of a Google Sheet that won't be saved with these export options - for example information about who created it and when and the complete revision history that is available through Google Drive - this is covered in a previous post. Given my findings when I interviewed a researcher here at the University of York about their use of Google Sheets, the inability of the export options to capture the version history will be seen as problematic for some use cases.

What is the best export format for Google Sheets?

The short answer is 'it depends'.

The export options available all have pros and cons and as ever, the most suitable one will very much depend on the nature of the original file and the properties that you consider to be most worthy of preservation.


  • If for example the inclusion of comments is an essential requirement, XLSX or ODS will be the only formats that retain them (with varying degrees of success). 
  • If you just want a static snapshot of the data in its final form, PDF will do a good job (you must specify that all sheets are saved), but note that if you want to include hidden sheets, HTML may be a better option. 
  • If the data is required in a usable form (including a record of the formula used) you will need to try XLSX or ODS but note that calculated values present in the original sheet may be missing. Similar but not identical results were noted with XLSX and ODS so it would be worth trying them both and seeing if either is suitable for the data in question.


It should be possible to export an acceptable version of the data for a simple Google Sheet but for a complex dataset it will be difficult to find an export option that adequately retains all features.

Exporting Google Sheets seems even more problematic and variable than Google Documents and for a sheet as complex as my flexisheet it appears that there is no suitable option that retains the functionality of the sheet as well as the content.

So, here's hoping that native Google Drive files appear on the list of World's Endangered Digital Species...due to be released on International Digital Preservation Day! We will have to wait until tomorrow to find out...



A disclaimer: I carried out the best part of this work about 6 months ago but have only just got around to publishing it. Since I originally carried out the exports and noted my findings, things may have changed!


Preserving Google Drive: What about Google Sheets?

Published 29 Nov 2017 by Jenny Mitcham in Digital Archiving at the University of York.

There was lots of interest in a blog post earlier this year about preserving Google Docs.

Often the issues we grapple with in the field of digital preservation are not what you'd call 'solved problems' and that is what makes them so interesting. I always like to hear how others are approaching these same challenges so it is great to see so many comments on the blog itself and via Twitter.

This time I'm turning my focus to the related issue of Google Sheets. This is the native spreadsheet application for Google Drive.

Why?

Again, this is an application that is widely used at the University of York in a variety of different contexts, including for academic research data. We need to think about how we might preserve data created in Google Sheets for the longer term.


How hard can it be?

Quite hard actually - see my earlier post!


Exporting from Google Drive

For Google Sheets I followed a similar methodology to Google Docs. Taking a couple of sample spreadsheets and downloading them in the formats that Google provides, then examining these exported versions to assess how well specific features of the spreadsheet were retained.

I used the File...Download as... menu in Google Sheets to test out the available export formats

The two spreadsheets I worked with were as follows:


Here is a summary of my findings:

Microsoft Excel - xlsx

I had high hopes for the xlsx export option - however, on opening the exported xlsx version of my flexisheet I was immediately faced with an error message telling me that the file contained unreadable content and asking whether I wanted to recover the contents.

This doesn't look encouraging...

Clicking 'Yes' on this dialogue box then allows the sheet to open and another message appears telling you what has been repaired. In this case it tells me that a formula has been removed.


Excel can open the file if it removes the formula

This is not ideal if the formula is considered to be worthy of preservation.

So clearly we already know that this isn't going to be a perfect copy of the Google sheet.

This version of my flexisheet looks pretty messed up. The dates and values look OK, but none of the calculated values are there - they are all replaced with "#VALUE".

The colours on the original flexisheet are important as they flag up problems and issues with the data entered. These however are not fully retained - for example, weekends are largely (but not consistently) marked as red and in the original file they are green (because it is assumed that I am not actually meant to be working weekends).

The XLSX export does however give a better representation of the more simple menu choices Google sheet. The data is accurate, and comments are present in a partial way. Unfortunately though, replies to comments are not displayed and the comments are not associated with a date or time.


Open Document Format - ods

I tried opening the ODS version of the flexisheet in LibreOffice on a Macbook. There were no error messages (which was nice) but the sheet was a bit of a mess. There were similar issues to those that I encountered in the Excel export though it wasn't identical. The colours were certainly applied differently, neither entirely accurate to the original.

If I actually tried to use the sheet to enter more data in, the formula do not work - they do not calculate anything, though it does appear that the formula itself appears to be retained. Any values that are calculated on the original sheet are not present.

Comments are retained (and replies to comments) but no date or time appears to be associated with them (note that the data may be there but just not displaying in LibreOffice).

I also tried opening the ODS file in Microsoft Office. On opening it the same error message was displayed to the one originally encountered in the XLSX version described above and this was followed by notification that “Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.” Unlike the XLSX file there didn't appear to be any additional information available about exactly what had been repaired or discarded - this didn't exactly fill me with confidence!

PDF document - pdf

When downloading a spreadsheet as a PDF you are presented with a few choices - for example:
  • Should the export include all sheets, just the current sheet or current selection (note that current sheet is the default response)
  • Should the export include the document title?
  • Should the export include sheet names?
To make the export as thorough as possible I chose to export all sheets and include document title and sheet names.

As you might expect this was a good representation of the values on the spreadsheet - a digital print if you like - but all functionality and interactivity was lost. In order to re-use the data, it would need to be copied and pasted or re-typed back into a spreadsheet application.

Note that comments within the sheet were not retained and also there was no option to export sheets that were hidden.

Web page - html

This gave an accurate representation of the values on the spreadsheet, but, similar to the PDF version, not in a way that really encourages reuse. Formula were not retained and the resulting copy is just a static snapshot.

Interestingly, the comments in the menu choices example weren't retained. This surprised me because when using the html export option for Google documents one of the noted benefits was that comments were retained. Seems to be a lack of consistency here.

Another thing that surprised me about this version of the flexisheet was that it included hidden sheets (I hadn't until this point realised that there were hidden sheets!). I later discovered that the XLSX and ODS also retained the hidden sheets ...but they were (of course) hidden so I didn't immediately notice them! 

Tab delimited and comma separated values - tsv and csv

It is made clear on export that only the current sheet is exported so if using this as an export strategy you would need to ensure you exported each individual sheet one by one.

The tab delimited export of the flexisheet surprised me. In order to look at the data properly I tried importing it into MS Excel. It came up with a circular reference warning which surprised me - were some of the dynamic properties of the sheets being somehow retained (all be it in a way that was broken)?

tab_delim_error_when_import_to_Excel.png
A circular reference warning when opening the tab delimited file in Microsoft Excel

Both of these formats did a reasonable job of capturing the simple menu choices data (though note that the comments were not retained) but neither did an acceptable job of representing the complex data within the flexisheet (given that the more complex elements such as formulas and colours were not retained).

What about the metadata?

I won't go into detail again about the other features of a Google Sheet that won't be saved with these export options - for example information about who created it and when and the complete revision history that is available through Google Drive - this is covered in a previous post. Given my findings when I interviewed a researcher here at the University of York about their use of Google Sheets, the inability of the export options to capture the version history will be seen as problematic for some use cases.

What is the best export format for Google Sheets?

The short answer is 'it depends'.

The export options available all have pros and cons and as ever, the most suitable one will very much depend on the nature of the original file and the properties that you consider to be most worthy of preservation.


  • If for example the inclusion of comments is an essential requirement, XLSX or ODS will be the only formats that retain them (with varying degrees of success). 
  • If you just want a static snapshot of the data in its final form, PDF will do a good job (you must specify that all sheets are saved), but note that if you want to include hidden sheets, HTML may be a better option. 
  • If the data is required in a usable form (including a record of the formula used) you will need to try XLSX or ODS but note that calculated values present in the original sheet may be missing. Similar but not identical results were noted with XLSX and ODS so it would be worth trying them both and seeing if either is suitable for the data in question.


It should be possible to export an acceptable version of the data for a simple Google Sheet but for a complex dataset it will be difficult to find an export option that adequately retains all features.

Exporting Google Sheets seems even more problematic and variable than Google Documents and for a sheet as complex as my flexisheet it appears that there is no suitable option that retains the functionality of the sheet as well as the content.

So, here's hoping that native Google Drive files appear on the list of World's Endangered Digital Species...due to be released on International Digital Preservation Day! We will have to wait until tomorrow to find out...



A disclaimer: I carried out the best part of this work about 6 months ago but have only just got around to publishing it. Since I originally carried out the exports and noted my findings, things may have changed!


Server failures in october and november 2017

Published 28 Nov 2017 by Pierrick Le Gall in The Piwigo.com Blog.

The huge downtime at OVH that occurred on November 9th 2017 was quite like an earthquake for the European web. Of course Piwigo.com was impacted. But before that, we lived the server failure of October 7th and another one on October 14th. Let’s describe and explain what happened.

Photo by Johannes Plenio on Unsplash

Photo by Johannes Plenio on Unsplash

A) October 7th, the first server failure

On October 7th 2017, during saturday evening, our “reverse-proxy” server, the one through which all web traffic goes, crashed. OVH, our technical host, has identified a problem on the motherboard and replaced it. Web traffic was routed to the spare server during the short downtime. A server failure without real gravity, without loss of data, but which announced the start of a painful series of technical problems.

B) October 14th, a more serious server failure

A week later, on October 14th, the very same “reverse-proxy” server saw his load go into such high levels it was unable to deliver web pages… Web traffic is again switched to the spare server, in read-only mode for accounts hosted on this server. About 10 hours of investigation later, we were still not able to understand the origin of the problem. We have to decide to switch the spare server to write mode. This decision was difficult to take because it meant losing data produced between the last backup (1am) and the switch to spare server (about 8am). In other words, for the accounts hosted on this server, the photos added during the night simply “disappeared” from their Piwigo.

This is the first time in the history of Piwigo.com that we switch a spare server to write mode. Unfortunately, another problem has happened, related to the first one. To explain this problem, it is necessary to understand how Piwigo.com servers infrastructure works.

On the Piwigo.com infrastructure, servers work in pairs: a main server and its spare server. There are currently 4 pairs in production. The main server takes care of the “live operations”, while the spare server is synchronized with its main server every night and receives the web traffic in read-only during downtimes.

In the usual way, spare servers only allow read operations, ie you can visit the albums or view the photos, but not enter the administration or add photos.

One of the server pairs is what we call the “reverse-proxy”: all the web traffic of *.piwigo.com goes through this server and according to the piwigo concerned, the traffic goes to one or the other pair. Normally the reverse-proxy is configured to point to the main servers, not spare servers.

When a problem occurs on one of the main servers, we switch the traffic to its spare server. If the reverse-proxy server is concerned, we switch the IP address Fail-Over (IPFO): a mechanism that we manage on our OVH administration pannel. For other servers, we change the reverse-proxy configuration.

That’s enough for infrastructure details… let’s go back to October 14th: so we switched the IPFO to use the spare reverse-proxy server. Unfortunately, we met 2 problems in cascade:

  1. the spare reverse-proxy server, for one of the server pairs, pointed to the spare server
  2. this very spare server was configured in write mode instead of read-only

Why such an unexpected configuration?

Because we sometimes use the spare infrastructure to do real-life tests. In this case, these were IPV6 tests.

What impact for users?

During the many hours when the web traffic went through the spare reverse-proxy server, accounts hosted on the faulty server returned to the state of the previous night where photos added during night & morning had apparently disappeared but they were able to keep adding photos. This state did not trigger any specific alert : the situation seemed “normal” for the users concerned and for our monitor system. When the problem was detected, we changed the reverse proxy configuration to point back to the main server. Consequence: all the photos added during the downtime apparently disappeared.

What actions have been taken after October 14th?

1) Checks on reverse-proxy configuration

A new script was pushed on production. It checks very often that reverse-proxy is configured to send web traffic on main servers only.

2) Checks on write Vs read-only mode

Another script was pushed to production. This one checks main servers are configured in write mode and spare severs are in read-only mode.

3) Isolate third-party web applications

The “non-vital” web applications, on which we have less expertise, were switched to a third-party server dedicated to this use: 2 WordPress blogs, wiki, forum and piwik (analytics for visits). Indeed, one of the possibilities for the server failure, is that an application entered the 4th dimension or was under attack. Moving these applications into an “isolated” server helps to limit the impact of any future issue.

4) New backup system

The decision to switch a spare server to write mode, ie turn it into a main server, is a hard to take. Indeed it means giving up any hope to return to the main server. This decision is difficult because it involves accepting a loss of data.

To make this decision simpler, two measures have been taken: first to define a time threshold after which we apply the switch. In our case, if the failure lasts more than 2 hours, we will switch. Then backups must be more frequent than once a day: if the backups were only 1 or 2 hours old, the decision would have been much easier!

In addition to the daily backup, we have added a new “rolling backups” system: every 15 minutes, the script analyzes each Piwigo on specific criteria (new/modified/deleted photos/users/albums/groups…). If anything has changed since the last backup, the script backs up the Piwigo (files + database) with a synchronization on the spare server.

C) What about the giant downtime on OVH network, on October 9th and 10th ?

Being hosted at OVH, especially in the datacenter of Strasbourg (France, Europe), the downtime has greatly impacted our own infrastructure. First because our main reverse-proxy server is in Strasbourg. The datacenter failure put Piwigo.com completely out of order during the morning of November 9th (Central Europe time). Then because we could not switch the IP Fail Over. Or rather, OVH allowed us to do it, but instead of requiring ~60 seconds, it took ~10 hours! Hours when the accounts hosted on the reverse-proxy server were in read-only.

Unlike the October 14th situation, we could not make the decision to switch the spare server in write mode because an IPFO switch request was in progress, and we had no idea how long it would take OVH to apply the action.

The Piwigo.com infrastructure has returned to its normal state on November 10th at 14:46, Paris time (France).

OVH has just provided compensation for these failures. We were waiting for it to publish this blog post. The compensation is not much, compared to the actual damage, but we will fully transfer this compensation to our customers. After very high level calculations, 3 days of time credits were added to each account. It’s a small commercial gesture but we think we have to reverse it to you as a symbol!

We are sorry for these inconveniences. As you read in this blog post, we’ve improved our methods to mitigate risk in the future and reduce the impact of an irreversible server failure.


Delirious Sky

Published 27 Nov 2017 by jenimcmillan in Jeni McMillan.

DSC_0856

It is a delicious moment,

Delirious sky.

The sun burning deeply,

Her skin starts to fry.

She gathers her senses,

Surrounded by life.

When death beckons shyly,

She submits to his knife.

It’s only a metaphor,

We grow and we die,

And laugh at the Present,

The Goddess on High.


Slim 3.9.1 (and 3.9.2) released

Published 26 Nov 2017 by in Slim Framework Blog.

After the release of 3.9.0, a regression and an unexpected side-effect of a bug fix were noticed.

Firstly, you could not clear the user’s password when using Uri::withUserInfo(''), so this is fixed in #2332.

Secondly, we discovered that return $response->withHeader('Location', '/login'); no longer redirected in a browser. This isn’t a surprise as the 302 status code isn’t explicitly set developers were relying on a feature of PHP’s header() function that set 302 for them. This side-effect was causing other issues such as #1730, so it was fixed in 3.9.0. To mitigate the effect of this change, 3.9.1 includes #2345 which sets the status code to 302 when you add a Location header if the status code is currently 200. This change will not be forward-ported to 4.x though.

The full list of changes is here

Update: Shortly after the release of 3.9.1, it was discovered that #2342 should not have been merged as it breaks BC, so this PR was reverted in 3.9.2.


My best books of 2017

Published 25 Nov 2017 by Tom Wilson in thomas m wilson.

My best books of 2017… Deeply insightful works from Yale University Press on geopolitics today, a history of consumerism in the West, a wave making read on the Anthropocene as a new era, a powerful explanation of the nature/nurture question for human identity by a very funny Californian, and a charming meander through the English […]

How do I promote a user automatically in Mediawiki and create a log of those promotions?

Published 24 Nov 2017 by sau226 in Newest questions tagged mediawiki - Webmasters Stack Exchange.

I control a Mediawiki site. Here you can see users being automatically updated and added into the extended confirmed user group.

If I have a group called "util" where I just want to add relevant code to enable autopromotion with a log entry like that, make an edit and get promoted automatically into the group before removing the bit of code would it be possible? Also what code would I have to use to gain a level of access like that?


Is it possible to find broken link anchors in MediaWiki?

Published 24 Nov 2017 by Lyubomyr Shaydariv in Newest questions tagged mediawiki - Webmasters Stack Exchange.

Probably a simple question answered million times, but I can't find an answer. MediaWiki can track missing pages and report those with Special:WantedPages. I'm not sure if it's possible, but can MediaWiki report broken anchors? Say, I have the Foo page that refers the Bar page like this: [[Bar#Name]]. Let's assume the Bar page lacks this section therefore the Name section does not exist there, but Special:WantedPages won't report this link as broken because the Bar page exists. Is there any way to find all broken anchors? Thanks in advance!


SLAM POETRY DAD

Published 16 Nov 2017 by timbaker in Tim Baker.

I recently made my public slam poetry debut at the Men of Letters event in Brisbane in the salubrious surrounds of one of Australia’s most iconic live music venues, the Zoo, in Fortitude Valley. Men of Letters is a spin off of the hugely successful...

Your Fun and Informative Guide to Consuming “Oil, Love & Oxygen”

Published 16 Nov 2017 by Dave Robertson in Dave Robertson.

The Paradox of Choice says that too many options can demotivate people, so here’s a short guide to the options for getting your ears on “Oil, Love & Oxygen”.

Gigs
For the personal touch you can always get CDs at our shows. They come with a lush booklet of lyrics and credits, and the enchanting artwork of Frans Bisschops. Discounted digital download codes are also available for Bandcamp…

Bandcamp
Bandcamp is a one-stop online shop for your album consumption needs. You can get a digital download in your choice of format, including high-resolution formats for “audiophiles and nerds”. If you go for one of the “lossless” formats such as ALAC, then you are getting the highest sound quality possible (higher than CD). Downloads also come with a digital version of the aforementioned booklet.

Bandcamp is also where you can place a mail-order for the CD if you want to get physical. Another feature of Bandcamp is fans can pay more than the minimum price if they want to support the artist.

iTunes
The iTunes store is a great simple option for those in the Apple ecosystem, because it goes straight into the library on your device(s). You also get the same digital booklet as Bandcamp, and the audio for this release has been specially “Mastered for iTunes”. This means the sound quality is a bit better than most digital downloads (though not as good as the lossless formats available on Bandcamp).

This album was mastered by Ian Shepherd who has been a vigorous campaigner against the “loudness wars”. Did you ever notice that much, maybe most, music after the early 90s started to sound flat and bland? Well one reason was the use of “brick wall limiters” to increase average loudness, but this came at the expense of dynamics. I’m glad my release is not a causality of this pointless war, but I digress.

Other Digital Download Services
The album is on many other services, so just search for “Oil, Love & Oxygen” on your preferred platform. These services don’t provide you the booklet though and are not quite as high sound quality as the above two.

Streaming (Spotify etc.)
The album is also available across all the major streaming platforms. While streaming is certainly convenient, it is typically low sound quality and pays tiny royalties to artists.

Vinyl and Tape
Interestingly these formats are seeing a bit of a resurgence around the world. I would argue this is not because they are inherently better than digital, but because digital is so often abused (e.g. the aforementioned loudness wars and the use of “lossy” formats like mp3). If you seriously want vinyl or tape though, let me know and I will consider getting old school!

Share the Love
If you like the album, then please consider telling friends, rating or reviewing the album on iTunes etc., liking our page on the book of face…

Short enough?

Share


Your Fun and Informative Guide to Consuming “Oil, Love & Oxygen”

Published 16 Nov 2017 by Dave Robertson in Dave Robertson.

The Paradox of Choice says that too many options can demotivate people, so here’s a short guide to the options for getting your ears on “Oil, Love & Oxygen”.

Gigs
For the personal touch you can always get CDs at our shows. They come with a lush booklet of lyrics and credits, and the enchanting artwork of Frans Bisschops. Discounted digital download codes are also available for Bandcamp…

Bandcamp
Bandcamp is a one-stop online shop for your album consumption needs. You can get a digital download in your choice of format, including high-resolution formats for “audiophiles and nerds”. If you go for one of the “lossless” formats such as ALAC, then you are getting the highest sound quality possible (higher than CD). Downloads also come with a digital version of the aforementioned booklet.

Bandcamp is also where you can place a mail-order for the CD if you want to get physical. Another feature of Bandcamp is fans can pay more than the minimum price if they want to support the artist.

iTunes
The iTunes store is a great simple option for those in the Apple ecosystem, because it goes straight into the library on your device(s). You also get the same digital booklet as Bandcamp, and the audio for this release has been specially “Mastered for iTunes”. This means the sound quality is a bit better than most digital downloads (though not as good as the lossless formats available on Bandcamp).

This album was mastered by Ian Shepherd who has been a vigorous campaigner against the “loudness wars”. Did you ever notice that much, maybe most, music after the early 90s started to sound flat and bland? Well one reason was the use of “brick wall limiters” to increase average loudness, but this came at the expense of dynamics. I’m glad my release is not a causality of this pointless war, but I digress.

Other Digital Download Services
The album is on many other services, so just search for “Oil, Love & Oxygen” on your preferred platform. These services don’t provide you the booklet though and are not quite as high sound quality as the above two.

Streaming (Spotify etc.)
The album is also available across all the major streaming platforms. While streaming is certainly convenient, it is typically low sound quality and pays tiny royalties to artists.

Vinyl and Tape
Interestingly these formats are seeing a bit of a resurgence around the world. I would argue this is not because they are inherently better than digital, but because digital is so often abused (e.g. the aforementioned loudness wars and the use of “lossy” formats like mp3). If you seriously want vinyl or tape though, let me know and I will consider getting old school!

Share the Love
If you like the album, then please consider telling friends, rating or reviewing the album on iTunes etc., liking our page on the book of face…

Short enough?

Share


1.5.1

Published 11 Nov 2017 by mblaney in Tags from simplepie.

1.5.1 (#559)

* Revert sanitisation type change for author and category.

* Check if the Sanitize class has been changed and update the registry.
Also preference links in the headers over links in the body to
comply with WebSub specification.

* Improvements to mf2 feed parsing.

* Switch from regex to xpath for microformats discovery.

* 1.5.1 release.

* Remove PHP 5.3 from testing.


Slim 3.9.0 released

Published 4 Nov 2017 by in Slim Framework Blog.

We are delighted to release Slim 3.9.0. As Slim 3 is stable, there’s mostly bug fixes in this version.

Probably the most noticeable changes are that we now allow any HTTP method name in the Request object and the Uri now correctly encodes the user information, which will ensure user names and passwords with reserved characters such as @ will work as you expect. Also in the HTTP component, the Request’s getParams() now allows you to provide a list of the parameters you want returned, allowing you to filter for a specific set.

As usual, there are also some bug fixes, particularly around the output buffering setting and you can now use any HTTP method you want to without getting an error.

The full list of changes is here


Understanding WordStar - check out the manuals!

Published 20 Oct 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Last month I was pleased to be able to give a presentation at 'After the Digital Revolution' about some of the work I have been doing on the WordStar 4.0 files in the Marks and Gran digital archive that we hold here at the Borthwick Institute for Archives. This event specifically focused on literary archives.

It was some time ago now that I first wrote about these files that were recovered from 5.25 inch floppy (really floppy) disks deposited with us in 2009.

My original post described the process of re-discovery, data capture and file format identification - basically the steps that were carried out to get some level of control over the material and put it somewhere safe.

I recorded some of my initial observations about the files but offered no conclusions about the reasons for the idiosyncrasies.

I’ve since been able to spend a bit more time looking at the files and investigating the creating application (WordStar) so in my presentation at this event I was able to talk at length (too long as usual) about WordStar and early word processing. A topic guaranteed to bring out my inner geek!

WordStar is not an application I had any experience with in the past. I didn’t start word processing until the early 90’s when my archaeology essays and undergraduate dissertation were typed up into a DOS version of Word Perfect. Prior to that I used a typewriter (now I feel old!).

WordStar by all accounts was ahead of its time. It was the first Word Processing application to include mail merge functionality. It was hugely influential, introducing a number of keyboard shortcuts that are still used today in modern word processing applications (for example control-B to make text bold). Users interacted with WordStar using their keyboard, selecting the necessary keystrokes from a set of different menus. The computer mouse (if it was present at all) was entirely redundant.

WordStar was widely used as home computing and word processing increased in popularity through the 1980’s and into the early 90’s. However, with the introduction of Windows 3.0 and Word for Windows in 1989, WordStar gradually fell out of favour (info from Wikipedia).

Despite this it seems that WordStar had a loyal band of followers, particularly among writers. Of course the word processor was the key tool of their trade so if they found an application they were comfortable with it is understandable that they might want to stick with it.

I was therefore not surprised to hear that others presenting at 'After the Digital Revolution' also had WordStar files in their literary archives. Clear opportunities for collaboration here! If we are all thinking about how to provide access to and preserve these files for the future then wouldn't it be useful to talk about it together?

I've already learnt a lot through conversations with the National Library of New Zealand who have been carrying out work in this area (read all about it here: Gattuso J, McKinney P (2014) Converting WordStar to HTML4. iPres.)

However, this blog post is not about defining a preservation strategy for the files it is about better understanding them. My efforts have been greatly helped by finding a copy of both a WordStar 3 manual and a WordStar 4 manual online.

As noted in my previous post on this subject there were a few things that stand out when first looking at the recovered WordStar files and I've used the manuals and other research avenues to try and understand these better.


Created and last modified dates

The Marks and Gran digital archive consists of 174 files, most of which are WordStar files (and I believe them to be WordStar version 4).

Looking at the details that appear on the title pages of some of the scripts, the material appears to be from the period 1984 to 1987 (though not everything is dated).

However the system dates associated with the files themselves tell a different story. 

The majority of files in the archive have a creation date of 1st January 1980.

This was odd. Not only would that have been a very busy New Year's Day for the screen writing duo, but the timestamps on the files suggest that they were also working in the very early hours of the morning - perhaps unexpected when many people are out celebrating having just seen in the New Year!

This is the point at which I properly lost my faith in technical metadata!

In this period computers weren't quite as clever as they are today. When you switched them on they would ask you what date it was. If you didn't tell them the date, the PC would fall back to a system default ....which just so happens to be 1st January 1980.

I was interested to see Abby Adams from the Harry Ransom Center, University of Texas at Austin (also presenting at 'After the Digital Revolution') flag up some similarly suspicious dates on files in a digital archive held at her institution. Her dates differed just slightly to mine, falling on the evening of the 31st December 1979. Again, these dates looked unreliable as they were clearly out of line with the rest of the collection.

This is the same issue as mine, but the differences relate to the timezone. There is further explanation here highlighted by David Clipsham when I threw the question out to Twitter. Thanks!


Fragmentation

Another thing I had noticed about the files was the way that they were broken up into fragments. The script for a single episode was not saved as a single file but typically as 3 or 4 separate files. These files were named in such a way that it was clear that they were related and that the order that the files should be viewed or accessed was apparent - for example GINGER1, GINGER2 or PILOT and PILOTB.

This seemed curious to me - why not just save the document as a single file? The WordStar 4 manual didn't offer any clues but I found this piece of information in the WordStar 3 manual which describes how files should be split up to help manage the storage space on your diskettes:

From the WordStar 3 manual




Perhaps some of the files in the digital archive are from WordStar 3, or perhaps Marks and Gran had been previously using WordStar 3 and had just got into the habit of splitting a document into several files in order to ensure they didn't run out of space on their floppy disks.

I can not imagine working this way today! Technology really has come on a long way. Imagine trying to format, review or spell check a document that exists as several discrete files potentially sitting on different media!


Filenames

One thing that stands out when browsing the disks is that all the filenames are in capital letters. DOES ANYONE KNOW WHY THIS WAS THE CASE?

File names in this digital archive were also quite cryptic.This is the 1980’s so filenames conform to the 8.3 limit. Only 8 characters are allowed in a filename and it *may* also include a 3 character file extension.

Note that the file extension really is optional and WordStar version 4 doesn’t enforce the use of a standard file extension. Users were encouraged to use those last 3 characters of the file name to give additional context to the file content rather than to describe the file format itself.

Guidance on file naming from the WordStar 4 manual
Some of the tools and processes we have in place to analyse and process the files in our digital archives use the file extension information to help understand the format. The file naming methodology described here therefore makes me quite uncomfortable!

Marks and Gran tended not to use the file extension in this way (though there are a few examples of this in the archive). The majority of WordStar files have no extension at all. The real consistent use of file extensions related to their back up files.


Backup files

Scattered amongst the recovered data were a set of files that had the extension BAK. This clearly is a file extension that WordStar creates and uses consistently. These files clearly contained very similar content to other documents within the archive but typically with just a few differences in content. These files were clearly back up files of some sort but I wondered whether they had been created automatically or by the writers themselves.

Again the manual was helpful in moving forward my understanding on this:

Backup files from the WordStar 4 manual

This backup procedure is also summarised with the help of a diagram in the WordStar 3 manual:


The backup procedure from WordStar 3 manual


This does help explain why there were so many back up files in the archive. I guess the next question is 'should we keep them?'. It does seem that they are an artefact of the application rather than representing a conscious process by the writers to back their files up at a particular point in time and that may impact on their value. However, as discussed in a previous post on preserving Google documents there could be some benefit in preserving revision history (even if only partial).



...and finally

My understanding of these WordStar files has come on in leaps and bounds by doing a bit of research and in particular through finding copies of the manuals.

The manuals even explain why alongside the scripts within the digital archive we also have a disk that contains a copy of the WordStar application itself. 

The very first step in the manual asks users to make a copy of the software:


I do remember having to do this sort of thing in the past! From WordStar 4 manual


Of course the manuals themselves are also incredibly useful in teaching me how to actually use the software. Keystroke based navigation is hardly intuitive to those of us who are now used to using a mouse, but I think that might be the subject of another blog post!



Understanding WordStar - check out the manuals!

Published 20 Oct 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Last month I was pleased to be able to give a presentation at 'After the Digital Revolution' about some of the work I have been doing on the WordStar 4.0 files in the Marks and Gran digital archive that we hold here at the Borthwick Institute for Archives. This event specifically focused on literary archives.

It was some time ago now that I first wrote about these files that were recovered from 5.25 inch floppy (really floppy) disks deposited with us in 2009.

My original post described the process of re-discovery, data capture and file format identification - basically the steps that were carried out to get some level of control over the material and put it somewhere safe.

I recorded some of my initial observations about the files but offered no conclusions about the reasons for the idiosyncrasies.

I’ve since been able to spend a bit more time looking at the files and investigating the creating application (WordStar) so in my presentation at this event I was able to talk at length (too long as usual) about WordStar and early word processing. A topic guaranteed to bring out my inner geek!

WordStar is not an application I had any experience with in the past. I didn’t start word processing until the early 90’s when my archaeology essays and undergraduate dissertation were typed up into a DOS version of Word Perfect. Prior to that I used a typewriter (now I feel old!).

WordStar by all accounts was ahead of its time. It was the first Word Processing application to include mail merge functionality. It was hugely influential, introducing a number of keyboard shortcuts that are still used today in modern word processing applications (for example control-B to make text bold). Users interacted with WordStar using their keyboard, selecting the necessary keystrokes from a set of different menus. The computer mouse (if it was present at all) was entirely redundant.

WordStar was widely used as home computing and word processing increased in popularity through the 1980’s and into the early 90’s. However, with the introduction of Windows 3.0 and Word for Windows in 1989, WordStar gradually fell out of favour (info from Wikipedia).

Despite this it seems that WordStar had a loyal band of followers, particularly among writers. Of course the word processor was the key tool of their trade so if they found an application they were comfortable with it is understandable that they might want to stick with it.

I was therefore not surprised to hear that others presenting at 'After the Digital Revolution' also had WordStar files in their literary archives. Clear opportunities for collaboration here! If we are all thinking about how to provide access to and preserve these files for the future then wouldn't it be useful to talk about it together?

I've already learnt a lot through conversations with the National Library of New Zealand who have been carrying out work in this area (read all about it here: Gattuso J, McKinney P (2014) Converting WordStar to HTML4. iPres.)

However, this blog post is not about defining a preservation strategy for the files it is about better understanding them. My efforts have been greatly helped by finding a copy of both a WordStar 3 manual and a WordStar 4 manual online.

As noted in my previous post on this subject there were a few things that stand out when first looking at the recovered WordStar files and I've used the manuals and other research avenues to try and understand these better.


Created and last modified dates

The Marks and Gran digital archive consists of 174 files, most of which are WordStar files (and I believe them to be WordStar version 4).

Looking at the details that appear on the title pages of some of the scripts, the material appears to be from the period 1984 to 1987 (though not everything is dated).

However the system dates associated with the files themselves tell a different story. 

The majority of files in the archive have a creation date of 1st January 1980.

This was odd. Not only would that have been a very busy New Year's Day for the screen writing duo, but the timestamps on the files suggest that they were also working in the very early hours of the morning - perhaps unexpected when many people are out celebrating having just seen in the New Year!

This is the point at which I properly lost my faith in technical metadata!

In this period computers weren't quite as clever as they are today. When you switched them on they would ask you what date it was. If you didn't tell them the date, the PC would fall back to a system default ....which just so happens to be 1st January 1980.

I was interested to see Abby Adams from the Harry Ransom Center, University of Texas at Austin (also presenting at 'After the Digital Revolution') flag up some similarly suspicious dates on files in a digital archive held at her institution. Her dates differed just slightly to mine, falling on the evening of the 31st December 1979. Again, these dates looked unreliable as they were clearly out of line with the rest of the collection.

This is the same issue as mine, but the differences relate to the timezone. There is further explanation here highlighted by David Clipsham when I threw the question out to Twitter. Thanks!


Fragmentation

Another thing I had noticed about the files was the way that they were broken up into fragments. The script for a single episode was not saved as a single file but typically as 3 or 4 separate files. These files were named in such a way that it was clear that they were related and that the order that the files should be viewed or accessed was apparent - for example GINGER1, GINGER2 or PILOT and PILOTB.

This seemed curious to me - why not just save the document as a single file? The WordStar 4 manual didn't offer any clues but I found this piece of information in the WordStar 3 manual which describes how files should be split up to help manage the storage space on your diskettes:

From the WordStar 3 manual




Perhaps some of the files in the digital archive are from WordStar 3, or perhaps Marks and Gran had been previously using WordStar 3 and had just got into the habit of splitting a document into several files in order to ensure they didn't run out of space on their floppy disks.

I can not imagine working this way today! Technology really has come on a long way. Imagine trying to format, review or spell check a document that exists as several discrete files potentially sitting on different media!


Filenames

One thing that stands out when browsing the disks is that all the filenames are in capital letters. DOES ANYONE KNOW WHY THIS WAS THE CASE?

File names in this digital archive were also quite cryptic.This is the 1980’s so filenames conform to the 8.3 limit. Only 8 characters are allowed in a filename and it *may* also include a 3 character file extension.

Note that the file extension really is optional and WordStar version 4 doesn’t enforce the use of a standard file extension. Users were encouraged to use those last 3 characters of the file name to give additional context to the file content rather than to describe the file format itself.

Guidance on file naming from the WordStar 4 manual
Some of the tools and processes we have in place to analyse and process the files in our digital archives use the file extension information to help understand the format. The file naming methodology described here therefore makes me quite uncomfortable!

Marks and Gran tended not to use the file extension in this way (though there are a few examples of this in the archive). The majority of WordStar files have no extension at all. The real consistent use of file extensions related to their back up files.


Backup files

Scattered amongst the recovered data were a set of files that had the extension BAK. This clearly is a file extension that WordStar creates and uses consistently. These files clearly contained very similar content to other documents within the archive but typically with just a few differences in content. These files were clearly back up files of some sort but I wondered whether they had been created automatically or by the writers themselves.

Again the manual was helpful in moving forward my understanding on this:

Backup files from the WordStar 4 manual

This backup procedure is also summarised with the help of a diagram in the WordStar 3 manual:


The backup procedure from WordStar 3 manual


This does help explain why there were so many back up files in the archive. I guess the next question is 'should we keep them?'. It does seem that they are an artefact of the application rather than representing a conscious process by the writers to back their files up at a particular point in time and that may impact on their value. However, as discussed in a previous post on preserving Google documents there could be some benefit in preserving revision history (even if only partial).



...and finally

My understanding of these WordStar files has come on in leaps and bounds by doing a bit of research and in particular through finding copies of the manuals.

The manuals even explain why alongside the scripts within the digital archive we also have a disk that contains a copy of the WordStar application itself. 

The very first step in the manual asks users to make a copy of the software:


I do remember having to do this sort of thing in the past! From WordStar 4 manual


Of course the manuals themselves are also incredibly useful in teaching me how to actually use the software. Keystroke based navigation is hardly intuitive to those of us who are now used to using a mouse, but I think that might be the subject of another blog post!



Crime and Punishment

Published 19 Oct 2017 by leonieh in State Library of Western Australia Blog.

Many Western Australians have a convict or pensioner guard in their ancestral family. The State Library has digitised some items from our heritage collections relating to convicts, the police and the early criminal justice system.

Convicts slwa_b2462917_1

Convicts Tom the dealer, Davey Evans and Paddy Paternoster b2462917

Police Gazette of Western Australia, 1876-1900
The Police Gazettes include information under various headings including apprehensions (name of person arrested, arresting constable, charge and sentence), police appointments, tickets of leave, certificates of freedom, and conditional pardons issued to convicts. You may find physical descriptions of prisoners. Deserters from military service and escaped prisoners are sought. Mention is also made of expirees leaving the colony; inquests (where held, date, name and date of death of person, verdict); licences (publican, gallon, eating, boarding and lodging houses, railway refreshment rooms, wine and beer and spirit merchants, etc. giving name of licensee, name of hotel and town or district). There are listings for missing friends; prisoners discharged; people tried at Quarter Sessions (name, offence, district, verdict); and warrants issued. There are many reasons for a name to appear in the gazettes.

We thank the Friends of Battye Library and the Sholl Bequest, for supporting the digitising of the Police Gazettes.

Click to view slideshow.

 

A great resource for researching the broader experience of WA convicts is The convict system in Western Australia, 1850-1870 by Cherry Gertzel. This thesis explains the workings of the convict system, and explores the conditions under which the convicts lived and worked, their effect on the colony and, to some extent, the attitudes of colonists to the prisoners.

Click to view slideshow.

Another valuable publication is Further correspondence on the subject of convict discipline and transportation. This comprises official documents relating to the transportation of convicts to Australia, covering the period 1810-1865, and is bound in 8 volumes.
This set from our rare book collection gives an excellent background to the subject for anyone researching convicts or convict guards, with individuals (very) occasionally being named.
The easiest way to access this wonderful resource is to type convict system under Title in our catalogue and select State Library Online from the drop-down box. Once you’ve selected a volume, you can browse through the pages by placing your cursor on the edge of a page and clicking. If you have the volume turned on, this makes a very satisfying page-turning noise! If you want to search for names, scroll down and select the Download button. You can then save a searchable PDF version to your PC. The files are fairly large so you may need to be patient.

Return of the number of wives and families of ticket-of-leave holders to be sent out to Western Australia 1859

Return of the number of wives and families of ticket-of-leave holders to be sent out to Western Australia 1859 From: Further correspondence on the subject of convict discipline and transportation, 1859-1865 p.65. [vol.8]

 There are several online diaries relating to convict voyages. The diary, including copies of letters home, of convict John Acton Wroth was kept during his transportation to Western Australia on the Mermaid in 1851 and for a while after his arrival. Wroth was only 17 years old at the time of his conviction. Apparently he was enamoured of a young woman and resorted to fraud in order to find the means to impress her. The diary spans 1851-1853 and it reveals one young man’s difficulty in finding himself far from the love and support of his family while accepting of the circumstance he has brought upon himself. Wroth subsequently settled in Toodyay and became a respected resident, raising a large family and running several businesses as well as acting for some time as local school master. Click to view slideshow.

Another interesting read is the transcript of the diary of John Gregg, carpenter on the convict ship York. This 1862 diary gives details of work each day, which was often difficult when the weather was foul and the carpenter sea-sick, and uncommon events such as attempts by convicts to escape –

“…the affair altogether must be admitted to reflect little credit on the military portion of the convict guard, for although the officer of the watch called loud and long for the guard, none were forthcoming until the prisoners were actually in custody.”

Click to view slideshow.

Diary of John Gregg, carpenter on the convict ship ‘York’, with definitions of nautical terms, compiled by Juliet Ludbrook.

Picture1

 

 

 

A letter from a convict in Australia to a brother in England, originally published in the Cornhill Magazine, April 1866 contains insights into the experience of a more educated felon and some sharp observations on convict life as lived by him upon his arrival in Western Australia-

“…you can walk about and talk with your friends as you please. So long as there is no disturbance, there is no interference”

and

“…the bond class stand in the proportion of fully five-sevenths of the entire population, and are fully conscious of their power…”

Other miscellaneous convict -related items include:

Two posters listing convict runaways with details of their convictions and descriptions:
Return of convicts who have escaped from the colony, and whose absconding has been notified to this office between the 1st June, 1850, and the 31st of March, 1859
and
List of convicts who are supposed to have escaped the Colony (a broadsheet giving the name, number and description of 83 escaped convicts).


Parade state of the Enrolled Guard, 30 March 1887, on the occasion of the inspection of the guard by Sir Frederick Napier Broome, prior to disbandment.

Parade_state_of_the_Enrolled_Guard___b1936163_2017-10-11_1638

Parade state of the Enrolled Guard… b1936163

 

British Army pensioners came out to Western Australia as convict guards. This document gives the following details for those still serving in 1887:- rank, name, regiment, age, rate of pension, length of Army service, rank when pensioned, date of joining the Enrolled Guard, medals and clasps.

 

 

 

 

 

 

Scale of remission for English convicts sentenced to penal servitude subsequent to 1 July 1857  is a table showing how much time in good behaviour convicts needed to accrue in order to qualify for privileges.

Certificate of freedom, 1869 [Certificates of freedom of convict William Dore]

This is just a small sample of convict-related material in the State Library collections that you can explore online. You can also visit the Battye Library of West Australian History to research individual convicts, policemen, pensioner guards or others involved in the criminal justice system.

 


“Why archivists need a shredder…”

Published 13 Oct 2017 by inthemailbox in In the mailbox.

Struggling to explain what it is that you do and why you do it? President of the Australian Society of Archivists, Julia Mant, gives it a red hot go in an interview for the University of Technology Sydneyhttps://itunes.apple.com/au/podcast/glamcity/id1276048279?mt=2

https://player.whooshkaa.com/player/playlist/show/1927?visual=true&sharing=true

 


Google Books and Mein Kampf

Published 10 Oct 2017 by Karen Coyle in Coyle's InFormation.

I hadn't look at Google Books in a while, or at least not carefully, so I was surprised to find that Google had added blurbs to most of the books. Even more surprising (although perhaps I should say "troubling") is that no source is given for the book blurbs. Some at least come from publisher sites, which means that they are promotional in nature. For example, here's a mildly promotional text about a literary work, from a literary publisher:



This gives a synopsis of the book, starting with:

"Throughout a single day in 1892, John Shawnessy recalls the great moments of his life..." 

It ends by letting the reader know that this was a bestseller when published in 1948, and calls it a "powerful novel."

The blurb on a 1909 version of Darwin's The Origin of Species is mysterious because the book isn't a recent publication with an online site providing the text. I do not know where this description comes from, but because the  entire thrust of this blurb is about the controversy of evolution versus the Bible (even though Darwin did not press this point himself) I'm guessing that the blurb post-dates this particular publication.


"First published in 1859, this landmark book on evolutionary biology was not the first to deal with the subject, but it went on to become a sensation -- and a controversial one for many religious people who could not reconcile Darwin's science with their faith."
That's a reasonable view to take of Darwin's "landmark" book but it isn't what I would consider to be faithful to the full import of this tome.

The blurb on Hitler's Mein Kampf is particularly troubling. If you look at different versions of the book you get both pro- and anti- Nazi sentiments, neither of which really belong  on a site that claims to be a catalog of books. Also note that because each book entry has only one blurb, the tone changes considerably depending on which publication you happen to pick from the list.


First on the list:
"Settling Accounts became Mein Kampf, an unparalleled example of muddled economics and history, appalling bigotry, and an intense self-glorification of Adolf Hitler as the true founder and builder of the National Socialist movement. It was written in hate and it contained a blueprint for violent bloodshed."

Second on the list:
"This book has set a path toward a much higher understanding of the self and of our magnificent destiny as living beings part of this Race on our planet. It shows us that we must not look at nature in terms of good or bad, but in an unfiltered manner. It describes what we must do if we want to survive as a people and as a Race."
That's horrifying. Note that both books are self-published, and the blurbs are the ones that I find on those books in Amazon, perhaps indicating that Google is sucking up books from the Amazon site. There is, or at least at one point there once was, a difference between Amazon and Google Books. Google, after all, scanned books in libraries and presented itself as a search engine for published texts; Amazon will sell you Trump's tweets on toilet paper. The only text on the Google Books page still claims that Google Books is about  search: "Search the world's most comprehensive index of full-text books." Libraries partnered with Google with lofty promises of gains in scholarship:
"Our participation in the Google Books Library Project will add significantly to the extensive digital resources the Libraries already deliver. It will enable the Libraries to make available more significant portions of its extraordinary archival and special collections to scholars and researchers worldwide in ways that will ultimately change the nature of scholarship." Jim Neal, Columbia University
I don't know how these folks now feel about having their texts intermingled with publications they would never buy and described by texts that may come from shady and unreliable sources.

Even leaving aside the grossest aspects of the blurbs and Google's hypocrisy about its commercialization of its books project, adding blurbs to the book entries with no attribution and clearly not vetting the sources is extremely irresponsible. It's also very Google to create sloppy algorithms that illustrate their basic ignorance of the content their are working with -- in this case, the world's books.


Why do I write environmental history?

Published 8 Oct 2017 by Tom Wilson in thomas m wilson.

Why bother to tell the history of the plants and animals that make up my home in Western Australia?  Partly its about reminding us of what was here on the land before, and in some ways, could be here again. In answering this question I’d like to quote the full text of Henry David Thoreau’s […]

Oil, Love & Oxygen – Album Launch

Published 29 Sep 2017 by Dave Robertson in Dave Robertson.

“Oil, Love & Oxygen” is a collection of songs about kissing, climate change, cult 70s novels and more kissing. Recorded across ten houses and almost as many years, the album is diverse mix of bittersweet indie folk, pop, rock and blues. The Kiss List bring a playful element to Dave Robertson’s songwriting, unique voice and percussive acoustic guitar work. This special launch night also features local music legends Los Porcheros, Dave Johnson, Sian Brown, Rachel Armstrong and Merle Fyshwick.

Tickets $15 through https://www.trybooking.com/SDCA , or on the door if still available

Share


Oil, Love & Oxygen – Album Launch

Published 29 Sep 2017 by Dave Robertson in Dave Robertson.

“Oil, Love & Oxygen” is a collection of songs about kissing, climate change, cult 70s novels and more kissing. Recorded across ten houses and almost as many years, the album is diverse mix of bittersweet indie folk, pop, rock and blues. The Kiss List bring a playful element to Dave Robertson’s songwriting, unique voice and percussive acoustic guitar work. This special launch night also features local music legends Los Porcheros, Dave Johnson, Sian Brown, Rachel Armstrong and Merle Fyshwick.

Tickets $15 through https://www.trybooking.com/SDCA , or on the door if still available

Share


v2.4.4

Published 27 Sep 2017 by fabpot in Tags from Twig.


v1.35.0

Published 27 Sep 2017 by fabpot in Tags from Twig.


The first UK AtoM user group meeting

Published 27 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday the newly formed UK AtoM user group met for the first time at St John's College Cambridge and I was really pleased that myself and a colleague were able to attend.
Bridge of Sighs in Autumn (photo by Sally-Anne Shearn)

This group has been established to provide the growing UK AtoM community with a much needed forum for exchanging ideas and sharing experiences of using AtoM.

The meeting was attended by about 15 people though we were informed that there are nearly 50 people on the email distribution list. Interest in AtoM is certainly increasing in the UK.

As this was our first meeting, those who had made progress with AtoM were encouraged to give a brief presentation covering the following points:
  1. Where are you with AtoM (investigating, testing, using)?
  2. What do you use it for? (cataloguing, accessions, physical storage locations)
  3. What do you like about it/ what works?
  4. What don’t you like about it/ what doesn’t work?
  5. How do you see AtoM fitting into your wider technical infrastructure? (do you have separate location or accession databases etc?)
  6. What unanswered questions do you have?
It was really interesting to find out how others are using AtoM in the UK. A couple of attendees had already upgraded to the new 2.4 release so that was encouraging to see.

I'm not going to summarise the whole meeting but I made a note of people's likes and dislikes (questions 3 and 4 above). There were some common themes that came up.

Note that most users are still using AtoM 2.2 or 2.3, those who have moved to 2.4 haven't had much chance to explore it yet. It may be that some of these comments are already out of date and fixed in the new release.


What works?


AtoM seems to have lots going for it!

The words 'intuitive', 'user friendly', 'simple', 'clear' and 'flexible' were mentioned several times. One attendee described some user testing she carried out during which she found her users just getting on and using it without any introduction or explanation! Clearly a good sign!

The fact that it was standards compliant was mentioned as well as the fact that consistency was enforced. When moving from unstructured finding aids to AtoM it really does help ensure that the right bits of information are included. The fact that AtoM highlights which mandatory fields are missing at the top of a page is really helpful when checking through your own or others records.

The ability to display digital images was highlighted by others as a key selling point, particularly the browse by digital objects feature.

The way that different bits of the AtoM database interlink was a plus point that was mentioned more than once - this allows you to build up complex interconnecting records using archival descriptions and authority records and these can also be linked to accession records and a physical location.

The locations section of AtoM was thought to be 'a good thing' - for recording information about where in the building each archive is stored. This works well once you get your head around how best to use it.

Integration with Archivematica was mentioned by one user as being a key selling point for them - several people in the room were either using, or thinking of using Archivematica for digital preservation.

The user community itself and the quick and helpful responses to queries posted on the user forum were mentioned by more than one attendee. Also praised was the fact that AtoM is in continuous active development and very much moving in the right direction.


What doesn't work?


Several attendees mentioned the digital object functionality in AtoM. As well as being a clear selling point, it was also highlighted as an area that could be improved. The one-to-one relationship between an archival description and a digital object wasn't thought to be ideal and there was some discussion about linking through to external repositories - it would be nice if items linked in this way could be displayed in the AtoM image carousel even where the url doesn't end in a filename.

The typeahead search suggestions when you enter search terms were not thought to be helpful all of the time. Sometimes the closest matches do not appear in the list of suggested results.

One user mentioned that they would like a publication status that is somewhere in between draft and published. This would be useful for those records that are complete and can be viewed internally by a selected group of users who are logged in but are not available to the wider public.

More than one person mentioned that they would like to see a conservation module in AtoM.

There was some discussion about the lack of an audit trail for descriptions within AtoM. It isn't possible to see who created a record, when it was created and information about updates. This would be really useful for data quality checking, particularly when training new members of staff and volunteers.

Some concerns about scalability were mentioned - particularly for one user with a very large number of records within AtoM - the process of re-indexing AtoM can take three days.

When creating creator or access points, the drop down menu doesn’t display all the options so this causes difficulties when trying to link to the right point or establishing whether the desired record is in the system or not. This can be particularly problematic for common surnames as several different records may exist.

There are some issues with the way authority records are created currently, with no automated way of creating a unique identifier and no ability to keep authority records in draft.

A comment about the lack of auto-save and the issue of the web form timing out and losing all of your work seemed to be a shared concern for many attendees.

Other things that were mentioned included an integration with Active Directory and local workarounds that had to be put in place to make finding aids bi-lingual.


Moving forward


The group agreed that it would be useful to keep a running list of these potential areas of development for AtoM and that perhaps in the future members may be able to collaborate to jointly sponsor work to improve AtoM. This would be a really positive outcome for this new network.

I was also able to present on a recent collaboration to enable OAI-PMH harvesting of EAD from AtoM and use it as an opportunity to try to drum up support for further development of this new feature. I had to try and remember what OAI-PMH stood for and think I got 83% of it right!

Thanks to St John's College Cambridge for hosting. I look forward to our next meeting which we hope to hold here in York in the Spring.

The first UK AtoM user group meeting

Published 27 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Yesterday the newly formed UK AtoM user group met for the first time at St John's College Cambridge and I was really pleased that myself and a colleague were able to attend.
Bridge of Sighs in Autumn (photo by Sally-Anne Shearn)

This group has been established to provide the growing UK AtoM community with a much needed forum for exchanging ideas and sharing experiences of using AtoM.

The meeting was attended by about 15 people though we were informed that there are nearly 50 people on the email distribution list. Interest in AtoM is certainly increasing in the UK.

As this was our first meeting, those who had made progress with AtoM were encouraged to give a brief presentation covering the following points:
  1. Where are you with AtoM (investigating, testing, using)?
  2. What do you use it for? (cataloguing, accessions, physical storage locations)
  3. What do you like about it/ what works?
  4. What don’t you like about it/ what doesn’t work?
  5. How do you see AtoM fitting into your wider technical infrastructure? (do you have separate location or accession databases etc?)
  6. What unanswered questions do you have?
It was really interesting to find out how others are using AtoM in the UK. A couple of attendees had already upgraded to the new 2.4 release so that was encouraging to see.

I'm not going to summarise the whole meeting but I made a note of people's likes and dislikes (questions 3 and 4 above). There were some common themes that came up.

Note that most users are still using AtoM 2.2 or 2.3, those who have moved to 2.4 haven't had much chance to explore it yet. It may be that some of these comments are already out of date and fixed in the new release.


What works?


AtoM seems to have lots going for it!

The words 'intuitive', 'user friendly', 'simple', 'clear' and 'flexible' were mentioned several times. One attendee described some user testing she carried out during which she found her users just getting on and using it without any introduction or explanation! Clearly a good sign!

The fact that it was standards compliant was mentioned as well as the fact that consistency was enforced. When moving from unstructured finding aids to AtoM it really does help ensure that the right bits of information are included. The fact that AtoM highlights which mandatory fields are missing at the top of a page is really helpful when checking through your own or others records.

The ability to display digital images was highlighted by others as a key selling point, particularly the browse by digital objects feature.

The way that different bits of the AtoM database interlink was a plus point that was mentioned more than once - this allows you to build up complex interconnecting records using archival descriptions and authority records and these can also be linked to accession records and a physical location.

The locations section of AtoM was thought to be 'a good thing' - for recording information about where in the building each archive is stored. This works well once you get your head around how best to use it.

Integration with Archivematica was mentioned by one user as being a key selling point for them - several people in the room were either using, or thinking of using Archivematica for digital preservation.

The user community itself and the quick and helpful responses to queries posted on the user forum were mentioned by more than one attendee. Also praised was the fact that AtoM is in continuous active development and very much moving in the right direction.


What doesn't work?


Several attendees mentioned the digital object functionality in AtoM. As well as being a clear selling point, it was also highlighted as an area that could be improved. The one-to-one relationship between an archival description and a digital object wasn't thought to be ideal and there was some discussion about linking through to external repositories - it would be nice if items linked in this way could be displayed in the AtoM image carousel even where the url doesn't end in a filename.

The typeahead search suggestions when you enter search terms were not thought to be helpful all of the time. Sometimes the closest matches do not appear in the list of suggested results.

One user mentioned that they would like a publication status that is somewhere in between draft and published. This would be useful for those records that are complete and can be viewed internally by a selected group of users who are logged in but are not available to the wider public.

More than one person mentioned that they would like to see a conservation module in AtoM.

There was some discussion about the lack of an audit trail for descriptions within AtoM. It isn't possible to see who created a record, when it was created and information about updates. This would be really useful for data quality checking, particularly when training new members of staff and volunteers.

Some concerns about scalability were mentioned - particularly for one user with a very large number of records within AtoM - the process of re-indexing AtoM can take three days.

When creating creator or access points, the drop down menu doesn’t display all the options so this causes difficulties when trying to link to the right point or establishing whether the desired record is in the system or not. This can be particularly problematic for common surnames as several different records may exist.

There are some issues with the way authority records are created currently, with no automated way of creating a unique identifier and no ability to keep authority records in draft.

A comment about the lack of auto-save and the issue of the web form timing out and losing all of your work seemed to be a shared concern for many attendees.

Other things that were mentioned included an integration with Active Directory and local workarounds that had to be put in place to make finding aids bi-lingual.


Moving forward


The group agreed that it would be useful to keep a running list of these potential areas of development for AtoM and that perhaps in the future members may be able to collaborate to jointly sponsor work to improve AtoM. This would be a really positive outcome for this new network.

I was also able to present on a recent collaboration to enable OAI-PMH harvesting of EAD from AtoM and use it as an opportunity to try to drum up support for further development of this new feature. I had to try and remember what OAI-PMH stood for and think I got 83% of it right!

Thanks to St John's College Cambridge for hosting. I look forward to our next meeting which we hope to hold here in York in the Spring.

Moving a proof of concept into production? it's harder than you might think...

Published 20 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Myself and colleagues blogged a lot during the Filling the Digital Preservation Gap Project but I’m aware that I’ve gone a bit quiet on this topic since…

I was going to wait until we had a big success to announce, but follow on work has taken longer than expected. So in the meantime here is an update on where we are and what we are up to.

Background


Just to re-cap, by the end of phase 3 of Filling the Digital Preservation Gap we had created a working proof of concept at the University of York that demonstrated that it is possible create an automated preservation workflow for research data using PURE, Archivematica, Fedora and Samvera (then called Hydra!).

This is described in our phase 3 project report (and a detailed description of the workflow we were trying to implement was included as an appendix in the phase 2 report).

After the project was over, it was agreed that we should go ahead and move this into production.

Progress has been slower than expected. I hadn’t quite appreciated just how different a proof of concept is to a production-ready environment!

Here are some of the obstacles we have encountered (and in some cases overcome):

Error reporting


One of the key things that we have had to build in to the existing code in order to get it ready for production is error handling.

This was not a priority for the proof of concept. A proof of concept is really designed to demonstrate that something is possible, not to be used in earnest.

If errors happen and things stop working (which they sometimes do) you can just kill it and rebuild.

In a production environment we want to be alerted when something goes wrong so we can work out how to fix it. Alerts and errors are crucial to a system like this.

We are sorting this out by enabling Archivematica's own error handling and error catching within Automation Tools.


What happens when something goes wrong?


...and of course once things have gone wrong in Archivematica and you've fixed the underlying technical issue, you then need to deal with any remaining problems with your information packages in Archivematica.

For example, if the problems have resulted in failed transfers in Archivematica then you need to work out what you are going to do with those failed transfers. Although it is (very) tempting to just clear out Archivematica and start again, colleagues have advised me that it is far more useful to actually try and solve the problems and establish how we might handle a multitude of problematic scenarios if we were in a production environment!

So we now have scenarios in which an automated transfer has failed so in order to get things moving again we need to carry out a manual transfer of the dataset into Archivematica. Will the other parts of our workflow still work if we intervene in this way?

One issue we have encountered along the way is that though our automated transfer uses a specific 'datasets' processing configuration that we have set up within Archivematica, when we push things through manually it uses the 'default' processing configuration which is not what we want.

We are now looking at how we can encourage Archivematica to use the specified processing configuration. As described in the Archivematica documentation, you can do this by including an XML file describing your processing configuration within your transfer.

It is useful to learn lessons like this outside of a production environment!


File size/upload


Although our project recognised that there would be limit to the size of dataset that we could accept and process with our application, we didn't really bottom out what size dataset we intended to support.

It has now been agreed that we should reasonably expect the data deposit form to accept datasets of up to 20 GB in size. Anything larger than this would need to be handed in a different way.

Testing the proof of concept in earnest showed that it was not able to handle datasets of over 1 GB in size. Its primary purpose was to demonstrate the necessary integrations and workflow not to handle larger files.

Additional (and ongoing) work was required to enable the web deposit form to work with larger datasets.


Space


In testing the application we of course ended up trying to push some quite substantial datasets through it.

This was fine until everything abrubtly seemed to stop working!

The problem was actually a fairly simple one but because of our own inexperience with Archivematica it took a while to troubleshoot and get things moving in the right direction again.

It turned out that we hadn’t allocated enough space in one of the bits of filestore that Archivematica uses for failed transfers (/var/archivematica/sharedDirectory/failed). This had filled up and was stopping Archivematica from doing anything else.

Once we knew the cause of the problem the available space was increased but then everything ground to a halt again because we had quickly used that up again ….increasing the space had got things moving but of course while we were trying to demonstrate the fact that it wasn't working, we had deposited several further datasets which were waiting in the transfer directory and quickly blocked things up again.

On a related issue, one of the test datasets I had been using to see how well Research Data York could handle larger datasets consisted of c.5 GB consisting of about 2000 JPEG images. Of course one of the default normalisation tasks in Archivematica is to convert all of these JPEGs to TIFF.

Once this collection of JPEGs were converted to TIFF the size of the dataset increased to around 80 GB. Until I witnessed this it hadn't really occurred to me that this could cause problems.

The solution - allocate Archivematica much more space than you think it will need!

We also now have the filestore set up so that it will inform us when the space in these directories gets to 75% full. Hopefully this will allow us to stop the filestore filling up in the future.


Workflow


The proof of concept did not undergo rigorous testing - it was designed for demonstration purposes only.

During the project we thought long and hard about the deposit, request and preservation workflows that we wanted to support, but we were always aware that once we had it in an environment that we could all play with and test, additional requirements would emerge.

As it happens, we have discovered that the workflow implemented is very true to that described in the appendix of our phase 2 report and does meet our needs. However, there are lots of bits of fine tuning required to enhance the functionality and make the interface more user friendly.

The challenge here is to try to carry out the minimum of work required to turn it into an adequate solution to take into production. There are so many enhancements we could make – I have a wish list as long as my arm – but until we better understand whether a local solution or a shared solution (provided by the Jisc Research Data Shared Service) will be adopted in the future it is not worth trying to make this application perfect.

Making it fit for production is the priority. Bells and whistles can be added later as necessary!





My thanks to all those who have worked on creating, developing, troubleshooting and testing this application and workflow. It couldn't have happened without you!


Moving a proof of concept into production? it's harder than you might think...

Published 20 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Myself and colleagues blogged a lot during the Filling the Digital Preservation Gap Project but I’m aware that I’ve gone a bit quiet on this topic since…

I was going to wait until we had a big success to announce, but follow on work has taken longer than expected. So in the meantime here is an update on where we are and what we are up to.

Background


Just to re-cap, by the end of phase 3 of Filling the Digital Preservation Gap we had created a working proof of concept at the University of York that demonstrated that it is possible create an automated preservation workflow for research data using PURE, Archivematica, Fedora and Samvera (then called Hydra!).

This is described in our phase 3 project report (and a detailed description of the workflow we were trying to implement was included as an appendix in the phase 2 report).

After the project was over, it was agreed that we should go ahead and move this into production.

Progress has been slower than expected. I hadn’t quite appreciated just how different a proof of concept is to a production-ready environment!

Here are some of the obstacles we have encountered (and in some cases overcome):

Error reporting


One of the key things that we have had to build in to the existing code in order to get it ready for production is error handling.

This was not a priority for the proof of concept. A proof of concept is really designed to demonstrate that something is possible, not to be used in earnest.

If errors happen and things stop working (which they sometimes do) you can just kill it and rebuild.

In a production environment we want to be alerted when something goes wrong so we can work out how to fix it. Alerts and errors are crucial to a system like this.

We are sorting this out by enabling Archivematica's own error handling and error catching within Automation Tools.


What happens when something goes wrong?


...and of course once things have gone wrong in Archivematica and you've fixed the underlying technical issue, you then need to deal with any remaining problems with your information packages in Archivematica.

For example, if the problems have resulted in failed transfers in Archivematica then you need to work out what you are going to do with those failed transfers. Although it is (very) tempting to just clear out Archivematica and start again, colleagues have advised me that it is far more useful to actually try and solve the problems and establish how we might handle a multitude of problematic scenarios if we were in a production environment!

So we now have scenarios in which an automated transfer has failed so in order to get things moving again we need to carry out a manual transfer of the dataset into Archivematica. Will the other parts of our workflow still work if we intervene in this way?

One issue we have encountered along the way is that though our automated transfer uses a specific 'datasets' processing configuration that we have set up within Archivematica, when we push things through manually it uses the 'default' processing configuration which is not what we want.

We are now looking at how we can encourage Archivematica to use the specified processing configuration. As described in the Archivematica documentation, you can do this by including an XML file describing your processing configuration within your transfer.

It is useful to learn lessons like this outside of a production environment!


File size/upload


Although our project recognised that there would be limit to the size of dataset that we could accept and process with our application, we didn't really bottom out what size dataset we intended to support.

It has now been agreed that we should reasonably expect the data deposit form to accept datasets of up to 20 GB in size. Anything larger than this would need to be handed in a different way.

Testing the proof of concept in earnest showed that it was not able to handle datasets of over 1 GB in size. Its primary purpose was to demonstrate the necessary integrations and workflow not to handle larger files.

Additional (and ongoing) work was required to enable the web deposit form to work with larger datasets.


Space


In testing the application we of course ended up trying to push some quite substantial datasets through it.

This was fine until everything abrubtly seemed to stop working!

The problem was actually a fairly simple one but because of our own inexperience with Archivematica it took a while to troubleshoot and get things moving in the right direction again.

It turned out that we hadn’t allocated enough space in one of the bits of filestore that Archivematica uses for failed transfers (/var/archivematica/sharedDirectory/failed). This had filled up and was stopping Archivematica from doing anything else.

Once we knew the cause of the problem the available space was increased but then everything ground to a halt again because we had quickly used that up again ….increasing the space had got things moving but of course while we were trying to demonstrate the fact that it wasn't working, we had deposited several further datasets which were waiting in the transfer directory and quickly blocked things up again.

On a related issue, one of the test datasets I had been using to see how well Research Data York could handle larger datasets consisted of c.5 GB consisting of about 2000 JPEG images. Of course one of the default normalisation tasks in Archivematica is to convert all of these JPEGs to TIFF.

Once this collection of JPEGs were converted to TIFF the size of the dataset increased to around 80 GB. Until I witnessed this it hadn't really occurred to me that this could cause problems.

The solution - allocate Archivematica much more space than you think it will need!

We also now have the filestore set up so that it will inform us when the space in these directories gets to 75% full. Hopefully this will allow us to stop the filestore filling up in the future.


Workflow


The proof of concept did not undergo rigorous testing - it was designed for demonstration purposes only.

During the project we thought long and hard about the deposit, request and preservation workflows that we wanted to support, but we were always aware that once we had it in an environment that we could all play with and test, additional requirements would emerge.

As it happens, we have discovered that the workflow implemented is very true to that described in the appendix of our phase 2 report and does meet our needs. However, there are lots of bits of fine tuning required to enhance the functionality and make the interface more user friendly.

The challenge here is to try to carry out the minimum of work required to turn it into an adequate solution to take into production. There are so many enhancements we could make – I have a wish list as long as my arm – but until we better understand whether a local solution or a shared solution (provided by the Jisc Research Data Shared Service) will be adopted in the future it is not worth trying to make this application perfect.

Making it fit for production is the priority. Bells and whistles can be added later as necessary!





My thanks to all those who have worked on creating, developing, troubleshooting and testing this application and workflow. It couldn't have happened without you!


How do you deal with mass spam on MediaWiki?

Published 19 Sep 2017 by sau226 in Newest questions tagged mediawiki - Webmasters Stack Exchange.

What would be the best way to find a users IP address on MediaWiki if all the connections were proxied through squid proxy server and you have access to all user rights?

I am a steward on a centralauth based wiki and we have lots of spam accounts registering and making 1 spam page each.

Can someone please tell me what the best way to mass block them is as I keep on having to block each user individually and lock their accounts?


HAPPY RETIREMENT, MR GAWLER

Published 18 Sep 2017 by timbaker in Tim Baker.

The author (centre) with Ruth and Ian Gawler Recently a great Australian, a man who has helped thousands of others in their most vulnerable and challenging moments, a Member of the Order of Australia, quietly retired from a long and remarkable career of public service....

Harvesting EAD from AtoM: we need your help!

Published 18 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Back in February I published a blog post about a project to develop AtoM to allow EAD (Encoded Archival Description) to be harvested via OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting): “Harvesting EAD from AtoM: a collaborative approach

Now that AtoM version 2.4 is released (hooray!), containing the functionality we have sponsored, I thought it was high time I updated you on what has been achieved by this project, where more work is needed and how the wider AtoM community can help.


What was our aim?


Our development work had a few key aims:

  • To enable finding aids from AtoM to be exposed as EAD 2002 XML for others to harvest. The partners who sponsored this project were particularly keen to enable the Archives Hub to harvest their EAD.
  • To change the way that EAD was generated by AtoM in order to make it more scalable. Moving EAD generation from the web browser to the job scheduler was considered to be the best approach here.
  • To make changes to the existing DC (Dublin Core) metadata generation feature so that it also works through the job scheduler - making this existing feature more scalable and able to handle larger quantities of data

A screen shot of the job scheduler in AtoM - showing the EAD and
DC creation jobs that have been completed

What have we achieved?

The good

We believe that the EAD harvesting feature as released in AtoM version 2.4 will enable a harvester such as the Archives Hub to harvest our catalogue metadata from AtoM as EAD. As we add new top level archival descriptions to our catalogue, subsequent harvests should pick up and display these additional records. 

This is a considerable achievement and something that has been on our wishlist for some time. This will allow our finding aids to be more widely signposted. Having our data aggregated and exposed by others is key to ensuring that potential users of our archives can find the information that they need.

Changes have also been made to the way metadata (both EAD and Dublin Core) are generated in AtoM. This means that the solution going forward is more scalable for those AtoM instances that have very large numbers of records or large descriptive hierarchies.

The new functionality in AtoM around OAI-PMH harvesting of EAD and settings for moving XML creation to the job scheduler is described in the AtoM documentation.

The not-so-good

Unfortunately the EAD harvesting functionality within AtoM 2.4 will not do everything we would like it to do. 

It does not at this point include the ability for the harvester to know when metadata records have been updated or deleted. It also does not pick up new child records that are added into an existing descriptive hierarchy. 

We want to be able to edit our records once within AtoM and have any changes reflected in the harvested versions of the data. 

We don’t want our data to become out of sync. 

So clearly this isn't ideal.

The task of enabling full harvesting functionality for EAD was found to be considerably more complex than first anticipated. This has no doubt been confounded by the hierarchical nature of the EAD which differs from the simplicity of the traditional Dublin Core approach.

The problems encountered are certainly not insurmountable, but lack of additional resources and timelines for the release of AtoM 2.4 stopped us from being able to finish off this work in full.

A note on scalability


Although the development work deliberately set out to consider issues of scalability, it turns out that scalability is actually on a sliding scale!

The National Library of Wales had the forethought to include one of their largest archival descriptions as sample data for inclusion in the version of AtoM 2.4 that Artefactual deployed for testing. Their finding aid for St David’s Diocesan Records is a very large descriptive hierarchy consisting of 33,961 individual entries. This pushed the capabilities of EAD creation (even when done via the job scheduler) and also led to discussions with The Archives Hub about exactly how they would process and display such a large description at their end even if EAD generation within AtoM were successful.

Some more thought and more manual workarounds will need to be put in place to manage the harvesting and subsequent display of large descriptions such as these.

So what next?


We are keen to get AtoM 2.4 installed at the Borthwick Institute for Archives over the next couple of months. We are currently on version 2.2 and would like to start benefiting from all the new features that have been introduced available... and of course to test in earnest the EAD harvesting feature that we have jointly sponsored.

We already know that this feature will not fully meet our needs in its current form, but would like to set up an initial harvest with the Archives Hub and further test some of our assumptions about how this will work.

We may need to put some workarounds in place to ensure that we have a way of reflecting updates and deletions in the harvested data – either with manual deletes or updates or a full delete and re-harvest periodically.

Harvesting in AtoM 2.4 - some things that need to change


So we have a list of priority things that need to be improved in order to get EAD harvesting working more smoothly in the future:


In line with the OAI-PMH specification

  • AtoM needs to expose updates to the metadata to the harvester
  • AtoM needs to expose new records (at any level of description) to the harvester
  • AtoM needs to expose information about deletions to the harvester
  • AtoM also needs to expose information about deletions to DC metadata to the harvester (it has come to my attention during the course of this project that this isn’t happening at the moment) 

Some other areas of potential work


I also wanted to bring together and highlight some other areas of potential work for the future. These are all things that were discussed during the course of the project but were not within the scope of our original development goals.

  • Harvesting of EAC (Encoded Archival Context) - this is the metadata standard for authority records. Is this something people would like to see enabled in the future? Of course this is only useful if you have someone who actually wants to harvest this information!
  • On the subject of authority records, it would be useful to change the current AtoM EAD template to use @authfilenumber and @source - so that an EAD record can link back to the relevant authority record in the local AtoM site. The ability to create rich authority records is such a key strength of AtoM, allowing an institution to weave rich interconnecting stories about their holdings. If harvesting doesn’t preserve this inter-connectivity then I think we are missing a trick!
  • EAD3 - this development work has deliberately not touched on the new EAD standard. Firstly, this would have been a much bigger job and secondly, we are looking to have our EAD harvested by The Archives Hub and they are not currently working with EAD3. This may be a priority area of work for the future.
  • Subject source - the subject source (for example "Library of Congress Subject Headings") doesn't appear in AtoM generated EAD at the moment even though it can be entered into AtoM - this would be a really useful addition to the EAD.
  • Visible elements - AtoM allows you to decide which elements you wish to display/hide in your local AtoM interface. With the exception of information relating to physical storage, the XML generation tasks currently do not take account of visible elements and will carry out an export of all fields. Further investigation of this should be carried out in the future. If an institution is using the visible elements feature to hide certain bits of information that should not be more widely distributed, they would be concerned if this information was being harvested and displayed elsewhere. As certain elements will be required in order to create valid EAD, this may get complicated!
  • ‘Manual’ EAD generation - the project team discussed the possibility of adding a button to the AtoM user interface so that staff users can manually kick-off EAD regeneration for a single descriptive hierarchy. Artefactual suggested this as a method of managing the process of EAD generation for large descriptive hierarchies. You would not want the EAD to regenerate with each minor tweak if a large archival description was undergoing several updates, however, you need to be able to trigger this task when you are ready to do so. It should be possible to switch off the automatic EAD re-generation (which normally triggers when a record is edited and saved) but have a button on the interface that staff can click when they want to initiate the process - for example when all edits are complete. 
  • As part of their work on this project, Artefactual created a simple script to help with the process of generating EAD for large descriptive hierarchies - it basically provides a way of finding out which XML files relate to a specific archival description so that EAD can be manually enhanced and updated if it is too large for AtoM to generate via the job scheduler. It would be useful to turn this script into a command-line task that is maintained as part of the AtoM codebase.

We need your help!


Although we believe we have something we can work with here and now, we are not under any illusions that this feature does all that it needs to in order to meet our requirements in the longer term. 

I would love to find out what other AtoM users (and harvesters) think of the feature. Is it useful to you? Are there other things we should put on the wishlist? 

There is a lot of additional work described in this post which the original group of project partners are unlikely to be able to fund on their own. If EAD harvesting is a priority to you and your organisation and you think you can contribute to further work in this area either on your own or as part of a collaborative project please do get in touch.


Thanks


I’d like to finish with a huge thanks to those organisations who have helped make this project happen, either through sponsorship, development or testing and feedback.



Harvesting EAD from AtoM: we need your help!

Published 18 Sep 2017 by Jenny Mitcham in Digital Archiving at the University of York.

Back in February I published a blog post about a project to develop AtoM to allow EAD (Encoded Archival Description) to be harvested via OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting): “Harvesting EAD from AtoM: a collaborative approach

Now that AtoM version 2.4 is released (hooray!), containing the functionality we have sponsored, I thought it was high time I updated you on what has been achieved by this project, where more work is needed and how the wider AtoM community can help.


What was our aim?


Our development work had a few key aims:

  • To enable finding aids from AtoM to be exposed as EAD 2002 XML for others to harvest. The partners who sponsored this project were particularly keen to enable the Archives Hub to harvest their EAD.
  • To change the way that EAD was generated by AtoM in order to make it more scalable. Moving EAD generation from the web browser to the job scheduler was considered to be the best approach here.
  • To make changes to the existing DC (Dublin Core) metadata generation feature so that it also works through the job scheduler - making this existing feature more scalable and able to handle larger quantities of data

A screen shot of the job scheduler in AtoM - showing the EAD and
DC creation jobs that have been completed

What have we achieved?

The good

We believe that the EAD harvesting feature as released in AtoM version 2.4 will enable a harvester such as the Archives Hub to harvest our catalogue metadata from AtoM as EAD. As we add new top level archival descriptions to our catalogue, subsequent harvests should pick up and display these additional records. 

This is a considerable achievement and something that has been on our wishlist for some time. This will allow our finding aids to be more widely signposted. Having our data aggregated and exposed by others is key to ensuring that potential users of our archives can find the information that they need.

Changes have also been made to the way metadata (both EAD and Dublin Core) are generated in AtoM. This means that the solution going forward is more scalable for those AtoM instances that have very large numbers of records or large descriptive hierarchies.

The new functionality in AtoM around OAI-PMH harvesting of EAD and settings for moving XML creation to the job scheduler is described in the AtoM documentation.

The not-so-good

Unfortunately the EAD harvesting functionality within AtoM 2.4 will not do everything we would like it to do. 

It does not at this point include the ability for the harvester to know when metadata records have been updated or deleted. It also does not pick up new child records that are added into an existing descriptive hierarchy. 

We want to be able to edit our records once within AtoM and have any changes reflected in the harvested versions of the data. 

We don’t want our data to become out of sync. 

So clearly this isn't ideal.

The task of enabling full harvesting functionality for EAD was found to be considerably more complex than first anticipated. This has no doubt been confounded by the hierarchical nature of the EAD which differs from the simplicity of the traditional Dublin Core approach.

The problems encountered are certainly not insurmountable, but lack of additional resources and timelines for the release of AtoM 2.4 stopped us from being able to finish off this work in full.

A note on scalability


Although the development work deliberately set out to consider issues of scalability, it turns out that scalability is actually on a sliding scale!

The National Library of Wales had the forethought to include one of their largest archival descriptions as sample data for inclusion in the version of AtoM 2.4 that Artefactual deployed for testing. Their finding aid for St David’s Diocesan Records is a very large descriptive hierarchy consisting of 33,961 individual entries. This pushed the capabilities of EAD creation (even when done via the job scheduler) and also led to discussions with The Archives Hub about exactly how they would process and display such a large description at their end even if EAD generation within AtoM were successful.

Some more thought and more manual workarounds will need to be put in place to manage the harvesting and subsequent display of large descriptions such as these.

So what next?


We are keen to get AtoM 2.4 installed at the Borthwick Institute for Archives over the next couple of months. We are currently on version 2.2 and would like to start benefiting from all the new features that have been introduced available... and of course to test in earnest the EAD harvesting feature that we have jointly sponsored.

We already know that this feature will not fully meet our needs in its current form, but would like to set up an initial harvest with the Archives Hub and further test some of our assumptions about how this will work.

We may need to put some workarounds in place to ensure that we have a way of reflecting updates and deletions in the harvested data – either with manual deletes or updates or a full delete and re-harvest periodically.

Harvesting in AtoM 2.4 - some things that need to change


So we have a list of priority things that need to be improved in order to get EAD harvesting working more smoothly in the future:


In line with the OAI-PMH specification

  • AtoM needs to expose updates to the metadata to the harvester
  • AtoM needs to expose new records (at any level of description) to the harvester
  • AtoM needs to expose information about deletions to the harvester
  • AtoM also needs to expose information about deletions to DC metadata to the harvester (it has come to my attention during the course of this project that this isn’t happening at the moment) 

Some other areas of potential work


I also wanted to bring together and highlight some other areas of potential work for the future. These are all things that were discussed during the course of the project but were not within the scope of our original development goals.

  • Harvesting of EAC (Encoded Archival Context) - this is the metadata standard for authority records. Is this something people would like to see enabled in the future? Of course this is only useful if you have someone who actually wants to harvest this information!
  • On the subject of authority records, it would be useful to change the current AtoM EAD template to use @authfilenumber and @source - so that an EAD record can link back to the relevant authority record in the local AtoM site. The ability to create rich authority records is such a key strength of AtoM, allowing an institution to weave rich interconnecting stories about their holdings. If harvesting doesn’t preserve this inter-connectivity then I think we are missing a trick!
  • EAD3 - this development work has deliberately not touched on the new EAD standard. Firstly, this would have been a much bigger job and secondly, we are looking to have our EAD harvested by The Archives Hub and they are not currently working with EAD3. This may be a priority area of work for the future.
  • Subject source - the subject source (for example "Library of Congress Subject Headings") doesn't appear in AtoM generated EAD at the moment even though it can be entered into AtoM - this would be a really useful addition to the EAD.
  • Visible elements - AtoM allows you to decide which elements you wish to display/hide in your local AtoM interface. With the exception of information relating to physical storage, the XML generation tasks currently do not take account of visible elements and will carry out an export of all fields. Further investigation of this should be carried out in the future. If an institution is using the visible elements feature to hide certain bits of information that should not be more widely distributed, they would be concerned if this information was being harvested and displayed elsewhere. As certain elements will be required in order to create valid EAD, this may get complicated!
  • ‘Manual’ EAD generation - the project team discussed the possibility of adding a button to the AtoM user interface so that staff users can manually kick-off EAD regeneration for a single descriptive hierarchy. Artefactual suggested this as a method of managing the process of EAD generation for large descriptive hierarchies. You would not want the EAD to regenerate with each minor tweak if a large archival description was undergoing several updates, however, you need to be able to trigger this task when you are ready to do so. It should be possible to switch off the automatic EAD re-generation (which normally triggers when a record is edited and saved) but have a button on the interface that staff can click when they want to initiate the process - for example when all edits are complete. 
  • As part of their work on this project, Artefactual created a simple script to help with the process of generating EAD for large descriptive hierarchies - it basically provides a way of finding out which XML files relate to a specific archival description so that EAD can be manually enhanced and updated if it is too large for AtoM to generate via the job scheduler. It would be useful to turn this script into a command-line task that is maintained as part of the AtoM codebase.

We need your help!


Although we believe we have something we can work with here and now, we are not under any illusions that this feature does all that it needs to in order to meet our requirements in the longer term. 

I would love to find out what other AtoM users (and harvesters) think of the feature. Is it useful to you? Are there other things we should put on the wishlist? 

There is a lot of additional work described in this post which the original group of project partners are unlikely to be able to fund on their own. If EAD harvesting is a priority to you and your organisation and you think you can contribute to further work in this area either on your own or as part of a collaborative project please do get in touch.


Thanks


I’d like to finish with a huge thanks to those organisations who have helped make this project happen, either through sponsorship, development or testing and feedback.



Jason Scott Talks His Way Out of It: A Podcast

Published 14 Sep 2017 by Jason Scott in ASCII by Jason Scott.

Next week I start a podcast.

There’s a Patreon for the podcast with more information here.

Let me unpack a little of the thinking.

Through the last seven years, since I moved back to NY, I’ve had pretty variant experiences of debt or huge costs weighing me down. Previously, I was making some serious income from a unix admin job, and my spending was direct but pretty limited. Since then, even with full-time employment (and I mean, seriously, a dream job), I’ve made some grandiose mistakes with taxes, bills and tracking down old obligations that means I have some notable costs floating in the background.

Compound that with a new home I’ve moved to with real landlords that aren’t family and a general desire to clean up my life, and I realized I needed some way to make extra money that will just drop directly into the bill pit, never to really pass into my hands.

How, then, to do this?

I work very long hours for the Internet Archive, and I am making a huge difference in the world working for them. It wouldn’t be right or useful for me to take on any other job. I also don’t want to be doing something like making “stuff” that I sell or otherwise speculate into some market. Leave aside I have these documentaries to finish, and time has to be short.

Then take into account that I can no longer afford to drop money going to anything other than a small handful of conferences that aren’t local to me (the NY-CT-NJ Tri-State area), and that people really like the presentations I give.

So, I thought, how about me giving basically a presentation once a week? What if I recorded me giving a sort of fireside chat or conversational presentation about subjects I would normally give on the road, but make them into a downloadable podcast? Then, I hope, everyone would be happy: fans get a presentation. I get away from begging for money to pay off debts. I get to refine my speaking skills. And maybe the world gets something fun out of the whole deal.

Enter a podcast, funded by a Patreon.

The title: Jason Talks His Way Out of It, my attempt to write down my debts and share the stories and thoughts I have.

I announced the Patreon on my 47th birthday. Within 24 hours, about 100 people had signed up, paying some small amount (or not small, in some cases) for each published episode. I had a goal of $250/episode to make it worthwhile, and we passed that handily. So it’s happening.

I recorded a prototype episode, and that’s up there, and the first episode of the series drops Monday. These are story-based presentations roughly 30 minutes long apiece, and I will continue to do them as long as it makes sense to.

Public speaking is something I’ve done for many, many years, and I enjoy it, and I get comments that people enjoy them very much. My presentation on That Awesome Time I Was Sued for Two Billion Dollars has passed 800,000 views on the various copies online.

I spent $40 improving my sound setup, which should work for the time being. (I already had a nice microphone and a SSD-based laptop which won’t add sound to the room.) I’m going to have a growing list of topics I’ll work from, and I’ll stay in communication with the patrons.

Let’s see what this brings.

One other thing: Moving to the new home means that a lot of quality of life issues have been fixed, and my goal is to really shoot forward finishing those two documentaries I owe people. I want them done as much as everyone else! And with less looming bills and debts in my life, it’ll be all I want to do.

So, back the new podcast if you’d like. It’ll help a lot.


An Eventlogging adventure

Published 14 Sep 2017 by in Posts on The bugalore.

What the heck is eventlogging? Eventlogging is a MediaWiki extension which lets us log events such as how users interact with a certain feature (client-side logging) or capturing the state of a system (user, permissions etc.) when a certain event happens (server-side logging). There are 3 different parts to eventlogging an event. The schema, the code and the log data. I won’t be going into the details of that because there’s a detailed guide for it.

Does Mediawiki encrypt logins by default as the browser sends them to the server?

Published 11 Sep 2017 by user1258361 in Newest questions tagged mediawiki - Server Fault.

Several searches only turned up questions about encrypting login info on the server side. Does Mediawiki encrypt logins after you type them in the browser and send them? (to prevent a man-in-the-middle from reading them in transit and taking over an account)


The Bounty of the Ted Nelson Junk Mail

Published 9 Sep 2017 by Jason Scott in ASCII by Jason Scott.

At the end of May, I mentioned the Ted Nelson Junk Mail project, where a group of people were scanning in boxes of mailings and pamphlets collected by Ted Nelson and putting them on the Internet Archive. Besides the uniqueness of the content, it was also unique in that we were trying to set it up to be self-sustaining from volunteer monetary contributions, and the compensate the scanners doing the work.

This entire endeavor has been wildly successful.

We are well past 18,000 pages scanned. We have taken in thousands in donations. And we now have three people scanning and one person entering metadata.

Here is the spreadsheet with transparency and donation information.

I highly encourage donating.

But let’s talk about how this collection continues to be amazing.

Always, there are the pure visuals. As we’re scanning away, we’re starting to see trends in what we have, and everything seems to go from the early 1960s to the early 1990s, a 30-year scope that encompasses a lot of companies and a lot of industries. These companies are trying to thrive in a whirlpool of competing attention, especially in certain technical fields, and they try everything from humor to class to rudimentary fear-and-uncertainty plays in the art.

These are exquisitely designed brochures, in many cases – obviously done by a firm or with an in-house group specifically tasked with making the best possible paper invitations and with little expense spared. After all, this might be the only customer-facing communication a company could have about its products, and might be the best convincing literature after the salesman has left or the envelope is opened.

Scanning at 600dpi has been a smart move – you can really zoom in and see detail, find lots to play with or study or copy. Everything is at this level, like this detail about a magnetic eraser that lets you see the lettering on the side.

Going after these companies for gender roles or other out-of-fashion jokes almost feels like punching down, but yeah, there’s a lot of it. Women draped over machines, assumptions that women will be doing the typing, and clunky humor about fulfilling your responsibilities as a (male) boss abounds. Cultural norms regarding what fears reigned in business or how companies were expected to keep on top of the latest trends are baked in there too.

The biggest obstacle going forward, besides bringing attention to this work, is going to be one of findability. The collection is not based on some specific subject matter other than what attracted Ted’s attention over the decades. He tripped lightly among aerospace, lab science, computers, electronics, publishing… nothing escaped his grasp, especially in technical fields.

If people are looking for pure aesthetic beauty, that is, “here’s a drawing of something done in a very old way” or “here are old fonts”, then this bounty is already, at 1,700 items, a treasure trove that could absorb weeks of your time. Just clicking around to items that on first blush seem to have boring title pages will often expand into breathtaking works of art and design.

I’m not worried about that part, frankly – these kind of sell themselves.

But there’s so much more to find among these pages, and as we’re now up to so many examples, it’s going to be a challenge to get researching folks to find them.

We have the keywording active, so you can search for terms like monitor, circuit, or hypercard and get more specific matches without concentrating on what the title says or what graphics appear on the front. The Archive has a full-text search, and so people looking for phrases will no doubt stumble into this collection.

But how easily will people even think to know about a wristwatch for the Macintosh from 1990, a closed circuit camera called the Handy Looky..  or this little graphic, nestled away inside a bland software catalog:

…I don’t know. I’ll mention that this is actually twitter-fodder among archivists, who are unhappy when someone is described as “discovering” something in the archives, when it was obvious a person cataloged it and put it there.

But that’s not the case here. Even Kyle, who’s doing the metadata, is doing so in a descriptive fashion, and on a rough day of typing in descriptions, he might not particularly highlight unique gems in the pile (he often does, though). So, if you discover them in there, you really did discover them.

So, the project is deep, delightful, and successful. The main consideration of this is funding; we are paying the scanners $10/hr to scan and the metadata is $15/hr. They work fast and efficiently. We track them on the spreadsheet. But that means a single day of this work can cause a notable bill. We’re asking people on twitter to raise funds, but it never hurts to ask here as well. Consider donating to this project, because we may not know for years how much wonderful history is saved here.

Please share the jewels you find.


Blog? Bleurgh.

Published 9 Sep 2017 by in Posts on The bugalore.

The what, the why, the when, the who, the how: I’ve been coding for a while now, mostly on Wikimedia projects. Every time I get stuck with a bug or come across something I didn’t know of, I’d learn something new, fix it and move on. While this works great for me, it’s a wealth of knowledge that I’m keeping all to myself. I wanted to be able to pass on the intriguing lessons I learnt to other people who might want to hear the stories I brought back from the pits I fell into (no, not like Bruce Wayne).

4 Months!

Published 9 Sep 2017 by Jason Scott in ASCII by Jason Scott.

It’s been 4 months since my last post! That’s one busy little Jason summer, to be sure.

Obviously, I’m still around, so no heart attack lingering or problems. My doctor told me that my heart is basically healed, and he wants more exercise out of me. My diet’s continued to be lots of whole foods, leafy greens and occasional shameful treats that don’t turn into a staple.

I spent a good month working with good friends to clear out the famous Information Cube, sorting out and mailing/driving away all the contents to other institutions, including the Internet Archive, the Strong Museum of Play, the Vintage Computer Federation, and parts worldwide.

I’ve moved homes, no longer living with my brother after seven up-and-down years of siblings sharing a house. It was time! We’re probably not permanently scarred! I love him very much. I now live in an apartment with very specific landlords with rules and an important need to pay them on time each and every month.

To that end, I’ve cut back on my expenses and will continue to, so it’s the end of me “just showing up” to pretty much any conferences that I’m not being compensated for, which will of course cut things down in terms of Jason appearances you can find me at.

I’ll still be making appearances as people ask me to go, of course – I love travel. I’m speaking in Amsterdam in October, as well as being an Emcee at the Internet Archive in October as well. So we’ll see how that goes.

What that means is more media ingestion work, and more work on the remaining two documentaries. I’m going to continue my goal of clearing my commitments before long, so I can choose what I do next.

What follows will be (I hope) lots of entries going deep into some subjects and about what I’m working on, and I thank you for your patience as I was not writing weblog entries while upending my entire life.

To the future!


Godless for God’s Sake: Now available for Kindle for just $5.99

Published 6 Sep 2017 by Nontheist Friends in NontheistFriends.org.

godsake_large

Godless for God’s Sake: Nontheism in Contemporary Quakerism

In this book edited by British Friend and author David Boulton, 27 Quakers from 4 countries and 13 yearly meetings tell how they combine active and committed membership in the Religious Society of Friends with rejection of traditional belief in the existence of a transcendent, personal and supernatural God.

For some, God is no more (but no less) than a symbol of the wholly human values of “mercy, pity, peace and love”. For others, the very idea of God has become an archaism.

Readers who seek a faith free of supernaturalism, whether they are Friends, members of other religious traditions or drop-outs from old-time religion, will find good company among those whose search for an authentic 21st century understanding of religion and spirituality has led them to declare themselves “Godless – for God’s Sake”.

Contents

Preface: In the Beginning…

1. For God’s Sake? An Introduction

 

David Boulton

2. What’s a Nice Nontheist Like You Doing Here?

 

Robin Alpern

3. Something to Declare

 

Philip Gross

4. It’s All in the Numbers

Joan D Lucas

5. Chanticleer’s Call: Religion as a Naturalist Views It

Os Cresson

6. Mystery: It’s What we Don’t Know

James T Dooley Riemermann

7. Living the Questions

Sandy Parker

8. Listening to the Kingdom

Bowen Alpern

9. The Making of a Quaker Nontheist Tradition

David Boulton and Os Cresson

10. Facts and Figures

David Rush

11. This is my Story, This is my Song…


 

Ordering Info

Links to forms for ordering online will be provided here as soon as they are available. In the meantime, contact the organizations listed below, using the book details at the bottom of this page.

QuakerBooks of Friends General Conference

(formerly FGC Bookstore)

1216 Arch St., Ste 2B

Philadelphia, PA 19107

215-561-1700 fax 215-561-0759

http://www.quakerbooks.org/get/333011

(this is the “Universalism” section of Quakerbooks, where the book is currently located)

(this is the “Universalism” section of Quakerbooks, where the book is currently located)

or

The

Quaker Bookshop

173 Euston Rd London NW1 2BJ

020 7663 1030, fax 020 7663 1008 bookshop@quaker.org.uk

 

Those outside the United Kingdom and United States should be able to order through a local bookshop, quoting the publishing details below – particularly the ISBN number. In case of difficulty, the book can be ordered direct from the publisher’s address below.

Title: “Godless for God’s Sake: Nontheism in Contemporary Quakerism” (ed. David Boulton)

Publisher: Dales Historical Monographs, Hobsons Farm, Dent, Cumbria LA10 5RF, UK. Tel 015396 25321. Email davidboulton1@compuserve.com.

Retail price: ?9.50 ($18.50). Prices elsewhere to be calculated on UK price plus postage.

Format: Paperback, full colour cover, 152 pages, A5

ISBN number: 0-9511578-6-8 (to be quoted when ordering from any bookshop in the world)


MassMessage hits 1,000 commits

Published 28 Aug 2017 by legoktm in The Lego Mirror.

The MassMessage MediaWiki extension hit 1,000 commits today, following an update of the localization messages for the Russian language. MassMessage replaced a Toolserver bot that allowed sending a message to all Wikimedia wikis, by integrating it into MediaWiki and using the job queue. We also added some nice features like input validation and previewing. Through it, I became familiar with different internals of MediaWiki, including submitting a few core patches.

I made my first commit on July 20, 2013. It would get a full rollout to all Wikimedia wikis on November 19, 2013, after a lot of help from MZMcBride, Reedy, Siebrand, Ori, and other MediaWiki developers.

I also mentored User:wctaiwan, who worked on a Google Summer of Code project that added a ContentHandler backend to the extension, to make it easier for people to create and maintain page lists. You can see it used by The Wikipedia Signpost's subscription list.

It's still a bit crazy to think that I've been hacking on MediaWiki for over four years now, and how much it has changed my life in that much time. So here's to the next four years and next 1,000 commits to MassMessage!


Requiring HTTPS for my Toolforge tools

Published 26 Aug 2017 by legoktm in The Lego Mirror.

My Toolforge (formerly "Tool Labs") tools will now start requiring HTTPS, and redirecting any HTTP traffic. It's a little bit of common code for each tool, so I put it in a shared "toolforge" library.

from flask import Flask
import toolforge

app = Flask(__name__)
app.before_request(toolforge.redirect_to_https)

And that's it! Your tool will automatically be HTTPS-only now.

$ curl -I "http://tools.wmflabs.org/mwpackages/"
HTTP/1.1 302 FOUND
Server: nginx/1.11.13
Date: Sat, 26 Aug 2017 07:58:39 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 281
Connection: keep-alive
Location: https://tools.wmflabs.org/mwpackages/
X-Clacks-Overhead: GNU Terry Pratchett

My DebConf 17 presentation - Bringing MediaWiki back into Debian

Published 25 Aug 2017 by legoktm in The Lego Mirror.

Full quality video available on Wikimedia Commons, as well as the slides.

I had a blast attending DebConf '17 in Montreal, and presented about my efforts to bring back MediaWiki into Debian. The talks I went to were all fantastic, and got to meet some amazing people. But the best parts about the conference was the laid-back atmosphere and the food. I've never been to another conference that had food that comes even close to DebConf.

Feeling very motivated, I have three new packages in the pipeline: LuaSandbox, uprightdiff, and libkiwix.

I hope to be at DebConf again next year!


Benchmarking with the NDSA Levels of Preservation