Demystifying Digital Storage Solutions for Indigenous Data Sovereignty

2019-11-08 Share story

digital_storage_solutions_idsov.png

The movement towards Indigenous Data Sovereignty is important because Indigenous Peoples have long been measured and classified by governments according to colonial interests. It is time to put the control over how and what is measured back into the hands of Indigenous Peoples. This article will not attempt to tackle the more challenging political aspects of Indigenous Data Sovereignty (see our Decolonizing Digital series) but will examine some data storage technologies that can be useful for Indigenous organizations moving towards data sovereignty.

Background

One of the things we noticed at Animikii is that many of our Indigenous clients are hiring us to build web applications to allow them to collect, store, and manage data. As Indigenous Nations and organizations grow, the need for Indigenous Data Sovereignty tools not only becomes apparent but also increases in necessity.

The First Nations Information Governance Centre defined a set of principles for how Indigenous data should be collected, protected, used, or shared. This guideline is called OCAP®, for the four guiding principles: OwnershipControlAccess, and Possession.

Possession, as defined by OCAP®, means having physical control of the data, which in turn means that the responsibility for safeguarding that data falls on the Indigenous organization directly. 

This includes:

  • Ensuring that individual and community privacy is respected both within and outside the organization.
  • Ensuring that data is not lost in the event of a hardware failure or a natural disaster.
  • Sharing subsets of the data with organization staff, communities, or third parties and revoking access when necessary.

The Problem with “Cloud” Storage services

Many organizations choose to use cloud storage services like Dropbox or Google Drive, but these services may not be appropriate for Indigenous organizations and especially not for Nation governments because of varying global privacy laws. These services use encryption to prevent data from being accessed by unauthorized third parties. However, because of the way encryption is used, it is important to note that the data is visible to the employees of services like Dropbox and Google Drive. 

With most cloud service providers, the way your data is secured is analogous to giving valuables to a concierge for safekeeping. You can instruct the concierge to only allow certain people to get the items, but that’s putting a lot of trust in the concierge and in the concierge’s ability to not make a mistake when they interpret your instructions about who should get access.

Even if an Indigenous organization were to fully trust the intentions of a cloud service provider and their ability to technically secure the data, they would still have to contend with the risk of physical data seizures from a data center that is stored outside of Canada. This same concern is the reason why organizations connected with the Canadian government are legally required to store that data in data centres that are physically located in Canada.

So what does this mean for Indigenous organizations and Nations? Does every Nation out there need to build a data centre on their land in order to have true possession over the data? Not necessarily. Data can be stored in any data centre as long as the right encryption technology is used.

How End-to-End Encryption Can Help

This is where a technology called end-to-end encryption can help. To go back to the concierge analogy, end-to-end encryption securing data is akin to putting your things in an unbreakable lockbox. You can leave the lockbox with a concierge and instruct the concierge to give the box to certain people, but unless you also give those people the key to the box, they won’t be able to see the contents.

In fact, this lockbox is so good, that one of the risks with using end-to-end encryption is that if you lose your encryption keys, you can no longer access the data. This has happened before with Bitcoin, which is a different use case of encryption but serves as an example of how strong encryption can be. Each organization would have to develop the right protocols for storing paper copies or other offline versions of the keys in a safe place to ensure the data can be accessed, even if a key is misplaced.

End-to-End Encrypted Cloud Storage

For organizations that do not have the resources to operate their own data centre, there are cloud storage services that offer end-to-end encryption.

One option that is worth looking at is Resilio Connect. One of the advantages of Resilio Connect compared to the other products covered in this article is that it is based on the BitTorrent protocol which means it should work well for Nations with slow Internet connections. This article explains how Resilio uses mesh networks to make the best of poor internet connections

Another product that is due to launch in November 2019 is Tardigrade, an encrypted distributed storage system that uses the Amazon S3 protocol. To put it plainly, what Tardigrade can do for organizations is run the data centers that store the organization’s data in a distributed fashion and maintain the Open Source software that manages how the data is stored and retrieved. Their website mentions that the software encrypts the data at the source (client-side encryption), before sending it over to the storage nodes. This service would allow Nations to have their data encrypted from the moment it’s uploaded to the storage location.

End-to-End Encrypted On-Premise Storage

Data storage technologies have advanced quite a bit and we are seeing some exciting possibilities for organizations looking to take possession of their data. 

From a technological standpoint, it is not necessary for organizations to maintain their own server hardware but, for the organizations that can afford it, it can be a worthwhile investment, as nothing beats having physical access and possession of the data.

Decentralized end-to-end encrypted storage tools can allow organizations to host their data within the organization as well as have off-site backups without having to worry about the data being accessed by third parties without permission.

One example of this type of technology is the open-source Tahoe-LAFS distributed storage system which is also available as a cloud solution called the Simple Secure Storage Service (S4)

The video on their homepage introduces how the technology works and mentions an important consideration when choosing a data storage server:

“If you just hand away data without encrypting it first, you’re trusting whoever you’re giving it to, to keep it safe”.

With a system like S4, when somebody is uploading data, the files are encrypted before they leave the computer to be sent to the storage nodes. This is very similar to the way Tardigrade operates. Because all the data is encrypted before it leaves the computer, the people operating the S4 service cannot read or modify the data.

S4 also has a ‘Magic Folders’ feature which means it is possible to designate certain folders on a computer to automatically sync to the distributed cloud storage system. Basically, any file you drop into that folder will be available on any other machine connected to the Tahoe LAFS system.

The main difference between Tardigrade and S4 is that the latter seems to be aimed more at individuals, so the support level and ability to specify file access rules may not be appropriate for larger organizations. However, the Tahoe-LAFS technology itself would make a good candidate as a backbone technology for an Indigenous Data Sovereignty product. This technology is Open Source and was developed with funding from the Open Technology Fund

Conclusion

Taking possession of data will bring with it new challenges for Indigenous organizations. For organizations with smaller tech budgets, end-to-end encrypted cloud storage could be a good solution, while organizations with larger budgets could look into building their own platform on top of end-to-end encrypted storage like Tahoe-LAFS or Tardigrade. 

At Animikii we are in the early stages of building a Data Sovereignty tool that can help smaller organizations collect, store, and manage access to their data. This article, along with our Decolonizing Digital series, invites our readers to explore not only what Indigenous Data Sovereignty is but also how we at Animikii can make Indigenous Data Sovereignty a reality.

Learn more about #DataBack, its authors, Niiwin and download your copy of the eBook on the #DataBack website.