Upspin provides a global name space to name all your files.
Given an Upspin name, a file can be shared securely, copied efficiently without
“download” and “upload”, and accessed from anywhere that has a network
Its target audience is personal users, families or groups of friends.
Although it might have application in corporate environments, that is not its
Upspin provides a uniform naming mechanism for all data, along with
easy-to-understand and easy-to-use secure sharing, as well as end-to-end
encryption that guarantees privacy.
Upspin is not an “app” or a web service, but rather a suite of software
components, intended to run in the network and on devices connected to it, that
together provide a secure information storage and sharing network.
Upspin is a layer of infrastructure that other software and services can build
on to facilitate secure access and sharing.
Upspin looks a bit like a global file system, but its real contribution is a
set of interfaces, protocols and components from which an information
management system can be built, with properties such as security and access
control suited to a modern, networked world.
The rest of this document describes the problem Upspin is aiming to solve,
provides an outline of its structure and how it works, and describes some uses
both familiar and novel.
As information has moved from personal computers to networked servers, it has
also been separated from the people that use it.
Users have lost significant control of their data.
Instead of owning a CD or DVD, one pays to access a song or movie from a
provider such as Apple or Google, but the access can be rescinded if the user
loses the account.
Photos that one uploads to Facebook or Twitter are managed by, and effectively
belong to, those providers.
Other than occasional workarounds using a URL, information given to these
services becomes accessible only through those services.
If one wants to post a Facebook picture on one’s Twitter feed, one does that by
downloading the data from Facebook and then uploading it to Twitter.
Shouldn’t it be possible to have the image flow directly from Facebook to
This “information silo” model we have migrated to over the last few years makes
sense for the service providers but penalizes the users, those who create and
should therefore be in charge of their data.
There are surely advantages to hosting all one’s photos on a network service
rather than maintaining the archive oneself, but those advantages come with a
significant loss of control.
The situation also means that which data is available to a user depends on the
device being used.
For example, without special prearrangement, the data stored on one’s smart
phone is not visible from one’s computer, and vice versa.
Another issue is sharing.
Users have little control over who can access their data; essentially the
choice is “public”, where everyone can, or “private”, where only the user can,
and the workaround for other scenarios is the old pattern of upload, download,
and maybe mail or reposting or some other ad hoc mechanism.
There is also a security question.
Although it makes sense to share data in the network, that usually means the
service provider has access to the information itself.
Users should be able to gain the advantages of networked services but maintain
a measure of privacy when they so desire, and the rise of end-to-end encryption
in services demonstrates a way forward.
All of these problems are in principle easy to solve.
If every item of interest had a unique name, and every person, server, PC or
phone could evaluate that name to access the item, these problems would fall
Add to that proper end-to-end security and a coherent sharing model, and a
future with uniform, secure, ubiquitous access to data becomes feasible.
Download and upload, unauthorized access, siloed data, even email attachments
could become relics of the past.
Upspin is an attempt to enable this future by providing uniform access and
sharing methods for all of a user’s data.
Data can be accessed by any authenticated entity, be that a user or a server in
the network, but only when the user grants explicit permission to that entity.
The rest of this document outlines the pieces of Upspin and how it addresses
the problems of the current information world.
Users in Upspin are identified by an email address that, within Upspin, is
called the user name.
When the user is first registered with the system, the address is used to
verify the user’s identity; after that, it just serves as the user’s
identifier, for example,
Some email services provide a way to modify the user name to allow multiple
addresses to refer to the same user.
The usual syntax is to follow the user’s name before the
@ with a plus sign
and a suffix.
Upspin uses this technique to allow a user to own multiple Upspin services.
firstname.lastname@example.org might be the Upspin user name for an
Internet-connected video camera owned by
User names with suffixes belong to their primary user (the one without the
Primary users can therefore make changes to entries on the key server for their
name with a suffix.
email@example.com may change the keys and the store and directory
The converse, however, is not allowed:
firstname.lastname@example.org cannot make
User names with suffixes can be created only by their owners.
For this reason, they do not require a working email address.
Suffixed users are created using the
email@example.com can create the user ‘firstname.lastname@example.org`
$ upspin createsuffixeduser email@example.com
Every Upspin file name has the same basic structure.
It begins with the user’s name—an email address—followed by a slash-separated
Unix-like path name.
is an Upspin path name.
We say that
firstname.lastname@example.org is its owner.
As another example, the path name
is called the user root for
email@example.com, analogous to Unix’s
Looking at the full path,
firstname.lastname@example.org/dir/file, there is a directory
dir in the user root; it in turn contains the named file.
Any user with appropriate permission can access the contents of this file by
using Upspin services to evaluate the full path name.
Upspin names usually identify regular static files and directories, but the
naming scheme itself makes no guarantees about the kind of data served.
Some servers may offer dynamic content such as information generated by devices.
Also, some Upspin names identify links, which are analogous to Unix symbolic
A link allows a name in one tree to refer to an item anywhere else in the
Upspin name space, including a different user’s tree.
Thus links provide a way for a single user to organize into her own tree the
full set of Upspin names of interest.
Upspin has no concept of a local path name.
All Upspin path names are fully qualified, that is, they always begin with a
Upspin provides an easy-to-use model for sharing folders with any other Upspin
The access control mechanism determines who can read, write, delete, or even
discover the existence of Upspin files.
A separate document Access Control describes it in
detail, but the basic idea is simple.
By default, nothing is shared.
When a user is added to Upspin, all data created by that user is invisible to
If the user wishes to share a directory (the unit at which sharing is defined),
she adds a file called
Access to that directory.
In that file she describes, using a simple textual format, the rights she
wishes to grant and the users she wishes to grant them to.
For instance, an
Access file that contains
read: email@example.com, firstname.lastname@example.org
email@example.com to read any of the files in the
directory holding the
Access file, and also in its subdirectories.
If a subdirectory contains another
Access file, from that subdirectory
Access file completely replaces the rights granted by the
The owner (the user whose email address begins the Upspin path name) always has
permission to read all the files in the owner’s tree, as well as to update any
Access files within.
There is also a mechanism for defining groups of users for the purpose of
Full details are available in the access control document.
Upspin is implemented by three key pieces, each a networked server that
provides a simple RPC interface to its clients.
The pieces are:
1) A key server.
This is the server that holds the public (not private!) keys for all the users
in the system.
It also holds the network address for the directory server holding each user’s
For the global Upspin ecosystem, this service is provided by a server running
All users can save their public keys there, and can in turn ask the key server
for the public keys of any other user in the system.
2) A storage server.
These servers hold the actual data for the items in the system.
We expect there to be many storage servers, typically one or more per user or
perhaps family or organization.
Items are stored not by their name but by a reference, which for the default
implementation is a hash computed from the contents of the data.
3) A directory server.
These servers give names to the data held in the storage servers, on behalf of
the Upspin users.
Like storage servers, we expect there to be many directory servers.
Often, a directory server will hold only a single user’s tree, or perhaps the
trees of a single family.
The Upspin naming model allows this to work well.
Moreover, the separation of storage and naming means that the directory
structure for a single user is a modest amount of data even when it references
many large files.
key.upspin.io provides a single place to store all public keys.
Users’ actual data is stored across multiple directory and storage servers in
the network; these servers are run by the users themselves (or by agents on
A typical operation, say getting the contents of a file, is implemented using
For example, to read the file
firstname.lastname@example.org/file, the operation proceeds
- Extract the user name of its owner, which is always the beginning of
the Upspin name:
- Look up
email@example.com in the key server at
key.upspin.io to find the
network address of her directory server.
- Contact that directory server and ask it to evaluate the full Upspin name.
It returns the network address of the storage server holding the contents, and
the references under which they are stored.
- Contact the storage server to retrieve the data.
Of course this sequence is packaged into a library, called the client, that
is provided as part of the Upspin project.
It is also available through command-line tools or through a FUSE plugin for
Unix systems that turns the Upspin name space into a Unix file tree.
The storage and directory servers may be run anywhere, but we expect most to be
run as cloud services for easy availability, scalability, and maintenance.
The interfaces in Upspin make it easy to insert caching layers between the
client and servers to mitigate the cost of remote
Although Upspin allows a user to store and share data without encryption, we
expect most users will want their data to be protected from unauthorized access,
and so the system offers high security as the default.
There are two separate issues about security.
One is deciding who can access an item; that is the subject of the Access
section of this document.
Here we talk about a deeper guarantee, the promise that even if an intruder
breaks into the network servers, the user’s data is still protected.
By default, the content of files stored in Upspin is encrypted.
(File metadata stored in the directory hierarchy, such as the file name and the
list of users permitted to read the file, is guarded only by access controls in
the directory server, as the server must be able to navigate the tree.) The
user keeps, in a private location not part of Upspin, a key that is used both
during encryption of the data before it is written, and during decryption when
Both the encryption and decryption happen on the user’s client machine, not in
the network or on Upspin servers.
This is called end-to-end encryption, and prevents a snoop
from being able to read the user’s data by tapping the network or the
To share a file with a second user, that user must also be able to decrypt it.
Upspin handles this automatically, using encryption techniques that allow two
users to share encrypted data without disclosing their private keys to each
The public keys of all users are registered in a central server to enable
sharing even between strangers.
The details of the encryption algorithm and the security guarantees they can
make are described in a separate Upspin Security document.
Upspin also provides authentication mechanisms to block unauthorized users from
accessing the network servers in the first place.
The solution to this is also covered in the security document.
With the software components, protocols, and security mechanisms Upspin
provides one can construct secure, shared, distributed information systems.
An obvious example is a distributed file system, and the reference
implementation of the system provides exactly that: a global collection of
uniquely-named files that can be read, written, and shared securely.
It also provides, for Unix systems at least, a mechanism (the
daemon) to connect that distributed file system to the local file tree.
But it is important to understand that Upspin can name information from any
data service, not just traditional files.
In this section we mention a few possibilities, but many more can be imagined.
First, due to the content-addressable form of the standard storage server’s
references for static storage, the server can provide a simple, efficient
mechanism for backups, called a snapshot, and present it as an Upspin service
alongside the data it is preserving.
In effect, a snapshot provides a historical view of the tree at regular
intervals, so a user can view the data from the past using regular Upspin (or
It is presented by the same server and constructed automatically, and cheaply,
by building a secondary tree of historical roots of the user’s tree, named by
It is identified by the suffix
+snapshot attached to the user name.
firstname.lastname@example.org would be able to access her tree from December
3, 2016 through the Upspin tree rooted at
The files in that tree are not copies of the originals, but immutable
references to the same files as they appeared on that date.
The APIs for Upspin are easy to implement, making it simple to transform
existing data services into named, file-like data that can be joined to the
Upspin (or through
upspinfs, Unix) name space.
One could imagine simple but convenient connectors to do things like provide a
Twitter feed through a file system interface, or a greppable issue tracker for
GitHub bug reports, or an aggregator for music or video that provides a unified
view of all of one’s entertainment subscriptions.
As a more dynamic example, earlier we mentioned the idea of a connected device
such as a video camera.
The owner of the camera could register a special user to host the camera or, as
with the snapshot, serve it through a suffixed name like
The full path might be
email@example.com/video.jpg, with the idea that
every read of the file retrieves the most recent frame.
What is included
Upspin is an open source project, so naturally all the source code is available.
(See the section on contributing if you are interested in helping to develop
That source serves several purposes.
It includes definitions of the fundamental interfaces in the system (see the
Any service that implements one of the server interfaces called
StoreServer can participate in the Upspin ecosystem.
The source tree includes reference implementations of all three of these, of
course, and these are what is run at
The source tree also includes a couple of unusual implementations that use the
Upspin interfaces to provide access to services not usually thought of as
These include the snapshot mechanism, integral to the storage server and
described in the previous section, as well as a few experimental
components in the directory
Finally, the system includes a cache server that interposes between the remote
servers and the local client.
This implement exactly the same interfaces, so its existence is transparent to
key.upspin.io we run a
KeyServer and allow authenticated users to
register there and share in that name space.
We encourage anyone interested in using the system to register.
StoreServer, you will need to deploy your own
If you want to use the reference implementations, you can run your own instance
upspinserver by following
the server setup instructions.
Or you could write your own (and we encourage you to think about doing so).
It’s up to you.
Upspin is just a way to access things; what and where those things are is your
To join the Upspin system, a user needs to have a set of keys to access the
The private key is kept private and is the user’s responsibility to save and
The public key on the other hand must be made public so all participating
servers can authenticate the user.
The public key is made truly public by creating a user record to hold it in the
key server at key.upspin.io.
The process for doing this is described in a
Once registered in the key server, the user can access existing content stored
in Upspin, assuming permission is granted.
To store new data requires setting up a place to store it, that is, finding or
running Upspin directory and store servers to host the data.
To do so may require working with friends or family, signing up to an
organization that runs Upspin servers for outside users, or launching personal
instances on local machines or cloud systems.
The addresses of the user’s servers are then stored in the public key server’s
record for the user, at which point the user can start saving data in the
Again the details are presented separately.
Comparison with existing systems
Upspin has a number of similarities and differences when compared to other
systems that aim to give users secure shared access to data.
Rather than make explicit comparisons to such systems, this section focuses on
the salient or unusual aspects of its design that, when put together, make
The fundamental purpose of Upspin is universal access to secure, sharable
Those three italicized terms all matter.
Universal means that no single entity maintains the data; Upspin is in effect
(Key management works well with a single space of keys, but it is not a
requirement that a single server host all the keys; the current single server
can and likely will develop mirrors and replicas.) Any Upspin user can, with
permission, access any item on any Upspin server.
Secure means data can be, and usually will be, protected through end-to-end
encryption that not only guards it from prying eyes, but provides a controlled,
safe mechanism for sharing, by giving keys only to those who have the right to
unlock the data.
Sharable means that there is an easy-to-use, easy-to-understand mechanism for
sharing items with individuals, groups, or the public.
Despite the fine-grained nature of the access permission model, it’s always
easy to see who is allowed to access a file, easy to grant access, and easy to
Many systems provide either all-or-nothing models (public and private), or
allow fine-grained sharing but in a way that is not visible as part of the data
space itself, and not enforced by end-to-end encryption.
The target user for Upspin is also unusual.
Upspin is intended for securing and sharing personal data, and is not designed
for corporate information systems.
Although it could be used for corporate work, the fine-grained sharing model
and use of individual keys could make that clumsier than it would be for most
For individuals or small groups, however, Upspin works well.
Also, Upspin’s system architecture is unusual.
It is at its core just a set of simple APIs that anyone may implement; it is
not a data service one must acquire from a particular provider.
Although there are reference implementations that provide many of the features
we feel are central to the system, we expect and hope that many other
implementations will arise, allowing a wide variety of information services to
coexist in the Upspin space.
One particularly distinct element of Upspin’s design is the separation of
directory and storage servers.
This separation has a number of properties, most important of which is that it
guarantees that the directory servers are not granted any access to users’ data
beyond knowing where it is located.
Directory servers never even see user’s data, even unencrypted data, other than
file names and access permission information.
A user’s data need not even be hosted by the same organization that hosts the
Installing and Contributing
The source for Upspin is hosted on Gerrit
and mirrored to GitHub.
Install Go if you haven’t
already (golang.org/dl) and then run
$ git clone https://upspin.googlesource.com/upspin
$ cd upspin
$ go install cmd/...
There are useful auxiliary commands and interfaces to various cloud storage
providers in (upspin.googlesource.com).
Guidelines for contributing back to the project are in a