How to set up your own internet time machine on a virtual server using ArchiveBox

Hello everyone, PQ.Hosting is here! My name is Igor, and since 2024 I have been working in the company's technical support. And starting from this day, I will also be writing for our page on tekkix ;)

And you know what I noticed during my work? Many people think that a virtual server or even a dedicated one is not very interesting. Well, what is the maximum you can do with it? Host an online store or any other website - not much fun.

That's why I took it upon myself to add a bit of rock and roll and show that a server is actually gigabytes of fresh information a virtual techno-laboratory of interesting projects, experiments, and even professional growth. The main thing is to have a smartphone or computer with openssh installed at hand. And all this for the price of a couple of cups of coffee a month.

Get ready - I plan to write a whole series of articles that will cover useful and unusual self-hosted services that can be set up on VPS or Dedicated.

By the way, if you want me to talk about a specific service, be sure to write about it in the comments - I will consider all suggestions in detail!

In the first issue, I will talk about ArchiveBox - a service that allows you to independently launch an analogue of the Wayback Machine.

A few more words about the ArchiveBox project

ArchiveBox is an open-source utility designed for archiving websites. It allows you to collect, save, and conveniently organize snapshots of the pages you are interested in.

In general, the service replicates the functionality of the well-known Wayback Machine. However, there is one fundamental difference: in ArchiveBox, you can save non-public sites. Including information from social networks, YouTube, Soundcloud, and others.

At the same time, as with all other self-hosted applications, you have full control over all the information and are not dependent on the decisions of the service owners. And anything can happen to the Internet Archive - let's remember at least the multi-day outage that happened quite recently.

What equipment is required for installation

Judging by the information on the developer's website, ArchiveBox does not require the most powerful hardware. For example, in theory, the service can even run on a single-board computer Raspberry Pi 3, which was released 8 years ago.

For the test, I will use our entry-level VPS Aluminium with one Xeon E5-2697A processor core, 1 GB of RAM, and a 25 GB SSD. This configuration should be more than enough for the job.

Installing ArchiveBox on a virtual server to create an internet time machine

What needs to be done before installation

In theory, ArchiveBox can also be installed using Apt, pip, or Pacman, but the developers strongly recommend using Docker for installation. Well, let's be chill guys and not resist the advice of the software authors.

Or maybe we should have been bad guys and installed the service through a package manager? Write in the comments which option you would choose?

If Docker is not yet installed on your server, I will leave a small instruction on how to do it from the apt repository.

First, update the list of repositories and add GPG keys:

sudo apt-get update

sudo apt-get install ca-certificates curl

sudo install -m 0755 -d /etc/apt/keyrings

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc

sudo chmod a+r /etc/apt/keyrings/docker.asc

Then add the repository to apt and update the list again:

echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \

sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update

ArchiveBox interface showing saved web pages

After that, you can install Docker itself:

sudo apt-get install docker-ce docker-ce-cli containerd.io

docker-buildx-plugin docker-compose-plugin

The process of installing and configuring ArchiveBox on a virtual server

If you are worried that something went wrong, you can run a command to check the functionality of Docker on your machine:

sudo docker run hello-world

If everything is fine, you will see the following message in the terminal:

Example of using ArchiveBox to save web content

How to install

There are many options for installing ArchiveBox, but here I decided to go the KISS route and use the Docker run installation option. It seemed less complicated to me — I have enough difficulties and hassles during the workday.

First, let's create a directory where the site snapshots will be stored:

mkdir -p ~/archivebox/data && cd ~/archivebox/data

ArchiveBox web interface with access to archived pages

After that, let's finally run the ArchiveBox installer:

docker run -v $PWD:/data -it archivebox/archivebox init --setup

Scheduling for automatic archiving with ArchiveBox

Next, the installer will prompt you to create a user for the web interface (this will come in handy later):

Advantages of using ArchiveBox to create an internet time machine

Come up with a username, enter your email and password:

And, everything is ready! You can start using ArchiveBox!

How to use ArchiveBox

There are 2 options here. The first is to continue using ArchiveBox directly in the terminal. For example, to add a link that the utility will store and track, just one command is enough:

echo 'https://example.com' | docker run -i -v $PWD:/data archivebox/archivebox add

And to view all saved sites, you will need to enter:

docker run -v /path/on/host:/path/inside/container

The full list of commands is available on the project's GitHub page.

The second option is to install the GUI. Whether this is convenient or not is up to you. True Linux users will grumble, but it's nice to sometimes switch from the harsh console to a graphical interface.

To do this, you will need to run 2 commands:

docker run -v $PWD:/data -p 8000:8000 archivebox/archivebox

docker run -v $PWD:/data -it archivebox/archivebox help

After that, go to http://your-server-IP:8000 in your browser. As a result, you should end up on this page. All that's left is to add the sites you're interested in!

Overview of ArchiveBox capabilities for long-term web content storage

That's it — you have your own internet time machine! And one that is practically independent of various external factors. No one will complain about the content on your personal VPS, and you will be able to save any information for the future. And besides, it's all very easy: if Docker is already on your server, you will literally need to enter a couple of commands in the terminal.

The virtual server Aluminium, on which everything was tested, costs only 4.77 euros. And with our promo code HABR you can get it and other tariffs even cheaper — with a 15% discount! Go to the website, choose the right service and order a fast and reliable VPS/VDS in one of 42 locations around the world.

And remember that the server is not boring. It gives you the opportunity to be creative and spend time usefully. ArchiveBox is just the beginning! There are many more interesting reviews of self-hosted services waiting for you.

Comments