Running a Local Arweave Gateway for Development

Created by:

About this Lesson

Greetings! Welcome to the first practical lesson of the Arweave 201 track. In the intro lesson, I explained how Arweave ensures permanent data storage and gateways enable access to data stored on Arweave. This lesson will be hands-on! You'll learn how to install and configure an Arweave gateway for local development.

A local gateway is useful when developing DApps for the Permaweb and a good way to understand the setup process before running a gateway in production.

Prerequisites

To complete this lesson, you need a basic understanding of HTTP, TLS/SSL, DNS, Linux, Node.js, and Docker.

A basic understanding of blockchains (i.e., transactions, wallets) is also helpful.

You also need a Linux system with Docker, Git, and Node.js.

Arweave Gateways Explained

There are currently two types of Arweave nodes: miners and gateways (Figure 1).

figure-1

Figure 1: Arweave gateways

The miner nodes are responsible for data storage. They form the Arweave proof-of-work blockchain network and are incentivized to store data permanently by Arweave’s native AR token.

The gateway nodes are responsible for data delivery to clients. There are several gateway implementations, but they’re all indexing and caching onchain data to ensure clients can access it quickly. The implementation you will use in this course is the AR.IO gateway.

AR.IO gateways form their own proof-of-stake network controlled by smart contracts on Arweave. They are incentivized by IO tokens (similar to ERC-20 tokens) generated via sales of ArNS names, which are Arweave’s equivalent of DNS domains that get resolved by AR.IO gateways (Figure 2).

figure-2

Figure 2: AR.IO Network

The AR.IO gateway is implemented as a Docker Compose application. It consists of containers for data delivery, data upload to Arweave, and third-party containers that care about the gateway’s internal state and routing. For this lesson, only the core and envoy containers are important; the rest is either experimental, optional, or both.

figure-3

Figure 3: AR.IO Gateway containers

Now that you understand our Arweave gateway of choice let’s set one up!

Installing the Gateway

To install the gateway, you only need a docker-compose.yaml and .env file, but to make the use of the gateway easier, you will also initialize a Node.js project and add an SSL proxy to enable scripting and HTTPS access to your gateway.

mkdir arweave-gateway && cd arweave-gateway
npm init -y
npm i -D local-ssl-proxy

Next, you need three scripts. One will start all docker containers in the background, one will stop them, and one will start an SSL proxy so you can test web APIs that rely on it. Update the scripts in your package.json file:

"scripts": {
  "start:gateway": "docker compose up -d",
  "stop:gateway": "docker compose down",
  "start:proxy": "local-ssl-proxy --source 443 --target 3000"
},

The docker-compose.yaml of the gateway contains the basic setup configuration. It defines the container images, storage, and network configuration. Download it from GitHub with this command:

curl https://raw.githubusercontent.com/ar-io/ar-io-node/develop/docker-compose.yaml > docker-compose.yaml

You can use a .env file to add your custom configuration. Create one with the following content:

TRUSTED_NODE_URL=https://arweave.net
START_WRITERS=false
RUN_OBSERVER=false
RUN_RESOLVER=false

This is the absolute basic configuration that doesn't start unnecessary containers or processes. It uses arweave.net to resolve all requests; nothing will be indexed locally. However, it caches the responses.

Setting the START_WRITERS variable to false prevents the start of the worker processes that synchronize the Arweave blockchain to your gateway. You need them if you want to handle everything from your gateway and don't fall back to the trusted node, but without additional configuration, this would start fetching the whole chain, which can take a few weeks to complete.

The observer and resolver are containers that run alongside your gateway; they aren't needed for most development tasks, so you can deactivate them.

Testing the Gateway

To test your gateway, run the following command:

npm run start:gateway

Docker will start pulling the container images and start each container. After a few seconds, you can access your gateway via http://localhost:3000/ in your browser.

If the gateway started correctly, you see info about your gateway that looks like this:

{
  "network": "arweave.N.1",
  "version": 5,
  "release": 70,
  "height": 1462197,
  "current": "IY0Hw4RZIARVrln-VhP_kPbjBYzU5txT_njzFI8y5xljgIGgB2A1nhuHaRq3gzOh",
  "blocks": 1462198,
  "peers": 336,
  "queue_length": 0,
  "node_state_latency": 1
}

When you're done, you can stop the gateway with this command:

npm run stop:gateway

Enabling HTTPS

If you want to access the gateway via HTTPS, for example, when you want to access a DApp on the Permaweb that uses a browser feature only available in a secure context, you can run the server like this:

npm run start:gateway
npm run start:proxy

Now, you can access your gateway via https://localhost.

Enabling ArNS Names

ArNS names are subdomains of your gateway domain; if you don’t host your gateway behind a domain, you can’t use this feature. To test an ArNS name, you must update your hosts and .env files.

On Linux, you find the hosts file at /etc/hosts.

On Windows, you will find the hosts file at C:\Windows\System32\drivers\etc\hosts.

Add the following line to this file to enable a specific ArNS name:

127.0.0.1 ardrive.arweave.local

Then add the following line to the .env file:

ARNS_ROOT_HOST=arweave.local

After you have made the changes, restart your gateway. You should be able to access the ArDrive Permapage via https://ardrive.arweave.local/.

Sandboxing

The ARNS_ROOT_HOST variable also activates sandboxing. Gateways will redirect URLs with transaction IDs (TXIDs) to a dynamically generated subdomain. This ensures all websites delivered by a gateway are sandboxed and can’t access each other's content.

https://arweave.local/transaction-id-99999

becomes

https://sandbox-id-99999.arweave.local/transaction-id-9999

However, the hosts file doesn’t support wildcard subdomains, so you must manually set every sandbox domain or (which is easier) delete the ARNS_ROOT_HOST to deactivate sandboxing again.

Basic Configuration Options

Now that you have your own gateway up and running, let's look at the basic configuration options you can set up via your .env file.

TRUSTED_NODE_URL

This is an Arweave miner node URL your gateway will use to get network information and fetch data it hasn't cached locally.

ARNS_ROOT_HOST

This is the domain or hostname of your gateway. If you set this to a domain, ensure you don't forget the port if you use non-standard HTTP/S ports. If you used a hosts file to set the domain, don't forget to set this for all sandbox/ArNS subdomains.

If not set, ArNS names and transaction sandboxing will be disabled.

LOG_LEVEL

This variable allows you to control the logger's output. By default, it's set to notice; if you set it to info, you drastically lower the log output.

ADMIN_API_KEY

The password for all endpoints is under /ar-io/admin. You must send this password to the Authentication header as a bearer token.

By default, the gateway generates a random password.

START_WRITERS, START_HEIGHT, and STOP_HEIGHT

The START_WRITERS variable controls the worker processes that synchronize and index blockchain data with your gateway. The writers are activated by default; synchronizing the whole chain can take multiple weeks. Set the START_HEIGHT variable before starting the workers to ensure you only index the latest transactions.

💡

Note: If you set START_WRITERS to true and didn’t set a START_HEIGHT value or if you changed the START_HEIGHT value, you have to delete your index files and restart your gateway to start indexing with the new value.

You can get the latest block height from https://arweave.net/height; your gateway will eventually index all transactions you sent after this block. This can save you time when synchronizing.

If you also use STOP_HEIGHT, you can force your gateway to index only a specific time window of the chain, which is helpful for debugging.

RUN_OBSERVER & RUN_RESOLVER

The observer is an additional container that runs alongside your gateway to check other gateways in the network behave appropriately. It starts by default, so you must set this variable to false in development.

The resolver is responsible for resolving ArNS names, but you can deactivate it and let your trusted node resolve ArNS names.

Advanced Configuration Options

Sometimes, you need more control over your gateway. There are plenty of options for customizing your gateway.

Modifying the Data Location

If you want to change the locations where the gateway stores its data, you can use the following variables:

CHUNKS_DATA_PATH, CONTIGUOUS_DATA_PATH, and HEADERS_DATA_PATH define the raw transaction data paths. Multiple gateways can share the same locations to save space and traffic. Only one should have START_WRITES set to true so they don’t override each other's downloads.

The index and caching paths are defined by SQLITE_DATA_PATH, TEMP_DATA_PATH, LMDB_DATA_PATH, REDIS_DATA_PATH, REPORTS_DATA_PATH, WARP_CACHE_PATH, and ARNS_CACHE_PATH. These paths must be unique for every gateway.

Unbundling Bundled Transactions

Arweave has two types of transactions:

  1. Layer 1 transactions are like Bitcoin transactions.
  2. Layer 2 transactions (called bundled transactions) are stored inside the transaction body of a layer 1 transaction.

Usually, a gateway will only index layer 1 transactions, but you can configure it to unbundle the bundled ones. This way, your gateway doesn’t have to fetch them from your trusted node. Keep in mind that this will inflate the stored data.

You can activate unbundling by setting the ANS104_UNBUNDLE_FILTER and ANS104_INDEX_FILTER variables.

They have the same format as the webhooks filter:

{"never": true} // (Default)

{"always": true}

{"attributes": { "owner": <owner key>, ... }}

{"tags": [
  { "name": <utf8 tag name>, "value": <utf8 tag value> },
  { "name": <utf8 tag name> },
  ...
  ]
}

{ "and": [ <nested filter>, ... ]}

{ "or": [ <nested filter>, ... ]}

To manage your storage, you can use the owner attribute to ensure only transactions from your wallet are stored.

Calling APIs via Webhooks

The gateway can call webhooks if it encounters specific transactions or blocks while indexing.

The WEBHOOK_TARGET_SERVERS variable accepts a string with comma-separated API URLs. The endpoints must accept POST requests.

The WEBHOOK_INDEX_FILTER and WEBHOOK_BLOCK_FILTER allow you to filter for transactions and blocks.

The filters have the following format:

{"never": true} // (Default)

{"always": true}

{"attributes": { "owner": <owner key>, ... }}

{"tags": [
  { "name": <utf8 tag name>, "value": <utf8 tag value> },
  { "name": <utf8 tag name> },
  ...
  ]
}

{ "and": [ <nested filter>, ... ]}

{ "or": [ <nested filter>, ... ]}
💡

Note: You must enable unbundling for your gateway to ensure the webhooks trigger for bundled transactions. This includes the transactions you uploaded via the Turbo SDK.

Content Moderation

You might not want to serve all data on Arweave. To block specific transactions, you can use the admin API like this:

curl -X PUT -H "Authorization: Bearer <ADMIN_KEY>" \
  -H "Content-Type: application/json" \
  "http://<HOST>:<PORT>/ar-io/admin/block-data" \
  -d '{ "id": "<ID>", "notes": "Example notes", "source": "Example source" }'

Make sure you set the ADMIN_API_KEY inside your .env file.

Simulating Request Failures

With the SIMULATED_REQUEST_FAILURE_RATE variable that goes from 0 to 1, you can define the probability of request errors. This way, you can check how clients handle a flaky gateway.

Summary

In this lesson you learned what an Arweave gateway is and how to run one on your development machine. You learned how to test ArNS names lcoally, how to use SSL and how to enable advanced features like calling webhooks and simulating a flaky gateway. Now, you know the basics to host your own gateway to improve your development experience when building DApps on the Permaweb.

Connect your wallet and Sign in to start the quiz

Conclusion

Hosting an Arweave gateway is similar to running a web server. The difference is that a web server will deliver files from it’s local disk and an Arweave gateway delivers files fetched from Arweave. This means, you can run your own gateway to make all data on Arweave locally available to improve latency. With features like webhooks, you can even call other servers when the right transactions are ready on Arweave, which allows you to react to new data dynamically.

In the next lesson, you will learn how to deploy an Arweave gateway on AWS.