Usage
To become a index provider involves three main parts:
A DagStore migration
Indexing announcement of available content
Content routing verification
DagStore Migration
The DagStore migration is required to roll out the new CARv2 indexing format, MultihashIndexSorted
. It regenerates the indices with the latest CARv2 index format that includes the full multihash of CIDs in each DagStore shard. For more inform see DagStore Lotus documentation.
Indexing Announcement
The indexing announcement publishes advertisements about the content available for retrieval onto a gossipsub topic, which is listened to by a set of indexer nodes.
The advertisements are then processed by the indexer in order to provide an endpoint that allows clients to lookup where and how to retrieve data for a given multihash. For more information, see Indexer Node Design.
Content Routing Verification
The content routing verification involves checking that the advertisements published by the Lotus instance are ingested by the indexer nodes and for a given stored multihash the correct provider is returned by the indexer node.
Note that ingestion of advertisements by the indexer nodes is progressive; total ingestion time depends on the number of multhihashes being advertised. Therefore, verification needs to be performed with some delay after indexing announcement is made.
Usage Expectations:
New CAR index format in DagStore
The new index format stores the multihash code as well as the digest. The size of DagStore index files as a result are slightly larger. The size increase is negligible.
Index Provider GossipSub Announcements
The index provider integration announces changes to the chain of advertisements onto a gossipsub topic, named /indexer/ingest/mainnet
, which is propagated through the Lotus daemon onto the indexer nodes.
Index Provider GraphSync Server
The index provider integration exposes a GraphSync server which serves requests from indexer nodes to sync the list of advertised multihashes provided by the SP.
Note that the server is exposed on the address configured under IndexProvider
configuration section, with key ListenAddresses
. The advertisements will include the address configured under the same section with key AnnounceAddresses
. You must make sure both sets of address are reachable publicly. For more information see Indexer Provider Configuration.
Index Provider Storage Usage
The index provider integration shares a datastore with markets
process, wrapped under the namespace index-provider
. The datastore entries stored include:
internal mappings used by the index provider engine, and
cache of chained multihash chunks.
The storage used by the internal mappings is negligible.
The storage used by caching is bound by an LRU cache, the maximum size of which is configured to 1024
, i.e. the number of chunks cached. The maximum length of each chunk configured to 16,384
.
The exact storage usage, then, depends on the number of multihashes stored in a single chunk and the size of each multihash, and can be calculated as:
For example, caching 128-bit long multihashes will result in chunk sizes of 0.25MiB with maximum cache growth of 256 MiB.
Note that the LRU cache may grow beyond its max size if the generated chain of chunks is longer than the configured LinkChunkSize
. This is to avoid partial caching of chunks within a single advertisement. The cache expansion is logged in INFO
level at provider/engine
logging subsystem and can be monitored for diagnosis purposes.
Steps to Become an Index Provider
Stop the
daemon
miner
andmarkets
processesStop all lotus processes to suspend any changes to the state of DagStore during the rollout.
The daemon needs to be stoped in order to roll out a change to the API that protects connections between markets and daemon libp2p node.
Back up the existing DagStore repository
The DagStore repository is located at
$LOTUS_MARKETS_PATH/dagstore
by default. Make a copy of that folder. This is necessary for:verifying that the expected shard indices are re-generated after migration, and
❗rolling back the changes in case of an error.
Delete the existing DagStore repository
Delete the DagStore repository located at
$LOTUS_MARKETS_PATH/dagstore
by default. The absence of the repository signals to the Lotus instance that a DagStore migration is needed and will automatically trigger one uponmarkets
instance start-up.Rotate any existing Lotus log files and adjust log level
For easier debugging rotate any existing logs so that the new logs only include output generated by the target release.
Deploy
daemon
andminer
processesDeploy the index provider tag -
master-spx.idxprov.rc-1
to thedaemon
andminer
processes and await until they are fully started and ready.Note: This tag is based off release/v1.15.0, thus it also supports the upcoming OhSnap network v15 upgrade!
They are ready when the following commands succeed:
lotus-miner storage-deals list
lotus-miner sectors list
Deploy the target release on
markets
processDownload and deploy
master-spx.idxprov.rc-1
to the market process.Start the
markets
processStart only the
markets
process and wait for the following log line in the markets process logs:dagstore migration completed successfully
This indicates that the list of shards that require initialisation have been queued for processing. See [DagStore First-time Migration](https://lotus.filecoin.io/docs/storage-providers/dagstore/#first-time-migration) for more information. See Indexer Provider Config to customize the configuration of the subsystem that announces indexes.
Configure logging subsystems
Make sure your lotus installation persists the log files for future debugging. Set the log level for the following subsystems on market node to
INFO
:go-legs-gpubsub
provider/engine
dagstore
To do this run the following command:
Initialise the the DagStore shards
To start the initialisation of DagStore shards, run:
lotus-miner dagstore initialize-all --concurrency=N
if you run a monolith miner process orlotus-miner --call-on-market dagstore initialize-all --concurrency=N
if you have split your market subsystem.⚠️ Initialization places IO workload on your storage system.
N
controls the number of deals that are concurrently initialised. See DagStrore Force Bulk Initialisation docs for more information.Wait for the initialisation to complete. The initialisation time is a factor of the volume of data stored, since it involves re-indexing the data blocks.
Verify re-creation of DagStore repository
The successful completion of the previous step should recreate the DagStore repository, located at
$LOTUS_MARKETS_PATH/dagstore
. Navigate to that director. Under the subfolderindex
verify that matching*.full.idx
files can be found for all files under the same sub-directory in the backup of DagStore taken in step 2.✨ Announce all indices to the indexers
To announce all the indices in bulk to the indexers, run:
lotus-miner index announce-all
if you run a monolith miner process orlotus-miner --call-on-market index announce-all
if you have split your market subsystem.This command generates advertisements and publishes indices onto the indexer gossipsub channel. In the markets logs look for a series of logs that include
deal announcement sent to index provider
. You should see one such log per deal. The log line also includes advertisement CID, the deal proposal CID to which it belongs and the shardKey from which its multihash entries are generated. The logs should also include logs that provide information about the number of multihash entries each advertisement includes. For example:Note that the bulk advertisement only announces deals that are not expired and handed over to the sealing subsystem. The expired deals will not be advertised. For any remaining deals the advertisement will occur after they are handed over to the sealing subsystem.
Wait for the bulk indexing announcement to complete. The bulk announcement is complete when
finished announcing active deals to index provider
is logged.Verify indices in DagStore repository are ingested by indexer nodes
To verify ingestion, download and install the latest
provider
CLI tool from:The built binaries can be found under assets attached for each target platform.
Once installed, download the following script.
Script below to be provided as a downloadable
.sh
file; for now pasted below for review purposes.The script takes two mandatory argument:
Miner peer ID as the first argument, and
Path to dagstore repository as the second argument, e.g.
$LOTUS_MARKETS_PATH/dagstore
A third argument may optionally be specified as a number between
0.0>=1.0
which sets the selection probability of the multihash sample that is verified for ingestion, set to 5% by default.You can find out what your peer ID is using
lotus-miner net id
This script will:
iterate over all index files that match
*.full.idx
name,selects 5% of the multihashes at random from each index, and
verifies that the indexer node has those multihash associated with the given
MINER_ID
as the provider.
A verification result is printed for each file. Verify that verification is successful for each of the files. See
provider verify-ingest -h
for example output.Indexer Provider ConfigurationYou can adjust the values under the
IndexProvider
session in the config.toml of your market process to configure indexes announcement to the indexer.If the session doesn't exist, you can manually add it:
Last updated