Skip to content

feat(homeserver, testnet): add configurable setup for pubky-homeserver and pubky-testnet#361

Open
techraed wants to merge 10 commits intopubky:mainfrom
techraed:issue-185
Open

feat(homeserver, testnet): add configurable setup for pubky-homeserver and pubky-testnet#361
techraed wants to merge 10 commits intopubky:mainfrom
techraed:issue-185

Conversation

@techraed
Copy link
Copy Markdown

@techraed techraed commented Mar 28, 2026

Resolves issue #185

Changes summary:
1. homeserver CLI now accepts 3 path args: to data dir, to secret file and to config file. The resolution algo:
- If path is defined, then the value is used
- If path is not define, then data dir path is used as a base path
- If data dir is not defined, then home directory is used (~/.pubky)
The CLI behavior didn't change if no params are provided or only if data dir param is provided - that way changes are backward compatible.
2. PersistentDataDir is renamed to HomeserverPaths, and the latter one stores all three resolved paths. Consequently, the module data_directory is also renamed to homeserver_config.
3. Due to changes semantics described upper, DataDir is now renamed to SetupSource (also module is renamed). The trait is no longer a Clone implementor, as homeserver app covers the trait object with Arc.
4. MockDataDir is renamed to MockSetupSource. The module is also renamed. The type now can be instantiated with a path to data dir.
5. EphermalTestnet can now define a data dir in its' builder. If data dir was defined for an ephermal testnet, then a file system will be used by open-dal. Otherwise testnet storage remains in-memory.

UPD: the original issue was changed. See further comments to understand the context and changes summary.

@techraed techraed changed the title feat: add configurable setup for pubky-homeserver and pubky-testnet feat(homeserver, testnet) add configurable setup for pubky-homeserver and pubky-testnet Mar 28, 2026
@techraed techraed changed the title feat(homeserver, testnet) add configurable setup for pubky-homeserver and pubky-testnet feat(homeserver, testnet): add configurable setup for pubky-homeserver and pubky-testnet Mar 28, 2026
@techraed
Copy link
Copy Markdown
Author

techraed commented Mar 28, 2026

cc @SeverinAlexB @86667
Ready for the review

Also I have a question regarding pubky-testnet having defined a path to data dir. What's the use case for that?

  1. Is it only to get access to stored files after the test?
  2. Or is it to run tests using some real homeserver data dir (like historical/snapshot tests)?

If the second choice is the intended use case, can I request to implement it in other PR, not in the current one. As that case requires to think thoroughly how to handle the case with metadata is postgres.

@86667
Copy link
Copy Markdown
Collaborator

86667 commented Apr 1, 2026

Hey @techraed thanks for your work!

Nicely done, I have a couple comments around making changes as simple as possible in open codebases.

There are a lot of changes here. Im not sure why the cli has changed, and the refactor and directory rename makes it difficult to understand what is happening and is technically a breaking change.

Please separate behavioral changes from refactors either in different PRs or at least different commits, and seriously consider whether a rename of a common type is necessary. It may be that the renaming/refactor is absolutely necessary for the desired behaviour to work, if so please justify it.

I have a question regarding pubky-testnet having defined a path to data dir. What's the use case for that?

Hm, if the second option is more difficult then I agree lets keep it for a separate piece of work. Achieving only 1. is still progress in the right direction.

@techraed
Copy link
Copy Markdown
Author

techraed commented Apr 1, 2026

making changes as simple as possible in open codebases.

refactor and directory rename makes it difficult to understand

Yeah, you right about that. Sorry, forgot that for open codebases it's usually preferred small and only relative changes.

Anyway, let me try to justify these changes.

The CLI

The original issue states the following: "It's not possible to choose the data dir for the homeserver". And the same issue is for the pubky-testnet.

Frankly speaking, the master version of the cli allows you defining the path to some other file storage. But there are several issues.

Problem-1 Config and the data path are the same

The problem is that under the path there must the config and the secrets file. Therefore if you have some other files location, you must transfer there your config and secrets file, because the setup code for the homeserver has a strong dependency of the config and secrets files from the data directory (must be in the same directory).

Problem-2 Low UX mitigations

Alright, you could mitigate that just by the following changes to storage config toml:

pub enum StorageConfigToml {
    /// Files are stored on the local file system.
    FileSystem { path: PathBuf }, // <------------- Added here the path

    // other variants here
}

That would allow not touching the CLI, but the UX would be very low — basically, you would need to create a directory, inside the directory you must define a config file, which must define the path.

Solution

So what's the best and ergonomic way (imho) to define a custom data dir without having to think of config and secretes files being close to data dir? Change the CLI!

# Calling like this would create everything inside the ~/.pubky
./target/release/homeserver

# Remains the same behavior
./target/release/after_this_pr_homeserver

# Calling like this would create everything inside the /etc/some/path/to/data
./target/release/homeserver --data-dir /etc/some/path/to/data

# Remains the same behavior
./target/release/after_this_pr_homeserver --data-dir /var/lib/homeserver/data

# Now config and data are in the separate dirs. Secrets will be created in the data dir.
./target/release/after_this_pr_homeserver --data-dir ~/data --config /etc/configs/homeserver_config.toml

# Now everything is accessible from their own paths
./target/release/after_this_pr_homeserver --data-dir /var/lib/homeserver/data --config /etc/configs/homeserver_config.toml --secret-key /etc/secrets/homeserver_secrets

Renaming because of the CLI changes

So the CLI changes bring the separation between different setup sources. So we can't say now the path is to persistent data directory. It's now a path to different configs. That's now a directory is renamed: data_directory —> homeserver_configs. Same renaming for data_dir.rs module and the crate DataDir: data_dir.rs —> setup_source.rs and trait DataDir —> SetupSource.

pubky-testnet related changes

The original issue states "pubky-testnet always uses an emphemeral tmp dir. It can be useful to point it at a persistent location for some local testing.".

To do that the original MockDataDir

pub struct MockDataDir {
    pub(crate) temp_dir: std::sync::Arc<tempfile::TempDir>,
    pub config_toml: super::ConfigToml,
    pub keypair: pubky_common::crypto::Keypair,
}

is changed to

pub struct MockSetupSource {
    root: MockDataDir, // THE CHANGE !
    pub config_toml: super::ConfigToml,
    pub keypair: pubky_common::crypto::Keypair,
}

enum MockDataDir {
    Temp(std::sync::Arc<tempfile::TempDir>),
    Persistent(PathBuf),
}

Renaming because of the MockDataDir changes

That's related to DataDir trait name changes to SetupSource.

Final comment

@86667 You are right about the size (scope) of changes, these renamings trigger moving files and review becomes complicated. But the changed nature of original entities made me bring all the changes (feature and refactoring) together. Hope that made everything clear for you.

@techraed
Copy link
Copy Markdown
Author

techraed commented Apr 2, 2026

@86667 shall I remain the same the PR? or divide changes related to logic and renamings (different commits)?

@86667
Copy link
Copy Markdown
Collaborator

86667 commented Apr 7, 2026

Hey @techraed

Anyway, let me try to justify these changes.

I understand that there is some way to justify 41 files changes and breaking the interface between modules. The question is whether its worth the pain. Ideally, we would keep changes as minimal as possible so that we can quickly test and confirm all is well, then iterate quickly.

@86667 shall I remain the same the PR? or divide changes related to logic and renamings (different commits)?

This would be great, if we could have a minimal PR which solves a single problem.

This being said, I do agree that the task here is fairly under-specified and the development of it is itself research into what is best. I think that maybe we could achieve what we need (data dir specified so that it can be reused later) by the more simple change of allowing the path of data to be specified but keeping config.toml and secret in HOME/.pubky. What do you think?

That way, we could later add an optional var to config.toml which points to the data location and then not have to worry about the --data-dir flag each time we run. I think the DataDir trait may do most of thee heavy lifting for us here, though i may be incorrect.

Looking forward to seeing what you think, thanks again for getting invovled!

@86667
Copy link
Copy Markdown
Collaborator

86667 commented Apr 7, 2026

I re-considered what were doing here and decided to update the original issue. The actual need is in the testnet only, the homeserver I think we can get away with not changing.

See this issue comment - #185 (comment)

@techraed
Copy link
Copy Markdown
Author

techraed commented Apr 9, 2026

Sure, that's a right thing to do. After the #361 (comment) I thought in the same direction.

Will do

@techraed
Copy link
Copy Markdown
Author

hey @86667
Now it's ready for the review.

Only focused on providing an ability to use persistent data storage (i.e., file system) for tests. Basic scenario which was described in the #185 (comment) is implemented in here - https://github.com/pubky/pubky-core/pull/361/changes#diff-87ec3cd0feb2ec7271c55b5a28635507f0a79f26433b869d2977b872c07cf6a9R922.

Summary of changes:

  1. MockDataDir can be instantiated either with a temp file storage or a persistent one by providing a path to it.
  2. Test DB lifecycle is now opt-in for cleanup. EphemeralTestnetBuilder::new().drop_db_on_cleanup(false) - this now sets a special param in the query part of the connection string (like postgres://localhost:5432/postgres?pubky-test-persist), that signals to TestDbDropper not to drop a DB, so a test DB can survive across testnet restarts. That's vital for test database (created via pubky_homeserver::persistence::sql::SqlDb::create_test_database) if we opt file storage to persist, because if database is dropped, then all the metadata for the file storage is lost and we couldn't use the existing file storage via a homeserver from a new testnet.
  3. An ability to obtain a db connection string from the running homeserver is enabled for test databases. The returned value also includes the target DB name as a query param (like postgres://localhost:5432/postgres?pubky-test-db-name=<some_name>), so a second testnet can reconnect to that DB directly instead of creating a new one. The connection string can be provided into EphemeralTestnetBuilder::new().postgres(connection_with_db_name_param).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants