Skip to content

Not possible to use my own open ai API #65

@mossishahi

Description

@mossishahi

Hi

When I change the annotator,
from:

ss_adata = annotator.run(
    study_context=study_context,
    override_existing_results=True,
)

To:

ss_adata = annotator.run(
    study_context=study_context,
    override_existing_results=True,
    llm_configs=[{
        "provider": "openai",
        "name": "gpt-4o",
        "apiKey": o_api,
        "baseUrl": "https://api.openai.com/v1",   # optional
        "modelSettings": {                        # optional
            "temperature": 0.0,
            "max_tokens": 8192
         }
    }],
)

CyteType runs for few clusters and then stops and it will never proceed further. I've tried different agents and it seems that specifying cusom LLM is not possible.

should mention that I didn't run this step as I wanted to run on full data:

max_n = 1000
subsampled_adatas = []
n_groups = adata.obs[group_key].nunique()
for n, group in tqdm(enumerate(sorted(adata.obs[group_key].unique()))):
    idx = adata.obs[group_key] == group
    if idx[idx].sum() > max_n:
        ss_idx = pd.Series(False, index=adata.obs.index)
        ss_idx[idx[idx].sample(n=max_n, random_state=999).index] = True
        subsampled_adatas.append(adata[ss_idx].to_memory())
    else:
        subsampled_adatas.append(adata[idx].to_memory())

# Concatenate the individual subsampled anndata objects into a single object
ss_adata = anndata.concat(subsampled_adatas)

# Anndata removes the `var` entry, let's add back gene names
ss_adata.var['gene_symbols'] = [x.split('_')[0] for x in adata.var.feature_name]

Then needed to do:
ss_adata = adata
ss_adata.var['gene_symbols'] = ss_adata.var.index

Meanwhile, should highlight that setting backed=True makes errors in:

sc.tl.rank_genes_groups(
    ss_adata,
    groupby=group_key,
    use_raw=False,
    key_added='rank_genes_'+group_key
)

and had to do:
ss_adata = ss_adata.to_memory()

This is an example:
https://prod.cytetype.nygen.io/report/4c8a952a-8b08-4a30-ab18-90a64628dd12

this is the final output of my job:

---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
Cell In[18], line 8
      1 study_context = """
      2 Spatial transcriptomics Xenium 5k data, from pancreas of human male sample.
      3 The cells were from PDAC cancer cells with the status of impaired glucose tolerance.
      4 
      5 """
      6 o_api = ""
----> 8 ss_adata = annotator.run(
      9     study_context=study_context,
     10     override_existing_results=True,
     11     llm_configs=[{
     12         "provider": "openai",
     13         "name": "gpt-4o",
     14         "apiKey": o_api,
     15         "baseUrl": "https://api.openai.com/v1",   # optional
     16         # "modelSettings": {                        # optional
     17         #     "temperature": 0.0,
     18         #     "max_tokens": 8192
     19         # }
     20     }],
     21 )

File [/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/main.py:428](http://supergpu25.scidom.de:8889/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/main.py#line=427), in CyteType.run(self, study_context, llm_configs, metadata, n_parallel_clusters, results_prefix, poll_interval_seconds, timeout_seconds, api_url, auth_token, save_query, query_filename, vars_h5_path, obs_duckdb_path, upload_timeout_seconds, upload_max_workers, cleanup_artifacts, require_artifacts, show_progress, override_existing_results)
    425 store_job_details(self.adata, job_id, self.api_url, results_prefix)
    427 # Wait for completion
--> 428 result = wait_for_completion(
    429     self.api_url,
    430     self.auth_token,
    431     job_id,
    432     poll_interval_seconds,
    433     timeout_seconds,
    434     show_progress,
    435 )
    437 # Store results
    438 store_annotations(
    439     self.adata,
    440     result,
   (...)    445     check_unannotated=True,
    446 )

File [/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/api/client.py:391](http://supergpu25.scidom.de:8889/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/api/client.py#line=390), in wait_for_completion(base_url, auth_token, job_id, poll_interval, timeout, show_progress)
    389 if progress:
    390     progress.finalize({})
--> 391 raise TimeoutError(f"Job {job_id} did not complete within {timeout}s")

TimeoutError: Job 4c8a952a-8b08-4a30-ab18-90a64628dd12 did not complete within 7200s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions