-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Hi
When I change the annotator,
from:
ss_adata = annotator.run(
study_context=study_context,
override_existing_results=True,
)
To:
ss_adata = annotator.run(
study_context=study_context,
override_existing_results=True,
llm_configs=[{
"provider": "openai",
"name": "gpt-4o",
"apiKey": o_api,
"baseUrl": "https://api.openai.com/v1", # optional
"modelSettings": { # optional
"temperature": 0.0,
"max_tokens": 8192
}
}],
)
CyteType runs for few clusters and then stops and it will never proceed further. I've tried different agents and it seems that specifying cusom LLM is not possible.
should mention that I didn't run this step as I wanted to run on full data:
max_n = 1000
subsampled_adatas = []
n_groups = adata.obs[group_key].nunique()
for n, group in tqdm(enumerate(sorted(adata.obs[group_key].unique()))):
idx = adata.obs[group_key] == group
if idx[idx].sum() > max_n:
ss_idx = pd.Series(False, index=adata.obs.index)
ss_idx[idx[idx].sample(n=max_n, random_state=999).index] = True
subsampled_adatas.append(adata[ss_idx].to_memory())
else:
subsampled_adatas.append(adata[idx].to_memory())
# Concatenate the individual subsampled anndata objects into a single object
ss_adata = anndata.concat(subsampled_adatas)
# Anndata removes the `var` entry, let's add back gene names
ss_adata.var['gene_symbols'] = [x.split('_')[0] for x in adata.var.feature_name]
Then needed to do:
ss_adata = adata
ss_adata.var['gene_symbols'] = ss_adata.var.index
Meanwhile, should highlight that setting backed=True makes errors in:
sc.tl.rank_genes_groups(
ss_adata,
groupby=group_key,
use_raw=False,
key_added='rank_genes_'+group_key
)
and had to do:
ss_adata = ss_adata.to_memory()
This is an example:
https://prod.cytetype.nygen.io/report/4c8a952a-8b08-4a30-ab18-90a64628dd12
this is the final output of my job:
---------------------------------------------------------------------------
TimeoutError Traceback (most recent call last)
Cell In[18], line 8
1 study_context = """
2 Spatial transcriptomics Xenium 5k data, from pancreas of human male sample.
3 The cells were from PDAC cancer cells with the status of impaired glucose tolerance.
4
5 """
6 o_api = ""
----> 8 ss_adata = annotator.run(
9 study_context=study_context,
10 override_existing_results=True,
11 llm_configs=[{
12 "provider": "openai",
13 "name": "gpt-4o",
14 "apiKey": o_api,
15 "baseUrl": "https://api.openai.com/v1", # optional
16 # "modelSettings": { # optional
17 # "temperature": 0.0,
18 # "max_tokens": 8192
19 # }
20 }],
21 )
File [/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/main.py:428](http://supergpu25.scidom.de:8889/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/main.py#line=427), in CyteType.run(self, study_context, llm_configs, metadata, n_parallel_clusters, results_prefix, poll_interval_seconds, timeout_seconds, api_url, auth_token, save_query, query_filename, vars_h5_path, obs_duckdb_path, upload_timeout_seconds, upload_max_workers, cleanup_artifacts, require_artifacts, show_progress, override_existing_results)
425 store_job_details(self.adata, job_id, self.api_url, results_prefix)
427 # Wait for completion
--> 428 result = wait_for_completion(
429 self.api_url,
430 self.auth_token,
431 job_id,
432 poll_interval_seconds,
433 timeout_seconds,
434 show_progress,
435 )
437 # Store results
438 store_annotations(
439 self.adata,
440 result,
(...) 445 check_unannotated=True,
446 )
File [/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/api/client.py:391](http://supergpu25.scidom.de:8889/opt/venvs/cytetype/lib/python3.12/site-packages/cytetype/api/client.py#line=390), in wait_for_completion(base_url, auth_token, job_id, poll_interval, timeout, show_progress)
389 if progress:
390 progress.finalize({})
--> 391 raise TimeoutError(f"Job {job_id} did not complete within {timeout}s")
TimeoutError: Job 4c8a952a-8b08-4a30-ab18-90a64628dd12 did not complete within 7200s
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels