Skip to content

Add create from generator#88

Draft
jaclark5 wants to merge 1 commit intomainfrom
ds_from_gen
Draft

Add create from generator#88
jaclark5 wants to merge 1 commit intomainfrom
ds_from_gen

Conversation

@jaclark5
Copy link
Collaborator

Description

Currently creating a dataset involves a list[dict] that is imported all at once into descent.targets.energy.create_dataset(). This requires that the entire dataset is held in memory.

HuggingFace dataset can be created with generators using Dataset.from_generator() but the Dataset is abstracted away in descent.

This PR adds a function descent.targets.energy.create_dataset_from_generator() to regain this capability from the datasets library.

Marking as a draft until I successfully use this with my current work, just in case I run into an issue

Status

  • Ready to go

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.49%. Comparing base (40d4051) to head (3ab21eb).

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #88   +/-   ##
=======================================
  Coverage   99.49%   99.49%           
=======================================
  Files          11       11           
  Lines         981      997   +16     
=======================================
+ Hits          976      992   +16     
  Misses          5        5           
Flag Coverage Δ
unittests 99.49% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants