Skip to content

Support arrow struct#739

Open
ColinLeeo wants to merge 3 commits intodevelopfrom
support_arrow_struct
Open

Support arrow struct#739
ColinLeeo wants to merge 3 commits intodevelopfrom
support_arrow_struct

Conversation

@ColinLeeo
Copy link
Contributor

No description provided.

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 39.96540% with 347 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.78%. Comparing base (dd27faf) to head (014580b).

Files with missing lines Patch % Lines
cpp/src/cwrapper/arrow_c.cc 40.45% 272 Missing and 40 partials ⚠️
cpp/src/cwrapper/tsfile_cwrapper.cc 0.00% 22 Missing ⚠️
cpp/src/reader/table_result_set.cc 50.00% 1 Missing and 5 partials ⚠️
cpp/src/writer/tsfile_writer.cc 0.00% 3 Missing ⚠️
cpp/src/reader/tsfile_reader.cc 60.00% 0 Missing and 2 partials ⚠️
cpp/src/reader/qds_with_timegenerator.h 0.00% 1 Missing ⚠️
cpp/src/reader/qds_without_timegenerator.h 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #739      +/-   ##
===========================================
- Coverage    62.09%   61.78%   -0.32%     
===========================================
  Files          700      701       +1     
  Lines        40142    40719     +577     
  Branches      5650     5784     +134     
===========================================
+ Hits         24926    25157     +231     
- Misses       14522    14821     +299     
- Partials       694      741      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +325 to +352
uint8_t* null_bitmap = nullptr;
if (has_null) {
size_t null_bitmap_size = GetNullBitmapSize(row_count);
null_bitmap = static_cast<uint8_t*>(
common::mem_alloc(null_bitmap_size, common::MOD_TSBLOCK));
if (null_bitmap == nullptr) {
common::mem_free(array_data->buffers);
common::mem_free(array_data);
return common::E_OOM;
}
common::BitMap& vec_bitmap = vec->get_bitmap();
char* vec_bitmap_data = vec_bitmap.get_bitmap();
for (size_t i = 0; i < null_bitmap_size; ++i) {
null_bitmap[i] = ~static_cast<uint8_t>(vec_bitmap_data[i]);
}
array_data->buffers[0] = null_bitmap;

int64_t null_count = 0;
for (uint32_t i = 0; i < row_count; ++i) {
if (vec_bitmap.test(i)) {
null_count++;
}
}
out_array->null_count = null_count;
} else {
array_data->buffers[0] = nullptr;
out_array->null_count = 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as BuildFixedLengthArrowArrayC? May extract.

char* vec_data = vec->get_value_data().get_data();
uint32_t vec_offset = 0;

// 获取 vec_bitmap 用于后续检查
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment on lines +377 to +379
uint32_t len = 0;
std::memcpy(&len, vec_data + vec_offset, sizeof(uint32_t));
vec_offset += sizeof(uint32_t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len = ((uint32_t) (vec_data + vec_offset)) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may copy length and content in one pass, because the total content length can be calculated by len(vec_data) - row_count*sizeof(uint32_t).

Comment on lines +434 to +457
// Convert TsFile YYYYMMDD integer to days since Unix epoch (1970-01-01)
static int32_t YYYYMMDDToDaysSinceEpoch(int32_t yyyymmdd) {
int year = yyyymmdd / 10000;
int month = (yyyymmdd % 10000) / 100;
int day = yyyymmdd % 100;

std::tm date = {};
date.tm_year = year - 1900;
date.tm_mon = month - 1;
date.tm_mday = day;
date.tm_hour = 12;
date.tm_isdst = -1;

std::tm epoch = {};
epoch.tm_year = 70;
epoch.tm_mon = 0;
epoch.tm_mday = 1;
epoch.tm_hour = 12;
epoch.tm_isdst = -1;

time_t t1 = mktime(&date);
time_t t2 = mktime(&epoch);
return static_cast<int32_t>((t1 - t2) / (60 * 60 * 24));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to utils?

Comment on lines +481 to +505
common::BitMap& vec_bitmap = vec->get_bitmap();
uint8_t* null_bitmap = nullptr;
if (has_null) {
size_t null_bitmap_size = GetNullBitmapSize(row_count);
null_bitmap = static_cast<uint8_t*>(
common::mem_alloc(null_bitmap_size, common::MOD_TSBLOCK));
if (null_bitmap == nullptr) {
common::mem_free(array_data->buffers);
common::mem_free(array_data);
return common::E_OOM;
}
char* vec_bitmap_data = vec_bitmap.get_bitmap();
for (size_t i = 0; i < null_bitmap_size; ++i) {
null_bitmap[i] = ~static_cast<uint8_t>(vec_bitmap_data[i]);
}
int64_t null_count = 0;
for (uint32_t i = 0; i < row_count; ++i) {
if (vec_bitmap.test(i)) null_count++;
}
out_array->null_count = null_count;
array_data->buffers[0] = null_bitmap;
} else {
out_array->null_count = 0;
array_data->buffers[0] = nullptr;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract

Comment on lines +54 to +60
if (batch_size == 0) {
block_size_ = 1024;
batch_mode_ = false;
} else {
block_size_ = batch_size;
batch_mode_ = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_size == 0 -> batch_size <= 0, or use uint

Comment on lines +421 to +426
// // Test 2: Batch query (batch_size = 1000)
// {
// storage::TsFileReader reader;
// int ret = reader.open(file_name_);
// ASSERT_EQ(ret, common::E_OK);
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why commentted?

Comment on lines +162 to +166
if code != 0:
return None

if arrow_schema.release == NULL or arrow_array.release == NULL:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise an exception?

pandas==2.2.2
setuptools==78.1.1
wheel==0.46.2
pyarrow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to specify a minimum version?


tsfile_include_dir = os.path.join(project_dir, "tsfile", "include")

extra_compile_args = ["-O0", "-g3", "-fno-omit-frame-pointer"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is O0 enforced?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants