Andy Curtis contactandyc

Github Profile for Andy Curtis

Selective works, algorithmic research, and high-performance C infrastructure by Andy Curtis.

✦ Archival Works & Algorithms

A selection of algorithms and systems I've developed over my career, largely in the public domain at this point.

Quicksort Improvement (2019) — Designed an efficient modification to quicksort/introsort. By reusing the pivot sample to detect when input is already sorted, reverse-sorted, or all-equal, the algorithm finishes in O(n) with a single verification pass. Yields 5–30× speedups for sorted inputs with no regressions on adversarial cases.
Ad Inventory Overlapping Set Problem (2011–2012) — Designed a real-time ad inventory model to handle millions of buys with overlapping targeting criteria using inverted indices and bitmap intersections. Enabled accurate forecasting and scalable allocation at web scale.
Expected Frequency & Long Correlation (2009+) — An algorithm for search and recommendations leveraging entire user histories. By adjusting local co-occurrence with an expected vs. actual frequency correction, it systematically removes global popularity bias.
Click-Based Search & Recommendations (2003–2005) — A framework leveraging full user sessions (queries + clicks) to learn correlations, improving ranking, personalization, and localization.
Efficient Near-Shingling (2001) — An approach to near-duplicate detection that approximates shingling accuracy while reducing storage and compute costs. Documents are reduced to title + first + 10 longest sentence hashes, indexed in dual forward/inverted indices.
EzResult Search Engine (1998–1999) — A distributed search engine written entirely in C/C++/assembly, independently utilizing inverted indices, tries, and cosine similarity. Supported instant index updates and was acquired in 1999.

✦ The C Infrastructure Ecosystem

A modular, multi-tier dependency graph of C libraries built for out-of-core data processing, search, and system reliability. Designed strictly for performance, simplicity, and composability.

🧱 Foundation & Memory

a-memory-library Zero-overhead memory pools, auto-growing buffers, and debug-wrapped allocators.
the-macro-library Type-safe C macros for core algorithms (introsort, bsearch, red-black trees, heaps).
a-bitset-library Expandable bitset structures for setting, querying, and bitwise operations.

⚙️ Distributed Processing & I/O

a-map-reduce-library Single-node, partitioned DAG execution engine for out-of-core data processing and pipelining.
the-io-library Record-oriented file processing with transparent compression, partitioning, and sort-merging.
the-lz4-library Fast LZ4 compression and decompression primitives.

🕸️ Networking & Security

a-curl-library Async event-loop wrapper over libcurl with rate-limiting, backoffs, and dependency scheduling.
a-curl-openai-plugin Builder API on the curl event loop for handling OpenAI streams and structured outputs.
an-encryption-library Secure key generation and in-place AES-GCM encryption/decryption.

🗂️ Parsing & Serialization

a-json-library Fast JSON parser integrated directly with memory pools.
a-json-sax-library High-speed, destructive/in-place SAX parser for JSON.
sql-parser-library Lightweight SQL expression tokenizer, parser, and AST evaluator.
a-json-schema-builder Programmatic C helpers for dynamically building JSON schemas.

🔎 NLP, Search & ML

search-index-library Compact inverted index engine with BM25 helpers and snippet generation.
embedding-library SIMD-accelerated primitives for vector embeddings (dot product, cosine) and int8 quantization.
a-tokenizer-library Tokenization and cursor logic for parsing queries and expressions.
stemmer-library Word stemming with an integrated caching mechanism.
a-sentence-chunker-library Fast UTF-8/ASCII sentence segmentation with length constraint re-chunking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Andy Curtis contactandyc

Achievements

Achievements

Highlights

Block or report contactandyc

Github Profile for Andy Curtis

✦ Archival Works & Algorithms

✦ The C Infrastructure Ecosystem

🧱 Foundation & Memory

⚙️ Distributed Processing & I/O

🕸️ Networking & Security

🗂️ Parsing & Serialization

🔎 NLP, Search & ML

✦ Core Infrastructure

✦ Activity & Statistics

Pinned Loading

Uh oh!