Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# voSINT v2 is token-free by default.
# Optional flags only.
VOSINT_MODE=deep
VOSINT_HEADFUL=false
VOSINT_CASES_DIR=cases
93 changes: 56 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,74 @@
<p align="center">
<img src="Results/logo.png" width="500">
</p>
# voSINT: Video Reverse Search OSINT Tool
# voSINT v2

## Description
voSINT is an Open Source Intelligence (OSINT) tool designed for video reverse search. It enables users to trace the digital footprint of a video across the internet. By listing the results in descending order, voSINT reveals where a video first appeared and its subsequent occurrences online. This tool is invaluable for cybersecurity experts, digital forensics analysts, and anyone interested in the origin and spread of digital content.
Token-free reverse-video OSINT workflow focused on **origin hunting** and **repost spread** analysis.

Key Features:
- Track video appearances online in descending order.
- Generate approximate results, prioritizing data scope.
- Beta version focused on user feedback and continuous improvement.
## Highlights
- No SerpApi key, no `config.ini` setup, no upload-to-host requirement.
- Playwright provider adapters (default order): **Pinterest**, Google Lens, Bing Visual, Yandex, TinEye.
- Multi-frame extraction with quality scoring and timeline-first ranking.
- OCR pivots, transcript pivots (optional local dependencies), and Video DNA artifact.
- Case-folder outputs: HTML/JSON/CSV + raw/normalized artifacts.
- Commands: `scan`, `diff`, `report`.

## Installation Guide
Navigate to the directory where you want to create your project.

### Setting up a Virtual Environment
Run the following command to create a virtual environment (replace 'venv' with your desired environment name):
## Install
```bash
python3 -m venv venv
python -m venv .venv
source .venv/bin/activate
pip install -e .
python -m playwright install chromium
```
Activate the virtual environment:

## CLI
### Single scan
```bash
source venv/bin/activate
vosint scan video.mp4
```
Install the required packages:

### Batch scan
```bash
pip install -r requirements.txt
vosint scan ./videos --batch --mode deep
```

## Usage Instructions
For using the tool with a single video:
### Compare videos
```bash
python voSINT.py <video_path>
vosint diff a.mp4 b.mp4
```
For multiple videos in a directory:

### Re-open a case report
```bash
python voSINT.py <videos_dir>
vosint report cases/case_YYYYMMDD_HHMMSS
```

By creating and activating a virtual environment, you ensure that the installed packages and dependencies are isolated from your system's global Python environment, providing a clean and separate environment for your project.

## API Key Configuration
Before using voSINT, you need to obtain an API key from SerpApi.com. This key is essential for the tool to perform video reverse searches in Google and Yandex without dealing with CAPTCHA. Follow these steps to configure your API key:

Visit SerpApi.com and sign up to receive an API key.
## Modes
- `fast`: few top frames, Pinterest + Google Lens + Bing.
- `deep` (default): more frames, OCR+transcript, all default providers.
- `stealth`: local-only extraction/pivots, no provider submission; emits manual query pack.

Once you have your API key, open the config.ini file in the voSINT directory.

Insert your API key in the designated section of config.ini.

Ensure your API key is correctly saved in the configuration file to enable the full functionality of voSINT.
## Common flags
```bash
--mode fast|deep|stealth
--providers pinterest,google_lens,bing_visual,yandex,tineye
--max-frames 8
--ocr
--transcribe
--json --csv --html
--keep-frames
--no-browser
--headful
```

## Output layout
Each run writes:
```
cases/<case_id>/
input/
frames/
raw/
normalized/
report.html
report.json
timeline.csv
```

![](https://raw.githubusercontent.com/Meshall/voSINT/master/walkthrough.gif)
## Privacy note
Reverse-image providers receive submitted frames unless `--mode stealth` or `--no-browser` is used.
4 changes: 0 additions & 4 deletions config.ini

This file was deleted.

Binary file removed modules/.html_generator.py.swo
Binary file not shown.
Binary file removed modules/.html_generator.py.swp
Binary file not shown.
1 change: 0 additions & 1 deletion modules/__init__.py

This file was deleted.

137 changes: 0 additions & 137 deletions modules/html_generator.py

This file was deleted.

7 changes: 0 additions & 7 deletions modules/upload.py

This file was deleted.

39 changes: 0 additions & 39 deletions modules/video_search.py

This file was deleted.

26 changes: 26 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "vosint"
version = "2.0.0"
description = "Token-free video reverse-search and provenance OSINT toolkit"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
"numpy>=1.25",
"playwright>=1.45",
"imageio>=2.34",
"pillow>=10.0",
]

[project.optional-dependencies]
ocr = ["pytesseract>=0.3.10"]
transcribe = ["openai-whisper>=20231117"]

[project.scripts]
vosint = "vosint.cli:main"

[tool.setuptools.packages.find]
include = ["vosint*"]
13 changes: 4 additions & 9 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
certifi==2023.7.22
charset-normalizer==3.2.0
google-search-results==2.4.2
idna==3.4
numpy==1.25.2
opencv-python==4.8.0.76
requests==2.31.0
tqdm==4.66.1
urllib3==2.0.4
numpy>=1.25
playwright>=1.45
imageio>=2.34
pillow>=10.0
13 changes: 13 additions & 0 deletions tests/test_normalize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from vosint.core.normalize import normalize_hits
from vosint.models import Hit


def test_normalize_merges_duplicates():
hits = [
Hit(engine="pinterest", frame_id="f1", url="HTTPS://Example.com/path/", title="a"),
Hit(engine="bing_visual", frame_id="f2", url="https://example.com/path", title="b"),
]
merged = normalize_hits(hits)
assert len(merged) == 1
assert merged[0].domain == "example.com"
assert merged[0].support_engines == {"pinterest", "bing_visual"}
10 changes: 10 additions & 0 deletions tests/test_timeline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from vosint.core.timeline import rank_timeline
from vosint.models import Hit


def test_rank_timeline_known_dates_first():
a = Hit(engine="p", frame_id="1", url="https://a", date_raw="2021-01-01")
b = Hit(engine="p", frame_id="2", url="https://b")
a.date_parsed = __import__("datetime").datetime(2021, 1, 1)
out = rank_timeline([b, a])
assert out[0].url == "https://a"
Loading