Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .github/workflows/staging-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@ on:
push:
branches:
- main
pull_request:
branches:
- main
workflow_dispatch:

jobs:
Expand Down
55 changes: 55 additions & 0 deletions .github/workflows/test-lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: Test and Lint

on:
pull_request:

jobs:
test:
name: Test
runs-on: ubuntu-latest
permissions:
contents: read
actions: read
checks: write
pull-requests: write
strategy:
fail-fast: false
matrix:
python-version: ["3.8"]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
id: python-setup
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m venv .venv
.venv/bin/pip install pytest
.venv/bin/pip install -r app/requirements.txt

- name: Lint with flake8
run: |
source .venv/bin/activate
.venv/bin/pip install flake8
# stop the build if there are Python syntax errors or undefined names
flake8 ./app --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 ./app --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

- name: Run tests
if: always()
run: |
source .venv/bin/activate
pytest --junit-xml=./reports/pytest.xml --tb=auto -v

- name: Upload test results
uses: actions/upload-artifact@v4
if: always()
with:
name: test-results-${{ matrix.python-version }}
path: ./reports/pytest.xml
if-no-files-found: warn
10 changes: 5 additions & 5 deletions app/docs/user.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ Note that this function is designed to handle comparisons of mathematical expres

### Optional parameters

There are 15 optional parameters that can be set: `atol`, `complexNumbers`, `convention`, `criteria`, `elementary_functions`, `feedback_for_incorrect_response`, `multiple_answers_criteria`, `physical_quantity`, `plus_minus`/`minus_plus`, `rtol`, `specialFunctions`, `strict_syntax`, `strictness`, `symbol_assumptions`.
There are 15 optional parameters that can be set: `absolute_tolerance`, `complexNumbers`, `convention`, `criteria`, `elementary_functions`, `feedback_for_incorrect_response`, `multiple_answers_criteria`, `physical_quantity`, `plus_minus`/`minus_plus`, `rtol`, `specialFunctions`, `strict_syntax`, `strictness`, `symbol_assumptions`.

#### `atol`
Sets the absolute tolerance, $e_a$, i.e. if the answer, $x$, and response, $\tilde{x}$, are numerical values then the response is considered equal to the answer if $|x-\tilde{x}| \leq e_aBy default `atol` is set to `0`, which means the comparison will be done with as high accuracy as possible. If either the answer or the response aren't numerical expressions this parameter is ignored.
#### `absolute_tolerance` (`atol`)
Sets the absolute tolerance, $e_a$, i.e. if the answer, $x$, and response, $\tilde{x}$, are numerical values then the response is considered equal to the answer if $|x-\tilde{x}| \leq e_aBy default `absolute_tolerance` is set to `0`, which means the comparison will be done with as high accuracy as possible. If either the answer or the response aren't numerical expressions this parameter is ignored.

#### `complexNumbers`

Expand Down Expand Up @@ -76,8 +76,8 @@ When `physical_quantity` the evaluation function will generate feedback based on

**TODO:** Generate new flowchart for updated physical quantity feedback generation procedure.

#### `rtol`
Sets the relative tolerance, $e_r$, i.e. if the answer, $x$, and response, $\tilde{x}$, are numerical values then the response is considered equal to the answer if $\left|\frac{x-\tilde{x}}{x}\right| \leq e_r$. By default `rtol` is set to `0`, which means the comparison will be done with as high accuracy as possible. If either the answer or the response aren't numerical expressions this parameter is ignored.
#### `relative_tolerance` (`rtol`)
Sets the relative tolerance, $e_r$, i.e. if the answer, $x$, and response, $\tilde{x}$, are numerical values then the response is considered equal to the answer if $\left|\frac{x-\tilde{x}}{x}\right| \leq e_r$. By default `relative_tolerance` is set to `0`, which means the comparison will be done with as high accuracy as possible. If either the answer or the response aren't numerical expressions this parameter is ignored.

#### `strictness`

Expand Down
7 changes: 6 additions & 1 deletion app/evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,12 @@ def evaluation_function(response, answer, params, include_test_data=False) -> di
- if set to True, use basic dimensional analysis functionality.
"""

if "relative_tolerance" in params:
params["rtol"] = params["relative_tolerance"]

if "absolute_tolerance" in params:
params["atol"] = params["absolute_tolerance"]

evaluation_result = EvaluationResult()
evaluation_result.is_correct = False

Expand Down Expand Up @@ -318,7 +324,6 @@ def evaluation_function(response, answer, params, include_test_data=False) -> di
"reserved_expressions": reserved_expressions_parsed,
"criteria": criteria,
"disabled_evaluation_nodes": parameters.get("disabled_evaluation_nodes", set()),
"evaluation_result": evaluation_result,
"parsing_parameters": parsing_parameters,
"evaluation_result": evaluation_result,
"syntactical_comparison": parameters.get("syntactical_comparison", False),
Expand Down
24 changes: 24 additions & 0 deletions app/tests/physical_quantity_evaluation_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,18 @@ def test_physical_quantity_with_rtol(self):
result = evaluation_function(res, ans, params, include_test_data=True)
assert result["is_correct"] is True

def test_physical_quantity_with_rel_tol(self):
ans = "7500 m/s"
res = "7504.1 m/s"
params = {
'relative_tolerance': 0.05,
'strict_syntax': False,
'physical_quantity': True,
'elementary_functions': True,
}
result = evaluation_function(res, ans, params, include_test_data=True)
assert result["is_correct"] is True

def test_physical_quantity_with_atol(self):
ans = "7500 m/s"
res = "7504.1 m/s"
Expand All @@ -349,6 +361,18 @@ def test_physical_quantity_with_atol(self):
result = evaluation_function(res, ans, params, include_test_data=True)
assert result["is_correct"] is True

def test_physical_quantity_with_abs_tol(self):
ans = "7500 m/s"
res = "7504.1 m/s"
params = {
'absolute_tolerance': 5,
'strict_syntax': False,
'physical_quantity': True,
'elementary_functions': True,
}
result = evaluation_function(res, ans, params, include_test_data=True)
assert result["is_correct"] is True

def test_tolerance_given_as_string(self):
ans = "4.52 kg"
res = "13.74 kg"
Expand Down
13 changes: 13 additions & 0 deletions app/tests/symbolic_evaluation_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -928,6 +928,19 @@ def test_pi_with_rtol(self):
result = evaluation_function(response, answer, params)
assert result["is_correct"] is True

def test_pi_with_rel_tol(self):
answer = "pi"
response = "3.14"
params = {
"strict_syntax": False,
"relative_tolerance": 0.05,
"symbols": {
"pi": {"aliases": ["Pi", "PI", "π"], "latex": "\\(\\pi\\)"},
}
}
result = evaluation_function(response, answer, params)
assert result["is_correct"] is True

@pytest.mark.parametrize(
"response,outcome",
[
Expand Down
Loading