-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy pathapache_airflow.txt
More file actions
90 lines (61 loc) · 2.56 KB
/
apache_airflow.txt
File metadata and controls
90 lines (61 loc) · 2.56 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
Installation on Mac
MacOS Monterey ver. 12.0.1
# --------------------------------------------------------------
I tried official docs on installation without virtual env,
- https://airflow.apache.org/docs/apache-airflow/1.10.13/installation.html
and got many conflicts between packages.
# --------------------------------------------------------------
So I decided to follow these instructions:
Setting up Apache-Airflow in MacOS
By Hiren Rupchandani & Mukesh Kumar
INSAID, Aug 17, 2021
- https://insaid.medium.com/setting-up-apache-airflow-in-macos-2b5e86eeaf1
# first make sure to install CommandLineTools
xcode-select --install
# then install virtualenv
pip install virtualenv
cd
mkdir airflow_workspace
cd airflow_workspace
virtualenv airflow_env
source airflow_env/bin/activate (in bash)
source airflow_env/bin/activate.fish (in fish)
pip3 install apache-airflow[gcp,sentry,statsd]
/Users/levselector/airflow_workspace/airflow_env/bin/python -m pip install --upgrade pip
pip install pyspark
pip install sklearn
mkdir airflow
cd airflow # ~/airflow_workspace/airflow/
airflow db init # creates SQLight DB under ~/airflow/
# make directory for DAGs
mkdir dags # ~/airflow_workspace/airflow/dags/
# ceate an admin user
airflow users create --username admin --password q --firstname Lev --lastname Selector --role Admin --email lev.selector@gmail.com
# check the list of users
airflow users list
# start airflow scheduler
airflow scheduler # takes time
# start webserver in new terminal window:
cd ~/airflow_workspace/
source airflow_env/bin/activate.fish
cd airflow
airflow webserver
Open browser: http://localhost:8080/
Select "DAGs" in the top horizontal menu
You will see many examples.
Select "example_python_operator"
And switch to "Code"
You will see python script
Related source script is here:
~/airflow_workspace/airflow_env/lib/python3.9/site-packages/airflow/example_dags/example_python_operator.py
And the log file is here:
~/airflow/logs/scheduler/2022-02-03/native_dags/example_dags/example_python_operator.py.log
On the top-right you can see a triangular "run" button - press on it to trigger the run.
# --------------------------------------------------------------
Next tutorial:
Hello World using Apache-Airflow
- https://insaid.medium.com/hello-world-using-apache-airflow-91859e3bbfd5
Yet another tutorial:
How to build a data extraction pipeline with Apache Airflow
- https://towardsdatascience.com/how-to-build-a-data-extraction-pipeline-with-apache-airflow-fa83cb8dbcdf
# --------------------------------------------------------------